linux

Author	SHA1	Message	Date
Andi Kleen	224e2c977b	perf script: Support insn and insnlen When looking at Intel PT traces with perf script it is useful to have some indication of the instruction. Dump the instruction bytes and instruction length, which can be used for simple pattern analysis in scripts. % perf record -e intel_pt// foo % perf script --itrace=i0ns -F ip,insn,insnlen ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00 ffffffff81012334 ilen: 1 insn: 5b ffffffff81012335 ilen: 1 insn: 5d ffffffff81012336 ilen: 1 insn: c3 ffffffff810123e3 ilen: 1 insn: 5b ffffffff810123e4 ilen: 2 insn: 41 5c ffffffff810123e6 ilen: 1 insn: 5d ffffffff810123e7 ilen: 1 insn: c3 ffffffff810124a6 ilen: 2 insn: 31 c0 ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01 ffffffff810124b1 ilen: 2 insn: 75 87 ... Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1475847747-30994-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-24 11:07:30 -03:00
Jiri Olsa	af09b2d35e	perf c2c report: Add --show-all option Normally we limit the main list to contain only entries with HITM % value > 0.0005, but it might be useful to display all captured entries. Adding --show-all option for that. Requested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-nokgjdwikbegec5jzj4mxhqc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-21 10:32:02 -03:00
Jiri Olsa	18f278d2dd	perf c2c report: Add --no-source option Add a possibility to disable source line column with new --no-source option. It source line data could take lot of time to retrieve, so it could be a performance burden for big data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-8p6s2727fq8nbsm3it5gix3p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-21 10:32:01 -03:00
Jiri Olsa	465f27a3b2	perf c2c: Add man page and credits Add man page for c2c command and credits to builtin-c2c.c file. Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-21 10:32:01 -03:00
Nambong Ha	2ad8327fd0	perf top/report: Add tips about a list option Add two tips that describe --list option of config sub-command and explain how to choose particular config file location. Signed-off-by: Nambong Ha <over3025@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <taeung@kosslab.kr> Link: http://lkml.kernel.org/r/1475191562-3240-1-git-send-email-over3025@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-05 19:51:53 -03:00
Donghyun Kim	49343235d0	perf report/top: Add a tip about system-wide collection from all CPUs Signed-off-by: Donghyun Kim <dongdong9335@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <taeung@kosslab.kr> Link: http://lkml.kernel.org/r/1475187357-21882-1-git-send-email-dongdong9335@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-05 19:50:25 -03:00
Kim SeonYoung	8649b6434e	perf report/top: Add a tip about source line numbers with overhead There is a existing tip as below. If you have debuginfo enabled, try: perf report -s sym,srcline However this tip only describe a condition to use --sort sym,scrline options. So there is lack of explanation in the tip. I think that it would be better to add a tip that exactly explains the feature of --sort srcline. Signed-off-by: Seonyoung Kim <adamas0414@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <taeung@kosslab.kr> Link: http://lkml.kernel.org/r/1475194602-5596-1-git-send-email-adamas0414@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-05 19:47:55 -03:00
Sukadev Bhattiprolu	c8d6828a65	perf list: Support long jevents descriptions Previously we were dropping the useful longer descriptions that some events have in the event list completely. This patch makes them appear with perf list. Old perf list: baclears: baclears.all [Counts the number of baclears] vs new: perf list -v: ... baclears: baclears.all [The BACLEARS event counts the number of times the front end is resteered, mainly when the Branch Prediction Unit cannot provide a correct prediction and this is corrected by the Branch Address Calculator at the front end. The BACLEARS.ANY event counts the number of baclears for any type of branch] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1473978296-20712-13-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-03 21:35:47 -03:00
Andi Kleen	1c5f01fe86	perf list: Add a --no-desc flag Add a --no-desc flag to 'perf list' to not print the event descriptions that were earlier added for JSON events. This may be useful to get a less crowded listing. It's still default to print descriptions as that is the more useful default for most users. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1473978296-20712-9-git-send-email-sukadev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-10-03 21:35:45 -03:00
Adrian Hunter	1b36c03e35	perf record: Add support for using symbols in address filters Symbols come from either the DSO or /proc/kallsyms for the kernel. Details of the functionality can be found in Documentation/perf-record.txt. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-29 11:17:02 -03:00
Simon Que	2acad19500	perf tools: Update documentation info about quipper The existing link is outdated. The most recent quipper code can be found at the new URL. Committer notes: Quipper is a C++ parser that can be used to convert from a perf.data file to and from a protobuf, a Chromium OS facility. Signed-off-by: Simon Que <sque@chromium.org> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Chong Jiang <chongjiang@chromium.org> Link: http://lkml.kernel.org/n/tip-4q1nm7jl3vovp66p5bki20pq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-29 11:16:44 -03:00
Adrian Hunter	a9e57009da	perf record: Fix documentation 'event_sources' -> 'event_source' Change '/sys/bus/event_sources' to the correct path which is '/sys/bus/event_source'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Link: http://lkml.kernel.org/r/1474641528-18776-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-27 15:00:29 -03:00
Mathieu Poirier	dd60fba732	perf tools: Add infrastructure for PMU specific configuration This patch adds PMU driver specific configuration to the parser infrastructure by preceding any term with the '@' letter. As such doing something like: perf record -e some_event/@cfg1,@cfg2=config/ ... will see 'cfg1' and 'cfg2=config' being added to the list of evsel config terms. Token 'cfg1' and 'cfg2=config' are not processed in user space and are meant to be interpreted by the PMU driver. First the lexer/parser are supplemented with the required definitions to recognise the driver specific configuration. From there they are simply added to the list of event terms. The bulk of the work is done in function "parse_events_add_pmu()" where driver config event terms are added to a new list of driver config terms, which in turn spliced with the event's new driver configuration list. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1473179837-3293-4-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-13 17:09:11 -03:00
Masami Hiramatsu	428aff82e9	perf probe: Ignore vmlinux buildid if offline kernel is given Ignore the buildid of running kernel when both of --definition and --vmlinux is given because that kernel should be off-line. This also skips post-processing of kprobe event for relocating symbol and checking blacklist, because it can not be done on off-line kernel. E.g. without this fix perf shows an error as below ---- $ perf probe --vmlinux=./vmlinux-arm --definition do_sys_open ./vmlinux-arm with build id 7a1f76dd56e9c4da707cd3d6333f50748141434b not found, continuing without symbols Failed to find symbol do_sys_open in kernel Error: Failed to add events. ---- with this fix, we can get the definition ---- $ perf probe --vmlinux=./vmlinux-arm --definition do_sys_open p:probe/do_sys_open do_sys_open+0 ---- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/147214228193.23638.12581984840822162131.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-01 09:44:14 -03:00
Masami Hiramatsu	1c20b1d154	perf probe: Show trace event definition Add --definition/-D option for showing the trace-event definition in stdout. This can be useful in debugging or combined with a shell script. e.g. ---- # perf probe --definition 'do_sys_open $params' p:probe/do_sys_open _text+2261728 dfd=%di:s32 filename=%si:u64 flags=%dx:s32 mode=%cx:u16 ---- Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/147214226712.23638.2240534040014013658.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-01 09:44:13 -03:00
Milian Wolff	893c5c798b	perf config: Show default report configuration in example and docs Signed-off-by: Milian Wolff <milian.wolff@kdab.com> LPU-Reference: 20160830134106.21240-2-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-09-01 09:44:13 -03:00
Masami Hiramatsu	bdca79c2bf	ftrace: kprobe: uprobe: Show u8/u16/u32/u64 types in decimal Change kprobe/uprobe-tracer to show the arguments type-casted with u8/u16/u32/u64 in decimal digits instead of hexadecimal. To minimize compatibility issue, the arguments without type casting are typed by x64 (or x32 for 32bit arch) by default. Note: all arguments set by old perf probe without types are shown in decimal by default. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Naohiro Aota <naohiro.aota@hgst.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/147151076135.12957.14684546093034343894.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 17:06:38 -03:00
Masami Hiramatsu	9254378725	perf probe: Support hexadecimal casting Support hexadecimal unsigned integer casting by 'x'. This allows user to explicitly specify the output format of the probe arguments as hexadecimal. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Naohiro Aota <naohiro.aota@hgst.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/147151072679.12957.4458656416765710753.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 17:06:37 -03:00
Masami Hiramatsu	17ce3dc7e5	ftrace: kprobe: uprobe: Add x8/x16/x32/x64 for hexadecimal types Add x8/x16/x32/x64 for hexadecimal type casting to kprobe/uprobe event tracer. These type casts can be used for integer arguments for explicitly showing them in hexadecimal digits in formatted text. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Naohiro Aota <naohiro.aota@hgst.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/147151067029.12957.11591314629326414783.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 15:38:09 -03:00
Arnaldo Carvalho de Melo	fa1f456592	perf report: Allow configuring the default sort order in ~/.perfconfig Allows changing the default sort order from "comm,dso,symbol" to some other default, for instance "sym,dso" may be more fitting for kernel developers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pm1h5puxua8nsxksd68fjm8r@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-23 15:37:33 -03:00
Naohiro Aota	19f00b0117	perf probe: Support signedness casting The 'perf probe' tool detects a variable's type and use the detected type to add a new probe. Then, kprobes prints its variable in hexadecimal format if the variable is unsigned and prints in decimal if it is signed. We sometimes want to see unsigned variable in decimal format (i.e. sector_t or size_t). In that case, we need to investigate the variable's size manually to specify just signedness. This patch add signedness casting support. By specifying "s" or "u" as a type, perf-probe will investigate variable size as usual and use the specified signedness. E.g. without this: $ perf probe -a 'submit_bio bio->bi_iter.bi_sector' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 $ cat trace_pipe\|head dbench-9692 [003] d..1 971.096633: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x3a3d00 dbench-9692 [003] d..1 971.096685: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x1a3d80 dbench-9692 [003] d..1 971.096687: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x3a3d80 ... // need to investigate the variable size $ perf probe -a 'submit_bio bio->bi_iter.bi_sector:s64' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s64) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 With this: // just use "s" to cast its signedness $ perf probe -v -a 'submit_bio bio->bi_iter.bi_sector:s' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 $ cat trace_pipe\|head dbench-9689 [001] d..1 1212.391237: submit_bio: (submit_bio+0x0/0x140) bi_sector=128 dbench-9689 [001] d..1 1212.391252: submit_bio: (submit_bio+0x0/0x140) bi_sector=131072 dbench-9697 [006] d..1 1212.398611: submit_bio: (submit_bio+0x0/0x140) bi_sector=30208 This commit also update perf-probe.txt to describe "types". Most parts are based on existing documentation: Documentation/trace/kprobetrace.txt Committer note: Testing using 'perf trace': # perf probe -a 'submit_bio bio->bi_iter.bi_sector' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 # trace --no-syscalls --ev probe:submit_bio 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=0xc133c0) 3181.861 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffb8) 3181.881 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffc0) 3184.488 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffc8) <SNIP> 4717.927 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x4dc7a88) 4717.970 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x4dc7880) ^C[root@jouet ~]# Now, using this new feature: [root@jouet ~]# perf probe -a 'submit_bio bio->bi_iter.bi_sector:s' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 [root@jouet ~]# trace --no-syscalls --ev probe:submit_bio 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145704) 0.017 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145712) 0.019 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145720) 2.567 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145728) 5631.919 probe:submit_bio:(ffffffffac3aee00) bi_sector=0) 5631.941 probe:submit_bio:(ffffffffac3aee00) bi_sector=8) 5631.945 probe:submit_bio:(ffffffffac3aee00) bi_sector=16) 5631.948 probe:submit_bio:(ffffffffac3aee00) bi_sector=24) ^C# With callchains: # trace --no-syscalls --ev probe:submit_bio/max-stack=10/ 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662544) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 0.023 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662552) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 0.027 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662560) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 2.593 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662568) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) journal_submit_commit_record+0xa82001ac ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa82012e8 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) ^C# Signed-off-by: Naohiro Aota <naohiro.aota@hgst.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470710408-23515-1-git-send-email-naohiro.aota@hgst.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-09 10:52:22 -03:00
Brendan Gregg	bcdc09af3e	perf script: Add 'bpf-output' field to usage message This adds the 'bpf-output' field to the perf script usage message, and docs. Signed-off-by: Brendan Gregg <bgregg@netflix.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1470192469-11910-4-git-send-email-bgregg@netflix.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-09 10:46:43 -03:00
Jiri Olsa	b6f35ed774	perf record: Add --sample-cpu option Adding --sample-cpu option to be able to explicitly enable CPU sample type. Currently it's only enable implicitly in case the target is cpu related. It will be useful for following c2c record tool. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1470074555-24889-8-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-08-02 16:33:29 -03:00
Wang Nan	4ea648aec0	perf record: Add --tail-synthesize option When working with overwritable ring buffer there's a inconvenience problem: if perf dumps data after a long period after it starts, non-sample events may lost, which makes following 'perf report' unable to identify proc name and mmap layout. For example: # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \ dd if=/dev/zero of=/dev/null send SIGUSR2 after dd runs long enough. The resuling perf.data lost correct comm and mmap events: # perf script -i perf.data.2016061522374354 perf 24478 [004] 2581325.601789: raw_syscalls:sys_exit: NR 0 = 512 ^^^^ Should be 'dd' 27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux) 203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux) b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux) 7f47c417edf0 [unknown] ([unknown]) ^^^^^^^^^^^^ Fail to unwind This patch provides a '--tail-synthesize' option, allows perf to collect system status when finalizing output file. In resuling output file, the non-sample events reflect system status when dumping data. After this patch: # perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \ dd if=/dev/zero of=/dev/null # perf script -i perf.data.2016061600544998 dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ... ^^ Correct comm 203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms]) 203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms]) 203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms]) b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms]) d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so) ^^^^^ Correct unwind This option doesn't aim to solve this problem completely. If a process terminates before SIGUSR2, we still lost its COMM and MMAP events. For example, we can't unwind correctly from the final perf.data we get from the previous example, because when perf collects the final output file (when we press C-c), 'dd' has been terminated so its '/proc/<pid>/mmap' becomes empty. However, this is a cheaper choice. To completely solve this problem we need to continously output non-sample events. To satisify the requirement of daemonization, we need to merge them periodically. It is possible but requires much more code and cycles. Automatically select --tail-synthesize when --overwrite is provided. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-16-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-15 17:27:52 -03:00
Wang Nan	626a6b784e	perf tools: Enable overwrite settings This patch allows following config terms and option: Globally setting events to overwrite; # perf record --overwrite ... Set specific events to be overwrite or no-overwrite. # perf record --event cycles/overwrite/ ... # perf record --event cycles/no-overwrite/ ... Add missing config terms and update the config term array size because the longest string length has changed. For overwritable events, it automatically selects attr.write_backward since perf requires it to be backward for reading. Test result: # perf record --overwrite -e syscalls:enter_nanosleep usleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ] # perf evlist -v syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP\|TID\|TIME\|CPU\|PERIOD\|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1468485287-33422-14-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-15 17:27:51 -03:00
Masami Hiramatsu	7e9fca51fb	perf probe: Support a special SDT probe format Support a special SDT probe format which can omit the '%' prefix only if the SDT group name starts with "sdt_". So, for example both of "%sdt_libc:setjump" and "sdt_libc:setjump" are acceptable for perf probe --add. E.g. without this: # perf probe -a sdt_libc:setjmp Semantic error :There is non-digit char in line number. ... With this: # perf probe -a sdt_libc:setjmp Added new event: sdt_libc:setjmp (on %setjmp in /usr/lib64/libc-2.20.so) You can now use it in all perf tools, such as: perf record -e sdt_libc:setjmp -aR sleep 1 Suggested-by: Brendan Gregg <brendan.d.gregg@gmail.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146831794674.17065.13359473252168740430.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-13 23:09:09 -03:00
Masami Hiramatsu	36a009fe07	perf probe: Accept %sdt and %cached event name To improve usability, support %[PROVIDER:]SDTEVENT format to add new probes on SDT and cached events. e.g. ---- # perf probe -x /lib/libc-2.17.so %lll_lock_wait_private Added new event: sdt_libc:lll_lock_wait_private (on %lll_lock_wait_private in /usr/lib/libc-2.17.so) You can now use it in all perf tools, such as: perf record -e sdt_libc:lll_lock_wait_private -aR sleep 1 # perf probe -l \| more sdt_libc:lll_lock_wait_private (on __lll_lock_wait_private+21 in /usr/lib/libc-2.17.so) ---- Note that this is not only for SDT events, but also normal events with event-name. e.g. define "myevent" on cache (-n doesn't add the real probe) ---- # perf probe -x ./perf --cache -n --add 'myevent=dso__load $params' ---- Reuse the "myevent" from cache as below. ---- # perf probe -x ./perf %myevent ---- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146831788372.17065.3645054540325909346.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-13 23:09:05 -03:00
Arnaldo Carvalho de Melo	175b968b81	perf report: Introduce --stdio-color to setup the color output mode selection 'perf report --stdio' will colorize entries with most hits and possibly some other aspects of its output, but those colors gets suppressed if we redirect the output to a non-tty, allow keeping the colors by adding a new option, --stdio-color, now this use case will also output escape sequences for colors: $ perf annotate --stdio-color \| more Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3iuawqjldu4i8gziot7e3d5n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-12 00:00:39 -03:00
Arnaldo Carvalho de Melo	53fe4ba1da	perf annotate: Introduce --stdio-color to setup the color output mode selection 'perf annotate --stdio' will colorize entries with most hits and possibly some other aspects of its output, but those colors gets suppressed if we redirect the output to a non-tty, allow keeping the colors by adding a new option, --stdio-color, now this use case will also output escape sequences for colors: $ perf annotate --stdio-color \| more Based-on-a-patch-by: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-sjrnixani5pg6qez640gaxhf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-12 00:00:39 -03:00
Chris Phlipot	3d0376113e	perf tools: Update android build documentation Update the android build documentation according to recent android build fixes. The instructions for step 1a and step 2 were updated to work with NDK version 11(oldest supported version) and NDK version 12(current version). Signed-off-by: Chris Phlipot <cphlipot0@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1467349955-1135-5-git-send-email-cphlipot0@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-04 20:27:27 -03:00
Masami Hiramatsu	6430a94ead	perf buildid-cache: Scan and import user SDT events to probe cache perf buildid-cache --add <binary> scans given binary and add the SDT events to probe cache. "sdt_" prefix is appended for all SDT providers to avoid event-name clash with other pre-defined events. It is possible to use the cached SDT events as other cached events, via perf probe --add "sdt_<provider>:<event>=<event>". e.g. ---- # perf buildid-cache --add /lib/libc-2.17.so # perf probe --cache --list \| head -n 5 /usr/lib/libc-2.17.so (a6fb821bdf53660eb2c29f778757aef294d3d392): sdt_libc:setjmp=setjmp sdt_libc:longjmp=longjmp sdt_libc:longjmp_target=longjmp_target sdt_libc:memory_heap_new=memory_heap_new # perf probe -x /usr/lib/libc-2.17.so \ -a sdt_libc:memory_heap_new=memory_heap_new Added new event: sdt_libc:memory_heap_new (on memory_heap_new in /usr/lib/libc-2.17.so) You can now use it in all perf tools, such as: perf record -e sdt_libc:memory_heap_new -aR sleep 1 # perf probe -l sdt_libc:memory_heap_new (on new_heap+183 in /usr/lib/libc-2.17.so) ---- Note that SDT event entries in probe-cache file is somewhat different from normal cached events. Normal one starts with "#", but SDTs are starting with "%". Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736025058.27797.13043265488541434502.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-04 19:39:00 -03:00
Masami Hiramatsu	8d993d9690	perf probe: Add group name support Allow user to set group name for adding new event. Note that user must ensure that the group name doesn't conflict with existing group name carefully. E.g. Existing group name can conflict with other events. Especially, using the group name reserved for kernel modules can hide kernel embedded events when loading modules. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736024091.27797.9471545190066268995.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-04 19:39:00 -03:00
Masami Hiramatsu	4a0f65c102	perf probe: Remove caches when --cache is given 'perf probe --del' removes caches when '--cache' is given. Note that the delete pattern is not the same as for normal events. If you cached probes with event name, --del "eventname" works as expected. However, if you skipped it, the cached probes doesn't have actual event name. In that case --del "probe-desc" is required (wildcard is acceptable). For example a cache entry has the probe-desc "vfs_read $params", you can remove it with --del 'vfs_read'. ----- # perf probe --cache --list /[kernel.kallsyms] (1466a0a250b5d0070c6d0f03c5fed30b237970a1): vfs_read $params /usr/lib64/libc-2.17.so (c31ffe7942bfd77b2fca8f9bd5709d387a86d3bc): getaddrinfo $params # perf probe --cache --del vfs_read\ Removed cached event: probe:vfs_read # perf probe --cache --list /[kernel.kallsyms] (1466a0a250b5d0070c6d0f03c5fed30b237970a1): /usr/lib64/libc-2.17.so (c31ffe7942bfd77b2fca8f9bd5709d387a86d3bc): getaddrinfo $params ----- Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736021651.27797.10250879847070772920.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-01 11:34:57 -03:00
Masami Hiramatsu	1f3736c9c8	perf probe: Show all cached probes perf probe --list shows all cached probes when --cache is given. Each caches are shown with on which binary that probed. E.g.: ----- # perf probe --cache vfs_read \$params # perf probe --cache -x /lib64/libc-2.17.so getaddrinfo \$params # perf probe --cache --list [kernel.kallsyms] (1466a0a250b5d0070c6d0f03c5fed30b237970a1): vfs_read $params /usr/lib64/libc-2.17.so (c31ffe7942bfd77b2fca8f9bd5709d387a86d3bc): getaddrinfo $params ----- Note that $params requires debuginfo. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/146736020674.27797.13488316780383460180.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-07-01 11:34:57 -03:00
Jiri Olsa	7fa9b8fba0	perf test: Add -F/--dont-fork option Adding -F/--dont-fork option to bypass forking for each test. It's useful for debugging test. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Nilay Vaish <nilayvaish@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1467113345-12669-1-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-30 18:27:45 -03:00
Andi Kleen	d4897e1935	perf tools: Add documentation for perf.data on disk format Add some documentation for the on disk format of perf.data. This is not documenting the actual perf events -- which are documented in perf_event.h -- but just the additional headers that perf record adds around them when writing the data to disk. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1466800885-12974-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-29 10:07:23 -03:00
Wang Nan	9e1a7ea19f	perf data ctf: Add '--all' option for 'perf data convert' After this patch, 'perf data convert' convert comm events to output CTF stream. Result: # perf record -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.378 MB perf.data (73 samples) ] # perf data convert --to-ctf ./out.ctf [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ] [ perf data convert: Converted and wrote 0.003 MB (73 samples) ] # babeltrace --clock-seconds ./out.ctf/ [10627.402515791] (+?.?????????) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } [10627.402518972] (+0.000003181) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } ... // only sample event is converted # perf data convert --all --to-ctf ./out.ctf [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ] [ perf data convert: Converted and wrote 0.023 MB (73 samples, 384 non-samples) ] # babeltrace --clock-seconds ./out.ctf/ [ 0.000000000] (+?.?????????) perf_comm: { cpu_id = 0 }, { pid = 1, tid = 1, comm = "init" } [ 0.000000000] (+0.000000000) perf_comm: { cpu_id = 0 }, { pid = 2, tid = 2, comm = "kthreadd" } [ 0.000000000] (+0.000000000) perf_comm: { cpu_id = 0 }, { pid = 3, tid = 3, comm = "ksoftirqd/0" } ... // comm events are converted [10627.402515791] (+10627.402515791) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } [10627.402518972] (+0.000003181) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81065AF4, perf_tid = 0, perf_pid = 0, perf_period = 1 } ... // samples are also converted Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1466767332-114472-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-28 10:54:57 -03:00
Adrian Hunter	e216708d98	perf script: Add callindent option Based on patches from Andi Kleen. When printing PT instruction traces with perf script it is rather useful to see some indentation for the call tree. This patch adds a new callindent field to perf script that prints spaces for the function call stack depth. We already have code to track the function call stack for PT, that we can reuse with minor modifications. The resulting output is not quite as nice as ftrace yet, but a lot better than what was there before. Note there are some corner cases when the thread stack gets code confused and prints incorrect indentation. Even with that it is fairly useful. When displaying kernel code traces it is recommended to run as root, as otherwise perf doesn't understand the kernel addresses properly, and may not reset the call stack correctly on kernel boundaries. Example output: sudo perf-with-kcore record eg2 -a -e intel_pt// -- sleep 1 sudo perf-with-kcore script eg2 --ns -F callindent,time,comm,pid,sym,ip,addr,flags,cpu --itrace=cre \| less ... swapper 0 [000] 5830.389116586: call irq_exit ffffffff8104d620 smp_call_function_single_interrupt+0x30 => ffffffff8107e720 irq_exit swapper 0 [000] 5830.389116586: call idle_cpu ffffffff8107e769 irq_exit+0x49 => ffffffff810a3970 idle_cpu swapper 0 [000] 5830.389116586: return idle_cpu ffffffff810a39b7 idle_cpu+0x47 => ffffffff8107e76e irq_exit swapper 0 [000] 5830.389116586: call tick_nohz_irq_exit ffffffff8107e7bd irq_exit+0x9d => ffffffff810f2fc0 tick_nohz_irq_exit swapper 0 [000] 5830.389116919: call __tick_nohz_idle_enter ffffffff810f2fe0 tick_nohz_irq_exit+0x20 => ffffffff810f28d0 __tick_nohz_idle_enter swapper 0 [000] 5830.389116919: call ktime_get ffffffff810f28f1 __tick_nohz_idle_enter+0x21 => ffffffff810e9ec0 ktime_get swapper 0 [000] 5830.389116919: call read_tsc ffffffff810e9ef6 ktime_get+0x36 => ffffffff81035070 read_tsc swapper 0 [000] 5830.389116919: return read_tsc ffffffff81035084 read_tsc+0x14 => ffffffff810e9efc ktime_get swapper 0 [000] 5830.389116919: return ktime_get ffffffff810e9f46 ktime_get+0x86 => ffffffff810f28f6 __tick_nohz_idle_enter swapper 0 [000] 5830.389116919: call sched_clock_idle_sleep_event ffffffff810f290b __tick_nohz_idle_enter+0x3b => ffffffff810a7380 sched_clock_idle_sleep_event swapper 0 [000] 5830.389116919: call sched_clock_cpu ffffffff810a738b sched_clock_idle_sleep_event+0xb => ffffffff810a72e0 sched_clock_cpu swapper 0 [000] 5830.389116919: call sched_clock ffffffff810a734d sched_clock_cpu+0x6d => ffffffff81035750 sched_clock swapper 0 [000] 5830.389116919: call native_sched_clock ffffffff81035754 sched_clock+0x4 => ffffffff81035640 native_sched_clock swapper 0 [000] 5830.389116919: return native_sched_clock ffffffff8103568c native_sched_clock+0x4c => ffffffff81035759 sched_clock swapper 0 [000] 5830.389116919: return sched_clock ffffffff8103575c sched_clock+0xc => ffffffff810a7352 sched_clock_cpu swapper 0 [000] 5830.389116919: return sched_clock_cpu ffffffff810a7356 sched_clock_cpu+0x76 => ffffffff810a7390 sched_clock_idle_sleep_event swapper 0 [000] 5830.389116919: return sched_clock_idle_sleep_event ffffffff810a7391 sched_clock_idle_sleep_event+0x11 => ffffffff810f2910 __tick_nohz_idle_enter ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1466689258-28493-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-23 17:04:26 -03:00
Adrian Hunter	055cd33d93	perf script: Print sample flags more nicely The flags field is synthesized and may have a value when Instruction Trace decoding. The flags are "bcrosyiABEx" which stand for branch, call, return, conditional, system, asynchronous, interrupt, transaction abort, trace begin, trace end, and in transaction, respectively. Change the display so that known combinations of flags are printed more nicely e.g.: "call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b", "int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs", "async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB", "tr end" for "bE". However the "x" flag will be displayed separately in those cases e.g. "jcc (x)" for a condition branch within a transaction. Example: perf record -e intel_pt//u ls perf script --ns -F comm,cpu,pid,tid,time,ip,addr,sym,dso,symoff,flags ... ls 3689/3689 [001] 2062.020965237: jcc 7f06a958847a _dl_sysdep_start+0xfa (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9588450 _dl_sysdep_start+0xd0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965237: jmp 7f06a9588461 _dl_sysdep_start+0xe1 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95885a0 _dl_sysdep_start+0x220 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965237: jmp 7f06a95885a4 _dl_sysdep_start+0x224 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9588470 _dl_sysdep_start+0xf0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965904: call 7f06a95884c3 _dl_sysdep_start+0x143 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a9589140 brk+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020965904: syscall 7f06a958914a brk+0xa (/lib/x86_64-linux-gnu/ld-2.19.so) => 0 [unknown] ([unknown]) ls 3689/3689 [001] 2062.020966237: tr strt 0 [unknown] ([unknown]) => 7f06a958914c brk+0xc (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: return 7f06a9589165 brk+0x25 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95884c8 _dl_sysdep_start+0x148 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: jcc 7f06a95884d7 _dl_sysdep_start+0x157 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: call 7f06a95885f0 _dl_sysdep_start+0x270 (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a958ac50 strlen+0x0 (/lib/x86_64-linux-gnu/ld-2.19.so) ls 3689/3689 [001] 2062.020966237: jcc 7f06a958ac6e strlen+0x1e (/lib/x86_64-linux-gnu/ld-2.19.so) => 7f06a958ac60 strlen+0x10 (/lib/x86_64-linux-gnu/ld-2.19.so) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1466689258-28493-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-23 16:36:59 -03:00
Wang Nan	0aab21363f	perf record: Add --dry-run option to check cmdline options With '--dry-run', 'perf record' doesn't do reall recording. Combine with llvm.dump-obj option, --dry-run can be used to help compile BPF objects for embedded platform. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1466064161-48553-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-21 13:18:35 -03:00
Adrian Hunter	cbb0bba9f3	perf script: Fix documentation of '-f' when it should be '-F' The documentation for perf script mixes up '-f' and '-F'. Fix it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/None Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-21 13:18:33 -03:00
Masami Hiramatsu	2fd457a345	perf probe: Add --cache option to cache the probe definitions Add --cache option to cache the probe definitions. This just saves the result of the dwarf analysis to probe cache. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20160615032840.31330.44412.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-15 14:34:42 -03:00
Jiri Olsa	b0d745b3c3	perf mem: Add --ldlat option Adding --ldlat option to specify desired latency for loads event. Specify 50 as loads event latency: $ perf mem record -e ldlat-loads -v --ldlat 50 true calling: record -W -d -e cpu/mem-loads,ldlat=50/P true Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1465928361-2442-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-15 10:35:27 -03:00
Andi Kleen	44b1e60ab5	perf stat: Basic support for TopDown in perf stat Add basic plumbing for TopDown in perf stat TopDown is intended to replace the frontend cycles idle/ backend cycles idle metrics in standard perf stat output. These metrics are not reliable in many workloads, due to out of order effects. This implements a new --topdown mode in perf stat (similar to --transaction) that measures the pipe line bottlenecks using standardized formulas. The measurement can be all done with 5 counters (one fixed counter) The result are four metrics: FrontendBound, BackendBound, BadSpeculation, Retiring that describe the CPU pipeline behavior on a high level. The full top down methology has many hierarchical metrics. This implementation only supports level 1 which can be collected without multiplexing. A full implementation of top down on top of perf is available in pmu-tools toplev. (http://github.com/andikleen/pmu-tools) The current version works on Intel Core CPUs starting with Sandy Bridge, and Atom CPUs starting with Silvermont. In principle the generic metrics should be also implementable on other out of order CPUs. TopDown level 1 uses a set of abstracted metrics which are generic to out of order CPU cores (although some CPUs may not implement all of them): topdown-total-slots Available slots in the pipeline topdown-slots-issued Slots issued into the pipeline topdown-slots-retired Slots successfully retired topdown-fetch-bubbles Pipeline gaps in the frontend topdown-recovery-bubbles Pipeline gaps during recovery from misspeculation These metrics then allow to compute four useful metrics: FrontendBound, BackendBound, Retiring, BadSpeculation. Add a new --topdown options to enable events. When --topdown is specified set up events for all topdown events supported by the kernel. Add topdown-* as a special case to the event parser, as is needed for all events containing -. The actual code to compute the metrics is in follow-on patches. v2: Use standard sysctl read function. v3: Move x86 specific code to arch/ v4: Enable --metric-only implicitly for topdown. v5: Add --single-thread option to not force per core mode v6: Fix output order of topdown metrics v7: Allow combining with -d v8: Remove --single-thread again v9: Rename functions, adding arch_ and topdown_. v10: Expand man page and describe TopDown better Paste intro into commit description. Print error when malloc fails. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-06-06 17:04:15 -03:00
Andi Kleen	508be0dfe6	perf report: Add srcline_from/to branch sort keys Add "srcline_from" and "srcline_to" branch sort keys that allow to show the source lines of a branch. That makes it much easier to track down where particular branches happen in the program, for example to examine branch mispredictions, or to associate it with cycle counts: % perf record -b -e cycles:p ./tcall % perf report --sort srcline_from,srcline_to,mispredict ... 15.10% tcall.c:18 tcall.c:10 N 14.83% tcall.c:11 tcall.c:5 N 14.12% tcall.c:7 tcall.c:12 N 14.04% tcall.c:12 tcall.c:5 N 12.42% tcall.c:17 tcall.c:18 N 12.39% tcall.c:7 tcall.c:13 N 12.27% tcall.c:13 tcall.c:17 N ... % perf report --sort srcline_from,srcline_to,cycles ... 17.12% tcall.c:18 tcall.c:11 1 17.01% tcall.c:12 tcall.c:6 1 16.98% tcall.c:11 tcall.c:6 1 15.91% tcall.c:17 tcall.c:18 1 6.38% tcall.c:7 tcall.c:17 7 4.80% tcall.c:7 tcall.c:12 8 4.21% tcall.c:7 tcall.c:17 8 2.67% tcall.c:7 tcall.c:12 7 2.62% tcall.c:7 tcall.c:12 10 2.10% tcall.c:7 tcall.c:17 9 1.58% tcall.c:7 tcall.c:12 6 1.44% tcall.c:7 tcall.c:12 5 1.38% tcall.c:7 tcall.c:12 9 1.06% tcall.c:7 tcall.c:17 13 1.05% tcall.c:7 tcall.c:12 4 1.01% tcall.c:7 tcall.c:17 6 Open issues: - Some kernel symbols get misresolved. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-23 11:25:16 -03:00
Arnaldo Carvalho de Melo	fe176085a4	perf tools: Fix usage of max_stack sysctl We cannot limit processing stacks from the current value of the sysctl, as we may be processing perf.data files, possibly from other machines. Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can be overriden using --max-stack or equivalent. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Fixes: `4cb93446c5` ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack") Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-05-20 11:43:56 -03:00
Wang Nan	0c1d46a879	perf record: Disable buildid cache options by default in switch output mode The cost of buildid cache processing is high: reading all events in output perf.data, opening each elf file to read buildids then copying them into ~/.debug directory. In switch output mode, these heavy works block perf from receiving perf events for too long. Enable no-buildid and no-buildid-cache by default if --switch-output is provided. Still allow user use --no-no-buildid to explicitly enable buildid in this case. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-6-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	eca857ab38	perf record: Force enable --timestamp-filename when --switch-output is provided Without this patch, the last output doesn't have timestamp appended if --timestamp-filename is not explicitly provided. For example: # perf record -a --switch-output & [1] 11224 # kill -s SIGUSR2 11224 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622372823 ] # fg perf record -a --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ] # ls -l total 836 -rw------- 1 root root 33256 Dec 26 22:37 perf.data <---- Odd -rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-5-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Updated man page, that also got an entry for --timestamp-filename ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Wang Nan	3c1cb7e372	perf record: Split output into multiple files via '--switch-output' Allow 'perf record' to split its output into multiple files. For example: # ~/perf record -a --timestamp-filename --switch-output & [1] 10763 # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314468 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] # [ perf record: Dump perf.data.2015122622314762 ] # kill -s SIGUSR2 10763 [ perf record: dump data: Woken up 1 times ] #[ perf record: Dump perf.data.2015122622315171 ] # fg perf record -a --timestamp-filename --switch-output ^C[ perf record: Woken up 1 times to write data ] [ perf record: Dump perf.data.2015122622315513 ] [ perf record: Captured and wrote 0.014 MB perf.data.<timestamp> (296 samples) ] # ls -l total 920 -rw------- 1 root root 797692 Dec 26 22:31 perf.data.2015122622314468 -rw------- 1 root root 59960 Dec 26 22:31 perf.data.2015122622314762 -rw------- 1 root root 59912 Dec 26 22:31 perf.data.2015122622315171 -rw------- 1 root root 19220 Dec 26 22:31 perf.data.2015122622315513 Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1461178794-40467-4-git-send-email-wangnan0@huawei.com Signed-off-by: He Kuang <hekuang@huawei.com> [ Added man page entry, used the re-synthesize patch in this series as a fixup ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-28 09:58:59 -03:00
Arnaldo Carvalho de Melo	4cb93446c5	perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack There is an upper limit to what tooling considers a valid callchain, and it was tied to the hardcoded value in the kernel, PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl, make it read it and use that as the upper limit, falling back to PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2016-04-27 10:29:07 -03:00

1 2 3 4 5 ...

538 Commits