linux

History

Kan Liang 384b60557b perf tools: Construct LBR call chain LBR call stack only has user-space callchains. It is output in the PERF_SAMPLE_BRANCH_STACK data format. For kernel callchains, it's still in the form of PERF_SAMPLE_CALLCHAIN. The perf tool has to handle both data sources to construct a complete callstack. For the "perf report -D" option, both lbr and fp information will be displayed. A new call chain recording option "lbr" is introduced into the perf tool for LBR call stack. The user can use --call-graph lbr to get the call stack information from hardware. Here are some examples. When profiling bc(1) on Fedora 19: echo 'scale=2000; 4a(1)' > cmd; perf record --call-graph lbr bc -l < cmd If enabling LBR, perf report output looks like: 50.36% bc bc [.] bc_divide \| --- bc_divide execute run_code yyparse main __libc_start_main _start 33.66% bc bc [.] _one_mult \| --- _one_mult bc_divide execute run_code yyparse main __libc_start_main _start 7.62% bc bc [.] _bc_do_add \| --- _bc_do_add \| \|--99.89%-- 0x2000186a8 --0.11%-- [...] 6.83% bc bc [.] _bc_do_sub \| --- _bc_do_sub \| \|--99.94%-- bc_add \| execute \| run_code \| yyparse \| main \| __libc_start_main \| _start --0.06%-- [...] 0.46% bc libc-2.17.so [.] __memset_sse2 \| --- __memset_sse2 \| \|--54.13%-- bc_new_num \| \| \| \|--51.00%-- bc_divide \| \| execute \| \| run_code \| \| yyparse \| \| main \| \| __libc_start_main \| \| _start \| \| \| \|--30.46%-- _bc_do_sub \| \| bc_add \| \| execute \| \| run_code \| \| yyparse \| \| main \| \| __libc_start_main \| \| _start \| \| \| --18.55%-- _bc_do_add \| bc_add \| execute \| run_code \| yyparse \| main \| __libc_start_main \| _start \| --45.87%-- bc_divide execute run_code yyparse main __libc_start_main _start If using FP, perf report output looks like: echo 'scale=2000; 4a(1)' > cmd; perf record --call-graph fp bc -l < cmd 50.49% bc bc [.] bc_divide \| --- bc_divide 33.57% bc bc [.] _one_mult \| --- _one_mult 7.61% bc bc [.] _bc_do_add \| --- _bc_do_add 0x2000186a8 6.88% bc bc [.] _bc_do_sub \| --- _bc_do_sub 0.42% bc libc-2.17.so [.] __memcpy_ssse3_back \| --- __memcpy_ssse3_back If using LBR, perf report -D output looks like: 3458145275743 0x2fd750 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x2): 9748/9748: 0x408ea8 period: 609644 addr: 0 ... LBR call chain: nr:8 ..... 0: fffffffffffffe00 ..... 1: 0000000000408e50 ..... 2: 000000000040a458 ..... 3: 000000000040562e ..... 4: 0000000000408590 ..... 5: 00000000004022c0 ..... 6: 00000000004015dd ..... 7: 0000003d1cc21b43 ... FP chain: nr:2 ..... 0: fffffffffffffe00 ..... 1: 0000000000408ea8 ... thread: bc:9748 ...... dso: /usr/bin/bc The LBR call stack has the following known limitations: - Zero length calls are not filtered out by the hardware - Exception handing such as setjmp/longjmp will have calls/returns not match - Pushing different return address onto the stack will have calls/returns not match - If callstack is deeper than the LBR, only the last entries are captured Tested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1420482185-29830-3-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>		2015-02-18 17:16:18 +01:00
..
include	tools: Remove bitops/hweight usage of bits in tools/perf	2015-01-16 17:49:29 -03:00
scripting-engines	perf tools: Remove EOL whitespaces	2015-01-21 13:24:31 -03:00
abspath.c
alias.c	perf tools: Introduce zfree	2013-12-27 15:17:00 -03:00
annotate.c	perf tools: Remove EOL whitespaces	2015-01-21 13:24:31 -03:00
annotate.h	perf tools: Fix segfault for symbol annotation on TUI	2015-01-16 17:49:29 -03:00
bitmap.c
build-id.c	perf buildid-cache: Remove extra debugdir variables	2014-12-09 09:14:34 -03:00
build-id.h	perf build-id: Move disable_buildid_cache() to util/build-id.c	2014-11-19 12:33:46 -03:00
cache.h	perf tools: Elide strlcpy warning with uclibc	2015-01-16 17:49:29 -03:00
callchain.c	perf tools: Enable LBR call stack support	2015-02-18 17:16:17 +01:00
callchain.h	perf tools: Enable LBR call stack support	2015-02-18 17:16:17 +01:00
cgroup.c	perf evlist: Introduce evlist__for_each() & friends	2014-01-13 10:06:25 -03:00
cgroup.h
cloexec.c	perf util: Replace strerror with strerror_r for thread-safety	2014-08-15 10:58:35 -03:00
cloexec.h	perf tools: Enable close-on-exec flag on perf file descriptor	2014-07-18 09:09:34 +02:00
color.c	perf tools: Remove some unused functions from color.c	2015-01-21 13:24:32 -03:00
color.h	perf tools: Remove some unused functions from color.c	2015-01-21 13:24:32 -03:00
comm.c	perf tools: Identify which comms are from exec	2014-08-13 19:23:08 -03:00
comm.h	perf tools: Add facility to export data in database-friendly way	2014-10-29 10:32:49 -02:00
config.c	perf tools: Add --buildid-dir option to set cache directory	2014-12-09 09:14:35 -03:00
cpumap.c	perf tools: Use cpu/possible instead of cpu/kernel_max	2014-04-22 17:39:16 +02:00
cpumap.h	perf tools: Allow ability to map cpus to nodes easily	2014-04-22 17:39:12 +02:00
ctype.c
data.c	perf util: Replace strerror with strerror_r for thread-safety	2014-08-15 10:58:35 -03:00
data.h	perf tools: Add perf_data_file__write interface	2013-12-02 09:22:46 -03:00
db-export.c	perf tools: Defer export of comms that were not 'set'	2014-11-03 18:11:59 -03:00
db-export.h	perf tools: Defer export of comms that were not 'set'	2014-11-03 18:11:59 -03:00
debug.c	perf tools: Allow to force redirect pr_debug to stderr.	2014-11-24 18:03:48 -03:00
debug.h	perf: Use strerror_r instead of strerror	2014-08-15 10:54:29 -03:00
dso.c	perf symbols: Convert lseek + read to pread	2015-01-29 17:02:01 -03:00
dso.h	perf callchain: Cache eh/debug frame offset for dwarf unwind	2015-01-29 16:20:42 -03:00
dwarf-aux.c	perf probe: Fix perf probe to find correct variable DIE	2014-06-04 14:49:20 +02:00
dwarf-aux.h	perf probe: Fix to find line information for probe list	2013-10-04 15:16:05 -03:00
environment.c
event.c	perf tools: Add id index	2014-10-29 11:24:47 -02:00
event.h	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-12-09 21:18:06 -08:00
evlist.c	tools lib fs: Adopt debugfs open strerrno method	2015-01-22 10:34:22 -03:00
evlist.h	tools lib fs: Adopt debugfs open strerrno method	2015-01-22 10:34:22 -03:00
evsel.c	perf tools: Enable LBR call stack support	2015-02-18 17:16:17 +01:00
evsel.h	perf tools: Construct LBR call chain	2015-02-18 17:16:18 +01:00
exec_cmd.c
exec_cmd.h
find-vdso-map.c	perf tools: Build programs to copy 32-bit compatibility	2014-10-29 10:32:48 -02:00
generate-cmdlist.sh	tools/perf: Standardize feature support define names to: HAVE_{FEATURE}_SUPPORT	2013-10-09 08:48:28 +02:00
header.c	perf header: Set header version correctly	2015-01-29 16:53:11 -03:00
header.h	perf build-id: Move build-id related functions to util/build-id.c	2014-11-05 10:14:07 -03:00
help.c	perf tools: Use zfree to help detect use after free bugs	2013-12-27 17:08:19 -03:00
help.h
hist.c	perf tools: Pass struct perf_hpp_fmt to its callbacks	2015-01-21 13:24:34 -03:00
hist.h	perf tools: Pass struct perf_hpp_fmt to its callbacks	2015-01-21 13:24:34 -03:00
intlist.c	perf util: Add findnew method to intlist	2013-10-14 10:28:48 -03:00
intlist.h	perf util: Add findnew method to intlist	2013-10-14 10:28:48 -03:00
kvm-stat.h	perf kvm stat report: Save pid string in opts.target.pid	2014-09-17 17:08:07 -03:00
levenshtein.c
levenshtein.h
machine.c	perf tools: Construct LBR call chain	2015-02-18 17:16:18 +01:00
machine.h	perf tools: Add facility to export data in database-friendly way	2014-10-29 10:32:49 -02:00
map.c	perf callchain: Make get_srcline fall back to sym+offset	2014-11-24 18:03:47 -03:00
map.h	perf symbols: Introduce 'for' method to iterate over the symbols with a given name	2015-01-21 10:06:15 -03:00
ordered-events.c	perf session: Add option to copy events when queueing	2014-10-15 17:39:03 -03:00
ordered-events.h	perf session: Add option to copy events when queueing	2014-10-15 17:39:03 -03:00
pager.c	perf tools: Add cat as fallback pager	2014-05-21 11:48:33 +02:00
parse-events.c	Merge branch 'perf/hw_breakpoints' into perf/core	2015-01-28 15:48:59 +01:00
parse-events.h	Merge branch 'perf/hw_breakpoints' into perf/core	2015-01-28 15:48:59 +01:00
parse-events.l	perf tools: allow user to specify hardware breakpoint bp_len	2014-12-03 15:14:29 +01:00
parse-events.y	perf tools: allow user to specify hardware breakpoint bp_len	2014-12-03 15:14:29 +01:00
parse-options.c	perf tools: Allow use of an exclusive option more than once	2015-01-21 13:24:33 -03:00
parse-options.h	perf tools: Add support for exclusive option	2014-10-29 10:32:47 -02:00
path.c	tools/perf: Turn strlcpy() into a __weak function	2013-10-09 08:48:49 +02:00
perf_regs.c	perf tools: Cache register accesses for unwind processing	2014-06-12 16:53:19 +02:00
perf_regs.h	perf tools: Cache register accesses for unwind processing	2014-06-12 16:53:19 +02:00
PERF-VERSION-GEN	perf tools: Fix version when building out of tree	2013-11-07 10:40:47 -03:00
pmu.c	perf tools: Extend format_alias() to include event parameters	2015-01-21 13:24:33 -03:00
pmu.h	perf tools: Add snapshot format file parsing	2014-11-24 18:03:51 -03:00
pmu.l
pmu.y	perf tools: Fix build with bison 2.3 and older.	2013-02-14 16:12:34 -03:00
probe-event.c	perf probe: Fix probing kretprobes	2015-01-21 10:06:24 -03:00
probe-event.h	perf probe: Do not access kallsyms when analyzing user binaries	2014-09-17 18:01:14 -03:00
probe-finder.c	perf probe: Fix crash in dwarf_getcfi_elf	2015-01-02 12:44:01 -03:00
probe-finder.h	perf probe: Support distro-style debuginfo for uprobe	2014-02-18 09:38:44 -03:00
pstack.c	perf tools: Move pr_* debug macros into debug object	2014-07-17 12:58:39 -03:00
pstack.h	perf tools: Finish the removal of 'self' arguments	2013-11-05 15:32:36 -03:00
python-ext-sources	tools: Remove bitops/hweight usage of bits in tools/perf	2015-01-16 17:49:29 -03:00
python.c	perf tools: Remove EOL whitespaces	2015-01-21 13:24:31 -03:00
quote.c
quote.h
rblist.c	perf util: Add findnew method to intlist	2013-10-14 10:28:48 -03:00
rblist.h	perf util: Add findnew method to intlist	2013-10-14 10:28:48 -03:00
record.c	perf tools: Use sysctl__read_int instead of ad-hoc copies	2014-12-11 17:53:04 -03:00
run-command.c	perf util: Replace strerror with strerror_r for thread-safety	2014-08-15 10:58:35 -03:00
run-command.h
session.c	perf tools: Construct LBR call chain	2015-02-18 17:16:18 +01:00
session.h	perf tools: Do not use __perf_session__process_events() directly	2015-01-29 16:36:32 -03:00
setup.py	tools/: Convert to new topic libraries	2013-12-16 16:03:27 -03:00
sigchain.c
sigchain.h
sort.c	perf tools: Pass struct perf_hpp_fmt to its callbacks	2015-01-21 13:24:34 -03:00
sort.h	perf tools: Add +field argument support for --field option	2014-08-24 08:11:19 -03:00
srcline.c	perf: Fix building warning on ARM 32	2014-12-19 13:09:43 +01:00
stat.c	perf stats: Add max and min stats	2013-08-07 17:35:26 -03:00
stat.h	tools: Consolidate types.h	2014-05-01 21:22:39 +02:00
strbuf.c	perf tools: Use zfree to help detect use after free bugs	2013-12-27 17:08:19 -03:00
strbuf.h
strfilter.c	perf tools: Use zfree to help detect use after free bugs	2013-12-27 17:08:19 -03:00
strfilter.h	perf tools: Finish the removal of 'self' arguments	2013-11-05 15:32:36 -03:00
string.c	Revert "perf tools: Default to cpu// for events v5"	2014-10-15 16:04:33 -03:00
strlist.c	perf tools: Fix build error due to zfree() cast	2014-01-15 15:10:04 -03:00
strlist.h	perf tools: Stop using 'self' in strlist	2013-01-25 12:49:28 -03:00
svghelper.c	perf timechart: Implement IO mode	2014-07-10 00:22:54 +02:00
svghelper.h	perf timechart: Implement IO mode	2014-07-10 00:22:54 +02:00
symbol-elf.c	perf symbols: Support to read compressed module from build-id cache	2015-01-29 16:56:54 -03:00
symbol-minimal.c	perf symbols: Fix use after free in filename__read_build_id	2014-12-17 11:58:17 -03:00
symbol.c	perf tools: Remove EOL whitespaces	2015-01-21 13:24:31 -03:00
symbol.h	perf symbols: Introduce method to iterate symbols ordered by name	2015-01-21 10:05:54 -03:00
target.c	perf record: Make per-cpu mmaps the default.	2013-11-27 14:58:36 -03:00
target.h	perf target: Move the checking of which map function to call into function.	2013-12-04 13:46:37 -03:00
thread_map.c	perf thread_map: Create dummy constructor out of open coded equivalent	2014-10-14 17:32:52 -03:00
thread_map.h	perf thread_map: Create dummy constructor out of open coded equivalent	2014-10-14 17:32:52 -03:00
thread-stack.c	perf tools: Enhance the thread stack to output call/return data	2014-11-03 17:43:56 -03:00
thread-stack.h	perf tools: Enhance the thread stack to output call/return data	2014-11-03 17:43:56 -03:00
thread.c	perf tools: Only override the default :tid comm entry	2014-11-19 12:37:26 -03:00
thread.h	perf tools: Add a thread stack for synthesizing call chains	2014-11-03 17:10:59 -03:00
tool.h	perf tools: Add id index	2014-10-29 11:24:47 -02:00
top.c	perf tools: Rename 'perf_record_opts' to 'record_opts	2013-12-19 14:43:45 -03:00
top.h	tools: Consolidate types.h	2014-05-01 21:22:39 +02:00
trace-event-info.c	perf tools: Move pr_* debug macros into debug object	2014-07-17 12:58:39 -03:00
trace-event-parse.c	perf tools: Fix memory leak in event_format__print function	2014-02-18 09:34:47 -03:00
trace-event-read.c	perf tools: Remove needless getopt.h includes	2014-07-17 12:59:00 -03:00
trace-event-scripting.c	perf scripting: Add 'flush' callback to scripting API	2014-08-22 13:12:11 -03:00
trace-event.c	tools lib traceevent: Make plugin unload function receive pevent	2014-01-15 15:10:40 -03:00
trace-event.h	perf scripting: Add 'flush' callback to scripting API	2014-08-22 13:12:11 -03:00
tsc.c	perf tools: Move rdtsc() function	2014-07-23 11:48:11 -03:00
tsc.h	perf tools: Move rdtsc() function	2014-07-23 11:48:11 -03:00
unwind-libdw.c	perf callchains: Use thread->mg->machine	2014-10-29 10:32:46 -02:00
unwind-libdw.h	perf tools: Add libdw DWARF post unwind support	2014-02-24 09:29:36 -03:00
unwind-libunwind.c	perf callchain: Cache eh/debug frame offset for dwarf unwind	2015-01-29 16:20:42 -03:00
unwind.h	perf callchains: Use thread->mg->machine	2014-10-29 10:32:46 -02:00
usage.c
util.c	perf tools: Use sysctl__read_int instead of ad-hoc copies	2014-12-11 17:53:04 -03:00
util.h	perf: Fix building warning on ARM 32	2014-12-19 13:09:43 +01:00
values.c	perf tools: Use zfree to help detect use after free bugs	2013-12-27 17:08:19 -03:00
values.h	tools: Consolidate types.h	2014-05-01 21:22:39 +02:00
vdso.c	perf tools: Do not attempt to run perf-read-vdso32 if it wasn't built	2014-10-29 10:32:48 -02:00
vdso.h	perf tools: Add support for 32-bit compatibility VDSOs	2014-10-29 10:32:48 -02:00
wrapper.c	perf tools: Use __maybe_used for unused variables	2012-09-11 12:19:15 -03:00
xyarray.c
xyarray.h
zlib.c	perf tools: Add gzip decompression support for kernel module	2014-11-05 10:11:26 -03:00