linux/tools/perf
Chris Phlipot 9919a65ec5 perf callchain: Fix incorrect ordering of entries
The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.

The callchain entries are resolved incorrectly for a sample when all of
the following conditions are met:

1. callchain_param.order is set to ORDER_CALLER

2. thread__resolve_callchain_sample is able to resolve callchain entries
   for the sample.

3. unwind__get_entries is also able to resolve callchain entries for the
   sample.

The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called when
callchain_param.order is set to ORDER_CALLER.

Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.

How to Reproduce the Existing Bug:

Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways to see
the bug, however it can still be seen in current builds using perf
report.

Here is how i can reproduce the bug using perf report:

  # perf record --call-graph=dwarf stress -c 1 -t 5

when i run this command:

  # perf report --call-graph=flat,0,0,callee

This callchain, containing kernel (handle_irq_event, etc) and userspace
samples (__libc_start_main, etc) is contained in the output, which looks
correct (callee order):

                gen8_irq_handler
                handle_irq_event_percpu
                handle_irq_event
                handle_edge_irq
                handle_irq
                do_IRQ
                ret_from_intr
                __random
                rand
                0x558f2a04dded
                0x558f2a04c774
                __libc_start_main
                0x558f2a04dcd9

Now run this command using caller order:

  # perf report --call-graph=flat,0,0,caller

It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.

instead you see this:

                ret_from_intr
                do_IRQ
                handle_irq
                handle_edge_irq
                handle_irq_event
                handle_irq_event_percpu
                gen8_irq_handler
                0x558f2a04dcd9
                __libc_start_main
                0x558f2a04c774
                0x558f2a04dded
                rand
                __random

Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.

if this patch is applied and perf script is re-run, you will see the
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):

                0x558f2a04dcd9
                __libc_start_main
                0x558f2a04c774
                0x558f2a04dded
                rand
                __random
                ret_from_intr
                do_IRQ
                handle_irq
                handle_edge_irq
                handle_irq_event
                handle_irq_event_percpu
                gen8_irq_handler

Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-05-06 08:59:47 -03:00
..
arch perf symbols: Fix kallsyms perf test on ppc64le 2016-05-05 21:04:03 -03:00
bench perf bench: Remove one more die() call 2016-04-26 13:28:40 -03:00
config perf tools: Build syscall table .c header from kernel's syscall_64.tbl 2016-04-08 09:58:14 -03:00
Documentation perf record: Disable buildid cache options by default in switch output mode 2016-04-28 09:58:59 -03:00
jvmti perf jit: Add support for using TSC as a timestamp 2016-04-01 18:42:55 -03:00
python perf python: Support the PERF_RECORD_SWITCH event 2015-10-07 19:41:50 -03:00
scripts perf script: Fix postgresql ubuntu install instructions 2016-04-19 12:36:54 -03:00
tests perf hists: Move sort__need_collapse into struct perf_hpp_list 2016-05-05 21:03:58 -03:00
trace perf trace: Move msg_flags beautifier to tools/perf/trace/beauty/ 2016-04-28 09:58:59 -03:00
ui perf hists: Move sort__has_comm into struct perf_hpp_list 2016-05-05 21:04:02 -03:00
util perf callchain: Fix incorrect ordering of entries 2016-05-06 08:59:47 -03:00
.gitignore perf tools: Add Intel PT instruction decoder 2015-08-17 11:11:36 -03:00
Build perf tools: Set and pass DOCDIR to builtin-report.c 2016-01-12 12:42:07 -03:00
builtin-annotate.c perf machine: Rename perf_event__preprocess_sample to machine__resolve 2016-03-23 12:03:08 -03:00
builtin-bench.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-buildid-cache.c perf tools: Move timestamp creation to util 2016-01-29 17:30:06 -03:00
builtin-buildid-list.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-config.c perf config: Make show_config() use perf_config_set 2016-04-14 09:15:47 -03:00
builtin-data.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-diff.c perf hists: Move sort__need_collapse into struct perf_hpp_list 2016-05-05 21:03:58 -03:00
builtin-evlist.c perf evlist: Add --trace-fields option to show trace fields 2016-01-08 14:23:02 -03:00
builtin-help.c perf help: Use asprintf instead of adhoc equivalents 2016-03-23 16:36:07 -03:00
builtin-inject.c perf tools: Add time conversion event 2016-03-31 10:52:24 -03:00
builtin-kmem.c perf callchain: Start moving away from global per thread cursors 2016-04-14 14:48:07 -03:00
builtin-kvm.c perf evsel: Do not use globals in config() 2016-04-11 22:18:20 -03:00
builtin-list.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-lock.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-mem.c perf mem: Add -U/-K (--all-user/--all-kernel) options 2016-03-30 11:14:07 -03:00
builtin-probe.c perf subcmd: Create subcmd library 2015-12-17 14:27:14 -03:00
builtin-record.c perf record: Generate tracking events for process forked by perf 2016-04-28 09:58:59 -03:00
builtin-report.c perf hists: Move sort__has_parent into struct perf_hpp_list 2016-05-05 21:03:59 -03:00
builtin-sched.c perf sched map: Display only given cpus 2016-04-13 10:11:52 -03:00
builtin-script.c perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack 2016-04-27 10:29:07 -03:00
builtin-stat.c perf stat: Add --metric-only support for -A 2016-03-10 16:50:47 -03:00
builtin-timechart.c perf machine: Rename perf_event__preprocess_sample to machine__resolve 2016-03-23 12:03:08 -03:00
builtin-top.c perf hists: Move sort__has_socket into struct perf_hpp_list 2016-05-05 21:04:01 -03:00
builtin-trace.c perf trace: Do not print raw args list for syscalls with no args 2016-05-06 08:44:30 -03:00
builtin-version.c perf tools: Move cmd_version() to builtin-version.c 2015-12-09 13:42:03 -03:00
builtin.h perf tools: Remove needless 'extern' from function prototypes 2016-03-23 15:06:35 -03:00
command-list.txt perf tools: Do not show trace command if it's not compiled in 2016-01-08 12:46:17 -03:00
CREDITS
design.txt
Makefile perf build tests: Do parallell builds with 'build-test' 2016-02-04 15:57:00 -03:00
Makefile.perf perf tools: Build syscall table .c header from kernel's syscall_64.tbl 2016-04-08 09:58:14 -03:00
MANIFEST perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers 2016-03-24 12:28:57 -03:00
perf-archive.sh
perf-completion.sh perf tools: Avoid confusion with preloaded bash function for perf bash completion 2015-03-19 13:53:27 -03:00
perf-read-vdso.c
perf-sys.h perf tools: Move generic barriers out of perf-sys.h 2015-05-08 16:05:08 -03:00
perf-with-kcore.sh perf tools: Fix perf-with-kcore handling of arguments containing spaces 2015-08-06 16:48:27 -03:00
perf.c perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack 2016-04-27 10:29:07 -03:00
perf.h perf tools: Ditch record_opts.callgraph_set 2016-04-18 12:26:27 -03:00