When '--sort' is not set, 'perf mem report" will print a null pointer as
the output value of sort order, so fix it.
Example:
Before this patch:
$ perf mem report
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 18 of event 'cpu/mem-loads/pp'
# Total weight : 188
# Sort order : (null)
#
...
After this patch:
$ perf mem report
# To display the perf.data header info, please use --header/--header-only options.
#
# Samples: 18 of event 'cpu/mem-loads/pp'
# Total weight : 188
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
...
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1427082605-12881-1-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since hist_entry__delete() nowadays doesn't actually frees anything that
may be in use by the annotation code.
Eventually we will solve this for good by reference counting struct
symbol.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-uldtgljymtrkns0knpiso5op@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding support to add field(s) to default field order via using the '+'
prefix, like for report:
$ perf report
Samples: 10 of event 'cycles', Event count (approx.): 4463799
Overhead Command Shared Object Symbol
32.40% ls [kernel.kallsyms] [k] filemap_fault
28.19% ls [kernel.kallsyms] [k] get_page_from_freelist
23.38% ls [kernel.kallsyms] [k] enqueue_entity
15.04% ls [kernel.kallsyms] [k] mmap_region
$ perf report -F +period,sample
Samples: 10 of event 'cycles', Event count (approx.): 4463799
Overhead Period Samples Command Shared Object Symbol
32.40% 1446493 1 ls [kernel.kallsyms] [k] filemap_fault
28.19% 1258486 1 ls [kernel.kallsyms] [k] get_page_from_freelist
23.38% 1043754 1 ls [kernel.kallsyms] [k] enqueue_entity
15.04% 671160 1 ls [kernel.kallsyms] [k] mmap_region
Works in general for commands using --field option.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jean Pihet <jean.pihet@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1408715919-25990-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In perf's 'mem-mode', one can get access to a whole bunch of details specific to a
particular sample instruction. A bunch of those details relate to the data
address.
One interesting thing you can do with data addresses is to convert them into a unique
cacheline they belong too. Organizing these data cachelines into similar groups and sorting
them can reveal cache contention.
This patch creates an alogorithm based on various sample details that can help group
entries together into data cachelines and allows 'perf report' to sort on it.
The algorithm relies on having proper mmap2 support in the kernel to help determine
if the memory map the data address belongs to is private to a pid or globally shared.
The alogortithm is as follows:
o group cpumodes together
o group entries with discovered maps together
o sort on major, minor, inode and inode generation numbers
o if userspace anon, then sort on pid
o sort on cachelines based on data addresses
The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'.
Sample output:
#
# Samples: 206 of event 'cpu/mem-loads/pp'
# Total weight : 2534
# Sort order : dcacheline,pid
#
# Overhead Samples Data Cacheline Command: Pid
# ........ ............ ...................................................................... ..................
#
13.22% 1 [k] 0xffff88042f08ebc0 swapper: 0
9.27% 1 [k] 0xffff88082e8cea80 swapper: 0
3.59% 2 [k] 0xffffffff819ba180 swapper: 0
0.32% 1 [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0 swapper: 0
0.32% 1 [k] timekeeper_seq+0xfffffffffffffff8 swapper: 0
Note: Added a '+1' to symlen size in hists__calc_col_len to prevent the next column
from prematurely tabbing over and mis-aligning. Not sure what the problem is.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
After output/sort fields refactoring, it's expensive
to check the elide bool in its current location inside
the 'struct sort_entry'.
The perf_hpp__should_skip function gets highly noticable in
workloads with high number of output/sort fields, like for:
$ perf report -i perf-test.data -F overhead,sample,period,comm,pid,dso,symbol,cpu --stdio
Performance report:
9.70% perf [.] perf_hpp__should_skip
Moving the elide bool into the 'struct perf_hpp_fmt', which
makes the perf_hpp__should_skip just single struct read.
Got speedup of around 22% for my test perf.data workload.
The change should not harm any other workload types.
Performance counter stats for (10 runs):
before:
358,319,732,626 cycles ( +- 0.55% )
467,129,581,515 instructions # 1.30 insns per cycle ( +- 0.00% )
150.943975206 seconds time elapsed ( +- 0.62% )
now:
278,785,972,990 cycles ( +- 0.12% )
370,146,797,640 instructions # 1.33 insns per cycle ( +- 0.00% )
116.416670507 seconds time elapsed ( +- 0.31% )
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20140601142622.GA9131@krava.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
If -g cumulative option is given, it needs to show entries which don't
have self overhead. So apply percent-limit to accumulated overhead
percentage in this case.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-14-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Maintain accumulated stat information in hist_entry->stat_acc if
symbol_conf.cumulate_callchain is set. Fields in ->stat_acc have same
vaules initially, and will be updated as callchain is processed later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Currently, what the sort_entry does is just identifying hist entries
so that they can be grouped properly. However, with -F option
support, it indeed needs to sort entries appropriately to be shown to
users. So add ->sort() member to do it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-13-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The -F/--fields option is to allow user setup output field in any
order. It can receive any sort keys and following (hpp) fields:
overhead, overhead_sys, overhead_us, sample and period
If guest profiling is enabled, overhead_guest_{sys,us} will be
available too.
The output fields also affect sort order unless you give -s/--sort
option. And any keys specified on -s option, will also be added to
the output field list automatically.
$ perf report -F sym,sample,overhead
...
# Symbol Samples Overhead
# .......................... ............ ........
#
[.] __cxa_atexit 2 2.50%
[.] __libc_csu_init 4 5.00%
[.] __new_exitfn 3 3.75%
[.] _dl_check_map_versions 1 1.25%
[.] _dl_name_match_p 4 5.00%
[.] _dl_setup_hash 1 1.25%
[.] _dl_sysdep_start 1 1.25%
[.] _init 5 6.25%
[.] _setjmp 6 7.50%
[.] a 8 10.00%
[.] b 8 10.00%
[.] brk 1 1.25%
[.] c 8 10.00%
Note that, the example output above is captured after applying next
patch which fixes sort/comparing behavior.
Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-12-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The perf uses different default sort orders for different use-cases,
and this was scattered throughout the code. Add get_default_sort_
order() function to handle this and change initial value of sort_order
to NULL to distinguish it from user-given one.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1400480762-22852-10-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
There is no point in sort.h including itself.
The include was added when the file was created, in commit "perf tools:
Create util/sort.and use it" (dd68ada2d) and added a include to "sort.h"
in lot of files (all the files that started using the file). It was
probably added by mistake on sort.h too.
Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383776454-10595-1-git-send-email-rodrigo@sdfg.com.ar
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They convey no information, perhaps I was bitten by some snake at some
point, complete the detox by naming the last of those arguments more
sensibly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-u1r0dnjoro08dgztiy2g3t2q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At insert time, a hist entry should reference comm at the time otherwise
it'll get the last comm anyway.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-n6pykiiymtgmcjs834go2t8x@git.kernel.org
[ Fixed up const pointer issues ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Extend the perf branch sorting code to support sorting by in_tx
or abort_tx qualifiers. Also print out those qualifiers.
This also fixes up some of the existing sort key documentation.
We do not support no_tx here, because it's simply not showing
the in_tx flag.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This is a partial revert of Namhyung's patch
afab87b91f
perf sort: Separate out memory-specific sort keys
He wrote
For global/local weights, I'm not entirely sure to place them into the
memory dimension. But it's the only user at this time.
Well TSX is another (in fact the original) user of the flags, and it
needs them to be common. So move local/global weight back to the common
sort keys.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Link: http://lkml.kernel.org/r/1374188333-17899-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It does not make sense to make some computation (ratio, wdiff), when the
hist_entry is 'dummy' - added via hists__link.
Adding dummy field to struct hist_entry which indicates that it was
added by hists__link and avoiding some of the processing for such
entries.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-g8bxml0n0pnqsrpyd98p0ird@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For example, in an application with an expensive function implemented
with deeply nested recursive calls, the default call-graph presentation
is dominated by the different callchains within that function. By
ignoring these callees, we can collect the callchains leading into the
function and compactly identify what to blame for expensive calls.
For example, in this report the callers of garbage_collect() are
scattered across the tree:
$ perf report -d ruby 2>- | grep -m10 ^[^#]*[a-z]
22.03% ruby [.] gc_mark
--- gc_mark
|--59.40%-- mark_keyvalue
| st_foreach
| gc_mark_children
| |--99.75%-- rb_gc_mark
| | rb_vm_mark
| | gc_mark_children
| | gc_marks
| | |--99.00%-- garbage_collect
If we ignore the callees of garbage_collect(), its callers are coalesced:
$ perf report --ignore-callees garbage_collect -d ruby 2>- | grep -m10 ^[^#]*[a-z]
72.92% ruby [.] garbage_collect
--- garbage_collect
vm_xmalloc
|--47.08%-- ruby_xmalloc
| st_insert2
| rb_hash_aset
| |--98.45%-- features_index_add
| | rb_provide_feature
| | rb_require_safe
| | vm_call_method
Signed-off-by: Greg Price <price@mit.edu>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130623031720.GW22203@biohazard-cafe.mit.edu
Link: http://lkml.kernel.org/r/20130708115746.GO22203@biohazard-cafe.mit.edu
Cc: Fengguang Wu <fengguang.wu@intel.com>
[ remove spaces at beginning of line, reported by Fengguang Wu ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current logic is to attach pair to the leader hist_entry.
Arguments of hist_entry__add_pair function were placed the other way
round.. driving me crazy.
I.e. list_add_tail expects (new_node, head).
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1355404152-16523-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The same code was duplicate to places, factor them out to common
sort__setup_elide().
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364991979-3008-11-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since they're used only for perf mem, separate out them to a different
dimension so that normal user cannot access them by any chance.
For global/local weights, I'm not entirely sure to place them into the
memory dimension. But it's the only user at this time.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364991979-3008-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's used for determining current sort mode which can be one of
NORMAL, BRANCH and new MEMORY.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1364816125-12212-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the sorting and histogram support
functions to enable profiling of memory accesses.
The following sorting orders are added:
- symbol_daddr: data address symbol (or raw address)
- dso_daddr: data address shared object
- locked: access uses locked transaction
- tlb : TLB access
- mem : memory level of the access (L1, L2, L3, RAM, ...)
- snoop: access snoop mode
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed, the move of methods to
machine.[ch], and the rename of dsrc to data_src, to match the change
made in the PERF_SAMPLE_DSRC in a previous patch. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record has a new option -W that enables weightened sampling.
Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.
Add the necessary glue to perf report, record and the library.
v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.
Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the setup_sorting() is called for parsing sort keys and exits
if it failed to add the sort key. As it's included in libperf it'd be
better returning an error code rather than exiting application inside of
the library.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1360130237-9963-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Current perf report gets segmentation fault when a branch stack specific
sort key is provided by --sort option to a perf.data file which contains
no branch infomation. It's because those sort keys reference branch
info of a hist entry unconditionally. Maybe we can change it checks
whether such branch info is valid or not. But if the branch stacks are
not recorded, it'd be nop. Thus it'd be better to make those keys are
unselectable.
This patch separates those keys to a different dimension array, so that
if user passes such a key to a file which has no branch stack will get
following message rather than a segfault.
Error: Invalid --sort key: `symbol_from'
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reported-by: Stefan Beller <stefanbeller@googlemail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1356599507-14226-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Removing displacement from struct hist_entry_diff, because it's not
used. Displacement is not used for sorting, so there's no reason to
pre-calculate it.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1354110769-2998-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a misplaced underscore. In this case, 'hist_entry' is the name of
data structure and we usually put double underscores between data
structure and actual function name.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-8jdq8g6kl6v54hkexrfwsy72@git.kernel.org
[ committer note: put it in front of the patch queue where it came from ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We want to match more than two hists, so that we can match more than two
perf.data files and moreover, match hist_entries (buckets) in multiple
events in a group.
So the "baseline"/"leader" will instead of a ->pair pointer, use a
list_head, that will link to the pairs and hists__match use it.
Following that perf_evlist__link will link the hists in its evsel
groups.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-2kbmzepoi544ygj9godseqpv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding 'wdiff' as new computation way to compare hist entries.
If specified the 'Weighted diff' column is displayed with value 'd'
computed as:
d = B->period * WEIGHT-A - A->period * WEIGHT-B
- A/B being matching hist entry from first/second file specified
(or perf.data/perf.data.old) respectively.
- period being the hist entry period value
- WEIGHT-A/WEIGHT-B being user suplied weights in the the '-c' option
behind ':' separator like '-c wdiff:1,2'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1349448287-18919-5-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The struct he_stat is for separating out statistics data of a hist
entry. It is required for later changes.
It's just a mechanical change and should have no functional differences.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arun Sharma <asharma@fb.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving the position calculation into the diff command, so the position
as prepared inside struct hist_entry data and there's no need to compute
in the output display path.
Removing 'displacement' from struct perf_hpp as it is no longer needed.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-3-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding pointer back to the parent struct hists for struct hists_entry.
This will be useful in future for any hist_entry's data computation,
that depends on total data of its parent hists.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1349354994-17853-2-git-send-email-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sort__has_sym variable is for checking whether the sort_list
includes 'symbol' as a sort key. It will be used for later patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1347611729-16994-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The repsep_snprintf function was still using standalone field_sep, which
not even set anymore.
Replacing it with 'symbol_conf.field_sep'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1346946426-13496-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just let it there till the user exits the annotation browser.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-nmaxuzreqhm5k10t2co5sk9a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By using a mutex just for inserting and rotating two hist_entry rb
trees, so that when sorting we can get the last batch of entries created
from the ring buffer, merge it with whatever we have processed so far
and show the output while new entries are being added.
The 'report' tool continues, for now, to do it without threading, but
will use this in the future to allow visualization of results in long
perf.data sessions while the entries are being processed.
The new 'top' tool will be the first user.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9b05atsn0q6m7fqgrug8fk2i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These are probably some old leftovers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
These don't need to be globally visible.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
In order to implement callchains collapsing, we need to keep
track of the maximum depth in a histogram tree of callchains.
This way we'll avoid allocating an arbitrary temporary buffer
size on callchain merge time.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@infradead.org>
The stock newt checkbox tree widget we were using was not really
suitable for hist entry + callchain browsing.
The problems with it were manifold:
- We needed to traverse the whole hist_entry rb_tree to add each entry +
callchains beforehand.
- No control over the colors used for each row
So a new tree widget, based mostly on slang, was written.
It extends the ui_browser class already used for annotate to allow the
user to fold/unfold branches in the callchains tree, using extra fields
in the symbol_map class that is embedded in hist_entry and
callchain_node instances to store the folding state and when changing
this state calculates the number of rows that are produced when showing
a particular hist_entry instance.
This greatly speeds up browsing as we don't have to upfront touch all
the entries and only calculate callchain related operations when some
callchain branch is actually unfolded.
The memory footprint is also reduced as the data structure is not
duplicated, just some extra fields for controling callchain state and to
simplify the process of seeking thru entries (nr_rows, row_offset) were
added.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They were globals, and since we support multiple hists and sessions
at the same time, it doesn't make sense to calculate those values
considereing all symbols in all sessions.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In a shared multi-core environment, users want to analyze why their
program was slow. In particular, if the code ran slower only on certain
CPUs due to interference from other programs or kernel threads, the user
should be able to notice that.
Sample usage:
perf record -f -a -- sleep 3
perf report --sort cpu,comm
Workload:
program is running on 16 CPUs
Experiencing interference from an antagonist only on 4 CPUs.
Samples: 106218177676 cycles
Overhead CPU Command
........ ... ...............
6.25% 2 program
6.24% 6 program
6.24% 11 program
6.24% 5 program
6.24% 9 program
6.24% 10 program
6.23% 15 program
6.23% 7 program
6.23% 3 program
6.23% 14 program
6.22% 1 program
6.20% 13 program
3.17% 12 program
3.15% 8 program
3.14% 0 program
3.13% 4 program
3.11% 4 antagonist
3.11% 0 antagonist
3.10% 8 antagonist
3.07% 12 antagonist
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100505181612.GA5091@sharma-home.net>
Signed-off-by: Arun Sharma <aruns@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>