mirror of
https://github.com/torvalds/linux.git
synced 2024-12-14 23:25:54 +00:00
91cea6be90
8 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Jin Yao
|
e69dc84282 |
perf stat: Support --cputype option for hybrid events
In previous patch, we have supported the syntax which enables the event on a specified pmu, such as: cpu_core/<event>/ cpu_atom/<event>/ While this syntax is not very easy for applying on a set of events or applying on a group. In following example, we have to explicitly assign the pmu prefix. # ./perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -- sleep 1 Performance counter stats for 'sleep 1': 1,158,545 cpu_core/cycles/ 1,003,113 cpu_core/instructions/ 1.002428712 seconds time elapsed A much easier way is: # ./perf stat --cputype core -e '{cycles,instructions}' -- sleep 1 Performance counter stats for 'sleep 1': 1,101,071 cpu_core/cycles/ 939,892 cpu_core/instructions/ 1.002363142 seconds time elapsed For this example, the '--cputype' enables the events from specified pmu (cpu_core). If '--cputype' conflicts with pmu prefix, '--cputype' is ignored. # ./perf stat --cputype core -e cycles,cpu_atom/instructions/ -a -- sleep 1 Performance counter stats for 'system wide': 21,003,407 cpu_core/cycles/ 367,886 cpu_atom/instructions/ 1.002203520 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210909062215.10278-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
2b62b3a611 |
perf parse-events: Add new "metric-id" term
Add a new "metric-id" term to events so that metric parsing can set an ID that can be reliably looked up. Metric parsing currently will turn a metric like "instructions/cycles" into a parse events string of "{instructions,cycles}:W". However, parse-events may change "instructions" into "instructions:u" if perf_event_paranoid=2. When this happens expr__resolve_id currently fails as stat-shadow adds the ID "instructions:u" to match with the counter value and the metric tries to look up the ID just "instructions". A later patch will use the new term. An example of the current problem: $ echo -1 > /proc/sys/kernel/perf_event_paranoid $ perf stat -M IPC /bin/true Performance counter stats for '/bin/true': 1,217,161 inst_retired.any # 0.97 IPC 1,250,389 cpu_clk_unhalted.thread 0.002064773 seconds time elapsed 0.002378000 seconds user 0.000000000 seconds sys $ echo 2 > /proc/sys/kernel/perf_event_paranoid $ perf stat -M IPC /bin/true Performance counter stats for '/bin/true': 150,298 inst_retired.any:u # nan IPC 187,095 cpu_clk_unhalted.thread:u 0.002042731 seconds time elapsed 0.000000000 seconds user 0.002377000 seconds sys Note: nan IPC is printed as an effect of "perf metric: Use NAN for missing event IDs." but earlier versions of perf just fail with a parse error and display no value. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Changbin Du <changbin.du@intel.com> Cc: Denys Zagorui <dzagorui@cisco.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Felix Fietkau <nbd@nbd.name> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nicholas Fraser <nfraser@codeweavers.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: ShihCheng Tu <mrtoastcheng@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20211015172132.1162559-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8e8bbfb311 |
perf parse-events: Add const to evsel name
The evsel name is strdup-ed before assignment and so can be const. A later change will add another similar string. Using const makes it clearer that these are not out arguments. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Changbin Du <changbin.du@intel.com> Cc: Denys Zagorui <dzagorui@cisco.com> Cc: Fabian Hemmer <copy@copy.sh> Cc: Felix Fietkau <nbd@nbd.name> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kees Kook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nicholas Fraser <nfraser@codeweavers.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: ShihCheng Tu <mrtoastcheng@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20211015172132.1162559-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
99fc5941b8 |
perf tools: Fix hybrid config terms list corruption
A config terms list was spliced twice, resulting in a never-ending loop when the list was traversed. Fix by using list_splice_init() and copying and freeing the lists as necessary. This patch also depends on patch "perf tools: Factor out copy_config_terms() and free_config_terms()" Example on ADL: Before: # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname & # jobs [1]+ Running perf record -e "{intel_pt//,cycles/aux-sample-size=4096/pp}" uname # perf top -E 10 PerfTop: 4071 irqs/sec kernel: 6.9% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 24 CPUs) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 97.60% perf [.] __evsel__get_config_term 0.25% [kernel] [k] kallsyms_expand_symbol.constprop.13 0.24% perf [.] kallsyms__parse 0.15% [kernel] [k] _raw_spin_lock 0.14% [kernel] [k] number 0.13% [kernel] [k] advance_transaction 0.08% [kernel] [k] format_decode 0.08% perf [.] map__process_kallsym_symbol 0.08% perf [.] rb_insert_color 0.08% [kernel] [k] vsnprintf exiting. # kill %1 After: # perf record -e '{intel_pt//,cycles/aux-sample-size=4096/pp}' uname & Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.060 MB perf.data ] # perf script | head perf-exec 604 [001] 1827.312293: psb: psb offs: 0 ffffffffb8415e87 pt_config_start+0x37 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a3bd event_sched_in.isra.133+0xfd ([kernel.kallsyms]) => ffffffffb856a9a0 perf_pmu_nop_void+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856b10e merge_sched_in+0x26e ([kernel.kallsyms]) => ffffffffb856a2c0 event_sched_in.isra.133+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a45d event_sched_in.isra.133+0x19d ([kernel.kallsyms]) => ffffffffb8568b80 perf_event_set_state.part.61+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8568b86 perf_event_set_state.part.61+0x6 ([kernel.kallsyms]) => ffffffffb85662a0 perf_event_update_time+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a35c event_sched_in.isra.133+0x9c ([kernel.kallsyms]) => ffffffffb8567610 perf_log_itrace_start+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb856a377 event_sched_in.isra.133+0xb7 ([kernel.kallsyms]) => ffffffffb8403b40 x86_pmu_add+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8403b86 x86_pmu_add+0x46 ([kernel.kallsyms]) => ffffffffb8403940 collect_events+0x0 ([kernel.kallsyms]) perf-exec 604 1827.312293: 1 branches: ffffffffb8403a7b collect_events+0x13b ([kernel.kallsyms]) => ffffffffb8402cd0 collect_event+0x0 ([kernel.kallsyms]) Fixes: |
||
Jin Yao
|
c93afadc92 |
perf parse-events: Compare with hybrid pmu name
On hybrid platform, user may want to enable event only on one pmu. Following syntax will be supported: cpu_core/<event>/ cpu_atom/<event>/ For hardware event, hardware cache event and raw event, two events are created by default. We pass the specified pmu name in parse_state and it would be checked before event creation. So next only the event with the specified pmu would be created. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
94da591b1c |
perf parse-events: Create two hybrid raw events
On hybrid platform, same raw event is possible to be available on both cpu_core pmu and cpu_atom pmu. It's supported to create two raw events for one event encoding. For raw events, the attr.type is PMU type. # perf stat -e r3c -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 4 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 4 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 8 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 8 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 r3c: 0: 434449 1001412521 1001412521 r3c: 1: 173162 1001482031 1001482031 r3c: 2: 231710 1001524974 1001524974 r3c: 3: 110012 1001563523 1001563523 r3c: 4: 191517 1001593221 1001593221 r3c: 5: 956458 1001628147 1001628147 r3c: 6: 416969 1001715626 1001715626 r3c: 7: 1047527 1001596650 1001596650 r3c: 8: 103877 1001633520 1001633520 r3c: 9: 70571 1001637898 1001637898 r3c: 10: 550284 1001714398 1001714398 r3c: 11: 1257274 1001738349 1001738349 r3c: 12: 107797 1001801432 1001801432 r3c: 13: 67471 1001836281 1001836281 r3c: 14: 286782 1001923161 1001923161 r3c: 15: 815509 1001952550 1001952550 r3c: 0: 95994 1002071117 1002071117 r3c: 1: 105570 1002142438 1002142438 r3c: 2: 115921 1002189147 1002189147 r3c: 3: 72747 1002238133 1002238133 r3c: 4: 103519 1002276753 1002276753 r3c: 5: 121382 1002315131 1002315131 r3c: 6: 80298 1002248050 1002248050 r3c: 7: 466790 1002278221 1002278221 r3c: 6821369 16026754282 16026754282 r3c: 1162221 8017758990 8017758990 Performance counter stats for 'system wide': 6,821,369 cpu_core/r3c/ 1,162,221 cpu_atom/r3c/ 1.002289965 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
30def61f64 |
perf parse-events: Create two hybrid cache events
For cache events, they have pre-defined configs. The kernel needs to know where the cache event comes from (e.g. from cpu_core pmu or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE can't carry pmu information. Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. When enabling a hybrid cache event without specified pmu, such as, 'perf stat -e LLC-loads -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e LLC-loads -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 LLC-loads: 0: 1507 1001800280 1001800280 LLC-loads: 1: 666 1001812250 1001812250 LLC-loads: 2: 3353 1001813453 1001813453 LLC-loads: 3: 514 1001848795 1001848795 LLC-loads: 4: 627 1001952832 1001952832 LLC-loads: 5: 4399 1001451154 1001451154 LLC-loads: 6: 1240 1001481052 1001481052 LLC-loads: 7: 478 1001520348 1001520348 LLC-loads: 8: 691 1001551236 1001551236 LLC-loads: 9: 310 1001578945 1001578945 LLC-loads: 10: 1018 1001594354 1001594354 LLC-loads: 11: 3656 1001622355 1001622355 LLC-loads: 12: 882 1001661416 1001661416 LLC-loads: 13: 506 1001693963 1001693963 LLC-loads: 14: 3547 1001721013 1001721013 LLC-loads: 15: 1399 1001734818 1001734818 LLC-loads: 0: 1314 1001793826 1001793826 LLC-loads: 1: 2857 1001752764 1001752764 LLC-loads: 2: 646 1001830694 1001830694 LLC-loads: 3: 1612 1001864861 1001864861 LLC-loads: 4: 2244 1001912381 1001912381 LLC-loads: 5: 1255 1001943889 1001943889 LLC-loads: 6: 4624 1002021109 1002021109 LLC-loads: 7: 2703 1001959302 1001959302 LLC-loads: 24793 16026838264 16026838264 LLC-loads: 17255 8015078826 8015078826 Performance counter stats for 'system wide': 24,793 cpu_core/LLC-loads/ 17,255 cpu_atom/LLC-loads/ 1.001970988 seconds time elapsed 0x4 in 0x400000002 indicates the cpu_core pmu. 0x8 in 0x800000002 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jin Yao
|
9cbfa2f64c |
perf parse-events: Create two hybrid hardware events
Current hardware events has special perf types PERF_TYPE_HARDWARE.
But it doesn't pass the PMU type in the user interface. For a hybrid
system, the perf kernel doesn't know which PMU the events belong to.
So now this type is extended to be PMU aware type. The PMU type ID
is stored at attr.config[63:32].
PMU type ID is retrieved from sysfs.
root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
8
root@lkp-adl-d01:/sys/devices/cpu_core# cat type
4
When enabling a hybrid hardware event without specified pmu, such as,
'perf stat -e cycles -a', two events are created automatically. One
is for atom, the other is for core.
# perf stat -e cycles -a -vv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
...
------------------------------------------------------------
perf_event_attr:
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
------------------------------------------------------------
perf_event_attr:
size 120
config 0x800000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
------------------------------------------------------------
...
------------------------------------------------------------
perf_event_attr:
size 120
config 0x800000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
cycles: 0: 836272 1001525722 1001525722
cycles: 1: 628564 1001580453 1001580453
cycles: 2: 872693 1001605997 1001605997
cycles: 3: 70417 1001641369 1001641369
cycles: 4: 88593 1001726722 1001726722
cycles: 5: 470495 1001752993 1001752993
cycles: 6: 484733 1001840440 1001840440
cycles: 7:
|