linux/tools/perf
Kan Liang 6a80d794d7 perf stat: New metricgroup output for the default mode
In the default mode, the current output of the metricgroup include both
events and metrics, which is not necessary and just makes the output
hard to read. Since different ARCHs (even different generations in the
same ARCH) may use different events. The output also vary on different
platforms.

For a metricgroup, only outputting the value of each metric is good
enough.

Add a new field default_metricgroup in evsel to indicate an event of the
default metricgroup. For those events, printout() should print the
metricgroup name rather than each event.

Add perf_stat__skip_metric_event() to skip the evsel in the Default
metricgroup, if it's not running or not the metric event.

Add print_metricgroup_header_t to pass the functions which print the
display name of each metricgroup in the Default metricgroup. Support all
three output methods.

Factor out perf_stat__print_shadow_stats_metricgroup() to print out each
metrics.

On SPR:

Before:

 ./perf_old stat sleep 1

 Performance counter stats for 'sleep 1':

              0.54 msec task-clock:u                     #    0.001 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
                68      page-faults:u                    #  125.445 K/sec
           540,970      cycles:u                         #    0.998 GHz
           556,325      instructions:u                   #    1.03  insn per cycle
           123,602      branches:u                       #  228.018 M/sec
             6,889      branch-misses:u                  #    5.57% of all branches
         3,245,820      TOPDOWN.SLOTS:u                  #     18.4 %  tma_backend_bound
                                                  #     17.2 %  tma_retiring
                                                  #     23.1 %  tma_bad_speculation
                                                  #     41.4 %  tma_frontend_bound
           564,859      topdown-retiring:u
         1,370,999      topdown-fe-bound:u
           603,271      topdown-be-bound:u
           744,874      topdown-bad-spec:u
            12,661      INT_MISC.UOP_DROPPING:u          #   23.357 M/sec

       1.001798215 seconds time elapsed

       0.000193000 seconds user
       0.001700000 seconds sys

After:

$ ./perf stat sleep 1

 Performance counter stats for 'sleep 1':

              0.51 msec task-clock:u                     #    0.001 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
                68      page-faults:u                    #  132.683 K/sec
           545,228      cycles:u                         #    1.064 GHz
           555,509      instructions:u                   #    1.02  insn per cycle
           123,574      branches:u                       #  241.120 M/sec
             6,957      branch-misses:u                  #    5.63% of all branches
                        TopdownL1                 #     17.5 %  tma_backend_bound
                                                  #     22.6 %  tma_bad_speculation
                                                  #     42.7 %  tma_frontend_bound
                                                  #     17.1 %  tma_retiring
                        TopdownL2                 #     21.8 %  tma_branch_mispredicts
                                                  #     11.5 %  tma_core_bound
                                                  #     13.4 %  tma_fetch_bandwidth
                                                  #     29.3 %  tma_fetch_latency
                                                  #      2.7 %  tma_heavy_operations
                                                  #     14.5 %  tma_light_operations
                                                  #      0.8 %  tma_machine_clears
                                                  #      6.1 %  tma_memory_bound

       1.001712086 seconds time elapsed

       0.000151000 seconds user
       0.001618000 seconds sys

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahmad Yasin <ahmad.yasin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230616031420.3751973-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-16 09:57:19 -03:00
..
arch perf tool x86: Fix perf_env memory leak 2023-06-14 18:19:06 -03:00
bench perf bench sched messaging: Free contexts on exit 2023-06-12 18:18:14 -03:00
dlfilters perf tools: Fix usage of the verbose variable 2022-12-20 15:16:33 -03:00
Documentation perf stat: Document --metric-no-threshold and threshold colors 2023-06-05 16:04:14 -03:00
examples/bpf perf trace: Remove unused bpf map 'syscalls' 2022-11-23 10:30:00 -03:00
include/perf perf bpf: Remove now unused BPF headers 2022-11-04 11:41:48 -03:00
jvmti
pmu-events perf stat,jevents: Introduce Default tags for the default mode 2023-06-15 22:16:48 -03:00
python
scripts perf python scripting: Get rid of unused import in arm-cs-trace-disasm 2023-06-13 23:40:33 -03:00
tests pert tests: Update metric-value for perf stat JSON output 2023-06-15 22:16:59 -03:00
trace perf thread: Add accessor functions for thread 2023-06-12 15:57:53 -03:00
ui perf thread: Add reference count checking 2023-06-12 15:57:53 -03:00
util perf stat: New metricgroup output for the default mode 2023-06-16 09:57:19 -03:00
.gitignore perf jevents: Run metric_test.py at compile-time 2023-02-03 17:11:39 -03:00
Build perf script: Fix Python support when no libtraceevent 2023-03-15 10:27:07 -03:00
builtin-annotate.c perf addr_location: Add init/exit/copy functions 2023-06-12 15:57:53 -03:00
builtin-bench.c perf bench: Add missing setlocale() call to allow usage of %'d style formatting 2023-06-05 10:36:58 -03:00
builtin-buildid-cache.c
builtin-buildid-list.c perf util: Move input_name to util 2023-04-10 19:21:31 -03:00
builtin-c2c.c perf callchain: Use pthread keys for tls callchain_cursor 2023-06-12 15:57:54 -03:00
builtin-config.c perf path: Make mkpath thread safe, remove 16384 bytes from .bss 2023-05-28 10:24:14 -03:00
builtin-daemon.c perf daemon: Dynamically allocate path to perf 2023-05-28 10:23:23 -03:00
builtin-data.c perf util: Move input_name to util 2023-04-10 19:21:31 -03:00
builtin-diff.c perf srcline: Optimize comparision against SRCLINE_UNKNOWN 2023-06-12 18:17:00 -03:00
builtin-evlist.c perf util: Move input_name to util 2023-04-10 19:21:31 -03:00
builtin-ftrace.c perf tools fixes for v6.4: 2nd batch 2023-05-31 15:31:56 -03:00
builtin-help.c perf path: Make mkpath thread safe, remove 16384 bytes from .bss 2023-05-28 10:24:14 -03:00
builtin-inject.c perf inject: Lazily allocate guest_event event_buf 2023-06-12 18:18:14 -03:00
builtin-kallsyms.c perf map: Add helper for ->map_ip() and ->unmap_ip() 2023-04-06 22:10:17 -03:00
builtin-kmem.c perf callchain: Use pthread keys for tls callchain_cursor 2023-06-12 15:57:54 -03:00
builtin-kvm.c perf evsel: Introduce evsel__name_is() method to check if the evsel name is equal to a given string 2023-04-24 14:28:11 -03:00
builtin-kwork.c perf callchain: Use pthread keys for tls callchain_cursor 2023-06-12 15:57:54 -03:00
builtin-list.c perf list: Check arguments to show libpfm4 events 2023-06-12 15:57:53 -03:00
builtin-lock.c perf callchain: Use pthread keys for tls callchain_cursor 2023-06-12 15:57:54 -03:00
builtin-mem.c perf addr_location: Add init/exit/copy functions 2023-06-12 15:57:53 -03:00
builtin-probe.c perf probe: Dynamically allocate params memory 2023-05-28 10:24:02 -03:00
builtin-record.c perf pmus: Remove perf_pmus__has_hybrid 2023-05-27 09:42:38 -03:00
builtin-report.c perf report: Avoid 'parent_thread' thread leak on '--tasks' processing 2023-06-12 15:57:53 -03:00
builtin-sched.c perf sched: Avoid large stack allocations 2023-06-12 18:18:14 -03:00
builtin-script.c perf script: Remove some large stack allocations 2023-06-12 18:18:14 -03:00
builtin-stat.c perf stat: New metricgroup output for the default mode 2023-06-16 09:57:19 -03:00
builtin-timechart.c perf addr_location: Add init/exit/copy functions 2023-06-12 15:57:53 -03:00
builtin-top.c perf top: Add exit routine for main thread 2023-06-12 15:57:54 -03:00
builtin-trace.c perf callchain: Use pthread keys for tls callchain_cursor 2023-06-12 15:57:54 -03:00
builtin-version.c Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL" 2023-05-06 18:07:37 -03:00
builtin.h perf usage: Move usage strings 2023-04-10 19:20:53 -03:00
check-headers.sh tools headers: Make the difference output easier to read 2023-06-09 10:56:40 -03:00
command-list.txt perf help: Use HAVE_LIBTRACEEVENT to filter out unsupported commands 2023-01-02 11:51:53 -03:00
CREDITS
design.txt
Makefile perf tools: Use "grep -E" instead of "egrep" 2022-12-14 15:28:19 -03:00
Makefile.config perf tests: Make x86 new instructions test optional at build time 2023-06-13 23:40:32 -03:00
Makefile.perf perf tests: Make x86 new instructions test optional at build time 2023-06-13 23:40:32 -03:00
MANIFEST tools lib traceevent: Remove libtraceevent 2022-12-14 11:16:12 -03:00
perf-archive.sh
perf-completion.sh perf tools: Fix auto-complete on aarch64 2023-02-08 10:38:10 -03:00
perf-iostat.sh
perf-read-vdso.c
perf-sys.h
perf.c perf util: Move input_name to util 2023-04-10 19:21:31 -03:00
perf.h perf util: Move perf_guest/host declarations 2023-04-10 19:22:05 -03:00