linux/tools/perf/arch
Ian Rogers d98079c05b perf evlist: Keep topdown counters in weak group
On Intel Icelake, topdown events must always be grouped with a slots
event as leader. When a metric is parsed a weak group is formed and
retried if perf_event_open fails. The retried events aren't grouped
breaking the slots leader requirement. This change modifies the weak
group "reset" behavior so that topdown events aren't broken from the
group for the retry.

  $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

   Performance counter stats for 'system wide':

    47,867,188,483      slots                                                         (92.27%)
   <not supported>      topdown-bad-spec
   <not supported>      topdown-be-bound
   <not supported>      topdown-fe-bound
   <not supported>      topdown-retiring
     2,173,346,937      branch-instructions                                           (92.27%)
        10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
        96,291,140      bus-cycles                                                    (92.29%)
         6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
        30,886,082      cache-references                                              (76.91%)
    11,773,726,641      cpu-cycles                                                    (84.62%)
    11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
                 0      mem-loads                                                     (92.32%)
     2,212,928,573      mem-stores                                                    (84.69%)
    10,024,403,118      ref-cycles                                                    (92.35%)
        16,232,978      baclears.any                                                  (92.35%)
        23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       0.981070734 seconds time elapsed

After:

  $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

   Performance counter stats for 'system wide':

       31040189283      slots                                                         (92.27%)
        8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
       10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
        4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
        7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
        1417611942      branch-instructions                                           (92.26%)
           5285529      branch-misses             #    0.37% of all branches          (92.28%)
          62922469      bus-cycles                                                    (92.29%)
           1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
          17374098      cache-references                                              (76.94%)
        8040889520      cpu-cycles                                                    (84.63%)
        7709992319      instructions              #    0.96  insn per cycle           (92.32%)
                 0      mem-loads                                                     (92.32%)
        1515669558      mem-stores                                                    (84.68%)
        6542411177      ref-cycles                                                    (92.35%)
           4154149      baclears.any                                                  (92.35%)
          20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       1.010799593 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220517052724.283874-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 12:01:18 -03:00
..
alpha perf tools: Add empty Build files for architectures lacking them 2015-08-06 16:24:15 -03:00
arc/annotate perf annotate: Introduce basic support for ARC 2018-12-17 14:59:42 -03:00
arm perf auxtrace: Record whether an auxtrace mmap is needed 2022-05-10 14:27:19 -03:00
arm64 perf auxtrace: Record whether an auxtrace mmap is needed 2022-05-10 14:27:19 -03:00
csky perf arch: Support register names from all archs 2021-12-16 12:18:12 -03:00
mips tools headers UAPI: Sync files changed by new set_mempolicy_home_node syscall 2022-01-20 11:20:37 -03:00
parisc perf tools: Add empty Build files for architectures lacking them 2015-08-06 16:24:15 -03:00
powerpc perf symbol: Remove arch__symbols__fixup_end() 2022-04-28 10:51:40 -03:00
riscv perf jitdump: Add riscv64 support 2022-04-11 16:37:26 -03:00
riscv64/annotate perf annotate: Add riscv64 support 2021-09-27 09:33:44 -03:00
s390 perf auxtrace: Record whether an auxtrace mmap is needed 2022-05-10 14:27:19 -03:00
sh treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
sparc treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
x86 perf evlist: Keep topdown counters in weak group 2022-05-17 12:01:18 -03:00
xtensa treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
Build perf tools: Rename build libperf to perf 2019-02-14 15:18:08 -03:00
common.c perf debug: Remove needless include directives from debug.h 2019-08-31 19:10:19 -03:00
common.h perf env: Remove env.h from other headers where just a fwd decl is needed 2019-08-31 19:10:40 -03:00