Commit Graph

261 Commits

Author SHA1 Message Date
Joakim Zhang
37b9c7bbe1 perf vendor events arm64: Add JSON metrics for imx8mp DDR Perf
Add JSON metrics for imx8mp DDR Perf.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de> <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-imx@nxp.com
Cc: kernel@pengutronix.de
Link: https://lore.kernel.org/r/20210127105734.12198-5-qiangqing.zhang@nxp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-18 10:09:23 -03:00
Joakim Zhang
3a35093ab5 perf vendor events arm64: Add JSON metrics for imx8mq DDR Perf
Add JSON metrics for imx8mq DDR Perf.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de> <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-imx@nxp.com
Cc: kernel@pengutronix.de
Link: https://lore.kernel.org/r/20210127105734.12198-4-qiangqing.zhang@nxp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-18 10:09:23 -03:00
Joakim Zhang
842ed29895 perf vendor events arm64: Add JSON metrics for imx8mn DDR Perf
Add JSON metrics for imx8mn DDR Perf.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de> <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: kernel@pengutronix.de
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-imx@nxp.com
Link: https://lore.kernel.org/r/20210127105734.12198-3-qiangqing.zhang@nxp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-18 10:09:19 -03:00
Joakim Zhang
84b102f564 perf vendor events arm64: Fix indentation of brackets in imx8mm metrics
Fix indentation of brackets in imx8mm metrics.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-imx@nxp.com
Cc: kernel@pengutronix.de
Link: https://lore.kernel.org/r/20210127105734.12198-2-qiangqing.zhang@nxp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-18 10:09:16 -03:00
John Garry
c3a9cdef61 perf vendor events arm64: Reference common and uarch events for A76
Reduce duplication in the JSONs by referencing standard events from
armv8-common-and-microarch.json

In general the "PublicDescription" fields are not modified when somewhat
significantly worded differently than the standard.

Apart from that, description and names for events slightly different to
standard are changed (to standard) for consistency.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-5-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03 13:10:44 -03:00
John Garry
d02d5dc882 perf vendor events arm64: Reference common and uarch events for Ampere eMag
Reduce duplication in the JSONs by referencing standard events from
armv8-common-and-microarch.json

In general the "PublicDescription" fields are not modified when somewhat
significantly worded differently than the standard.

Apart from that, description and names for events slightly different to
standard are changed (to standard) for consistency.

Note that names for events 0x34 and 0x35 are non-standard and remain
unchanged. Those events came from the following originally:

  4c2479c67b/Documentation/arm64/eMAG-ARM-CoreImpDefined.pdf

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: mathieu.poirier@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03 13:10:44 -03:00
John Garry
c77669662f perf vendor events arm64: Add common and uarch event JSON
Add a common and microarch JSON, which can be referenced from CPU JSONs.

For now, brief and public description are as event brief event
description from the ARMv8 ARM [0], D7-11.

The list of events is not complete, as not all events will be referenced
yet.

Reference document is at the following:

[0] https://documentation-service.arm.com/static/5fa3bd1eb209f547eebd4141?token=

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03 13:10:43 -03:00
John Garry
2bf797be81 perf vendor events arm64: Fix Ampere eMag event typo
The "briefdescription" for event 0x35 has a typo - fix it.

Fixes: d35c595bf0 ("perf vendor events arm64: Revise core JSON events for eMAG")
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Nakamura, Shunsuke/中村 俊介 <nakamura.shun@fujitsu.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@openeuler.org
Link: https://lore.kernel.org/r/1611835236-34696-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-03 13:10:43 -03:00
Joakim Zhang
e15a536521 perf vendor events: Add JSON metrics for imx8mm DDR Perf
Add JSON metrics for imx8mm DDR Perf.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Acked-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Link: http://lore.kernel.org/lkml/1607080216-36968-11-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17 14:36:17 -03:00
John Garry
4689f56796 perf jevents: Add support for system events tables
Process the JSONs to find support for "system" events, which are not
tied to a specific CPUID.

A "COMPAT" property is now used to match against the namespace ID from
the kernel PMU driver.

The generated pmu-events.c will now have 2 tables:

a. CPU events, as before.
b. New pmu_sys_event_tables[] table, which will have events matched to
   specific SoCs.

It will look like this:

struct pmu_event pme_hisilicon_hip09_sys[] = {
{
	.name = "cycles",
	.compat = "0x00030736",
	.event = "event=0",
	.desc = "Clock cycles",
	.topic = "smmu v3 pmcg",
	.long_desc = "Clock cycles",
},
{
	.name = "smmuv3_pmcg.l1_tlb",
	.compat = "0x00030736",
	.event = "event=0x8a",
	.desc = "SMMUv3 PMCG l1_tlb. Unit: smmuv3_pmcg ",
	.topic = "smmu v3 pmcg",
	.long_desc = "SMMUv3 PMCG l1_tlb",
	.pmu = "smmuv3_pmcg",
},
...
};

struct pmu_event pme_arm_cortex_a53[] = {
{
	.name = "ext_mem_req",
	.event = "event=0xc0",
	.desc = "External memory request",
	.topic = "memory",
},
{
	.name = "ext_mem_req_nc",
	.event = "event=0xc1",
	.desc = "Non-cacheable external memory request",
	.topic = "memory",
},
...
};

struct pmu_event pme_hisilicon_hip09_cpu[] = {
{
	.name = "l2d_cache_refill_wr",
	.event = "event=0x53",
	.desc = "L2D cache refill, write",
	.topic = "core imp def",
	.long_desc = "Attributable Level 2 data cache refill, write",
},
...
};

struct pmu_events_map pmu_events_map[] = {
{
	.cpuid = "0x00000000410fd030",
	.version = "v1",
	.type = "core",
	.table = pme_arm_cortex_a53
},
{
	.cpuid = "0x00000000480fd010",
	.version = "v1",
	.type = "core",
	.table = pme_hisilicon_hip09_cpu
},
	{
		.table = 0
	},
};

struct pmu_event pme_hisilicon_hip09_cpu[] = {
{
	.name = "uncore_hisi_l3c.rd_cpipe",
	.event = "event=0",
	.desc = "Total read accesses. Unit: hisi_sccl,l3c ",
	.topic = "uncore l3c",
	.long_desc = "Total read accesses",
	.pmu = "hisi_sccl,l3c",
},
{
	.name = "uncore_hisi_l3c.wr_cpipe",
	.event = "event=0x1",
	.desc = "Total write accesses. Unit: hisi_sccl,l3c ",
	.topic = "uncore l3c",
	.long_desc = "Total write accesses",
	.pmu = "hisi_sccl,l3c",
},
...
};

struct pmu_sys_events pmu_sys_event_tables[] = {
{
	.table = pme_hisilicon_hip09_sys,
},
...
};

Committer notes:

Added the fix for architectures without PMU events, provided by John
after I reported the build failing in such systems.

Link: https://lore.kernel.org/lkml/650baaf2-36b6-a9e2-ff49-963ef864c1f3@huawei.com/

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1607080216-36968-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17 14:36:17 -03:00
John Garry
4853f1caa4 perf jevents: Add support for an extra directory level
Currently only upto a level 2 directory is supported, in form
vendor/platform.

Add support for a further level, to support vendor/platform
sub-directories in future, which will be vendor/platform/cpu and
vendor/platform/sys.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1607080216-36968-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17 14:36:17 -03:00
Jin Yao
3d05181a08 perf vendor events: Update Skylake client events to v50
- Update Skylake events to v50.
- Update Skylake JSON metrics from TMAM 4.0.
- Fix the issue in DRAM_Parallel_Reads
- Fix the perf test warning

Before:

  root@kbl-ppc:~# perf stat -M DRAM_Parallel_Reads -- sleep 1
  event syntax error: '{arb/event=0x80,umask=0x2/,arb/event=0x80,umask=0x2,thresh=1/}:W'
                       \___ unknown term 'thresh' for pmu 'uncore_arb'

  valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,percore

  Initial error:
  event syntax error: '..umask=0x2/,arb/event=0x80,umask=0x2,thresh=1/}:W'
                                    \___ Cannot find PMU `arb'. Missing kernel support?

  root@kbl-ppc:~# perf test metrics
  10: PMU events                                 :
  10.3: Parsing of PMU event table metrics               : Skip (some metrics failed)
  10.4: Parsing of PMU event table metrics with fake PMUs: Ok
  67: Parse and process metrics                  : Ok

After:

  root@kbl-ppc:~# perf stat -M MEM_Parallel_Reads -- sleep 1

   Performance counter stats for 'system wide':

           4,951,646      arb/event=0x80,umask=0x2/ #    26.30 MEM_Parallel_Reads       (50.04%)
             188,251      arb/event=0x80,umask=0x2,cmask=1/                                     (49.96%)

         1.000867010 seconds time elapsed

  root@kbl-ppc:~# perf test metrics
  10: PMU events                                 :
  10.3: Parsing of PMU event table metrics               : Ok
  10.4: Parsing of PMU event table metrics with fake PMUs: Ok
  67: Parse and process metrics                  : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/93fae76f-ce2b-ab0b-3ae9-cc9a2b4cbaec@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-16 14:05:32 -03:00
John Garry
644bf4b0f7 perf jevents: Add test for arch std events
Recently there was an undetected breakage for std arch event support.

Add support in "PMU events" testcase to detect such breakages.

For this, the "test" arch needs has support added to process std arch
events. And a test event is added for the test, ifself.

Also add a few code comments to help understand the code a bit better.

Committer testing:

Before:

  # perf test -vv pmu  |& grep l3_cache_rd
  #

After:

  # perf test -vv pmu  |& grep l3_cache_rd
  testing event table l3_cache_rd: pass
  testing aliases PMU cpu: matched event l3_cache_rd
  #

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-By: Kajol Jain<kjain@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/1603364547-197086-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-04 09:42:41 -03:00
John Garry
fa1b41a74d perf jevents: Tidy error handling
There is much duplication in the error handling for directory transvering
for prcessing JSONs.

Factor out the common code to tidy a bit.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-By: Kajol Jain<kjain@linux.ibm.com>
Link: https://lore.kernel.org/r/1603364547-197086-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-04 09:42:41 -03:00
Jin Yao
0dfbe4c646 perf vendor events: Fix DRAM_BW_Use 0 issue for CLX/SKX
Ian reports an issue that the metric DRAM_BW_Use often remains 0.

The metric expression for DRAM_BW_Use on CLX/SKX:

"( 64 * ( uncore_imc@cas_count_read@ + uncore_imc@cas_count_write@ ) / 1000000000 ) / duration_time"

The counts of uncore_imc/cas_count_read/ and uncore_imc/cas_count_write/
are scaled up by 64, that is to turn a count of cache lines into bytes,
the count is then divided by 1000000000 to give GB.

However, the counts of uncore_imc/cas_count_read/ and
uncore_imc/cas_count_write/ have been scaled yet.

The scale values are from sysfs, such as
/sys/devices/uncore_imc_0/events/cas_count_read.scale.
It's 6.103515625e-5 (64 / 1024.0 / 1024.0).

So if we use original metric expression, the result is not correct.

But the difficulty is, for SKL client, the counts are not scaled.

The metric expression for DRAM_BW_Use on SKL:

"64 * ( arb@event\\=0x81\\,umask\\=0x1@ + arb@event\\=0x84\\,umask\\=0x1@ ) / 1000000 / duration_time / 1000"

root@kbl-ppc:~# perf stat -M DRAM_BW_Use -a -- sleep 1

 Performance counter stats for 'system wide':

               190      arb/event=0x84,umask=0x1/ #     1.86 DRAM_BW_Use
        29,093,178      arb/event=0x81,umask=0x1/
     1,000,703,287 ns   duration_time

       1.000703287 seconds time elapsed

The result is expected.

So the easy way is just change the metric expression for CLX/SKX.
This patch changes the metric expression to:

"( ( ( uncore_imc@cas_count_read@ + uncore_imc@cas_count_write@ ) * 1048576 ) / 1000000000 ) / duration_time"

1048576 = 1024 * 1024.

Before (tested on CLX):

root@lkp-csl-2sp5 ~# perf stat -M DRAM_BW_Use -a -- sleep 1

 Performance counter stats for 'system wide':

            765.35 MiB  uncore_imc/cas_count_read/ #     0.00 DRAM_BW_Use
              5.42 MiB  uncore_imc/cas_count_write/
        1001515088 ns   duration_time

       1.001515088 seconds time elapsed

After:

root@lkp-csl-2sp5 ~# perf stat -M DRAM_BW_Use -a -- sleep 1

 Performance counter stats for 'system wide':

            767.95 MiB  uncore_imc/cas_count_read/ #     0.80 DRAM_BW_Use
              5.02 MiB  uncore_imc/cas_count_write/
        1001900010 ns   duration_time

       1.001900010 seconds time elapsed

Fixes: 038d3b53c2 ("perf vendor events intel: Update CascadelakeX events to v1.08")
Fixes: b5ff7f2799 ("perf vendor events: Update SkylakeX events to v1.21")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20201023005334.7869-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-11-03 08:31:31 -03:00
John Garry
caf7f9685d perf jevents: Fix event code for events referencing std arch events
The event code for events referencing std arch events is incorrectly
evaluated in json_events().

The issue is that je.event is evaluated properly from try_fixup(), but
later NULLified from the real_event() call, as "event" may be NULL.

Fix by setting "event" same je.event in try_fixup().

Also remove support for overwriting event code for events using std arch
events, as it is not used.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-By: Kajol Jain<kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/1602170368-11892-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-14 13:43:31 -03:00
Sandipan Das
70830f974e perf vendor events: Fix typos in power8 PMU events
This replaces the incorrectly spelled word "localtion" with "location"
in some power8 PMU event descriptions.

Fixes: 2a81fa3bb5 ("perf vendor events: Add power8 PMU events")
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20201012050205.328523-1-sandipan@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-13 11:01:42 -03:00
Jin Yao
b5ff7f2799 perf vendor events: Update SkylakeX events to v1.21
- Update SkylakeX events to v1.21.
- Update SkylakeX JSON metrics from TMAM 4.0.

Other fixes:

- Add NO_NMI_WATCHDOG metric constraint to Backend_Bound
- Fix misspelled error

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 08:46:47 -03:00
Jin Yao
038d3b53c2 perf vendor events intel: Update CascadelakeX events to v1.08
- Update CascadelakeX events to v1.08.
- Update CascadelakeX JSON metrics from TMAM 4.0.

Other fixes:

- Add NO_NMI_WATCHDOG metric constraint to Backend_Bound
- Change 'MB/sec' to 'MB' in UNC_M_PMM_BANDWIDTH.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lore.kernel.org/lkml/20200922031918.3723-1-yao.jin@linux.intel.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-28 08:46:37 -03:00
Arnaldo Carvalho de Melo
056c172201 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-17 15:45:05 -03:00
Henry Burns
56f3a1cdaf perf vendor events amd: Remove trailing commas
The amdzen2/core.json and amdzen/core.json vendor events files have the
occasional trailing comma. Since that goes against the JSON standard,
lets remove it.

Signed-off-by: Henry Burns <henrywolfeburns@gmail.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Link: http://lore.kernel.org/lkml/20200915004125.971-1-henrywolfeburns@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-15 08:53:25 -03:00
Kajol Jain
b1f815c479 perf vendor events power9: Add hv_24x7 core level metric events
This patch adds hv_24x7 core level events in nest_metric.json file and
also add PerChip/PerCore field in metric events.

Result:

power9 platform:

command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000
     1.000070601                        1.9                        2.0
     2.000253881                        2.0                        1.9
     3.000364810                        2.0                        2.0

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-6-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 09:19:53 -03:00
Kajol Jain
560ccbc4a5 perf jevents: Add support for parsing perchip/percore events
Initially, every time we want to add new terms like chip, core thread etc,
we need to create corrsponding fields in pmu_events and event struct.

This patch adds an enum called 'aggr_mode_class' which store all these
aggregation like perchip/percore. It also adds new field 'aggr_mode'
to capture these terms.

Now, if user wants to add any new term, they just need to add it in
the enum defined.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-4-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 09:18:33 -03:00
Kajol Jain
71a374bb18 perf jevents: Add new structure to pass json fields.
This patch adds new structure called 'json_event' inside jevents.c
file to improve the callback prototype inside jevent files.

Initially, whenever user want to add new field, they need to update
in all function callback which make it more and more complex with
increased number of parmeters.

With this change, we just need to add it in new structure 'json_event'.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-3-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 09:18:04 -03:00
Kajol Jain
0d52b7889b perf jevents: Make json_events() static and ditch jevents.h file
This patch removes jevents.h and makes json_events function static.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-2-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-10 09:17:03 -03:00
Kim Phillips
09b54b30cc perf vendor events amd: Enable Family 19h users by matching Zen2 events
This enables zen3 users by reusing mostly-compatible zen2 events
until the official public list of zen3 events is published in a
future PPR.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-4-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04 16:32:44 -03:00
Kim Phillips
08ed77e414 perf vendor events amd: Add recommended events
Add support for events listed in Section 2.1.15.2 "Performance
Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803
Rev 0.54 - Sep 12, 2019".

perf now supports these new events (-e):

  all_dc_accesses
  all_tlbs_flushed
  l1_dtlb_misses
  l2_cache_accesses_from_dc_misses
  l2_cache_accesses_from_ic_misses
  l2_cache_hits_from_dc_misses
  l2_cache_hits_from_ic_misses
  l2_cache_misses_from_dc_misses
  l2_cache_misses_from_ic_miss
  l2_dtlb_misses
  l2_itlb_misses
  sse_avx_stalls
  uops_dispatched
  uops_retired
  l3_accesses
  l3_misses

and these metrics (-M):

  branch_misprediction_ratio
  all_l2_cache_accesses
  all_l2_cache_hits
  all_l2_cache_misses
  ic_fetch_miss_ratio
  l2_cache_accesses_from_l2_hwpf
  l2_cache_hits_from_l2_hwpf
  l2_cache_misses_from_l2_hwpf
  l3_read_miss_latency
  l1_itlb_misses
  all_remote_links_outbound
  nps1_die_to_dram

The nps1_die_to_dram event may need perf stat's --metric-no-group
switch if the number of available data fabric counters is less
than the number it uses (8).

Committer testing:

On a AMD Ryzen 3900x system:

Before:

  # perf list all_dc_accesses   all_tlbs_flushed   l1_dtlb_misses   l2_cache_accesses_from_dc_misses   l2_cache_accesses_from_ic_misses   l2_cache_hits_from_dc_misses   l2_cache_hits_from_ic_misses   l2_cache_misses_from_dc_misses   l2_cache_misses_from_ic_miss   l2_dtlb_misses   l2_itlb_misses   sse_avx_stalls   uops_dispatched   uops_retired   l3_accesses   l3_misses | grep -v "^Metric Groups:$" | grep -v "^$"
  #

After:

  # perf list all_dc_accesses   all_tlbs_flushed   l1_dtlb_misses   l2_cache_accesses_from_dc_misses   l2_cache_accesses_from_ic_misses   l2_cache_hits_from_dc_misses   l2_cache_hits_from_ic_misses   l2_cache_misses_from_dc_misses   l2_cache_misses_from_ic_miss   l2_dtlb_misses   l2_itlb_misses   sse_avx_stalls   uops_dispatched   uops_retired   l3_accesses   l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$"
  all_dc_accesses
       [All L1 Data Cache Accesses]
  all_tlbs_flushed
       [All TLBs Flushed]
  l1_dtlb_misses
       [L1 DTLB Misses]
  l2_cache_accesses_from_dc_misses
       [L2 Cache Accesses from L1 Data Cache Misses (including prefetch)]
  l2_cache_accesses_from_ic_misses
       [L2 Cache Accesses from L1 Instruction Cache Misses (including
        prefetch)]
  l2_cache_hits_from_dc_misses
       [L2 Cache Hits from L1 Data Cache Misses]
  l2_cache_hits_from_ic_misses
       [L2 Cache Hits from L1 Instruction Cache Misses]
  l2_cache_misses_from_dc_misses
       [L2 Cache Misses from L1 Data Cache Misses]
  l2_cache_misses_from_ic_miss
       [L2 Cache Misses from L1 Instruction Cache Misses]
  l2_dtlb_misses
       [L2 DTLB Misses & Data page walks]
  l2_itlb_misses
       [L2 ITLB Misses & Instruction page walks]
  sse_avx_stalls
       [Mixed SSE/AVX Stalls]
  uops_dispatched
       [Micro-ops Dispatched]
  uops_retired
       [Micro-ops Retired]
  l3_accesses
       [L3 Accesses. Unit: amd_l3]
  l3_misses
       [L3 Misses (includes Chg2X). Unit: amd_l3]
  #

  # perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2

   Performance counter stats for 'system wide':

       433,439,949      all_dc_accesses                                               (35.66%)
               443      all_tlbs_flushed                                              (35.66%)
         2,985,885      l1_dtlb_misses                                                (35.66%)
        18,318,019      l2_cache_accesses_from_dc_misses                                     (35.68%)
        50,114,810      l2_cache_accesses_from_ic_misses                                     (35.72%)
        12,423,978      l2_cache_hits_from_dc_misses                                     (35.74%)
        40,703,103      l2_cache_hits_from_ic_misses                                     (35.74%)
         6,698,673      l2_cache_misses_from_dc_misses                                     (35.74%)
        12,090,892      l2_cache_misses_from_ic_miss                                     (35.74%)
           614,267      l2_dtlb_misses                                                (35.74%)
           216,036      l2_itlb_misses                                                (35.74%)
            11,977      sse_avx_stalls                                                (35.74%)
       999,276,223      uops_dispatched                                               (35.73%)
     1,075,311,620      uops_retired                                                  (35.69%)
         1,420,763      l3_accesses
           540,164      l3_misses

       2.002344121 seconds time elapsed

  # perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2

   Performance counter stats for 'system wide':

       175,943,104      all_dc_accesses
               310      all_tlbs_flushed
         2,280,359      l1_dtlb_misses
        11,700,151      l2_cache_accesses_from_dc_misses
        25,414,963      l2_cache_accesses_from_ic_misses

       2.001957818 seconds time elapsed

  #

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-3-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04 16:32:22 -03:00
Kim Phillips
ab22eea35f perf vendor events amd: Add ITLB Instruction Fetch Hits event for zen1
The ITLB Instruction Fetch Hits event isn't documented even in later
zen1 PPRs, but it seems to count correctly on zen1 hardware.

Add it to zen1 group so zen1 users can use the upcoming IC Fetch Miss
Ratio Metric.

The IF1G, 1IF2M, IF4K (Instruction fetches to a 1 GB, 2 MB, and 4K page)
unit masks are not added because unlike zen2 hardware, zen1 hardware
counts all its unit masks with a 0 unit mask according to the old
convention:

  zen1$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1

   Performance counter stats for 'sleep 1':

             211,318      cpu/event=0x94/u
             211,318      cpu/event=0x94,umask=0xff/u

Rome/zen2:

  zen2$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1

   Performance counter stats for 'sleep 1':

                   0      cpu/event=0x94/u
             190,744      cpu/event=0x94,umask=0xff/u

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # on Zen2 only (3900x)
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-2-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04 16:32:22 -03:00
Kim Phillips
60d804521e perf vendor events amd: Add L2 Prefetch events for zen1
Later revisions of PPRs that post-date the original Family 17h events
submission patch add these events.

Specifically, they were not in this 2017 revision of the F17h PPR:

Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017

But e.g., are included in this 2019 version of the PPR:

Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019

Fixes: 98c07a8f74 ("perf vendor events amd: perf PMU events for AMD Family 17h")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-1-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-04 16:32:13 -03:00
Namhyung Kim
e62458e394 perf jevents: Fix suspicious code in fixregex()
The new string should have enough space for the original string and the
back slashes IMHO.

Fixes: fbc2844e84 ("perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: William Cohen <wcohen@redhat.com>
Link: http://lore.kernel.org/lkml/20200903152510.489233-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-09-03 15:38:05 -03:00
Paul A. Clarke
8989f5f076 perf stat: Update POWER9 metrics to utilize other metrics
These changes take advantage of the new capability added in merge commit
00e4db5125 "Allow using computed metrics
in calculating other metrics".

The net is a simplification of the expressions for a handful of metrics,
but no functional change.

Signed-off-by: Paul Clarke <pc@us.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200813222155.268183-1-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-14 09:38:53 -03:00
Arnaldo Carvalho de Melo
b1aa3db2c1 Merge remote-tracking branch 'torvalds/master' into perf/core
Minor conflict in tools/perf/arch/arm/util/auxtrace.c as one fix there
was cherry-picked for the last perf/urgent pull req to Linus, so was
already there.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-08-03 09:37:31 -03:00
Thomas Richter
3d3af181d3 s390/cpum_cf,perf: change DFLT_CCERROR counter name
Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.

Fixes: d68d5d51dc ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e4 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-21 13:53:56 +02:00
Kajol Jain
78194fb486 perf vendor events power9: Added nest imc metric events
Added nest imc metric events.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nageswara R Sastry <nasastry@in.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: maddy@linux.ibm.com
Link: http://lore.kernel.org/lkml/20200703065658.377467-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-06 09:38:03 -03:00
Ed Maste
82352ae28f perf tools: Correct license on jsmn JSON parser
This header is part of the jsmn JSON parser, introduced in 867a979a83.
Correct the SPDX tag to indicate that it is under the MIT license.

Signed-off-by: Ed Maste <emaste@freebsd.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lore.kernel.org/lkml/20200528170858.48457-1-emaste@freefall.freebsd.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-29 16:51:38 -03:00
Paul A. Clarke
acd1ac2315 perf stat: POWER9 metrics: expand "ICT" acronym
Uses of "ICT" and "Ict" are expanded to "Instruction Completion Table".

Signed-off-by: Paul Clarke <pc@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1589915886-22992-1-git-send-email-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:27 -03:00
Paul A. Clarke
63b5930f4a perf vendor events power9: Add missing metrics to POWER9 'cpi_breakdown'
Add the following metrics to the POWER9 'cpi_breakdown' metricgroup:

- ict_noslot_br_mpred_cpi
- ict_noslot_br_mpred_icmiss_cpi
- ict_noslot_cyc_other_cpi
- ict_noslot_disp_held_cpi
- ict_noslot_disp_held_hb_full_cpi
- ict_noslot_disp_held_issq_cpi
- ict_noslot_disp_held_other_cpi
- ict_noslot_disp_held_sync_cpi
- ict_noslot_disp_held_tbegin_cpi
- ict_noslot_ic_l2_cpi
- ict_noslot_ic_l3_cpi
- ict_noslot_ic_l3miss_cpi
- ict_noslot_ic_miss_cpi

Signed-off-by: Paul Clarke <pc@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1588868938-21933-3-git-send-email-pc@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:25 -03:00
Ian Rogers
f2682a8fe9 perf metrics: Fix parse errors in power9 metrics
Mismatched parentheses.

Fixes: 7f3cf5ac77 (perf vendor events power9: Cpi_breakdown & estimated_dcache_miss_cpi metrics)
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Paul Clarke <pc@us.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Haiyan Song <haiyanx.song@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200501173333.227162-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:25 -03:00
Ian Rogers
981d169f90 perf metrics: Fix parse errors in power8 metrics
Mismatched parentheses.

Fixes: dd81eafacc (perf vendor events power8: Cpi_breakdown & estimated_dcache_miss_cpi metrics)
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Paul Clarke <pc@us.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Haiyan Song <haiyanx.song@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200501173333.227162-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:25 -03:00
Ian Rogers
7db61f384d perf metrics: Fix parse errors in skylake metrics
Remove over escaping with \\.

Fixes: fd5500989c (perf vendor events intel: Update metrics from TMAM 3.5)
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Haiyan Song <haiyanx.song@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200501173333.227162-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:25 -03:00
Ian Rogers
92aa1c2bdb perf metrics: Fix parse errors in cascade lake metrics
Remove over escaping with \\.
Remove extraneous if 1 if 0 == 1 else 0 else 0.

Fixes: fd5500989c (perf vendor events intel: Update metrics from TMAM 3.5)
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Haiyan Song <haiyanx.song@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200501173333.227162-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-05-28 10:03:25 -03:00
Kajol Jain
354575c00d perf vendor events power9: Add hv_24x7 socket/chip level metric events
The hv_24×7 feature in IBM® POWER9™ processor-based servers provide the
facility to continuously collect large numbers of hardware performance
metrics efficiently and accurately.

This patch adds hv_24x7  metric file for different Socket/chip
resources.

Result:

power9 platform:

  command:# ./perf stat --metric-only -M Memory_RD_BW_Chip -C 0 -I 1000

     1.000096188          0.9           0.3
     2.000285720          0.5           0.1
     3.000424990          0.4           0.1

  command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000

     1.000097981          2.3           2.3
     2.000291713          2.3           2.3
     3.000421719          2.3           2.3
     4.000550912          2.3           2.3

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-8-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-30 10:48:33 -03:00
Shaokun Zhang
454a8be0cf perf pmu: Fix function name in comment, its get_cpuid_str(), not get_cpustr()
get_cpuid_str() is used in tools/perf/arch/xxx/util/header.c,
fix the name in comment.

Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lore.kernel.org/lkml/1588141992-48382-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-30 10:48:33 -03:00
Jin Yao
8ed1faf015 perf pmu-events x86: Use CPU_CLK_UNHALTED.THREAD in Kernel_Utilization metric
The kernel utilization metric does multiplexing currently and is somewhat
unreliable. The problem is that it uses two instances of the fixed counter,
and the kernel has to multipleplex which causes errors. So should use
CPU_CLK_UNHALTED.THREAD instead.

Before:

  # perf stat -M Kernel_Utilization -- sleep 1

  Performance counter stats for 'sleep 1':

          1,419,425      cpu_clk_unhalted.ref_tsc:k
      <not counted>      cpu_clk_unhalted.ref_tsc	(0.00%)

After:

  # perf stat -M Kernel_Utilization -- sleep 1

  Performance counter stats for 'sleep 1':

            746,688      cpu_clk_unhalted.thread:k #      0.7 Kernel_Utilization
          1,088,348      cpu_clk_unhalted.thread

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200309013125.7559-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-04-03 09:37:56 -03:00
John Garry
d844780887 perf jevents: Support test events folder
With the goal of supporting pmu-events test case, introduce support for
a test events folder.

These test events can be used for testing generation of pmu-event tables
and alias creation for any arch.

When running the pmu-events test case, these test events will be used as
the platform-agnostic events, so aliases can be created per-PMU and
validated against known expected values.

To support the test events, add a "testcpu" entry in pmu_events_map[].
The pmu-events test will be able to lookup the events map for "testcpu",
to verify the generated tables against expected values.

The resultant generated pmu-events.c will now look like the following:

  struct pmu_event pme_ampere_emag[] = {
  {
  	.name = "ldrex_spec",
  	.event = "event=0x6c",
  	.desc = "Exclusive operation spe...",
  	.topic = "intrinsic",
  	.long_desc = "Exclusive operation ...",
  },
  ...
  };

  struct pmu_event pme_test_cpu[] = {
  {
  	.name = "uncore_hisi_ddrc.flux_wcmd",
  	.event = "event=0x2",
  	.desc = "DDRC write commands. Unit: hisi_sccl,ddrc ",
  	.topic = "uncore",
  	.long_desc = "DDRC write commands",
  	.pmu = "hisi_sccl,ddrc",
  },
  {
  	.name = "unc_cbo_xsnp_response.miss_eviction",
  	.event = "umask=0x81,event=0x22",
  	.desc = "Unit: uncore_cbox A cross-core snoop resulted ...",
  	.topic = "uncore",
  	.long_desc = "A cross-core snoop resulted from L3 ...",
  	.pmu = "uncore_cbox",
  },
  {
  	.name = "eist_trans",
  	.event = "umask=0x0,period=200000,event=0x3a",
  	.desc = "Number of Enhanced Intel SpeedStep(R) ...",
  	.topic = "other",
  },
  {
  	.name = 0,
  },
  };

  struct pmu_events_map pmu_events_map[] = {
  ...
  {
  	.cpuid = "0x00000000500f0000",
  	.version = "v1",
  	.type = "core",
  	.table = pme_ampere_emag
  },
  ...
  {
  	.cpuid = "testcpu",
  	.version = "v1",
  	.type = "core",
  	.table = pme_test_cpu,
  },
  {
  	.cpuid = 0,
  	.version = 0,
  	.type = 0,
  	.table = 0,
  },
  };

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-24 10:35:59 -03:00
John Garry
c52db67a74 perf jevents: Add some test events
Add some test PMU events. The events are randomly chosen from x86 and
arm64 JSONs. The events include CPU and uncore events.

Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1584442939-8911-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-24 10:35:59 -03:00
Vijay Thakkar
b5b8a7cf14 perf vendor events amd: Update Zen1 events to V2
This patch updates the PMCs for AMD Zen1 core based processors (Family
17h; Models 0 through 2F) to be in accordance with PMCs as
documented in the latest versions of the AMD Processor Programming
Reference [1], [2] and [3]. Note that some events, such as FPU pipe
assignment are missing in [1], and therefore [3] is included for full
coverage of events.

PMCs added:

  fpu_pipe_assignment.dual{0|1|2|3}
  fpu_pipe_assignment.total{0|1|2|3}
  ls_mab_alloc.dc_prefetcher
  ls_mab_alloc.stores
  ls_mab_alloc.loads
  bp_dyn_ind_pred
  bp_de_redirect

PMC removed:

  ex_ret_cond_misp

Cumulative counts, fpu_pipe_assignment.total and
fpu_pipe_assignment.dual, existed in v1, but did expose port-level
counters.

ex_ret_cond_misp has been removed as it has been removed from the latest
versions of the PPR, and when tested, always seems to sample zero as
tested on a Ryzen 3400G system.

[1]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.

[2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h,
Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.

[3]: OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018

All of the PPRs can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=206537

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: vijay thakkar <vijaythakkar@me.com>
Link: http://lore.kernel.org/lkml/20200318190002.307290-4-vijaythakkar@me.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-24 10:35:58 -03:00
Vijay Thakkar
2079f7aa0a perf vendor events amd: Add Zen2 events
This patch adds PMU events for AMD Zen2 core based processors, namely,
Matisse (model 71h), Castle Peak (model 31h) and Rome (model 2xh), as
documented in the AMD Processor Programming Reference for Matisse [1].
The model number regex has been set to detect all the models under
family 17 that do not match those of Zen1, as the range is larger for
zen2.

Zen2 adds some additional counters that are not present in Zen1 and
events for them have been added in this patch. Some counters have also
been removed for Zen2 thatwere previously present in Zen1 and have been
confirmed to always sample zero on zen2. These added/removed counters
have been omitted for brevity but can be found here:
https://gist.github.com/thakkarV/5b12ca5fd7488eb2c42e451e40bdd5f3

Note that PPR for Zen2 [1] does not include some counters that were
documented in the PPR for Zen1 based processors [2]. After having tested
these counters, some of them that still work for zen2 systems have been
preserved in the events for zen2. The counters that are omitted in [1]
but are still measurable and non-zero on zen2 (tested on a Ryzen 3900X
system) are the following:

  PMC 0x000 fpu_pipe_assignment.{total|total0|total1|total2|total3}
  PMC 0x004 fp_num_mov_elim_scal_op.*
  PMC 0x046 ls_tablewalker.*
  PMC 0x062 l2_latency.l2_cycles_waiting_on_fills
  PMC 0x063 l2_wcb_req.*
  PMC 0x06D l2_fill_pending.l2_fill_busy
  PMC 0x080 ic_fw32
  PMC 0x081 ic_fw32_miss
  PMC 0x086 bp_snp_re_sync
  PMC 0x087 ic_fetch_stall.*
  PMC 0x08C ic_cache_inval.*
  PMC 0x099 bp_tlb_rel
  PMC 0x0C7 ex_ret_brn_resync
  PMC 0x28A ic_oc_mode_switch.*
  L3PMC 0x001 l3_request_g1.*
  L3PMC 0x006 l3_comb_clstr_state.*

[1]: Processor Programming Reference (PPR) for AMD Family 17h Model 71h,
Revision B0 Processors, 56176 Rev 3.06 - Jul 17, 2019

[2]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019

All of the PPRs can be found at:

https://bugzilla.kernel.org/show_bug.cgi?id=206537

Here are the results of running "fpu_pipe_assignment.total" events on my
Ryzen 3900X family 17h model 71h system:

Before this patch:

  $> perf list *fpu_pipe_assignment*

List of pre-defined events (to be used in -e):

After:

  $> perf list *fpu_pipe_assignment*

  floating point:
  fpu_pipe_assignment.total
      [Total number of fp uOps]
  fpu_pipe_assignment.total0
      [Total number uOps assigned to pipe 0]
  fpu_pipe_assignment.total1
      [Total number uOps assigned to pipe 1]
  fpu_pipe_assignment.total2
      [Total number uOps assigned to pipe 2]
  fpu_pipe_assignment.total3
      [Total number uOps assigned to pipe 3]

  Metric Groups:

  $> perf stat -e fpu_pipe_assignment.total sleep 1

  Performance counter stats for 'sleep 1':

              25,883      fpu_pipe_assignment.total

         1.004145868 seconds time elapsed

         0.001805000 seconds user
         0.000000000 seconds sys

Usage tests while running Linpackin the background:

  $> perf stat -I1000 -e fpu_pipe_assignment.total
       1.000266796     79,313,191,516      fpu_pipe_assignment.total
       2.000809630     68,091,474,430      fpu_pipe_assignment.total
       3.001028115     52,925,023,174      fpu_pipe_assignment.total

  $> perf record -e fpu_pipe_assignment.total,fpu_pipe_assignment.total0 -a sleep 1
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Captured and wrote 4.031 MB perf.data (64764 samples) ]

  $> perf report --stdio --no-header | head -30
      98.33%  xhpl             xhpl                          [.] dgemm_kernel
       0.28%  xhpl             xhpl                          [.] dtrsm_kernel_LT
       0.10%  xhpl             [kernel.kallsyms]             [k] entry_SYSCALL_64
       0.08%  xhpl             xhpl                          [.] idamax_k
       0.07%  baloo_file_extr  liblmdb.so                    [.] mdb_mid2l_insert
       0.06%  xhpl             xhpl                          [.] dgemm_itcopy
       0.06%  xhpl             xhpl                          [.] dgemm_oncopy
       0.06%  xhpl             [kernel.kallsyms]             [k] __schedule
       0.06%  xhpl             [kernel.kallsyms]             [k] syscall_trace_enter
       0.06%  xhpl             [kernel.kallsyms]             [k] native_sched_clock
       0.06%  xhpl             [kernel.kallsyms]             [k] pick_next_task_fair
       0.05%  xhpl             xhpl                          [.] blas_thread_server.llvm.15009391670273914865
       0.04%  xhpl             [kernel.kallsyms]             [k] do_syscall_64
       0.04%  xhpl             [kernel.kallsyms]             [k] yield_task_fair
       0.04%  xhpl             libpthread-2.31.so            [.] __pthread_mutex_unlock_usercnt
       0.03%  xhpl             [kernel.kallsyms]             [k] cpuacct_charge
       0.03%  xhpl             [kernel.kallsyms]             [k] syscall_return_via_sysret
       0.03%  xhpl             libc-2.31.so                  [.] __sched_yield
       0.03%  xhpl             [kernel.kallsyms]             [k] __calc_delta

  $> perf annotate --stdio2 dgemm_kernel | egrep '^ {0,2}[0-9]+' -B2 -A2
                  sub          $0x60,%rsp
                  mov          %rbx,(%rsp)
    0.00          mov          %rbp,0x8(%rsp)
                  mov          %r12,0x10(%rsp)
    0.00          mov          %r13,0x18(%rsp)
                  mov          %r14,0x20(%rsp)
                  mov          %r15,0x28(%rsp)
  --
                  mov          %rdi,%r13
                  mov          %rsi,0x28(%rsp)
    0.00          mov          %rdx,%r12
                  vmovsd       %xmm0,0x30(%rsp)
                  shl          $0x3,%r10
                  mov          0x28(%rsp),%rax
    0.00          xor          %rdx,%rdx
                  mov          $0x18,%rdi
                  div          %rdi
  --
                  nop
            a0:   mov          %r12,%rax
    0.00          shl          $0x3,%rax
                  mov          %r8,%rdi
                  lea          (%r8,%rax,8),%r15
  --
                  mov          %r12,%rax
                  nop
    0.00    c0:   vmovups      (%rdi),%ymm1
    0.09          vmovups      0x20(%rdi),%ymm2
    0.02          vmovups      (%r15),%ymm3
    0.10          vmovups      %ymm1,(%rsi)
    0.07          vmovups      %ymm2,0x20(%rsi)
    0.07          vmovups      %ymm3,0x40(%rsi)
    0.06          add          $0x40,%rdi
                  add          $0x40,%r15
                  add          $0x60,%rsi
    0.00          dec          %rax
                ↑ jne          c0
                  mov          %r9,%r15
  --
                  nop
           110:   lea          0x80(%rsp),%rsi
    0.01          add          $0x60,%rsi
    0.03          mov          %r12,%rax
    0.00          sar          $0x3,%rax
                  cmp          $0x2,%rax
                ↓ jl           d26
                  prefetcht0   0x200(%rdi)
    0.01          vmovups      -0x60(%rsi),%ymm1
    0.02          prefetcht0   0xa0(%rsi)
    0.00          vbroadcastsd -0x80(%rdi),%ymm0
    0.00          prefetcht0   0xe0(%rsi)
    0.03          vmovups      -0x40(%rsi),%ymm2
    0.00          prefetcht0   0x120(%rsi)
                  vmovups      -0x20(%rsi),%ymm3
                  vmulpd       %ymm0,%ymm1,%ymm4
    0.01          prefetcht0   0x160(%rsi)
                  vmulpd       %ymm0,%ymm2,%ymm8
    0.01          vmulpd       %ymm0,%ymm3,%ymm12
    0.02          prefetcht0   0x1a0(%rsi)
    0.01          vbroadcastsd -0x78(%rdi),%ymm0
                  vmulpd       %ymm0,%ymm1,%ymm5
    0.01          vmulpd       %ymm0,%ymm2,%ymm9
                  vmulpd       %ymm0,%ymm3,%ymm13
    0.01          vbroadcastsd -0x70(%rdi),%ymm0
                  vmulpd       %ymm0,%ymm1,%ymm6
    0.00          vmulpd       %ymm0,%ymm2,%ymm10
    0.00          add          $0x60,%rsi

  ... snip ...

                  nop
          65e0:   vmovddup     -0x60(%rsi),%xmm2
    0.00          vmovups      -0x80(%rdi),%xmm0
                  vmovups      -0x70(%rdi),%xmm1
    0.00          vmovddup     -0x58(%rsi),%xmm3
                  vfmadd231pd  %xmm0,%xmm2,%xmm4
    0.00          vfmadd231pd  %xmm1,%xmm2,%xmm5
    0.00          vfmadd231pd  %xmm0,%xmm3,%xmm6
    0.00          vfmadd231pd  %xmm1,%xmm3,%xmm7
    0.00          add          $0x10,%rsi
                  add          $0x20,%rdi
    0.00          dec          %rax
                ↑ jne          65e0
                  nop
                  nop
          6620:   vmovddup     0x30(%rsp),%xmm0
    0.00          vmulpd       %xmm0,%xmm4,%xmm4
    0.00          vmulpd       %xmm0,%xmm5,%xmm5
                  vmulpd       %xmm0,%xmm6,%xmm6
                  vmulpd       %xmm0,%xmm7,%xmm7
                  vaddpd       (%r15),%xmm4,%xmm4
                  vaddpd       0x10(%r15),%xmm5,%xmm5
    0.00          vaddpd       (%r15,%r10,1),%xmm6,%xmm6
    0.00          vaddpd       0x10(%r15,%r10,1),%xmm7,%xmm7
    0.00          vmovups      %xmm4,(%r15)
                  vmovups      %xmm5,0x10(%r15)
    0.00          vmovups      %xmm6,(%r15,%r10,1)
                  vmovups      %xmm7,0x10(%r15,%r10,1)
                  add          $0x20,%r15
  --
                  lea          (%r8,%rax,8),%r8
          69d8:   mov          0x20(%rsp),%r14
    0.00          test         $0x1,%r14
                ↓ je           6d84
                  mov          %r9,%r15
  --
                  vbroadcastsd -0x28(%rsi),%ymm3
                  vfmadd231pd  (%rdi),%ymm0,%ymm4
    0.00          vfmadd231pd  0x20(%rdi),%ymm1,%ymm5
                  vfmadd231pd  0x40(%rdi),%ymm2,%ymm6
                  vfmadd231pd  0x60(%rdi),%ymm3,%ymm7
  --
                  vmulpd       %ymm0,%ymm4,%ymm4
                  vaddpd       (%r15),%ymm4,%ymm4
    0.00          vmovups      %ymm4,(%r15)
                  add          $0x20,%r15
                  dec          %r11
  --
                  mov          %rbx,%rsp
                  mov          (%rsp),%rbx
    0.01          mov          0x8(%rsp),%rbp
                  mov          0x10(%rsp),%r12
                  mov          0x18(%rsp),%r13

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200318190002.307290-3-vijaythakkar@me.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-24 10:35:58 -03:00
Vijay Thakkar
c5f18e9e94 perf vendor events amd: Restrict model detection for zen1 based processors
This patch changes the previous blanket detection of AMD Family 17h
processors to be more specific to Zen1 core based products only by
replacing model detection regex pattern [[:xdigit:]]+ with
([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.

This change is required to allow for the addition of separate PMU events
for Zen2 core based models in the following patches as those belong to
family 17h but have different PMCs. Current PMU events directory has
also been renamed to "amdzen1" from "amdfam17h" to reflect this
specificity.

Note that although this change does not break PMU counters for existing
zen1 based systems, it does disable the current set of counters for zen2
based systems. Counters for zen2 have been added in the following
patches in this patchset.

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-03-24 10:35:53 -03:00
Ingo Molnar
d1c9f7d117 perf/core improvements and fixes:
perf record:
 
   Alexey Budankov:
 
   - Fix binding of AIO user space buffers to nodes
 
 maps:
 
   Dominik b. Czarnota:
 
   - Fix off by one in strncpy() size argument.
 
   Arnaldo Carvalho de Melo:
 
   - Use strstarts() to look for Android libraries.
 
   Ian Rogers:
 
   - Give synthetic mmap events an inode generation.
 
 man pages:
 
   Ian Rogers:
 
   - Set man page date to last git commit.
 
 perf test:
 
   Ian Rogers:
 
   - Print if shell directory isn't present.
 
 perf report:
 
   Jin Yao:
 
   - Fix no branch type statistics report issue.
 
 perf expr:
 
   Jiri Olsa:
 
   - Fix copy/paste mistake
 
 vendor events:
 
   Kan Liang:
 
   - Support metric constraints.
 
 vendor events intel:
 
   Kan Liang:
 
   - Add NO_NMI_WATCHDOG metric constraint.
 
 vendor events s390:
 
   Thomas Richter:
 
  - Add new deflate counters for IBM z15.
 
 ARM cs-etm:
 
   Leo Yan:
 
   - Last branch improvements.
 
 intel-pt:
 
   Adrian Hunter:
 
   - Update intel-pt.txt file with new location of the documentation.
 
   - Add Intel PT man page references.
 
   - Rename intel-pt.txt and put it in man page format.
 
 perl scripting:
 
   Michael Petlan:
 
  - Add common_callchain to fix argument order.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXnFBiwAKCRCyPKLppCJ+
 J4sOAQDTh5w3GFDOKzFHLqXWOE9mlsXnS7tHdkypuRweBpuQXQEA0Sq125ludwe7
 pzZ1MFqZJ85lw0mfDqBV9E1PlgQz8Q8=
 =1uH9
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-5.7-20200317' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf record:

  Alexey Budankov:

  - Fix binding of AIO user space buffers to nodes

maps:

  Dominik b. Czarnota:

  - Fix off by one in strncpy() size argument.

  Arnaldo Carvalho de Melo:

  - Use strstarts() to look for Android libraries.

  Ian Rogers:

  - Give synthetic mmap events an inode generation.

man pages:

  Ian Rogers:

  - Set man page date to last git commit.

perf test:

  Ian Rogers:

  - Print if shell directory isn't present.

perf report:

  Jin Yao:

  - Fix no branch type statistics report issue.

perf expr:

  Jiri Olsa:

  - Fix copy/paste mistake

vendor events:

  Kan Liang:

  - Support metric constraints.

vendor events intel:

  Kan Liang:

  - Add NO_NMI_WATCHDOG metric constraint.

vendor events s390:

  Thomas Richter:

 - Add new deflate counters for IBM z15.

ARM cs-etm:

  Leo Yan:

  - Last branch improvements.

intel-pt:

  Adrian Hunter:

  - Update intel-pt.txt file with new location of the documentation.

  - Add Intel PT man page references.

  - Rename intel-pt.txt and put it in man page format.

perl scripting:

  Michael Petlan:

 - Add common_callchain to fix argument order.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Conflicts:
	tools/perf/util/map.c
2020-03-19 15:02:26 +01:00