mirror of
https://github.com/torvalds/linux.git
synced 2024-12-14 15:13:52 +00:00
d9ca43c06f
13684 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Nick Forrington
|
08c1d7a159 |
perf vendor events arm64: Arm Cortex-A78C and X1C
Add PMU events for the Arm Cortex-A78C and Arm Cortex-X1C CPUs. Events for Arm Cortex-A78C match those for Arm Cortex-A78. Events for Arm Cortex-X1C match those for Arm Cortex- X1. As such, this is just a mapfile change. Main ID Register (MIDR) and event data is sourced from the corresponding Arm Technical Reference Manuals: Arm Cortex-A78C: https://developer.arm.com/documentation/102226/ Arm Cortex-X1C: https://developer.arm.com/documentation/101968/ Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Nick Forrington <nick.forrington@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrew Kilroy <andrew.kilroy@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220610174459.615995-1-nick.forrington@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ebcdbf7a6a |
perf vendor events: Update Intel snowridgex
Update to v1.20, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the snowridgex files into perf and update mapfile.csv. Tested on a non-snowridgex with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok This change just updates the version number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-31-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6b47be608b |
perf vendor events: Update Intel westmereex
Update to v3, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the westmereex files into perf and update mapfile.csv. This change just aligns whitespace and updates the version number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-30-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
4823edd648 |
perf vendor events: Update Intel westmereep-sp
Update to v3, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the westmereep-sp files into perf and update mapfile.csv. This change just aligns whitespace and updates the version number. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-29-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ae2fa1ccf1 |
perf vendor events: Update Intel westmereep-dp
Update to v2, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the westmereep-dp files into perf and update mapfile.csv. This change just aligns whitespace. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-28-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
5e1dd4f24a |
perf vendor events: Update Intel tigerlake
Update to v1.07, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the tigerlake files into perf and update mapfile.csv. Tested on a non-tigerlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-27-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
59fd7d3225 |
perf vendor events: Update Intel skylakex
Update to v1.28, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the skylakex files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-26-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
35d6527701 |
perf vendor events: Update Intel skylake
Update to v53, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the skylake files into perf and update mapfile.csv. Tested on a non-skylake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-25-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
89072caf14 |
perf vendor events: Update Intel silvermont
Update to v14, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the silvermont files into perf and update mapfile.csv. Other than aligning whitespace this change just folds the mapfile.csv entries for silvertmont onto one line. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-24-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
34122105f9 |
perf vendor events: Update Intel sapphirerapids
Update to v1.04, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the sapphirerapids files into perf and update mapfile.csv. Tested on a non-sapphirerapids with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-23-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
777e131244 |
perf vendor events: Update Intel sandybridge
Update to v17, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the sandybridge files into perf and update mapfile.csv. Tested on a non-sandybridge with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-22-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8fe33fd5d3 |
perf vendor events: Update Intel nehalemex
Update to v3, there are no TMA metrics for nehalemex. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the nehalemex files into perf and update mapfile.csv. Tested on a non-nehalemex with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Note: most of this change is just sorting the keys in the json dictionaries. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-21-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
bcc344a3bf |
perf vendor events: Update Intel nehalemep
Update to v3, the are no TMA metrics for nehalemep. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the nehalemep files into perf and update mapfile.csv. Tested on a non-nehalemep with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-20-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
1ab4ef06fa |
perf vendor events: Add Intel meteorlake
Events are v1.00, there are no metrics yet. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the events and metrics. Manually copy the meteorlake files into perf and update mapfile.csv. Tested on a non-meteorlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-19-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ae7bcd600e |
perf vendor events: Update Intel knightslanding
Update to v9, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the knightslanding files into perf and update mapfile.csv. Tested on a non-knightslanding with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Note: uncore-memory has become uncore-other as the topic was determined this way in the conversion scripts. For simplicity the scripts naming is maintained. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-18-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
376d8b581b |
perf vendor events: Update Intel jaketown
Update to v21, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the jaketown files into perf and update mapfile.csv. Tested on a non-jaketown with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6220136831 |
perf vendor events: Update Intel ivytown
Update to v21, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the ivytown files into perf and update mapfile.csv. Tested on a non-ivytown with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
80c14459f6 |
perf vendor events: Update Intel ivybridge
Update to v22, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the ivybridge files into perf and update mapfile.csv. Tested on a non-ivybridge with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
d214d0c261 |
perf vendor events: Update Intel icelakex
Update to v1.15, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the icelakex files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
a4a4353ebf |
perf vendor events: Update Intel icelake
Update to v1.14, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the icelake files into perf and update mapfile.csv. Tested on a non-icelake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
859fe0f4f2 |
perf vendor events: Update Intel haswellx
Update to v25, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the haswellx files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Failed 93: perf all PMU test : Ok The test 91 failure is a pre-existing failure on the test system with the metric Load_Miss_Real_Latency which is fixed by prefixing it with --metric-no-group. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
8e6389f931 |
perf vendor events: Update Intel haswell
Update to v31, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the haswell files into perf and update mapfile.csv. Tested on a non-haswell with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ae54f70dd9 |
perf vendor events: Update goldmontplus mapfile.csv
Align end of file whitespace with what is generated by: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py Correct the version in mapfile.csv. Event json remains at v1.01, there are no goldmontplus metrics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
beb2db9bed |
perf vendor events: Update goldmont mapfile.csv
Align end of file whitespace with what is generated by: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py Modify mapfile.csv to have a missing goldmont cpuid. Event json remains at v13, there are no goldmont metrics. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
3c9c315711 |
perf vendor events: Update Intel elkhartlake
Update to v1.03. Elkhartlake metrics aren't in TMA but basic metrics are left unchanged. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the elkhartlake files into perf and update mapfile.csv. Tested on a non-elkhartlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
f9d45862ec |
perf vendor events: Update Intel cascadelakex
Update to v1.16, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the cascadelakex files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
9709ede1a1 |
perf vendor events: Update bonnell mapfile.csv
Align end of file whitespace with what is generated by: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py Fold the mapfile.csv entries together with a more complex regular expression. This will reduce the pmu-events.c table size. The files following this change are still at v4. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
a95ab294a5 |
perf vendor events: Update Intel alderlake
Update to v1.13, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the alderlake files into perf and update mapfile.csv. Tested on a non-alderlake with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ef908a1925 |
perf vendor events: Update Intel broadwellde
Update to v7, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the broadwellde files into perf and update mapfile.csv. Tested on a non-broadwellde with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
1775634ea4 |
perf vendor events: Update Intel broadwell
Update to v26, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the broadwell files into perf and update mapfile.csv. Tested on a non-broadwell with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
4266081e33 |
perf vendor events: Update Intel broadwellx
Update to v19, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py to download and generate the latest events and metrics. Manually copy the broadwellx files into perf and update mapfile.csv. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 90: perf all metricgroups test : Ok 91: perf all metrics test : Skip 93: perf all PMU test : Ok Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220727220832.2865794-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
9a24180567 |
perf bpf: Remove undefined behavior from bpf_perf_object__next()
bpf_perf_object__next() folded the last element in the list test with the empty list test. However, this meant that offsets were computed against null and that a struct list_head was compared against a 'struct bpf_perf_object'. Working around this with clang's undefined behavior sanitizer required -fno-sanitize=null and -fno-sanitize=object-size. Remove the undefined behavior by using the regular Linux list APIs and handling the starting case separately from the end testing case. Looking at uses like bpf_perf_object__for_each(), as the constant NULL or non-NULL argument can be constant propagated, the code is no less efficient. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Christy Lee <christylee@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Rix <trix@redhat.com> Cc: bpf@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
882528d2e7 |
perf symbol: Skip symbols if SHF_ALLOC flag is not set
Some symbols are observed with the 'st_value' field zeroed. E.g. libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which resides in the '.gnu.warning.getwd' section. Unlike normal sections, such kind of sections are used for linker warning when a file calls deprecated functions, but they are not part of memory images, the symbols in these sections should be dropped. This patch checks the section attribute SHF_ALLOC bit, if the bit is not set, it skips symbols to avoid spurious ones. Suggested-by: Fangrui Song <maskray@google.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chang Rui <changruinj@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Leo Yan
|
2d86612aac |
perf symbol: Correct address for bss symbols
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols. This is a common
issue on both x86 and Arm64 platforms.
Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:
...
Disassembly of section .bss:
0000000000004040 <completed.0>:
...
0000000000004080 <buf1>:
...
00000000000040c0 <buf2>:
...
0000000000004100 <thread>:
...
First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.
# ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf2 0x30a8-0x30e8
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf1 0x3068-0x30a8
...
The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section. The perf tool uses below formula to convert
a symbol's memory address to a file address:
file_address = st_value - sh_addr + sh_offset
^
` Memory address
We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.
The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.
Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info. A better choice for converting
memory address to file address is using the formula:
file_address = st_value - p_vaddr + p_offset
This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.
After applying the change:
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf2 0x30c0-0x3100
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf1 0x3080-0x30c0
...
Fixes:
|
||
Leo Yan
|
b226521923 |
perf scripts python: Let script to be python2 compliant
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.
To fix the building failure, this patch changes from the python f-string
format to traditional string format.
Fixes:
|
||
Ian Rogers
|
a061a8ad3f |
perf test: Avoid sysfs state affecting fake events
Icelake has a slots event, on my Skylakex I have CPU events in sysfs of topdown-slots-issued and topdown-total-slots. Legacy event parsing would try to use '-' to separate parts of an event and so perf_pmu__parse_init sets 'slots' to be a PMU_EVENT_SYMBOL_SUFFIX2. As such parsing the slots event for a fake PMU fails as a PMU_EVENT_SYMBOL_SUFFIX2 isn't made into the PE_PMU_EVENT_FAKE token. Resolve this issue by test initializing the PMU parsing state before every parse. This must be done every parse as the state is removes after each parse_events. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sedat Dilek <sedat.dilek@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220725223633.2301737-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
bedd17381b |
perf vendor events intel: Update event list for haswellx
Update JSON core/uncore events for haswellx to perf. Based on HSX JSON list v24: https://download.01.org/perfmon/HSX Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220614145019.2177071-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
28738de918 |
perf vendor events intel: Update event list for broadwellx
Update JSON core/uncore events for broadwellx to perf. Based on BDX JSON list v19: https://download.01.org/perfmon/BDX Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220614145019.2177071-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
b43a5442d8 |
perf vendor events intel: Update event list for Snowridgex
More uncore events are added in the converter tool: https://github.com/intel/event-converter-for-linux-perf Keep both alias and the original name for the events, in case someone already used the alias in their script. Generate the perf events based on Snowridgex(SNR) event list v1.20: https://download.01.org/perfmon/SNR/ Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220609094222.2030167-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
9146af4413 |
perf vendor events intel: Rename tremontx to snowridgex
Tremontx was an old name for Snowridgex, so rename Tremontx to Snowridgex. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220609094222.2030167-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
6a92916de5 |
perf vendor events intel: Update event list for Sapphirerapids
Update JSON event list for Sapphirerapids to perf. Based on JSON list v1.02: https://download.01.org/perfmon/SPR/ Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220607092749.1976878-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Zhengjun Xing
|
5fa2481cdf |
perf vendor events intel: Update event list for Alderlake
Update JSON event list for Alderlake to perf. It is a hybrid event list for both Atom and Core. Based on JSON list v1.11: https://download.01.org/perfmon/ADL/ Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220607092749.1976878-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Colin Ian King
|
8147f79ea5 |
perf inject: Fix spelling mistake "theads" -> "threads"
There is a spelling mistake in a pr_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20220721124528.20997-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
acfb65fe1d |
perf kwork: Add workqueue trace BPF support
Implements workqueue trace bpf function. Test cases: # perf kwork -k workqueue lat -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (w)addrconf_verify_work | 0002 | 5.856 ms | 1 | 5.856 ms | 111994.634313 s | 111994.640169 s | (w)vmstat_update | 0001 | 1.247 ms | 1 | 1.247 ms | 111996.462651 s | 111996.463899 s | (w)neigh_periodic_work | 0001 | 1.183 ms | 1 | 1.183 ms | 111996.462789 s | 111996.463973 s | (w)neigh_managed_work | 0001 | 0.989 ms | 2 | 1.635 ms | 111996.462820 s | 111996.464455 s | (w)wb_workfn | 0000 | 0.667 ms | 1 | 0.667 ms | 111996.384273 s | 111996.384940 s | (w)bpf_prog_free_deferred | 0001 | 0.495 ms | 1 | 0.495 ms | 111986.314201 s | 111986.314696 s | (w)mix_interrupt_randomness | 0002 | 0.421 ms | 6 | 0.749 ms | 111995.927750 s | 111995.928499 s | (w)vmstat_shepherd | 0000 | 0.374 ms | 2 | 0.385 ms | 111991.265242 s | 111991.265627 s | (w)e1000_watchdog | 0002 | 0.356 ms | 5 | 0.390 ms | 111994.528380 s | 111994.528770 s | (w)vmstat_update | 0000 | 0.231 ms | 2 | 0.365 ms | 111996.384407 s | 111996.384772 s | (w)flush_to_ldisc | 0006 | 0.165 ms | 1 | 0.165 ms | 111995.930606 s | 111995.930771 s | (w)flush_to_ldisc | 0000 | 0.094 ms | 2 | 0.095 ms | 111996.460453 s | 111996.460548 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k workqueue rep -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)e1000_watchdog | 0002 | 0.627 ms | 2 | 0.324 ms | 112002.720665 s | 112002.720989 s | (w)flush_to_ldisc | 0007 | 0.598 ms | 2 | 0.534 ms | 112000.875226 s | 112000.875761 s | (w)wq_barrier_func | 0007 | 0.492 ms | 1 | 0.492 ms | 112000.876981 s | 112000.877473 s | (w)flush_to_ldisc | 0007 | 0.281 ms | 1 | 0.281 ms | 112005.826882 s | 112005.827163 s | (w)mix_interrupt_randomness | 0002 | 0.229 ms | 3 | 0.102 ms | 112005.825671 s | 112005.825774 s | (w)vmstat_shepherd | 0000 | 0.202 ms | 1 | 0.202 ms | 112001.504511 s | 112001.504713 s | (w)bpf_prog_free_deferred | 0001 | 0.181 ms | 1 | 0.181 ms | 112000.883251 s | 112000.883432 s | (w)wb_workfn | 0007 | 0.130 ms | 1 | 0.130 ms | 112001.505195 s | 112001.505325 s | (w)vmstat_update | 0000 | 0.053 ms | 1 | 0.053 ms | 112001.504763 s | 112001.504815 s | -------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
5a81927a40 |
perf kwork: Add softirq trace BPF support
Implements softirq trace bpf function. Test cases: Trace softirq latency without filter: # perf kwork -k softirq lat -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0005 | 0.281 ms | 3 | 0.338 ms | 111295.752222 s | 111295.752560 s | (s)RCU:9 | 0002 | 0.262 ms | 24 | 1.400 ms | 111301.335986 s | 111301.337386 s | (s)SCHED:7 | 0005 | 0.177 ms | 14 | 0.212 ms | 111295.752270 s | 111295.752481 s | (s)RCU:9 | 0007 | 0.161 ms | 47 | 2.022 ms | 111295.402159 s | 111295.404181 s | (s)NET_RX:3 | 0003 | 0.149 ms | 12 | 1.261 ms | 111301.192964 s | 111301.194225 s | (s)TIMER:1 | 0001 | 0.105 ms | 9 | 0.198 ms | 111301.180191 s | 111301.180389 s | ... <SNIP> ... (s)NET_RX:3 | 0002 | 0.098 ms | 6 | 0.124 ms | 111295.403760 s | 111295.403884 s | (s)SCHED:7 | 0001 | 0.093 ms | 19 | 0.242 ms | 111301.180256 s | 111301.180498 s | (s)SCHED:7 | 0007 | 0.078 ms | 15 | 0.188 ms | 111300.064226 s | 111300.064415 s | (s)SCHED:7 | 0004 | 0.077 ms | 11 | 0.213 ms | 111301.361759 s | 111301.361973 s | (s)SCHED:7 | 0000 | 0.063 ms | 33 | 0.805 ms | 111295.401811 s | 111295.402616 s | (s)SCHED:7 | 0003 | 0.063 ms | 14 | 0.085 ms | 111301.192255 s | 111301.192340 s | -------------------------------------------------------------------------------------------------------------------------------- Trace softirq latency with cpu filter: # perf kwork -k softirq lat -b -C 1 Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0001 | 0.178 ms | 5 | 0.572 ms | 111435.534135 s | 111435.534707 s | -------------------------------------------------------------------------------------------------------------------------------- Trace softirq latency with name filter: # perf kwork -k softirq lat -b -n SCHED Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)SCHED:7 | 0001 | 0.295 ms | 15 | 2.183 ms | 111452.534950 s | 111452.537133 s | (s)SCHED:7 | 0002 | 0.215 ms | 10 | 0.315 ms | 111460.000238 s | 111460.000553 s | (s)SCHED:7 | 0005 | 0.190 ms | 29 | 0.338 ms | 111457.032538 s | 111457.032876 s | (s)SCHED:7 | 0003 | 0.097 ms | 10 | 0.319 ms | 111452.434351 s | 111452.434670 s | (s)SCHED:7 | 0006 | 0.089 ms | 1 | 0.089 ms | 111450.737450 s | 111450.737539 s | (s)SCHED:7 | 0007 | 0.085 ms | 17 | 0.169 ms | 111452.471333 s | 111452.471502 s | (s)SCHED:7 | 0004 | 0.071 ms | 15 | 0.221 ms | 111452.535252 s | 111452.535473 s | (s)SCHED:7 | 0000 | 0.044 ms | 32 | 0.130 ms | 111460.001982 s | 111460.002112 s | -------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com [ Add {} for multiline if blocks ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
420298aefe |
perf kwork: Add IRQ trace BPF support
Implements irq trace bpf function. Test cases: Trace irq without filter: # perf kwork -k irq rep -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 31.026 ms | 285 | 1.493 ms | 110326.049963 s | 110326.051456 s | eth0:10 | 0002 | 7.875 ms | 96 | 1.429 ms | 110313.916835 s | 110313.918264 s | ata_piix:14 | 0002 | 2.510 ms | 28 | 0.396 ms | 110331.367987 s | 110331.368383 s | -------------------------------------------------------------------------------------------------------------------------------- Trace irq with cpu filter: # perf kwork -k irq rep -b -C 0 Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 34.288 ms | 282 | 2.061 ms | 110358.078968 s | 110358.081029 s | -------------------------------------------------------------------------------------------------------------------------------- Trace irq with name filter: # perf kwork -k irq rep -b -n eth0 Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- eth0:10 | 0002 | 2.184 ms | 21 | 0.572 ms | 110386.541699 s | 110386.542271 s | -------------------------------------------------------------------------------------------------------------------------------- Trace irq with summary: # perf kwork -k irq rep -b -S Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 42.923 ms | 285 | 1.181 ms | 110418.128867 s | 110418.130049 s | eth0:10 | 0002 | 2.085 ms | 20 | 0.668 ms | 110416.002935 s | 110416.003603 s | ata_piix:14 | 0002 | 0.970 ms | 4 | 0.656 ms | 110424.034482 s | 110424.035138 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 309 Total runtime (msec) : 45.977 (0.003% load average) Total time span (msec) : 17017.655 -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork -k irq rep -b Starting trace, Hit <Ctrl+C> to stop and report ^C Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- nvme0q20:145 | 0019 | 0.570 ms | 28 | 0.064 ms | 26966.635102 s | 26966.635167 s | amdgpu:162 | 0002 | 0.568 ms | 29 | 0.068 ms | 26966.644346 s | 26966.644414 s | nvme0q4:129 | 0003 | 0.565 ms | 31 | 0.037 ms | 26966.614830 s | 26966.614866 s | nvme0q16:141 | 0015 | 0.205 ms | 66 | 0.012 ms | 26967.145161 s | 26967.145174 s | nvme0q29:154 | 0028 | 0.154 ms | 44 | 0.014 ms | 26967.078970 s | 26967.078984 s | nvme0q10:135 | 0009 | 0.134 ms | 43 | 0.011 ms | 26967.132093 s | 26967.132104 s | nvme0q2:127 | 0001 | 0.132 ms | 26 | 0.011 ms | 26966.883584 s | 26966.883595 s | nvme0q25:150 | 0024 | 0.127 ms | 32 | 0.014 ms | 26966.631419 s | 26966.631433 s | nvme0q14:139 | 0013 | 0.110 ms | 21 | 0.017 ms | 26966.760843 s | 26966.760861 s | nvme0q30:155 | 0029 | 0.102 ms | 30 | 0.022 ms | 26966.677171 s | 26966.677193 s | nvme0q13:138 | 0012 | 0.088 ms | 20 | 0.015 ms | 26966.738733 s | 26966.738748 s | nvme0q6:131 | 0005 | 0.087 ms | 13 | 0.020 ms | 26966.648445 s | 26966.648465 s | nvme0q28:153 | 0027 | 0.066 ms | 12 | 0.015 ms | 26966.771431 s | 26966.771447 s | nvme0q26:151 | 0025 | 0.060 ms | 13 | 0.012 ms | 26966.704266 s | 26966.704278 s | nvme0q21:146 | 0020 | 0.054 ms | 20 | 0.011 ms | 26967.322082 s | 26967.322094 s | nvme0q1:126 | 0000 | 0.046 ms | 11 | 0.013 ms | 26966.859754 s | 26966.859767 s | nvme0q17:142 | 0016 | 0.046 ms | 10 | 0.011 ms | 26967.114513 s | 26967.114524 s | xhci_hcd:74 | 0015 | 0.041 ms | 3 | 0.016 ms | 26967.086004 s | 26967.086020 s | nvme0q8:133 | 0007 | 0.039 ms | 12 | 0.008 ms | 26966.712056 s | 26966.712063 s | nvme0q32:157 | 0031 | 0.036 ms | 10 | 0.014 ms | 26966.627054 s | 26966.627068 s | nvme0q9:134 | 0008 | 0.036 ms | 11 | 0.011 ms | 26967.258452 s | 26967.258462 s | nvme0q7:132 | 0006 | 0.024 ms | 3 | 0.014 ms | 26966.767404 s | 26966.767418 s | nvme0q11:136 | 0010 | 0.023 ms | 5 | 0.006 ms | 26966.935455 s | 26966.935461 s | nvme0q31:156 | 0030 | 0.018 ms | 5 | 0.006 ms | 26966.627517 s | 26966.627524 s | nvme0q12:137 | 0011 | 0.015 ms | 2 | 0.014 ms | 26966.799588 s | 26966.799602 s | enp5s0-rx-0:164 | 0006 | 0.009 ms | 2 | 0.005 ms | 26966.742024 s | 26966.742028 s | enp5s0-rx-1:165 | 0007 | 0.006 ms | 2 | 0.004 ms | 26966.939486 s | 26966.939490 s | enp5s0-tx-0:166 | 0008 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s | enp5s0-tx-1:167 | 0009 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s | -------------------------------------------------------------------------------------------------------------------------------- #t Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
daf07d2207 |
perf kwork: Implement BPF trace
'perf record' generates perf.data, which generates extra interrupts for hard disk, amount of data to be collected increases with time. Using eBPF trace can process the data in kernel, which solves the preceding two problems. Add -b/--use-bpf option for latency and report to support tracing kwork events using eBPF: 1. Create bpf prog and attach to tracepoints, 2. Start tracing after command is entered, 3. After user hit "ctrl+c", stop tracing and report, 4. Support CPU and name filtering. This commit implements the framework code and does not add specific event support. Test cases: # perf kwork rep -h Usage: perf kwork report [<options>] -b, --use-bpf Use BPF to measure kwork runtime -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork lat -h Usage: perf kwork latency [<options>] -b, --use-bpf Use BPF to measure kwork latency -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat -b Unsupported bpf trace class irq # perf kwork rep -b Unsupported bpf trace class irq Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com [ Simplify work_findnew() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
bcc8b3e88d |
perf kwork: Implement perf kwork timehist
Implements framework of perf kwork timehist, to provide an analysis of kernel work events. Test cases: # perf kwork tim Runtime start Runtime end Cpu Kwork name Runtime Delaytime (TYPE)NAME:NUM (msec) (msec) ----------------- ----------------- ------ ------------------------------ ---------- ---------- 91576.060290 91576.060344 [0000] (s)RCU:9 0.055 0.111 91576.061470 91576.061547 [0000] (s)SCHED:7 0.077 0.073 91576.062604 91576.062697 [0001] (s)RCU:9 0.094 0.409 91576.064443 91576.064517 [0002] (s)RCU:9 0.074 0.114 91576.065144 91576.065211 [0000] (s)SCHED:7 0.067 0.058 91576.066564 91576.066609 [0003] (s)RCU:9 0.045 0.110 91576.068495 91576.068559 [0000] (s)SCHED:7 0.064 0.059 91576.068900 91576.068996 [0004] (s)RCU:9 0.096 0.726 91576.069364 91576.069420 [0002] (s)RCU:9 0.056 0.082 91576.069649 91576.069701 [0004] (s)RCU:9 0.052 0.111 91576.070147 91576.070206 [0000] (s)SCHED:7 0.060 0.057 91576.073147 91576.073202 [0000] (s)SCHED:7 0.054 0.060 <SNIP> # perf kwork tim --max-stack 2 -g Runtime start Runtime end Cpu Kwork name Runtime Delaytime (TYPE)NAME:NUM (msec) (msec) ----------------- ----------------- ------ ------------------------------ ---------- ---------- 91576.060290 91576.060344 [0000] (s)RCU:9 0.055 0.111 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.061470 91576.061547 [0000] (s)SCHED:7 0.077 0.073 irq_exit_rcu <- sysvec_call_function_single 91576.062604 91576.062697 [0001] (s)RCU:9 0.094 0.409 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.064443 91576.064517 [0002] (s)RCU:9 0.074 0.114 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.065144 91576.065211 [0000] (s)SCHED:7 0.067 0.058 irq_exit_rcu <- sysvec_call_function_single 91576.066564 91576.066609 [0003] (s)RCU:9 0.045 0.110 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.068495 91576.068559 [0000] (s)SCHED:7 0.064 0.059 irq_exit_rcu <- sysvec_call_function_single 91576.068900 91576.068996 [0004] (s)RCU:9 0.096 0.726 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.069364 91576.069420 [0002] (s)RCU:9 0.056 0.082 irq_exit_rcu <- sysvec_apic_timer_interrupt 91576.069649 91576.069701 [0004] (s)RCU:9 0.052 0.111 irq_exit_rcu <- sysvec_apic_timer_interrupt <SNIP> Committer testing: # perf kwork -k workqueue timehist | head -40 Runtime start Runtime end Cpu Kwork name Runtime Delaytime (TYPE)NAME:NUM (msec) (msec) ----------------- ----------------- ------ ------------------------------ ---------- ---------- 26520.211825 26520.211832 [0019] (w)free_work 0.007 0.004 26520.212929 26520.212934 [0020] (w)free_work 0.005 0.004 26520.213226 26520.213228 [0014] (w)kfree_rcu_work 0.002 0.004 26520.214057 26520.214061 [0021] (w)free_work 0.004 0.004 26520.221239 26520.221241 [0007] (w)kfree_rcu_work 0.002 0.009 26520.223232 26520.223238 [0013] (w)psi_avgs_work 0.005 0.006 26520.230057 26520.230060 [0020] (w)free_work 0.003 0.003 26520.270428 26520.270434 [0015] (w)free_work 0.006 0.004 26520.270546 26520.270550 [0014] (w)free_work 0.004 0.003 26520.281626 26520.281629 [0015] (w)free_work 0.003 0.002 26520.287225 26520.287230 [0012] (w)psi_avgs_work 0.005 0.008 26520.287231 26520.287235 [0001] (w)psi_avgs_work 0.004 0.011 26520.287236 26520.287239 [0001] (w)psi_avgs_work 0.003 0.012 26520.329488 26520.329492 [0024] (w)free_work 0.004 0.004 26520.330600 26520.330605 [0007] (w)free_work 0.005 0.004 26520.334218 26520.334218 [0007] (w)kfree_rcu_monitor 0.001 0.002 26520.335220 26520.335221 [0005] (w)kfree_rcu_monitor 0.001 0.004 26520.343980 26520.343985 [0007] (w)free_work 0.005 0.002 26520.345093 26520.345097 [0006] (w)free_work 0.004 0.003 26520.351233 26520.351238 [0027] (w)psi_avgs_work 0.005 0.008 26520.353228 26520.353229 [0007] (w)kfree_rcu_work 0.001 0.002 26520.353229 26520.353231 [0005] (w)kfree_rcu_work 0.001 0.006 26520.382381 26520.382383 [0006] (w)free_work 0.003 0.002 26520.386547 26520.386548 [0006] (w)free_work 0.002 0.001 26520.391243 26520.391245 [0015] (w)console_callback 0.002 0.016 26520.415369 26520.415621 [0027] (w)btrfs_work_helper 0.252 26520.415351 26520.416174 [0002] (w)btrfs_work_helper 0.823 0.037 26520.415343 26520.416304 [0031] (w)btrfs_work_helper 0.961 26520.415335 26520.417078 [0001] (w)btrfs_work_helper 1.743 26520.415250 26520.417564 [0002] (w)wb_workfn 2.314 26520.424777 26520.424787 [0002] (w)btrfs_work_helper 0.010 26520.424788 26520.424798 [0002] (w)btrfs_work_helper 0.010 26520.424790 26520.424805 [0001] (w)btrfs_work_helper 0.016 0.016 26520.424801 26520.424807 [0002] (w)btrfs_work_helper 0.006 26520.424809 26520.424831 [0002] (w)btrfs_work_helper 0.022 0.030 26520.424824 26520.424835 [0027] (w)btrfs_work_helper 0.011 26520.424809 26520.424867 [0001] (w)btrfs_work_helper 0.059 0.032 # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-14-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
53e49e32ae |
perf kwork: Add workqueue latency support
Implements workqueue latency function. Test cases: # perf kwork -k workqueue lat Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (w)vmstat_update | 0001 | 5.004 ms | 1 | 5.004 ms | 44001.745646 s | 44001.750650 s | (w)vmstat_update | 0006 | 1.773 ms | 1 | 1.773 ms | 44000.830840 s | 44000.832613 s | (w)vmstat_shepherd | 0000 | 0.992 ms | 8 | 2.474 ms | 44007.717845 s | 44007.720318 s | (w)vmstat_update | 0000 | 0.974 ms | 5 | 2.624 ms | 44004.785970 s | 44004.788594 s | (w)e1000_watchdog | 0002 | 0.687 ms | 5 | 2.632 ms | 44005.009334 s | 44005.011966 s | (w)vmstat_update | 0002 | 0.307 ms | 1 | 0.307 ms | 44004.817395 s | 44004.817702 s | (w)vmstat_update | 0004 | 0.296 ms | 1 | 0.296 ms | 43997.913677 s | 43997.913973 s | (w)mix_interrupt_randomness | 0000 | 0.283 ms | 285 | 3.724 ms | 44006.790889 s | 44006.794613 s | (w)neigh_managed_work | 0001 | 0.271 ms | 1 | 0.271 ms | 43997.665542 s | 43997.665813 s | (w)vmstat_update | 0005 | 0.261 ms | 1 | 0.261 ms | 44007.820542 s | 44007.820803 s | (w)neigh_managed_work | 0004 | 0.220 ms | 1 | 0.220 ms | 44002.953287 s | 44002.953507 s | (w)neigh_periodic_work | 0004 | 0.217 ms | 1 | 0.217 ms | 43999.929718 s | 43999.929935 s | (w)mix_interrupt_randomness | 0002 | 0.199 ms | 5 | 0.310 ms | 44005.012316 s | 44005.012625 s | (w)vmstat_update | 0003 | 0.199 ms | 4 | 0.307 ms | 44005.714391 s | 44005.714699 s | (w)gc_worker | 0001 | 0.071 ms | 173 | 1.128 ms | 44002.062579 s | 44002.063707 s | -------------------------------------------------------------------------------------------------------------------------------- INFO: 0.020% skipped events (17 including 10 raise, 7 entry, 0 exit) Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-13-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
19807bba5a |
perf kwork: Add softirq latency support
Implements softirq latency function. Test cases: # perf kwork -k softirq lat Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0006 | 1.048 ms | 1 | 1.048 ms | 44000.829759 s | 44000.830807 s | (s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s | (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s | (s)SCHED:7 | 0000 | 0.158 ms | 677 | 2.639 ms | 44004.785716 s | 44004.788355 s | ... <SNIP> ... (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s | (s)TIMER:1 | 0005 | 0.128 ms | 1 | 0.128 ms | 44007.820346 s | 44007.820474 s | (s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat -C 1,2 Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s | (s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s | (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)NET_RX:3 | 0002 | 0.106 ms | 5 | 0.163 ms | 44005.012255 s | 44005.012418 s | (s)TIMER:1 | 0002 | 0.084 ms | 9 | 0.114 ms | 44005.009168 s | 44005.009282 s | (s)SCHED:7 | 0001 | 0.049 ms | 655 | 0.837 ms | 44005.707998 s | 44005.708835 s | (s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat -n RCU Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s | (s)RCU:9 | 0004 | 0.237 ms | 26 | 0.792 ms | 43997.683018 s | 43997.683810 s | (s)RCU:9 | 0007 | 0.217 ms | 140 | 1.335 ms | 43997.671080 s | 43997.672415 s | (s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s | (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat -s count,avg -n RCU Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s | (s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s | (s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)RCU:9 | 0007 | 0.217 ms | 140 | 1.335 ms | 43997.671080 s | 43997.672415 s | (s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s | (s)RCU:9 | 0004 | 0.237 ms | 26 | 0.792 ms | 43997.683018 s | 43997.683810 s | (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k softirq lat --time 43997, Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0006 | 1.048 ms | 1 | 1.048 ms | 44000.829759 s | 44000.830807 s | (s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s | (s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s | (s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s | (s)TIMER:1 | 0004 | 0.083 ms | 21 | 0.127 ms | 44004.969171 s | 44004.969298 s | ... <SNIP> ... (s)SCHED:7 | 0005 | 0.050 ms | 4 | 0.086 ms | 43997.684852 s | 43997.684938 s | (s)SCHED:7 | 0001 | 0.049 ms | 655 | 0.837 ms | 44005.707998 s | 44005.708835 s | (s)SCHED:7 | 0007 | 0.044 ms | 171 | 0.077 ms | 43997.943265 s | 43997.943342 s | (s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s | -------------------------------------------------------------------------------------------------------------------------------- Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-12-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
ad3d9f7a92 |
perf kwork: Implement perf kwork latency
Implements framework of perf kwork latency, which is used to report time properties such as delay time and frequency. Test cases: # perf kwork lat -h Usage: perf kwork latency [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat -C 199 Requested CPU 199 too large. Consider raising MAX_NR_CPUS Invalid cpu bitmap # perf kwork lat -i perf_no_exist.data failed to open perf_no_exist.data: No such file or directory # perf kwork lat -s avg1 Error: Unknown --sort key: `avg1' Usage: perf kwork latency [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): avg, max, count --time <str> Time span for analysis (start,stop) # perf kwork lat --time FFFF, Invalid time span # perf kwork lat Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- INFO: 36.570% skipped events (31537 including 0 raise, 31537 entry, 0 exit) Since there are no latency-enabled events, the output is empty. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-11-yangjihong1@huawei.com [ Add {} for multiline if blocks ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
8dbc3c8689 |
perf kwork: Add workqueue report support
Implements workqueue report function. Test cases: # perf kwork -k workqueue rep Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)gc_worker | 0001 | 1912.389 ms | 173 | 12.896 ms | 44002.050787 s | 44002.063683 s | (w)mix_interrupt_randomness | 0000 | 24.308 ms | 285 | 3.349 ms | 44004.784908 s | 44004.788257 s | (w)e1000_watchdog | 0002 | 5.332 ms | 5 | 2.059 ms | 44000.914366 s | 44000.916424 s | (w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s | (w)vmstat_shepherd | 0000 | 0.964 ms | 8 | 0.195 ms | 43997.986453 s | 43997.986648 s | (w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s | (w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s | (w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s | (w)mix_interrupt_randomness | 0002 | 0.114 ms | 5 | 0.037 ms | 44005.012625 s | 44005.012662 s | (w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s | (w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s | (w)neigh_periodic_work | 0004 | 0.039 ms | 1 | 0.039 ms | 43999.929935 s | 43999.929974 s | (w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s | (w)neigh_managed_work | 0001 | 0.036 ms | 1 | 0.036 ms | 43997.665813 s | 43997.665849 s | (w)neigh_managed_work | 0004 | 0.036 ms | 1 | 0.036 ms | 44002.953507 s | 44002.953543 s | (w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k workqueue rep -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)gc_worker | 0001 | 1912.389 ms | 173 | 12.896 ms | 44002.050787 s | 44002.063683 s | (w)mix_interrupt_randomness | 0000 | 24.308 ms | 285 | 3.349 ms | 44004.784908 s | 44004.788257 s | (w)e1000_watchdog | 0002 | 5.332 ms | 5 | 2.059 ms | 44000.914366 s | 44000.916424 s | (w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s | (w)vmstat_shepherd | 0000 | 0.964 ms | 8 | 0.195 ms | 43997.986453 s | 43997.986648 s | (w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s | (w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s | (w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s | (w)mix_interrupt_randomness | 0002 | 0.114 ms | 5 | 0.037 ms | 44005.012625 s | 44005.012662 s | (w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s | (w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s | (w)neigh_periodic_work | 0004 | 0.039 ms | 1 | 0.039 ms | 43999.929935 s | 43999.929974 s | (w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s | (w)neigh_managed_work | 0001 | 0.036 ms | 1 | 0.036 ms | 43997.665813 s | 43997.665849 s | (w)neigh_managed_work | 0004 | 0.036 ms | 1 | 0.036 ms | 44002.953507 s | 44002.953543 s | (w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 500 Total runtime (msec) : 1945.085 (0.192% load average) Total time span (msec) : 10155.026 -------------------------------------------------------------------------------------------------------------------------------- # perf kwork -k workqueue rep -n vmstat_update Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s | (w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s | (w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s | (w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s | (w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s | (w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s | (w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s | (w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s | -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork -k workqueue rep -C 1 | head -20 Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (w)commit_work | 0001 | 25.896 ms | 2 | 13.200 ms | 26522.906700 s | 26522.919900 s | (w)commit_work | 0001 | 13.316 ms | 1 | 13.316 ms | 26522.573246 s | 26522.586562 s | (w)commit_work | 0001 | 13.177 ms | 1 | 13.177 ms | 26522.673406 s | 26522.686583 s | (w)commit_work | 0001 | 12.630 ms | 1 | 12.630 ms | 26522.123921 s | 26522.136551 s | (w)btrfs_work_helper | 0001 | 3.544 ms | 1 | 3.544 ms | 26529.131296 s | 26529.134840 s | (w)btrfs_work_helper | 0001 | 3.330 ms | 1 | 3.330 ms | 26529.137698 s | 26529.141028 s | (w)btrfs_work_helper | 0001 | 2.855 ms | 1 | 2.855 ms | 26529.134842 s | 26529.137697 s | (w)btrfs_work_helper | 0001 | 2.757 ms | 1 | 2.757 ms | 26529.124086 s | 26529.126843 s | (w)btrfs_work_helper | 0001 | 2.182 ms | 1 | 2.182 ms | 26529.141030 s | 26529.143212 s | (w)btrfs_work_helper | 0001 | 1.743 ms | 1 | 1.743 ms | 26520.415335 s | 26520.417078 s | (w)btrfs_work_helper | 0001 | 1.499 ms | 1 | 1.499 ms | 26529.127774 s | 26529.129272 s | (w)btrfs_work_helper | 0001 | 1.446 ms | 1 | 1.446 ms | 26529.129848 s | 26529.131294 s | (w)btrfs_work_helper | 0001 | 1.373 ms | 1 | 1.373 ms | 26523.808270 s | 26523.809643 s | (w)wb_workfn | 0001 | 1.165 ms | 2 | 0.763 ms | 26527.071056 s | 26527.071819 s | (w)btrfs_work_helper | 0001 | 0.926 ms | 1 | 0.926 ms | 26529.126846 s | 26529.127771 s | (w)btrfs_work_helper | 0001 | 0.571 ms | 1 | 0.571 ms | 26529.129275 s | 26529.129846 s | (w)wb_workfn | 0001 | 0.525 ms | 1 | 0.525 ms | 26522.975151 s | 26522.975676 s | # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-10-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
4c14819169 |
perf kwork: Add softirq report support
Implements softirq kwork report function. Test cases: # perf kwork -k softirq rep Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s | (s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s | (s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s | (s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s | ... <SNIP> ... (s)RCU:9 | 0004 | 0.830 ms | 26 | 0.058 ms | 43997.666418 s | 43997.666476 s | (s)TIMER:1 | 0001 | 0.471 ms | 4 | 0.158 ms | 44007.834694 s | 44007.834852 s | (s)RCU:9 | 0006 | 0.220 ms | 7 | 0.048 ms | 44004.833764 s | 44004.833812 s | (s)NET_RX:3 | 0002 | 0.164 ms | 5 | 0.049 ms | 44005.012418 s | 44005.012466 s | (s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s | (s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s | (s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s | -------------------------------------------------------------------------------------------------------------------------------- # # perf kwork -k softirq rep -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s | (s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s | (s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s | (s)SCHED:7 | 0000 | 63.631 ms | 680 | 2.690 ms | 44006.721976 s | 44006.724666 s | ... <SNIP> ... (s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s | (s)RCU:9 | 0006 | 0.220 ms | 7 | 0.048 ms | 44004.833764 s | 44004.833812 s | (s)NET_RX:3 | 0002 | 0.164 ms | 5 | 0.049 ms | 44005.012418 s | 44005.012466 s | (s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s | (s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s | (s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 12748 Total runtime (msec) : 661.433 (0.065% load average) Total time span (msec) : 10176.441 -------------------------------------------------------------------------------------------------------------------------------- # # perf kwork -k softirq rep -s count,max Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s | (s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s | (s)SCHED:7 | 0002 | 50.039 ms | 1731 | 0.074 ms | 44005.009447 s | 44005.009521 s | (s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s | (s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s | ... <SNIP> ... (s)RCU:9 | 0002 | 35.241 ms | 932 | 0.407 ms | 44005.009541 s | 44005.009949 s | (s)RCU:9 | 0000 | 45.710 ms | 702 | 1.144 ms | 44004.787023 s | 44004.788167 s | (s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s | (s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s | (s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s | -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork -k softirq report -C 2 -s count,max Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- (s)SCHED:7 | 0002 | 0.980 ms | 159 | 0.024 ms | 26035.571037 s | 26035.571061 s | (s)RCU:9 | 0002 | 0.124 ms | 88 | 0.021 ms | 26035.177050 s | 26035.177071 s | (s)TIMER:1 | 0002 | 0.122 ms | 56 | 0.007 ms | 26035.468045 s | 26035.468052 s | -------------------------------------------------------------------------------------------------------------------------------- # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-9-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
94348520c6 |
perf kwork: Add irq report support
Implements irq kwork report function. Test cases: # perf kwork record -- sleep 10 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 6.134 MB perf.data ] # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -C 2 Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -C 3 Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -i perf.data Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -s max,freq Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s | eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- Total count : 18289 Total runtime (msec) : 1167.686 (0.115% load average) Total time span (msec) : 10159.155 -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report --time 44005, Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- virtio0-requests:25 | 0000 | 402.173 ms | 4695 | 0.981 ms | 44007.831992 s | 44007.832973 s | eth0:10 | 0002 | 0.089 ms | 2 | 0.058 ms | 44005.012222 s | 44005.012280 s | -------------------------------------------------------------------------------------------------------------------------------- Committer testing: # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- nvme0q5:130 | 0004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s | amdgpu:162 | 0002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s | nvme0q24:149 | 0023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s | nvme0q20:145 | 0019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s | nvme0q31:156 | 0030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s | nvme0q8:133 | 0007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s | nvme0q6:131 | 0005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s | nvme0q19:144 | 0018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s | nvme0q7:132 | 0006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s | nvme0q18:143 | 0017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s | nvme0q17:142 | 0016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s | enp5s0-rx-0:164 | 0006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s | enp5s0-tx-0:166 | 0008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s | -------------------------------------------------------------------------------------------------------------------------------- # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-8-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
f98919ec4f |
perf kwork: Implement 'report' subcommand
Implements framework of 'perf kwork report', which is used to report time properties such as run time and frequency: Test cases: # perf kwork Usage: perf kwork [<options>] {record|report} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, etc) -v, --verbose be more verbose (show symbol address, etc) # perf kwork report -h Usage: perf kwork report [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -S Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- Total count : 0 Total runtime (msec) : 0.000 (0.000% load average) Total time span (msec) : 0.000 -------------------------------------------------------------------------------------------------------------------------------- # perf kwork report -C 0,100 Requested CPU 100 too large. Consider raising MAX_NR_CPUS Invalid cpu bitmap # perf kwork report -s runtime1 Error: Unknown --sort key: `runtime1' Usage: perf kwork report [<options>] -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): runtime, max, count -S, --with-summary Show summary with statistics --time <str> Time span for analysis (start,stop) # perf kwork report -i perf_no_exist.data failed to open perf_no_exist.data: No such file or directory # perf kwork report --time 00FFF, Invalid time span Since there are no report supported events, the output is empty. Briefly describe the data structure: 1. "class" indicates event type. For example, irq and softiq correspond to different types. 2. "cluster" refers to a specific event corresponding to a type. For example, RCU and TIMER in softirq correspond to different clusters, which contains three types of events: raise, entry, and exit. 3. "atom" includes time of each sample and sample of the previous phase. (For example, exit corresponds to entry, which is used for timehist.) Committer notes: - Add {} for multiline if blocks. - report_print_work() should either return that ret variable that accounts how many bytes were printed or stop accounting and be void. Do the former for now to avoid this: builtin-kwork.c:534:6: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable] int ret = 0; ^ 1 error generated. When building with: ⬢[acme@toolbox perf]$ clang --version clang version 13.0.0 (https://github.com/llvm/llvm-project e8991caea8690ec2d17b0b7e1c29bf0da6609076) Also: - if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) { + if (dst_type < KWORK_TRACE_MAX) { Several versions of clang and at least this gcc: 3 51.40 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0) builtin-kwork.c:411:16: error: comparison of unsigned enum expression >= 0 is always true [-Werror,-Wtautological-compare] if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) { As the first entry in a enum is zero. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-7-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
97179d9d08 |
perf kwork: Add workqueue kwork record support
Record workqueue events workqueue:workqueue_activate_work, workqueue:workqueue_execute_start & workqueue:workqueue_execute_end Tese cases: Record all events: # perf kwork record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.857 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:irq_handler_entry irq:irq_handler_exit irq:softirq_raise irq:softirq_entry irq:softirq_exit workqueue:workqueue_activate_work workqueue:workqueue_execute_start workqueue:workqueue_execute_end dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Record workqueue events: # perf kwork -k workqueue record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.081 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date workqueue:workqueue_activate_work workqueue:workqueue_execute_start workqueue:workqueue_execute_end dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Committer testing: # perf kwork record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 3.430 MB perf.data (24130 samples) ] # perf evlist -v irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 workqueue:workqueue_activate_work: type: 2, size: 128, config: 0x106, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 workqueue:workqueue_execute_start: type: 2, size: 128, config: 0x105, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 workqueue:workqueue_execute_end: type: 2, size: 128, config: 0x104, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # perf script | grep workqueue | head swapper 0 [018] 26035.043289: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368 kworker/18:2-ev 70440 [018] 26035.043293: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work kworker/18:2-ev 70440 [018] 26035.043301: workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work swapper 0 [021] 26035.044704: workqueue:workqueue_activate_work: work struct 0xffff8b8ffef6e368 kworker/21:0-ev 4080535 [021] 26035.044709: workqueue:workqueue_execute_start: work struct 0xffff8b8ffef6e368: function free_work kworker/21:0-ev 4080535 [021] 26035.044716: workqueue:workqueue_execute_end: work struct 0xffff8b8ffef6e368: function free_work swapper 0 [018] 26035.045230: workqueue:workqueue_activate_work: work struct 0xffff8b8ffeeae368 kworker/18:2-ev 70440 [018] 26035.045232: workqueue:workqueue_execute_start: work struct 0xffff8b8ffeeae368: function free_work kworker/18:2-ev 70440 [018] 26035.045235: workqueue:workqueue_execute_end: work struct 0xffff8b8ffeeae368: function free_work swapper 0 [001] 26035.052046: workqueue:workqueue_activate_work: work struct 0xffff8b8108901590 # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-5-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
e643932190 |
perf kwork: Add softirq kwork record support
Record softirq events irq:softirq_raise, irq:softirq_entry & irq:softirq_exit. Test cases: Record all events: # perf kwork record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.897 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:irq_handler_entry irq:irq_handler_exit irq:softirq_raise irq:softirq_entry irq:softirq_exit dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Record softirq events: # perf kwork -k softirq record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.141 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:softirq_raise irq:softirq_entry irq:softirq_exit dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Committer testing: # perf kwork record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 3.078 MB perf.data (17433 samples) ] # perf evlist -v irq:irq_handler_entry: type: 2, size: 128, config: 0x97, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:irq_handler_exit: type: 2, size: 128, config: 0x96, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_raise: type: 2, size: 128, config: 0x93, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_entry: type: 2, size: 128, config: 0x95, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 irq:softirq_exit: type: 2, size: 128, config: 0x94, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, sample_id_all: 1, exclude_guest: 1 dummy:HG: type: 1, size: 128, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|RAW|IDENTIFIER, read_format: ID, inherit: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # perf script | head migration/12 73 [012] 25884.940992: irq:softirq_raise: vec=9 [action=RCU] migration/12 73 [012] 25884.940994: irq:softirq_entry: vec=9 [action=RCU] migration/12 73 [012] 25884.940995: irq:softirq_exit: vec=9 [action=RCU] swapper 0 [004] 25884.940995: irq:softirq_raise: vec=9 [action=RCU] swapper 0 [004] 25884.940998: irq:softirq_entry: vec=9 [action=RCU] swapper 0 [004] 25884.940999: irq:softirq_exit: vec=9 [action=RCU] cc1 71212 [021] 25884.941990: irq:softirq_raise: vec=9 [action=RCU] swapper 0 [004] 25884.941991: irq:softirq_raise: vec=9 [action=RCU] cc1 71212 [021] 25884.941992: irq:softirq_raise: vec=7 [action=SCHED] perf-exec 71208 [013] 25884.941992: irq:softirq_raise: vec=9 [action=RCU] # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-4-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
4f8ae962f0 |
perf kwork: Add irq kwork record support
Record interrupt events irq:irq_handler_entry & irq_handler_exit Test cases: # perf kwork record -o perf_kwork.date -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.556 MB perf_kwork.date ] # # perf evlist -i perf_kwork.date irq:irq_handler_entry irq:irq_handler_exit dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-3-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Yang Jihong
|
0f70d8e9db |
perf kwork: New tool to trace time properties of kernel work (such as softirq, and workqueue)
The 'perf kwork' tool is used to trace time properties of kernel work (such as irq, softirq, and workqueue), including runtime, latency, and timehist, using the infrastructure in the perf tools to allow tracing extra targets. This is the first commit to reuse the 'perf record' framework code to implement a simple record function, kwork is not supported currently. Test cases: # perf usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] The most commonly used perf commands are: <SNIP> iostat Show I/O performance metrics kallsyms Searches running kernel for symbols kmem Tool to trace/measure kernel memory properties kvm Tool to trace/measure kvm guest os kwork Tool to trace/measure kernel work properties (latencies) list List all symbolic event types lock Analyze lock events mem Profile memory accesses record Run a command and record its profile into perf.data <SNIP> See 'perf help COMMAND' for more information on a specific command. # perf kwork Usage: perf kwork [<options>] {record} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -k, --kwork <kwork> list of kwork to profile -v, --verbose be more verbose (show symbol address, etc) # perf kwork record -- sleep 1 [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 1.787 MB perf.data ] Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220709015033.38326-2-yangjihong1@huawei.com [ Add {} for multiline if blocks ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
ade5353950 |
perf data: Add missing unistd.h header needed for pid_t
Noticed when processing 'perf kwork' that includes util/data.h without, by luck, having included unistd.h indirectly to get the pid_t typedef. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
1ab55323c5 |
perf lock: Support -t option for 'contention' subcommand
Like perf lock report, it can report lock contention stat of each task. $ perf lock contention -t contended total wait max wait avg wait pid comm 5 945.20 us 902.08 us 189.04 us 316167 EventManager_De 33 98.17 us 6.78 us 2.97 us 766063 kworker/0:1-get 7 92.47 us 61.26 us 13.21 us 316170 EventManager_De 14 76.31 us 12.87 us 5.45 us 12949 timedcall 24 76.15 us 12.27 us 3.17 us 767992 sched-pipe 15 75.62 us 11.93 us 5.04 us 15127 switchto-defaul 24 71.84 us 5.59 us 2.99 us 629168 kworker/u513:2- 17 67.41 us 7.94 us 3.96 us 13504 coroner- 1 59.56 us 59.56 us 59.56 us 316165 EventManager_De 14 56.21 us 6.89 us 4.01 us 0 swapper Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
79079f21f5 |
perf lock: Add -k and -F options to 'contention' subcommand
Like perf lock report, add -k/--key and -F/--field options to control output formatting and sorting. Note that it has slightly different default options as some fields are not available and to optimize the screen space. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
528b9cab3b |
perf lock: Add 'contention' subcommand
The 'perf lock contention' processes the lock contention events and displays the result like perf lock report. Right now, there's not much difference between the two but the lock contention specific features will come soon. $ perf lock contention contended total wait max wait avg wait type caller 238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c 1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd 81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172 2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d 24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56 2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa 6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d 1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b 3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
f9c695a211 |
perf lock: Add lock aggregation enum
Introduce the aggr_mode variable to prepare a later code change. The default is LOCK_AGGR_ADDR which aggregates the result for the lock instances. When -t/--threads option is given, it'd be set to LOCK_AGGR_TASK. The LOCK_AGGR_CALLER is for the contention analysis and it'd aggregate the stat by comparing the callstacks. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
fb87158bab |
perf lock: Add flags field in the lock_stat
For lock contention tracepoint analysis, it needs to keep the flags. As nr_readlock and nr_trylock fields are not used for it, let's make it a union. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220725183124.368304-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
6923397cb7 |
perf test: Add test for #system_tsc_freq in metrics
The value should be non-zero on Intel while zero on everything else. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220718164312.3994191-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
1276ade6a5 |
perf tsc: Add cpuinfo fall back for arch_get_tsc_freq()
The CPUID method of arch_get_tsc_freq fails for older Intel processors, such as Skylake. Compute using /proc/cpuinfo. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220718164312.3994191-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Kan Liang
|
bc2373a58a |
perf tsc: Add arch TSC frequency information
The TSC frequency information is required for the event metrics with the literal, system_tsc_freq. For the newer Intel platform, the TSC frequency information can be retrieved from the CPUID leaf 0x15. If the TSC frequency information isn't present the /proc/cpuinfo approach is used. Refactor cpuid() for this use. Note, the previous stack pushing/popping approach was broken on x86-64 that has stack red zones that would be clobbered. Committer testing: Before: $ perf record sleep 0.0001 [ perf record: Woken up 1 times to write data ] $ perf report --header-only |& grep cpuid # cpuid : AuthenticAMD,25,33,0 $ After the patch: $ perf record sleep 0.0001 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ] $ perf report --header-only |& grep cpuid # cpuid : AuthenticAMD,25,33,0 $ Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220718164312.3994191-2-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Namhyung Kim
|
9fe9b252c7 |
perf lock: Fix a copy-n-paste bug
It should be lock_text_end instead of _start.
Fixes:
|
||
Arnaldo Carvalho de Melo
|
41d0914d86 |
perf python: Ignore unused command line arguments when building with clang
Noticed after switching to python3 by default on some older fedora releases: 35 38.20 fedora:27 : FAIL clang version 5.0.2 (tags/RELEASE_502/final) clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument] clang-5.0: error: argument unused during compilation: '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1' [-Werror,-Wunused-command-line-argument] error: command 'clang' failed with exit status 1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
f077c77699 |
perf build: Avoid defining _FORTIFY_SOURCE multiple times
One in perf's CFLAGS and the other in the distro python binding scripts. So if use the usual technique of first -D_FORTIFY_SOURCE then -D it. Noticed with: opensuse tumbleweed: gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Thomas Richter
|
87abe344cd |
perf test: Fix test case 83 ('perf stat CSV output linter') on s390
Perf test case 83: perf stat CSV output linter might fail
on s390.
The reason for this is the output of the command
./perf stat -x, -A -a --no-merge true
which depends on a .config file setting. When CONFIG_SCHED_TOPOLOGY
is set, the output of above perf command is
CPU0,1.50,msec,cpu-clock,1502781,100.00,1.052,CPUs utilized
When CONFIG_SCHED_TOPOLOGY is *NOT* set the output of above perf
command is
0.95,msec,cpu-clock,949800,100.00,1.060,CPUs utilized
Fix the test case to accept both output formats.
Output before:
# perf test 83
83: perf stat CSV output linter : FAILED!
#
Output after:
# ./perf test 83
83: perf stat CSV output linter : Ok
#
Fixes:
|
||
Jason Wang
|
2c91cd88f5 |
perf cs-etm: Fix duplicated 'the' in comment
The double `the' is duplicated in the comment, remove one. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20220716044040.43123-1-wangborong@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Jason Wang
|
c69d33ebfa |
perf probe: Fix duplicated 'the' in comment
The double `the' is duplicated in the comment, remove one. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Zechuan Chen <chenzechuan1@huawei.com> Link: http://lore.kernel.org/lkml/20220716043957.42829-1-wangborong@cdjrlc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Arnaldo Carvalho de Melo
|
63a4354ae7 |
perf scripting perl: Ignore some warnings to keep building with perl headers
On gcc 12 we started seeing this: In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:2999, from util/scripting-engines/trace-event-perl.c:35: /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_is_utf8_valid_partial_char_flags': /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/handy.h:125:23: error: cast from function call of type 'STRLEN' {aka 'long unsigned int'} to non-matching type '_Bool' [-Werror=bad-function-cast] 125 | #define cBOOL(cbool) ((bool) (cbool)) | ^ /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:2363:12: note: in expansion of macro 'cBOOL' 2363 | return cBOOL(is_utf8_char_helper_(s0, e, flags)); | ^~~~~ In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7242: /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h: In function 'Perl_cop_file_avn': /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/inline.h:3489:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 3489 | const char *file = CopFILE(cop); | ^~~~~ In file included from /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/perl.h:7243: /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h: In function 'Perl_newSV_type': /usr/lib/perl5/5.36.0/x86_64-linux-thread-multi/CORE/sv_inline.h:376:5: error: enumeration value 'SVt_LAST' not handled in switch [-Werror=switch-enum] 376 | switch (type) { | ^~~~~~ So disable those warnings to keep building with perl devel headers. Noticed, among other distros, on opensuse tumbleweed: gcc version 12.1.1 20220629 [revision 7811663964aa7e31c3939b859bbfa2e16919639f] (SUSE Linux) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
ee87a0841a |
perf python: Avoid deprecation warning on distutils
Fix the following DeprecationWarning: tools/perf/util/setup.py:31: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives Note: the setuptools module may need installing, for example: $ sudo apt install python-setuptools Reviewer comments: James said: Tested it with python 2.7 and 3.8 by running "make install-python_ext PYTHON=..." Committer notes: Tested with: $ make -k BUILD_BPF_SKEL=1 PYTHON=python3 O=/tmp/build/perf -C tools/perf install-bin ; perf test python $ make -k BUILD_BPF_SKEL=1 O=/tmp/build/perf -C tools/perf install-bin ; perf test python Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220615014206.26651-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Ian Rogers
|
557cc18ee7 |
perf gtk: Only support --gtk if compiled in
If HAVE_GTK2_SUPPORT isn't defined then --gtk can't succeed, don't support it as a command line option in this case. v2. Is a rebase. Patch appears to have been missed in: https://lore.kernel.org/lkml/Ygu40djM1MqAfkcF@kernel.org/ Signed-off-by: Ian Rogers <irogers@google.com> Cc: Stephane Eranian <eranian@google.com> Cc: xaizek <xaizek@posteo.net> Link: https://lore.kernel.org/r/20220707203836.345918-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
2f1d6b41e2 |
perf intel-pt: Add documentation for tracing guest machine user space
Now it is possible to decode a host Intel PT trace including guest machine user space, add documentation for the steps needed to do it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-36-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
98759cca84 |
perf intel-pt: Use guest pid/tid etc in guest samples
When decoding with guest sideband information, for VMX non-root (NR) i.e. guest events, replace the host (hypervisor) pid/tid with guest values, and provide also the new machine_pid and vcpu values. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-35-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
61cd9135d0 |
perf intel-pt: Add machine_pid and vcpu to auxtrace_error
When decoding with guest sideband information, for VMX non-root (NR) i.e. guest errors, replace the host (hypervisor) pid/tid with guest values, and provide also the new machine_pid and vcpu values. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-34-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
71658de4dd |
perf intel-pt: Determine guest thread from guest sideband
Prior to decoding, determine what guest thread, if any, is running. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-33-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
7d1f65b504 |
perf intel-pt: Disable sync switch with guest sideband
The sync_switch facility attempts to better synchronize context switches with the Intel PT trace, however it is not designed for guest machine context switches, so disable it when guest sideband is detected. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-32-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
0bb82cf518 |
perf intel-pt: Track guest context switches
Use guest context switch events to keep track of which guest thread is running on a particular guest machine and VCPU. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-31-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
12374a1622 |
perf intel-pt: Add some more logging to intel_pt_walk_next_insn()
To aid debugging, add some more logging to intel_pt_walk_next_insn(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-30-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
7c0b20d13f |
perf intel-pt: Remove guest_machine_pid
Remove guest_machine_pid because it is not needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-29-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
f9de2f0fd3 |
perf tools: Add perf_event__is_guest()
Add a helper function to determine if an event is a guest event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-28-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
f42bbbf2e9 |
perf tools: Handle injected guest kernel mmap event
If a kernel mmap event was recorded inside a guest and injected into a host perf.data file, then it will match a host mmap_name not a guest mmap_name, see machine__set_mmap_name(). So try matching a host mmap_name in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-27-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
eef8e06eeb |
perf machine: Use realloc_array_as_needed() in machine__set_current_tid()
Prepare machine__set_current_tid() for use with guest machines that do not currently have a machine->env->nr_cpus_avail value by making use of realloc_array_as_needed(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-26-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
97406a7e4f |
perf inject: Add support for injecting guest sideband events
Inject events from a perf.data file recorded in a virtual machine into a perf.data file recorded on the host at the same time. Only side band events (e.g. mmap, comm, fork, exit etc) and build IDs are injected. Additionally, the guest kcore_dir is copied as kcore_dir__ appended to the machine PID. This is non-trivial because: o It is not possible to process 2 sessions simultaneously so instead events are first written to a temporary file. o To avoid conflict, guest sample IDs are replaced with new unused sample IDs. o Guest event's CPU is changed to be the host CPU because it is more useful for reporting and analysis. o Sample ID is mapped to machine PID which is recorded with VCPU in the id index. This is important to allow guest events to be related to the guest machine and VCPU. o Timestamps must be converted. o Events are inserted to obey finished-round ordering. The anticipated use-case is: - start recording sideband events in a guest machine - start recording an AUX area trace on the host which can trace also the guest (e.g. Intel PT) - run test case on the guest - stop recording on the host - stop recording on the guest - copy the guest perf.data file to the host - inject the guest perf.data file sideband events into the host perf.data file using perf inject - the resulting perf.data file can now be used Subsequent patches provide Intel PT support for this. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-25-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
10d3470022 |
perf tools: Add reallocarray_as_needed()
Add helper reallocarray_as_needed() to reallocate an array to a larger size and initialize the extra entries to an arbitrary value. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-24-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
a5367ecb53 |
perf tools: Automatically use guest kcore_dir if present
When registering a guest machine using machine_pid from the id index, check perf.data for a matching kcore_dir subdirectory and set the kallsyms file name accordingly. If set, use it to find the machine's kernel symbols and object code (from kcore). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-23-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
65691e9ff0 |
perf tools: Make has_kcore_dir() work also for guest kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can be stored in the perf.data output directory under the subdirectory named kcore_dir. Guest machines will have their files also under subdirectories beginning kcore_dir__ followed by the machine pid. Make has_kcore_dir() return true also if there is a guest machine kcore_dir. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
386e0d83d3 |
perf tools: Remove also guest kcore_dir with host kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can be stored in the perf.data output directory under the subdirectory named kcore_dir. Guest machines will have their files also under subdirectories beginning kcore_dir__ followed by the machine pid. Remove these also when removing kcore_dir. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
13a133b255 |
perf script python: intel-pt-events: Add machine_pid and vcpu
Add machine_pid and vcpu to the intel-pt-events.py script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
6de306b7a5 |
perf script python: Add machine_pid and vcpu
Add machine_pid and vcpu to python sample events and context switch events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-19-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
7151c1d178 |
perf auxtrace: Add machine_pid and vcpu to auxtrace_error
Add machine_pid and vcpu to struct perf_record_auxtrace_error. The existing fmt member is used to identify the new format. The new members make it possible to easily differentiate errors from guest machines. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-18-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
2273e46b98 |
perf dlfilter: Add machine_pid and vcpu
Add machine_pid and vcpu to struct perf_dlfilter_sample. The 'size' can be used to determine if the values are present, however machine_pid is zero if unused in any case. vcpu should be ignored if machine_pid is zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-17-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
e28fb159f1 |
perf script: Add machine_pid and vcpu
Add fields machine_pid and vcpu. These are displayed only if machine_pid is non-zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
6350490995 |
perf session: Use sample->machine_pid to find guest machine
If machine_pid is set, use it to find the guest machine. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
3461b65da7 |
perf tools: Add machine_pid and vcpu to perf_sample
When parsing a sample with a sample ID, copy machine_pid and vcpu from perf_sample_id to perf_sample. Note, machine_pid will be zero when unused, so only a non-zero value represents a guest machine. vcpu should be ignored if machine_pid is zero. Note also, machine_pid is used with events that have come from injecting a guest perf.data file, however guest events recorded on the host (i.e. using perf kvm) have the (QEMU) hypervisor process pid to identify them - refer machines__find_for_cpumode(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |