Commit Graph

29618 Commits

Author SHA1 Message Date
Adrian Hunter
a4455e0053 perf data: Add has_kcore_dir()
Add a helper function has_kcore_dir(), so that perf inject can determine if
it needs to keep the kcore_dir.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220520132404.25853-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:11:39 -03:00
Adrian Hunter
180b3d0626 perf inject: Keep some features sections from input file
perf inject overwrites feature sections with information from the current
machine. It makes more sense to keep original information that describes
the machine or software when perf record was run.

Example: perf.data from "Desktop" injected on "nuc11"

 Before:

  $ perf script --header-only -i perf.data-from-desktop | head -15
  # ========
  # captured on    : Thu May 19 09:55:50 2022
  # header version : 1
  # data offset    : 1208
  # data size      : 837480
  # feat offset    : 838688
  # hostname : Desktop
  # os release : 5.13.0-41-generic
  # perf version : 5.18.rc5.gac837f7ca7ed
  # arch : x86_64
  # nrcpus online : 28
  # nrcpus avail : 28
  # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
  # cpuid : GenuineIntel,6,85,4
  # total memory : 65548656 kB

  $ perf inject -i perf.data-from-desktop -o injected-perf.data

  $ perf script --header-only -i injected-perf.data | head -15
  # ========
  # captured on    : Fri May 20 15:06:55 2022
  # header version : 1
  # data offset    : 1208
  # data size      : 837480
  # feat offset    : 838688
  # hostname : nuc11
  # os release : 5.17.5-local
  # perf version : 5.18.rc5.g0f828fdeb9af
  # arch : x86_64
  # nrcpus online : 8
  # nrcpus avail : 8
  # cpudesc : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  # cpuid : GenuineIntel,6,140,1
  # total memory : 16012124 kB

 After:

  $ perf inject -i perf.data-from-desktop -o injected-perf.data

  $ perf script --header-only -i injected-perf.data | head -15
  # ========
  # captured on    : Fri May 20 15:08:54 2022
  # header version : 1
  # data offset    : 1208
  # data size      : 837480
  # feat offset    : 838688
  # hostname : Desktop
  # os release : 5.13.0-41-generic
  # perf version : 5.18.rc5.gac837f7ca7ed
  # arch : x86_64
  # nrcpus online : 28
  # nrcpus avail : 28
  # cpudesc : Intel(R) Core(TM) i9-9940X CPU @ 3.30GHz
  # cpuid : GenuineIntel,6,85,4
  # total memory : 65548656 kB

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220520132404.25853-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:11:25 -03:00
Adrian Hunter
618ee7838e libperf: Add preadn()
Add preadn() to provide pread() and readn() semantics.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/ab8918a4-7ac8-a37e-2e2c-28438c422d87@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:11:12 -03:00
Adrian Hunter
237c96b8c1 perf header: Add ability to keep feature sections
Many feature sections should not be re-written during perf inject. In
preparation to support that, add callbacks that a tool can use to copy
a feature section from elsewhere. perf inject will use this facility to
copy features sections from the input file.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220520132404.25853-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:11:01 -03:00
Ian Rogers
1634b5a1f1 perf jevents: Modify match field
The match_field function looks for json values to append to the event
string. As the C code processes these in order the output order matches
that in the json dictionary. Python json readers read the entire
dictionary and lose the ordering. To make the python and C output
comparable make the C code first read the extra fields then append them
to the event in an order not determined by their order in the file.

Modify the pmu-events test so that test expectations match the new
order.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220511211526.1021908-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:08:15 -03:00
Ian Rogers
afba2b08e1 perf vendor events: Fix Ivytown UNC_M_ACT_COUNT.RD umask
The event had two umasks with the umask of 3 being correct.
Note: this change wasn't automatically generated as there is no CSV for
Ivytown uncore events at:
https://github.com/intel/event-converter-for-linux-perf

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220511211526.1021908-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:08:06 -03:00
Ian Rogers
a583bf1878 perf vendor events: Fix Alderlake metric groups
Remove unnecessary empty groups.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220511211526.1021908-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:07:54 -03:00
Ian Rogers
fcb120d50c perf jevents: Append PMU description later
Append the PMU information from "Unit" to the description later. This
avoids a problem when "Unit" appears early in a json event and the
information prepends the description rather than being the expected
suffix.

Update the pmu-events test so that expectations now match the improved
output.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Forrington <nick.forrington@arm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220511211526.1021908-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:07:41 -03:00
Ian Rogers
2cf88f4614 perf test: Use skip in PERF_RECORD_*
Check if the error code is EACCES and make the test a skip with
a "permissions" skip reason if so.

Committer testing:

Before:

  $ perf test PERF_RECORD
    8: PERF_RECORD_* events & perf_sample fields            : FAILED!
  $

After:

  $ perf test PERF_RECORD
    8: PERF_RECORD_* events & perf_sample fields            : Skip (permissions)
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:05:07 -03:00
Ian Rogers
7741e03e80 perf test: Parse events break apart tests
Break multiple tests in the main test into individual test cases. Make
better use of skip and add reasons. Skip also for parse event permission
issues (detected by searching the error string). Rather than break out
of tests on the first failure, keep going and logging to pr_debug.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:04:59 -03:00
Ian Rogers
8252e7917e perf test: Parse events tidy evlist_test
Remove two unused variables. Make structs const. Also fix the array
index (aka id) for the event software/r0x1a/.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:04:55 -03:00
Ian Rogers
b58eca408c perf test: Parse events tidy terms_test
Remove an unused variables. Make structs const. Fix checkpatch issue wrt
unsigned not being with an int.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:04:46 -03:00
Ian Rogers
7312c36ce6 perf test: Basic mmap use skip
If opening the event fails for basic mmap with EACCES it is more
likely permission related that a true error. Mark the test as skip
in this case and add a skip reason.

Committer testing:

Before:

  $ perf test "mmap interface"
    4: Read samples using the mmap interface           : FAILED!
  $

After:

  $ perf test "mmap interface"
    4: Read samples using the mmap interface           : Skip (permissions)
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:03:39 -03:00
Ian Rogers
f9b10c82fa perf test: Use skip in openat syscall
Failures to open the tracepoint cause this test to fail, however,
typically such failures are permission related. Lower the failure to
just skipping the test in those cases and add a skip reason.

Committer testing:

Before:

  $ perf test "openat syscall"
    2: Detect openat syscall event                        : FAILED!
    3: Detect openat syscall event on all cpus            : FAILED!
  $

After:

  $ perf test "openat syscall"
    2: Detect openat syscall event                        : Skip (permissions)
    3: Detect openat syscall event on all cpus            : Skip (permissions)
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:01:57 -03:00
Ian Rogers
740f8a8241 perf test: Use skip in vmlinux kallsyms
Currently failures in reading vmlinux or kallsyms result in a test
failure. However, the failure is typically permission related. Prefer to
flag these failures as skip.

Committer testing:

Before:

  $ perf test vmlinux
    1: vmlinux symtab matches kallsyms                 : FAILED!
  $

After:

  $ perf test vmlinux
    1: vmlinux symtab matches kallsyms                 : Skip
  $

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:01:15 -03:00
Ian Rogers
cfa5013a41 perf test: Skip reason for suites with 1 test
When a suite has just 1 subtest, the subtest number is given as -1 to
avoid indented printing. When this subtest number is seen for the skip
reason, use the reason of the first test.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220518042027.836799-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 10:00:43 -03:00
Ian Rogers
0b9462d0ac perf stat: Make use of index clearer with perf_counts
Try to disambiguate further when perf_counts is being accessed it is
with a cpu map index rather than a CPU.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220519032005.1273691-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:54:02 -03:00
Ian Rogers
54668a4ea0 perf bpf_counter: Tidy use of CPU map index
BPF counters are typically running across all CPUs and so the CPU map
index and CPU number are the same. There may be cases with offline CPUs
where this isn't the case and so ensure the cpu map index for
perf_counts is going to be a valid index by explicitly iterating over
the CPU map. This also makes it clearer that users of perf_counts are
using an index. Collapse some multiple uses of perf_counts into single
uses.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220519032005.1273691-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:53:06 -03:00
Ian Rogers
e696f6dbbf perf cpumap: Add perf_cpu_map__for_each_idx()
A variant of perf_cpu_map__for_each_cpu() that just iterates index values
without the corresponding load of the CPU.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220519032005.1273691-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:52:38 -03:00
Ian Rogers
0dd9769f0c perf stat: Add stat record+report test
This would have caught:

  "Subject: Re: perf stat report segfaults"
  https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220519032005.1273691-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:51:55 -03:00
Namhyung Kim
7c3bcbdf44 perf lock: Add -t/--thread option for report
The -t option is to show per-thread lock stat like below:

  $ perf lock report -t -F acquired,contended,avg_wait

                Name   acquired  contended   avg wait (ns)

                perf     240569          9            5784
             swapper     106610         19             543
              :15789      17370          2           14538
        ContainerMgr       8981          6             874
               sleep       5275          1           11281
     ContainerThread       4416          4             944
     RootPressureThr       3215          5            1215
         rcu_preempt       2954          0               0
        ContainerMgr       2560          0               0
             unnamed       1873          0               0
     EventManager_De       1845          1             636
     futex-default-S       1609          0               0
  ...

Committer notes:

Add that option to the 'perf lock report' man page.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220521010811.932703-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:49:35 -03:00
Namhyung Kim
79d9333b85 perf lock: Do not discard broken lock stats
Currently it discards a lock_stat for a lock instance when there's a
broken lock_seq_stat in a single task for the lock.  But it also means
that the existing (and later) valid lock stat info for that lock will
be discarded as well.

This is not ideal since we can lose many valuable info because of a
single failure.  Actually those failures are indepent to the existing
stat.  So we can only discard the broken lock_seq_stat but keep the
valid lock_stat.

The discarded lock_seq_stat will be reallocated in a subsequent event
with SEQ_STATE_UNINITIALIZED which will be ignored until it see the
start of the next sequence.  So it should be ok just free it.

Before:

  $ perf lock report -F acquired,contended,avg_wait

  Warning:
  Processed 1401603 events and lost 18 chunks!

  Check IO/CPU overload!

                  Name   acquired  contended   avg wait (ns)

         rcu_read_lock     251225          0               0
   &(ei->i_block_re...       8731          0               0
   &sb->s_type->i_l...       8731          0               0
    hrtimer_bases.lock       5261          0               0
    hrtimer_bases.lock       2626          0               0
    hrtimer_bases.lock       1953          0               0
    hrtimer_bases.lock       1382          0               0
      cpu_hotplug_lock       1350          0               0
    hrtimer_bases.lock       1273          0               0
    hrtimer_bases.lock       1269          0               0
    hrtimer_bases.lock       1198          0               0
   ...

New:
                  Name   acquired  contended   avg wait (ns)

         rcu_read_lock     251225          0               0
   tk_core.seq.seqc...      54074          0               0
          &xa->xa_lock      17470          0               0
        &ei->i_es_lock      17464          0               0
       &ei->i_raw_lock       9391          0               0
   &mapping->privat...       8734          0               0
       &ei->i_data_sem       8731          0               0
   &(ei->i_block_re...       8731          0               0
   &sb->s_type->i_l...       8731          0               0
   jiffies_seq.seqc...       6953          0               0
        &mm->mmap_lock       6889          0               0
             balancing       5768          0               0
    hrtimer_bases.lock       5261          0               0
   ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220521010811.932703-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:47:41 -03:00
Leo Yan
12aeaaba08 perf c2c: Update documentation for store metric 'N/A'
The 'N/A' metric is added for store operations, update documentation to
reflect changes in the report table.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adam Li <adamli@amperemail.onmicrosoft.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220518055729.1869566-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:36:47 -03:00
Leo Yan
550b4d6f9a perf c2c: Add dimensions for 'N/A' metrics of store operation
Since now we have the statistics 'st_na' for store operations, add
dimensions for the 'N/A' (no available memory level) metrics and the
associated percentage calculation for the single cache line view.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adam Li <adamli@amperemail.onmicrosoft.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220518055729.1869566-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:36:34 -03:00
Leo Yan
9845063710 perf mem: Add stats for store operation with no available memory level
Sometimes we don't know memory store operations happen on exactly which
memory (or cache) level, the memory level flag is set to PERF_MEM_LVL_NA
in this case; a practical example is Arm SPE AUX trace sets this flag
for all store operations due to absent info for cache level.

This patch is to add a new item "st_na" in structure c2c_stats to add
statistics for store operations with no available cache level.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adam Li <adamli@amperemail.onmicrosoft.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220518055729.1869566-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:36:12 -03:00
Ian Rogers
508c9fbce0 perf build: Error for BPF skeletons without LIBBPF
LIBBPF requires LIBELF so doing "make BUILD_BPF_SKEL=1 NO_LIBELF=1"
fails with compiler errors about missing declarations. Similar could
happen if libbpf feature detection fails.

Prefer to error when BUILD_BPF_SKEL is enabled but LIBBPF isn't.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220520211826.1828180-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:33:39 -03:00
Arnaldo Carvalho de Melo
0869331fba Merge remote-tracking branch 'torvalds/master' into perf/core
To get the rest of 5.18.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23 09:32:49 -03:00
Linus Torvalds
eaea45fc0e perf tools fixes for v5.18: 6th batch
- Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events in 'perf stat'.
 
 - Fix x86's arch__intr_reg_mask() for the hybrid platform.
 
 - Address 'perf bench numa' compiler error on s390.
 
 - Fix check for btf__load_from_kernel_by_id() in libbpf.
 
 - Fix "all PMU test" 'perf test' to skip hv_24x7/hv_gpci tests on powerpc.
 
 - Fix session topology test to skip the test in guest environment.
 
 - Skip BPF 'perf test' if clang is not present.
 
 - Avoid shell test description infinite loop in 'perf test'.
 
 - Fix Intel LBR callstack entries and nr print message.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYokzYwAKCRCyPKLppCJ+
 J9rLAQCwp6FAAbbh/Kxv9jiU51xTYRItpjNk0SDJsh+hb+vwYgD7BTvbTco5d6TJ
 n8BkHw2v3iLCUHJX23qzMKgZyOa87w4=
 =9T87
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.18-2022-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
   in 'perf stat'.

 - Fix x86's arch__intr_reg_mask() for the hybrid platform.

 - Address 'perf bench numa' compiler error on s390.

 - Fix check for btf__load_from_kernel_by_id() in libbpf.

 - Fix "all PMU test" 'perf test' to skip hv_24x7/hv_gpci tests on
   powerpc.

 - Fix session topology test to skip the test in guest environment.

 - Skip BPF 'perf test' if clang is not present.

 - Avoid shell test description infinite loop in 'perf test'.

 - Fix Intel LBR callstack entries and nr print message.

* tag 'perf-tools-fixes-for-v5.18-2022-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf session: Fix Intel LBR callstack entries and nr print message
  perf test bpf: Skip test if clang is not present
  perf test session topology: Fix test to skip the test in guest environment
  perf bench numa: Address compiler error on s390
  perf test: Avoid shell test description infinite loop
  perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform
  perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc
  perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
  perf build: Fix check for btf__load_from_kernel_by_id() in libbpf
2022-05-21 14:14:02 -10:00
Chengdong Li
51d0bf99b8 perf session: Fix Intel LBR callstack entries and nr print message
When generating callstack information from branch_stack(Intel LBR), the
actual number of callstack entry should be bigger than the number of
branch_stack, for example:

	branch_stack records:
		B() -> C()
		A() -> B()
	converted callstack records should be:
		C()
		B()
		A()
though, the number of callstack equals
to the number of branch stack plus 1.

This patch fixes above issue in branch_stack__printf(). For example,

	# echo 'scale=2000; 4*a(1)' > cmd
	# perf record --call-graph lbr bc -l < cmd

Before applying this patch, `perf script -D` output:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:6    # which is not consistent with LBR records.
	.....  0: 000000000040a410
	.....  1: 0000000000408650    # ditto
	.....  2: 00000000004022f2
	.....  3: 00000000004015f5
	.....  4: 00007f5ed6dcb553
	.....  5: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

After applied:

	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
	... LBR call chain: nr:8
	.....  0: fffffffffffffe00
	.....  1: 000000000040a410
	.....  2: 000000000040573c
	.....  3: 0000000000408650
	.....  4: 00000000004022f2
	.....  5: 00000000004015f5
	.....  6: 00007f5ed6dcb553
	.....  7: 0000000000401698
	... FP chain: nr:2
	.....  0: fffffffffffffe00
	.....  1: 000000000040a6d8
	... branch callstack: nr:7
	.....  0: 000000000040a410
	.....  1: 000000000040573c
	.....  2: 0000000000408650
	.....  3: 00000000004022f2
	.....  4: 00000000004015f5
	.....  5: 00007f5ed6dcb553
	.....  6: 0000000000401698
	 ... thread: bc:17990
	 ...... dso: /usr/bin/bc
	bc 17990 1220022.677386:     894172 cycles:
			  40a410 [unknown] (/usr/bin/bc)
			  40573c [unknown] (/usr/bin/bc)
			  408650 [unknown] (/usr/bin/bc)
			  4022f2 [unknown] (/usr/bin/bc)
			  4015f5 [unknown] (/usr/bin/bc)
		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
			  401698 [unknown] (/usr/bin/bc)

Change from v1:
	- refined code style according to Jiri's review comments.

Signed-off-by: Chengdong Li <chengdongli@tencent.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: likexu@tencent.com
Link: https://lore.kernel.org/r/20220517015726.96131-1-chengdongli@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:56:24 -03:00
Athira Rajeev
8994e97be3 perf test bpf: Skip test if clang is not present
Perf BPF filter test fails in environment where "clang" is not
installed.

Test failure logs:

<<>>
 42: BPF filter                    :
 42.1: Basic BPF filtering         : Skip
 42.2: BPF pinning                 : FAILED!
 42.3: BPF prologue generation     : FAILED!
<<>>

Enabling verbose option provided debug logs which says clang/llvm needs
to be installed. Snippet of verbose logs:

<<>>
 42.2: BPF pinning                  :
 --- start ---
test child forked, pid 61423
ERROR:	unable to find clang.
Hint:	Try to install latest clang/llvm to support BPF.
        Check your $PATH

<<logs_here>>

Failed to compile test case: 'Basic BPF llvm compile'
Unable to get BPF object, fix kbuild first
test child finished with -1
 ---- end ----
BPF filter subtest 2: FAILED!
<<>>

Here subtests, "BPF pinning" and "BPF prologue generation" failed and
logs shows clang/llvm is needed. After installing clang, testcase
passes.

Reason on why subtest failure happens though logs has proper debug
information:

Main function __test__bpf calls test_llvm__fetch_bpf_obj by
passing 4th argument as true ( 4th arguments maps to parameter
"force" in test_llvm__fetch_bpf_obj ). But this will cause
test_llvm__fetch_bpf_obj to skip the check for clang/llvm.

Snippet of code part which checks for clang based on
parameter "force" in test_llvm__fetch_bpf_obj:

<<>>
if (!force && (!llvm_param.user_set_param &&
<<>>

Since force is set to "false", test won't get skipped and fails to
compile test case. The BPF code compilation needs clang, So pass the
fourth argument as "false" and also skip the test if reason for return
is "TEST_SKIP"

After the patch:

<<>>
 42: BPF filter                    :
 42.1: Basic BPF filtering         : Skip
 42.2: BPF pinning                 : Skip
 42.3: BPF prologue generation     : Skip
<<>>

Fixes: ba1fae431e ("perf test: Add 'perf test BPF'")
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lore.kernel.org/r/20220511115438.84032-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:54:21 -03:00
Athira Rajeev
cfd7092c31 perf test session topology: Fix test to skip the test in guest environment
The session topology test fails in powerpc pSeries platform.

Test logs:

  <<>>
  Session topology : FAILED!
  <<>>

This testcases tests cpu topology by checking the core_id and socket_id
stored in perf_env from perf session. The data from perf session is
compared with the cpu topology information from
"/sys/devices/system/cpu/cpuX/topology" like core_id,
physical_package_id.

In case of virtual environment, detail like physical_package_id is
restricted to be exposed. Hence physical_package_id is set to -1. The
testcase fails on such platforms since socket_id can't be fetched from
topology info.

Skip the testcase in powerpc if physical_package_id returns -1.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>---
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220511114959.84002-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:52:33 -03:00
Thomas Richter
f8ac1c4784 perf bench numa: Address compiler error on s390
The compilation on s390 results in this error:

  # make DEBUG=y bench/numa.o
  ...
  bench/numa.c: In function ‘__bench_numa’:
  bench/numa.c:1749:81: error: ‘%d’ directive output may be truncated
              writing between 1 and 11 bytes into a region of size between
              10 and 20 [-Werror=format-truncation=]
  1749 |        snprintf(tname, sizeof(tname), "process%d:thread%d", p, t);
                                                               ^~
  ...
  bench/numa.c:1749:64: note: directive argument in the range
                 [-2147483647, 2147483646]
  ...
  #

The maximum length of the %d replacement is 11 characters because of the
negative sign.  Therefore extend the array by two more characters.

Output after:

  # make  DEBUG=y bench/numa.o > /dev/null 2>&1; ll bench/numa.o
  -rw-r--r-- 1 root root 418320 May 19 09:11 bench/numa.o
  #

Fixes: 3aff8ba0a4 ("perf bench numa: Avoid possible truncation when using snprintf()")
Suggested-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520081158.2990006-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:19 -03:00
Ian Rogers
caaaa55477 perf test: Avoid shell test description infinite loop
for_each_shell_test() is already strict in expecting tests to be files
and executable. It is sometimes possible when it iterates over all files
that it finds one that is executable and lacks a newline character. When
this happens the loop never terminates as it doesn't check for EOF.

Add the EOF check to make this loop at least bounded by the file size.

If the description is returned as NULL then also skip the test.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220517204144.645913-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:19 -03:00
Kan Liang
01b28e4a58 perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform
The X86 specific arch__intr_reg_mask() is to check whether the kernel
and hardware can collect XMM registers. But it doesn't work on some
hybrid platform.

Without the patch on ADL-N:

  $ perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
  R11 R12 R13 R14 R15

The config of the test event doesn't contain the PMU information. The
kernel may fail to initialize it on the correct hybrid PMU and return
the wrong non-supported information.

Add the PMU information into the config for the hybrid platform. The
same register set is supported among different hybrid PMUs. Checking
the first available one is good enough.

With the patch on ADL-N:

  $ perf record -I?
  available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
  R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
  XMM10 XMM11 XMM12 XMM13 XMM14 XMM15

Fixes: 6466ec14aa ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:19 -03:00
Athira Rajeev
451ed8058c perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc
"perf all PMU test" picks the input events from "perf list --raw-dump
pmu" list and runs "perf stat -e" for each of the event in the list. In
case of powerpc, the PowerVM environment supports events from hv_24x7
and hv_gpci PMU which is of example format like below:

- hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
- hv_gpci/event,partition_id=?/

The value for "?" needs to be filled in depending on system and
respective event. CPM_ADJUNCT_INST needs have core value and domain
value. hv_gpci event needs partition_id.  Similarly, there are other
events for hv_24x7 and hv_gpci having "?" in event format. Hence skip
these events on powerpc platform since values like partition_id, domain
is specific to system and event.

Fixes: 3d5ac9effc ("perf test: Workload test of all PMUs")
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20220520101236.17249-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21 14:45:06 -03:00
Linus Torvalds
6c3f5bec9b ARM:
* Correctly expose GICv3 support even if no irqchip is created
   so that userspace doesn't observe it changing pointlessly
   (fixing a regression with QEMU)
 
 * Don't issue a hypercall to set the id-mapped vectors when
   protected mode is enabled (fix for pKVM in combination with
   CPUs affected by Spectre-v3a)
 
 x86: Five oneliners, of which the most interesting two are:
 
 * a NULL pointer dereference on INVPCID executed with
   paging disabled, but only if KVM is using shadow paging
 
 * an incorrect bsearch comparison function which could truncate
   the result and apply PMU event filtering incorrectly.  This one
   comes with a selftests update too.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKH1qYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMadgf9E1u5skRjtv+RWPbfs/v3irnirY/L
 x5TaVb2yiPahNH5qgFL2xnJ9jCcCNlxxn5uKpEAi0JFrqc6uCS0Rh1TPfqEN0lLt
 5PGJD2JSKXAWVRkObS3j5iZuQX4ZvDRY53eSQv6pdcU+evjTq1H5WZ83uciqo0J5
 xilKEtUIpJ9o0ELw9BjAd3vlRlOPpveHq+48DJN7cO0L/eju9Lz9kqJQTE7WQato
 SsmpXPNIaSlk3U3yWAfOYgzyVkZQW/JiKS++TfVr5VQMppbOI6bxo3UfDAygiA9e
 9KZAWrwoXqDMNp9756Y6lfT7g8PblnXgOvTXa/cV+ypaeAuuTU/iBSLwxQ==
 =gWsP
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Correctly expose GICv3 support even if no irqchip is created so
     that userspace doesn't observe it changing pointlessly (fixing a
     regression with QEMU)

   - Don't issue a hypercall to set the id-mapped vectors when protected
     mode is enabled (fix for pKVM in combination with CPUs affected by
     Spectre-v3a)

  x86 (five oneliners, of which the most interesting two are):

   - a NULL pointer dereference on INVPCID executed with paging
     disabled, but only if KVM is using shadow paging

   - an incorrect bsearch comparison function which could truncate the
     result and apply PMU event filtering incorrectly. This one comes
     with a selftests update too"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
  KVM: x86: hyper-v: fix type of valid_bank_mask
  KVM: Free new dirty bitmap if creating a new memslot fails
  KVM: eventfd: Fix false positive RCU usage warning
  selftests: kvm/x86: Verify the pmu event filter matches the correct event
  selftests: kvm/x86: Add the helper function create_pmu_event_filter
  kvm: x86/pmu: Fix the compare function used by the pmu event filter
  KVM: arm64: Don't hypercall before EL2 init
  KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
  KVM: x86/mmu: Update number of zapped pages even if page list is stable
2022-05-20 20:34:59 -10:00
Kan Liang
e0e14cdff3 perf parse-events: Move slots event for the hybrid platform too
The commit 94dbfd6781 ("perf parse-events: Architecture specific
leader override") introduced a feature to reorder the slots event to
fulfill the restriction of the perf metrics topdown group. But the
feature doesn't work on the hybrid machine.

  $ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1

   Performance counter stats for 'system wide':

       <not counted>      cpu_core/instructions/
       <not counted>      cpu_core/slots/
     <not supported>      cpu_core/topdown-retiring/

         1.002871801 seconds time elapsed

A hybrid platform has a different PMU name for the core PMUs, while
current perf hard code the PMU name "cpu".

Introduce a new function to check whether the system supports the perf
metrics feature. The result is cached for the future usage.

For X86, the core PMU name always has "cpu" prefix.

With the patch:

  $ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1

   Performance counter stats for 'system wide':

          76,337,010      cpu_core/slots/
          10,416,809      cpu_core/instructions/
          11,692,372      cpu_core/topdown-retiring/

         1.002805453 seconds time elapsed

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-5-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:13:37 -03:00
Kan Liang
e7d1374ed5 perf parse-events: Support different format of the topdown event name
The evsel->name may have a different format for a topdown event, a pure
topdown name (e.g., topdown-fe-bound), or a PMU name + a topdown name
(e.g., cpu/topdown-fe-bound/). The cpu/topdown-fe-bound/ kind format
isn't supported by the arch_evlist__leader(). This format is a very
common format for a hybrid platform, which requires specifying the PMU
name for each event.

Without the patch,

  $ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1

   Performance counter stats for 'system wide':

       <not counted>      instructions
       <not counted>      slots
     <not supported>      cpu/topdown-fe-bound/

         1.003482041 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
          echo 0 > /proc/sys/kernel/nmi_watchdog
          perf stat ...
          echo 1 > /proc/sys/kernel/nmi_watchdog
  The events in group usually have to be from the same PMU. Try reorganizing the group.

With the patch,

  $ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1

  Performance counter stats for 'system wide':

         157,383,996      slots
          25,011,711      instructions
          27,441,686      cpu/topdown-fe-bound/

         1.003530890 seconds time elapsed

Fixes: bc355822f0 ("perf parse-events: Move slots only with topdown")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-4-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:12:57 -03:00
Kan Liang
e8f4f794d7 perf stat: Always keep perf metrics topdown events in a group
If any member in a group has a different cpu mask than the other
members, the current perf stat disables group. when the perf metrics
topdown events are part of the group, the below <not supported> error
will be triggered.

  $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ }

   Performance counter stats for 'system wide':

         141,465,174      slots
     <not supported>      topdown-retiring
       1,605,330,334      uncore_imc_free_running_0/dclk/

The perf metrics topdown events must always be grouped with a slots
event as leader.

Factor out evsel__remove_from_group() to only remove the regular events
from the group.

Remove evsel__must_be_in_group(), since no one use it anymore.

With the patch, the topdown events aren't broken from the group for the
splitting.

  $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ }

   Performance counter stats for 'system wide':

         346,110,588      slots
         124,608,256      topdown-retiring
       1,606,869,976      uncore_imc_free_running_0/dclk/

         1.003877592 seconds time elapsed

Fixes: a9a1790247 ("perf stat: Ensure group is defined on top of the same cpu mask")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:11:13 -03:00
Kan Liang
39d5f412da perf evsel: Fixes topdown events in a weak group for the hybrid platform
The patch ("perf evlist: Keep topdown counters in weak group") fixes the
perf metrics topdown event issue when the topdown events are in a weak
group on a non-hybrid platform. However, it doesn't work for the hybrid
platform.

  $./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
  cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
  cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
  cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
  cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
  cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

  Performance counter stats for 'system wide':

       751,765,068      cpu_core/slots/                        (84.07%)
   <not supported>      cpu_core/topdown-bad-spec/
   <not supported>      cpu_core/topdown-be-bound/
   <not supported>      cpu_core/topdown-fe-bound/
   <not supported>      cpu_core/topdown-retiring/
        12,398,197      cpu_core/branch-instructions/          (84.07%)
         1,054,218      cpu_core/branch-misses/                (84.24%)
       539,764,637      cpu_core/bus-cycles/                   (84.64%)
            14,683      cpu_core/cache-misses/                 (84.87%)
         7,277,809      cpu_core/cache-references/             (77.30%)
       222,299,439      cpu_core/cpu-cycles/                   (77.28%)
        63,661,714      cpu_core/instructions/                 (84.85%)
                 0      cpu_core/mem-loads/                    (77.29%)
        12,271,725      cpu_core/mem-stores/                   (77.30%)
       542,241,102      cpu_core/ref-cycles/                   (84.85%)
             8,854      cpu_core/cache-misses/                 (76.71%)
         7,179,013      cpu_core/cache-references/             (76.31%)

         1.003245250 seconds time elapsed

A hybrid platform has a different PMU name for the core PMUs, while
the current perf hard code the PMU name "cpu".

The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
For a hybrid platform, the pmu_name must be non-NULL. Because there are
at least two core PMUs. The PMU has to be specified.
For a non-hybrid platform, the pmu_name may be NULL. Because there is
only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
it is a "cpu" PMU.

In case other PMUs also define the "slots" event, checking the PMU type
as well.

With the patch,

  $ perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
  cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
  cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
  cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
  cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
  cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1

  Performance counter stats for 'system wide':

     766,620,266   cpu_core/slots/                                        (84.06%)
      73,172,129   cpu_core/topdown-bad-spec/ #    9.5% bad speculation   (84.06%)
     193,443,341   cpu_core/topdown-be-bound/ #    25.0% backend bound    (84.06%)
     403,940,929   cpu_core/topdown-fe-bound/ #    52.3% frontend bound   (84.06%)
     102,070,237   cpu_core/topdown-retiring/ #    13.2% retiring         (84.06%)
      12,364,429   cpu_core/branch-instructions/                          (84.03%)
       1,080,124   cpu_core/branch-misses/                                (84.24%)
     564,120,383   cpu_core/bus-cycles/                                   (84.65%)
          36,979   cpu_core/cache-misses/                                 (84.86%)
       7,298,094   cpu_core/cache-references/                             (77.30%)
     227,174,372   cpu_core/cpu-cycles/                                   (77.31%)
      63,886,523   cpu_core/instructions/                                 (84.87%)
               0   cpu_core/mem-loads/                                    (77.31%)
      12,208,782   cpu_core/mem-stores/                                   (77.31%)
     566,409,738   cpu_core/ref-cycles/                                   (84.87%)
          23,118   cpu_core/cache-misses/                                 (76.71%)
       7,212,602   cpu_core/cache-references/                             (76.29%)

       1.003228667 seconds time elapsed

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 11:09:41 -03:00
Ian Rogers
92d579ea32 perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
Stat events can come from disk and so need a degree of validation. They
contain a CPU which needs looking up via CPU map to access a counter.

Add the CPU to index translation, alongside validity checking.

Discussion thread:

  https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/

Fixes: 7ac0089d13 ("perf evsel: Pass cpu not cpu map index to synthesize")
Reported-by: Michael Petlan <mpetlan@redhat.com>
Suggested-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20220519032005.1273691-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 10:54:06 -03:00
Arnaldo Carvalho de Melo
0ae065a5d2 perf build: Fix check for btf__load_from_kernel_by_id() in libbpf
Avi Kivity reported a problem where the __weak
btf__load_from_kernel_by_id() in tools/perf/util/bpf-event.c was being
used and it called btf__get_from_id() in tools/lib/bpf/btf.c that in
turn called back to btf__load_from_kernel_by_id(), resulting in an
endless loop.

Fix this by adding a feature test to check if
btf__load_from_kernel_by_id() is available when building perf with
LIBBPF_DYNAMIC=1, and if not then provide the fallback to the old
btf__get_from_id(), that doesn't call back to btf__load_from_kernel_by_id()
since at that time it didn't exist at all.

Tested on Fedora 35 where we have libbpf-devel 0.4.0 with LIBBPF_DYNAMIC
where we don't have btf__load_from_kernel_by_id() and thus its feature
test fail, not defining HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID:

  $ cat /tmp/build/perf-urgent/feature/test-libbpf-btf__load_from_kernel_by_id.make.output
  test-libbpf-btf__load_from_kernel_by_id.c: In function ‘main’:
  test-libbpf-btf__load_from_kernel_by_id.c:6:16: error: implicit declaration of function ‘btf__load_from_kernel_by_id’ [-Werror=implicit-function-declaration]
      6 |         return btf__load_from_kernel_by_id(20151128, NULL);
        |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  cc1: all warnings being treated as errors
  $

  $ nm /tmp/build/perf-urgent/perf | grep btf__load_from_kernel_by_id
  00000000005ba180 T btf__load_from_kernel_by_id
  $

  $ objdump --disassemble=btf__load_from_kernel_by_id -S /tmp/build/perf-urgent/perf

  /tmp/build/perf-urgent/perf:     file format elf64-x86-64
  <SNIP>
  00000000005ba180 <btf__load_from_kernel_by_id>:
  #include "record.h"
  #include "util/synthetic-events.h"

  #ifndef HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID
  struct btf *btf__load_from_kernel_by_id(__u32 id)
  {
    5ba180:	55                   	push   %rbp
    5ba181:	48 89 e5             	mov    %rsp,%rbp
    5ba184:	48 83 ec 10          	sub    $0x10,%rsp
    5ba188:	64 48 8b 04 25 28 00 	mov    %fs:0x28,%rax
    5ba18f:	00 00
    5ba191:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
    5ba195:	31 c0                	xor    %eax,%eax
         struct btf *btf;
  #pragma GCC diagnostic push
  #pragma GCC diagnostic ignored "-Wdeprecated-declarations"
         int err = btf__get_from_id(id, &btf);
    5ba197:	48 8d 75 f0          	lea    -0x10(%rbp),%rsi
    5ba19b:	e8 a0 57 e5 ff       	call   40f940 <btf__get_from_id@plt>
    5ba1a0:	89 c2                	mov    %eax,%edx
  #pragma GCC diagnostic pop

         return err ? ERR_PTR(err) : btf;
    5ba1a2:	48 98                	cltq
    5ba1a4:	85 d2                	test   %edx,%edx
    5ba1a6:	48 0f 44 45 f0       	cmove  -0x10(%rbp),%rax
  }
  <SNIP>

Fixes: 218e7b775d ("perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions")
Reported-by: Avi Kivity <avi@scylladb.com>
Link: https://lore.kernel.org/linux-perf-users/f0add43b-3de5-20c5-22c4-70aff4af959f@scylladb.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/linux-perf-users/YobjjFOblY4Xvwo7@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20 09:45:41 -03:00
Aaron Lewis
c41ef29cc1 selftests: kvm/x86: Verify the pmu event filter matches the correct event
Add a test to demonstrate that when the guest programs an event select
it is matched correctly in the pmu event filter and not inadvertently
filtered.  This could happen on AMD if the high nybble[1] in the event
select gets truncated away only leaving the bottom byte[2] left for
matching.

This is a contrived example used for the convenience of demonstrating
this issue, however, this can be applied to event selects 0x28A (OC
Mode Switch) and 0x08A (L1 BTB Correction), where 0x08A could end up
being denied when the event select was only set up to deny 0x28A.

[1] bits 35:32 in the event select register and bits 11:8 in the event
    select.
[2] bits 7:0 in the event select register and bits 7:0 in the event
    select.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220517051238.2566934-3-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20 07:06:55 -04:00
Aaron Lewis
04baa2233d selftests: kvm/x86: Add the helper function create_pmu_event_filter
Add a helper function that creates a pmu event filter given an event
list.  Currently, a pmu event filter can only be created with the same
hard coded event list.  Add a way to create one given a different event
list.

Also, rename make_pmu_event_filter to alloc_pmu_event_filter to clarify
it's purpose given the introduction of create_pmu_event_filter.

No functional changes intended.

Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Message-Id: <20220517051238.2566934-2-aaronlewis@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-05-20 07:06:55 -04:00
Linus Torvalds
d904c8cc03 Networking fixes for 5.18-rc8, including fixes from can, xfrm and
netfilter subtrees.
 
 Notably this reverts a recent TCP/DCCP netns-related change
 to address a possible UaF.
 
 Current release - regressions:
   - tcp: revert "tcp/dccp: get rid of inet_twsk_purge()"
 
   - xfrm: set dst dev to blackhole_netdev instead of loopback_dev in ifdown
 
 Previous releases - regressions:
   - netfilter: flowtable: fix TCP flow teardown
 
   - can: revert "can: m_can: pci: use custom bit timings for Elkhart Lake"
 
   - xfrm: check encryption module availability consistency
 
   - eth: vmxnet3: fix possible use-after-free bugs in vmxnet3_rq_alloc_rx_buf()
 
   - eth: mlx5: initialize flow steering during driver probe
 
   - eth: ice: fix crash when writing timestamp on RX rings
 
 Previous releases - always broken:
   - mptcp: fix checksum byte order
 
   - eth: lan966x: fix assignment of the MAC address
 
   - eth: mlx5: remove HW-GRO from reported features
 
   - eth: ftgmac100: disable hardware checksum on AST2600
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmKGAYYSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkrt8P/2GyYNQT7q0h3Plsxc/m1tIUCPiERROE
 zIU0R2QVc64xpkMISeVb3YYpa3eqhtQsNWgt7Xsr1NRXBmyx60dvGpS81w8Gnxuo
 ruA7SxnH6OA0usviiYPmeGP9emvCEkO5YRW5kxl1Cpum19yNxjfZKJ6ARk0IDp/D
 C1S91PYtF9s25Yytrlpv9lVVBvTHQxg2EQocZHxO+7/j2O8jJP/NAYltpVaRNC2W
 gLcOWTAujrjAfpdsBhJsWXv4dTCQOAgnIXYP9P1JdFMAZtkXoYQUjaXP7dsaAXHw
 iE9FBRkqDKVhj94CxR6VPOSo0kVvOuBfkc1eJeZ74lvahkHBq4EyiVCo6/JhNQTd
 /bi/mTeUlI9yYyu/j9lMDy4CwOuiB69Dl4vNR/G5C1rF7l1vQkZr50pnD96MePwu
 9fR5+ipZsDhj5c77OMiraqnnOyWXVtD2YCZCCw80a9/aWG4zxcIDtnNQIfqAACvx
 0wNgG2bPSKRablytep1Qs84Vvupaa1cC2eTBbA+6LzQqk3CR9/YMUSD6MXitxQyD
 RJYbm5QMqdW2QH8zE21E+8wzIPeN9m66lJFppuntuB+I/CHWAnP/CmdbWysR3FQ+
 5ZisPh4PUqb1VIzGKUbym/D9FB20Vc8zq6oQa8LqiIOODUrxQMg3F2O43OWsYsn3
 TDNCwo5BQ/Z8
 =C848
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can, xfrm and netfilter subtrees.

  Notably this reverts a recent TCP/DCCP netns-related change to address
  a possible UaF.

  Current release - regressions:

   - tcp: revert "tcp/dccp: get rid of inet_twsk_purge()"

   - xfrm: set dst dev to blackhole_netdev instead of loopback_dev in
     ifdown

  Previous releases - regressions:

   - netfilter: flowtable: fix TCP flow teardown

   - can: revert "can: m_can: pci: use custom bit timings for Elkhart
     Lake"

   - xfrm: check encryption module availability consistency

   - eth: vmxnet3: fix possible use-after-free bugs in
     vmxnet3_rq_alloc_rx_buf()

   - eth: mlx5: initialize flow steering during driver probe

   - eth: ice: fix crash when writing timestamp on RX rings

  Previous releases - always broken:

   - mptcp: fix checksum byte order

   - eth: lan966x: fix assignment of the MAC address

   - eth: mlx5: remove HW-GRO from reported features

   - eth: ftgmac100: disable hardware checksum on AST2600"

* tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.
  ptp: ocp: change sysfs attr group handling
  selftests: forwarding: fix missing backslash
  netfilter: nf_tables: disable expression reduction infra
  netfilter: flowtable: move dst_check to packet path
  netfilter: flowtable: fix TCP flow teardown
  net: ftgmac100: Disable hardware checksum on AST2600
  igb: skip phy status check where unavailable
  nfc: pn533: Fix buggy cleanup order
  mptcp: Do TCP fallback on early DSS checksum failure
  mptcp: fix checksum byte order
  net: af_key: check encryption module availability consistency
  net: af_key: add check for pfkey_broadcast in function pfkey_process
  net/mlx5: Drain fw_reset when removing device
  net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
  net/mlx5e: CT: Fix support for GRE tuples
  net/mlx5e: Remove HW-GRO from reported features
  net/mlx5e: Properly block HW GRO when XDP is enabled
  net/mlx5e: Properly block LRO when XDP is enabled
  net/mlx5e: Block rx-gro-hw feature in switchdev mode
  ...
2022-05-19 05:50:29 -10:00
Joachim Wiberg
090f9dd092 selftests: forwarding: fix missing backslash
Fix missing backslash, introduced in f62c5acc80.  Causes all tests to
not be installed.

Fixes: f62c5acc80 ("selftests/net/forwarding: add missing tests to Makefile")
Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20220518151630.2747773-1-troglobit@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-18 20:09:47 -07:00
Ian Rogers
6a973e2919 perf test: Add basic stat and topdown group test
Add a basic stat test.

Add two tests of grouping behavior for topdown events. Topdown events
are special as they must be grouped with the slots event first.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220517052724.283874-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 12:02:30 -03:00
Ian Rogers
d98079c05b perf evlist: Keep topdown counters in weak group
On Intel Icelake, topdown events must always be grouped with a slots
event as leader. When a metric is parsed a weak group is formed and
retried if perf_event_open fails. The retried events aren't grouped
breaking the slots leader requirement. This change modifies the weak
group "reset" behavior so that topdown events aren't broken from the
group for the retry.

  $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

   Performance counter stats for 'system wide':

    47,867,188,483      slots                                                         (92.27%)
   <not supported>      topdown-bad-spec
   <not supported>      topdown-be-bound
   <not supported>      topdown-fe-bound
   <not supported>      topdown-retiring
     2,173,346,937      branch-instructions                                           (92.27%)
        10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
        96,291,140      bus-cycles                                                    (92.29%)
         6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
        30,886,082      cache-references                                              (76.91%)
    11,773,726,641      cpu-cycles                                                    (84.62%)
    11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
                 0      mem-loads                                                     (92.32%)
     2,212,928,573      mem-stores                                                    (84.69%)
    10,024,403,118      ref-cycles                                                    (92.35%)
        16,232,978      baclears.any                                                  (92.35%)
        23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       0.981070734 seconds time elapsed

After:

  $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1

   Performance counter stats for 'system wide':

       31040189283      slots                                                         (92.27%)
        8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
       10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
        4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
        7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
        1417611942      branch-instructions                                           (92.26%)
           5285529      branch-misses             #    0.37% of all branches          (92.28%)
          62922469      bus-cycles                                                    (92.29%)
           1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
          17374098      cache-references                                              (76.94%)
        8040889520      cpu-cycles                                                    (84.63%)
        7709992319      instructions              #    0.96  insn per cycle           (92.32%)
                 0      mem-loads                                                     (92.32%)
        1515669558      mem-stores                                                    (84.68%)
        6542411177      ref-cycles                                                    (92.35%)
           4154149      baclears.any                                                  (92.35%)
          20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)

       1.010799593 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220517052724.283874-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 12:01:18 -03:00
Adrian Hunter
75659c6fb5 perf scripts python: intel-pt-events.py: Print ptwrite value as a string if it is ASCII
It can be convenient to put a string value into a ptwrite payload as
a quick and easy way to identify what is being printed.

To make that useful, if the Intel ptwrite payload value contains only
printable ASCII characters padded with NULLs, then print it also as a
string.

Using the example program from the "Emulated PTWRITE" section of
tools/perf/Documentation/perf-intel-pt.txt:

 $ echo -n "Hello" | od -t x8
 0000000 0000006f6c6c6548
 0000005
 $ perf record -e intel_pt//u ./eg_ptw 0x0000006f6c6c6548
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.016 MB perf.data ]
 $ perf script --itrace=ew intel-pt-events.py
 Intel PT Branch Trace, Power Events, Event Trace and PTWRITE
      Switch In   38524/38524 [001]     24166.044995916     0/0
           eg_ptw 38524/38524 [001]     24166.045380004   ptwrite  jmp                   IP: 0 payload: 0x6f6c6c6548 Hello     56532c7ce196 perf_emulate_ptwrite+0x16 (/home/ahunter/git/work/eg_ptw)
 End

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220509152400.376613-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 11:56:15 -03:00
Adrian Hunter
a5014310f7 perf script: Print Intel ptwrite value as a string if it is ASCII
It can be convenient to put a string value into a ptwrite payload as
a quick and easy way to identify what is being printed.

To make that useful, if the Intel ptwrite payload value contains only
printable ASCII characters padded with NULLs, then print it also as a
string.

Using the example program from the "Emulated PTWRITE" section of
tools/perf/Documentation/perf-intel-pt.txt:

 $ echo -n "Hello" | od -t x8
 0000000 0000006f6c6c6548
 0000005
 $ perf record -e intel_pt//u ./eg_ptw 0x0000006f6c6c6548
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.016 MB perf.data ]
 $ perf script --itrace=ew
           eg_ptw 35563 [005] 18256.087338:     ptwrite:  IP: 0 payload: 0x6f6c6c6548 Hello      55e764db5196 perf_emulate_ptwrite+0x16 (/home/user/eg_ptw)
 $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220509152400.376613-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17 11:56:02 -03:00