Commit Graph

1154771 Commits

Author SHA1 Message Date
Tiezhu Yang
1bad502775 tools x86: Keep list sorted by number in unistd_{32,64}.h
It is better to keep list sorted by number in unistd_{32,64}.h,
so that we can add more syscall number to a proper position.

This is preparation for later patch, no functionality change.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1668052208-14047-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Diederik de Haas
a912f5975f perf test: Replace legacy ... with $(...)
As detailed in https://www.shellcheck.net/wiki/SC2006:

The use of `...` is legacy syntax with several issues:
1. It has a series of undefined behaviors related to quoting in POSIX.
2. It imposes a custom escaping mode with surprising results.
3. It's exceptionally hard to nest.

$(...) command substitution has none of these problems,
and is therefore strongly encouraged.

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230201214945.127474-3-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Diederik de Haas
5b420cf003 perf test: Replace 'grep | wc -l' with 'grep -c'
To count the number of results from grep, use the '-c' parameter
instead of piping it to 'wc'.

See also https://www.shellcheck.net/wiki/SC2126

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230201214945.127474-2-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Namhyung Kim
dd15480a3d perf stat: Hide invalid uncore event output for aggr mode
The current display code for perf stat iterates given cpus and build the
aggr map to collect the event data for the aggregation mode.

But uncore events have their own cpu maps and it won't guarantee that
it'd match to the aggr map.  For example, per-package uncore events
would generate a single value for each socket.  When user asks per-core
aggregation mode, the output would contain 0 values for other cores.

Thus it needs to check the uncore PMU's cpumask and if it matches to the
current aggregation id.

Before:
  $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0              1               3.73 Joules power/energy-pkg/
  S0-D0-C1              0      <not counted> Joules power/energy-pkg/
  S0-D0-C2              0      <not counted> Joules power/energy-pkg/
  S0-D0-C3              0      <not counted> Joules power/energy-pkg/

         1.001404046 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
  	echo 0 > /proc/sys/kernel/nmi_watchdog
  	perf stat ...
  	echo 1 > /proc/sys/kernel/nmi_watchdog

The core 1, 2 and 3 should not be printed because the event is handled
in a cpu in the core 0 only.  With this change, the output becomes like
below.

After:
  $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0              1               2.09 Joules power/energy-pkg/

Fixes: b897613510 ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230125192431.2929677-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Namhyung Kim
7b204399ae perf lock contention: Add -S/--callstack-filter option
The -S/--callstack-filter is to limit display entries having the given
string in the callstack (not only in the caller in the output).

The following example shows lock contention results if the callstack
has 'net' substring somewhere.  Note that the caller '__dev_queue_xmit'
does not match to it, but it has 'inet6_csk_xmit' in the callstack.

This applies even if you don't use -v option to show the full callstack.

  $ sudo ./perf lock con -abv -S net sleep 1
  ...
   contended   total wait     max wait     avg wait         type   caller

           5     70.20 us     16.13 us     14.04 us     spinlock   __dev_queue_xmit+0xb6d
                          0xffffffffa5dd1c60  _raw_spin_lock+0x30
                          0xffffffffa5b8f6ed  __dev_queue_xmit+0xb6d
                          0xffffffffa5cd8267  ip6_finish_output2+0x2c7
                          0xffffffffa5cdac14  ip6_finish_output+0x1d4
                          0xffffffffa5cdb477  ip6_xmit+0x457
                          0xffffffffa5d1fd17  inet6_csk_xmit+0xd7
                          0xffffffffa5c5f4aa  __tcp_transmit_skb+0x54a
                          0xffffffffa5c6467d  tcp_keepalive_timer+0x2fd

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230126000936.3017683-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Namhyung Kim
3fd7a168bf perf script: Add 'cgroup' field for output
There's no field for the cgroup, let's add one.  To do that, users need to
specify --all-cgroup option for perf record to capture the cgroup info.

  $ perf record --all-cgroups -- true

  $ perf script -F comm,pid,cgroup
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...
            true 337112  /user.slice/user-657345.slice/user@657345.service/...

If it's recorded without the --all-cgroups, it'd complain.

  $ perf script -F comm,pid,cgroup
  Samples for 'cycles:u' event do not have CGROUP attribute set. Cannot print 'cgroup' field.
  Hint: run 'perf record --all-cgroups ...'

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230126213610.3381147-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Ross Zwisler
1df49ef9ee perf tools docs: Use canonical ftrace path
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.

But, from Documentation/trace/ftrace.rst:

  Before 4.1, all ftrace tracing control files were within the debugfs
  file system, which is typically located at /sys/kernel/debug/tracing.
  For backward compatibility, when mounting the debugfs file system,
  the tracefs file system will be automatically mounted at:

  /sys/kernel/debug/tracing

A few spots in the perf docs still refer to this older debugfs path, so
let's update them to avoid confusion.

Signed-off-by: Ross Zwisler <zwisler@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: linux-trace-kernel@vger.kernel.org
Link: http://lore.kernel.org/lkml/20230130181915.1113313-5-zwisler@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Rob Herring
2889959489 perf arm-spe: Only warn once for each unsupported address packet
Unknown address packet indexes are not an error as the Arm architecture
can (and has with SPEv1.2) define new ones and implementation defined
ones are also allowed. The error message for every occurrence of the
packet is needlessly noisy as well. Change the message to print just
once for each unknown index.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230127205546.667740-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Krister Johansen
1c24956542 perf symbols: Symbol lookup with kcore can fail if multiple segments match stext
This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.

On this system, stext fell within the vmalloc segment.  The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information.  In this case, however, there were two: a very large
vmalloc segment, and the text segment.  This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments.  However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace.  The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.

Instead, select the smallest matching segment that contains stext
instead of the first matching segment.  This allows us to match the text
segment instead of vmalloc, if one is contained within the other.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Reaver <me@davidreaver.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Athira Rajeev
3980ee9ad8 perf probe: Fix usage when libtraceevent is missing
While parsing the tracepoint events in parse_events_add_tracepoint()
function, code checks for HAVE_LIBTRACEEVENT support. This is needed
since libtraceevent is necessary for tracepoint. But while adding probe
points, check for LIBTRACEEVENT is not done in case of perf probe.
Hence, in environment with missing libtraceevent-devel, it is observed
that adding a probe point shows below message though it can't be used
via perf record.

Example:

Adding probe point:
	./perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
	Added new event:
	  probe:vfs_getname    (on getname_flags:72 with pathname=result->name:string)

	You can now use it in all perf tools, such as:

		perf record -e probe:vfs_getname -aR sleep 1

But trying perf record:
	./perf  record -e probe:vfs_getname -aR sleep 1
	event syntax error: 'probe:vfs_getname'
				\___ unsupported tracepoint
	libtraceevent is necessary for tracepoint support
	Run 'perf list' for a list of valid events

The builtin tool like perf record needs libtraceevent to
parse tracefs. But still the probe can be used by enabling
via tracefs. Patch fixes the probe usage message to the user
based on presence of libtraceevent. With the fix,

 # ./perf probe 'pmu:myprobe=schedule'
 Added new event:
   pmu:myprobe          (on schedule)

 perf is not linked with libtraceevent, to use the new probe you can use tracefs:

	cd /sys/kernel/tracing/
	echo 1 > events/pmu/myprobe/enable
	echo 1 > tracing_on
	cat trace_pipe
	Before removing the probe, echo 0 > events/pmu/myprobe/enable

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230131134748.54567-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:19 -03:00
Adrian Hunter
ce4c8e7966 perf symbols: Get symbols for .plt.got for x86-64
For x86_64, determine a symbol for .plt.got entries. That requires
computing the target offset and finding that in .rela.dyn, which in
turn means .rela.dyn needs to be sorted by offset.

Example:

  In this example, the GNU C Library is using .plt.got for malloc and
  free.

  Before:

    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.027 MB perf.data ]
    $ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp1.txt

  After:

    $ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp2.txt
    $ diff /tmp/cmp1.txt /tmp/cmp2.txt | head -12
    15509,15510c15509,15510
    < 27046.755390907:      7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428380 offset_0x28380@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    < 27046.755390907:      7f0b29428384 offset_0x28380@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    ---
    > 27046.755390907:      7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428380 malloc@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    > 27046.755390907:      7f0b29428384 malloc@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    15821,15822c15821,15822
    < 27046.755394865:      7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428370 offset_0x28370@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    < 27046.755394865:      7f0b29428374 offset_0x28370@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    ---
    > 27046.755394865:      7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b29428370 free@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)
    > 27046.755394865:      7f0b29428374 free@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) =>     7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6)

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:32:01 -03:00
Adrian Hunter
51a188ad8c perf symbols: Start adding support for .plt.got for x86
For x86, .plt.got is used, for example, when the address is taken of a
dynamically linked function. Start adding support by synthesizing a
symbol for each entry. A subsequent patch will attempt to get a better
name for the symbol.

Example:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstpltgot.c
    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    void callfn(void (*fn)(void))
    {
            fn();
    }

    int main()
    {
            fn4();
            fn1();
            callfn(fn3);
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -o tstpltgot tstpltgot.c -L . -ltstpltlib -Wl,-rpath="$(pwd)"
    $ readelf -SW tstpltgot | grep 'Name\|plt\|dyn'
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 6] .dynsym           DYNSYM          00000000000003d8 0003d8 0000f0 18   A  7   1  8
      [ 7] .dynstr           STRTAB          00000000000004c8 0004c8 0000c6 00   A  0   0  1
      [10] .rela.dyn         RELA            00000000000005d8 0005d8 0000d8 18   A  6   0  8
      [11] .rela.plt         RELA            00000000000006b0 0006b0 000048 18  AI  6  24  8
      [13] .plt              PROGBITS        0000000000001020 001020 000040 10  AX  0   0 16
      [14] .plt.got          PROGBITS        0000000000001060 001060 000020 10  AX  0   0 16
      [15] .plt.sec          PROGBITS        0000000000001080 001080 000030 10  AX  0   0 16
      [23] .dynamic          DYNAMIC         0000000000003d90 002d90 000210 10  WA  7   0  8
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltgot , filter callfn @ ./tstpltgot' ./tstpltgot
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.011 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    28393.810326915:   tr strt                               0 [unknown] =>     562350baa1b2 main+0x0
    28393.810326915:   tr end  call               562350baa1ba main+0x8 =>     562350baa090 fn4@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1bf main+0xd
    28393.810326917:   tr end  call               562350baa1bf main+0xd =>     562350baa080 fn1@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1c4 main+0x12
    28393.810326917:   call                       562350baa1ce main+0x1c =>     562350baa199 callfn+0x0
    28393.810326917:   tr end  call               562350baa1ad callfn+0x14 =>     7f607d36110f fn3+0x0
    28393.810326922:   tr strt                               0 [unknown] =>     562350baa1af callfn+0x16
    28393.810326922:   return                     562350baa1b1 callfn+0x18 =>     562350baa1d3 main+0x21
    28393.810326922:   tr end  call               562350baa1d3 main+0x21 =>     562350baa0a0 fn2@plt+0x0
    28393.810326924:   tr strt                               0 [unknown] =>     562350baa1d8 main+0x26
    28393.810326924:   tr end  call               562350baa1d8 main+0x26 =>     562350baa060 [unknown]  <- call to fn3 via .plt.got
    28393.810326925:   tr strt                               0 [unknown] =>     562350baa1dd main+0x2b
    28393.810326925:   tr end  return             562350baa1e3 main+0x31 =>     7f607d029d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    28393.810326915:   tr strt                               0 [unknown] =>     562350baa1b2 main+0x0
    28393.810326915:   tr end  call               562350baa1ba main+0x8 =>     562350baa090 fn4@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1bf main+0xd
    28393.810326917:   tr end  call               562350baa1bf main+0xd =>     562350baa080 fn1@plt+0x0
    28393.810326917:   tr strt                               0 [unknown] =>     562350baa1c4 main+0x12
    28393.810326917:   call                       562350baa1ce main+0x1c =>     562350baa199 callfn+0x0
    28393.810326917:   tr end  call               562350baa1ad callfn+0x14 =>     7f607d36110f fn3+0x0
    28393.810326922:   tr strt                               0 [unknown] =>     562350baa1af callfn+0x16
    28393.810326922:   return                     562350baa1b1 callfn+0x18 =>     562350baa1d3 main+0x21
    28393.810326922:   tr end  call               562350baa1d3 main+0x21 =>     562350baa0a0 fn2@plt+0x0
    28393.810326924:   tr strt                               0 [unknown] =>     562350baa1d8 main+0x26
    28393.810326924:   tr end  call               562350baa1d8 main+0x26 =>     562350baa060 offset_0x1060@plt+0x0
    28393.810326925:   tr strt                               0 [unknown] =>     562350baa1dd main+0x2b
    28393.810326925:   tr end  return             562350baa1e3 main+0x31 =>     7f607d029d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02 16:31:06 -03:00
Adrian Hunter
a1ab12856f perf symbols: Allow for static executables with .plt
A statically linked executable can have a .plt due to IFUNCs, in which
case .symtab is used not .dynsym. Check the section header link to see
if that is the case, and then use symtab instead.

Example:

  Before:

    $ cat tstifunc.c
    #include <stdio.h>

    void thing1(void)
    {
            printf("thing1\n");
    }

    void thing2(void)
    {
            printf("thing2\n");
    }

    typedef void (*thing_fn_t)(void);

    thing_fn_t thing_ifunc(void)
    {
            int x;

            if (x & 1)
                    return thing2;
            return thing1;
    }

    void thing(void) __attribute__ ((ifunc ("thing_ifunc")));

    int main()
    {
            thing();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -static -Wall -Wextra -Wno-uninitialized -o tstifuncstatic tstifunc.c
    $ readelf -SW tstifuncstatic | grep 'Name\|plt\|dyn'
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 4] .rela.plt         RELA            00000000004002e8 0002e8 000258 18  AI 29  20  8
      [ 6] .plt              PROGBITS        0000000000401020 001020 000190 00  AX  0   0 16
      [20] .got.plt          PROGBITS        00000000004c5000 0c4000 0000e0 08  WA  0   0  8
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstifuncstatic' ./tstifuncstatic
    thing1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.008 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    15786.690189535:   tr strt                               0 [unknown] =>           4017cd main+0x0
    15786.690189535:   tr end  call                     4017d5 main+0x8 =>           401170 [unknown]
    15786.690197660:   tr strt                               0 [unknown] =>           4017da main+0xd
    15786.690197660:   tr end  return                   4017e0 main+0x13 =>           401c1a __libc_start_call_main+0x6a

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    15786.690189535:   tr strt                               0 [unknown] =>           4017cd main+0x0
    15786.690189535:   tr end  call                     4017d5 main+0x8 =>           401170 thing_ifunc@plt+0x0
    15786.690197660:   tr strt                               0 [unknown] =>           4017da main+0xd
    15786.690197660:   tr end  return                   4017e0 main+0x13 =>           401c1a __libc_start_call_main+0x6a

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:51:51 -03:00
Adrian Hunter
60fbb3e49a perf symbols: Allow for .plt without header
A static executable can have a .plt due to the presence of IFUNCs.  In
that case the .plt does not have a header. Check for whether there is a
header by comparing the number of entries to the number of relocation
entries.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:51:31 -03:00
Adrian Hunter
b7dbc0be6e perf symbols: Add support for IFUNC symbols for x86_64
For x86_64, the GNU linker is putting IFUNC information in the relocation
addend, so use it to try to find a symbol for plt entries that refer to
IFUNCs.

Example:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstpltifunc.c
    #include <stdio.h>

    void thing1(void)
    {
            printf("thing1\n");
    }

    void thing2(void)
    {
            printf("thing2\n");
    }

    typedef void (*thing_fn_t)(void);

    thing_fn_t thing_ifunc(void)
    {
            int x;

            if (x & 1)
                    return thing2;
            return thing1;
    }

    void thing(void) __attribute__ ((ifunc ("thing_ifunc")));

    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            thing();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -Wno-uninitialized -o tstpltifunc tstpltifunc.c -L . -ltstpltlib -Wl,-rpath="$(pwd)"
    $ readelf -rW tstpltifunc | grep -A99 plt
    Relocation section '.rela.plt' at offset 0x738 contains 8 entries:
        Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
    0000000000003f98  0000000300000007 R_X86_64_JUMP_SLOT     0000000000000000 puts@GLIBC_2.2.5 + 0
    0000000000003fa8  0000000400000007 R_X86_64_JUMP_SLOT     0000000000000000 __stack_chk_fail@GLIBC_2.4 + 0
    0000000000003fb0  0000000500000007 R_X86_64_JUMP_SLOT     0000000000000000 fn1 + 0
    0000000000003fb8  0000000600000007 R_X86_64_JUMP_SLOT     0000000000000000 fn3 + 0
    0000000000003fc0  0000000800000007 R_X86_64_JUMP_SLOT     0000000000000000 fn4 + 0
    0000000000003fc8  0000000900000007 R_X86_64_JUMP_SLOT     0000000000000000 fn2 + 0
    0000000000003fd0  0000000b00000007 R_X86_64_JUMP_SLOT     0000000000000000 getrandom@GLIBC_2.25 + 0
    0000000000003fa0  0000000000000025 R_X86_64_IRELATIVE                        125d
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltifunc' ./tstpltifunc
    thing2
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.016 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    21860.073683659:   tr strt                               0 [unknown] =>     561e212c42be main+0x0
    21860.073683659:   tr end  call               561e212c42c6 main+0x8 =>     561e212c4110 fn4@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42cb main+0xd
    21860.073683661:   tr end  call               561e212c42cb main+0xd =>     561e212c40f0 fn1@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42d0 main+0x12
    21860.073683661:   tr end  call               561e212c42d0 main+0x12 =>     561e212c40d0 offset_0x10d0@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42d5 main+0x17
    21860.073698451:   tr end  call               561e212c42d5 main+0x17 =>     561e212c4120 fn2@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42da main+0x1c
    21860.073698451:   tr end  call               561e212c42da main+0x1c =>     561e212c4100 fn3@plt+0x0
    21860.073698452:   tr strt                               0 [unknown] =>     561e212c42df main+0x21
    21860.073698452:   tr end  return             561e212c42e5 main+0x27 =>     7fb51cc29d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    21860.073683659:   tr strt                               0 [unknown] =>     561e212c42be main+0x0
    21860.073683659:   tr end  call               561e212c42c6 main+0x8 =>     561e212c4110 fn4@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42cb main+0xd
    21860.073683661:   tr end  call               561e212c42cb main+0xd =>     561e212c40f0 fn1@plt+0x0
    21860.073683661:   tr strt                               0 [unknown] =>     561e212c42d0 main+0x12
    21860.073683661:   tr end  call               561e212c42d0 main+0x12 =>     561e212c40d0 thing_ifunc@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42d5 main+0x17
    21860.073698451:   tr end  call               561e212c42d5 main+0x17 =>     561e212c4120 fn2@plt+0x0
    21860.073698451:   tr strt                               0 [unknown] =>     561e212c42da main+0x1c
    21860.073698451:   tr end  call               561e212c42da main+0x1c =>     561e212c4100 fn3@plt+0x0
    21860.073698452:   tr strt                               0 [unknown] =>     561e212c42df main+0x21
    21860.073698452:   tr end  return             561e212c42e5 main+0x27 =>     7fb51cc29d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:50:52 -03:00
Adrian Hunter
05963491c0 perf symbols: Record whether a symbol is an alias for an IFUNC symbol
To assist with synthesizing plt symbols for IFUNCs, record whether a
symbol is an alias of an IFUNC symbol.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:49:29 -03:00
Adrian Hunter
78250284b1 perf symbols: Sort plt relocations for x86
For x86, with the addition of IFUNCs, relocation information becomes
disordered with respect to plt. Correct that by sorting the relocations by
offset.

Example:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstpltifunc.c
    #include <stdio.h>

    void thing1(void)
    {
            printf("thing1\n");
    }

    void thing2(void)
    {
            printf("thing2\n");
    }

    typedef void (*thing_fn_t)(void);

    thing_fn_t thing_ifunc(void)
    {
            int x;

            if (x & 1)
                    return thing2;
            return thing1;
    }

    void thing(void) __attribute__ ((ifunc ("thing_ifunc")));

    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            thing();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -Wno-uninitialized -o tstpltifunc tstpltifunc.c -L . -ltstpltlib -Wl,-rpath="$(pwd)"
    $ readelf -rW tstpltifunc | grep -A99 plt
    Relocation section '.rela.plt' at offset 0x738 contains 8 entries:
        Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
    0000000000003f98  0000000300000007 R_X86_64_JUMP_SLOT     0000000000000000 puts@GLIBC_2.2.5 + 0
    0000000000003fa8  0000000400000007 R_X86_64_JUMP_SLOT     0000000000000000 __stack_chk_fail@GLIBC_2.4 + 0
    0000000000003fb0  0000000500000007 R_X86_64_JUMP_SLOT     0000000000000000 fn1 + 0
    0000000000003fb8  0000000600000007 R_X86_64_JUMP_SLOT     0000000000000000 fn3 + 0
    0000000000003fc0  0000000800000007 R_X86_64_JUMP_SLOT     0000000000000000 fn4 + 0
    0000000000003fc8  0000000900000007 R_X86_64_JUMP_SLOT     0000000000000000 fn2 + 0
    0000000000003fd0  0000000b00000007 R_X86_64_JUMP_SLOT     0000000000000000 getrandom@GLIBC_2.25 + 0
    0000000000003fa0  0000000000000025 R_X86_64_IRELATIVE                        125d
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltifunc' ./tstpltifunc
    thing2
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.029 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    20417.302513948:   tr strt                               0 [unknown] =>     5629a74892be main+0x0
    20417.302513948:   tr end  call               5629a74892c6 main+0x8 =>     5629a7489110 fn2@plt+0x0
    20417.302513949:   tr strt                               0 [unknown] =>     5629a74892cb main+0xd
    20417.302513949:   tr end  call               5629a74892cb main+0xd =>     5629a74890f0 fn3@plt+0x0
    20417.302513950:   tr strt                               0 [unknown] =>     5629a74892d0 main+0x12
    20417.302513950:   tr end  call               5629a74892d0 main+0x12 =>     5629a74890d0 __stack_chk_fail@plt+0x0
    20417.302528114:   tr strt                               0 [unknown] =>     5629a74892d5 main+0x17
    20417.302528114:   tr end  call               5629a74892d5 main+0x17 =>     5629a7489120 getrandom@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892da main+0x1c
    20417.302528115:   tr end  call               5629a74892da main+0x1c =>     5629a7489100 fn4@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892df main+0x21
    20417.302528115:   tr end  return             5629a74892e5 main+0x27 =>     7ff14da29d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    20417.302513948:   tr strt                               0 [unknown] =>     5629a74892be main+0x0
    20417.302513948:   tr end  call               5629a74892c6 main+0x8 =>     5629a7489110 fn4@plt+0x0
    20417.302513949:   tr strt                               0 [unknown] =>     5629a74892cb main+0xd
    20417.302513949:   tr end  call               5629a74892cb main+0xd =>     5629a74890f0 fn1@plt+0x0
    20417.302513950:   tr strt                               0 [unknown] =>     5629a74892d0 main+0x12
    20417.302513950:   tr end  call               5629a74892d0 main+0x12 =>     5629a74890d0 offset_0x10d0@plt+0x0
    20417.302528114:   tr strt                               0 [unknown] =>     5629a74892d5 main+0x17
    20417.302528114:   tr end  call               5629a74892d5 main+0x17 =>     5629a7489120 fn2@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892da main+0x1c
    20417.302528115:   tr end  call               5629a74892da main+0x1c =>     5629a7489100 fn3@plt+0x0
    20417.302528115:   tr strt                               0 [unknown] =>     5629a74892df main+0x21
    20417.302528115:   tr end  return             5629a74892e5 main+0x27 =>     7ff14da29d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:49:05 -03:00
Adrian Hunter
b2529f829a perf symbols: Add support for x86 .plt.sec
The section .plt.sec was originally added for MPX and was first called
.plt.bnd. While MPX has been deprecated, .plt.sec is now also used for
IBT.  On x86_64, IBT may be enabled by default, but can be switched off
using gcc option -fcf-protection=none, or switched on by -z ibt or -z
ibtplt. On 32-bit, option -z ibt or -z ibtplt will enable IBT.

With .plt.sec, calls are made into .plt.sec instead of .plt, so it makes
more sense to put the symbols there instead of .plt. A notable
difference is that .plt.sec does not have a header entry.

For x86, when synthesizing symbols for plt, use offset and entry size of
.plt.sec instead of .plt when there is a .plt.sec section.

Example on Ubuntu 22.04 gcc 11.3:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstplt.c
    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c
    $ gcc -Wall -Wextra -z ibt -o tstplt tstplt.c -L . -ltstpltlib -Wl,-rpath=$(pwd)
    $ readelf -SW tstplt | grep 'plt\|Name'
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [11] .rela.plt         RELA            0000000000000698 000698 000060 18  AI  6  24  8
      [13] .plt              PROGBITS        0000000000001020 001020 000050 10  AX  0   0 16
      [14] .plt.got          PROGBITS        0000000000001070 001070 000010 10  AX  0   0 16
      [15] .plt.sec          PROGBITS        0000000000001080 001080 000040 10  AX  0   0 16
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstplt' ./tstplt
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.015 MB perf.data ]
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    38970.522546686:   tr strt                               0 [unknown] =>     55fc222a81a9 main+0x0
    38970.522546686:   tr end  call               55fc222a81b1 main+0x8 =>     55fc222a80a0 [unknown]
    38970.522546687:   tr strt                               0 [unknown] =>     55fc222a81b6 main+0xd
    38970.522546687:   tr end  call               55fc222a81b6 main+0xd =>     55fc222a8080 [unknown]
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81bb main+0x12
    38970.522546688:   tr end  call               55fc222a81bb main+0x12 =>     55fc222a80b0 [unknown]
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81c0 main+0x17
    38970.522546688:   tr end  call               55fc222a81c0 main+0x17 =>     55fc222a8090 [unknown]
    38970.522546689:   tr strt                               0 [unknown] =>     55fc222a81c5 main+0x1c
    38970.522546894:   tr end  return             55fc222a81cb main+0x22 =>     7f3a4dc29d90 __libc_start_call_main+0x80

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    38970.522546686:   tr strt                               0 [unknown] =>     55fc222a81a9 main+0x0
    38970.522546686:   tr end  call               55fc222a81b1 main+0x8 =>     55fc222a80a0 fn4@plt+0x0
    38970.522546687:   tr strt                               0 [unknown] =>     55fc222a81b6 main+0xd
    38970.522546687:   tr end  call               55fc222a81b6 main+0xd =>     55fc222a8080 fn1@plt+0x0
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81bb main+0x12
    38970.522546688:   tr end  call               55fc222a81bb main+0x12 =>     55fc222a80b0 fn2@plt+0x0
    38970.522546688:   tr strt                               0 [unknown] =>     55fc222a81c0 main+0x17
    38970.522546688:   tr end  call               55fc222a81c0 main+0x17 =>     55fc222a8090 fn3@plt+0x0
    38970.522546689:   tr strt                               0 [unknown] =>     55fc222a81c5 main+0x1c
    38970.522546894:   tr end  return             55fc222a81cb main+0x22 =>     7f3a4dc29d90 __libc_start_call_main+0x80

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:48:31 -03:00
Adrian Hunter
66fe2d53a0 perf symbols: Correct plt entry sizes for x86
In 32-bit executables the .plt entry size can be set to 4 when it is really
16. In fact the only sizes used for x86 (32 or 64 bit) are 8 or 16, so
check for those and, if not, use the alignment to choose which it is.

Example on Ubuntu 22.04 gcc 11.3:

  Before:

    $ cat tstpltlib.c
    void fn1(void) {}
    void fn2(void) {}
    void fn3(void) {}
    void fn4(void) {}
    $ cat tstplt.c
    void fn1(void);
    void fn2(void);
    void fn3(void);
    void fn4(void);

    int main()
    {
            fn4();
            fn1();
            fn2();
            fn3();
            return 0;
    }
    $ gcc --version
    gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    $ gcc -m32 -Wall -Wextra -shared -o libtstpltlib32.so tstpltlib.c
    $ gcc -m32 -Wall -Wextra -o tstplt32 tstplt.c -L . -ltstpltlib32 -Wl,-rpath=$(pwd)
    $ perf record -e intel_pt//u --filter 'filter main @ ./tstplt32' ./tstplt32
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.011 MB perf.data ]
    $ readelf -SW tstplt32 | grep 'plt\|Name'
      [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
      [10] .rel.plt          REL             0000041c 00041c 000028 08  AI  5  22  4
      [12] .plt              PROGBITS        00001030 001030 000060 04  AX  0   0 16   <- ES is 0x04, should be 0x10
      [13] .plt.got          PROGBITS        00001090 001090 000008 08  AX  0   0  8
    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    17894.383903029:   tr strt                               0 [unknown] =>         565b81cd main+0x0
    17894.383903029:   tr end  call                   565b81d4 main+0x7 =>         565b80d0 __x86.get_pc_thunk.bx+0x0
    17894.383903031:   tr strt                               0 [unknown] =>         565b81d9 main+0xc
    17894.383903031:   tr end  call                   565b81df main+0x12 =>         565b8070 [unknown]
    17894.383903032:   tr strt                               0 [unknown] =>         565b81e4 main+0x17
    17894.383903032:   tr end  call                   565b81e4 main+0x17 =>         565b8050 [unknown]
    17894.383903033:   tr strt                               0 [unknown] =>         565b81e9 main+0x1c
    17894.383903033:   tr end  call                   565b81e9 main+0x1c =>         565b8080 [unknown]
    17894.383903033:   tr strt                               0 [unknown] =>         565b81ee main+0x21
    17894.383903033:   tr end  call                   565b81ee main+0x21 =>         565b8060 [unknown]
    17894.383903237:   tr strt                               0 [unknown] =>         565b81f3 main+0x26
    17894.383903237:   tr end  return                 565b81fc main+0x2f =>         f7c21519 [unknown]

  After:

    $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso
    17894.383903029:   tr strt                               0 [unknown] =>         565b81cd main+0x0
    17894.383903029:   tr end  call                   565b81d4 main+0x7 =>         565b80d0 __x86.get_pc_thunk.bx+0x0
    17894.383903031:   tr strt                               0 [unknown] =>         565b81d9 main+0xc
    17894.383903031:   tr end  call                   565b81df main+0x12 =>         565b8070 fn4@plt+0x0
    17894.383903032:   tr strt                               0 [unknown] =>         565b81e4 main+0x17
    17894.383903032:   tr end  call                   565b81e4 main+0x17 =>         565b8050 fn1@plt+0x0
    17894.383903033:   tr strt                               0 [unknown] =>         565b81e9 main+0x1c
    17894.383903033:   tr end  call                   565b81e9 main+0x1c =>         565b8080 fn2@plt+0x0
    17894.383903033:   tr strt                               0 [unknown] =>         565b81ee main+0x21
    17894.383903033:   tr end  call                   565b81ee main+0x21 =>         565b8060 fn3@plt+0x0
    17894.383903237:   tr strt                               0 [unknown] =>         565b81f3 main+0x26
    17894.383903237:   tr end  return                 565b81fc main+0x2f =>         f7c21519 [unknown]

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230131131625.6964-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:47:46 -03:00
Athira Rajeev
766b0beedb perf tests shell: Fix check for libtracevent support
Test “Use vfs_getname probe to get syscall args filenames” fails in
environment with missing libtraceevent support as below:

  82: Use vfs_getname probe to get syscall args filenames             :
  --- start ---
  test child forked, pid 304726
  Recording open file:
  event syntax error: 'probe:vfs_getname*'
                       \___ unsupported tracepoint

  libtraceevent is necessary for tracepoint support
  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  test child finished with -1
  ---- end ----
  Use vfs_getname probe to get syscall args filenames: FAILED!

The environment has debuginfo but is missing the libtraceevent devel.

Hence perf is compiled without libtraceevent support.  The test tries to
add probe “probe:vfs_getname” and then uses it with “perf record”.  This
fails at function “parse_events_add_tracepoint" due to missing
libtraceevent.

Similarly "probe libc's inet_pton & backtrace it with ping" test slso
fails with same reason.

Add a function in 'perf test shell' library to check if perf record with
—dry-run reports any error on missing support for libtraceevent. Update
both the tests to use this new function “skip_no_probe_record_support”
before proceeding With using probe point via perf builtin record.

With the change,

  82: Use vfs_getname probe to get syscall args filenames             :
  --- start ---
  test child forked, pid 305014
  Recording open file:
  libtraceevent is necessary for tracepoint support
  test child finished with -2
  ---- end ----
  Use vfs_getname probe to get syscall args filenames: Skip

   81: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 305036
  libtraceevent is necessary for tracepoint support
  test child finished with -2
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: Skip

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kjain@linux.ibm.com,
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/r/20230201180421.59640-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:44:26 -03:00
Athira Rajeev
84cce3d60c perf tests shell: Add check for perf data file in record+probe_libc_inet_pton test
The "probe libc's inet_pton & backtrace it with ping" test installs a
uprobe and uses perf record/script to check the backtrace. Currently
even if the "perf record" fails, the test reports success. Logs below:

  # ./perf test -v "probe libc's inet_pton & backtrace it with ping"
  81: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 304211
  failed to open /tmp/perf.data.Btf: No such file or directory
  test child finished with 0
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: Ok

Fix this by adding check for presence of perf.data file
before proceeding with "perf script".

With the patch changes, test reports fail correctly.

 # ./perf test -v "probe libc's inet_pton & backtrace it with ping"
 81: probe libc's inet_pton & backtrace it with ping                 :
  --- start ---
  test child forked, pid 304358
  FAIL: perf record failed to create "/tmp/perf.data.Uoi"
  test child finished with -1
  ---- end ----
  probe libc's inet_pton & backtrace it with ping: FAILED!

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/r/20230201180421.59640-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:44:18 -03:00
Namhyung Kim
e072b097d2 perf test: Add pipe mode test to the Intel PT test suite
The test_pipe() function will check perf report and perf inject with
pipe input.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:31:15 -03:00
Namhyung Kim
14bf478441 perf session: Avoid calling lseek(2) for pipe
We should not call lseek(2) for pipes as it won't work.  And we already
in the proper place to read the data for AUXTRACE.  Add the comment like
in the PERF_RECORD_HEADER_TRACING_DATA.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:31:04 -03:00
Namhyung Kim
aeb802f872 perf intel-pt: Do not try to queue auxtrace data on pipe
When it processes AUXTRACE_INFO, it calls to auxtrace_queue_data() to
collect AUXTRACE data first.  That won't work with pipe since it needs
lseek() to read the scattered aux data.

  $ perf record -o- -e intel_pt// true | perf report -i- --itrace=i100
  # To display the perf.data header info, please use --header/--header-only options.
  #
  0x4118 [0xa0]: failed to process type: 70
  Error:
  failed to process sample

For the pipe mode, it can handle the aux data as it gets.  But there's
no guarantee it can get the aux data in time.  So the following warning
will be shown at the beginning:

  WARNING: Intel PT with pipe mode is not recommended.
           The output cannot relied upon.  In particular,
           time stamps and the order of events may be incorrect.

Fixes: dbd134322e ("perf intel-pt: Add support for decoding AUX area samples")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 21:30:05 -03:00
Namhyung Kim
1746212dae perf inject: Use perf_data__read() for auxtrace
In copy_bytes(), it reads the data from the (input) fd and writes it to
the output file.  But it does with the read(2) unconditionally which
caused a problem of mixing buffered vs unbuffered I/O together.

You can see the problem when using pipes.

  $ perf record -e intel_pt// -o- true | perf inject -b > /dev/null
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  0x45c0 [0x30]: failed to process type: 71

It should use perf_data__read() to honor the 'use_stdio' setting.

Fixes: 601366678c ("perf data: Allow to use stdio functions for pipe mode")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230131023350.1903992-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-01 19:22:07 -03:00
Mike Leach
c6535b6ba9 perf cs-etm: Update decoder code for OpenCSD version 1.4
OpenCSD version 1.4 is released with support for FEAT_ITE.

This adds a new packet type, with associated output element ID in the
packet type enum - OCSD_GEN_TRC_ELEM_INSTRUMENTATION.

As we just ignore this packet in perf, add to the switch statement to
avoid the "enum not handled in switch error", but conditionally so as
not to break the perf build for older OpenCSD installations.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120153706.20388-1-mike.leach@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-30 14:54:13 -03:00
Naveen N. Rao
dfadf8b315 perf test: Fix DWARF unwind test by adding non-inline to expected function in a backtrace
'DWARF unwind' 'perf test' can sometimes fail:

  $ perf test -v 74
  Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
   74: Test dwarf unwind                                               :
  --- start ---
  test child forked, pid 3785254
  Problems creating module maps, continuing anyway...
  Problems creating module maps, continuing anyway...
  unwind: test__arch_unwind_sample:ip = 0x102d0ad4c (0x36ad4c)
  unwind: access_mem addr 0x7fffc33128c8, val 1031c3228, offset 120
  unwind: access_mem addr 0x7fffc33128d0, val 12427cc70, offset 128
  <snip>
  unwind: test_dwarf_unwind__krava_3:ip = 0x102b8768b (0x1e768b)
  unwind: access_mem addr 0x7fffc3313048, val 7fffc3313050, offset 2040
  unwind: access_mem addr 0x7fffc3313060, val 102b8777c, offset 2064
  unwind: test_dwarf_unwind__krava_2:ip = 0x102b8770b (0x1e770b)
  unwind: access_mem addr 0x7fffc3313088, val 7fffc3313090, offset 2104
  unwind: access_mem addr 0x7fffc33130a0, val 102b87890, offset 2128
  unwind: test_dwarf_unwind__krava_1:ip = 0x102b8777b (0x1e777b)
  unwind: access_mem addr 0x7fffc3313108, val 10323a274, offset 2232
  unwind: access_mem addr 0x7fffc3313110, val ffffffffffffffff, offset 2240
  unwind: access_mem addr 0x7fffc3313118, val 102c08ed0, offset 2248
  unwind: access_mem addr 0x7fffc3313120, val 1031db000, offset 2256
  unwind: access_mem addr 0x7fffc3313128, val 7fffc3313130, offset 2264
  unwind: access_mem addr 0x7fffc3313140, val 102b45ee8, offset 2288
  unwind: '':ip = 0x102b8788f (0x1e788f)
  failed: got unresolved address 0x102b8788f
  unwind: failed with 'no error'
  got wrong number of stack entries 0 != 8
  test child finished with -1
  ---- end ----
  Test dwarf unwind: FAILED!

We expect to resolve test__dwarf_unwind as the last symbol, but that
function can be optimized away:

  $ objdump -tT /usr/bin/perf | grep dwarf_unwind
  000000000083b018 g    DO .data	0000000000000040  Base        tests__dwarf_unwind
  00000000001e7750 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_1
  00000000001e76e0 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_2
  00000000001e7620 g    DF .text	00000000000000b4  Base        0x60 test_dwarf_unwind__krava_3
  00000000001e74f0 g    DF .text	0000000000000128  Base        0x60 test_dwarf_unwind__compare
  00000000001e7350 g    DF .text	000000000000019c  Base        0x60 test_dwarf_unwind__thread
  000000000083b000 g    DO .data	0000000000000018  Base        suite__dwarf_unwind

Fix this similar to commit fdf7c49c20 ("perf tests: Fix dwarf
unwind for stripped binaries") by marking the function as a global and
adding the 'noinline' attribute to it.

With this patch:

  $ objdump -tT perf | grep dwarf_unwind
  000000000083b018 g    DO .data	0000000000000040  Base        tests__dwarf_unwind
  00000000001e80f0 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_1
  00000000001e8080 g    DF .text	0000000000000068  Base        0x60 test_dwarf_unwind__krava_2
  00000000001e7fc0 g    DF .text	00000000000000b4  Base        0x60 test_dwarf_unwind__krava_3
  00000000001e7e90 g    DF .text	0000000000000128  Base        0x60 test_dwarf_unwind__compare
  00000000001e7cf0 g    DF .text	000000000000019c  Base        0x60 test_dwarf_unwind__thread
  00000000001e8160 g    DF .text	0000000000000248  Base        0x60 test__dwarf_unwind
  000000000083b000 g    DO .data	0000000000000018  Base        suite__dwarf_unwind
  $ ./perf test 74
   74: Test dwarf unwind                                               : Ok

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20230125123442.107156-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-30 14:54:13 -03:00
Ian Rogers
22e06e6825 perf buildid: Avoid copy of uninitialized memory
build_id__init() only copies the buildid data up to size leaving the
rest of the data array uninitialized. Copying the full array during
synthesis means the written event contains uninitialized memory.

Ensure the size is less that the buffer size and only copy the bytes
that were initialized. This was detected by the Clang/LLVM memory
sanitizer.

v2. Avoids the potential for copying too much as suggested by Arnaldo.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230120185828.43231-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:35 -03:00
James Clark
86569c0ab1 perf mem/c2c: Document that SPE is used for mem and c2c on ARM
Setup is non-trivial so also link to the full SPE docs.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-perf-users@vger.kernel.or
Link: https://lore.kernel.org/r/20230124145929.557891-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
James Clark
6bc75b4c90 perf cs-etm: Improve missing sink warning message
Make the sink error message more similar to the event error message that
reminds about missing kernel support. The available sinks are also
determined by the hardware so mention that too.

Also, usually it's not necessary to specify the sink, so add that as a
hint.

Now the error for a made up sink looks like this:

  $ perf record -e cs_etm/@abc/
  Couldn't find sink "abc" on event cs_etm/@abc/.
  Missing kernel or device support?

  Hint: An appropriate sink will be picked automatically if one isn't is specified.

For any error other than ENOENT, the same message as before is
displayed.

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/ec7502e6-b406-3997-c2a5-24f98e5c4854@arm.com
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230124110220.460551-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
Arnaldo Carvalho de Melo
0b58d89b1e perf tools: Add Ian Rogers to MAINTAINERS as a reviewer
Ian has been reviewing perf tooling patches consistently for a long
time, so lets reflect that in the MAINTAINERS file so that contributors
add him to the CC list in patch submissions.

Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-27 15:00:34 -03:00
Athira Rajeev
f194210846 perf test buildid: Fix shell string substitutions
The perf test named “build id cache operations” skips with below error
on some distros:

  <<>>
   78: build id cache operations                                       :
  test child forked, pid 111101
  WARNING: wine not found. PE binaries will not be run.
  test binaries: /tmp/perf.ex.SHA1.PKz /tmp/perf.ex.MD5.Gt3 ./tests/shell/../pe-file.exe
  DEBUGINFOD_URLS=
  Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.PKz: Ok
  build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
  ./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
  test child finished with -2
  build id cache operations: Skip
  <<>>

The test script "tests/shell/buildid.sh" uses some of the string
substitution ways which are supported in bash, but not in "sh" or other
shells. Above error on line number 69 that reports "Bad substitution"
is:

  <<>>
  link=${build_id_dir}/.build-id/${id:0:2}/${id:2}
  <<>>

Here the way of getting first two characters from id ie, ${id:0:2} and
similarly expressions like ${id:2} is not recognised in "sh". So the
line errors and instead of hitting failure, the test gets skipped as
shown in logs.  So the syntax issue causes test not to be executed in
such cases. Similarly usage : "${@: -1}" [ to pick last argument passed
to a function] in “test_record” doesn’t work in all distros.

Fix this by using alternative way with shell substitution to pick
required characters from the string. Also fix the usage of “${@: -1}” to
work in all cases.

Another usage in “test_record” is:

  <<>>
  ${perf} record --buildid-all -o ${data} $@ &> ${log}
  <<>>

This causes the 'perf record' to start in background and Results in the
data file not being created by the time "check" function is invoked.
Below log shows 'perf record' result getting displayed after the call to
"check" function.

  <<>>
  running: perf record /tmp/perf.ex.SHA1.EAU
  build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
  link: /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb
  failed: link /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb does not exist
  test child finished with -1
  build id cache operations: FAILED!
  root@machine:~/athira/linux/tools/perf# Couldn't synthesize bpf events.
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.010 MB /tmp/perf.data.bFF ]
  <<>>

Fix this by redirecting output instead of using “&” which starts the
command in background.

Reviewed-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20230119142719.32628-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-23 10:03:07 -03:00
Diederik de Haas
fc5d836c67 perf: Various spelling fixes
Fix various spelling errors as reported by Debian's lintian tool.

"amount of times" -> "number of times"
ocurrence -> occurrence
upto -> up to

Signed-off-by: Diederik de Haas <didi.debian@cknow.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230122122034.48020-1-didi.debian@cknow.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-23 10:00:47 -03:00
Naveen N. Rao
7158005b4e perf test: Switch basic bpf filtering test to use syscall tracepoint
BPF filtering tests can sometime fail. Running the test in verbose mode
shows the following:

  $ sudo perf test 42
  42: BPF filter                                                      :
  42.1: Basic BPF filtering                                           : FAILED!
  42.2: BPF pinning                                                   : Skip
  42.3: BPF prologue generation                                       : Skip
  $ perf --version
  perf version 4.18.0-425.3.1.el8.ppc64le
  $ sudo perf test -v 42
  42: BPF filter                                                      :
  42.1: Basic BPF filtering                                           :
  --- start ---
  test child forked, pid 711060
  ...
  bpf: config 'func=do_epoll_wait' is ok
  Looking at the vmlinux_path (8 entries long)
  Using /usr/lib/debug/lib/modules/4.18.0-425.3.1.el8.ppc64le/vmlinux for symbols
  Open Debuginfo file: /usr/lib/debug/.build-id/81/56f5a07f92ccb62c5600ba0e4aacfb5f3a7534.debug
  Try to find probe point from debuginfo.
  Matched function: do_epoll_wait [4ef8cb0]
  found inline addr: 0xc00000000061dbe4
  Probe point found: __se_compat_sys_epoll_pwait+196
  found inline addr: 0xc00000000061d9f4
  Probe point found: __se_sys_epoll_pwait+196
  found inline addr: 0xc00000000061d824
  Probe point found: __se_sys_epoll_wait+36
  Found 3 probe_trace_events.
  Opening /sys/kernel/tracing//kprobe_events write=1
  ...
  BPF filter result incorrect, expected 56, got 56 samples
  test child finished with -1
  ---- end ----
  BPF filter subtest 1: FAILED!

The statement above about the result being incorrect looks weird, and it
is due to that particular perf build missing commit 3e11300cdf
("perf test: Fix bpf test sample mismatch reporting"). In reality, due
to commit 4b04e0decd ("perf test: Fix basic bpf filtering test"),
perf expects there to be 56*3 samples.

However, the number of samples we receive is going to be dependent on
where the probes are installed, which is dependent on where
do_epoll_wait gets inlined. On s390x, it looks like probes at all the
inlined locations are hit. But, that is not the case on ppc64le.

Fix this by switching the test to instead use the syscall tracepoint.
This ensures that we will only ever install a single event enabling us
to reliably determine the sample count.

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20230123083224.276404-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-23 09:58:01 -03:00
Arnaldo Carvalho de Melo
91f67b9a64 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick fixes that went via perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-23 09:56:00 -03:00
James Clark
5670ebf54b perf cs-etm: Ensure that Coresight timestamps don't go backwards
There are some edge cases around estimated timestamps that can result
in them going backwards.

One is that after a discontinuity, the last used timestamp is set to 0.
The duration of the next range is then subtracted which could result in
an earlier timestamp than the last instruction. Fix this by not
resetting the last timestamp used on a discontinuity, and make sure that
new estimated timestamps are clamped to be later than that.

Another case is that estimated timestamps could compound over time to
end up being more than the next real timestamp in the trace. Fix this by
clamping the estimates in cs_etm_decoder__do_soft_timestamp() to be no
later than it.

cs_etm_decoder__do_soft_timestamp() also updated next_cs_timestamp,
which meant that the next real timestamp was lost and not stored
anywhere. Fix that by only updating cs_timestamp for estimates and keep
next_cs_timestamp untouched.

Finally, use next_cs_timestamp to signify if a timestamp has been
received previously. Because cs_timestamp has the first range
subtracted, it could technically go to 0 which would break the logic.

Testing
=======

It can be verified that timestamps don't go backwards when tracing on a
single core with the following commands. Across multiple cores it's
expected that timestamps are interleaved:

  $ perf record -e cs_etm/@tmc_etr0/k -C 4 taskset -c 4 sleep 1
  $ perf script --itrace=i1ns --ns -Fcomm,tid,pid,time,cpu,event,ip,sym,addr,symoff,flags,callindent > itrace
  $ sed 's/://g' itrace | awk -F ' ' ' { print $4 } ' | awk '{ if ($1 < prev) { print "line:" NR " " $0 } {prev=$1}}'

Reported-by: Tanmay Jagdale <tanmay@marvell.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-9-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:46 -03:00
German Gomez
a7fe9a443b perf cs_etm: Set the time field in the synthetic samples
If virtual timestamps are detected, set sample time field accordingly,
otherwise warn the user that the samples will not include accurate
time data.

  | Test notes (FEAT_TRF platform)
  |
  | $ ./perf record -e cs_etm//u -a -- sleep 4
  | $ ./perf script --fields +time
  | 	    perf   422 [000]   163.375100:          1 branches:uH:                 0 [unknown] ([unknown])
  | 	    perf   422 [000]   163.375100:          1 branches:uH:      ffffb8009544 ioctl+0x14 (/lib/aarch64-linux-gnu/libc-2.27.so)
  | 	    perf   422 [000]   163.375100:          1 branches:uH:      aaaaab6bebf4 perf_evsel__run_ioctl+0x90 (/home/german/linux/tools/perf/perf)
  | [...]
  | 	    perf   422 [000]   167.393100:          1 branches:uH:      aaaaab6bda00 __xyarray__entry+0x74 (/home/german/linux/tools/perf/perf)
  | 	    perf   422 [000]   167.393099:          1 branches:uH:      aaaaab6bda0c __xyarray__entry+0x80 (/home/german/linux/tools/perf/perf)
  | 	    perf   422 [000]   167.393099:          1 branches:uH:      ffffb8009538 ioctl+0x8 (/lib/aarch64-linux-gnu/libc-2.27.so)
  |
  | The time from the first sample to the last sample is 4 seconds

Now that times are converted to nanoseconds, also try to estimate the
timestamps more accurately be dividing by some fixed value for
instructions per ns. This prevents long ranges from being estimated
too far in the past than would be realistic.

Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-8-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:45 -03:00
German Gomez
2e2f7ceecc perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETE
Read the value of ts_source exposed by the driver and store it in the
ETMv4 and ETE header. If the interface doesn't exist (such as in older
Kernels), defaults to a safe value of -1.

Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-7-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:44 -03:00
German Gomez
326163c552 perf cs_etm: Keep separate symbols for ETMv4 and ETE parameters
Previously, adding a new parameter at the end of ETMv4 meant adding it
somewhere in the middle of ETE, which is not supported by the current
header version.

Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-6-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:41 -03:00
German Gomez
c2b6a8969c perf pmu: Add function to check if a pmu file exists
Add a utility function perf_pmu__file_exists() to check if a given pmu
file exists in the sysfs filesystem.

Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-5-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:35 -03:00
James Clark
5f2c8efa78 perf pmu: Remove remaining duplication of bus/event_source/devices/...
Use the new perf_pmu__pathname_scnprintf() instead. No functional
changes.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:33 -03:00
James Clark
d50a79cd0f perf pmu: Use perf_pmu__open_file() and perf_pmu__scan_file()
Remove some code that duplicates existing methods. Copy strings where
const strings are required.

No functional changes.

Committer notes:

Add a stub for erf_pmu__scan_file() in tools/perf/util/python.c not to
drag tools/perf/util/pmu.c into the python binding.

This fixes 'perf test python' at this point in this patchset.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:32 -03:00
James Clark
f8ad6018ce perf pmu: Remove duplication around EVENT_SOURCE_DEVICE_PATH
The pattern for accessing EVENT_SOURCE_DEVICE_PATH is duplicated in a
few places, so add two utility functions to cover it. Also just use
perf_pmu__scan_file() instead of pmu_type() which already does the same
thing.

No functional changes.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:17:27 -03:00
Ian Rogers
4cbd5334ff perf tools: Fix foolproof typo
In the context of LBR stitching documentation.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230119201036.156441-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:53 -03:00
Adrian Hunter
df8aeaefea perf symbols: Check SHT_RELA and SHT_REL type earlier
Make the code more readable by checking for SHT_RELA and SHT_REL type
earlier.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:50 -03:00
Adrian Hunter
375a448184 perf symbols: Combine handling for SHT_RELA and SHT_REL
SHT_REL and SHT_RELA are handled the same way. Simplify by combining the
handling.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:43 -03:00
Adrian Hunter
45204677d4 perf symbols: Allow for .plt entries with no symbol
Create a sensible name for .plt entries with no symbol.

Example:

 Before:

   $ perf test --dso /usr/lib/x86_64-linux-gnu/libc.so.6 -vv Symbols 2>/tmp/cmp1.txt

 After:

   $ perf test --dso /usr/lib/x86_64-linux-gnu/libc.so.6 -vv Symbols 2>/tmp/cmp2.txt
   $ diff /tmp/cmp1.txt /tmp/cmp2.txt
    4c4
    < test child forked, pid 53043
    ---
    > test child forked, pid 54372
    23,62c23,62
    <  280f0-28100 g @plt
    <  28100-28110 g @plt
    <  28110-28120 g @plt
    <  28120-28130 g @plt
    <  28130-28140 g @plt
    <  28140-28150 g @plt
    <  28150-28160 g @plt
    <  28160-28170 g @plt
    <  28170-28180 g @plt
    <  28180-28190 g @plt
    <  28190-281a0 g @plt
    <  281a0-281b0 g @plt
    <  281b0-281c0 g @plt
    <  281c0-281d0 g @plt
    <  281d0-281e0 g @plt
    <  281e0-281f0 g @plt
    <  281f0-28200 g @plt
    <  28200-28210 g @plt
    <  28210-28220 g @plt
    <  28220-28230 g @plt
    <  28230-28240 g @plt
    <  28240-28250 g @plt
    <  28250-28260 g @plt
    <  28260-28270 g @plt
    <  28270-28280 g @plt
    <  28280-28290 g @plt
    <  28290-282a0 g @plt
    <  282a0-282b0 g @plt
    <  282b0-282c0 g @plt
    <  282c0-282d0 g @plt
    <  282d0-282e0 g @plt
    <  282e0-282f0 g @plt
    <  282f0-28300 g @plt
    <  28300-28310 g @plt
    <  28310-28320 g @plt
    <  28320-28330 g @plt
    <  28330-28340 g @plt
    <  28340-28350 g @plt
    <  28350-28360 g @plt
    <  28360-28370 g @plt
    ---
    >  280f0-28100 g offset_0x280f0@plt
    >  28100-28110 g offset_0x28100@plt
    >  28110-28120 g offset_0x28110@plt
    >  28120-28130 g offset_0x28120@plt
    >  28130-28140 g offset_0x28130@plt
    >  28140-28150 g offset_0x28140@plt
    >  28150-28160 g offset_0x28150@plt
    >  28160-28170 g offset_0x28160@plt
    >  28170-28180 g offset_0x28170@plt
    >  28180-28190 g offset_0x28180@plt
    >  28190-281a0 g offset_0x28190@plt
    >  281a0-281b0 g offset_0x281a0@plt
    >  281b0-281c0 g offset_0x281b0@plt
    >  281c0-281d0 g offset_0x281c0@plt
    >  281d0-281e0 g offset_0x281d0@plt
    >  281e0-281f0 g offset_0x281e0@plt
    >  281f0-28200 g offset_0x281f0@plt
    >  28200-28210 g offset_0x28200@plt
    >  28210-28220 g offset_0x28210@plt
    >  28220-28230 g offset_0x28220@plt
    >  28230-28240 g offset_0x28230@plt
    >  28240-28250 g offset_0x28240@plt
    >  28250-28260 g offset_0x28250@plt
    >  28260-28270 g offset_0x28260@plt
    >  28270-28280 g offset_0x28270@plt
    >  28280-28290 g offset_0x28280@plt
    >  28290-282a0 g offset_0x28290@plt
    >  282a0-282b0 g offset_0x282a0@plt
    >  282b0-282c0 g offset_0x282b0@plt
    >  282c0-282d0 g offset_0x282c0@plt
    >  282d0-282e0 g offset_0x282d0@plt
    >  282e0-282f0 g offset_0x282e0@plt
    >  282f0-28300 g offset_0x282f0@plt
    >  28300-28310 g offset_0x28300@plt
    >  28310-28320 g offset_0x28310@plt
    >  28320-28330 g offset_0x28320@plt
    >  28330-28340 g offset_0x28330@plt
    >  28340-28350 g offset_0x28340@plt
    >  28350-28360 g offset_0x28350@plt
    >  28360-28370 g offset_0x28360@plt

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:37 -03:00
Adrian Hunter
698a0d1a1a perf symbols: Add symbol for .plt header
perf expands the _init symbol over .plt because there are no PLT symbols
at that point, but then dso__synthesize_plt_symbols() creates them.

Fix by truncating the previous symbol and inserting a symbol for .plt
header.

Example:

 Before:

   $ perf test --dso `which uname` -v Symbols
    74: Symbols                                                         :
   --- start ---
   test child forked, pid 191028
   Problems creating module maps, continuing anyway...
   Testing /usr/bin/uname
   Overlapping symbols:
    2000-25f0 g _init
    2040-2050 g free@plt
   test child finished with -1
   ---- end ----
   Symbols: FAILED!
   $ perf test --dso `which uname` -vv Symbols 2>/tmp/cmp1.txt

 After:

   $ perf test --dso `which uname` -v Symbols
    74: Symbols                                                         :
   --- start ---
   test child forked, pid 194291
   Testing /usr/bin/uname
   test child finished with 0
   ---- end ----
   Symbols: Ok
   $ perf test --dso `which uname` -vv Symbols 2>/tmp/cmp2.txt
   $ diff /tmp/cmp1.txt /tmp/cmp2.txt
   4,5c4
   < test child forked, pid 191031
   < Problems creating module maps, continuing anyway...
   ---
   > test child forked, pid 194296
   9c8,9
   <  2000-25f0 g _init
   ---
   >  2000-2030 g _init
   >  2030-2040 g .plt
   100,103c100
   < Overlapping symbols:
   <  2000-25f0 g _init
   <  2040-2050 g free@plt
   < test child finished with -1
   ---
   > test child finished with 0
   105c102
   < Symbols: FAILED!
   ---
   > Symbols: Ok
   $

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:28 -03:00
Adrian Hunter
5fec9b171c perf symbols: Do not check ss->dynsym twice
ss->dynsym is checked to be not NULL twice. Remove the first check
because, in fact, there can be a plt with no dynsym, which is something
that will be dealt with later.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:23 -03:00
Adrian Hunter
477d5e35b4 perf symbols: Slightly simplify 'err' usage in dso__synthesize_plt_symbols()
Return zero directly instead of needless 'goto out_elf_end' that does
the same thing. That allows 'err' to be initialized to -1 instead of
having to change its value later.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230120123456.12449-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-22 18:10:18 -03:00