Commit Graph

5181 Commits

Author SHA1 Message Date
Adrian Hunter
d2c959803c perf kcore_copy: Iterate phdrs
In preparation to add more program headers, iterate phdrs instead of
assuming there is only one for the kernel text and one for the modules.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:42 -03:00
Adrian Hunter
15acef6c37 perf kcore_copy: Layout sections
In preparation to add more program headers, layout the relative offset
of each section.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:42 -03:00
Adrian Hunter
c9dd1d8949 perf kcore_copy: Calculate offset from phnum
In preparation to add more program headers, calculate offset from the
number of phdrs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:41 -03:00
Adrian Hunter
6e97957d3d perf kcore_copy: Keep a count of phdrs
In preparation to add more program headers, keep a count of phdrs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:41 -03:00
Adrian Hunter
f683820948 perf kcore_copy: Keep phdr data in a list
Currently, kcore_copy makes 2 program headers, one for the kernel text
(namely kernel_map) and one for the modules (namely modules_map). Now
more program headers are needed, but treating each program header as a
special case results in much more code.

Instead, in preparation to add more program headers, change to keep
program header data (phdr_data) in a list.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:40 -03:00
Jin Yao
787e4da9f9 perf annotate: Show group event string for stdio
When we enable the group, for tui/stdio2, the output first line includes
the group event string. While for stdio, it will show only one event.

For example,

perf record -e cycles,branches ./div
perf annotate --group --stdio

    Percent |      Source code & Disassembly of div for cycles (44407 samples)
    ......

The first line doesn't include the event 'branches'.

With this patch, it will show the correct group even string.

perf annotate --group --stdio

    Percent |      Source code & Disassembly of div for cycles, branches (44407 samples)
    ......

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:40 -03:00
Adrian Hunter
a8ce99b0ee perf machine: Synthesize and process mmap events for x86 PTI entry trampolines
Like the kernel text, the location of x86 PTI entry trampolines must be
recorded in the perf.data file. Like the kernel, synthesize a mmap event
for that, and add processing for it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:39 -03:00
Adrian Hunter
1c5aae7710 perf machine: Create maps for x86 PTI entry trampolines
Create maps for x86 PTI entry trampolines, based on symbols found in
kallsyms. It is also necessary to keep track of whether the trampolines
have been mapped particularly when the kernel dso is kcore.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com
[ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:24:08 -03:00
Adrian Hunter
5759a6820a perf machine: Allow for extra kernel maps
Identify extra kernel maps by name so that they can be distinguished
from the kernel map and module maps.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:59:22 -03:00
Adrian Hunter
4d004365e2 perf machine: Fix map_groups__split_kallsyms() for entry trampoline symbols
When kernel symbols are derived from /proc/kallsyms only (not using
vmlinux or /proc/kcore) map_groups__split_kallsyms() is used. However
that function makes assumptions that are not true with entry trampoline
symbols. For now, remove the entry trampoline symbols at that point, as
they are no longer needed at that point.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:55:59 -03:00
Adrian Hunter
4d99e41365 perf machine: Workaround missing maps for x86 PTI entry trampolines
On x86_64 the PTI entry trampolines are not in the kernel map created by
perf tools. That results in the addresses having no symbols and prevents
annotation.  It also causes Intel PT to have decoding errors at the
trampoline addresses.

Workaround that by creating maps for the trampolines.

At present the kernel does not export information revealing where the
trampolines are.  Until that happens, the addresses are hardcoded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:54:22 -03:00
Adrian Hunter
9cecca325e perf machine: Add nr_cpus_avail()
Add a function to return the number of the machine's available CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:52:49 -03:00
Jin Yao
e2bdbe80a0 perf evlist: Introduce force_leader() method
For non-explicit group (e.g. those created with -e '{eventA,eventB}'),
'perf report' supports a option '--group' which can enable group output.

We also need to support 'perf annotate' with the same '--group'.

Create a new function perf_evlist__force_leader() which contains common
code to force setting the group leader.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526914666-31839-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21 14:40:54 -03:00
Adrian Hunter
19422a9f2a perf tools: Fix kernel_start for PTI on x86
Opickn x86_64, PTI entry trampolines are less than the start of kernel text,
but still above 2^63. So leave kernel_start = 1ULL << 63 for x86_64.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526548928-20790-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:51 -03:00
Adrian Hunter
dbbd34a666 perf machine: Add machine__is() to identify machine arch
Add a function to identify the machine architecture.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526548928-20790-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:50 -03:00
Jin Yao
3e71fc0319 perf annotate: Create hotkey 'c' to show min/max cycles
In the 'perf annotate' view, a new hotkey 'c' is created for showing the
min/max cycles.

For example, when press 'c', the annotate view is:

  Percent│ IPC     Cycle(min/max)
         │
         │
         │                             Disassembly of section .text:
         │
         │                             000000000003aab0 <random@@GLIBC_2.2.5>:
    8.22 │3.92                           sub    $0x18,%rsp
         │3.92                           mov    $0x1,%esi
         │3.92                           xor    %eax,%eax
         │3.92                           cmpl   $0x0,argp_program_version_hook@@G
         │3.92             1(2/1)      ↓ je     20
         │                               lock   cmpxchg %esi,__abort_msg@@GLIBC_P
         │                             ↓ jne    29
         │                             ↓ jmp    43
         │1.10                     20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+
    8.93 │1.10             1(5/1)      ↓ je     43

When press 'c' again, the annotate view is switched back:

  Percent│ IPC Cycle
         │
         │
         │                Disassembly of section .text:
         │
         │                000000000003aab0 <random@@GLIBC_2.2.5>:
    8.22 │3.92              sub    $0x18,%rsp
         │3.92              mov    $0x1,%esi
         │3.92              xor    %eax,%eax
         │3.92              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x
         │3.92     1      ↓ je     20
         │                  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
         │                ↓ jne    29
         │                ↓ jmp    43
         │1.10        20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
    8.93 │1.10     1      ↓ je     43

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526569118-14217-3-git-send-email-yao.jin@linux.intel.com
[ Rename all maxmin to minmax ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:49 -03:00
Jin Yao
48659ebf37 perf annotate: Record the min/max cycles
Currently perf has a feature to account cycles for LBRs

For example, on skylake:

  perf record -b ...
  perf report or perf annotate

And then browsing the annotate browser gives average cycle counts for
program blocks.

For some analysis it would be useful if we could know not only the
average cycles but also the min and max cycles.

This patch records the min and max cycles.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526569118-14217-2-git-send-email-yao.jin@linux.intel.com
[ Switch from max/min to min/max ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-18 16:31:41 -03:00
Sandipan Das
1961018469 perf script: Show virtual addresses instead of offsets
When perf data is recorded with the call-graph option enabled, the
callchain shown by perf script shows the binary offsets of the symbols
as the ip. This is incorrect for kernel symbols as the ip values are
always off by a fixed offset depending on the architecture. If the
offsets from the start of the symbols are printed, they are also
incorrect for both kernel and userspace symbols.

Without the call-graph option, the callchain shows the virtual addresses
of the symbols rather than their binary offsets. The offsets printed in
this case are also correct.

This fixes the inconsistency in perf script's output.

This can be verified on a powerpc64le system running Fedora 27 as
follows:

  # cat /proc/kallsyms | grep sys_write
  ...
  c0000000004025a0 T sys_write
  c0000000004025a0 T __se_sys_write
  ...

  # perf probe -a sys_write

Before applying this patch:

  # perf record -e probe:sys_write -g ~/test
  # perf script -F ip,sym,symoff

                    4125b0 sys_write+0x8000000000008010
                     1b9e0 system_call+0x8000000000008058
                    118234 __GI___libc_write+0xffff0000f52c0024
                     92c74 _IO_file_write@@GLIBC_2.17+0xffff0000f52c0044
                  5afbfd8a [unknown]
                     91a60 new_do_write+0xffff0000f52c0090
                     94638 _IO_do_write@@GLIBC_2.17+0xffff0000f52c0038
                     94bbc _IO_file_overflow@@GLIBC_2.17+0xffff0000f52c014c
                     95a24 __overflow+0xffff0000f52c0064
                     84548 _IO_puts+0xffff0000f52c0218
                       440 main+0xffffffffe0000020
                     236a0 generic_start_main.isra.0+0xffff0000f52c0140
                     23898 __libc_start_main+0xffff0000f52c00b8
                         0 [unknown]
  ...

  # perf record -e probe:sys_write ~/test
  # perf script -F ip,sym,symoff

  c0000000004025b0 sys_write+0x10
  ...

After applying this patch:

  # perf record -e probe:sys_write -g ~/test
  # perf script -F ip,sym,symoff

          c0000000004025b0 sys_write+0x10
          c00000000000b9e0 system_call+0x58
              7fffb70d8234 __GI___libc_write+0x24
              7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44
                  5afc1818 [unknown]
              7fffb7051a60 new_do_write+0x90
              7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38
              7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c
              7fffb7055a24 __overflow+0x64
              7fffb7044548 _IO_puts+0x218
                  10000440 main+0x20
              7fffb6fe36a0 generic_start_main.isra.0+0x140
              7fffb6fe3898 __libc_start_main+0xb8
                         0 [unknown]
  ...

  # perf record -e probe:sys_write ~/test
  # perf script -F ip,sym,symoff

  c0000000004025b0 sys_write+0x10
  ...

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180517063326.6319-1-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 16:55:29 -03:00
Arnaldo Carvalho de Melo
029c75e5cf perf tools: No need to unconditionally read the max_stack sysctls
Let tools that need to have those variables with the sysctl current
values use a function that will read them.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1ljj3oeo5kpt2n1icfd9vowe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 16:31:32 -03:00
Arnaldo Carvalho de Melo
9ac94e31ca perf tools: Read the cache line size lazily
It is not read as commonly as 'page_size', so it makes sense to read it
lazily, caching its value when it is first read.

Less files open unconditionally at startup.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-35xhrq91u94uc1djtclek1ie@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 16:03:34 -03:00
Arnaldo Carvalho de Melo
7014e0e3bf tools lib api fs tracing_path: Introduce opendir() method
That takes care of using the right call to get the tracing_path
directory, the one that will end up calling tracing_path_set() to figure
out where tracefs is mounted.

One more step in doing just lazy reading of system structures to reduce
the number of operations done unconditionaly at 'perf' start.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-42zzi0f274909bg9mxzl81bu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 14:50:38 -03:00
Arnaldo Carvalho de Melo
25a7d91427 perf parse-events: Use get/put_events_file()
Instead of accessing the trace_events_path variable directly, that may
not have been properly initialized wrt detecting where tracefs is
mounted.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-id7hzn1ydgkxbumeve5wapqz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 14:49:36 -03:00
Arnaldo Carvalho de Melo
c02cab228e perf tools: Reuse the path to the tracepoint /events/ directory
When using for_each_event() we needlessly rebuild the whole path to
the tracepoint directory, reuse the dir_path instead, saving some cycles
and reducing the size of the next patch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-54bcs15n0cp6gwcgpc4hptyc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 14:25:07 -03:00
Arnaldo Carvalho de Melo
40c3c0c9ac tools lib api fs tracing_path: Introduce get/put_events_file() helpers
To make reading events files a tad more compact than with
get_tracing_files("events/foo").

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-do6xgtwpmfl8zjs1euxsd2du@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17 12:01:50 -03:00
Arnaldo Carvalho de Melo
17c257e867 tools lib api: Unexport 'tracing_path' variable
One should use tracing_path_mount() instead, so more things get done
lazily instead of at every 'perf' tool call startup.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-fci4yll35idd9yuslp67vqc2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16 16:27:14 -03:00
Arnaldo Carvalho de Melo
d01bd1ac92 perf config: Call perf_config__init() lazily
We check what perf_config__init() does at each perf_config() call,
namely if the static perf_config instance was created, so instead of
bailing out in that case, try to allocate it, bailing if it fails.

Now to get the perf_config() call out of the start of perf's main()
function, doing it also lazily.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-4bo45k6ivsmbxpfpdte4orsg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16 16:11:09 -03:00
YueHaibing
7a36a287de perf bpf: Fix NULL return handling in bpf__prepare_load()
bpf_object__open()/bpf_object__open_buffer can return error pointer or
NULL, check the return values with IS_ERR_OR_NULL() in bpf__prepare_load
and bpf__prepare_load_buffer

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: netdev@vger.kernel.org
Link: https://lkml.kernel.org/n/tip-psf4xwc09n62al2cb9s33v9h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16 10:01:55 -03:00
Kan Liang
3cdc5c2cb9 perf parse-events: Handle uncore event aliases in small groups properly
Perf stat doesn't count the uncore event aliases from the same uncore
block in a group, for example:

  perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000
  #           time             counts unit events
       1.000447342      <not counted>      unc_m_cas_count.all
       1.000447342      <not counted>      unc_m_clockticks
       2.000740654      <not counted>      unc_m_cas_count.all
       2.000740654      <not counted>      unc_m_clockticks

The output is very misleading. It gives a wrong impression that the
uncore event doesn't work.

An uncore block could be composed by several PMUs. An uncore event alias
is a joint name which means the same event runs on all PMUs of a block.
Perf doesn't support mixed events from different PMUs in the same group.
It is wrong to put uncore event aliases in a big group.

The right way is to split the big group into multiple small groups which
only include the events from the same PMU.

Only uncore event aliases from the same uncore block should be specially
handled here. It doesn't make sense to mix the uncore events with other
uncore events from different blocks or even core events in a group.

With the patch:
  #           time             counts unit events
     1.001557653            140,833      unc_m_cas_count.all
     1.001557653      1,330,231,332      unc_m_clockticks
     2.002709483             85,007      unc_m_cas_count.all
     2.002709483      1,429,494,563      unc_m_clockticks

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16 10:01:54 -03:00
Adrian Hunter
5654997838 perf tools: Use the "_stest" symbol to identify the kernel map when loading kcore
The first symbol is not necessarily in the kernel text.  Instead of
using the first symbol, use the _stest symbol to identify the kernel map
when loading kcore.

This allows for the introduction of symbols to identify the x86_64 PTI
entry trampolines.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1525866228-30321-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15 14:31:25 -03:00
Arnaldo Carvalho de Melo
1b16fffa38 perf llvm-utils: Add bpf include path to clang command line
We'll start putting headers for helpers to be used in eBPF proggies in
there:

  # perf trace -v --no-syscalls -e empty.c |& grep "llvm compiling command : "
  llvm compiling command : /usr/lib64/ccache/clang -D__KERNEL__ -D__NR_CPUS__=4 -DLINUX_VERSION_CODE=0x41100   -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h  -I/home/acme/lib/include/perf/bpf -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/4.17.0-rc3-00034-gf4ef6a438cee/build -c /home/acme/bpf/empty.c -target bpf -O2 -o -
  #

Notice the "-I/home/acme/lib/include/perf/bpf"

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6xq94xro8xlb5s9urznh3f9k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15 14:31:17 -03:00
Arnaldo Carvalho de Melo
291c161f6c Merge remote-tracking branch 'tip/perf/urgent' into perf/core
To pick up fixes, notably the revert for the intel_pt//u regression.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15 10:30:17 -03:00
Arnaldo Carvalho de Melo
c23080a6e4 perf tools: Add missing newline when parsing empty BPF proggie
This is not specific to BPF but was found when parsing a .c BPF proggie
that while valid, had no events attached to tracepoints, kprobes, etc:

Very minimal file that perf's BPF code can compile:

  # cat empty.c
  char _license[] __attribute__((section("license"), used)) = "GPL";
  int _version __attribute__((section("version"), used)) = LINUX_VERSION_CODE;
  #

Before this patch:

  # perf trace -e empty.c
  WARNING: event parser found nothinginvalid or unsupported event: 'empty.c'
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
    #

After:

  # perf trace -e empty.c
  WARNING: event parser found nothing
  invalid or unsupported event: 'empty.c'
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8ysughiz00h6mjpcot04qyjj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-11 12:10:02 -03:00
Leo Yan
3a0887997d perf cs-etm: Remove redundant space
There have two spaces ahead function name cs_etm__set_pid_tid_cpu(), so
remove one space and correct indentation.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1525924920-4381-2-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-11 10:46:36 -03:00
Leo Yan
46d5362004 perf cs-etm: Support unknown_thread in cs_etm_auxtrace
CoreSight doesn't allocate thread structure for unknown_thread in ETM
auxtrace, so unknown_thread is NULL pointer.  If the perf data doesn't
contain valid tid and then cs_etm__mem_access() uses unknown_thread
instead as thread handler, this results in a segmentation fault when
thread__find_addr_map() accesses the thread handler.

This commit creates a new thread data which is used by unknown_thread, so
CoreSight tracing can roll back to use unknown_thread if perf data
doesn't include valid thread info.  This commit also releases thread
data for initialization failure case and for normal auxtrace free flow.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1525924920-4381-1-git-send-email-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-11 10:45:23 -03:00
Jin Yao
04d2600ab6 perf annotate: Display all available events on --stdio
When we perform the following command lines:

  $ perf record -e "{cycles,branches}" ./div
  $ perf annotate main --stdio

The output shows only the first event, "cycles" and the displaying
format is not correct.

   Percent         |      Source code & Disassembly of div for cycles (44550 samples)
  -----------------------------------------------------------------------------------
                   :
                   :
                   :
                   :            Disassembly of section .text:
                   :
                   :            00000000004004b0 <main>:
                   :            main():
                   :
                   :                    return i;
                   :            }
                   :
                   :            int main(void)
                   :            {
      0.00 :   4004b0:       push   %rbx
                   :                    int i;
                   :                    int flag;
                   :                    volatile double x = 1212121212, y = 121212;
                   :
                   :                    s_randseed = time(0);
      0.00 :   4004b1:       xor    %edi,%edi
                   :                    srand(s_randseed);
      0.00 :   4004b3:       mov    $0x77359400,%ebx
                   :
                   :                    return i;
                   :            }

The issue is that the value of the 'nr_percent' variable is hardcoded to
1.  This patch fixes it.

With this patch, the output is:

   Percent         |      Source code & Disassembly of div for cycles (44550 samples)
  -----------------------------------------------------------------------------------
                   :
                   :
                   :
                   :            Disassembly of section .text:
                   :
                   :            00000000004004b0 <main>:
                   :            main():
                   :
                   :                    return i;
                   :            }
                   :
                   :            int main(void)
                   :            {
      0.00    0.00 :   4004b0:       push   %rbx
                   :                    int i;
                   :                    int flag;
                   :                    volatile double x = 1212121212, y = 121212;
                   :
                   :                    s_randseed = time(0);
      0.00    0.00 :   4004b1:       xor    %edi,%edi
                   :                    srand(s_randseed);
      0.00    0.00 :   4004b3:       mov    $0x77359400,%ebx
                   :
                   :                    return i;
                   :            }

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: f681d593d1 ("perf annotate: Remove disasm__calc_percent() from disasm_line__print()")
Link: http://lkml.kernel.org/r/1525881435-4092-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-10 15:19:30 -03:00
Arnaldo Carvalho de Melo
4a35a9027f Revert "perf pmu: Fix pmu events parsing rule"
As reported by Adrian Hunter, this breaks intel_pt event parsing:

  # perf record -e intel_pt//u uname
  event syntax error: 'intel_pt//u'
                               \___ parser error
  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  #

This reverts commit 9a4a931ce8.

Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ye1o2mji7x68xotiot1tn1gp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-07 16:28:10 -03:00
Arnaldo Carvalho de Melo
107cad95ff perf machine: Ditch find_kernel_function variants
Since we do not have split symtabs anymore, no need to have explicit
find_kernel_function variants, use the find_kernel_symbol ones.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hiw2ryflju000f6wl62128it@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-30 12:20:54 -03:00
Colin Ian King
246907611e perf tools: Fix spelling mistake: "builid" -> "buildid"
Trivial fix to spelling mistake in error message text

Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180427193158.17932-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-30 12:02:03 -03:00
Arnaldo Carvalho de Melo
15e0e2d4ee perf symbols: Move split_kallsyms to struct map_groups
Since it mainly will populate symtabs of its maps (kernel modules).

While looking at this I wonder if map_groups__split_kallsyms_for_kcore()
shouldn't be all that we need, seems much simpler.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-3d1f3iby76popdr8ia9yimsc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27 16:05:15 -03:00
Arnaldo Carvalho de Melo
019c6820d5 perf symbols: kallsyms__delta() needs the kmap, not the map
It was only using the map to obtain its kmap, so do the validation in
its called, __dso__load_kallsyms() and pass the kmap, that will be used
in the following patches in similar simplifications.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-u6p9hbonlqzpl6o1z9xzxd75@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27 15:47:13 -03:00
Arnaldo Carvalho de Melo
333cc76c9d perf symbols: Remove unused dso__load_all_kallsyms() 'map' parameter
Only the 'dso' is needed, so ditch the struct used to pass (map, dso),
passing just the used 'dso' pointer.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-17a4gkk1cs4up4smkviymi2g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27 15:36:15 -03:00
Arnaldo Carvalho de Melo
4e0d1e8bcb perf symbols: Split kernel symbol processing from dso__load_sym()
More should be done to split this function, removing stuff map
relocation steps from the actual symbol table loading.

Arch specific stuff also should go elsewhere, to tools/arch/ and
we should have it keyed by data from the perf_env either in the
perf.data header or from the running environment.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-236gyo6cx6iet90u3uc01cws@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27 15:15:24 -03:00
Arnaldo Carvalho de Melo
857140e816 perf symbols: Remove needless goto
We can plain use the an else to the if block that is right after that
goto, so simplify it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-vnpc2rakf6vc98pcl5z1cfrg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27 10:53:14 -03:00
Arnaldo Carvalho de Melo
3183f8ca30 perf symbols: Unify symbol maps
Remove the split of symbol tables for data (MAP__VARIABLE) and for
functions (MAP__FUNCTION), its unneeded and there were various places
doing two lookups to find a symbol, so simplify this.

We still will consider only the symbols that matched the filters in
place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in
the patch, just so that we consider only the same symbols as before,
to reduce the possibility of regressions.

All the tests on 50-something build environments, in varios versions
of lots of distros and cross build environments were performed without
build regressions, as usual with all pull requests the other tests were
also performed: 'perf test' and 'make -C tools/perf build-test'.

Also this was done at a great granularity so that regressions can be
bisected more easily.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-hiq0fy2rsleupnqqwuojo1ne@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-27 10:47:06 -03:00
Arnaldo Carvalho de Melo
e9814df864 perf symbols: Use map->prot in place of type==MAP__FUNCTION
Its equivalent, one less use of enum map_type.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6m18iv1ty7nh7kxlfmn89sgz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 16:15:08 -03:00
Arnaldo Carvalho de Melo
d183b2614f perf map: Use map->prot in place of type==MAP__FUNCTION
Equivalent, one step more in ditching enum map_type.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-mrjjc87a4tpf896j5u4sql4e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 16:08:38 -03:00
Arnaldo Carvalho de Melo
18231d7946 perf symbols: Use symbol type instead of map->type
map->type is going away, we can derive it from map->prot, so use
the same logic as in the kernel's arch/arm/kernel/module.c file:

  ELF32_ST_TYPE(sym->st_info) == STT_FUNC && !(sym->st_value & 1))

This was introduced in b2f8fb237e ("perf symbols: Fix annotation of
thumb code"), that fix is maintained with this change.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dave Martin <dave.martin@linaro.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dr. David Alan Gilbert <david.gilbert@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-us590h81uqgxaumucfttqj50@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:20 -03:00
Arnaldo Carvalho de Melo
d1fd8d9e6b perf symbols: No need to special case MAP__FUNCTION in fixup
In 39b12f7812 ("perf tools: Make it possible to read object code from
vmlinux") we special case MAP__FUNCTION maps inconsistently, the first
test tests the map type while the following tests added by this patch
don't do that, be consistent and elliminate this special case.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-khmi5jccpcwqa9nybefluzqp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:20 -03:00
Arnaldo Carvalho de Melo
6769e98dde perf sort: Use mmap->prot on "dcacheline" formatting
To match the kernel when setting the PERF_RECORD_MISC_MMAP_DATA bit
in perf_event_attr.header.misc, that gets set when VM_EXEC is not
set in the vm_flags.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-r1z0tbdc7tich469aw4szinx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:19 -03:00
Arnaldo Carvalho de Melo
0f476f2bbc perf machine: Set PROT_EXEC for executable PERF_RECORD_MMAP records
The kernel doesn't fill the map 'prot' field for PERF_RECORD_MMAP
records, and we will use that info to replace checking for
MAP__VARIABLE, so store that when processing the
PERF_RECORD_MISC_MMAP_DATA perf_event_attr.header.misc bit.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-es3zz9r0q2qlssg4wh1w1d8p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:19 -03:00
Arnaldo Carvalho de Melo
af30bffa2f perf symbols: Store the ELF symbol type in the symbol struct
There is code that needs to see if a resolved address is a function, so,
since we're going to ditch the MAP__{FUNCTION,VARIABLE} split, store
that info in the per symbol struct.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-9ugwxz0i8ryg5702rx8u5q6z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:18 -03:00
Arnaldo Carvalho de Melo
e1f2a0d0f2 perf map: Remove map_type arg from map_groups__find()
One more step in ditching the split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-4pour7egur07tkrpbynawemv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:18 -03:00
Arnaldo Carvalho de Melo
404eb5a436 perf thread: Make thread__find_map() search all maps
We still have the split internally, but users don't see it anymore,
simplifying the growing number of cases where we end up searching
in the MAP__VARIABLE maps.

This further paves the way for ditching the split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-86mfxrztf310konutxvhr5ua@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:17 -03:00
Arnaldo Carvalho de Melo
117d3c2474 perf thread: Ditch __thread__find_symbol()
Simulate having all symbols in just one tree by searching the still
existing two trees.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-uss70e8tvzzbzs326330t83q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:17 -03:00
Arnaldo Carvalho de Melo
128cde3379 perf machine: Use machine__find_kernel_function() instead of open coded version
We have that equivalent, shorter helper, use it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-1hcgu3k7vxdy4vknqf3kbtzt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:16 -03:00
Arnaldo Carvalho de Melo
26bd933164 perf thread: Remove addr_type arg from thread__find_cpumode_addr_location()
All callers are for MAP__FUNCTION, so just ditch it and use
thread__find_symbol(), that already ditched MAP__FUNCTION, i.e.
internally uses it till we ditch it for good.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-i0ocxs00b4a0tlrx31lyh2cs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:16 -03:00
Arnaldo Carvalho de Melo
af07eeb04c perf symbols: Remove map_type arg from dso__find_symbol()
One more step to ditch MAP__{VARIABLE,FUNCTION}

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-919d1k13ts62pjipnpibvgwd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:15 -03:00
Arnaldo Carvalho de Melo
dce0478b5f perf map: Remove enum_type arg to map_groups__first()
Only the symbol core needs to use that, so provide a __ variant for that
case, that will end up removed when we ditch the MAP__ split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-x29k9e1ohastsoqbilp3mguh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:15 -03:00
Arnaldo Carvalho de Melo
a2f1c160fe perf symbols: Unexport symbol_type__is_a()
Now this is only used in the symbols.c file, where it will finally
disappear when we remove the MAP_{FUNCTION,VARIABLE} split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-a9t4d4hfrycczq9vpsk5sr8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:15 -03:00
Arnaldo Carvalho de Melo
e85e0e0ccc perf tools: Use kallsyms__is_function()
Replacing equivalent, the equivalent and longer variation:

	 symbol__is_a(type, MAP__FUNCTION);

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-9t3dqogher54owfl9o2mir52@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:14 -03:00
Arnaldo Carvalho de Melo
5cf88a6325 perf symbols: Shorten dso__(first|last)_symbol()
All users want MAP__FUNCTION, and this split is going away.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sm72zwt1f03ma5uw78l6zze0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:13 -03:00
Arnaldo Carvalho de Melo
abe5449d2d perf map: Shorten map_groups__find() signature
Removing the map_type, that is going away.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-18iiiw25r75xn7zlppjldk48@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:12 -03:00
Arnaldo Carvalho de Melo
1d1a2654ff perf machine: Remove needless map_type from machine__load_vmlinux_path()
Since it uses machine__kernel_map() and this function always returns the
MAP__FUNCTION map, it doesn't make sense to call it with MAP__VARIABLE.

And also this is a step in the direction of nuking the MAP__{FUNCTION,VARIABLE}
split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-0h3eof3kx3kq32ixg5fquf3p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:11 -03:00
Arnaldo Carvalho de Melo
329f0adef3 perf machine: Shorten machine__load_kallsyms() signature
So far the only use is for MAP__FUNCTION, and since we're going to
remove that split, remove the map_type argument in machine__load_kallsyms().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-5dhgh7x8g9hx5hpxlp3k08jp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:10 -03:00
Arnaldo Carvalho de Melo
68a741868a perf machine: Introduce machine__kernel_maps()
That returns the a data structure contained the ordered list of kernel
modules + the main kernel maps, one more step in removing the
MAP__{FUNCTION,VARIABLE} split.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-qsgbxfyaohc80c9ma049dubm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:10 -03:00
Arnaldo Carvalho de Melo
83cf774b02 perf map: Shorten map_groups__find_by_name() signature
Another step in the road to elliminate the MAP_{FUNCTION,VARIABLE}
separation, reducing the exposure to these details in the tools using
the symbol APIs.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-8a1hvrqe3r5i0kw865u3uxwt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:09 -03:00
Arnaldo Carvalho de Melo
d9a5f27460 perf thread: Make thread__find_symbol() return the symbol searched
Instead of just returning it in al.sym, allowing for some simplification
in its users, and to make it consistent with thread__find_map().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-4axi2sigslffdixzxbehvgoj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:09 -03:00
Arnaldo Carvalho de Melo
71a84b5aed perf thread: Make thread__find_map() return the map
It was returning the searched map just on the addr_location passed, with
the function itself returning void.

Make it return the map so that we can make the code more compact.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tzlrrzdeoof4i6ktyqv1t6ks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:08 -03:00
Arnaldo Carvalho de Melo
4546263d72 perf thread: Introduce thread__find_symbol()
Out of thread__find_addr_location(..., MAP__FUNCTION, ...), idea here is to
continue removing references to MAP__{FUNCTION,VARIABLE} ahead of
getting both types of symbols in the same rbtree, as various places do
two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE.

So thread__find_symbol() will eventually do just that, and 'struct
symbol' will have the symbol type, for code that cares about that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-n7528en9e08yd3flzmb26tth@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:07 -03:00
Arnaldo Carvalho de Melo
f07a2d32b5 perf thread: Introduce thread__find_map()
Out of thread__find_add_map(..., MAP__FUNCTION, ...), idea here is to
continue removing references to MAP__{FUNCTION,VARIABLE} ahead of
getting both types of symbols in the same rbtree, as various places do
two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE.

So thread__find_map() will eventually do just that, and 'struct symbol'
will have the symbol type, for code that cares about that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-q27xee34l4izpfau49w103s6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:06 -03:00
Arnaldo Carvalho de Melo
e94b861a23 perf map: Introduce map__has_symbols()
To further simplify checking if symbols are available for a given map
and to reduce the number of users of MAP__{FUNCTION,VARIABLE}.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-iyfoyvbfdti5uehgpjum3qrq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:06 -03:00
Arnaldo Carvalho de Melo
d88205db9c perf dso: Add dso__has_symbols() method
To replace longer code sequences in various places.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tlk3klbkfyjrbfjvryyznfju@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:05 -03:00
Arnaldo Carvalho de Melo
efdd5c6b81 perf symbols: Use __map__is_kernel() instead of ad-hoc equivalent code
Shorter, should be equivalent code, use it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-q90olng8sfkvrnsrwu7xnul6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 13:47:00 -03:00
Jiri Olsa
e55c14af48 perf stat: Add --table option to display time of each run
Add --table option to display time for each run (-r option), like:

  $ perf stat --null -r 5 --table perf bench sched pipe

   Performance counter stats for './perf bench sched pipe' (5 runs):

             # Table of individual measurements:
             5.379 (-0.176)
             5.243 (-0.311)
             5.238 (-0.317)
             5.536 (-0.019)
             6.377 (+0.823)

             # Final result:
             5.555 +- 0.213 seconds time elapsed  ( +-  3.83% )

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180423090823.32309-8-jolsa@kernel.org
[ Document the new option in 'perf stat's man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-26 09:30:27 -03:00
Kan Liang
80ee8c588a perf stat: Fix duplicate PMU name for interval print
PMU name is printed repeatedly for interval print, for example:

  perf stat --no-merge -e 'unc_m_clockticks' -a -I 1000
  #           time             counts unit events
     1.001053069        243,702,144      unc_m_clockticks [uncore_imc_4]
     1.001053069        244,268,304      unc_m_clockticks [uncore_imc_2]
     1.001053069        244,427,386      unc_m_clockticks [uncore_imc_0]
     1.001053069        244,583,760      unc_m_clockticks [uncore_imc_5]
     1.001053069        244,738,971      unc_m_clockticks [uncore_imc_3]
     1.001053069        244,880,309      unc_m_clockticks [uncore_imc_1]
     2.002024821        240,818,200      unc_m_clockticks [uncore_imc_4] [uncore_imc_4]
     2.002024821        240,767,812      unc_m_clockticks [uncore_imc_2] [uncore_imc_2]
     2.002024821        240,764,215      unc_m_clockticks [uncore_imc_0] [uncore_imc_0]
     2.002024821        240,759,504      unc_m_clockticks [uncore_imc_5] [uncore_imc_5]
     2.002024821        240,755,992      unc_m_clockticks [uncore_imc_3] [uncore_imc_3]
     2.002024821        240,750,403      unc_m_clockticks [uncore_imc_1] [uncore_imc_1]

For each print, the PMU name is unconditionally appended to the
counter->name.

Need to check the counter->name first. If the PMU name is already
appended, do nothing.

Committer notes:

Add and use perf_evsel->uniquified_name bool instead of doing the more
expensive strstr(event->name, pmu->name).

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 8c5421c016 ("perf pmu: Display pmu name when printing unmerged events in stat")
Link: http://lkml.kernel.org/r/1524594014-79243-5-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24 16:12:00 -03:00
Kan Liang
121f325f34 perf evsel: Only fall back group read for leader
Perf doesn't support mixed events from different PMUs (except software
event) in a group. The perf stat should output <not counted>/<not
supported> for all events, but it doesn't. For example,

  perf stat -e '{cycles,uncore_imc_5/umask=0xF,event=0x4/,instructions}'
       <not counted>      cycles
       <not supported>    uncore_imc_5/umask=0xF,event=0x4/
           1,024,300      instructions

If perf fails to open an event, it doesn't error out directly. It will
disable some features and retry, until the event is opened or all
features are disabled. The disabled features will not be re-enabled. The
group read is one of these features.

For the example as above, the IMC event and the leader event "cycles"
are from different PMUs. Opening the IMC event must fail. The group read
feature must be disabled for IMC event and the followed event
"instructions". The "instructions" event has the same PMU as the leader
"cycles". It can be opened successfully. Since the group read feature
has been disabled, the "instructions" event will be read as a single
event, which definitely has a value.

The group read fallback is still useful for the case which kernel
doesn't support group read. It is good enough to be handled only by the
leader.

For the fallback request from members, it must be caused by an error.
The fallback only breaks the semantics of group.  Limit the group read
fallback only for the leader.

Committer testing:

On a broadwell t450s notebook:

Before:

  # perf stat -e '{cycles,unc_cbo_cache_lookup.read_i,instructions}' sleep 1

  Performance counter stats for 'sleep 1':

     <not counted>      cycles
   <not supported>      unc_cbo_cache_lookup.read_i
           818,206      instructions

       1.003170887 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
	echo 0 > /proc/sys/kernel/nmi_watchdog
	perf stat ...
	echo 1 > /proc/sys/kernel/nmi_watchdog

After:

  # perf stat -e '{cycles,unc_cbo_cache_lookup.read_i,instructions}' sleep 1

  Performance counter stats for 'sleep 1':

     <not counted>      cycles
   <not supported>      unc_cbo_cache_lookup.read_i
     <not counted>      instructions

       1.001380511 seconds time elapsed

  Some events weren't counted. Try disabling the NMI watchdog:
	echo 0 > /proc/sys/kernel/nmi_watchdog
	perf stat ...
	echo 1 > /proc/sys/kernel/nmi_watchdog
  #

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes:  82bf311e15 ("perf stat: Use group read for event groups")
Link: http://lkml.kernel.org/r/1524594014-79243-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24 16:11:59 -03:00
Kan Liang
292c34c102 perf pmu: Fix core PMU alias list for X86 platform
When counting uncore event with alias, core event is mistakenly
involved, for example:

  perf stat --no-merge -e "unc_m_cas_count.all" -C0  sleep 1

  Performance counter stats for 'CPU(s) 0':

                 0      unc_m_cas_count.all [uncore_imc_4]
                 0      unc_m_cas_count.all [uncore_imc_2]
                 0      unc_m_cas_count.all [uncore_imc_0]
           153,640      unc_m_cas_count.all [cpu]
                 0      unc_m_cas_count.all [uncore_imc_5]
            25,026      unc_m_cas_count.all [uncore_imc_3]
                 0      unc_m_cas_count.all [uncore_imc_1]

       1.001447890 seconds time elapsed

The reason is that current implementation doesn't check PMU name of a
event when adding its alias into the alias list for core PMU. The
uncore event aliases are mistakenly added.

This bug was introduced in:
  commit 14b22ae028 ("perf pmu: Add helper function is_pmu_core to
  detect PMU CORE devices")

Checking the PMU name for all PMUs on X86 and other architectures except
ARM.
There is no behavior change for ARM.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: 14b22ae028 ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices")
Link: http://lkml.kernel.org/r/1524594014-79243-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-24 16:02:29 -03:00
Jiri Olsa
e9add8bac6 perf evsel: Disable write_backward for leader sampling group events
.. and other related fields that do not need to be enabled
for events that have sampling leader.

It fixes the perf top usage Ingo reported broken:

  # perf top -e '{cycles,msr/aperf/}:S'

The 'msr/aperf/' event is configured for write_back sampling, which is
not allowed by the MSR PMU, so it fails to create the event.

Adjusting related attr test.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180423090823.32309-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-23 11:21:56 -03:00
Jiri Olsa
9a4a931ce8 perf pmu: Fix pmu events parsing rule
Currently all the event parsing fails end up in the event_pmu rule, and
display misleading help like:

  $ perf stat -e inst kill
  event syntax error: 'inst'
                       \___ Cannot find PMU `inst'. Missing kernel support?
  ...

The reason is that the event_pmu is too strong and match also single
string. Changing it to force the '/' separators to be part of the rule,
and getting the proper error now:

  $ perf stat -e inst kill
  event syntax error: 'inst'
                       \___ parser error
  Run 'perf list' for a list of valid events
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180423090823.32309-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-23 11:17:27 -03:00
Jiri Olsa
129193bb0c perf stat: Keep the / modifier separator in fallback
The 'perf stat' fallback for EACCES error sets the exclude_kernel
perf_event_attr and tries perf_event_open() again with it. In addition,
it also changes the name of the event to reflect that change by adding
the 'u' modifier.

But it does not take into account the '/' separator, so the event name
can end up mangled, like: (note the '/:' characters)

  $ perf stat -e cpu/cpu-cycles/ kill
  ...
             386,832      cpu/cpu-cycles/:u

Adding the code to check on the '/' separator and set the following
correct event name:

  $ perf stat -e cpu/cpu-cycles/ kill
  ...
             388,548      cpu/cpu-cycles/u

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180423090823.32309-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-23 11:14:10 -03:00
Thomas Richter
ce04abfbd3 perf list: Remove s390 specific strcmp_cpuid_cmp function
Make the type field in pmu-events/arch/s390/mapfile.cvs more generic to
match the created cpuid string for s390.

The pattern also checks for the counter first version number and counter
second version number ([13]\.[1-5]) and the authorization field which
follows.

These numbers do not exist in the cpuid identification string when perf
commands are executed on a z/VM environment (which does not support CPU
counter measurement facility).

CPUID string for LPAR:
   cpuid : IBM,3906,704,M03,3.5,002f
CPUID string for z/VM:
   cpuid : IBM,2964,702,N96

This allows the removal of s390 specific cpuid compare code and uses the
common compare function with its regular expression matching algorithm.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180423081745.3672-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-23 11:03:13 -03:00
Namhyung Kim
ee05d21791 perf machine: Set main kernel end address properly
map_groups__fixup_end() was called to set the end addresses of kernel
and module maps.  But now since machine__create_modules() sets the end
address of modules properly, the only remaining piece is the kernel map.

We can set it with adjacent module's address directly instead of calling
map_groups__fixup_end().  If there's no module after the kernel map, the
end address will be ~0ULL.

Since it also changes the start address of the kernel map, it needs to
re-insert the map to the kmaps in order to keep a correct ordering.  Kim
reported that it caused problems on ARM64.

Reported-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Kim Phillips <kim.phillips@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20180419235915.GA19067@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-23 10:52:55 -03:00
Mathieu Poirier
8a9fd83230 coresight: Move to SPDX identifier
Move CoreSight headers to the SPDX identifier.

Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1524089118-27595-1-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-19 12:29:41 -03:00
Andi Kleen
ccbb6afe08 perf record: Remove suggestion to enable APIC
'perf record' suggests to enable the APIC on errors.

APIC is practically always used today and the problem is usually
somewhere else.

Just remove the outdated suggestion.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20180406203812.3087-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18 15:35:50 -03:00
Andi Kleen
ec3948451e perf record: Remove misleading error suggestion
When perf record encounters an error setting up an event it suggests
to enable CONFIG_PERF_EVENTS. This is misleading because:

- Usually it is enabled (it is really hard to disable on x86)

- The problem is usually somewhere else, e.g. the CPU is not supported
or an invalid configuration has been used.

Remove the misleading suggestion.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20180406203812.3087-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18 15:35:49 -03:00
Thomas Richter
038586c343 perf list: Add s390 support for detailed/verbose PMU event description
'perf list' with flags -d and -v print a description (-d) or a very
verbose explanation (-v) of CPU specific counter events.  These
descriptions are provided with the json files in directory
pmu-events/arch/s390/*.json.

Display of these descriptions on s390 requires the corresponding json
files.

On s390 this does not work because function is_pmu_core() does not
detect the s390 directory name where the CPU specific events are listed.
On x86 it is:

  /sys/bus/event_source/devices/cpu

whereas on s390 it is:

  /sys/bus/event_source/devices/cpum_cf
  /sys/bus/event_source/devices/cpum_sf

Fix this by adding s390 directory name testing to function
is_pmu_core(). This is the same approach as taken for the ARM platform.

Output before:

[root@s35lp76 perf]# ./perf list -d pmu
List of pre-defined events (to be used in -e):

  cpum_cf/AES_BLOCKED_CYCLES/      [Kernel PMU event]
  cpum_cf/AES_BLOCKED_FUNCTIONS/   [Kernel PMU event]
  cpum_cf/AES_CYCLES/              [Kernel PMU event]
  cpum_cf/AES_FUNCTIONS/           [Kernel PMU event]
  ....
  cpum_cf/TX_NC_TEND/              [Kernel PMU event]
  cpum_cf/VX_BCD_EXECUTION_SLOTS/  [Kernel PMU event]
  cpum_sf/SF_CYCLES_BASIC/         [Kernel PMU event]

Output after:

[root@s35lp76 perf]# ./perf list -d pmu
List of pre-defined events (to be used in -e):

  cpum_cf/AES_BLOCKED_CYCLES/      [Kernel PMU event]
  cpum_cf/AES_BLOCKED_FUNCTIONS/   [Kernel PMU event]
  cpum_cf/AES_CYCLES/              [Kernel PMU event]
  cpum_cf/AES_FUNCTIONS/           [Kernel PMU event]
  ....
  cpum_cf/TX_NC_TEND/              [Kernel PMU event]
  cpum_cf/VX_BCD_EXECUTION_SLOTS/  [Kernel PMU event]
  cpum_sf/SF_CYCLES_BASIC/         [Kernel PMU event]

3906:
  bcd_dfp_execution_slots
       [BCD DFP Execution Slots]
  decimal_instructions
       [Decimal Instructions]
  dtlb2_gpage_writes
       [DTLB2 GPAGE Writes]
  dtlb2_hpage_writes
       [DTLB2 HPAGE Writes]
  dtlb2_misses
       [DTLB2 Misses]
  dtlb2_writes
       [DTLB2 Writes]
  itlb2_misses
       [ITLB2 Misses]
  itlb2_writes
       [ITLB2 Writes]
  l1c_tlb2_misses
       [L1C TLB2 Misses]
  .....

cfvn 3:
  cpu_cycles
       [CPU Cycles]
  instructions
       [Instructions]
  l1d_dir_writes
       [L1D Directory Writes]
  l1d_penalty_cycles
       [L1D Penalty Cycles]
  l1i_dir_writes
       [L1I Directory Writes]
  l1i_penalty_cycles
       [L1I Penalty Cycles]
  problem_state_cpu_cycles
       [Problem State CPU Cycles]
  problem_state_instructions
       [Problem State Instructions]
  ....

csvn generic:
  aes_blocked_cycles
       [AES Blocked Cycles]
  aes_blocked_functions
       [AES Blocked Functions]
  aes_cycles
       [AES Cycles]
  aes_functions
       [AES Functions]
  dea_blocked_cycles
       [DEA Blocked Cycles]
  dea_blocked_functions
       [DEA Blocked Functions]
  ....

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180416132314.33249-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17 09:47:39 -03:00
Alexey Budankov
b3f35b5d5d perf report: Extend raw dump (-D) out with switch out event type
Print additional 'preempt' tag for PERF_RECORD_SWITCH[_CPU_WIDE] OUT records when
event header misc field contains PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit set
designating preemption context switch out event:

tools/perf/perf report -D -i perf.data | grep _SWITCH

0 768361415226 0x27f076 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     8/8
4 768362216813 0x28f45e [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT          next pid/tid:     0/0
4 768362217824 0x28f486 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:  4073/4073
0 768362414027 0x27f0ce [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt  next pid/tid:     8/8
0 768362414367 0x27f0f6 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN           prev pid/tid:     0/0

Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/6f5aebb9-b96c-f304-f08f-8f046d38de4f@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17 09:47:39 -03:00
Arnaldo Carvalho de Melo
43c4023152 perf annotate: Allow setting the offset level in .perfconfig
The default is 1 (jump_target):

  # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
  Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574
  _raw_spin_lock_irqsave() /proc/kcore
    0.26        nop
    4.61        push   %rbx
   19.33        pushfq
    7.97        pop    %rax
    0.32        nop
    0.06        mov    %rax,%rbx
   14.63        cli
    0.06        nop
                xor    %eax,%eax
                mov    $0x1,%edx
   49.94        lock   cmpxchg %edx,(%rdi)
    0.16        test   %eax,%eax
              ↓ jne    2b
    2.66        mov    %rbx,%rax
                pop    %rbx
              ← retq
          2b:   mov    %eax,%esi
              → callq  *ffffffffb30eaed0
                mov    %rbx,%rax
                pop    %rbx
              ← retq
  #

But one can ask for showing offsets for call instructions by setting
this:

  # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
  Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574
  _raw_spin_lock_irqsave() /proc/kcore
    0.26        nop
    4.61        push   %rbx
   19.33        pushfq
    7.97        pop    %rax
    0.32        nop
    0.06        mov    %rax,%rbx
   14.63        cli
    0.06        nop
                xor    %eax,%eax
                mov    $0x1,%edx
   49.94        lock   cmpxchg %edx,(%rdi)
    0.16        test   %eax,%eax
              ↓ jne    2b
    2.66        mov    %rbx,%rax
                pop    %rbx
              ← retq
          2b:   mov    %eax,%esi
          2d: → callq  *ffffffffb30eaed0
                mov    %rbx,%rax
                pop    %rbx
              ← retq
  #

Or using a big value to ask for all offsets to be shown:

  # cat ~/.perfconfig
  [annotate]

	offset_level = 100

	hide_src_code = true
  # perf annotate --ignore-vmlinux --stdio2 _raw_spin_lock_irqsave
  Samples: 3K of event 'cycles:ppp', 3000 Hz, Event count (approx.): 2766398574
  _raw_spin_lock_irqsave() /proc/kcore
    0.26   0:   nop
    4.61   5:   push   %rbx
   19.33   6:   pushfq
    7.97   7:   pop    %rax
    0.32   8:   nop
    0.06   d:   mov    %rax,%rbx
   14.63  10:   cli
    0.06  11:   nop
          17:   xor    %eax,%eax
          19:   mov    $0x1,%edx
   49.94  1e:   lock   cmpxchg %edx,(%rdi)
    0.16  22:   test   %eax,%eax
          24: ↓ jne    2b
    2.66  26:   mov    %rbx,%rax
          29:   pop    %rbx
          2a: ← retq
          2b:   mov    %eax,%esi
          2d: → callq  *ffffffffb30eaed0
          32:   mov    %rbx,%rax
          35:   pop    %rbx
          36: ← retq
   #

This also affects the TUI, i.e. the default 'perf annotate' and 'perf
top/report' -> A hotkey -> annotate interfaces, when slang-devel is present
in the build, i.e.:

  # perf version --build-options | grep slang
              libslang: [ on  ]  # HAVE_SLANG_SUPPORT
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-venm6x5zrt40eu8hxdsmqxz6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13 10:00:05 -03:00
Arnaldo Carvalho de Melo
7b366142a5 perf report: Fix switching to another perf.data file
In the TUI the 's' hotkey can be used to switch to another perf.data
file in the current directory, but that got broken in Fixes:
b01141f4f5 ("perf annotate: Initialize the priv are in symbol__new()"),
that would show this once another file was chosen:

    ┌─Fatal Error─────────────────────────────────────┐
    │Annotation needs to be init before symbol__init()│
    │                                                 │
    │                                                 │
    │Press any key...                                 │
    └─────────────────────────────────────────────────┘

Fix it by just silently bailing out if symbol__annotation_init() was already
called, just like is done with symbol__init(), i.e. they are done just once at
session start, not when switching to a new perf.data file.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: b01141f4f5 ("perf annotate: Initialize the priv are in symbol__new()")
Link: https://lkml.kernel.org/n/tip-ogppdtpzfax7y1h6gjdv5s6u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13 10:00:04 -03:00
Thomas Richter
4f75f1cbf9 perf record: Change warning for missing sysfs entry to debug
Using perf on 4.16.0 kernel on s390 shows this warning:

   failed: can't open node sysfs data

each time I run command perf record ... for example:

  [root@s35lp76 perf]# ./perf record -e rB0000 -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  failed: can't open node sysfs data
  [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ]
  [root@s35lp76 perf]#

It turns out commit e2091cedd5 ("perf tools: Add MEM_TOPOLOGY feature
to perf data file") tries to open directory named /sys/devices/system/node/
which does not exist on s390.

This is the call stack:
 __cmd_record
 +---> perf_session__write_header
       +---> perf_header__adds_write
             +---> do_write_feat
	           +---> write_mem_topology
		         +---> build_mem_topology
			       prints warning

The issue starts in do_write_feat() which unconditionally loops over all
features and now includes HEADER_MEM_TOPOLOGY and calls write_mem_topology().

Function record__init_features() at the beginning of __cmd_record() sets
all features and then turns off some of them.

Fix this by changing the warning to a level 2 debug output statement.

So it is only shown when debug level 2 or higher is set.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180412133246.92801-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-13 09:59:56 -03:00
Jin Yao
22e9af4e94 perf tools: Rename HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT
To be consistent with other HAVE_XXX_SUPPORT uses in Makefile.config,
this patch renames HAVE_SYSCALL_TABLE to HAVE_SYSCALL_TABLE_SUPPORT and
updates the C code accordingly.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1523269609-28824-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12 10:33:31 -03:00
Jin Yao
90ce61b919 perf script: Use HAVE_LIBXXX_SUPPORT to replace NO_LIBXXX
In Makefile.config, we define the conditional compilation variables
HAVE_LIBPERL_SUPPORT and HAVE_LIBPYTHON_SUPPORT.

To make the C code more consistent, this patch replaces
NO_LIBPERL/NO_LIBPYTHON in C code with HAVE_LIBPERL_SUPPORT/
HAVE_LIBPYTHON_SUPPORT.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1523269609-28824-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12 10:33:29 -03:00
Arnaldo Carvalho de Melo
592c10e217 perf annotate: Allow showing offsets in more than just jump targets
Jesper wanted to see offsets at callq sites when doing some performance
investigation related to retpolines, so save him some time by providing
an 'struct annotation_options' to control where offsets should appear:
just on jump targets? That + call instructions? All?

This puts in place the logic to show the offsets, now we need to wire
this up in the TUI browser (next patch) and on the 'perf annotate --stdio2"
interface, where we need a more general mechanism to setup the
'annotation_options' struct from the command line.

Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-m3jc9c3swobye9tj08gnh5i7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-12 10:32:39 -03:00
Sandipan Das
fcbd8fa446 perf tests clang: Fix function name for clang IR test
As stated in tests/llvm-src-base.c, the name of the bpf function should
be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to
lookup "bpf_func__SyS_epoll_wait".

Before applying patch:

55: builtin clang support                                 :
55.1: builtin clang compile C source to IR                : FAILED!
55.2: builtin clang compile C source to ELF object        : Skip

After applying patch:

55: builtin clang support                                 :
55.1: builtin clang compile C source to IR                : Ok
55.2: builtin clang compile C source to ELF object        : Ok

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Fixes: e67d52d411 ("perf clang: Update test case to use real BPF script")
Link: http://lkml.kernel.org/r/20180404180419.19056-3-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09 11:13:09 -03:00
Sandipan Das
7854e499f3 perf clang: Add support for recent clang versions
The clang API calls used by perf have changed in recent releases and
builds succeed with libclang-3.9 only. This introduces compatibility
with libclang-4.0 and above.

Without this patch, we will see the following compilation errors with
libclang-4.0+:

 util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’:
 util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
   Opts.Inputs.emplace_back(Path, IK_C);
                                  ^~~~
 util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::Module> perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr<clang::vfs::FileSystem>)’:
 util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
   Clang.setInvocation(&*CI);
                           ^
 In file included from util/c++/clang.cpp:14:0:
 /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr<clang::CompilerInvocation>)
    void setInvocation(std::shared_ptr<CompilerInvocation> Value);
         ^~~~~~~~~~~~~

Committer testing:

Tested on Fedora 27 after installing the clang-devel and llvm-devel
packages, versions:

  # rpm -qa | egrep llvm\|clang
  llvm-5.0.1-6.fc27.x86_64
  clang-libs-5.0.1-5.fc27.x86_64
  clang-5.0.1-5.fc27.x86_64
  clang-tools-extra-5.0.1-5.fc27.x86_64
  llvm-libs-5.0.1-6.fc27.x86_64
  llvm-devel-5.0.1-6.fc27.x86_64
  clang-devel-5.0.1-5.fc27.x86_64
  #

Make sure you don't have some older version lying around in /usr/local,
etc, then:

  $ make LIBCLANGLLVM=1 -C tools/perf install-bin

And in the end perf will be linked agains these libraries:

  # ldd ~/bin/perf | egrep -i llvm\|clang
	libclangAST.so.5 => /lib64/libclangAST.so.5 (0x00007f8bb2eb4000)
	libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x00007f8bb29e3000)
	libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x00007f8bb23f7000)
	libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x00007f8bb2060000)
	libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x00007f8bb1d06000)
	libclangLex.so.5 => /lib64/libclangLex.so.5 (0x00007f8bb1a3e000)
	libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x00007f8bb17d4000)
	libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x00007f8bb15c5000)
	libclangSema.so.5 => /lib64/libclangSema.so.5 (0x00007f8bb0cc9000)
	libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x00007f8bb0a23000)
	libclangParse.so.5 => /lib64/libclangParse.so.5 (0x00007f8bb0725000)
	libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x00007f8bb039a000)
	libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x00007f8bace98000)
	libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x00007f8bab735000)
	libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x00007f8bab4b2000)
	libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x00007f8bab2a1000)
	libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x00007f8bab08e000)
  #

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Fixes: 00b86691c7 ("perf clang: Add builtin clang support ant test case")
Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09 11:13:08 -03:00
Arnaldo Carvalho de Melo
ad0902e0c4 perf tools: No need to include namespaces.h in util.h
The only thing that is needed there is a forward declaration for 'struct
nsinfo', so disentanble this, which in turns allows built-in clang
builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89c84@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-09 10:57:50 -03:00
Adrian Hunter
b238db6557 perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
In preparation for supporting AUX area sampling buffers,
auxtrace_queues__add_buffer() needs to be more generic. To that end, move
CPU filtering into it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-06 09:40:41 -03:00
Arnaldo Carvalho de Melo
41a43dacec perf report: Remove duplicated 'samples' in lost samples warning
The following message, emitted when samples are lost due to system
overload, had one 'samples' too many, ditch it:

   Processed 25333 samples and lost 20.88% samples!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: https://lkml.kernel.org/n/tip-oev1469y02hmfere6r2kkxp6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05 14:34:09 -03:00
Arnaldo Carvalho de Melo
c0459a0925 perf annotate: Show group details on the title line
To match what is shown in the main 'perf report/top' title lines, i.e.
if a group is being shown, either a real group (recorded with "-e
'{a,b,c}') or a forced group (using 'perf report --group' for a
perf.data file recorded without {}) we will show multiple columns,
one per event, but we were failing to show the group details, so, for:

 # perf report --header-only | grep cmdline
 # cmdline : /home/acme/bin/perf record -e {cycles,instructions,cache-misses}
 # perf report --group

The first line was showing just "cycles", now it shows the correct line,
which is:

  Samples: 578  of events 'anon group { cycles, instructions, cache-misses }', 4000 Hz, Event count (approx.): 487421794
  syscall_return_via_sysret  /lib/modules/4.16.0-rc7/build/vmlinux
    0.22   2.97   0.00 │    ↓ jmp    6c
                       │      mov    %cr3,%rdi
    1.33  10.89   4.00 │    ↓ jmp    62
                       │      mov    %rdi,%rax
<SNIP>

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 6920e2854e ("perf annotate browser: Show extra title line with event information")
Link: https://lkml.kernel.org/n/tip-i41tqh17c2dabnyzjh99r1oz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05 11:18:39 -03:00
Adrian Hunter
0d75f123a6 perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
In preparation for supporting AUX area sampling buffers,
auxtrace_queues__add_buffer() needs to be more generic. To that end,
move memory allocation for struct buffer into it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1520327598-1317-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-05 11:03:33 -03:00