When debugging the tui browser I find it useful to redirect the debug
log into a file. Currently it's always forced to the message line.
Add an option to force it to stderr. Then it can be easily redirected.
Example:
[root@zoo ~]# perf --debug stderr report -vv 2> /tmp/debug
[root@zoo ~]# tail /tmp/debug
dso open failed, mmap: No such file or directory
dso open failed, mmap: No such file or directory
dso open failed, mmap: No such file or directory
dso open failed, mmap: No such file or directory
dso open failed, mmap: No such file or directory
Using /root/.debug/.build-id/4e/841948927029fb650132253642d5dbb2c1fb93 for symbols
Failed to open /tmp/perf-8831.map, continuing without symbols
Failed to open /tmp/perf-12721.map, continuing without symbols
Failed to open /tmp/perf-6966.map, continuing without symbols
Failed to open /tmp/perf-8802.map, continuing without symbols
[root@zoo ~]#
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1416605880-25055-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to define bfd_demangle() to either a wrapper for
cplus_demangle() or to a stub when NO_DEMANGLE is defined.
That is at odds with using bfd.h for some other reason, as it defines
bfd_demangle() and then if code that wants to use symbol.h, where the
above stubbing/wrapping is done, and bfd.h for other reasons, we end up
with a build error where bfd_demangle() is found to be redefined.
Avoid that by moving the stubbing/wrapping to symbol-elf.c, that is the
only user of such function. If we ever get to a point where there are
more valid users, we can then introduce a header for that.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6wzjpe2fy9xtgchshulixlzw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For lbr-as-callgraph we need to see the line number in the history,
because many LBR entries can be in a single function, and just
showing the same function name many times is not useful.
When the history code is configured to sort by address, also try to
resolve the address to a file:srcline and display this in the browser.
If that doesn't work still display the address.
This can be also useful without LBRs for understanding which call in a large
function (or in which inlined function) called something else.
Contains fixes from Namhyung Kim
v2: Refactor code into common function
v3: Fix GTK build
v4: Rebase
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-7-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently perf report on TUI doesn't print percent for first-level
callchain entry.
I guess it (wrongly) assumes that there's only a single callchain in the
first level.
This patch fixes it by handling the first level callchains same as
others - if it's not 100% it should print the percent value.
Also it'll affect other callchains in the other way around - if it's
100% (single callchain) it should not print the percentage.
Before:
- 30.95% 6.84% abc2 abc2 [.] a
- a
- 70.00% c
- 100.00% apic_timer_interrupt
smp_apic_timer_interrupt
local_apic_timer_interrupt
hrtimer_interrupt
...
+ 30.00% b
+ __libc_start_main
After:
- 30.95% 6.84% abc2 abc2 [.] a
- 77.90% a
- 70.00% c
- apic_timer_interrupt
smp_apic_timer_interrupt
local_apic_timer_interrupt
hrtimer_interrupt
...
+ 30.00% b
+ 22.10% __libc_start_main
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1416816807-6495-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible fixes:
- Fallback to kallsyms when using the minimal 'ELF' loader (Arnaldo Carvalho de Melo)
- Fix annotation with kcore (Adrian Hunter)
- Fix up srcline histogram key formatting (Arnaldo Carvalho de Melo)
- Add missing handler for PERF_RECORD_MMAP2 events in 'perf diff' (Kan Liang)
User visible changes/new features:
- Only print base source file for srcline histogram sort key (Andi Kleen)
- Support source line numbers in annotate using a hotkey (Andi Kleen)
Infrastructure changes and fixes:
- Do not poll events that use the system_wide flag (Adrian Hunter)
- Add perf-read-vdso32 and perf-read-vdsox32 to .gitignore (Adrian Hunter)
- Only override the default :tid comm entry (Adrian Hunter)
- Factor out adding new call chain entries (Andi Kleen)
- Use al.addr to set up call chain (Andi Kleen)
- Use a common function to resolve symbol or name (Andi Kleen)
- Fix ftrace:function event recording (Jiri Olsa)
- Move disable_buildid_cache() to util/build-id.c (Namhyung Kim)
- Clean up libelf feature support code (Namhyung Kim)
- Fix typo in python 'perf test' (WANG Chao)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With srcline key/sort'ing it's useful to have line numbers in the
annotate window. This patch implements this.
Use objdump -l to request the line numbers and save them in the line
structure. Then the browser displays them for source lines.
The line numbers are not displayed by default, but can be toggled on
with 'k'
There is one unfortunate problem with this setup. For lines not
containing source and which are outside functions objdump -l reports
line numbers off by a few: it always reports the first line number in
the next function even for lines that are outside the function.
I haven't found a nice way to detect/correct this. Probably objdump has
to be fixed.
See https://sourceware.org/bugzilla/show_bug.cgi?id=16433
The line numbers are still useful even with these problems, as most are
correct and the ones which are not are nearby.
v2: Fix help text. Handle (discriminator...) output in objdump.
Left align the line numbers.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-9-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Enable capture of interrupted machine state for each sample.
Registers to sample are passed per event in the sample_regs_intr bitmask.
To sample interrupt machine state, the PERF_SAMPLE_INTR_REGS must be passed in
sample_type.
The list of available registers is arch dependent and provided by asm/perf_regs.h
Registers are laid out as u64 in the order of the bit order of sample_intr_regs.
This patch also adds a new ABI version PERF_ATTR_SIZE_VER4 because we extend
the perf_event_attr struct with a new u64 field.
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: cebbert.lkml@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Link: http://lkml.kernel.org/r/1411559322-16548-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Disallow setting inv/cmask/etc. flags for all PEBS events
on these CPUs, except for the UOPS_RETIRED.* events on Nehalem/Westmere,
which are needed for cycles:p. This avoids an undefined situation
strongly discouraged by the Intle SDM. The PLD_* events were already
covered. This follows the earlier changes for Sandy Bridge and alter.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1411569288-5627-3-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
My earlier commit:
86a04461a9 ("perf/x86: Revamp PEBS event selection")
made nearly all PEBS on Sandy/IvyBridge/Haswell to reject non zero flags.
However this wasn't done for the INST_RETIRED.PREC_DIST event
because no suitable macro existed. Now that we have
INTEL_FLAGS_UEVENT_CONSTRAINT enforce zero flags for
INST_RETIRED.PREC_DIST too.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1411569288-5627-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There were several reports that on some systems writing the SBOX0 PMU
initialization MSR would #GP at boot. This did not happen on all
systems -- my two test systems booted fine.
Writing the three initialization bits bit-by-bit seems to avoid the
problem. So add a special callback to do just that.
This replaces an earlier patch that disabled the SBOX.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-and-Tested-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/1415062828-19759-4-git-send-email-andi@firstfloor.org
[ Fixed a whitespace error and added attribution tags that were left out inexplicably. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a CPU hotplugged out, we call perf_remove_from_context() (via
perf_event_exit_cpu()) to rip each CPU-bound event out of its PMU's cpu
context, but leave siblings grouped together. Freeing of these events is
left to the mercy of the usual refcounting.
When a CPU-bound event's refcount drops to zero we cross-call to
__perf_remove_from_context() to clean it up, detaching grouped siblings.
This works when the relevant CPU is online, but will fail if the CPU is
currently offline, and we won't detach the event from its siblings
before freeing the event, leaving the sibling list corrupt. If the
sibling list is later walked (e.g. because the CPU cam online again
before a remaining sibling's refcount drops to zero), we will walk the
now corrupted siblings list, potentially dereferencing garbage values.
Given that the events should never be scheduled again (as we removed
them from their context), we can simply detatch siblings when the CPU
goes down in the first place. If the CPU comes back online, the
redundant call to __perf_remove_from_context() is safe.
Reported-by: Drew Richardson <drew.richardson@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: vincent.weaver@maine.edu
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1415203904-25308-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
Infrastructure changes:
- Add gzip decompression support for kernel modules (Namhyung Kim)
- More prep patches for Intel PT, including a a thread stack and
more stuff made available via the database export mechanism (Adrian Hunter)
- Optimize checking that tracepoint events are defined in perf script perl/python (Jiri Olsa)
- Do not free pevent when deleting tracepoint evsel (Jiri Olsa)
- Fix build-id matching for vmlinux (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There's a problem on finding correct kernel symbols when perf report
runs on a different kernel. Although a part of the problem was solved
by the prior commit 0a7e6d1b68 ("perf tools: Check recorded kernel
version when finding vmlinux"), there's a remaining problem still.
When perf records samples, it synthesizes the kernel map using
machine__mmap_name() and ref_reloc_sym like "[kernel.kallsyms]_text".
You can easily see it using 'perf report -D' command.
After finishing record, it goes through the recorded events to find
maps/dsos actually used. And then record build-id info of them.
During this process, it needs to load symbols in a dso and it'd call
dso__load_vmlinux_path() since the default value of the symbol_conf.
try_vmlinux_path is true. However it changes dso->long_name to a real
path of the vmlinux file (e.g. /lib/modules/3.16.4/build/vmlinux) if one
is running on a custom kernel.
It resulted in that perf report reads the build-id of the vmlinux, but
cannot use it since it only knows about the [kernel.kallsyms] map. It
then falls back to possible vmlinux paths by using the recorded kernel
version (in case of a recent version) or a running kernel silently.
Even with the recent tools, this still has a possibility of breaking
the result. As the build directory is a symbolic link, if one built a
new kernel in the same directory with different source/config, the old
link to vmlinux will point the new file. So it's absolutely needed to
use build-id when finding a kernel image.
In this patch, it's now changed to try to search a kernel dso in the
existing dso list which was constructed during build-id table parsing
so it'll always have a build-id. If not found, search "[kernel.kallsyms]".
Before:
$ perf report
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ...............................
#
72.15% 0.00% swapper [kernel.kallsyms] [k] set_curr_task_rt
72.15% 0.00% swapper [kernel.kallsyms] [k] native_calibrate_tsc
72.15% 0.00% swapper [kernel.kallsyms] [k] tsc_refine_calibration_work
71.87% 71.87% swapper [kernel.kallsyms] [k] module_finalize
...
After (for the same perf.data):
72.15% 0.00% swapper vmlinux [k] cpu_startup_entry
72.15% 0.00% swapper vmlinux [k] arch_cpu_idle
72.15% 0.00% swapper vmlinux [k] default_idle
71.87% 71.87% swapper vmlinux [k] native_safe_halt
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/20140924073356.GB1962@gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1415063674-17206-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>