If somebody happens to name an event with the beginning of 'tracepoint'
(e.g. tracepoint_foo), then it will never be showed with perf list
event_glob, thus we parse the argument 'tracepoint' more carefully for
accuracy.
Example:
Before this patch:
$ perf list tracepoint_foo:*
jbd2:jbd2_start_commit [Tracepoint event]
jbd2:jbd2_commit_locking [Tracepoint event]
jbd2:jbd2_run_stats [Tracepoint event]
block:block_rq_issue [Tracepoint event]
block:block_bio_complete [Tracepoint event]
block:block_bio_backmerge [Tracepoint event]
block:block_getrq [Tracepoint event]
... ...
As shown above, all of the tracepoint events are printed. In fact, the
command's real intention is to print the events of tracepoint_foo.
After this patch:
$ perf list tracepoint_foo:*
tracepoint_foo:tp_foo_enter [Tracepoint event]
tracepoint_foo:tp_foo_exit [Tracepoint event]
As shown above, only the events of tracepoint_foo are printed.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1425032491-20224-3-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The recent new patch "perf tools: Add new 'perf data' command" (commit
2245bf14 in acme's git repo perf/core) has caused a building error when
compiling the source code of perf:
cc1: warnings being treated as errors
builtin-data.c:89: error: missing initializer
builtin-data.c:89: error: (near initialization for ‘data_cmds[1].summary’)
make[2]: *** [builtin-data.o] Error 1
make[2]: *** Waiting for unfinished jobs....
LD bench/perf-in.o
LD tests/perf-in.o
make[1]: *** [perf-in.o] Error 2
make: *** [all] Error 2
This patch fixes the building error above.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1425038026-27604-1-git-send-email-yunlong.song@huawei.com
[ .name == NULL ends the loop, use it instead of seting all fields to NULL ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The minus operator has higher precedence than ?: Add parentheses around
?: fix this.
Before this patch:
$ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
$ perf probe -l -k ../vmlinux
kprobes:myprobe (on do_sys_open)
After this patch:
$ echo 'p:myprobe do_sys_open' > /sys/kernel/debug/tracing/kprobe_events
$ perf probe -l -k ../vmlinux
kprobes:myprobe (on do_sys_open@linux.git/fs/open.c)
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1425034373-14511-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the symbol names.
Actually, perf diff once intended to compare the symbol names. The
commit as below can look for a pair by name.
604c5c9297 (perf diff: Change the default sort order to "dso,symbol")
However, at that time, perf diff used a global list of dsos. That means
the binaries which has same name can only be loaded once. That's a
problem for comparing different binaries.
For example, we have an old binary and an updated binary. They very
likely have same name and most of the functions, so only dsos from old
binary will be loaded. When processing the data from updated binary,
perf still use the symbol information from old binary. That's wrong.
Then the commit as below used IP to replace symbol name.
9c443dfdd3 ("perf diff: Fix support for all --sort combinations")
>From that time, perf diff starts to compare the symbol address.
The global dsos is discarded from a patch in 2010.
a1645ce12a ("perf: 'perf kvm' tool for monitoring guest performance
from host")
However, at that time, perf diff already compared by address. So perf
diff cannot work for different binaries as well.
This patch actually rolls back the perf diff to original design. The
document is also changed, so everybody knows the original design is to
compare the symbol names.
Here are some examples:
The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.
example_v1.c
noinline void f3(void)
{
volatile int i;
for (i = 0; i < 10000;) {
if(i%2)
i++;
else
i++;
}
}
noinline void f2(void)
{
volatile int a = 100, b, c;
for (b = 0; b < 10000; b++)
c = a * b;
}
noinline void f1(void)
{
f2();
f3();
}
int main()
{
int i;
for (i = 0; i < 100000; i++)
f1();
}
example_v2.c
noinline void f2(void)
{
volatile int a = 100, b, c;
for (b = 0; b < 10000; b++)
c = a * b;
}
noinline void f3(void)
{
volatile int i;
for (i = 0; i < 10000;) {
if(i%2)
i++;
else
i++;
}
}
noinline void f1(void)
{
f2();
f3();
}
int main()
{
int i;
for (i = 0; i < 100000; i++)
f1();
}
[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]
[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]
Old perf diff result:
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
Event 'cycles'
Baseline Delta Shared Object Symbol
........ ....... ................ ...............................
[kernel.vmlinux] [k] __perf_event_task_sched_out
0.00% [kernel.vmlinux] [k] apic_timer_interrupt
[kernel.vmlinux] [k] idle_cpu
[kernel.vmlinux] [k] intel_pstate_timer_func
[kernel.vmlinux] [k] native_read_msr_safe
0.00% [kernel.vmlinux] [k] native_read_tsc
0.00% [kernel.vmlinux] [k] native_write_msr_safe
[kernel.vmlinux] [k] ntp_tick_length
0.00% [kernel.vmlinux] [k] rb_erase
0.00% [kernel.vmlinux] [k] tick_sched_timer
0.00% [kernel.vmlinux] [k] unmap_single_vma
0.00% [kernel.vmlinux] [k] update_wall_time
0.00% example [.] f1
46.24% example [.] f2
53.71% -7.55% example [.] f3
+53.81% example [.] f3
0.02% example [.] main
New perf diff result:
[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
[kernel.vmlinux] [k] __perf_event_task_sched_out
0.00% [kernel.vmlinux] [k] apic_timer_interrupt
[kernel.vmlinux] [k] idle_cpu
[kernel.vmlinux] [k] intel_pstate_timer_func
[kernel.vmlinux] [k] native_read_msr_safe
0.00% [kernel.vmlinux] [k] native_read_tsc
0.00% [kernel.vmlinux] [k] native_write_msr_safe
[kernel.vmlinux] [k] ntp_tick_length
0.00% [kernel.vmlinux] [k] rb_erase
0.00% [kernel.vmlinux] [k] tick_sched_timer
0.00% [kernel.vmlinux] [k] unmap_single_vma
0.00% [kernel.vmlinux] [k] update_wall_time
0.00% example [.] f1
46.24% -0.08% example [.] f2
53.71% +0.11% example [.] f3
0.02% example [.] main
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new buildid cache if the update target file is not cached.
This can happen when an old binary is replaced by new one after caching
the old one. In this case, user sees his operation just failed.
But it does not look straight, since user just pass the binary "path",
not "build-id".
----
# ./perf buildid-cache --add ./perf
(update ./perf to new binary)
# ./perf buildid-cache --update ./perf
./perf wasn't in the cache
#
----
This patch adds given new binary to cache if the new binary is
not cached. So we'll not see the above error.
----
# ./perf buildid-cache --add ./perf
(update ./perf to new binary)
# ./perf buildid-cache --update ./perf
#
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150226065440.23912.1494.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We could end up returning 0 (Ok) with a NULL raw_path. Fix it.
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Link: http://lkml.kernel.org/n/tip-l0kcbcg5f4nnzqt01cv42vec@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix get_real_path to free allocated memory when comp_dir is used for
complementing path and getting an error.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naohiro Aota <naota@elisp.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150226082504.28125.74506.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Recent linux kernel provides a blacklist of the functions which can not
be probed. perf probe can now check this blacklist before setting new
events and indicate better error message for users.
Without this patch,
----
# perf probe --add vmalloc_fault
Added new event:
Failed to write event: Invalid argument
Error: Failed to add events.
----
With this patch
----
# perf probe --add vmalloc_fault
Added new event:
Warning: Skipped probing on blacklisted function: vmalloc_fault
----
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150219143113.14434.5387.stgit@localhost.localdomain
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On Sparc64 perf-trace is failing in many spots due to extended load
instructions being used on misaligned accesses.
(gdb) run trace ls
Starting program: /tmp/perf/perf trace ls
[Thread debugging using libthread_db enabled]
Detaching after fork from child process 169460.
<ls output removed>
Program received signal SIGBUS, Bus error.
0x000000000014f4dc in tp_field__u64 (field=0x4cc700, sample=0x7feffffa098) at builtin-trace.c:61
warning: Source file is more recent than executable.
61 TP_UINT_FIELD(64);
(gdb) bt
0 0x000000000014f4dc in tp_field__u64 (field=0x4cc700, sample=0x7feffffa098) at builtin-trace.c:61
1 0x0000000000156ad4 in trace__sys_exit (trace=0x7feffffc268, evsel=0x4cc580, event=0xfffffc0104912000,
sample=0x7feffffa098) at builtin-trace.c:1701
2 0x0000000000158c14 in trace__run (trace=0x7feffffc268, argc=1, argv=0x7fefffff360) at builtin-trace.c:2160
3 0x000000000015b78c in cmd_trace (argc=1, argv=0x7fefffff360, prefix=0x0) at builtin-trace.c:2609
4 0x0000000000107d94 in run_builtin (p=0x4549c8, argc=2, argv=0x7fefffff360) at perf.c:341
5 0x0000000000108140 in handle_internal_command (argc=2, argv=0x7fefffff360) at perf.c:400
6 0x0000000000108308 in run_argv (argcp=0x7feffffef2c, argv=0x7feffffef20) at perf.c:444
7 0x0000000000108728 in main (argc=2, argv=0x7fefffff360) at perf.c:559
(gdb) p *sample
$1 = {ip = 4391276, pid = 169472, tid = 169472, time = 6303014583281250, addr = 0, id = 72082,
stream_id = 18446744073709551615, period = 1, weight = 0, transaction = 0, cpu = 73, raw_size = 36,
data_src = 84410401, flags = 0, insn_len = 0, raw_data = 0xfffffc010491203c, callchain = 0x0,
branch_stack = 0x0, user_regs = {abi = 0, mask = 0, regs = 0x0, cache_regs = 0x7feffffa098, cache_mask = 0},
intr_regs = {abi = 0, mask = 0, regs = 0x0, cache_regs = 0x7feffffa098, cache_mask = 0}, user_stack = {
offset = 0, size = 0, data = 0x0}, read = {time_enabled = 0, time_running = 0, {group = {nr = 0,
values = 0x0}, one = {value = 0, id = 0}}}}
(gdb) p *field
$2 = {offset = 16, {integer = 0x14f4a8 <tp_field__u64>, pointer = 0x14f4a8 <tp_field__u64>}}
sample->raw_data is guaranteed to not be 8-byte aligned because it is preceded
by the size as a u3. So accessing raw data with an extended load instruction causes
the SIGBUS. Resolve by using memcpy to a temporary variable of appropriate size.
Signed-off-by: David Ahern <david.ahern@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1424376022-140608-1-git-send-email-david.ahern@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some of the tracers bring their own id or pid fields and we can end up
having two of them. This patch adds a "perf_" prefix to the 'generic'
fields so we avoid a clash of the member names.
The change is visible in the babeltrace output:
Before:
$ babeltrace ./ctf-data/
[03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
[03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
...
Now:
$ babeltrace ./ctf-data/
[03:19:13.962131936] (+0.000001935) cycles: { }, { perf_ip = 0xFFFFFFFF8105443A, perf_tid = 20714, perf_pid = 20714, perf_period = 8 }
[03:19:13.962133732] (+0.000001796) cycles: { }, { perf_ip = 0xFFFFFFFF8105443A, perf_tid = 20714, perf_pid = 20714, perf_period = 114 }
...
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-5-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding 'perf data convert' to convert perf data file into different
format. This patch adds support for CTF format conversion.
To convert perf.data into CTF run:
$ perf data convert --to-ctf=./ctf-data/
[ perf data convert: Converted 'perf.data' into CTF data './ctf-data/' ]
[ perf data convert: Converted and wrote 11.268 MB (100230 samples) ]
The command will create CTF metadata out of perf.data file (or one
specified via -i option) and then convert all sample events into single
CTF stream.
Each sample_type bit is translated into separated CTF event field apart
from following exceptions:
PERF_SAMPLE_RAW - added in next patch
PERF_SAMPLE_READ - TODO
PERF_SAMPLE_CALLCHAIN - TODO
PERF_SAMPLE_BRANCH_STACK - TODO
PERF_SAMPLE_REGS_USER - TODO
PERF_SAMPLE_STACK_USER - TODO
$ perf --debug=data-convert=2 data convert ...
The converted CTF data could be analyzed by CTF tools, like babletrace
or tracecompass [1].
$ babeltrace ./ctf-data/
[03:19:13.962125533] (+?.?????????) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
[03:19:13.962130001] (+0.000004468) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
[03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
[03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
[03:19:13.962135557] (+0.000001825) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 2087 }
[03:19:13.962137627] (+0.000002070) cycles: { }, { ip = 0xFFFFFFFF81361938, tid = 20714, pid = 20714, period = 37582 }
[03:19:13.962161091] (+0.000023464) cycles: { }, { ip = 0xFFFFFFFF8124218F, tid = 20714, pid = 20714, period = 600246 }
[03:19:13.962517569] (+0.000356478) cycles: { }, { ip = 0xFFFFFFFF811A75DB, tid = 20714, pid = 20714, period = 1325731 }
[03:19:13.969518008] (+0.007000439) cycles: { }, { ip = 0x34080917B2, tid = 20714, pid = 20714, period = 1144298 }
The following members to the ctf-environment were decided to be added to
distinguish and specify perf CTF data:
- domain
It says "kernel" because it contains a kernel trace (not to be
confused with a user space like lttng-ust does)
- tracer_name
It says perf. This can be used to distinguish between lttng and perf
CTF based trace.
- version
The kernel version from stream. In addition to release, this is what
it looks like on a Debian kernel:
release = "3.14-1-amd64";
version = "3.14.0";
[1] http://projects.eclipse.org/projects/tools.tracecompass
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-4-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding new 'perf data' command to provide operations over data files.
The 'perf data convert' sub command is coming in following patch, but
there's possibility for other useful commands like 'perf data ls' (to
display perf data file in directory in ls style).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-3-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding feature check for babeltrace library [1], which will be used for
perf data file CTF [2] conversion in following patches.
The babeltrace library is now automatically detected as standard
feature. It's possible to specify LIBBABELTRACE_DIR make variable to
specify location of installed libbabeltrace, like:
$ make LIBBABELTRACE_DIR=/opt/libbabeltrace/
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libunwind: [ on ]
... libbabeltrace: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... DWARF post unwind library: libunwind
NOTE The installation of the [1] to to used by above make:
$ git clone git://git.efficios.com/babeltrace.git
$ cd babeltrace
$ vim README
$ ./bootstrap
$ ./configure --prefix=/opt/libbabeltrace
$ make prefix=/opt/libbabeltrace
$ sudo make install prefix=/opt/libbabeltrace
Please make sure that the /opt/libbabeltrace/lib directory is in your
LD_LIBRARY_PATH:
$ export LD_LIBRARY_PATH=/opt/libbabeltrace/lib
[1] babeltrace - http://www.efficios.com/babeltrace
[2] Common Trace Format - http://www.efficios.com/ctf
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-2-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
[ Added missing babeltrace build instructions ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an option to perf record to record running/enabled time for read
events, similar to what stat does.
This is useful to understand multiplexing problems.
Right now the report support is not great, but at least report -D
already supports it.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1424819620-16043-1-git-send-email-andi@firstfloor.org
[ Fixed the Documentation entry to match the OPT_BOOLEAN one ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To use in stdio based tools, like 'trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-79kjmerlw6d88csyx1afzwvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To figure out if ordered_events are being used when doing a flush
operation, it is enough to check if there were in fact some events
queued, i.e. look at oe->nr_events.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1c5r404vy766kt5nflv88uag@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All it wants is session->evlist.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6w9663gka3jb1j1rfxxd5jcq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For tools that don't deal with perf.data files, thus do not need to
use perf_session.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-kglq67gvauq9tak02a4se00r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Start to untangle session from delivering samples, as there are
tools that want to use ordered_events and don't use perf_session at all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-rn4pk3pjxd78sgzrkn19tktp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because we need to use ordered_events in some cases, so we will need to
first have them in a queue, order that queue, and then process the
event.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-cmkw9zgoh0z4r218957ftp1a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Forgot to do it when adding the feature.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mx152b6x9cgknhw91vsyjlnd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to filter multiple pids in trace, i.e. trace itself,
gnome-terminal, X.org, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-frtpkg7qapqwf7asa35wf8am@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To filter out events for a certain pid, for instance, when tracing
system wide, so that the tracer itself doesn't creates an event loop.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-byoia9dzu4gmkdv87etnd9zf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- 'perf trace': Allow mixing with tracepoints and suppressing plain syscalls
(Arnaldo Carvalho de Melo)
Infrastructure:
- Kconfig beachhead (Jiri Olsa)
- Simplify nr_pages validity (Kaixu Xia)
- Fixup header positioning in 'perf list' (Yunlong Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJU3mKaAAoJEBpxZoYYoA71qnYH/1h8zqbQosuy/7Mu2tgLROts
2LSK8M+XD4RKdDVRLK95BIKmZfZkBjeOUE+PJIQ6/Mb1BQGBOmmGQ5oydLf2QUFw
5zVAFS8gec7xGvQpITuZEplJQcqm24CHt7qxUwFlh1DnRzN8eRkW2tHZmr5mfOil
hVpTQYpawRg/HIufDvlMU0Umv28JPQyRpfIF2TilkBxUT6KjYJK1QNuoNsgGS4ZL
r8rEpijRNkbmQZXmIDfZzvlzMx2Bwf0wdGf/1Rod1f1HLD4252ZKc07JCujBpvji
rK/oFj2hHx64r5HUQrOudlQ2B5VvlFKnWKnnb5EgL6gtM4moGhKjNHcUjFy1XLk=
=8zWn
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- No need to explicitely enable evsels for workload started from perf, let it
be enabled via perf_event_attr.enable_on_exec, removing some events that take
place in the 'perf trace' before a workload is really started by it.
(Arnaldo Carvalho de Melo)
- Fix to handle optimized not-inlined functions in 'perf probe' (Masami Hiramatsu)
- Update 'perf probe' man page (Masami Hiramatsu)
- 'perf trace': Allow mixing with tracepoints and suppressing plain syscalls
(Arnaldo Carvalho de Melo)
Infrastructure changes:
- Introduce {trace_seq_do,event_format_}_fprintf functions to allow
a default tracepoint field list printer to be used in tools that allows
redirecting output to a file. (Arnaldo Carvalho de Melo)
- The man page for pthread_attr_set_affinity_np states that _GNU_SOURCE
must be defined before pthread.h, do it to fix the build in some
systems (Josh Boyer)
- Cleanups in 'perf buildid-cache' (Masami Hiramatsu)
- Fix dso cache test case (Namhyung Kim)
- Do Not rely on dso__data_read_offset() to open DSO (Namhyung Kim)
- Make perf aware of tracefs (Steven Rostedt).
- Fix build by defining STT_GNU_IFUNC for glibc 2.9 and older (Vinson Lee)
- AArch64 symbol resolution fixes (Victor Kamensky)
- Kconfig beachhead (Jiri Olsa)
- Simplify nr_pages validity (Kaixu Xia)
- Fixup header positioning in 'perf list' (Yunlong Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
LBR call stack only has user-space callchains. It is output in the
PERF_SAMPLE_BRANCH_STACK data format. For kernel callchains, it's
still in the form of PERF_SAMPLE_CALLCHAIN.
The perf tool has to handle both data sources to construct a
complete callstack.
For the "perf report -D" option, both lbr and fp information will be
displayed.
A new call chain recording option "lbr" is introduced into the perf
tool for LBR call stack. The user can use --call-graph lbr to get
the call stack information from hardware.
Here are some examples.
When profiling bc(1) on Fedora 19:
echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph lbr bc -l < cmd
If enabling LBR, perf report output looks like:
50.36% bc bc [.] bc_divide
|
--- bc_divide
execute
run_code
yyparse
main
__libc_start_main
_start
33.66% bc bc [.] _one_mult
|
--- _one_mult
bc_divide
execute
run_code
yyparse
main
__libc_start_main
_start
7.62% bc bc [.] _bc_do_add
|
--- _bc_do_add
|
|--99.89%-- 0x2000186a8
--0.11%-- [...]
6.83% bc bc [.] _bc_do_sub
|
--- _bc_do_sub
|
|--99.94%-- bc_add
| execute
| run_code
| yyparse
| main
| __libc_start_main
| _start
--0.06%-- [...]
0.46% bc libc-2.17.so [.] __memset_sse2
|
--- __memset_sse2
|
|--54.13%-- bc_new_num
| |
| |--51.00%-- bc_divide
| | execute
| | run_code
| | yyparse
| | main
| | __libc_start_main
| | _start
| |
| |--30.46%-- _bc_do_sub
| | bc_add
| | execute
| | run_code
| | yyparse
| | main
| | __libc_start_main
| | _start
| |
| --18.55%-- _bc_do_add
| bc_add
| execute
| run_code
| yyparse
| main
| __libc_start_main
| _start
|
--45.87%-- bc_divide
execute
run_code
yyparse
main
__libc_start_main
_start
If using FP, perf report output looks like:
echo 'scale=2000; 4*a(1)' > cmd; perf record --call-graph fp bc -l < cmd
50.49% bc bc [.] bc_divide
|
--- bc_divide
33.57% bc bc [.] _one_mult
|
--- _one_mult
7.61% bc bc [.] _bc_do_add
|
--- _bc_do_add
0x2000186a8
6.88% bc bc [.] _bc_do_sub
|
--- _bc_do_sub
0.42% bc libc-2.17.so [.] __memcpy_ssse3_back
|
--- __memcpy_ssse3_back
If using LBR, perf report -D output looks like:
3458145275743 0x2fd750 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x2): 9748/9748: 0x408ea8 period: 609644 addr: 0
... LBR call chain: nr:8
..... 0: fffffffffffffe00
..... 1: 0000000000408e50
..... 2: 000000000040a458
..... 3: 000000000040562e
..... 4: 0000000000408590
..... 5: 00000000004022c0
..... 6: 00000000004015dd
..... 7: 0000003d1cc21b43
... FP chain: nr:2
..... 0: fffffffffffffe00
..... 1: 0000000000408ea8
... thread: bc:9748
...... dso: /usr/bin/bc
The LBR call stack has the following known limitations:
- Zero length calls are not filtered out by the hardware
- Exception handing such as setjmp/longjmp will have calls/returns not
match
- Pushing different return address onto the stack will have
calls/returns not match
- If callstack is deeper than the LBR, only the last entries are
captured
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1420482185-29830-3-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, there are two call chain recording options, fp and dwarf.
Haswell has a new feature that utilizes the existing LBR facility to
record call chains. Kernel side LBR support code provides this as a
third option to record call chains. This patch enables the lbr call
stack support on the tooling side.
LBR call stack has some limitations:
- It reuses current LBR facility, so LBR call stack and branch record
can not be enabled at the same time.
- It is only available for user-space callchains.
However, it also offers some advantages:
- LBR call stack can work on user apps which don't have frame-pointers
or dwarf debug info compiled. It is a good alternative when nothing
else works.
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masanari Iida <standby24x7@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The hearer text 'List of pre-defined events (to be used in -e):' is
placed in an improper function, which causes an abnormal output, e.g.
'perf list hw' shows no guiding text at all, and 'perf list hw
L1-dcache*' shows the guiding text incorrectly in the middle of the
output.
Example
Before this patch:
$ perf list hw L1-dcache*
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
stalled-cycles-backend OR idle-cycles-backend [Hardware event]
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
List of pre-defined events (to be used in -e): <-- incorrect position
L1-dcache-load-misses [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-dcache-prefetch-misses [Hardware cache event]
L1-dcache-prefetches [Hardware cache event]
L1-dcache-store-misses [Hardware cache event]
L1-dcache-stores [Hardware cache event]
After this patch:
$ perf list hw L1-dcache*
List of pre-defined events (to be used in -e): <-- correct position
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
stalled-cycles-backend OR idle-cycles-backend [Hardware event]
stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]
L1-dcache-load-misses [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-dcache-prefetch-misses [Hardware cache event]
L1-dcache-prefetches [Hardware cache event]
L1-dcache-store-misses [Hardware cache event]
L1-dcache-stores [Hardware cache event]
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1423833115-11199-8-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the 'segmentation fault' bug of 'perf list --list-cmds', which also
happens in other cases (e.g. record, report ...). This bug happens when
there are no cmds to list at all.
Example:
Before this patch:
$ perf list --list-cmds
Segmentation fault
$
After this patch:
$ perf list --list-cmds
$
As shown above, the result prints nothing rather than a segmentation
fault. The null result means 'perf list' has no cmds to display at this
time.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1423833115-11199-5-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the sparc arch objects building under build framework to be
included in the libperf build object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-160hknrqr27c9zf59japw91y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>