This patch utilizes bpf_object__load() provided by libbpf to load all
objects into kernel.
Committer notes:
Testing it:
When using an incorrect kernel version number, i.e., having this in your
eBPF proggie:
int _version __attribute__((section("version"), used)) = 0x40100;
For a 4.3.0-rc6+ kernel, say, this happens and needs checking at event
parsing time, to provide a better error report to the user:
# perf record --event /tmp/foo.o sleep 1
libbpf: load bpf program failed: Invalid argument
libbpf: -- BEGIN DUMP LOG ---
libbpf:
libbpf: -- END LOG --
libbpf: failed to load program 'fork=_do_fork'
libbpf: failed to load object '/tmp/foo.o'
event syntax error: '/tmp/foo.o'
\___ Invalid argument: Are you root and runing a CONFIG_BPF_SYSCALL kernel?
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
If we instead make it match, i.e. use 0x40300 on this v4.3.0-rc6+
kernel, the whole process goes thru:
# perf record --event /tmp/foo.o -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data ]
# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
#
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces bpf__{un,}probe() functions to enable callers to
create kprobe points based on section names a BPF program. It parses the
section names in the program and creates corresponding 'struct
perf_probe_event' structures. The parse_perf_probe_command() function is
used to do the main parsing work. The resuling 'struct perf_probe_event'
is stored into program private data for further using.
By utilizing the new probing API, this patch creates probe points during
event parsing.
To ensure probe points be removed correctly, register an atexit hook so
even perf quit through exit() bpf__clear() is still called, so probing
points are cleared. Note that bpf_clear() should be registered before
bpf__probe() is called, so failure of bpf__probe() can still trigger
bpf__clear() to remove probe points which are already probed.
strerror style error reporting scaffold is created by this patch.
bpf__strerror_probe() is the first error reporting function in
bpf-loader.c.
Committer note:
Trying it:
To build a test eBPF object file:
I am testing using a script I built from the 'perf test -v LLVM' output:
$ cat ~/bin/hello-ebpf
export KERNEL_INC_OPTIONS="-nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.8.3/include -I/home/acme/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated -I/home/acme/git/linux/include -Iinclude -I/home/acme/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -Iinclude/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h"
export WORKING_DIR=/lib/modules/4.2.0/build
export CLANG_SOURCE=-
export CLANG_OPTIONS=-xc
OBJ=/tmp/foo.o
rm -f $OBJ
echo '__attribute__((section("fork=do_fork"), used)) int fork(void *ctx) {return 0;} char _license[] __attribute__((section("license"), used)) = "GPL";int _version __attribute__((section("version"), used)) = 0x40100;' | \
clang -D__KERNEL__ $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o /tmp/foo.o && file $OBJ
---
First asking to put a probe in a function not present in the kernel
(misses the initial _):
$ perf record --event /tmp/foo.o sleep 1
Probe point 'do_fork' not found.
event syntax error: '/tmp/foo.o'
\___ You need to check probing points in BPF file
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
---
Now, with "__attribute__((section("fork=_do_fork"), used)):
$ grep _do_fork /proc/kallsyms
ffffffff81099ab0 T _do_fork
$ perf record --event /tmp/foo.o sleep 1
Failed to open kprobe_events: Permission denied
event syntax error: '/tmp/foo.o'
\___ Permission denied
---
Cool, we need to provide some better hints, "kprobe_events" is too low
level, one doesn't strictly need to know the precise details of how
these things are put in place, so something that shows the command
needed to fix the permissions would be more helpful.
Lets try as root instead:
# perf record --event /tmp/foo.o sleep 1
Lowering default frequency rate to 1000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
# perf evlist
/tmp/foo.o
[root@felicio ~]# perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period,
sample_freq }: 1000, sample_type: IP|TID|TIME|PERIOD, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
---
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By introducing new rules in tools/perf/util/parse-events.[ly], this
patch enables 'perf record --event bpf_file.o' to select events by an
eBPF object file. It calls parse_events_load_bpf() to load that file,
which uses bpf__prepare_load() and finally calls bpf_object__open() for
the object files.
After applying this patch, commands like:
# perf record --event foo.o sleep
become possible.
However, at this point it is unable to link any useful things onto the
evsel list because the creating of probe points and BPF program
attaching have not been implemented. Before real events are possible to
be extracted, to avoid perf report error because of empty evsel list,
this patch link a dummy evsel. The dummy event related code will be
removed when probing and extracting code is ready.
Commiter notes:
Using it:
$ ls -la foo.o
ls: cannot access foo.o: No such file or directory
$ perf record --event foo.o sleep
libbpf: failed to open foo.o: No such file or directory
event syntax error: 'foo.o'
\___ BPF object file 'foo.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/build/perf/perf.o
/tmp/build/perf/perf.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
$ perf record --event /tmp/build/perf/perf.o sleep
libbpf: /tmp/build/perf/perf.o is not an eBPF object file
event syntax error: '/tmp/build/perf/perf.o'
\___ BPF object file '/tmp/build/perf/perf.o' is invalid
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
$
$ file /tmp/foo.o
/tmp/foo.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
$ perf record --event /tmp/foo.o sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data ]
$ perf evlist
/tmp/foo.o
$ perf evlist -v
/tmp/foo.o: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
So, type 1 is PERF_TYPE_SOFTWARE, config 0x9 is PERF_COUNT_SW_DUMMY, ok.
$ perf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'bpf-loader.[ch]' files are introduced in this patch. Which will be
the interface between perf and libbpf. bpf__prepare_load() resides in
bpf-loader.c. Following patches will enrich these two files.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By adding libbpf into perf's Makefile, this patch enables perf to build
libbpf if libelf is found and neither NO_LIBELF nor NO_LIBBPF is set.
The newly introduced code is similar to how libapi and libtraceevent
are wired into Makefile.perf.
MANIFEST is also updated for 'make perf-*-src-pkg'.
Append make_no_libbpf to tools/perf/tests/make.
The 'bpf' feature check is appended into default FEATURE_TESTS and
FEATURE_DISPLAY, so perf will check the API version of bpf in
/path/to/kernel/include/uapi/linux/bpf.h. Which should not fail except
when we are trying to port this code to an old kernel.
Error messages are also updated to notify users about the lack of BPF
support in 'perf record' if libelf is missing or the BPF API check
failed.
tools/lib/bpf is added to TAG_FOLDERS to allow us to navigate libbpf
files when working on perf using tools/perf/tags.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1444826502-49291-2-git-send-email-wangnan0@huawei.com
[ Document NO_LIBBPF in Makefile.perf, noted by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we split symbols based on the map comparison, but symbols are stored
within dso objects and maps could point into same dso objects (kernel maps).
Hence we could end up changing rbtree we are currently iterating and mess it
up. It's easily reproduced on s390x by running:
$ perf record -a -- sleep 3
$ perf buildid-list -i perf.data --with-hits
The fix is to compare dso objects instead.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151026135130.GA26003@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch allows perf record setting event's attr.inherit bit by
config terms like:
# perf record -e cycles/no-inherit/ ...
# perf record -e cycles/inherit/ ...
So user can control inherit bit for each event separately.
In following example, a.out fork()s in main then do some complex
CPU intensive computations in both of its children.
Basic result with and without inherit:
# perf record -e cycles -e instructions ./a.out
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.205 MB perf.data (47920 samples) ]
# perf report --stdio
# ...
# Samples: 23K of event 'cycles'
# Event count (approx.): 23641752891
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30428312415
# perf record -i -e cycles -e instructions ./a.out
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.111 MB perf.data (24019 samples) ]
...
# Samples: 12K of event 'cycles'
# Event count (approx.): 11699501775
...
# Samples: 12K of event 'instructions'
# Event count (approx.): 15058023559
Cancel inherit for one event when globally enable:
# perf record -e cycles/no-inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.660 MB perf.data (36004 samples) ]
...
# Samples: 12K of event 'cycles/no-inherit/'
# Event count (approx.): 11895759282
...
# Samples: 24K of event 'instructions'
# Event count (approx.): 30668000441
Enable inherit for one event when globally disable:
# perf record -i -e cycles/inherit/ -e instructions ./a.out
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 1.654 MB perf.data (35868 samples) ]
...
# Samples: 23K of event 'cycles/inherit/'
# Event count (approx.): 23285400229
...
# Samples: 11K of event 'instructions'
# Event count (approx.): 14969050259
Committer note:
One can check if the bit was set, in addition to seeing the result in
the perf.data file size as above by doing one of:
# perf record -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.911 MB perf.data (63 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
So, the inherit bit was set in both, now, if we disable it globally using
--no-inherit:
# perf record --no-inherit -e cycles -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.910 MB perf.data (56 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
No inherit bit set, then disabling it and setting just on the cycles event:
# perf record --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.909 MB perf.data (48 samples) ]
# perf evlist -v
cycles/inherit/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
instructions: size: 112, config: 0x1, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ID|CPU|PERIOD, read_format: ID, disabled: 1, freq: 1, sample_id_all: 1, exclude_guest: 1
#
We can see it as well in by using a more verbose level of debug messages in
the tool that sets up the perf_event_attr, 'perf record' in this case:
[root@zoo ~]# perf record -vv --no-inherit -e cycles/inherit/ -e instructions -a usleep 1
------------------------------------------------------------
perf_event_attr:
size 112
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|CPU|PERIOD
read_format ID
disabled 1
freq 1
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
<SNIP>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1446029705-199659-2-git-send-email-wangnan0@huawei.com
[ s/u64/bool/ for the perf_evsel_config_term inherit field - jolsa]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Recent GDB (at least on a vanilla Debian box) looks for debug information in
/usr/lib/debug/.build-id/nn/nnnnnnn
where nn/nnnnnn is the build-id of the stripped ELF binary. This is
documented here:
https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
This was not working in perf because we didn't read the build id until
AFTER we searched for the separate debug information file. This patch
reads the build ID and THEN does the search.
Signed-off-by: Dima Kogan <dima@secretsauce.net>
Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This was benign, but wrong. The build-id should live in a char[], not a char*[]
Signed-off-by: Dima Kogan <dima@secretsauce.net>
Link: http://lkml.kernel.org/r/87si6pfwz4.fsf@secretsauce.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Recently 'perf <tool> -h' was made aware of arguments and would show
just the help for the arguments specified, but that required a strict
form, i.e.:
$ perf -h --tui
worked, but:
$ perf -h tui
didn't.
Make it support both cases and also look at the option help when neither
matches, so that he following examples works:
$ perf report -h interface
Usage: perf report [<options>]
--gtk Use the GTK2 interface
--stdio Use the stdio interface
--tui Use the TUI interface
$ perf report -h stack
Usage: perf report [<options>]
-g, --call-graph <print_type,threshold[,print_limit],order,
sort_key[,branch]>
Display call graph (stack chain/backtrace):
print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold (<percent>)
print_limit: maximum number of call graph entry (<number>)
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)
Default: graph,0.5,caller,function
--max-stack <n> Set the maximum stack depth when parsing the
callchain, anything beyond the specified depth
will be ignored. Default: 127
$
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xzqvamzqv3cv0p6w3inhols3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently any time we need to access socket or core id for given cpu, we
access the sysfs topology file.
Adding a cpus_aggr_map cpu_map to cache those entries.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding cpu_map__empty_new interface to create empty cpumap with given
size. The cpumap entries are initialized with -1.
It'll be used for caching cpu_map in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because the 'perf stat record' patches will use the id_offset member
together with the priv pointer.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-29-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now usage_with_options() setup a pager before printing message so normal
printf() or pr_err() will not be shown. The usage_with_options_msg()
can be used to print some help message before usage strings.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445701767-12731-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's annoying to see error or help message when command has many options
like in perf record, report or top. So setup pager when print parser
error or help message - it should be OK since no UI is enabled at the
parsing time. The usage_with_options() already disables it by calling
exit_browser() anyway.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445701767-12731-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that it can be more consistent with other --show-* options. The old
name (--showcpuutilization) is provided only for compatibility.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445701767-12731-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently if an option name is ambiguous it only prints first two
matched option names but no help. It'd be better it could show all
possible names and help messages too.
Before:
$ perf report --show
Error: Ambiguous option: show (could be --show-total-period or
--show-ref-call-graph)
Usage: perf report [<options>]
After:
$ perf report --show
Error: Ambiguous option: show (could be --show-total-period or
--show-ref-call-graph)
Usage: perf report [<options>]
-n, --show-nr-samples
Show a column with the number of samples
--showcpuutilization
Show sample percentage for different cpu modes
-I, --show-info Display extended information about perf.data file
--show-total-period
Show a column with the sum of periods
--show-ref-call-graph
Show callgraph from reference event
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445701767-12731-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some tools have a lot of options, so, providing a way to show help just
for some of them may come handy:
$ perf report -h --tui
Usage: perf report [<options>]
--tui Use the TUI interface
$ perf report -h --tui --showcpuutilization -b -c
Usage: perf report [<options>]
-b, --branch-stack use branch records for per branch histogram filling
-c, --comms <comm[,comm...]>
only consider symbols in these comms
--showcpuutilization
Show sample percentage for different cpu modes
--tui Use the TUI interface
$
Using it with perf bash completion is also handy, just make sure you
source the needed file:
$ . ~/git/linux/tools/perf/perf-completion.sh
Then press tab/tab after -- to see a list of options, put them after -h
and only the options chosen will have its help presented:
$ perf report -h --
--asm-raw --demangle-kernel --group
--kallsyms --pretty --stdio
--branch-history --disassembler-style --gtk
--max-stack --showcpuutilization --symbol-filter
--branch-stack --dsos --header
--mem-mode --show-info --symbols
--call-graph --dump-raw-trace --header-only
--modules --show-nr-samples --symfs
--children --exclude-other --hide-unresolved
--objdump --show-ref-call-graph --threads
--column-widths --fields --ignore-callees
--parent --show-total-period --tid
--comms --field-separator --input
--percentage --socket-filter --tui
--cpu --force --inverted
--percent-limit --sort --verbose
--demangle --full-source-path --itrace
--pid --source --vmlinux
$ perf report -h --socket-filter
Usage: perf report [<options>]
--socket-filter <n>
only show processor socket that match with this filter
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-83mcdd3wj0379jcgea8w0fxa@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When asking for a listing of the options, be it using -h or when an
unknown option is passed, order it by one-letter options, then the ones
having just long names.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-41qh68t35n4ehrpsuazp1dx8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_config() infrastructure we inherited from git calls die() when
the provided config callback returns -1, meaning some key in a config
section is unexpected, that seems ok for a stdio based tool, but in
--tui we end up messing up the output, so just tell the user about the
error, wait for a keystroke and return 0, being more resilient and
proceeding with what we managed to parse.
That die() needs to die, tho :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-pqtsffh2kwr5mwm4qg9kgotu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. we want to tell the user about errors found during, for instance,
the ui_browser initialization, so that a call to ui__warning() appears
as a window waiting for a key to be pressed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ederrwizcl6mfz10vfobl5qq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The annotate__configs should be sorted so that it can use bsearch(3).
However commit 0c4a5bcea4 ("perf annotate: Display total number of
samples with --show-total-period") added a new config item at the end.
This resulted in the 'annotate.use_offset' config variable cannot be
found and perf terminated like below:
$ perf report
bad config file line 6 in ~/.perfconfig
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Fixes: 0c4a5bcea4 ("perf annotate: Display total number of samples with --show-total-period")
Link: http://lkml.kernel.org/r/1445396240-3428-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --call-graph option is complex so we should provide better guide for
users. Also change help message to be consistent with config option
names. Now perf top will show help like below:
$ perf top --call-graph
Error: option `call-graph' requires a value
Usage: perf top [<options>]
--call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]>
setup and enables call-graph (stack chain/backtrace):
record_mode: call graph recording mode (fp|dwarf|lbr)
record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>)
default: 8192 (bytes)
print_type: call graph printing style (graph|flat|fractal|none)
threshold: minimum call graph inclusion threshold (<percent>)
print_limit: maximum number of call graph entry (<number>)
order: call graph order (caller|callee)
sort_key: call graph sort key (function|address)
branch: include last branch info to call graph (branch)
Default: fp,graph,0.5,caller,function
Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445524112-5201-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The caller callchain order is useful with --children option since it can
show 'overview' style output, but other commands which don't use
--children feature like 'perf script' or even 'perf report/top' without
--children are better to keep callee order.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445499946-29817-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently 'perf top --call-graph' option is same as 'perf record'. But
'perf top' also need to receive display options in 'perf report'. To do
that, change parse_callchain_report_opt() to allow record options too.
Now perf top can receive display options like below:
$ perf top --call-graph
Error: option `call-graph' requires a value
Usage: perf top [<options>]
--call-graph
<mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]>
setup and enables call-graph (stack chain/backtrace)
recording: fp dwarf lbr, output_type (graph, flat,
fractal, or none), min percent threshold, optional
print limit, callchain order, key (function or
address), add branches
$ perf top --call-graph callee,graph,fp
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445495330-25416-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These messages will be used by 'perf top' in the next patch.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445495330-25416-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a missing field to the perf_event_attr debug output.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1445366797-30894-4-git-send-email-andi@firstfloor.org
[ Print it between config2 and sample_regs_user (peterz)]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf will core dump if --per-socket/core -a are applied for perf stat.
The root cause is that cpu_map__build_map set refcnt of evlist's cpu_map
to 1. It should set refcnt for the newly created cpu_map, not evlist's
cpu_map.
Here is the example:
# perf stat -e cycles --per-socket -a sleep 1
Performance counter stats for 'system wide':
S0 36 30,196,257 cycles
S1 28 15,823,536 cycles
1.001126828 seconds time elapsed
*** Error in `./perf': corrupted double-linked list: 0x00000000021f9090 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3002e7bbe7]
/lib64/libc.so.6[0x3002e7d2b5]
./perf(perf_evsel__delete+0x28)[0x485bdd]
./perf[0x4800e8]
./perf(perf_evlist__delete+0x5e)[0x482cd5]
./perf(cmd_stat+0xf25)[0x432328]
./perf[0x4768e0]
./perf[0x476ad6]
./perf[0x476b41]
./perf(main+0x1d0)[0x476db2]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x3002e21b45]
./perf[0x4202c5]
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1444388363-35936-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch add a new branch type sampling filter to perf record.
It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples
direct call branches only, unlike 'any_call' which includes indirect
calls as well.
$ perf record -j call -e cycles .....
The man page is updated accordingly.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: khandual@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1444720151-10275-5-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There's no need to check sampling output fields for events without
perf_event_attr::sample_type field set.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-51-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding data arg to cpu_map__build_map callback, so we could pass data
along to the callback. It'll be needed in following patches to retrieve
topology info from perf.data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-41-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We'll need to call it from perf stat in the stat_script patchkit
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-40-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding AGGR_UNSET mode, so we could distinguish unset aggr_mode in
following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-30-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's used as the perf_evsel::priv data, so the name suits better. Also
we'll need the perf_stat name free for more generic struct.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444992092-17897-29-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Capitalize 'usage' to make it consistent with all the other 'Usage' in
the codes, e.g., usage_builtin.
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Sriram Raghunathan <sriram.r@nokia.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1444894792-2338-3-git-send-email-yunlong.song@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So right now we output this text:
memcpy: Benchmark for memcpy() functions
memset: Benchmark for memset() functions
all: Test all memory access benchmarks
But the right verb to use with benchmarks is to 'run' them, not 'test'
them.
So change this (and all similar texts) to:
memcpy: Benchmark for memcpy() functions
memset: Benchmark for memset() functions
all: Run all memory access benchmarks
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-15-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So right now there's a somewhat inconsistent mess of the benchmarking
code and options sometimes calling benchmarked functions 'functions',
sometimes calling them 'routines'.
Name them 'functions' consistently.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-14-git-send-email-mingo@kernel.org
[ Updated perf-bench man page, pointed out by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have three benchmarking subsystems that specify some sort of 'number
of loops' parameter - but all of them do it inconsistently:
numa: -l/--nr_loops
sched messaging: -l/--loops
mem memset/memcpy: -i/--iterations
Harmonize them to -l/--nr_loops by picking the numa variant - which is
also the most likely one to have existing scripting which we don't want
to break.
Plus improve the parameter help texts to indicate the default value for
the nr_loops variable to keep users from guessing ...
Also propagate the naming to internal variables.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-13-git-send-email-mingo@kernel.org
[ Let the harmonisation reach the perf-bench man page as well ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reorder functions a bit, so that we synchronize the layout of the
memcpy() and memset() portions of the code.
This improves the code, especially after we'll add an strlcpy() variant
as well.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-12-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- fix various typos in user visible output strings
- make the output consistent (wrt. capitalization and spelling)
- offer the list of routines to benchmark on '-r help'.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-11-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So 'perf bench mem memcpy/memset' consistently uses 'len' and 'length'
for buffer sizes - while it's really a memory buffer size. (strings have
length.)
Rename all affected variables.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-10-git-send-email-mingo@kernel.org
[ Update perf-bench man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So bench/mem-functions.c has a 'routine' name for the routines parameter
string, but a 'length_str' name for the length parameter string.
We also have another entity named 'routine': 'struct routine'.
This is inconsistent and confusing: rename 'routine' to 'routine_str'.
Also fix typos in the --routine help text.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-9-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So 'perf bench mem memset/memcpy' has a CPU cycles measurement method,
but calls it 'cycle' (singular) throughout the code, which makes it
harder to read.
Rename all related functions, variables and options to a plural 'cycles'
nomenclature.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-8-git-send-email-mingo@kernel.org
[ s/--cycle/--cycles/g in perf-bench man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So 'perf bench -h' is not very helpful when printing the help line
about the output formatting options:
-f, --format <default>
Specify format style
There are two output format styles, 'default' and 'simple', so improve
the help text to:
-f, --format <default|simple>
Specify the output formatting style
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-7-git-send-email-mingo@kernel.org
[ Removed leftovers from the mem-functions.c rename ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So 'perf bench mem memcpy/memset' has elaborate code to measure
memcpy()/memset() performance both with freshly allocated buffers (which
includes initial page fault overhead) and with preallocated buffers.
But the thing is, the resulting bandwidth results are mostly
meaningless, because page faults dominate so much of the cost.
It might make sense to measure cache cold vs. cache hot performance, but
the code does not do this.
So remove this complication, and always prefault the ranges before using
them.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-6-git-send-email-mingo@kernel.org
[ Remove --no-prefault, --only-prefault from docs, noticed by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So mem-memcpy.c started out as a simple memcpy() benchmark, then it grew
memset() functionality and now I plan to add string copy benchmarks as
well.
This makes the file name a misnomer: rename it to the more generic
mem-functions.c name.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1445241870-24854-5-git-send-email-mingo@kernel.org
[ The "rename" was introducing __unused, wasn't removing the old file,
and didn't update tools/perf/bench/Build, fix it ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently libtraceevent emits warning on unsupported event formats.
However it'd be better to see them only -v option is given. To do that,
it needs to override the warning() function which is used in the
libtracevent. Thus add set_warning_routine() same as set_die_routine()
and check the verbose flag in our warning routine.
Before:
# perf test 5
5: parse events tests :
Warning: [kvmmmu:kvm_mmu_get_page] bad op token {
Warning: [kvmmmu:kvm_mmu_sync_page] bad op token {
Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token {
Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token {
Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined
...
Ok
After:
# perf test 5
5: parse events tests : Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445268229-1601-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, when 'perf test' is run by a normal user, it'll fail to
access tracepoint events. The output becomes somewhat messy because it
tries to be nice with long error messages and hints.
IMHO this is not needed for 'perf test' by default and AFAIK 'perf test'
uses pr_debug() rather than pr_err() for such messages so that one can
use -v option to see further details on failed testcases if needed.
Before:
$ perf test
1: vmlinux symtab matches kallsyms : FAILED!
2: detect openat syscall event :Error:
No permissions to read
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
FAILED!
3: detect openat syscall event on all cpus :Error:
No permissions to read
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
FAILED!
...
After:
$ perf test
1: vmlinux symtab matches kallsyms : FAILED!
2: detect openat syscall event : FAILED!
3: detect openat syscall event on all cpus : FAILED!
...
$ perf test -v 2
2: detect openat syscall event :
--- start ---
test child forked, pid 30575
Error: No permissions to read
/sys/kernel/debug/tracing/events/syscalls/sys_enter_openat
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
test child finished with -1
---- end ----
detect openat syscall event: FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1445268229-1601-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With horizontal scrolling, the left/right arrow keys are used to scroll
columns and ENTER/ESC keys are used to enter/exit menu. However if
callchain is recorded, the ENTER key is used to toggle callchain
expansion so there's no way to display menu. Use 'm' key to display the
menu for this case.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444694521-8136-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
unw_word_t is uint64_t even on 32-bit MIPS. Cast it to uintptr_t before
the cast to void *p to get rid of the following errors:
util/unwind-libunwind.c: In function 'access_mem':
util/unwind-libunwind.c:464:4: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
util/unwind-libunwind.c:475:2: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
cc1: all warnings being treated as errors
make[3]: *** [util/unwind-libunwind.o] Error 1
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabinv@axis.com>
Link: http://lkml.kernel.org/r/1443379079-29133-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When NO_LIBUNWIND_DEBUG_FRAME=0, use the .debug_frame if the .eh_frame
doesn't contain the approprate unwind tables.
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabinv@axis.com>
Link: http://lkml.kernel.org/r/1443379079-29133-3-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When in the hists browser, i.e. in 'perf report' or in 'perf top', it is
possible to press '/' and specify a substring to filter by symbol name.
Clarify how to remove a filter by making the prompt be:
Please enter the name of symbol you want to see.
To remove the filter later, press / + ENTER
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vbq2b0kyufwy6p0ctkfswcoe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They were repurposed for horizontal scrolling, so use just ENTER/ESC in
the help messages.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: c6c3c02dea ("perf hists browser: Implement horizontal scrolling")
Link: http://lkml.kernel.org/n/tip-n5ar4qg8fs12ax4vhr3rxhxj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not as the first attempt at finding a vmlinux for the running kernel,
this way we get a more informative filename to present in tools, it will
check that the build-id is the same as the one previously loaded in the
DSO in dso->build_id, reading from /sys/kernel/notes, for instance.
E.g. in the annotation TUI, going from 'perf top', for the scsi_sg_alloc
kernel function, in the first line:
Before:
scsi_sg_alloc /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1
After:
scsi_sg_alloc /lib/modules/4.3.0-rc1+/build/vmlinux
And:
# ls -la /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1
lrwxrwxrwx. 1 root root 81 Sep 22 16:11 /root/.debug/.build-id/28/2777c262e6b3c0451375163c9a81c893218ab1 -> ../../home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
# file ~/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
/root/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=282777c262e6b3c0451375163c9a81c893218ab1, not stripped
#
The same as:
# file /lib/modules/4.3.0-rc1+/build/vmlinux
/lib/modules/4.3.0-rc1+/build/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=282777c262e6b3c0451375163c9a81c893218ab1, not stripped
Furthermore:
# sha256sum /lib/modules/4.3.0-rc1+/build/vmlinux
e7a789bbdc61029ec09140c228e1dd651271f38ef0b8416c0b7d5ff727b98be2 /lib/modules/4.3.0-rc1+/build/vmlinux
# sha256sum ~/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
e7a789bbdc61029ec09140c228e1dd651271f38ef0b8416c0b7d5ff727b98be2 /root/.debug/home/git/build/v4.3.0-rc1+/vmlinux/282777c262e6b3c0451375163c9a81c893218ab1
[root@zoo new]#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9y42ikzq3jisiddoi6f07n8z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The function can return negative value, assigning it to unsigned
variable can cause memory corruption.
The problem has been detected using proposed semantic patch
scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci [1].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2038576
Signed-off-by: Andrzej Hajda <a.hajda@samsung.com>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/1444122017-16856-1-git-send-email-a.hajda@samsung.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_hpp__init currently does not respect sorting dimensions and the
setup_sorting function could endup queueing same format twice. That
screwed up the perf_hpp__list and got stuck in loop within
perf_hpp__setup_output_field function.
$ perf report -F +overhead
0x00000000004c1355 in perf_hpp__is_sort_entry (format=format@entry=0x880440 <perf_hpp.format>) at util/sort.c:1506
1506 {
#0 0x00000000004c1355 in perf_hpp__is_sort_entry (format=format@entry=0x880440 <perf_hpp.format>) at util/sort.c:1506
#1 0x00000000004c139d in perf_hpp__same_sort_entry (a=a@entry=0x880440 <perf_hpp.format>, b=b@entry=0x2bb2fe0) at util/sort.c:1380
#2 0x00000000004f8d3c in perf_hpp__setup_output_field () at ui/hist.c:554
#3 0x00000000004c1d1e in setup_sorting () at util/sort.c:1984
#4 0x000000000042efbf in cmd_report (argc=0, argv=0x7ffea5a0e790, prefix=<optimized out>) at builtin-report.c:874
#5 0x0000000000476f13 in run_builtin (p=p@entry=0x875628 <commands+168>, argc=argc@entry=3, argv=argv@entry=0x7ffea5a0e790) at perf.c:385
#6 0x000000000047710b in handle_internal_command (argc=3, argv=0x7ffea5a0e790) at perf.c:445
#7 0x0000000000477176 in run_argv (argcp=argcp@entry=0x7ffea5a0e5fc, argv=argv@entry=0x7ffea5a0e5f0) at perf.c:489
#8 0x00000000004773e7 in main (argc=3, argv=0x7ffea5a0e790) at perf.c:606
Using hpp_dimension__add_output function to register the output column.
It will also mark the dimension as taken and omit above stuck.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444134312-29136-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This function will allow to register output column from ui code and
respect taken sort/output dimensions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444134312-29136-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need to call reset_dimensions within __setup_output_field
function. It's already called in its caller setup_sorting right before
perf_hpp__init, which will be changed in following patch to respect
taken dimension.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444134312-29136-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we dont fail properly when pattern matching fails to find any
tracepoint.
Current behaviour:
$ perf record -e 'sched:krava*' sleep 1
WARNING: event parser found nothinginvalid or unsupported event: 'sched:krava*'
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
This patch change:
$ perf record -e 'sched:krava*' sleep 1
event syntax error: 'sched:krava*'
\___ unknown tracepoint
Error: File /sys/kernel/debug/tracing/events/sched/krava* not found.
Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
Run 'perf list' for a list of valid events
usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444073477-3181-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Do it using the recently introduced ui_brower scrolling mode, setting
ui_browser.columns to the number of sort columns and then, when
rendering each line, skipping as many initial columns as the user
pressed the right arrow.
As the user presses the left arrow, the ui_browser code will remove the
scrolling counter and the left scrolling takes place.
The right arrow key was an alias for ENTER, so people used to press it
may get a bit annoyed at first, sorry! Ditto for ESC and the left key.
Callchains can be left as is or we can, when rendering the Symbol
column, store the at what position on the screen it is and then
using ui_browser__gotorc() to print it from there, i.e. the callchain
would move around with the symbol.
Leaving it as is, i.e. at a fixed position, close to the left, saves
precious screen real state for it, so I'm inclined to leave it as is
now.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ccqq9sabgfge5dwbqjwh71ij@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the classes derived from ui_browser want to do some sort of
horizontal scrolling, they have just to set ui_browser->columns to
the number of columns available.
Those columns can be the number of characters on the screen, if what is
desired is to scroll character by character, or the number of columns in
a spreadsheet like table.
This is what the hist_browser will do, skipping ui_browser->horiz_scroll
columns when rendering each of its lines.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-q6a22bpmpgcr1awgzrmd4jrs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Which is the most common default found in other similar tools.
Requested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://www.youtube.com/watch?v=nXaxk27zwlk
Link: http://lkml.kernel.org/n/tip-v8lq36aispvdwgxdmt9p9jd9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Peter reports that it's possible to trigger a WARN_ON_ONCE() in the
Intel CQM code by combining a hardware event and an Intel CQM
(software) event into a group. Unfortunately, the perf tools are not
able to create this bundle and we need to manually construct a test
case.
For posterity, record Peter's proof of concept test case in tools/perf
so that it presents a model for how we can perform architecture
specific tests, or "arch tests", in perf in the future.
The particular issue triggered in the test case is that when the
counter for the hardware event overflows and triggers a PMI we'll read
both the hardware event and the software event counters.
Unfortunately, for CQM that involves performing an IPI to read the CQM
event counters on all sockets, which in NMI context triggers the
WARN_ON_ONCE().
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/r/1437490509-15373-1-git-send-email-matt@codeblueprint.co.uk
Link: http://lkml.kernel.org/n/tip-3p4ra0u8vzm7m289a1m799kf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move out the x86-specific tests into tools/perf/arch/x86/tests and
define an 'arch_tests' array, which is the list of tests that only apply
to the build architecture.
We can also now begin to get rid of some of the #ifdef code that is
present in the generic perf tests.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/n/tip-9s68h4ptg06ah0lgnjz55mqn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tests that only make sense for some architectures currently live in
the same place as the generic tests. Move out the x86-specific tests
into tools/perf/arch/x86/tests and define an 'arch_tests' array, which
is the list of tests that only apply to the build architecture.
The main idea is to encourage developers to add arch tests to build
out perf's test coverage, without dumping everything in
tools/perf/tests.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kanaka Juvva <kanaka.d.juvva@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vikas Shivappa <vikas.shivappa@intel.com>
Cc: Vince Weaver <vince@deater.net>
Link: http://lkml.kernel.org/n/tip-p4uc1c15ssbj8xj7ku5slpa6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding handling for '-h' and '-v' options to invoke help and version
command respectively.
Current behaviour is:
$ perf -v
Unknown option: -v
Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
$ perf -h
Unknown option: -h
Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
New behaviour:
$ perf -h
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
The most commonly used perf commands are:
annotate Read perf.data (created by perf record) and display annotated code
archive Create archive with object files with build-ids found in perf.data file
bench General framework for benchmark suites
...
$ perf -v
perf version 4.3.rc3.gc99e32
Updated man page.
Requested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to properly initialize column width for symbol_iaddr field, so
all symbols could fit in the column.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sorting on 'symbol' gives to broad a resolution as it can cover a range
of IP address. Use the iaddr instead to get proper sorting on IP
addresses. Need to use the 'mem_sort' feature of perf record.
New sort option is: symbol_iaddr, header label is 'Code Symbol'.
$ perf mem report --stdio -F +symbol_iaddr
# Overhead Samples Code Symbol Local Weight
# ........ ............ ........................ ............
#
54.08% 1 [k] nmi_handle 192
4.51% 1 [k] finish_task_switch 16
3.66% 1 [.] malloc 13
3.10% 1 [.] __strcoll_l 11
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-8-git-send-email-jolsa@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We cant test 'P' modifier gets properly parsed, the functionality test
itself is beyond this suite.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'P' will cause the event to get maximum possible detected precise
level.
Following record:
$ perf record -e cycles:P ...
will detect maximum precise level for 'cycles' event and use it.
Commiter note:
Testing it:
$ perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.013 MB perf.data (9 samples) ]
$ perf evlist
cycles:P
$ perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1, mmap2: 1,
comm_exec: 1
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll be used in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The annotated_source::sizeof_sym_hist could easily overflow int size,
resulting in crash in __symbol__inc_addr_samples.
Changing its type int size_t as was probably intended from beginning
based on the initialization code in symbol__alloc_hist.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1444068369-20978-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --interval-print parameter was limited to 100ms. However, for
example, 10ms is required to do sophisticated bandwidth analysis using
uncore events.
The test shows that the overhead of the system-wide uncore monitoring
with 10ms interval is only ~2%. So this patch reduces the minimal
interval-print allowd to 10ms.
But 10ms may not work well for all cases. For example, when the
cpus/threads number is very large, for system-wide core event monitoring
the overhead could be high.
To handle this issue, a warning will be displayed when the
interval-print is set between 10ms to 100ms. So users can make a
decision according to their specific cases.
# perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1
print interval < 100ms. The overhead percentage could be high in some
cases. Please proceed with caution.
# time counts unit events
0.010200451 0.10 MiB uncore_imc_1/cas_count_read/
0.020475117 0.02 MiB uncore_imc_1/cas_count_read/
0.030692800 0.01 MiB uncore_imc_1/cas_count_read/
0.040948161 0.02 MiB uncore_imc_1/cas_count_read/
0.051159564 0.00 MiB uncore_imc_1/cas_count_read/
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
[ Added warning about overhead when using sub 100ms intervals to the man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When run "perf record -e", the number of samples showed up is wrong on some
32 bit systems, i.e. powerpc and arm.
For example, run the below commands on 32 bit powerpc:
perf probe -x /lib/libc.so.6 malloc
perf record -e probe_libc:malloc -a ls perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ]
Actually, "perf script" just shows 21 samples. The number of samples is also
absurd since samples is long type, but it is printed as PRIu64.
Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64.
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Cc: linaro-kernel@lists.linaro.org
Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org
[ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow probing on kernel modules when 'perf' is built without debuginfo
support.
Currently perf-probe --module requires linking with libdw, but this
doesn't make sense.
E.g.
----
# make NO_DWARF=1
# ./perf probe -m pcspkr pcspkr_event%return
Error: unknown switch `m'
----
With this patch
----
# ./perf probe -m pcspkr pcspkr_event%return
Added new event:
probe:pcspkr_event (on pcspkr_event%return in pcspkr)
You can now use it in all perf tools, such as:
perf record -e probe:pcspkr_event -aR sleep 1
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20151002125832.18617.78721.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some PMUs, like the 'intel_bts' one can be used as an event name, i.e.:
$ perf record -e intel_bts:// usleep 1
Is a valid event name.
But the code printing such PMUs was not honouring the 'event_glob'
parameter, so the following line was always appearing:
$ intel_bts// [Kernel PMU event]
Fix it:
$ [acme@felicio linux]$ perf list data
List of pre-defined events (to be used in -e):
uncore_imc/data_reads/ [Kernel PMU event]
uncore_imc/data_writes/ [Kernel PMU event]
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ajb71858n7q7ao77b8pyy74w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The patch f9db0d0f1b ("perf callchain: Allow disabling call graphs
per event") added an ability to enable/disable callchain recording per
event. But it had a problem when the enablement setting is changed at
'perf report' time using -g/--call-graph option.
For example, the following scenario will get a segfault.
$ perf record -ag sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.500 MB perf.data (2555 samples) ]
$ perf report -g none
perf: Segmentation fault
-------- backtrace --------
perf[0x53a98a]
/usr/lib/libc.so.6(+0x335af)[0x7f4e91df95af]
This is because callchain_param.sort() callback was not set but it
tried to call the function as it had the PERF_SAMPLE_CALLCHAIN bit.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: f9db0d0f1b ("perf callchain: Allow disabling call graphs per event")
Link: http://lkml.kernel.org/r/1443587640-24242-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf top didn't add the idle/swapper thread to the machine's thread
list and its comm was displayed as ':0'. Fix it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443577526-3240-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf top uses 'dso,symbol' sort keys by default so it overlooked a
problem in task's comm resolving. When the sort key contains 'comm',
some task's comm is not shown properly. This is because the
perf_top__mmap_read_idx() checks the cpumode value improperly.
The cpumode value of non-sample events are 0 (PERF_RECORD_MISC_CPUMODE_
UNKNOWN) so the events will be ignored by the switch statement. This patch
allows it for non-sample events.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443577526-3240-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A previous patch added a synthesized comm event for forked child process
but it missed that the event should contain area for sample_id_hdr at
the end. It worked by accident since the perf_event union contains
bigger event structs like mmap_events.
This patch fixes it by dynamically allocating event struct including
those area like in perf_event__synthesize_thread_map().
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1443577526-3240-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the user doesn't specify any event, try the most precise "cycles"
available, i.e. start by "cycles:ppp" and go on removing "p" till it
works.
E.g.
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (11 samples) ]
$ perf evlist
cycles:pp
$ perf evlist -v
cycles:pp: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, precise_ip: 2, sample_id_all: 1,
exclude_guest: 1, mmap2: 1, comm_exec: 1
$ grep 'model name' /proc/cpuinfo | head -1
model name : Intel(R) Core(TM) i7-3667U CPU @ 2.00GHz
$
When 'cycles' appears explicitely is specified this will not be tried,
i.e. the user has full control of the level of precision to be used:
$ perf record -e cycles usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (9 samples) ]
$ perf evlist
cycles
$ perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1,
enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2:
1, comm_exec: 1
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://www.youtube.com/watch?v=nXaxk27zwlk
Link: http://lkml.kernel.org/n/tip-b1ywebmt22pi78vjxau01wth@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that one can, for instance, use it with wc -l:
# perf list *:*write* | wc -l
60
Or to look for the "bio" tracepoints, without 'perf list' headers:
# perf list *:*bio* | head
block:block_bio_backmerge [Tracepoint event]
block:block_bio_bounce [Tracepoint event]
block:block_bio_complete [Tracepoint event]
block:block_bio_frontmerge [Tracepoint event]
block:block_bio_queue [Tracepoint event]
block:block_bio_remap [Tracepoint event]
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ts7sc0x8u4io4cifzkup4j44@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf probe shows more precisely message when it finds given
%return target function is inlined.
Without this fix:
----
# ./perf probe -V getname_flags%return
Return probe must be on the head of a real function.
Debuginfo analysis failed.
Error: Failed to show vars.
----
With this fix:
----
# ./perf probe -V getname_flags%return
Failed to find "getname_flags%return",
because getname_flags is an inlined function and has no return point.
Debuginfo analysis failed.
Error: Failed to show vars.
----
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164137.3733.55055.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf probe --list will get a segfault if the first kprobe event is on a
module and the second or latter one is on the kernel.
e.g.
----
# ./perf probe -q -m pcspkr pcspkr_event
# ./perf probe -q vfs_read
# ./perf probe -l
Segmentation fault (core dumped)
----
This is because the debuginfo_cache fails to handle NULL module name,
which causes segfault on strcmp. (Note that strcmp("something", NULL)
always causes segfault)
To fix this debuginfo_cache__open always translates the NULL module name
to "kernel" (this is correct, because NULL module name means opening the
debuginfo for the kernel)
----
# ./perf probe -l
probe:pcspkr_event (on pcspkr_event@drivers/input/misc/pcspkr.c
in pcspkr)
probe:vfs_read (on vfs_read@ksrc/linux-3/fs/read_write.c)
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164135.3733.23993.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf probe always failed to find appropriate line numbers because of
failing to find .text start address offset from debuginfo.
e.g.
----
# ./perf probe -m pcspkr pcspkr_event:5
Added new events:
probe:pcspkr_event (on pcspkr_event:5 in pcspkr)
probe:pcspkr_event_1 (on pcspkr_event:5 in pcspkr)
You can now use it in all perf tools, such as:
perf record -e probe:pcspkr_event_1 -aR sleep 1
# ./perf probe -l
Failed to find debug information for address ffffffffa031f006
Failed to find debug information for address ffffffffa031f016
probe:pcspkr_event (on pcspkr_event+6 in pcspkr)
probe:pcspkr_event_1 (on pcspkr_event+22 in pcspkr)
----
This fixes the above issue as below.
1. Get the relative address of the symbol in .text by using
map->start.
2. Adjust the address by adding the offset of .text section
in the kernel module binary.
With this fix, perf probe -l shows lines correctly.
----
# ./perf probe -l
probe:pcspkr_event (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
probe:pcspkr_event_1 (on pcspkr_event:5@drivers/input/misc/pcspkr.c in pcspkr)
----
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164132.3733.24643.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a trival bug about libdwfl usage of the report session, it should
explicitly begin and end a report session around dwfl_report_offline().
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164128.3733.59876.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix to remove dot suffix (e.g. .const, .isra) from the second or latter
events which has suffix numbers.
Since the previous commit 35a23ff928 ("perf probe: Cut off the gcc
optimization postfixes from function name") didn't care about the suffix
numbered events, therefore we'll have an error when we add additional
events on the same dot suffix functions.
e.g.
----
# ./perf probe -f -a get_sigframe.isra.2.constprop.3 \
-a get_sigframe.isra.2.constprop.3
Failed to write event: Invalid argument
Error: Failed to add events.
----
This fixes above issue as below:
----
# ./perf probe -f -a get_sigframe.isra.2.constprop.3 \
-a get_sigframe.isra.2.constprop.3
Added new events:
probe:get_sigframe (on get_sigframe.isra.2.constprop.3)
probe:get_sigframe_1 (on get_sigframe.isra.2.constprop.3)
You can now use it in all perf tools, such as:
perf record -e probe:get_sigframe_1 -aR sleep 1
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20150930164130.3733.26573.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is about binding, not type, we have just a letter in kallsyms that
should map both for the ELF type (STT_FUNC, etc) and to the ELF
symbol binding (STB_WEAK, STB_GLOBAL, etc), so rename it now before
introducing kallsyms2_elf_type()
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uu5vj343ms1q2wm55690on6v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And it is also a step in the direction of killing the separation of data
and text maps in map_groups.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rrds86kb3wx5wk8v38v56gw8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In places where we were using its open coded equivalent.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-khkdugcdoqy3tkszm3jdxgbe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_regs.c file does not get built on Powerpc as CONFIG_PERF_REGS
is false. So the weak definition for 'sample_regs_masks' doesn't get
picked up.
Adding perf_regs.o to util/Build unconditionally, exposes a redefinition
error for 'perf_reg_value()' function (due to the static inline version
in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that
function.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: linuxppc-dev@ozlabs.org
Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>