We were using it at 10 kHz, which doesn't work in machines where somehow
the max freq was auto reduced by the kernel:
[root@ssdandy ~]# perf test 19
19: Test software clock events have valid period values : FAILED!
[root@ssdandy ~]# perf test -v 19
19: Test software clock events have valid period values :
--- start ---
Couldn't open evlist: Invalid argument
---- end ----
Test software clock events have valid period values: FAILED!
[root@ssdandy ~]#
[root@ssdandy ~]# cat /proc/sys/kernel/perf_event_max_sample_rate
7000
Reducing it to 500 Hz should be good enough for this test and also
shouldn't affect what it is testing.
But warn the user if it fails, informing the knob and the freq tried.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-548rhj1uo6xbwnxa95kw3hqe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were not checking if we successfully opened the counters, i.e. if
sys_perf_event_open worked, when it doesn't in this test, we were
continuing anyway and then segfaulting when trying to access the file
descriptor array, that at that point had been freed in perf_evlist__open
error path:
[root@ssdandy ~]# perf test -v 19
19: Test software clock events have valid period values :
--- start ---
Segmentation fault (core dumped)
[root@ssdandy ~]#
Do the check and bail out instead.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6qy8ljkn0e9hm7bh7keo5z68@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If given sort keys are all elided there'll be no output except for the
overhead column - actually the TUI shows a noisy output. In this case
it'd be better to show up the sort keys rather than elide.
Before:
$ perf report -s comm -c perf
(...)
# Overhead
# ........
#
100.00%
After:
$ perf report -s comm -c perf
(...)
# Overhead Command
# ........ .......
#
100.00% perf
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383900822-14609-1-git-send-email-namhyung@kernel.org
[ Us curly braces around multi-line statements, as requested by Ingo Molnar ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Most uses of the evsel constructor are followed by a call to
perf_evlist__add with an idex of evlist->nr_entries, so make rename
the current constructor to perf_evsel__new_idx and remove the need
for passing the constructor for the common case.
We still need the new_idx variant because the way groups are handled,
with evsel->nr_members holding the number of entries in an evlist,
partitioning the evlist into sublists inside a single linked list.
This asks for a clarifying refactoring, but for now simplify the non
parser cases, so that tool writers don't have to bother with evsel idx
setting.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-zy9tskx6jqm2rmw7468zze2a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Each call to tui_progress__update() would forcibly refresh the entire
screen. This is somewhat inefficient and causes noticable flickering
during the startup of perf-report, especially on large/slow terminals.
It looks like the force-refresh in tui_progress__update() serves no
purpose other than to clear the screen so that the progress bar of a
previous operation does not subsume that of a subsequent operation. But
we can do just that in a much more efficient manner by clearing only the
region that a previous progress bar may have occupied before repainting
the new progress bar. Then the force-refresh could be removed with no
change in visuals.
This patch disables the slow force-refresh in tui_progress__update() and
instead calls SLsmg_fill_region() on the entire area that the progress
bar may occupy before repainting it. This change makes the startup of
perf-report much faster and appear much "smoother".
It turns out that this was a big bottleneck in the startup speed of
perf-report -- with this patch, perf-report starts up ~2x faster (1.1s
vs 0.55s) on my machines. (These numbers were measured by running "time
perf report" on an 8MB perf.data and pressing 'q' immediately.)
Signed-off-by: Patrick Palka <patrick@parcs.ath.cx>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382747149-9716-1-git-send-email-patrick@parcs.ath.cx
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead do the lookups just when creating the tracepoints, initially for
the most common, raw_syscalls:sys_{enter,exit}.
It works by having evsel->priv have a per tracepoint structure with
entries for the fields, for direct access, with the offset and a
function to get the value from the sample, doing the swap if needed.
Using a simple workload that does M millions write syscalls, we go from:
# perf stat -i -e cycles /tmp/oldperf trace ./sc_hello 100 > /dev/null
Performance counter stats for '/tmp/oldperf trace ./sc_hello 100':
8,366,771,459 cycles
2.668025928 seconds time elapsed
# perf stat -i -e cycles perf trace ./sc_hello 100 > /dev/null
Performance counter stats for 'perf trace ./sc_hello 100':
8,345,187,650 cycles
2.631748425 seconds time elapsed
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-eyfhvoo510a5i10b27dnvm88@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When building perf out of tree:
$ make perf-tar-src-pkg
$ tar -xf perf-<ver>.tar -C /tmp
$ cd /tmp/perf<ver>
$ make -C tools/perf
you get this warning message:
make[1]: *** No rule to make target `kernelversion'. Stop.
Fix it by saving the perf version in the tar file and using that for the
out of tree builds.
v2: removed short form request and fixed up version string from usual output.
Signed-off-by: David Ahern <dsahern@gmail.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1383753335-25782-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the check for maximum allowed frequency rate defined in following
file:
/proc/sys/kernel/perf_event_max_sample_rate
When we cross the maximum value we fail and display detailed error
message with advise.
$ perf record -F 3000 ls
Maximum frequency rate (2000) reached.
Please use -F freq option with lower value or consider
tweaking /proc/sys/kernel/perf_event_max_sample_rate.
In case user does not specify the frequency and the default value cross
the maximum, we display warning and set the frequency value to the
current maximum.
$ perf record ls
Lowering default frequency rate to 2000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
Same messages are used for 'perf top'.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383660887-1734-4-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Attributes (struct perf_event_attr) are recorded separately in the
perf.data file. perf script uses them to set up output options.
However attributes can also be in the event stream, for example when the
input is a pipe (i.e. live mode). This patch makes perf script process
in-stream attributes in the same way as on-file attributes.
Here is an example:
Before this patch:
$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
:4220 4220 [-01] 2933367.838906: cycles:
:4220 4220 [-01] 2933367.838910: cycles:
:4220 4220 [-01] 2933367.838912: cycles:
:4220 4220 [-01] 2933367.838914: cycles:
:4220 4220 [-01] 2933367.838916: cycles:
:4220 4220 [-01] 2933367.838918: cycles:
uname 4220 [-01] 2933367.838938: cycles:
uname 4220 [-01] 2933367.839207: cycles:
After this patch:
$ perf record uname | perf script
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.015 MB (null) (~655 samples) ]
:4582 4582 2933425.707724: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707728: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707730: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707732: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707734: cycles: ffffffff81043ffa native_write_msr_safe ([kernel.kallsyms])
:4582 4582 2933425.707736: cycles: ffffffff81309a24 memcpy ([kernel.kallsyms])
uname 4582 2933425.707760: cycles: ffffffff8109c1c7 enqueue_task_fair ([kernel.kallsyms])
uname 4582 2933425.707978: cycles: ffffffff81308457 clear_page_c ([kernel.kallsyms])
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This new COMM infrastructure provides two features:
1) It keeps track of all comms lifecycle for a given thread. This way we
can associate a timeframe to any thread COMM, as long as
PERF_SAMPLE_TIME samples are joined to COMM and fork events.
As a result we should have more precise COMM sorted hists with seperated
entries for pre and post exec time after a fork.
2) It also makes sure that a given COMM string is not duplicated but
rather shared among the threads that refer to it. This way the threads
COMM can be compared against pointer values from the sort
infrastructure.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-hwjf70b2wve9m2kosxiq8bb3@git.kernel.org
[ Rename some accessor functions ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
[ Use __ as separator for class__method for private comm_str methods ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This way we can later delimit a lifecycle for the COMM and map a hist to
a precise COMM:timeslice couple.
PERF_RECORD_COMM and PERF_RECORD_FORK events that don't have
PERF_SAMPLE_TIME samples can only send 0 value as a timestamp and thus
should overwrite any previous COMM on a given thread because there is no
sensible way to keep track of all the comms lifecycles in a thread
without time informations.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6tyow99vgmmtt9qwr2u2lqd7@git.kernel.org
[ Made it cope with PERF_RECORD_MMAP2 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>