Commit Graph

400 Commits

Author SHA1 Message Date
Kan Liang
94ba462d69 perf diff: Support for different binaries
Currently, the perf diff only works with same binaries. That's because
it compares the symbol start address. It doesn't work if the perf.data
comes from different binaries. This patch matches the symbol names.

Actually, perf diff once intended to compare the symbol names.  The
commit as below can look for a pair by name.

604c5c9297 (perf diff: Change the default sort order to "dso,symbol")
However, at that time, perf diff used a global list of dsos. That means
the binaries which has same name can only be loaded once. That's a
problem for comparing different binaries.

For example, we have an old binary and an updated binary. They very
likely have same name and most of the functions, so only dsos from old
binary will be loaded. When processing the data from updated binary,
perf still use the symbol information from old binary. That's wrong.

Then the commit as below used IP to replace symbol name.
9c443dfdd3 ("perf diff: Fix support for all --sort combinations")
>From that time, perf diff starts to compare the symbol address.

The global dsos is discarded from a patch in 2010.
a1645ce12a ("perf: 'perf kvm' tool for monitoring guest performance
from host")
However, at that time, perf diff already compared by address. So perf
diff cannot work for different binaries as well.

This patch actually rolls back the perf diff to original design. The
document is also changed, so everybody knows the original design is to
compare the symbol names.

Here are some examples:

The only difference between example_v1.c and example_v2.c is the
location of f2 and f3. There is no change in behavior, but the previous
perf diff display the wrong differential profile.

example_v1.c
noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {

                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;

}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

example_v2.c
noinline void f2(void)
{
        volatile int a = 100, b, c;
        for (b = 0; b < 10000; b++)
                c = a * b;
}

noinline void f3(void)
{
        volatile int i;
        for (i = 0; i < 10000;) {
                if(i%2)
                        i++;
                else
                        i++;
        }
}

noinline void f1(void)
{
                f2();
                f3();
}

int main()
{
        int i;
        for (i = 0; i < 100000; i++)
                f1();
}

[lk@localhost perf_diff]$ gcc example_v1.c -o example
[lk@localhost perf_diff]$ perf record -o example_v1.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.813 MB example_v1.data (~35522 samples) ]

[lk@localhost perf_diff]$ gcc example_v2.c -o example
[lk@localhost perf_diff]$ perf record -o example_v2.data ./example
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.824 MB example_v2.data (~36015 samples) ]

Old perf diff result:

[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
 Event 'cycles'
 Baseline    Delta  Shared Object     Symbol
 ........  .......  ................  ...............................

                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%           example           [.] f2
    53.71%   -7.55%  example           [.] f3
            +53.81%  example           [.] f3
     0.02%           example           [.] main

New perf diff result:

[lk@localhost perf_diff]$ perf diff example_v1.data example_v2.data
                     [kernel.vmlinux]  [k] __perf_event_task_sched_out
     0.00%           [kernel.vmlinux]  [k] apic_timer_interrupt
                     [kernel.vmlinux]  [k] idle_cpu
                     [kernel.vmlinux]  [k] intel_pstate_timer_func
                     [kernel.vmlinux]  [k] native_read_msr_safe
     0.00%           [kernel.vmlinux]  [k] native_read_tsc
     0.00%           [kernel.vmlinux]  [k] native_write_msr_safe
                     [kernel.vmlinux]  [k] ntp_tick_length
     0.00%           [kernel.vmlinux]  [k] rb_erase
     0.00%           [kernel.vmlinux]  [k] tick_sched_timer
     0.00%           [kernel.vmlinux]  [k] unmap_single_vma
     0.00%           [kernel.vmlinux]  [k] update_wall_time
     0.00%           example           [.] f1
    46.24%   -0.08%  example           [.] f2
    53.71%   +0.11%  example           [.] f3
     0.02%           example           [.] main

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1423460384-11645-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-27 10:08:38 -03:00
Masami Hiramatsu
a50d11a10c perf buildid-cache: Add new buildid cache if update target is not cached
Add new buildid cache if the update target file is not cached.

This can happen when an old binary is replaced by new one after caching
the old one. In this case, user sees his operation just failed.

But it does not look straight, since user just pass the binary "path",
not "build-id".

  ----
  # ./perf buildid-cache --add ./perf
  (update ./perf to new binary)
  # ./perf buildid-cache --update ./perf
  ./perf wasn't in the cache
  #
  ----

This patch adds given new binary to cache if the new binary is
not cached. So we'll not see the above error.

  ----
  # ./perf buildid-cache --add ./perf
  (update ./perf to new binary)
  # ./perf buildid-cache --update ./perf
  #
  ----

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150226065440.23912.1494.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-27 10:08:37 -03:00
Jiri Olsa
edbe9817ae perf data: Add perf data to CTF conversion support
Adding 'perf data convert' to convert perf data file into different
format. This patch adds support for CTF format conversion.

To convert perf.data into CTF run:
  $ perf data convert --to-ctf=./ctf-data/
  [ perf data convert: Converted 'perf.data' into CTF data './ctf-data/' ]
  [ perf data convert: Converted and wrote 11.268 MB (100230 samples) ]

The command will create CTF metadata out of perf.data file (or one
specified via -i option) and then convert all sample events into single
CTF stream.

Each sample_type bit is translated into separated CTF event field apart
from following exceptions:

  PERF_SAMPLE_RAW          - added in next patch
  PERF_SAMPLE_READ         - TODO
  PERF_SAMPLE_CALLCHAIN    - TODO
  PERF_SAMPLE_BRANCH_STACK - TODO
  PERF_SAMPLE_REGS_USER    - TODO
  PERF_SAMPLE_STACK_USER   - TODO

  $ perf --debug=data-convert=2 data convert ...

The converted CTF data could be analyzed by CTF tools, like babletrace
or tracecompass [1].

  $ babeltrace ./ctf-data/
  [03:19:13.962125533] (+?.?????????) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
  [03:19:13.962130001] (+0.000004468) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 1 }
  [03:19:13.962131936] (+0.000001935) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 8 }
  [03:19:13.962133732] (+0.000001796) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 114 }
  [03:19:13.962135557] (+0.000001825) cycles: { }, { ip = 0xFFFFFFFF8105443A, tid = 20714, pid = 20714, period = 2087 }
  [03:19:13.962137627] (+0.000002070) cycles: { }, { ip = 0xFFFFFFFF81361938, tid = 20714, pid = 20714, period = 37582 }
  [03:19:13.962161091] (+0.000023464) cycles: { }, { ip = 0xFFFFFFFF8124218F, tid = 20714, pid = 20714, period = 600246 }
  [03:19:13.962517569] (+0.000356478) cycles: { }, { ip = 0xFFFFFFFF811A75DB, tid = 20714, pid = 20714, period = 1325731 }
  [03:19:13.969518008] (+0.007000439) cycles: { }, { ip = 0x34080917B2, tid = 20714, pid = 20714, period = 1144298 }

The following members to the ctf-environment were decided to be added to
distinguish and specify perf CTF data:

  - domain

    It says "kernel" because it contains a kernel trace (not to be
    confused with a user space like lttng-ust does)

  - tracer_name

    It says perf. This can be used to distinguish between lttng and perf
    CTF based trace.

  - version

    The kernel version from stream. In addition to release, this is what
    it looks like on a Debian kernel:

      release = "3.14-1-amd64";
      version = "3.14.0";

[1] http://projects.eclipse.org/projects/tools.tracecompass

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-4-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-25 16:13:12 -03:00
Jiri Olsa
2245bf1410 perf tools: Add new 'perf data' command
Adding new 'perf data' command to provide operations over data files.

The 'perf data convert' sub command is coming in following patch, but
there's possibility for other useful commands like 'perf data ls' (to
display perf data file in directory in ls style).

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremie Galarneau <jgalar@efficios.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1424470628-5969-3-git-send-email-jolsa@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-25 12:42:25 -03:00
Andi Kleen
85c273d2b6 perf record: Support recording running/enabled time
Add an option to perf record to record running/enabled time for read
events, similar to what stat does.

This is useful to understand multiplexing problems.

Right now the report support is not great, but at least report -D
already supports it.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1424819620-16043-1-git-send-email-andi@firstfloor.org
[ Fixed the Documentation entry to match the OPT_BOOLEAN one ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-25 12:42:23 -03:00
Arnaldo Carvalho de Melo
77c92582a5 perf trace: Add man page entry for --event
Forgot to do it when adding the feature.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-mx152b6x9cgknhw91vsyjlnd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-22 22:22:07 -03:00
Arnaldo Carvalho de Melo
f078c3852c perf trace: Introduce --filter-pids
When tracing in X we get event loops due to the tracing activity, i.e.
updates to a gnome-terminal that generate syscalls for X.org, etc.

To get a more useful view of what is happening, syscall wise, system
wide, we need to filter those, like in:

 # ps ax|egrep '981|2296|1519' | grep -v egrep
   981 tty1 Ss+ 5:40 /usr/bin/Xorg :0 -background none ...
  1519 ?    Sl  2:22 /usr/bin/gnome-shell
  2296 ?    Sl  4:16 /usr/libexec/gnome-terminal-server
 #

 # trace -e write --filter-pids 981,2296,1519
    0.385 ( 0.021 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
    0.922 ( 0.014 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
 5006.525 ( 0.029 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 136) = 136
 5007.235 ( 0.023 ms): goa-daemon/2061 write(fd: 1</dev/null>, buf: 0x7fbeb017b000, count: 140) = 140
 5177.646 ( 0.018 ms): rtkit-daemon/782 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7f7eea70be88, count: 8) = 8
 8314.497 ( 0.004 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af7b0, count: 8) = 8
 8314.518 ( 0.002 ms): gsd-locate-poi/2084 write(fd: 5<anon_inode:[eventfd]>, buf: 0x7fffe96af0e0, count: 8) = 8
 ^C#

When this option is used the tracer pid is also filtered.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-f5qmiyy7c0uxdm21ncatpeek@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-22 22:21:52 -03:00
Ingo Molnar
8a26ce4e54 perf/core improvements and fixes:
User visible:
 
 - 'perf trace': Allow mixing with tracepoints and suppressing plain syscalls
   (Arnaldo Carvalho de Melo)
 
 Infrastructure:
 
 - Kconfig beachhead (Jiri Olsa)
 
 - Simplify nr_pages validity (Kaixu Xia)
 
 - Fixup header positioning in 'perf list' (Yunlong Song)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJU3mKaAAoJEBpxZoYYoA71qnYH/1h8zqbQosuy/7Mu2tgLROts
 2LSK8M+XD4RKdDVRLK95BIKmZfZkBjeOUE+PJIQ6/Mb1BQGBOmmGQ5oydLf2QUFw
 5zVAFS8gec7xGvQpITuZEplJQcqm24CHt7qxUwFlh1DnRzN8eRkW2tHZmr5mfOil
 hVpTQYpawRg/HIufDvlMU0Umv28JPQyRpfIF2TilkBxUT6KjYJK1QNuoNsgGS4ZL
 r8rEpijRNkbmQZXmIDfZzvlzMx2Bwf0wdGf/1Rod1f1HLD4252ZKc07JCujBpvji
 rK/oFj2hHx64r5HUQrOudlQ2B5VvlFKnWKnnb5EgL6gtM4moGhKjNHcUjFy1XLk=
 =8zWn
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - No need to explicitely enable evsels for workload started from perf, let it
    be enabled via perf_event_attr.enable_on_exec, removing some events that take
    place in the 'perf trace' before a workload is really started by it.
    (Arnaldo Carvalho de Melo)

  - Fix to handle optimized not-inlined functions in 'perf probe' (Masami Hiramatsu)

  - Update 'perf probe' man page (Masami Hiramatsu)

  - 'perf trace': Allow mixing with tracepoints and suppressing plain syscalls
    (Arnaldo Carvalho de Melo)

Infrastructure changes:

  - Introduce {trace_seq_do,event_format_}_fprintf functions to allow
    a default tracepoint field list printer to be used in tools that allows
    redirecting output to a file. (Arnaldo Carvalho de Melo)

  - The man page for pthread_attr_set_affinity_np states that _GNU_SOURCE
    must be defined before pthread.h, do it to fix the build in some
    systems (Josh Boyer)

  - Cleanups in 'perf buildid-cache' (Masami Hiramatsu)

  - Fix dso cache test case (Namhyung Kim)

  - Do Not rely on dso__data_read_offset() to open DSO (Namhyung Kim)

  - Make perf aware of tracefs (Steven Rostedt).

  - Fix build by defining STT_GNU_IFUNC for glibc 2.9 and older (Vinson Lee)

  - AArch64 symbol resolution fixes (Victor Kamensky)

  - Kconfig beachhead (Jiri Olsa)

  - Simplify nr_pages validity (Kaixu Xia)

  - Fixup header positioning in 'perf list' (Yunlong Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18 19:18:18 +01:00
Kan Liang
aad2b21c15 perf tools: Enable LBR call stack support
Currently, there are two call chain recording options, fp and dwarf.

Haswell has a new feature that utilizes the existing LBR facility to
record call chains. Kernel side LBR support code provides this as a
third option to record call chains. This patch enables the lbr call
stack support on the tooling side.

LBR call stack has some limitations:

 - It reuses current LBR facility, so LBR call stack and branch record
   can not be enabled at the same time.

 - It is only available for user-space callchains.

However, it also offers some advantages:

 - LBR call stack can work on user apps which don't have frame-pointers
   or dwarf debug info compiled. It is a good alternative when nothing
   else works.

Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Cody P Schafer <cody@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Jacob Shin <jacob.w.shin@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masanari Iida <standby24x7@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1420482185-29830-2-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-02-18 17:16:17 +01:00
Jiri Olsa
f819f703a4 perf build: Add build documentation
Adding file describing the basics of perf build process.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ibgf7vxyduwohlqqfayl11xb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-12 17:53:00 -03:00
Masami Hiramatsu
8b72805fd1 perf probe: Update man page
Update Documentation/perf-probe.txt to add descriptions of some newer
options.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150130093746.30575.8571.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-02-06 11:46:36 +01:00
Ingo Molnar
b3890e4704 Merge branch 'perf/hw_breakpoints' into perf/core
The new hw_breakpoint bits are now ready for v3.20, merge them
into the main branch, to avoid conflicts.

Conflicts:
	tools/perf/Documentation/perf-record.txt

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-28 15:48:59 +01:00
Cody P Schafer
f9ab9c196d perf tools: Document parameterized and symbolic events
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Cody P Schafer <dev@codyps.com>
Cc: Haren Myneni <hbabu@us.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1420679633-28856-5-git-send-email-sukadev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-21 13:24:33 -03:00
Arnaldo Carvalho de Melo
48000a1aed perf tools: Remove EOL whitespaces
Janitorial stuff: boredom moment.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-u70i7shys3kths4hzru72bha@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-21 13:24:31 -03:00
Stephane Eranian
67121f85e4 perf mem: Enable sampling loads and stores simultaneously
This patch modifies perf mem to default to sampling loads and stores
simultaneously. It could only do one or the other before yet there was
no hardware restriction preventing simultaneous collection. With this
patch, one run is sufficient to collect both.

It is still possible to sample only loads or stores by using the
-t option:
 $ perf mem -t load rec
 $ perf mem -t load rep
Or
 $ perf mem -t store rec
 $ perf mem -t store rep

The perf report TUI will show one event at a time. The store output will
contain a Weight column which will be empty.

In V2, we updated the man pages to reflect the change and also simplify
the initialization of the argv vector passed to the cmd_*() functions as
per LKML feedback.

In V3, we fixed typos in the changelog.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Fowles <rfowles@redhat.com>
Link: http://lkml.kernel.org/r/20141217152355.GA10053@thinkpad
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-01-21 13:24:31 -03:00
Jiri Olsa
99ce8e9fce perf tools: Add --buildid-dir option to set cache directory
Adding --buildid-dir to be able to set specific cache directory. It's
going to be handy for buildid tests coming in shortly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1417460789-13874-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-09 09:14:35 -03:00
Jacob Shin
3741eb9f8c perf tools: allow user to specify hardware breakpoint bp_len
Currently bp_len is given a default value of 4. Allow user to override it:

  $ perf stat -e mem:0x1000/8
                            ^
                            bp_len

If no value is given, it will default to 4 as it did before.

Signed-off-by: Jacob Shin <jacob.w.shin@gmail.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xiakaixu <xiakaixu@huawei.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2014-12-03 15:14:29 +01:00
Andi Kleen
fa94c36c29 perf report: Add --branch-history option
Add a --branch-history option to perf report that changes all the
settings necessary for using the branches in callstacks.

This is just a short cut to make this nicer to use, it does not enable
any functionality by itself.

v2: Change sort order. Rename option to --branch-history to
    be less confusing.
v3: Updates
v4: Fix conflict with newer perf base
v5: Port to latest tip
v6: Add more comments. Remove CCKEY_ADDRESS setting. Remove
    unnecessary branch_mode setting. Use a boolean.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-01 20:00:31 -03:00
Andi Kleen
8b7bad58ef perf callchain: Support handling complete branch stacks as histograms
Currently branch stacks can be only shown as edge histograms for
individual branches. I never found this display particularly useful.

This implements an alternative mode that creates histograms over
complete branch traces, instead of individual branches, similar to how
normal callgraphs are handled. This is done by putting it in front of
the normal callgraph and then using the normal callgraph histogram
infrastructure to unify them.

This way in complex functions we can understand the control flow that
lead to a particular sample, and may even see some control flow in the
caller for short functions.

Example (simplified, of course for such simple code this is usually not
needed), please run this after the whole patchkit is in, as at this
point in the patch order there is no --branch-history, that will be
added in a patch after this one:

tcall.c:

volatile a = 10000, b = 100000, c;

__attribute__((noinline)) f2()
{
	c = a / b;
}

__attribute__((noinline)) f1()
{
	f2();
	f2();
}
main()
{
	int i;
	for (i = 0; i < 1000000; i++)
		f1();
}

% perf record -b -g ./tsrc/tcall
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (~1923 samples) ]
% perf report --no-children --branch-history
...
    54.91%  tcall.c:6  [.] f2                      tcall
            |
            |--65.53%-- f2 tcall.c:5
            |          |
            |          |--70.83%-- f1 tcall.c:11
            |          |          f1 tcall.c:10
            |          |          main tcall.c:18
            |          |          main tcall.c:18
            |          |          main tcall.c:17
            |          |          main tcall.c:17
            |          |          f1 tcall.c:13
            |          |          f1 tcall.c:13
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:12
            |          |          f1 tcall.c:12
            |          |          f2 tcall.c:7
            |          |          f2 tcall.c:5
            |          |          f1 tcall.c:11
            |          |
            |           --29.17%-- f1 tcall.c:12
            |                     f1 tcall.c:12
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:11
            |                     f1 tcall.c:10
            |                     main tcall.c:18
            |                     main tcall.c:18
            |                     main tcall.c:17
            |                     main tcall.c:17
            |                     f1 tcall.c:13
            |                     f1 tcall.c:13
            |                     f2 tcall.c:7
            |                     f2 tcall.c:5
            |                     f1 tcall.c:12

The default output is unchanged.

This is only implemented in perf report, no change to record or anywhere
else.

This adds the basic code to report:

- add a new "branch" option to the -g option parser to enable this mode
- when the flag is set include the LBR into the callstack in machine.c.

The rest of the history code is unchanged and doesn't know the
difference between LBR entry and normal call entry.

- detect overlaps with the callchain
- remove small loop duplicates in the LBR

Current limitations:

- The LBR flags (mispredict etc.) are not shown in the history
and LBR entries have no special marker.
- It would be nice if annotate marked the LBR entries somehow
(e.g. with arrows)

v2: Various fixes.
v3: Merge further patches into this one. Fix white space.
v4: Improve manpage. Address review feedback.
v5: Rename functions. Better error message without -g. Fix crash without
    -b.
v6: Rebase
v7: Rebase. Use NO_ENTRY in memset.
v8: Port to latest tip. Move add_callchain_ip to separate
    patch. Skip initial entries in callchain. Minor cleanups.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1415844328-4884-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-12-01 20:00:31 -03:00
Stephane Eranian
4b6c51773d perf record: Add new -I option to sample interrupted machine state
Add -I/--intr-regs option to capture machine state registers at
interrupt.

Add the corresponding man page description

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411559322-16548-6-git-send-email-eranian@google.com
Cc: cebbert.lkml@gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-11-16 11:42:02 +01:00
Jiri Olsa
535aeaae7d perf script: Add period data column
Adding period data column to be displayed in perf script.  It's possible
to get period values using -f option, like:

  $ perf script -f comm,tid,time,period,ip,sym,dso
          :26019 26019 52414.329088:       3707  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329088:         44  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:       1987  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
          :26019 26019 52414.329093:          6  ffffffff8105443a native_write_msr_safe ([kernel.kallsyms])
              ls 26019 52414.329442:     537558        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.329442:       2099        3407c0639c _dl_map_object_from_fd (/usr/lib64/ld-2.17.so)
              ls 26019 52414.330181:    1242100        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.330181:       3774        34080917bb get_next_seq (/usr/lib64/libc-2.17.so)
              ls 26019 52414.331427:    1083662  ffffffff810c7dc2 update_curr ([kernel.kallsyms])
              ls 26019 52414.331427:        360  ffffffff810c7dc2 update_curr ([kernel.kallsyms])

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: "Jen-Cheng(Tommy) Huang" <tommy24@gatech.edu>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jen-Cheng(Tommy) Huang <tommy24@gatech.edu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408977943-16594-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-17 15:21:30 -03:00
Masanari Iida
96355f2cfb perf Documentation: Fix typos in perf/Documentation
This patch fix spelling typos found in tool/perf/Documentation.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/1410275930-17207-1-git-send-email-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-10-15 17:39:02 -03:00
Avi Kivity
763122ade7 perf tools: Disable kernel symbol demangling by default
Some Linux symbols (for example __vt_event_wait) are interpreted by the
demangler as C++ mangled names, which of course they aren't.

Disable kernel symbol demangling by default to avoid this, and allow
enabling it with a new option --demangle-kernel for those who wish it.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Avi Kivity <avi@cloudius-systems.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1410581705-26968-1-git-send-email-avi@cloudius-systems.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-09-17 17:08:09 -03:00
Namhyung Kim
cf59002fde perf top: Add -w option for setting column width
Add -w/--column-widths option like perf report does so that users are
able to see symbols even with some very long C++ library/functions.

It can be a list separated by comma for each column.

  $ perf top -w 0,20,30

The value of 0 means there's no limit.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1406785662-5534-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-08-12 12:03:06 -03:00
Adrian Hunter
a7a2b8b4ce perf inject: Add --kallsyms parameter
Let perf inject take --kallsyms parameter the same as perf script and
perf report do.

That is needed for decoding Instruction Trace data using a copy of
/proc/kcore for the kernel object because the kallsyms path is used to
locate that copy.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406035081-14301-30-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-07-25 12:08:34 -03:00
Jiri Olsa
bbb2cea7e8 perf tools: Add --debug optionto set debug variable
Adding --debug option as a way to setup debug variables.  Starting with
support for verbose, more will come.

It's possible to use it now with report command:
  $ perf --debug verbose   ...
  $ perf --debug verbose=2 ...

I'll need this support to add separated debug variable for ordered
events change in order to separate debug output out of standard verbose
stream.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140717105500.GG516@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-07-17 12:58:59 -03:00
Alexander Yarygin
3be8e2a0a5 perf kvm: Add stat support on s390
On s390, the vmexit event has a tree-like structure: between
exit_event_begin and exit_event_end several other events may happen and
with each of them refining the previous ones.

This patch adds a decoder for such events to the generic code and also
the files <asm/kvm_perf.h> and kvm-stat.c for s390.

Commands 'perf kvm stat record', 'report' and 'live' are supported.

Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1404397747-20939-5-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-07-16 17:57:33 -03:00
Stanislav Fomichev
d243144af0 perf timechart: Add more options to IO mode
--io-skip-eagain - don't show EAGAIN errors
--io-min-time    - make small io bursts visible
--io-merge-dist  - merge adjacent events

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/1404835423-23098-5-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-07-10 00:22:54 +02:00
Stanislav Fomichev
b97b59b93d perf timechart: Implement IO mode
Currently, timechart records only scheduler and CPU events (task switches,
running times, CPU power states, etc); this commit adds IO mode which
makes it possible to record IO (disk, network) activity. In this mode
perf timechart will generate SVG with IO charts (writes, reads, tx, rx, polls).

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/1404835423-23098-3-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-07-10 00:22:54 +02:00
Stanislav Fomichev
e281a9606d perf trace: Add possibility to switch off syscall events
Currently, we may either trace syscalls or syscalls+pagefaults.

We'd like to be able to trace *only* pagefaults and this commit
implements this feature.

Example:

  [root@zoo /]# echo 1 > /proc/sys/vm/drop_caches ; trace --no-syscalls -F -p `pidof xchat`
       0.000 ( 0.000 ms): xchat/4574 majfault [g_unichar_get_script+0x11] => /usr/lib64/libglib-2.0.so.0.3800.2@0xc403b (x.)
       0.202 ( 0.000 ms): xchat/4574 majfault [_cairo_hash_table_lookup+0x53] => 0x2280ff0 (?.)
      20.854 ( 0.000 ms): xchat/4574 majfault [gdk_cairo_set_source_pixbuf+0x110] => /usr/bin/xchat@0x6da1f (x.)
    1022.000 ( 0.000 ms): xchat/4574 majfault [__memcpy_sse2_unaligned+0x29] => 0x7ff5a8ca0400 (?.)
  ^C[root@zoo /]#

Below we can see malloc calls, 'trace' reading symbol tables in libraries to
resolve symbols, etc.

  [root@zoo /]# echo 1 > /proc/sys/vm/drop_caches ; trace --no-syscalls -F all --cpu 1 sleep 10
       0.000 ( 0.000 ms): chrome/26589 minfault [0x1b53129] => /tmp/perf-26589.map@0x33cbcbf7f000 (x.)
      96.477 ( 0.000 ms): libvirtd/947 minfault [copy_user_enhanced_fast_string+0x5] => 0x7f7685bba000 (?k)
     113.164 ( 0.000 ms): Xorg/1063 minfault [0x786da] => 0x7fce52882a3c (?.)
    7162.801 ( 0.000 ms): chrome/3747 minfault [0x8e1a89] => 0xfcaefed0008 (?.)
<SNIP>
    7773.138 ( 0.000 ms): chrome/3886 minfault [0x8e1a89] => 0xfcb0ce28008 (?.)
    7992.022 ( 0.000 ms): chrome/26574 minfault [0x1b5a708] => 0x3de7b5fc5000 (?.)
    8108.949 ( 0.000 ms): qemu-system-x8/4537 majfault [_int_malloc+0xee] => 0x7faffc466d60 (?.)
    8108.975 ( 0.000 ms): qemu-system-x8/4537 minfault [_int_malloc+0x102] => 0x7faffc466d60 (?.)
<SNIP>
    8148.174 ( 0.000 ms): qemu-system-x8/4537 minfault [_int_malloc+0x102] => 0x7faffc4eb500 (?.)
    8270.855 ( 0.000 ms): chrome/26245 minfault [do_bo_emit_reloc+0xdb] => 0x45d092bc004 (?.)
    8270.869 ( 0.000 ms): chrome/26245 minfault [do_bo_emit_reloc+0x108] => 0x45d09150000 (?.)
no symbols found in /usr/lib64/libspice-server.so.1.9.0, maybe install a debug package?
    8273.831 ( 0.000 ms): trace/20198 majfault [__memcmp_sse4_1+0xbc6] => /usr/lib64/libspice-server.so.1.9.0@0xdf000 (d.)
<SNIP>
    8275.121 ( 0.000 ms): trace/20198 minfault [dso__load+0x38] => 0x14fe756 (?.)
no symbols found in /usr/lib64/libelf-0.158.so, maybe install a debug package?
    8275.142 ( 0.000 ms): trace/20198 minfault [__memcmp_sse4_1+0xbc6] => /usr/lib64/libelf-0.158.so@0x0 (d.)
<SNIP>
  [root@zoo /]#

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1403799268-1367-6-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-06-26 17:48:07 -03:00
Stanislav Fomichev
598d02c5a0 perf trace: Add support for pagefault tracing
This patch adds optional pagefault tracing support to 'perf trace'.

Using -F/--pf option user can specify whether he wants minor, major or
all pagefault events to be traced. This patch adds only live mode,
record and replace will come in a separate patch.

Example output:

  1756272.905 ( 0.000 ms): curl/5937 majfault [0x7fa7261978b6] => /usr/lib/x86_64-linux-gnu/libkrb5.so.26.0.0@0x85288 (d.)
  1862866.036 ( 0.000 ms): wget/8460 majfault [__clear_user+0x3f] => 0x659cb4 (?k)

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1403799268-1367-3-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-06-26 16:07:43 -03:00
Davidlohr Bueso
b6f0629a94 perf bench: Add --repeat option
There are a number of benchmarks that do single runs and as a result
does not really help users gain a general idea of how the workload
performs. So the user must either manually do multiple runs or just use
single bogus results.

This option will enable users to specify the amount of runs (arbitrarily
defaulted to 10, to use the existing benchmarks default) through the
'--repeat' option.  Add it to perf-bench instead of implementing it
always in each specific benchmark.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1402942467-10671-2-git-send-email-davidlohr@hp.com
[ Kept the existing default of 10, changing it to something else should
  be done on separate patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-06-19 16:13:15 -03:00
Ingo Molnar
7184062b94 perf/core improvements:
User visible:
 
 . Improve 'perf probe' error messages, moving some diagnostic messages to
   only appear in --verbose mode and fixing up some error reporting related
   to variables and struct members. (Masami Hiramatsu)
 
 . Reflow 'perf timechart' man page. (Stanislav Fomichev)
 
 Developer stuff:
 
 . Be more precise when reporting missing libraries in a static tool build.
   (Arnaldo Carvalho de Melo)
 
 . Show error messages from the multiple make invoked from 'make build-test'.
   (Arnaldo Carvalho de Melo)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJTlxjYAAoJENZQFvNTUqpA5b4P/1Qs0S/HqAsVCqQe9143IxNS
 HY0NrhBGm05rbYga+Bvp6lp9xXf3F9hp7i3rFANgB68sHLEmi8DU9T5vmvrq9TIU
 +KT102re7eA/93rVQ+cvBqaosQVh8ia7O2tnr+FEhyBCNOIwTqtUI4g+9/IJB3h9
 0xxsYLR2SZtV9aSZKXdSjOZ0wh8l0D1VjuCQd5wqYvqQ8r+1nOImKX3Y02Byftns
 ZH/MkYtkmUbdFMdenRN2lJenDnIPji9AESPnTcZbXS23IIgnpOicgtRcrt9LVK4Y
 Ty+ooLXmf57uXkoFpM4DMybuyUGH3xw44TB0PqZuBJ1Psgdm5SzdJfLshUKptLFc
 XvxN8yaWSvOz2Bu/tS17o+PzXYdgk3Ar8UCWSYtkFDmfbaZC6RYzMfgHZnYsVlrf
 ZjcIviBqkbHpTFkV3PJZi6PnvKCiNUj2rA5rv9ltc2XPMgHEGhqT7lxGgh0iGd/O
 c8Wt/TjB6CRuMqk6N4Epb/yIIYbL01Ax3GdR1yw4exG7W75hLz+BBrT7P51Ivdg2
 Ke2ysjpbARamBY3XOxCqA3zfWlhHdH1PrBexEkEa1+4ALk0W8TtEhkNgw+ZEiT9H
 HbWXi9KwrNff0RAgzx2o9XiwO8iG/wLgO5AU0CNY9L2s7gosxE8BnSoPnvdVqhvl
 lt/m+f8SKYavUlHNxvC3
 =37tZ
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements from Arnaldo Carvalho de Melo:

User visible:

  * Improve 'perf probe' error messages, moving some diagnostic messages to
    only appear in --verbose mode and fixing up some error reporting related
    to variables and struct members. (Masami Hiramatsu)

  * Reflow 'perf timechart' man page. (Stanislav Fomichev)

Developer stuff:

  * Be more precise when reporting missing libraries in a static tool build.
    (Arnaldo Carvalho de Melo)

  * Show error messages from the multiple make invoked from 'make build-test'.
    (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-12 13:54:42 +02:00
Stanislav Fomichev
f48e00cead perf timechart: Reflow documentation
Move options away from examples.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Link: http://lkml.kernel.org/r/20140610095216.GO26511@stfomichev-desktop.yandex.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-06-10 10:03:18 -03:00
Don Zickus
9b32ba71ba perf tools: Add dcacheline sort
In perf's 'mem-mode', one can get access to a whole bunch of details specific to a
particular sample instruction.  A bunch of those details relate to the data
address.

One interesting thing you can do with data addresses is to convert them into a unique
cacheline they belong too.  Organizing these data cachelines into similar groups and sorting
them can reveal cache contention.

This patch creates an alogorithm based on various sample details that can help group
entries together into data cachelines and allows 'perf report' to sort on it.

The algorithm relies on having proper mmap2 support in the kernel to help determine
if the memory map the data address belongs to is private to a pid or globally shared.

The alogortithm is as follows:

o group cpumodes together
o group entries with discovered maps together
o sort on major, minor, inode and inode generation numbers
o if userspace anon, then sort on pid
o sort on cachelines based on data addresses

The 'dcacheline' sort option in 'perf report' only works in 'mem-mode'.

Sample output:

 #
 # Samples: 206  of event 'cpu/mem-loads/pp'
 # Total weight : 2534
 # Sort order   : dcacheline,pid
 #
 # Overhead       Samples                                                          Data Cacheline       Command:  Pid
 # ........  ............  ......................................................................  ..................
 #
    13.22%             1  [k] 0xffff88042f08ebc0                                                       swapper:    0
     9.27%             1  [k] 0xffff88082e8cea80                                                       swapper:    0
     3.59%             2  [k] 0xffffffff819ba180                                                       swapper:    0
     0.32%             1  [k] arch_trigger_all_cpu_backtrace_handler_na.23901+0xffffffffffffffe0       swapper:    0
     0.32%             1  [k] timekeeper_seq+0xfffffffffffffff8                                        swapper:    0

Note:  Added a '+1' to symlen size in hists__calc_col_len to prevent the next column
from prematurely tabbing over and mis-aligning.  Not sure what the problem is.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1401208087-181977-8-git-send-email-dzickus@redhat.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-09 13:34:49 +02:00
Don Zickus
75e906c960 perf report: Add mem-mode documentation to report command
Add mem-mode sorting types and mem-mode itself to perf-report documentation.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/r/1400526833-141779-5-git-send-email-dzickus@redhat.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-09 13:34:47 +02:00
Anshuman Khandual
3e39db4ae2 perf/documentation: Add description for conditional branch filter
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mpe@ellerman.id.au
Cc: benh@kernel.crashing.org
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1400743210-32289-4-git-send-email-khandual@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 12:30:27 +02:00
Namhyung Kim
1432ec342e perf top: Add --children option
The --children option is for showing accumulated overhead (period)
value as well as self overhead.  It should be used with one of -g or
--call-graph option.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-21-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:35:07 +02:00
Namhyung Kim
793aaaabb7 perf report: Add --children option
The --children option is for showing accumulated overhead (period)
value as well as self overhead.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arun Sharma <asharma@fb.com>
Tested-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1401335910-16832-16-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-06-01 14:35:04 +02:00
Namhyung Kim
6fe8c26d7a perf top: Add --fields option to specify output fields
The --fields option is to allow user setup output field in any order.
It can receive any sort keys and following (hpp) fields:

  overhead, overhead_sys, overhead_us, sample and period

If guest profiling is enabled, overhead_guest_{sys,us} will be
available too.

More more information, please see previous patch "perf report:
Add -F option to specify output fields"

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-15-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:36 +02:00
Namhyung Kim
a7d945bc91 perf report: Add -F option to specify output fields
The -F/--fields option is to allow user setup output field in any
order.  It can receive any sort keys and following (hpp) fields:

  overhead, overhead_sys, overhead_us, sample and period

If guest profiling is enabled, overhead_guest_{sys,us} will be
available too.

The output fields also affect sort order unless you give -s/--sort
option.  And any keys specified on -s option, will also be added to
the output field list automatically.

  $ perf report -F sym,sample,overhead
  ...
  #                     Symbol       Samples  Overhead
  # ..........................  ............  ........
  #
    [.] __cxa_atexit                       2     2.50%
    [.] __libc_csu_init                    4     5.00%
    [.] __new_exitfn                       3     3.75%
    [.] _dl_check_map_versions             1     1.25%
    [.] _dl_name_match_p                   4     5.00%
    [.] _dl_setup_hash                     1     1.25%
    [.] _dl_sysdep_start                   1     1.25%
    [.] _init                              5     6.25%
    [.] _setjmp                            6     7.50%
    [.] a                                  8    10.00%
    [.] b                                  8    10.00%
    [.] brk                                1     1.25%
    [.] c                                  8    10.00%

Note that, the example output above is captured after applying next
patch which fixes sort/comparing behavior.

Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-12-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:35 +02:00
Namhyung Kim
a2ce067e55 perf tools: Allow hpp fields to be sort keys
Add overhead{,_sys,_us,_guest_sys,_guest_us}, sample and period sort
keys so that they can be selected with --sort/-s option.

  $ perf report -s period,comm --stdio
  ...
  # Overhead        Period          Command
  # ........  ............  ...............
  #
      47.06%           152          swapper
      13.93%            45  qemu-system-arm
      12.38%            40         synergys
       3.72%            12          firefox
       2.48%             8            xchat

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1400480762-22852-9-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2014-05-21 11:45:34 +02:00
Namhyung Kim
8810f6ced7 perf diff: Add --percentage option
The --percentage option is for controlling overhead percentage
displayed.  It can only receive either of "relative" or "absolute" and
affects -c delta output only.

For more information, please see previous commit same thing done to
"perf report".

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-5-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:03 +02:00
Namhyung Kim
33db4568e1 perf top: Add --percentage option
The --percentage option is for controlling overhead percentage
displayed.  It can only receive either of "relative" or "absolute".
Move the parser callback function into a common location since it's
used by multiple commands now.

For more information, please see previous commit same thing done to
"perf report".

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:03 +02:00
Namhyung Kim
f214833054 perf report: Add --percentage option
The --percentage option is for controlling overhead percentage
displayed.  It can only receive either of "relative" or "absolute".

"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%.  "absolute" means it retains
the original value before and after the filter is applied.

  $ perf report -s comm
  # Overhead       Command
  # ........  ............
  #
      74.19%           cc1
       7.61%           gcc
       6.11%            as
       4.35%            sh
       4.14%          make
       1.13%        fixdep
  ...

  $ perf report -s comm -c cc1,gcc --percentage absolute
  # Overhead       Command
  # ........  ............
  #
      74.19%           cc1
       7.61%           gcc

  $ perf report -s comm -c cc1,gcc --percentage relative
  # Overhead       Command
  # ........  ............
  #
      90.69%           cc1
       9.31%           gcc

Note that it has zero effect if no filter was applied.

Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-16 17:16:03 +02:00
Ramkumar Ramachandra
95a2b3c0a9 perf bench: Update manpage to mention numa and futex
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1395964219-22173-3-git-send-email-artagnon@gmail.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-14 12:55:41 +02:00
Namhyung Kim
5e09714b0e perf top: Fix documentation of invalid -s option
On perf top, the -s option is used for --sort, but the man page
contains invalid documentation of -s option for --sym-annotate.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1395193578-27098-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
2014-04-14 12:54:59 +02:00
Andi Kleen
5b4398209d perf probe: Clarify x86 register naming for perf probe
Clarify how to specify x86 registers in perf probe. I recently ran into
this problem and had to figure it out from the source.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/1393596135-4227-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-03-14 11:20:44 -03:00
Andi Kleen
b639409704 perf mem: Clarify load-latency in documentation
Clarify in the documentation that 'perf mem report' reports use-latency,
not load/store-latency on Intel systems.

This often causes confusion with users.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1393596135-4227-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-03-14 11:20:44 -03:00
Arnaldo Carvalho de Melo
a6205a35ba perf record: Rename --initial-delay to --delay
To be consistent with the equivalent option in 'stat', also, for the
same reason, use -D as the one letter alias.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-p5yjnopajb3a8x0xha7yl5w8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-01-14 17:58:12 -03:00
Arnaldo Carvalho de Melo
509051ea84 perf record: Rename --no-delay to --no-buffering
That is how the option summary describes it and so that we can free
--delay to replace --initial-delay and then be consistent with stat's
--delay equivalent option.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-f8hd2010uhjl2zzb34hepbmi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-01-14 17:57:04 -03:00
Andi Kleen
6619a53ef7 perf record: Add --initial-delay option
perf stat has a --delay option to delay measuring the workload.

This is useful to skip measuring the startup phase of the program, which
is often very different from the main workload.

The same is useful for perf record when sampling.

--no-delay was already taken, so add a --initial-delay
to perf record too.
-D was already taken for record, so there is only a long option.

v2: Don't disable group members (Namhyung Kim)
v3: port to latest perf/core
    rename to --initial-delay to avoid conflict with --no-delay

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1389476307-2124-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-01-13 10:07:03 -03:00
Andi Kleen
8f3dd2b096 perf stat: Fix --delay option in man page
The --delay option was documented as --initial-delay in the manpage. Fix this.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1389132847-31982-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2014-01-13 10:06:24 -03:00
Stanislav Fomichev
e57a2dffbc perf timechart: Add --highlight option
This option highlights tasks (using different color) that run more than
given duration or tasks with given name.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Link: http://lkml.kernel.org/r/20131217155349.GA13021@stfomichev-desktop
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-17 16:33:55 -03:00
Stanislav Fomichev
c507999790 perf timechart: Add support for topology
Add -t switch to sort CPUs topologically.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Link: http://lkml.kernel.org/r/1385995056-20158-5-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-16 16:34:53 -03:00
Jiri Olsa
e90debddf8 perf script: Add --header/--header-only options
Currently the perf.data header is always displayed for stdio output,
which is no always useful.

Disabling header information by default and adding following options to
control header output:

  --header      - display header information
  --header-only - display header information only w/o further
                  processing

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-0ehaawv5xc83w6ag03c5hi10@git.kernel.org
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386583370-1699-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-10 16:51:07 -03:00
Jiri Olsa
5cfe2c82f3 perf report: Add --header/--header-only options
Currently the perf.data header is always displayed for stdio output,
which is no always useful.

Disabling header information by default and adding following options to
control header output:

  --header      - display header information (old default)
  --header-only - display header information only w/o further
                  processing, forces stdio output

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386583370-1699-2-git-send-email-jolsa@redhat.com
[ Added single line explaining talking about the new --header* options,
  to address David Ahern comment; better man page entry for the new options,
  from Namhyung Kim ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-10 16:49:02 -03:00
Dongsheng Yang
f113bee019 perf archive: Remove duplicated 'runs' in man page
Two 'runs' here breaks the sentence in Description of 'perf archive'
command.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/78a15a9f4f500b6074a1e25917d6e8251f894628.1386629050.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-09 15:21:45 -03:00
Dongsheng Yang
100b907350 perf kvm: Introduce option -v for perf kvm command.
As there is no -v option for perf kvm, the all debug message for perf
kvm will nerver be printed out to user.

Example:
	# perf kvm --guestmount /tmp/guestmount/ record -a
	Not enough memory for reading perf file header

It is confusing message for newbies such as me. With this patch applied,
we can use -v option to get the detail.

Example:
	# perf kvm --guestmount /tmp/guestmount/ record -a -v
	Can't access file /tmp/guestmount//15069/proc/kallsyms
	Not enough memory for reading perf file header

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1386609311-23889-1-git-send-email-yangds.fnst@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-09 15:19:11 -03:00
Adrian Hunter
cc8fae1d81 perf script: Add an option to print the source line number
Add field 'srcline' that displays the source file name and line number
associated with the sample ip.  The information displayed is the same as
from addr2line.

 $ perf script -f comm,tid,pid,time,ip,sym,dso,symoff,srcline
            grep 10701/10701 2497321.421013:  ffffffff81043ffa native_write_msr_safe+0xa ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/arch/x86/include/asm/msr.h:95
            grep 10701/10701 2497321.421984:  ffffffff8165b6b3 _raw_spin_lock+0x13 ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/arch/x86/include/asm/spinlock.h:54
            grep 10701/10701 2497321.421990:  ffffffff810b64b3 tick_sched_timer+0x53 ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/kernel/time/tick-sched.c:840
            grep 10701/10701 2497321.421992:  ffffffff8106f63f run_timer_softirq+0x2f ([kernel.kallsyms])
  /usr/src/debug/kernel-3.9.fc17/linux-3.9.10-100.fc17.x86_64/kernel/timer.c:1372

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1386315778-11633-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-09 14:47:15 -03:00
Dongsheng Yang
8df0b4ad58 perf kvm: Update the 'record' man page entry for new --guest/--host behavior
As we have changed the default behavior of 'perf kvm' to --guest
enabled, the parts of the man page that covers the 'record' subcommand
are outdated.

This patch updates it to show the correct output with
--host/--guest/neither/both of them.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/3a3a9c1e05acb5a274d1d8369db5a4c6467d6276.1386197481.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-04 15:06:08 -03:00
Dongsheng Yang
316bd98a9a perf kvm: Fix spurious '=' use in man page
As option --host and --guest request no input for it, there should not
be a '=' after them in the man page sources.

And --output expects a filename as the input, so there should be a '='
after it.

This patch removes the needless '=' after --guest and --host, and adds a
'=' after --output in perf-kvm.txt.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/6124d9eb10a3f1f6b399d1db660110bc7a60fd6b.1386197481.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-04 15:04:44 -03:00
Dongsheng Yang
ed086d5b8a perf kvm: Add more detail about buildid-list in man page
As the buildid is read from /sys/kernel/notes, then if we use perf kvm
buildid-list with a perf data file captured by perf kvm record with
--guestkallsyms and --guestmodules, there is no result in output.

This patch add a explanation about it and add a limit of using perf kvm
buildid-list.

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/d605a805486340b53bc261aa64d7632ad0a8cf53.1386197481.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-12-04 15:02:47 -03:00
Namhyung Kim
ba1ddf42f3 perf script: Print mmap[2] events also
If --show-mmap-events option is given, also print internal MMAP and
MMAP2 events.  It would be helpful for debugging.

  $ perf script --show-mmap-events
  ...
           sleep  9486 [009] 3350640.335531: PERF_RECORD_MMAP 9486/9486: [0x400000(0x6000) @ 0]: x /usr/bin/sleep
           sleep  9486 [009] 3350640.335542: PERF_RECORD_MMAP 9486/9486: [0x3153a00000(0x223000) @ 0]: x /usr/lib64/ld-2.17.so
           sleep  9486 [009] 3350640.335553: PERF_RECORD_MMAP 9486/9486: [0x7fff8b5fe000(0x2000) @ 0x7fff8b5fe000]: x [vdso]
           sleep  9486 [009] 3350640.335643: PERF_RECORD_MMAP 9486/9486: [0x3153e00000(0x3c0000) @ 0]: x /usr/lib64/libc-2.17.so

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1385456066-26592-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:38 -03:00
Namhyung Kim
ad7ebb9a48 perf script: Print comm, fork and exit events also
If --show-task-events option is given, also print internal COMM, FORK
and EXIT events.  It would be helpful for debugging.

  $ perf script --show-task-events
  ...
         swapper     0 [009] 3350640.335261: sched:sched_switch: prev_comm=swapper/9
           sleep  9486 [009] 3350640.335509: PERF_RECORD_COMM: sleep:9486
           sleep  9486 [009] 3350640.335806: sched:sched_stat_runtime: comm=sleep pid=9486
         firefox  2635 [003] 3350641.275896: PERF_RECORD_FORK(2635:9487):(2635:2635)
         firefox  2635 [003] 3350641.275896: sched:sched_process_fork: comm=firefox pid=2635
           sleep  9486 [009] 3350641.336009: PERF_RECORD_EXIT(9486:9486):(9486:9486)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1385455873-25865-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:38 -03:00
Stanislav Fomichev
6f8d67fa0c perf timechart: Add backtrace support
Add -g flag to `perf timechart record` which saves callchain info in the
perf.data.

When generating SVG, add backtrace information to the figure details, so
now it's possible to see which code path woke up the task and why some
task went to sleep.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-8-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:37 -03:00
Stanislav Fomichev
367b3152d7 perf timechart: Add support for -P and -T in timechart recording
If we don't want either power or task events we may use -T or -P with
the `perf timechart record` command to filter out events while recording
to keep perf.data small.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-7-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:37 -03:00
Stanislav Fomichev
c87097d39d perf timechart: Add support for displaying only tasks related data
In order to make SVG smaller and faster to browse add possibility to
switch off power related information with -T switch.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-5-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:37 -03:00
Stanislav Fomichev
54874e3236 perf timechart: Add option to limit number of tasks
Add -n option to specify min. number of tasks to print.

Signed-off-by: Stanislav Fomichev <stfomichev@yandex-team.ru>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383323151-19810-3-git-send-email-stfomichev@yandex-team.ru
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:36 -03:00
Adrian Hunter
69e7e5b02b perf record: Default -t option to no inheritance
The change to per-cpu mmaps causes the -p, -t and -u options now to have
inheritance enabled by default.  Change that back to no inheritance but
for the -t option only.

Requested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384768557-23331-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:36 -03:00
Adrian Hunter
3aa5939d71 perf record: Make per-cpu mmaps the default.
This affects the -p, -t and -u options that previously defaulted to
per-thread mmaps.

Consequently add an option to select per-thread mmaps to support the old
behaviour.

Note that per-thread can be used with a workload-only (i.e. none of -p,
-t, -u, -a or -C is selected) to get a per-thread mmap with no
inheritance.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/5286271D.3020808@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:36 -03:00
David Ahern
bf80669e4f perf top: Make -g refer to callchains
In most commands -g is used for callchains. Make perf-top follow suit.
Move group to just --group with no short cut making it similar to
perf-record.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1384487490-6865-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-27 14:58:35 -03:00
Adrian Hunter
539e6bb71e perf record: Add an option to force per-cpu mmaps
By default, when tasks are specified (i.e. -p, -t or -u options)
per-thread mmaps are created.

Add an option to override that and force per-cpu mmaps.

Further comments by peterz:

So this option allows -t/-p/-u to create one buffer per cpu and attach
all the various thread/process/user tasks' their counters to that one
buffer?

As opposed to the current state where each such counter would have its
own buffer.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383313899-15987-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-14 16:10:27 -03:00
David Ahern
fd2eabaf16 perf trace: Add summary only option
Per request from Pekka make --summary a summary only option meaning do
not show the individual system calls. Add another option to see all
syscalls along with the summary. In addition use 's' and 'S' as
shortcuts for the options.

Requested-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Pekka Enberg <penberg@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1384273875-3751-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:24:38 -03:00
Ingo Molnar
aac898548d Merge branch 'perf/urgent' into perf/core
Conflicts:
	tools/perf/builtin-record.c
	tools/perf/builtin-top.c
	tools/perf/util/hist.h
2013-10-29 11:23:32 +01:00
Jiri Olsa
ae779a6309 perf top: Split -G and --call-graph
Splitting -G and --call-graph for record command, so we could use '-G'
with no option.

The '-G' option now takes NO argument and enables the configured unwind
method, which is currently the frame pointers method.

It will be possible to configure unwind method via config file in
upcoming patches.

All current '-G' arguments is overtaken by --call-graph option.

NOTE: The documentation for top --call-graph option
      was wrongly copied from report command.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382797536-32303-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28 16:06:00 -03:00
Jiri Olsa
09b0fd45ff perf record: Split -g and --call-graph
Splitting -g and --call-graph for record command, so we could use '-g'
with no option.

The '-g' option now takes NO argument and enables the configured unwind
method, which is currently the frame pointers method.

It will be possible to configure unwind method via config file in
upcoming patches.

All current '-g' arguments is overtaken by --call-graph option.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382797536-32303-2-git-send-email-jolsa@redhat.com
[ reordered -g/--call-graph on --help and expanded the man page
  according to comments by David Ahern and Namhyung Kim ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-28 16:05:59 -03:00
Waiman Long
5dbb6e81d8 perf top: Add --max-stack option to limit callchain stack scan
When the callgraph function is enabled (-G), it may take a long time to
scan all the stack data and merge them accordingly.

This patch adds a new --max-stack option to perf-top to limit the depth
of callchain stack data to look at to reduce the time it takes for
perf-top to finish its processing. It reduces the amount of information
provided to the user in exchange for faster speed.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382107129-2010-5-git-send-email-Waiman.Long@hp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-21 17:36:25 -03:00
Waiman Long
91e9561742 perf report: Add --max-stack option to limit callchain stack scan
When callgraph data was included in the perf data file, it may take a
long time to scan all those data and merge them together especially if
the stored callchains are long and the perf data file itself is large,
like a Gbyte or so.

The callchain stack is currently limited to PERF_MAX_STACK_DEPTH (127).
This is a large value. Usually the callgraph data that developers are
most interested in are the first few levels, the rests are usually not
looked at.

This patch adds a new --max-stack option to perf-report to limit the
depth of callchain stack data to look at to reduce the time it takes for
perf-report to finish its processing. It trades the presence of trailing
stack information with faster speed.

The following table shows the elapsed time of doing perf-report on a
perf.data file of size 985,531,828 bytes.

  --max_stack   Elapsed Time    Output data size
  -----------   ------------    ----------------
  not set        88.0s          124,422,651
  64             87.5s          116,303,213
  32             87.2s          112,023,804
  16             86.6s           94,326,380
  8              59.9s           33,697,248
  4              40.7s           10,116,637
  -g none        27.1s            2,555,810

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1382107129-2010-4-git-send-email-Waiman.Long@hp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-21 17:36:25 -03:00
Arnaldo Carvalho de Melo
c522739d72 perf trace: Use vfs_getname hook if available
Initially it tries to find a probe:vfs_getname that should be setup
with:

 perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'

or with slight changes to cope with code flux in the getname_flags code.

In the future, if a "vfs:getname" tracepoint becomes available, then it
will be preferred.

This is not strictly required and more expensive method of reading the
/proc/pid/fd/ symlink will be used when the fd->path array entry is not
populated by a previous vfs_getname + open syscall ret sequence.

As with any other 'perf probe' probe the setup must be done just once
and the probe will be left inactive, waiting for users, be it 'perf
trace' of any other tool.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ujg8se8glq5izmu8cdkq15po@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-16 11:18:24 -03:00
Adrian Hunter
fc1b691d76 perf buildid-cache: Add ability to add kcore to the cache
kcore can be used to view the running kernel object code.  However,
kcore changes as modules are loaded and unloaded, and when the kernel
decides to modify its own code.  Consequently it is useful to create a
copy of kcore at a particular time.  Unlike vmlinux, kcore is not unique
for a given build-id.  And in addition, the kallsyms and modules files
are also needed.  The tool therefore creates a directory:

	~/.debug/[kernel.kcore]/<build-id>/<YYYYmmddHHMMSShh>

which contains: kcore, kallsyms and modules.

Note that the copied kcore contains only code sections.  See the
kcore_copy() function for how that is determined.

The tool will not make additional copies of kcore if there is already
one with the same modules at the same addresses.

Currently, perf tools will not look for kcore in the cache.  That is
addressed in another patch.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/525BF849.5030405@intel.com
[ renamed 'index' to 'idx' to avoid shadowing string.h symbol in f12,
  use at least one member initializer when initializing a struct to
  zeros, also to fix the build on f12 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-14 12:20:38 -03:00
David Ahern
bf2575c121 perf trace: Add summary option to dump syscall statistics
When enabled dumps a summary of all syscalls by task with the usual
statistics -- min, max, average and relative stddev. For example,

make - 26341 :       3344   [ 17.4% ]      0.000 ms

                read :   52    0.000     4.802     0.644   30.08
               write :   20    0.004     0.036     0.010   21.72
                open :   24    0.003     0.046     0.014   23.68
               close :   64    0.002     0.055     0.008   22.53
                stat : 2714    0.002     0.222     0.004    4.47
               fstat :   18    0.001     0.041     0.006   46.26
                mmap :   30    0.003     0.009     0.006    5.71
            mprotect :    8    0.006     0.039     0.016   32.16
              munmap :   12    0.007     0.077     0.020   38.25
                 brk :   48    0.002     0.014     0.004   10.18
        rt_sigaction :   18    0.002     0.002     0.002    2.11
      rt_sigprocmask :   60    0.002     0.128     0.010   32.88
              access :    2    0.006     0.006     0.006    0.00
                pipe :   12    0.004     0.048     0.013   35.98
               vfork :   34    0.448     0.980     0.692    3.04
              execve :   20    0.000     0.387     0.046   56.66
               wait4 :   34    0.017  9923.287   593.221   68.45
               fcntl :    8    0.001     0.041     0.013   48.79
            getdents :   48    0.002     0.079     0.013   19.62
              getcwd :    2    0.005     0.005     0.005    0.00
               chdir :    2    0.070     0.070     0.070    0.00
           getrlimit :    2    0.045     0.045     0.045    0.00
          arch_prctl :    2    0.002     0.002     0.002    0.00
           setrlimit :    2    0.002     0.002     0.002    0.00
              openat :   94    0.003     0.005     0.003    2.11

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1381289214-24885-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-14 10:28:50 -03:00
Ramkumar Ramachandra
d366c53e1d perf timechart: Add example in the documentation
While at it, update the synopsis to show both forms.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Link: http://lkml.kernel.org/r/1380791716-10325-1-git-send-email-artagnon@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-11 12:18:13 -03:00
Ingo Molnar
8a5411e9a3 perf tools: Implement summary output for 'make install'
'make install' used to show all the install lines, which is way too
verbose to be really informative to the user.

Implement summary output instead:

  comet:~/tip/tools/perf> make install
    BUILD:   Doing 'make -j12' parallel build
    SUBDIR   Documentation
    INSTALL  Documentation-man
    INSTALL  binaries
    INSTALL  libexec
    INSTALL  perf-archive
    INSTALL  perl-scripts
    INSTALL  python-scripts
    INSTALL  bash_completion-script
    INSTALL  tests

'make install V=1' will still show the old, detailed output.

Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1381312169-17354-5-git-send-email-mingo@kernel.org
[ Fixed conflict with libperf-gtk patches in acme/perf/core, cope with 'trace' alias ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-11 12:18:11 -03:00
Ingo Molnar
65fb09922d tools: Harmonize the various build messages in perf, lib-traceevent, lib-lk
The various build lines from libtraceevent and perf mix up during a
parallel build and produce unaligned output like:

    CC builtin-buildid-list.o
    CC builtin-buildid-cache.o
    CC builtin-list.o
  CC FPIC            trace-seq.o
    CC builtin-record.o
  CC FPIC            parse-filter.o
    CC builtin-report.o
    CC builtin-stat.o
  CC FPIC            parse-utils.o
  CC FPIC            kbuffer-parse.o
    CC builtin-timechart.o
    CC builtin-top.o
    CC builtin-script.o
  BUILD STATIC LIB   libtraceevent.a
    CC builtin-probe.o
    CC builtin-kmem.o
    CC builtin-lock.o

To solve this, harmonize all the build message alignments to be similar
to the kernel's kbuild output: prefixed by two spaces and 11-char wide.

After the patch the output looks pretty tidy, even if output lines get
mixed up:

  CC      builtin-annotate.o
  FLAGS:  * new build flags or cross compiler
  CC      builtin-bench.o
  AR      liblk.a
  CC      bench/sched-messaging.o
  CC FPIC event-parse.o
  CC      bench/sched-pipe.o
  CC FPIC trace-seq.o
  CC      bench/mem-memcpy.o
  CC      bench/mem-memset.o
  CC FPIC parse-filter.o
  CC      builtin-diff.o
  CC      builtin-evlist.o
  CC      builtin-help.o

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381312169-17354-3-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-11 12:18:08 -03:00
Ingo Molnar
8ec19c0eba perf tools: Implement summary output for 'make clean'
'make clean' used to show all the rm lines, which isn't really
informative in any way and spams the console.

Implement summary output:

  comet:~/tip/tools/perf> make clean
   CLEAN libtraceevent
   CLEAN liblk
   CLEAN config
   CLEAN core-objs
   CLEAN core-progs
   CLEAN core-gen
   CLEAN Documentation
   CLEAN python

'make clean V=1' will still show the old, detailed output.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381312169-17354-2-git-send-email-mingo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-11 12:18:06 -03:00
David Ahern
5e2485b1a2 perf trace: Add record option
The record option is a convience alias to include the -e raw_syscalls:*
argument to perf-record. All other options are passed to perf-record's
handler. Resulting data file can be analyzed by perf-trace -i.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1380395584-9025-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-11 12:17:49 -03:00
Jiri Olsa
27050f530d perf tools: Add possibility to specify mmap size
Adding possibility to specify mmap size via -m/--mmap-pages
by appending unit size character (B/K/M/G) to the
number, like:
  $ perf record -m 8K ls
  $ perf record -m 2M ls

The size is rounded up appropriately to follow perf
mmap restrictions.

If no unit is specified the number provides pages as
of now, like:
  $ perf record -m 8 ls

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1378031796-17892-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09 11:24:20 -03:00
Davidlohr Bueso
f37376cd72 perf lock: Account for lock average wait time
While perf-lock currently reports both the total wait time and the
number of contentions, it doesn't explicitly show the average wait time.
Having this value immediately in the report can be quite useful when
looking into performance issues.

Furthermore, allowing report to sort by averages is another handy
feature to have - and thus do not only print the value, but add it to
the lock_stat structure.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1378693159-8747-8-git-send-email-davidlohr@hp.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09 11:24:01 -03:00
Arnaldo Carvalho de Melo
50c95cbd70 perf trace: Add option to show process COMM
Enabled by default, disable with --no-comm, e.g.:

 181.821 (0.001 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: PEEK|TRUNC|CMSG_CLOEXEC ) = 20
 181.824 (0.001 ms): deja-dup-monit/10784 geteuid(                                                           ) = 1000
 181.825 (0.001 ms): deja-dup-monit/10784 getegid(                                                           ) = 1000
 181.834 (0.002 ms): deja-dup-monit/10784 recvmsg(fd: 8, msg: 0x7fff4342baf0, flags: CMSG_CLOEXEC            ) = 20
 181.836 (0.001 ms): deja-dup-monit/10784 geteuid(                                                           ) = 1000
 181.838 (0.001 ms): deja-dup-monit/10784 getegid(                                                           ) = 1000
 181.705 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK|TRUNC|CMSG_CLOEXEC) = 1256
 181.710 (0.002 ms): evolution-addr/10924 geteuid(                                                           ) = 1000
 181.712 (0.001 ms): evolution-addr/10924 getegid(                                                           ) = 1000
 181.727 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC           ) = 1256
 181.731 (0.001 ms): evolution-addr/10924 geteuid(                                                           ) = 1000
 181.734 (0.001 ms): evolution-addr/10924 getegid(                                                           ) = 1000
 181.908 (0.002 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: PEEK|TRUNC|CMSG_CLOEXEC) = 20
 181.913 (0.001 ms): evolution-addr/10924 geteuid(                                                           ) = 1000
 181.915 (0.001 ms): evolution-addr/10924 getegid(                                                           ) = 1000
 181.930 (0.003 ms): evolution-addr/10924 recvmsg(fd: 10, msg: 0x7fff17dc6990, flags: CMSG_CLOEXEC           ) = 20
 181.934 (0.001 ms): evolution-addr/10924 geteuid(                                                           ) = 1000
 181.937 (0.001 ms): evolution-addr/10924 getegid(                                                           ) = 1000
 220.718 (0.010 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL                ) = 200
 220.741 (0.000 ms): dbus-daemon/10711  ... [continued]: epoll_wait()) = 1
 220.759 (0.004 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC              ) = 200
 220.780 (0.002 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC              ) = 200
 220.788 (0.001 ms): dbus-daemon/10711 recvmsg(fd: 11, msg: 0x7ffff94594d0, flags: CMSG_CLOEXEC              ) = -1 EAGAIN Resource temporarily unavailable
 220.760 (0.004 ms): at-spi2-regist/10715 sendmsg(fd: 3, msg: 0x7fffdb8756c0, flags: NOSIGNAL                ) = 200
 220.771 (0.023 ms): perf/26347 open(filename: 0xf2e780, mode: 15918976                               ) = 19
 220.850 (0.002 ms): perf/26347 close(fd: 19                                                          ) = 0

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6be5jvnkdzjptdrebfn5263n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09 11:11:33 -03:00
David Ahern
4bb09192d3 perf trace: Add option to show full timestamp
Current timestamp shown for output is time relative to firt sample. This
patch adds an option to show the absolute perf_clock timestamp which is
useful when comparing output across commands (e.g., perf-trace to
perf-script).

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1378319865-55695-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-10-09 11:10:49 -03:00
Ingo Molnar
31f6be65e0 tools/perf: Fix double/triple-build of the feature detection logic during 'make install' et al
Linus reported the following perf build system bug:

  'Another annoyance during that make was that "make install" seems to
   want to re-make the thing I just built. That's absolutely horrible, [...]'

The thing that got re-built were 'only' the (numerous) feature checks,
not the whole project - but still it was mighty annoying as the feature
checks took 9+ seconds even on reasonably fast boxes.

Even with the autodep patches where feature detection is much faster
it wastes resources, wastes screen real estate and confuses users if
we execute feature detection twice.

There were two sources for these unnecessary re-builds of the feature
checks:

 - Unnecessary nested invocations of $(MAKE), apparently to be able
   to do conditional compilation dependent on documentation tools
   presence. Use straight dependencies instead, with no nesting.

 - A direct invocation of $(MAKE) to rebuild the PERF-VERSION-FILE.
   This is apparently done to be able to include it into the
   Makefile:

    -include $(OUTPUT)PERF-VERSION-FILE

   but that's entirely pointless for two reasons: 1) the version file
   gets regenerated by the initial build pass anyway, 2) including it
   is futile, given its contents:

    #define PERF_VERSION "3.12.rc3.g8510c7"

   'make' will interpret that as a comment line...

   So just remove this part of the doc-generation logic.

With these things fixed a 'make install' now rebuilds only what is needed.

A repeated 'make install' on an already built tree is super fast now,
it finishes in under 0.3 seconds:

  #
  #  After the patch:
  #

  $ time make install

  ...

  real    0m0.280s
  user    0m0.162s
  sys     0m0.054s

Prior all the autodep changes and prior this fix, a repeat 'make install'
took 24.1 seconds (!) on the same system:

  #
  #  Before the patches:
  #

  $ time make install

  ...

  real    0m24.109s
  user    0m21.171s
  sys     0m2.449s

Which almost entirely was caused by fixable build system fat.
We are now literally ~86 times faster.

A fresh rebuild and install now takes just 11.4 seconds:

  #
  #  After the patch:
  #

  $ make clean
  $ time make -j16 install

  ...

  real    0m11.457s
  user    1m43.411s
  sys     0m7.610s

Without the patches it took 27.8 seconds:

  #
  #  Before the patches:
  #

  $ make clean
  $ time make -j16 install

  ...

  real    0m27.801s
  user    1m59.242s
  sys     0m9.749s

So even in the complete rebuild case we are now ~2.5 times faster.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-x4qjnxjGrgxpribq8sdakfTp@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-09 08:48:51 +02:00
Andi Kleen
475eeab9f3 tools/perf: Add support for record transaction flags
Add support for recording and displaying the transaction flags.
They are essentially a new sort key. Also display them
in a nice way to the user.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-6-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:12 +02:00
Andi Kleen
0126d493b6 tools/perf/record: Add abort_tx,no_tx,in_tx branch filter options to perf record -j
Make perf record -j aware of the new in_tx,no_tx,abort_tx branch qualifiers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:10 +02:00
Andi Kleen
f5d05bcec4 tools/perf: Support sorting by in_tx or abort branch flags
Extend the perf branch sorting code to support sorting by in_tx
or abort_tx qualifiers. Also print out those qualifiers.

This also fixes up some of the existing sort key documentation.

We do not support no_tx here, because it's simply not showing
the in_tx flag.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1379688044-14173-4-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:09 +02:00
Andi Kleen
4cabc3d1cb tools/perf/stat: Add perf stat --transaction
Add support to perf stat to print the basic transactional execution statistics:
Total cycles, Cycles in Transaction, Cycles in aborted transsactions
using the in_tx and in_tx_checkpoint qualifiers.
Transaction Starts and Elision Starts, to compute the average transaction
length.

This is a reasonable overview over the success of the transactions.

Also support architectures that have a transaction aborted cycles
counter like POWER8. Since that is awkward to handle in the kernel
abstract handle both cases here.

Enable with a new --transaction / -T option.

This requires measuring these events in a group, since they depend on each
other.

This is implemented by using TM sysfs events exported by the kernel

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnaldo Carvalho de Melo <acme@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377128846-977-5-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-04 10:06:07 +02:00
David Ahern
6810fc915f perf trace: Add option to analyze events in a file versus live
Allows capture of raw_syscall:* events and analyzed at a later time.

v2: change -i option from inherit to input name for consistency with
    other perf commands

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1377750593-48046-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-29 17:42:34 -03:00
Arnaldo Carvalho de Melo
7c304ee0fc perf trace: Add --verbose option
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ain6q4u8g3bpnh18yhw24v2x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-26 17:25:43 -03:00
Arnaldo Carvalho de Melo
b059efdf52 perf trace: Support ! in -e expressions
So that we can ask for all but a set of syscalls to be traced.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-9j6hvap23qanyl96wx4mrj9k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-26 17:25:41 -03:00
David Ahern
ac9be8ee4e perf trace: Make command line arguments consistent with perf-record
Common arguments like thread id, CPU list, mmap pages, etc should be
consistent across perf commands.

v3: Updated man page
v2: rebased to latest core branch

Signed-off-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1377018945-21940-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-08-26 17:25:35 -03:00