Commit Graph

582 Commits

Author SHA1 Message Date
Changbin Du
d35fa1e75f perf report: Document +field style argument support for --field option
Commit 2f3f9bcf00 ("perf tools: Add +field argument support for
--field option") by Jiri Olsa <jolsa@kernel.org> introduced +field style
argument support for --field option.

This is useful but not updated documentation.  This add a little
description there.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170313083252.23644-1-changbin.du@intel.com
[ Slightly improved the phrase structure ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-13 11:44:44 -03:00
Namhyung Kim
dc23103278 perf ftrace: Add support for -a and -C option
The -a/--all-cpus and -C/--cpu option is for controlling tracing cpus.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170224011251.14946-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-03 19:07:17 -03:00
Namhyung Kim
a9af6be5bc perf ftrace: Add support for --pid option
The -p (--pid) option enables to trace existing process by its pid.

Committer notes:

Testing it:

Using the function_graph tracer on a process that is just waiting for user
input and thus will make 'perf ftrace' sit there waiting for that, then press
any key on that mutt session and see what happens:

  # perf ftrace -t function_graph -p `pidof mutt` | head -40
  2)   1.038 us    |  switch_mm_irqs_off();
  ------------------------------------------
  2)    <idle>-0    =>   mutt-3595
  ------------------------------------------

  2)               |              finish_task_switch() {
  2)               |                smp_irq_work_interrupt() {
  2)               |                  irq_enter() {
  2)   0.180 us    |                    rcu_irq_enter();
  2)   1.248 us    |                  }
  2)               |                  __wake_up() {
  2)   0.126 us    |                    _raw_spin_lock_irqsave();
  2)               |                    __wake_up_common() {
  2)               |                      pollwake() {
  2)               |                        default_wake_function() {
  2)               |                          try_to_wake_up() {
  2)   0.662 us    |                            _raw_spin_lock_irqsave();
  2)               |                            select_task_rq_fair() {
  2)   1.719 us    |                              effective_load.isra.41();
  2)   1.343 us    |                              effective_load.isra.41();
  2)               |                              select_idle_sibling() {
  2)   0.331 us    |                                idle_cpu();
  2)   1.458 us    |                              }
  2)   8.350 us    |                            }
  2)   0.200 us    |                            _raw_spin_lock();
  2)               |                            ttwu_do_activate() {
  2)               |                              activate_task() {
  2)   0.136 us    |                                update_rq_clock.part.77();
  2)               |                                enqueue_task_fair() {
  2)               |                                  enqueue_entity() {
  2)   0.146 us    |                                    update_curr();
  2)   0.330 us    |                                    account_entity_enqueue();
  2)   0.280 us    |                                    update_cfs_shares();
  2)   0.321 us    |                                    place_entity();
  2)   0.206 us    |                                    __enqueue_entity();
  2)   6.926 us    |                                  }
  2)               |                                  enqueue_entity() {
  2)   0.105 us    |                                    update_curr();
  2)   0.175 us    |                                    account_entity_enqueue();
  2)   0.531 us    |                                    update_cfs_shares();
 #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170224011251.14946-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-03 19:07:16 -03:00
Charles Baylis
7768f8dada perf tools: Allow sorting by symbol size
Add new sort key 'symbol_size' to allow user to sort by symbol size, or
(more usefully) display the symbol size using --fields=...,symbol_size.

Committer note:

Testing it together with the recently added -q, to remove the headers,
and using the '+' sign with -s, to add the symbol_size sort order to
the default, which is '-s/--sort comm,dso,symbol':

  # perf report -q -s +symbol_size | head -10
  10.39%  swapper       [kernel.vmlinux] [k] intel_idle               270
   3.45%  swapper       [kernel.vmlinux] [k] update_blocked_averages 1546
   2.61%  swapper       [kernel.vmlinux] [k] update_load_avg         1292
   2.36%  swapper       [kernel.vmlinux] [k] update_cfs_shares        240
   1.83%  swapper       [kernel.vmlinux] [k] __hrtimer_run_queues     606
   1.74%  swapper       [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
   1.66%  swapper       [kernel.vmlinux] [k] apic_timer_interrupt     152
   1.60%  CPU 0/KVM     [kvm]            [k] kvm_set_msr_common      3046
   1.60%  gnome-shell   libglib-2.0.so.0 [.] g_slist_find              37
   1.46%  gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup      370
  #

Signed-off-by: Charles Baylis <charles.baylis@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org
[ Use symbol__size(), remove needless %lld + (long long) casting ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-03-03 19:07:16 -03:00
Linus Torvalds
3f26b0c876 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes on the kernel and tooling side - nothing in particular
  stands out"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf/core: Fix the perf_cpu_time_max_percent check
  perf/core: Fix perf_event_enable_on_exec() timekeeping (again)
  perf/core: Remove confusing comment and move put_ctx()
  perf record: Honor --quiet option properly
  perf annotate: Add -q/--quiet option
  perf diff: Add -q/--quiet option
  perf report: Add -q/--quiet option
  perf utils: Check verbose flag properly
  perf utils: Add perf_quiet_option()
  perf record: Add -a as default target
  perf stat: Add -a as default target
  perf tools: Fail on using multiple bits long terms without value
  perf tools: Move new_term arguments into struct parse_events_term template
  perf build: Add special fixdep cleaning rule
  perf tools: Replace _SC_NPROCESSORS_CONF with max_present_cpu in cpu_topology_map
  perf header: Make build_cpu_topology skip offline/absent CPUs
  perf cpumap: Add cpu__max_present_cpu()
  perf session: Fix DEBUG=1 build with clang
  tools lib traceevent: It's preempt not prempt
  perf python: Filter out -specs=/a/b/c from the python binding cc options
  ...
2017-02-28 11:38:18 -08:00
Masahiro Yamada
9332ef9dbd scripts/spelling.txt: add "an user" pattern and fix typo instances
Fix typos and add the following to the scripts/spelling.txt:

  an user||a user
  an userspace||a userspace

I also added "userspace" to the list since it is a common word in Linux.
I found some instances for "an userfaultfd", but I did not add it to the
list.  I felt it is endless to find words that start with "user" such as
"userland" etc., so must draw a line somewhere.

Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:46 -08:00
Namhyung Kim
eddaef8896 perf annotate: Add -q/--quiet option
The -q/--quiet option is to suppress any message.  Sometimes users just
want to see the numbers and it can be used for that case.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-20 11:47:18 -03:00
Namhyung Kim
63b42fce86 perf diff: Add -q/--quiet option
The -q/--quiet option is to suppress any message.  Sometimes users just
want to see the numbers and it can be used for that case.

Committer notes:

Before:

  # perf diff | head -10
  Failed to open /tmp/perf-6678.map, continuing without symbols
  Failed to open /tmp/perf-6678.map, continuing without symbols
  Failed to open /tmp/perf-2646.map, continuing without symbols
  # Event 'cycles'
  #
  # Baseline  Delta Abs  Shared Object               Symbol
  # ........  .........  ..........................  ............................................
  #
       5.36%     -1.76%  [kernel.vmlinux]            [k] intel_idle
       2.80%     +1.48%  firefox                     [.] 0x00000000000101fe
      57.12%     -1.25%  libxul.so                   [.] 0x00000000009bea92
       1.36%     -1.11%  [kernel.vmlinux]            [k] __schedule
       4.26%     -1.00%  perf-6678.map               [.] 0x00007fac4b0e9320

After:

  # perf diff -q | head -10
       5.36%     -1.76%  [kernel.vmlinux]            [k] intel_idle
       2.80%     +1.48%  firefox                     [.] 0x00000000000101fe
      57.12%     -1.25%  libxul.so                   [.] 0x00000000009bea92
       1.36%     -1.11%  [kernel.vmlinux]            [k] __schedule
       4.26%     -1.00%  perf-6678.map               [.] 0x00007fac4b0e9320
       1.86%     +0.95%  [kernel.vmlinux]            [k] update_blocked_averages
       0.80%     -0.70%  [kernel.vmlinux]            [k] native_sched_clock
       0.74%     -0.58%  [kernel.vmlinux]            [k] native_write_msr
       0.76%     -0.56%  qemu-system-x86_64          [.] 0x00000000002395c0
                 +0.54%  libpulsecommon-10.0.so      [.] 0x000000000002d91b
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-20 11:47:08 -03:00
Namhyung Kim
27fafab59a perf report: Add -q/--quiet option
The -q/--quiet option is to suppress any message.  Sometimes users just
want to see the numbers and it can be used for that case.

Before:

  $ perf report | head -15
  Failed to open /lib/modules/3.19.3-3-ARCH/kernel/fs/ext4/ext4.ko.gz, continuing without symbols
  Failed to open /lib/modules/3.19.3-3-ARCH/kernel/fs/jbd2/jbd2.ko.gz, continuing without symbols
  Failed to open /tmp/perf-14507.map, continuing without symbols
  ...
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 39K of event 'cycles'
  # Event count (approx.): 30444796573
  #
  # Overhead  Command      Shared Object        Symbol
  # ........  ...........  ...................  .........................
  #
       9.28%  swapper	   [kernel.vmlinux]     [k] intel_idle
       5.64%  swapper	   [kernel.vmlinux]     [k] native_write_msr_safe
       1.93%  swapper	   [kernel.vmlinux]     [k] __switch_to
       1.89%  swapper	   [kernel.vmlinux]     [k] menu_select
       1.75%  sched-pipe   [kernel.vmlinux]     [k] __switch_to

After:

  $ perf report -q | head
       9.28%  swapper	   [kernel.vmlinux]     [k] intel_idle
       5.64%  swapper	   [kernel.vmlinux]     [k] native_write_msr_safe
       1.93%  swapper	   [kernel.vmlinux]     [k] __switch_to
       1.89%  swapper	   [kernel.vmlinux]     [k] menu_select
       1.75%  sched-pipe   [kernel.vmlinux]     [k] __switch_to
       1.67%  swapper	   [kernel.vmlinux]     [k] cpu_startup_entry
       1.48%  sched-pipe   [kernel.vmlinux]     [k] enqueue_entity
       1.46%  swapper	   [kernel.vmlinux]     [k] __schedule
       1.36%  swapper	   [kernel.vmlinux]     [k] native_read_tsc
       1.34%  sched-pipe   [kernel.vmlinux]     [k] __schedule

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-4-namhyung@kernel.org
[ Removed builtin-report.c verbose > 0 hunk added to the previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-20 11:46:58 -03:00
Jiri Olsa
483635a9d0 perf record: Add -a as default target
Running 'perf record' with no target (-a, -p, -t, etc) will now collect
system wide data.

Commiter notes:

Testing it:

  [root@jouet ~]# perf record
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.351 MB perf.data (366 samples) ]
  #

is equivalent to:

  # perf record -a
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.411 MB perf.data (978 samples) ]
  #

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170217170018.GA15389@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-17 17:32:38 -03:00
Jiri Olsa
0d79f8b931 perf stat: Add -a as default target
Boris asked for default -a option in case we monitor only uncore events.

While implementing that I thought it might be actually useful to make it
overall default.

Running 'perf stat' will now collect system wide data.

Committer note:

Testing it:

  # perf stat
  ^C
   Performance counter stats for 'system wide':

         3571.559178      cpu-clock (msec)          #    4.000 CPUs utilized
               3,346      context-switches          #    0.937 K/sec
                 277      cpu-migrations            #    0.078 K/sec
              57,271      page-faults               #    0.016 M/sec
       4,535,633,835      cycles                    #    1.270 GHz
       6,389,736,516      instructions              #    1.41  insn per cycle
       1,541,293,875      branches                  #  431.547 M/sec
          14,526,396      branch-misses             #    0.94% of all branches

         0.892950118 seconds time elapsed

  #

Requested-and-Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170217170034.GB15389@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-17 17:31:10 -03:00
Namhyung Kim
be57b3fd21 perf diff: Change default setting to "delta-abs"
The "delta-abs" compute method will show most changed entries on top.
So users can easily see how much effect between the data.  Note that it
also changes the default of -o option to 1 in order to apply the compute
method.  To see original-style (sorted by baseline) use -o 0 option.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170210161856.18422-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-13 14:29:38 -03:00
Namhyung Kim
4b35994abe perf diff: Add diff.compute config option
The diff.compute config variable is to set the default compute method of
perf diff command (-c option).  Possible values 'delta' (default),
'delta-abs', 'ratio' and 'wdiff'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170210073614.24584-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-13 14:29:37 -03:00
Namhyung Kim
d49dd15d69 perf diff: Add diff.order config option
In many cases, I need to look at differences between two data so I often
used the -o option to sort the result base on the difference first.
It'd be nice to have a config option to set it by default.

The diff.order config option is to set the default value of -o/--order
option.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170210073614.24584-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-13 14:29:37 -03:00
Namhyung Kim
a1668c25a8 perf diff: Add 'delta-abs' compute method
The 'delta-abs' compute method is same as 'delta' but shows entries with
bigger absolute delta first instead of sorting numerically.  This is
only useful together with -o option.

Below is default output (-c delta):

  $ perf diff -o 1 -c delta | grep -v ^# | head
    42.22%   +4.97%  [kernel.kallsyms]  [k] cfb_imageblit
     0.62%   +1.23%  [kernel.kallsyms]  [k] mutex_lock
             +1.15%  [kernel.kallsyms]  [k] copy_user_generic_string
     2.40%   +0.95%  [kernel.kallsyms]  [k] bit_putcs
     0.31%   +0.79%  [kernel.kallsyms]  [k] link_path_walk
             +0.64%  [kernel.kallsyms]  [k] kmem_cache_alloc
     0.00%   +0.57%  [kernel.kallsyms]  [k] __rcu_read_unlock
             +0.45%  [kernel.kallsyms]  [k] alloc_set_pte
     0.16%   +0.45%  [kernel.kallsyms]  [k] menu_select
             +0.41%  ld-2.24.so         [.] do_lookup_x

Now with 'delta-abs' it shows entries have bigger delta value either
positive or negative.

  $ perf diff -o 1 -c delta-abs | grep -v ^# | head
    42.22%   +4.97%  [kernel.kallsyms]  [k] cfb_imageblit
    12.72%   -3.01%  [kernel.kallsyms]  [k] intel_idle
     9.72%   -1.31%  [unknown]          [.] 0x0000000000411343
     0.62%   +1.23%  [kernel.kallsyms]  [k] mutex_lock
     2.40%   +0.95%  [kernel.kallsyms]  [k] bit_putcs
     0.31%   +0.79%  [kernel.kallsyms]  [k] link_path_walk
     1.35%   -0.71%  [kernel.kallsyms]  [k] smp_call_function_single
     0.00%   +0.57%  [kernel.kallsyms]  [k] __rcu_read_unlock
     0.16%   +0.45%  [kernel.kallsyms]  [k] menu_select
     0.72%   -0.44%  [kernel.kallsyms]  [k] lookup_fast

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170210073614.24584-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-02-13 14:29:36 -03:00
Namhyung Kim
d01f4e8db2 perf ftrace: Introduce new 'ftrace' tool
The 'perf ftrace' command is a simple wrapper of kernel's ftrace
functionality.  It only supports single thread tracing currently and
just reads trace_pipe in text and then write it to stdout.

Committer notes:

Testing it:

  # perf ftrace -f function_graph usleep 123456
  <SNIP>
  2)               |  SyS_nanosleep() {
  2)               |    _copy_from_user() {
  <SNIP>
  2)   0.900 us    |      }
  2)   1.354 us    |    }
  2)               |    hrtimer_nanosleep() {
  2)   0.062 us    |      __hrtimer_init();
  2)               |      do_nanosleep() {
  2)               |        hrtimer_start_range_ns() {
  <SNIP>
  2)   5.025 us    |        }
  2)               |        schedule() {
  2)   0.125 us    |          rcu_note_context_switch();
  2)   0.057 us    |          _raw_spin_lock();
  2)               |          deactivate_task() {
  2)   0.369 us    |            update_rq_clock.part.77();
  2)               |            dequeue_task_fair() {
  <SNIP>
  2) + 22.453 us   |            }
  2) + 23.736 us   |          }
  2)               |          pick_next_task_fair() {
  <SNIP>
  2) + 47.167 us   |          }
  2)               |          pick_next_task_idle() {
  <SNIP>
  2)   4.462 us    |          }
  ------------------------------------------
  2)  usleep-20387  =>    <idle>-0
  ------------------------------------------

  2)   0.806 us    |  switch_mm_irqs_off();
  ------------------------------------------
  2)    <idle>-0    =>  usleep-20387
  ------------------------------------------

  2)   0.151 us    |          finish_task_switch();
  2) @ 123597.2 us |        }
  2)   0.037 us    |        _cond_resched();
  2)               |        hrtimer_try_to_cancel() {
  2)   0.064 us    |          hrtimer_active();
  2)   0.353 us    |        }
  2) @ 123605.3 us |      }
  2) @ 123606.2 us |    }
  2) @ 123608.3 us |  } /* SyS_nanosleep */
  2)               |  __do_page_fault() {
 <SNIP>

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeremy Eder <jeder@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>,
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-r1hgmsj4dxny8arn3o9mw512@git.kernel.org
[ Various foward port fixes, add man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26 11:43:01 -03:00
Jiri Olsa
190bacca16 perf c2c report: Coalesce by default only by pid,iaddr
It seems to be the most used argument for -c option so far.  In the
beginning when you want to have the overall process report, so it makes
sense to make it the default one.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-20 16:52:56 -03:00
Namhyung Kim
414e050c68 perf sched timehist: Add --state option
The --state option is to show task state when switched out.  The state
is printed as a single character like in the /proc but I added 'I' for
idle state rather than 'R'.

  $ perf sched timehist --state | head
  Samples do not have callchains.
      time cpu task name              wait time sch delay run time state
               [tid/pid]                 (msec)    (msec)   (msec)
  -------- --- ----------------------- -------- ------------------ -----
  1.753791 [3] <idle>                     0.000     0.000    0.000     I
  1.753834 [1] perf[27469]                0.000     0.000    0.000     S
  1.753904 [3] perf[27470]                0.000     0.006    0.112     S
  1.753914 [1] <idle>                     0.000     0.000    0.079     I
  1.753915 [3] migration/3[23]            0.000     0.002    0.011     S
  1.754287 [2] <idle>                     0.000     0.000    0.000     I
  1.754335 [2] transmission[1773/1739]    0.000     0.004    0.047     S

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523.31212-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17 11:36:44 -03:00
Michael Petlan
5c64f99b1d perf script: Fix man page about --dump-raw-trace option
The "--dump-raw-script" is not a valid option, replace it with the valid
one, "--dump-raw-trace"

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 133dc4c39c ("perf: Rename 'perf trace' to 'perf script'")
LPU-Reference: 728644547.14560155.1484320012612.JavaMail.zimbra@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16 14:59:15 -03:00
Jiri Olsa
bfacbe3bf2 perf record: Add switch-output time option argument
It's now possible to specify the threshold time for perf.data like:

  $ perf record --switch-output=30s ...

Once it's reached, the current data are dumped in to the
perf.data.<timestamp> file and session does on.

  $ perf record --switch-output=30s ...
  [ perf record: dump data: Woken up 44 times ]
  [ perf record: Dump perf.data.2017010213043746 ]
  ...

The time is expected to be a number with appended unit
character - s/m/h/d.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Jiri Olsa
dc0c6127c2 perf record: Add switch-output size option argument
It's now possible to specify the threshold size for perf.data like:

  $ perf record --switch-output=2G ...

Once it's reached, the current data are dumped in to the
perf.data.<timestamp> file and session does on.

  $ perf record --switch-output=2G ...
  [ perf record: dump data: Woken up 7244 times ]
  [ perf record: Dump perf.data.2017010214093746 ]
  ...

The size is expected to be a number with appended unit character -
B/K/M/G.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:02 -03:00
Arnaldo Carvalho de Melo
017037ff3d perf trace: Allow specifying list of syscalls and events in -e/--expr/--event
Makes it easier to specify both events and syscalls (to be formatter
strace-like), i.e. previously one would have to do:

  # perf trace -e nanosleep --event sched:sched_switch usleep 1

Now it is possible to do:

  # perf trace -e nanosleep,sched:sched_switch usleep 1
     0.000 ( 0.021 ms): usleep/17962 nanosleep(rqtp: 0x7ffdedd61ec0) ...
     0.021 (         ): sched:sched_switch:usleep:17962 [120] S ==> swapper/1:0 [120])
     0.000 ( 0.066 ms): usleep/17962  ... [continued]: nanosleep()) = 0
  #

The old style --expr and using both -e and --event continues to work.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ieg6bakub4657l9e6afn85r4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Arnaldo Carvalho de Melo
355637717d perf kallsyms: Introduce tool to look for extended symbol information on the running kernel
Its similar to doing grep on a /proc/kallsyms, but it also shows extra
information like the path to the kernel module and the unrelocated
addresses in it, to help in diagnosing problems.

It is also helps demonstrate the use of the symbols routines so that
tool writers can use them more effectively.

Using it:

  $ perf kallsyms e1000_xmit_frame netif_rx usb_stor_set_xfer_buf
  e1000_xmit_frame: [e1000e] /lib/modules/4.9.0+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 0xffffffffc046fc10-0xffffffffc0470bb0 (0x19c80-0x1ac20)
  netif_rx: [kernel] [kernel.kallsyms] 0xffffffff916f03a0-0xffffffff916f0410 (0xffffffff916f03a0-0xffffffff916f0410)
  usb_stor_set_xfer_buf: [usb_storage] /lib/modules/4.9.0+/kernel/drivers/usb/storage/usb-storage.ko 0xffffffffc057aea0-0xffffffffc057af19 (0xf10-0xf89)
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-79bk9pakujn4l4vq0f90klv3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11 16:48:01 -03:00
Jiri Olsa
60437ac02f perf record: Fix --switch-output documentation and comment
There's no --signal-trigger option, also adding the code comment into
record man page.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483431600-19887-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03 11:11:38 -03:00
Namhyung Kim
07235f84ec perf sched timehist: Add -I/--idle-hist option
The --idle-hist option is to analyze system idle state so which process
makes cpu to go idle.  If this option is specified, non-idle events will
be skipped and processes switching to/from idle will be shown.

This option is mostly useful when used with --summary(-only) option.  In
the idle-time summary view, idle time is accounted to previous thread
which is run before idle task.

The example output looks like following:

  Idle-time summary
                  comm parent sched-out idle-time min-idle avg-idle max-idle stddev migrations
                                (count)    (msec)   (msec)   (msec)   (msec)      %
  --------------------------------------------------------------------------------------------
        rcu_preempt[7]      2        95   550.872    0.011    5.798   23.146   7.63      0
       migration/1[16]      2         1    15.558   15.558   15.558   15.558   0.00      0
        khugepaged[39]      2         1     3.062    3.062    3.062    3.062   0.00      0
     kworker/0:1H[124]      2         2     4.728    0.611    2.364    4.116  74.12      0
  systemd-journal[167]      1         1     4.510    4.510    4.510    4.510   0.00      0
    kworker/u16:3[558]      2        13    74.737    0.080    5.749   12.960  21.96      0
   irq/34-iwlwifi[628]      2        21   118.403    0.032    5.638   23.990  24.00      0
    kworker/u17:0[673]      2         1     3.523    3.523    3.523    3.523   0.00      0
      dbus-daemon[722]      1         1     6.743    6.743    6.743    6.743   0.00      0
          ifplugd[741]      1         1    58.826   58.826   58.826   58.826   0.00      0
  wpa_supplicant[1490]      1         1    13.302   13.302   13.302   13.302   0.00      0
     wpa_actiond[1492]      1         2     4.064    0.168    2.032    3.896  91.72      0
         dockerd[1500]      1         1     0.055    0.055    0.055    0.055   0.00      0
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161208144755.16673-6-namhyung@kernel.org
Link: http://lkml.kernel.org/r/20161213080632.19099-2-namhyung@kernel.org
[ Merged fix sent by Namhyumg, as posted in the second Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15 16:25:45 -03:00
Yannick Brosseau
108a7c103b perf tools: Explicitly document that --children is enabled by default
The fact that the --children option is enabled by default is buried deep
at the end of the help page, in the overhead calculation section. This
make it explicit right where the option is listed, following the same
way other default options are described

Signed-off-by: Yannick Brosseau <scientist@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20161202160732.29058-1-scientist@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07 12:00:35 -03:00
David Ahern
46690a8051 perf report: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

Using the perf.data file captured via 'perf kmem record':

  # perf report --header-only
  # ========
  # captured on: Tue Nov 29 16:01:53 2016
  # hostname : jouet
  # os release : 4.8.8-300.fc25.x86_64
  # perf version : 4.9.rc6.g5a6aca
  # arch : x86_64
  # nrcpus online : 4
  # nrcpus avail : 4
  # cpudesc : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
  # cpuid : GenuineIntel,6,61,4
  # total memory : 20254660 kB
  # cmdline : /home/acme/bin/perf kmem record usleep 1
  # event : name = kmem:kmalloc, , id = { 931980, 931981, 931982, 931983 }, type = 2, size = 112, config = 0x1b9, { sample_period, sample_freq } = 1, sample_typ
  # event : name = kmem:kmalloc_node, , id = { 931984, 931985, 931986, 931987 }, type = 2, size = 112, config = 0x1b7, { sample_period, sample_freq } = 1, sampl
  # event : name = kmem:kfree, , id = { 931988, 931989, 931990, 931991 }, type = 2, size = 112, config = 0x1b5, { sample_period, sample_freq } = 1, sample_type
  # event : name = kmem:kmem_cache_alloc, , id = { 931992, 931993, 931994, 931995 }, type = 2, size = 112, config = 0x1b8, { sample_period, sample_freq } = 1, s
  # event : name = kmem:kmem_cache_alloc_node, , id = { 931996, 931997, 931998, 931999 }, type = 2, size = 112, config = 0x1b6, { sample_period, sample_freq } =
  # event : name = kmem:kmem_cache_free, , id = { 932000, 932001, 932002, 932003 }, type = 2, size = 112, config = 0x1b4, { sample_period, sample_freq } = 1, sa
  # HEADER_CPU_TOPOLOGY info available, use -I to display
  # HEADER_NUMA_TOPOLOGY info available, use -I to display
  # pmu mappings: cpu = 4, intel_pt = 7, intel_bts = 6, uncore_arb = 13, cstate_pkg = 15, breakpoint = 5, uncore_cbox_1 = 12, power = 9, software = 1, uncore_im
  # HEADER_CACHE info available, use -I to display
  # missing features: HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT
  # ========
  #
  # # Looking at just the histogram entries for the first event:
  #
  # perf report  | head -33
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 40  of event 'kmem:kmalloc'
  # Event count (approx.): 40
  #
  # Overhead  Trace output
  # ........  ...............................................................................................................
  #
    37.50%  call_site=ffffffffb91ad3c7 ptr=0xffff88895fc05000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
    10.00%  call_site=ffffffffb9258416 ptr=0xffff888a1dc61f00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
     7.50%  call_site=ffffffffb9258416 ptr=0xffff888a2640ac00 bytes_req=240 bytes_alloc=256 gfp_flags=GFP_KERNEL|__GFP_ZERO
     2.50%  call_site=ffffffffb92759ba ptr=0xffff888a26776000 bytes_req=4096 bytes_alloc=4096 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb9276864 ptr=0xffff8886f6b82600 bytes_req=136 bytes_alloc=192 gfp_flags=GFP_KERNEL|__GFP_ZERO
     2.50%  call_site=ffffffffb9276903 ptr=0xffff888aefcf0460 bytes_req=32 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c98a00 bytes_req=392 bytes_alloc=512 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad0ce ptr=0xffff888756c9ba00 bytes_req=504 bytes_alloc=512 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad301 ptr=0xffff888a31747600 bytes_req=128 bytes_alloc=128 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb92ad511 ptr=0xffff888a9d26a2a0 bytes_req=28 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c11a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c12c0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1540 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c15e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c16e0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff88873e8c1c20 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb936a7fb ptr=0xffff888a9d26a2a0 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931240 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931980 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO
     2.50%  call_site=ffffffffb9373e66 ptr=0xffff8889f1931a00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_ATOMIC|__GFP_ZERO

  #
  # # And then limiting using the example for 'perf kmem stat --time' used
  # # in the previous changeset committer note we see that there were no
  # # kmem:kmalloc in that last part of the file, but there were some
  # # kmem:kmem_cache_alloc ones:
  #
  # perf report --time 20119.782088, --stdio
  #
  # Total Lost Samples: 0
  #
  # Samples: 0  of event 'kmem:kmalloc'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 0  of event 'kmem:kmalloc_node'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 0  of event 'kmem:kfree'
  # Event count (approx.): 0
  #
  # Overhead  Trace output
  # ........  ............
  #

  # Samples: 8  of event 'kmem:kmem_cache_alloc'
  # Event count (approx.): 8
  #
  # Overhead  Trace output
  # ........  ..................................................................................................................
  #
    75.00%  call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
    12.50%  call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
    12.50%  call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-7-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:10 -03:00
David Ahern
2a865bd8dd perf kmem: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

  # perf kmem record usleep 1
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Captured and wrote 1.540 MB perf.data (2049 samples) ]
  # perf evlist
  kmem:kmalloc
  kmem:kmalloc_node
  kmem:kfree
  kmem:kmem_cache_alloc
  kmem:kmem_cache_alloc_node
  kmem:kmem_cache_free
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #
  # # Use 'perf script' to get a first approach, select a chunk for then using
  # # with 'perf kmem stat --time'
  #
  # perf script | tail -15
    usleep 9889 [0] 20119.782088:  kmem:kmem_cache_free: (selinux_file_free_security+0x27) call_site=ffffffffb936aa07 ptr=0xffff888a1df49fc0
      perf 9888 [3] 20119.782088:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782089: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782090:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782090: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
    usleep 9889 [0] 20119.782091: kmem:kmem_cache_alloc: (__sigqueue_alloc+0x4a) call_site=ffffffffb90ad33a ptr=0xffff8889f071f6e0 bytes_req=160 bytes_alloc=160 gfp_flags=GFP_ATOMIC|__GFP_NOTRACK
      perf 9888 [3] 20119.782091:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782093:  kmem:kmem_cache_free: (__sigqueue_free.part.17+0x33) call_site=ffffffffb90ad3f3 ptr=0xffff8889f071f6e0
      perf 9888 [3] 20119.782098: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782098:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782099: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782100: kmem:kmem_cache_alloc: (alloc_buffer_head+0x21) call_site=ffffffffb9287cc1 ptr=0xffff8889b12722d8 bytes_req=104 bytes_alloc=104 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782101:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
      perf 9888 [3] 20119.782102: kmem:kmem_cache_alloc: (jbd2__journal_start+0x72) call_site=ffffffffb9333b42 ptr=0xffff888bdf1a39c0 bytes_req=48 bytes_alloc=48 gfp_flags=GFP_NOFS|__GFP_ZERO
      perf 9888 [3] 20119.782103:  kmem:kmem_cache_free: (jbd2_journal_stop+0x1a1) call_site=ffffffffb9334581 ptr=0xffff888bdf1a39c0
  #
  # # stats for the whole perf.data file, i.e. no interval specified
  #
  # perf kmem stat

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 172,628
  Total bytes allocated: 173,088
  Total bytes freed:     161,280
  Net total bytes allocated: 11,808
  Total bytes wasted on internal fragmentation: 460
  Internal fragmentation: 0.265761%
  Cross CPU allocations: 0/851
  #
  # # stats for an end open interval, after a certain time:
  #
  # perf kmem stat --time 20119.782088,

  SUMMARY (SLAB allocator)
  ========================
  Total bytes requested: 552
  Total bytes allocated: 552
  Total bytes freed:     448
  Net total bytes allocated: 104
  Total bytes wasted on internal fragmentation: 0
  Internal fragmentation: 0.000000%
  Cross CPU allocations: 0/8
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:03:02 -03:00
David Ahern
853b740711 perf sched timehist: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for time window and analyze a segment of interest within that window.

Committer notes:

Testing it:

  # perf sched record -a usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.593 MB perf.data (25 samples) ]
  #
  # perf sched timehist | head -18
  Samples do not have callchains.
          time    cpu   task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  --------------- ---------  ---------  --------
   19818.635579 [0002]  <idle>              0.000      0.000     0.000
   19818.635613 [0000]  perf[9116]          0.000      0.000     0.000
   19818.635676 [0000]  <idle>              0.000      0.000     0.063
   19818.635678 [0000]  rcuos/2[29]         0.000      0.002     0.001
   19818.635696 [0002]  perf[9117]          0.000      0.004     0.116
   19818.635702 [0000]  <idle>              0.001      0.000     0.024
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.636263 [0000]  usleep[9117]        0.005      0.000     0.560
   19818.636316 [0000]  <idle>              0.560      0.000     0.053
   19818.636358 [0002]  <idle>              0.129      0.000     0.649
   19818.636358 [0000]  usleep[9117]        0.053      0.002     0.042
  #

  # perf sched timehist --time 19818.635696,
  Samples do not have callchains.
           time    cpu  task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  ---------------  --------  --------- ---------
   19818.635696 [0002]  perf[9117]          0.000      0.120     0.000
   19818.635702 [0000]  <idle>              0.019      0.000     0.006
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.636263 [0000]  usleep[9117]        0.005      0.000     0.560
   19818.636316 [0000]  <idle>              0.560      0.000     0.053
   19818.636358 [0002]  <idle>              0.129      0.000     0.649
   19818.636358 [0000]  usleep[9117]        0.053      0.002     0.042
  #
  # perf sched timehist --time 19818.635696,19818.635709
  Samples do not have callchains.
           time    cpu  task name       wait time  sch delay  run time
                        [tid/pid]          (msec)     (msec)    (msec)
  ------------- ------  --------------- ---------  --------- ---------
   19818.635696 [0002]  perf[9117]          0.000      0.120     0.000
   19818.635702 [0000]  <idle>              0.019      0.000     0.006
   19818.635709 [0002]  migration/2[25]     0.000      0.003     0.012
   19818.635709 [0000]  usleep[9117]        0.005      0.000     0.006
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:52 -03:00
David Ahern
a91f4c473f perf script: Add option to specify time window of interest
Add option to allow user to control analysis window. e.g., collect data
for some amount of time and analyze a segment of interest within that
window.

Committer notes:

Testing it:

  # perf evlist -v
  cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  #
  # perf script --hide-call-graph | head -15
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370048:    126 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370049:   2701 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370051:  58823 cycles:ppp: ffffffffb90cd2e0 idle_cpu (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370059:      1 cycles:ppp: ffffffffb91a713a ctx_resched (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370062:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370076:      1 cycles:ppp: ffffffffb91a76c1 __perf_event_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370091:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
       perf 5124 [2] 9693.370095:      3 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  #
  # perf script --hide-call-graph --time ,9693.370048
    swapper    0 [0] 9693.370039:      1 cycles:ppp: ffffffffb90072ad x86_pmu_enable (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370044:      1 cycles:ppp: ffffffffb900ca1b intel_pmu_handle_irq (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [0] 9693.370046:      7 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
  # perf script --hide-call-graph --time 9693.370064,9693.370076
    swapper    0 [1] 9693.370064:     13 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370065:    250 cycles:ppp: ffffffffb902fd93 native_sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370067:   5269 cycles:ppp: ffffffffb902fe79 sched_clock (.../4.8.8-300.fc25.x86_64/vmlinux)
    swapper    0 [1] 9693.370069: 114602 cycles:ppp: ffffffffb90c1c5a atomic_notifier_call_chain (.../4.8.8-300.fc25.x86_64/vmlinux)
  #

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480439746-42695-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01 13:02:45 -03:00
David Ahern
64eff7d9c4 perf script: Add option to stop printing callchain
Allow user to specify list of symbols which cause the dump of callchains
to stop at that symbol.

Committer notes:

Testing it:

  # perf record -ag usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.177 MB perf.data (33 samples) ]
  #
  # # Without it:
  #
  # perf script
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
                 137f419 x86_64_start_kernel ([kernel.vmlinux].init.text)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  326978 flush_smp_call_function_queue (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  327413 generic_smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  249b37 smp_call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a04b2c call_function_single_interrupt (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  889427 cpuidle_enter (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e534a call_cpuidle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  2e5730 cpu_startup_entry (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  9f5167 rest_init (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                 137ffeb start_kernel ([kernel.vmlinux].init.text)
                 137f2ca x86_64_start_reservations ([kernel.vmlinux].init.text)
  #
  # # Using it to see just what are the calls from the 'remote_function' function:
  #
  # perf script --stop-bt remote_function
  swapper   0 [000]  9693.370039:          1 cycles:ppp:
                  2072ad x86_pmu_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

  swapper   0 [000]  9693.370044:          1 cycles:ppp:
                  20ca1b intel_pmu_handle_irq (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  205b0c perf_event_nmi_handler (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a14a nmi_handle (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a6b3 default_do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  22a83c do_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  a03fb1 end_repeat_nmi (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a29d7 perf_pmu_enable.part.90 (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a713a ctx_resched (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a76c1 __perf_event_enable (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a0390 event_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)
                  3a1cff remote_function (/usr/lib/debug/lib/modules/4.8.8-300.fc25.x86_64/vmlinux)

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1480104021-36275-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-29 13:06:19 -03:00
David Ahern
350f54fab2 perf sched timehist: Handle cpu migration events
Add handlers for sched:sched_migrate_task event. Total number of
migrations is added to summary display and -M/--migrations can be used
to show migration events.

Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1480091321-35591-1-git-send-email-dsa@cumulusnetworks.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-25 16:00:22 -03:00
David Ahern
a407b0678b perf sched timehist: Add -V/--cpu-visual option
The -V option provides a visual aid for sched switches by cpu:

  $ perf sched timehist -V
             time    cpu  0123456789abc  task name              b/n time  sch delay   run time
                                         [tid/pid]                (msec)     (msec)     (msec)
  --------------- ------  -------------  --------------------  ---------  ---------  ---------
  ...
   2412598.429696 [0009]           i     <idle>                    0.000      0.000      0.000
   2412598.429767 [0002]    s            perf[7219]                0.000      0.000      0.000
   2412598.429783 [0009]           s     perf[7220]                0.000      0.006      0.087
   2412598.429794 [0010]            i    <idle>                    0.000      0.000      0.000
   2412598.429795 [0009]           s     migration/9[53]           0.000      0.003      0.011
   2412598.430370 [0010]            s    sleep[7220]               0.011      0.000      0.576
   2412598.432584 [0003]     i           <idle>                    0.000      0.000      0.000
  ...

Committer notes:

'i' marks idle time, 's' are scheduler events.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-8-namhyung@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:09 -03:00
David Ahern
6c973c9085 perf sched timehist: Add call graph options
If callchains were recorded they are appended to the line with a default stack depth of 5:

  1.874569 [0011] gcc[31949]       0.014 0.000 1.148 wait_for_completion_killable <- do_fork <- sys_vfork <- stub_vfork <- __vfork
  1.874591 [0010] gcc[31951]       0.000 0.000 0.024 __cond_resched <- _cond_resched <- wait_for_completion <- stop_one_cpu <- sched_exec
  1.874603 [0010] migration/10[59] 3.350 0.004 0.011 smpboot_thread_fn <- kthread <- ret_from_fork
  1.874604 [0011] <idle>           1.148 0.000 0.035 cpu_startup_entry <- start_secondary
  1.874723 [0005] <idle>           0.016 0.000 1.383 cpu_startup_entry <- start_secondary
  1.874746 [0005] gcc[31949]       0.153 0.078 0.022 do_wait sys_wait4 <- system_call_fastpath <- __GI___waitpid

 --no-call-graph can be used to not show the callchains. --max-stack is used
to control the number of frames shown (default of 5). -x/--excl options can
be used to collapse redundant callchains to get more relevant data on screen.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-7-namhyung@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:09 -03:00
David Ahern
fc1469f1b2 perf sched timehist: Add -w/--wakeups option
The -w option is to show wakeup events with timehist.

  $ perf sched timehist -w
             time    cpu  task name              b/n time  sch delay   run time
                          [tid/pid]                (msec)     (msec)     (msec)
  --------------- ------  --------------------  ---------  ---------  ---------
   2412598.429689 [0002]  perf[7219]                                             awakened: perf[7220]
   2412598.429696 [0009]  <idle>                    0.000      0.000      0.000
   2412598.429767 [0002]  perf[7219]                0.000      0.000      0.000
   2412598.429780 [0009]  perf[7220]                                             awakened: migration/9[53]
  ...

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-6-namhyung@kernel.org
[ Add documentation based on above commit message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:08 -03:00
David Ahern
49394a2a24 perf sched timehist: Introduce timehist command
'perf sched timehist' provides an analysis of scheduling events.

Example usage:
    perf sched record -- sleep 1
    perf sched timehist

By default it shows the individual schedule events, including the wait
time (time between sched-out and next sched-in events for the task), the
task scheduling delay (time between wakeup and actually running) and run
time for the task:

            time    cpu  task name             wait time  sch delay   run time
                         [tid/pid]                (msec)     (msec)     (msec)
  -------------- ------  --------------------  ---------  ---------  ---------
    79371.874569 [0011]  gcc[31949]                0.014      0.000      1.148
    79371.874591 [0010]  gcc[31951]                0.000      0.000      0.024
    79371.874603 [0010]  migration/10[59]          3.350      0.004      0.011
    79371.874604 [0011]  <idle>                    1.148      0.000      0.035
    79371.874723 [0005]  <idle>                    0.016      0.000      1.383
    79371.874746 [0005]  gcc[31949]                0.153      0.078      0.022
...

Times are in msec.usec.

Committer note:

Add above explanation as the 'perf sched timehist' entry for 'man
perf-sched'.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20161116060634.28477-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:07 -03:00
Jiri Olsa
d940baccc9 perf c2c report: Display total HITMs on default
Currently we display the cacheline list sorted on remote HITMs by
default.

The problem is that they might not be always counted and 'perf c2c
report' displays an empty output. Thus it's more convenient to display
and sort the cacheline list based on the total of HITMs and have the
best change to see data in the default report run.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:05 -03:00
Jiri Olsa
b7ac4f9f3b perf c2c report: Add -f/--force option
Adding -f/--force option to go through ownership validation:

  $ sudo perf c2c report
  File perf.data not owned by current user or root (use -f to override)
  $
  $ sudo perf c2c report -f
  < c2c report output >
  $

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1479764011-10732-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-23 10:44:04 -03:00
Taeung Song
c6fc018a7a perf config: Add support setting variables in a config file
Add setting feature that can add config variables with their values to a
config file (i.e. user or system config file) or modify config key-value
pairs in a config file.  For the syntax examples:

    perf config [<file-option>] [section.name[=value] ...]

e.g. You can set the ui.show-headers to false with

    # perf config ui.show-headers=false

If you want to add or modify several config items, you can do like

    # perf config annotate.show_nr_jumps=false kmem.default=slab

Committer notes:

Testing it:

  $ perf config -l
  top.children=true
  report.children=false
  $
  $ perf config top.children=false
  $ perf config -l
  top.children=false
  report.children=false
  $
  $ perf config kmem.default=slab
  $ perf config -l
  top.children=false
  report.children=false
  kmem.default=slab
  $

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Nambong Ha <over3025@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wookje Kwon <aweee0@gmail.com>
Link: http://lkml.kernel.org/r/1478241862-31230-5-git-send-email-treeze.taeung@gmail.com
[ Combined patch with docs update with this one ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-14 13:08:11 -03:00
Taeung Song
909236083e perf config: Add support for getting config key-value pairs
Add a functionality getting specific config key-value pairs.
For the syntax examples,

    perf config [<file-option>] [section.name ...]

e.g. To query config items 'report.queue-size' and 'report.children', do

    # perf config report.queue-size report.children

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Nambong Ha <over3025@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wookje Kwon <aweee0@gmail.com>
Link: http://lkml.kernel.org/r/1478241862-31230-2-git-send-email-treeze.taeung@gmail.com
[ Combined patch with docs update with this one ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-14 12:52:17 -03:00
Arnaldo Carvalho de Melo
699c12a7cc perf intel-pt: Update documentation about context switch events
Since the unprivileged sched switch event was added in perf, PT doesn't
need need perf_event_paranoid=-1 anymore for per cpu decoding.

Add a note stating that that is only needed for kernels < 4.2.

Reported-by: Andi Kleen <ak@linux.intel.com>
Report-Link: http://lkml.kernel.org/r/http://lkml.kernel.org/n/tip-x2ybghpqxxn3zu0m8o7qi42r@git.kernel.org
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 45ac1403f5 ("perf: Add PERF_RECORD_SWITCH to indicate context switches")
Link: http://lkml.kernel.org/n/tip-x2ybghpqxxn3zu0m8o7qi42r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-11-11 13:18:52 -03:00
Alexis Berlemont
e36b7821a9 perf trace: Implement --delay
In the perf wiki todo-list[1], there is an entry regarding initial-delay
and 'perf trace'; the following small patch tries to fulfill this point.
It has been generated against the branch tip/perf/core.

It has only been implemented in the "trace__run" case.

Ex.:

  $ sudo strace -- ./perf trace --delay 5 sleep 1 2>&1
  ...
  fcntl(7, F_SETFL, O_RDONLY|O_NONBLOCK)  = 0
  ioctl(7, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
  ioctl(11, PERF_EVENT_IOC_SET_OUTPUT, 0x7) = 0
  fcntl(11, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
  ioctl(11, PERF_EVENT_IOC_ID, 0x7ffc8fd35718) = 0
  write(6, "\0", 1)                       = 1
  close(6)                                = 0
  nanosleep({0, 5000000}, NULL)           = 0  # DELAY OF 5 MS BEFORE ENABLING THE EVENTS
  ioctl(3, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ioctl(4, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ioctl(5, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ioctl(7, PERF_EVENT_IOC_ENABLE, 0)      = 0
  ...

[1]: https://perf.wiki.kernel.org/index.php/Todo

Signed-off-by: Alexis Berlemont <alexis.berlemont@gmail.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161010054328.4028-2-alexis.berlemont@gmail.com
[ Add entry to the manpage, cut'n'pasted from stat's and record's ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-24 11:07:43 -03:00
Stephane Eranian
b3151ea500 perf jit: Add jitdump format specification document
This patch adds a formal specification of the jitdump format. The goal
is to help jit runtime developers implement the jitdump support without
having to read the jvmti code.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-10-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-24 11:07:41 -03:00
Andi Kleen
84ee74affc perf record: Improve documentation of event parameters
- Some editing (params -> parameters)
- Point to the now more complete list of parameters in the perf list
manpage.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1476381433-22959-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-24 11:07:35 -03:00
Andi Kleen
224e2c977b perf script: Support insn and insnlen
When looking at Intel PT traces with perf script it is useful to have
some indication of the instruction. Dump the instruction bytes and
instruction length, which can be used for simple pattern analysis in
scripts.

% perf record -e intel_pt// foo
% perf script --itrace=i0ns -F ip,insn,insnlen
 ffffffff8101232f ilen: 5 insn: 0f 1f 44 00 00
 ffffffff81012334 ilen: 1 insn: 5b
 ffffffff81012335 ilen: 1 insn: 5d
 ffffffff81012336 ilen: 1 insn: c3
 ffffffff810123e3 ilen: 1 insn: 5b
 ffffffff810123e4 ilen: 2 insn: 41 5c
 ffffffff810123e6 ilen: 1 insn: 5d
 ffffffff810123e7 ilen: 1 insn: c3
 ffffffff810124a6 ilen: 2 insn: 31 c0
 ffffffff810124a8 ilen: 9 insn: 41 83 bc 24 a8 01 00 00 01
 ffffffff810124b1 ilen: 2 insn: 75 87
...

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1475847747-30994-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-24 11:07:30 -03:00
Jiri Olsa
af09b2d35e perf c2c report: Add --show-all option
Normally we limit the main list to contain only entries with HITM %
value > 0.0005, but it might be useful to display all captured entries.
Adding --show-all option for that.

Requested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-nokgjdwikbegec5jzj4mxhqc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-21 10:32:02 -03:00
Jiri Olsa
18f278d2dd perf c2c report: Add --no-source option
Add a possibility to disable source line column with new --no-source
option. It source line data could take lot of time to retrieve, so it
could be a performance burden for big data.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-8p6s2727fq8nbsm3it5gix3p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-21 10:32:01 -03:00
Jiri Olsa
465f27a3b2 perf c2c: Add man page and credits
Add man page for c2c command and credits to builtin-c2c.c file.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-twbp391v8v9f5idp584hlfov@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-21 10:32:01 -03:00
Nambong Ha
2ad8327fd0 perf top/report: Add tips about a list option
Add two tips that describe --list option of config sub-command and
explain how to choose particular config file location.

Signed-off-by: Nambong Ha <over3025@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <taeung@kosslab.kr>
Link: http://lkml.kernel.org/r/1475191562-3240-1-git-send-email-over3025@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-05 19:51:53 -03:00
Donghyun Kim
49343235d0 perf report/top: Add a tip about system-wide collection from all CPUs
Signed-off-by: Donghyun Kim <dongdong9335@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <taeung@kosslab.kr>
Link: http://lkml.kernel.org/r/1475187357-21882-1-git-send-email-dongdong9335@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-10-05 19:50:25 -03:00