Commit Graph

679178 Commits

Author SHA1 Message Date
Adrian Hunter
d5b1a5f660 x86/insn: perf tools: Add new ptwrite instruction
Add ptwrite to the op code map and the perf tools new instructions test.
To run the test:

  $ tools/perf/perf test "x86 ins"
  39: Test x86 instruction decoder - new instructions          : Ok

Or to see the details:

  $ tools/perf/perf test -v "x86 ins" 2>&1 | grep ptwrite

For information about ptwrite, refer the Intel SDM.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: http://lkml.kernel.org/r/1495180230-19367-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:58:04 -03:00
Colin Ian King
19f0edb980 perf jit: fix typo: "incalid" -> "invalid"
Trivial fix to typo in jvmti_close() warnx warning message.

Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170627124917.19151-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:55:06 -03:00
Arnaldo Carvalho de Melo
fef2a73516 perf tools: Kill die()
Finally can nuke this function, no more users.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eivvvzn8ie6w42gy3batxoy7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:49:13 -03:00
Arnaldo Carvalho de Melo
25ce4bb8c5 perf config: Do not die when parsing u64 or int config values
Just warn the user and ignore those values.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-tbf60nj3ierm6hrkhpothymx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:44:58 -03:00
Arnaldo Carvalho de Melo
62d94b00f8 perf tools: Replace error() with pr_err()
To consolidate the error reporting facility.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b41iot1094katoffdf19w9zk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:22:31 -03:00
Arnaldo Carvalho de Melo
b211d79ac1 perf tools: Remove warning()
Now everything uses pr_warning(), so ditch it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-hv8r0mgdhk73wtfq3zrhavgx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:13:20 -03:00
Arnaldo Carvalho de Melo
d2a74d53aa perf event-parse: Use pr_warning()
Convert sole user of warning() in this file to pr_warning(),
consolidating error reporting facilities.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3y7yf6v673ujl2rcs34tzv8n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:08:14 -03:00
Arnaldo Carvalho de Melo
4cf134e744 perf config: Use pr_warning()
warning() is going away, consolidating error reporting.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5r3636cwl4z1varo90mervai@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:03:17 -03:00
Arnaldo Carvalho de Melo
59913aabb0 perf help: Use pr_warning()
Complete the switch to using te pr_{warning,error,etc} error reporting
facilities.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3l9gr6237b4aqyo0rsspixe2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 11:01:17 -03:00
Arnaldo Carvalho de Melo
86e474ff87 perf help: Elliminate dup code for reporting
And switch from warning() to pr_warning(), to elliminate another
duplication: too many error reporting facilities.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-pkzcjrhek3uuqc4i5i9ealwd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 10:59:28 -03:00
Arnaldo Carvalho de Melo
881c362d34 perf help: Introduce exec_failed() to avoid code duplication
The warning(str_error_r(errno)) pattern can be replaced with a function,
do it.

And while at it use pr_warning(), we have way too many error reporting
facilities, time to drop some, starting with the one we got from the git
sources.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lbak5npj1ri1uuvf1en3c0p0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27 10:52:57 -03:00
Thomas Richter
19508c048a perf tests: Add platform dependency to test 15
This patch adds platform dependency into the test case 15
(perf_event_attr). It is based on a suggestion from Jiri Olsa.

Add a new optional attribute named 'arch' in the [config] section of the
test case file. It is a comma separated list of architecture names this
test can be executed on. For example:

  arch = x86_64,alpha,ppc

If this attribute is missing the test is executed on any platform.  This
does not break existing behavior.

The values listed for this attribute should be identical to uname -m
output.

If the list starts with an exclamation mark (!) the comparison is
inverted, for example for

  arch = !s390x,ppc

the test is not executed on s390x or ppc platforms.  The exclamation
mark must be at the beginnning of the list.

Here is an example debug output:

  [root@s35lp76]# fgrep arch tests/attr/test-stat-C2
  arch = x86_64,alpha,ppc
  [root@s35lp76]# PERF_TEST_ATTR=/tmp /usr/bin/python2 ./tests/attr.py \
    -d ./tests/attr/ -p ./perf -vvvvv -t test-stat-C1

provides the following output:

  running './tests/attr//test-stat-C1'
  test limitation 'x86_64,alpha,ppc' <--- new
    loading expected events
      Event event:base-stat
        fd = 1
        group_fd = -1
        .....

Here is the output when a test is skipped:

  [root@s35lp76]# fgrep arch tests/attr/test-stat-C1
  arch = !s390x
  [root@s35lp76]# PERF_TEST_ATTR=/tmp /usr/bin/python2 ./tests/attr.py \
    -d ./tests/attr/ -p ./perf -vvvvv -t test-stat-C1

provides the following output:

test limitation '!s390x' <--- new

skipped [s390x] './tests/attr//test-stat-C1' <--- new

The test is skipped with return code 0.

Suggested-and-Acked-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/20170622073625.86762-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-26 21:42:00 -03:00
Ingo Molnar
8e70e84091 perf/core improvements ad fixes:
New features:
 
 - Add support to measure SMI cost in 'perf stat' (Kan Liang)
 
 - Add support for unwinding callchains in powerpc with libdw (Paolo Bonzini)
 
 Fixes:
 
 - Fix message: cpu list option is -C not -c (Adrian Hunter)
 
 - Fix 'perf script' message: field list option is -F not -f (Adrian Hunter)
 
 - Intel PT fixes: (Adrian Hunter)
 
   o Fix missing stack clear
   o Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
   o Fix last_ip usage
   o Ensure never to set 'last_ip' when packet 'count' is zero
   o Clear FUP flag on error
   o Fix transactions_sample_type
 
 Infrastructure:
 
 - Intel PT cleanups/refactorings (Adrian Hunter)
 
   o Use FUP always when scanning for an IP
   o Add missing __fallthrough
   o Remove redundant initial_skip checks
   o Allow decoding with branch tracing disabled
   o Add default config for pass-through branch enable
   o Add documentation for new config terms
   o Add decoder support for ptwrite and power event packets
   o Add reserved byte to CBR packet payload
   o Add decoder support for CBR events
 
 - Move  find_process() to the only place that uses it, skimming some
   more fat from util.[ch] (Arnaldo Carvalho de Melo)
 
 - Do parameter validation earlier on fetch_kernel_version() (Arnaldo Carvalho de Melo)
 
 - Remove unused _ALL_SOURCE define (Arnaldo Carvalho de Melo)
 
 - Add sysfs__write_int function (Kan Liang)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZSouZAAoJENZQFvNTUqpAzIIP/jlEKCAym9EtGAmsOBxvAIiQ
 ocoV0g8/bRW+Dr8TGt7xbZNswBHCk/4e7ap+y9/NtME4KIwuZMovf8h/oImf+C1z
 ULiomjLEcJy9CQoWRDIe/kUK0LKmIVNVeNqIxisEukZmRhTW6GkpOeO3/BMALDbo
 YAoXkpSEeiJEx3OcD+yLlu6e2a0JJkJa0U2ujWxBjeNDatyWHKkM043eu4kbshvc
 YrfQUeQr4soWU45c/iWVOmRG+u9R2lfxK+T0cA1PNRYMJX70qXh0/jyGumv4iRUt
 KoKjGu2oh8WiE4xwImQrAkqEbuDwBbK3dZ71ZYsainR+RMphk25AD4KEM+5rjQnu
 DD++iGcKl5fAI+llzRxEjgSOM5CJ0iIufqHq2UB6ZB3NSL8sEUu7ehnbLA1lHF75
 ABDGStB6Tis16NRKaM0hCz2OykluEOr4+RhUOUY4Nk58+vzxCigVpFSQF1B8duWC
 WcjQP0w3GXkaKcXpq75fsDeLQC9UCrGCU2bYDuY34HrYroScIGCPsHow3qKV38ml
 lFUujzQHrHxImDNwFSRAsnNLEVIy/SsUunY9QlDCZysVwMC47/W6gpquQl8rfBMb
 KLiUA2ohLbYZsFUK5SaICWePtuu5IVcCGfEgrNqbnCYT9+9kPSFIPEPMXl+F+fAq
 UWPPqIBS4tmAA1PD9mz3
 =7oUP
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.13-20170621' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

 - Add support to measure SMI cost in 'perf stat' (Kan Liang)

 - Add support for unwinding callchains in powerpc with libdw (Paolo Bonzini)

Fixes:

 - Fix message: cpu list option is -C not -c (Adrian Hunter)

 - Fix 'perf script' message: field list option is -F not -f (Adrian Hunter)

 - Intel PT fixes: (Adrian Hunter)

   o Fix missing stack clear
   o Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
   o Fix last_ip usage
   o Ensure never to set 'last_ip' when packet 'count' is zero
   o Clear FUP flag on error
   o Fix transactions_sample_type

Infrastructure changes:

 - Intel PT cleanups/refactorings (Adrian Hunter)

   o Use FUP always when scanning for an IP
   o Add missing __fallthrough
   o Remove redundant initial_skip checks
   o Allow decoding with branch tracing disabled
   o Add default config for pass-through branch enable
   o Add documentation for new config terms
   o Add decoder support for ptwrite and power event packets
   o Add reserved byte to CBR packet payload
   o Add decoder support for CBR events

 - Move  find_process() to the only place that uses it, skimming some
   more fat from util.[ch] (Arnaldo Carvalho de Melo)

 - Do parameter validation earlier on fetch_kernel_version() (Arnaldo Carvalho de Melo)

 - Remove unused _ALL_SOURCE define (Arnaldo Carvalho de Melo)

 - Add sysfs__write_int function (Kan Liang)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-21 20:11:53 +02:00
Adrian Hunter
701516ae3d perf script: Fix message because field list option is -F not -f
Fix message because field list option is -F not -f.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-20-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:53 -03:00
Adrian Hunter
30795467e5 perf tools: Fix message because cpu list option is -C not -c
Fix message because cpu list option is -C not -c

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-19-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:53 -03:00
Adrian Hunter
2116074898 perf intel-pt: Fix transactions_sample_type
'transactions_sample_type' is needed to correctly inject transactions
samples but it was not being set. Set it from the event sample type.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:52 -03:00
Adrian Hunter
5da3b23b3b perf intel-pt: Remove redundant initial_skip checks
'initial_skip' is checked inside the sample synthesis functions which means
it is actually being done twice for 'instructions' and 'transactions'
samples. Remove the redundant checks.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:51 -03:00
Adrian Hunter
0a7c700d23 perf intel-pt: Add decoder support for CBR events
Add decoder support for informing the tools of changes to the core-to-bus
ratio (CBR).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:51 -03:00
Adrian Hunter
26fb2fb19c perf intel-pt: Add reserved byte to CBR packet payload
Future proof CBR packet decoding by passing through also the undefined
'reserved' byte in the packet payload.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:50 -03:00
Adrian Hunter
a472e65fc4 perf intel-pt: Add decoder support for ptwrite and power event packets
Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This
patch only affects the decoder, so the tools still do not select or consume
the new information. That is added in subsequent patches.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:50 -03:00
Adrian Hunter
2bc60ffd66 perf intel-pt: Add documentation for new config terms
Add documentation for new config terms.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:49 -03:00
Adrian Hunter
9fd629f9a6 perf intel-pt: Add default config for pass-through branch enable
Branch tracing is enabled by default, so a fake config bit called 'pt'
(pass-through) was added to allow the 'branch enable' bit to have affect.
Add default config 'pt,branch' which will allow users to disable branch
tracing using 'branch=0' instead of having to specify 'pt,branch=0'.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:48 -03:00
Adrian Hunter
839598176b perf intel-pt: Allow decoding with branch tracing disabled
The kernel now supports the disabling of branch tracing, however the
decoder assumes branch tracing is always enabled. Pass through a parameter
to indicate whether branch tracing is enabled and use it to avoid cases
when the decoder is expecting branch packets. There are 2 such cases.
First, FUP packets which can bind to an IP even when there is no branch
tracing. Secondly, the decoder will try to use branch packets to find an IP
to start decoding or to recover from errors.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:48 -03:00
Adrian Hunter
04194207fe perf intel-pt: Add missing __fallthrough
perf tools uses __fallthrough. Add missing  __fallthrough to a switch
statement.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:47 -03:00
Adrian Hunter
6a558f12db perf intel-pt: Clear FUP flag on error
Sometimes a FUP packet is associated with a TSX transaction and a flag is
set to indicate that. Ensure that flag is cleared on any error condition
because at that point the decoder can no longer assume it is correct.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:47 -03:00
Adrian Hunter
622b7a47b8 perf intel-pt: Use FUP always when scanning for an IP
The decoder will try to use branch packets to find an IP to start decoding
or to recover from errors. Currently the FUP packet is used only in the
case of an overflow, however there is no reason for that to be a special
case. So just use FUP always when scanning for an IP.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:46 -03:00
Adrian Hunter
f952eaceb0 perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zero
Intel PT uses IP compression based on the last IP. For decoding purposes,
'last IP' is not updated when a branch target has been suppressed, which is
indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so
ensure never to set 'last_ip' when packet 'count' is zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:46 -03:00
Adrian Hunter
ee14ac0ef6 perf intel-pt: Fix last_ip usage
Intel PT uses IP compression based on the last IP. For decoding
purposes, 'last IP' is considered to be reset to zero whenever there is
a synchronization packet (PSB). The decoder wasn't doing that, and was
treating the zero value to mean that there was no last IP, whereas
compression can be done against the zero value. Fix by setting last_ip
to zero when a PSB is received and keep track of have_last_ip.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:45 -03:00
Adrian Hunter
ad7167a8cd perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IP
A value of zero is used to indicate that there is no IP. Ensure the
value is zero when the state is INTEL_PT_STATE_NO_IP.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:44 -03:00
Adrian Hunter
12b7080609 perf intel-pt: Fix missing stack clear
The return compression stack must be cleared whenever there is a PSB. Fix
one case where that was not happening.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:44 -03:00
Adrian Hunter
3f04d98e97 perf intel-pt: Improve sample timestamp
The decoder uses its current timestamp in samples. Usually that is a
timestamp that has already passed, but in some cases it is a timestamp
for a branch that the decoder is walking towards, and consequently
hasn't reached. Improve that situation by using the pkt_state to
determine when to use the current or previous timestamp.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:43 -03:00
Adrian Hunter
22c0689233 perf intel-pt: Move decoder error setting into one condition
Move decoder error setting into one condition.

Cc'ed to stable because later fixes depend on it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:43 -03:00
Paolo Bonzini
a7f0fda085 perf unwind: Support for powerpc
Porting PPC to libdw only needs an architecture-specific hook to move
the register state from perf to libdw.

The ARM and x86 architectures already use libdw, and it is useful to
have as much common code for the unwinder as possible.  Mark Wielaard
has contributed a frame-based unwinder to libdw, so that unwinding works
even for binaries that do not have CFI information.  In addition,
libunwind is always preferred to libdw by the build machinery so this
cannot introduce regressions on machines that have both libunwind and
libdw installed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1496312681-20133-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:42 -03:00
Kan Liang
daefd0bc0b perf stat: Add support to measure SMI cost
Implementing a new --smi-cost mode in perf stat to measure SMI cost.

During the measurement, the /sys/device/cpu/freeze_on_smi will be set.

The measurement can be done with one counter (unhalted core cycles), and
two free running MSR counters (IA32_APERF and SMI_COUNT).

In practice, the percentages of SMI core cycles should be more useful
than absolute value. So the output will be the percentage of SMI core
cycles and SMI#. metric_only will be set by default.

SMI cycles% = (aperf - unhalted core cycles) / aperf

Here is an example output.

 Performance counter stats for 'sudo echo ':

SMI cycles%          SMI#
    0.1%              1

       0.010858678 seconds time elapsed

Users who wants to get the actual value can apply additional
--no-metric-only.

Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:35 -03:00
Kan Liang
3b00ea9386 tools lib api fs: Add sysfs__write_int function
Add sysfs__write_int() to ease up writing int to sysfs.  New interface
is:

  int sysfs__write_int(const char *entry, int value);

Also, introducing filename__write_int() which is useful for new helpers
to write sysctl values.

Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21 11:35:27 -03:00
Arnaldo Carvalho de Melo
fd25bf8b8c perf tools: Remove unused _ALL_SOURCE define
Curious as to what this was for I looked at /usr/include/ and only some
python headers define this, and it ends up being to enable "extensions"
on some old OSes:

  /* Enable extensions on AIX 3, Interix */

I guess we can remove this one safely.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-omnundlxo2brs552bdl6m0j1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-20 12:30:07 -03:00
Arnaldo Carvalho de Melo
44b58e06e8 perf tools: Do parameter validation earlier on fetch_kernel_version()
While trying to reduce util.[ch] I noticed that fetch_kernel_version()
and fetch_ubuntu_kernel_version() do lots of operations only to check if
they are needed, i.e. it checks if the pointer where to return the
kernel version is NULL only after obtaining the kernel version from
/proc/version_signature or by parsing the results from uname().

Do it earlier not to confuse people reading this code in the future :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i94qwyekk4tzbu0b9ce1r1mz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-20 12:19:16 -03:00
Arnaldo Carvalho de Melo
2157f6ee18 perf evsel: Adopt find_process()
And make it static, nobody else uses it, if we ever need it in more
places we can carve a new source file for process related methods,
for now lets reduce util.{c,h} a tad more.

Link: http://lkml.kernel.org/n/tip-zgb28rllvypjibw52aaz9p15@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-20 12:05:38 -03:00
Ingo Molnar
007b811b40 perf/core improvements and fixes:
User visible:
 
 - Allow adding and removing fields to the default 'perf script' columns,
   using + or - as field prefixes to do so (Andi Kleen)
 
 - Display titles in left frame in the annotate browser (Jin Yao)
 
 - Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso'
   (Mark Santaniello)
 
 - Support function filtering in 'perf ftrace' (Namhyung Kim)
 
 - Allow specifying function call depth in 'perf ftrace' (Namhyumg Kim)
 
 Infrastructure:
 
 - Adopt __noreturn, __printf, __scanf, noinline, __packed and __aligned
   __alignment__(()) markers, to make the tools/ source code base to be
   more compact and look more like kernel code (Arnaldo Carvalho de Melo)
 
 - Remove unnecessary check in annotate_browser_write() (Jin Yao)
 
 - Return arch from symbol__disassemble() so that callers, such as
   the annotate TUI browser to use arch specific formattings, such
   as the upcoming instruction micro-op fusion on Intel Core (Jin Yao)
 
 - Remove superfluous check before use in the coresight code base (Kim
   Phillips)
 
 - Remove unused SAMPLE_SIZE defines and BTS priv array (Kim Phillips)
 
 - Error handling fix/tidy ups in 'perf config' (Taeung Song)
 
 - Avoid error in the BPF proggie built with clang in 'perf test llvm'
   when PROFILE_ALL_BRANCHES is set (Wang Nan)
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJZSHfUAAoJENZQFvNTUqpA3UYP/1T4VPphm1CItrZJW6vzepev
 eaHWBmVPpjnSb2/bN5Blh1GKcmfAwuGJPYQx9fONXCMUL6DXIYTzYaSZ0jQ2tGSG
 bCpnUJ2ccmvaCs7kUPvnkgaANsuqBQW7RSDNzzy9059r8udybtlXkQw0eBRcHrpf
 S6ql3H0uAMlK9M+QiLSc/OiDkW93EqsKtYSCBTL9bMsxokSQY+K5mYo0mZFojS4p
 43Gs9JDt58nxydRcplEoNdvN11yS9M6KyxFjl/38swzdXH3/KLQ7ZHeqgOq4YDnI
 8Zpzbi6n/yUjeUDGXWNu19F1JgdSprgZ3plZo4m5doFuWc4A03WvoBefCvYc1Gxc
 5X9nxc3DjNsA8XJAsoGgB0B19wqgNzk5jgzhKMBOumre5gSUSP/uSOzbG7RQ+k+y
 mHXRFXUesNWo6TX065xdstr0tOd0fbkOA7mAkuYY5QyrTQcON2HKMUqvMGu86UfR
 StJPYRUFwLgjO2tQ967+LC3I+WQVZVwPMBAZ52cBSvYbj97H3dEm7SuBEqG/FrLp
 WEkibvcvEaVcfZuaZcgokodwIBDHeqWZYm+vHGaOZO2A0W6S5kvc2l2tf9uNbodp
 ye99FJdIsxv5eKTZmxps/W3Mjw245D5HF9i7Fy5vgoZgwiAi/vJ1kMHYxFd6LwW6
 O/wnSzNzXSPICqH1M6Rd
 =wK/k
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo-4.13-20170719' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

- Allow adding and removing fields to the default 'perf script' columns,
  using + or - as field prefixes to do so (Andi Kleen)

- Display titles in left frame in the annotate browser (Jin Yao)

- Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso'
  (Mark Santaniello)

- Support function filtering in 'perf ftrace' (Namhyung Kim)

- Allow specifying function call depth in 'perf ftrace' (Namhyumg Kim)

Infrastructure changes:

- Adopt __noreturn, __printf, __scanf, noinline, __packed and __aligned
  __alignment__(()) markers, to make the tools/ source code base to be
  more compact and look more like kernel code (Arnaldo Carvalho de Melo)

- Remove unnecessary check in annotate_browser_write() (Jin Yao)

- Return arch from symbol__disassemble() so that callers, such as
  the annotate TUI browser to use arch specific formattings, such
  as the upcoming instruction micro-op fusion on Intel Core (Jin Yao)

- Remove superfluous check before use in the coresight code base (Kim
  Phillips)

- Remove unused SAMPLE_SIZE defines and BTS priv array (Kim Phillips)

- Error handling fix/tidy ups in 'perf config' (Taeung Song)

- Avoid error in the BPF proggie built with clang in 'perf test llvm'
  when PROFILE_ALL_BRANCHES is set (Wang Nan)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 10:49:08 +02:00
Ingo Molnar
2eb0fc9bfe Linux 4.12-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQEbBAABAgAGBQJZR92EAAoJEHm+PkMAQRiGT1oH+KW2FLrRaYxtut+KyGA6l7tc
 R/hFx1n9BibkjXeqD+y6/4SjRTe6/pT8Zkihv3/19eZ5algUWeQa0Hm+/455sl58
 IdIXx/pzuCO3kqR3//fP9ZFD657GNDsuQ0fYnZESItFwiWQtO1TNfZD6KQjkqBdI
 L3MVhDUVBZA2ZtPwC4ERei5/sToV9woykKoJ/A3+OkWjgX6w4SBimqgkSEFk4uE8
 xS0pycyDZci93rJlECi1UueewdODTKSmhwdC80qvGEiYXzsC6UFtaF0Fj66XO1AL
 UMjxqI/gkm5ZuCIjRsmPmJjgL5q1RT6UX/qtw9yn71XTmcgMiPW2/DF/v/OaTg==
 =XbW2
 -----END PGP SIGNATURE-----

Merge tag 'v4.12-rc6' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20 10:48:44 +02:00
Taeung Song
dfe1c6d7ef perf config: Refactor the code using 'ret' variable in cmd_config()
To simplify the code related to 'ret' variable in cmd_config(),
initialize 'ret' with -1 instead of 0 and use goto to perform resource
release at the end of the function, setting ret to zero just before the
out_err label, as usual in the kernel sources.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1497671202-20495-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:55 -03:00
Taeung Song
4f1fd74283 perf config: Check error cases of {show_spec, set}_config()
show_spec_config() and set_config() can be called multiple times
in the loop in cmd_config().

However, The error cases of them wasn't checked, so fix it.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1497671197-20450-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:54 -03:00
Namhyung Kim
1096c35aa8 perf ftrace: Add -D option for depth filter
The -D/--graph-depth option is to set max graph depth.  The following
example traces max 2-depth of page fault handler.

  $ sudo perf ftrace -G __do_page_fault -D 2 -- hello
   ...
   0)               |  __do_page_fault() {
   0)   0.063 us    |    down_read_trylock();
   0)   0.251 us    |    find_vma();
   0)   5.374 us    |    handle_mm_fault();
   0)   0.054 us    |    up_read();
   0)   7.463 us    |  }
   ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170618142302.25390-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:54 -03:00
Namhyung Kim
78b83e8b12 perf ftrace: Add option for function filtering
The -T/--trace-funcs and -N/--notrace-funcs options are to specify
functions to enable/disable tracing dynamically.

The -G/--graph-funcs and -g/--nograph-funcs options are to set filters
for function graph tracer.

For example, to trace fault handling functions only:

  $ sudo perf ftrace -T *fault hello
   0)               |  __do_page_fault() {
   0)               |    handle_mm_fault() {
   0)   2.117 us    |      __handle_mm_fault();
   0)   3.627 us    |    }
   0)   7.811 us    |  }
   0)               |  __do_page_fault() {
   0)               |    handle_mm_fault() {
   0)   2.014 us    |      __handle_mm_fault();
   0)   2.424 us    |    }
   0)   2.951 us    |  }
   ...

To trace all functions executed in __do_page_fault:

  $ sudo perf ftrace -G __do_page_fault hello
   2)               |  __do_page_fault() {
   3)   0.060 us    |    down_read_trylock();
   3)               |    find_vma() {
   3)   0.075 us    |      vmacache_find();
   3)   0.053 us    |      vmacache_update();
   3)   1.246 us    |    }
   3)               |    handle_mm_fault() {
   3)   0.063 us    |      __rcu_read_lock();
   3)   0.056 us    |      mem_cgroup_from_task();
   3)   0.057 us    |      __rcu_read_unlock();
   3)               |      __handle_mm_fault() {
   3)               |        filemap_map_pages() {
   3)   0.058 us    |          __rcu_read_lock();
   3)               |          alloc_set_pte() {
   ...

But don't want to show details in handle_mm_fault:

  $ sudo perf ftrace -G __do_page_fault -g handle_mm_fault hello
   3)               |  __do_page_fault() {
   3)   0.049 us    |    down_read_trylock();
   3)               |    find_vma() {
   3)   0.048 us    |      vmacache_find();
   3)   0.041 us    |      vmacache_update();
   3)   0.680 us    |    }
   3)   0.036 us    |    up_read();
   3)   4.547 us    |  } /* __do_page_fault */
   ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170618142302.25390-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:53 -03:00
Namhyung Kim
29681bc5bb perf ftrace: Move setup_pager before opening trace_pipe
The 'perf ftrace' command fails to reset tracer after finishing
recording like below:

  $ sudo perf ftrace -v hello
  write 'nop' to tracing/current_tracer failed: Device or resource busy
  ...

This is because the trace_pipe file is open in pager process.  Move the
pager setup to before opening the file.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Fixes: 583359646f ("perf ftrace: Use pager for displaying result")
Link: http://lkml.kernel.org/r/20170618142302.25390-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:52 -03:00
Namhyung Kim
e7bd9ba20a perf ftrace: Show error message when fails to set ftrace files
It'd be better for debugging to show an error message when it fails to
setup ftrace for some reason.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170618142302.25390-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:51 -03:00
Mark Santaniello
106dacd86f perf script: Support -F brstackoff,dso
The idea here is to make AutoFDO easier in cloud environment with ASLR.
It's easiest to show how this is useful by example. I built a small test
akin to "while(1) { do_nothing(); }" where the do_nothing function is
loaded from a dso:

  $ cat burncpu.cpp
  #include <dlfcn.h>

  int main() {
    void* handle = dlopen("./dso.so", RTLD_LAZY);
    if (!handle) return -1;

    typedef void (*fp)();
    fp do_nothing = (fp) dlsym(handle, "do_nothing");

    while(1) {
      do_nothing();
    }
  }

  $ cat dso.cpp
  extern "C" void do_nothing() {}

  $ cat build.sh
  #!/bin/bash
  g++ -shared dso.cpp -o dso.so
  g++ burncpu.cpp -o burncpu -ldl

I sampled the execution of this program with perf record -b.

Using the existing "brstack,dso", we get absolute addresses that are
affected by ASLR, and could be different on different hosts. The address
does not uniquely identify a branch/target in the binary:

  $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0

Using the existing "brstacksym,dso" is a little better, because the
symbol plus offset and dso name *does* uniquely identify a branch/target
in the binary.  Ultimately, however, AutoFDO wants a simple offset into
the binary, so we'd have to undo all the work perf did to symbolize in
the first place:

  $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0

With the new "brstackoff,dso" we get what we need: a simple offset into a
specific dso/binary that uniquely identifies a branch/target:
  $ perf script -F brstackoff,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  0x6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0

Signed-off-by: Mark Santaniello <marksan@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619163825.2012979-2-marksan@fb.com
[ Updated documentation about 'brstackoff' using text from above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:46 -03:00
Mark Santaniello
55b9b50811 perf script: Support -F brstack,dso and brstacksym,dso
Perf script can report the dso for "addr" and "ip" fields.

This adds the same support for the "brstack" and "brstacksym" fields.
This can be helpful for AutoFDO: we can ignore LBR entries unless the
source and target address are both in the target module we are about to
build.

I built a small test akin to "while(1) { do_nothing(); }" where the
do_nothing function is loaded from a dso:

  $ cat burncpu.cpp
  #include <dlfcn.h>

  int main() {
    void* handle = dlopen("./dso.so", RTLD_LAZY);
    if (!handle) return -1;

    typedef void (*fp)();
    fp do_nothing = (fp) dlsym(handle, "do_nothing");

    while(1) {
      do_nothing();
    }
  }

  $ cat dso.cpp
  extern "C" void do_nothing() {}

  $ cat build.sh
  #!/bin/bash
  g++ -shared dso.cpp -o dso.so
  g++ burncpu.cpp -o burncpu -ldl

I sampled the execution with perf record -b.  Using the new perf script
functionality I can easily find cases where there was a transition from one
dso to another:

  $ perf record -a -b -- sleep 5
  [ perf record: Woken up 55 times to write data ]
  [ perf record: Captured and wrote 18.815 MB perf.data (43593 samples) ]

  $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0

  $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
  do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0

Signed-off-by: Mark Santaniello <marksan@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619163825.2012979-1-marksan@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 22:05:40 -03:00
Wang Nan
9b57fb7e35 perf test llvm: Avoid error when PROFILE_ALL_BRANCHES is set
The 'if' keyword is a define that expands to complex code when
CONFIG_PROFILE_ALL_BRANCHES is selected, which causes a 'perf test LLVM'
failure like:

  $ ./perf test LLVM
  35: LLVM search and compile                    :
  35.1: Basic BPF llvm compile                    : Ok
  35.2: kbuild searching                          : Ok
  35.3: Compile source for BPF prologue generation: FAILED!
  35.4: Compile source for BPF relocation         : Skip

The only affected test case is bpf-script-test-prologue.c
because it uses kernel headers and has 'if' inside.

This patch undefines 'if' to make it passes perf test.

More detailed analysis from a message in this thread, also by Wang:

The problem is caused by following relocation information:

  $ readelf -a ./llvmsubtest3
  ...
     [ 5] _ftrace_branch    PROGBITS         0000000000000000  00000260
          00000000000000a0  0000000000000000  WA       0     0     4
  ...
  Relocation section '.relfunc=null_lseek file->f_mode offset orig' at
  offset 0x490 contains 4 entries:
     Offset          Info           Type           Sym. Value    Sym. Name
  000000000038  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
  0000000000b0  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
  000000000128  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
  0000000001c0  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch

  Relocation section '.rel_ftrace_branch' at offset 0x4d0 contains 8 entries:
     Offset          Info           Type           Sym. Value    Sym. Name
  000000000000  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
  000000000008  000100000001 unrecognized: 1       0000000000000015 .L.str
  000000000028  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
  000000000030  000100000001 unrecognized: 1       0000000000000015 .L.str
  000000000050  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
  000000000058  000100000001 unrecognized: 1       0000000000000015 .L.str
  000000000078  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
  000000000080  000100000001 unrecognized: 1       0000000000000015 .L.str
  ...

So I think the failure is because you enabled CONFIG_PROFILE_ALL_BRANCHES.

I can reproduce your buggy result by selecting
CONFIG_PROFILE_ALL_BRANCHES in my kbuild:

  $ ./perf test LLVM
  35: LLVM search and compile                    :
  35.1: Basic BPF llvm compile                    : Ok
  35.2: kbuild searching                          : Ok
  35.3: Compile source for BPF prologue generation: FAILED!
  35.4: Compile source for BPF relocation         : Skip

Simply undef CONFIG_PROFILE_ALL_BRANCHES in clang opts not working
because it is introduced by "#include <uapi/linux/fs.h>", which override
cmdline options. So I think the best way is to undefine 'if' inside BPF
script.

Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/20170620183203.2517-1-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 16:11:26 -03:00
Jin Yao
dcaa394807 perf annotate: Return arch from symbol__disassemble() and save it in browser
In annotate browser, we will add support to check fused instructions.
While this is x86-specific feature so we need the annotate browser to
know what the arch it runs on.

symbol__disassemble() has figured out the arch. This patch just lets the
arch return from symbol__disassemble and save the arch in annotate
browser.

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19 15:27:09 -03:00