c840cbfeff
Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis of code with Intel PT, it is useful to know if a timing bubble was caused by Intel PT or not. Add reporting of PSB events via perf script. PSB events are printed with the existing itrace 'p' option which also prints power and frequency changes. The PSB event contains the trace offset at which the PSB occurs, to allow easy reference back to the PSB+ packets. The PSB event timestamp is always the timestamp from the PSB+ TSC packet, and the ip is always the address from the PSB+ FUP packet. The code changes are non-trivial because the decoder must walk to the PSB+ FUP address before outputting the PSB event. Example: $ perf record -e intel_pt/cyc,psb_period=0/u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.046 MB perf.data ] $ perf script --itrace=p --ns perf 17981 [006] 25617.510820383: psb: psb offs: 0 0 [unknown] ([unknown]) perf 17981 [006] 25617.510820383: cbr: cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) uname 17981 [006] 25617.510889753: psb: psb offs: 0xb50 7f78c12a212e __GI___tunables_init+0xee (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.510899162: psb: psb offs: 0x12d0 7f78c128af1c dl_main+0x93c (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.510939242: psb: psb offs: 0x1a50 7f78c128eefc _dl_map_object_from_fd+0x13c (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.510981274: psb: psb offs: 0x21c8 7f78c1296307 _dl_relocate_object+0x927 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.510993034: psb: psb offs: 0x2948 7f78c12940e4 _dl_lookup_symbol_x+0x14 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.511003871: psb: psb offs: 0x30c8 7f78c12937b3 do_lookup_x+0x2f3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.511019854: psb: psb offs: 0x3850 7f78c1295eed _dl_relocate_object+0x50d (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.511029015: psb: psb offs: 0x4390 7f78c12a855a strcmp+0xf6a (/usr/lib/x86_64-linux-gnu/ld-2.31.so) uname 17981 [006] 25617.511064876: psb: psb offs: 0x4b10 0 [unknown] ([unknown]) uname 17981 [006] 25617.511080762: psb: psb offs: 0x5290 7f78c11db53d _dl_addr+0x13d (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511086035: psb: psb offs: 0x5a08 7f78c11db538 _dl_addr+0x138 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511091381: psb: psb offs: 0x6190 7f78c11db534 _dl_addr+0x134 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511096681: psb: psb offs: 0x6910 7f78c11db4c3 _dl_addr+0xc3 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511119520: psb: psb offs: 0x7090 7f78c10ada5e _nl_intern_locale_data+0x12e (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511126584: psb: psb offs: 0x7818 7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511132775: psb: psb offs: 0x8358 7f78c10c20c0 getenv+0xa0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511134598: psb: psb offs: 0x8ad0 7f78c10ada09 _nl_intern_locale_data+0xd9 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511135685: psb: psb offs: 0x9258 7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511138322: psb: psb offs: 0x99d0 7f78c11fffd9 __strncmp_avx2+0x39 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) uname 17981 [006] 25617.511158907: psb: psb offs: 0xa150 0 [unknown] ([unknown]) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210205175350.23817-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
65 lines
2.3 KiB
Plaintext
65 lines
2.3 KiB
Plaintext
i synthesize instructions events
|
|
b synthesize branches events (branch misses for Arm SPE)
|
|
c synthesize branches events (calls only)
|
|
r synthesize branches events (returns only)
|
|
x synthesize transactions events
|
|
w synthesize ptwrite events
|
|
p synthesize power events (incl. PSB events for Intel PT)
|
|
o synthesize other events recorded due to the use
|
|
of aux-output (refer to perf record)
|
|
e synthesize error events
|
|
d create a debug log
|
|
f synthesize first level cache events
|
|
m synthesize last level cache events
|
|
M synthesize memory events
|
|
t synthesize TLB events
|
|
a synthesize remote access events
|
|
g synthesize a call chain (use with i or x)
|
|
G synthesize a call chain on existing event records
|
|
l synthesize last branch entries (use with i or x)
|
|
L synthesize last branch entries on existing event records
|
|
s skip initial number of events
|
|
q quicker (less detailed) decoding
|
|
|
|
The default is all events i.e. the same as --itrace=ibxwpe,
|
|
except for perf script where it is --itrace=ce
|
|
|
|
In addition, the period (default 100000, except for perf script where it is 1)
|
|
for instructions events can be specified in units of:
|
|
|
|
i instructions
|
|
t ticks
|
|
ms milliseconds
|
|
us microseconds
|
|
ns nanoseconds (default)
|
|
|
|
Also the call chain size (default 16, max. 1024) for instructions or
|
|
transactions events can be specified.
|
|
|
|
Also the number of last branch entries (default 64, max. 1024) for
|
|
instructions or transactions events can be specified.
|
|
|
|
Similar to options g and l, size may also be specified for options G and L.
|
|
On x86, note that G and L work poorly when data has been recorded with
|
|
large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
|
|
|
|
It is also possible to skip events generated (instructions, branches, transactions,
|
|
ptwrite, power) at the beginning. This is useful to ignore initialization code.
|
|
|
|
--itrace=i0nss1000000
|
|
|
|
skips the first million instructions.
|
|
|
|
The 'e' option may be followed by flags which affect what errors will or
|
|
will not be reported. Each flag must be preceded by either '+' or '-'.
|
|
The flags are:
|
|
o overflow
|
|
l trace data lost
|
|
|
|
If supported, the 'd' option may be followed by flags which affect what
|
|
debug messages will or will not be logged. Each flag must be preceded
|
|
by either '+' or '-'. The flags are:
|
|
a all perf events
|
|
|
|
If supported, the 'q' option may be repeated to increase the effect.
|