mirror of
https://github.com/torvalds/linux.git
synced 2024-12-06 11:01:43 +00:00
48d02a1d5c
Implement printing instruction sequences as hex dump for branch stacks. This relies on the x86 instruction decoder used by the PT decoder to find the lengths of instructions to dump them individually. This is good enough for pattern matching. This allows to study hot paths for individual samples, together with branch misprediction and cycle count / IPC information if available (on Skylake systems). % perf record -b ... % perf script -F brstackinsn ... read_hpet+67: ffffffff9905b843 insn: 74 ea # PRED ffffffff9905b82f insn: 85 c9 ffffffff9905b831 insn: 74 12 ffffffff9905b833 insn: f3 90 ffffffff9905b835 insn: 48 8b 0f ffffffff9905b838 insn: 48 89 ca ffffffff9905b83b insn: 48 c1 ea 20 ffffffff9905b83f insn: 39 f2 ffffffff9905b841 insn: 89 d0 ffffffff9905b843 insn: 74 ea # PRED Only works when no special branch filters are specified. Occasionally the path does not reach up to the sample IP, as the LBRs may be frozen before executing a final jump. In this case we print a special message. The instruction dumper piggy backs on the existing infrastructure from the IP PT decoder. An earlier iteration of this patch relied on a disassembler, but this version only uses the existing instruction decoder. Committer note: Added hint about how to get suitable perf.data files for use with '-F brstackinsm': $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ] $ $ perf script -F brstackinsn Display of branch stack assembler requested, but non all-branch filter set Hint: run 'perf record -b ...' $ Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
23 lines
392 B
C
23 lines
392 B
C
#ifndef __PERF_DUMP_INSN_H
|
|
#define __PERF_DUMP_INSN_H 1
|
|
|
|
#define MAXINSN 15
|
|
|
|
#include <linux/types.h>
|
|
|
|
struct thread;
|
|
|
|
struct perf_insn {
|
|
/* Initialized by callers: */
|
|
struct thread *thread;
|
|
u8 cpumode;
|
|
bool is64bit;
|
|
int cpu;
|
|
/* Temporary */
|
|
char out[256];
|
|
};
|
|
|
|
const char *dump_insn(struct perf_insn *x, u64 ip,
|
|
u8 *inbuf, int inlen, int *lenp);
|
|
#endif
|