mainlining shenanigans
8cacac6ecd
perf report: Jin Yao: - Allow entering the annotation view (symbol source/assembly + overhead/cycles/etc column) from the 'perf report --total-cycles' interface. E.g.: # perf record --all-cpus --branch-any --all-kernel ^C[ perf record: Woken up 5 times to write data ] # # perf evlist -v cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY # # perf report --total-cycles # # Samples: 78762 of event 'cycles' Sampled Sampled Avg Avg Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object 1.72% 95.8K 0.00% 254 [msr.h:105 -> msr.h:166] [kernel.vmlinux] 1.56% 107.6K 0.00% 618 [compiler.h:199 -> common.c:301] [kernel.vmlinux] 0.83% 46.3K 0.00% 409 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux] 0.83% 46.1K 0.00% 83 [jump_label.h:41 -> tsc.c:230] [kernel.vmlinux] 0.64% 36.9K 0.01% 1.4K [hda_intel.c:904 -> hda_intel.c:916] [snd_hda_intel] 0.57% 30.2K 0.00% 282 [file.c:710 -> file.c:730] [kernel.vmlinux] 0.48% 25.8K 0.00% 82 [spinlock.c:158 -> spinlock.c:160] [kernel.vmlinux] 0.45% 23.7K 0.00% 369 [tick-broadcast.c:585 -> tick-broadcast.c:586] [kernel.vmlinux] 0.44% 24.4K 0.00% 73 [msr.h:236 -> tsc.c:1088] [kernel.vmlinux] 0.43% 22.7K 0.00% 144 [cpuidle.c:229 -> cpuidle.c:232] [kernel.vmlinux] Then press 'A' or Enter on one of those lines, just like with 'perf top', say the top one: [msr.h:105 -> msr.h:166], then this shows up: Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762 native_write_msr /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period] Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%) │ │ Disassembly of section .text: │ │ ffffffff8106c480 <native_write_msr>: │ __wrmsr(): │ return EAX_EDX_VAL(val, low, high); │ } │ │ static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high) │ { │ asm volatile("1: wrmsr\n" 49.16 │0.02 mov %edi,%ecx │0.02 mov %esi,%eax │0.02 wrmsr │ arch_static_branch(): │ #include <linux/stringify.h> │ #include <linux/types.h> │ │ static __always_inline bool arch_static_branch(struct static_key *key, bool branch) │ { │ asm_volatile_goto("1:" 0.79 │0.02 nop │ native_write_msr(): │ { │ __wrmsr(msr, low, high); │ │ if (msr_tracepoint_active(__tracepoint_write_msr)) │ do_trace_write_msr(msr, ((u64)high << 32 | low), 0); │ } 50.05 │0.02 254 ← retq │ do_trace_write_msr(msr, ((u64)high << 32 | low), 0); │ shl $0x20,%rdx │ mov %esi,%esi │ or %rdx,%rsi │ xor %edx,%edx │ → jmpq do_trace_write_msr We need to improve this to show the source code line numbers in the annotation view, so one can go from that program block to the annotation view and see those source code line numbers straight away. auxtrace/Intel PT: Adrian Hunter: - Add support for AUX area sampling, requires new functionality that will land in 5.5, its already in tip. This includes kernel capability querying so that it fails gracefully with older kernels, duimping aux area samples in 'perf report -D' and 'perf script'. perf.data: Alexey Budankov: - Fix decompression of PERF_RECORD_COMPRESSED records. core: Arnaldo Carvalho de Melo: - Use the 'dcacheline' cmp routine to find the right DSOs taking into account the 'maj', 'min', 'ino' and 'ino_generation', that got moved from 'struct map' to 'struct dso', where it belongs. This further reduces the size of 'struct map', there is still more work to do to maybe get it to max one cacheline. libtraceevent: Hewenliang: - Fix memory leakage in copy_filter_type(). Sudip Mukherjee: - Fix header installation. perf parse: Ian Rogers : - Fix potential memory leak when handling tracepoint errors, found using LLVM's libFuzzer. perf probe: Colin Ian King: - Fix spelling mistake "addrees" -> "address". Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXdfz/wAKCRCyPKLppCJ+ J3nIAQDiOoX/bJRAyhsL6VVTUDzRq/tOTCzkMu43nABHfN5DAQEAnYWdfMsLZ1jm hNZd7ogDgR/kBp0v349k+8nTQncgNwI= =OhRG -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo-5.5-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf report: Jin Yao: - Allow entering the annotation view (symbol source/assembly + overhead/cycles/etc column) from the 'perf report --total-cycles' interface. E.g.: # perf record --all-cpus --branch-any --all-kernel ^C[ perf record: Woken up 5 times to write data ] # # perf evlist -v cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_user: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY # # perf report --total-cycles # # Samples: 78762 of event 'cycles' Sampled Sampled Avg Avg Cycles% Cycles Cycles% Cycles [Program Block Range] Shared Object 1.72% 95.8K 0.00% 254 [msr.h:105 -> msr.h:166] [kernel.vmlinux] 1.56% 107.6K 0.00% 618 [compiler.h:199 -> common.c:301] [kernel.vmlinux] 0.83% 46.3K 0.00% 409 [entry_64.S:153 -> entry_64.S:175] [kernel.vmlinux] 0.83% 46.1K 0.00% 83 [jump_label.h:41 -> tsc.c:230] [kernel.vmlinux] 0.64% 36.9K 0.01% 1.4K [hda_intel.c:904 -> hda_intel.c:916] [snd_hda_intel] 0.57% 30.2K 0.00% 282 [file.c:710 -> file.c:730] [kernel.vmlinux] 0.48% 25.8K 0.00% 82 [spinlock.c:158 -> spinlock.c:160] [kernel.vmlinux] 0.45% 23.7K 0.00% 369 [tick-broadcast.c:585 -> tick-broadcast.c:586] [kernel.vmlinux] 0.44% 24.4K 0.00% 73 [msr.h:236 -> tsc.c:1088] [kernel.vmlinux] 0.43% 22.7K 0.00% 144 [cpuidle.c:229 -> cpuidle.c:232] [kernel.vmlinux] Then press 'A' or Enter on one of those lines, just like with 'perf top', say the top one: [msr.h:105 -> msr.h:166], then this shows up: Samples: 78K of event 'cycles', 4000 Hz, Event count (approx.): 78762 native_write_msr /lib/modules/5.4.0-rc8/build/vmlinux [Percent: local period] Percent│ IPC Cycle (Average IPC: 0.02, IPC Coverage: 50.0%) │ │ Disassembly of section .text: │ │ ffffffff8106c480 <native_write_msr>: │ __wrmsr(): │ return EAX_EDX_VAL(val, low, high); │ } │ │ static inline void notrace __wrmsr(unsigned int msr, u32 low, u32 high) │ { │ asm volatile("1: wrmsr\n" 49.16 │0.02 mov %edi,%ecx │0.02 mov %esi,%eax │0.02 wrmsr │ arch_static_branch(): │ #include <linux/stringify.h> │ #include <linux/types.h> │ │ static __always_inline bool arch_static_branch(struct static_key *key, bool branch) │ { │ asm_volatile_goto("1:" 0.79 │0.02 nop │ native_write_msr(): │ { │ __wrmsr(msr, low, high); │ │ if (msr_tracepoint_active(__tracepoint_write_msr)) │ do_trace_write_msr(msr, ((u64)high << 32 | low), 0); │ } 50.05 │0.02 254 ← retq │ do_trace_write_msr(msr, ((u64)high << 32 | low), 0); │ shl $0x20,%rdx │ mov %esi,%esi │ or %rdx,%rsi │ xor %edx,%edx │ → jmpq do_trace_write_msr We need to improve this to show the source code line numbers in the annotation view, so one can go from that program block to the annotation view and see those source code line numbers straight away. auxtrace/Intel PT: Adrian Hunter: - Add support for AUX area sampling, requires new functionality that will land in 5.5, its already in tip. This includes kernel capability querying so that it fails gracefully with older kernels, duimping aux area samples in 'perf report -D' and 'perf script'. perf.data: Alexey Budankov: - Fix decompression of PERF_RECORD_COMPRESSED records. core: Arnaldo Carvalho de Melo: - Use the 'dcacheline' cmp routine to find the right DSOs taking into account the 'maj', 'min', 'ino' and 'ino_generation', that got moved from 'struct map' to 'struct dso', where it belongs. This further reduces the size of 'struct map', there is still more work to do to maybe get it to max one cacheline. libtraceevent: Hewenliang: - Fix memory leakage in copy_filter_type(). Sudip Mukherjee: - Fix header installation. perf parse: Ian Rogers : - Fix potential memory leak when handling tracepoint errors, found using LLVM's libFuzzer. perf probe: Colin Ian King: - Fix spelling mistake "addrees" -> "address". Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
---|---|---|
arch | ||
block | ||
certs | ||
crypto | ||
Documentation | ||
drivers | ||
fs | ||
include | ||
init | ||
ipc | ||
kernel | ||
lib | ||
LICENSES | ||
mm | ||
net | ||
samples | ||
scripts | ||
security | ||
sound | ||
tools | ||
usr | ||
virt | ||
.clang-format | ||
.cocciconfig | ||
.get_maintainer.ignore | ||
.gitattributes | ||
.gitignore | ||
.mailmap | ||
COPYING | ||
CREDITS | ||
Kbuild | ||
Kconfig | ||
MAINTAINERS | ||
Makefile | ||
README |
Linux kernel ============ There are several guides for kernel developers and users. These guides can be rendered in a number of formats, like HTML and PDF. Please read Documentation/admin-guide/README.rst first. In order to build the documentation, use ``make htmldocs`` or ``make pdfdocs``. The formatted documentation can also be read online at: https://www.kernel.org/doc/html/latest/ There are various text files in the Documentation/ subdirectory, several of them using the Restructured Text markup notation. Please read the Documentation/process/changes.rst file, as it contains the requirements for building and running the kernel, and information about the problems which may result by upgrading your kernel.