mainlining shenanigans
Go to file
James Clark 0323dea318 perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events()
Currently, timeless mode starts the decode on PERF_RECORD_EXIT, and
non-timeless mode starts decoding on the fist PERF_RECORD_AUX record.

This can cause the "data has no samples!" error if the first
PERF_RECORD_AUX record comes before the first (or any relevant)
PERF_RECORD_MMAP2 record because the mmaps are required by the decoder
to access the binary data.

This change pushes the start of non-timeless decoding to the very end of
parsing the file. The PERF_RECORD_EXIT event can't be used because it
might not exist in system-wide or snapshot modes.

I have not been able to find the exact cause for the events to be
intermittently in the wrong order in the basic scenario:

	perf record -e cs_etm/@tmc_etr0/u top

But it can be made to happen every time with the --delay option. This is
because "enable_on_exec" is disabled, which causes tracing to start
before the process to be launched is exec'd. For example:

	perf record -e cs_etm/@tmc_etr0/u --delay=1 top
	perf report -D | grep 'AUX\|MAP'

	0 16714475632740 0x520 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 []
	0 16714476494960 0x5d0 [0x40]: PERF_RECORD_AUX offset: 0x30 size: 0x30 flags: 0 []
	0 16714478208900 0x660 [0x40]: PERF_RECORD_AUX offset: 0x60 size: 0x30 flags: 0 []
	4294967295 16714478293340 0x700 [0x70]: PERF_RECORD_MMAP2 8712/8712: [0x557a460000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
	4294967295 16714478353020 0x770 [0x88]: PERF_RECORD_MMAP2 8712/8712: [0x7f86f72000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so

Another scenario in which decoding from the first aux record fails is a
workload that forks. Although the aux record comes after 'bash', it
comes before 'top', which is what we are interested in. For example:

	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
	perf report -D | grep 'AUX\|MAP'

	4294967295 16853946421300 0x510 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x558f280000(0x142000) @ 0 00:17 5213953 0]: r-xp /usr/bin/bash
	4294967295 16853946543560 0x580 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba6e000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
	4294967295 16853946628420 0x608 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba9e000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
	0 16853947067300 0x690 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x3a60 flags: 0 []
	...
	0 16853966602580 0x1758 [0x40]: PERF_RECORD_AUX offset: 0xc2470 size: 0x30 flags: 0 []
	4294967295 16853967119860 0x1818 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x5559e70000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
	4294967295 16853967181620 0x1888 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed06000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
	4294967295 16853967237180 0x1910 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed36000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]

A third scenario is when the majority of time is spent in a shared
library that is not loaded at startup. For example a dynamically loaded
plugin.

Testing
=======

Testing was done by checking if any samples that are present in the
old output are missing from the new output. Timestamps must be
stripped out with awk because now they are set to the last AUX sample,
rather than the first:

	./perf script $4 | awk '!($4="")' > new.script
	./perf-default script $4 | awk '!($4="")' > default.script
	comm -13 <(sort -u new.script) <(sort -u default.script)

Testing showed that the new output is a superset of the old. When lines
appear in the comm output, it is not because they are missing but
because [unknown] is now resolved to sensible locations. For example
last putp branch here now resolves to libtinfo, so it's not missing
from the output, but is actually improved:

Old:
	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 0 [unknown] ([unknown])
New:
	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 7f8ab39208 putp+0x0 (/lib/libtinfo.so.5.9)

In the following two modes, decoding now works and the "data has no
samples!" error is not displayed any more:

	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
	perf record -e cs_etm/@tmc_etr0/u --delay=1 top

In snapshot mode, there is also an improvement to decoding. Previously
samples for the 'kill' process that was used to send SIGUSR2 were
completely missing, because the process hadn't started yet. But now
there are additional samples present:

	perf record -e cs_etm/@tmc_etr0/u --snapshot -a
	perf script

		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      aaaabb612fb4 [unknown] (/usr/bin/stress)
		  kill 19644 [000] 161627.938153:    1000000    instructions:uH:      ffffae0ef210 [unknown] (/lib/aarch64-linux-gnu/ld-2.27.so)
		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      ffff9e754d40 random_r+0x20 (/lib/aarch64-linux-gnu/libc-2.27.so)

Also tested was the round trip of 'perf inject' followed by 'perf
report' which has the same differences and improvements.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210609130421.13934-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
arch Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
block sched: Change task_struct::state 2021-06-18 11:43:09 +02:00
certs Kbuild updates for v5.13 (2nd) 2021-05-08 10:00:11 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2021-06-28 16:04:56 -07:00
Documentation Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
drivers Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
fs dlm for 5.14 2021-06-29 20:30:00 -07:00
include 38 cifs/smb3 fixes including improvement to fallocate emulation, some DFS fixes. About 1/3 are to address Coverity warnings 2021-06-29 20:18:16 -07:00
init sched: Change task_struct::state 2021-06-18 11:43:09 +02:00
ipc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
kernel Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
lib Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
net Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
samples This was a reasonably active cycle for documentation; this pull includes: 2021-06-28 16:53:05 -07:00
scripts Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
security integrity-v5.14 2021-06-28 16:15:50 -07:00
sound spi: Updates for v5.14 2021-06-28 11:10:20 -07:00
tools perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events() 2021-07-01 16:14:37 -03:00
usr .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
virt Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap m68k updates for v5.14 2021-06-28 14:01:03 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: move Murali Karicheri to credits 2021-04-29 15:47:30 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS printk changes for 5.14 2021-06-29 12:07:18 -07:00
Makefile arm64 updates for 5.14 2021-06-28 14:04:24 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.