linux/tools
James Clark 83d1fc92d4 perf cs-etm: Split Coresight decode by aux records
Populate the auxtrace queues using AUX records rather than whole
auxtrace buffers so that the decoder is reset between each aux record.

This is similar to the auxtrace_queues__process_index() ->
auxtrace_queues__add_indexed_event() flow where
perf_session__peek_event() is used to read AUXTRACE events out of random
positions in the file based on the auxtrace index.

But now we loop over all PERF_RECORD_AUX events instead of AUXTRACE
buffers. For each PERF_RECORD_AUX event, we find the corresponding
AUXTRACE buffer using the index, and add a fragment of that buffer to
the auxtrace queues.

No other changes to decoding were made, apart from populating the
auxtrace queues. The result of decoding is identical to before, except
in cases where decoding failed completely, due to not resetting the
decoder.

The reason for this change is because AUX records are emitted any time
tracing is disabled, for example when the process is scheduled out.
Because ETM was disabled and enabled again, the decoder also needs to be
reset to force the search for a sync packet. Otherwise there would be
fatal decoding errors.

Testing
=======

Testing was done with the following script, to diff the decoding results
between the patched and un-patched versions of perf:

	#!/bin/bash
	set -ex

	$1 script -i $3 $4 > split.script
	$2 script -i $3 $4 > default.script

	diff split.script default.script | head -n 20

And it was run like this, with various itrace options depending on the
quantity of synthesised events:

	compare.sh ./perf-patched ./perf-default perf-per-cpu-2-threads.data --itrace=i100000ns

No changes in output were observed in the following scenarios:

* Simple per-cpu
	perf record -e cs_etm/@tmc_etr0/u top

* Per-thread, single thread
	perf record -e cs_etm/@tmc_etr0/u --per-thread ./threads_C

* Per-thread multiple threads (but only one thread collected data):
	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597

* Per-thread multiple threads (both threads collected data):
	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597

* Per-cpu explicit threads:
	perf record -e cs_etm/@tmc_etr0/u --pid 853,854

* System-wide (per-cpu):
    perf record -e cs_etm/@tmc_etr0/u -a

* No data collected (no aux buffers)
	Can happen with any command when run for a short period

* Containing truncated records
	Can happen with any command

* Containing aux records with 0 size
	Can happen with any command

* Snapshot mode (various files with and without buffer wrap)
	perf record -e cs_etm/@tmc_etr0/u -a --snapshot

Some differences were observed in the following scenario:

* Snapshot mode (with duplicate buffers)
	perf record -e cs_etm/@tmc_etr0/u -a --snapshot

Fewer samples are generated in snapshot mode if duplicate buffers
were gathered because buffers with the same offset are now only added
once. This gives different, but more correct results and no duplicate
data is decoded any more.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210624164303.28632-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 14:42:36 -03:00
..
accounting
arch tools headers UAPI: Sync files changed by the memfd_secret new syscall 2021-07-14 10:05:35 -03:00
bootconfig bootconfig: Share the checksum function with tools 2021-06-10 13:41:26 -04:00
bpf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-06-29 15:45:27 -07:00
build tools build: Fix quiet cmd indentation 2021-05-17 12:10:03 +09:00
cgroup tools/cgroup/slabinfo.py: updated to work on current kernel 2021-04-23 14:42:40 -07:00
debugging tools: Fix "the the" in a message in kernel-chktaint 2021-06-13 17:01:17 -06:00
edid
firewire
firmware
gpio tools: gpio-utils: fix various kernel-doc warnings 2021-03-26 14:56:19 +01:00
hv
iio iio: event_monitor: Enable events before monitoring 2021-03-25 19:13:52 +00:00
include tools headers: Remove broken definition of __LITTLE_ENDIAN 2021-07-14 14:39:36 -03:00
io_uring
kvm/kvm_stat tools/kvm_stat: Fix documentation typo 2021-05-07 06:06:22 -04:00
laptop
leds
lib libperf: Add tests for perf_evlist__set_leader() 2021-07-09 15:34:41 -03:00
memory-model tools/memory-model: Fix smp_mb__after_spinlock() spelling 2021-05-10 16:27:20 -07:00
objtool A single ELF format fix for a section flags mismatch bug that breaks 2021-06-28 11:35:55 -07:00
pci
pcmcia
perf perf cs-etm: Split Coresight decode by aux records 2021-07-14 14:42:36 -03:00
power tools/power/x86/intel-speed-select: v1.10 release 2021-06-18 15:29:32 +02:00
rcu tools/rcu: Add drgn script to dump number of RCU callbacks 2021-05-10 15:39:19 -07:00
scripts tools build: Fix quiet cmd indentation 2021-05-17 12:10:03 +09:00
spi spi: tools: make a symbolic link to the header file spi.h 2021-04-22 16:30:39 +01:00
testing Tracing fix for histograms and a clean up in ftrace 2021-07-09 11:15:09 -07:00
thermal/tmon tools: do not include scripts/Kbuild.include 2021-04-25 05:26:13 +09:00
time
tracing tools/latency-collector: Remove unneeded semicolon 2021-03-18 12:58:26 -04:00
usb treewide: remove editor modelines and cruft 2021-05-07 00:26:34 -07:00
virtio
vm tools/vm/page_owner_sort.c: check malloc() return 2021-06-29 10:53:47 -07:00
wmi
Makefile