linux

Author	SHA1	Message	Date
Mark Rutland	a12c72cc3e	arm: perf: factor out xscale pmu driver Now that the core arm perf code maintains no global state and all microarchitecture-specific PMU data can be fed in through the shared probe function, it's possible to use it as a library and get rid of the C file includes we have currently. This patch factors out the xscale-specific portions out into the xscale driver. For the moment this is always built if perf event support is enabled, but the preprocessor guards will leave behind an empty file. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2015-05-28 16:54:09 +01:00
Mark Rutland	1167925086	arm: perf: kill get_hw_events() Now that the arm pmu code is limited to CPU PMUs the get_hw_events() function is superfluous, as we'll always have a set of per-cpu pmu_hw_events structures. This patch removes the get_hw_events() function, replacing it with a percpu hw_events pointer. Uses of get_hw_events are updated to use this_cpu_ptr. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-10-30 12:17:00 +00:00
Mark Rutland	3d1ff755e3	arm: perf: clean up PMU names The perf userspace tools can't handle dashes or spaces in PMU names, which conflicts with the current naming scheme in the arm perf backend. This prevents these PMUs from being accessed by name from the perf tools. Additionally the ARMv6 pmus are named "v6", which does not fully distinguish them in the sys/bus/event_source namespace. This patch renames the PMUs consistently to a lower case form with underscores, e.g. "armv6_1176", "armv7_cortex_a9". This is both readily accepted by today's perf tool, and far easier to type than the (apparently unused) convention in use previously. The OProfile name conversion code is updated to handle this. Due to a copy-paste error involving two "xscale1" entries, "xscale2" has never been matched by the name OProfile name mapping. While we're updating names, this is corrected. Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Christopher Covington <cov@codeaurora.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> [sachin: fixed missing semicolons in armv6 backend] Signed-off-by: Sachin Kamat <sachin.kamat@samsung.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-07-02 15:48:25 +01:00
Mark Rutland	f929f5759f	arm: perf: xscale: condense event maps Now that we have macros for declaring fully invalid event maps, put them to work for the XScale PMU event maps. While this necessitates repeating common indices, we no longer need to refer to *_UNSUPPORTED events at all, and it makes it possible for the even maps to fit on a single page on a reasonably sized monitor. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2014-07-02 15:48:22 +01:00
Will Deacon	40c390c768	ARM: perf: don't pretend to support counting of L1I writes ARM has a harvard cache architecture and cannot write directly to the I-side. This patch removes the L1I write events from the cache map (which previously returned read events in many cases). Reported-by: Mike Williams <michael.williams@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2013-01-16 12:01:59 +00:00
Greg Kroah-Hartman	351a102dbf	ARM: drivers: remove __dev* attributes. CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-03 15:57:04 -08:00
Sudeep KarkadaNagesha	ed6f2a5223	ARM: perf: consistently use struct perf_event in arm_pmu functions The arm_pmu functions have wildly varied parameters which can often be derived from struct perf_event. This patch changes the arm_pmu function prototypes so that struct perf_event pointers are passed in preference to fields that can be derived from the event. Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2012-11-09 11:37:25 +00:00
Sudeep KarkadaNagesha	513c99ce4e	ARM: perf: allocate CPU PMU dynamically at probe time Supporting multiple, heterogeneous CPU PMUs requires us to allocate the arm_pmu structures dynamically as the devices are probed. This patch removes the static structure definitions for each CPU PMU type and instead passes pointers to the PMU-specific init functions. Signed-off-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2012-11-09 11:37:25 +00:00
Will Deacon	6dbc002970	ARM: perf: prepare for moving CPU PMU code into separate file The CPU PMU code is tightly coupled with generic ARM PMU handling code. This makes it cumbersome when trying to add support for other ARM PMUs (e.g. interconnect, L2 cache controller, bus) as the generic parts of the code are not readily reusable. This patch cleans up perf_event.c so that reusable code is exposed via header files to other potential PMU drivers. The CPU code is consistently named to identify it as such and also to prepare for moving it into a separate file. Signed-off-by: Will Deacon <will.deacon@arm.com>	2012-08-23 11:35:52 +01:00
Will Deacon	04236f9fe0	ARM: perf: probe devicetree in preference to current CPU The CPU PMU is probed using the current cpuid information as part of the early_initcall initialising the architecture perf backend. For architectures without NMI (such as ARM), this does not need to be performed early and can be deferred to the driver probe callback. This also allows us to probe the devicetree in preference to parsing the current cpuid, which may be invalid on a big.LITTLE multi-cluster system. This patch defers the PMU probing and uses the devicetree information when available. Signed-off-by: Will Deacon <will.deacon@arm.com>	2012-08-23 11:35:52 +01:00
Will Deacon	4295b898f5	ARM: 7448/1: perf: remove arm_perf_pmu_ids global enumeration In order to provide PMU name strings compatible with the OProfile user ABI, an enumeration of all PMUs is currently used by perf to identify each PMU uniquely. Unfortunately, this does not scale well in the presence of multiple PMUs and creates a single, global namespace across all PMUs in the system. This patch removes the enumeration and instead uses the name string for the PMU to map onto the OProfile variant. perf_pmu_name is implemented for CPU PMUs, which is all that OProfile cares about anyway. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-07-09 17:41:10 +01:00
Robert Richter	fd0d000b2c	perf: Pass last sampling period to perf_sample_data_init() We always need to pass the last sample period to perf_sample_data_init(), otherwise the event distribution will be wrong. Thus, modifiyng the function interface with the required period as argument. So basically a pattern like this: perf_sample_data_init(&data, ~0ULL); data.period = event->hw.last_period; will now be like that: perf_sample_data_init(&data, ~0ULL, event->hw.last_period); Avoids unininitialized data.period and simplifies code. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-09 15:23:12 +02:00
Will Deacon	3f31ae1213	ARM: 7357/1: perf: fix overflow handling for xscale2 PMUs xscale2 PMUs indicate overflow not via the PMU control register, but by a separate overflow FLAG register instead. This patch fixes the xscale2 PMU code to use this register to detect to overflow and ensures that we clear any pending overflow when disabling a counter. Cc: <stable@vger.kernel.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-03-07 09:40:49 +00:00
Will Deacon	f6f5a30c83	ARM: 7356/1: perf: check that we have an event in the PMU IRQ handlers The PMU IRQ handlers in perf assume that if a counter has overflowed then perf must be responsible. In the paranoid world of crazy hardware, this could be false, so check that we do have a valid event before attempting to dereference NULL in the interrupt path. Cc: <stable@vger.kernel.org> Signed-off-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-03-07 09:40:49 +00:00
Will Deacon	5727347180	ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode On ARM, the PMU does not stop counting after an overflow and therefore IRQ latency affects the new counter value read by the kernel. This is significant for non-sampling runs where it is possible for the new value to overtake the previous one, causing the delta to be out by up to max_period events. Commit `a737823d` ("ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency") attempted to fix this problem by allowing interrupt handlers to pass an overflow flag to the event update function, causing the overflow calculation to assume that the counter passed through zero when going from prev to new. Unfortunately, this doesn't work when overflow occurs on the perf_task_tick path because we have the flag cleared and end up computing a large negative delta. This patch removes the overflow flag from armpmu_event_update and instead limits the sample_period to half of the max_period for non-sampling profiling runs. Cc: <stable@vger.kernel.org> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-03-07 09:40:48 +00:00
Will Deacon	0445e7a58e	ARM: perf: add support for stalled cycle ABI events Commit `8f622422` ("perf events: Add generic front-end and back-end stalled cycle event definitions") added two new ABI events for counting stalled cycles. This patch adds support for these new events to the ARM perf implementation. Cc: Jamie Iles <jamie@jamieiles.com> Cc: Jean Pihet <j-pihet@ti.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-12-02 15:16:16 +00:00
Mark Rutland	8be3f9a238	ARM: perf: remove cpu-related misnomers Currently struct cpu_hw_events stores data on events running on a PMU associated with a CPU. As this data is general enough to be used for system PMUs, this name is a misnomer, and may cause confusion when it is used for system PMUs. Additionally, 'armpmu' is commonly used as a parameter name for an instance of struct arm_pmu. The name is also used for a global instance which represents the CPU's PMU. As cpu_hw_events is now not tied to CPU PMUs, it is renamed to pmu_hw_events, with instances of it renamed similarly. As the global 'armpmu' is CPU-specfic, it is renamed to cpu_pmu. This should make it clearer which code is generic, and which is coupled with the CPU. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-08-31 10:50:12 +01:00
Mark Rutland	e1f431b57e	ARM: perf: refactor event mapping Currently mapping an event type to a hardware configuration value depends on the data being pointed to from struct arm_pmu. These fields (cache_map, event_map, raw_event_mask) are currently specific to CPU PMUs, and do not serve the general case well. This patch replaces the event map pointers on struct arm_pmu with a new 'map_event' function pointer. Small shim functions are used to reuse the existing common code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-08-31 10:50:09 +01:00
Mark Rutland	0f78d2d5cc	ARM: perf: lock PMU registers per-CPU Currently, a single lock serialises access to CPU PMU registers. This global locking is unnecessary as PMU registers are local to the CPU they monitor. This patch replaces the global lock with a per-CPU lock. As the lock is in struct cpu_hw_events, PMUs providing a single cpu_hw_events instance can be locked globally. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-08-31 10:50:07 +01:00
Mark Rutland	c47f8684ba	ARM: perf: remove active_mask Currently, pmu_hw_events::active_mask is used to keep track of which events are active in hardware. As we can stop counters and their interrupts, this is unnecessary. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-08-31 10:50:02 +01:00
Will Deacon	d2b41f7456	ARM: perf: index Xscale and ARMv6 event counters starting from zero Now that the ARMv7 PMU backend indexes event counters from zero, follow suit and do the same for ARMv6 and Xscale. Acked-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Jean Pihet <j-pihet@ti.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-08-31 10:18:00 +01:00
Mark Rutland	a6c93afed3	ARM: perf: de-const struct arm_pmu This patch removes const qualifiers from instances of struct arm_pmu, and functions initialising them, in preparation for generalising arm_pmu usage to system (AKA uncore) PMUs. This will allow for dynamically modifiable structures (locks, struct pmu) to be added as members of struct arm_pmu. Acked-by: Jamie Iles <jamie@jamieiles.com> Reviewed-by: Jean Pihet <j-pihet@ti.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2011-08-31 10:16:58 +01:00
Peter Zijlstra	89d6c0b5bd	perf, arch: Add generic NODE cache events Add a NODE level to the generic cache events which is used to measure local vs remote memory accesses. Like all other cache events, an ACCESS is HIT+MISS, if there is no way to distinguish between reads and writes do reads only etc.. The below needs filling out for !x86 (which I filled out with unsupported events). I'm fairly sure ARM can leave it like that since it doesn't strike me as an architecture that even has NUMA support. SH might have something since it does appear to have some NUMA bits. Sparc64, PowerPC and MIPS certainly want a good look there since they clearly are NUMA capable. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Cc: Anton Blanchard <anton@samba.org> Cc: David Daney <ddaney@caviumnetworks.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:38 +02:00
Peter Zijlstra	a8b0ca17b8	perf: Remove the nmi parameter from the swevent and overflow interface The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
Will Deacon	a737823d37	ARM: 6835/1: perf: ensure overflows aren't missed due to IRQ latency If a counter overflows during a perf stat profiling run it may overtake the last known value of the counter: 0 prev new 0xffffffff \|----------\|-------\|----------------------\| In this case, the number of events that have occurred is (0xffffffff - prev) + new. Unfortunately, the event update code will not realise an overflow has occurred and will instead report the event delta as (new - prev) which may be considerably smaller than the real count. This patch adds an extra argument to armpmu_event_update which indicates whether or not an overflow has occurred. If an overflow has occurred then we use the maximum period of the counter to calculate the elapsed events. Acked-by: Jamie Iles <jamie@jamieiles.com> Reported-by: Ashwin Chaugule <ashwinc@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2011-03-26 10:06:09 +00:00
Will Deacon	961ec6daa7	ARM: 6521/1: perf: use raw_spinlock_t for pmu_lock For kernels built with PREEMPT_RT, critical sections protected by standard spinlocks are preemptible. This is not acceptable on perf as (a) we may be scheduled onto a different CPU whilst reading/writing banked PMU registers and (b) the latency when reading the PMU registers becomes unpredictable. This patch upgrades the pmu_lock spinlock to a raw_spinlock instead. Reported-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2010-12-04 11:18:08 +00:00
Will Deacon	4d6b7a779b	ARM: 6512/1: perf: fix warnings generated by sparse Russell reported a number of warnings coming from sparse when checking the ARM perf_event.c files: \| perf_event.c seems to also have problems too: \| \| CHECK arch/arm/kernel/perf_event.c \| arch/arm/kernel/perf_event.c:37:1: warning: symbol 'pmu_lock' was not declared. Should it be static? \| arch/arm/kernel/perf_event.c:70:1: warning: symbol 'cpu_hw_events' was not declared. Should it be static? \| arch/arm/kernel/perf_event.c:1006:1: warning: symbol 'armv6pmu_enable_event' was not declared. Should it be static? \| arch/arm/kernel/perf_event.c:1113:1: warning: symbol 'armv6pmu_stop' was not declared. Should it be static? \| arch/arm/kernel/perf_event.c:1956:6: warning: symbol 'armv7pmu_enable_event' was not declared. Should it be static? \| arch/arm/kernel/perf_event.c:3072:14: warning: incorrect type in argument 1 (different address spaces) \| arch/arm/kernel/perf_event.c:3072:14: expected void const volatile [noderef] <asn:1><noident> \| arch/arm/kernel/perf_event.c:3072:14: got struct frame_tail tail \| arch/arm/kernel/perf_event.c:3074:49: warning: incorrect type in argument 2 (different address spaces) \| arch/arm/kernel/perf_event.c:3074:49: expected void const [noderef] <asn:1>from \| arch/arm/kernel/perf_event.c:3074:49: got struct frame_tail tail This patch resolves these issues so we can live in silence again. Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2010-12-04 11:17:44 +00:00
Will Deacon	43eab87828	ARM: perf: separate PMU backends into multiple files The ARM perf_event.c file contains all PMU backends and, as new PMUs are introduced, will continue to grow. This patch follows the example of x86 and splits the PMU implementations into separate files which are then #included back into the main file. Compile-time guards are added to each PMU file to avoid compiling in code that is not relevant for the version of the architecture which we are targetting. Acked-by: Jean Pihet <j-pihet@ti.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2010-11-25 16:52:08 +00:00

28 Commits