Leverage the commit for ARM by Will Deacon:
- 446a5a8b1e
ARM: 6205/1: perf: ensure counter delta is treated as unsigned
Hardware performance counters on ARM are 32-bits wide but atomic64_t
variables are used to represent counter data in the hw_perf_event structure.
The armpmu_event_update function right-shifts a signed 64-bit delta variable
and adds the result to the event count. This can lead to shifting in sign-bits
if the MSB of the 32-bit counter value is set. This results in perf output
such as:
Performance counter stats for 'sleep 20':
18446744073460670464 cycles <-- 0xFFFFFFFFF12A6000
7783773 instructions # 0.000 IPC
465 context-switches
161 page-faults
1172393 branches
20.154242147 seconds time elapsed
This patch ensures that the delta value is treated as unsigned so that the
right shift sets the upper bits to zero.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2015/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is the MIPS part of the following commits by Frederic Weisbecker:
- f72c1a931e
perf: Factorize callchain context handling
Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.
- 56962b4449
perf: Generalize some arch callchain code
- Most archs use one callchain buffer per cpu, except x86 that needs
to deal with NMIs. Provide a default perf_callchain_buffer()
implementation that x86 overrides.
- Centralize all the kernel/user regs handling and invoke new arch
handlers from there: perf_callchain_user() / perf_callchain_kernel()
That avoid all the user_mode(), current->mm checks and so...
- Invert some parameters in perf_callchain_*() helpers: entry to the
left, regs to the right, following the traditional (dst, src).
- 70791ce9ba
perf: Generalize callchain_store()
callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.
This removes repetitive code.
- c1a65932fd
perf: Drop unappropriate tests on arch callchains
Drop the TASK_RUNNING test on user tasks for callchains as
this check doesn't seem to make any sense.
Also remove the tests for !current that is not supposed to
happen and current->pid as this should be handled at the
generic level, with exclude_idle attribute.
Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2014/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is the MIPS part of the following commits by Peter Zijlstra:
- a4eaf7f146
perf: Rework the PMU methods
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.
The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.
This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).
It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).
The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:
1) We disable the counter:
a) the pmu has per-counter enable bits, we flip that
b) we program a NOP event, preserving the counter state
2) We store the counter state and ignore all read/overflow events
For MIPSXX, the stopped state is implemented in the way of 1.b as above.
- 33696fc0d1
perf: Per PMU disable
Changes perf_disable() into perf_pmu_disable().
- 24cd7f54a0
perf: Reduce perf_disable() usage
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.
- b0a873ebbf
perf: Register PMU implementations
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.
- 51b0fe3954
perf: Deconstify struct pmu
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`
Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2012/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch adds the mipsxx Perf-events support based on the skeleton.
Generic hardware events and cache events are now fully implemented for
the 24K/34K/74K/1004K cores. To support other cores in mipsxx (such as
R10000/SB1), the generic hardware event tables and cache event tables
need to be filled out. To support other CPUs which have different PMU
than mipsxx, such as RM9000 and LOONGSON2, the additional files
perf_event_$cpu.c need to be created.
Raw event is an important part of Perf-events. It helps the user collect
performance data for events that are not listed as the generic hardware
events and cache events but ARE supported by the CPU's PMU.
This patch also adds this feature for mipsxx 24K/34K/74K/1004K. For how to
use it, please refer to processor core software user's manual and the
comments for mipsxx_pmu_map_raw_event() for more details.
Please note that this is a "precise" implementation, which means the
kernel will check whether the requested raw events are supported by this
CPU and which hardware counters can be assigned for them.
To test the functionality of Perf-event, you may want to compile the tool
"perf" for your MIPS platform. You can refer to the following URL:
http://www.linux-mips.org/archives/linux-mips/2010-10/msg00126.html
You also need to customize the CFLAGS and LDFLAGS in tools/perf/Makefile
for your libs, includes, etc.
In case you encounter the boot failure in SMVP kernel on multi-threading
CPUs, you may take a look at:
http://www.linux-mips.org/git?p=linux-mti.git;a=commitdiff;h=5460815027d802697b879644c74f0e8365254020
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: linux-mips@linux-mips.org
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: jamie.iles@picochip.com
Cc: ddaney@caviumnetworks.com
Cc: matt@console-pimps.org
Patchwork: https://patchwork.linux-mips.org/patch/1689/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
create mode 100644 arch/mips/kernel/perf_event_mipsxx.c
This patch provides the skeleton of the HW perf event support. To enable
this feature, we can not choose the SMTC kernel; Oprofile should be
disabled; kernel performance events be selected. Then we can enable it in
Kernel type menu.
Oprofile for MIPS platforms initializes irq at arch init time. Currently
we do not change this logic to allow PMU reservation.
If a platform has EIC, we can use the irq base and perf counter irq offset
defines for the interrupt controller in specific init_hw_perf_events().
Based on this skeleton patch, the 3 different kinds of MIPS PMU, namely,
mipsxx/loongson2/rm9000, can be supported by adding corresponding lower
level C files at the bottom. The suggested names of these files are
perf_event_mipsxx.c/perf_event_loongson2.c/perf_event_rm9000.c. So, for
example, we can do this by adding "#include perf_event_mipsxx.c" at the
bottom of perf_event.c.
In addition, PMUs with 64bit counters are also considered in this patch.
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: linux-mips@linux-mips.org
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: jamie.iles@picochip.com
Cc: ddaney@caviumnetworks.com
Cc: matt@console-pimps.org
Patchwork: https://patchwork.linux-mips.org/patch/1688/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>