linux/arch/x86
Peter Zijlstra 9536c8d2da perf/x86: Optimize intel_pmu_pebs_fixup_ip()
There's been reports of high NMI handler overhead, highlighted by
such kernel messages:

  [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000
  [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs

Don Zickus analyzed the source of the overhead and reported:

 > While there are a few places that are causing latencies, for now I focused on
 > the longest one first.  It seems to be 'copy_user_from_nmi'
 >
 > intel_pmu_handle_irq ->
 >	intel_pmu_drain_pebs_nhm ->
 >		__intel_pmu_drain_pebs_nhm ->
 >			__intel_pmu_pebs_event ->
 >				intel_pmu_pebs_fixup_ip ->
 >					copy_from_user_nmi
 >
 > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of
 > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles
 > (there are some cases where only 10 iterations are needed to go that high
 > too, but in generall over 50 or so).  At this point copy_user_from_nmi
 > seems to account for over 90% of the nmi latency.

The solution to that is to avoid having to call copy_from_user_nmi() for
every instruction.

Since we already limit the max basic block size, we can easily
pre-allocate a piece of memory to copy the entire thing into in one
go.

Don reported this test result:

 > Your patch made a huge difference in improvement.  The
 > copy_from_user_nmi() no longer hits the million of cycles.  I still
 > have a batch of 100,000-300,000 cycles.  My longest NMI paths used
 > to be dominated by copy_from_user_nmi, now it is not (I have to dig
 > up the new hot path).

Reported-and-tested-by: Don Zickus <dzickus@redhat.com>
Cc: jmario@redhat.com
Cc: acme@infradead.org
Cc: dave.hansen@linux.intel.com
Cc: eranian@google.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-16 15:44:00 +02:00
..
boot Merge branch 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-04 09:38:10 -07:00
configs x86, platform, kvm, kconfig: Turn existing .config's into KVM-capable configs 2013-05-28 12:11:32 +02:00
crypto Reinstate "crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform framework" 2013-09-07 12:56:26 +10:00
ia32 Merge branch 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-04 11:08:32 -07:00
include compiler/gcc4: Add quirk for 'asm goto' miscompilation bug 2013-10-11 07:39:14 +02:00
kernel perf/x86: Optimize intel_pmu_pebs_fixup_ip() 2013-10-16 15:44:00 +02:00
kvm KVM: nVMX: fix shadow on EPT 2013-10-10 11:39:57 +02:00
lguest lguest: fix GPF in guest when using gdb. 2013-09-06 08:09:28 +09:30
lib Merge branch 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-04 11:08:32 -07:00
math-emu
mm x86: finish user fault error path with fatal signal 2013-09-12 15:38:01 -07:00
net x86: bpf_jit_comp: secure bpf jit against spraying attacks 2013-05-19 23:55:41 -07:00
oprofile oprofilefs_create_...() do not need superblock argument 2013-09-03 22:52:48 -04:00
pci Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero" 2013-10-04 16:15:29 -06:00
platform x86, efi: Don't map Boot Services on i386 2013-09-18 14:42:33 +01:00
power x86, asmlinkage, power: Make various symbols used by the suspend asm code visible 2013-08-06 14:21:03 -07:00
realmode x86, relocs: Refactor the relocs tool to merge 32- and 64-bit ELF 2013-04-16 16:02:58 -07:00
syscalls unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE 2013-05-09 13:46:38 -04:00
tools Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-04 08:39:38 -07:00
um um: prctl: Do not include linux/ptrace.h 2013-09-07 10:57:11 +02:00
vdso remove sched notifier for cross-cpu migrations 2013-07-18 12:29:30 +02:00
video
xen Bug-fixes: 2013-09-25 15:50:53 -07:00
.gitignore
Kbuild
Kconfig x86, build, pci: Fix PCI_MSI build on !SMP 2013-10-04 10:43:34 -07:00
Kconfig.cpu
Kconfig.debug Merge branch 'kconfig-diet' from Dave Hansen 2013-07-04 11:25:51 -07:00
Makefile x86, relocs: Move ELF relocation handling to C 2013-08-07 21:00:04 -07:00
Makefile_32.cpu
Makefile.um