linux/arch
Don Zickus 13beacee81 perf/x86/p4: Fix counter corruption when using lots of perf groups
On a P4 box stressing perf with:

   ./perf record -o perf.data ./perf stat -v ./perf bench all

it was noticed that a slew of unknown NMIs would pop out rather quickly.

Painfully debugging this ancient platform, led me to notice cross cpu counter
corruption.

The P4 machine is special in that it has 18 counters, half are used for cpu0
and the other half is for cpu1 (or all 18 if hyperthreading is disabled).  But
the splitting of the counters has to be actively managed by the software.

In this particular bug, one of the cpu0 specific counters was being used by
cpu1 and caused all sorts of random unknown nmis.

I am not entirely sure on the corruption path, but what happens is:

 o perf schedules a group with p4_pmu_schedule_events()
 o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused
   but for a different cpu, so it 'swaps' the config bits and returns the
   updated 'assign' array with a _new_ index.
 o perf schedules another group with p4_pmu_schedule_events()
 o inside p4_pmu_schedule_events(), it notices an hwc pointer is being reused
   (the same one as above) but for the _same_ cpu [BUG!!], so it updates the
   'assign' array to use the _old_ (wrong cpu) index because the _new_ index is in
   an earlier part of the 'assign' array (and hasn't been committed yet).
 o perf commits the transaction using the wrong index and corrupts the other cpu

The [BUG!!] is because the 'hwc->config' is updated but not the 'hwc->idx'.  So
the check for 'p4_should_swap_ts()' is correct the first time around but
incorrect the second time around (because hwc->config was updated in between).

I think the spirit of perf was to not modify anything until all the
transactions had a chance to 'test' if they would succeed, and if so, commit
atomically.  However, P4 breaks this spirit by touching the hwc->config
element.

So my fix is to continue the un-perf like breakage, by assigning hwc->idx to -1
on swap to tell follow up group scheduling to find a new index.

Of course if the transaction fails rolling this back will be difficult, but
that is not different than how the current code works. :-)  And I wasn't sure
how much effort to cleanup the code I should do for a platform that is almost
10 years old by now.

Hence the lazy fix.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1391024270-19469-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-02-09 13:17:23 +01:00
..
alpha alpha: fix broken network checksum 2014-01-31 09:21:55 -08:00
arc Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2014-01-30 16:58:05 -08:00
arm ARM: SoC fixes for 3.14-rc1 2014-02-02 11:11:06 -08:00
arm64 arm64: defconfig: Expand default enabled features 2014-02-07 17:17:28 +00:00
avr32 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
blackfin Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2014-01-31 09:31:14 -08:00
c6x lib: Add missing arch generic-y entries for asm-generic/hash.h 2013-12-17 21:26:19 -05:00
cris CRIS correction for 3.14 2014-01-28 09:01:14 -08:00
frv Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2014-01-30 16:58:05 -08:00
hexagon Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
ia64 [IA64] Wire up new sched_setattr and sched_getattr syscalls 2014-01-28 09:52:53 -08:00
m32r Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
m68k Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
metag Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
microblaze Microblaze patches for 3.14-rc1 2014-01-28 09:04:11 -08:00
mips MIPS: fpu.h: Fix build when CONFIG_BUG is not set 2014-02-06 13:42:43 +01:00
mn10300 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2014-01-30 16:58:05 -08:00
openrisc OpenRISC updates for 3.14 2014-01-30 17:08:41 -08:00
parisc execve: use 'struct filename *' for executable name passing 2014-02-05 12:54:53 -08:00
powerpc Second batch of KVM updates. Some minor x86 fixes, 2014-01-31 08:37:32 -08:00
s390 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2014-02-05 15:51:42 -08:00
score Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2014-01-31 09:31:14 -08:00
sh Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-01-25 11:17:34 -08:00
sparc sparc: Hook up sched_setattr and sched_getattr syscalls. 2014-01-29 00:45:06 -08:00
tile tile: remove compat_sys_lookup_dcookie declaration to fix compile error 2014-02-01 10:55:15 -08:00
um Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2014-01-26 11:06:16 -08:00
unicore32 arch/unicore32/kernel/early_printk.c:setup_early_printk: missing initialization 2014-01-27 21:02:39 -08:00
x86 perf/x86/p4: Fix counter corruption when using lots of perf groups 2014-02-09 13:17:23 +01:00
xtensa Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block 2014-01-30 11:19:05 -08:00
.gitignore
Kconfig stackprotector: Introduce CONFIG_CC_STACKPROTECTOR_STRONG 2013-12-20 09:38:40 +01:00