linux/arch/x86/kernel/cpu
HATAYAMA Daisuke b292d7a104 perf/x86/intel: ignore CondChgd bit to avoid false NMI handling
Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.

For example, we use external NMI to make system panic to get crash
dump, but in this case, the external NMI is falsely handled do to the
issue.

This commit deals with the issue simply by ignoring CondChgd bit.

Here is explanation in detail.

On x86 NMI watchdog uses performance monitoring feature to
periodically signal NMI each time performance counter gets overflowed.

intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
owner of a given NMI by looking at overflow status bits in
MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
handles the given NMI as its own NMI.

The problem is that the intel_pmu_handle_irq() doesn't distinguish
CondChgd bit from other bits. Unlike the other status bits, CondChgd
bit doesn't represent overflow status for performance counters. Thus,
CondChgd bit cannot be thought of as a mark indicating a given NMI is
NMI watchdog's.

As a result, if CondChgd bit is set, any NMI is falsely handled by the
NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
action is never performed until CondChgd bit is cleared.

I noticed this behavior on systems with Ivy Bridge processors: Intel
Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
in the beginning at boot. Then the CondChgd bit is immediately cleared
by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
0.

On the other hand, on older processors such as Nehalem, Xeon E7540,
CondChgd bit is not set in the beginning at boot.

I'm not sure about exact behavior of CondChgd bit, in particular when
this bit is set. Although I read Intel System Programmer's Manual to
figure out that, the descriptions I found are:

  In 18.9.1:

  "The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
   indicate changes to the state of performancmonitoring hardware"

  In Table 35-2 IA-32 Architectural MSRs

  63 CondChg: status bits of this register has changed.

These are different from the bahviour I see on the actual system as I
explained above.

At least, I think ignoring CondChgd bit should be enough for NMI
watchdog perspective.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-02 08:35:55 +02:00
..
mcheck hwpoison: remove unused global variable in do_machine_check() 2014-06-04 16:54:11 -07:00
microcode x86, microcode: Add a disable chicken bit 2014-05-20 20:21:27 -07:00
mtrr mm, x86: Account for TLB flushes only when debugging 2014-01-25 09:10:41 +01:00
.gitignore
amd.c Rename TAINT_UNSAFE_SMP to TAINT_CPU_OUT_OF_SPEC 2014-03-20 16:28:09 -07:00
bugs_64.c
bugs.c x86: Get rid of ->hard_math and all the FPU asm fu 2013-06-06 14:32:04 -07:00
centaur.c x86: Remove CONFIG_X86_OOSTORE 2014-03-11 10:16:18 -07:00
common.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-06-12 19:18:49 -07:00
cpu.h x86/cpu: Track legacy CPU model data only on 32-bit kernels 2013-10-26 13:34:39 +02:00
cyrix.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
hypervisor.c x86: Correctly detect hypervisor 2013-08-05 06:35:33 -07:00
intel_cacheinfo.c x86, intel, cacheinfo: Fix CPU hotplug callback registration 2014-03-20 13:43:43 +01:00
intel.c Merge branch 'x86-nuke-platforms-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-02 13:15:58 -07:00
Makefile Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-01-20 12:07:54 -08:00
match.c x86: align x86 arch with generic CPU modalias handling 2014-02-18 12:45:38 -08:00
mkcapflags.sh mkcapflags.pl: convert to mkcapflags.sh 2013-04-29 15:54:27 -07:00
mshyperv.c x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately 2014-04-14 11:49:55 -07:00
perf_event_amd_ibs.c kprobes, x86: Use NOKPROBE_SYMBOL() instead of __kprobes annotation 2014-04-24 10:26:38 +02:00
perf_event_amd_iommu.c perf/x86/amd: Do not print an error when the device is not present 2013-07-05 08:27:15 +02:00
perf_event_amd_iommu.h perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation 2013-06-19 13:04:53 +02:00
perf_event_amd_uncore.c x86, amd, uncore: Fix CPU hotplug callback registration 2014-03-20 13:43:43 +01:00
perf_event_amd.c perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node() 2013-09-02 08:42:49 +02:00
perf_event_intel_ds.c fix Haswell precise store data source encoding 2014-05-19 21:52:59 +09:00
perf_event_intel_lbr.c perf/x86: Add conditional branch filtering support 2014-06-05 12:30:23 +02:00
perf_event_intel_rapl.c perf/x86: Fix RAPL rdmsrl_safe() usage 2014-04-24 08:12:41 +02:00
perf_event_intel_uncore.c CPU hotplug notifiers registration fixes for 3.15-rc1 2014-04-07 14:55:46 -07:00
perf_event_intel_uncore.h perf/x86/uncore: add hrtimer to SNB uncore IMC PMU 2014-02-21 21:49:08 +01:00
perf_event_intel.c perf/x86/intel: ignore CondChgd bit to avoid false NMI handling 2014-07-02 08:35:55 +02:00
perf_event_knc.c x86: Constify a few items 2013-03-11 15:11:03 +01:00
perf_event_p4.c perf/x86/p4: Block PMIs on init to prevent a stream of unkown NMIs 2014-02-09 13:20:35 +01:00
perf_event_p6.c perf/x86/intel/p6: Add userspace RDPMC quirk for PPro 2014-02-09 13:08:24 +01:00
perf_event.c perf/x86: Use common PMU interrupt disabled code 2014-06-05 12:30:03 +02:00
perf_event.h perf/x86: Add a few more comments 2014-02-27 12:43:25 +01:00
perfctr-watchdog.c perf/x86: Add support for Intel Xeon-Phi Knights Corner PMU 2012-10-04 13:32:37 +02:00
powerflags.c update AMD powerflags comments 2013-05-28 12:02:10 +02:00
proc.c x86/cpu: Always print SMP information in /proc/cpuinfo 2013-11-06 08:13:56 +01:00
rdrand.c x86, rdrand: When nordrand is specified, disable RDSEED as well 2014-05-11 20:25:20 -07:00
scattered.c treewide: Fix common typo in "identify" 2013-10-14 15:31:06 +02:00
topology.c x86: delete __cpuinit usage from all x86 files 2013-07-14 19:36:56 -04:00
transmeta.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
umc.c x86: Delete non-required instances of include <linux/init.h> 2014-01-06 21:25:18 -08:00
vmware.c x86: Correctly detect hypervisor 2013-08-05 06:35:33 -07:00