linux/arch/x86/kernel
Sean Christopherson 009bce1df0 x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted
Choo! Choo!  All aboard the Split Lock Express, with direct service to
Wreckage!

Skip split_lock_verify_msr() if the CPU isn't whitelisted as a possible
SLD-enabled CPU model to avoid writing MSR_TEST_CTRL.  MSR_TEST_CTRL
exists, and is writable, on many generations of CPUs.  Writing the MSR,
even with '0', can result in bizarre, undocumented behavior.

This fixes a crash on Haswell when resuming from suspend with a live KVM
guest.  Because APs use the standard SMP boot flow for resume, they will
go through split_lock_init() and the subsequent RDMSR/WRMSR sequence,
which runs even when sld_state==sld_off to ensure SLD is disabled.  On
Haswell (at least, my Haswell), writing MSR_TEST_CTRL with '0' will
succeed and _may_ take the SMT _sibling_ out of VMX root mode.

When KVM has an active guest, KVM performs VMXON as part of CPU onlining
(see kvm_starting_cpu()).  Because SMP boot is serialized, the resulting
flow is effectively:

  on_each_ap_cpu() {
     WRMSR(MSR_TEST_CTRL, 0)
     VMXON
  }

As a result, the WRMSR can disable VMX on a different CPU that has
already done VMXON.  This ultimately results in a #UD on VMPTRLD when
KVM regains control and attempt run its vCPUs.

The above voodoo was confirmed by reworking KVM's VMXON flow to write
MSR_TEST_CTRL prior to VMXON, and to serialize the sequence as above.
Further verification of the insanity was done by redoing VMXON on all
APs after the initial WRMSR->VMXON sequence.  The additional VMXON,
which should VM-Fail, occasionally succeeded, and also eliminated the
unexpected #UD on VMPTRLD.

The damage done by writing MSR_TEST_CTRL doesn't appear to be limited
to VMX, e.g. after suspend with an active KVM guest, subsequent reboots
almost always hang (even when fudging VMXON), a #UD on a random Jcc was
observed, suspend/resume stability is qualitatively poor, and so on and
so forth.

  kernel BUG at arch/x86/kvm/x86.c:386!
  CPU: 1 PID: 2592 Comm: CPU 6/KVM Tainted: G      D
  Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
  RIP: 0010:kvm_spurious_fault+0xf/0x20
  Call Trace:
   vmx_vcpu_load_vmcs+0x1fb/0x2b0
   vmx_vcpu_load+0x3e/0x160
   kvm_arch_vcpu_load+0x48/0x260
   finish_task_switch+0x140/0x260
   __schedule+0x460/0x720
   _cond_resched+0x2d/0x40
   kvm_arch_vcpu_ioctl_run+0x82e/0x1ca0
   kvm_vcpu_ioctl+0x363/0x5c0
   ksys_ioctl+0x88/0xa0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x4c/0x170
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: dbaba47085 ("x86/split_lock: Rework the initialization flow of split lock detection")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200605192605.7439-1-sean.j.christopherson@intel.com
2020-06-30 14:09:31 +02:00
..
acpi mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
apic The X86 entry, exception and interrupt code rework 2020-06-13 10:05:47 -07:00
cpu x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted 2020-06-30 14:09:31 +02:00
fpu x86/fpu: Reset MXCSR to default in kernel_fpu_begin() 2020-06-29 10:02:00 +02:00
kprobes A few fixes and small cleanups for tracing: 2020-06-20 13:17:47 -07:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
alternative.c x86/int3: Inline bsearch() 2020-06-11 15:14:54 +02:00
amd_gart_64.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
amd_nb.c x86/amd_nb: Add AMD family 17h model 60h PCI IDs 2020-05-22 18:24:40 +02:00
apb_timer.c x86/apb_timer: Drop unused TSC calibration 2020-05-27 13:05:59 +02:00
aperture_64.c
apm_32.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 118 2019-05-24 17:39:02 +02:00
asm-offsets_32.c x86 entry code updates: 2020-03-30 19:14:28 -07:00
asm-offsets_64.c x86/entry: Remove DBn stacks 2020-06-11 15:15:23 +02:00
asm-offsets.c efi/x86: Avoid using code32_start 2020-03-08 09:58:17 +01:00
audit_64.c x86/audit: Fix a -Wmissing-prototypes warning for ia32_classify_syscall() 2020-05-19 18:03:07 +02:00
bootflag.c
check.c
cpuid.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 142 2019-05-30 11:25:17 -07:00
crash_core_32.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
crash_core_64.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
crash_dump_32.c
crash_dump_64.c fs/core/vmcore: Move sev_active() reference to x86 arch code 2019-08-09 22:52:10 +10:00
crash.c x86/crash: Use resource_size() 2020-01-09 14:40:03 +01:00
devicetree.c
doublefault_32.c x86/entry: Convert double fault exception to IDTENTRY_DF 2020-06-11 15:15:03 +02:00
dumpstack_32.c x86/32: Remove CONFIG_DOUBLEFAULT 2020-04-14 14:24:05 +02:00
dumpstack_64.c x86/entry: Remove DBn stacks 2020-06-11 15:15:23 +02:00
dumpstack.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
e820.c Rebase locking/kcsan to locking/urgent 2020-06-11 20:02:46 +02:00
early_printk.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
early-quirks.c x86/intel: Disable HPET on Intel Ice Lake platforms 2019-11-29 12:17:58 +01:00
ebda.c
eisa.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 243 2019-06-19 17:09:07 +02:00
espfix_64.c mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
ftrace_32.S x86: Change {JMP,CALL}_NOSPEC argument 2020-04-30 20:14:34 +02:00
ftrace_64.S x86/entry/64: Move non entry code into .text section 2020-06-11 15:14:37 +02:00
ftrace.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
head32.c
head64.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
head_32.S x86/boot: Remove KEEP_SEGMENTS support 2020-02-22 23:37:37 +01:00
head_64.S x86/entry: Remove the apic/BUILD interrupt leftovers 2020-06-11 15:15:16 +02:00
hpet.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
hw_breakpoint.c x86/entry: Optimize local_db_save() for virt 2020-06-11 15:15:22 +02:00
i8237.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
i8253.c x86/timer: Skip PIT initialization on modern chipsets 2019-06-29 11:35:35 +02:00
i8259.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
idt.c x86/idt: Consolidate idt functionality 2020-06-11 15:15:26 +02:00
ima_arch.c EFI updates for v5.7: 2020-02-26 15:21:22 +01:00
io_delay.c x86/io_delay: Define IO_DELAY macros in C instead of Kconfig 2019-05-24 08:46:06 +02:00
ioport.c x86/ioperm: Prevent a memory leak when fork fails 2020-05-28 21:36:20 +02:00
irq_32.c x86/irq: Rework handle_irq() for 64-bit 2020-06-11 15:15:12 +02:00
irq_64.c x86/entry/64: Move do_softirq_own_stack() to C 2020-06-11 15:15:07 +02:00
irq_work.c x86/entry: Convert various system vectors 2020-06-11 15:15:14 +02:00
irq.c x86/entry: Convert KVM vectors to IDTENTRY_SYSVEC* 2020-06-11 15:15:15 +02:00
irqflags.S x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_* 2019-10-18 11:58:33 +02:00
irqinit.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
itmt.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
jailhouse.c x86/jailhouse: Only enable platform UARTs if available 2019-10-10 15:43:59 +02:00
jump_label.c x86/jump_label: Move 'inline' keyword placement 2020-03-27 11:05:41 +01:00
kdebugfs.c x86/boot: Introduce setup_indirect 2019-11-12 16:21:15 +01:00
kexec-bzimage64.c efi/x86: Make fw_vendor, config_table and runtime sysfs nodes x86 specific 2020-02-23 21:59:42 +01:00
kgdb.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
ksysfs.c x86/boot: Introduce setup_indirect 2019-11-12 16:21:15 +01:00
kvm.c The X86 entry, exception and interrupt code rework 2020-06-13 10:05:47 -07:00
kvmclock.c x86/vdso: Use generic VDSO clock mode storage 2020-02-17 14:40:23 +01:00
ldt.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
machine_kexec_32.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
machine_kexec_64.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
Makefile Rebase locking/kcsan to locking/urgent 2020-06-11 20:02:46 +02:00
mmconf-fam10h_64.c
module.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
mpparse.c x86/boot: Fix memory leak in default_get_smp_config() 2019-07-16 23:13:48 +02:00
msr.c x86/msr: Restrict MSR access when the kernel is locked down 2019-08-19 21:54:16 -07:00
nmi_selftest.c
nmi.c x86/entry: Make NMI use IDTENTRY_RAW 2020-06-12 14:15:48 +02:00
paravirt_patch.c x86/paravirt: Standardize 'insn_buff' variable names 2019-04-29 16:05:49 +02:00
paravirt-spinlocks.c
paravirt.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
pci-dma.c dma-mapping updates for 5.5-rc1 2019-11-28 11:16:43 -08:00
pci-iommu_table.c
pci-swiotlb.c dma-mapping: fix filename references 2019-09-03 08:36:30 +02:00
pcspeaker.c
perf_regs.c perf/x86/regs: Check reserved bits 2019-06-24 19:19:24 +02:00
platform-quirks.c
pmem.c
probe_roms.c maccess: make get_kernel_nofault() check for minimal type compatibility 2020-06-18 12:10:37 -07:00
process_32.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
process_64.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
process.c A set of fixes and updates for x86: 2020-06-11 15:54:31 -07:00
process.h x86: Use the correct SPDX License Identifier in headers 2019-10-01 20:31:35 +02:00
ptrace.c mm: don't include asm/pgtable.h if linux/mm.h is already included 2020-06-09 09:39:13 -07:00
pvclock.c x86/vdso: Use generic VDSO clock mode storage 2020-02-17 14:40:23 +01:00
quirks.c remove ioremap_nocache and devm_ioremap_nocache 2020-01-06 09:45:59 +01:00
reboot_fixups_32.c
reboot.c A set of fixes and updates for x86: 2020-06-11 15:54:31 -07:00
relocate_kernel_32.S x86/asm: Annotate relocate_kernel_{32,64}.c 2019-10-18 09:53:19 +02:00
relocate_kernel_64.S x86/kexec: Make relocate_kernel_64.S objtool clean 2020-03-25 18:28:28 +01:00
resource.c
rtc.c
setup_percpu.c x86/mm: remove vmalloc faulting 2020-06-02 10:59:12 -07:00
setup.c x86/setup: Add an initrdmem= option to specify initrd physical address 2020-04-27 09:28:16 +02:00
signal_compat.c
signal.c Merge branch 'work.set_fs-exec' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-01 16:21:46 -07:00
smp.c x86/entry: Convert reschedule interrupt to IDTENTRY_SYSVEC_SIMPLE 2020-06-11 15:15:16 +02:00
smpboot.c mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
stacktrace.c x86 user stack frame reads: switch to explicit __get_user() 2020-02-15 17:26:26 -05:00
step.c
sys_ia32.c x86: switch cp_stat64() to unsafe_put_user() 2020-06-03 16:59:21 -04:00
sys_x86_64.c x86: Remove unneeded includes 2020-03-21 16:03:25 +01:00
sysfb_efi.c x86/sysfb_efi: Add quirks for some devices with swapped width and height 2019-07-22 10:47:11 +02:00
sysfb_simplefb.c x86/sysfb: Fix check for bad VRAM size 2020-01-20 10:57:53 +01:00
sysfb.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tboot.c mmap locking API: add MMAP_LOCK_INITIALIZER 2020-06-09 09:39:14 -07:00
time.c A set of fixes and updates for x86: 2020-06-11 15:54:31 -07:00
tls.c x86/tls: Fix possible spectre-v1 in do_get_thread_area() 2019-06-27 23:48:04 +02:00
tls.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 2019-05-30 11:29:21 -07:00
topology.c x86/smp: Replace cpu_up/down() with add/remove_cpu() 2020-03-25 12:59:35 +01:00
trace_clock.c
tracepoint.c x86/entry: Convert reschedule interrupt to IDTENTRY_SYSVEC_SIMPLE 2020-06-11 15:15:16 +02:00
traps.c maccess: rename probe_kernel_address to get_kernel_nofault 2020-06-18 11:14:40 -07:00
tsc_msr.c x86 timer updates: 2020-03-30 19:55:39 -07:00
tsc_sync.c x86: Fix a handful of typos 2020-02-16 20:58:06 +01:00
tsc.c x86/tsc: Add tsc_early_khz command line parameter 2020-05-21 23:07:00 +02:00
umip.c x86/umip: Make umip_insns static 2020-04-15 11:13:12 +02:00
unwind_frame.c x86/entry: Unbreak __irqentry_text_start/end magic 2020-06-11 15:15:29 +02:00
unwind_guess.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
unwind_orc.c x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks 2020-05-22 19:55:17 +02:00
uprobes.c x86/apic, x86/uprobes: Correct parameter names in kernel-doc comments 2019-10-27 09:00:28 +01:00
verify_cpu.S x86/asm: Annotate local pseudo-functions 2019-10-18 10:04:04 +02:00
vm86_32.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
vmlinux.lds.S The X86 entry, exception and interrupt code rework 2020-06-13 10:05:47 -07:00
vsmp_64.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 346 2019-06-05 17:37:08 +02:00
x86_init.c x86/kvm: Handle async page faults directly through do_page_fault() 2020-05-19 15:53:57 +02:00