linux/arch/x86
Eiichi Tsukata e649b3f018 KVM: x86: Fix APIC page invalidation race
Commit b1394e745b ("KVM: x86: fix APIC page invalidation") tried
to fix inappropriate APIC page invalidation by re-introducing arch
specific kvm_arch_mmu_notifier_invalidate_range() and calling it from
kvm_mmu_notifier_invalidate_range_start. However, the patch left a
possible race where the VMCS APIC address cache is updated *before*
it is unmapped:

  (Invalidator) kvm_mmu_notifier_invalidate_range_start()
  (Invalidator) kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD)
  (KVM VCPU) vcpu_enter_guest()
  (KVM VCPU) kvm_vcpu_reload_apic_access_page()
  (Invalidator) actually unmap page

Because of the above race, there can be a mismatch between the
host physical address stored in the APIC_ACCESS_PAGE VMCS field and
the host physical address stored in the EPT entry for the APIC GPA
(0xfee0000).  When this happens, the processor will not trap APIC
accesses, and will instead show the raw contents of the APIC-access page.
Because Windows OS periodically checks for unexpected modifications to
the LAPIC register, this will show up as a BSOD crash with BugCheck
CRITICAL_STRUCTURE_CORRUPTION (109) we are currently seeing in
https://bugzilla.redhat.com/show_bug.cgi?id=1751017.

The root cause of the issue is that kvm_arch_mmu_notifier_invalidate_range()
cannot guarantee that no additional references are taken to the pages in
the range before kvm_mmu_notifier_invalidate_range_end().  Fortunately,
this case is supported by the MMU notifier API, as documented in
include/linux/mmu_notifier.h:

	 * If the subsystem
         * can't guarantee that no additional references are taken to
         * the pages in the range, it has to implement the
         * invalidate_range() notifier to remove any references taken
         * after invalidate_range_start().

The fix therefore is to reload the APIC-access page field in the VMCS
from kvm_mmu_notifier_invalidate_range() instead of ..._range_start().

Cc: stable@vger.kernel.org
Fixes: b1394e745b ("KVM: x86: fix APIC page invalidation")
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=197951
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Message-Id: <20200606042627.61070-1-eiichi.tsukata@nutanix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-08 09:05:38 -04:00
..
boot Misc dependency fixes, plus a documentation update about memory protection keys support. 2020-06-01 13:45:59 -07:00
configs compiler: remove CONFIG_OPTIMIZE_INLINING entirely 2020-04-07 10:43:42 -07:00
crypto There are a lot of objtool changes in this cycle, all across the map: 2020-06-01 13:13:00 -07:00
entry ARM: 2020-06-03 15:13:47 -07:00
events perf/x86/rapl: Add AMD Fam17h RAPL support 2020-05-28 07:58:56 +02:00
hyperv mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
ia32 Merge branch 'work.set_fs-exec' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-01 16:21:46 -07:00
include Merge branch 'akpm' (patches from Andrew) 2020-06-03 20:24:15 -07:00
kernel x86/kvm: Remove defunct KVM_DEBUG_FS Kconfig 2020-06-04 14:12:36 -04:00
kvm KVM: x86: Fix APIC page invalidation race 2020-06-08 09:05:38 -04:00
lib X86 timer specific updates: 2020-06-03 10:18:09 -07:00
math-emu
mm Merge branch 'akpm' (patches from Andrew) 2020-06-03 20:24:15 -07:00
net bpf, i386: Remove unneeded conversion to bool 2020-05-07 16:29:14 +02:00
oprofile
pci
platform This tree cleans up various aspects of the UV platform support code, 2020-06-01 14:48:20 -07:00
power cpu/hotplug: Remove disable_nonboot_cpus() 2020-05-07 15:18:40 +02:00
purgatory
ras
realmode SPDX patches for 5.7-rc1. 2020-04-03 13:12:26 -07:00
tools .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
um take the dummy csum_and_copy_from_user() into net/checksum.h 2020-05-29 16:11:50 -04:00
video
xen More EFI changes for v5.8: 2020-05-25 15:11:14 +02:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
Kbuild
Kconfig x86/kvm: Remove defunct KVM_DEBUG_FS Kconfig 2020-06-04 14:12:36 -04:00
Kconfig.assembler x86/delay: Introduce TPAUSE delay 2020-05-07 16:06:20 +02:00
Kconfig.cpu
Kconfig.debug x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined 2020-06-03 20:09:50 -07:00
Makefile x86/boot/build: Make 'make bzlilo' not depend on vmlinux or $(obj)/bzImage 2020-04-21 18:10:28 +02:00
Makefile_32.cpu
Makefile.um