linux/arch/x86
Vitaly Kuznetsov 49097762fa Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU"
Guest crashes are observed on a Cascade Lake system when 'perf top' is
launched on the host, e.g.

 BUG: unable to handle kernel paging request at fffffe0000073038
 PGD 7ffa7067 P4D 7ffa7067 PUD 7ffa6067 PMD 7ffa5067 PTE ffffffffff120
 Oops: 0000 [#1] SMP PTI
 CPU: 1 PID: 1 Comm: systemd Not tainted 4.18.0+ #380
...
 Call Trace:
  serial8250_console_write+0xfe/0x1f0
  call_console_drivers.constprop.0+0x9d/0x120
  console_unlock+0x1ea/0x460

Call traces are different but the crash is imminent. The problem was
blindly bisected to the commit 041bc42ce2 ("KVM: VMX: Micro-optimize
vmexit time when not exposing PMU"). It was also confirmed that the
issue goes away if PMU is exposed to the guest.

With some instrumentation of the guest we can see what is being switched
(when we do atomic_switch_perf_msrs()):

 vmx_vcpu_run: switching 2 msrs
 vmx_vcpu_run: switching MSR38f guest: 70000000d host: 70000000f
 vmx_vcpu_run: switching MSR3f1 guest: 0 host: 2

The current guess is that PEBS (MSR_IA32_PEBS_ENABLE, 0x3f1) is to blame.
Regardless of whether PMU is exposed to the guest or not, PEBS needs to
be disabled upon switch.

This reverts commit 041bc42ce2.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200619094046.654019-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-19 08:13:40 -04:00
..
boot Rebase locking/kcsan to locking/urgent 2020-06-11 20:02:46 +02:00
configs compiler: remove CONFIG_OPTIMIZE_INLINING entirely 2020-04-07 10:43:42 -07:00
crypto There are a lot of objtool changes in this cycle, all across the map: 2020-06-01 13:13:00 -07:00
entry The X86 entry, exception and interrupt code rework 2020-06-13 10:05:47 -07:00
events treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
hyperv x86/entry: Convert various hypervisor vectors to IDTENTRY_SYSVEC 2020-06-11 15:15:15 +02:00
ia32 Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-06-04 14:07:08 -07:00
include RAS updates from Borislav Petkov: 2020-06-13 10:21:00 -07:00
kernel RAS updates from Borislav Petkov: 2020-06-13 10:21:00 -07:00
kvm Revert "KVM: VMX: Micro-optimize vmexit time when not exposing PMU" 2020-06-19 08:13:40 -04:00
lib Rebase locking/kcsan to locking/urgent 2020-06-11 20:02:46 +02:00
math-emu
mm The X86 entry, exception and interrupt code rework 2020-06-13 10:05:47 -07:00
net bpf, i386: Remove unneeded conversion to bool 2020-05-07 16:29:14 +02:00
oprofile
pci Merge branch 'pci/virtualization' 2020-06-04 12:59:13 -05:00
platform x86/entry: Convert various system vectors 2020-06-11 15:15:14 +02:00
power mm: reorder includes after introduction of linux/pgtable.h 2020-06-09 09:39:13 -07:00
purgatory Merge branch 'x86/kdump' into locking/kcsan, to resolve conflicts 2020-03-21 09:24:41 +01:00
ras treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
realmode Rebase locking/kcsan to locking/urgent 2020-06-11 20:02:46 +02:00
tools .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
um mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
video
xen xen: Move xen_setup_callback_vector() definition to include/xen/hvm.h 2020-06-11 15:15:19 +02:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
Kbuild
Kconfig Kbuild updates for v5.8 (2nd) 2020-06-13 13:29:16 -07:00
Kconfig.assembler x86/delay: Introduce TPAUSE delay 2020-05-07 16:06:20 +02:00
Kconfig.cpu treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Kconfig.debug treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Makefile x86/boot/build: Make 'make bzlilo' not depend on vmlinux or $(obj)/bzImage 2020-04-21 18:10:28 +02:00
Makefile_32.cpu
Makefile.um