Use vcpu_{set,clear}_cpuid_feature() to toggle nested VMX support in the
vCPU CPUID module in the nVMX state test. Drop CPUID_VMX as there are
no longer any users.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-29-seanjc@google.com
Use the vCPU's persistent CPUID array directly when manipulating the set
of exposed Hyper-V CPUID features. Drop set_cpuid() to route all future
modification through the vCPU helpers; the Hyper-V features test was the
last user.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-27-seanjc@google.com
Add a new helper, vcpu_clear_cpuid_entry(), to do a RMW operation on the
vCPU's CPUID model to clear a given CPUID entry, and use it to clear
KVM's paravirt feature instead of operating on kvm_get_supported_cpuid()'s
static "cpuid" variable. This also eliminates a user of
the soon-be-defunct set_cpuid() helper.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-26-seanjc@google.com
Use vcpu_clear_cpuid_feature() to the MONITOR/MWAIT CPUID feature bit in
the MONITOR/MWAIT quirk test.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a helper to set a vCPU's guest.MAXPHYADDR, and use it in the test
that verifies the emulator returns an error on an unknown instruction
when KVM emulates in response to an EPT violation with a GPA that is
legal in hardware but illegal with respect to the guest's MAXPHYADDR.
Add a helper even though there's only a single user at this time. Before
its removal, mmu_role_test also stuffed guest.MAXPHYADDR, and the helper
provides a small amount of clarity.
More importantly, this eliminates a set_cpuid() user and an instance of
modifying kvm_get_supported_cpuid()'s static "cpuid".
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-25-seanjc@google.com
Use vm->pa_bits to generate the mask of physical address bits that are
reserved in page table entries. vm->pa_bits is set when the VM is
created, i.e. it's guaranteed to be valid when populating page tables.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-24-seanjc@google.com
Add helpers to get a specific CPUID entry for a given vCPU, and to toggle
a specific CPUID-based feature for a vCPU. The helpers will reduce the
amount of boilerplate code needed to tweak a vCPU's CPUID model, improve
code clarity, and most importantly move tests away from modifying the
static "cpuid" returned by kvm_get_supported_cpuid().
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-23-seanjc@google.com
Rename get_cpuid() to get_cpuid_entry() to better reflect its behavior.
Leave set_cpuid() as is to avoid unnecessary churn, that helper will soon
be removed entirely.
Oppurtunistically tweak the implementation to avoid using a temporary
variable in anticipation of taggin the input @cpuid with "const".
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-21-seanjc@google.com
Don't use a static variable for the Hyper-V supported CPUID array, the
helper unconditionally reallocates the array on every invocation (and all
callers free the array immediately after use). The array is intentionally
recreated and refilled because the set of supported CPUID features is
dependent on vCPU state, e.g. whether or not eVMCS has been enabled.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-20-seanjc@google.com
Cache a vCPU's CPUID information in "struct kvm_vcpu" to allow fixing the
mess where tests, often unknowingly, modify the global/static "cpuid"
allocated by kvm_get_supported_cpuid().
Add vcpu_init_cpuid() to handle stuffing an entirely different CPUID
model, e.g. during vCPU creation or when switching to the Hyper-V enabled
CPUID model. Automatically refresh the cache on vcpu_set_cpuid() so that
any adjustments made by KVM are always reflected in the cache. Drop
vcpu_get_cpuid() entirely to force tests to use the cache, and to allow
adding e.g. vcpu_get_cpuid_entry() in the future without creating a
conflicting set of APIs where vcpu_get_cpuid() does KVM_GET_CPUID2, but
vcpu_get_cpuid_entry() does not.
Opportunistically convert the VMX nested state test and KVM PV test to
manipulating the vCPU's CPUID (because it's easy), but use
vcpu_init_cpuid() for the Hyper-V features test and "emulator error" test
to effectively retain their current behavior as they're less trivial to
convert.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-19-seanjc@google.com
Split out the computation of the effective size of a kvm_cpuid2 struct
from allocate_kvm_cpuid2(), and modify both to take an arbitrary number
of entries. Future commits will add caching of a vCPU's CPUID model, and
will (a) be able to precisely size the entries array, and (b) will need
to know the effective size of the struct in order to copy to/from the
cache.
Expose the helpers so that the Hyper-V Features test can use them in the
(somewhat distant) future. The Hyper-V test very, very subtly relies on
propagating CPUID info across vCPU instances, and will need to make a
copy of the previous vCPU's CPUID information when it switches to using
the per-vCPU cache. Alternatively, KVM could provide helpers to
duplicate and/or copy a kvm_cpuid2 instance, but each is literally a
single line of code if the helpers are exposed, and it's not like the
size of kvm_cpuid2 is secret knowledge.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-18-seanjc@google.com
In the CPUID test, verify that KVM doesn't modify the kvm_cpuid2.entries
layout, i.e. that the order of entries and their flags is identical
between what the test provides via KVM_SET_CPUID2 and what KVM returns
via KVM_GET_CPUID2.
Asserting that the layouts match simplifies the test as there's no need
to iterate over both arrays.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-17-seanjc@google.com
Use kvm_cpu_has() in the stea-ltime test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().
Opportunistically define all of KVM's paravirt CPUID-based features.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-15-seanjc@google.com
Remove the MMU role test, which was made obsolete by KVM commit
feb627e8d6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"). The
ongoing costs of keeping the test updated far outweigh any benefits,
e.g. the test _might_ be useful as an example or for documentation
purposes, but otherwise the test is dead weight.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-14-seanjc@google.com
Use kvm_cpu_has() in the CR4/CPUID sync test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-13-seanjc@google.com
Use kvm_cpu_has() in the AMX test instead of open coding equivalent
functionality using kvm_get_supported_cpuid_entry() and
kvm_get_supported_cpuid_index().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-12-seanjc@google.com
Check for _both_ XTILE data and cfg support in the AMX test instead of
checking for _either_ feature. Practically speaking, no sane CPU or vCPU
will support one but not the other, but the effective "or" behavior is
subtle and technically incorrect.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-11-seanjc@google.com
Use kvm_cpu_has() in the XSS MSR test instead of open coding equivalent
functionality using kvm_get_supported_cpuid_index().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-10-seanjc@google.com
Drop a redundant vcpu_set_cpuid() from the PMU test. The vCPU's CPUID is
set to KVM's supported CPUID by vm_create_with_one_vcpu(), which was also
true back when the helper was named vm_create_default().
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-9-seanjc@google.com
Use kvm_cpu_has() in the PMU test to query PDCM support instead of open
coding equivalent functionality using kvm_get_supported_cpuid_index().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-8-seanjc@google.com
Use kvm_cpu_has() to check for nested VMX support, and drop the helpers
now that their functionality is trivial to implement.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-7-seanjc@google.com
Use kvm_cpu_has() to check for nested SVM support, and drop the helpers
now that their functionality is trivial to implement.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-6-seanjc@google.com
Use kvm_cpu_has() in the SEV migration test instead of open coding
equivalent functionality using kvm_get_supported_cpuid_entry().
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-5-seanjc@google.com
Add X86_FEATURE_* magic in the style of KVM-Unit-Tests' implementation,
where the CPUID function, index, output register, and output bit position
are embedded in the macro value. Add kvm_cpu_has() to query KVM's
supported CPUID and use it set_sregs_test, which is the most prolific
user of manual feature querying.
Opportunstically rename calc_cr4_feature_bits() to
calc_supported_cr4_feature_bits() to better capture how the CR4 bits are
chosen.
Link: https://lore.kernel.org/all/20210422005626.564163-1-ricarkol@google.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-4-seanjc@google.com
Rename X86_FEATURE_* macros to CPUID_* in various tests to free up the
X86_FEATURE_* names for KVM-Unit-Tests style CPUID automagic where the
function, leaf, register, and bit for the feature is embedded in its
macro value.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-3-seanjc@google.com
On x86-64, set KVM's supported CPUID as the vCPU's CPUID when recreating
a VM+vCPU to deduplicate code for state save/restore tests, and to
provide symmetry of sorts with respect to vm_create_with_one_vcpu(). The
extra KVM_SET_CPUID2 call is wasteful for Hyper-V, but ultimately is
nothing more than an expensive nop, and overriding the vCPU's CPUID with
the Hyper-V CPUID information is the only known scenario where a state
save/restore test wouldn't need/want the default CPUID.
Opportunistically use __weak for the default vm_compute_max_gfn(), it's
provided by tools' compiler.h.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220614200707.3315957-2-seanjc@google.com
Fix filename reporting in guest asserts by ensuring the GUEST_ASSERT
macro records __FILE__ and substituting REPORT_GUEST_ASSERT for many
repetitive calls to TEST_FAIL.
Previously filename was reported by using __FILE__ directly in the
selftest, wrongly assuming it would always be the same as where the
assertion failed.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reported-by: Ricardo Koller <ricarkol@google.com>
Fixes: 4e18bccc2e
Link: https://lore.kernel.org/r/20220615193116.806312-5-coltonlewis@google.com
[sean: convert more TEST_FAIL => REPORT_GUEST_ASSERT instances]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Write REPORT_GUEST_ASSERT macros to pair with GUEST_ASSERT to abstract
and make consistent all guest assertion reporting. Every report
includes an explanatory string, a filename, and a line number.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-4-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Increase UCALL_MAX_ARGS to 7 to allow GUEST_ASSERT_4 to pass 3 builtin
ucall arguments specified in guest_assert_builtin_args plus 4
user-specified arguments.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-3-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Enumerate GUEST_ASSERT arguments to avoid magic indices to ucall.args.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20220615193116.806312-2-coltonlewis@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Change a WARN_ON() to separate WARN_ON_ONCE() if KVM has an outstanding
PIO or MMIO request without an associated callback, i.e. if KVM queued a
userspace I/O exit but didn't actually exit to userspace before moving
on to something else. Warning on every KVM_RUN risks spamming the kernel
if KVM gets into a bad state. Opportunistically split the WARNs so that
it's easier to triage failures when a WARN fires.
Deliberately do not use KVM_BUG_ON(), i.e. don't kill the VM. While the
WARN is all but guaranteed to fire if and only if there's a KVM bug, a
dangling I/O request does not present a danger to KVM (that flag is truly
truly consumed only in a single emulator path), and any such bug is
unlikely to be fatal to the VM (KVM essentially failed to do something it
shouldn't have tried to do in the first place). In other words, note the
bug, but let the VM keep running.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220711232750.1092012-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
When injecting a #GP on LLDT/LTR due to a non-canonical LDT/TSS base, set
the error code to the selector. Intel SDM's says nothing about the #GP,
but AMD's APM explicitly states that both LLDT and LTR set the error code
to the selector, not zero.
Note, a non-canonical memory operand on LLDT/LTR does generate a #GP(0),
but the KVM code in question is specific to the base from the descriptor.
Fixes: e37a75a13c ("KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220711232750.1092012-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Wait to mark the TSS as busy during LTR emulation until after all fault
checks for the LTR have passed. Specifically, don't mark the TSS busy if
the new TSS base is non-canonical.
Opportunistically drop the one-off !seg_desc.PRESENT check for TR as the
only reason for the early check was to avoid marking a !PRESENT TSS as
busy, i.e. the common !PRESENT is now done before setting the busy bit.
Fixes: e37a75a13c ("KVM: x86: Emulator ignores LDTR/TR extended base on LLDT/LTR")
Reported-by: syzbot+760a73552f47a8cd0fd9@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220711232750.1092012-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a "UD" clause to KVM_X86_QUIRK_MWAIT_NEVER_FAULTS to make it clear
that the quirk only controls the #UD behavior of MONITOR/MWAIT. KVM
doesn't currently enforce fault checks when MONITOR/MWAIT are supported,
but that could change in the future. SVM also has a virtualization hole
in that it checks all faults before intercepts, and so "never faults" is
already a lie when running on SVM.
Fixes: bfbcc81bb8 ("KVM: x86: Add a quirk for KVM's "MONITOR/MWAIT are NOPs!" behavior")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-4-seanjc@google.com
Do not use GCC's "A" constraint to load EAX:EDX in wrmsr_safe(). Per
GCC's documenation on x86-specific constraints, "A" will not actually
load a 64-bit value into EAX:EDX on x86-64.
The a and d registers. This class is used for instructions that return
double word results in the ax:dx register pair. Single word values will
be allocated either in ax or dx. For example on i386 the following
implements rdtsc:
unsigned long long rdtsc (void)
{
unsigned long long tick;
__asm__ __volatile__("rdtsc":"=A"(tick));
return tick;
}
This is not correct on x86-64 as it would allocate tick in either ax or
dx. You have to use the following variant instead:
unsigned long long rdtsc (void)
{
unsigned int tickl, tickh;
__asm__ __volatile__("rdtsc":"=a"(tickl),"=d"(tickh));
return ((unsigned long long)tickh << 32)|tickl;
}
Because a u64 fits in a single 64-bit register, using "A" for selftests,
which are 64-bit only, results in GCC loading the value into either RAX
or RDX instead of splitting it across EAX:EDX.
E.g.:
kvm_exit: reason MSR_WRITE rip 0x402919 info 0 0
kvm_msr: msr_write 40000118 = 0x60000000001 (#GP)
...
With "A":
48 8b 43 08 mov 0x8(%rbx),%rax
49 b9 ba da ca ba 0a movabs $0xabacadaba,%r9
00 00 00
4c 8d 15 07 00 00 00 lea 0x7(%rip),%r10 # 402f44 <guest_msr+0x34>
4c 8d 1d 06 00 00 00 lea 0x6(%rip),%r11 # 402f4a <guest_msr+0x3a>
0f 30 wrmsr
With "a"/"d":
48 8b 53 08 mov 0x8(%rbx),%rdx
89 d0 mov %edx,%eax
48 c1 ea 20 shr $0x20,%rdx
49 b9 ba da ca ba 0a movabs $0xabacadaba,%r9
00 00 00
4c 8d 15 07 00 00 00 lea 0x7(%rip),%r10 # 402fc3 <guest_msr+0xb3>
4c 8d 1d 06 00 00 00 lea 0x6(%rip),%r11 # 402fc9 <guest_msr+0xb9>
0f 30 wrmsr
Fixes: 3b23054cd3 ("KVM: selftests: Add x86-64 support for exception fixup")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
[sean: use "& -1u", provide GCC blurb and link to documentation]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220714011115.3135828-1-seanjc@google.com
Provide valid inputs for RAX, RCX, and RDX when testing whether or not
KVM injects a #UD on MONITOR/MWAIT. SVM has a virtualization hole and
checks for _all_ faults before checking for intercepts, e.g. MONITOR with
an unsupported RCX will #GP before KVM gets a chance to intercept and
emulate.
Fixes: 2325d4dd73 ("KVM: selftests: Add MONITOR/MWAIT quirk test")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-3-seanjc@google.com
Fix a copy+paste error in monitor_mwait_test by switching one of the two
"monitor" instructions to an "mwait". The intent of the test is very
much to verify the quirk handles both MONITOR and MWAIT.
Fixes: 2325d4dd73 ("KVM: selftests: Add MONITOR/MWAIT quirk test")
Reported-by: Yuan Yao <yuan.yao@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220711225753.1073989-2-seanjc@google.com
Read vcpu->vcpu_idx directly instead of bouncing through the one-line
wrapper, kvm_vcpu_get_idx(), and drop the wrapper. The wrapper is a
remnant of the original implementation and serves no purpose; remove it
(again) before it gains more users.
kvm_vcpu_get_idx() was removed in the not-too-distant past by commit
4eeef24241 ("KVM: x86: Query vcpu->vcpu_idx directly and drop its
accessor"), but was unintentionally re-introduced by commit a54d806688
("KVM: Keep memslots in tree-based structures instead of array-based ones"),
likely due to a rebase goof. The wrapper then managed to gain users in
KVM's Xen code.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20220614225615.3843835-1-seanjc@google.com
The result of gva_to_gpa() is physical address not virtual address,
it is odd that UNMAPPED_GVA macro is used as the result for physical
address. Replace UNMAPPED_GVA with INVALID_GPA and drop UNMAPPED_GVA
macro.
No functional change intended.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/6104978956449467d3c68f1ad7f2c2f6d771d0ee.1656667239.git.houwenlong.hwl@antgroup.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
hang upon boot or shortly after when a non-default TSC frequency was
set for L1. The issue is observed on a host where TSC scaling is
supported. The problem appears to be that Windows doesn't use TSC
scaling for its guests, even when the feature is advertised, and KVM
filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
L1's VMCS. This leads to L2 running with the default frequency (matching
host's) while L1 is running with an altered one.
Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
it was set for L1. TSC_MULTIPLIER is already correctly computed and
written by prepare_vmcs02().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Fixes: d041b5ea93 ("KVM: nVMX: Enable nested TSC scaling")
Cc: stable@vger.kernel.org
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220712135009.952805-1-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
'vector' and 'trig_mode' fields of 'struct kvm_lapic_irq' are left
uninitialized in kvm_pv_kick_cpu_op(). While these fields are normally
not needed for APIC_DM_REMRD, they're still referenced by
__apic_accept_irq() for trace_kvm_apic_accept_irq(). Fully initialize
the structure to avoid consuming random stack memory.
Fixes: a183b638b6 ("KVM: x86: make apic_accept_irq tracepoint more generic")
Reported-by: syzbot+d6caa905917d353f0d07@syzkaller.appspotmail.com
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20220708125147.593975-1-vkuznets@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Add a helper to update KVM's in-kernel local APIC in response to MCG_CAP
being changed by userspace to fix multiple bugs. First and foremost,
KVM needs to check that there's an in-kernel APIC prior to dereferencing
vcpu->arch.apic. Beyond that, any "new" LVT entries need to be masked,
and the APIC version register needs to be updated as it reports out the
number of LVT entries.
Fixes: 4b903561ec ("KVM: x86: Add Corrected Machine Check Interrupt (CMCI) emulation to lapic.")
Reported-by: syzbot+8cdad6430c24f396f158@syzkaller.appspotmail.com
Cc: Siddh Raman Pant <code@siddh.me>
Cc: Jue Wang <juew@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Initialize the number of LVT entries during APIC creation, else the field
will be incorrectly left '0' if userspace never invokes KVM_X86_SETUP_MCE.
Add and use a helper to calculate the number of entries even though
MCG_CMCI_P is not set by default in vcpu->arch.mcg_cap. Relying on that
to always be true is unnecessarily risky, and subtle/confusing as well.
Fixes: 4b903561ec ("KVM: x86: Add Corrected Machine Check Interrupt (CMCI) emulation to lapic.")
Reported-by: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Jue Wang <juew@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Merge a bug fix and cleanups for {g,s}et_msr_mce() using a base that
predates commit 281b52780b ("KVM: x86: Add emulation for
MSR_IA32_MCx_CTL2 MSRs."), which was written with the intention that it
be applied _after_ the bug fix and cleanups. The bug fix in particular
needs to be sent to stable trees; give them a stable hash to use.
Add helpers to identify CTL (control) and STATUS MCi MSR types instead of
open coding the checks using the offset. Using the offset is perfectly
safe, but unintuitive, as understanding what the code does requires
knowing that the offset calcuation will not affect the lower three bits.
Opportunistically comment the STATUS logic to save readers a trip to
Intel's SDM or AMD's APM to understand the "data != 0" check.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20220512222716.4112548-4-seanjc@google.com
Use an explicit case statement to grab the full range of MCx bank MSRs
in {g,s}et_msr_mce(), and manually check only the "end" (the number of
banks configured by userspace may be less than the max). The "default"
trick works, but is a bit odd now, and will be quite odd if/when support
for accessing MCx_CTL2 MSRs is added, which has near identical logic.
Hoist "offset" to function scope so as to avoid curly braces for the case
statement, and because MCx_CTL2 support will need the same variables.
Opportunstically clean up the comment about allowing bit 10 to be cleared
from bank 4.
No functional change intended.
Cc: Jue Wang <juew@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Link: https://lore.kernel.org/r/20220512222716.4112548-3-seanjc@google.com