linux

Author	SHA1	Message	Date
Josh Poimboeuf	d61684b56e	x86/crypto/sha512-avx: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/d36e9ea1c819d87fa89b3df3fa83e2a1ede18146.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:36 -05:00
Josh Poimboeuf	ce58466680	x86/crypto/sha256-avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/8048e7444c49a8137f05265262b83dc50f8fb7f3.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:36 -05:00
Josh Poimboeuf	20114c899c	x86/crypto/sha1_avx2: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/fdaaf8670ed1f52f55ba9a6bbac98c1afddc1af6.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:35 -05:00
Josh Poimboeuf	35a0067d2c	x86/crypto/sha_ni: Standardize stack alignment prologue Use a more standard prologue for saving the stack pointer before realigning the stack. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/5033e1a79867dff1b18e1b4d0783c38897d3f223.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:35 -05:00
Josh Poimboeuf	2b02ed5548	x86/crypto/crc32c-pcl-intel: Standardize jump table Simplify the jump table code so that it resembles a compiler-generated table. This enables ORC unwinding by allowing objtool to follow all the potential code paths. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/5357a039def90b8ef6b5874ef12cda008ecf18ba.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:34 -05:00
Josh Poimboeuf	dabe5167a3	x86/crypto/camellia-aesni-avx2: Unconditionally allocate stack buffer A conditional stack allocation violates traditional unwinding requirements when a single instruction can have differing stack layouts. There's no benefit in allocating the stack buffer conditionally. Just do it unconditionally. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/85ac96613ee5784b6239c18d3f68b1f3c509caa3.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:34 -05:00
Josh Poimboeuf	e163be86ff	x86/crypto/aesni-intel_avx: Standardize stack alignment prologue Use RBP instead of R14 for saving the old stack pointer before realignment. This resembles what compilers normally do. This enables ORC unwinding by allowing objtool to understand the stack realignment. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/02d00a0903a0959f4787e186e2a07d271e1f63d4.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:34 -05:00
Josh Poimboeuf	ff5796b6db	x86/crypto/aesni-intel_avx: Fix register usage comments Fix register usage comments to match reality. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/8655d4513a0ed1eddec609165064153973010aa2.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:33 -05:00
Josh Poimboeuf	4f08300916	x86/crypto/aesni-intel_avx: Remove unused macros These macros are no longer used; remove them. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Link: https://lore.kernel.org/r/53f7136ea93ebdbca399959e6d2991ecb46e733e.1614182415.git.jpoimboe@redhat.com	2021-04-19 12:36:33 -05:00
Ben Gardon	24ae4cfaaa	KVM: x86/mmu: Allow enabling/disabling dirty logging under MMU read lock To reduce lock contention and interference with page fault handlers, allow the TDP MMU functions which enable and disable dirty logging to operate under the MMU read lock. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-12-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:06:04 -04:00
Ben Gardon	2db6f772b5	KVM: x86/mmu: Allow zapping collapsible SPTEs to use MMU read lock To reduce the impact of disabling dirty logging, change the TDP MMU function which zaps collapsible SPTEs to run under the MMU read lock. This way, page faults on zapped SPTEs can proceed in parallel with kvm_mmu_zap_collapsible_sptes. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-11-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:06:04 -04:00
Ben Gardon	6103bc0740	KVM: x86/mmu: Allow zap gfn range to operate under the mmu read lock To reduce lock contention and interference with page fault handlers, allow the TDP MMU function to zap a GFN range to operate under the MMU read lock. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-10-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:06:04 -04:00
Ben Gardon	c0e64238ac	KVM: x86/mmu: Protect the tdp_mmu_roots list with RCU Protect the contents of the TDP MMU roots list with RCU in preparation for a future patch which will allow the iterator macro to be used under the MMU lock in read mode. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-9-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:06:01 -04:00
Ben Gardon	fb10129335	KVM: x86/mmu: handle cmpxchg failure in kvm_tdp_mmu_get_root To reduce dependence on the MMU write lock, don't rely on the assumption that the atomic operation in kvm_tdp_mmu_get_root will always succeed. By not relying on that assumption, threads do not need to hold the MMU lock in write mode in order to take a reference on a TDP MMU root. In the root iterator, this change means that some roots might have to be skipped if they are found to have a zero refcount. This will still never happen as of this patch, but a future patch will need that flexibility to make the root iterator safe under the MMU read lock. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-8-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:25 -04:00
Ben Gardon	11cccf5c04	KVM: x86/mmu: Make TDP MMU root refcount atomic In order to parallelize more operations for the TDP MMU, make the refcount on TDP MMU roots atomic, so that a future patch can allow multiple threads to take a reference on the root concurrently, while holding the MMU lock in read mode. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-7-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:25 -04:00
Ben Gardon	cfc109979b	KVM: x86/mmu: Refactor yield safe root iterator Refactor the yield safe TDP MMU root iterator to be more amenable to changes in future commits which will allow it to be used under the MMU lock in read mode. Currently the iterator requires a complicated dance between the helper functions and different parts of the for loop which makes it hard to reason about. Moving all the logic into a single function simplifies the iterator substantially. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-6-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:24 -04:00
Ben Gardon	2bdb3d84ce	KVM: x86/mmu: Merge TDP MMU put and free root kvm_tdp_mmu_put_root and kvm_tdp_mmu_free_root are always called together, so merge the functions to simplify TDP MMU root refcounting / freeing. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-5-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:24 -04:00
Ben Gardon	4bba36d72b	KVM: x86/mmu: use tdp_mmu_free_sp to free roots Minor cleanup to deduplicate the code used to free a struct kvm_mmu_page in the TDP MMU. No functional change intended. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-4-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:24 -04:00
Ben Gardon	76eb54e7e7	KVM: x86/mmu: Move kvm_mmu_(get\|put)_root to TDP MMU The TDP MMU is almost the only user of kvm_mmu_get_root and kvm_mmu_put_root. There is only one use of put_root in mmu.c for the legacy / shadow MMU. Open code that one use and move the get / put functions to the TDP MMU so they can be extended in future commits. No functional change intended. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-3-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:24 -04:00
Ben Gardon	8ca6f063b7	KVM: x86/mmu: Re-add const qualifier in kvm_tdp_mmu_zap_collapsible_sptes kvm_tdp_mmu_zap_collapsible_sptes unnecessarily removes the const qualifier from its memlsot argument, leading to a compiler warning. Add the const annotation and pass it to subsequent functions. Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20210401233736.638171-2-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:23 -04:00
Sean Christopherson	e1eed5847b	KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible Let the TDP MMU yield when unmapping a range in response to a MMU notification, if yielding is allowed by said notification. There is no reason to disallow yielding in this case, and in theory the range being invalidated could be quite large. Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-11-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-19 09:05:23 -04:00
Maciej W. Rozycki	0ef3439cd8	x86/build: Disable HIGHMEM64G selection for M486SX Fix a regression caused by making the 486SX separately selectable in Kconfig, for which the HIGHMEM64G setting has not been updated and therefore has become exposed as a user-selectable option for the M486SX configuration setting unlike with original M486 and all the other settings that choose non-PAE-enabled processors: High Memory Support > 1. off (NOHIGHMEM) 2. 4GB (HIGHMEM4G) 3. 64GB (HIGHMEM64G) choice[1-3?]: With the fix in place the setting is now correctly removed: High Memory Support > 1. off (NOHIGHMEM) 2. 4GB (HIGHMEM4G) choice[1-2?]: [ bp: Massage commit message. ] Fixes: `87d6021b81` ("x86/math-emu: Limit MATH_EMULATION to 486SX compatibles") Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org # v5.5+ Link: https://lkml.kernel.org/r/alpine.DEB.2.21.2104141221340.44318@angie.orcam.me.uk	2021-04-19 14:02:12 +02:00
Jakub Kicinski	8203c7ce4e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ethernet/stmicro/stmmac/stmmac_main.c - keep the ZC code, drop the code related to reinit net/bridge/netfilter/ebtables.c - fix build after move to net_generic Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-04-17 11:08:07 -07:00
Sean Christopherson	b4c5936c47	KVM: Kill off the old hva-based MMU notifier callbacks Yank out the hva-based MMU notifier APIs now that all architectures that use the notifiers have moved to the gfn-based APIs. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:08 -04:00
Sean Christopherson	3039bcc744	KVM: Move x86's MMU notifier memslot walkers to generic code Move the hva->gfn lookup for MMU notifiers into common code. Every arch does a similar lookup, and some arch code is all but identical across multiple architectures. In addition to consolidating code, this will allow introducing optimizations that will benefit all architectures without incurring multiple walks of the memslots, e.g. by taking mmu_lock if and only if a relevant range exists in the memslots. The use of __always_inline to avoid indirect call retpolines, as done by x86, may also benefit other architectures. Consolidating the lookups also fixes a wart in x86, where the legacy MMU and TDP MMU each do their own memslot walks. Lastly, future enhancements to the memslot implementation, e.g. to add an interval tree to track host address, will need to touch far less arch specific code. MIPS, PPC, and arm64 will be converted one at a time in future patches. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:06 -04:00
Paolo Bonzini	6c9dd6d262	KVM: constify kvm_arch_flush_remote_tlbs_memslot memslots are stored in RCU and there should be no need to change them. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:04 -04:00
Paolo Bonzini	dbb6964e4c	KVM: MMU: protect TDP MMU pages only down to required level When using manual protection of dirty pages, it is not necessary to protect nested page tables down to the 4K level; instead KVM can protect only hugepages in order to split them lazily, and delay write protection at 4K-granularity until KVM_CLEAR_DIRTY_LOG. This was overlooked in the TDP MMU, so do it there as well. Fixes: `a6a0b05da9` ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Cc: Ben Gardon <bgardon@google.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:04 -04:00
Maxim Levitsky	7e582ccbbd	KVM: x86: implement KVM_CAP_SET_GUEST_DEBUG2 Store the supported bits into KVM_GUESTDBG_VALID_MASK macro, similar to how arm does this. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401135451.1004564-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:02 -04:00
Maxim Levitsky	4020da3b9f	KVM: x86: pending exceptions must not be blocked by an injected event Injected interrupts/nmi should not block a pending exception, but rather be either lost if nested hypervisor doesn't intercept the pending exception (as in stock x86), or be delivered in exitintinfo/IDT_VECTORING_INFO field, as a part of a VMexit that corresponds to the pending exception. The only reason for an exception to be blocked is when nested run is pending (and that can't really happen currently but still worth checking for). Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401143817.1030695-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:02 -04:00
Maxim Levitsky	232f75d3b4	KVM: nSVM: call nested_svm_load_cr3 on nested state load While KVM's MMU should be fully reset by loading of nested CR0/CR3/CR4 by KVM_SET_SREGS, we are not in nested mode yet when we do it and therefore only root_mmu is reset. On regular nested entries we call nested_svm_load_cr3 which both updates the guest's CR3 in the MMU when it is needed, and it also initializes the mmu again which makes it initialize the walk_mmu as well when nested paging is enabled in both host and guest. Since we don't call nested_svm_load_cr3 on nested state load, the walk_mmu can be left uninitialized, which can lead to a NULL pointer dereference while accessing it if we happen to get a nested page fault right after entering the nested guest first time after the migration and we decide to emulate it, which leads to the emulator trying to access walk_mmu->gva_to_gpa which is NULL. Therefore we should call this function on nested state load as well. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401141814.1029036-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:01 -04:00
David Edmondson	8486039a6c	KVM: x86: dump_vmcs should include the autoload/autostore MSR lists When dumping the current VMCS state, include the MSRs that are being automatically loaded/stored during VM entry/exit. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-6-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:01 -04:00
David Edmondson	0702a3cbbf	KVM: x86: dump_vmcs should show the effective EFER If EFER is not being loaded from the VMCS, show the effective value by reference to the MSR autoload list or calculation. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-5-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:00 -04:00
David Edmondson	5518da62d4	KVM: x86: dump_vmcs should consider only the load controls of EFER/PAT When deciding whether to dump the GUEST_IA32_EFER and GUEST_IA32_PAT fields of the VMCS, examine only the VM entry load controls, as saving on VM exit has no effect on whether VM entry succeeds or fails. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-4-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:31:00 -04:00
David Edmondson	699e1b2e55	KVM: x86: dump_vmcs should not conflate EFER and PAT presence in VMCS Show EFER and PAT based on their individual entry/exit controls. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-3-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:59 -04:00
David Edmondson	d9e46d344e	KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid If the VM entry/exit controls for loading/saving MSR_EFER are either not available (an older processor or explicitly disabled) or not used (host and guest values are the same), reading GUEST_IA32_EFER from the VMCS returns an inaccurate value. Because of this, in dump_vmcs() don't use GUEST_IA32_EFER to decide whether to print the PDPTRs - always do so if the fields exist. Fixes: `4eb64dce8d` ("KVM: x86: dump VMCS on invalid entry") Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210318120841.133123-2-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:59 -04:00
Maxim Levitsky	adc2a23734	KVM: nSVM: improve SYSENTER emulation on AMD Currently to support Intel->AMD migration, if CPU vendor is GenuineIntel, we emulate the full 64 value for MSR_IA32_SYSENTER_{EIP\|ESP} msrs, and we also emulate the sysenter/sysexit instruction in long mode. (Emulator does still refuse to emulate sysenter in 64 bit mode, on the ground that the code for that wasn't tested and likely has no users) However when virtual vmload/vmsave is enabled, the vmload instruction will update these 32 bit msrs without triggering their msr intercept, which will lead to having stale values in kvm's shadow copy of these msrs, which relies on the intercept to be up to date. Fix/optimize this by doing the following: 1. Enable the MSR intercepts for SYSENTER MSRs iff vendor=GenuineIntel (This is both a tiny optimization and also ensures that in case the guest cpu vendor is AMD, the msrs will be 32 bit wide as AMD defined). 2. Store only high 32 bit part of these msrs on interception and combine it with hardware msr value on intercepted read/writes iff vendor=GenuineIntel. 3. Disable vmload/vmsave virtualization if vendor=GenuineIntel. (It is somewhat insane to set vendor=GenuineIntel and still enable SVM for the guest but well whatever). Then zero the high 32 bit parts when kvm intercepts and emulates vmload. Thanks a lot to Paulo Bonzini for helping me with fixing this in the most correct way. This patch fixes nested migration of 32 bit nested guests, that was broken because incorrect cached values of SYSENTER msrs were stored in the migration stream if L1 changed these msrs with vmload prior to L2 entry. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401111928.996871-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:59 -04:00
Maxim Levitsky	c1df4aac44	KVM: x86: add guest_cpuid_is_intel This is similar to existing 'guest_cpuid_is_amd_or_hygon' Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210401111928.996871-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:58 -04:00
Sean Christopherson	eba04b20e4	KVM: x86: Account a variety of miscellaneous allocations Switch to GFP_KERNEL_ACCOUNT for a handful of allocations that are clearly associated with a single task/VM. Note, there are a several SEV allocations that aren't accounted, but those can (hopefully) be fixed by using the local stack for memory. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210331023025.2485960-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:58 -04:00
Sean Christopherson	8727906fde	KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created Reject KVM_SEV_INIT and KVM_SEV_ES_INIT if they are attempted after one or more vCPUs have been created. KVM assumes a VM is tagged SEV/SEV-ES prior to vCPU creation, e.g. init_vmcb() needs to mark the VMCB as SEV enabled, and svm_create_vcpu() needs to allocate the VMSA. At best, creating vCPUs before SEV/SEV-ES init will lead to unexpected errors and/or behavior, and at worst it will crash the host, e.g. sev_launch_update_vmsa() will dereference a null svm->vmsa pointer. Fixes: `1654efcbc4` ("KVM: SVM: Add KVM_SEV_INIT command") Fixes: `ad73109ae7` ("KVM: SVM: Provide support to launch and run an SEV-ES guest") Cc: stable@vger.kernel.org Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210331031936.2495277-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:58 -04:00
Sean Christopherson	9fa1521daa	KVM: SVM: Do not set sev->es_active until KVM_SEV_ES_INIT completes Set sev->es_active only after the guts of KVM_SEV_ES_INIT succeeds. If the command fails, e.g. because SEV is already active or there are no available ASIDs, then es_active will be left set even though the VM is not fully SEV-ES capable. Refactor the code so that "es_active" is passed on the stack instead of being prematurely shoved into sev_info, both to avoid having to unwind sev_info and so that it's more obvious what actually consumes es_active in sev_guest_init() and its helpers. Fixes: `ad73109ae7` ("KVM: SVM: Provide support to launch and run an SEV-ES guest") Cc: stable@vger.kernel.org Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210331031936.2495277-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:57 -04:00
Sean Christopherson	c36b16d29f	KVM: SVM: Use online_vcpus, not created_vcpus, to iterate over vCPUs Use the kvm_for_each_vcpu() helper to iterate over vCPUs when encrypting VMSAs for SEV, which effectively switches to use online_vcpus instead of created_vcpus. This fixes a possible null-pointer dereference as created_vcpus does not guarantee a vCPU exists, since it is updated at the very beginning of KVM_CREATE_VCPU. created_vcpus exists to allow the bulk of vCPU creation to run in parallel, while still correctly restricting the max number of max vCPUs. Fixes: `ad73109ae7` ("KVM: SVM: Provide support to launch and run an SEV-ES guest") Cc: stable@vger.kernel.org Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210331031936.2495277-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:57 -04:00
Sean Christopherson	8f8f52a45d	KVM: x86/mmu: Simplify code for aging SPTEs in TDP MMU Use a basic NOT+AND sequence to clear the Accessed bit in TDP MMU SPTEs, as opposed to the fancy ffs()+clear_bit() logic that was copied from the legacy MMU. The legacy MMU uses clear_bit() because it is operating on the SPTE itself, i.e. clearing needs to be atomic. The TDP MMU operates on a local variable that it later writes to the SPTE, and so doesn't need to be atomic or even resident in memory. Opportunistically drop unnecessary initialization of new_spte, it's guaranteed to be written before being accessed. Using NOT+AND instead of ffs()+clear_bit() reduces the sequence from: 0x0000000000058be6 <+134>: test %rax,%rax 0x0000000000058be9 <+137>: je 0x58bf4 <age_gfn_range+148> 0x0000000000058beb <+139>: test %rax,%rdi 0x0000000000058bee <+142>: je 0x58cdc <age_gfn_range+380> 0x0000000000058bf4 <+148>: mov %rdi,0x8(%rsp) 0x0000000000058bf9 <+153>: mov $0xffffffff,%edx 0x0000000000058bfe <+158>: bsf %eax,%edx 0x0000000000058c01 <+161>: movslq %edx,%rdx 0x0000000000058c04 <+164>: lock btr %rdx,0x8(%rsp) 0x0000000000058c0b <+171>: mov 0x8(%rsp),%r15 to: 0x0000000000058bdd <+125>: test %rax,%rax 0x0000000000058be0 <+128>: je 0x58beb <age_gfn_range+139> 0x0000000000058be2 <+130>: test %rax,%r8 0x0000000000058be5 <+133>: je 0x58cc0 <age_gfn_range+352> 0x0000000000058beb <+139>: not %rax 0x0000000000058bee <+142>: and %r8,%rax 0x0000000000058bf1 <+145>: mov %rax,%r15 thus eliminating several memory accesses, including a locked access. Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210331004942.2444916-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:57 -04:00
Sean Christopherson	6d9aafb96d	KVM: x86/mmu: Remove spurious clearing of dirty bit from TDP MMU SPTE Don't clear the dirty bit when aging a TDP MMU SPTE (in response to a MMU notifier event). Prematurely clearing the dirty bit could cause spurious PML updates if aging a page happened to coincide with dirty logging. Note, tdp_mmu_set_spte_no_acc_track() flows into __handle_changed_spte(), so the host PFN will be marked dirty, i.e. there is no potential for data corruption. Fixes: `a6a0b05da9` ("kvm: x86/mmu: Support dirty logging for the TDP MMU") Cc: Ben Gardon <bgardon@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210331004942.2444916-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:56 -04:00
Sean Christopherson	6dfbd6b5d5	KVM: x86/mmu: Drop trace_kvm_age_page() tracepoint Remove x86's trace_kvm_age_page() tracepoint. It's mostly redundant with the common trace_kvm_age_hva() tracepoint, and if there is a need for the extra details, e.g. gfn, referenced, etc... those details should be added to the common tracepoint so that all architectures and MMUs benefit from the info. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-19-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:56 -04:00
Sean Christopherson	5f7c292b89	KVM: Move prototypes for MMU notifier callbacks to generic code Move the prototypes for the MMU notifier callbacks out of arch code and into common code. There is no benefit to having each arch replicate the prototypes since any deviation from the invocation in common code will explode. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:55 -04:00
Sean Christopherson	aaaac889cf	KVM: x86/mmu: Use leaf-only loop for walking TDP SPTEs when changing SPTE Use the leaf-only TDP iterator when changing the SPTE in reaction to a MMU notifier. Practically speaking, this is a nop since the guts of the loop explicitly looks for 4k SPTEs, which are always leaf SPTEs. Switch the iterator to match age_gfn_range() and test_age_gfn() so that a future patch can consolidate the core iterating logic. No real functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:55 -04:00
Sean Christopherson	a3f15bda46	KVM: x86/mmu: Pass address space ID to TDP MMU root walkers Move the address space ID check that is performed when iterating over roots into the macro helpers to consolidate code. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:55 -04:00
Sean Christopherson	2b9663d8a1	KVM: x86/mmu: Pass address space ID to __kvm_tdp_mmu_zap_gfn_range() Pass the address space ID to TDP MMU's primary "zap gfn range" helper to allow the MMU notifier paths to iterate over memslots exactly once. Currently, both the legacy MMU and TDP MMU iterate over memslots when looking for an overlapping hva range, which can be quite costly if there are a large number of memslots. Add a "flush" parameter so that iterating over multiple address spaces in the caller will continue to do the right thing when yielding while a flush is pending from a previous address space. Note, this also has a functional change in the form of coalescing TLB flushes across multiple address spaces in kvm_zap_gfn_range(), and also optimizes the TDP MMU to utilize range-based flushing when running as L1 with Hyper-V enlightenments. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-6-seanjc@google.com> [Keep separate for loops to prepare for other incoming patches. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:54 -04:00
Sean Christopherson	1a61b7db7a	KVM: x86/mmu: Coalesce TLB flushes across address spaces for gfn range zap Gather pending TLB flushes across both address spaces when zapping a given gfn range. This requires feeding "flush" back into subsequent calls, but on the plus side sets the stage for further batching between the legacy MMU and TDP MMU. It also allows refactoring the address space iteration to cover the legacy and TDP MMUs without introducing truly ugly code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:54 -04:00
Sean Christopherson	142ccde1f7	KVM: x86/mmu: Coalesce TLB flushes when zapping collapsible SPTEs Gather pending TLB flushes across both the legacy and TDP MMUs when zapping collapsible SPTEs to avoid multiple flushes if both the legacy MMU (for nested guests) and TDP MMU have mappings for the memslot. Note, this also optimizes the TDP MMU to flush only the relevant range when running as L1 with Hyper-V enlightenments. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-04-17 08:30:54 -04:00

... 4 5 6 7 8 ...

38422 Commits