Commit Graph

8483 Commits

Author SHA1 Message Date
Paolo Bonzini
7fd55a02a4 Merge tag 'kvmarm-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 5.16

- Simplification of the 'vcpu first run' by integrating it into
  KVM's 'pid change' flow

- Refactoring of the FP and SVE state tracking, also leading to
  a simpler state and less shared data between EL1 and EL2 in
  the nVHE case

- Tidy up the header file usage for the nvhe hyp object

- New HYP unsharing mechanism, finally allowing pages to be
  unmapped from the Stage-1 EL2 page-tables

- Various pKVM cleanups around refcounting and sharing

- A couple of vgic fixes for bugs that would trigger once
  the vcpu xarray rework is merged, but not sooner

- Add minimal support for ARMv8.7's PMU extension

- Rework kvm_pgtable initialisation ahead of the NV work

- New selftest for IRQ injection

- Teach selftests about the lack of default IPA space and
  page sizes

- Expand sysreg selftest to deal with Pointer Authentication

- The usual bunch of cleanups and doc update
2022-01-07 10:42:19 -05:00
Marc Zyngier
1c53a1ae36 Merge branch kvm-arm64/misc-5.17 into kvmarm-master/next
* kvm-arm64/misc-5.17:
  : .
  : Misc fixes and improvements:
  : - Add minimal support for ARMv8.7's PMU extension
  : - Constify kvm_io_gic_ops
  : - Drop kvm_is_transparent_hugepage() prototype
  : - Drop unused workaround_flags field
  : - Rework kvm_pgtable initialisation
  : - Documentation fixes
  : - Replace open-coded SCTLR_EL1.EE useage with its defined macro
  : - Sysreg list selftest update to handle PAuth
  : - Include cleanups
  : .
  KVM: arm64: vgic: Replace kernel.h with the necessary inclusions
  KVM: arm64: Fix comment typo in kvm_vcpu_finalize_sve()
  KVM: arm64: selftests: get-reg-list: Add pauth configuration
  KVM: arm64: Fix comment on barrier in kvm_psci_vcpu_on()
  KVM: arm64: Fix comment for kvm_reset_vcpu()
  KVM: arm64: Use defined value for SCTLR_ELx_EE
  KVM: arm64: Rework kvm_pgtable initialisation
  KVM: arm64: Drop unused workaround_flags vcpu field

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-01-04 17:16:15 +00:00
Marc Zyngier
ad7937dc77 Merge branch kvm-arm64/selftest/irq-injection into kvmarm-master/next
* kvm-arm64/selftest/irq-injection:
  : .
  : New tests from Ricardo Koller:
  : "This series adds a new test, aarch64/vgic-irq, that validates the injection of
  : different types of IRQs from userspace using various methods and configurations"
  : .
  KVM: selftests: aarch64: Add test for restoring active IRQs
  KVM: selftests: aarch64: Add ISPENDR write tests in vgic_irq
  KVM: selftests: aarch64: Add tests for IRQFD in vgic_irq
  KVM: selftests: Add IRQ GSI routing library functions
  KVM: selftests: aarch64: Add test_inject_fail to vgic_irq
  KVM: selftests: aarch64: Add tests for LEVEL_INFO in vgic_irq
  KVM: selftests: aarch64: Level-sensitive interrupts tests in vgic_irq
  KVM: selftests: aarch64: Add preemption tests in vgic_irq
  KVM: selftests: aarch64: Cmdline arg to set EOI mode in vgic_irq
  KVM: selftests: aarch64: Cmdline arg to set number of IRQs in vgic_irq test
  KVM: selftests: aarch64: Abstract the injection functions in vgic_irq
  KVM: selftests: aarch64: Add vgic_irq to test userspace IRQ injection
  KVM: selftests: aarch64: Add vGIC library functions to deal with vIRQ state
  KVM: selftests: Add kvm_irq_line library function
  KVM: selftests: aarch64: Add GICv3 register accessor library functions
  KVM: selftests: aarch64: Add function for accessing GICv3 dist and redist registers
  KVM: selftests: aarch64: Move gic_v3.h to shared headers

Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-01-04 14:03:43 +00:00
Marc Zyngier
f15dcf1b58 KVM: arm64: selftests: get-reg-list: Add pauth configuration
The get-reg-list test ignores the Pointer Authentication features,
which is a shame now that we have relatively common HW with this feature.

Define two new configurations (with and without PMU) that exercise the
KVM capabilities.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20211228121414.1013250-1-maz@kernel.org
2022-01-04 13:39:55 +00:00
Ricardo Koller
728fcc46d2 KVM: selftests: aarch64: Add test for restoring active IRQs
Add a test that restores multiple IRQs in active state, it does it by
writing into ISACTIVER from the guest and using KVM ioctls. This test
tries to emulate what would happen during a live migration: restore
active IRQs.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-18-ricarkol@google.com
2021-12-28 19:25:05 +00:00
Ricardo Koller
bebd8f3f86 KVM: selftests: aarch64: Add ISPENDR write tests in vgic_irq
Add injection tests that use writing into the ISPENDR register (to mark
IRQs as pending). This is typically used by migration code.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-17-ricarkol@google.com
2021-12-28 19:25:00 +00:00
Ricardo Koller
6a5a47188c KVM: selftests: aarch64: Add tests for IRQFD in vgic_irq
Add injection tests for the KVM_IRQFD ioctl into vgic_irq.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-16-ricarkol@google.com
2021-12-28 19:24:54 +00:00
Ricardo Koller
88209c104e KVM: selftests: Add IRQ GSI routing library functions
Add an architecture independent wrapper function for creating and
writing IRQ GSI routing tables. Also add a function to add irqchip
entries.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-15-ricarkol@google.com
2021-12-28 19:24:48 +00:00
Ricardo Koller
90f50acac9 KVM: selftests: aarch64: Add test_inject_fail to vgic_irq
Add tests for failed injections to vgic_irq. This tests that KVM can
handle bogus IRQ numbers.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-14-ricarkol@google.com
2021-12-28 19:24:41 +00:00
Ricardo Koller
6830fa9159 KVM: selftests: aarch64: Add tests for LEVEL_INFO in vgic_irq
Add injection tests for the LEVEL_INFO ioctl (level-sensitive specific)
into vgic_irq.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-13-ricarkol@google.com
2021-12-28 19:24:35 +00:00
Ricardo Koller
92f2cc4aa7 KVM: selftests: aarch64: Level-sensitive interrupts tests in vgic_irq
Add a cmdline arg for using level-sensitive interrupts (vs the default
edge-triggered). Then move the handler into a generic handler function
that takes the type of interrupt (level vs. edge) as an arg.  When
handling line-sensitive interrupts it sets the line to low after
acknowledging the IRQ.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-12-ricarkol@google.com
2021-12-28 19:24:28 +00:00
Ricardo Koller
0ad3ff4a6a KVM: selftests: aarch64: Add preemption tests in vgic_irq
Add tests for IRQ preemption (having more than one activated IRQ at the
same time).  This test injects multiple concurrent IRQs and handles them
without handling the actual exceptions.  This is done by masking
interrupts for the whole test.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-11-ricarkol@google.com
2021-12-28 19:24:22 +00:00
Ricardo Koller
8a35b2877d KVM: selftests: aarch64: Cmdline arg to set EOI mode in vgic_irq
Add a new cmdline arg to set the EOI mode for all vgic_irq tests.  This
specifies whether a write to EOIR will deactivate IRQs or not.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-10-ricarkol@google.com
2021-12-28 19:24:13 +00:00
Ricardo Koller
e5410ee280 KVM: selftests: aarch64: Cmdline arg to set number of IRQs in vgic_irq test
Add the ability to specify the number of vIRQs exposed by KVM (arg
defaults to 64). Then extend the KVM_IRQ_LINE test by injecting all
available SPIs at once (specified by the nr-irqs arg). As a bonus,
inject all SGIs at once as well.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-9-ricarkol@google.com
2021-12-28 19:24:06 +00:00
Ricardo Koller
e1cb399eed KVM: selftests: aarch64: Abstract the injection functions in vgic_irq
Build an abstraction around the injection functions, so the preparation
and checking around the actual injection can be shared between tests.
All functions are stored as pointers in arrays of kvm_inject_desc's
which include the pointer and what kind of interrupts they can inject.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-8-ricarkol@google.com
2021-12-28 19:23:49 +00:00
Ricardo Koller
50b020cdb7 KVM: selftests: aarch64: Add vgic_irq to test userspace IRQ injection
Add a new KVM selftest, vgic_irq, for testing userspace IRQ injection.  This
particular test injects an SPI using KVM_IRQ_LINE on GICv3 and verifies
that the IRQ is handled in the guest. The next commits will add more
types of IRQs and different modes.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-7-ricarkol@google.com
2021-12-28 19:23:43 +00:00
Ricardo Koller
e95def3a90 KVM: selftests: aarch64: Add vGIC library functions to deal with vIRQ state
Add a set of library functions for userspace code in selftests to deal
with vIRQ state (i.e., ioctl wrappers).

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-6-ricarkol@google.com
2021-12-28 19:23:35 +00:00
Ricardo Koller
227895ed6d KVM: selftests: Add kvm_irq_line library function
Add an architecture independent wrapper function for the KVM_IRQ_LINE
ioctl.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-5-ricarkol@google.com
2021-12-28 19:23:23 +00:00
Ricardo Koller
17ce617bf7 KVM: selftests: aarch64: Add GICv3 register accessor library functions
Add library functions for accessing GICv3 registers: DIR, PMR, CTLR,
ISACTIVER, ISPENDR.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-4-ricarkol@google.com
2021-12-28 19:23:13 +00:00
Ricardo Koller
745068367c KVM: selftests: aarch64: Add function for accessing GICv3 dist and redist registers
Add a generic library function for reading and writing GICv3 distributor
and redistributor registers. Then adapt some functions to use it; more
will come and use it in the next commit.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-3-ricarkol@google.com
2021-12-28 19:23:07 +00:00
Ricardo Koller
33a1ca736e KVM: selftests: aarch64: Move gic_v3.h to shared headers
Move gic_v3.h to the shared headers location. There are some definitions
that will be used in the vgic-irq test.

Signed-off-by: Ricardo Koller <ricarkol@google.com>
Acked-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109023906.1091208-2-ricarkol@google.com
2021-12-28 19:22:54 +00:00
Marc Zyngier
aa674de1dc KVM: selftests: arm64: Add support for various modes with 16kB page size
The 16kB page size is not a popular choice, due to only a few CPUs
actually implementing support for it. However, it can lead to some
interesting performance improvements given the right uarch choices.

Add support for this page size for various PA/VA combinations.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20211227124809.1335409-7-maz@kernel.org
2021-12-28 11:04:20 +00:00
Marc Zyngier
e7f58a6bd2 KVM: selftests: arm64: Add support for VM_MODE_P36V48_{4K,64K}
Some of the arm64 systems out there have an IPA space that is
positively tiny. Nonetheless, they make great KVM hosts.

Add support for 36bit IPA support with 4kB pages, which makes
some of the fruity machines happy. Whilst we're at it, add support
for 64kB pages as well, though these boxes have no support for it.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211227124809.1335409-6-maz@kernel.org
2021-12-28 11:04:20 +00:00
Marc Zyngier
2f41a61c54 KVM: selftests: arm64: Rework TCR_EL1 configuration
The current way we initialise TCR_EL1 is a bit cumbersome, as
we mix setting TG0 and IPS in the same swtch statement.

Split it into two statements (one for the base granule size, and
another for the IPA size), allowing new modes to be added in a
more elegant way.

No functional change intended.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20211227124809.1335409-5-maz@kernel.org
2021-12-28 11:04:20 +00:00
Marc Zyngier
0303ffdb9e KVM: selftests: arm64: Check for supported page sizes
Just as arm64 implemenations don't necessary support all IPA
ranges, they don't  all support the same page sizes either. Fun.

Create a dummy VM to snapshot the page sizes supported by the
host, and filter the supported modes.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20211227124809.1335409-4-maz@kernel.org
2021-12-28 11:04:20 +00:00
Marc Zyngier
357c628e12 KVM: selftests: arm64: Introduce a variable default IPA size
Contrary to popular belief, there is no such thing as a default
IPA size on arm64. Anything goes, and implementations are the
usual Wild West.

The selftest infrastructure default to 40bit IPA, which obviously
doesn't work for some systems out there.

Turn VM_MODE_DEFAULT from a constant into a variable, and let
guest_modes_append_default() populate it, depending on what
the HW can do. In order to preserve the current behaviour, we
still pick 40bits IPA as the default if it is available, and
the largest supported IPA space otherwise.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20211227124809.1335409-3-maz@kernel.org
2021-12-28 11:04:20 +00:00
Marc Zyngier
cb7c4f364a KVM: selftests: arm64: Initialise default guest mode at test startup time
As we are going to add support for a variable default mode on arm64,
let's make sure it is setup first by using a constructor that gets
called before the actual test runs.

Suggested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Link: https://lore.kernel.org/r/20211227124809.1335409-2-maz@kernel.org
2021-12-28 11:04:19 +00:00
Sean Christopherson
ab1ef34416 KVM: selftests: Add test to verify TRIPLE_FAULT on invalid L2 guest state
Add a selftest to attempt to enter L2 with invalid guests state by
exiting to userspace via I/O from L2, and then using KVM_SET_SREGS to set
invalid guest state (marking TR unusable is arbitrary chosen for its
relative simplicity).

This is a regression test for a bug introduced by commit c8607e4a08
("KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if
!from_vmentry"), which incorrectly set vmx->fail=true when L2 had invalid
guest state and ultimately triggered a WARN due to nested_vmx_vmexit()
seeing vmx->fail==true while attempting to synthesize a nested VM-Exit.

The is also a functional test to verify that KVM sythesizes TRIPLE_FAULT
for L2, which is somewhat arbitrary behavior, instead of emulating L2.
KVM should never emulate L2 due to invalid guest state, as it's
architecturally impossible for L1 to run an L2 guest with invalid state
as nested VM-Enter should always fail, i.e. L1 needs to do the emulation.
Stuffing state via KVM ioctl() is a non-architctural, out-of-band case,
hence the TRIPLE_FAULT being rather arbitrary.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211207193006.120997-5-seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20 08:06:55 -05:00
Andrew Jones
577e022b7b selftests: KVM: Fix non-x86 compiling
Attempting to compile on a non-x86 architecture fails with

include/kvm_util.h: In function ‘vm_compute_max_gfn’:
include/kvm_util.h:79:21: error: dereferencing pointer to incomplete type ‘struct kvm_vm’
  return ((1ULL << vm->pa_bits) >> vm->page_shift) - 1;
                     ^~

This is because the declaration of struct kvm_vm is in
lib/kvm_util_internal.h as an effort to make it private to
the test lib code. We can still provide arch specific functions,
though, by making the generic function symbols weak. Do that to
fix the compile error.

Fixes: c8cc43c1ea ("selftests: KVM: avoid failures due to reserved HyperTransport region")
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-Id: <20211214151842.848314-1-drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20 08:06:54 -05:00
Vitaly Kuznetsov
0b091a43d7 KVM: selftests: vmx_pmu_msrs_test: Drop tests mangling guest visible CPUIDs
Host initiated writes to MSR_IA32_PERF_CAPABILITIES should not depend
on guest visible CPUIDs and (incorrect) KVM logic implementing it is
about to change. Also, KVM_SET_CPUID{,2} after KVM_RUN is now forbidden
and causes test to fail.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: feb627e8d6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211216165213.338923-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-19 19:35:29 +01:00
Sean Christopherson
10e7a099bf selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode
if userspace modifies RCX during a userspace exit to handle string I/O.
This is a regression test for a user-triggerable WARN introduced by
commit 3b27de2718 ("KVM: x86: split the two parts of emulator_pio_in").

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211025201311.1881846-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10 09:38:02 -05:00
Paolo Bonzini
c8cc43c1ea selftests: KVM: avoid failures due to reserved HyperTransport region
AMD proceessors define an address range that is reserved by HyperTransport
and causes a failure if used for guest physical addresses.  Avoid
selftests failures by reserving those guest physical addresses; the
rules are:

- On parts with <40 bits, its fully hidden from software.

- Before Fam17h, it was always 12G just below 1T, even if there was more
RAM above this location.  In this case we just not use any RAM above 1T.

- On Fam17h and later, it is variable based on SME, and is either just
below 2^48 (no encryption) or 2^43 (encryption).

Fixes: ef4c9f4f65 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Cc: stable@vger.kernel.org
Cc: David Matlack <dmatlack@google.com>
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210805105423.412878-1-pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-10 07:42:20 -05:00
Maciej S. Szmigiero
ee3a4f6662 KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
INTERCEPT_x are bit positions, but the code was using the raw value of
INTERCEPT_VINTR (4) instead of BIT(INTERCEPT_VINTR).
This resulted in masking of bit 2 - that is, SMI instead of VINTR.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <49b9571d25588870db5380b0be1a41df4bbaaf93.1638486479.git.maciej.szmigiero@oracle.com>
2021-12-09 12:44:39 -05:00
Linus Torvalds
a51e3ac43d Merge tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireless, and wireguard.

  Mostly scattered driver changes this week, with one big clump in
  mv88e6xxx. Nothing of note, really.

  Current release - regressions:

   - smc: keep smc_close_final()'s error code during active close

  Current release - new code bugs:

   - iwlwifi: various static checker fixes (int overflow, leaks, missing
     error codes)

   - rtw89: fix size of firmware header before transfer, avoid crash

   - mt76: fix timestamp check in tx_status; fix pktid leak;

   - mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()

  Previous releases - regressions:

   - smc: fix list corruption in smc_lgr_cleanup_early

   - ipv4: convert fib_num_tclassid_users to atomic_t

  Previous releases - always broken:

   - tls: fix authentication failure in CCM mode

   - vrf: reset IPCB/IP6CB when processing outbound pkts, prevent
     incorrect processing

   - dsa: mv88e6xxx: fixes for various device errata

   - rds: correct socket tunable error in rds_tcp_tune()

   - ipv6: fix memory leak in fib6_rule_suppress

   - wireguard: reset peer src endpoint when netns exits

   - wireguard: improve resilience to DoS around incoming handshakes

   - tcp: fix page frag corruption on page fault which involves TCP

   - mpls: fix missing attributes in delete notifications

   - mt7915: fix NULL pointer dereference with ad-hoc mode

  Misc:

   - rt2x00: be more lenient about EPROTO errors during start

   - mlx4_en: update reported link modes for 1/10G"

* tag 'net-5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits)
  net: dsa: b53: Add SPI ID table
  gro: Fix inconsistent indenting
  selftests: net: Correct case name
  net/rds: correct socket tunable error in rds_tcp_tune()
  mctp: Don't let RTM_DELROUTE delete local routes
  net/smc: Keep smc_close_final rc during active close
  ibmvnic: drop bad optimization in reuse_tx_pools()
  ibmvnic: drop bad optimization in reuse_rx_pools()
  net/smc: fix wrong list_del in smc_lgr_cleanup_early
  Fix Comment of ETH_P_802_3_MIN
  ethernet: aquantia: Try MAC address from device tree
  ipv4: convert fib_num_tclassid_users to atomic_t
  net: avoid uninit-value from tcp_conn_request
  net: annotate data-races on txq->xmit_lock_owner
  octeontx2-af: Fix a memleak bug in rvu_mbox_init()
  net/mlx4_en: Fix an use-after-free bug in mlx4_en_try_alloc_resources()
  vrf: Reset IPCB/IP6CB when processing outbound pkts in vrf dev xmit
  net: qlogic: qlcnic: Fix a NULL pointer dereference in qlcnic_83xx_add_rings()
  net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed
  net: dsa: mv88e6xxx: Fix inband AN for 2500base-x on 88E6393X family
  ...
2021-12-02 11:22:06 -08:00
Li Zhijian
a05431b22b selftests: net: Correct case name
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names

Fixes: 34d0302ab8 ("selftests: Add ipv6 address bind tests to fcnal-test")
Fixes: 75b2b2b3db ("selftests: Add ipv4 address bind tests to fcnal-test")
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-02 12:19:08 +00:00
Linus Torvalds
f080815fdb Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Fix constant sign extension affecting TCR_EL2 and preventing
     running on ARMv8.7 models due to spurious bits being set

   - Fix use of helpers using PSTATE early on exit by always sampling it
     as soon as the exit takes place

   - Move pkvm's 32bit handling into a common helper

  RISC-V:

   - Fix incorrect KVM_MAX_VCPUS value

   - Unmap stage2 mapping when deleting/moving a memslot

  x86:

   - Fix and downgrade BUG_ON due to uninitialized cache

   - Many APICv and MOVE_ENC_CONTEXT_FROM fixes

   - Correctly emulate TLB flushes around nested vmentry/vmexit and when
     the nested hypervisor uses VPID

   - Prevent modifications to CPUID after the VM has run

   - Other smaller bugfixes

  Generic:

   - Memslot handling bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
  KVM: fix avic_set_running for preemptable kernels
  KVM: VMX: clear vmx_x86_ops.sync_pir_to_irr if APICv is disabled
  KVM: SEV: accept signals in sev_lock_two_vms
  KVM: SEV: do not take kvm->lock when destroying
  KVM: SEV: Prohibit migration of a VM that has mirrors
  KVM: SEV: Do COPY_ENC_CONTEXT_FROM with both VMs locked
  selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
  KVM: SEV: move mirror status to destination of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
  KVM: SEV: initialize regions_list of a mirror VM
  KVM: SEV: cleanup locking for KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
  KVM: SEV: do not use list_replace_init on an empty list
  KVM: x86: Use a stable condition around all VT-d PI paths
  KVM: x86: check PIR even for vCPUs with disabled APICv
  KVM: VMX: prepare sync_pir_to_irr for running with APICv disabled
  KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
  KVM: x86/mmu: Handle "default" period when selectively waking kthread
  KVM: MMU: shadow nested paging does not have PKU
  KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible path
  KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping
  KVM: X86: Use vcpu->arch.walk_mmu for kvm_mmu_invlpg()
  ...
2021-11-30 09:22:15 -08:00
Matthew Wilcox (Oracle)
d6e6a27d96 tools: Fix math.h breakage
Commit 98e1385ef2 ("include/linux/radix-tree.h: replace kernel.h with
the necessary inclusions") broke the radix tree test suite in two
different ways; first by including math.h which didn't exist in the
tools directory, and second by removing an implicit include of
spinlock.h before lockdep.h.  Fix both issues.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-30 09:14:42 -08:00
Paolo Bonzini
17d44a96f0 KVM: SEV: Prohibit migration of a VM that has mirrors
VMs that mirror an encryption context rely on the owner to keep the
ASID allocated.  Performing a KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
would cause a dangling ASID:

1. copy context from A to B (gets ref to A)
2. move context from A to L (moves ASID from A to L)
3. close L (releases ASID from L, B still references it)

The right way to do the handoff instead is to create a fresh mirror VM
on the destination first:

1. copy context from A to B (gets ref to A)
[later] 2. close B (releases ref to A)
3. move context from A to L (moves ASID from A to L)
4. copy context from L to M

So, catch the situation by adding a count of how many VMs are
mirroring this one's encryption context.

Fixes: 0b020f5af0 ("KVM: SEV: Add support for SEV-ES intra host migration")
Message-Id: <20211123005036.2954379-11-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:54:14 -05:00
Paolo Bonzini
dc79c9f4eb selftests: sev_migrate_tests: add tests for KVM_CAP_VM_COPY_ENC_CONTEXT_FROM
I am putting the tests in sev_migrate_tests because the failure conditions are
very similar and some of the setup code can be reused, too.

The tests cover both successful creation of a mirror VM, and error
conditions.

Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Message-Id: <20211123005036.2954379-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:54:13 -05:00
Maciej S. Szmigiero
81835ee113 KVM: selftests: page_table_test: fix calculation of guest_test_phys_mem
A kvm_page_table_test run with its default settings fails on VMX due to
memory region add failure:
> ==== Test Assertion Failure ====
>  lib/kvm_util.c:952: ret == 0
>  pid=10538 tid=10538 errno=17 - File exists
>     1  0x00000000004057d1: vm_userspace_mem_region_add at kvm_util.c:947
>     2  0x0000000000401ee9: pre_init_before_test at kvm_page_table_test.c:302
>     3   (inlined by) run_test at kvm_page_table_test.c:374
>     4  0x0000000000409754: for_each_guest_mode at guest_modes.c:53
>     5  0x0000000000401860: main at kvm_page_table_test.c:500
>     6  0x00007f82ae2d8554: ?? ??:0
>     7  0x0000000000401894: _start at ??:?
>  KVM_SET_USER_MEMORY_REGION IOCTL failed,
>  rc: -1 errno: 17
>  slot: 1 flags: 0x0
>  guest_phys_addr: 0xc0000000 size: 0x40000000

This is because the memory range that this test is trying to add
(0x0c0000000 - 0x100000000) conflicts with LAPIC mapping at 0x0fee00000.

Looking at the code it seems that guest_test_*phys*_mem variable gets
mistakenly overwritten with guest_test_*virt*_mem while trying to adjust
the former for alignment.
With the correct variable adjusted this test runs successfully.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <52e487458c3172923549bbcf9dfccfbe6faea60b.1637940473.git.maciej.szmigiero@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-30 03:12:13 -05:00
Jason A. Donenfeld
20ae1d6aa1 wireguard: device: reset peer src endpoint when netns exits
Each peer's endpoint contains a dst_cache entry that takes a reference
to another netdev. When the containing namespace exits, we take down the
socket and prevent future sockets from being created (by setting
creating_net to NULL), which removes that potential reference on the
netns. However, it doesn't release references to the netns that a netdev
cached in dst_cache might be taking, so the netns still might fail to
exit. Since the socket is gimped anyway, we can simply clear all the
dst_caches (by way of clearing the endpoint src), which will release all
references.

However, the current dst_cache_reset function only releases those
references lazily. But it turns out that all of our usages of
wg_socket_clear_peer_endpoint_src are called from contexts that are not
exactly high-speed or bottle-necked. For example, when there's
connection difficulty, or when userspace is reconfiguring the interface.
And in particular for this patch, when the netns is exiting. So for
those cases, it makes more sense to call dst_release immediately. For
that, we add a small helper function to dst_cache.

This patch also adds a test to netns.sh from Hangbin Liu to ensure this
doesn't regress.

Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Reported-by: Xiumei Mu <xmu@redhat.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Fixes: 900575aa33 ("wireguard: device: avoid circular netns references")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:45 -08:00
Li Zhijian
7e938beb83 wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLIST
DEBUG_PI_LIST was renamed to DEBUG_PLIST since 8e18faeac3 ("lib/plist:
rename DEBUG_PI_LIST to DEBUG_PLIST").

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Fixes: 8e18faeac3 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:40 -08:00
Jason A. Donenfeld
782c72af56 wireguard: selftests: actually test for routing loops
We previously removed the restriction on looping to self, and then added
a test to make sure the kernel didn't blow up during a routing loop. The
kernel didn't blow up, thankfully, but on certain architectures where
skb fragmentation is easier, such as ppc64, the skbs weren't actually
being discarded after a few rounds through. But the test wasn't catching
this. So actually test explicitly for massive increases in tx to see if
we have a routing loop. Note that the actual loop problem will need to
be addressed in a different commit.

Fixes: b673e24aad ("wireguard: socket: remove errant restriction on looping to self")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:29 -08:00
Jason A. Donenfeld
03ff1b1def wireguard: selftests: increase default dmesg log size
The selftests currently parse the kernel log at the end to track
potential memory leaks. With these tests now reading off the end of the
buffer, due to recent optimizations, some creation messages were lost,
making the tests think that there was a free without an alloc. Fix this
by increasing the kernel log size.

Fixes: 24b70eeeb4 ("wireguard: use synchronize_net rather than synchronize_rcu")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:29 -08:00
Linus Torvalds
c5c17547b7 Merge tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from netfilter.

  Current release - regressions:

   - r8169: fix incorrect mac address assignment

   - vlan: fix underflow for the real_dev refcnt when vlan creation
     fails

   - smc: avoid warning of possible recursive locking

  Current release - new code bugs:

   - vsock/virtio: suppress used length validation

   - neigh: fix crash in v6 module initialization error path

  Previous releases - regressions:

   - af_unix: fix change in behavior in read after shutdown

   - igb: fix netpoll exit with traffic, avoid warning

   - tls: fix splice_read() when starting mid-record

   - lan743x: fix deadlock in lan743x_phy_link_status_change()

   - marvell: prestera: fix bridge port operation

  Previous releases - always broken:

   - tcp_cubic: fix spurious Hystart ACK train detections for
     not-cwnd-limited flows

   - nexthop: fix refcount issues when replacing IPv6 groups

   - nexthop: fix null pointer dereference when IPv6 is not enabled

   - phylink: force link down and retrigger resolve on interface change

   - mptcp: fix delack timer length calculation and incorrect early
     clearing

   - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds

   - nfc: virtual_ncidev: change default device permissions

   - netfilter: ctnetlink: fix error codes and flags used for kernel
     side filtering of dumps

   - netfilter: flowtable: fix IPv6 tunnel addr match

   - ncsi: align payload to 32-bit to fix dropped packets

   - iavf: fix deadlock and loss of config during VF interface reset

   - ice: avoid bpf_prog refcount underflow

   - ocelot: fix broken PTP over IP and PTP API violations

  Misc:

   - marvell: mvpp2: increase MTU limit when XDP enabled"

* tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
  net: dsa: microchip: implement multi-bridge support
  net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
  net: mscc: ocelot: set up traps for PTP packets
  net: ptp: add a definition for the UDP port for IEEE 1588 general messages
  net: mscc: ocelot: create a function that replaces an existing VCAP filter
  net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
  net: hns3: fix incorrect components info of ethtool --reset command
  net: hns3: fix one incorrect value of page pool info when queried by debugfs
  net: hns3: add check NULL address for page pool
  net: hns3: fix VF RSS failed problem after PF enable multi-TCs
  net: qed: fix the array may be out of bound
  net/smc: Don't call clcsock shutdown twice when smc shutdown
  net: vlan: fix underflow for the real_dev refcnt
  ptp: fix filter names in the documentation
  ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()
  nfc: virtual_ncidev: change default device permissions
  net/sched: sch_ets: don't peek at classes beyond 'nbands'
  net: stmmac: Disable Tx queues when reconfiguring the interface
  selftests: tls: test for correct proto_ops
  tls: fix replacing proto_ops
  ...
2021-11-26 12:58:53 -08:00
Vitaly Kuznetsov
908fa88e42 KVM: selftests: Make sure kvm_create_max_vcpus test won't hit RLIMIT_NOFILE
With the elevated 'KVM_CAP_MAX_VCPUS' value kvm_create_max_vcpus test
may hit RLIMIT_NOFILE limits:

 # ./kvm_create_max_vcpus
 KVM_CAP_MAX_VCPU_ID: 4096
 KVM_CAP_MAX_VCPUS: 1024
 Testing creating 1024 vCPUs, with IDs 0...1023.
 /dev/kvm not available (errno: 24), skipping test

Adjust RLIMIT_NOFILE limits to make sure KVM_CAP_MAX_VCPUS fds can be
opened. Note, raising hard limit ('rlim_max') requires CAP_SYS_RESOURCE
capability which is generally not needed to run kvm selftests (but without
raising the limit the test is doomed to fail anyway).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211123135953.667434-1-vkuznets@redhat.com>
[Skip the test if the hard limit can be raised. - Paolo]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 08:14:20 -05:00
Vitaly Kuznetsov
6c1186430a KVM: selftests: Avoid KVM_SET_CPUID2 after KVM_RUN in hyperv_features test
hyperv_features's sole purpose is to test access to various Hyper-V MSRs
and hypercalls with different CPUID data. As KVM_SET_CPUID2 after KVM_RUN
is deprecated and soon-to-be forbidden, avoid it by re-creating test VM
for each sub-test.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20211122175818.608220-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 08:14:19 -05:00
Paolo Bonzini
826bff439f selftests: sev_migrate_tests: free all VMs
Ensure that the ASID are freed promptly, which becomes more important
when more tests are added to this file.

Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 06:43:30 -05:00
Paolo Bonzini
4916ea8b06 selftests: fix check for circular KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM
KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM leaves the source VM in a dead state,
so migrating back to the original source VM fails the ioctl.  Adjust
the test.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-26 06:43:29 -05:00
Jakub Kicinski
f884a34262 selftests: tls: test for correct proto_ops
Previous patch fixes overriding callbacks incorrectly. Triggering
the crash in sendpage_locked would be more spectacular but it's
hard to get to, so take the easier path of proving this is broken
and call getname. We're currently getting IPv4 socket info on an
IPv6 socket.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25 19:28:17 -08:00