Commit Graph

948892 Commits

Author SHA1 Message Date
Sean Wang
f8d6379932 mt76: mt7663: fix the usage WoW with net detect support
mt7615_mcu_sched_scan_enable should be taken along with
mt7615_mcu_sched_scan_req to have proper scan plans initialization.

Fixes: bd39bd2f00c3 ("mt76: mt7663: introduce WoW with net detect support")
Co-developed-by: Wan-Feng Jiang <Wan-Feng.Jiang@mediatek.com>
Signed-off-by: Wan-Feng Jiang <Wan-Feng.Jiang@mediatek.com>
Co-developed-by: Soul Huang <Soul.Huang@mediatek.com>
Signed-off-by: Soul Huang <Soul.Huang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
2020-05-13 20:01:10 +02:00
Rob Herring
848685c25d ARM: vexpress: Don't select VEXPRESS_CONFIG
CONFIG_VEXPRESS_CONFIG has 'default y if ARCH_VEXPRESS', so selecting is
unnecessary. Selecting it also prevents setting CONFIG_VEXPRESS_CONFIG
to a module.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:47 -05:00
Rob Herring
70e4758aaa bus: vexpress-config: Support building as module
Enable building vexpress-config driver as a module.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:46 -05:00
Rob Herring
310f80d617 vexpress: Move setting master site to vexpress-config bus
There's only a single caller of vexpress_config_set_master() from
vexpress-sysreg.c. Let's just make the registers needed available to
vexpress-config and move all the code there. The registers needed aren't
used anywhere else either. With this, we can get rid of the private API
between these 2 drivers.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:46 -05:00
Rob Herring
a5a38765ac bus: vexpress-config: simplify config bus probing
The vexpress-config initialization is dependent on the vexpress-syscfg
driver probing. As vexpress-config was not a driver, deferred probe
could not be used and instead initcall ordering was relied upon. This is
fragile and doesn't work for modules.

Let's move the config bus init into the vexpress-syscfg probe. This
eliminates the initcall ordering requirement and the need to create a
struct device and the "vexpress-config" class.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:46 -05:00
Rob Herring
d06cfe3f12 bus: vexpress-config: Merge vexpress-syscfg into vexpress-config
The only thing that vexpress-syscfg does is provide a regmap to
vexpress-config bus child devices. There's little reason to have 2
components for this. The current structure with initcall ordering
requirements makes turning these components into modules more difficult.

So let's start to simplify things and merge vexpress-syscfg into
vexpress-config. There's no functional change in this commit and it's
still separate components until subsequent commits.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:46 -05:00
Rob Herring
7b9d428e05 mfd: vexpress-sysreg: Support building as a module
Enable building the vexpress-sysreg driver as a module.

As deferred probe between the vexpress components works now, we don't
need to create struct devices early with of_platform_device_create().

Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:45 -05:00
Rob Herring
0ea355ef78 mfd: vexpress-sysreg: Use devres API variants
Use the managed devm_gpiochip_add_data() and devm_mfd_add_devices()
instead of their unmanaged counterparts. With this, no .remove() hook is
needed for driver unbind.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:45 -05:00
Rob Herring
13fc767335 mfd: vexpress-sysreg: Drop unused syscon child devices
The "sys_id", "sys_misc" and "sys_procid" devices don't have a user
anywhere in the tree and do nothing more than create a syscon regmap for
a single register or 2. That's an overkill for creating child devices.
Let's just remove them.

Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:45 -05:00
Rob Herring
a229635f3b mfd: vexpress-sysreg: Drop selecting CONFIG_CLKSRC_MMIO
Nothing in the VExpress sysregs nor the MFD child drivers use
CONFIG_CLKSRC_MMIO. There's the 24MHz counter, but that's handled by
drivers/clocksource/timer-versatile.c which doesn't use
CONFIG_CLKSRC_MMIO either. So let's just drop CONFIG_CLKSRC_MMIO.

As the !ARCH_USES_GETTIMEOFFSET dependency was added for
CONFIG_CLKSRC_MMIO, that can be dropped, too.

Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:45 -05:00
Rob Herring
75b272bd09 clk: vexpress-osc: Support building as a module
Enable building the vexpress-osc clock driver as a module.

Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Michael Turquette <mturquette@baylibre.com>
Cc: linux-clk@vger.kernel.org
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-05-13 12:42:45 -05:00
Anson Huang
768062fd12 Input: imx_sc_key - use devm_add_action_or_reset() to handle all cleanups
Use devm_add_action_or_reset() to handle all cleanups of failure in
.probe and .remove, then .remove callback can be dropped.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Link: https://lore.kernel.org/r/1584082751-17047-1-git-send-email-Anson.Huang@nxp.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-05-13 10:34:58 -07:00
Brian Masney
2ecf9487a7 Input: remove msm-vibrator driver
The address referenced by this driver is within the Qualcomm Clock
namespace so let's drop the msm-vibrator bindings so that a more generic
solution can be used instead.  No one is currently using driver so this
won't affect any users.

Signed-off-by: Brian Masney <masneyb@onstation.org>
Link: https://lore.kernel.org/r/20200513013140.69935-3-masneyb@onstation.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-05-13 10:34:57 -07:00
Brian Masney
d364436337 dt-bindings: Input: remove msm-vibrator
The address referenced in this binding is within the Qualcomm Clock
namespace so let's drop the msm-vibrator bindings so that a more
generic solution can be used instead.  No one is currently using these
bindings so this won't affect any users.

Signed-off-by: Brian Masney <masneyb@onstation.org>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20200513013140.69935-2-masneyb@onstation.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-05-13 10:34:55 -07:00
Samu Nuutamo
333e22db22 hwmon: (da9052) Synchronize access with mfd
When tsi-as-adc is configured it is possible for in7[0123]_input read to
return an incorrect value if a concurrent read to in[456]_input is
performed. This is caused by a concurrent manipulation of the mux
channel without proper locking as hwmon and mfd use different locks for
synchronization.

Switch hwmon to use the same lock as mfd when accessing the TSI channel.

Fixes: 4f16cab19a ("hwmon: da9052: Add support for TSI channel")
Signed-off-by: Samu Nuutamo <samu.nuutamo@vincit.fi>
[rebase to current master, reword commit message slightly]
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2020-05-13 10:06:09 -07:00
David Manouchehri
eb4a6de496 thunderbolt: Update Kconfig to allow building on other architectures.
Thunderbolt 3 and USB4 shouldn't be x86 only.

Tested on a SolidRun HoneyComb (ARM Cortex-A72) with a
Gigabyte Titan Ridge Thunderbolt 3 PCIe card (JHL7540).

Signed-off-by: David Manouchehri <david.manouchehri@riseup.net>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2020-05-13 19:32:48 +03:00
Sean Christopherson
e93fd3b3e8 KVM: x86/mmu: Capture TDP level when updating CPUID
Snapshot the TDP level now that it's invariant (SVM) or dependent only
on host capabilities and guest CPUID (VMX).  This avoids having to call
kvm_x86_ops.get_tdp_level() when initializing a TDP MMU and/or
calculating the page role, and thus avoids the associated retpoline.

Drop the WARN in vmx_get_tdp_level() as updating CPUID while L2 is
active is legal, if dodgy.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200502043234.12481-11-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:14 -04:00
Sean Christopherson
0047fcade4 KVM: VMX: Move nested EPT out of kvm_x86_ops.get_tdp_level() hook
Separate the "core" TDP level handling from the nested EPT path to make
it clear that kvm_x86_ops.get_tdp_level() is used if and only if nested
EPT is not in use (kvm_init_shadow_ept_mmu() calculates the level from
the passed in vmcs12->eptp).  Add a WARN_ON() to enforce that the
kvm_x86_ops hook is not called for nested EPT.

This sets the stage for snapshotting the non-"nested EPT" TDP page level
during kvm_cpuid_update() to avoid the retpoline associated with
kvm_x86_ops.get_tdp_level() when resetting the MMU, a relatively
frequent operation when running a nested guest.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200502043234.12481-10-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:13 -04:00
Sean Christopherson
bd31fe495d KVM: VMX: Add proper cache tracking for CR0
Move CR0 caching into the standard register caching mechanism in order
to take advantage of the availability checks provided by regs_avail.
This avoids multiple VMREADs in the (uncommon) case where kvm_read_cr0()
is called multiple times in a single VM-Exit, and more importantly
eliminates a kvm_x86_ops hook, saves a retpoline on SVM when reading
CR0, and squashes the confusing naming discrepancy of "cache_reg" vs.
"decache_cr0_guest_bits".

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200502043234.12481-8-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:12 -04:00
Sean Christopherson
f98c1e7712 KVM: VMX: Add proper cache tracking for CR4
Move CR4 caching into the standard register caching mechanism in order
to take advantage of the availability checks provided by regs_avail.
This avoids multiple VMREADs and retpolines (when configured) during
nested VMX transitions as kvm_read_cr4_bits() is invoked multiple times
on each transition, e.g. when stuffing CR0 and CR3.

As an added bonus, this eliminates a kvm_x86_ops hook, saves a retpoline
on SVM when reading CR4, and squashes the confusing naming discrepancy
of "cache_reg" vs. "decache_cr4_guest_bits".

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200502043234.12481-7-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:10 -04:00
Sean Christopherson
0cc69204e7 KVM: nVMX: Unconditionally validate CR3 during nested transitions
Unconditionally check the validity of the incoming CR3 during nested
VM-Enter/VM-Exit to avoid invoking kvm_read_cr3() in the common case
where the guest isn't using PAE paging.  If vmcs.GUEST_CR3 hasn't yet
been cached (common case), kvm_read_cr3() will trigger a VMREAD.  The
VMREAD (~30 cycles) alone is likely slower than nested_cr3_valid()
(~5 cycles if vcpu->arch.maxphyaddr gets a cache hit), and the poor
exchange only gets worse when retpolines are enabled as the call to
kvm_x86_ops.cache_reg() will incur a retpoline (60+ cycles).

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200502043234.12481-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:09 -04:00
Sean Christopherson
56ba77a459 KVM: x86: Save L1 TSC offset in 'struct kvm_vcpu_arch'
Save L1's TSC offset in 'struct kvm_vcpu_arch' and drop the kvm_x86_ops
hook read_l1_tsc_offset().  This avoids a retpoline (when configured)
when reading L1's effective TSC, which is done at least once on every
VM-Exit.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200502043234.12481-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:04 -04:00
Sean Christopherson
1af1bb0562 KVM: nVMX: Skip IBPB when temporarily switching between vmcs01 and vmcs02
Skip the Indirect Branch Prediction Barrier that is triggered on a VMCS
switch when temporarily loading vmcs02 to synchronize it to vmcs12, i.e.
give copy_vmcs02_to_vmcs12_rare() the same treatment as
vmx_switch_vmcs().

Make vmx_vcpu_load() static now that it's only referenced within vmx.c.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200506235850.22600-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:03 -04:00
Sean Christopherson
5c911beff2 KVM: nVMX: Skip IBPB when switching between vmcs01 and vmcs02
Skip the Indirect Branch Prediction Barrier that is triggered on a VMCS
switch when running with spectre_v2_user=on/auto if the switch is
between two VMCSes in the same guest, i.e. between vmcs01 and vmcs02.
The IBPB is intended to prevent one guest from attacking another, which
is unnecessary in the nested case as it's the same guest from KVM's
perspective.

This all but eliminates the overhead observed for nested VMX transitions
when running with CONFIG_RETPOLINE=y and spectre_v2_user=on/auto, which
can be significant, e.g. roughly 3x on current systems.

Reported-by: Alexander Graf <graf@amazon.com>
Cc: KarimAllah Raslan <karahmed@amazon.de>
Cc: stable@vger.kernel.org
Fixes: 15d4507152 ("KVM/x86: Add IBPB support")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200501163117.4655-1-sean.j.christopherson@intel.com>
[Invert direction of bool argument. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:02 -04:00
Sean Christopherson
f27ad73a6e KVM: VMX: Use accessor to read vmcs.INTR_INFO when handling exception
Use vmx_get_intr_info() when grabbing the cached vmcs.INTR_INFO in
handle_exception_nmi() to ensure the cache isn't stale.  Bypassing the
caching accessor doesn't cause any known issues as the cache is always
refreshed by handle_exception_nmi_irqoff(), but the whole point of
adding the proper caching mechanism was to avoid such dependencies.

Fixes: 8791585837 ("KVM: VMX: Cache vmcs.EXIT_INTR_INFO using arch avail_reg flags")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200427171837.22613-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:15:01 -04:00
Paolo Bonzini
fede8076aa KVM: x86: handle wrap around 32-bit address space
KVM is not handling the case where EIP wraps around the 32-bit address
space (that is, outside long mode).  This is needed both in vmx.c
and in emulate.c.  SVM with NRIPS is okay, but it can still print
an error to dmesg due to integer overflow.

Reported-by: Nick Peterson <everdox@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:59 -04:00
Jason Yan
c4e115f08c kvm/eventfd: remove unneeded conversion to bool
The '==' expression itself is bool, no need to convert it to bool again.
This fixes the following coccicheck warning:

virt/kvm/eventfd.c:724:38-43: WARNING: conversion to bool not needed
here

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Message-Id: <20200420123805.4494-1-yanaijie@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:58 -04:00
Davidlohr Bueso
da4ad88cab kvm: Replace vcpu->swait with rcuwait
The use of any sort of waitqueue (simple or regular) for
wait/waking vcpus has always been an overkill and semantically
wrong. Because this is per-vcpu (which is blocked) there is
only ever a single waiting vcpu, thus no need for any sort of
queue.

As such, make use of the rcuwait primitive, with the following
considerations:

  - rcuwait already provides the proper barriers that serialize
  concurrent waiter and waker.

  - Task wakeup is done in rcu read critical region, with a
  stable task pointer.

  - Because there is no concurrency among waiters, we need
  not worry about rcuwait_wait_event() calls corrupting
  the wait->task. As a consequence, this saves the locking
  done in swait when modifying the queue. This also applies
  to per-vcore wait for powerpc kvm-hv.

The x86 tscdeadline_latency test mentioned in 8577370fb0
("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
latency is reduced by around 15-20% with this change.

Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-mips@vger.kernel.org
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-6-dave@stgolabs.net>
[Avoid extra logic changes. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:56 -04:00
Davidlohr Bueso
191a43be61 rcuwait: Introduce rcuwait_active()
This call is lockless and thus should not be trusted blindly.
For example, the return value of rcuwait_wakeup() should be used to
track whether a process was woken up.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-5-dave@stgolabs.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:54 -04:00
Davidlohr Bueso
5c21f7b322 rcuwait: Introduce prepare_to and finish_rcuwait
This allows further flexibility for some callers to implement
ad-hoc versions of the generic rcuwait_wait_event(). For example,
kvm will need this to maintain tracing semantics. The naming
is of course similar to what waitqueue apis offer.

Also go ahead and make use of rcu_assign_pointer() for both task
writes as it will make the __rcu sparse people happy - this will
be the special nil case, thus no added serialization.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-4-dave@stgolabs.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:53 -04:00
Davidlohr Bueso
9d9a6ebfea rcuwait: Let rcuwait_wake_up() return whether or not a task was awoken
Propagating the return value of wake_up_process() back to the caller
can come in handy for future users, such as for statistics or
accounting purposes.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-3-dave@stgolabs.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:52 -04:00
Davidlohr Bueso
c9d64a1b2d rcuwait: Fix stale wake call name in comment
The 'trywake' name was renamed to simply 'wake', update the comment.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Message-Id: <20200424054837.5138-2-dave@stgolabs.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:51 -04:00
Paolo Bonzini
c300ab9f08 KVM: x86: Replace late check_nested_events() hack with more precise fix
Add an argument to interrupt_allowed and nmi_allowed, to checking if
interrupt injection is blocked.  Use the hook to handle the case where
an interrupt arrives between check_nested_events() and the injection
logic.  Drop the retry of check_nested_events() that hack-a-fixed the
same condition.

Blocking injection is also a bit of a hack, e.g. KVM should do exiting
and non-exiting interrupt processing in a single pass, but it's a more
precise hack.  The old comment is also misleading, e.g. KVM_REQ_EVENT is
purely an optimization, setting it on every run loop (which KVM doesn't
do) should not affect functionality, only performance.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-13-sean.j.christopherson@intel.com>
[Extend to SVM, add SMI and NMI.  Even though NMI and SMI cannot come
 asynchronously right now, making the fix generic is easy and removes a
 special case. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:49 -04:00
Sean Christopherson
7ab0abdb55 KVM: VMX: Use vmx_get_rflags() to query RFLAGS in vmx_interrupt_blocked()
Use vmx_get_rflags() instead of manually reading vmcs.GUEST_RFLAGS when
querying RFLAGS.IF so that multiple checks against interrupt blocking in
a single run loop only require a single VMREAD.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-14-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:48 -04:00
Sean Christopherson
db43859280 KVM: VMX: Use vmx_interrupt_blocked() directly from vmx_handle_exit()
Use vmx_interrupt_blocked() instead of bouncing through
vmx_interrupt_allowed() when handling edge cases in vmx_handle_exit().
The nested_run_pending check in vmx_interrupt_allowed() should never
evaluate true in the VM-Exit path.

Hoist the WARN in handle_invalid_guest_state() up to vmx_handle_exit()
to enforce the above assumption for the !enable_vnmi case, and to detect
any other potential bugs with nested VM-Enter.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-12-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:47 -04:00
Sean Christopherson
3b82b8d7fd KVM: x86: WARN on injected+pending exception even in nested case
WARN if a pending exception is coincident with an injected exception
before calling check_nested_events() so that the WARN will fire even if
inject_pending_event() bails early because check_nested_events() detects
the conflict.  Bailing early isn't problematic (quite the opposite), but
suppressing the WARN is undesirable as it could mask a bug elsewhere in
KVM.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-11-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:46 -04:00
Paolo Bonzini
221e761090 KVM: nSVM: Preserve IRQ/NMI/SMI priority irrespective of exiting behavior
Short circuit vmx_check_nested_events() if an unblocked IRQ/NMI/SMI is
pending and needs to be injected into L2, priority between coincident
events is not dependent on exiting behavior.

Fixes: b518ba9fa6 ("KVM: nSVM: implement check_nested_events for interrupts")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:45 -04:00
Paolo Bonzini
fc6f7c03ad KVM: nSVM: Report interrupts as allowed when in L2 and exit-on-interrupt is set
Report interrupts as allowed when the vCPU is in L2 and L2 is being run with
exit-on-interrupts enabled and EFLAGS.IF=1 (either on the host or on the guest
according to VINTR).  Interrupts are always unblocked from L1's perspective
in this case.

While moving nested_exit_on_intr to svm.h, use INTERCEPT_INTR properly instead
of assuming it's zero (which it is of course).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:44 -04:00
Sean Christopherson
1cd2f0b0dd KVM: nVMX: Prioritize SMI over nested IRQ/NMI
Check for an unblocked SMI in vmx_check_nested_events() so that pending
SMIs are correctly prioritized over IRQs and NMIs when the latter events
will trigger VM-Exit.  This also fixes an issue where an SMI that was
marked pending while processing a nested VM-Enter wouldn't trigger an
immediate exit, i.e. would be incorrectly delayed until L2 happened to
take a VM-Exit.

Fixes: 64d6067057 ("KVM: x86: stubs for SMM support")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-10-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:43 -04:00
Sean Christopherson
15ff0b450b KVM: nVMX: Preserve IRQ/NMI priority irrespective of exiting behavior
Short circuit vmx_check_nested_events() if an unblocked IRQ/NMI is
pending and needs to be injected into L2, priority between coincident
events is not dependent on exiting behavior.

Fixes: b6b8a1451f ("KVM: nVMX: Rework interception of IRQs and NMIs")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-9-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:42 -04:00
Paolo Bonzini
cae96af184 KVM: SVM: Split out architectural interrupt/NMI/SMI blocking checks
Move the architectural (non-KVM specific) interrupt/NMI/SMI blocking checks
to a separate helper so that they can be used in a future patch by
svm_check_nested_events().

No functional change intended.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:40 -04:00
Sean Christopherson
1b660b6baa KVM: VMX: Split out architectural interrupt/NMI blocking checks
Move the architectural (non-KVM specific) interrupt/NMI blocking checks
to a separate helper so that they can be used in a future patch by
vmx_check_nested_events().

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-8-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:39 -04:00
Paolo Bonzini
55714cddbf KVM: nSVM: Move SMI vmexit handling to svm_check_nested_events()
Unlike VMX, SVM allows a hypervisor to take a SMI vmexit without having
any special SMM-monitor enablement sequence.  Therefore, it has to be
handled like interrupts and NMIs.  Check for an unblocked SMI in
svm_check_nested_events() so that pending SMIs are correctly prioritized
over IRQs and NMIs when the latter events will trigger VM-Exit.

Note that there is no need to test explicitly for SMI vmexits, because
guests always runs outside SMM and therefore can never get an SMI while
they are blocked.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:38 -04:00
Paolo Bonzini
bbdad0b5a7 KVM: nSVM: Report NMIs as allowed when in L2 and Exit-on-NMI is set
Report NMIs as allowed when the vCPU is in L2 and L2 is being run with
Exit-on-NMI enabled, as NMIs are always unblocked from L1's perspective
in this case.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:33 -04:00
Sean Christopherson
429ab576f3 KVM: nVMX: Report NMIs as allowed when in L2 and Exit-on-NMI is set
Report NMIs as allowed when the vCPU is in L2 and L2 is being run with
Exit-on-NMI enabled, as NMIs are always unblocked from L1's perspective
in this case.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-7-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:32 -04:00
Paolo Bonzini
a9fa7cb6aa KVM: x86: replace is_smm checks with kvm_x86_ops.smi_allowed
Do not hardcode is_smm so that all the architectural conditions for
blocking SMIs are listed in a single place.  Well, in two places because
this introduces some code duplication between Intel and AMD.

This ensures that nested SVM obeys GIF in kvm_vcpu_has_events.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:31 -04:00
Sean Christopherson
88c604b66e KVM: x86: Make return for {interrupt_nmi,smi}_allowed() a bool instead of int
Return an actual bool for kvm_x86_ops' {interrupt_nmi}_allowed() hook to
better reflect the return semantics, and to avoid creating an even
bigger mess when the related VMX code is refactored in upcoming patches.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-5-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:29 -04:00
Sean Christopherson
8081ad06b6 KVM: x86: Set KVM_REQ_EVENT if run is canceled with req_immediate_exit set
Re-request KVM_REQ_EVENT if vcpu_enter_guest() bails after processing
pending requests and an immediate exit was requested.  This fixes a bug
where a pending event, e.g. VMX preemption timer, is delayed and/or lost
if the exit was deferred due to something other than a higher priority
_injected_ event, e.g. due to a pending nested VM-Enter.  This bug only
affects the !injected case as kvm_x86_ops.cancel_injection() sets
KVM_REQ_EVENT to redo the injection, but that's purely serendipitous
behavior with respect to the deferred event.

Note, emulated preemption timer isn't the only event that can be
affected, it simply happens to be the only event where not re-requesting
KVM_REQ_EVENT is blatantly visible to the guest.

Fixes: f4124500c2 ("KVM: nVMX: Fully emulate preemption timer")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-4-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:28 -04:00
Sean Christopherson
d2060bd42e KVM: nVMX: Open a window for pending nested VMX preemption timer
Add a kvm_x86_ops hook to detect a nested pending "hypervisor timer" and
use it to effectively open a window for servicing the expired timer.
Like pending SMIs on VMX, opening a window simply means requesting an
immediate exit.

This fixes a bug where an expired VMX preemption timer (for L2) will be
delayed and/or lost if a pending exception is injected into L2.  The
pending exception is rightly prioritized by vmx_check_nested_events()
and injected into L2, with the preemption timer left pending.  Because
no window opened, L2 is free to run uninterrupted.

Fixes: f4124500c2 ("KVM: nVMX: Fully emulate preemption timer")
Reported-by: Jim Mattson <jmattson@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Peter Shier <pshier@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-3-sean.j.christopherson@intel.com>
[Check it in kvm_vcpu_has_events too, to ensure that the preemption
 timer is serviced promptly even if the vCPU is halted and L1 is not
 intercepting HLT. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:27 -04:00
Sean Christopherson
6ce347af14 KVM: nVMX: Preserve exception priority irrespective of exiting behavior
Short circuit vmx_check_nested_events() if an exception is pending and
needs to be injected into L2, priority between coincident events is not
dependent on exiting behavior.  This fixes a bug where a single-step #DB
that is not intercepted by L1 is incorrectly dropped due to servicing a
VMX Preemption Timer VM-Exit.

Injected exceptions also need to be blocked if nested VM-Enter is
pending or an exception was already injected, otherwise injecting the
exception could overwrite an existing event injection from L1.
Technically, this scenario should be impossible, i.e. KVM shouldn't
inject its own exception during nested VM-Enter.  This will be addressed
in a future patch.

Note, event priority between SMI, NMI and INTR is incorrect for L2, e.g.
SMI should take priority over VM-Exit on NMI/INTR, and NMI that is
injected into L2 should take priority over VM-Exit INTR.  This will also
be addressed in a future patch.

Fixes: b6b8a1451f ("KVM: nVMX: Rework interception of IRQs and NMIs")
Reported-by: Jim Mattson <jmattson@google.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Peter Shier <pshier@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200423022550.15113-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-13 12:14:25 -04:00