Commit Graph

43623 Commits

Author SHA1 Message Date
Linus Torvalds
e623175f64 - Drop __init annotation from two rtc functions which get called after
boot is done, in order to prevent a crash
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQ76ZYACgkQEsHwGGHe
 VUouEA//UfGRp9FysHX80FRIb9ZBwdVYZJDdf6CKaSqVQ+JLlT2cDZN3XrogtGG9
 1mxyRhmLXuh6D944o4NQIOsz4odXh9oNqNtIvqHz37YcMeJEtZlm/y6mKOJ4X8+E
 zbjlNyEsMvnqN5Y0Dr+fOSKgulhYLBOhFa+rgPqJqPOQ2/Ug4pCxJUrBhHxonPXQ
 rNjjG1kWyHWlGTh1CGtfESiYZUOuC5Ag4hahbB+VLMyW5iUALmUIfGnwpciuwEFx
 zNE0EgHW8al4e+UfH+Pa5mdyawMfur26l/EmA1yNEDbXH1uQNR1EUllD17ybfDnq
 MxicIGjpY8Cv1DZQzIAi1JFRVGcYyTIPGfn7P7pzF8C4Q9aOaOLr1MZhA8OA3D3o
 /Gb5Jn09uLKjn5u3axKjgs8pVAAmb3xWURLXxBILC2Mv9S7m1pgsbTmz3JmX3kpt
 6nqFXvxt6ZZ6xTLC5GM8RRyed4/gsPX3O7LuamjtSaswZqh0DnVXU64meShLzTph
 z2u5rTPOXeYxQE2C87bh0ghU/G2mywrH1wsK59OhDSzsKcJYD5HkG66aBFyCAkoV
 e0a565BC/lUn+6DjCK4ldTKv+QSbtwMyUg2iJxh/bhENROEI/0t/sUe2WYxQJCKn
 UaEIsJn0Cj4/iyAROt1nFRnZknYZgeGu587+GCBwtFL+asRyUhs=
 =Lrs2
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.3_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:

 - Drop __init annotation from two rtc functions which get called after
   boot is done, in order to prevent a crash

* tag 'x86_urgent_for_v6.3_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/rtc: Remove __init for runtime functions
2023-04-16 10:28:29 -07:00
Linus Torvalds
f0dd81db3e Kbuild fixes for v6.3 (3rd)
- Drop debug info from purgatory objects again
 
  - Document that kernel.org provides prebuilt LLVM toolchains
 
  - Give up handling untracked files for source package builds
 
  - Avoid creating corrupted cpio when KBUILD_BUILD_TIMESTAMP is given
    with a pre-epoch data.
 
  - Change panic_show_mem() to a macro to handle variable-length argument
 
  - Compress tarballs on-the-fly again
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmQ7zuUVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG3/AP/12Y/zmiCY/R2NQmhYAGt05K9I5R
 M0mQbdNLlzmXfrXPEAUVWbUcEOZ1jFxqkXJwhf1kYHmJ+2MeS/n1d4A+4x4eGXDE
 a2WfB/G8FP5+T6+u/U/LDfMqVpbuAfBkZXu5QeceUlHyTY1Q1BAO9PUvNu+Df2P7
 i1SBEDVzaN3WE1ZOc3pjQf0tIQybyM/x9EjGrh4DpBVTQYpmmgxpCdyzVr7yJvQ3
 LFHy+OtX2WTqtgULUunH776pp59EGgoKLDSLck+wG7bdmYm5y/13uVGc1iBfGBAX
 RUV8qaP1ijB0BHZZnzsppUjZshSeS97sOUVv6hiwRdWgr0ISfZuVYN0E2YlhBcFV
 UUidlk1l1VOytM8/EDrYyHnTvmHm+glMp1FxRR48ymZr1PqVUxQcad0lPClylp1b
 Xc50C2wkFwa5a8RkY0aIihrVpnbHBSiPVHvaF01kFwNUor+VASpanR/xtTr4b88x
 OK9aImRII15CxoOZdWtvut4c0OHw4sbyzmCuXM/nyS6c5+yroM/QZZs+c2ja/QEv
 QNlDW54JzU6u+JE4O7W/gH3mqKH8ytL7Z5hmVECiiCYWp77IBCE8B+3dXVGfdGUg
 Wy8MgtvZMW0ZAseqBD6VmVXkQIizUAgpJvkJy7R6YTw52P6sT2P2WAcWUTi2FQTD
 FpUEf8eMi1UTDQgG
 =bFZ1
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Drop debug info from purgatory objects again

 - Document that kernel.org provides prebuilt LLVM toolchains

 - Give up handling untracked files for source package builds

 - Avoid creating corrupted cpio when KBUILD_BUILD_TIMESTAMP is given
   with a pre-epoch data.

 - Change panic_show_mem() to a macro to handle variable-length argument

 - Compress tarballs on-the-fly again

* tag 'kbuild-fixes-v6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: do not create intermediate *.tar for tar packages
  kbuild: do not create intermediate *.tar for source tarballs
  kbuild: merge cmd_archive_linux and cmd_archive_perf
  init/initramfs: Fix argument forwarding to panic() in panic_show_mem()
  initramfs: Check negative timestamp to prevent broken cpio archive
  kbuild: give up untracked files for source package builds
  Documentation/llvm: Add a note about prebuilt kernel.org toolchains
  purgatory: fix disabling debug info
2023-04-16 09:46:32 -07:00
Matija Glavinic Pecotic
775d3c514c x86/rtc: Remove __init for runtime functions
set_rtc_noop(), get_rtc_noop() are after booting, therefore their __init
annotation is wrong.

A crash was observed on an x86 platform where CMOS RTC is unused and
disabled via device tree. set_rtc_noop() was invoked from ntp:
sync_hw_clock(), although CONFIG_RTC_SYSTOHC=n, however sync_cmos_clock()
doesn't honour that.

  Workqueue: events_power_efficient sync_hw_clock
  RIP: 0010:set_rtc_noop
  Call Trace:
   update_persistent_clock64
   sync_hw_clock

Fix this by dropping the __init annotation from set/get_rtc_noop().

Fixes: c311ed6183 ("x86/init: Allow DT configured systems to disable RTC at boot time")
Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/59f7ceb1-446b-1d3d-0bc8-1f0ee94b1e18@nokia.com
2023-04-13 14:41:04 +02:00
Linus Torvalds
e62252bc55 pci-v6.3-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmQ1qFYUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxRJA/+JnDrZ8Ag8sIGgI7meuqWcLe3dHXQ
 7KaOGM/MP0ir8qJwiobuon2ml1lEU+iWZEcpFH9b9b5iurAfbvZkPhFAWZzJG3GW
 B6qEX5zaK4tNJHC8byK4pSat2pJLg8Q6xwA6c3H9264tj4Dc83dg2Y8hWWozbU2l
 xFaawRexCus7Xzd2VEfskwOpHQpSYkWCDtTFclIt1iZeelj6an/0FVdvYqvvIXi+
 QZjS80tMeZIcKJq06023KJX36lZHf0RGzYzQTyr7asiNnXBnQ1DfYIASwx1Nf6Y3
 9PpvoDVbUGnLDAB8THcOM9j9UOrhlbOJ+4b60J0b4nsT0ORn4rCrN05K0DIIXZi0
 CsP3qR2l04xYZFElWrvnA30KS76G7vq0OeAGpQsE0nB1PgydXuiXk5lniQhsrOuj
 xRiY7YeDFwiNzpjWsVtqkANvl6tA53ZUn+AMEPfIpdYbjnbLcSSF/qY3WCeKoftx
 KUg6fVOoXbbSctK5rXQnj2yhwQZSkm3DrG9Ol2i3E524vcMKs28QBBuUN+gtL3TP
 aD6nYP2Q0gItYAQCMtJHCBzfstrO5l9JSLM6xUHR/GwiaLCC9+QRdLJFcAxZvZJi
 buGs09B2Xz8AbgQ3AejEw7UXk+4JP19ZLyznBhLaaQQO9YOvDYlcn3Xw1QGvJGY/
 qsYKElgrLWSz+50=
 =2gjK
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

 - Provide pci_msix_can_alloc_dyn() stub when CONFIG_PCI_MSI unset to
   avoid build errors (Reinette Chatre)

 - Quirk AMD XHCI controller that loses MSI-X state in D3hot to avoid
   broken USB after hotplug or suspend/resume (Basavaraj Natikar)

 - Fix use-after-free in pci_bus_release_domain_nr() (Rob Herring)

* tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: Fix use-after-free in pci_bus_release_domain_nr()
  x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
  PCI/MSI: Provide missing stub for pci_msix_can_alloc_dyn()
2023-04-11 11:59:49 -07:00
Linus Torvalds
411eb01410 This pull request contains the following bug fix for UML:
- Build regression with older gcc versions
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmQzG18WHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wQiRD/9KhMHHduPimvPF8RPpUj6aOMjZ
 XM8qRw1xIy+Jq7YcMchyx5Xq75vG+VSFDDsn2dZUe36469BPtzT/2vRfuhDHoLSZ
 g8446Omi567xwdyMbIB7UdZQQd3+HVaPTSaMhIV6aYhz2IU3/zx4g8frS39/zC1q
 NUdiyBTEOMl6vJi1lmEq8F9XFlb4N/7CuEGhQA9TQmPbcJ8PM1BqBtnf99gsvqgl
 G3ap+hZPnJ3sqXDR2HIwvs66RQMXQwOoZ0NW1FUygEsEmFkYVXSZPzbXEi3oG/Kd
 l5cSGIBNVPY53hD9bU5bSqxZu1jWlV5S3A7UDNQ0SZPnK3Eh8yOvHfdz0X8BSXKd
 +f/8EJmsHmJcKanQ4sWKUBKUA6q+8KdjDsfJnYzBvyiUbeHlfejYf2AFuo3kavtM
 4erCLhgGw6hAXgNuEzygABwtMlnLaQ1dPARntPJPeG3G5bpGA05m9aqVNZnoRcWj
 2FO53tFSJVJRmHfIbYkK9n22znRniA/gx63HnQKcmOCmlItDm/GdChfyJY+eSJ61
 VyBm7tVg23A97oEtr0SlS+/iFp+LlxnP+g+MMxLC9P2//Alm9wkMoQ5CT7kAU9O5
 bHlW7c8w1dDFsENXyrR92RC/dls4C7wNUnw3V2qPDz2rDySbQHt2ILM9lWegMxSG
 wNdhjmC5xRZ0anVQag==
 =lVvD
 -----END PGP SIGNATURE-----

Merge tag 'uml-for-linus-6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML fix from Richard Weinberger:

 - Build regression fix for older gcc versions

* tag 'uml-for-linus-6.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  um: Only disable SSE on clang to work around old GCC bugs
2023-04-10 13:13:33 -07:00
Linus Torvalds
4ba115e269 - Add a new Intel Arrow Lake CPU model number
- Fix a confusion about how to check the version of the ACPI spec which
   supports a "online capable" bit in the MADT table which lead to
   a bunch of boot breakages with Zen1 systems and VMs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQymF8ACgkQEsHwGGHe
 VUr2cQ//YjJTZpeZ0/azfqz3IvUEW9etR8NXFfBOiATjk6BfZPtjZizdwZ2GtFtk
 /P3XPDxa34dW4ifLBCHdGpabYF5OWUj44hB2nyf3PxDSGLt6EjLtnxQm+WtobwWh
 uQ8BBA6QFw0kHPqH7TI5xA/Z4m6AvIBKZpAsG6GG/K3jVLYUgNoCSxplV7OvC0/M
 gobjMLdoofOXvuevgX6K311YiZY4hniVUSSNdOy4oDNwDFSguqVwr3wjVC7Hod8t
 rl5EzNcabhMmy4Hf5OmCcU1WQdM+XS5GbulPW0PMliuvFE+2ImqVyO9G41MG+Gw0
 2qwhTsCA7R/Rm08eJh1ad5DmHdXc7M1ChC5s5ZTU25/9DDvRKxI7cHmIZxOxXCgy
 WJbP6LRw/9M7pBHatDjUTV7DdZnOkdsUK33dKWhiNZCB08EOVhcOVDq7EslMageH
 YhRmjpa2qfDpGOQpzwOZLgOpN3cz713QI7MBGFu0CBtbUR6ZKagQ0R/7As/vxuUo
 t5S8zMc/m+ra6til3MrP9QQvA0FA6Jo6H1DAZmglShtveO0eAGJCWaVbV8xppAIG
 DNGxk96OLJnigH0s0nAMHRQkJYM0yyKO02YkGbjGZhtLP53vDIiBqPD2PN5UJLO3
 U4ucZmmet/iExApEvbTbQ7uWfCEHLLDc3Kidh0fdoTDIC5JKu2U=
 =Nk8H
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.3_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Add a new Intel Arrow Lake CPU model number

 - Fix a confusion about how to check the version of the ACPI spec which
   supports a "online capable" bit in the MADT table which lead to a
   bunch of boot breakages with Zen1 systems and VMs

* tag 'x86_urgent_for_v6.3_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add model number for Intel Arrow Lake processor
  x86/acpi/boot: Correct acpi_is_processor_usable() check
  x86/ACPI/boot: Use FADT version to check support for online capable
2023-04-09 10:00:16 -07:00
Alyssa Ross
d83806c4c0 purgatory: fix disabling debug info
Since 32ef9e5054, -Wa,-gdwarf-2 is no longer used in KBUILD_AFLAGS.
Instead, it includes -g, the appropriate -gdwarf-* flag, and also the
-Wa versions of both of those if building with Clang and GNU as.  As a
result, debug info was being generated for the purgatory objects, even
though the intention was that it not be.

Fixes: 32ef9e5054 ("Makefile.debug: re-enable debug info for .S files")
Signed-off-by: Alyssa Ross <hi@alyssa.is>
Cc: stable@vger.kernel.org
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-04-08 19:36:53 +09:00
Basavaraj Natikar
f195fc1e97 x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
The AMD [1022:15b8] USB controller loses some internal functional MSI-X
context when transitioning from D0 to D3hot. BIOS normally traps D0->D3hot
and D3hot->D0 transitions so it can save and restore that internal context,
but some firmware in the field can't do this because it fails to clear the
AMD_15B8_RCC_DEV2_EPF0_STRAP2 NO_SOFT_RESET bit.

Clear AMD_15B8_RCC_DEV2_EPF0_STRAP2 NO_SOFT_RESET bit before USB controller
initialization during boot.

Link: https://lore.kernel.org/linux-usb/Y%2Fz9GdHjPyF2rNG3@glanzmann.de/T/#u
Link: https://lore.kernel.org/r/20230329172859.699743-1-Basavaraj.Natikar@amd.com
Reported-by: Thomas Glanzmann <thomas@glanzmann.de>
Tested-by: Thomas Glanzmann <thomas@glanzmann.de>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
2023-04-06 16:23:59 -05:00
Tony Luck
81515ecf15 x86/cpu: Add model number for Intel Arrow Lake processor
Successor to Lunar Lake.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230404174641.426593-1-tony.luck@intel.com
2023-04-05 13:36:26 +02:00
Linus Torvalds
76f598ba7d PPC:
* Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled
 
 s390:
 
 * Fix handling of external interrupts in protected guests
 
 x86:
 
 * Resample the pending state of IOAPIC interrupts when unmasking them
 
 * Fix usage of Hyper-V "enlightened TLB" on AMD
 
 * Small fixes to real mode exceptions
 
 * Suppress pending MMIO write exits if emulator detects exception
 
 Documentation:
 
 * Fix rST syntax
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQsXMQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOc6Qf/YMDcYxpPuYFZ69t9k4s0ubeGjGpr
 Qdzo6ZgKbfs4+nu89u2F5zvAftRSrEVQRpfLwmTXauM8el3s+EAEPlK/MLH6sSSo
 LG3eWoNxs7MUQ6gBWeQypY6opvz3W9hf8KYd+GY6TV7jwV5/YgH6674Yyd7T/bCm
 0/tUiil5Q7Da0w4GUY5m9gEqlbA9yjDBFySqqTZemo18NzLaPNpesKnm1hvv0QJG
 JRvOMmHwqAZgS0G5cyGEPZKY6Ga0v9QsAXc8mFC/Jtn9GMD2f47PYDRfd/7NSiVz
 ulgsk8zZkZIZ6nMbQ1t10txmGIbFmJ4iPPaATrMII/xHq+idgAlHU7zY1A==
 =QvFj
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "PPC:
   - Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled

  s390:
   - Fix handling of external interrupts in protected guests

  x86:
   - Resample the pending state of IOAPIC interrupts when unmasking them

   - Fix usage of Hyper-V "enlightened TLB" on AMD

   - Small fixes to real mode exceptions

   - Suppress pending MMIO write exits if emulator detects exception

  Documentation:
   - Fix rST syntax"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  docs: kvm: x86: Fix broken field list
  KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE platform dependent
  KVM: s390: pv: fix external interruption loop not always detected
  KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
  KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
  KVM: x86: Suppress pending MMIO write exits if emulator detects exception
  KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking
  KVM: irqfd: Make resampler_list an RCU list
  KVM: SVM: Flush Hyper-V TLB when required
2023-04-04 11:29:37 -07:00
David Gow
a3046a618a um: Only disable SSE on clang to work around old GCC bugs
As part of the Rust support for UML, we disable SSE (and similar flags)
to match the normal x86 builds. This both makes sense (we ideally want a
similar configuration to x86), and works around a crash bug with SSE
generation under Rust with LLVM.

However, this breaks compiling stdlib.h under gcc < 11, as the x86_64
ABI requires floating-point return values be stored in an SSE register.
gcc 11 fixes this by only doing register allocation when a function is
actually used, and since we never use atof(), it shouldn't be a problem:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99652

Nevertheless, only disable SSE on clang setups, as that's a simple way
of working around everyone's bugs.

Fixes: 8849818679 ("rust: arch/um: Disable FP/SIMD instruction to match x86")
Reported-by: Roberto Sassu <roberto.sassu@huaweicloud.com>
Link: https://lore.kernel.org/linux-um/6df2ecef9011d85654a82acd607fdcbc93ad593c.camel@huaweicloud.com/
Tested-by: Roberto Sassu <roberto.sassu@huaweicloud.com>
Tested-by: SeongJae Park <sj@kernel.org>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Tested-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
2023-04-04 09:57:05 +02:00
Linus Torvalds
2d72ab2449 hyperv-fixes for 6.3-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmQqIpUTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXjtwCACaG8LkrLOa4EWwdVLOutxc/VSHhPzS
 FaCyzxaSNtFciSl/kOPsl2pmwy+c9QAri3wO9uyJ41R1oUfjy/+pX8TxYc1imOrh
 6vIMUntYW7t9ISoUbi7hDU1Nj3CX4KOXruOliLP3WM9mtGvaNL5INEDh9PV6bxIz
 xlP8JEoKTk0ecChOWZDWyDIE95MwgqRin8uEI0JUyE2mdegIrDC7SFvqT7XjV23O
 0gntPdoZCgBzWohaiRMKJHHNUbAC+1O2+1tzY0bONwHdpmRj5/V28e02iARF3bAE
 4TvTt3qrZU02epzMhkZPnTztyvp1vPzmpHaBHQD4pdNZP/D1b8ejm4mz
 =o+VM
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-fixes-signed-20230402' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Fix a bug in channel allocation for VMbus (Mohammed Gamal)

 - Do not allow root partition functionality in CVM (Michael Kelley)

* tag 'hyperv-fixes-signed-20230402' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: Block root partition functionality in a Confidential VM
  Drivers: vmbus: Check for channel allocation before looking up relids
2023-04-03 09:34:08 -07:00
Alexey Kardashevskiy
52882b9c7a KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE platform dependent
When introduced, IRQFD resampling worked on POWER8 with XICS. However
KVM on POWER9 has never implemented it - the compatibility mode code
("XICS-on-XIVE") misses the kvm_notify_acked_irq() call and the native
XIVE mode does not handle INTx in KVM at all.

This moved the capability support advertising to platforms and stops
advertising it on XIVE, i.e. POWER9 and later.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Anup Patel <anup@brainfault.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20220504074807.3616813-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-31 11:19:05 -04:00
Paolo Bonzini
85b475a450 A small fix that repairs the external loop detection code for PV
guests.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmQilhMACgkQ41TmuOI4
 ufhydRAAgldhL63tLa9PChGUkRtQLRdWKC8qJXIZUriF5d3fvQD8nfTRSvndwPEj
 ELg+MO63xhZEqyje4JVjNISc8sHXTZqFk43L+L2uamX2tlbw8ELAtiCTLZHKuCzz
 T3+vNB6Q103C94ZxsIhxnlhXAw69K0yv7H3P/SytBx6iDCvQ2k+BdAuSKylTytcU
 lfOGK3ADuvaL9PJSyM1ICCKqgdveEBI+tHo9h/eqgM8SmSNF1OahhjyDiiWtWTT6
 EpaTkce0fJypvCShCOMBnSHdPS6QZ4JclNdcDR6uz4VMT28acV6lNGRDkPYgBzta
 Wnx6+CETyePJt6NU5Nh8XLXstKQ7OUvCUSWNligkRBYYl0pr7jSA8u8xuGCaCCDs
 8ZOkMWcYLYKqH3Cvqb2rMR2MM9psrHwtLflC+38hGMV23Iy8YBnAQv/AixSy5yoP
 J50O1IlnNVhjsRuNxVAEwaybZP3pY11/iEwrhOVDyjdKlZGmbA0lK705bTh5z5YE
 XXGYDe97sPvztfrpPwDWdqoJPPDvxRvzrXROxhNK4vo5D/qGzobrvxkEkDcrAVqQ
 yoOj46zk/cieSCkiySURcQ+3qLYi/8+HeOtQcVMApK91GsselzM6vwSuKqFUmicq
 XrgsS7Gj2VNk4IMLLgevqQWfwvRRhNs/FiUI6lDd+wqqhX09Vl0=
 =Ysma
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-6.3-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

A small fix that repairs the external loop detection code for PV
guests.
2023-03-31 11:15:09 -04:00
Eric DeVolder
fed8d8773b x86/acpi/boot: Correct acpi_is_processor_usable() check
The logic in acpi_is_processor_usable() requires the online capable
bit be set for hotpluggable CPUs.  The online capable bit has been
introduced in ACPI 6.3.

However, for ACPI revisions < 6.3 which do not support that bit, CPUs
should be reported as usable, not the other way around.

Reverse the check.

  [ bp: Rewrite commit message. ]

Fixes: e2869bd7af ("x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC")
Suggested-by: Miguel Luis <miguel.luis@oracle.com>
Suggested-by: Boris Ostrovsky <boris.ovstrosky@oracle.com>
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: David R <david@unsolicited.net>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230327191026.3454-2-eric.devolder@oracle.com
2023-03-30 11:07:30 +02:00
Mario Limonciello
a74fabfbd1 x86/ACPI/boot: Use FADT version to check support for online capable
ACPI 6.3 introduced the online capable bit, and also introduced MADT
version 5.

Latter was used to distinguish whether the offset storing online capable
could be used. However ACPI 6.2b has MADT version "45" which is for
an errata version of the ACPI 6.2 spec.  This means that the Linux code
for detecting availability of MADT will mistakenly flag ACPI 6.2b as
supporting online capable which is inaccurate as it's an ACPI 6.3 feature.

Instead use the FADT major and minor revision fields to distinguish this.

  [ bp: Massage. ]

Fixes: aa06e20f1b ("x86/ACPI: Don't add CPUs that are not online capable")
Reported-by: Eric DeVolder <eric.devolder@oracle.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/943d2445-84df-d939-f578-5d8240d342cc@unsolicited.net
2023-03-30 10:50:30 +02:00
Sean Christopherson
80962ec912 KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
Don't report an error code to L1 when synthesizing a nested VM-Exit and
L2 is in Real Mode.  Per Intel's SDM, regarding the error code valid bit:

  This bit is always 0 if the VM exit occurred while the logical processor
  was in real-address mode (CR0.PE=0).

The bug was introduced by a recent fix for AMD's Paged Real Mode, which
moved the error code suppression from the common "queue exception" path
to the "inject exception" path, but missed VMX's "synthesize VM-Exit"
path.

Fixes: b97f074583 ("KVM: x86: determine if an exception has an error code only when injecting it.")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230322143300.2209476-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-27 10:15:11 -04:00
Sean Christopherson
6c41468c7c KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
When injecting an exception into a vCPU in Real Mode, suppress the error
code by clearing the flag that tracks whether the error code is valid, not
by clearing the error code itself.  The "typo" was introduced by recent
fix for SVM's funky Paged Real Mode.

Opportunistically hoist the logic above the tracepoint so that the trace
is coherent with respect to what is actually injected (this was also the
behavior prior to the buggy commit).

Fixes: b97f074583 ("KVM: x86: determine if an exception has an error code only when injecting it.")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230322143300.2209476-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-27 10:15:10 -04:00
Sean Christopherson
0dc902267c KVM: x86: Suppress pending MMIO write exits if emulator detects exception
Clear vcpu->mmio_needed when injecting an exception from the emulator to
squash a (legitimate) warning about vcpu->mmio_needed being true at the
start of KVM_RUN without a callback being registered to complete the
userspace MMIO exit.  Suppressing the MMIO write exit is inarguably wrong
from an architectural perspective, but it is the least awful hack-a-fix
due to shortcomings in KVM's uAPI, not to mention that KVM already
suppresses MMIO writes in this scenario.

Outside of REP string instructions, KVM doesn't provide a way to resume
an instruction at the exact point where it was "interrupted" if said
instruction partially completed before encountering an MMIO access.  For
MMIO reads, KVM immediately exits to userspace upon detecting MMIO as
userspace provides the to-be-read value in a buffer, and so KVM can safely
(more or less) restart the instruction from the beginning.  When the
emulator re-encounters the MMIO read, KVM will service the MMIO by getting
the value from the buffer instead of exiting to userspace, i.e. KVM won't
put the vCPU into an infinite loop.

On an emulated MMIO write, KVM finishes the instruction before exiting to
userspace, as exiting immediately would ultimately hang the vCPU due to
the aforementioned shortcoming of KVM not being able to resume emulation
in the middle of an instruction.

For the vast majority of _emulated_ instructions, deferring the userspace
exit doesn't cause problems as very few x86 instructions (again ignoring
string operations) generate multiple writes.  But for instructions that
generate multiple writes, e.g. PUSHA (multiple pushes onto the stack),
deferring the exit effectively results in only the final write triggering
an exit to userspace.  KVM does support multiple MMIO "fragments", but
only for page splits; if an instruction performs multiple distinct MMIO
writes, the number of fragments gets reset when the next MMIO write comes
along and any previous MMIO writes are dropped.

Circling back to the warning, if a deferred MMIO write coincides with an
exception, e.g. in this case a #SS due to PUSHA underflowing the stack
after queueing a write to an MMIO page on a previous push, KVM injects
the exceptions and leaves the deferred MMIO pending without registering a
callback, thus triggering the splat.

Sweep the problem under the proverbial rug as dropping MMIO writes is not
unique to the exception scenario (see above), i.e. instructions like PUSHA
are fundamentally broken with respect to MMIO, and have been since KVM's
inception.

Reported-by: zhangjianguo <zhangjianguo18@huawei.com>
Reported-by: syzbot+760a73552f47a8cd0fd9@syzkaller.appspotmail.com
Reported-by: syzbot+8accb43ddc6bd1f5713a@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230322141220.2206241-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-27 10:13:53 -04:00
Dmytro Maluka
fef8f2b90e KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking
KVM irqfd based emulation of level-triggered interrupts doesn't work
quite correctly in some cases, particularly in the case of interrupts
that are handled in a Linux guest as oneshot interrupts (IRQF_ONESHOT).
Such an interrupt is acked to the device in its threaded irq handler,
i.e. later than it is acked to the interrupt controller (EOI at the end
of hardirq), not earlier.

Linux keeps such interrupt masked until its threaded handler finishes,
to prevent the EOI from re-asserting an unacknowledged interrupt.
However, with KVM + vfio (or whatever is listening on the resamplefd)
we always notify resamplefd at the EOI, so vfio prematurely unmasks the
host physical IRQ, thus a new physical interrupt is fired in the host.
This extra interrupt in the host is not a problem per se. The problem is
that it is unconditionally queued for injection into the guest, so the
guest sees an extra bogus interrupt. [*]

There are observed at least 2 user-visible issues caused by those
extra erroneous interrupts for a oneshot irq in the guest:

1. System suspend aborted due to a pending wakeup interrupt from
   ChromeOS EC (drivers/platform/chrome/cros_ec.c).
2. Annoying "invalid report id data" errors from ELAN0000 touchpad
   (drivers/input/mouse/elan_i2c_core.c), flooding the guest dmesg
   every time the touchpad is touched.

The core issue here is that by the time when the guest unmasks the IRQ,
the physical IRQ line is no longer asserted (since the guest has
acked the interrupt to the device in the meantime), yet we
unconditionally inject the interrupt queued into the guest by the
previous resampling. So to fix the issue, we need a way to detect that
the IRQ is no longer pending, and cancel the queued interrupt in this
case.

With IOAPIC we are not able to probe the physical IRQ line state
directly (at least not if the underlying physical interrupt controller
is an IOAPIC too), so in this patch we use irqfd resampler for that.
Namely, instead of injecting the queued interrupt, we just notify the
resampler that this interrupt is done. If the IRQ line is actually
already deasserted, we are done. If it is still asserted, a new
interrupt will be shortly triggered through irqfd and injected into the
guest.

In the case if there is no irqfd resampler registered for this IRQ, we
cannot fix the issue, so we keep the existing behavior: immediately
unconditionally inject the queued interrupt.

This patch fixes the issue for x86 IOAPIC only. In the long run, we can
fix it for other irqchips and other architectures too, possibly taking
advantage of reading the physical state of the IRQ line, which is
possible with some other irqchips (e.g. with arm64 GIC, maybe even with
the legacy x86 PIC).

[*] In this description we assume that the interrupt is a physical host
    interrupt forwarded to the guest e.g. by vfio. Potentially the same
    issue may occur also with a purely virtual interrupt from an
    emulated device, e.g. if the guest handles this interrupt, again, as
    a oneshot interrupt.

Signed-off-by: Dmytro Maluka <dmy@semihalf.com>
Link: https://lore.kernel.org/kvm/31420943-8c5f-125c-a5ee-d2fde2700083@semihalf.com/
Link: https://lore.kernel.org/lkml/87o7wrug0w.wl-maz@kernel.org/
Message-Id: <20230322204344.50138-3-dmy@semihalf.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-27 10:13:28 -04:00
Jeremi Piotrowski
e5c972c1fa KVM: SVM: Flush Hyper-V TLB when required
The Hyper-V "EnlightenedNptTlb" enlightenment is always enabled when KVM
is running on top of Hyper-V and Hyper-V exposes support for it (which
is always). On AMD CPUs this enlightenment results in ASID invalidations
not flushing TLB entries derived from the NPT. To force the underlying
(L0) hypervisor to rebuild its shadow page tables, an explicit hypercall
is needed.

The original KVM implementation of Hyper-V's "EnlightenedNptTlb" on SVM
only added remote TLB flush hooks. This worked out fine for a while, as
sufficient remote TLB flushes where being issued in KVM to mask the
problem. Since v5.17, changes in the TDP code reduced the number of
flushes and the out-of-sync TLB prevents guests from booting
successfully.

Split svm_flush_tlb_current() into separate callbacks for the 3 cases
(guest/all/current), and issue the required Hyper-V hypercall when a
Hyper-V TLB flush is needed. The most important case where the TLB flush
was missing is when loading a new PGD, which is followed by what is now
svm_flush_tlb_current().

Cc: stable@vger.kernel.org # v5.17+
Fixes: 1e0c7d4075 ("KVM: SVM: hyper-v: Remote TLB flush for SVM")
Link: https://lore.kernel.org/lkml/43980946-7bbf-dcef-7e40-af904c456250@linux.microsoft.com/
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20230324145233.4585-1-jpiotrowski@linux.microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-27 10:10:26 -04:00
Linus Torvalds
974fc94336 - Properly clear perf event status tracking in the AMD perf event
overflow handler
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQgQDAACgkQEsHwGGHe
 VUo1OA//YfWLXnc9CBN1S8EefXj0oHHogj3jkVMQLfWOoYmaknaXIHsOg8g4Kihs
 liUZO0tGWD4WdiUe/GpWmC0KIMtuZ8I1o0ML7LLdxpfexZbMLdlaqUHO68rlHDXt
 /nJ1mmcA3MJTkdDePIjpI1b8TM+4L8XWzi4XRxMXVbhquuQXGkcqWacq0/rh/0et
 /5iqzp88varuAwSHTrmX12keYEtRLsfL8+fy5WdQX7L9i1oiSU41IsZ95/wMmjbE
 +Y7t/EFKiIVH9+JdeM78HLAem+hIXZByNay6E6p+SveNtcjW1QRVviQ+QZ+c71hh
 0PQgeuJzArd0NRG7TWLUktCuONj6LZZ7E3apQyzXTPcg6H/IX9JeQv1dR9UGyjx7
 mAyrCSS9KzPiydiQLGlOQJeXCRwlNbo7fJKfX/uiCYErEhokQXztbLdHV3G/J/4v
 nuBQ2t50ZYLWbQoTZ8IOncrwTgDw7lSU8opNaE6G3P8Ut7BXy3f55YMRwKLg0WO7
 9X9AYhCMCtUeLwrTHX1SwPZToDI0huCJNTPy94ioSpGSU951CvgXCzX/ZPOsAFjs
 ndw3TQlI3xDGJETa2Q1fev08UjSHOieFlX5MQ+zLSIf6cSy4s7DeeOOBLXZouKnc
 jTx+31U1lhGTsqb43ac18CNjpslu67GnyCpzMjy3UymohQzV/nE=
 =dK66
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fix from Borislav Petkov:

 - Properly clear perf event status tracking in the AMD perf event
   overflow handler

* tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/core: Always clear status for idx
2023-03-26 09:13:35 -07:00
Linus Torvalds
986c63741d - Add a AMX ptrace self test
- Prevent a false-positive warning when retrieving the (invalid) address of
   dynamic FPU features in their init state which are not saved in
   init_fpstate at all
 
 - Randomize per-CPU entry areas only when KASLR is enabled
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQgPFAACgkQEsHwGGHe
 VUrAfA//QyZE5JnH0Ber3upRlZ/dPSNKIaOX6DMLshGj7QDqs2utTnjc4pwaqGWD
 OWpPuAJvOo2+NsN4nfB12venasIzseXDBBhEw6a5kYx73QmFbZ4XswFBLl2Eh8we
 cFbqU4B8SQvFQaahZ4kRRHpsmNGEPYRvgh2lBjcKUJBUaCuu6KoqE9+I3t173Obc
 sPfkXmhintDjYIjKfllN78rsBq4uCCaOVu5u299ZFMdBakRtx0M7U3547+4hwoE3
 txP+VK+TPs8e64XJtCTem1br8HXNt/W5pC4IoQPnH8V+FLhUp1iIz6FpVHnJ7VMD
 9c8VL7e8BNXhKkQn8sSkSVUZV3xNP7n4MbKKbba3f6EWPZnI28WQ3w09LUte/1aa
 hHEHyjMVyJfUiAcfuE1gZflG1+TqT8GkQJ+hqG9+/iSCWftOMuhfsKCROCLGhltJ
 yYBoyR2ZC1ErSLIOvgYAEUIeZ9FkzreOU0Pit6P/5qaPu+EXw3uDzoZB0WQH40Z5
 PQwz04/s3idPwbfCZDOyNc7QZwxbGu1ESkdiTtCJmbBLW0MkWiBCnf/qZsK7PdD1
 Q2qmx86ewIo6QipJpGK9pqWuzwFYNEJJHn3P7T1CcYQnQb+61m+b6WeYozQCgyMF
 0dII6JulW98/WzjVgH6zUA0a0dicO7FM9H6iEGqlIcvxv0PuM7M=
 =eZTj
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Add a AMX ptrace self test

 - Prevent a false-positive warning when retrieving the (invalid)
   address of dynamic FPU features in their init state which are not
   saved in init_fpstate at all

 - Randomize per-CPU entry areas only when KASLR is enabled

* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86/amx: Add a ptrace test
  x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
  x86/mm: Do not shuffle CPU entry areas without KASLR
2023-03-26 09:01:24 -07:00
Linus Torvalds
2495697422 xen: branch for v6.3-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZB2KdAAKCRCAXGG7T9hj
 vseqAP9OlHO4qqfTWlSmYSPisfWwDT6CM7+K+4vWpMXFh3ZGuAEAhER0mNM1ikoB
 ZF7Ash778XPt2CaapQLsHtFZqJUn5gw=
 =ouzt
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - fix build warning

 - avoid concurrent accesses to the Xen PV console ring page

* tag 'for-linus-6.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/PVH: avoid 32-bit build warning when obtaining VGA console info
  hvc/xen: prevent concurrent accesses to the shared ring
2023-03-24 09:44:43 -07:00
Chang S. Bae
b158888402 x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
__copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer
or from init_fpstate into the ptrace buffer. Dynamic features, like
XTILEDATA, have an all zeroes init state and are not saved in
init_fpstate, which means the corresponding bit is not set in the
xfeatures bitmap of the init_fpstate header.

But __copy_xstate_to_uabi_buf() retrieves addresses for both the tasks
xstate and init_fpstate unconditionally via __raw_xsave_addr().

So if the tasks XSAVE buffer has a dynamic feature set, then the
address retrieval for init_fpstate triggers the warning in
__raw_xsave_addr() which checks the feature bit in the init_fpstate
header.

Remove the address retrieval from init_fpstate for extended features.
They have an all zeroes init state so init_fpstate has zeros for them.
Then zeroing the user buffer for the init state is the same as copying
them from init_fpstate.

Fixes: 2308ee57d9 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode")
Reported-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/kvm/20230221163655.920289-2-mizhang@google.com/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Mingwei Zhang <mizhang@google.com>
Link: https://lore.kernel.org/all/20230227210504.18520-2-chang.seok.bae%40intel.com
Cc: stable@vger.kernel.org
2023-03-22 10:59:13 -07:00
Michal Koutný
a3f547addc x86/mm: Do not shuffle CPU entry areas without KASLR
The commit 97e3d26b5e ("x86/mm: Randomize per-cpu entry area") fixed
an omission of KASLR on CPU entry areas. It doesn't take into account
KASLR switches though, which may result in unintended non-determinism
when a user wants to avoid it (e.g. debugging, benchmarking).

Generate only a single combination of CPU entry areas offsets -- the
linear array that existed prior randomization when KASLR is turned off.

Since we have 3f148f3318 ("x86/kasan: Map shadow for percpu pages on
demand") and followups, we can use the more relaxed guard
kasrl_enabled() (in contrast to kaslr_memory_enabled()).

Fixes: 97e3d26b5e ("x86/mm: Randomize per-cpu entry area")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20230306193144.24605-1-mkoutny%40suse.com
2023-03-22 10:42:47 -07:00
Jan Beulich
aadbd07ff8 x86/PVH: avoid 32-bit build warning when obtaining VGA console info
In the commit referenced below I failed to pay attention to this code
also being buildable as 32-bit. Adjust the type of "ret" - there's no
real need for it to be wider than 32 bits.

Fixes: 934ef33ee7 ("x86/PVH: obtain VGA console info in Dom0")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

Link: https://lore.kernel.org/r/2d2193ff-670b-0a27-e12d-2c5c4c121c79@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-03-22 16:59:46 +01:00
Breno Leitao
263f5ecaf7 perf/x86/amd/core: Always clear status for idx
The variable 'status' (which contains the unhandled overflow bits) is
not being properly masked in some cases, displaying the following
warning:

  WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270

This seems to be happening because the loop is being continued before
the status bit being unset, in case x86_perf_event_set_period()
returns 0. This is also causing an inconsistency because the "handled"
counter is incremented, but the status bit is not cleaned.

Move the bit cleaning together above, together when the "handled"
counter is incremented.

Fixes: 7685665c39 ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20230321113338.1669660-1-leitao@debian.org
2023-03-21 14:43:05 +01:00
Linus Torvalds
c46a7d0473 - Flush out logged errors immediately after MCA banks configuration
changes over sysfs have been done instead of waiting until something
   else triggers the workqueue later - another error or the polling
   interval cycle is reached
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQXB9kACgkQEsHwGGHe
 VUqqjA//YbcRx2PFcZT5nnuQlb6bptsluCUrHOcJVT/1fe0ayrlvahuw/QtSXRH4
 Vwukc3+1cehp3CcSbHKAKOArTL7NV2tbk+EZQk+Ae+7QdRz/9TuEenL6ipCC1cr4
 Z3Bo3KZmHlBcoJaQDcQWWIL8TiYAPXdqXWksh8q+0pxDI2wuFguFBJI84j+AUZH+
 I4EDXLfzQn8RQZgiggEIez0aOIig74eaPfhHsNlqJJYG4x/EVgmRn9qJpYBGAeq6
 xQR6NvHUTjCCZAASI1QJ/IT5rXD17iey3J/gIw3QZEhotBCCDdk5vh8S8zqDStRF
 x3Za7qeC5m4HMfB/09v8HGeTitlaT0BYmM2CFOsru7I/qI+dJccDTwLmF8UY5Nj2
 G6454A7ZEQ13lhfAoDIeVFfoSkqyXNz+McTtOQ8/xDJ5hnuNJ4WtT7sWemWZlV5S
 l14xVFbojtGNmQygUGeL7cxl6h12Y9zFNwh1A5HzwH4EvywQJW7/35pxXEZIO3tl
 EioXKe1eSLcKoD9VAv8icmstpwJl1Gm5Xge1oyw8cyTW6d3hM8ZOEqdTAJvRkfG7
 LwPl3qC6Hrqhjc26WZ9pxmvR1hSYLWIidy6MlNeO9mf6wZR/ub+SmuHzy6n7TZl4
 pTsVver93ZgS1J8CJ0ohCK1jHs+2aLvh/6qiJvIw9lbgAZ2HKPo=
 =Q6Dq
 -----END PGP SIGNATURE-----

Merge tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS fix from Borislav Petkov:

 - Flush out logged errors immediately after MCA banks configuration
   changes over sysfs have been done instead of waiting until something
   else triggers the workqueue later - another error or the polling
   interval cycle is reached

* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Make sure logged MCEs are processed after sysfs update
2023-03-19 09:57:53 -07:00
Linus Torvalds
4ac39c5910 - Check cmdline_find_option()'s return value before further processing
- Clear temporary storage in the resctrl code to prevent access to an
   unexistent MSR
 
 - Add a simple throttling mechanism to protect the hypervisor from potentially
   malicious SEV guests issuing requests in rapid succession.
 
   In order to not jeopardize the sanity of everyone involved in
   maintaining this code, the request issuing side has received
   a cleanup, split in more or less trivial, small and digestible pieces.
   Otherwise, the code was threatening to become an unmaintainable mess.
 
   Therefore, that cleanup is marked indirectly also for stable so that
   there's no differences between the upstream code and the stable
   variant when it comes down to backporting more there.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQW/64ACgkQEsHwGGHe
 VUoWzBAAl1KD4RR5EhrppOCl5mWtZmKUf+COag7RiqggXJyhCTXO+5N24dHcgoJB
 h60gY7Nxg0CpZVbkMDSpJIuclmlMkiCLgUeuvN6E5ofgb/ZSv9nDuCXPUtLQ962d
 T6071/v48G+2PVGm+PD1xAwP3065i3itVV/k6Xn8fxeXf/fq8L5eU5tADuFICI0b
 dKbd7U+TEQAh5E6BUwms2G1P0glJqqL37H22fTcyxI6D2T/UJLlc4+or5JmTofDa
 XJE/UHn+ZaGZYjhdr/BrlcxnY1jUTQH2K3wciADmNolkuCpDQJs6GgN98lXdhT34
 vyWQVokHGEKE8Va6m5wZX90eKraSc27/0d5ZlHz/rIJgVBxp/VvCzqLUZRvkRwwk
 k7bVOeZHe6P+b0QQl7uL9U2ff0sV/4PX0NLr+jzQdlA2ZYuTV6YgBDl7nAe1Tw/J
 gJViAvDbm26mlTG1wQrvw9M2P4AQIYpEmD4KPs7j2aQafUgtGqfTBwyeKHXdtMLJ
 TrkEISZZ8BVVvYghctN4R21IryUSnfq2eXxPwxUMh78SrO8sC23QJ5PVqKM/enF8
 azf/ZBgANidzqJ44k2Ow2bnO0ZTYZblvl3NUCMNa5SjQmzAEUzupHEKUgV10MFMR
 J3lspGU47BVeirFPWlCYKr+3Buwzur5xo5wCmrezxbN0fAo9k5M=
 =6UGz
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
 "There's a little bit more 'movement' in there for my taste but it
  needs to happen and should make the code better after it.

   - Check cmdline_find_option()'s return value before further
     processing

   - Clear temporary storage in the resctrl code to prevent access to an
     unexistent MSR

   - Add a simple throttling mechanism to protect the hypervisor from
     potentially malicious SEV guests issuing requests in rapid
     succession.

     In order to not jeopardize the sanity of everyone involved in
     maintaining this code, the request issuing side has received a
     cleanup, split in more or less trivial, small and digestible
     pieces. Otherwise, the code was threatening to become an
     unmaintainable mess.

     Therefore, that cleanup is marked indirectly also for stable so
     that there's no differences between the upstream code and the
     stable variant when it comes down to backporting more there"

* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix use of uninitialized buffer in sme_enable()
  x86/resctrl: Clear staged_config[] before and after it is used
  virt/coco/sev-guest: Add throttling awareness
  virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
  virt/coco/sev-guest: Do some code style cleanups
  virt/coco/sev-guest: Carve out the request issuing logic into a helper
  virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
  virt/coco/sev-guest: Simplify extended guest request handling
  virt/coco/sev-guest: Check SEV_SNP attribute at probe time
2023-03-19 09:43:41 -07:00
Linus Torvalds
0eb392ec09 xen: branch for v6.3-rc3
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZBQKJwAKCRCAXGG7T9hj
 vuVgAQDhvr5mBFNqFxIfTnE8+oEsnYb0OgmR+9U3h+ECDB0P0gEAmR1fAee441YE
 2DWOAlvjmqoI2K8DTTabizXvm7x3bQk=
 =jcYl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - cleanup for xen time handling

 - enable the VGA console in a Xen PVH dom0

 - cleanup in the xenfs driver

* tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: remove unnecessary (void*) conversions
  x86/PVH: obtain VGA console info in Dom0
  x86/xen/time: cleanup xen_tsc_safe_clocksource
  xen: update arch/x86/include/asm/xen/cpuid.h
2023-03-17 10:45:49 -07:00
Michael Kelley
f8acb24aaf x86/hyperv: Block root partition functionality in a Confidential VM
Hyper-V should never specify a VM that is a Confidential VM and also
running in the root partition.  Nonetheless, explicitly block such a
combination to guard against a compromised Hyper-V maliciously trying to
exploit root partition functionality in a Confidential VM to expose
Confidential VM secrets. No known bug is being fixed, but the attack
surface for Confidential VMs on Hyper-V is reduced.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1678894453-95392-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2023-03-17 10:57:35 +00:00
Linus Torvalds
0ddc84d2dd ARM64:
* Address a rather annoying bug w.r.t. guest timer offsetting.  The
   synchronization of timer offsets between vCPUs was broken, leading to
   inconsistent timer reads within the VM.
 
 x86:
 
 * New tests for the slow path of the EVTCHNOP_send Xen hypercall
 
 * Add missing nVMX consistency checks for CR0 and CR4
 
 * Fix bug that broke AMD GATag on 512 vCPU machines
 
 Selftests:
 
 * Skip hugetlb tests if huge pages are not available
 
 * Sync KVM exit reasons
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQQhBMUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPJsAf/aqKQtRJH2YDHuS/OvlH546lgrPTY
 zc2S187N4OofqKvm8HWAJOPravGI4Lkc3Jvlq2jPnlwl66musfako5YGXyyJesIP
 9pc32jxwbhpHyp39tSTxlNbjE68E4Tau2iFa5n6fq/2BOEkZNGRhTDWPfbJV4yZO
 JpkaguNm1nuZfKnRNxaaYhJwbqPIBc8l+Y3Q3nw6QLZHaNoupsd2pY3c4SuTYFcW
 UxUaFtNkpXQxbwve0MWFLh/JztOzFhQcdMi3OSTBYZz32T0vncjXFDuARfKLNKyw
 FgwkHgs2/d35AgE0JEwz1u6+/RMHvUheG08zkp8//lINfNgF/Cka7Dz2uA==
 =B1LI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM64:

   - Address a rather annoying bug w.r.t. guest timer offsetting. The
     synchronization of timer offsets between vCPUs was broken, leading
     to inconsistent timer reads within the VM.

  x86:

   - New tests for the slow path of the EVTCHNOP_send Xen hypercall

   - Add missing nVMX consistency checks for CR0 and CR4

   - Fix bug that broke AMD GATag on 512 vCPU machines

  Selftests:

   - Skip hugetlb tests if huge pages are not available

   - Sync KVM exit reasons"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: selftests: Sync KVM exit reasons in selftests
  KVM: selftests: Add macro to generate KVM exit reason strings
  KVM: selftests: Print expected and actual exit reason in KVM exit reason assert
  KVM: selftests: Make vCPU exit reason test assertion common
  KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
  KVM: selftests: Use enum for test numbers in xen_shinfo_test
  KVM: selftests: Add helpers to make Xen-style VMCALL/VMMCALL hypercalls
  KVM: selftests: Move the guts of kvm_hypercall() to a separate macro
  KVM: SVM: WARN if GATag generation drops VM or vCPU ID information
  KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
  KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
  selftests: KVM: skip hugetlb tests if huge pages are not available
  KVM: VMX: Use tabs instead of spaces for indentation
  KVM: VMX: Fix indentation coding style issue
  KVM: nVMX: remove unnecessary #ifdef
  KVM: nVMX: add missing consistency checks for CR0 and CR4
  KVM: arm64: timers: Convert per-vcpu virtual offset to a global value
2023-03-16 11:32:12 -07:00
Nikita Zhandarovich
cbebd68f59 x86/mm: Fix use of uninitialized buffer in sme_enable()
cmdline_find_option() may fail before doing any initialization of
the buffer array. This may lead to unpredictable results when the same
buffer is used later in calls to strncmp() function.  Fix the issue by
returning early if cmdline_find_option() returns an error.

Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.

Fixes: aca20d5462 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230306160656.14844-1-n.zhandarovich@fintech.ru
2023-03-16 12:22:25 +01:00
Shawn Wang
0424a7dfe9 x86/resctrl: Clear staged_config[] before and after it is used
As a temporary storage, staged_config[] in rdt_domain should be cleared
before and after it is used. The stale value in staged_config[] could
cause an MSR access error.

Here is a reproducer on a system with 16 usable CLOSIDs for a 15-way L3
Cache (MBA should be disabled if the number of CLOSIDs for MB is less than
16.) :
	mount -t resctrl resctrl -o cdp /sys/fs/resctrl
	mkdir /sys/fs/resctrl/p{1..7}
	umount /sys/fs/resctrl/
	mount -t resctrl resctrl /sys/fs/resctrl
	mkdir /sys/fs/resctrl/p{1..8}

An error occurs when creating resource group named p8:
    unchecked MSR access error: WRMSR to 0xca0 (tried to write 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
    Call Trace:
     <IRQ>
     __flush_smp_call_function_queue+0x11d/0x170
     __sysvec_call_function+0x24/0xd0
     sysvec_call_function+0x89/0xc0
     </IRQ>
     <TASK>
     asm_sysvec_call_function+0x16/0x20

When creating a new resource control group, hardware will be configured
by the following process:
    rdtgroup_mkdir()
      rdtgroup_mkdir_ctrl_mon()
        rdtgroup_init_alloc()
          resctrl_arch_update_domains()

resctrl_arch_update_domains() iterates and updates all resctrl_conf_type
whose have_new_ctrl is true. Since staged_config[] holds the same values as
when CDP was enabled, it will continue to update the CDP_CODE and CDP_DATA
configurations. When group p8 is created, get_config_index() called in
resctrl_arch_update_domains() will return 16 and 17 as the CLOSIDs for
CDP_CODE and CDP_DATA, which will be translated to an invalid register -
0xca0 in this scenario.

Fix it by clearing staged_config[] before and after it is used.

[reinette: re-order commit tags]

Fixes: 75408e4350 ("x86/resctrl: Allow different CODE/DATA configurations to be staged")
Suggested-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/2fad13f49fbe89687fc40e9a5a61f23a28d1507a.1673988935.git.reinette.chatre%40intel.com
2023-03-15 15:19:43 -07:00
Linus Torvalds
29db00c252 Tracing fixes for v6.3
- Do not allow histogram values to have modifies.
   Can cause a NULL pointer dereference if they do.
 
 - Warn if hist_field_name() is passed a NULL.
   Prevent the NULL pointer dereference mentioned above.
 
 - Fix invalid address look up race in lookup_rec()
 
 - Define ftrace_stub_graph conditionally to prevent linker errors
 
 - Always check if RCU is watching at all tracepoint locations
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZBDuTBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qsboAP4yfrFYvIIKM5EkzkEiPI+V2hdlA12x
 bt839jO5AWCmhAEAiY8FmKatpBJQKsiGqSOab8aHOMnhGFZwltCHAPa9PAI=
 =vtA2
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Do not allow histogram values to have modifies. They can cause a NULL
   pointer dereference if they do.

 - Warn if hist_field_name() is passed a NULL. Prevent the NULL pointer
   dereference mentioned above.

 - Fix invalid address look up race in lookup_rec()

 - Define ftrace_stub_graph conditionally to prevent linker errors

 - Always check if RCU is watching at all tracepoint locations

* tag 'trace-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Make tracepoint lockdep check actually test something
  ftrace,kcfi: Define ftrace_stub_graph conditionally
  ftrace: Fix invalid address access in lookup_rec() when index is 0
  tracing: Check field value in hist_field_name()
  tracing: Do not let histogram values have some modifiers
2023-03-14 17:07:54 -07:00
Jan Beulich
934ef33ee7 x86/PVH: obtain VGA console info in Dom0
A new platform-op was added to Xen to allow obtaining the same VGA
console information PV Dom0 is handed. Invoke the new function and have
the output data processed by xen_init_vga().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-03-14 15:20:51 +01:00
Sean Christopherson
c281794eaa KVM: SVM: WARN if GATag generation drops VM or vCPU ID information
WARN if generating a GATag given a VM ID and vCPU ID doesn't yield the
same IDs when pulling the IDs back out of the tag.  Don't bother adding
error handling to callers, this is very much a paranoid sanity check as
KVM fully controls the VM ID and is supposed to reject too-big vCPU IDs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:07 -04:00
Suravee Suthikulpanit
5999715922 KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
Define AVIC_VCPU_ID_MASK based on AVIC_PHYSICAL_MAX_INDEX, i.e. the mask
that effectively controls the largest guest physical APIC ID supported by
x2AVIC, instead of hardcoding the number of bits to 8 (and the number of
VM bits to 24).

The AVIC GATag is programmed into the AMD IOMMU IRTE to provide a
reference back to KVM in case the IOMMU cannot inject an interrupt into a
non-running vCPU.  In such a case, the IOMMU notifies software by creating
a GALog entry with the corresponded GATag, and KVM then uses the GATag to
find the correct VM+vCPU to kick.  Dropping bit 8 from the GATag results
in kicking the wrong vCPU when targeting vCPUs with x2APIC ID > 255.

Fixes: 4d1d7942e3 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:06 -04:00
Sean Christopherson
3ec7a1b274 KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
Define the "physical table max index mask" as bits 8:0, not 9:0.  x2AVIC
currently supports a max of 512 entries, i.e. the max index is 511, and
the inputs to GENMASK_ULL() are inclusive.  The bug is benign as bit 9 is
reserved and never set by KVM, i.e. KVM is just clearing bits that are
guaranteed to be zero.

Note, as of this writing, APM "Rev. 3.39-October 2022" incorrectly states
that bits 11:8 are reserved in Table B-1. VMCB Layout, Control Area.  I.e.
that table wasn't updated when x2AVIC support was added.

Opportunistically fix the comment for the max AVIC ID to align with the
code, and clean up comment formatting too.

Fixes: 4d1d7942e3 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:06 -04:00
Rong Tao
53293cb81b KVM: VMX: Use tabs instead of spaces for indentation
Code indentation should use tabs where possible and miss a '*'.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_A492CB3F9592578451154442830EA1B02C07@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:55 -04:00
Rong Tao
06e1854728 KVM: VMX: Fix indentation coding style issue
Code indentation should use tabs where possible.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_31E6ACADCB6915E157CF5113C41803212107@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:55 -04:00
Paolo Bonzini
77900bffed KVM: nVMX: remove unnecessary #ifdef
nested_vmx_check_controls() has already run by the time KVM checks host state,
so the "host address space size" exit control can only be set on x86-64 hosts.
Simplify the condition at the cost of adding some dead code to 32-bit kernels.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:54 -04:00
Paolo Bonzini
112e66017b KVM: nVMX: add missing consistency checks for CR0 and CR4
The effective values of the guest CR0 and CR4 registers may differ from
those included in the VMCS12.  In particular, disabling EPT forces
CR4.PAE=1 and disabling unrestricted guest mode forces CR0.PG=CR0.PE=1.

Therefore, checks on these bits cannot be delegated to the processor
and must be performed by KVM.

Reported-by: Reima ISHII <ishiir@g.ecc.u-tokyo.ac.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:54 -04:00
Dionna Glaze
72f7754dcf virt/coco/sev-guest: Add throttling awareness
A potentially malicious SEV guest can constantly hammer the hypervisor
using this driver to send down requests and thus prevent or at least
considerably hinder other guests from issuing requests to the secure
processor which is a shared platform resource.

Therefore, the host is permitted and encouraged to throttle such guest
requests.

Add the capability to handle the case when the hypervisor throttles
excessive numbers of requests issued by the guest. Otherwise, the VM
platform communication key will be disabled, preventing the guest from
attesting itself.

Realistically speaking, a well-behaved guest should not even care about
throttling. During its lifetime, it would end up issuing a handful of
requests which the hardware can easily handle.

This is more to address the case of a malicious guest. Such guest should
get throttled and if its VMPCK gets disabled, then that's its own
wrongdoing and perhaps that guest even deserves it.

To the implementation: the hypervisor signals with SNP_GUEST_REQ_ERR_BUSY
that the guest requests should be throttled. That error code is returned
in the upper 32-bit half of exitinfo2 and this is part of the GHCB spec
v2.

So the guest is given a throttling period of 1 minute in which it
retries the request every 2 seconds. This is a good default but if it
turns out to not pan out in practice, it can be tweaked later.

For safety, since the encryption algorithm in GHCBv2 is AES_GCM, control
must remain in the kernel to complete the request with the current
sequence number. Returning without finishing the request allows the
guest to make another request but with different message contents. This
is IV reuse, and breaks cryptographic protections.

  [ bp:
    - Rewrite commit message and do a simplified version.
    - The stable tags are supposed to denote that a cleanup should go
      upfront before backporting this so that any future fixes to this
      can preserve the sanity of the backporter(s). ]

Fixes: d5af44dde5 ("x86/sev: Provide support for SNP guest request NAEs")
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org> # d6fd48eff7 ("virt/coco/sev-guest: Check SEV_SNP attribute at probe time")
Cc: <stable@kernel.org> # 970ab82374 (" virt/coco/sev-guest: Simplify extended guest request handling")
Cc: <stable@kernel.org> # c5a338274b ("virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()")
Cc: <stable@kernel.org> # 0fdb6cc7c8 ("virt/coco/sev-guest: Carve out the request issuing logic into a helper")
Cc: <stable@kernel.org> # d25bae7dc7 ("virt/coco/sev-guest: Do some code style cleanups")
Cc: <stable@kernel.org> # fa4ae42cc6 ("virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case")
Link: https://lore.kernel.org/r/20230214164638.1189804-2-dionnaglaze@google.com
2023-03-13 13:29:27 +01:00
Borislav Petkov (AMD)
fa4ae42cc6 virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
snp_issue_guest_request() checks the value returned by the hypervisor in
sw_exit_info_2 and returns a different error depending on it.

Convert those checks into a switch-case to make it more readable when
more error values are going to be checked in the future.

No functional changes.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230307192449.24732-8-bp@alien8.de
2023-03-13 12:55:34 +01:00
Borislav Petkov (AMD)
970ab82374 virt/coco/sev-guest: Simplify extended guest request handling
Return a specific error code - -ENOSPC - to signal the too small cert
data buffer instead of checking exit code and exitinfo2.

While at it, hoist the *fw_err assignment in snp_issue_guest_request()
so that a proper error value is returned to the callers.

  [ Tom: check override_err instead of err. ]

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230307192449.24732-4-bp@alien8.de
2023-03-13 11:27:10 +01:00
Borislav Petkov (AMD)
d6fd48eff7 virt/coco/sev-guest: Check SEV_SNP attribute at probe time
No need to check it on every ioctl. And yes, this is a common SEV driver
but it does only SNP-specific operations currently. This can be
revisited later, when more use cases appear.

No functional changes.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230307192449.24732-3-bp@alien8.de
2023-03-13 11:20:20 +01:00
Yazen Ghannam
4783b9cb37 x86/mce: Make sure logged MCEs are processed after sysfs update
A recent change introduced a flag to queue up errors found during
boot-time polling. These errors will be processed during late init once
the MCE subsystem is fully set up.

A number of sysfs updates call mce_restart() which goes through a subset
of the CPU init flow. This includes polling MCA banks and logging any
errors found. Since the same function is used as boot-time polling,
errors will be queued. However, the system is now past late init, so the
errors will remain queued until another error is found and the workqueue
is triggered.

Call mce_schedule_work() at the end of mce_restart() so that queued
errors are processed.

Fixes: 3bff147b18 ("x86/mce: Defer processing of early errors")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com
2023-03-12 21:12:21 +01:00
Linus Torvalds
d3d0cac69f - Disable XSAVES on AMD Zen1 and Zen2 machines due to an erratum. No
impact to anything as those machines will fallback to XSAVEC which is
   equivalent there.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQNtvAACgkQEsHwGGHe
 VUpGiRAAjlYpvaQK24s8MiQr3LBC0pKsgKstf1Jx5C+HspmS5JAdF83646kMOUKm
 MUGPfQwK1nN5kO0/fOlo4O6vhSIF2Ft/Xfrd/APZm6qJhR3pli9675NeF8fH2D5t
 Ypgtl6psRudkB3RUmE1cmHWbr9dMnHZZLnL6iA/qHYXCY3kaw96ncM6HjdnrjXRd
 OV2+N4dyhTet3MdUdw7dSr1uz75O5PQH/1FwR1V2zroF1sjImaIwQ7JN51hIITxw
 DzfTbfuJzdAqwfztBFG/yZ5K+DEoU5BemHHIuhq+X9/7GeLMd059DdnZuXSX8mcH
 jjzOa/E5r/PjYze0XRWT3RbI5fbSc1qhNbmj3kLNP3KE/F3S74n6FR58oLNqosVk
 zw1TYP8oocdjG1VxJdm5qndIzwHMSj3qkd+BSNZZ1fwINVLXtSDubtThkN/i+81+
 nqnMA8HFrcwy1bhwq4jd5dmP7tjlODATfeL4ZV6/6J1RX8Vwu+bjdy8PM+vJYJ0d
 pnFLT20cf6Or0MQHUssO+uh6oC3aQ6AxPWJcuUfbdSLYzjr2EObgCHXGZOhCjvhC
 CsALcmwnLh5XzwglzWoXyyv+tsJar63XYcPSEIt+gIfXpLf7ZbzcOSDLDkri6B3Z
 fCABGASFnoXr7ZYnGxH4L5WKWOk1W+pgpxyC4mnzD9oHtXIzUPU=
 =u6kj
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:
 "A single erratum fix for AMD machines:

   - Disable XSAVES on AMD Zen1 and Zen2 machines due to an erratum. No
     impact to anything as those machines will fallback to XSAVEC which
     is equivalent there"

* tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Disable XSAVES on AMD family 0x17
2023-03-12 09:12:03 -07:00