Commit Graph

997813 Commits

Author SHA1 Message Date
Damien Le Moal
de3510e52b null_blk: fix command timeout completion handling
Memory backed or zoned null block devices may generate actual request
timeout errors due to the submission path being blocked on memory
allocation or zone locking. Unlike fake timeouts or injected timeouts,
the request submission path will call blk_mq_complete_request() or
blk_mq_end_request() for these real timeout errors, causing a double
completion and use after free situation as the block layer timeout
handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in
blk_mq_check_expired(). This problem often triggers a NULL pointer
dereference such as:

BUG: kernel NULL pointer dereference, address: 0000000000000050
RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20
...
Call Trace:
  dd_finish_request+0x56/0x80
  blk_mq_free_request+0x37/0x130
  null_handle_cmd+0xbf/0x250 [null_blk]
  ? null_queue_rq+0x67/0xd0 [null_blk]
  blk_mq_dispatch_rq_list+0x122/0x850
  __blk_mq_do_dispatch_sched+0xbb/0x2c0
  __blk_mq_sched_dispatch_requests+0x13d/0x190
  blk_mq_sched_dispatch_requests+0x30/0x60
  __blk_mq_run_hw_queue+0x49/0x90
  process_one_work+0x26c/0x580
  worker_thread+0x55/0x3c0
  ? process_one_work+0x580/0x580
  kthread+0x134/0x150
  ? kthread_create_worker_on_cpu+0x70/0x70
  ret_from_fork+0x1f/0x30

This problem very often triggers when running the full btrfs xfstests
on a memory-backed zoned null block device in a VM with limited amount
of memory.

Avoid this by executing blk_mq_complete_request() in null_timeout_rq()
only for commands that are marked for a fake timeout completion using
the fake_timeout boolean in struct null_cmd. For timeout errors injected
through debugfs, the timeout handler will execute
blk_mq_complete_request()i as before. This is safe as the submission
path does not execute complete requests in this case.

In null_timeout_rq(), also make sure to set the command error field to
BLK_STS_TIMEOUT and to propagate this error through to the request
completion.

Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Reviewed-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-01 07:03:46 -06:00
Matthew Wilcox (Oracle)
2c7e57a027 idr test suite: Improve reporting from idr_find_test_1
Instead of just reporting an assertion failure, report enough information
that we can start diagnosing exactly went wrong.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01 07:50:42 -04:00
Matthew Wilcox (Oracle)
094ffbd1d8 idr test suite: Create anchor before launching throbber
The throbber could race with creation of the anchor entry and cause the
IDR to have zero entries in it, which would cause the test to fail.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01 07:50:19 -04:00
Matthew Wilcox (Oracle)
703586410d idr test suite: Take RCU read lock in idr_find_test_1
When run on a single CPU, this test would frequently access already-freed
memory.  Due to timing, this bug never showed up on multi-CPU tests.

Reported-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01 07:44:48 -04:00
Matthew Wilcox (Oracle)
1bb4bd266c radix tree test suite: Register the main thread with the RCU library
Several test runners register individual worker threads with the
RCU library, but neglect to register the main thread, which can lead
to objects being freed while the main thread is in what appears to be
an RCU critical section.

Reported-by: Chris von Recklinghausen <crecklin@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-04-01 07:41:30 -04:00
Vitaly Kuznetsov
8cdddd182b ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()
Commit 496121c021 ("ACPI: processor: idle: Allow probing on platforms
with one ACPI C-state") broke CPU0 hotplug on certain systems, e.g.
I'm observing the following on AWS Nitro (e.g r5b.xlarge but other
instance types are affected as well):

 # echo 0 > /sys/devices/system/cpu/cpu0/online
 # echo 1 > /sys/devices/system/cpu/cpu0/online
 <10 seconds delay>
 -bash: echo: write error: Input/output error

In fact, the above mentioned commit only revealed the problem and did
not introduce it. On x86, to wakeup CPU an NMI is being used and
hlt_play_dead()/mwait_play_dead() loops are prepared to handle it:

	/*
	 * If NMI wants to wake up CPU0, start CPU0.
	 */
	if (wakeup_cpu0())
		start_cpu0();

cpuidle_play_dead() -> acpi_idle_play_dead() (which is now being called on
systems where it wasn't called before the above mentioned commit) serves
the same purpose but it doesn't have a path for CPU0. What happens now on
wakeup is:
 - NMI is sent to CPU0
 - wakeup_cpu0_nmi() works as expected
 - we get back to while (1) loop in acpi_idle_play_dead()
 - safe_halt() puts CPU0 to sleep again.

The straightforward/minimal fix is add the special handling for CPU0 on x86
and that's what the patch is doing.

Fixes: 496121c021 ("ACPI: processor: idle: Allow probing on platforms with one ACPI C-state")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-01 13:37:55 +02:00
Srinivas Kandagatla
adfc3ed7dc
ASoC: codecs: lpass-rx-macro: set npl clock rate correctly
NPL clock rate is twice the MCLK rate, so set this correctly to
avoid soundwire timeouts.

Fixes: af3d54b997 ("ASoC: codecs: lpass-rx-macro: add support for lpass rx macro")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210331171235.24824-2-srinivas.kandagatla@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-01 12:18:09 +01:00
Srinivas Kandagatla
b861106f3c
ASoC: codecs: lpass-tx-macro: set npl clock rate correctly
NPL clock rate is twice the MCLK rate, so set this correctly to
avoid soundwire timeouts.

Fixes: c39667ddcf ("ASoC: codecs: lpass-tx-macro: add support for lpass tx macro")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20210331171235.24824-1-srinivas.kandagatla@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-04-01 12:18:07 +01:00
Arnd Bergmann
89e21e1ad9 i.MX fixes for 5.12, round 2:
- Fix a system failure on imx6qdl-phytec-pfla02 board when booting from
   SD, by adding missing vmmc supply for SD interfaces.
 - Fix address typo in i.MX8MM/Q IOMUXC_SD1_DATA0_GPIO2_IO2 definition.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCgAyFiEEFmJXigPl4LoGSz08UFdYWoewfM4FAmBi5tIUHHNoYXduZ3Vv
 QGtlcm5lbC5vcmcACgkQUFdYWoewfM63qQf9H+AmuNEw3Sm9+kW+VH3u+7cBGY0r
 gkdV+hc+pabC/lzkvGTJhmncW2Y35BfzuEG6Bd6s6QEEPAqtqZ0fzDZlcS444b9Z
 e2hLPraKo/C51SCOoAmCUd5JA3to/ZVC+zg1ZiN92SrqgBm5e3we7xvp+Qa/Rzxs
 ZYmzll20U4gt9Dq2HX7dSLc8F/yq6EIGEMkXPKkkDUdWXxM4qbUpN0LlzWCV529f
 SNppkfeA1VfB9Kb8MrawvBRldN4j3T0SWhRFZfa6LqzJEP1dy+885u+4YknMdeJc
 ibpab/oEAzz/yiOiTBmtNCUBFEh3Xdiwh+0Y4T5nGhRd2kFWi2TBJB7hFg==
 =8P0W
 -----END PGP SIGNATURE-----

Merge tag 'imx-fixes-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes

i.MX fixes for 5.12, round 2:

- Fix a system failure on imx6qdl-phytec-pfla02 board when booting from
  SD, by adding missing vmmc supply for SD interfaces.
- Fix address typo in i.MX8MM/Q IOMUXC_SD1_DATA0_GPIO2_IO2 definition.

* tag 'imx-fixes-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
  ARM: dts: imx6: pbab01: Set vmmc supply for both SD interfaces
  arm64: dts: imx8mm/q: Fix pad control of SD1_DATA0

Link: https://lore.kernel.org/r/20210330090236.GQ22955@dragon
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01 11:34:06 +02:00
Arnd Bergmann
111a5a421f More fixes for omaps for v5.12-rc cycle
Two fixes for hangs, mmc slot order fix, and a voltage typo fix:
 
 - Remove unused duplicate sha2md5_fck clock node that can race with the
   OMAP4_SHA2MD5_CLKCTRL clock node for disable for unused clocks
 
 - Add aliases for omap4/5 mmc to put the slots back into the right
   order again
 
 - Fix typo for bionic voltage controllers that accidentally use mpu
   for all instances instead of mpu, core and iva
 
 - Fix random hangs for droid4 caused by missing fix from TI Android
   kernel tree to do a dummy smc call on cpuidle wakeup path
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAmBbHlwRHHRvbnlAYXRv
 bWlkZS5jb20ACgkQG9Q+yVyrpXM7bRAAhm539aZUQE0mXrLZTbVxTo4PfOaA4ToB
 3ZsoHFP2QK6RwulS6J7ebHLOVE6fMmVOj2UBpXMtTsNrBrI0k/7ziAgFiunTxZGa
 GpKma4AoNFjz3WLjkX7XpxlEH3W/oaIW6My5UQxn827m8oTqjN9mb/b0qxLu2zAp
 xc0sGM5t18A/v64Bx2OY2EimrieqzreNC5YUUKTH/CZnxnii6dla1Di6tZtT6iXw
 ARaqNM46qrd9iV1lfjncp0a2nfWAdlR4GJ2qXCKgLjs0J9T8xquUxda33zjRiXET
 /4pKJPVcU9jf1er839qk2gCoqzRhJhINQWxrzEBpj/ern4XR3Z0fQ6i0oT21roMr
 ho6mWKYudKd6k8fua2cWqKepaOoKVDhJYvvUN/3SvxV23rf8A26NddzrfZU/7H7S
 IQr1cg7vM2gKlFZJ3oGUCn9SL1DNDFHLvfJYYnSLW1dB04BQHvZjjHdxyXg5wg3P
 2ZUVr2dER6mF49kHvtRHu5avKw4d5KGodG/E645sZQe+g/sTQGZcIIVRhI4lsa4C
 VnpJfokokkGcouOoy5mipLO+gPIMNlhdp05hMGtPu/iKjo378VbS/07097k8DTaT
 dlS8V+lDAthS5aOf86RzsgLAcs2f0UvUNvgeXxYeN7uRxSFqE/sWg9J0pMtPISt7
 zyygoqP2oT4=
 =QQY6
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v5.12/fixes-rc4-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into arm/fixes

More fixes for omaps for v5.12-rc cycle

Two fixes for hangs, mmc slot order fix, and a voltage typo fix:

- Remove unused duplicate sha2md5_fck clock node that can race with the
  OMAP4_SHA2MD5_CLKCTRL clock node for disable for unused clocks

- Add aliases for omap4/5 mmc to put the slots back into the right
  order again

- Fix typo for bionic voltage controllers that accidentally use mpu
  for all instances instead of mpu, core and iva

- Fix random hangs for droid4 caused by missing fix from TI Android
  kernel tree to do a dummy smc call on cpuidle wakeup path

* tag 'omap-for-v5.12/fixes-rc4-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP4: PM: update ROM return address for OSWR and OFF
  ARM: OMAP4: Fix PMIC voltage domains for bionic
  ARM: dts: Fix moving mmc devices with aliases for omap4 & 5
  ARM: dts: Drop duplicate sha2md5_fck to fix clk_disable race

Link: https://lore.kernel.org/r/pull-1616584662-702939@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01 11:33:35 +02:00
Arnd Bergmann
70a6062cc2 This pull request contains Broadcom ARM-based SoC changes for 5.12,
please pull the following:
 
 - Florian reverts the adding of the second level interrupt controller
   for HDMI BSC interrupts since they collide with the main I2C
   controller (i2c-bcm2835).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEm+Rq3+YGJdiR9yuFh9CWnEQHBwQFAmAsbcMACgkQh9CWnEQH
 BwQzMA/+OKiEhSrnMvJ2OCrvy1FQHoexapmG+i9hMW2fO7Zd2CX9we2kAVU7HC3r
 aGCrYAQkUbyGROx+CjjRYQJDt8XP66tgD8vMgODxq2ujfbFXSOe5xlI9OBeDaSRm
 RveqOM4K/2+QkxBI/LbJdT2cZwsqxud4y8xuuw0/WDRI5FypQynuaxI+uDgkSVWb
 cuFJ9iJuBCAUE3PxqtQr5/nbDvnsjZO6Ib/GkVulvm4/YU5o1bFcSZ4Z+tD3ip8U
 o+uPdO1pfb7HlLkczPMwB6zI7SKWPOnsxIxGE0sgQGWoWYeE5HQqzryeNYokLaoj
 Ygm3xZp6HnRfngNJ6H1dN8h95Nd1DU+Jjz6Z+PXUfrekpzdhBtSQnyQ4sxi8Jhrf
 0jNxUvbJ625IduqV36xMWMW+WOKs4xLIBQ7FSqmlYjdURYBvpjHxoE1oUccBc7tc
 4lBbIIxaENB+bmAS/qKFsDZWQdjkNxoSiLpBb4w2VnNE/yBePL/z9czFeawqmPc7
 TIsxm8Rqozbt7btffSmZYJb7rNr//bMAR8gmmWmNvwAaJXRZdHFuPAbtONmTkXdQ
 yxuLtAjafHUM+j791QGAMIXwfWjfCc4ERSINuQoNm474bzCAQVsi0kUtcHrDFvQM
 UU+0brs3F3yGap3PkxsrId67DlgzUTCUm+/WiDzhQv9zh402f5U=
 =B9w6
 -----END PGP SIGNATURE-----

Merge tag 'arm-soc/for-5.12/devicetree-part2' of https://github.com/Broadcom/stblinux into arm/fixes

This pull request contains Broadcom ARM-based SoC changes for 5.12,
please pull the following:

- Florian reverts the adding of the second level interrupt controller
  for HDMI BSC interrupts since they collide with the main I2C
  controller (i2c-bcm2835).

* tag 'arm-soc/for-5.12/devicetree-part2' of https://github.com/Broadcom/stblinux:
  Revert "ARM: dts: bcm2711: Add the BSC interrupt controller"

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-04-01 11:33:08 +02:00
Vitaly Kuznetsov
55626ca9c6 selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0)
Add a test for the issue when KVM_SET_CLOCK(0) call could cause
TSC page value to go very big because of a signedness issue around
hv_clock->system_time.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210326155551.17446-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 05:14:19 -04:00
Vitaly Kuznetsov
77fcbe823f KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update()
When guest time is reset with KVM_SET_CLOCK(0), it is possible for
'hv_clock->system_time' to become a small negative number. This happens
because in KVM_SET_CLOCK handling we set 'kvm->arch.kvmclock_offset' based
on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled,
kvm_guest_time_update() does (masterclock in use case):

hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset;

And 'master_kernel_ns' represents the last time when masterclock
got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a
problem, the difference is very small, e.g. I'm observing
hv_clock.system_time = -70 ns. The issue comes from the fact that
'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in
compute_tsc_page_parameters() becomes a very big number.

Use 'master_kernel_ns' instead of get_kvmclock_ns() when masterclock is in
use and get_kvmclock_base_ns() when it's not to prevent 'system_time' from
going negative.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210331124130.337992-2-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 05:14:19 -04:00
Paolo Bonzini
a83829f56c KVM: x86: disable interrupts while pvclock_gtod_sync_lock is taken
pvclock_gtod_sync_lock can be taken with interrupts disabled if the
preempt notifier calls get_kvmclock_ns to update the Xen
runstate information:

   spin_lock include/linux/spinlock.h:354 [inline]
   get_kvmclock_ns+0x25/0x390 arch/x86/kvm/x86.c:2587
   kvm_xen_update_runstate+0x3d/0x2c0 arch/x86/kvm/xen.c:69
   kvm_xen_update_runstate_guest+0x74/0x320 arch/x86/kvm/xen.c:100
   kvm_xen_runstate_set_preempted arch/x86/kvm/xen.h:96 [inline]
   kvm_arch_vcpu_put+0x2d8/0x5a0 arch/x86/kvm/x86.c:4062

So change the users of the spinlock to spin_lock_irqsave and
spin_unlock_irqrestore.

Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
Fixes: 30b5c851af ("KVM: x86/xen: Add support for vCPU runstate information")
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 05:14:19 -04:00
Paolo Bonzini
c2c647f91a KVM: x86: reduce pvclock_gtod_sync_lock critical sections
There is no need to include changes to vcpu->requests into
the pvclock_gtod_sync_lock critical section.  The changes to
the shared data structures (in pvclock_update_vm_gtod_copy)
already occur under the lock.

Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 05:14:19 -04:00
Paolo Bonzini
6ebae23c07 Merge branch 'kvm-fix-svm-races' into kvm-master 2021-04-01 05:14:05 -04:00
Paolo Bonzini
3c346c0c60 KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit
Fixing nested_vmcb_check_save to avoid all TOC/TOU races
is a bit harder in released kernels, so do the bare minimum
by avoiding that EFER.SVME is cleared.  This is problematic
because svm_set_efer frees the data structures for nested
virtualization if EFER.SVME is cleared.

Also check that EFER.SVME remains set after a nested vmexit;
clearing it could happen if the bit is zero in the save area
that is passed to KVM_SET_NESTED_STATE (the save area of the
nested state corresponds to the nested hypervisor's state
and is restored on the next nested vmexit).

Cc: stable@vger.kernel.org
Fixes: 2fcf4876ad ("KVM: nSVM: implement on demand allocation of the nested state")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 05:11:35 -04:00
Paolo Bonzini
a58d9166a7 KVM: SVM: load control fields from VMCB12 before checking them
Avoid races between check and use of the nested VMCB controls.  This
for example ensures that the VMRUN intercept is always reflected to the
nested hypervisor, instead of being processed by the host.  Without this
patch, it is possible to end up with svm->nested.hsave pointing to
the MSR permission bitmap for nested guests.

This bug is CVE-2021-29657.

Reported-by: Felix Wilhelm <fwilhelm@google.com>
Cc: stable@vger.kernel.org
Fixes: 2fcf4876ad ("KVM: nSVM: implement on demand allocation of the nested state")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-01 05:09:31 -04:00
Dave Airlie
dcdb7aa452 Merge tag 'amd-drm-fixes-5.12-2021-03-31' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.12-2021-03-31:

amdgpu:
- Polaris idle power fix
- VM fix
- Vangogh S3 fix
- Fixes for non-4K page sizes

amdkfd:
- dqm fence memory corruption fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210401020057.17831-1-alexander.deucher@amd.com
2021-04-01 15:04:58 +10:00
Dave Airlie
7344c82777 Just one cleanup which drops of_gpio.h inclusion.
- This header file isn't used anymore so drop it.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJgYbbvAAoJEFc4NIkMQxK4KMYQALXOVKqGu8fgQXct3lyIn9cK
 lvFkrWv03v5DYdyHio7XR8egFzRMKw0XENYzAM0CAaUsVApOgi63puZrtyO5+taW
 ++Ai3oclCrGvwEWXhxx6jGEPUPaPPrD8sQF60+3bOxbAGDxHk99RhEHvYwIQbQzD
 5ZJa3V2K7xiBmnnXP2mlm5qNSVC0c8rRWtf5rXjRHcAfaWHy9gqScaXqwn/9KKZy
 Nrf5dsV3vxG9F+HdoWyKIdvhGfVrjdIldkLBtCn2P3n0aHJXAeqpoK1K6JpVIFtO
 mJPHwB9XwZZ/I2jxXBpATU70C50SAKgFd0bS5f6caZOZJHw1S+VkocgcOqLZA4Kz
 7QnaMJfef2MkSK/1I3XsLgissd+4GsnghNt8t5lF6+vNuTTEcqY+k/eL+2xpXzaF
 lKtXqTgL3+JKlCibYUK0ZCGa7M3cGGWjmSSUsXTo2jWY7GqlT5T6lB0YOp8fmr8P
 7IQEm3l1gNCgFpF8S2mbIHbWKRlv/Bm0oTbUixAPMojLYjIPC48YZ7qZZ5WwPcSz
 CB+y2F+QkIDR1Du8Ys8cacNgedVfSZz3SGdJ8pnpTI5UkzGWvgzqTwnqRumNmPUb
 uzAtmA9Fi43f/DWgNqsA5FPODtDin3ubU8/JwGKlNEYgJfrXYga4trINOAqCjKoJ
 sHgi5Ekd8ZzVsf9CPWRe
 =Rp8S
 -----END PGP SIGNATURE-----

Merge tag 'exynos-drm-fixes-for-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes

Just one cleanup which drops of_gpio.h inclusion.
- This header file isn't used anymore so drop it.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1617016858-14081-1-git-send-email-inki.dae@samsung.com
2021-04-01 13:31:11 +10:00
Xℹ Ruoyao
e3512fb670 drm/amdgpu: check alignment on CPU page for bo map
The page table of AMDGPU requires an alignment to CPU page so we should
check ioctl parameters for it.  Return -EINVAL if some parameter is
unaligned to CPU page, instead of corrupt the page table sliently.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-31 21:53:38 -04:00
Huacai Chen
566c6e25f9 drm/amdgpu: Set a suitable dev_info.gart_page_size
In Mesa, dev_info.gart_page_size is used for alignment and it was
set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU
driver requires an alignment on CPU pages.  So, for non-4KB page system,
gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE).

Signed-off-by: Rui Wang <wangr@lemote.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1
[Xi: rebased for drm-next, use max_t for checkpatch,
     and reworded commit message.]
Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang>
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549
Tested-by: Dan Horák <dan@danny.cz>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-31 21:53:38 -04:00
Alex Deucher
6951c3e4a2 drm/amdgpu/vangogh: don't check for dpm in is_dpm_running when in suspend
Do the same thing we do for Renoir.  We can check, but since
the sbios has started DPM, it will always return true which
causes the driver to skip some of the SMU init when it shouldn't.

Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-31 21:53:38 -04:00
Qu Huang
e92049ae45 drm/amdkfd: dqm fence memory corruption
Amdgpu driver uses 4-byte data type as DQM fence memory,
and transmits GPU address of fence memory to microcode
through query status PM4 message. However, query status
PM4 message definition and microcode processing are all
processed according to 8 bytes. Fence memory only allocates
4 bytes of memory, but microcode does write 8 bytes of memory,
so there is a memory corruption.

Changes since v1:
  * Change dqm->fence_addr as a u64 pointer to fix this issue,
also fix up query_status and amdkfd_fence_wait_timeout function
uses 64 bit fence value to make them consistent.

Signed-off-by: Qu Huang <jinsdb@126.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-03-31 21:53:25 -04:00
Yufen Yu
3edf5346e4 block: only update parent bi_status when bio fail
For multiple split bios, if one of the bio is fail, the whole
should return error to application. But we found there is a race
between bio_integrity_verify_fn and bio complete, which return
io success to application after one of the bio fail. The race as
following:

split bio(READ)          kworker

nvme_complete_rq
blk_update_request //split error=0
  bio_endio
    bio_integrity_endio
      queue_work(kintegrityd_wq, &bip->bip_work);

                         bio_integrity_verify_fn
                         bio_endio //split bio
                          __bio_chain_endio
                             if (!parent->bi_status)

                               <interrupt entry>
                               nvme_irq
                                 blk_update_request //parent error=7
                                 req_bio_endio
                                    bio->bi_status = 7 //parent bio
                               <interrupt exit>

                               parent->bi_status = 0
                        parent->bi_end_io() // return bi_status=0

The bio has been split as two: split and parent. When split
bio completed, it depends on kworker to do endio, while
bio_integrity_verify_fn have been interrupted by parent bio
complete irq handler. Then, parent bio->bi_status which have
been set in irq handler will overwrite by kworker.

In fact, even without the above race, we also need to conside
the concurrency beteen mulitple split bio complete and update
the same parent bi_status. Normally, multiple split bios will
be issued to the same hctx and complete from the same irq
vector. But if we have updated queue map between multiple split
bios, these bios may complete on different hw queue and different
irq vector. Then the concurrency update parent bi_status may
cause the final status error.

Suggested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20210331115359.1125679-1-yuyufen@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-31 19:18:04 -06:00
Linus Torvalds
d19cc4bfbf Add check of order < 0 before calling free_pages()
The function addresses that are traced by ftrace are stored in pages,
 and the size is held in a variable. If there's some error in creating
 them, the allocate ones will be freed. In this case, it is possible that
 the order of pages to be freed may end up being negative due to a size of
 zero passed to get_count_order(), and then that negative number will cause
 free_pages() to free a very large section. Make sure that does not happen.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYGR30BQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnbDAP9yEhTLcDRUi3VLWnEq19Dt4Lsg86Bf
 QRpbWG6Ze9EbZQEAgYAOe1fsNCNEIMXXh/4nlKVpKKH+vviS0ux9Z6uhpQQ=
 =Veyq
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace fix from Steven Rostedt:
 "Add check of order < 0 before calling free_pages()

  The function addresses that are traced by ftrace are stored in pages,
  and the size is held in a variable. If there's some error in creating
  them, the allocate ones will be freed. In this case, it is possible
  that the order of pages to be freed may end up being negative due to a
  size of zero passed to get_count_order(), and then that negative
  number will cause free_pages() to free a very large section.

  Make sure that does not happen"

* tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Check if pages were allocated before calling free_pages()
2021-03-31 10:14:55 -07:00
Linus Torvalds
39192106d4 Pin control fixes for the v5.12 kernel cycle:
- Fix up some Intel GPIO base calculations.
 - Fix a register offset in the Microchip driver.
 - Fix suspend/resume bug in the Rockchip driver.
 - Default pull up strength in the Qualcomm LPASS
   driver.
 - Fix two pingroup offsets in the Qualcomm SC7280
   driver.
 - Fix SDC1 register offset in the Qualcomm SC7280
   driver.
 - Fix a nasty string concatenation in the
   Qualcomm SDX55 driver.
 - Check the REVID register to see if the device is
   real or virtualized during virtualization in the
   Intel driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmBkalgACgkQQRCzN7AZ
 XXMY/g/+OUZm+mT0DR0AFDW7UkMytns0Zgt7Zk9V6W5rytW7731LNCC5CRF1F3kI
 iZiutqNfe7uDWop5yo1ub8UqCxBtgrJADzXT4uTGid11jQrsfwUjZEqjlW1J4oHz
 Y4cfmLCPtuB4rA1GrISqYQ2s2YO2+kdFVdL8ZrhFQug5GEOJrtmjAYvTU1GshaLH
 9oxd7bg5QV2ZmzAQoH7tScqUi+60u7CqWLBJzuW6/qIBCBpYxc6NFLlfPbuqJbyb
 NuJeRl+e7OA2tQE5CR4ymQyuoz7iGCrgsDIOi7maADDTJQEW/RPG9565qzqQxCgv
 UFt1sqodgegEgXLRNJG6Y9zO1qktCzDHJGNFyH9EZ2FHVKox+Gicf8TxYM3ulHca
 az/qs+MXH32YdLr//lg6XvogAMjnl3nXeq9j1U7nunki4N93oxBhq89d5qBcyZSP
 /3D53HViq6EEcBnChAZ64SY21Ro7HNzE6x5bOSL2HPsWy9B8U5jRgAkHxPjd8/Vr
 LF5EPblI+bZc8fOxAi9ifMu9OROB+I5ZUm00zYzucJUKoMtrkAkObW9/GS88vpwN
 j6uAAr/WFlWaXpKynIxyME3zPuhoyZHFlSnI8LI3UTBiZtFsxoMWbVP1OVkbK6bc
 kUtQUPsE0T4OowYa+ulYet+US/LucmVn7KgTFGOodSKtpxHU7Mk=
 =+Gsn
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Some overly ripe fixes for the v5.12 kernel. I should have sent
  earlier but had my head stuck in GDB.

  All are driver fixes:

   - Fix up some Intel GPIO base calculations.

   - Fix a register offset in the Microchip driver.

   - Fix suspend/resume bug in the Rockchip driver.

   - Default pull up strength in the Qualcomm LPASS driver.

   - Fix two pingroup offsets in the Qualcomm SC7280 driver.

   - Fix SDC1 register offset in the Qualcomm SC7280 driver.

   - Fix a nasty string concatenation in the Qualcomm SDX55 driver.

   - Check the REVID register to see if the device is real or
     virtualized during virtualization in the Intel driver"

* tag 'pinctrl-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: intel: check REVID register value for device presence
  pinctrl: qcom: fix unintentional string concatenation
  pinctrl: qcom: sc7280: Fix SDC1_RCLK configurations
  pinctrl: qcom: sc7280: Fix SDC_QDSD_PINGROUP and UFS_RESET offsets
  pinctrl: qcom: lpass lpi: use default pullup/strength values
  pinctrl: rockchip: fix restore error in resume
  pinctrl: microchip-sgpio: Fix wrong register offset for IRQ trigger
  pinctrl: intel: Show the GPIO base calculation explicitly
2021-03-31 10:09:44 -07:00
Bastian Germann
7c0d6e4820
ASoC: sunxi: sun4i-codec: fill ASoC card owner
card->owner is a required property and since commit 81033c6b58 ("ALSA:
core: Warn on empty module") a warning is issued if it is empty. Add it.
This fixes following warning observed on Lamobo R1:

WARNING: CPU: 1 PID: 190 at sound/core/init.c:207 snd_card_new+0x430/0x480 [snd]
Modules linked in: sun4i_codec(E+) sun4i_backend(E+) snd_soc_core(E) ...
CPU: 1 PID: 190 Comm: systemd-udevd Tainted: G         C  E     5.10.0-1-armmp #1 Debian 5.10.4-1
Hardware name: Allwinner sun7i (A20) Family
Call trace:
 (snd_card_new [snd])
 (snd_soc_bind_card [snd_soc_core])
 (snd_soc_register_card [snd_soc_core])
 (sun4i_codec_probe [sun4i_codec])

Fixes: 45fb6b6f2a ("ASoC: sunxi: add support for the on-chip codec on early Allwinner SoCs")
Related: commit 3c27ea23ff ("ASoC: qcom: Set card->owner to avoid warnings")
Related: commit ec653df2a0 ("drm/vc4/vc4_hdmi: fill ASoC card owner")
Cc: linux-arm-kernel@lists.infradead.org
Cc: alsa-devel@alsa-project.org
Signed-off-by: Bastian Germann <bage@linutronix.de>
Link: https://lore.kernel.org/r/20210331151843.30583-1-bage@linutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
2021-03-31 17:59:43 +01:00
Paolo Bonzini
825e34d3c9 Merge commit 'kvm-tdp-fix-flushes' into kvm-master 2021-03-31 07:45:41 -04:00
Tetsuo Handa
5e46d1b78a reiserfs: update reiserfs_xattrs_initialized() condition
syzbot is reporting NULL pointer dereference at reiserfs_security_init()
[1], for commit ab17c4f021 ("reiserfs: fixup xattr_root caching")
is assuming that REISERFS_SB(s)->xattr_root != NULL in
reiserfs_xattr_jcreate_nblocks() despite that commit made
REISERFS_SB(sb)->priv_root != NULL && REISERFS_SB(s)->xattr_root == NULL
case possible.

I guess that commit 6cb4aff0a7 ("reiserfs: fix oops while creating
privroot with selinux enabled") wanted to check xattr_root != NULL
before reiserfs_xattr_jcreate_nblocks(), for the changelog is talking
about the xattr root.

  The issue is that while creating the privroot during mount
  reiserfs_security_init calls reiserfs_xattr_jcreate_nblocks which
  dereferences the xattr root. The xattr root doesn't exist, so we get
  an oops.

Therefore, update reiserfs_xattrs_initialized() to check both the
privroot and the xattr root.

Link: https://syzkaller.appspot.com/bug?id=8abaedbdeb32c861dc5340544284167dd0e46cde # [1]
Reported-and-tested-by: syzbot <syzbot+690cb1e51970435f9775@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 6cb4aff0a7 ("reiserfs: fix oops while creating privroot with selinux enabled")
Acked-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Jan Kara <jack@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-30 14:27:32 -07:00
Jens Axboe
82734c5b1b io_uring: drop sqd lock before handling signals for SQPOLL
Don't call into get_signal() with the sqd mutex held, it'll fail if we're
freezing the task and we'll get complaints on locks still being held:

====================================
WARNING: iou-sqp-8386/8387 still has locks held!
5.12.0-rc4-syzkaller #0 Not tainted
------------------------------------
1 lock held by iou-sqp-8386/8387:
 #0: ffff88801e1d2470 (&sqd->lock){+.+.}-{3:3}, at: io_sq_thread+0x24c/0x13a0 fs/io_uring.c:6731

 stack backtrace:
 CPU: 1 PID: 8387 Comm: iou-sqp-8386 Not tainted 5.12.0-rc4-syzkaller #0
 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:79 [inline]
  dump_stack+0x141/0x1d7 lib/dump_stack.c:120
  try_to_freeze include/linux/freezer.h:66 [inline]
  get_signal+0x171a/0x2150 kernel/signal.c:2576
  io_sq_thread+0x8d2/0x13a0 fs/io_uring.c:6748

Fold the get_signal() case in with the parking checks, as we need to drop
the lock in both cases, and since we need to be checking for parking when
juggling the lock anyway.

Reported-by: syzbot+796d767eb376810256f5@syzkaller.appspotmail.com
Fixes: dbe1bdbb39 ("io_uring: handle signals for IO threads like a normal thread")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-30 14:36:46 -06:00
Hans de Goede
3e759425cc ACPI: scan: Fix _STA getting called on devices with unmet dependencies
Commit 71da201f38 ("ACPI: scan: Defer enumeration of devices with
_DEP lists") dropped the following 2 lines from acpi_init_device_object():

	/* Assume there are unmet deps until acpi_device_dep_initialize() runs */
	device->dep_unmet = 1;

Leaving the initial value of dep_unmet at the 0 from the kzalloc(). This
causes the acpi_bus_get_status() call in acpi_add_single_object() to
actually call _STA, even though there maybe unmet deps, leading to errors
like these:

[    0.123579] ACPI Error: No handler for Region [ECRM] (00000000ba9edc4c)
               [GenericSerialBus] (20170831/evregion-166)
[    0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler
               (20170831/exfldio-299)
[    0.123618] ACPI Error: Method parse/execution failed
               \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550)

Fix this by re-adding the dep_unmet = 1 initialization to
acpi_init_device_object() and modifying acpi_bus_check_add() to make sure
that dep_unmet always gets setup there, overriding the initial 1 value.

This re-fixes the issue initially fixed by
commit 63347db0af ("ACPI / scan: Use acpi_bus_get_status() to initialize
ACPI_TYPE_DEVICE devs"), which introduced the removed
"device->dep_unmet = 1;" statement.

This issue was noticed; and the fix tested on a Dell Venue 10 Pro 5055.

Fixes: 71da201f38 ("ACPI: scan: Defer enumeration of devices with _DEP lists")
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-30 21:36:20 +02:00
Linus Torvalds
6ac86aae89 s390 updates for 5.12-rc6
- fix incorrect initialization and update of vdso data pages, which
   results in incorrect tod clock steering, and that
   clock_gettime(CLOCK_MONOTONIC_RAW, ...) returns incorrect values.
 
 - update MAINTAINERS for s390 vfio drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmBjXH8ACgkQIg7DeRsp
 bsLSWg//QhUiFdeUvyMoUxUj7Pld5/VPwAdVyuYbWlaiAwzHQwdhSCEXQ0CoIJm2
 TWKpDA768KjYI8RowS+WxjGtGgDFtagtwAUm9UveiCuV4QJ5Zt/n+IRoXlUuwZWG
 9mjx2UcUHxPFGJ7WtAR5VI1MLxZDHeEHVjLh7bOugO/7388CZr4TnCoIJZRXk1sh
 m8WLr5gavBmYTA2PrmayuOkxOPMV56d5rRi/ZGiUYWHJJgskg+LDWcBCr4IER2am
 MgeCMIpVgs2o4z6O2xCRmJeOG4Ha1JgeI0+8OhkkWmmZCo/ZMrjW4rjP8s0nSSif
 PdKot7KaeDAVGCm11jv4V+YVN9HaVPFqGimNV7XuEXBZKnwh7N2yKK4qWvwRy+TJ
 Y1nBMUvzGw7IEl60FYPCPUryk9KEq2qAYc77jRcRi9H7ji0X3BMYVKr/9AT3sdeJ
 O7FkGSdEk7rfYzDiqRyisbT3rJKMwRy7i8sp7ap/m1qcgVbQlUSFoFNXSchF+2wP
 OCQuvLOX7RNsFP9YiXcqUTgFO6JkN0UbVfwyOH/SxTf9Pyj53wEmphURGiHTZkbr
 9TrjDmAU3MA52IGU9gYHLqQ9otCrDhj4H9Uuuo6IDJ3v1pVgdt87fIq3nHjOU8Si
 KxVQ7V86Z3bXpHYglohWG7AfltfqkdO7FFAj7zvKgvq06zqRmtQ=
 =5z4s
 -----END PGP SIGNATURE-----

Merge tag 's390-5.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - fix incorrect initialization and update of vdso data pages, which
   results in incorrect tod clock steering, and that
   clock_gettime(CLOCK_MONOTONIC_RAW, ...) returns incorrect values.

 - update MAINTAINERS for s390 vfio drivers

* tag 's390-5.12-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  MAINTAINERS: add backups for s390 vfio drivers
  s390/vdso: fix initializing and updating of vdso_data
  s390/vdso: fix tod_steering_delta type
  s390/vdso: copy tod_steering_delta value to vdso_data page
2021-03-30 10:54:22 -07:00
Thierry Reding
ac097aecfe drm/tegra: sor: Grab runtime PM reference across reset
The SOR resets are exclusively shared with the SOR power domain. This
means that exclusive access can only be granted temporarily and in order
for that to work, a rigorous sequence must be observed. To ensure that a
single consumer gets exclusive access to a reset, each consumer must
implement a rigorous protocol using the reset_control_acquire() and
reset_control_release() functions.

However, these functions alone don't provide any guarantees at the
system level. Drivers need to ensure that the only a single consumer has
access to the reset at the same time. In order for the SOR to be able to
exclusively access its reset, it must therefore ensure that the SOR
power domain is not powered off by holding on to a runtime PM reference
to that power domain across the reset assert/deassert operation.

This used to work fine by accident, but was revealed when recently more
devices started to rely on the SOR power domain.

Fixes: 11c632e1cf ("drm/tegra: sor: Implement acquire/release for reset")
Reported-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30 19:51:39 +02:00
Matthew Wilcox (Oracle)
7487de534d radix tree test suite: Fix compilation
Commit 4bba4c4bb0 added tools/include/linux/compiler_types.h which
includes linux/compiler-gcc.h.  Unfortunately, we had our own (empty)
compiler_types.h which overrode the one added by that commit, and
so we lost the definition of __must_be_array().  Removing our empty
compiler_types.h fixes the problem and reduces our divergence from the
rest of the tools.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-03-30 13:44:35 -04:00
Matthew Wilcox (Oracle)
df59d0a461 XArray: Add xa_limit_16b
A 16-bit limit is a more common limit than I had realised.  Make it
generally available.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-03-30 13:42:33 -04:00
Matthew Wilcox (Oracle)
3012110d71 XArray: Fix splitting to non-zero orders
Splitting an order-4 entry into order-2 entries would leave the array
containing pointers to 000040008000c000 instead of 000044448888cccc.
This is a one-character fix, but enhance the test suite to check this
case.

Reported-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-03-30 13:42:33 -04:00
Matthew Wilcox (Oracle)
12efebab09 XArray: Fix split documentation
I wrote the documentation backwards; the new order of the entry is stored
in the xas and the caller passes the old entry.

Reported-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2021-03-30 13:42:33 -04:00
Thierry Reding
a31500fe70 drm/tegra: dc: Restore coupling of display controllers
Coupling of display controllers used to rely on runtime PM to take the
companion controller out of reset. Commit fd67e9c6ed ("drm/tegra: Do
not implement runtime PM") accidentally broke this when runtime PM was
removed.

Restore this functionality by reusing the hierarchical host1x client
suspend/resume infrastructure that's similar to runtime PM and which
perfectly fits this use-case.

Fixes: fd67e9c6ed ("drm/tegra: Do not implement runtime PM")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Reported-by: Paul Fertser <fercerpav@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30 19:40:43 +02:00
Mikko Perttunen
a24f98176d gpu: host1x: Use different lock classes for each client
To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30 19:37:20 +02:00
Dmitry Osipenko
f8fb97c915 drm/tegra: dc: Don't set PLL clock to 0Hz
RGB output doesn't allow to change parent clock rate of the display and
PCLK rate is set to 0Hz in this case. The tegra_dc_commit_state() shall
not set the display clock to 0Hz since this change propagates to the
parent clock. The DISP clock is defined as a NODIV clock by the tegra-clk
driver and all NODIV clocks use the CLK_SET_RATE_PARENT flag.

This bug stayed unnoticed because by default PLLP is used as the parent
clock for the display controller and PLLP silently skips the erroneous 0Hz
rate changes because it always has active child clocks that don't permit
rate changes. The PLLP isn't acceptable for some devices that we want to
upstream (like Samsung Galaxy Tab and ASUS TF700T) due to a display panel
clock rate requirements that can't be fulfilled by using PLLP and then the
bug pops up in this case since parent clock is set to 0Hz, killing the
display output.

Don't touch DC clock if pclk=0 in order to fix the problem.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2021-03-30 19:37:20 +02:00
Sean Christopherson
33a3164161 KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
Prevent the TDP MMU from yielding when zapping a gfn range during NX
page recovery.  If a flush is pending from a previous invocation of the
zapping helper, either in the TDP MMU or the legacy MMU, but the TDP MMU
has not accumulated a flush for the current invocation, then yielding
will release mmu_lock with stale TLB entries.

That being said, this isn't technically a bug fix in the current code, as
the TDP MMU will never yield in this case.  tdp_mmu_iter_cond_resched()
will yield if and only if it has made forward progress, as defined by the
current gfn vs. the last yielded (or starting) gfn.  Because zapping a
single shadow page is guaranteed to (a) find that page and (b) step
sideways at the level of the shadow page, the TDP iter will break its loop
before getting a chance to yield.

But that is all very, very subtle, and will break at the slightest sneeze,
e.g. zapping while holding mmu_lock for read would break as the TDP MMU
wouldn't be guaranteed to see the present shadow page, and thus could step
sideways at a lower level.

Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210325200119.1359384-4-seanjc@google.com>
[Add lockdep assertion. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:19:56 -04:00
Sean Christopherson
048f49809c KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
Honor the "flush needed" return from kvm_tdp_mmu_zap_gfn_range(), which
does the flush itself if and only if it yields (which it will never do in
this particular scenario), and otherwise expects the caller to do the
flush.  If pages are zapped from the TDP MMU but not the legacy MMU, then
no flush will occur.

Fixes: 29cf0f5007 ("kvm: x86/mmu: NX largepage recovery for TDP MMU")
Cc: stable@vger.kernel.org
Cc: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210325200119.1359384-3-seanjc@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:19:55 -04:00
Sean Christopherson
a835429cda KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap
When flushing a range of GFNs across multiple roots, ensure any pending
flush from a previous root is honored before yielding while walking the
tables of the current root.

Note, kvm_tdp_mmu_zap_gfn_range() now intentionally overwrites its local
"flush" with the result to avoid redundant flushes.  zap_gfn_range()
preserves and return the incoming "flush", unless of course the flush was
performed prior to yielding and no new flush was triggered.

Fixes: 1af4a96025 ("KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed")
Cc: stable@vger.kernel.org
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210325200119.1359384-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:19:55 -04:00
Siddharth Chandrasekaran
6fb3084ab5 KVM: make: Fix out-of-source module builds
Building kvm module out-of-source with,

    make -C $SRC O=$BIN M=arch/x86/kvm

fails to find "irq.h" as the include dir passed to cflags-y does not
prefix the source dir. Fix this by prefixing $(srctree) to the include
dir path.

Signed-off-by: Siddharth Chandrasekaran <sidcha@amazon.de>
Message-Id: <20210324124347.18336-1-sidcha@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:07:10 -04:00
Vitaly Kuznetsov
f982fb62a3 selftests: kvm: make hardware_disable_test less verbose
hardware_disable_test produces 512 snippets like
...
 main: [511] waiting semaphore
 run_test: [511] start vcpus
 run_test: [511] all threads launched
 main: [511] waiting 368us
 main: [511] killing child

and this doesn't have much value, let's print this info with pr_debug().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210323104331.1354800-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:07:10 -04:00
Vitaly Kuznetsov
1973cadd4c KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE
MSR_F15H_PERF_CTL0-5, MSR_F15H_PERF_CTR0-5 MSRs are only available when
X86_FEATURE_PERFCTR_CORE CPUID bit was exposed to the guest. KVM, however,
allows these MSRs unconditionally because kvm_pmu_is_valid_msr() ->
amd_msr_idx_to_pmc() check always passes and because kvm_pmu_set_msr() ->
amd_pmu_set_msr() doesn't fail.

In case of a counter (CTRn), no big harm is done as we only increase
internal PMC's value but in case of an eventsel (CTLn), we go deep into
perf internals with a non-existing counter.

Note, kvm_get_msr_common() just returns '0' when these MSRs don't exist
and this also seems to contradict architectural behavior which is #GP
(I did check one old Opteron host) but changing this status quo is a bit
scarier.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210323084515.1346540-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:07:10 -04:00
Dongli Zhang
ecaf088f53 KVM: x86: remove unused declaration of kvm_write_tsc()
kvm_write_tsc() was renamed and made static since commit 0c899c25d7
("KVM: x86: do not attempt TSC synchronization on guest writes"). Remove
its unused declaration.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Message-Id: <20210326070334.12310-1-dongli.zhang@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:07:09 -04:00
Haiwei Li
d632826f26 KVM: clean up the unused argument
kvm_msr_ignored_check function never uses vcpu argument. Clean up the
function and invokers.

Signed-off-by: Haiwei Li <lihaiwei@tencent.com>
Message-Id: <20210313051032.4171-1-lihaiwei.kernel@gmail.com>
Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:07:09 -04:00
Stefan Raspl
75f94ecbd0 tools/kvm_stat: Add restart delay
If this service is enabled and the system rebooted, Systemd's initial
attempt to start this unit file may fail in case the kvm module is not
loaded. Since we did not specify a delay for the retries, Systemd
restarts with a minimum delay a number of times before giving up and
disabling the service. Which means a subsequent kvm module load will
have kvm running without monitoring.
Adding a delay to fix this.

Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Message-Id: <20210325122949.1433271-1-raspl@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-30 13:07:09 -04:00