This patch fixes the following sparse error:
kernel/exit.c:627:25: error: incompatible types in comparison expression
And the following warning:
kernel/exit.c:626:40: warning: incorrect type in assignment
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
[christian.brauner@ubuntu.com: edit commit message]
Link: https://lore.kernel.org/r/20200130062028.4870-1-madhuparnabhowmik10@gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
- Fix compilation on 32bit
- Move VHE guest entry/exit into the VHE-specific entry code
- Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl5VsNMPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDLhUQAIsecO9IyYjy1J0Q5AxaKLL7NuKYlAaty2xX
uY6UkTfPNsEaHFXSGYXWPDxrmkgArp2wuy4WVQB59Om00+LE7h9kiz7+xKpcUy1G
UoHa5mzMlqoOeUIWO/oSU6LYHhYDnIpHTDco93YrscU4nNRevJZ/GVeuQeMblzuZ
Sg7cWc+0V43FXUt9Jw8BsNhXH/D0l0p3v86p7GZLcSfFAccO62YfOwC8J/znLPym
4S+O9RYQkCczvzFeQVYQwqImOAunaOb0OzERUbm8icOF6ekYGwywjrtlmAC/3q+q
1g/te1yfwQ8fpprWl4QSH0sQVdfAcxdDZqcWtN2LhNaEShZtNa5yKpsRGn1V0eAS
tIO8eexAKCXoASHrrwfSkizYjRAeDabmodBQmS50/isY9OdBE2tDel+BLrCjzBJ2
hABwEZ3Q78216EuoqsZqWaEUZ3ck0iSW3IcXglmHE4TC8Iq6dwskvOPjay+msHr9
dcHDCxFIN4jzv9QcpKN8LkxfmW0Us28bzap3OhKfrz0nv7b4n+j0q1xbKL1QnN/l
RcDPW0dQeXuX9vYMeYIUDQcV4IgTUkF6IPDCRW7KCApi98HfPTbrfQ97nir79zDp
pD8NXaNFr4PtxJoheYYia3sjZMt/fgfvP2dM32iOpsMu7W1FXdfQN7heNSc6MQmO
ciyhf/mj
=NpPo
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm fixes for 5.6, take #1
- Fix compilation on 32bit
- Move VHE guest entry/exit into the VHE-specific entry code
- Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
In older version of systemd(219), at boot time, udevadm is called with :
/usr/bin/udevadm trigger --type=devices --action=add"
This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent,
leading to the "kvm: disabled by bios" message in case of your Bios disabled
the virtualization extensions.
On a modern system running up to 256 CPU threads, this pollutes the Kernel logs.
This patch offers to ratelimit this message to avoid any userspace program triggering
this uevent printing this message too often.
This patch is only a workaround but greatly reduce the pollution without
breaking the current behavior of printing a message if some try to instantiate
KVM on a system that doesn't support it.
Note that recent versions of systemd (>239) do not have trigger this behavior.
This patch will be useful at least for some using older systemd with recent Kernels.
Signed-off-by: Erwan Velu <e.velu@criteo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* pm-sleep:
PM / hibernate: fix typo "reserverd_size" -> "reserved_size"
Documentation: power: Drop reference to interface.rst
* pm-devfreq:
Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
struct cpufreq_policy is quite big and it is not a good idea
to allocate one on the stack. Just use cpufreq_cpu_get and
cpufreq_cpu_put which is even simpler.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Restrict -Werror to well-tested configurations and allow disabling it
via Kconfig.
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compile error with CONFIG_KVM_INTEL=y and W=1:
CC arch/x86/kvm/vmx/vmx.o
arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=]
68 | static const struct x86_cpu_id vmx_cpu_id[] = {
| ^~~~~~~~~~
cc1: all warnings being treated as errors
When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a
reference to the structure (or any code at all). This makes W=1 compiles
unhappy.
Wrap both in a #ifdef to avoid the issue.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
[Do the same for CONFIG_KVM_AMD. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nick Desaulniers Reported:
When building with:
$ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000
The following warning is observed:
arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in
function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=]
static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int
vector)
^
Debugging with:
https://github.com/ClangBuiltLinux/frame-larger-than
via:
$ python3 frame_larger_than.py arch/x86/kernel/kvm.o \
kvm_send_ipi_mask_allbutself
points to the stack allocated `struct cpumask newmask` in
`kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is
potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for
the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as
8192, making a single instance of a `struct cpumask` 1024 B.
This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for
both pv tlb and pv ipis..
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce some pv check helpers for consistency.
Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Even if APICv is disabled at startup, the backing page and ir_list need
to be initialized in case they are needed later. The only case in
which this can be skipped is for userspace irqchip, and that must be
done because avic_init_backing_page dereferences vcpu->arch.apic
(which is NULL for userspace irqchip).
Tested-by: rmuncrief@humanavance.com
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- bridge: huge rework to get rid of omap_dss custom display drivers
Driver Changes:
- hisilicon: some fixes related to modes it can deal with / default to
- virtio: shmem and gpu context fixes and enhancements
- sun4i: Support for LVDS on the A33
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXleoiAAKCRDj7w1vZxhR
xbT4AP4zzM/NP++JUReg/kevOHLPHdPH4sg1+EtC0F4nJ25MEgD/UqYafdB/DN+Y
GXCPiYyYbJ8HpCOMGZiLgPHGa37xJAA=
=6VIX
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-02-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.7
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- bridge: huge rework to get rid of omap_dss custom display drivers
Driver Changes:
- hisilicon: some fixes related to modes it can deal with / default to
- virtio: shmem and gpu context fixes and enhancements
- sun4i: Support for LVDS on the A33
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227113222.cdwzy4cvcqjtbmou@gilmour.lan
amdgpu:
- Drop DRIVER_USE_AGP
- Fix memory leak in GPU reset
- Resume fix for raven
radeon:
- Drop DRIVER_USE_AGP
i915:
- downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs
- shrinker fix
- pmu leak and double free fixes
- gvt user after free and virtual display reset fixes
- randconfig build fix
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJeWJ+uAAoJEAx081l5xIa+SW4P/3UNXTw6SaJbuIv0ffU6PpR6
GqynyKseoogfZ6xDNRSEYcamIkKaniRLli+TPvEFqjAbpEzu0bbaXlQDcoG3uPqm
jGxufQ4GvxIidbIbzoJA/TLl8UOGLJ+x6fx4EhtAS0VzE3dugI5yPZOyE+cboR2D
0DtvqB1Bmx580TMSIJlzw92Nfgh4n1K29h51lW5MpY+lEvnqOjk0aliunPeOi2wB
8tbABzB+pY6UTPNickb/SBWmwcem7ceA/xxX6YyKE89mhREQo1PLZI6Tt5YTdlQ4
mizHFEZT3H/JF/X67DmaAEADTM+BDMkpRXHLQlIetPdzAg4K85HeBcDQfHTPKKSg
qtibd4VtxYXV31Mt/UWOh4EtpYHRZtbC8D42jBqF2DFBRXiTImli6PQ/G3c9xZdg
sYPMWGVjh3HWNxnejO5Bi7na5jkWaY/ujzT/ERlF+EKmX8meKQb5SgVndcZkjATx
yMh7lszuSxmw//qIq741bAbQk3e8/AdNm6iDISSCN3X+JCZI6bMOYP3xW8FzlQSe
q69v/ckYlZoVqGwO/Qp5kmfMmZ6GaqHOkLTHQucJbgx5C6nhfF9jtjg+G5hhAO+l
OcKvZo4eTPjayQbBoogxWhO+PW+0NaZ/KsU6k16a27txCeEjPRGmYDbbX5M6GHOw
NaO7zx1LL/2j4o7fvkq8
=5C0g
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Just some fixes for this week: amdgpu, radeon and i915.
The main i915 one is a regression Gen7 (Ivybridge/Haswell), this moves
them back from trying to use the full-ppgtt support to the aliasing
version it used to use due to gpu hangs. Otherwise it's pretty quiet.
amdgpu:
- Drop DRIVER_USE_AGP
- Fix memory leak in GPU reset
- Resume fix for raven
radeon:
- Drop DRIVER_USE_AGP
i915:
- downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs
- shrinker fix
- pmu leak and double free fixes
- gvt user after free and virtual display reset fixes
- randconfig build fix"
* tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drm:
drm/radeon: Inline drm_get_pci_dev
drm/amdgpu: Drop DRIVER_USE_AGP
drm/i915: Avoid recursing onto active vma from the shrinker
drm/i915/pmu: Avoid using globals for PMU events
drm/i915/pmu: Avoid using globals for CPU hotplug state
drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt
drm/i915: fix header test with GCOV
amdgpu/gmc_v9: save/restore sdpif regs during S3
drm/amdgpu: fix memory leak during TDR test(v2)
drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime
drm/i915/gvt: Separate display reset from ALL_ENGINES reset
Pull networking fixes from David Miller:
1) Fix leak in nl80211 AP start where we leak the ACL memory, from
Johannes Berg.
2) Fix double mutex unlock in mac80211, from Andrei Otcheretianski.
3) Fix RCU stall in ipset, from Jozsef Kadlecsik.
4) Fix devlink locking in devlink_dpipe_table_register, from Madhuparna
Bhowmik.
5) Fix race causing TX hang in ll_temac, from Esben Haabendal.
6) Stale eth hdr pointer in br_dev_xmit(), from Nikolay Aleksandrov.
7) Fix TX hash calculation bounds checking wrt. tc rules, from Amritha
Nambiar.
8) Size netlink responses properly in schedule action code to take into
consideration TCA_ACT_FLAGS. From Jiri Pirko.
9) Fix firmware paths for mscc PHY driver, from Antoine Tenart.
10) Don't register stmmac notifier multiple times, from Aaro Koskinen.
11) Various rmnet bug fixes, from Taehee Yoo.
12) Fix vsock deadlock in vsock transport release, from Stefano
Garzarella.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
net: dsa: mv88e6xxx: Fix masking of egress port
mlxsw: pci: Wait longer before accessing the device after reset
sfc: fix timestamp reconstruction at 16-bit rollover points
vsock: fix potential deadlock in transport->release()
unix: It's CONFIG_PROC_FS not CONFIG_PROCFS
net: rmnet: fix packet forwarding in rmnet bridge mode
net: rmnet: fix bridge mode bugs
net: rmnet: use upper/lower device infrastructure
net: rmnet: do not allow to change mux id if mux id is duplicated
net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device()
net: rmnet: fix suspicious RCU usage
net: rmnet: fix NULL pointer dereference in rmnet_changelink()
net: rmnet: fix NULL pointer dereference in rmnet_newlink()
net: phy: marvell: don't interpret PHY status unless resolved
mlx5: register lag notifier for init network namespace only
unix: define and set show_fdinfo only if procfs is enabled
hinic: fix a bug of rss configuration
hinic: fix a bug of setting hw_ioctxt
hinic: fix a irq affinity bug
net/smc: check for valid ib_client_data
...
PowerVM systems running compatibility mode on a few Power8 revisions are
still vulnerable to the hardware defect that loses PMU exceptions arriving
prior to a context switch.
The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG
cpu_feature bit, nevertheless this bit also needs to be set for PowerVM
compatibility mode systems.
Fixes: 68f2f0d431 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG")
Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200227134715.9715-1-desnesn@linux.ibm.com
de80f95ccb ("PCI: cadence: Move all files to per-device cadence
directory") moved files of the PCI cadence drivers, but did not update the
MAINTAINERS entry.
Since then, ./scripts/get_maintainer.pl --self-test complains:
warning: no file matches F: drivers/pci/controller/pcie-cadence*
Repair the MAINTAINERS entry.
Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Dm-zoned initializes reference counters of new chunk works with zero
value and refcount_inc() is called to increment the counter. However, the
refcount_inc() function handles the addition to zero value as an error
and triggers the warning as follows:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 7 PID: 1506 at lib/refcount.c:25 refcount_warn_saturate+0x68/0xf0
...
CPU: 7 PID: 1506 Comm: systemd-udevd Not tainted 5.4.0+ #134
...
Call Trace:
dmz_map+0x2d2/0x350 [dm_zoned]
__map_bio+0x42/0x1a0
__split_and_process_non_flush+0x14a/0x1b0
__split_and_process_bio+0x83/0x240
? kmem_cache_alloc+0x165/0x220
dm_process_bio+0x90/0x230
? generic_make_request_checks+0x2e7/0x680
dm_make_request+0x3e/0xb0
generic_make_request+0xcf/0x320
? memcg_drain_all_list_lrus+0x1c0/0x1c0
submit_bio+0x3c/0x160
? guard_bio_eod+0x2c/0x130
mpage_readpages+0x182/0x1d0
? bdev_evict_inode+0xf0/0xf0
read_pages+0x6b/0x1b0
__do_page_cache_readahead+0x1ba/0x1d0
force_page_cache_readahead+0x93/0x100
generic_file_read_iter+0x83a/0xe40
? __seccomp_filter+0x7b/0x670
new_sync_read+0x12a/0x1c0
vfs_read+0x9d/0x150
ksys_read+0x5f/0xe0
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
...
After this warning, following refcount API calls for the counter all fail
to change the counter value.
Fix this by setting the initial reference counter value not zero but one
for the new chunk works. Instead, do not call refcount_inc() via
dmz_get_chunk_work() for the new chunks works.
The failure was observed with linux version 5.4 with CONFIG_REFCOUNT_FULL
enabled. Refcount rework was merged to linux version 5.5 by the
commit 168829ad09 ("Merge branch 'locking-core-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"). After this
commit, CONFIG_REFCOUNT_FULL was removed and the failure was observed
regardless of kernel configuration.
Linux version 4.20 merged the commit 092b564876 ("dm zoned: target: use
refcount_t for dm zoned reference counters"). Before this commit, dm
zoned used atomic_t APIs which does not check addition to zero, then this
fix is not necessary.
Fixes: 092b564876 ("dm zoned: target: use refcount_t for dm zoned reference counters")
Cc: stable@vger.kernel.org # 5.4+
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Verify the watermark upon resume - so that if the target is reloaded
with lower watermark, it will start the cleanup process immediately.
Fixes: 48debafe4f ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
The function dm_suspended returns true if the target is suspended.
However, when the target is being suspended during unload, it returns
false.
An example where this is a problem: the test "!dm_suspended(wc->ti)" in
writecache_writeback is not sufficient, because dm_suspended returns
zero while writecache_suspend is in progress. As is, without an
enhanced dm_suspended, simply switching from flush_workqueue to
drain_workqueue still emits warnings:
workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries
writecache_suspend calls flush_workqueue(wc->writeback_wq) - this function
flushes the current work. However, the workqueue may re-queue itself and
flush_workqueue doesn't wait for re-queued works to finish. Because of
this - the function writecache_writeback continues execution after the
device was suspended and then concurrently with writecache_dtr, causing
a crash in writecache_writeback.
We must use drain_workqueue - that waits until the work and all re-queued
works finish.
As a prereq for switching to drain_workqueue, this commit fixes
dm_suspended to return true after the presuspend hook and before the
postsuspend hook - just like during a normal suspend. It allows
simplifying the dm-integrity and dm-writecache targets so that they
don't have to maintain suspended flags on their own.
With this change use of drain_workqueue() can be used effectively. This
change was tested with the lvm2 testsuite and cryptsetup testsuite and
the are no regressions.
Fixes: 48debafe4f ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Reported-by: Corey Marthaler <cmarthal@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise
the iovec import for these commands will not do the right thing and fail
the command with -EINVAL.
Found by running the test suite compiled as 32-bit.
Cc: stable@vger.kernel.org
Fixes: aa1fa28fc7 ("io_uring: add support for recvmsg()")
Fixes: 0fa03c624d ("io_uring: add support for sendmsg()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The algorithm pre-allocates a cm_id since allocation cannot be done while
holding the cm.lock spinlock, however it doesn't free it on one error
path, leading to a memory leak.
Fixes: 067b171b86 ("IB/cm: Share listening CM IDs")
Link: https://lore.kernel.org/r/20200221152023.GA8680@ziepe.ca
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add missing ~ to the usage of the mask.
Reported-by: Kevin Benson <Kevin.Benson@zii.aero>
Reported-by: Chris Healy <Chris.Healy@zii.aero>
Fixes: 5c74c54ce6 ("net: dsa: mv88e6xxx: Split monitor port configuration")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
During initialization the driver issues a reset to the device and waits
for 100ms before checking if the firmware is ready. The waiting is
necessary because before that the device is irresponsive and the first
read can result in a completion timeout.
While 100ms is sufficient for Spectrum-1 and Spectrum-2, it is
insufficient for Spectrum-3.
Fix this by increasing the timeout to 200ms.
Fixes: da382875c6 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can't just use the top bits of the last sync event as they could be
off-by-one every 65,536 seconds, giving an error in reconstruction of
65,536 seconds.
This patch uses the difference in the bottom 16 bits (mod 2^16) to
calculate an offset that needs to be applied to the last sync event to
get to the current time.
Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Acked-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some transports (hyperv, virtio) acquire the sock lock during the
.release() callback.
In the vsock_stream_connect() we call vsock_assign_transport(); if
the socket was previously assigned to another transport, the
vsk->transport->release() is called, but the sock lock is already
held in the vsock_stream_connect(), causing a deadlock reported by
syzbot:
INFO: task syz-executor280:9768 blocked for more than 143 seconds.
Not tainted 5.6.0-rc1-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor280 D27912 9768 9766 0x00000000
Call Trace:
context_switch kernel/sched/core.c:3386 [inline]
__schedule+0x934/0x1f90 kernel/sched/core.c:4082
schedule+0xdc/0x2b0 kernel/sched/core.c:4156
__lock_sock+0x165/0x290 net/core/sock.c:2413
lock_sock_nested+0xfe/0x120 net/core/sock.c:2938
virtio_transport_release+0xc4/0xd60 net/vmw_vsock/virtio_transport_common.c:832
vsock_assign_transport+0xf3/0x3b0 net/vmw_vsock/af_vsock.c:454
vsock_stream_connect+0x2b3/0xc70 net/vmw_vsock/af_vsock.c:1288
__sys_connect_file+0x161/0x1c0 net/socket.c:1857
__sys_connect+0x174/0x1b0 net/socket.c:1874
__do_sys_connect net/socket.c:1885 [inline]
__se_sys_connect net/socket.c:1882 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:1882
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
To avoid this issue, this patch remove the lock acquiring in the
.release() callback of hyperv and virtio transports, and it holds
the lock when we call vsk->transport->release() in the vsock core.
Reported-by: syzbot+731710996d79d0d58fbc@syzkaller.appspotmail.com
Fixes: 408624af4c ("vsock: use local transport when it is loaded")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo says:
====================
net: rmnet: fix several bugs
This patchset is to fix several bugs in RMNET module.
1. The first patch fixes NULL-ptr-deref in rmnet_newlink().
When rmnet interface is being created, it uses IFLA_LINK
without checking NULL.
So, if userspace doesn't set IFLA_LINK, panic will occur.
In this patch, checking NULL pointer code is added.
2. The second patch fixes NULL-ptr-deref in rmnet_changelink().
To get real device in rmnet_changelink(), it uses IFLA_LINK.
But, IFLA_LINK should not be used in rmnet_changelink().
3. The third patch fixes suspicious RCU usage in rmnet_get_port().
rmnet_get_port() uses rcu_dereference_rtnl().
But, rmnet_get_port() is used by datapath.
So, rcu_dereference_bh() should be used instead of rcu_dereference_rtnl().
4. The fourth patch fixes suspicious RCU usage in
rmnet_force_unassociate_device().
RCU critical section should not be scheduled.
But, unregister_netdevice_queue() in the rmnet_force_unassociate_device()
would be scheduled.
So, the RCU warning occurs.
In this patch, the rcu_read_lock() in the rmnet_force_unassociate_device()
is removed because it's unnecessary.
5. The fifth patch fixes duplicate MUX ID case.
RMNET MUX ID is unique.
So, rmnet interface isn't allowed to be created, which have
a duplicate MUX ID.
But, only rmnet_newlink() checks this condition, rmnet_changelink()
doesn't check this.
So, duplicate MUX ID case would happen.
6. The sixth patch fixes upper/lower interface relationship problems.
When IFLA_LINK is used, the upper/lower infrastructure should be used.
Because it checks the maximum depth of upper/lower interfaces and it also
checks circular interface relationship, etc.
In this patch, netdev_upper_dev_link() is used.
7. The seventh patch fixes bridge related problems.
a) ->ndo_del_slave() doesn't work.
b) It couldn't detect circular upper/lower interface relationship.
c) It couldn't prevent stack overflow because of too deep depth
of upper/lower interface
d) It doesn't check the number of lower interfaces.
e) Panics because of several reasons.
These problems are actually the same problem.
So, this patch fixes these problems.
8. The eighth patch fixes packet forwarding issue in bridge mode
Packet forwarding is not working in rmnet bridge mode.
Because when a packet is forwarded, skb_push() for an ethernet header
is needed. But it doesn't call skb_push().
So, the ethernet header will be lost.
Change log:
- update commit logs.
- drop two patches in this patchset because of wrong target branch.
- ("net: rmnet: add missing module alias")
- ("net: rmnet: print error message when command fails")
- remove unneessary rcu_read_lock() in the third patch.
- use rcu_dereference_bh() instead of rcu_dereference in third patch.
- do not allow to add a bridge device if rmnet interface is already
bridge mode in the seventh patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Packet forwarding is not working in rmnet bridge mode.
Because when a packet is forwarded, skb_push() for an ethernet header
is needed. But it doesn't call skb_push().
So, the ethernet header will be lost.
Test commands:
modprobe rmnet
ip netns add nst
ip netns add nst2
ip link add veth0 type veth peer name veth1
ip link add veth2 type veth peer name veth3
ip link set veth1 netns nst
ip link set veth3 netns nst2
ip link add rmnet0 link veth0 type rmnet mux_id 1
ip link set veth2 master rmnet0
ip link set veth0 up
ip link set veth2 up
ip link set rmnet0 up
ip a a 192.168.100.1/24 dev rmnet0
ip netns exec nst ip link set veth1 up
ip netns exec nst ip a a 192.168.100.2/24 dev veth1
ip netns exec nst2 ip link set veth3 up
ip netns exec nst2 ip a a 192.168.100.3/24 dev veth3
ip netns exec nst2 ping 192.168.100.2
Fixes: 60d58f971c ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to attach a bridge interface to the rmnet interface,
"master" operation is used.
(e.g. ip link set dummy1 master rmnet0)
But, in the rmnet_add_bridge(), which is a callback of ->ndo_add_slave()
doesn't register lower interface.
So, ->ndo_del_slave() doesn't work.
There are other problems too.
1. It couldn't detect circular upper/lower interface relationship.
2. It couldn't prevent stack overflow because of too deep depth
of upper/lower interface
3. It doesn't check the number of lower interfaces.
4. Panics because of several reasons.
The root problem of these issues is actually the same.
So, in this patch, these all problems will be fixed.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link add dummy1 master rmnet0 type dummy
ip link add dummy2 master rmnet0 type dummy
ip link del rmnet0
ip link del dummy2
ip link del dummy1
Splat looks like:
[ 41.867595][ T1164] general protection fault, probably for non-canonical address 0xdffffc0000000101I
[ 41.869993][ T1164] KASAN: null-ptr-deref in range [0x0000000000000808-0x000000000000080f]
[ 41.872950][ T1164] CPU: 0 PID: 1164 Comm: ip Not tainted 5.6.0-rc1+ #447
[ 41.873915][ T1164] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 41.875161][ T1164] RIP: 0010:rmnet_unregister_bridge.isra.6+0x71/0xf0 [rmnet]
[ 41.876178][ T1164] Code: 48 89 ef 48 89 c6 5b 5d e9 fc fe ff ff e8 f7 f3 ff ff 48 8d b8 08 08 00 00 48 ba 00 7
[ 41.878925][ T1164] RSP: 0018:ffff8880c4d0f188 EFLAGS: 00010202
[ 41.879774][ T1164] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000101
[ 41.887689][ T1164] RDX: dffffc0000000000 RSI: ffffffffb8cf64f0 RDI: 0000000000000808
[ 41.888727][ T1164] RBP: ffff8880c40e4000 R08: ffffed101b3c0e3c R09: 0000000000000001
[ 41.889749][ T1164] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 1ffff110189a1e3c
[ 41.890783][ T1164] R13: ffff8880c4d0f200 R14: ffffffffb8d56160 R15: ffff8880ccc2c000
[ 41.891794][ T1164] FS: 00007f4300edc0c0(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
[ 41.892953][ T1164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.893800][ T1164] CR2: 00007f43003bc8c0 CR3: 00000000ca53e001 CR4: 00000000000606f0
[ 41.894824][ T1164] Call Trace:
[ 41.895274][ T1164] ? rcu_is_watching+0x2c/0x80
[ 41.895895][ T1164] rmnet_config_notify_cb+0x1f7/0x590 [rmnet]
[ 41.896687][ T1164] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[ 41.897611][ T1164] ? rmnet_unregister_bridge.isra.6+0xf0/0xf0 [rmnet]
[ 41.898508][ T1164] ? __module_text_address+0x13/0x140
[ 41.899162][ T1164] notifier_call_chain+0x90/0x160
[ 41.899814][ T1164] rollback_registered_many+0x660/0xcf0
[ 41.900544][ T1164] ? netif_set_real_num_tx_queues+0x780/0x780
[ 41.901316][ T1164] ? __lock_acquire+0xdfe/0x3de0
[ 41.901958][ T1164] ? memset+0x1f/0x40
[ 41.902468][ T1164] ? __nla_validate_parse+0x98/0x1ab0
[ 41.903166][ T1164] unregister_netdevice_many.part.133+0x13/0x1b0
[ 41.903988][ T1164] rtnl_delete_link+0xbc/0x100
[ ... ]
Fixes: 60d58f971c ("net: qualcomm: rmnet: Implement bridge mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Basically, duplicate mux id isn't be allowed.
So, the creation of rmnet will be failed if there is duplicate mux id
is existing.
But, changelink routine doesn't check duplicate mux id.
Test commands:
modprobe rmnet
ip link add dummy0 type dummy
ip link add rmnet0 link dummy0 type rmnet mux_id 1
ip link add rmnet1 link dummy0 type rmnet mux_id 2
ip link set rmnet1 type rmnet mux_id 1
Fixes: 23790ef120 ("net: qualcomm: rmnet: Allow to configure flags for existing devices")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rmnet_get_port() internally calls rcu_dereference_rtnl(),
which checks RTNL.
But rmnet_get_port() could be called by packet path.
The packet path is not protected by RTNL.
So, the suspicious RCU usage problem occurs.
Test commands:
modprobe rmnet
ip netns add nst
ip link add veth0 type veth peer name veth1
ip link set veth1 netns nst
ip link add rmnet0 link veth0 type rmnet mux_id 1
ip netns exec nst ip link add rmnet1 link veth1 type rmnet mux_id 1
ip netns exec nst ip link set veth1 up
ip netns exec nst ip link set rmnet1 up
ip netns exec nst ip a a 192.168.100.2/24 dev rmnet1
ip link set veth0 up
ip link set rmnet0 up
ip a a 192.168.100.1/24 dev rmnet0
ping 192.168.100.2
Splat looks like:
[ 146.630958][ T1174] WARNING: suspicious RCU usage
[ 146.631735][ T1174] 5.6.0-rc1+ #447 Not tainted
[ 146.632387][ T1174] -----------------------------
[ 146.633151][ T1174] drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c:386 suspicious rcu_dereference_check() !
[ 146.634742][ T1174]
[ 146.634742][ T1174] other info that might help us debug this:
[ 146.634742][ T1174]
[ 146.645992][ T1174]
[ 146.645992][ T1174] rcu_scheduler_active = 2, debug_locks = 1
[ 146.646937][ T1174] 5 locks held by ping/1174:
[ 146.647609][ T1174] #0: ffff8880c31dea70 (sk_lock-AF_INET){+.+.}, at: raw_sendmsg+0xab8/0x2980
[ 146.662463][ T1174] #1: ffffffff93925660 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x243/0x2150
[ 146.671696][ T1174] #2: ffffffff93925660 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x213/0x2940
[ 146.673064][ T1174] #3: ffff8880c19ecd58 (&dev->qdisc_running_key#7){+...}, at: ip_finish_output2+0x714/0x2150
[ 146.690358][ T1174] #4: ffff8880c5796898 (&dev->qdisc_xmit_lock_key#3){+.-.}, at: sch_direct_xmit+0x1e2/0x1020
[ 146.699875][ T1174]
[ 146.699875][ T1174] stack backtrace:
[ 146.701091][ T1174] CPU: 0 PID: 1174 Comm: ping Not tainted 5.6.0-rc1+ #447
[ 146.705215][ T1174] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 146.706565][ T1174] Call Trace:
[ 146.707102][ T1174] dump_stack+0x96/0xdb
[ 146.708007][ T1174] rmnet_get_port.part.9+0x76/0x80 [rmnet]
[ 146.709233][ T1174] rmnet_egress_handler+0x107/0x420 [rmnet]
[ 146.710492][ T1174] ? sch_direct_xmit+0x1e2/0x1020
[ 146.716193][ T1174] rmnet_vnd_start_xmit+0x3d/0xa0 [rmnet]
[ 146.717012][ T1174] dev_hard_start_xmit+0x160/0x740
[ 146.717854][ T1174] sch_direct_xmit+0x265/0x1020
[ 146.718577][ T1174] ? register_lock_class+0x14d0/0x14d0
[ 146.719429][ T1174] ? dev_watchdog+0xac0/0xac0
[ 146.723738][ T1174] ? __dev_queue_xmit+0x15fd/0x2940
[ 146.724469][ T1174] ? lock_acquire+0x164/0x3b0
[ 146.725172][ T1174] __dev_queue_xmit+0x20c7/0x2940
[ ... ]
Fixes: ceed73a2cf ("drivers: net: ethernet: qualcomm: rmnet: Initial implementation")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some bcm2711 revisions have different DMA constraints on the their PCIE
bus. The lower common denominator, being able to access the lower 3GB of
memory, is the default setting for now. Newer SoC revisions are able to
access the whole memory space.
Raspberry Pi 4's firmware is aware of this limitation and will correct
the PCIE's dma-ranges property if a pcie0 alias is available. So add
it.
Fixes: d5c8dc0d4c ("ARM: dts: bcm2711: Enable PCIe controller")
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Reviewed-by: Phil Elwell <phil@raspberrypi.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This adds the missing properties to the PWR LED for the RPi 3 & 4 boards,
which are already set for the other boards. Without them we will lose
the LED state after suspend.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
- fix missed rebuild of DT schema check
- add some phony targets to PHONY
- fix comments and documents
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl5X+HIVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGMf8P/2uI3ECM0SiVGtaD33PwnoHb+Imn
XjOFyuQF37UFaAh5HKa0VYxk0/q5zZ6iYfaTfYdBusxpGhhMsAjIWsvhOaiPtqRA
fPL1jrOqQpeY9g37J6xpLyYIwTb2V/ShiA2CLFiQCjDYMHHDkGTw5lXHx7O/9aWu
82Xk4sTw2CrzgDH/hhXaoxIWo3jmqv9EoTEXoNlJKaOoKkjYfbhwzlDz4VvFY+sn
YbUL2qIq3ViYfKR4qvEDG69E7gAnXSXCXJfbBMpTaCL5ZkbwwkKylc2qgoSikJqS
NMZ9T5Eq9rGqLAt7m2wRZjVHjN4zX7yjudp8Fa901W6C7OBW+XNOIIhzSPKbKLYb
EtrQqhZvvmTN73RHHC5D/ZGzKRisEdnO6gCbh3/bg3Wpl0yJjM9tyQx8rqwMYlPc
3xMnuylKSrBj3t4HfBVoVZqwJPkq7OLtnEF9TC73FToCsLz2SBuuHlqqTStInhEQ
mP+2V8OyBAlLtmLRj8q27YqDfkUME65pGaZiPQGl3DfT/buhLJLuZqd+zLDzUsmE
boCgSDtcc/O9RClN6LpBHPRJwTSgj/h5ECqeOeGyaGoPtJ6DdNS0FHePoLk/tLhu
VylkMTOaOOvMx7VAmpTgtDyVGPSl3MNyCPSeaIw6jIHnpi0iVIuVWKt8dyPjbihT
GRjpR04OOmnPE+LB
=97Ti
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- fix missed rebuild of DT schema check
- add some phony targets to PHONY
- fix comments and documents
* tag 'kbuild-fixes-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: get rid of trailing slash from subdir- example
kbuild: add dt_binding_check to PHONY in a correct place
kbuild: add dtbs_check to PHONY
kbuild: remove unneeded semicolon at the end of cmd_dtb_check
kbuild: fix DT binding schema rule to detect command line changes
kbuild: remove wrong documentation about mandatory-y
kbuild: add comment for V=2 mode
Don't attempt to interpret the PHY specific status register unless
the PHY is indicating that the resolution is valid.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code causes problems when the unregistering netdevice could
be different then the registering one.
Since the check in mlx5_lag_netdev_event() does not allow any other
network namespace anyway, fix this by registerting the lag notifier
per init network namespace only.
Fixes: d48834f9d4 ("mlx5: Use dev_net netdevice notifier registrations")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Aya Levin <ayal@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull HID subsystem fixes from Jiri Kosina:
- syzkaller-reported error handling fixes in various drivers, from
various people
- increase of HID report buffer size to 8K, which is apparently needed
by certain modern devices
- a few new device-ID-specific fixes / quirks
- battery charging status reporting fix in logitech-hidpp, from Filipe
Laíns
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: hid-bigbenff: fix race condition for scheduled work during removal
HID: hid-bigbenff: call hid_hw_stop() in case of error
HID: hid-bigbenff: fix general protection fault caused by double kfree
HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override
HID: alps: Fix an error handling path in 'alps_input_configured()'
HID: hiddev: Fix race in in hiddev_disconnect()
HID: core: increase HID report buffer size to 8KiB
HID: core: fix off-by-one memset in hid_report_raw_event()
HID: apple: Add support for recent firmware on Magic Keyboards
HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock
HID: logitech-hidpp: BatteryVoltage: only read chargeStatus if extPower is active
Follow the pattern used with other *_show_fdinfo functions and only
define unix_show_fdinfo and set it in proto_ops if CONFIG_PROCFS
is set.
Fixes: 3c32da19a8 ("unix: Show number of pending scm files of receive queue in fdinfo")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Luo bin says:
====================
hinic: BugFixes
the bug fixed in patch #2 has been present since the first commit.
the bugs fixed in patch #1 and patch #3 have been present since the
following commits:
patch #1: 352f58b0d9 ("net-next/hinic: Set Rxq irq to specific cpu for NUMA")
patch #3: 421e952628 ("hinic: add rss support")
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
should use real receive queue number to configure hw rss
indirect table rather than maximal queue number
Signed-off-by: Luo bin <luobin9@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>