It's not enough to have per-falcon structures anymore, we have multiple
versions of some firmware now that have interface differences.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Will be passed to the FW loader function as an upper bound on the supported
FW version to attempt to load.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We have a need for this now with updated SEC2 LS FW images that have an
incompatible interface from the previous version.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In theory, IO scheduler belongs to request queue, and the request pool
of sched tags belongs to the request queue too.
However, the current tags allocation interfaces are re-used for both
driver tags and sched tags, and driver tags is definitely host wide,
and doesn't belong to any request queue, same with its request pool.
So we need tagset instance for freeing request of sched tags.
Meantime, blk_mq_free_tag_set() often follows blk_cleanup_queue() in case
of non-BLK_MQ_F_TAG_SHARED, this way requires that request pool of sched
tags to be freed before calling blk_mq_free_tag_set().
Commit 47cdee29ef ("block: move blk_exit_queue into __blk_release_queue")
moves blk_exit_queue into __blk_release_queue for simplying the fast
path in generic_make_request(), then causes oops during freeing requests
of sched tags in __blk_release_queue().
Fix the above issue by move freeing request pool of sched tags into
blk_cleanup_queue(), this way is safe becasue queue has been frozen and no any
in-queue requests at that time. Freeing sched tags has to be kept in queue's
release handler becasue there might be un-completed dispatch activity
which might refer to sched tags.
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Fixes: 47cdee29ef ("block: move blk_exit_queue into __blk_release_queue")
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull parisc fixes from Helge Deller:
- Fix crashes when accessing PCI devices on some machines like C240 and
J5000. The crashes were triggered because we replaced cache flushes
by nops in the alternative coding where we shouldn't for some
machines.
- Dave fixed a race in the usage of the sr1 space register when used to
load the coherence index.
- Use the hardware lpa instruction to to load the physical address of
kernel virtual addresses in the iommu driver code.
- The kernel may fail to link when CONFIG_MLONGCALLS isn't set. Solve
that by rearranging functions in the final vmlinux executeable.
- Some defconfig cleanups and removal of compiler warnings.
* 'parisc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix crash due alternative coding for NP iopdir_fdc bit
parisc: Use lpa instruction to load physical addresses in driver code
parisc: configs: Remove useless UEVENT_HELPER_PATH
parisc: Use implicit space register selection for loading the coherence index of I/O pdirs
parisc: Fix compiler warnings in float emulation code
parisc/slab: cleanup after /proc/slab_allocators removal
parisc: Allow building 64-bit kernel without -mlong-calls compiler option
parisc: Kconfig: remove ARCH_DISCARD_MEMBLOCK
Pull crypto fixes from Herbert Xu:
"This fixes a regression that breaks the jitterentropy RNG and a
potential memory leak in hmac"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: hmac - fix memory leak in hmac_init_tfm()
crypto: jitterentropy - change back to module_init()
- Fix some forgotten strings in a log debugging function
- Fix incorrect unit conversion in online fsck code
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlz1Ur0ACgkQ+H93GTRK
tOshcA//X+tEFZGF+oosQ3h/IJqr9pODZfVf76mQIpg/5IU7WUzONGjTsjiHyH93
zC2qw7l7m/3LyCSXfuJfuWOCyPXRRj7Dv+iUCbjmAh0OAn6Alpa1VN9jxABTeTuY
xeCoNdRBCg6wF2XLLVgEN+VEdrFc7F7x/4OyoW5dmmBQakNOWlVgLtQqO+7gK/hS
Qc+xw9ekKvkceHi//NXJTSXAZ0EfntzNZ/Fg41O2cztWV4oKxl2a4ej+j0g8/u9e
rabTP54RH8dNJIejRmWU3dxz/w7OHSOO84LW/Q4LMKykfhLV6lhPL25igy1eLNhY
OpcCWQio4ZKBwd8UoXCH0HA4dprusPipI1JIXp/YDHGFb0PumhkZaO/1xM0G5J+o
3n1A1m7MW16jXAwlwBq0A6UNUFGyMfhfQiCluikBwkpR5dhWQoLQ5/IWZxs7ibUy
0M5POwnYUsx7xcx3/TjUsLaGbBrqD/F4LXXGmxVJQa6Dt6B0ctyLDnjN08mNwR0e
4Kvck4yxUHrUoXGaOhtddZmloVcrY58CwwHPabwoA3qC39ktKvp5PPrTNPiao8Ew
oILteFrpQxPfg1AItXO/QMcGvj7ragHSA2O0XqD5Tm61jWErfpIAyRQ89oXT5GUt
Z52uOI164u/8T4S/HUvi9tqYuZ+xudG4krIClDCXBaww1W2xNjo=
=WFWW
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"Here are a couple more bug fixes for 5.2. Changes since last update:
- Fix some forgotten strings in a log debugging function
- Fix incorrect unit conversion in online fsck code"
* tag 'xfs-5.2-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: inode btree scrubber should calculate im_boffset correctly
xfs: fix broken log reservation debugging
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJc+SSIAAoJENW/n+sDE2U6Vx8P/jP9FW/JNMj4JeXVMPvSP4eB
raZDEUNksGnWu7/JAjoz5PruPZaeW+CoDq25oiB7MalMopEXZ4RRovhmrrZBXqP3
jI7/4yoHRJf5907ZFfbNHX3vL8up+I3Ej1jf8fsEpbTMlO95n5dtsTJIcuBpRB6L
Oq4M7sIrSU8FxRhOt+wBw042R3FWLYkNZ7mYV+VbiC/OXGSnOWj/uhZ04m83B39a
B0EylS0RIMVkS187+7gVxn5Rcyg0go+/Hi3pdMwBcWFOsVfAPI4Gr7n4F/CI4he5
lb4B6CMlQR0m1Nsqvvx+s3TUxlnAxuRJ9Kdam2T0emVvNkQTOpBvA6mHsk3WNfrl
6YE06wkpk29tLGD0a1cKwY4tiQrJ7isi2n/9pCw8NsWVJM2vMMILIBjNWFQ6wKOo
JnrKe3gjEe/lXN6JuZRZQaP2VkaRO9a86Op7WsPmt5LWAIyRFj0EQMJFay+dhjcl
N/MpAXzhuNrHGHpuf0X+9caWhLciAZmd3t/CbEaNCbHMlha7S9iFaIChy899gFob
L8ODwrHGozOQXSJzqOxLgGEos7BQLyrxnurBqe1aU3JHmQXLD32xQA69qTphvRf9
wlYSVm4eitg5QgGQPgwMilAb51O+3gQkDuas8uUS9LbRcnauhjNw4zabM4L8tOT2
fW4JhUxxw39fj+/GBtX2
=ilXy
-----END PGP SIGNATURE-----
Merge tag 'gfs2-v5.2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fix from Andreas Gruenbacher:
"A revert for a patch that turned out to be broken"
* tag 'gfs2-v5.2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
Revert "gfs2: Replace gl_revokes with a GLF flag"
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXPkV+wAKCRDh3BK/laaZ
PMGyAQDq6ry0bTRPIL52Ek+eRS/pi3bIsH96e22Q6W/NrAQEfwD+NtFZneAW/Tux
AuKIRWqS7UdqCjLurwMHfR9bOHrBDQI=
=z7pu
-----END PGP SIGNATURE-----
Merge tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
"Here's one fix for a class of bugs triggered by syzcaller, and one
that makes xfstests fail less"
* tag 'ovl-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: doc: add non-standard corner cases
ovl: detect overlapping layers
ovl: support the FS_IOC_FS[SG]ETXATTR ioctls
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXPjJMAAKCRDh3BK/laaZ
PDzlAP9CgHZsgCVfB5afSb9rqY9Fdzr3LxSOwaCXavA5XGJAVQEAhjldnlMOjEvO
LrDEPG3zziJuQgCmMJ9xXoBYYjkCwgo=
=nff/
-----END PGP SIGNATURE-----
Merge tag 'fuse-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"This fixes a leaked inode lock in an error cleanup path and a data
consistency issue with copy_file_range().
It also adds a new flag for the WRITE request that allows userspace
filesystems to clear suid/sgid bits on the file if necessary"
* tag 'fuse-fixes-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: extract helper for range writeback
fuse: fix copy_file_range() in the writeback case
fuse: add FUSE_WRITE_KILL_PRIV
fuse: fallocate: fix return with locked inode
Stable bugfixes:
- SUNRPC: Fix regression in umount of a secure mount
- SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
- NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
- NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
Other bugfixes:
- xprtrdma: Use struct_size() in kzalloc()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlz4KzoACgkQ18tUv7Cl
QOt7OhAAvG+DVZ6V5+q4zvabKgoievlL56Ys4SaAp3+OlxC6VaiyQUDs/6U9C/xH
dmVbGYWdFXjqJE1JPXxmu0jOdRiZcnhIq+hiHNOK0qZOBCnE5zzZ1r1tdNY0GHQ2
JOkREqsXsaeUWuO0pCY7JOmzd5aU1XLhg1/8+9Z7gNwamMfwLkEqi7FGtXi+xsGz
gQVxMJlHsV2F21IKdKS0TJrcqr2okya/MnOQRbbMC2RT/MYNxDrhAPBJ1Shcx3HB
NlccAn4jhIL0bCPRvFPib6KrO01U0Ye/KECN8j2qHRT4QS2s0dsnnQ6f2tEs9mJ8
cRTVh1uniF6ZuDxSr6KIIN3mKA9DX2SK83H16ahAaRBLM8dwF+4MIr6gDdtJvsVw
nY0YDpnAaKFypuCPBV/jFu7fk97hul4ntymJGVeFdlqu/HtWs1Z1iM93DDVJbKr8
a3AND6woOQ2asvySPo+X66PKt79gofga4C+ZDuMfJax8+K9imqIyforgLrAmd/yL
sGAlLzenf6fmOB5C1bPTtrFFbs6XiHXMidDGwmm1kOZIDuN+O2TTwc24gyXx0IyJ
OhmjDn2CKmzS2WVPhetRgurzkdTigJu4PebC421qWSFhxlf/NfghW+rpM9su/hwv
/r9+bpdjZ8YD5FUJvxsX4NZLr+SWbNTzX/ARNdRFsGr0NLpG/50=
=rrp/
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-5.2-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client fixes from Anna Schumaker:
"These are mostly stable bugfixes found during testing, many during the
recent NFS bake-a-thon.
Stable bugfixes:
- SUNRPC: Fix regression in umount of a secure mount
- SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
- NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
- NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
Other bugfixes:
- xprtrdma: Use struct_size() in kzalloc()"
* tag 'nfs-for-5.2-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled
NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter
SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential
SUNRPC fix regression in umount of a secure mount
xprtrdma: Use struct_size() in kzalloc()
In following sequences, child devices created while removing mdev parent
device can be left out, or it may lead to race of removing half
initialized child mdev devices.
issue-1:
--------
cpu-0 cpu-1
----- -----
mdev_unregister_device()
device_for_each_child()
mdev_device_remove_cb()
mdev_device_remove()
create_store()
mdev_device_create() [...]
device_add()
parent_remove_sysfs_files()
/* BUG: device added by cpu-0
* whose parent is getting removed
* and it won't process this mdev.
*/
issue-2:
--------
Below crash is observed when user initiated remove is in progress
and mdev_unregister_driver() completes parent unregistration.
cpu-0 cpu-1
----- -----
remove_store()
mdev_device_remove()
active = false;
mdev_unregister_device()
parent device removed.
[...]
parents->ops->remove()
/*
* BUG: Accessing invalid parent.
*/
This is similar race like create() racing with mdev_unregister_device().
BUG: unable to handle kernel paging request at ffffffffc0585668
PGD e8f618067 P4D e8f618067 PUD e8f61a067 PMD 85adca067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 41 PID: 37403 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6
Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016
RIP: 0010:mdev_device_remove+0xfa/0x140 [mdev]
Call Trace:
remove_store+0x71/0x90 [mdev]
kernfs_fop_write+0x113/0x1a0
vfs_write+0xad/0x1b0
ksys_write+0x5a/0xe0
do_syscall_64+0x5a/0x210
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Therefore, mdev core is improved as below to overcome above issues.
Wait for any ongoing mdev create() and remove() to finish before
unregistering parent device.
This continues to allow multiple create and remove to progress in
parallel for different mdev devices as most common case.
At the same time guard parent removal while parent is being accessed by
create() and remove() callbacks.
create()/remove() and unregister_device() are synchronized by the rwsem.
Refactor device removal code to mdev_device_remove_common() to avoid
acquiring unreg_sem of the parent.
Fixes: 7b96953bc6 ("vfio: Mediated device Core driver")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
If device is removal is initiated by two threads as below, mdev core
attempts to create a syfs remove file on stale device.
During this flow, below [1] call trace is observed.
cpu-0 cpu-1
----- -----
mdev_unregister_device()
device_for_each_child
mdev_device_remove_cb
mdev_device_remove
user_syscall
remove_store()
mdev_device_remove()
[..]
unregister device();
/* not found in list or
* active=false.
*/
sysfs_create_file()
..Call trace
Now that mdev core follows correct device removal sequence of the linux
bus model, remove shouldn't fail in normal cases. If it fails, there is
no point of creating a stale file or checking for specific error status.
kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327
sysfs_create_file_ns+0x7f/0x90
kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted
5.1.0-rc6-vdevbus+ #6
kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b
08/09/2016
kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90
kernel: Call Trace:
kernel: remove_store+0xdc/0x100 [mdev]
kernel: kernfs_fop_write+0x113/0x1a0
kernel: vfs_write+0xad/0x1b0
kernel: ksys_write+0x5a/0xe0
kernel: do_syscall_64+0x5a/0x210
kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Currently, the process issuing a "start" command on the pktgen procfs
interface, acquires the pktgen thread lock and never release it, until
all pktgen threads are completed. The above can blocks indefinitely any
other pktgen command and any (even unrelated) netdevice removal - as
the pktgen netdev notifier acquires the same lock.
The issue is demonstrated by the following script, reported by Matteo:
ip -b - <<'EOF'
link add type dummy
link add type veth
link set dummy0 up
EOF
modprobe pktgen
echo reset >/proc/net/pktgen/pgctrl
{
echo rem_device_all
echo add_device dummy0
} >/proc/net/pktgen/kpktgend_0
echo count 0 >/proc/net/pktgen/dummy0
echo start >/proc/net/pktgen/pgctrl &
sleep 1
rmmod veth
Fix the above releasing the thread lock around the sleep call.
Additionally we must prevent racing with forcefull rmmod - as the
thread lock no more protects from them. Instead, acquire a self-reference
before waiting for any thread. As a side effect, running
rmmod pktgen
while some thread is running now fails with "module in use" error,
before this patch such command hanged indefinitely.
Note: the issue predates the commit reported in the fixes tag, but
this fix can't be applied before the mentioned commit.
v1 -> v2:
- no need to check for thread existence after flipping the lock,
pktgen threads are freed only at net exit time
-
Fixes: 6146e6a43b ("[PKTGEN]: Removes thread_{un,}lock() macros.")
Reported-and-tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
with fixes for adfs:
- factor out filename comparison, so we can be sure that adfs_compare()
(used for namei compare) and adfs_match() (used for lookup) have the
same behaviour.
- factor out filename lowering (which is not the same as tolower() which
will lower top-bit-set characters) to ensure that we have the same
behaviour when comparing filenames as when we hash them.
- factor out the object fixups, so we are applying all fixups to
directory objects in the same way, independent of the disk format.
- factor out the object name fixup (into the previously factored out
function) to ensure that filenames are appropriately translated -
for example, adfs allows '/' in filenames, which being the Unix path
separator, need to be translated to a different character, which is
normally '.' (DOS 8.3 filenames represent the . as a / on adfs, so
this is the expected reverse translation.)
- remove filename truncation; Al asked about this and apparently the
decision is to remove it. In any case, adfs's truncation was buggy,
so this rids us of that bug by removing the truncation feature.
- we now have only one location which adds the "filetype" suffix to the
filename, so there's no point that code being out of line.
- since we translate '/' into '.', an adfs filename of "/" or "//" would
end up being translated to "." and ".." which have special meanings.
In this case, change the first character to "^" to avoid these special
directory names being abused.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAXPD4oPTnkBvkraxkAQK35Q//bql8SDnF9sqy8YlVUmAgoIMdyQtiFuD7
5UgRBPDl4bIm2Y+mtuYjU/u3nq6tZit0HukybUvD5yA64Opy1Ahkf8OB/f0f6TTU
xjx55/D9QRj4loGxXHM6PuEO9GX+pIsYPFQoufPYq7hksQB2y1ETkjENk4W4PY2m
gvbtQqQmz/B+G7PZvrMsZQV/BwYF3vhP8S/qLbgl3PAbciVofruXtPJNRxgOL8ot
hbOIfT5x30YZpILzXqDZJq4mviWPru+FVJ1uIW1Nd5s8T/9seICxXMjFaMQJUSMn
oIHCzC1WlP4uRbjwmJ+lyLlEPyYrgYN3+H1FcIO0MTYfBXwYZrVdFLW4TtWBracc
8bRa+p9jeRe9jdlKpGaX12a4W7xQ3SmB2i8UFE+/epnBqPvuDPOM0h19XK+FclTH
fAg7Ej1uBC1ROkTlW4OXBsLakXvIIka859DQYQduVKDw8kTSH4QyTdAE7qvOGz/d
Y0XMeIUz+U1izVJ8ShHCVyttXnkBkC6Xwoc6RY3IET++Fu0hXCMdQMSf6Ta8zJjA
EUAdb3GLYdJTqX6Oy9NTUd42GPnZwR5KMUOJd6v9ETU7gLDRDyu9ILpgrF+vFaKj
Xnf7B+D4l1jdB5cU/MYzfzF7Ky80vDHjVr62PSvtb5X9F0pHOINgZJ5an8n80bDc
dxQq92h9hCI=
=mQU/
-----END PGP SIGNATURE-----
Merge tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ADFS cleanups/fixes from Russell King:
"As a result of some of Al Viro's great work, here are a few cleanups
with fixes for adfs:
- factor out filename comparison, so we can be sure that
adfs_compare() (used for namei compare) and adfs_match() (used for
lookup) have the same behaviour.
- factor out filename lowering (which is not the same as tolower()
which will lower top-bit-set characters) to ensure that we have the
same behaviour when comparing filenames as when we hash them.
- factor out the object fixups, so we are applying all fixups to
directory objects in the same way, independent of the disk format.
- factor out the object name fixup (into the previously factored out
function) to ensure that filenames are appropriately translated -
for example, adfs allows '/' in filenames, which being the Unix
path separator, need to be translated to a different character,
which is normally '.' (DOS 8.3 filenames represent the . as a / on
adfs, so this is the expected reverse translation.)
- remove filename truncation; Al asked about this and apparently the
decision is to remove it. In any case, adfs's truncation was buggy,
so this rids us of that bug by removing the truncation feature.
- we now have only one location which adds the "filetype" suffix to
the filename, so there's no point that code being out of line.
- since we translate '/' into '.', an adfs filename of "/" or "//"
would end up being translated to "." and ".." which have special
meanings. In this case, change the first character to "^" to avoid
these special directory names being abused"
* tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm:
fs/adfs: fix filename fixup handling for "/" and "//" names
fs/adfs: move append_filetype_suffix() into adfs_object_fixup()
fs/adfs: remove truncated filename hashing
fs/adfs: factor out filename fixup
fs/adfs: factor out object fixups
fs/adfs: factor out filename case lowering
fs/adfs: factor out filename comparison
Use a safe strscpy call to copy the ethtool stat strings into the
relevant buffers, instead of a memcpy that will be accessing
out-of-bound data.
Fixes: 118d6298f6 ("net: mvpp2: add ethtool GOP statistics")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the following tests last for several hours, the problem will occur.
Server:
rds-stress -r 1.1.1.16 -D 1M
Client:
rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30
The following will occur.
"
Starting up....
tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us cpu
%
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
"
>From vmcore, we can find that clean_list is NULL.
>From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker.
Then rds_ib_mr_pool_flush_worker calls
"
rds_ib_flush_mr_pool(pool, 0, NULL);
"
Then in function
"
int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
int free_all, struct rds_ib_mr **ibmr_ret)
"
ibmr_ret is NULL.
In the source code,
"
...
list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail);
if (ibmr_ret)
*ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode);
/* more than one entry in llist nodes */
if (clean_nodes->next)
llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list);
...
"
When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next
instead of clean_nodes is added in clean_list.
So clean_nodes is discarded. It can not be used again.
The workqueue is executed periodically. So more and more clean_nodes are
discarded. Finally the clean_list is NULL.
Then this problem will occur.
Fixes: 1bc144b625 ("net, rds, Replace xlist in net/rds/xlist.h with llist")
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Olivier Matz says:
====================
ipv6: fix EFAULT on sendto with icmpv6 and hdrincl
The following code returns EFAULT (Bad address):
s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1);
sendto(ipv6_icmp6_packet, addr); /* returns -1, errno = EFAULT */
The problem is fixed in the second patch. The first one aligns the
code to ipv4, to avoid a race condition in the second patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The following code returns EFAULT (Bad address):
s = socket(AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);
setsockopt(s, SOL_IPV6, IPV6_HDRINCL, 1);
sendto(ipv6_icmp6_packet, addr); /* returns -1, errno = EFAULT */
The IPv4 equivalent code works. A workaround is to use IPPROTO_RAW
instead of IPPROTO_ICMPV6.
The failure happens because 2 bytes are eaten from the msghdr by
rawv6_probe_proto_opt() starting from commit 19e3c66b52 ("ipv6
equivalent of "ipv4: Avoid reading user iov twice after
raw_probe_proto_opt""), but at that time it was not a problem because
IPV6_HDRINCL was not yet introduced.
Only eat these 2 bytes if hdrincl == 0.
Fixes: 715f504b11 ("ipv6: add IPV6_HDRINCL option for raw sockets")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it was done in commit 8f659a03a0 ("net: ipv4: fix for a race
condition in raw_sendmsg") and commit 20b50d7997 ("net: ipv4: emulate
READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()") for ipv4, copy the
value of inet->hdrincl in a local variable, to avoid introducing a race
condition in the next commit.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 87fd125344 ("nvme-rdma: remove redundant reference between
ib_device and tagset") caused a kernel panic when disconnecting from an
inaccessible controller (disconnect during re-connection).
--
nvme nvme0: Removing ctrl: NQN "testnqn1"
nvme_rdma: nvme_rdma_exit_request: hctx 0 queue_idx 1
BUG: unable to handle kernel paging request at 0000000080000228
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
Call Trace:
blk_mq_exit_hctx+0x5c/0xf0
blk_mq_exit_queue+0xd4/0x100
blk_cleanup_queue+0x9a/0xc0
nvme_rdma_destroy_io_queues+0x52/0x60 [nvme_rdma]
nvme_rdma_shutdown_ctrl+0x3e/0x80 [nvme_rdma]
nvme_do_delete_ctrl+0x53/0x80 [nvme_core]
nvme_sysfs_delete+0x45/0x60 [nvme_core]
kernfs_fop_write+0x105/0x180
vfs_write+0xad/0x1a0
ksys_write+0x5a/0xd0
do_syscall_64+0x55/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa215417154
--
The reason for this crash is accessing an already freed ib_device for
performing dma_unmap during exit_request commands. The root cause for
that is that during re-connection all the queues are destroyed and
re-created (and the ib_device is reference counted by the queues and
freed as well) but the tagset stays alive and all the DMA mappings (that
we perform in init_request) kept in the request context. The original
commit fixed a different bug that was introduced during bonding (aka nic
teaming) tests that for some scenarios change the underlying ib_device
and caused memory leakage and possible segmentation fault. This commit
is a complementary commit that also changes the wrong DMA mappings that
were saved in the request context and making the request sqe dma
mappings dynamic with the command lifetime (i.e. mapped in .queue_rq and
unmapped in .complete). It also fixes the above crash of accessing freed
ib_device during destruction of the tagset.
Fixes: 87fd125344 ("nvme-rdma: remove redundant reference between ib_device and tagset")
Reported-by: Jim Harris <james.r.harris@intel.com>
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
The Number of Namespaces (nn) field in the identify controller data structure is
defined as u32 and the maximum allowed value in NVMe specification is
0xFFFFFFFEUL. This change fixes the possible overflow of the DIV_ROUND_UP()
operation used in nvme_scan_ns_list() by casting the nn to u64.
Signed-off-by: Jaesoo Lee <jalee@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
This patch addresses below two issues and prepares the code to address
3rd issue listed below.
1. mdev device is placed on the mdev bus before it is created in the
vendor driver. Once a device is placed on the mdev bus without creating
its supporting underlying vendor device, mdev driver's probe() gets
triggered. However there isn't a stable mdev available to work on.
create_store()
mdev_create_device()
device_register()
...
vfio_mdev_probe()
[...]
parent->ops->create()
vfio_ap_mdev_create()
mdev_set_drvdata(mdev, matrix_mdev);
/* Valid pointer set above */
Due to this way of initialization, mdev driver who wants to use the mdev,
doesn't have a valid mdev to work on.
2. Current creation sequence is,
parent->ops_create()
groups_register()
Remove sequence is,
parent->ops->remove()
groups_unregister()
However, remove sequence should be exact mirror of creation sequence.
Once this is achieved, all users of the mdev will be terminated first
before removing underlying vendor device.
(Follow standard linux driver model).
At that point vendor's remove() ops shouldn't fail because taking the
device off the bus should terminate any usage.
3. When remove operation fails, mdev sysfs removal attempts to add the
file back on already removed device. Following call trace [1] is observed.
[1] call trace:
kernel: WARNING: CPU: 2 PID: 9348 at fs/sysfs/file.c:327 sysfs_create_file_ns+0x7f/0x90
kernel: CPU: 2 PID: 9348 Comm: bash Kdump: loaded Not tainted 5.1.0-rc6-vdevbus+ #6
kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016
kernel: RIP: 0010:sysfs_create_file_ns+0x7f/0x90
kernel: Call Trace:
kernel: remove_store+0xdc/0x100 [mdev]
kernel: kernfs_fop_write+0x113/0x1a0
kernel: vfs_write+0xad/0x1b0
kernel: ksys_write+0x5a/0xe0
kernel: do_syscall_64+0x5a/0x210
kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe
Therefore, mdev core is improved in following ways.
1. Split the device registration/deregistration sequence so that some
things can be done between initialization of the device and hooking it
up to the bus respectively after deregistering it from the bus but
before giving up our final reference.
In particular, this means invoking the ->create() and ->remove()
callbacks in those new windows. This gives the vendor driver an
initialized mdev device to work with during creation.
At the same time, a bus driver who wish to bind to mdev driver also
gets initialized mdev device.
This follows standard Linux kernel bus and device model.
2. During remove flow, first remove the device from the bus. This
ensures that any bus specific devices are removed.
Once device is taken off the mdev bus, invoke remove() of mdev
from the vendor driver.
3. The driver core device model provides way to register and auto
unregister the device sysfs attribute groups at dev->groups.
Make use of dev->groups to let core create the groups and eliminate
code to avoid explicit groups creation and removal.
To ensure, that new sequence is solid, a below stack dump of a
process is taken who attempts to remove the device while device is in
use by vfio driver and user application.
This stack dump validates that vfio driver guards against such device
removal when device is in use.
cat /proc/21962/stack
[<0>] vfio_del_group_dev+0x216/0x3c0 [vfio]
[<0>] mdev_remove+0x21/0x40 [mdev]
[<0>] device_release_driver_internal+0xe8/0x1b0
[<0>] bus_remove_device+0xf9/0x170
[<0>] device_del+0x168/0x350
[<0>] mdev_device_remove_common+0x1d/0x50 [mdev]
[<0>] mdev_device_remove+0x8c/0xd0 [mdev]
[<0>] remove_store+0x71/0x90 [mdev]
[<0>] kernfs_fop_write+0x113/0x1a0
[<0>] vfs_write+0xad/0x1b0
[<0>] ksys_write+0x5a/0xe0
[<0>] do_syscall_64+0x5a/0x210
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<0>] 0xffffffffffffffff
This prepares the code to eliminate calling device_create_file() in
subsequent patch.
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When we call snd_soc_component_set_jack(component, NULL, NULL) we should
set rt274->jack to passed jack, so when interrupt is triggered it calls
snd_soc_jack_report(rt274->jack, ...) with proper value.
This fixes problem in machine where in register, we call
snd_soc_register(component, &headset, NULL), which just calls
rt274_mic_detect via callback.
Now when machine driver is removed "headset" will be gone, so we
need to tell codec driver that it's gone with:
snd_soc_register(component, NULL, NULL), but we also need to be able
to handle NULL jack argument here gracefully.
If we don't set it to NULL, next time the rt274_irq runs it will call
snd_soc_jack_report with first argument being invalid pointer and there
will be Oops.
Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
During the integration of HDaudio support, we changed the way in which
we get hdev in snd_hdac_ext_bus_device_init() to use one preallocated
with devm_kzalloc(), however it still left kfree(hdev) in
snd_hdac_ext_bus_device_exit(). It leads to oopses when trying to
rmmod and modprobe. Fix it, by just removing kfree call.
SOF also uses some of the snd_hdac_ functions for HDAudio support but
allocated the memory with kzalloc. A matching fix is provided
separately to align all users of the snd_hdac_ library.
Fixes: 6298542fa3 ("ALSA: hdac: remove memory allocation from snd_hdac_ext_bus_device_init")
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
There are already defined ppcap and ppcap interrupt functions, use
the already defined functions for easy code read.
Fixes: 8a300c8fb1 ("ASoC: SOF: Intel: Add HDA controller for Intel DSP")
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Zhu Yingjiang <yingjiang.zhu@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Kernel crashes when an ASoC component rebinding.
The dai_link->platforms has been reset to NULL by soc_cleanup_platform()
in soc_cleanup_card_resources() when un-registering component. However,
it has no chance to re-allocate the dai_link->platforms when registering
the component again.
Move the DAI pre-links initiation from snd_soc_register_card() to
snd_soc_instantiate_card() to make sure all DAI pre-links get initiated
when component rebinding.
As an example, by using the following commands:
- echo -n max98357a > /sys/bus/platform/drivers/max98357a/unbind
- echo -n max98357a > /sys/bus/platform/drivers/max98357a/bind
Got the error message:
"Unable to handle kernel NULL pointer dereference at virtual address".
The call trace:
snd_soc_is_matching_component+0x30/0x6c
soc_bind_dai_link+0x16c/0x240
snd_soc_bind_card+0x1e4/0xb10
snd_soc_add_component+0x270/0x300
snd_soc_register_component+0x54/0x6c
Signed-off-by: Tzung-Bi Shih <tzungbi@google.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.
Fixes: f403906da0 ("ASoC: Intel: cht_bsw_rt5672: platform name fixup support")
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.
Fixes: 4506db8043 ("ASoC: Intel: cht_bsw_nau8824: platform name fixup support")
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.
Fixes: e4bc6b1195 ("ASoC: Intel: bytcht_es8316: platform name fixup support")
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The platform override code uses devm_ functions to allocate memory for
the new name but the card device is not initialized. Fix by moving the
init earlier.
Fixes: 7e7e24d7c7 ("ASoC: Intel: cht_bsw_max98090_ti: platform name fixup support")
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit 73118ca8ba introduced a glock reference counting bug in
gfs2_trans_remove_revoke. Given that, replacing gl_revokes with a GLF flag is
no longer useful, so revert that commit.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Oded writes:
This tag contains the following fixes:
- Fix the code that checks whether we can use 2MB page size when mapping
memory in the ASIC's MMU. The current code had a "hole" which happened
in architectures other then x86-64.
- Fix the debugfs interface to read/write from/to the device using device
virtual addresses. There was a bug in the translation regarding
addresses that were mapped using 2MB page size.
- Fix a bug in the debug/profiling code, where the code didn't read the
full address but only the lower 32-bits of the address.
* tag 'misc-habanalabs-fixes-2019-06-06' of git://people.freedesktop.org/~gabbayo/linux:
habanalabs: Read upper bits of trace buffer from RWPHI
habanalabs: Fix virtual address access via debugfs for 2MB pages
habanalabs: fix bug in checking huge page optimization
Since GCC 9, the compiler warns about evolution of the
platform-specific ABI, in particular relating for the marshaling of
certain structures involving bitfields.
The kernel is a standalone binary, and of course nobody would be
so stupid as to expose structs containing bitfields as function
arguments in ABI. (Passing a pointer to such a struct, however
inadvisable, should be unaffected by this change. perf and various
drivers rely on that.)
So these warnings do more harm than good: turn them off.
We may miss warnings about future ABI drift, but that's too bad.
Future ABI breaks of this class will have to be debugged and fixed
the traditional way unless the compiler evolves finer-grained
diagnostics.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
According to the found documentation, data cache flushes and sync
instructions are needed on the PCX-U+ (PA8200, e.g. C200/C240)
platforms, while PCX-W (PA8500, e.g. C360) platforms aparently don't
need those flushes when changing the IO PDIR data structures.
We have no documentation for PCX-W+ (PA8600) and PCX-W2 (PA8700) CPUs,
but Carlo Pisani reported that his C3600 machine (PA8600, PCX-W+) fails
when the fdc instructions were removed. His firmware didn't set the NIOP
bit, so one may assume it's a firmware bug since other C3750 machines
had the bit set.
Even if documentation (as mentioned above) states that PCX-W (PA8500,
e.g. J5000) does not need fdc flushes, Sven could show that an Adaptec
29320A PCI-X SCSI controller reliably failed on a dd command during the
first five minutes in his J5000 when fdc flushes were missing.
Going forward, we will now NOT replace the fdc and sync assembler
instructions by NOPS if:
a) the NP iopdir_fdc bit was set by firmware, or
b) we find a CPU up to and including a PCX-W+ (PA8600).
This fixes the HPMC crashes on a C240 and C36XX machines. For other
machines we rely on the firmware to set the bit when needed.
In case one finds HPMC issues, people could try to boot their machines
with the "no-alternatives" kernel option to turn off any alternative
patching.
Reported-by: Sven Schnelle <svens@stackframe.org>
Reported-by: Carlo Pisani <carlojpisani@gmail.com>
Tested-by: Sven Schnelle <svens@stackframe.org>
Fixes: 3847dab774 ("parisc: Add alternative coding infrastructure")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 5.0+
Most I/O in the kernel is done using the kernel offset mapping.
However, there is one API that uses aliased kernel address ranges:
> The final category of APIs is for I/O to deliberately aliased address
> ranges inside the kernel. Such aliases are set up by use of the
> vmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O
> subsystem assumes that the user mapping and kernel offset mapping are
> the only aliases. This isn't true for vmap aliases, so anything in
> the kernel trying to do I/O to vmap areas must manually manage
> coherency. It must do this by flushing the vmap range before doing
> I/O and invalidating it after the I/O returns.
For this reason, we should use the hardware lpa instruction to load the
physical address of kernel virtual addresses in the driver code.
I believe we only use the vmap/vmalloc API with old PA 1.x processors
which don't have a sba, so we don't hit this problem.
Tested on c3750, c8000 and rp3440.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Remove the CONFIG_UEVENT_HELPER_PATH because:
1. It is disabled since commit 1be01d4a57 ("driver: base: Disable
CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was
made default to 'n',
2. It is not recommended (help message: "This should not be used today
[...] creates a high system load") and was kept only for ancient
userland,
3. Certain userland specifically requests it to be disabled (systemd
README: "Legacy hotplug slows down the system and confuses udev").
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Helge Deller <deller@gmx.de>
We only support I/O to kernel space. Using %sr1 to load the coherence
index may be racy unless interrupts are disabled. This patch changes the
code used to load the coherence index to use implicit space register
selection. This saves one instruction and eliminates the race.
Tested on rp3440, c8000 and c3750.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org
Signed-off-by: Helge Deller <deller@gmx.de>
Fix a s/TIF_SECOMP/TIF_SECCOMP/ comment typo
Cc: Jiri Kosina <trivial@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org
Signed-off-by: George G. Davis <george_davis@mentor.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
We need to check whether drm_atomic_get_crtc_state() returns an error
pointer before dereferencing "crtc_st".
Fixes: 9e56030941 ("drm/komeda: Add komeda_plane/plane_helper_funcs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: "james qian wang (Arm Technology China)" <james.qian.wang@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/arm/display/komeda/komeda_plane.c: In function komeda_plane_atomic_check:
drivers/gpu/drm/arm/display/komeda/komeda_plane.c:49:22: warning: variable kcrtc set but not used [-Wunused-but-set-variable]
It is never used since introduction in
commit 9e56030941 ("drm/komeda: Add komeda_plane/plane_helper_funcs")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Some chips have attributes which exist on more than one page but the
attribute is not presently marked as paged. This causes the attributes
to be generated with the same label, which makes it impossible for
userspace to tell them apart.
Marking all such attributes as paged would result in the page suffix
being added regardless of whether they were present on more than one
page or not, which might break existing setups. Therefore, we add a
second check which treats the attribute as paged, even if not marked as
such, if it is present on multiple pages.
Fixes: b4ce237b7f ("hwmon: (pmbus) Introduce infrastructure to detect sensors and limit registers")
Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
update_lock is a mutex intended to protect write operations. It was not
taken, however, when _pmbus_write_word_data is called from
pmbus_set_samples() function which may cause problems especially when
some PMBUS_VIRT_* operation is implemented as a read-modify-write cycle.
This patch makes sure the lock is held during the operation.
Fixes: 49c4455dcc ("hwmon: (pmbus) Introduce PMBUS_VIRT_*_SAMPLES registers")
Signed-off-by: Krzysztof Adamski <krzysztof.adamski@nokia.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
[groeck: Declared and initialized missing 'data' variable]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>