The down condition should be the negation of the wake condition,
IOW when I moved it from:
if (cond && wake())
to
if (__netif_txq_completed_wake(cond))
Cond should have been negated. Flip it now.
This bug leads to occasional crashes with netconsole.
It may also lead to queue never waking up in case BQL is not enabled.
Reported-by: David Wei <davidhwei@meta.com>
Fixes: 08a096780d ("bnxt: use new queue try_stop/try_wake macros")
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20230607010826.960226-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZIDxUwAKCRDbK58LschI
g5hDAQD7ukrniCvMRNIm2yUZIGSxE4RvGiXptO4a0NfLck5R/wEAsfN2KUsPcPhW
HS37lVfx7VVXfj42+REf7lWLu4TXpwk=
=6mS/
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-06-07
We've added 7 non-merge commits during the last 7 day(s) which contain
a total of 12 files changed, 112 insertions(+), 7 deletions(-).
The main changes are:
1) Fix a use-after-free in BPF's task local storage, from KP Singh.
2) Make struct path handling more robust in bpf_d_path, from Jiri Olsa.
3) Fix a syzbot NULL-pointer dereference in sockmap, from Eric Dumazet.
4) UAPI fix for BPF_NETFILTER before final kernel ships,
from Florian Westphal.
5) Fix map-in-map array_map_gen_lookup code generation where elem_size was
not being set for inner maps, from Rhys Rustad-Elliott.
6) Fix sockopt_sk selftest's NETLINK_LIST_MEMBERSHIPS assertion,
from Yonghong Song.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add extra path pointer check to d_path helper
selftests/bpf: Fix sockopt_sk selftest
bpf: netfilter: Add BPF_NETFILTER bpf_attach_type
selftests/bpf: Add access_inner_map selftest
bpf: Fix elem_size not being set for inner maps
bpf: Fix UAF in task local storage
bpf, sockmap: Avoid potential NULL dereference in sk_psock_verdict_data_ready()
====================
Link: https://lore.kernel.org/r/20230607220514.29698-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
irq_cpu_rmap_release() calls cpu_rmap_put(), which may free the rmap.
So we need to clear the pointer to our glue structure in rmap before
doing that, not after.
Fixes: 4e0473f106 ("lib: cpu_rmap: Avoid use after free on rmap->obj array entries")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZHo0vwquhOy3FaXc@decadent.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eswitch vport is needed for eswitch manager when creating LAG,
to create egress rules. However, this was not handled when ECPF is
an eswitch manager.
Signed-off-by: Bodong Wang <bodong@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Commit bffaa91658 ("net/mlx5: E-Switch, Add control for inline mode")
added inline mode checking to esw_offloads_start() with a warning
printed out in case there is a problem. Tne inline mode checking was
done even after mlx5_eswitch_enable_locked() call failed, which is
pointless.
Later on, commit 8c98ee77d9 ("net/mlx5e: E-Switch, Add extack messages
to devlink callbacks") converted the error/warning prints to extack
setting, which caused that the inline mode check error to overwrite
possible previous extack message when mlx5_eswitch_enable_locked()
failed. User then gets confusing error message.
Fix this by skipping check of inline mode after
mlx5_eswitch_enable_locked() call failed.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Currently, a temp object is filled and used as a key for rhashtable_lookup.
Lookups will only works while key remains the first attribute in the
relevant rhashtable node object.
Fix this by passing a key, instead of a object containing the key.
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Remove unused definitions left after the removal
of the RX page cache feature.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add generated_pkt_steering_fail and handled_pkt_steering_fail to devlink
heatlth reporter.
generated_pkt_steering_fail indicates the number of packets dropped due to
illegal steering operation within the vport steering domain.
handled_pkt_steering_fail indicates the number of packets dropped due to
illegal steering operation, originated by the vport.
Also, update devlink reporter functionality documentation with the newly
exposed counters.
Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Now, after all preparation are done, enable 4 ports VF LAG
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
multiport eswitch LAG is not supported over more than two ports. Add a check in
order to block multiport eswitch LAG over such devices.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
multipath LAG is not supported over more than two ports. Add a check in
order to block multipath LAG over such configurations.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
mlx5_shared_fdb_supported() is used only in a single file. Change the
function to be static.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Shared FDB handling is using the assumption that shared FDB can only
be created from two devices.
In order to support shared FDB of more than two devices, iterate over
all LAG ports instead of hard coding only the first two LAG ports
whenever handling shared FDB.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Shared FDB LAG can only work if all eswitches are paired.
Also, whenever two eswitches are paired, devcom is marked as ready.
Therefore, in case of device with two eswitches, checking devcom was
sufficient. However, this is not correct for device with more than
two eswitches, which will be introduced in downstream patch.
Hence, check all eswitches are paired explicitly.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Introduce a generic APIs to iterate over all the devices which are part
of the LAG. This API replace mlx5_lag_get_peer_mdev() which retrieve
only a single peer device from the lag.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The cited patch introduce ib port for the slave device uplink in
case of multiport eswitch. However, this ib port didn't perform
anything when unloaded.
Unload the new ib port properly.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
- a fix for unbalanced open count for inhibited input devices
- fixups in Elantech PS/2 and Cyppress TTSP v5 drivers
- a quirk to soc_button_array driver to make it work with
Lenovo Yoga Book X90F / X90L
- a removal of erroneous entry from xpad driver.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQST2eWILY88ieB2DOtAj56VGEWXnAUCZH9ziwAKCRBAj56VGEWX
nJ0JAQDkzAz8sD97Ua07O4VtP/wginhrM8GRe0gHQd2Pp+r83AD8C7+4P79OA5K7
jZSy5FXmTQJYctUvvIiQA6KJC3pK7Aw=
=I4bq
-----END PGP SIGNATURE-----
Merge tag 'input-for-v6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a fix for unbalanced open count for inhibited input devices
- fixups in Elantech PS/2 and Cyppress TTSP v5 drivers
- a quirk to soc_button_array driver to make it work with Lenovo
Yoga Book X90F / X90L
- a removal of erroneous entry from xpad driver
* tag 'input-for-v6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: xpad - delete a Razer DeathAdder mouse VID/PID entry
Input: psmouse - fix OOB access in Elantech protocol
Input: soc_button_array - add invalid acpi_index DMI quirk handling
Input: fix open count when closing inhibited device
Input: cyttsp5 - fix array length
Maxime Chevallier says:
====================
Followup fixes for the dwmac and altera lynx conversion
Here's yet another version of the cleanup series for the TSE PCS replacement
by PCS Lynx. It includes Kconfig fixups, some missing initialisations
and a slight rework suggested by Russell for the dwmac cleanup sequence,
along with more explicit zeroing of local structures as per MAciej's
review.
====================
Link: https://lore.kernel.org/r/20230607135941.407054-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Explicitly zero-ize the local mdio_regmap_config data, and explicitly
set the .autoscan parameter, as we only have a PCS on this bus.
Fixes: 5d1f3fe7d2 ("net: stmmac: dwmac-sogfpga: use the lynx pcs driver")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Set the .autoscan flag to false on the regmap-mdio bus, to avoid using a
random uninitialized value. We don't want autoscan in this case as the
mdio device is a PCS and not a PHY.
Fixes: db48abbaa1 ("net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
So far, only the dwmac_socfpga variant of stmmac uses PCS Lynx. Use a
dedicated cleanup sequence for dwmac_socfpga instead of using the
generic stmmac one.
Fixes: 5d1f3fe7d2 ("net: stmmac: dwmac-sogfpga: use the lynx pcs driver")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use the correct Kconfig dependency for altera_tse as PCS_ALTERA_TSE was
replaced by PCS_LYNX.
Fixes: db48abbaa1 ("net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The regmap_config and mdio_regmap_config objects needs to be zeroed before
using them. This will cause spurious errors at probe time as config->pad_bits
is containing random uninitialized data.
Fixes: db48abbaa1 ("net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski says:
====================
tools: ynl: generate code for the handshake family
Add necessary features and generate user space C code for serializing
/ deserializing messages of the handshake family.
In addition to basics already present in netdev and fou, handshake
has nested attrs and multi-attr u32.
====================
Link: https://lore.kernel.org/r/20230606194302.919343-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When parsing multi-attr we count the objects and then allocate
an array to hold the parsed objects. If an attr space has multiple
multi-attr objects, however, if parsing the first array fails
we'll leave the object count for the second even tho the second
array was never allocated.
This may cause crashes when freeing objects on error.
Count attributes to a variable on the stack and only set the count
in the object once the memory was allocated.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The handshake family needs support for MultiAttr scalars.
Right now we only support code gen for MultiAttr nested
types.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This is overdue and an oversight.
Add myself to this file deespite the fact that I'm trying to reduce the
number of entries in this file which have my name attached, but in the
hope that patches wont get picked up elsewhere completely unreviewed and
unnoticed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the beacon loss work that might cause a disconnect
and the CSA disconnect work to be wiphy work, so we hold
the wiphy lock for them.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Channel switch obviously must be handled per link, and we
have a (potential) deadlock when canceling that work. Use
the new delayed wiphy work to handle this instead and get
rid of the explicit timer that way too.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
SMPS requests are per link, and currently there's a potential
deadlock with canceling. Use the new wiphy work to handle SMPS
instead, so that the cancel cannot deadlock.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since we want to have wiphy_lock() for the unregistration
in the future, unregister also netdevs via cfg80211 now
to be able to hold the wiphy_lock() for it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
We'll need this later to convert other works that might
be cancelled from here, so convert this one first.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add a work abstraction at the cfg80211 level that will always
hold the wiphy_lock() for any work executed and therefore also
can be canceled safely (without waiting) while holding that.
This improves on what we do now as with the new wiphy works we
don't have to worry about locking while cancelling them safely.
Also, don't let such works run while the device is suspended,
since they'll likely need to interact with the device. Flush
them before suspend though.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Sending the wiphy out might cause calls to the driver,
notably get_txq_stats() and get_antenna(). These aren't
very important, since the normally have their own locks
and/or just send out static data, but if the contract
should be that the wiphy lock is always held, these are
also affected. Fix that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Missed this ioctl since it's in wext-sme.c where we
usually get via a front-level ioctl handler in the
other files, but it should also hold the wiphy lock
to align the locking contract towards the driver.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This is a driver callback, and the driver should be able
to assume that it's called with the wiphy lock held. Move
the call up so that's true, it has no other effect since
the device is already unregistering and we cannot reach
this function through other paths.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Most code paths in cfg80211 already hold the wiphy lock,
mostly by virtue of being called from nl80211, so make
the pmsr cleanup worker also hold it, aligning the
locking promises between different parts of cfg80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Most code paths in cfg80211 already hold the wiphy lock,
mostly by virtue of being called from nl80211, so make
the auto-disconnect worker also hold it, aligning the
locking promises between different parts of cfg80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There are a number of upcoming things in both the stack and
drivers that would otherwise conflict, so merge wireless to
wireless-next to be able to avoid those conflicts.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
kafs incorrectly passes a zero mtime (ie. 1st Jan 1970) to the server when
creating a file, dir or symlink because the mtime recorded in the
afs_operation struct gets passed to the server by the marshalling routines,
but the afs_mkdir(), afs_create() and afs_symlink() functions don't set it.
This gets masked if a file or directory is subsequently modified.
Fix this by filling in op->mtime before calling the create op.
Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anastasios reported crash on stable 5.15 kernel with following
BPF attached to lsm hook:
SEC("lsm.s/bprm_creds_for_exec")
int BPF_PROG(bprm_creds_for_exec, struct linux_binprm *bprm)
{
struct path *path = &bprm->executable->f_path;
char p[128] = { 0 };
bpf_d_path(path, p, 128);
return 0;
}
But bprm->executable can be NULL, so bpf_d_path call will crash:
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
...
RIP: 0010:d_path+0x22/0x280
...
Call Trace:
<TASK>
bpf_d_path+0x21/0x60
bpf_prog_db9cf176e84498d9_bprm_creds_for_exec+0x94/0x99
bpf_trampoline_6442506293_0+0x55/0x1000
bpf_lsm_bprm_creds_for_exec+0x5/0x10
security_bprm_creds_for_exec+0x29/0x40
bprm_execve+0x1c1/0x900
do_execveat_common.isra.0+0x1af/0x260
__x64_sys_execve+0x32/0x40
It's problem for all stable trees with bpf_d_path helper, which was
added in 5.9.
This issue is fixed in current bpf code, where we identify and mark
trusted pointers, so the above code would fail even to load.
For the sake of the stable trees and to workaround potentially broken
verifier in the future, adding the code that reads the path object from
the passed pointer and verifies it's valid in kernel space.
Fixes: 6e22ab9da7 ("bpf: Add d_path helper")
Reported-by: Anastasios Papagiannis <tasos.papagiannnis@gmail.com>
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20230606181714.532998-1-jolsa@kernel.org
try_module_get will be called in tcf_proto_lookup_ops. So module_put needs
to be called to drop the refcount if ops don't implement the required
function.
Fixes: 9f407f1768 ("net: sched: introduce chain templates")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
txgbe_shutdown() relies on txgbe_dev_shutdown() to initialise
wake by passing it by reference. However, txgbe_dev_shutdown()
doesn't use this parameter at all.
wake is then passed uninitialised by txgbe_dev_shutdown()
to pci_wake_from_d3().
Resolve this problem by:
* Removing the unused parameter from txgbe_dev_shutdown()
* Removing the uninitialised variable wake from txgbe_dev_shutdown()
* Passing false to pci_wake_from_d3() - this assumes that
although uninitialised wake was in practice false (0).
I'm not sure that this counts as a bug, as I'm not sure that
it manifests in any unwanted behaviour. But in any case, the issue
was introduced by:
3ce7547e5b ("net: txgbe: Add build support for txgbe")
Flagged by Smatch as:
.../txgbe_main.c:486 txgbe_shutdown() error: uninitialized symbol 'wake'.
No functional change intended.
Compile tested only.
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes following sparse errors:
net/sched/act_police.c:360:28: warning: dereference of noderef expression
net/sched/act_police.c:362:45: warning: dereference of noderef expression
net/sched/act_police.c:362:45: warning: dereference of noderef expression
net/sched/act_police.c:368:28: warning: dereference of noderef expression
net/sched/act_police.c:370:45: warning: dereference of noderef expression
net/sched/act_police.c:370:45: warning: dereference of noderef expression
net/sched/act_police.c:376:45: warning: dereference of noderef expression
net/sched/act_police.c:376:45: warning: dereference of noderef expression
Fixes: d1967e495a ("net_sched: act_police: add 2 new attributes to support police 64bit rate and peakrate")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>