Commit Graph

1316932 Commits

Author SHA1 Message Date
Petr Machata
15880bec9b selftests: net: fdb_notify: Add a test for FDB notifications
Check that only one notification is produced for various FDB edit
operations.

Regarding the ip_link_add() and ip_link_master() helpers. This pattern of
action plus corresponding defer is bound to come up often, and a dedicated
vocabulary to capture it will be handy. tunnel_create() and vlan_create()
from forwarding/lib.sh are somewhat opaque and perhaps too kitchen-sinky,
so I tried to go in the opposite direction with these ones, and wrapped
only the bare minimum to schedule a corresponding cleanup.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://patch.msgid.link/910c5880ae6d3b558d6889cbdba2be690c2615c6.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:19 -08:00
Petr Machata
46f6569cf0 selftests: net: lib: Add kill_process
A number of selftests run processes in the background and need to kill them
afterwards. Instead for everyone to open-code the kill / wait / redirect
mantra, add a helper in net/lib.sh. Convert existing open-code sites.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Link: https://patch.msgid.link/a9db102067d741c118f0bd93b10c75e2a34665ea.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:19 -08:00
Petr Machata
af76b44318 selftests: net: lib: Move checks from forwarding/lib.sh here
For logging to be useful, something has to set RET and retmsg by calling
ret_set_ksft_status(). There is a suite of functions to that end in
forwarding/lib: check_err, check_fail et.al. Move them to net/lib.sh so
that every net test can use them.

Existing lib.sh users might be using these same names for their functions.
However lib.sh is always sourced near the top of the file (checked), and
whatever new definitions will simply override the ones provided by lib.sh.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://patch.msgid.link/f488a00dc85b8e0c1f3c71476b32b21b5189a847.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:19 -08:00
Petr Machata
601d9d70a4 selftests: net: lib: Move tests_run from forwarding/lib.sh here
It would be good to use the same mechanism for scheduling and dispatching
general net tests as the many forwarding tests already use. To that end,
move the logging helpers to net/lib.sh so that every net test can use them.

Existing lib.sh users might be using the name themselves. However lib.sh is
always sourced near the top of the file (checked), and whatever new
definition will simply override the one provided by lib.sh.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://patch.msgid.link/a6fc083486493425b2c61185c327845b6ce3233a.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:19 -08:00
Petr Machata
b219bcfcc9 selftests: net: lib: Move logging from forwarding/lib.sh here
Many net selftests invent their own logging helpers. These really should be
in a library sourced by these tests. Currently forwarding/lib.sh has a
suite of perfectly fine logging helpers, but sourcing a forwarding/ library
from a higher-level directory smells of layering violation. In this patch,
move the logging helpers to net/lib.sh so that every net test can use them.

Together with the logging helpers, it's also necessary to move
pause_on_fail(), and EXIT_STATUS and RET.

Existing lib.sh users might be using these same names for their functions
or variables. However lib.sh is always sourced near the top of the
file (checked), and whatever new definitions will simply override the ones
provided by lib.sh.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://patch.msgid.link/edd3785a3bd72ffbe1409300989e993ee50ae98b.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:19 -08:00
Petr Machata
42575ad5aa ndo_fdb_del: Add a parameter to report whether notification was sent
In a similar fashion to ndo_fdb_add, which was covered in the previous
patch, add the bool *notified argument to ndo_fdb_del. Callees that send a
notification on their own set the flag to true.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/06b1acf4953ef0a5ed153ef1f32d7292044f2be6.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:18 -08:00
Petr Machata
4b42fbc6bd ndo_fdb_add: Add a parameter to report whether notification was sent
Currently when FDB entries are added to or deleted from a VXLAN netdevice,
the VXLAN driver emits one notification, including the VXLAN-specific
attributes. The core however always sends a notification as well, a generic
one. Thus two notifications are unnecessarily sent for these operations. A
similar situation comes up with bridge driver, which also emits
notifications on its own:

 # ip link add name vx type vxlan id 1000 dstport 4789
 # bridge monitor fdb &
 [1] 1981693
 # bridge fdb add de:ad:be:ef:13:37 dev vx self dst 192.0.2.1
 de:ad:be:ef:13:37 dev vx dst 192.0.2.1 self permanent
 de:ad:be:ef:13:37 dev vx self permanent

In order to prevent this duplicity, add a paremeter to ndo_fdb_add,
bool *notified. The flag is primed to false, and if the callee sends a
notification on its own, it sets it to true, thus informing the core that
it should not generate another notification.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/cbf6ae8195e85cbf922f8058ce4eba770f3b71ed.1731589511.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:39:18 -08:00
Jakub Kicinski
2a8ce470c5 Merge branch 'modifying-format-and-renaming-goto-labels'
Justin Lai says:

====================
Modifying format and renaming goto labels

This patch set primarily involves modifying the enum rtase_registers
format and renaming the goto labels in rtase_init_one.
====================

Link: https://patch.msgid.link/20241114112549.376101-1-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:26:57 -08:00
Justin Lai
39007e1c1c rtase: Modify the content format of the enum rtase_registers
Remove unnecessary spaces.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241114112549.376101-3-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:26:55 -08:00
Justin Lai
fdb5379119 rtase: Modify the name of the goto label
Modify the name of the goto label in rtase_init_one().

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241114112549.376101-2-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:26:55 -08:00
Jakub Kicinski
bf3c76b4c4 Merge branch 'net-netpoll-improve-skb-pool-management'
Breno Leitao says:

====================
net: netpoll: Improve SKB pool management

The netpoll subsystem pre-allocates 32 SKBs in a pool for emergency use
during out-of-memory conditions. However, the current implementation has
several inefficiencies:

 * The SKB pool, once allocated, is never freed:
	 * Resources remain allocated even after netpoll users are removed
	 * Failed initialization can leave pool populated forever
 * The global pool design makes resource tracking difficult

This series addresses these issues through three patches:

Patch 1 ("net: netpoll: Individualize the skb pool"):
 - Replace global pool with per-user pools in netpoll struct

Patch 2 ("net: netpoll: flush skb pool during cleanup"):
- Properly free pool resources during netconsole cleanup

These changes improve resource management and make the code more
maintainable.  As a side benefit, the improved structure would allow
netpoll to be modularized if desired in the future.

v2: https://lore.kernel.org/20241107-skb_buffers_v2-v2-0-288c6264ba4f@debian.org
v1: https://lore.kernel.org/20241025142025.3558051-1-leitao@debian.org
====================

Link: https://patch.msgid.link/20241114-skb_buffers_v2-v3-0-9be9f52a8b69@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:25:40 -08:00
Breno Leitao
6c59f16f17 net: netpoll: flush skb pool during cleanup
The netpoll subsystem maintains a pool of 32 pre-allocated SKBs per
instance, but these SKBs are not freed when the netpoll user is brought
down. This leads to memory waste as these buffers remain allocated but
unused.

Add skb_pool_flush() to properly clean up these SKBs when netconsole is
terminated, improving memory efficiency.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20241114-skb_buffers_v2-v3-2-9be9f52a8b69@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:25:34 -08:00
Breno Leitao
221a9c1df7 net: netpoll: Individualize the skb pool
The current implementation of the netpoll system uses a global skb
pool, which can lead to inefficient memory usage and
waste when targets are disabled or no longer in use.

This can result in a significant amount of memory being unnecessarily
allocated and retained, potentially causing performance issues and
limiting the availability of resources for other system components.

Modify the netpoll system to assign a skb pool to each target instead of
using a global one.

This approach allows for more fine-grained control over memory
allocation and deallocation, ensuring that resources are only allocated
and retained as needed.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20241114-skb_buffers_v2-v3-1-9be9f52a8b69@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:25:34 -08:00
Colin Ian King
11ee317d88 octeontx2-pf: Fix spelling mistake "reprentator" -> "representor"
There is a spelling mistake in a NL_SET_ERR_MSG_MOD error message.
Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://patch.msgid.link/20241114102012.1868514-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:15:45 -08:00
Dmitry Safonov
e51edeaf35 net/netlink: Correct the comment on netlink message max cap
Since commit d35c99ff77 ("netlink: do not enter direct reclaim from
netlink_dump()") the cap is 32KiB.

Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://patch.msgid.link/20241113-tcp-md5-diag-prep-v2-5-00a2a7feb1fa@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 16:14:16 -08:00
Samuel Holland
1f181d1cda irqchip/riscv-aplic: Prevent crash when MSI domain is missing
If the APLIC driver is probed before the IMSIC driver, the parent MSI
domain will be missing, which causes a NULL pointer dereference in
msi_create_device_irq_domain().

Avoid this by deferring probe until the parent MSI domain is available. Use
dev_err_probe() to avoid printing an error message when returning
-EPROBE_DEFER.

Fixes: ca8df97fe6 ("irqchip/riscv-aplic: Add support for MSI-mode")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241114200133.3069460-1-samuel.holland@sifive.com
2024-11-16 00:45:37 +01:00
Jakub Kicinski
2532390448 Merge branch 'enic-use-all-the-resources-configured-on-vic'
Nelson Escobar says:

====================
enic: Use all the resources configured on VIC

Allow users to configure and use more than 8 rx queues and 8 tx queues
on the Cisco VIC.

This series changes the maximum number of tx and rx queues supported
from 8 to the hardware limit of 256, and allocates memory based on the
number of resources configured on the VIC.

v3: https://lore.kernel.org/20241108-remove_vic_resource_limits-v3-0-3ba8123bcffc@cisco.com
v2: https://lore.kernel.org/20241024-remove_vic_resource_limits-v2-0-039b8cae5fdd@cisco.com
v1: https://lore.kernel.org/20241022041707.27402-2-neescoba@cisco.com
====================

Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-0-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:48 -08:00
Nelson Escobar
a28ccf1d6c enic: Move kdump check into enic_adjust_resources()
Move the kdump check into enic_adjust_resources() so that everything
that modifies resources is in the same function.

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-7-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:41 -08:00
Nelson Escobar
374f6c04df enic: Move enic resource adjustments to separate function
Move the enic resource adjustments out of enic_set_intr_mode() and into
its own function, enic_adjust_resources().

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-6-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:41 -08:00
Nelson Escobar
cc94d6c4d4 enic: Adjust used MSI-X wq/rq/cq/interrupt resources in a more robust way
Instead of failing to use MSI-X if resources aren't configured exactly
right, use the resources we do have.  Since we could start using large
numbers of rq resources, we do limit the rq count to what
netif_get_num_default_rss_queues() recommends.

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-5-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:41 -08:00
Nelson Escobar
a64e5492ca enic: Allocate arrays in enic struct based on VIC config
Allocate wq, rq, cq, intr, and napi arrays based on the number of
resources configured in the VIC.

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-4-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:41 -08:00
Nelson Escobar
5aee332472 enic: Save resource counts we read from HW
Save the resources counts for wq,rq,cq, and interrupts in *_avail variables
so that we don't lose the information when adjusting the counts we are
actually using.

Report the wq_avail and rq_avail as the channel maximums in 'ethtool -l'
output.

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-3-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:41 -08:00
Nelson Escobar
231646cb6a enic: Make MSI-X I/O interrupts come after the other required ones
The VIC hardware has a constraint that the MSIX interrupt used for errors
be specified as a 7 bit number.  Before this patch, it was allocated after
the I/O interrupts, which would cause a problem if 128 or more I/O
interrupts are in use.

So make the required interrupts come before the I/O interrupts to
guarantee the error interrupt offset never exceeds 7 bits.

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-2-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:40 -08:00
Nelson Escobar
b67609c931 enic: Create enic_wq/rq structures to bundle per wq/rq data
Bundling the wq/rq specific data into dedicated enic_wq/rq structures
cleans up the enic structure and simplifies future changes related to
wq/rq.

Co-developed-by: John Daley <johndale@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Co-developed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20241113-remove_vic_resource_limits-v4-1-a34cf8570c67@cisco.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:38:40 -08:00
Ziwei Xiao
8ffade77b6 gve: Flow steering trigger reset only for timeout error
When configuring flow steering rules, the driver is currently going
through a reset for all errors from the device. Instead, the driver
should only reset when there's a timeout error from the device.

Fixes: 57718b60df ("gve: Add flow steering adminq commands")
Cc: stable@vger.kernel.org
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241113175930.2585680-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:36:59 -08:00
Tarun Alle
025b2bbc5a net: phy: microchip_t1: Clause-45 PHY loopback support for LAN887x
Adds support for clause-45 PHY loopback for the Microchip LAN887x driver.

Signed-off-by: Tarun Alle <Tarun.Alle@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241114101951.382996-1-Tarun.Alle@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 15:22:10 -08:00
Rob Herring (Arm)
6bbdb903db dt-bindings: net: dsa: microchip,ksz: Drop undocumented "id"
"id" is not a documented property, so drop it.

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20241113225642.1783485-2-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:29:28 -08:00
Russell King (Oracle)
41ffcd9501 net: phy: fix phylib's dual eee_enabled
phylib has two eee_enabled members. Some parts of the code are using
phydev->eee_enabled, other parts are using phydev->eee_cfg.eee_enabled.
This leads to incorrect behaviour as their state goes out of sync.
ethtool --show-eee shows incorrect information, and --set-eee sometimes
doesn't take effect.

Fix this by only having one eee_enabled member - that in eee_cfg.

Fixes: 49168d1980 ("net: phy: Add phy_support_eee() indicating MAC support EEE")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/E1tBXAF-00341F-EQ@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:27:37 -08:00
Felix Maurer
0c0d0f42ff xsk: Free skb when TX metadata options are invalid
When a new skb is allocated for transmitting an xsk descriptor, i.e., for
every non-multibuf descriptor or the first frag of a multibuf descriptor,
but the descriptor is later found to have invalid options set for the TX
metadata, the new skb is never freed. This can leak skbs until the send
buffer is full which makes sending more packets impossible.

Fix this by freeing the skb in the error path if we are currently dealing
with the first frag, i.e., an skb allocated in this iteration of
xsk_build_skb.

Fixes: 48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
Reported-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Felix Maurer <fmaurer@redhat.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/edb9b00fb19e680dff5a3350cd7581c5927975a8.1731581697.git.fmaurer@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:26:40 -08:00
Vadim Fedorenko
c7a21af711 bnxt_en: optimize gettimex64
Current implementation of gettimex64() makes at least 3 PCIe reads to
get current PHC time. It takes at least 2.2us to get this value back to
userspace. At the same time there is cached value of upper bits of PHC
available for packet timestamps already. This patch reuses cached value
to speed up reading of PHC time.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20241114114820.1411660-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:26:05 -08:00
Jakub Kicinski
8807850697 netfilter pull request 24-11-14
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmc18Y4ACgkQ1w0aZmrP
 KyHKURAAwQxhSDGgEGs5Y5f851kqb36OZST7kLXAdLPv6jJlCl5x6gW9Nxo5NWoI
 inFwp5lGjha7dXbrkVi60BvkoMFcU9AhLs4RmWHBZzs3NtnbCEIlZ9LXfWuKf1rU
 1LhfUrN2UqtYRWzz4mznTW686jdEFg5kgyugI8Ja5RaLiaLQ0DNJS8IxZncYP3a6
 ZrmP5d/LUW/WZ0lRLX7s10k+ar8VartZvKr0wuKZXo8TuzjmDFf6+4l2EYbQN+A6
 tjRIpC/8pEvKhC5bvSea1Irn7+qDvapPkpPzkU5Wg+ftMUv/1ehBIBWPkrD5y8ye
 vpvQIb9Wpiyy6dPG3jtK2Y0IwyKZHf3t6mFWI5y10+GUqbYSuabILYquG5SWAbyZ
 EdWrw5fEP9Na4oeEtQpFrPKgcl20fPaxc3Q2MzpodFzUAeYCMrrxBXcToDf0yvFd
 mghsr6iTdfjJT7fT3prFIIkMalAoX1sp6rjpcP+Nd2SY7Y3nBPaiGSrF75svPbPR
 IUTJaZIgUyoOfimy78fKXMuK63r1+wXO5oDXvP2KpBUetAWEO16IULgD7zx0zIWQ
 vnwBcyiqhBzRqcfDpLxaq/wNZA9eJCFCzqRn7GmqNlEKrrGBE62M19gZnAC2hUB/
 FYfHkGT3SvSDt6im1gyNp0QKn8kSl/2bUbkf29rcl0zuu42WnUw=
 =0/FY
 -----END PGP SIGNATURE-----

Merge tag 'nf-24-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Update .gitignore in selftest to skip conntrack_reverse_clash,
   from Li Zhijian.

2) Fix conntrack_dump_flush return values, from Guan Jing.

3) syzbot found that ipset's bitmap type does not properly checks for
   bitmap's first ip, from Jeongjun Park.

* tag 'nf-24-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: ipset: add missing range check in bitmap_ip_uadt
  selftests: netfilter: Fix missing return values in conntrack_dump_flush
  selftests: netfilter: Add missing gitignore file
====================

Link: https://patch.msgid.link/20241114125723.82229-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:24:36 -08:00
Joe Damato
ed7231f56c netdev-genl: Hold rcu_read_lock in napi_set
Hold rcu_read_lock during netdev_nl_napi_set_doit, which calls
napi_by_id and requires rcu_read_lock to be held.

Closes: https://lore.kernel.org/netdev/719083c2-e277-447b-b6ea-ca3acb293a03@redhat.com/
Fixes: 1287c1ae0f ("netdev-genl: Support setting per-NAPI config values")
Signed-off-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20241114175600.18882-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:21:00 -08:00
Joe Damato
c53bf100f6 netdev-genl: Hold rcu_read_lock in napi_get
Hold rcu_read_lock in netdev_nl_napi_get_doit, which calls napi_by_id
and is required to be called under rcu_read_lock.

Cc: stable@vger.kernel.org
Fixes: 27f91aaf49 ("netdev-genl: Add netlink framework functions for napi")
Signed-off-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20241114175157.16604-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:20:19 -08:00
Jakub Kicinski
6cd663f03f bluetooth-next pull request for net-next:
- btusb: add Foxconn 0xe0fc for Qualcomm WCN785x
  - btmtk: Fix ISO interface handling
  - Add quirk for ATS2851
  - btusb: Add RTL8852BE device 0489:e123
  - ISO: Do not emit LE PA/BIG Create Sync if previous is pending
  - btusb: Add USB HW IDs for MT7920/MT7925
  - btintel_pcie: Add handshake between driver and firmware
  - btintel_pcie: Add recovery mechanism
  - hci_conn: Use disable_delayed_work_sync
  - SCO: Use kref to track lifetime of sco_conn
  - ISO: Use kref to track lifetime of iso_conn
  - btnxpuart: Add GPIO support to power save feature
  - btusb: Add 0x0489:0xe0f3 and 0x13d3:0x3623 for Qualcomm WCN785x
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmc2bBcZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKfnGEACV1YylQ9kJxzyDwyxrZtYG
 3T2I+8qQwIpokyGcYHpMC0kJFAzCbywGxjS3njpOrM0nTuvGpvLAD40Sn+pMHbKb
 3XzNNSixuxgJvr3CyDM0KOua/2nFQZyPe0DXe4D9bzOproIHDyoQkWbVqntLCyXO
 DDwwSF/CcgfZsIf5dfGir6erUBOXzYUUlCL7Q0ap2DNdeYAv6XLWwVSMu7okIFpH
 H/vBcWMWXwNNyEtDIHPmRYut14qEFASTRWsSe7IiIoL2V5VG5BVUALk8rAmoVv10
 IjE5kAmdLBHplhtnDIEg55CHIivlnbOp9d7WMhKkwY+vrPx571uFuwXeCBrg+5cd
 SyekP701TtS5oscT2SazimZTdtS0YLmJgVlhxX8DAIeElO9pEPdJt9CNCafFKmEY
 LMleGXDrH4bnTA1k6nMX2Ky4/oqlSonPbYXZ4GzL5ZMRg9biIkRI3YyRvKLM1plh
 MoO14zXhS184Cf0vSaGSeZ2nqsv7Z0lPkGJxCpGyzkOoA1VnzORl9teik5C6eeCw
 7wOoM+x+aJU8hxyD65DkyPNzLkvrEohRJWx7XMOKZEC1uFvBrJfEc/lb7TH5E+Zd
 PbPG1+x5Y3CAwvOcQzbpVeF0ujL0+KvJF+Y1q7eJ3mB+KoDPBEVbcO1ypvL6kbfW
 ETYrD/fuBiuxQE/XM0Xdlw==
 =niGG
 -----END PGP SIGNATURE-----

Merge tag 'for-net-next-2024-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

 - btusb: add Foxconn 0xe0fc for Qualcomm WCN785x
 - btmtk: Fix ISO interface handling
 - Add quirk for ATS2851
 - btusb: Add RTL8852BE device 0489:e123
 - ISO: Do not emit LE PA/BIG Create Sync if previous is pending
 - btusb: Add USB HW IDs for MT7920/MT7925
 - btintel_pcie: Add handshake between driver and firmware
 - btintel_pcie: Add recovery mechanism
 - hci_conn: Use disable_delayed_work_sync
 - SCO: Use kref to track lifetime of sco_conn
 - ISO: Use kref to track lifetime of iso_conn
 - btnxpuart: Add GPIO support to power save feature
 - btusb: Add 0x0489:0xe0f3 and 0x13d3:0x3623 for Qualcomm WCN785x

* tag 'for-net-next-2024-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (51 commits)
  Bluetooth: MGMT: Add initial implementation of MGMT_OP_HCI_CMD_SYNC
  Bluetooth: fix use-after-free in device_for_each_child()
  Bluetooth: btintel: Direct exception event to bluetooth stack
  Bluetooth: hci_core: Fix calling mgmt_device_connected
  Bluetooth: hci_bcm: Use the devm_clk_get_optional() helper
  Bluetooth: ISO: Send BIG Create Sync via hci_sync
  Bluetooth: hci_conn: Remove alloc from critical section
  Bluetooth: ISO: Use kref to track lifetime of iso_conn
  Bluetooth: SCO: Use kref to track lifetime of sco_conn
  Bluetooth: HCI: Add IPC(11) bus type
  Bluetooth: btusb: Add 3 HWIDs for MT7925
  Bluetooth: btusb: Add new VID/PID 0489/e124 for MT7925
  Bluetooth: ISO: Update hci_conn_hash_lookup_big for Broadcast slave
  Bluetooth: ISO: Do not emit LE BIG Create Sync if previous is pending
  Bluetooth: ISO: Fix matching parent socket for BIS slave
  Bluetooth: ISO: Do not emit LE PA Create Sync if previous is pending
  Bluetooth: btrtl: Decrease HCI_OP_RESET timeout from 10 s to 2 s
  Bluetooth: btbcm: fix missing of_node_put() in btbcm_get_board_name()
  Bluetooth: btusb: Add new VID/PID 0489/e111 for MT7925
  Bluetooth: btmtk: adjust the position to init iso data anchor
  ...
====================

Link: https://patch.msgid.link/20241114214731.1994446-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:16:28 -08:00
Jakub Kicinski
26a3beee24 netfilter pull request 24-11-15
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEjF9xRqF1emXiQiqU1w0aZmrPKyEFAmc3S9AACgkQ1w0aZmrP
 KyF7Sg/9GBfCiuuxUqrbigUitY8dJFuCTt+fKxMDfTb6sqU7FgQK/ylqwuW2zikz
 MgyVRXTAMbgD1KU5U+v1VEf5kq8iCU/rpdCC1xMOK9GvbaYQ9l/0cw8PR1jGgmSZ
 P1NWgmpv30IbZ/bQblU9/SbP8sFWg3DLC9lFrqYlLkJjijhfSDTflI6uVVWwt+rn
 9jWqgzf6mUYKAKJ56gFfUW/09jYPkQ5OLYz9CLqvIZLhdYNPGy2GEgldzXkHaVPv
 O65lMjrNojVYfITcinjkVfVVTlcLtQPNG9novclXrsf+qSsov5h/583n0c+7Xh3N
 r+EY1NBzZEcxLloTowJ/iq7xtDHHDG6Rv3BGTMS2JWFxhUDOV3Ks2qj/bIZUkzh5
 /Kl8n4NFbE+f1F3TGOoivZ0CFK1s3jcdIu3RTMXwwa41eiOAt8dPvhckfxTW20kT
 GdIYMNpUC1UVw2a1bxPEw27omB2UF2VADK5vHm97WJ8FBjA1HwPA9afF+PmyNMZ6
 cCOKT225DpXkt2WAMX+bgDyqQN150B05/JrBRdiT5hT5++xJn+heZLvx56L4mPA2
 8Y8NnnXLsyx5pwtE6HKgBOZNXXno2xpE/OrafF5n2zHwiMnF5qeF1+Jwerm8SxUa
 ZTuUS1mAi922IJzksnjRtiVggEA4X9Arq4NRwlIMWunTxybmHnE=
 =Tp49
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Extended netlink error reporting if nfnetlink attribute parser fails,
   from Donald Hunter.

2) Incorrect request_module() module, from Simon Horman.

3) A series of patches to reduce memory consumption for set element
   transactions.
   Florian Westphal says:

"When doing a flush on a set or mass adding/removing elements from a
set, each element needs to allocate 96 bytes to hold the transactional
state.

In such cases, virtually all the information in struct nft_trans_elem
is the same.

Change nft_trans_elem to a flex-array, i.e. a single nft_trans_elem
can hold multiple set element pointers.

The number of elements that can be stored in one nft_trans_elem is limited
by the slab allocator, this series limits the compaction to at most 62
elements as it caps the reallocation to 2048 bytes of memory."

4) A series of patches to prepare the transition to dscp_t in .flowi_tos.
   From Guillaume Nault.

5) Support for bitwise operations with two source registers,
   from Jeremy Sowden.

* tag 'nf-next-24-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: bitwise: add support for doing AND, OR and XOR directly
  netfilter: bitwise: rename some boolean operation functions
  netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t.
  netfilter: nft_fib: Convert nft_fib4_eval() to dscp_t.
  netfilter: rpfilter: Convert rpfilter_mt() to dscp_t.
  netfilter: flow_offload: Convert nft_flow_route() to dscp_t.
  netfilter: ipv4: Convert ip_route_me_harder() to dscp_t.
  netfilter: nf_tables: allocate element update information dynamically
  netfilter: nf_tables: switch trans_elem to real flex array
  netfilter: nf_tables: prepare nft audit for set element compaction
  netfilter: nf_tables: prepare for multiple elements in nft_trans_elem structure
  netfilter: nf_tables: add nft_trans_commit_list_add_elem helper
  netfilter: bpf: Pass string literal as format argument of request_module()
  netfilter: nfnetlink: Report extack policy errors for batched ops
====================

Link: https://patch.msgid.link/20241115133207.8907-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-15 14:09:21 -08:00
Frederic Weisbecker
d8dfba2c60 Merge branches 'rcu/fixes', 'rcu/nocb', 'rcu/torture', 'rcu/stall' and 'rcu/srcu' into rcu/dev 2024-11-15 22:38:53 +01:00
Shuah Khan
c818d5c64c Documentation/CoC: spell out enforcement for unacceptable behaviors
The Code of Conduct committee's goal first and foremost is to bring about
change to ensure our community continues to foster respectful discussions.

In the interest of transparency, the CoC enforcement policy is formalized
for unacceptable behaviors.

Update the Code of Conduct Interpretation document with the enforcement
information.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Kees Cook <kees@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20241114205649.44179-1-skhan@linuxfoundation.org
2024-11-15 14:31:59 -07:00
Uladzislau Rezki (Sony)
c229d579d0 rcuscale: Remove redundant WARN_ON_ONCE() splat
There are two places where WARN_ON_ONCE() is called two times
in the error paths. One which is encapsulated into if() condition
and another one, which is unnecessary, is placed in the brackets.

Remove an extra WARN_ON_ONCE() splat which is in brackets.

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-15 22:24:41 +01:00
Uladzislau Rezki (Sony)
812a1c3b9f rcuscale: Do a proper cleanup if kfree_scale_init() fails
A static analyzer for C, Smatch, reports and triggers below
warnings:

   kernel/rcu/rcuscale.c:1215 rcu_scale_init()
   warn: inconsistent returns 'global &fullstop_mutex'.

The checker complains about, we do not unlock the "fullstop_mutex"
mutex, in case of hitting below error path:

<snip>
...
    if (WARN_ON_ONCE(jiffies_at_lazy_cb - jif_start < 2 * HZ)) {
        pr_alert("ERROR: call_rcu() CBs are not being lazy as expected!\n");
        WARN_ON_ONCE(1);
        return -1;
        ^^^^^^^^^^
...
<snip>

it happens because "-1" is returned right away instead of
doing a proper unwinding.

Fix it by jumping to "unwind" label instead of returning -1.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Closes: https://lore.kernel.org/rcu/ZxfTrHuEGtgnOYWp@pc636/T/
Fixes: 084e04fff1 ("rcuscale: Add laziness and kfree tests")
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-15 22:23:50 +01:00
Paul E. McKenney
9407f5c3ec srcu: Unconditionally record srcu_read_lock_lite() in ->srcu_reader_flavor
Currently, srcu_read_lock_lite() uses the SRCU_READ_FLAVOR_LITE bit in
->srcu_reader_flavor to communicate to the grace-period processing in
srcu_readers_active_idx_check() that the smp_mb() must be replaced by a
synchronize_rcu().  Unfortunately, ->srcu_reader_flavor is not updated
unless the kernel is built with CONFIG_PROVE_RCU=y.  Therefore in all
kernels built with CONFIG_PROVE_RCU=n, srcu_readers_active_idx_check()
incorrectly uses smp_mb() instead of synchronize_rcu() for srcu_struct
structures whose readers use srcu_read_lock_lite().

This commit therefore causes Tree SRCU srcu_read_lock_lite()
to unconditionally update ->srcu_reader_flavor so that
srcu_readers_active_idx_check() can make the correct choice.

Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/
Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-15 22:13:37 +01:00
Rob Herring (Arm)
28b513b5a6 Merge branch 'dt/linus' into dt/next
Pull-in kunit kconfig fix
2024-11-15 14:03:59 -06:00
Stephen Boyd
332857fdac of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
Some configurations want to enable CONFIG_KUNIT without enabling
CONFIG_OF_OVERLAY. The kunit overlay code already skips if
CONFIG_OF_OVERLAY isn't enabled, so this select here isn't really doing
anything besides making it easier to run the tests without them
skipping. Remove the select and move the config setting to the
drivers/of/.kunitconfig file so that the overlay tests can be run with
or without CONFIG_OF_OVERLAY set to test either behavior.

Fixes: 5c9dd72d83 ("of: Add a KUnit test for overlays and test managed APIs")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20241016212016.887552-1-sboyd@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-11-15 14:03:28 -06:00
Rafael J. Wysocki
d47a60e487 Merge branch 'acpi-misc'
Merge miscellaneous ACPI changes for 6.13-rc1:

 - Switch several ACPI platform drivers back to using struct
   platform_driver::remove() (Uwe Kleine-König).

 - Replace strcpy() with strscpy() in multiple places in the ACPI
   subsystem (Muhammad Qasim Abdul Majeed, Abdul Rahim).

* acpi-misc:
  ACPI: Switch back to struct platform_driver::remove()
  ACPI: scan: Use strscpy() instead of strcpy()
  ACPI: SBSHC: Use strscpy() instead of strcpy()
  ACPI: SBS: Use strscpy() instead of strcpy()
  ACPI: power: Use strscpy() instead of strcpy()
  ACPI: pci_root: Use strscpy() instead of strcpy()
  ACPI: pci_link: Use strscpy() instead of strcpy()
  ACPI: event: Use strscpy() instead of strcpy()
  ACPI: EC: Use strscpy() instead of strcpy()
  ACPI: APD: Use strscpy() instead of strcpy()
  ACPI: thermal: Use strscpy() instead of strcpy()
2024-11-15 20:52:02 +01:00
Rafael J. Wysocki
563c87f58f Merge branches 'acpi-processor', 'acpi-x86' and 'acpi-video'
Merge and ACPI processor driver update, ACPI x86-specific code updates,
and an ACPI backlight (video) driver quirk for 6.13-rc1:

 - Rearrange the processor_perflib code in the ACPI processor driver
   to avoid compiling x86-specific code on other architectures (Arnd
   Bergmann).

 - Add adev NULL check to acpi_quirk_skip_serdev_enumeration() and
   make UART skip quirks work on PCI UARTs without an UID (Hans de
   Goede).

 - Force native backlight handling Apple MacbookPro11,2 and Air7,2 in
   the ACPI video driver (Jonathan Denose).

* acpi-processor:
  ACPI: processor_perflib: extend X86 dependency

* acpi-x86:
  ACPI: x86: Add adev NULL check to acpi_quirk_skip_serdev_enumeration()
  ACPI: x86: Make UART skip quirks work on PCI UARTs without an UID

* acpi-video:
  ACPI: video: force native for Apple MacbookPro11,2 and Air7,2
2024-11-15 20:46:13 +01:00
Rafael J. Wysocki
1c58e3a528 Merge branches 'acpi-battery', 'acpi-ec', 'acpi-pfr' and 'acpi-osl'
Merge updates of the ACPI battery and EC drivers, an ACPI Platform
Firmware Runtime (PFR) telemetry driver update and an ACPI OS support
layer change for 6.13-rc1:

 - Use DEFINE_SIMPLE_DEV_PM_OPS in the ACPI battery driver, make it use
   devm_ for initializing mutexes and allocating driver data, and make
   it check the register_pm_notifier() return value (Thomas Weißschuh,
   Andy Shevchenko).

 - Make the ACPI EC driver support compile-time conditional and allow
   ACPI to be built without CONFIG_HAS_IOPORT (Arnd Bergmann).

 - Remove a redundant error check from the pfr_telemetry driver (Colin
   Ian King).

* acpi-battery:
  ACPI: battery: Check for error code from devm_mutex_init() call
  ACPI: battery: use DEFINE_SIMPLE_DEV_PM_OPS
  ACPI: battery: initialize mutexes through devm_ APIs
  ACPI: battery: allocate driver data through devm_ APIs
  ACPI: battery: check result of register_pm_notifier()

* acpi-ec:
  ACPI: EC: make EC support compile-time conditional

* acpi-pfr:
  ACPI: pfr_telemetry: remove redundant error check on ret

* acpi-osl:
  ACPI: allow building without CONFIG_HAS_IOPORT
2024-11-15 20:45:14 +01:00
Linus Torvalds
e8bdb3c8be A Single RISC-V Fix for 6.12-rc8
* A fix for the CPU perf driver that avoids leaking CPU ID references on
   systems without snapshot support.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmc3avsTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiZUOD/44STXrS9mFLeBsizeEB8pX7GKoJ2FT
 Tcctcm4Or2sFeS22WNljtxHH8ozMPrhd++GogAJ91yO46WUzmjJ01skIKZFHkB+S
 VNFqwzJ+rm5l2NNCyhzO+xLxVrwWPWqY3E1rK5ta3UpaEqkBOOgW8zNHvvppOA9d
 tU/7yFVF/O8wZ8cXakGQCMNe4bZc8ZOM+rUTpBZLb3fne+BvUr+uanNkB9Py8Nl5
 PBD1OhFCg9M4wZ5MCw3yFS2/D2LCSGHVOYI8H71wcZvcqOA0qGll+LSVugMh4ZWN
 HX81lx8XY+nC8xz3OvwjnPZvQwyW/HU8jq+EaJEycH3axAscdZmXsi8pWGn8H0Ho
 WKeJbbNnmm9pfxJvhqdic0ghmAaX3uHWQLd+MriQjMbsOcA0+5XMHuEfaZuLDZ2o
 dZa9uvqdXIchHr1vg/gxrZvJueMfaChCuWMt+sEmQ4C+YlWBo1TMXKxHzVzb27nJ
 EdsAsxKRzzzUkaOdXL5LNTjlO5Xtt7BzF4k5g2GuwyYACPqbQZpzQUJLfPf+0ZgR
 s34eQSWeyuxxZMzK224hZDemQcX5bSXBR7nRo+aLsyJq75SJ5NBd4BsW2Ei6GN/P
 WItbkly3uAa/MrDv7X/Jj+ZVZVTRzk7y3mmZZLkORoAtJab8/0tNaLULk/p8X6qH
 zGyh6UY8C0DHZg==
 =nXVF
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fix from Palmer Dabbelt:

 - A fix for the CPU perf driver that avoids leaking CPU ID references
   on systems without snapshot support.

* tag 'riscv-for-linus-6.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  drivers: perf: Fix wrong put_cpu() placement
2024-11-15 11:44:32 -08:00
Alexander Aring
f74dacb4c8 dlm: fix recovery of middle conversions
In one special case, recovery is unable to reliably rebuild
lock state by simply recreating lkb structs as sent from the
lock holders.  That case is when the lkb's include conversions
between PR and CW modes.

The recovery code has always recognized this special case,
but the implemention has always been broken, and would set
invalid modes in recovered lkb's.  Unpredictable or bogus
errors could then be returned for further locking calls on
these locks.

This bug has gone unnoticed for so long due to some
combination of:
- applications never or infrequently converting between PR/CW
- recovery not occuring during these conversions
- if the recovery bug does occur, the caller may not notice,
  depending on what further locking calls are made, e.g. if
  the lock is simply unlocked it may go unnoticed

However, a core analysis from a recent gfs2 bug report points
to this broken code.

PR = Protected Read
CW = Concurrent Write
PR and CW are incompatible
PR and PR are compatible
CW and CW are compatible

Example 1

node C, resource R
granted: PR node A
granted: PR node B
granted: NL node C
granted: NL node D

- A sends convert PR->CW to C
- C fails before A gets a reply
- recovery occurs

At this point, A does not know if it still holds
the lock in PR, or if its conversion to CW was granted:
- If A's conversion to CW was granted, then another
  node's CW lock may also have been granted.
- If A's conversion to CW was not granted, it still
  holds a PR lock, and other nodes may also hold PR locks.

So, the new master of R cannot simply recreate the lock
from A using granted mode PR and requested mode CW.
The new master must look at all the recovered locks to
determine the correct granted modes, and ensure that all
the recovered locks are recreated in compatible states.

The correct lock recovery steps in this example are:
- node D becomes the new master of R
- node B sends D its lkb, granted PR
- node A sends D its lkb, convert PR->CW
- D determines the correct lock state is:
  granted: PR node B
  convert: PR->CW node A

The lkb sent by each node was recreated without
any change on the new master node.

Example 2

node C, resource R
granted: PR node A
granted: NL node C
granted: NL node D
waiting: CW node B

- A sends convert PR->CW to C
- C grants the conversion to CW for A
- C grants the waiting request for CW to B
- C sends granted message to B, but fails
  before it can send the granted message to A
- B receives the granted message from C

At this point:
- A believes it is converting PR->CW
- B believes it is holding a CW lock

The correct lock recovery steps in this example are:
- node D becomes the new master of R
- node A sends D its lkb, convert PR->CW
- node B sends D its lkb, granted CW
- D determins the correct lock state is:
  granted: CW node B
  granted: CW node A

The lkb sent by B is recreated without change,
but the lkb sent by A is changed because the
granted mode was not compatible.

Fixes to make this work correctly:

recover_convert_waiter: should not make any changes
to a converting lkb that is still waiting for a reply
message.  It was previously setting grmode to IV, which
is invalid state, so the lkb would not be handled
correctly by other code.

receive_rcom_lock_args: was checking the wrong lkb field
(wait_type instead of status) to determine if the lkb is
being converted, and in need of inspection for this special
recovery.  It was also setting grmode to IV in the lkb,
causing it to be mishandled by other code.
Now, this function just puts the lkb, directly as sent,
onto the convert queue of the resource being recovered,
and corrects it in recover_conversion() later, if needed.

recover_conversion: the job of this function is to detect
and correct lkb states for the special PR/CW conversions.
The new code now checks for recovered lkbs on the granted
queue with grmode PR or CW, and takes the real grmode from
that.  Then it looks for lkbs on the convert queue with an
incompatible grmode (i.e. grmode PR when the real grmode is
CW, or v.v.)  These converting lkbs need to be fixed.
They are fixed by temporarily setting their grmode to NL,
so that grmodes are not incompatible and won't confuse other
locking code.  The converting lkb will then be granted at
the end of recovery, replacing the temporary NL grmode.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2024-11-15 13:39:36 -06:00
Jens Axboe
88d47f6293 Merge tag 'md-6.13-20241115' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux into for-6.13/block
Pull MD fixes from Song:

"This set contains a fix for a W=1 warning, by John Garry, and a
 MAINTAINERS update."

* tag 'md-6.13-20241115' of https://git.kernel.org/pub/scm/linux/kernel/git/mdraid/linux:
  MAINTAINERS: Update git tree for mdraid subsystem
  md/raid5: Increase r5conf.cache_name size
2024-11-15 12:37:33 -07:00
Jiri Olsa
fab974e648 libbpf: Fix memory leak in bpf_program__attach_uprobe_multi
Andrii reported memory leak detected by Coverity on error path
in bpf_program__attach_uprobe_multi. Fixing that by moving
the check earlier before the offsets allocations.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241115115843.694337-1-jolsa@kernel.org
2024-11-15 11:29:12 -08:00
Pavel Begunkov
d617b3147d io_uring: restore back registered wait arguments
Now we've got a more generic region registration API, place
IORING_ENTER_EXT_ARG_REG and re-enable it.

First, the user has to register a region with the
IORING_MEM_REGION_REG_WAIT_ARG flag set. It can only be done for a
ring in a disabled state, aka IORING_SETUP_R_DISABLED, to avoid races
with already running waiters. With that we should have stable constant
values for ctx->cq_wait_{size,arg} in io_get_ext_arg_reg() and hence no
READ_ONCE required.

The other API difference is that we're now passing byte offsets instead
of indexes. The user _must_ align all offsets / pointers to the native
word size, failing to do so might but not necessarily has to lead to a
failure usually returned as -EFAULT. liburing will be hiding this
details from users.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/81822c1b4ffbe8ad391b4f9ad1564def0d26d990.1731689588.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-15 12:28:38 -07:00