mirror of
https://github.com/torvalds/linux.git
synced 2024-11-22 12:11:40 +00:00
f68cd02d51
77890 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Linus Torvalds
|
4477b39c32 |
minmax: add a few more MIN_T/MAX_T users
Commit
|
||
Linus Torvalds
|
1722389b0d |
A lot of networking people were at a conference last week, busy
catching COVID, so relatively short PR. Including fixes from bpf and netfilter. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaibxAACgkQMUZtbf5S IruuIRAAu96TiN/urPwmKznyb/Sk8x7p8iUzn6OvPS/TUlFUkURQtOh6M9uvbpN4 x/L//EWkMR0hY4SkBegoiXfb1GS0PjBdWTWUiROm5X9nVHqp5KRZAxWXhjFiS1BO BIYOT+JfCl7mQiPs90Mys/cEtYOggMBsCZQVIGw/iYoJLFREqxFSONwa0dG+tGMX jn9WNu4yCVDhJ/jtl2MaTsCNtYUaBUgYrKHJBfNGfJ2Lz/7rH9yFui2WSMlmOd/U QGeCb1DWURlShlCqY37wNinbFsxWkI5JN00ukTtwFAXLIaqc+zgHcIjrDjTJwK43 F4tKbJT3+bmehMU/h3Uo3c7DhXl7n9zDGiDtbCxnkykp0sFGJpjhDrWydo51c+YB qW5HaNrII2LiDicOVN8L29ylvKp7AEkClxgivEhZVGGk2f/szJRXfp9u3WBn5kAx 3paH55YN0DEsKbYbb1ZENEI1Vnc/4ff4PxZJCUNKwzcS8wCn1awqwcriK9TjS/cp fjilNFT4J3/uFrodHWTkx0jJT6UJFT0aF03qPLUH/J5kG+EVukOf1jBPInNdf1si 1j47SpblHUe86HiHphFMt32KZ210lJzWxh8uGma57Y2sB9makdLiK4etrFjkiMJJ Z8A3kGp3KpFjbuK4tHY25rp+5oxLNNOBNpay29lQrWtCL/NDcaQ= =9OsH -----END PGP SIGNATURE----- Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ... |
||
Linus Torvalds
|
b485625078 |
sysctl: treewide: constify the ctl_table argument of proc_handlers
Summary - const qualify struct ctl_table args in proc_handlers: This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmahVOcACgkQupfNUreW QU8QdgwAiDZMEQ6psvDfLGQlOvHyp21exZFC/i1OCrCq4BmTu83iYcojIKSaZ0q6 pfR9eYTVhNfAGsV2hQJvvUQPSVNESB/A7Z2zGLVQmZcOerAN17L2C7uA1SkbRBel P3yoMX6wruVCnBgQTcjktoBgpDqaOiqLTSh8Ko512qYOClLf22sCbbk+doiOwMF6 zbzh9bRo+qTuTgW30gwzRNml/lA1/XylSsM/cEFCxsc8k22lCneP+iWKDAfWtsr6 0GL+SvbBTyIEMFFvSw+0MUW30QctxaCYA1pBmn8nmc6eJ2yDly75/r8EWRr7zqDW SrRiXYDAzNcW3tiOdC2FW/J95VX62DiSQporVZ/OvahBKpQKGaLLNjx1EE3qGKIk ggnYiDmIO88ZBHoRbDDK9EdGGqrIZfRx9JeFD3bMOrwT3Ffxw30rwcYpzlOyjFz3 TIUeNthGXoQNRxZfTE+iybotKWcXnxCNe4U1Mvc2liY6+Z0jazDWfs+jV7CiuX1G Xk8jpTCR =JMg4 -----END PGP SIGNATURE----- Merge tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl constification from Joel Granados: "Treewide constification of the ctl_table argument of proc_handlers using a coccinelle script and some manual code formatting fixups. This is a prerequisite to moving the static ctl_table structs into read-only data section which will ensure that proc_handler function pointers cannot be modified" * tag 'constfy-sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: treewide: constify the ctl_table argument of proc_handlers |
||
Linus Torvalds
|
c2a96b7f18 |
Driver core changes for 6.11-rc1
Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZqH+aQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymoOQCfVBdLcBjEDAGh3L8qHRGMPy4rV2EAoL/r+zKm cJEYtJpGtWX6aAtugm9E =ZyJV -----END PGP SIGNATURE----- Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) ARM: sa1100: make match function take a const pointer sysfs/cpu: Make crash_hotplug attribute world-readable dio: Have dio_bus_match() callback take a const * zorro: make match function take a const pointer driver core: module: make module_[add|remove]_driver take a const * driver core: make driver_find_device() take a const * driver core: make driver_[create|remove]_file take a const * firmware_loader: fix soundness issue in `request_internal` firmware_loader: annotate doctests as `no_run` devres: Correct code style for functions that return a pointer type devres: Initialize an uninitialized struct member devres: Fix memory leakage caused by driver API devm_free_percpu() devres: Fix devm_krealloc() wasting memory driver core: platform: Switch to use kmemdup_array() driver core: have match() callback in struct bus_type take a const * MAINTAINERS: add Rust device abstractions to DRIVER CORE device: rust: improve safety comments MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER firmware: rust: improve safety comments ... |
||
Jakub Kicinski
|
f7578df913 |
bpf-for-netdev
-----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZqIl1AAKCRDbK58LschI g/MdAP9oyZV9/IZ6Y6Z1fWfio0SB+yJGugcwbFjWcEtNrzsqJQEAwipQnemAI4NC HBMfK2a/w7vhAFMXrP/SbkB/gUJJ7QE= =vovf -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-07-25 We've added 14 non-merge commits during the last 8 day(s) which contain a total of 19 files changed, 177 insertions(+), 70 deletions(-). The main changes are: 1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and BPF sockhash. Also add test coverage for this case, from Michal Luczaj. 2) Fix a segmentation issue when downgrading gso_size in the BPF helper bpf_skb_adjust_room(), from Fred Li. 3) Fix a compiler warning in resolve_btfids due to a missing type cast, from Liwei Song. 4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte boundary in the fexit_sleep BPF selftest, from Puranjay Mohan. 5) Fix a xsk regression to require a flag when actuating tx_metadata_len, from Stanislav Fomichev. 6) Fix function prototype BTF dumping in libbpf for prototypes that have no input arguments, from Andrii Nakryiko. 7) Fix stacktrace symbol resolution in perf script for BPF programs containing subprograms, from Hou Tao. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids bpf, events: Use prog to emit ksymbol event for main program selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected() af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash bpftool: Fix typo in usage help libbpf: Fix no-args func prototype BTF dumping syntax MAINTAINERS: Update powerpc BPF JIT maintainers MAINTAINERS: Update email address of Naveen selftests/bpf: fexit_sleep: Fix stack allocation for arm64 ==================== Link: https://patch.msgid.link/20240725114312.32197-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Matthieu Baerts (NGI0)
|
c166829268 |
tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
The 'Fixes' commit recently changed the behaviour of TCP by skipping the
processing of the 3rd ACK when a sk->sk_socket is set. The goal was to
skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an
unnecessary ACK in case of simultaneous connect(). Unfortunately, that
had an impact on TFO and MPTCP.
I started to look at the impact on MPTCP, because the MPTCP CI found
some issues with the MPTCP Packetdrill tests [1]. Then Paolo Abeni
suggested me to look at the impact on TFO with "plain" TCP.
For MPTCP, when receiving the 3rd ACK of a request adding a new path
(MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that
has been created when the MPTCP connection got established before with
the first path. The newly added 'goto' will then skip the processing of
the segment text (step 7) and not go through tcp_data_queue() where the
MPTCP options are validated, and some actions are triggered, e.g.
sending the MPJ 4th ACK [2] as demonstrated by the new errors when
running a packetdrill test [3] establishing a second subflow.
This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be
delayed. Still, we don't want to have this behaviour as it delays the
switch to the fully established mode, and invalid MPTCP options in this
3rd ACK will not be caught any more. This modification also affects the
MPTCP + TFO feature as well, and being the reason why the selftests
started to be unstable the last few days [4].
For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer
passing: if the 3rd ACK contains data, and the connection is accept()ed
before receiving them, these data would no longer be processed, and thus
not ACKed.
One last thing about MPTCP, in case of simultaneous connect(), a
fallback to TCP will be done, which seems fine:
`../common/defaults.sh`
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 100 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 < S 0:0(0) win 1000 <mss 1460, sackOK, TS val 407 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 330 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 < S. 0:0(0) ack 1 win 65535 <mss 1460, sackOK, TS val 700 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
+0 > . 1:1(0) ack 1 <nop, nop, TS val 845707014 ecr 700, nop, nop, sack 0:1>
Simultaneous SYN-data crossing is also not supported by TFO, see [6].
Kuniyuki Iwashima suggested to restrict the processing to SYN+ACK only:
that's a more generic solution than the one initially proposed, and
also enough to fix the issues described above.
Later on, Eric Dumazet mentioned that an ACK should still be sent in
reaction to the second SYN+ACK that is received: not sending a DUPACK
here seems wrong and could hurt:
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 1000 ecr 0,nop,wscale 8>
+0 < S 0:0(0) win 1000 <mss 1000, sackOK, nop, nop>
+0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 3308134035 ecr 0,nop,wscale 8>
+0 < S. 0:0(0) ack 1 win 1000 <mss 1000, sackOK, nop, nop>
+0 > . 1:1(0) ack 1 <nop, nop, sack 0:1> // <== Here
So in this version, the 'goto consume' is dropped, to always send an ACK
when switching from TCP_SYN_RECV to TCP_ESTABLISHED. This ACK will be
seen as a DUPACK -- with DSACK if SACK has been negotiated -- in case of
simultaneous SYN crossing: that's what is expected here.
Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9936227696 [1]
Link: https://datatracker.ietf.org/doc/html/rfc8684#fig_tokens [2]
Link: https://github.com/multipath-tcp/packetdrill/blob/mptcp-net-next/gtests/net/mptcp/syscalls/accept.pkt#L28 [3]
Link: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh [4]
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/server/basic-cookie-not-reqd.pkt#L21 [5]
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/client/simultaneous-fast-open.pkt [6]
Fixes:
|
||
Stanislav Fomichev
|
d5e726d914 |
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
Julian reports that commit |
||
Fred Li
|
fa5ef65561 |
bpf: Fix a segment issue when downgrading gso_size
Linearize the skb when downgrading gso_size because it may trigger a
BUG_ON() later when the skb is segmented as described in [1,2].
Fixes:
|
||
Paolo Abeni
|
e6d08d7ecf |
netfilter pull request 24-07-24
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmagtw4ACgkQ1V2XiooU IOTlQA//V22NlrHpu6iZ7DfDarRAKYcLgAxnQSoE9xu5P2+OEye1KQmSXngmr5xm 4YK2rDpjEU393Myrf2Rqr1qAENeRgE99c5U0R+SU2zQ+9z/LBmZIpLC/FaOPr4rJ tFJMY9JyxQ/bp3KK+HPz23f/4FoiUQFeUmDwLSxV5uwhLCZ70tBs1492ATt9ESQe kR69NiNjPgk1hswKFrMCBmilbUMN2dYPJDJmTn2qEdlMnIeRr6o/wcFVZtYnFOyX L5aR6VwZ1WsNjzRnQodUMABkV3j6XNYUC5yjTX4mSvgVlEEoefbfiismZsC3j/dL 1IuYgHylnC+E6vHGQU6g5t/CLWcDEkJxfjvHfZrJlD1mycdT3PxSRWy0M2qNk6tq rYeznfwJMwX2WVWVAx3Ba2RmBe3AQkhnypfsPMHFkCf4TZ+peDxPeZzVqlyUsJwI jkLdnwZXXymd67iye/mtLpfG71dG4zQAoVGaX/6uS9BYlRepXO8uW0LF2f62u0kZ DkzavZtcdXYvwk+AW+g0/FsKjLcjnETqrOC2K06pP4aHcjnNB8olLg8Ut26/Y506 16OxvUUEbV5Xsd0gquAWJmEI4TFY2ADXJOO4x2nP7GqHkRUeKo73zxTgY/68NCF5 l+Ur4eH5mbcAfLc+2nYyZEc6IaCKdyGkVq8kBoRiK7V09s/UOV4= =49yO -----END PGP SIGNATURE----- Merge tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains a Netfilter fix for net: Patch #1 if FPU is busy, then pipapo set backend falls back to standard set element lookup. Moreover, disable bh while at this. From Florian Westphal. netfilter pull request 24-07-24 * tag 'nf-24-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_set_pipapo_avx2: disable softinterrupts ==================== Link: https://patch.msgid.link/20240724081305.3152-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Joel Granados
|
78eb4ea25c |
sysctl: treewide: constify the ctl_table argument of proc_handlers
const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *); @r4@ identifier func, ctl; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos); ``` * Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Co-developed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com> |
||
Petr Machata
|
6d745cd0e9 |
net: nexthop: Initialize all fields in dumped nexthops
struct nexthop_grp contains two reserved fields that are not initialized by
nla_put_nh_group(), and carry garbage. This can be observed e.g. with
strace (edited for clarity):
# ip nexthop add id 1 dev lo
# ip nexthop add id 101 group 1
# strace -e recvmsg ip nexthop get id 101
...
recvmsg(... [{nla_len=12, nla_type=NHA_GROUP},
[{id=1, weight=0, resvd1=0x69, resvd2=0x67}]] ...) = 52
The fields are reserved and therefore not currently used. But as they are, they
leak kernel memory, and the fact they are not just zero complicates repurposing
of the fields for new ends. Initialize the full structure.
Fixes:
|
||
Shigeru Yoshida
|
fa96c6baef |
tipc: Return non-zero value from tipc_udp_addr2str() on error
tipc_udp_addr2str() should return non-zero value if the UDP media
address is invalid. Otherwise, a buffer overflow access can occur in
tipc_media_addr_printf(). Fix this by returning 1 on an invalid UDP
media address.
Fixes:
|
||
Florian Westphal
|
a16909ae99 |
netfilter: nft_set_pipapo_avx2: disable softinterrupts
We need to disable softinterrupts, else we get following problem:
1. pipapo_avx2 called from process context; fpu usable
2. preempt_disable() called, pcpu scratchmap in use
3. softirq handles rx or tx, we re-enter pipapo_avx2
4. fpu busy, fallback to generic non-avx version
5. fallback reuses scratch map and index, which are in use
by the preempted process
Handle this same way as generic version by first disabling
softinterrupts while the scratchmap is in use.
Fixes:
|
||
James Chapman
|
d587d82542 |
l2tp: make session IDR and tunnel session list coherent
Modify l2tp_session_register and l2tp_session_unhash so that the
session IDR and tunnel session lists remain coherent. To do so, hold
the session IDR lock and the tunnel's session list lock when making
any changes to either list.
Without this change, a rare race condition could hit the WARN_ON_ONCE
in l2tp_session_unhash if a thread replaced the IDR entry while
another thread was registering the same ID.
[ 7126.151795][T17511] WARNING: CPU: 3 PID: 17511 at net/l2tp/l2tp_core.c:1282 l2tp_session_delete.part.0+0x87e/0xbc0
[ 7126.163754][T17511] ? show_regs+0x93/0xa0
[ 7126.164157][T17511] ? __warn+0xe5/0x3c0
[ 7126.164536][T17511] ? l2tp_session_delete.part.0+0x87e/0xbc0
[ 7126.165070][T17511] ? report_bug+0x2e1/0x500
[ 7126.165486][T17511] ? l2tp_session_delete.part.0+0x87e/0xbc0
[ 7126.166013][T17511] ? handle_bug+0x99/0x130
[ 7126.166428][T17511] ? exc_invalid_op+0x35/0x80
[ 7126.166890][T17511] ? asm_exc_invalid_op+0x1a/0x20
[ 7126.167372][T17511] ? l2tp_session_delete.part.0+0x87d/0xbc0
[ 7126.167900][T17511] ? l2tp_session_delete.part.0+0x87e/0xbc0
[ 7126.168429][T17511] ? __local_bh_enable_ip+0xa4/0x120
[ 7126.168917][T17511] l2tp_session_delete+0x40/0x50
[ 7126.169369][T17511] pppol2tp_release+0x1a1/0x3f0
[ 7126.169817][T17511] __sock_release+0xb3/0x270
[ 7126.170247][T17511] ? __pfx_sock_close+0x10/0x10
[ 7126.170697][T17511] sock_close+0x1c/0x30
[ 7126.171087][T17511] __fput+0x40b/0xb90
[ 7126.171470][T17511] task_work_run+0x16c/0x260
[ 7126.171897][T17511] ? __pfx_task_work_run+0x10/0x10
[ 7126.172362][T17511] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7126.172863][T17511] ? do_raw_spin_unlock+0x174/0x230
[ 7126.173348][T17511] do_exit+0xaae/0x2b40
[ 7126.173730][T17511] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7126.174235][T17511] ? __pfx_lock_release+0x10/0x10
[ 7126.174690][T17511] ? srso_alias_return_thunk+0x5/0xfbef5
[ 7126.175190][T17511] ? do_raw_spin_lock+0x12c/0x2b0
[ 7126.175650][T17511] ? __pfx_do_exit+0x10/0x10
[ 7126.176072][T17511] ? _raw_spin_unlock_irq+0x23/0x50
[ 7126.176543][T17511] do_group_exit+0xd3/0x2a0
[ 7126.176990][T17511] __x64_sys_exit_group+0x3e/0x50
[ 7126.177456][T17511] x64_sys_call+0x1821/0x1830
[ 7126.177895][T17511] do_syscall_64+0xcb/0x250
[ 7126.178317][T17511] entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes:
|
||
Ido Schimmel
|
cc73bbab4b |
ipv4: Fix incorrect source address in Record Route option
The Record Route IP option records the addresses of the routers that
routed the packet. In the case of forwarded packets, the kernel performs
a route lookup via fib_lookup() and fills in the preferred source
address of the matched route.
The lookup is performed with the DS field of the forwarded packet, but
using the RT_TOS() macro which only masks one of the two ECN bits. If
the packet is ECT(0) or CE, the matched route might be different than
the route via which the packet was forwarded as the input path masks
both of the ECN bits, resulting in the wrong address being filled in the
Record Route option.
Fix by masking both of the ECN bits.
Fixes:
|
||
Linus Torvalds
|
527eff227d |
- In the series "treewide: Refactor heap related implementation",
Kuan-Wei Chiu has significantly reworked the min_heap library code and has taught bcachefs to use the new more generic implementation. - Yury Norov's series "Cleanup cpumask.h inclusion in core headers" reworks the cpumask and nodemask headers to make things generally more rational. - Kuan-Wei Chiu has sent along some maintenance work against our sorting library code in the series "lib/sort: Optimizations and cleanups". - More library maintainance work from Christophe Jaillet in the series "Remove usage of the deprecated ida_simple_xx() API". - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the series "nilfs2: eliminate the call to inode_attach_wb()". - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB command error". - Plus the usual shower of singleton patches all over the place. Please see the relevant changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2GvwAKCRDdBJ7gKXxA jlf/AP48xP5ilIHbtpAKm2z+MvGuTxJQ5VSC0UXFacuCbc93lAEA+Yo+vOVRmh6j fQF2nVKyKLYfSz7yqmCyAaHWohIYLgg= =Stxz -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - In the series "treewide: Refactor heap related implementation", Kuan-Wei Chiu has significantly reworked the min_heap library code and has taught bcachefs to use the new more generic implementation. - Yury Norov's series "Cleanup cpumask.h inclusion in core headers" reworks the cpumask and nodemask headers to make things generally more rational. - Kuan-Wei Chiu has sent along some maintenance work against our sorting library code in the series "lib/sort: Optimizations and cleanups". - More library maintainance work from Christophe Jaillet in the series "Remove usage of the deprecated ida_simple_xx() API". - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the series "nilfs2: eliminate the call to inode_attach_wb()". - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB command error". - Plus the usual shower of singleton patches all over the place. Please see the relevant changelogs for details. * tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits) ia64: scrub ia64 from poison.h watchdog/perf: properly initialize the turbo mode timestamp and rearm counter tsacct: replace strncpy() with strscpy() lib/bch.c: use swap() to improve code test_bpf: convert comma to semicolon init/modpost: conditionally check section mismatch to __meminit* init: remove unused __MEMINIT* macros nilfs2: Constify struct kobj_type nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro math: rational: add missing MODULE_DESCRIPTION() macro lib/zlib: add missing MODULE_DESCRIPTION() macro fs: ufs: add MODULE_DESCRIPTION() lib/rbtree.c: fix the example typo ocfs2: add bounds checking to ocfs2_check_dir_entry() fs: add kernel-doc comments to ocfs2_prepare_orphan_dir() coredump: simplify zap_process() selftests/fpu: add missing MODULE_DESCRIPTION() macro compiler.h: simplify data_race() macro build-id: require program headers to be right after ELF header resource: add missing MODULE_DESCRIPTION() ... |
||
Linus Torvalds
|
d7e78951a8 |
Notably this includes fixes for a s390 build breakage.
Including fixes from netfilter. Current release - new code bugs: - eth: fbnic: fix s390 build. - eth: airoha: fix NULL pointer dereference in airoha_qdma_cleanup_rx_queue() Previous releases - regressions: - flow_dissector: use DEBUG_NET_WARN_ON_ONCE - ipv4: fix incorrect TOS in route get reply - dsa: fix chip-wide frame size config in some drivers Previous releases - always broken: - netfilter: nf_set_pipapo: fix initial map fill - eth: gve: fix XDP TX completion handling when counters overflow Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmaafj4SHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkSsIP/jLODokNb/RcIuOYZVlgn2C5icMeZtqv 6BIliZeE1EcnMWqOnheHaJVklq17ChAYbj/GNN8zQeOQhPA2eFCzPPLl/UpF9/ik wuvk+QQ4EM3T/SwWLKmht9UJioVi66szl3vZ+ByJ4BgGiJWORW1dK/AYzyYrsplk BMpQTs/Q/ekzWJzBtX+1Cz8izX1gl+dXhMezSdg9cW4KaBu1Hqwny954HF6keUQn h7bbAWiOAb5bhqUzgCdgF4gXp+uaWEzG1a1kY1G86NjXC5H0G03+vl37RbY/1kYh lUa/3nfH2V7Sy+MYTvjQe66obVeQeOh/PhRMlbkEVphpKs8XtHfsP433iOMZZn2D Z2Yb4yICnpNwbPmK1fyFvOrY7zGp2yZ2dSDEhowGjcyoQPO6RPnBfvArl66phIOm DWfEq79dYa0eEnCM174BLZbsjcHeeYQX2b8lagUslC0Oel+ijUG26XeMqs5W3at9 QVcMV+aPE7pibpAq4M+qqkldv49QmXRsQai0BMa0+qEPyDKLe2nnf6L+TrDfN7Zf tpiiZZZGpcctZ1cOfyuebB37z/jtHJQ94IfLCD9RzpHlpOtCFycO18adrohhI58Q TaZf7U1CqC/UREckwE5F9ypQhdiQHEH84rgvKT3zFFRYftWTQmBCAUs3JzBPbYwO RO2GDFM/htOu =Q1xF -----END PGP SIGNATURE----- Merge tag 'net-6.11-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter. Notably this includes fixes for a s390 build breakage. Current release - new code bugs: - eth: fbnic: fix s390 build - eth: airoha: fix NULL pointer dereference in airoha_qdma_cleanup_rx_queue() Previous releases - regressions: - flow_dissector: use DEBUG_NET_WARN_ON_ONCE - ipv4: fix incorrect TOS in route get reply - dsa: fix chip-wide frame size config in some drivers Previous releases - always broken: - netfilter: nf_set_pipapo: fix initial map fill - eth: gve: fix XDP TX completion handling when counters overflow" * tag 'net-6.11-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: eth: fbnic: don't build the driver when skb has more than 21 frags net: dsa: b53: Limit chip-wide jumbo frame config to CPU ports net: dsa: mv88e6xxx: Limit chip-wide frame size config to CPU ports net: airoha: Fix NULL pointer dereference in airoha_qdma_cleanup_rx_queue() net: wwan: t7xx: add support for Dell DW5933e ipv4: Fix incorrect TOS in fibmatch route get reply ipv4: Fix incorrect TOS in route get reply net: flow_dissector: use DEBUG_NET_WARN_ON_ONCE driver core: auxiliary bus: Fix documentation of auxiliary_device net: airoha: fix error branch in airoha_dev_xmit and airoha_set_gdm_ports gve: Fix XDP TX completion handling when counters overflow ipvs: properly dereference pe in ip_vs_add_service selftests: netfilter: add test case for recent mismatch bug netfilter: nf_set_pipapo: fix initial map fill netfilter: ctnetlink: use helper function to calculate expect ID eth: fbnic: fix s390 build. |
||
Linus Torvalds
|
f4f92db439 |
virtio: features, fixes, cleanups
Several new features here: - Virtio find vqs API has been reworked (required to fix the scalability issue we have with adminq, which I hope to merge later in the cycle) - vDPA driver for Marvell OCTEON - virtio fs performance improvement - mlx5 migration speedups Fixes, cleanups all over the place. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmaXjQQPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpnIsH/jVNqAQbe/vaBQdNMdnsA+P9A9unLbYRxYCQ tN73mQRIXKtnZHBRAEbMGq52HPYg8HlN2HJSgyNo6I6t8VD+PiOco7m+3GpmqEcW aXPOPl0BAbVoDgyutxRuuodP8Z61lBx0mG6iOxpzTXOPGlpQqtPCFHO8YnodqnPf tMix/5uAqgZKV2siCbw5DtzwEc0gDHU8qsD0/nyoS5nBDF9yh/ardr5P/qiyFDQH atCNYTOhIFU83pLAaw0fpCGbkt7gxf+5RpWVx3wkYww+/MwvYhsveRvQyaGbBz3n WDtET3SOtVTta98OAGIKCq/2z8f6mYXBP7vXapBgnJG3vwS/poQ= =LYua -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "Several new features here: - Virtio find vqs API has been reworked (required to fix the scalability issue we have with adminq, which I hope to merge later in the cycle) - vDPA driver for Marvell OCTEON - virtio fs performance improvement - mlx5 migration speedups Fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (56 commits) virtio: rename virtio_find_vqs_info() to virtio_find_vqs() virtio: remove unused virtio_find_vqs() and virtio_find_vqs_ctx() helpers virtio: convert the rest virtio_find_vqs() users to virtio_find_vqs_info() virtio_balloon: convert to use virtio_find_vqs_info() virtiofs: convert to use virtio_find_vqs_info() scsi: virtio_scsi: convert to use virtio_find_vqs_info() virtio_net: convert to use virtio_find_vqs_info() virtio_crypto: convert to use virtio_find_vqs_info() virtio_console: convert to use virtio_find_vqs_info() virtio_blk: convert to use virtio_find_vqs_info() virtio: rename find_vqs_info() op to find_vqs() virtio: remove the original find_vqs() op virtio: call virtio_find_vqs_info() from virtio_find_single_vq() directly virtio: convert find_vqs() op implementations to find_vqs_info() virtio_pci: convert vp_*find_vqs() ops to find_vqs_info() virtio: introduce virtio_queue_info struct and find_vqs_info() config op virtio: make virtio_find_single_vq() call virtio_find_vqs() virtio: make virtio_find_vqs() call virtio_find_vqs_ctx() caif_virtio: use virtio_find_single_vq() for single virtqueue finding vdpa/mlx5: Don't enable non-active VQs in .set_vq_ready() ... |
||
Linus Torvalds
|
4f40c636b2 |
NFS Client Updates for Linux 6.11
New Features: * Add support for large folios * Implement rpcrdma generic device removal notification * Add client support for attribute delegations * Use a LAYOUTRETURN during reboot recovery to report layoutstats and errors * Improve throughput for random buffered writes * Add NVMe support to pnfs/blocklayout Bugfixes: * Fix rpcrdma_reqs_reset() * Avoid soft lockups when using UDP * Fix an nfs/blocklayout premature PR key unregestration * Another fix for EXCHGID4_FLAG_USE_PNFS_DS for DS server * Do not extend writes to the entire folio * Pass explicit offset and count values to tracepoints * Fix a race to wake up sleeping SUNRPC sync tasks * Fix gss_status tracepoint output Cleanups: * Add missing MODULE_DESCRIPTION() macros * Add blocklayout / SCSI layout tracepoints * Remove asm-generic headers from xprtrdma verbs.c * Remove unused 'struct mnt_fhstatus' * Other delegation related cleanups * Other folio related cleanups * Other pNFS related cleanups * Other xprtrdma cleanups -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmaZgr0ACgkQ18tUv7Cl QOv8FxAAnUyYG7Kdbv+5Ko/SFv0imxCb5DQh2XC/hSHNrlKBlDnqe2PANXR9XocL mS0Wry5tZf/T+o+QoKv0HQUdWFlnqKzwclggrekf/lkioU1feWsLe2RzDl1iUh0V 6fwcCyWXW1mYX2CtCaDe+/ZFcoZOMD+bItNHt/RdDScSnS9Jd8GSyocsVKsqaBx6 3wub0FJ4UBgYNoX2T3YyK2JwvO9GLaKIQRJV74rjgPJKjcjhptbcb5MKBmOZrF95 UCcpl4CwvD9RTsSEp0B98UbAFFpk8Nw1tmHF3GmyG/nsrJomDuLKFvbsiq23eHUf XeULZIbjMEzU56vjoTglZA4s7JYx17D0vzdPGUqU4mLN3LPm5LtGLBg2uQoPw/xW 50euLU+ol36mfnQlBsuM/tAXgtoAcT63aNeNRNp8aOL47xA+PC6kWTBK9OaR5+x6 w+d22Dpy+riMk1TRaAVt0ANcENKELsWRFvxkuWCpQhVoQ1h8LigQJzeggEEK7Sa6 5u9H6wCTee2wz746uwA43koj1utuyrLq/5S+qEtCY1pbP3U0A+Gh0Xh00OXiYuzL TgRdksmiAL8cA51WjSrq6HhGLOUJAYLfbdKaVhW+fULxUVwzWhFFaFbbdiq/e4OR 0pfqls8UZWICE51GeTfalEidpKZgV/LxU3QOuVoalWBULyj/TeI= =avTW -----END PGP SIGNATURE----- Merge tag 'nfs-for-6.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client updates from Anna Schumaker: "New Features: - Add support for large folios - Implement rpcrdma generic device removal notification - Add client support for attribute delegations - Use a LAYOUTRETURN during reboot recovery to report layoutstats and errors - Improve throughput for random buffered writes - Add NVMe support to pnfs/blocklayout Bugfixes: - Fix rpcrdma_reqs_reset() - Avoid soft lockups when using UDP - Fix an nfs/blocklayout premature PR key unregestration - Another fix for EXCHGID4_FLAG_USE_PNFS_DS for DS server - Do not extend writes to the entire folio - Pass explicit offset and count values to tracepoints - Fix a race to wake up sleeping SUNRPC sync tasks - Fix gss_status tracepoint output Cleanups: - Add missing MODULE_DESCRIPTION() macros - Add blocklayout / SCSI layout tracepoints - Remove asm-generic headers from xprtrdma verbs.c - Remove unused 'struct mnt_fhstatus' - Other delegation related cleanups - Other folio related cleanups - Other pNFS related cleanups - Other xprtrdma cleanups" * tag 'nfs-for-6.11-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (63 commits) SUNRPC: Fixup gss_status tracepoint error output SUNRPC: Fix a race to wake a sync task nfs: split nfs_read_folio nfs: pass explicit offset/count to trace events nfs: do not extend writes to the entire folio nfs/blocklayout: add support for NVMe nfs: remove nfs_page_length nfs: remove the unused max_deviceinfo_size field from struct pnfs_layoutdriver_type nfs: don't reuse partially completed requests in nfs_lock_and_join_requests nfs: move nfs_wait_on_request to write.c nfs: fold nfs_page_group_lock_subrequests into nfs_lock_and_join_requests nfs: fold nfs_folio_find_and_lock_request into nfs_lock_and_join_requests nfs: simplify nfs_folio_find_and_lock_request nfs: remove nfs_folio_private_request nfs: remove dead code for the old swap over NFS implementation NFSv4.1 another fix for EXCHGID4_FLAG_USE_PNFS_DS for DS server nfs: Block on write congestion nfs: Properly initialize server->writeback nfs: Drop pointless check from nfs_commit_release_pages() nfs/blocklayout: SCSI layout trace points for reservation key reg/unreg ... |
||
Benjamin Coddington
|
ed0172af5d |
SUNRPC: Fix a race to wake a sync task
We've observed NFS clients with sync tasks sleeping in __rpc_execute waiting on RPC_TASK_QUEUED that have not responded to a wake-up from rpc_make_runnable(). I suspect this problem usually goes unnoticed, because on a busy client the task will eventually be re-awoken by another task completion or xprt event. However, if the state manager is draining the slot table, a sync task missing a wake-up can result in a hung client. We've been able to prove that the waker in rpc_make_runnable() successfully calls wake_up_bit() (ie- there's no race to tk_runstate), but the wake_up_bit() call fails to wake the waiter. I suspect the waker is missing the load of the bit's wait_queue_head, so waitqueue_active() is false. There are some very helpful comments about this problem above wake_up_bit(), prepare_to_wait(), and waitqueue_active(). Fix this by inserting smp_mb__after_atomic() before the wake_up_bit(), which pairs with prepare_to_wait() calling set_current_state(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> |
||
Paolo Abeni
|
a1b7dbca14 |
netfilter pull request 24-07-17
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmaYOsUACgkQ1V2XiooU IOSZfhAAiylt0bXfNLh2dGFpigNqejO0y7uJG/lNALA6kbhqkfdaXyiEGI+ZyIUM qtkpk0BUDEhCtpspOVLNtDYcZkKuH7zzt6MsA9BcNm/5AS4XPB56buPsZ94ic3Uk LZB8X/+q+gT+HC3RF4hf92rfJP6WKvt3mO8GGLstg0r6orGX6DywfwD+jCAIS/HS CgDot0SXS0MQZpTUoMugRpGF8wWBANLjUvLYNjIQhV5vLpRvX/Fal+St1+AsFnzq hdtMDOmqJHy0GNwrOdWX5muK+3X5Fg1rHUIgA1SqALd52lpsv62U9JHN6cvV8nqG h24u3v17rtHg82zbKz4pBVHwurDyIT3TM3Pmp8v+b8ZnYmWsyXLr3lqv3GU5Rx14 HWsEi+FL/ND2mfWF+GOn4JqkWy/97eZ78VkekuuWXfzuNAXedp7OUw7RgfhCl5Pr DZPus0rrnl3nCOeQyXGpo0RacFDOhKESa0EBjHS0+S8nv4vpzS+5XbiZctM0FlSu QhwZM2Q+M6Klr0LWkLYyS99s5Cra3z5z/Oud7xtnlBOp81r0O9YVBYopoV6MM0kV b6IHvOvs+5W+dj8RmH6Chq9D+EmFuYGwPoa5uv6zdSn/2ZeXqKeGq/HnlXuqct37 RCbuV8S1IzJi5EgtwbU0uxBclKkiyCtkhqSNrdjSHxLwf7wgI2s= =I/g9 -----END PGP SIGNATURE----- Merge tag 'nf-24-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Call nf_expect_get_id() to delete expectation by ID. By trial and error it is possible to leak the LSB of the expectation address on x86_64. This bug is a leftover when converting the existing code to use nf_expect_get_id(). 2) Incorrect initialization in pipapo set backend leads to packet mismatches. From Florian Westphal. 3) Extend netfilter's selftests to cover for the pipapo set backend, also from Florian. 4) Fix sparse warning in IPVS when adding service, from Chen Hanxiao. netfilter pull request 24-07-17 * tag 'nf-24-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: ipvs: properly dereference pe in ip_vs_add_service selftests: netfilter: add test case for recent mismatch bug netfilter: nf_set_pipapo: fix initial map fill netfilter: ctnetlink: use helper function to calculate expect ID ==================== Link: https://patch.msgid.link/20240717215214.225394-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Ido Schimmel
|
f036e68212 |
ipv4: Fix incorrect TOS in fibmatch route get reply
The TOS value that is returned to user space in the route get reply is
the one with which the lookup was performed ('fl4->flowi4_tos'). This is
fine when the matched route is configured with a TOS as it would not
match if its TOS value did not match the one with which the lookup was
performed.
However, matching on TOS is only performed when the route's TOS is not
zero. It is therefore possible to have the kernel incorrectly return a
non-zero TOS:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get fibmatch 192.0.2.2 tos 0xfc
192.0.2.0/24 tos 0x1c dev dummy1 proto kernel scope link src 192.0.2.1
Fix by instead returning the DSCP field from the FIB result structure
which was populated during the route lookup.
Output after the patch:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get fibmatch 192.0.2.2 tos 0xfc
192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1
Extend the existing selftests to not only verify that the correct route
is returned, but that it is also returned with correct "tos" value (or
without it).
Fixes:
|
||
Ido Schimmel
|
338bb57e4c |
ipv4: Fix incorrect TOS in route get reply
The TOS value that is returned to user space in the route get reply is
the one with which the lookup was performed ('fl4->flowi4_tos'). This is
fine when the matched route is configured with a TOS as it would not
match if its TOS value did not match the one with which the lookup was
performed.
However, matching on TOS is only performed when the route's TOS is not
zero. It is therefore possible to have the kernel incorrectly return a
non-zero TOS:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get 192.0.2.2 tos 0xfc
192.0.2.2 tos 0x1c dev dummy1 src 192.0.2.1 uid 0
cache
Fix by adding a DSCP field to the FIB result structure (inside an
existing 4 bytes hole), populating it in the route lookup and using it
when filling the route get reply.
Output after the patch:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get 192.0.2.2 tos 0xfc
192.0.2.2 dev dummy1 src 192.0.2.1 uid 0
cache
Fixes:
|
||
Pablo Neira Ayuso
|
120f1c857a |
net: flow_dissector: use DEBUG_NET_WARN_ON_ONCE
The following splat is easy to reproduce upstream as well as in -stable kernels. Florian Westphal provided the following commit: |
||
Chen Hanxiao
|
cbd070a4ae |
ipvs: properly dereference pe in ip_vs_add_service
Use pe directly to resolve sparse warning:
net/netfilter/ipvs/ip_vs_ctl.c:1471:27: warning: dereference of noderef expression
Fixes:
|
||
Michal Luczaj
|
638f326043 |
af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
AF_UNIX socket tracks the most recent OOB packet (in its receive queue)
with an `oob_skb` pointer. BPF redirecting does not account for that: when
an OOB packet is moved between sockets, `oob_skb` is left outdated. This
results in a single skb that may be accessed from two different sockets.
Take the easy way out: silently drop MSG_OOB data targeting any socket that
is in a sockmap or a sockhash. Note that such silent drop is akin to the
fate of redirected skb's scm_fp_list (SCM_RIGHTS, SCM_CREDENTIALS).
For symmetry, forbid MSG_OOB in unix_bpf_recvmsg().
Fixes:
|
||
Linus Torvalds
|
586a7a8542 |
NFSD 6.11 Release Notes
This is a light release containing optimizations, code clean-ups, and minor bug fixes. This development cycle focused on work outside of upstream kernel development: 1. Continuing to build upstream CI for NFSD based on kdevops 2. Continuing to focus on the quality of NFSD in LTS kernels 3. Participation in IETF nfsv4 WG discussions about NFSv4 ACLs, directory delegation, and NFSv4.2 COPY offload Notable features in v6.11 that were not pulled through the NFSD tree include NFS server-side support for the new pNFS NVMe layout type [RFC9561]. Functional testing for pNFS block layouts like this one has been introduced to our kdevops CI harness. Work on improving the resolution of file attribute time stamps in local filesystems is also ongoing tree-wide. As always I am grateful to NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmaVM0cACgkQM2qzM29m f5fzOQ//c5CXIF3zCLIUofm5eZSP2zIszmHR75rVTEnf0Ehm2BJRF6VZiTvWXRpz bOuswxfV1Bds+TofbPIP8jqDcMp8NIXemdb6+QMwh4FDY4M8t1v6TRYt35L6Ulrq bSV81aRS622ofQ35sRzwxpGX6rB6YbB+5L4EKuxdEqRKSB8rCxQcjPy2qypcWlRC hEAGDe3IiVxTz4VQBpASRqbH9Udw/XEqIhv5c8aLtPvl8i+yWyV5m2G5FMRdBj49 u8rCLoPi/mON8TDs2U4pbhcdgfBWWvGS6woFp6qrqM0wzXIPLalWsPGK3DUtuFUg onxsClJXMWUvW4k4hbjiqosduLGY/kMeX62Lx1dCj/gktrJpU0GDNR/XbBhHU+hj UT2CL8AfedC4FQekdyJri/rDgPiTMsf8UE0lgtchHMUXH0ztrjaRxMGiIFMm5vCl dJBMGJfCkKR/+U1YrGRQI0tPL8CJKYI8klOEhLoOsCr/WC9p4nvvAzSg4W9mNK5P ni4+KU4f/bj8U0Ap2bUacTpXj6W8VcwJWeuHahVA1Slo+eqXO401hj4W88dQmm9O ZDR5h+6PI6KoL/KL6I4EyOv+sIEtW3s18cEWbSSu3N/CPuhSGTx8d2J201shJXRN uDdMkvbwv48x20pgD2oTkPrZbJHOL3BK5/WPBg7pwpfkoRrBAhY= =Xd5e -----END PGP SIGNATURE----- Merge tag 'nfsd-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd updates from Chuck Lever: "This is a light release containing optimizations, code clean-ups, and minor bug fixes. This development cycle focused on work outside of upstream kernel development: - Continuing to build upstream CI for NFSD based on kdevops - Continuing to focus on the quality of NFSD in LTS kernels - Participation in IETF nfsv4 WG discussions about NFSv4 ACLs, directory delegation, and NFSv4.2 COPY offload Notable features for v6.11 that do not come through the NFSD tree include NFS server-side support for the new pNFS NVMe layout type [RFC9561]. Functional testing for pNFS block layouts like this one has been introduced to our kdevops CI harness. Work on improving the resolution of file attribute time stamps in local filesystems is also ongoing tree-wide. As always I am grateful to NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle" * tag 'nfsd-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: nfsd_file_lease_notifier_call gets a file_lease as an argument gss_krb5: Fix the error handling path for crypto_sync_skcipher_setkey MAINTAINERS: Add a bugzilla link for NFSD nfsd: new netlink ops to get/set server pool_mode sunrpc: refactor pool_mode setting code nfsd: allow passing in array of thread counts via netlink nfsd: make nfsd_svc take an array of thread counts sunrpc: fix up the special handling of sv_nrpools == 1 SUNRPC: Add a trace point in svc_xprt_deferred_close NFSD: Support write delegations in LAYOUTGET lockd: Use *-y instead of *-objs in Makefile NFSD: Fix nfsdcld warning svcrdma: Handle ADDR_CHANGE CM event properly svcrdma: Refactor the creation of listener CMA ID NFSD: remove unused structs 'nfsd3_voidargs' NFSD: harden svcxdr_dupstr() and svcxdr_tmpalloc() against integer overflows |
||
Florian Westphal
|
791a615b7a |
netfilter: nf_set_pipapo: fix initial map fill
The initial buffer has to be inited to all-ones, but it must restrict
it to the size of the first field, not the total field size.
After each round in the map search step, the result and the fill map
are swapped, so if we have a set where f->bsize of the first element
is smaller than m->bsize_max, those one-bits are leaked into future
rounds result map.
This makes pipapo find an incorrect matching results for sets where
first field size is not the largest.
Followup patch adds a test case to nft_concat_range.sh selftest script.
Thanks to Stefano Brivio for pointing out that we need to zero out
the remainder explicitly, only correcting memset() argument isn't enough.
Fixes:
|
||
Pablo Neira Ayuso
|
782161895e |
netfilter: ctnetlink: use helper function to calculate expect ID
Delete expectation path is missing a call to the nf_expect_get_id()
helper function to calculate the expectation ID, otherwise LSB of the
expectation object address is leaked to userspace.
Fixes:
|
||
Jiri Pirko
|
6c85d6b653 |
virtio: rename virtio_find_vqs_info() to virtio_find_vqs()
Since the original virtio_find_vqs() is no longer present, rename virtio_find_vqs_info() back to virtio_find_vqs(). Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240708074814.1739223-20-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> |
||
Jiri Pirko
|
c95e67bac4 |
virtio: convert the rest virtio_find_vqs() users to virtio_find_vqs_info()
Instead of passing separate names and callbacks arrays to virtio_find_vqs(), have one of virtual_queue_info structs and pass it to virtio_find_vqs_info(). Suggested-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240708074814.1739223-18-jiri@resnulli.us> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> |
||
Linus Torvalds
|
51835949dd |
Networking changes for 6.11. Not much excitement - a handful of large
patchsets (devmem among them) did not make it in time. Core & protocols ---------------- - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT. - Use flex array for netdevice priv area, ensure its cache alignment. - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful. - Support scheduling transmission of packets based on CLOCK_TAI. - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets. - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address. - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync. - Improve TCP compliance with RFC 9293 for simultaneous connect(). - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it. - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled. - Introduce IPPROTO_SMC for selecting SMC when socket is created. - Allow UDP GSO transmit from devices with no checksum offload. - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding. - nf_tables: shrink memory consumption for transaction objects. Things we sprinkled into general kernel code -------------------------------------------- - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390). - Add IRQ information in sysfs for auxiliary bus. - Introduce guard definition for local_lock. - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures. BPF --- - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered. - Add new kfuncs for a generic, open-coded bits iterator. - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head. - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules. - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs. - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter. - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs. Driver API ---------- - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose. - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits. - Track additional RSS contexts in the core, make sure configuration changes don't break them. - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths. - Support updating firmware on SFP modules. Tests and tooling ----------------- - mptcp: use net/lib.sh to manage netns. - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints. - openvswitch: make test self-contained (don't depend on OvS CLI tools). Drivers ------- - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591 Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6 uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE 4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2 j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6 ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP 6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0 Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0= =opz8 -----END PGP SIGNATURE----- Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Not much excitement - a handful of large patchsets (devmem among them) did not make it in time. Core & protocols: - Use local_lock in addition to local_bh_disable() to protect per-CPU resources in networking, a step closer for local_bh_disable() not to act as a big lock on PREEMPT_RT - Use flex array for netdevice priv area, ensure its cache alignment - Add a sysctl knob to allow user to specify a default rto_min at socket init time. Bit of a big hammer but multiple companies were independently carrying such patch downstream so clearly it's useful - Support scheduling transmission of packets based on CLOCK_TAI - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off using cpusets - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address - Allow configuration of multipath hash seed, to both allow synchronizing hashing of two routers, and preventing partial accidental sync - Improve TCP compliance with RFC 9293 for simultaneous connect() - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace IKE daemon had to do this before, but the kernel can better keep track of it - Support sending supervision HSR frames with MAC addresses stored in ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled - Introduce IPPROTO_SMC for selecting SMC when socket is created - Allow UDP GSO transmit from devices with no checksum offload - openvswitch: add packet sampling via psample, separating the sampled traffic from "upcall" packets sent to user space for forwarding - nf_tables: shrink memory consumption for transaction objects Things we sprinkled into general kernel code: - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for QCA6390) [ Already merged separately - Linus ] - Add IRQ information in sysfs for auxiliary bus - Introduce guard definition for local_lock - Add aligned flavor of __cacheline_group_{begin, end}() markings for grouping fields in structures BPF: - Notify user space (via epoll) when a struct_ops object is getting detached/unregistered - Add new kfuncs for a generic, open-coded bits iterator - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head - Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible WRT BTF from modules - Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs - riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter - Add the capability to offload the netfilter flowtable in XDP layer through kfuncs Driver API: - Allow users to configure IRQ tresholds between which automatic IRQ moderation can choose - Expand Power Sourcing (PoE) status with power, class and failure reason. Support setting power limits - Track additional RSS contexts in the core, make sure configuration changes don't break them - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP data paths - Support updating firmware on SFP modules Tests and tooling: - mptcp: use net/lib.sh to manage netns - TCP-AO and TCP-MD5: replace debug prints used by tests with tracepoints - openvswitch: make test self-contained (don't depend on OvS CLI tools) Drivers: - Ethernet high-speed NICs: - Broadcom (bnxt): - increase the max total outstanding PTP TX packets to 4 - add timestamping statistics support - implement netdev_queue_mgmt_ops - support new RSS context API - Intel (100G, ice, idpf): - implement FEC statistics and dumping signal quality indicators - support E825C products (with 56Gbps PHYs) - nVidia/Mellanox: - support HW-GRO - mlx4/mlx5: support per-queue statistics via netlink - obey the max number of EQs setting in sub-functions - AMD/Solarflare: - support new RSS context API - AMD/Pensando: - ionic: rework fix for doorbell miss to lower overhead and skip it on new HW - Wangxun: - txgbe: support Flow Director perfect filters - Ethernet NICs consumer, embedded and virtual: - Add driver for Tehuti Networks TN40xx chips - Add driver for Meta's internal NIC chips - Add driver for Ethernet MAC on Airoha EN7581 SoCs - Add driver for Renesas Ethernet-TSN devices - Google cloud vNIC: - flow steering support - Microsoft vNIC: - support page sizes other than 4KB on ARM64 - vmware vNIC: - support latency measurement (update to version 9) - VirtIO net: - support for Byte Queue Limits - support configuring thresholds for automatic IRQ moderation - support for AF_XDP Rx zero-copy - Synopsys (stmmac): - support for STM32MP13 SoC - let platforms select the right PCS implementation - TI: - icssg-prueth: add multicast filtering support - icssg-prueth: enable PTP timestamping and PPS - Renesas: - ravb: improve Rx performance 30-400% by using page pool, theaded NAPI and timer-based IRQ coalescing - ravb: add MII support for R-Car V4M - Cadence (macb): - macb: add ARP support to Wake-On-LAN - Cortina: - use phylib for RX and TX pause configuration - Ethernet switches: - nVidia/Mellanox: - support configuration of multipath hash seed - report more accurate max MTU - use page_pool to improve Rx performance - MediaTek: - mt7530: add support for bridge port isolation - Qualcomm: - qca8k: add support for bridge port isolation - Microchip: - lan9371/2: add 100BaseTX PHY support - NXP: - vsc73xx: implement VLAN operations - Ethernet PHYs: - aquantia: enable support for aqr115c - aquantia: add support for PHY LEDs - realtek: add support for rtl8224 2.5Gbps PHY - xpcs: add memory-mapped device support - add BroadR-Reach link mode and support in Broadcom's PHY driver - CAN: - add document for ISO 15765-2 protocol support - mcp251xfd: workaround for erratum DS80000789E, use timestamps to catch when device returns incorrect FIFO status - WiFi: - mac80211/cfg80211: - parse Transmit Power Envelope (TPE) data in mac80211 instead of in drivers - improvements for 6 GHz regulatory flexibility - multi-link improvements - support multiple radios per wiphy - remove DEAUTH_NEED_MGD_TX_PREP flag - Intel (iwlwifi): - bump FW API to 91 for BZ/SC devices - report 64-bit radiotap timestamp - enable P2P low latency by default - handle Transmit Power Envelope (TPE) advertised by AP - remove support for older FW for new devices - fast resume (keeping the device configured) - mvm: re-enable Multi-Link Operation (MLO) - aggregation (A-MSDU) optimizations - MediaTek (mt76): - mt7925 Multi-Link Operation (MLO) support - Qualcomm (ath10k): - LED support for various chipsets - Qualcomm (ath12k): - remove unsupported Tx monitor handling - support channel 2 in 6 GHz band - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID Advertisements (EMA) - support dynamic VLAN - add panic handler for resetting the firmware state - DebugFS support for datapath statistics - WCN7850: support for Wake on WLAN - Microchip (wilc1000): - read MAC address during probe to make it visible to user space - suspend/resume improvements - TI (wl18xx): - support newer firmware versions - RealTek (rtw89): - preparation for RTL8852BE-VT support - Wake on WLAN support for WiFi 6 chips - 36-bit PCI DMA support - RealTek (rtlwifi): - RTL8192DU support - Broadcom (brcmfmac): - Management Frame Protection support (to enable WPA3) - Bluetooth: - qualcomm: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: add BCM4388 support - btintel: add support for BlazarU core - btintel: add support for Whale Peak2 - btnxpuart: add support for AW693 A1 chipset - btnxpuart: add support for IW615 chipset - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591" * tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits) eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering" tcp: Replace strncpy() with strscpy() wifi: ath12k: fix build vs old compiler tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child(). eth: fbnic: Write the TCAM tables used for RSS control and Rx to host eth: fbnic: Add L2 address programming eth: fbnic: Add basic Rx handling eth: fbnic: Add basic Tx handling eth: fbnic: Add link detection eth: fbnic: Add initial messaging to notify FW of our presence eth: fbnic: Implement Rx queue alloc/start/stop/free eth: fbnic: Implement Tx queue alloc/start/stop/free eth: fbnic: Allocate a netdevice and napi vectors with queues eth: fbnic: Add FW communication mechanism eth: fbnic: Add message parsing for FW messages eth: fbnic: Add register init to set PCIe/Ethernet device config eth: fbnic: Allocate core device specific structures and devlink interface eth: fbnic: Add scaffolding for Meta's NIC driver PCI: Add Meta Platforms vendor ID net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK ... |
||
Linus Torvalds
|
f8a8b94d06 |
sysctl changes for 6.11-rc1
Summary * Remove "->procname == NULL" check when iterating through sysctl table arrays Removing sentinels in ctl_table arrays reduces the build time size and runtime memory consumed by ~64 bytes per array. With all ctl_table sentinels gone, the additional check for ->procname == NULL that worked in tandem with the ARRAY_SIZE to calculate the size of the ctl_table arrays is no longer needed and has been removed. The sysctl register functions now returns an error if a sentinel is used. * Preparation patches for sysctl constification Constifying ctl_table structs prevents the modification of proc_handler function pointers as they would reside in .rodata. The ctl_table arguments in sysctl utility functions are const qualified in preparation for a future treewide proc_handler argument constification commit. * Misc fixes Increase robustness of set_ownership by providing sane default ownership values in case the callee doesn't set them. Bound check proc_dou8vec_minmax to avoid loading buggy modules and give sysctl testing module a name to avoid compiler complaints. Testing * This got push to linux-next in v6.10-rc2, so it has had more than a month of testing -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmaWdz4ACgkQupfNUreW QU/WKQwAkSuUz42yCQye77BK+Z8ANcTF1f3aI/wfv2nahq1GaSrNBpqUiXvEe9Tt KD2lM1PWiQfizVLIDPh96yxa5q69GQrPPOA/V1jwIXmk/HRpjjoONCFNNXVRCTls VCqDz/RatuXvzO35Yn87MnWnxv6PiX7X/zq/3WikVsUI381kvTgC6OwZxdFM52w4 ESwOa3LeOovtRnqV5dpHr6DCQKyd0N52nPxgXvaerjlsJsv7PlezN7z9YyLOOfmW xUD7X6LQcJq7HcEukaB6I9o2GQOi4yYXL2YOzed7qu9Thu+lasEoN3Bd7P+ilXkc JY6EXJ5o+d69PewKRuJ1QvD7wrHIkhNMNbMtvehNay124wAHDy3KtonFzyvlX4wE qCHBYc6rySJNhSqwVp9MoksOZfDM99pVIOs9YVIjc90Zzu5J7tORgYWRVOHTcAtj fd8nMdkK3+ZANapygFCyew6GueIzaqlQwveVgLGw4vc5L3ClknmURit3y487Pzdg B+BEVlsp =bs2G -----END PGP SIGNATURE----- Merge tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Remove "->procname == NULL" check when iterating through sysctl table arrays Removing sentinels in ctl_table arrays reduces the build time size and runtime memory consumed by ~64 bytes per array. With all ctl_table sentinels gone, the additional check for ->procname == NULL that worked in tandem with the ARRAY_SIZE to calculate the size of the ctl_table arrays is no longer needed and has been removed. The sysctl register functions now returns an error if a sentinel is used. - Preparation patches for sysctl constification Constifying ctl_table structs prevents the modification of proc_handler function pointers as they would reside in .rodata. The ctl_table arguments in sysctl utility functions are const qualified in preparation for a future treewide proc_handler argument constification commit. - Misc fixes Increase robustness of set_ownership by providing sane default ownership values in case the callee doesn't set them. Bound check proc_dou8vec_minmax to avoid loading buggy modules and give sysctl testing module a name to avoid compiler complaints. * tag 'sysctl-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: Warn on an empty procname element sysctl: Remove ctl_table sentinel code comments sysctl: Remove "child" sysctl code comments sysctl: Remove superfluous empty allocations from sysctl internals sysctl: Replace nr_entries with ctl_table_size in new_links sysctl: Remove check for sentinel element in ctl_table arrays mm profiling: Remove superfluous sentinel element from ctl_table locking: Remove superfluous sentinel element from kern_lockdep_table sysctl: Add module description to sysctl-testing sysctl: constify ctl_table arguments of utility function utsname: constify ctl_table arguments of utility function sysctl: move the extra1/2 boundary check of u8 to sysctl_check_table_array sysctl: always initialize i_uid/i_gid |
||
Kees Cook
|
a3bfc09506 |
tcp: Replace strncpy() with strscpy()
Replace the deprecated[1] uses of strncpy() in tcp_ca_get_name_by_key() and tcp_get_default_congestion_control(). The callers use the results as standard C strings (via nla_put_string() and proc handlers respectively), so trailing padding is not needed. Since passing the destination buffer arguments decays it to a pointer, the size can't be trivially determined by the compiler. ca->name is the same length in both cases, so strscpy() won't fail (when ca->name is NUL-terminated). Include the length explicitly instead of using the 2-argument strscpy(). Link: https://github.com/KSPP/linux/issues/90 [1] Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240714041111.it.918-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Kuniyuki Iwashima
|
3f45181358 |
tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
syzkaller reported KMSAN splat in tcp_create_openreq_child(). [0]
The uninit variable is tcp_rsk(req)->ao_keyid.
tcp_rsk(req)->ao_keyid is initialised only when tcp_conn_request() finds
a valid TCP AO option in SYN. Then, tcp_rsk(req)->used_tcp_ao is set
accordingly.
Let's not read tcp_rsk(req)->ao_keyid when tcp_rsk(req)->used_tcp_ao is
false.
[0]:
BUG: KMSAN: uninit-value in tcp_create_openreq_child+0x198b/0x1ff0 net/ipv4/tcp_minisocks.c:610
tcp_create_openreq_child+0x198b/0x1ff0 net/ipv4/tcp_minisocks.c:610
tcp_v4_syn_recv_sock+0x18e/0x2170 net/ipv4/tcp_ipv4.c:1754
tcp_check_req+0x1a3e/0x20c0 net/ipv4/tcp_minisocks.c:852
tcp_v4_rcv+0x26a4/0x53a0 net/ipv4/tcp_ipv4.c:2265
ip_protocol_deliver_rcu+0x884/0x1270 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x30f/0x530 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x230/0x4c0 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:460 [inline]
ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
ip_list_rcv_finish net/ipv4/ip_input.c:631 [inline]
ip_sublist_rcv+0x10f7/0x13e0 net/ipv4/ip_input.c:639
ip_list_rcv+0x952/0x9c0 net/ipv4/ip_input.c:674
__netif_receive_skb_list_ptype net/core/dev.c:5703 [inline]
__netif_receive_skb_list_core+0xd92/0x11d0 net/core/dev.c:5751
__netif_receive_skb_list net/core/dev.c:5803 [inline]
netif_receive_skb_list_internal+0xd8f/0x1350 net/core/dev.c:5895
gro_normal_list include/net/gro.h:515 [inline]
napi_complete_done+0x3f2/0x990 net/core/dev.c:6246
e1000_clean+0x1fa4/0x5e50 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
__napi_poll+0xd9/0x990 net/core/dev.c:6771
napi_poll net/core/dev.c:6840 [inline]
net_rx_action+0x90f/0x17e0 net/core/dev.c:6962
handle_softirqs+0x152/0x6b0 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu kernel/softirq.c:637 [inline]
irq_exit_rcu+0x5d/0x120 kernel/softirq.c:649
common_interrupt+0x83/0x90 arch/x86/kernel/irq.c:278
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
__msan_instrument_asm_store+0xd6/0xe0
arch_atomic_inc arch/x86/include/asm/atomic.h:53 [inline]
raw_atomic_inc include/linux/atomic/atomic-arch-fallback.h:992 [inline]
atomic_inc include/linux/atomic/atomic-instrumented.h:436 [inline]
page_ref_inc include/linux/page_ref.h:153 [inline]
folio_ref_inc include/linux/page_ref.h:160 [inline]
filemap_map_order0_folio mm/filemap.c:3596 [inline]
filemap_map_pages+0x11c7/0x2270 mm/filemap.c:3644
do_fault_around mm/memory.c:4879 [inline]
do_read_fault mm/memory.c:4912 [inline]
do_fault mm/memory.c:5051 [inline]
do_pte_missing mm/memory.c:3897 [inline]
handle_pte_fault mm/memory.c:5381 [inline]
__handle_mm_fault mm/memory.c:5524 [inline]
handle_mm_fault+0x3677/0x6f00 mm/memory.c:5689
do_user_addr_fault+0x1373/0x2b20 arch/x86/mm/fault.c:1338
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x54/0xc0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
Uninit was stored to memory at:
tcp_create_openreq_child+0x1984/0x1ff0 net/ipv4/tcp_minisocks.c:611
tcp_v4_syn_recv_sock+0x18e/0x2170 net/ipv4/tcp_ipv4.c:1754
tcp_check_req+0x1a3e/0x20c0 net/ipv4/tcp_minisocks.c:852
tcp_v4_rcv+0x26a4/0x53a0 net/ipv4/tcp_ipv4.c:2265
ip_protocol_deliver_rcu+0x884/0x1270 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x30f/0x530 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x230/0x4c0 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:460 [inline]
ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
ip_list_rcv_finish net/ipv4/ip_input.c:631 [inline]
ip_sublist_rcv+0x10f7/0x13e0 net/ipv4/ip_input.c:639
ip_list_rcv+0x952/0x9c0 net/ipv4/ip_input.c:674
__netif_receive_skb_list_ptype net/core/dev.c:5703 [inline]
__netif_receive_skb_list_core+0xd92/0x11d0 net/core/dev.c:5751
__netif_receive_skb_list net/core/dev.c:5803 [inline]
netif_receive_skb_list_internal+0xd8f/0x1350 net/core/dev.c:5895
gro_normal_list include/net/gro.h:515 [inline]
napi_complete_done+0x3f2/0x990 net/core/dev.c:6246
e1000_clean+0x1fa4/0x5e50 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
__napi_poll+0xd9/0x990 net/core/dev.c:6771
napi_poll net/core/dev.c:6840 [inline]
net_rx_action+0x90f/0x17e0 net/core/dev.c:6962
handle_softirqs+0x152/0x6b0 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu kernel/softirq.c:637 [inline]
irq_exit_rcu+0x5d/0x120 kernel/softirq.c:649
common_interrupt+0x83/0x90 arch/x86/kernel/irq.c:278
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
Uninit was created at:
__alloc_pages_noprof+0x82d/0xcb0 mm/page_alloc.c:4706
__alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
alloc_slab_page mm/slub.c:2265 [inline]
allocate_slab mm/slub.c:2428 [inline]
new_slab+0x2af/0x14e0 mm/slub.c:2481
___slab_alloc+0xf73/0x3150 mm/slub.c:3667
__slab_alloc mm/slub.c:3757 [inline]
__slab_alloc_node mm/slub.c:3810 [inline]
slab_alloc_node mm/slub.c:3990 [inline]
kmem_cache_alloc_noprof+0x53a/0x9f0 mm/slub.c:4009
reqsk_alloc_noprof net/ipv4/inet_connection_sock.c:920 [inline]
inet_reqsk_alloc+0x63/0x700 net/ipv4/inet_connection_sock.c:951
tcp_conn_request+0x339/0x4860 net/ipv4/tcp_input.c:7177
tcp_v4_conn_request+0x13b/0x190 net/ipv4/tcp_ipv4.c:1719
tcp_rcv_state_process+0x2dd/0x4a10 net/ipv4/tcp_input.c:6711
tcp_v4_do_rcv+0xbee/0x10d0 net/ipv4/tcp_ipv4.c:1932
tcp_v4_rcv+0x3fad/0x53a0 net/ipv4/tcp_ipv4.c:2334
ip_protocol_deliver_rcu+0x884/0x1270 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x30f/0x530 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x230/0x4c0 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:460 [inline]
ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline]
ip_list_rcv_finish net/ipv4/ip_input.c:631 [inline]
ip_sublist_rcv+0x10f7/0x13e0 net/ipv4/ip_input.c:639
ip_list_rcv+0x952/0x9c0 net/ipv4/ip_input.c:674
__netif_receive_skb_list_ptype net/core/dev.c:5703 [inline]
__netif_receive_skb_list_core+0xd92/0x11d0 net/core/dev.c:5751
__netif_receive_skb_list net/core/dev.c:5803 [inline]
netif_receive_skb_list_internal+0xd8f/0x1350 net/core/dev.c:5895
gro_normal_list include/net/gro.h:515 [inline]
napi_complete_done+0x3f2/0x990 net/core/dev.c:6246
e1000_clean+0x1fa4/0x5e50 drivers/net/ethernet/intel/e1000/e1000_main.c:3808
__napi_poll+0xd9/0x990 net/core/dev.c:6771
napi_poll net/core/dev.c:6840 [inline]
net_rx_action+0x90f/0x17e0 net/core/dev.c:6962
handle_softirqs+0x152/0x6b0 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu kernel/softirq.c:637 [inline]
irq_exit_rcu+0x5d/0x120 kernel/softirq.c:649
common_interrupt+0x83/0x90 arch/x86/kernel/irq.c:278
asm_common_interrupt+0x26/0x40 arch/x86/include/asm/idtentry.h:693
CPU: 0 PID: 239 Comm: modprobe Tainted: G B 6.10.0-rc7-01816-g852e42cc2dd4 #3 1107521f0c7b55c9309062382d0bda9f604dbb6d
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Fixes:
|
||
Linus Torvalds
|
3a56e24173 |
for-6.11/io_uring-20240714
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmaTgusQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpr+1EAC4I7pRAM341sfmhe/9QQKMM8VzGwy5Tlr1 AFLO3BujRTl6X8S9fQjIjN1coW6u4F42I19+vVlxqvB7CUnqt9VWpexEjxe4K0FR R+hIZW+fWV9K/eMrcsLcI7oReN5kIihHOzzy3wz0rENoGB5dCl6JAZMHDUCSqP0/ ZJJQ5ut8ah20Y/myHnzP5o4TfdE7nGo73Di2YoE2g3KqeX/dlAKW9+5hqKzzrHhM 2U25k/6KLy0ROzKpy2qW0QRE3pT5udoHLK2ue9+XwXF8JWVTlfVkHBzGY7NstyyT z07SEzW1q4xV1HdCwGDAU7cL2NJMRXSG0p2WZTm8QyaVTdsZQvEx08GLsVdLvFH5 Gg+oOaxVE+INzW+/Lwz7lFHgq6XEjdAlEAOXDtGkZoni6Rt6iCzFCW6RTf/guy8o Cub7tatMyegxai9+FTN/oFVoydRR0tsMf0OHrWnLOperh9CaxAwXvmKFeT/UTwiB KIuIOJop7aThJbiV42a/xwTrEjNMZRv6uVBBEtJX3rxpmIhqTbjcAv9rKMmgtLMk s6yX1MvYdOLhhEDyoUBX0dJdEETBf3KbnYIwi8kb4Sbkw/ZDgnkmSxFysom61wUF byAFEpah3ZFR8aES0uNKUE6UHK6i5qqp0Za/n6gA927E/WGCU9ndaS+01gyknog0 8FqFYwruHQ== =50CO -----END PGP SIGNATURE----- Merge tag 'for-6.11/io_uring-20240714' of git://git.kernel.dk/linux Pull io_uring updates from Jens Axboe: "Here are the io_uring updates queued up for 6.11. Nothing major this time around, various minor improvements and cleanups/fixes. This contains: - Add bind/listen opcodes. Main motivation is to support direct descriptors, to avoid needing a regular fd just for doing these two operations (Gabriel) - Probe fixes (Gabriel) - Treat io-wq work flags as atomics. Not fixing a real issue, but may as well and it silences a KCSAN warning (me) - Cleanup of rsrc __set_current_state() usage (me) - Add 64-bit for {m,f}advise operations (me) - Improve performance of data ring messages (me) - Fix for ring message overflow posting (Pavel) - Fix for freezer interaction with TWA_NOTIFY_SIGNAL. Not strictly an io_uring thing, but since TWA_NOTIFY_SIGNAL was originally added for faster task_work signaling for io_uring, bundling it with this pull (Pavel) - Add Pavel as a co-maintainer - Various cleanups (me, Thorsten)" * tag 'for-6.11/io_uring-20240714' of git://git.kernel.dk/linux: (28 commits) io_uring/net: check socket is valid in io_bind()/io_listen() kernel: rerun task_work while freezing in get_signal() io_uring/io-wq: limit retrying worker initialisation io_uring/napi: Remove unnecessary s64 cast io_uring/net: cleanup io_recv_finish() bundle handling io_uring/msg_ring: fix overflow posting MAINTAINERS: change Pavel Begunkov from io_uring reviewer to maintainer io_uring/msg_ring: use kmem_cache_free() to free request io_uring/msg_ring: check for dead submitter task io_uring/msg_ring: add an alloc cache for io_kiocb entries io_uring/msg_ring: improve handling of target CQE posting io_uring: add io_add_aux_cqe() helper io_uring: add remote task_work execution helper io_uring/msg_ring: tighten requirement for remote posting io_uring: Allocate only necessary memory in io_probe io_uring: Fix probe of disabled operations io_uring: Introduce IORING_OP_LISTEN io_uring: Introduce IORING_OP_BIND net: Split a __sys_listen helper for io_uring net: Split a __sys_bind helper for io_uring ... |
||
Jakub Kicinski
|
51b35d4f9d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in late fixes to prepare for the 6.11 net-next PR. Conflicts: |
||
Asbjørn Sloth Tønnesen
|
536b97acdd |
net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
NL_REQ_ATTR_CHECK() is used in fl_set_key_flags() to set extended attributes about the origin of an error, this patch propagates tca[TCA_OPTIONS] through. Before this patch: $ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/tc.yaml \ --do newtfilter --json '{ "chain": 0, "family": 0, "handle": 4, "ifindex": 22, "info": 262152, "kind": "flower", "options": { "flags": 0, "key-enc-flags": 8, "key-eth-type": 2048 }, "parent": 4294967283 }' Netlink error: Invalid argument nl_len = 68 (52) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'msg': 'Missing flags mask', 'miss-type': 111} After this patch: [same cmd] Netlink error: Invalid argument nl_len = 76 (60) nl_flags = 0x300 nl_type = 2 error: -22 extack: {'msg': 'Missing flags mask', 'miss-type': 111, 'miss-nest': 56} Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20240713021911.1631517-14-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
706bf4f44c |
flow_dissector: set encapsulation control flags for non-IP
Make sure to set encapsulated control flags also for non-IP packets, such that it's possible to allow matching on e.g. TUNNEL_OAM on a geneve packet carrying a non-IP packet. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-13-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
db5271d50e |
flow_dissector: cleanup FLOW_DISSECTOR_KEY_ENC_FLAGS
Now that TCA_FLOWER_KEY_ENC_FLAGS is unused, as it's former data is stored behind TCA_FLOWER_KEY_ENC_CONTROL, then remove the last bits of FLOW_DISSECTOR_KEY_ENC_FLAGS. FLOW_DISSECTOR_KEY_ENC_FLAGS is unreleased, and have been in net-next since 2024-06-04. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-12-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
11036bd7a0 |
net/sched: cls_flower: rework TCA_FLOWER_KEY_ENC_FLAGS usage
This patch changes how TCA_FLOWER_KEY_ENC_FLAGS is used, so that it is used with TCA_FLOWER_KEY_FLAGS_* flags, in the same way as TCA_FLOWER_KEY_FLAGS is currently used. Where TCA_FLOWER_KEY_FLAGS uses {key,mask}->control.flags, then TCA_FLOWER_KEY_ENC_FLAGS now uses {key,mask}->enc_control.flags, therefore {key,mask}->enc_flags is now unused. As the generic fl_set_key_flags/fl_dump_key_flags() is used with encap set to true, then fl_{set,dump}_key_enc_flags() is removed. This breaks unreleased userspace API (net-next since 2024-06-04). Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-10-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
988f8723d3 |
net/sched: cls_flower: add tunnel flags to fl_{set,dump}_key_flags()
Prepare to set and dump the tunnel flags. This code won't see any of these flags yet, as these flags aren't allowed by the NLA_POLICY_MASK, and the functions doesn't get called with encap set to true yet. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-9-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
03afeb613b |
flow_dissector: set encapsulated control flags from tun_flags
Set the new FLOW_DIS_F_TUNNEL_* encapsulated control flags, based on if their counter-part is set in tun_flags. These flags are not userspace visible yet, as the code to dump encapsulated control flags will first be added, and later activated in the following patches. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-8-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
4d0aed380f |
flow_dissector: prepare for encapsulated control flags
Rename skb_flow_dissect_set_enc_addr_type() to skb_flow_dissect_set_enc_control(), and make it set both addr_type and flags in FLOW_DISSECTOR_KEY_ENC_CONTROL. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-7-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
0e83a7875d |
net/sched: cls_flower: add policy for TCA_FLOWER_KEY_FLAGS
This policy guards fl_set_key_flags() from seeing flags not used in the context of TCA_FLOWER_KEY_FLAGS. In order For the policy check to be performed with the correct endianness, then we also needs to change the attribute type to NLA_BE32 (Thanks Davide). TCA_FLOWER_KEY_FLAGS{,_MASK} already has a be32 comment in include/uapi/linux/pkt_cls.h. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-6-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Asbjørn Sloth Tønnesen
|
fcb4bb07a9 |
net/sched: cls_flower: prepare fl_{set,dump}_key_flags() for ENC_FLAGS
Prepare fl_set_key_flags/fl_dump_key_flags() for use with TCA_FLOWER_KEY_ENC_FLAGS{,_MASK}. This patch adds an encap argument, similar to fl_set_key_ip/ fl_dump_key_ip(), and determine the flower keys based on the encap argument, and use them in the rest of the two functions. Since these functions are so far, only called with encap set false, then there is no functional change. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Tested-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Davide Caratti <dcaratti@redhat.com> Link: https://patch.msgid.link/20240713021911.1631517-5-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Christophe JAILLET
|
0970bf676f |
llc: Constify struct llc_sap_state_trans
'struct llc_sap_state_trans' are not modified in this driver. Constifying this structure moves some data to a read-only section, so increase overall security. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 339 456 24 819 333 net/llc/llc_s_st.o After: ===== text data bss dec hex filename 683 144 0 827 33b net/llc/llc_s_st.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/9d17587639195ee94b74ff06a11ef97d1833ee52.1720973710.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Christophe JAILLET
|
70de41ef78 |
llc: Constify struct llc_conn_state_trans
'struct llc_conn_state_trans' are not modified in this driver. Constifying this structure moves some data to a read-only section, so increase overall security. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 13923 10896 32 24851 6113 net/llc/llc_c_st.o After: ===== text data bss dec hex filename 21859 3328 0 25187 6263 net/llc/llc_c_st.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/87cda89e4c9414e71d1a54bb1eb491b0e7f70375.1720973029.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jakub Kicinski
|
cd9b6f4795 |
bluetooth-next pull request for net-next:
- qca: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: Add BCM4388 support - btintel: Add support for BlazarU core - btintel: Add support for Whale Peak2 - btnxpuart: Add support for AW693 A1 chipset - btnxpuart: Add support for IW615 chipset - btusb: Add Realtek RTL8852BE support ID 0x13d3:0x3591 -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmaVLq0ZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKTJXD/9AK+xa+zTPc9Y0HLY5rca3 lSqyVAqqWuvZ34GPo0qlH6L6w9bPVM+QiwtzfhD5OpN8E30k44HdoJQSIlv+sDrT 5xgAAJ5+8QSpxvyjnHhPwbAnKq23Gic+PKHVsgUtZcTSCImAdq8q+QsfLqrRNv9m zKgHBuDtl//uchfobi2LkwBQRGFalupfiFcvb/N/rE5Uley0wJ3nDrOY2kbZzl0l IuHg6uCNxxV1hr/tB0FtEfTr0otJas5vnMN2M3tG01lJ7xXUYVzzKuMMm+bRY62B uULIFDtrB9y5eX2IzjtXtNRmQNqYApBIDR2nl2PDSu5XlqdgG4Fg8xCZ1I6axQqK 6jza6xOcwSI0sGuFON7HNusL3/AMqjGuI7VUxbHgs+XaqJWvz/67pyWsGJ8n9NUU ba8CfTOBcOWgYbjxwfp8zdqO9MVwE42gkeTS6m6UWrjVdDMf0bi1xX2qUS3mZMMF 9tqP6pKRwWYxp3d/bcIFbnbljqIxok1K4Up4S36OgRSCA2c0kgq+bP7NPADS9pn/ avjGIlY5kSOC/hPUwtwvEA7mKmoAdQ3tmB97GG8wf5LwUwukbdSpk2m5kANPq798 uAu0yxQ6c71vz/EXfen2yy1+/REQYcH/PpVkPdooYcMBwzM3diwdGWJ9Ju1EK+Nb +toke/Zg0wjCM2JZDeotwA== =rQ2W -----END PGP SIGNATURE----- Merge tag 'for-net-next-2024-07-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - qca: use the power sequencer for QCA6390 - btusb: mediatek: add ISO data transmission functions - hci_bcm4377: Add BCM4388 support - btintel: Add support for BlazarU core - btintel: Add support for Whale Peak2 - btnxpuart: Add support for AW693 A1 chipset - btnxpuart: Add support for IW615 chipset - btusb: Add Realtek RTL8852BE support ID 0x13d3:0x3591 * tag 'for-net-next-2024-07-15' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (71 commits) Bluetooth: btmtk: Mark all stub functions as inline Bluetooth: hci_qca: Fix build error Bluetooth: hci_qca: use the power sequencer for wcn7850 and wcn6855 Bluetooth: hci_qca: make pwrseq calls the default if available Bluetooth: hci_qca: unduplicate calls to hci_uart_register_device() Bluetooth: hci_qca: schedule a devm action for disabling the clock dt-bindings: bluetooth: qualcomm: describe the inputs from PMU for wcn7850 Bluetooth: btnxpuart: Fix warnings for suspend and resume functions Bluetooth: btnxpuart: Add system suspend and resume handlers Bluetooth: btnxpuart: Add support for IW615 chipset Bluetooth: btnxpuart: Add support for AW693 A1 chipset Bluetooth: btintel: Add support for Whale Peak2 Bluetooth: btintel: Add support for BlazarU core Bluetooth: btusb: mediatek: add ISO data transmission functions Bluetooth: btmtk: move btusb_recv_acl_mtk to btmtk.c Bluetooth: btmtk: move btusb_mtk_[setup, shutdown] to btmtk.c Bluetooth: btmtk: move btusb_mtk_hci_wmt_sync to btmtk.c Bluetooth: btusb: add callback function in btusb suspend/resume Bluetooth: btmtk: rename btmediatek_data Bluetooth: btusb: mediatek: return error for failed reg access ... ==================== Link: https://patch.msgid.link/20240715142543.303944-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jakub Kicinski
|
30b3560050 |
Merge branch 'net-make-timestamping-selectable'
First part of "net: Make timestamping selectable" from Kory Maincent. Change the driver-facing type already to lower rebasing pain. Link: https://lore.kernel.org/20240709-feature_ptp_netnext-v17-0-b5317f50df2a@bootlin.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> |