forked from Minki/linux
fa0706e977
15955 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Jakub Kicinski
|
bec13ba9ce |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says: ==================== netfilter: conntrack and nf_tables bug fixes The following patchset contains netfilter fixes for net. Broken since 5.19: A few ancient connection tracking helpers assume TCP packets cannot exceed 64kb in size, but this isn't the case anymore with 5.19 when BIG TCP got merged, from myself. Regressions since 5.19: 1. 'conntrack -E expect' won't display anything because nfnetlink failed to enable events for expectations, only for normal conntrack events. 2. partially revert change that added resched calls to a function that can be in atomic context. Both broken and fixed up by myself. Broken for several releases (up to original merge of nf_tables): Several fixes for nf_tables control plane, from Pablo. This fixes up resource leaks in error paths and adds more sanity checks for mutually exclusive attributes/flags. Kconfig: NF_CONNTRACK_PROCFS is very old and doesn't provide all info provided via ctnetlink, so it should not default to y. From Geert Uytterhoeven. Selftests: rework nft_flowtable.sh: it frequently indicated failure; the way it tried to detect an offload failure did not work reliably. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: testing: selftests: nft_flowtable.sh: rework test to detect offload failure testing: selftests: nft_flowtable.sh: use random netns names netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to y netfilter: nf_tables: check NFT_SET_CONCAT flag if field_count is specified netfilter: nf_tables: disallow NFT_SET_ELEM_CATCHALL and NFT_SET_ELEM_INTERVAL_END netfilter: nf_tables: NFTA_SET_ELEM_KEY_END requires concat and interval flags netfilter: nf_tables: validate NFTA_SET_ELEM_OBJREF based on NFT_SET_OBJECT flag netfilter: nf_tables: really skip inactive sets when allocating name netfilter: nfnetlink: re-enable conntrack expectation events netfilter: nf_tables: fix scheduling-while-atomic splat netfilter: nf_ct_irc: cap packet search space to 4k netfilter: nf_ct_ftp: prefer skb_linearize netfilter: nf_ct_h323: cap packet size at 64k netfilter: nf_ct_sane: remove pseudo skb linearization netfilter: nf_tables: possible module reference underflow in error path netfilter: nf_tables: disallow NFTA_SET_ELEM_KEY_END with NFT_SET_ELEM_INTERVAL_END flag netfilter: nf_tables: use READ_ONCE and WRITE_ONCE for shared generation id access ==================== Link: https://lore.kernel.org/r/20220817140015.25843-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
David Howells
|
fc4aaf9fb3 |
net: Fix suspicious RCU usage in bpf_sk_reuseport_detach()
bpf_sk_reuseport_detach() calls __rcu_dereference_sk_user_data_with_flags()
to obtain the value of sk->sk_user_data, but that function is only usable
if the RCU read lock is held, and neither that function nor any of its
callers hold it.
Fix this by adding a new helper, __locked_read_sk_user_data_with_flags()
that checks to see if sk->sk_callback_lock() is held and use that here
instead.
Alternatively, making __rcu_dereference_sk_user_data_with_flags() use
rcu_dereference_checked() might suffice.
Without this, the following warning can be occasionally observed:
=============================
WARNING: suspicious RCU usage
6.0.0-rc1-build2+ #563 Not tainted
-----------------------------
include/net/sock.h:592 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
5 locks held by locktest/29873:
#0: ffff88812734b550 (&sb->s_type->i_mutex_key#9){+.+.}-{3:3}, at: __sock_release+0x77/0x121
#1: ffff88812f5621b0 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_close+0x1c/0x70
#2: ffff88810312f5c8 (&h->lhash2[i].lock){+.+.}-{2:2}, at: inet_unhash+0x76/0x1c0
#3: ffffffff83768bb8 (reuseport_lock){+...}-{2:2}, at: reuseport_detach_sock+0x18/0xdd
#4: ffff88812f562438 (clock-AF_INET){++..}-{2:2}, at: bpf_sk_reuseport_detach+0x24/0xa4
stack backtrace:
CPU: 1 PID: 29873 Comm: locktest Not tainted 6.0.0-rc1-build2+ #563
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
<TASK>
dump_stack_lvl+0x4c/0x5f
bpf_sk_reuseport_detach+0x6d/0xa4
reuseport_detach_sock+0x75/0xdd
inet_unhash+0xa5/0x1c0
tcp_set_state+0x169/0x20f
? lockdep_sock_is_held+0x3a/0x3a
? __lock_release.isra.0+0x13e/0x220
? reacquire_held_locks+0x1bb/0x1bb
? hlock_class+0x31/0x96
? mark_lock+0x9e/0x1af
__tcp_close+0x50/0x4b6
tcp_close+0x28/0x70
inet_release+0x8e/0xa7
__sock_release+0x95/0x121
sock_close+0x14/0x17
__fput+0x20f/0x36a
task_work_run+0xa3/0xcc
exit_to_user_mode_prepare+0x9c/0x14d
syscall_exit_to_user_mode+0x18/0x44
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes:
|
||
Alexander Mikhalitsyn
|
0ff4eb3d5e |
neighbour: make proxy_queue.qlen limit per-device
Right now we have a neigh_param PROXY_QLEN which specifies maximum length of neigh_table->proxy_queue. But in fact, this limitation doesn't work well because check condition looks like: tbl->proxy_queue.qlen > NEIGH_VAR(p, PROXY_QLEN) The problem is that p (struct neigh_parms) is a per-device thing, but tbl (struct neigh_table) is a system-wide global thing. It seems reasonable to make proxy_queue limit per-device based. v2: - nothing changed in this patch v3: - rebase to net tree Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@kernel.org> Cc: Yajun Deng <yajun.deng@linux.dev> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Christian Brauner <brauner@kernel.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Cc: Konstantin Khorenko <khorenko@virtuozzo.com> Cc: kernel@openvz.org Cc: devel@openvz.org Suggested-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Linus Torvalds
|
7ebfc85e2c |
Including fixes from bluetooth, bpf, can and netfilter.
A little longer PR than usual but it's all fixes, no late features. It's long partially because of timing, and partially because of follow ups to stuff that got merged a week or so before the merge window and wasn't as widely tested. Maybe the Bluetooth fixes are a little alarming so we'll address that, but the rest seems okay and not scary. Notably we're including a fix for the netfilter Kconfig [1], your WiFi warning [2] and a bluetooth fix which should unblock syzbot [3]. Current release - regressions: - Bluetooth: - don't try to cancel uninitialized works [3] - L2CAP: fix use-after-free caused by l2cap_chan_put - tls: rx: fix device offload after recent rework - devlink: fix UAF on failed reload and leftover locks in mlxsw Current release - new code bugs: - netfilter: - flowtable: fix incorrect Kconfig dependencies [1] - nf_tables: fix crash when nf_trace is enabled - bpf: - use proper target btf when exporting attach_btf_obj_id - arm64: fixes for bpf trampoline support - Bluetooth: - ISO: unlock on error path in iso_sock_setsockopt() - ISO: fix info leak in iso_sock_getsockopt() - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP - ISO: fix memory corruption on iso_pinfo.base - ISO: fix not using the correct QoS - hci_conn: fix updating ISO QoS PHY - phy: dp83867: fix get nvmem cell fail Previous releases - regressions: - wifi: cfg80211: fix validating BSS pointers in __cfg80211_connect_result [2] - atm: bring back zatm uAPI after ATM had been removed - properly fix old bug making bonding ARP monitor mode not being able to work with software devices with lockless Tx - tap: fix null-deref on skb->dev in dev_parse_header_protocol - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some devices and breaks others - netfilter: - nf_tables: many fixes rejecting cross-object linking which may lead to UAFs - nf_tables: fix null deref due to zeroed list head - nf_tables: validate variable length element extension - bgmac: fix a BUG triggered by wrong bytes_compl - bcmgenet: indicate MAC is in charge of PHY PM Previous releases - always broken: - bpf: - fix bad pointer deref in bpf_sys_bpf() injected via test infra - disallow non-builtin bpf programs calling the prog_run command - don't reinit map value in prealloc_lru_pop - fix UAFs during the read of map iterator fd - fix invalidity check for values in sk local storage map - reject sleepable program for non-resched map iterator - mptcp: - move subflow cleanup in mptcp_destroy_common() - do not queue data on closed subflows - virtio_net: fix memory leak inside XDP_TX with mergeable - vsock: fix memory leak when multiple threads try to connect() - rework sk_user_data sharing to prevent psock leaks - geneve: fix TOS inheriting for ipv4 - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel - phy: c45 baset1: do not skip aneg configuration if clock role is not specified - rose: avoid overflow when /proc displays timer information - x25: fix call timeouts in blocking connects - can: mcp251x: fix race condition on receive interrupt - can: j1939: - replace user-reachable WARN_ON_ONCE() with netdev_warn_once() - fix memory leak of skbs in j1939_session_destroy() Misc: - docs: bpf: clarify that many things are not uAPI - seg6: initialize induction variable to first valid array index (to silence clang vs objtool warning) - can: ems_usb: fix clang 14's -Wunaligned-access warning Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmL1TtkACgkQMUZtbf5S Iruz8Q/+O5xFFsjxuyZD0Mw9d3Jeo3ZI9PeeDvcYl5dZXVegpxqorujTFntxv1Ad JC8o5qqms3kO51d+W/yai6iDacEHX2YcJrupZve+vGvpOEVmBRY5O0E1AckJ18+u ItmjSVESkybUP5P08/An7Y0dMmj9Xb2z84dGkLe+n8lg6/fimo6Ki6yZjcOBOALu AYquMXUcnwztRMbTFjscbJjBd4xFMKZEtthljYtPdIReIN976wmMNYYx+jcPK7ha g39Kv6maklp4euerkGIJ/AMnOWHaOGCFjIaz7rr4444NDfrKdt/jeirUXJaz77Jo TJM2UOwgOeg6WZkSa3cmdq6UdjdkJ6LTe2CJFf1wJ1qfhAi+s8yWoszsM2Enp+66 c/mo9jTCMAjmgEJF11idZuz2S697/5j0hvbfM3ZPgNyNBgn8qxz/Z56fNOisx95u TkoKKFnGH+mcm/et+omBcyLBtBVK2+/6B6mpl6btf4DOkPn5KFYWHV67uV3ksHzQ ye+pnzidoIG0yKbRM2EQKXk7ELKROpl52xUHko93ZinMJt0Q7jBm7tZhJozNFEzi hWgUvpmNXgawzLYQcJ9jJmKw3PmYZnRhvYZB/1r91YamM28Hd58k9WfpWtUtjYJN N0X58L6JSnKPqzR70pcFppz6iBlh0tHdcEQGWhhKU5ScS3FDxGc= =C5Ck -----END PGP SIGNATURE----- Merge tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf, can and netfilter. A little larger than usual but it's all fixes, no late features. It's large partially because of timing, and partially because of follow ups to stuff that got merged a week or so before the merge window and wasn't as widely tested. Maybe the Bluetooth fixes are a little alarming so we'll address that, but the rest seems okay and not scary. Notably we're including a fix for the netfilter Kconfig [1], your WiFi warning [2] and a bluetooth fix which should unblock syzbot [3]. Current release - regressions: - Bluetooth: - don't try to cancel uninitialized works [3] - L2CAP: fix use-after-free caused by l2cap_chan_put - tls: rx: fix device offload after recent rework - devlink: fix UAF on failed reload and leftover locks in mlxsw Current release - new code bugs: - netfilter: - flowtable: fix incorrect Kconfig dependencies [1] - nf_tables: fix crash when nf_trace is enabled - bpf: - use proper target btf when exporting attach_btf_obj_id - arm64: fixes for bpf trampoline support - Bluetooth: - ISO: unlock on error path in iso_sock_setsockopt() - ISO: fix info leak in iso_sock_getsockopt() - ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP - ISO: fix memory corruption on iso_pinfo.base - ISO: fix not using the correct QoS - hci_conn: fix updating ISO QoS PHY - phy: dp83867: fix get nvmem cell fail Previous releases - regressions: - wifi: cfg80211: fix validating BSS pointers in __cfg80211_connect_result [2] - atm: bring back zatm uAPI after ATM had been removed - properly fix old bug making bonding ARP monitor mode not being able to work with software devices with lockless Tx - tap: fix null-deref on skb->dev in dev_parse_header_protocol - revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some devices and breaks others - netfilter: - nf_tables: many fixes rejecting cross-object linking which may lead to UAFs - nf_tables: fix null deref due to zeroed list head - nf_tables: validate variable length element extension - bgmac: fix a BUG triggered by wrong bytes_compl - bcmgenet: indicate MAC is in charge of PHY PM Previous releases - always broken: - bpf: - fix bad pointer deref in bpf_sys_bpf() injected via test infra - disallow non-builtin bpf programs calling the prog_run command - don't reinit map value in prealloc_lru_pop - fix UAFs during the read of map iterator fd - fix invalidity check for values in sk local storage map - reject sleepable program for non-resched map iterator - mptcp: - move subflow cleanup in mptcp_destroy_common() - do not queue data on closed subflows - virtio_net: fix memory leak inside XDP_TX with mergeable - vsock: fix memory leak when multiple threads try to connect() - rework sk_user_data sharing to prevent psock leaks - geneve: fix TOS inheriting for ipv4 - tunnels & drivers: do not use RT_TOS for IPv6 flowlabel - phy: c45 baset1: do not skip aneg configuration if clock role is not specified - rose: avoid overflow when /proc displays timer information - x25: fix call timeouts in blocking connects - can: mcp251x: fix race condition on receive interrupt - can: j1939: - replace user-reachable WARN_ON_ONCE() with netdev_warn_once() - fix memory leak of skbs in j1939_session_destroy() Misc: - docs: bpf: clarify that many things are not uAPI - seg6: initialize induction variable to first valid array index (to silence clang vs objtool warning) - can: ems_usb: fix clang 14's -Wunaligned-access warning" * tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits) net: atm: bring back zatm uAPI dpaa2-eth: trace the allocated address instead of page struct net: add missing kdoc for struct genl_multicast_group::flags nfp: fix use-after-free in area_cache_get() MAINTAINERS: use my korg address for mt7601u mlxsw: minimal: Fix deadlock in ports creation bonding: fix reference count leak in balance-alb mode net: usb: qmi_wwan: Add support for Cinterion MV32 bpf: Shut up kern_sys_bpf warning. net/tls: Use RCU API to access tls_ctx->netdev tls: rx: device: don't try to copy too much on detach tls: rx: device: bound the frag walk net_sched: cls_route: remove from list when handle is 0 selftests: forwarding: Fix failing tests with old libnet net: refactor bpf_sk_reuseport_detach() net: fix refcount bug in sk_psock_get (2) selftests/bpf: Ensure sleepable program is rejected by hash map iter selftests/bpf: Add write tests for sk local storage map iterator selftests/bpf: Add tests for reading a dangling map iter fd bpf: Only allow sleepable program for resched-able iterator ... |
||
Jakub Kicinski
|
5c221f0af6 |
net: add missing kdoc for struct genl_multicast_group::flags
Multicast group flags were added in commit
|
||
Florian Westphal
|
0b2f3212b5 |
netfilter: nfnetlink: re-enable conntrack expectation events
To avoid allocation of the conntrack extension area when possible,
the default behaviour was changed to only allocate the event extension
if a userspace program is subscribed to a notification group.
Problem is that while 'conntrack -E' does enable the event allocation
behind the scenes, 'conntrack -E expect' does not: no expectation events
are delivered unless user sets
"net.netfilter.nf_conntrack_events" back to 1 (always on).
Fix the autodetection to also consider EXP type group.
We need to track the 6 event groups (3+3, new/update/destroy for events and
for expectations each) independently, else we'd disable events again
if an expectation group becomes empty while there is still an active
event group.
Fixes:
|
||
Maxim Mikityanskiy
|
94ce3b64c6 |
net/tls: Use RCU API to access tls_ctx->netdev
Currently, tls_device_down synchronizes with tls_device_resync_rx using RCU, however, the pointer to netdev is stored using WRITE_ONCE and loaded using READ_ONCE. Although such approach is technically correct (rcu_dereference is essentially a READ_ONCE, and rcu_assign_pointer uses WRITE_ONCE to store NULL), using special RCU helpers for pointers is more valid, as it includes additional checks and might change the implementation transparently to the callers. Mark the netdev pointer as __rcu and use the correct RCU helpers to access it. For non-concurrent access pass the right conditions that guarantee safe access (locks taken, refcount value). Also use the correct helper in mlx5e, where even READ_ONCE was missing. The transition to RCU exposes existing issues, fixed by this commit: 1. bond_tls_device_xmit could read netdev twice, and it could become NULL the second time, after the NULL check passed. 2. Drivers shouldn't stop processing the last packet if tls_device_down just set netdev to NULL, before tls_dev_del was called. This prevents a possible packet drop when transitioning to the fallback software mode. Fixes: |
||
Jakub Kicinski
|
fbe8870f72 |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says: ==================== bpf 2022-08-10 We've added 23 non-merge commits during the last 7 day(s) which contain a total of 19 files changed, 424 insertions(+), 35 deletions(-). The main changes are: 1) Several fixes for BPF map iterator such as UAFs along with selftests, from Hou Tao. 2) Fix BPF syscall program's {copy,strncpy}_from_bpfptr() to not fault, from Jinghao Jia. 3) Reject BPF syscall programs calling BPF_PROG_RUN, from Alexei Starovoitov and YiFei Zhu. 4) Fix attach_btf_obj_id info to pick proper target BTF, from Stanislav Fomichev. 5) BPF design Q/A doc update to clarify what is not stable ABI, from Paul E. McKenney. 6) Fix BPF map's prealloc_lru_pop to not reinitialize, from Kumar Kartikeya Dwivedi. 7) Fix bpf_trampoline_put to avoid leaking ftrace hash, from Jiri Olsa. 8) Fix arm64 JIT to address sparse errors around BPF trampoline, from Xu Kuohai. 9) Fix arm64 JIT to use kvcalloc instead of kcalloc for internal program address offset buffer, from Aijun Sun. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits) selftests/bpf: Ensure sleepable program is rejected by hash map iter selftests/bpf: Add write tests for sk local storage map iterator selftests/bpf: Add tests for reading a dangling map iter fd bpf: Only allow sleepable program for resched-able iterator bpf: Check the validity of max_rdwr_access for sock local storage map iterator bpf: Acquire map uref in .init_seq_private for sock{map,hash} iterator bpf: Acquire map uref in .init_seq_private for sock local storage map iterator bpf: Acquire map uref in .init_seq_private for hash map iterator bpf: Acquire map uref in .init_seq_private for array map iterator bpf: Disallow bpf programs call prog_run command. bpf, arm64: Fix bpf trampoline instruction endianness selftests/bpf: Add test for prealloc_lru_pop bug bpf: Don't reinit map value in prealloc_lru_pop bpf: Allow calling bpf_prog_test kfuncs in tracing programs bpf, arm64: Allocate program buffer using kvcalloc instead of kcalloc selftests/bpf: Excercise bpf_obj_get_info_by_fd for bpf2bpf bpf: Use proper target btf when exporting attach_btf_obj_id mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled bpf: Cleanup ftrace hash in bpf_trampoline_put BPF: Fix potential bad pointer dereference in bpf_sys_bpf() ... ==================== Link: https://lore.kernel.org/r/20220810190624.10748-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Hawkins Jiawei
|
2a0133723f |
net: fix refcount bug in sk_psock_get (2)
Syzkaller reports refcount bug as follows:
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 1 PID: 3605 at lib/refcount.c:19 refcount_warn_saturate+0xf4/0x1e0 lib/refcount.c:19
Modules linked in:
CPU: 1 PID: 3605 Comm: syz-executor208 Not tainted 5.18.0-syzkaller-03023-g7e062cda7d90 #0
<TASK>
__refcount_add_not_zero include/linux/refcount.h:163 [inline]
__refcount_inc_not_zero include/linux/refcount.h:227 [inline]
refcount_inc_not_zero include/linux/refcount.h:245 [inline]
sk_psock_get+0x3bc/0x410 include/linux/skmsg.h:439
tls_data_ready+0x6d/0x1b0 net/tls/tls_sw.c:2091
tcp_data_ready+0x106/0x520 net/ipv4/tcp_input.c:4983
tcp_data_queue+0x25f2/0x4c90 net/ipv4/tcp_input.c:5057
tcp_rcv_state_process+0x1774/0x4e80 net/ipv4/tcp_input.c:6659
tcp_v4_do_rcv+0x339/0x980 net/ipv4/tcp_ipv4.c:1682
sk_backlog_rcv include/net/sock.h:1061 [inline]
__release_sock+0x134/0x3b0 net/core/sock.c:2849
release_sock+0x54/0x1b0 net/core/sock.c:3404
inet_shutdown+0x1e0/0x430 net/ipv4/af_inet.c:909
__sys_shutdown_sock net/socket.c:2331 [inline]
__sys_shutdown_sock net/socket.c:2325 [inline]
__sys_shutdown+0xf1/0x1b0 net/socket.c:2343
__do_sys_shutdown net/socket.c:2351 [inline]
__se_sys_shutdown net/socket.c:2349 [inline]
__x64_sys_shutdown+0x50/0x70 net/socket.c:2349
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
During SMC fallback process in connect syscall, kernel will
replaces TCP with SMC. In order to forward wakeup
smc socket waitqueue after fallback, kernel will sets
clcsk->sk_user_data to origin smc socket in
smc_fback_replace_callbacks().
Later, in shutdown syscall, kernel will calls
sk_psock_get(), which treats the clcsk->sk_user_data
as psock type, triggering the refcnt warning.
So, the root cause is that smc and psock, both will use
sk_user_data field. So they will mismatch this field
easily.
This patch solves it by using another bit(defined as
SK_USER_DATA_PSOCK) in PTRMASK, to mark whether
sk_user_data points to a psock object or not.
This patch depends on a PTRMASK introduced in commit
|
||
Christophe JAILLET
|
84b709d310 |
ax88796: Fix some typo in a comment
s/by caused/be caused/ s/ax88786/ax88796/ Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/7db4b622d2c3e5af58c1d1f32b81836f4af71f18.1659801746.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Pablo Neira Ayuso
|
f323ef3a0d |
netfilter: nf_tables: disallow jump to implicit chain from set element
Extend struct nft_data_desc to add a flag field that specifies
nft_data_init() is being called for set element data.
Use it to disallow jump to implicit chain from set element, only jump
to chain via immediate expression is allowed.
Fixes:
|
||
Pablo Neira Ayuso
|
341b694160 |
netfilter: nf_tables: upfront validation of data via nft_data_init()
Instead of parsing the data and then validate that type and length are
correct, pass a description of the expected data so it can be validated
upfront before parsing it to bail out earlier.
This patch adds a new .size field to specify the maximum size of the
data area. The .len field is optional and it is used as an input/output
field, it provides the specific length of the expected data in the input
path. If then .len field is not specified, then obtained length from the
netlink attribute is stored. This is required by cmp, bitwise, range and
immediate, which provide no netlink attribute that describes the data
length. The immediate expression uses the destination register type to
infer the expected data type.
Relying on opencoded validation of the expected data might lead to
subtle bugs as described in
|
||
Pablo Neira Ayuso
|
34aae2c2fb |
netfilter: nf_tables: validate variable length element extension
Update template to validate variable length extensions. This patch adds
a new .ext_len[id] field to the template to store the expected extension
length. This is used to sanity check the initialization of the variable
length extension.
Use PTR_ERR() in nft_set_elem_init() to report errors since, after this
update, there are two reason why this might fail, either because of
ENOMEM or insufficient room in the extension field (EINVAL).
Kernels up until
|
||
Jiri Olsa
|
f1d41f7720 |
mptcp, btf: Add struct mptcp_sock definition when CONFIG_MPTCP is disabled
The btf_sock_ids array needs struct mptcp_sock BTF ID for the
bpf_skc_to_mptcp_sock helper.
When CONFIG_MPTCP is disabled, the 'struct mptcp_sock' is not
defined and resolve_btfids will complain with:
[...]
BTFIDS vmlinux
WARN: resolve_btfids: unresolved symbol mptcp_sock
[...]
Add an empty definition for struct mptcp_sock when CONFIG_MPTCP
is disabled.
Fixes:
|
||
Linus Torvalds
|
ea0c39260d |
9p-for-5.20
- a couple of fixes - add a tracepoint for fid refcounting - some cleanup/followup on fid lookup - some cleanup around req refcounting -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmLuKqkACgkQq06b7GqY 5nA+RBAAvuA6AjKQSvsNxHsqSFMahwoE3cCPlF/QlmAnnZa1DzmRI3kKTAuuDKkE Q4c7DjxzDJCWRTlkpNUCGZycHqpu1QQbfoTo43sIqob7W8xlajeemc5Fxqtw5sPM m+0SzN7vvNJpCr+6pxMXwwGHSZmZOvpFjwj7cUjhpF/V1WO8bNxfCGzQdF0hX1Vn 2HoBbFUOmacL9Z/pF3O/ZG9LwCFuRQsH0EmbFBlJy1WdDtlVHTXlzDaGQ7EGaY4D 17UR6iITsj9ozacLhvk094PLIc3/RHDGLrm3C4Ka3zmUI7BsiYWPDmai3Pu/DNqn JJ5sZkdrVowxyBbGxw8GpZ4YDJtGsU5XglFPdkw+ZazxhZLNEIstPXg0HTZybrOe GE+WskWB0qS+RpX0tnYEcX6qOHWm3/63Yq5NG6A9tLQUSFku02jS/bCQSLBrmGWW Js24IWvFSTvl6XytHeldYhJP618pNUxXRSqgYv96vT/LI3mUrIMN+IVBNPujO6p2 jIYXNoaqLoY3efXKW/WQmp7C/52ZP4ly4fOiz7qtHTQsCcIk8Xo6zwHtm/FkNEqc sMZdqgLrxKPNBAlT8iEtt//wU2fB7mFt988p0pc+5lAK5t0h67KZJV6vDwTAMObX wV6Ht+QhOHtJwO779fk8FhZPNRPYZYMutIyFNlRx+4gCpBuqJPc= =941k -----END PGP SIGNATURE----- Merge tag '9p-for-5.20' of https://github.com/martinetd/linux Pull 9p updates from Dominique Martinet: - a couple of fixes - add a tracepoint for fid refcounting - some cleanup/followup on fid lookup - some cleanup around req refcounting * tag '9p-for-5.20' of https://github.com/martinetd/linux: net/9p: Initialize the iounit field during fid creation net: 9p: fix refcount leak in p9_read_work() error handling 9p: roll p9_tag_remove into p9_req_put 9p: Add client parameter to p9_req_put() 9p: Drop kref usage 9p: Fix some kernel-doc comments 9p fid refcount: cleanup p9_fid_put calls 9p fid refcount: add a 9p_fid_ref tracepoint 9p fid refcount: add p9_fid_get/put wrappers 9p: Fix minor typo in code comment 9p: Remove unnecessary variable for old fids while walking from d_parent 9p: Make the path walk logic more clear about when cloning is required 9p: Track the root fid with its own variable during lookups |
||
Vladimir Oltean
|
06799a9085 |
net: bonding: replace dev_trans_start() with the jiffies of the last ARP/NS
The bonding driver piggybacks on time stamps kept by the network stack for the purpose of the netdev TX watchdog, and this is problematic because it does not work with NETIF_F_LLTX devices. It is hard to say why the driver looks at dev_trans_start() of the slave->dev, considering that this is updated even by non-ARP/NS probes sent by us, and even by traffic not sent by us at all (for example PTP on physical slave devices). ARP monitoring in active-backup mode appears to still work even if we track only the last TX time of actual ARP probes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Linus Torvalds
|
f86d1fbbe7 |
Networking changes for 6.0.
Core ---- - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF --- - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols --------- - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API ---------- - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers ---------------------- - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers ------- - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years. Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmLqN+oSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkB9kQAI9VqW0c3SfiTJnkVBEIovZ6Tnh5stD2 UYFkh1BdchLsYxi7W4XMpVPSzRztiTP87mIx5c/KvIzj+QNeWL1XWRJSPdI9HhTD pTAA/tM2OG7bqrbyQiKDNfpQdNl7+kk1RwnYd+f9RFl1QVuIJaYhmjVwrsN5xF/+ jUsotpROarM2dGFWiFwJbKhP2zMDT+6qEEahM8pEPggKhv8wRLYjany2cZVEe4e0 WGUpbINAS8gEKm0Ob922WaDfDrcK/N1Z0jNz/kMaENkK18Vvc7F6bCO0DzAawKX9 QZMMwm6mHp3EThflJAMAzCGIYiIcwLhykgdyj8rrjPhFrWbMD2Sdsbo21HOXU/8j u4aAhVl+d+h7emmbgBoJ8sycVJ7BQlXz7lX20sTgADv9xI4/dPhQ17CMRuwX6fXX JSrn6P6e1LTV5CEg6vrlSPnKPY6uhFn/cPw47FxCjRwJ9phVnp+8uZWQmf9Pz3yf Ok/tcj+juFbsmuOshHy2cbRkuNZNS0oRWlSTBo5795ZwOLSakMonR3L+ev2aOvzz DVrFp2Y/iIVwMSFdCbouYdYnhArPRhOAtCmZc2afY8aBN7aaMgrdTy3+mzUoHy3I FG3K+VuKpfi0vY4zn6ZoLZDIpyXIoJJ93RcSGltD32t3Dp1RaQMVEI4s45k05PVm 1nYpXKHA8qML =hxEG -----END PGP SIGNATURE----- Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Paolo Abeni: "Core: - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF: - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols: - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API: - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers: - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers: - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years" * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits) doc: sfp-phylink: Fix a broken reference wireguard: selftests: support UML wireguard: allowedips: don't corrupt stack when detecting overflow wireguard: selftests: update config fragments wireguard: ratelimiter: use hrtimer in selftest net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ net: usb: ax88179_178a: Bind only to vendor-specific interface selftests: net: fix IOAM test skip return code net: usb: make USB_RTL8153_ECM non user configurable net: marvell: prestera: remove reduntant code octeontx2-pf: Reduce minimum mtu size to 60 net: devlink: Fix missing mutex_unlock() call net/tls: Remove redundant workqueue flush before destroy net: txgbe: Fix an error handling path in txgbe_probe() net: dsa: Fix spelling mistakes and cleanup code Documentation: devlink: add add devlink-selftests to the table of contents dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock net: ionic: fix error check for vlan flags in ionic_set_nic_features() net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr() nfp: flower: add support for tunnel offload without key ID ... |
||
Paolo Abeni
|
7c6327c77d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts: net/ax25/af_ax25.c |
||
Linus Torvalds
|
b349b1181d |
for-5.20/io_uring-2022-07-29
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLkm5gQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpmKMD/4l3QIrLbjYIxlfrzQcHbmYuUkbQtj3SbZg 6ejbnGVhCs1P9DdXH8MgE2BxgpiXQE0CqOK7vbSoo5ep2n2UTLI2DIxAl74SMIo7 0wmJXtUJySuViKr3NYVHqlN180MkQYddBz0nGElhkQBPBCMhW8CrtPCeURr/YyHp 2RxSYBXiUx2gRyig+klnp6oPEqelcBZJUyNHdA9yVrgl/RhB/t2rKj7D++8ukQM3 Zuyh8WIkTeTfUz9hdGG7fuCEdZN4DlO2CCEc7uy0cKi6VRCKH4hYUCqClJ+/cfd2 43dUI2O7B6D1t/ObFh8AGIDXBDqVA6ePQohQU6gooRkfQiBPKkc9d0ts4yIhRqca AjkzNM+0Eve3A01loJ8J84w8oZnvNpYEv5n8/sZVLWcyU3UIs0I88nC2OBiFtoRq d77CtFLwOTo+r3STtAhnZOqez90rhS6BqKtqlUP346PCuFItl6/MbGtwdTbLYEFj CVNIb2pERWSr2NxGv4lFyXaX/cRwruxojWH7yc3rRYjr4Ykevd1pe/fMGNiMAnKw 5em/3QU3qq0ZVcXLMihksKeHHFIQwGDRMuyuv/fktV10+yYXQ0t16WzkJT3aR8Xo cqs0r8+6Jnj3uYcOMzj/FoLcpEPr21hnwAtzLto1mG1Wh4JRn/D7Nx5zqxPLxcW+ NiU6VihPOw== =gxeV -----END PGP SIGNATURE----- Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block Pull io_uring updates from Jens Axboe: - As per (valid) complaint in the last merge window, fs/io_uring.c has grown quite large these days. io_uring isn't really tied to fs either, as it supports a wide variety of functionality outside of that. Move the code to io_uring/ and split it into files that either implement a specific request type, and split some code into helpers as well. The code is organized a lot better like this, and io_uring.c is now < 4K LOC (me). - Deprecate the epoll_ctl opcode. It'll still work, just trigger a warning once if used. If we don't get any complaints on this, and I don't expect any, then we can fully remove it in a future release (me). - Improve the cancel hash locking (Hao) - kbuf cleanups (Hao) - Efficiency improvements to the task_work handling (Dylan, Pavel) - Provided buffer improvements (Dylan) - Add support for recv/recvmsg multishot support. This is similar to the accept (or poll) support for have for multishot, where a single SQE can trigger everytime data is received. For applications that expect to do more than a few receives on an instantiated socket, this greatly improves efficiency (Dylan). - Efficiency improvements for poll handling (Pavel) - Poll cancelation improvements (Pavel) - Allow specifiying a range for direct descriptor allocations (Pavel) - Cleanup the cqe32 handling (Pavel) - Move io_uring types to greatly cleanup the tracing (Pavel) - Tons of great code cleanups and improvements (Pavel) - Add a way to do sync cancelations rather than through the sqe -> cqe interface, as that's a lot easier to use for some use cases (me). - Add support to IORING_OP_MSG_RING for sending direct descriptors to a different ring. This avoids the usually problematic SCM case, as we disallow those. (me) - Make the per-command alloc cache we use for apoll generic, place limits on it, and use it for netmsg as well (me). - Various cleanups (me, Michal, Gustavo, Uros) * tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits) io_uring: ensure REQ_F_ISREG is set async offload net: fix compat pointer in get_compat_msghdr() io_uring: Don't require reinitable percpu_ref io_uring: fix types in io_recvmsg_multishot_overflow io_uring: Use atomic_long_try_cmpxchg in __io_account_mem io_uring: support multishot in recvmsg net: copy from user before calling __get_compat_msghdr net: copy from user before calling __copy_msghdr io_uring: support 0 length iov in buffer select in compat io_uring: fix multishot ending when not polled io_uring: add netmsg cache io_uring: impose max limit on apoll cache io_uring: add abstraction around apoll cache io_uring: move apoll cache to poll.c io_uring: consolidate hash_locked io-wq handling io_uring: clear REQ_F_HASH_LOCKED on hash removal io_uring: don't race double poll setting REQ_F_ASYNC_DATA io_uring: don't miss setting REQ_F_DOUBLE_POLL io_uring: disable multishot recvmsg io_uring: only trace one of complete or overflow ... |
||
Maxim Mikityanskiy
|
8eaa1d1108 |
net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
Striding RQ uses MTT page mapping, where each page corresponds to an XSK
frame. MTT pages have alignment requirements, and XSK frames don't have
any alignment guarantees in the unaligned mode. Frames with improper
alignment must be discarded, otherwise the packet data will be written
at a wrong address.
Fixes:
|
||
Eric Dumazet
|
2df91e397d |
net: rose: add netdev ref tracker to 'struct rose_sock'
This will help debugging netdevice refcount problems with CONFIG_NET_DEV_REFCNT_TRACKER=y Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tested-by: Bernard Pidoux <f6bvp@free.fr> Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jakub Kicinski
|
5fc7c5887c |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says: ==================== bpf-next 2022-07-29 We've added 22 non-merge commits during the last 4 day(s) which contain a total of 27 files changed, 763 insertions(+), 120 deletions(-). The main changes are: 1) Fixes to allow setting any source IP with bpf_skb_set_tunnel_key() helper, from Paul Chaignon. 2) Fix for bpf_xdp_pointer() helper when doing sanity checking, from Joanne Koong. 3) Fix for XDP frame length calculation, from Lorenzo Bianconi. 4) Libbpf BPF_KSYSCALL docs improvements and fixes to selftests to accommodate s390x quirks with socketcall(), from Ilya Leoshkevich. 5) Allow/denylist and CI configs additions to selftests/bpf to improve BPF CI, from Daniel Müller. 6) BPF trampoline + ftrace follow up fixes, from Song Liu and Xu Kuohai. 7) Fix allocation warnings in netdevsim, from Jakub Kicinski. 8) bpf_obj_get_opts() libbpf API allowing to provide file flags, from Joe Burton. 9) vsnprintf usage fix in bpf_snprintf_btf(), from Fedor Tokarev. 10) Various small fixes and clean ups, from Daniel Müller, Rongguang Wei, Jörn-Thorben Hinz, Yang Li. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (22 commits) bpf: Remove unneeded semicolon libbpf: Add bpf_obj_get_opts() netdevsim: Avoid allocation warnings triggered from user space bpf: Fix NULL pointer dereference when registering bpf trampoline bpf: Fix test_progs -j error with fentry/fexit tests selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout bpftool: Don't try to return value from void function in skeleton bpftool: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE macro bpf: btf: Fix vsnprintf return value check libbpf: Support PPC in arch_specific_syscall_pfx selftests/bpf: Adjust vmtest.sh to use local kernel configuration selftests/bpf: Copy over libbpf configs selftests/bpf: Sort configuration selftests/bpf: Attach to socketcall() in test_probe_user libbpf: Extend BPF_KSYSCALL documentation bpf, devmap: Compute proper xdp_frame len redirecting frames bpf: Fix bpf_xdp_pointer return pointer selftests/bpf: Don't assign outer source IP to host bpf: Set flow flag to allow any source IP in bpf_tunnel_key geneve: Use ip_tunnel_key flow flags in route lookups ... ==================== Link: https://lore.kernel.org/r/20220729230948.1313527-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Mike Manning
|
944fd1aeac |
net: allow unbound socket for packets in VRF when tcp_l3mdev_accept set
The commit |
||
Andy Shevchenko
|
29192a170e |
firewire: net: Make use of get_unaligned_be48(), put_unaligned_be48()
Since we have a proper endianness converters for BE 48-bit data use them. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220726144906.5217-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Eric Dumazet
|
d7c4c9e075 |
ax25: fix incorrect dev_tracker usage
While investigating a separate rose issue [1], and enabling
CONFIG_NET_DEV_REFCNT_TRACKER=y, Bernard reported an orthogonal ax25 issue [2]
An ax25_dev can be used by one (or many) struct ax25_cb.
We thus need different dev_tracker, one per struct ax25_cb.
After this patch is applied, we are able to focus on rose.
[1] https://lore.kernel.org/netdev/fb7544a1-f42e-9254-18cc-c9b071f4ca70@free.fr/
[2]
[ 205.798723] reference already released.
[ 205.798732] allocated in:
[ 205.798734] ax25_bind+0x1a2/0x230 [ax25]
[ 205.798747] __sys_bind+0xea/0x110
[ 205.798753] __x64_sys_bind+0x18/0x20
[ 205.798758] do_syscall_64+0x5c/0x80
[ 205.798763] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 205.798768] freed in:
[ 205.798770] ax25_release+0x115/0x370 [ax25]
[ 205.798778] __sock_release+0x42/0xb0
[ 205.798782] sock_close+0x15/0x20
[ 205.798785] __fput+0x9f/0x260
[ 205.798789] ____fput+0xe/0x10
[ 205.798792] task_work_run+0x64/0xa0
[ 205.798798] exit_to_user_mode_prepare+0x18b/0x190
[ 205.798804] syscall_exit_to_user_mode+0x26/0x40
[ 205.798808] do_syscall_64+0x69/0x80
[ 205.798812] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 205.798827] ------------[ cut here ]------------
[ 205.798829] WARNING: CPU: 2 PID: 2605 at lib/ref_tracker.c:136 ref_tracker_free.cold+0x60/0x81
[ 205.798837] Modules linked in: rose netrom mkiss ax25 rfcomm cmac algif_hash algif_skcipher af_alg bnep snd_hda_codec_hdmi nls_iso8859_1 i915 rtw88_8821ce rtw88_8821c x86_pkg_temp_thermal rtw88_pci intel_powerclamp rtw88_core snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio coretemp snd_hda_intel kvm_intel snd_intel_dspcfg mac80211 snd_hda_codec kvm i2c_algo_bit drm_buddy drm_dp_helper btusb drm_kms_helper snd_hwdep btrtl snd_hda_core btbcm joydev crct10dif_pclmul btintel crc32_pclmul ghash_clmulni_intel mei_hdcp btmtk intel_rapl_msr aesni_intel bluetooth input_leds snd_pcm crypto_simd syscopyarea processor_thermal_device_pci_legacy sysfillrect cryptd intel_soc_dts_iosf snd_seq sysimgblt ecdh_generic fb_sys_fops rapl libarc4 processor_thermal_device intel_cstate processor_thermal_rfim cec snd_timer ecc snd_seq_device cfg80211 processor_thermal_mbox mei_me processor_thermal_rapl mei rc_core at24 snd intel_pch_thermal intel_rapl_common ttm soundcore int340x_thermal_zone video
[ 205.798948] mac_hid acpi_pad sch_fq_codel ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid i2c_i801 i2c_smbus r8169 xhci_pci ahci libahci realtek lpc_ich xhci_pci_renesas [last unloaded: ax25]
[ 205.798992] CPU: 2 PID: 2605 Comm: ax25ipd Not tainted 5.18.11-F6BVP #3
[ 205.798996] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CK3, BIOS 5.011 09/16/2020
[ 205.798999] RIP: 0010:ref_tracker_free.cold+0x60/0x81
[ 205.799005] Code: e8 d2 01 9b ff 83 7b 18 00 74 14 48 c7 c7 2f d7 ff 98 e8 10 6e fc ff 8b 7b 18 e8 b8 01 9b ff 4c 89 ee 4c 89 e7 e8 5d fd 07 00 <0f> 0b b8 ea ff ff ff e9 30 05 9b ff 41 0f b6 f7 48 c7 c7 a0 fa 4e
[ 205.799008] RSP: 0018:ffffaf5281073958 EFLAGS: 00010286
[ 205.799011] RAX: 0000000080000000 RBX: ffff9a0bd687ebe0 RCX: 0000000000000000
[ 205.799014] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 00000000ffffffff
[ 205.799016] RBP: ffffaf5281073a10 R08: 0000000000000003 R09: fffffffffffd5618
[ 205.799019] R10: 0000000000ffff10 R11: 000000000000000f R12: ffff9a0bc53384d0
[ 205.799022] R13: 0000000000000282 R14: 00000000ae000001 R15: 0000000000000001
[ 205.799024] FS: 0000000000000000(0000) GS:ffff9a0d0f300000(0000) knlGS:0000000000000000
[ 205.799028] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 205.799031] CR2: 00007ff6b8311554 CR3: 000000001ac10004 CR4: 00000000001706e0
[ 205.799033] Call Trace:
[ 205.799035] <TASK>
[ 205.799038] ? ax25_dev_device_down+0xd9/0x1b0 [ax25]
[ 205.799047] ? ax25_device_event+0x9f/0x270 [ax25]
[ 205.799055] ? raw_notifier_call_chain+0x49/0x60
[ 205.799060] ? call_netdevice_notifiers_info+0x52/0xa0
[ 205.799065] ? dev_close_many+0xc8/0x120
[ 205.799070] ? unregister_netdevice_many+0x13d/0x890
[ 205.799073] ? unregister_netdevice_queue+0x90/0xe0
[ 205.799076] ? unregister_netdev+0x1d/0x30
[ 205.799080] ? mkiss_close+0x7c/0xc0 [mkiss]
[ 205.799084] ? tty_ldisc_close+0x2e/0x40
[ 205.799089] ? tty_ldisc_hangup+0x137/0x210
[ 205.799092] ? __tty_hangup.part.0+0x208/0x350
[ 205.799098] ? tty_vhangup+0x15/0x20
[ 205.799103] ? pty_close+0x127/0x160
[ 205.799108] ? tty_release+0x139/0x5e0
[ 205.799112] ? __fput+0x9f/0x260
[ 205.799118] ax25_dev_device_down+0xd9/0x1b0 [ax25]
[ 205.799126] ax25_device_event+0x9f/0x270 [ax25]
[ 205.799135] raw_notifier_call_chain+0x49/0x60
[ 205.799140] call_netdevice_notifiers_info+0x52/0xa0
[ 205.799146] dev_close_many+0xc8/0x120
[ 205.799152] unregister_netdevice_many+0x13d/0x890
[ 205.799157] unregister_netdevice_queue+0x90/0xe0
[ 205.799161] unregister_netdev+0x1d/0x30
[ 205.799165] mkiss_close+0x7c/0xc0 [mkiss]
[ 205.799170] tty_ldisc_close+0x2e/0x40
[ 205.799173] tty_ldisc_hangup+0x137/0x210
[ 205.799178] __tty_hangup.part.0+0x208/0x350
[ 205.799184] tty_vhangup+0x15/0x20
[ 205.799188] pty_close+0x127/0x160
[ 205.799193] tty_release+0x139/0x5e0
[ 205.799199] __fput+0x9f/0x260
[ 205.799203] ____fput+0xe/0x10
[ 205.799208] task_work_run+0x64/0xa0
[ 205.799213] do_exit+0x33b/0xab0
[ 205.799217] ? __handle_mm_fault+0xc4f/0x15f0
[ 205.799224] do_group_exit+0x35/0xa0
[ 205.799228] __x64_sys_exit_group+0x18/0x20
[ 205.799232] do_syscall_64+0x5c/0x80
[ 205.799238] ? handle_mm_fault+0xba/0x290
[ 205.799242] ? debug_smp_processor_id+0x17/0x20
[ 205.799246] ? fpregs_assert_state_consistent+0x26/0x50
[ 205.799251] ? exit_to_user_mode_prepare+0x49/0x190
[ 205.799256] ? irqentry_exit_to_user_mode+0x9/0x20
[ 205.799260] ? irqentry_exit+0x33/0x40
[ 205.799263] ? exc_page_fault+0x87/0x170
[ 205.799268] ? asm_exc_page_fault+0x8/0x30
[ 205.799273] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 205.799277] RIP: 0033:0x7ff6b80eaca1
[ 205.799281] Code: Unable to access opcode bytes at RIP 0x7ff6b80eac77.
[ 205.799283] RSP: 002b:00007fff6dfd4738 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[ 205.799287] RAX: ffffffffffffffda RBX: 00007ff6b8215a00 RCX: 00007ff6b80eaca1
[ 205.799290] RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
[ 205.799293] RBP: 0000000000000001 R08: ffffffffffffff80 R09: 0000000000000028
[ 205.799295] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff6b8215a00
[ 205.799298] R13: 0000000000000000 R14: 00007ff6b821aee8 R15: 00007ff6b821af00
[ 205.799304] </TASK>
Fixes:
|
||
Vikas Gupta
|
08f588fa30 |
devlink: introduce framework for selftests
Add a framework for running selftests. Framework exposes devlink commands and test suite(s) to the user to execute and query the supported tests by the driver. Below are new entries in devlink_nl_ops devlink_nl_cmd_selftests_show_doit/dumpit: To query the supported selftests by the drivers. devlink_nl_cmd_selftests_run: To execute selftests. Users can provide a test mask for executing group tests or standalone tests. Documentation/networking/devlink/ path is already part of MAINTAINERS & the new files come under this path. Hence no update needed to the MAINTAINERS Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Tariq Toukan
|
7adc91e0c9 |
net/tls: Multi-threaded calls to TX tls_dev_del
Multiple TLS device-offloaded contexts can be added in parallel via concurrent calls to .tls_dev_add, while calls to .tls_dev_del are sequential in tls_device_gc_task. This is not a sustainable behavior. This creates a rate gap between add and del operations (addition rate outperforms the deletion rate). When running for enough time, the TLS device resources could get exhausted, failing to offload new connections. Replace the single-threaded garbage collector work with a per-context alternative, so they can be handled on several cores in parallel. Use a new dedicated destruct workqueue for this. Tested with mlx5 device: Before: 22141 add/sec, 103 del/sec After: 11684 add/sec, 11684 del/sec Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jakub Kicinski
|
272ac32f56 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Ziyang Xuan
|
85f0173df3 |
ipv6/addrconf: fix a null-ptr-deref bug for ip6_ptr
Change net device's MTU to smaller than IPV6_MIN_MTU or unregister
device while matching route. That may trigger null-ptr-deref bug
for ip6_ptr probability as following.
=========================================================
BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134
Read of size 4 at addr 0000000000000308 by task ping6/263
CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14
Call trace:
dump_backtrace+0x1a8/0x230
show_stack+0x20/0x70
dump_stack_lvl+0x68/0x84
print_report+0xc4/0x120
kasan_report+0x84/0x120
__asan_load4+0x94/0xd0
find_match.part.0+0x70/0x134
__find_rr_leaf+0x408/0x470
fib6_table_lookup+0x264/0x540
ip6_pol_route+0xf4/0x260
ip6_pol_route_output+0x58/0x70
fib6_rule_lookup+0x1a8/0x330
ip6_route_output_flags_noref+0xd8/0x1a0
ip6_route_output_flags+0x58/0x160
ip6_dst_lookup_tail+0x5b4/0x85c
ip6_dst_lookup_flow+0x98/0x120
rawv6_sendmsg+0x49c/0xc70
inet_sendmsg+0x68/0x94
Reproducer as following:
Firstly, prepare conditions:
$ip netns add ns1
$ip netns add ns2
$ip link add veth1 type veth peer name veth2
$ip link set veth1 netns ns1
$ip link set veth2 netns ns2
$ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1
$ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2
$ip netns exec ns1 ifconfig veth1 up
$ip netns exec ns2 ifconfig veth2 up
$ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1
$ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1
Secondly, execute the following two commands in two ssh windows
respectively:
$ip netns exec ns1 sh
$while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done
$ip netns exec ns1 sh
$while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done
It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly,
then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check.
cpu0 cpu1
fib6_table_lookup
__find_rr_leaf
addrconf_notify [ NETDEV_CHANGEMTU ]
addrconf_ifdown
RCU_INIT_POINTER(dev->ip6_ptr, NULL)
find_match
ip6_ignore_linkdown
So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to
fix the null-ptr-deref bug.
Fixes:
|
||
Paolo Abeni
|
7d85e9cb40 |
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says: ==================== ice: PPPoE offload support Marcin Szycik says: Add support for dissecting PPPoE and PPP-specific fields in flow dissector: PPPoE session id and PPP protocol type. Add support for those fields in tc-flower and support offloading PPPoE. Finally, add support for hardware offload of PPPoE packets in switchdev mode in ice driver. Example filter: tc filter add dev $PF1 ingress protocol ppp_ses prio 1 flower pppoe_sid \ 1234 ppp_proto ip skip_sw action mirred egress redirect dev $VF1_PR Changes in iproute2 are required to use the new fields (will be submitted soon). ICE COMMS DDP package is required to create a filter in ice. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for PPPoE hardware offload flow_offload: Introduce flow_match_pppoe net/sched: flower: Add PPPoE filter flow_dissector: Add PPPoE dissectors ==================== Link: https://lore.kernel.org/r/20220726203133.2171332-1-anthony.l.nguyen@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Jakub Kicinski
|
5f10376b6b |
add missing includes and forward declarations to networking includes under linux/
Similarly to a recent include/net/ cleanup, this patch adds missing includes to networking headers under include/linux. All these problems are currently masked by the existing users including the missing dependency before the broken header. Link: https://lore.kernel.org/all/20220723045755.2676857-1-kuba@kernel.org/ v1 Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20220726215652.158167-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> |
||
Stefan Raspl
|
8b2fed8e27 |
net/smc: Pass on DMBE bit mask in IRQ handler
Make the DMBE bits, which are passed on individually in ism_move() as parameter idx, available to the receiver. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Stefan Raspl
|
0a2f4f9893 |
s390/ism: Cleanups
Reworked signature of the function to retrieve the system EID: No plausible reason to use a double pointer. And neither to pass in the device as an argument, as this identifier is by definition per system, not per device. Plus some minor consistency edits. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Wenjia Zhang < wenjia@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Jakub Kicinski
|
84c61fe1a7 |
tls: rx: do not use the standard strparser
TLS is a relatively poor fit for strparser. We pause the input every time a message is received, wait for a read which will decrypt the message, start the parser, repeat. strparser is built to delineate the messages, wrap them in individual skbs and let them float off into the stack or a different socket. TLS wants the data pages and nothing else. There's no need for TLS to keep cloning (and occasionally skb_unclone()'ing) the TCP rx queue. This patch uses a pre-allocated skb and attaches the skbs from the TCP rx queue to it as frags. TLS is careful never to modify the input skb without CoW'ing / detaching it first. Since we call TCP rx queue cleanup directly we also get back the benefit of skb deferred free. Overall this results in a 6% gain in my benchmarks. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jakub Kicinski
|
3f92a64e44 |
tcp: allow tls to decrypt directly from the tcp rcv queue
Expose TCP rx queue accessor and cleanup, so that TLS can decrypt directly from the TCP queue. The expectation is that the caller can access the skb returned from tcp_recv_skb() and up to inq bytes worth of data (some of which may be in ->next skbs) and then call tcp_read_done() when data has been consumed. The socket lock must be held continuously across those two operations. Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jiri Pirko
|
7b2d9a1a50 |
net: devlink: introduce nested devlink entity for line card
For the purpose of exposing device info and allow flash update which is going to be implemented in follow-up patches, introduce a possibility for a line card to expose relation to nested devlink entity. The nested devlink entity represents the line card. Example: $ devlink lc show pci/0000:01:00.0 lc 1 pci/0000:01:00.0: lc 1 state active type 16x100G nested_devlink auxiliary/mlxsw_core.lc.0 supported_types: 16x100G $ devlink dev show auxiliary/mlxsw_core.lc.0 auxiliary/mlxsw_core.lc.0 Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Luiz Augusto von Dentz
|
d0be8347c6 |
Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
This fixes the following trace which is caused by hci_rx_work starting up *after* the final channel reference has been put() during sock_close() but *before* the references to the channel have been destroyed, so instead the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to prevent referencing a channel that is about to be destroyed. refcount_t: increment on 0; use-after-free. BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0 Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705 CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S W 4.14.234-00003-g1fb6d0bd49a4-dirty #28 Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT) Workqueue: hci0 hci_rx_work Call trace: dump_backtrace+0x0/0x378 show_stack+0x20/0x2c dump_stack+0x124/0x148 print_address_description+0x80/0x2e8 __kasan_report+0x168/0x188 kasan_report+0x10/0x18 __asan_load4+0x84/0x8c refcount_dec_and_test+0x20/0xd0 l2cap_chan_put+0x48/0x12c l2cap_recv_frame+0x4770/0x6550 l2cap_recv_acldata+0x44c/0x7a4 hci_acldata_packet+0x100/0x188 hci_rx_work+0x178/0x23c process_one_work+0x35c/0x95c worker_thread+0x4cc/0x960 kthread+0x1a8/0x1c4 ret_from_fork+0x10/0x18 Cc: stable@kernel.org Reported-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> |
||
Wojciech Drewek
|
6a21b0856d |
flow_offload: Introduce flow_match_pppoe
Allow to offload PPPoE filters by adding flow_rule_match_pppoe. Drivers can extract PPPoE specific fields from now on. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> |
||
Wojciech Drewek
|
46126db9c8 |
flow_dissector: Add PPPoE dissectors
Allow to dissect PPPoE specific fields which are: - session ID (16 bits) - ppp protocol (16 bits) - type (16 bits) - this is PPPoE ethertype, for now only ETH_P_PPP_SES is supported, possible ETH_P_PPP_DISC in the future The goal is to make the following TC command possible: # tc filter add dev ens6f0 ingress prio 1 protocol ppp_ses \ flower \ pppoe_sid 12 \ ppp_proto ip \ action drop Note that only PPPoE Session is supported. Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> |
||
Paul Chaignon
|
451ef36bd2 |
ip_tunnels: Add new flow flags field to ip_tunnel_key
This commit extends the ip_tunnel_key struct with a new field for the flow flags, to pass them to the route lookups. This new field will be populated and used in subsequent commits. Signed-off-by: Paul Chaignon <paul@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/f8bfd4983bd06685a59b1e3ba76ca27496f51ef3.1658759380.git.paul@isovalent.com |
||
Jakub Kicinski
|
2baf8ba532 |
wireless-next patches for v5.20
Third set of patches for v5.20. MLO work continues and we have a lot of stack changes due to that, including driver API changes. Not much driver patches except on mt76. Major changes: cfg80211/mac80211 * more prepartion for Wi-Fi 7 Multi-Link Operation (MLO) support, works with one link now * align with IEEE Draft P802.11be_D2.0 * hardware timestamps for receive and transmit mt76 * preparation for new chipset support * ACPI SAR support -----BEGIN PGP SIGNATURE----- iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmLe1k4RHGt2YWxvQGtl cm5lbC5vcmcACgkQbhckVSbrbZvxlQf8DrZIllhF0q/7Wry3JuG0gbNA+V2eI/lc OYrephsDBm/dvvyjcFWcdUzxoNk0k1+aOrx/09JijHFgCGKVwuK1+hxYVfjW2q43 9mHxJBo4NcMk1RDDM3paVuZ8QMHuYugbv2mQOZeAEq2XloAaqEM7wVE+bb4Mgtgx VAKS5du2igrSt83wl8BRMFb9MPAM1EQ3Cw7Ro5T4y+1Qm/hrBm6qWizSpqh9CXYx pDLR3pvQxiD4Axa0Uq3rUbyF4hLwciqSFOJvr2sI3q7b9YElA7wIi6NQzMkYJH6Z 7HW5K6UIQbblAaQkv2BLqpU1N6puTHUOAf5Md31vOAaOcGbSI5hbUA== =Cnxg -----END PGP SIGNATURE----- Merge tag 'wireless-next-2022-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v5.20 Third set of patches for v5.20. MLO work continues and we have a lot of stack changes due to that, including driver API changes. Not much driver patches except on mt76. Major changes: cfg80211/mac80211 - more prepartion for Wi-Fi 7 Multi-Link Operation (MLO) support, works with one link now - align with IEEE Draft P802.11be_D2.0 - hardware timestamps for receive and transmit mt76 - preparation for new chipset support - ACPI SAR support * tag 'wireless-next-2022-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (254 commits) wifi: mac80211: fix link data leak wifi: mac80211: mlme: fix disassoc with MLO wifi: mac80211: add macros to loop over active links wifi: mac80211: remove erroneous sband/link validation wifi: mac80211: mlme: transmit assoc frame with address translation wifi: mac80211: verify link addresses are different wifi: mac80211: rx: track link in RX data wifi: mac80211: optionally implement MLO multicast TX wifi: mac80211: expand ieee80211_mgmt_tx() for MLO wifi: nl80211: add MLO link ID to the NL80211_CMD_FRAME TX API wifi: mac80211: report link ID to cfg80211 on mgmt RX wifi: cfg80211: report link ID in NL80211_CMD_FRAME wifi: mac80211: add hardware timestamps for RX and TX wifi: cfg80211: add hardware timestamps to frame RX info wifi: cfg80211/nl80211: move rx management data into a struct wifi: cfg80211: add a function for reporting TX status with hardware timestamps wifi: nl80211: add RX and TX timestamp attributes wifi: ieee80211: add helper functions for detecting TM/FTM frames wifi: mac80211_hwsim: handle links for wmediumd/virtio wifi: mac80211: sta_info: fix link_sta insertion ... ==================== Link: https://lore.kernel.org/r/20220725174547.EA465C341C6@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
David S. Miller
|
e222dc8d84 |
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2022-07-20 1) Don't set DST_NOPOLICY in IPv4, a recent patch made this superfluous. From Eyal Birger. 2) Convert alg_key to flexible array member to avoid an iproute2 compile warning when built with gcc-12. From Stephen Hemminger. 3) xfrm_register_km and xfrm_unregister_km do always return 0 so change the type to void. From Zhengchao Shao. 4) Fix spelling mistake in esp6.c From Zhang Jiaming. 5) Improve the wording of comment above XFRM_OFFLOAD flags. From Petr Vaněk. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Kuniyuki Iwashima
|
0273954595 |
net: Fix data-races around sysctl_[rw]mem(_offset)?.
While reading these sysctl variables, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.
- .sysctl_rmem
- .sysctl_rwmem
- .sysctl_rmem_offset
- .sysctl_wmem_offset
- sysctl_tcp_rmem[1, 2]
- sysctl_tcp_wmem[1, 2]
- sysctl_decnet_rmem[1]
- sysctl_decnet_wmem[1]
- sysctl_tipc_rmem[1]
Fixes:
|
||
Dylan Yudaken
|
72c531f8ef |
net: copy from user before calling __get_compat_msghdr
this is in preparation for multishot receive from io_uring, where it needs to have access to the original struct user_msghdr. functionally this should be a no-op. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Dylan Yudaken <dylany@fb.com> Link: https://lore.kernel.org/r/20220714110258.1336200-3-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Matthias May
|
7074732c8f |
ip_tunnels: allow VXLAN/GENEVE to inherit TOS/TTL from VLAN
The current code allows for VXLAN and GENEVE to inherit the TOS respective the TTL when skb-protocol is ETH_P_IP or ETH_P_IPV6. However when the payload is VLAN encapsulated, then this inheriting does not work, because the visible skb-protocol is of type ETH_P_8021Q or ETH_P_8021AD. Instead of skb->protocol use skb_protocol(). Signed-off-by: Matthias May <matthias.may@westermo.com> Link: https://lore.kernel.org/r/20220721202718.10092-1-matthias.may@westermo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Jakub Kicinski
|
4a934eca7b |
bluetooth-next pull request for net-next:
- Add support for IM Networks PID 0x3568 - Add support for BCM4349B1 - Add support for CYW55572 - Add support for MT7922 VID/PID 0489/e0e2 - Add support for Realtek RTL8852C - Initial support for Isochronous Channels/ISO sockets - Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING quirk -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmLbPmsZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKSo7EACc8njgHO2pN8ncWvgu/gH8 0v1lRBoi+Tyzk5gZtdM0rIbE3t7tFqml3Kr0WsCkzV6CGgnqCw5i/MKZXCV8G4tG 0ZsY8y6NMiCFR6wQq3rdNS8NqPmlCHWm6yY2EISEM6qbtF8HwxQvXdkzznZPHgVG DR7i5fAVuGA6rIL+9NSG/TxHjJvq6Bmu3v9Uu/V062I7NrBMw9Jr0Ic1EaUqgYck QL8+653ZZMxYPxt978UekbQIYEp3YwZ5MTACtX86j2s5tlZKuivKTIZch1vSaOi1 1zC6up208+p2/4+Yq7FJ2kA6d2be3FD26oT1xymRhqiMakCRrHfdmTFpC7/J4ZgX /4mcIREkUoO2duim+91Hgt1Ww1vaD4joPwXD6AILbK1bdp0pi0gw47bQF8XO8uIh yPQqnoGWSJGD1VknPh5x7lGcAYQ3bgSg0L3TlQ4gN9Qc/emuC6UOQ9QPvwmyOilG ZrDn2p1Rpsoj8vVRTv6+CgKqLokXNUTPixCAaS4AIygRzhIzwReYYNMYUZgYMrTk Qf6bKqczEut1vq8NZiN3TTxLLOWHB5+3cdl/QRv4ZeoNXMXPmbtJ0y9Vud66GhZu vS6F+BNQapIkEbM6xrC/OepCgzrVoVn2BQuDO6SQA3pg4JxFeAl+V434Va3tqVeK 9h6GHrl8KePjh528NXbH3Q== =Ne87 -----END PGP SIGNATURE----- Merge tag 'for-net-next-2022-07-22' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for IM Networks PID 0x3568 - Add support for BCM4349B1 - Add support for CYW55572 - Add support for MT7922 VID/PID 0489/e0e2 - Add support for Realtek RTL8852C - Initial support for Isochronous Channels/ISO sockets - Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING quirk * tag 'for-net-next-2022-07-22' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (58 commits) Bluetooth: btusb: Detect if an ACL packet is in fact an ISO packet Bluetooth: btusb: Add support for ISO packets Bluetooth: ISO: Add broadcast support Bluetooth: Add initial implementation of BIS connections Bluetooth: Add BTPROTO_ISO socket type Bluetooth: Add initial implementation of CIS connections Bluetooth: hci_core: Introduce hci_recv_event_data Bluetooth: Convert delayed discov_off to hci_sync Bluetooth: Remove update_scan hci_request dependancy Bluetooth: Remove dead code from hci_request.c Bluetooth: btrtl: Fix typo in comment Bluetooth: MGMT: Fix holding hci_conn reference while command is queued Bluetooth: mgmt: Fix using hci_conn_abort Bluetooth: Use bt_status to convert from errno Bluetooth: Add bt_status Bluetooth: hci_sync: Split hci_dev_open_sync Bluetooth: hci_sync: Refactor remove Adv Monitor Bluetooth: hci_sync: Refactor add Adv Monitor Bluetooth: hci_sync: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING Bluetooth: btusb: Remove HCI_QUIRK_BROKEN_ERR_DATA_REPORTING for fake CSR ... ==================== Link: https://lore.kernel.org/r/20220723002232.964796-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
Luiz Augusto von Dentz
|
f764a6c2c1 |
Bluetooth: ISO: Add broadcast support
This adds broadcast support for BTPROTO_ISO by extending the sockaddr_iso with a new struct sockaddr_iso_bc where the socket user can set the broadcast address when receiving, the SID and the BIS indexes it wants to synchronize. When using BTPROTO_ISO for broadcast the roles are: Broadcaster -> uses connect with address set to BDADDR_ANY: > tools/isotest -s 00:00:00:00:00:00 Broadcast Receiver -> uses listen with address set to broadcaster: > tools/isotest -d 00:AA:01:00:00:00 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> |
||
Luiz Augusto von Dentz
|
eca0ae4aea |
Bluetooth: Add initial implementation of BIS connections
This adds initial support for BIS/BIG which includes: == Broadcaster role: Setup a periodic advertising and create a BIG == > tools/isotest -s 00:00:00:00:00:00 isotest[63]: Connected [00:00:00:00:00:00] isotest[63]: QoS BIG 0x00 BIS 0x00 Packing 0x00 Framing 0x00] isotest[63]: Output QoS [Interval 10000 us Latency 10 ms SDU 40 PHY 0x02 RTN 2] isotest[63]: Sending ... isotest[63]: Number of packets: 1 isotest[63]: Socket jitter buffer: 80 buffer < HCI Command: LE Set Perio.. (0x08|0x003e) plen 7 ... > HCI Event: Command Complete (0x0e) plen 4 LE Set Periodic Advertising Parameters (0x08|0x003e) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Perio.. (0x08|0x003f) plen 7 ... > HCI Event: Command Complete (0x0e) plen 4 LE Set Periodic Advertising Data (0x08|0x003f) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Perio.. (0x08|0x0040) plen 2 ... > HCI Event: Command Complete (0x0e) plen 4 LE Set Periodic Advertising Enable (0x08|0x0040) ncmd 1 Status: Success (0x00) < HCI Command: LE Create B.. (0x08|0x0068) plen 31 ... > HCI Event: Command Status (0x0f) plen 4 LE Create Broadcast Isochronous Group (0x08|0x0068) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 21 LE Broadcast Isochronous Group Complete (0x1b) ... == Broadcast Receiver role: Create a PA Sync and BIG Sync == > tools/isotest -i hci1 -d 00:AA:01:00:00:00 isotest[66]: Waiting for connection 00:AA:01:00:00:00... < HCI Command: LE Periodic Advert.. (0x08|0x0044) plen 14 ... > HCI Event: Command Status (0x0f) plen 4 LE Periodic Advertising Create Sync (0x08|0x0044) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Extended Sca.. (0x08|0x0041) plen 8 ... > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Extended Sca.. (0x08|0x0042) plen 6 ... > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Enable (0x08|0x0042) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 29 LE Extended Advertising Report (0x0d) ... > HCI Event: LE Meta Event (0x3e) plen 16 LE Periodic Advertising Sync Established (0x0e) ... < HCI Command: LE Broadcast Isoch.. (0x08|0x006b) plen 25 ... > HCI Event: Command Status (0x0f) plen 4 LE Broadcast Isochronous Group Create Sync (0x08|0x006b) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 17 LE Broadcast Isochronous Group Sync Estabilished (0x1d) ... Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> |
||
Luiz Augusto von Dentz
|
ccf74f2390 |
Bluetooth: Add BTPROTO_ISO socket type
This introduces a new socket type BTPROTO_ISO which can be enabled with use of ISO Socket experiemental UUID, it can used to initiate/accept connections and transfer packets between userspace and kernel similarly to how BTPROTO_SCO works: Central -> uses connect with address set to destination bdaddr: > tools/isotest -s 00:AA:01:00:00:00 Peripheral -> uses listen: > tools/isotest -d Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> |
||
Luiz Augusto von Dentz
|
26afbd826e |
Bluetooth: Add initial implementation of CIS connections
This adds the initial implementation of CIS connections and introduces the ISO packets/links. == Central: Set CIG Parameters, create a CIS and Setup Data Path == > tools/isotest -s <address> < HCI Command: LE Extended Create... (0x08|0x0043) plen 26 ... > HCI Event: Command Status (0x0f) plen 4 LE Extended Create Connection (0x08|0x0043) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 31 LE Enhanced Connection Complete (0x0a) ... < HCI Command: LE Create Connected... (0x08|0x0064) plen 5 ... > HCI Event: Command Status (0x0f) plen 4 LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 29 LE Connected Isochronous Stream Established (0x19) ... < HCI Command: LE Setup Isochronou.. (0x08|0x006e) plen 13 ... > HCI Event: Command Complete (0x0e) plen 6 LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 Status: Success (0x00) Handle: 257 < HCI Command: LE Setup Isochronou.. (0x08|0x006e) plen 13 ... > HCI Event: Command Complete (0x0e) plen 6 LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 Status: Success (0x00) Handle: 257 == Peripheral: Accept CIS and Setup Data Path == > tools/isotest -d HCI Event: LE Meta Event (0x3e) plen 7 LE Connected Isochronous Stream Request (0x1a) ... < HCI Command: LE Accept Co.. (0x08|0x0066) plen 2 ... > HCI Event: LE Meta Event (0x3e) plen 29 LE Connected Isochronous Stream Established (0x19) ... < HCI Command: LE Setup Is.. (0x08|0x006e) plen 13 ... > HCI Event: Command Complete (0x0e) plen 6 LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 Status: Success (0x00) Handle: 257 < HCI Command: LE Setup Is.. (0x08|0x006e) plen 13 ... > HCI Event: Command Complete (0x0e) plen 6 LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 Status: Success (0x00) Handle: 257 Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> |