linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-04 18:13:04 +00:00

Author	SHA1	Message	Date
Pablo Neira Ayuso	9dd732e0bd	netfilter: nf_tables: memleak flow rule from commit path Abort path release flow rule object, however, commit path does not. Update code to destroy these objects before releasing the transaction. Fixes: `c9626a2cbd` ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-06-06 17:31:46 +02:00
Pablo Neira Ayuso	c271cc9feb	netfilter: nf_tables: release new hooks on unsupported flowtable flags Release the list of new hooks that are pending to be registered in case that unsupported flowtable flags are provided. Fixes: `78d9f48f7f` ("netfilter: nf_tables: add devices to existing flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-06-06 17:31:46 +02:00
Pablo Neira Ayuso	2c9e455977	netfilter: nf_tables: always initialize flowtable hook list in transaction The hook list is used if nft_trans_flowtable_update(trans) == true. However, initialize this list for other cases for safety reasons. Fixes: `78d9f48f7f` ("netfilter: nf_tables: add devices to existing flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-06-02 23:31:11 +02:00
Pablo Neira Ayuso	b6d9014a33	netfilter: nf_tables: delete flowtable hooks via transaction list Remove inactive bool field in nft_hook object that was introduced in `abadb2f865` ("netfilter: nf_tables: delete devices from flowtable"). Move stale flowtable hooks to transaction list instead. Deleting twice the same device does not result in ENOENT. Fixes: `abadb2f865` ("netfilter: nf_tables: delete devices from flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-06-02 09:49:49 +02:00
Pablo Neira Ayuso	ab5e5c062f	netfilter: nf_tables: use kfree_rcu(ptr, rcu) to release hooks in clean_net path Use kfree_rcu(ptr, rcu) variant instead as described by `ae089831ff` ("netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant"). Fixes: `f9a43007d3` ("netfilter: nf_tables: double hook unregistration in netns path") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-06-01 16:01:54 +02:00
Florian Westphal	282e5f8fe9	netfilter: nat: really support inet nat without l3 address When no l3 address is given, priv->family is set to NFPROTO_INET and the evaluation function isn't called. Call it too so l4-only rewrite can work. Also add a test case for this. Fixes: `a33f387ecd` ("netfilter: nft_nat: allow to specify layer 4 protocol NAT only") Reported-by: Yi Chen <yiche@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-06-01 15:53:39 +02:00
Eric Dumazet	0a375c8224	tcp: tcp_rtx_synack() can be called from process context Laurent reported the enclosed report [1] This bug triggers with following coditions: 0) Kernel built with CONFIG_DEBUG_PREEMPT=y 1) A new passive FastOpen TCP socket is created. This FO socket waits for an ACK coming from client to be a complete ESTABLISHED one. 2) A socket operation on this socket goes through lock_sock() release_sock() dance. 3) While the socket is owned by the user in step 2), a retransmit of the SYN is received and stored in socket backlog. 4) At release_sock() time, the socket backlog is processed while in process context. 5) A SYNACK packet is cooked in response of the SYN retransmit. 6) -> tcp_rtx_synack() is called in process context. Before blamed commit, tcp_rtx_synack() was always called from BH handler, from a timer handler. Fix this by using TCP_INC_STATS() & NET_INC_STATS() which do not assume caller is in non preemptible context. [1] BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180 caller is tcp_rtx_synack.part.0+0x36/0xc0 CPU: 10 PID: 2180 Comm: epollpep Tainted: G OE 5.16.0-0.bpo.4-amd64 #1 Debian 5.16.12-1~bpo11+1 Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021 Call Trace: <TASK> dump_stack_lvl+0x48/0x5e check_preemption_disabled+0xde/0xe0 tcp_rtx_synack.part.0+0x36/0xc0 tcp_rtx_synack+0x8d/0xa0 ? kmem_cache_alloc+0x2e0/0x3e0 ? apparmor_file_alloc_security+0x3b/0x1f0 inet_rtx_syn_ack+0x16/0x30 tcp_check_req+0x367/0x610 tcp_rcv_state_process+0x91/0xf60 ? get_nohz_timer_target+0x18/0x1a0 ? lock_timer_base+0x61/0x80 ? preempt_count_add+0x68/0xa0 tcp_v4_do_rcv+0xbd/0x270 __release_sock+0x6d/0xb0 release_sock+0x2b/0x90 sock_setsockopt+0x138/0x1140 ? __sys_getsockname+0x7e/0xc0 ? aa_sk_perm+0x3e/0x1a0 __sys_setsockopt+0x198/0x1e0 __x64_sys_setsockopt+0x21/0x30 do_syscall_64+0x38/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `168a8f5805` ("tcp: TCP Fast Open Server - main code path") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Laurent Fasnacht <laurent.fasnacht@proton.ch> Acked-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20220530213713.601888-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-31 21:40:10 -07:00
Jakub Kicinski	b3c0a9efbe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Missing proper sanitization for nft_set_desc_concat_parse(). 2) Missing mutex in nf_tables pre_exit path. 3) Possible double hook unregistration from clean_net path. 4) Missing FLOWI_FLAG_ANYSRC flag in flowtable route lookup. Fix incorrect source and destination address in case of NAT. Patch from wenxu. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: flowtable: fix nft_flow_route source address for nat case netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag netfilter: nf_tables: double hook unregistration in netns path netfilter: nf_tables: hold mutex on netns pre_exit path netfilter: nf_tables: sanitize nft_set_desc_concat_parse() ==================== Link: https://lore.kernel.org/r/20220531215839.84765-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-31 20:58:24 -07:00
Guoju Fang	2e8728c955	net: sched: add barrier to fix packet stuck problem for lockless qdisc In qdisc_run_end(), the spin_unlock() only has store-release semantic, which guarantees all earlier memory access are visible before it. But the subsequent test_bit() has no barrier semantics so may be reordered ahead of the spin_unlock(). The store-load reordering may cause a packet stuck problem. The concurrent operations can be described as below, CPU 0 \| CPU 1 qdisc_run_end() \| qdisc_run_begin() . \| . ----> /* may be reorderd here */ \| . \| . \| . \| spin_unlock() \| set_bit() \| . \| smp_mb__after_atomic() ---- test_bit() \| spin_trylock() . \| . Consider the following sequence of events: CPU 0 reorder test_bit() ahead and see MISSED = 0 CPU 1 calls set_bit() CPU 1 calls spin_trylock() and return fail CPU 0 executes spin_unlock() At the end of the sequence, CPU 0 calls spin_unlock() and does nothing because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one take it, until the next cpu pushing to the qdisc (if ever ...) will notice and dequeue it. This patch fix this by adding one explicit barrier. As spin_unlock() and test_bit() ordering is a store-load ordering, a full memory barrier smp_mb() is needed here. Fixes: `a90c57f2ce` ("net: sched: fix packet stuck problem for lockless qdisc") Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220528101628.120193-1-gjfang@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-31 20:39:28 -07:00
wenxu	97629b237a	netfilter: flowtable: fix nft_flow_route source address for nat case For snat and dnat cases, the saddr should be taken from reverse tuple. Fixes: `3412e16418` (netfilter: flowtable: nft_flow_route use more data for reverse route) Signed-off-by: wenxu <wenxu@chinatelecom.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-31 23:32:53 +02:00
wenxu	f1896d45fe	netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag The nf_flow_table gets route through ip_route_output_key. If the saddr is not local one, then FLOWI_FLAG_ANYSRC flag should be set. Without this flag, the route lookup for other_dst will fail. Fixes: `3412e16418` (netfilter: flowtable: nft_flow_route use more data for reverse route) Signed-off-by: wenxu <wenxu@chinatelecom.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-31 23:14:03 +02:00
Pablo Neira Ayuso	f9a43007d3	netfilter: nf_tables: double hook unregistration in netns path __nft_release_hooks() is called from pre_netns exit path which unregisters the hooks, then the NETDEV_UNREGISTER event is triggered which unregisters the hooks again. [ 565.221461] WARNING: CPU: 18 PID: 193 at net/netfilter/core.c:495 __nf_unregister_net_hook+0x247/0x270 [...] [ 565.246890] CPU: 18 PID: 193 Comm: kworker/u64:1 Tainted: G E 5.18.0-rc7+ #27 [ 565.253682] Workqueue: netns cleanup_net [ 565.257059] RIP: 0010:__nf_unregister_net_hook+0x247/0x270 [...] [ 565.297120] Call Trace: [ 565.300900] <TASK> [ 565.304683] nf_tables_flowtable_event+0x16a/0x220 [nf_tables] [ 565.308518] raw_notifier_call_chain+0x63/0x80 [ 565.312386] unregister_netdevice_many+0x54f/0xb50 Unregister and destroy netdev hook from netns pre_exit via kfree_rcu so the NETDEV_UNREGISTER path see unregistered hooks. Fixes: `767d1216bf` ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-31 23:13:10 +02:00
Pablo Neira Ayuso	3923b1e440	netfilter: nf_tables: hold mutex on netns pre_exit path clean_net() runs in workqueue while walking over the lists, grab mutex. Fixes: `767d1216bf` ("netfilter: nftables: fix possible UAF over chains from packet path in netns") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-31 23:13:10 +02:00
Pablo Neira Ayuso	fecf31ee39	netfilter: nf_tables: sanitize nft_set_desc_concat_parse() Add several sanity checks for nft_set_desc_concat_parse(): - validate desc->field_count not larger than desc->field_len array. - field length cannot be larger than desc->field_len (ie. U8_MAX) - total length of the concatenation cannot be larger than register array. Joint work with Florian Westphal. Fixes: `f3a2181e16` ("netfilter: nf_tables: Support for sets with multiple ranged fields") Reported-by: <zhangziming.zzm@antgroup.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-31 23:13:10 +02:00
Juergen Gross	09e545f738	xen/netback: fix incorrect usage of RING_HAS_UNCONSUMED_REQUESTS() Commit `6fac592cca` ("xen: update ring.h") missed to fix one use case of RING_HAS_UNCONSUMED_REQUESTS(). Reported-by: Jan Beulich <jbeulich@suse.com> Fixes: `6fac592cca` ("xen: update ring.h") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20220530113459.20124-1-jgross@suse.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-05-31 12:22:22 +02:00
Arun Ajith S	3e0b8f529c	net/ipv6: Expand and rename accept_unsolicited_na to accept_untracked_na RFC 9131 changes default behaviour of handling RX of NA messages when the corresponding entry is absent in the neighbour cache. The current implementation is limited to accept just unsolicited NAs. However, the RFC is more generic where it also accepts solicited NAs. Both types should result in adding a STALE entry for this case. Expand accept_untracked_na behaviour to also accept solicited NAs to be compliant with the RFC and rename the sysctl knob to accept_untracked_na. Fixes: `f9a2fb7331` ("net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131") Signed-off-by: Arun Ajith S <aajith@arista.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220530101414.65439-1-aajith@arista.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-05-31 11:36:57 +02:00
Hangbin Liu	4a1f14df55	bonding: show NS IPv6 targets in proc master info When adding bond new parameter ns_targets. I forgot to print this in bond master proc info. After updating, the bond master info will look like: ARP IP target/s (n.n.n.n form): 192.168.1.254 NS IPv6 target/s (XX::XX form): 2022::1, 2022::2 Fixes: `4e24be018e` ("bonding: add new parameter ns_targets") Reported-by: Li Liang <liali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20220530062639.37179-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-05-31 10:50:02 +02:00
Viorel Suman	d7cd5e06c9	net: phy: at803x: disable WOL at probe Before `7beecaf7d5` ("net: phy: at803x: improve the WOL feature") patch "at803x_get_wol" implementation used AT803X_INTR_ENABLE_WOL value to set WAKE_MAGIC flag, and now AT803X_WOL_EN value is used for the same purpose. The problem here is that the values of these two bits are different after hardware reset: AT803X_INTR_ENABLE_WOL=0 after hardware reset, but AT803X_WOL_EN=1. So now, if called right after boot, "at803x_get_wol" will set WAKE_MAGIC flag, even if WOL function is not enabled by calling "at803x_set_wol" function. The patch disables WOL function on probe thus the behavior is consistent. Fixes: `7beecaf7d5` ("net: phy: at803x: improve the WOL feature") Signed-off-by: Viorel Suman <viorel.suman@nxp.com> Link: https://lore.kernel.org/r/20220527084935.235274-1-viorel.suman@oss.nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-30 21:21:25 -07:00
huhai	3a2cd89bfb	net: ipv4: Avoid bounds check warning Fix the following build warning when CONFIG_IPV6 is not set: In function ‘fortify_memcpy_chk’, inlined from ‘tcp_md5_do_add’ at net/ipv4/tcp_ipv4.c:1210:2: ./include/linux/fortify-string.h:328:4: error: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] 328 \| __write_overflow_field(p_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: huhai <huhai@kylinos.cn> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220526101213.2392980-1-zhanggenjian@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-30 21:21:12 -07:00
David S. Miller	90343f5732	Merge branch 'sfc-fixes' Íñigo Huguet says: ==================== sfc: fix some efx_separate_tx_channels errors Trying to load sfc driver with modparam efx_separate_tx_channels=1 resulted in errors during initialization and not being able to use the NIC. This patches fix a few bugs and make it work again. v2: * added Martin's patch instead of a previous mine. Mine one solved some of the initialization errors, but Martin's solves them also in all possible cases. * removed whitespaces cleanup, as requested by Jakub ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-29 12:12:53 +01:00
Íñigo Huguet	c308dfd1b4	sfc: fix wrong tx channel offset with efx_separate_tx_channels tx_channel_offset is calculated in efx_allocate_msix_channels, but it is also calculated again in efx_set_channels because it was originally done there, and when efx_allocate_msix_channels was introduced it was forgotten to be removed from efx_set_channels. Moreover, the old calculation is wrong when using efx_separate_tx_channels because now we can have XDP channels after the TX channels, so n_channels - n_tx_channels doesn't point to the first TX channel. Remove the old calculation from efx_set_channels, and add the initialization of this variable if MSI or legacy interrupts are used, next to the initialization of the rest of the related variables, where it was missing. Fixes: `3990a8fffb` ("sfc: allocate channels for XDP tx queues") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-29 12:12:53 +01:00
Martin Habets	2e102b53f8	sfc: fix considering that all channels have TX queues Normally, all channels have RX and TX queues, but this is not true if modparam efx_separate_tx_channels=1 is used. In that cases, some channels only have RX queues and others only TX queues (or more preciselly, they have them allocated, but not initialized). Fix efx_channel_has_tx_queues to return the correct value for this case too. Messages shown at probe time before the fix: sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0 ------------[ cut here ]------------ netdevice: ens6f0np0: failed to initialise TXQ -1 WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc] [...] stripped RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc] [...] stripped Call Trace: efx_init_tx_queue+0xaa/0xf0 [sfc] efx_start_channels+0x49/0x120 [sfc] efx_start_all+0x1f8/0x430 [sfc] efx_net_open+0x5a/0xe0 [sfc] __dev_open+0xd0/0x190 __dev_change_flags+0x1b3/0x220 dev_change_flags+0x21/0x60 [...] stripped Messages shown at remove time before the fix: sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues sfc 0000:03:00.0 ens6f0np0: failed to flush queues Fixes: `8700aff089` ("sfc: fix channel allocation with brute force") Reported-by: Tianhao Zhao <tizhao@redhat.com> Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Tested-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-29 12:12:53 +01:00
Christophe JAILLET	18eeb4dea6	net: enetc: Use pci_release_region() to release some resources Some resources are allocated using pci_request_region(). It is more straightforward to release them with pci_release_region(). Fixes: `231ece36f5` ("enetc: Add mdio bus driver for the PCIe MDIO endpoint") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 20:20:42 +01:00
Hangbin Liu	5e1eeef69c	bonding: NS target should accept link local address When setting bond NS target, we use bond_is_ip6_target_ok() to check if the address valid. The link local address was wrongly rejected in bond_changelink(), as most time the user just set the ARP/NS target to gateway, while the IPv6 gateway is always a link local address when user set up interface via SLAAC. So remove the link local addr check when setting bond NS target. Fixes: `129e3c1bab` ("bonding: add new option ns_ip6_target") Reported-by: Li Liang <liali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 15:29:56 +01:00
keliu	911799172d	net: nfc: Directly use ida_alloc()/free() Use ida_alloc()/ida_free() instead of deprecated ida_simple_get()/ida_simple_remove() . Signed-off-by: keliu <liuke94@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 15:28:47 +01:00
Yu Xiao	0649e4d634	nfp: only report pause frame configuration for physical device Only report pause frame configuration for physical device. Logical port of both PCI PF and PCI VF do not support it. Fixes: `9fdc5d85a8` ("nfp: update ethtool reporting of pauseframe control") Signed-off-by: Yu Xiao <yu.xiao@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 12:44:34 +01:00
Sean Anderson	d8064c1056	net: dpaa: Convert to SPDX identifiers This converts these files to use SPDX idenfifiers instead of license text. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 12:43:31 +01:00
Eric Dumazet	1182576529	tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd syzbot got a new report [1] finally pointing to a very old bug, added in initial support for MTU probing. tcp_mtu_probe() has checks about starting an MTU probe if tcp_snd_cwnd(tp) >= 11. But nothing prevents tcp_snd_cwnd(tp) to be reduced later and before the MTU probe succeeds. This bug would lead to potential zero-divides. Debugging added in commit `4057037535` ("tcp: add accessors to read/set tp->snd_cwnd") has paid off :) While we are at it, address potential overflows in this code. [1] WARNING: CPU: 1 PID: 14132 at include/net/tcp.h:1219 tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712 Modules linked in: CPU: 1 PID: 14132 Comm: syz-executor.2 Not tainted 5.18.0-syzkaller-07857-gbabf0bb978e3 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_snd_cwnd_set include/net/tcp.h:1219 [inline] RIP: 0010:tcp_mtup_probe_success+0x366/0x570 net/ipv4/tcp_input.c:2712 Code: 74 08 48 89 ef e8 da 80 17 f9 48 8b 45 00 65 48 ff 80 80 03 00 00 48 83 c4 30 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 aa b0 c5 f8 <0f> 0b e9 16 fe ff ff 48 8b 4c 24 08 80 e1 07 38 c1 0f 8c c7 fc ff RSP: 0018:ffffc900079e70f8 EFLAGS: 00010287 RAX: ffffffff88c0f7f6 RBX: ffff8880756e7a80 RCX: 0000000000040000 RDX: ffffc9000c6c4000 RSI: 0000000000031f9e RDI: 0000000000031f9f RBP: 0000000000000000 R08: ffffffff88c0f606 R09: ffffc900079e7520 R10: ffffed101011226d R11: 1ffff1101011226c R12: 1ffff1100eadcf50 R13: ffff8880756e72c0 R14: 1ffff1100eadcf89 R15: dffffc0000000000 FS: 00007f643236e700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ab3f1e2a0 CR3: 0000000064fe7000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_clean_rtx_queue+0x223a/0x2da0 net/ipv4/tcp_input.c:3356 tcp_ack+0x1962/0x3c90 net/ipv4/tcp_input.c:3861 tcp_rcv_established+0x7c8/0x1ac0 net/ipv4/tcp_input.c:5973 tcp_v6_do_rcv+0x57b/0x1210 net/ipv6/tcp_ipv6.c:1476 sk_backlog_rcv include/net/sock.h:1061 [inline] __release_sock+0x1d8/0x4c0 net/core/sock.c:2849 release_sock+0x5d/0x1c0 net/core/sock.c:3404 sk_stream_wait_memory+0x700/0xdc0 net/core/stream.c:145 tcp_sendmsg_locked+0x111d/0x3fc0 net/ipv4/tcp.c:1410 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1448 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] __sys_sendto+0x439/0x5c0 net/socket.c:2119 __do_sys_sendto net/socket.c:2131 [inline] __se_sys_sendto net/socket.c:2127 [inline] __x64_sys_sendto+0xda/0xf0 net/socket.c:2127 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f6431289109 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f643236e168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f643139c100 RCX: 00007f6431289109 RDX: 00000000d0d0c2ac RSI: 0000000020000080 RDI: 000000000000000a RBP: 00007f64312e308d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff372533af R14: 00007f643236e300 R15: 0000000000022000 Fixes: `5d424d5a67` ("[TCP]: MTU probing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 12:42:08 +01:00
Ke Liu	2f1de254a2	net: phy: Directly use ida_alloc()/free() Use ida_alloc()/ida_free() instead of deprecated ida_simple_get()/ida_simple_remove(). Signed-off-by: Ke Liu <liuke94@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 12:40:09 +01:00
Guangguan Wang	e225c9a5a7	net/smc: fixes for converting from "struct smc_cdc_tx_pend *" to "struct smc_wr_tx_pend_priv " "struct smc_cdc_tx_pend *" can not directly convert to "struct smc_wr_tx_pend_priv ". Fixes: `2bced6aefa` ("net/smc: put slot when connection is killed") Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-28 12:36:26 +01:00
Jakub Kicinski	9bae058ab5	Merge branch 'net-ipa-fix-page-free-in-two-spots' Alex Elder says: ==================== net: ipa: fix page free in two spots When a receive buffer is not wrapped in an SKB and passed to the network stack, the (compound) page gets freed within the IPA driver. This is currently quite rare. The pages are freed using __free_pages(), but they should instead be freed using page_put(). This series fixes this, in two spots. These patches work for the current linus/master branch, but won't apply cleanly to earlier stable branches. (Nevertheless, the fix is a trivial substitution everwhere __free_pages() is called.) ==================== Link: https://lore.kernel.org/r/20220526152314.1405629-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-27 18:29:54 -07:00
Alex Elder	70132763d5	net: ipa: fix page free in ipa_endpoint_replenish_one() Currently the (possibly compound) pages used for receive buffers are freed using __free_pages(). But according to this comment above the definition of that function, that's wrong: If you want to use the page's reference count to decide when to free the allocation, you should allocate a compound page, and use put_page() instead of __free_pages(). Convert the call to __free_pages() in ipa_endpoint_replenish_one() to use put_page() instead. Fixes: `6a606b9015` ("net: ipa: allocate transaction in replenish loop") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-27 18:29:50 -07:00
Alex Elder	155c0c90bc	net: ipa: fix page free in ipa_endpoint_trans_release() Currently the (possibly compound) page used for receive buffers are freed using __free_pages(). But according to this comment above the definition of that function, that's wrong: If you want to use the page's reference count to decide when to free the allocation, you should allocate a compound page, and use put_page() instead of __free_pages(). Convert the call to __free_pages() in ipa_endpoint_trans_release() to use put_page() instead. Fixes: `ed23f02680` ("net: ipa: define per-endpoint receive buffer size") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-27 18:29:50 -07:00
Alexandru Tachici	4dc160a52d	dt-bindings: net: Update ADIN PHY maintainers Update the dt-bindings maintainers section. Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Link: https://lore.kernel.org/r/20220526141318.77146-1-alexandru.tachici@analog.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-27 18:20:21 -07:00
Jakub Kicinski	6b51935a26	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2022-05-28 We've added 2 non-merge commits during the last 1 day(s) which contain a total of 2 files changed, 6 insertions(+), 10 deletions(-). The main changes are: 1) Fix ldx_probe_mem instruction in interpreter by properly zero-extending the bpf_probe_read_kernel() read content, from Menglong Dong. 2) Fix stacktrace_build_id BPF selftest given urandom_read has been renamed into urandom_read_iter in random driver, from Song Liu. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix probe read error in ___bpf_prog_run() selftests/bpf: fix stacktrace_build_id with missing kprobe/urandom_read ==================== Link: https://lore.kernel.org/r/20220527235042.8526-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-27 17:17:52 -07:00
Menglong Dong	caff1fa411	bpf: Fix probe read error in ___bpf_prog_run() I think there is something wrong with BPF_PROBE_MEM in ___bpf_prog_run() in big-endian machine. Let's make a test and see what will happen if we want to load a 'u16' with BPF_PROBE_MEM. Let's make the src value '0x0001', the value of dest register will become 0x0001000000000000, as the value will be loaded to the first 2 byte of DST with following code: bpf_probe_read_kernel(&DST, SIZE, (const void )(long) (SRC + insn->off)); Obviously, the value in DST is not correct. In fact, we can compare BPF_PROBE_MEM with LDX_MEM_H: DST = (SIZE *)(unsigned long) (SRC + insn->off); If the memory load is done by LDX_MEM_H, the value in DST will be 0x1 now. And I think this error results in the test case 'test_bpf_sk_storage_map' failing: test_bpf_sk_storage_map:PASS:bpf_iter_bpf_sk_storage_map__open_and_load 0 nsec test_bpf_sk_storage_map:PASS:socket 0 nsec test_bpf_sk_storage_map:PASS:map_update 0 nsec test_bpf_sk_storage_map:PASS:socket 0 nsec test_bpf_sk_storage_map:PASS:map_update 0 nsec test_bpf_sk_storage_map:PASS:socket 0 nsec test_bpf_sk_storage_map:PASS:map_update 0 nsec test_bpf_sk_storage_map:PASS:attach_iter 0 nsec test_bpf_sk_storage_map:PASS:create_iter 0 nsec test_bpf_sk_storage_map:PASS:read 0 nsec test_bpf_sk_storage_map:FAIL:ipv6_sk_count got 0 expected 3 $10/26 bpf_iter/bpf_sk_storage_map:FAIL The code of the test case is simply, it will load sk->sk_family to the register with BPF_PROBE_MEM and check if it is AF_INET6. With this patch, now the test case 'bpf_iter' can pass: $10 bpf_iter:OK Fixes: `2a02759ef5` ("bpf: Add support for BTF pointers to interpreter") Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20220524021228.533216-1-imagedong@tencent.com	2022-05-28 01:09:18 +02:00
Song Liu	59ed76fe2f	selftests/bpf: fix stacktrace_build_id with missing kprobe/urandom_read Kernel function urandom_read is replaced with urandom_read_iter. Therefore, kprobe on urandom_read is not working any more: [root@eth50-1 bpf]# ./test_progs -n 161 test_stacktrace_build_id:PASS:skel_open_and_load 0 nsec libbpf: kprobe perf_event_open() failed: No such file or directory libbpf: prog 'oncpu': failed to create kprobe 'urandom_read+0x0' \ perf event: No such file or directory libbpf: prog 'oncpu': failed to auto-attach: -2 test_stacktrace_build_id:FAIL:attach_tp err -2 161 stacktrace_build_id:FAIL Fix this by replacing urandom_read with urandom_read_iter in the test. Fixes: `1b388e7765` ("random: convert to using fops->read_iter()") Reported-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Song Liu <song@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20220526191608.2364049-1-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-27 07:57:25 -07:00
Carlo Lobrano	2c262b21de	net: usb: qmi_wwan: add Telit 0x1250 composition Add support for Telit LN910Cx 0x1250 composition 0x1250: rmnet, tty, tty, tty, tty Signed-off-by: Carlo Lobrano <c.lobrano@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-27 12:12:40 +01:00
Raju Lakkaraju	79dfeb2916	net: lan743x: PCI11010 / PCI11414 fix Fix the MDIO interface declarations to reflect what is currently supported by the PCI11010 / PCI11414 devices (C22 for RGMII and C22_C45 for SGMII) Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-27 12:09:46 +01:00
David S. Miller	55919b32d1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following contain more Netfilter fixes for net: 1) syzbot warning in nfnetlink bind, from Florian. 2) Refetch conntrack after __nf_conntrack_confirm(), from Florian Westphal. 3) Move struct nf_ct_timeout back at the bottom of the ctnl_time, to where it before recent update, also from Florian. 4) Add NL_SET_BAD_ATTR() to nf_tables netlink for proper set element commands error reporting. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-27 11:08:12 +01:00
Pablo Neira Ayuso	b53c116642	netfilter: nf_tables: set element extended ACK reporting support Report the element that causes problems via netlink extended ACK for set element commands. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-27 11:16:38 +02:00
Florian Westphal	aeed55a08d	netfilter: cttimeout: fix slab-out-of-bounds read in cttimeout_net_exit syzbot reports: BUG: KASAN: slab-out-of-bounds in __list_del_entry_valid+0xcc/0xf0 lib/list_debug.c:42 [..] list_del include/linux/list.h:148 [inline] cttimeout_net_exit+0x211/0x540 net/netfilter/nfnetlink_cttimeout.c:617 No reproducer so far. Looking at recent changes in this area its clear that the free_head must not be at the end of the structure because nf_ct_timeout structure has variable size. Reported-by: <syzbot+92968395eedbdbd3617d@syzkaller.appspotmail.com> Fixes: `78222bacfc` ("netfilter: cttimeout: decouple unlink and free on netns destruction") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-27 11:16:38 +02:00
Florian Westphal	56b14ecec9	netfilter: conntrack: re-fetch conntrack after insertion In case the conntrack is clashing, insertion can free skb->_nfct and set skb->_nfct to the already-confirmed entry. This wasn't found before because the conntrack entry and the extension space used to free'd after an rcu grace period, plus the race needs events enabled to trigger. Reported-by: <syzbot+793a590957d9c1b96620@syzkaller.appspotmail.com> Fixes: `71d8c47fc6` ("netfilter: conntrack: introduce clash resolution on insertion race") Fixes: `2ad9d7747c` ("netfilter: conntrack: free extension area immediately") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-27 11:16:34 +02:00
Florian Westphal	ffd219efd9	netfilter: nfnetlink: fix warn in nfnetlink_unbind syzbot reports following warn: WARNING: CPU: 0 PID: 3600 at net/netfilter/nfnetlink.c:703 nfnetlink_unbind+0x357/0x3b0 net/netfilter/nfnetlink.c:694 The syzbot generated program does this: socket(AF_NETLINK, SOCK_RAW, NETLINK_NETFILTER) = 3 setsockopt(3, SOL_NETLINK, NETLINK_DROP_MEMBERSHIP, [1], 4) = 0 ... which triggers 'WARN_ON_ONCE(nfnlnet->ctnetlink_listeners == 0)' check. Instead of counting, just enable reporting for every bind request and check if we still have listeners on unbind. While at it, also add the needed bounds check on nfnl_group2type[] access. Reported-by: <syzbot+4903218f7fba0a2d6226@syzkaller.appspotmail.com> Reported-by: <syzbot+afd2d80e495f96049571@syzkaller.appspotmail.com> Fixes: `2794cdb0b9` ("netfilter: nfnetlink: allow to detect if ctnetlink listeners exist") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-05-27 11:16:33 +02:00
Miaoqian Lin	02ded5a173	net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_register of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when done. mv88e6xxx_mdio_register() pass the device node to of_mdiobus_register(). We don't need the device node after it. Add missing of_node_put() to avoid refcount leak. Fixes: `a3c53be55c` ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-27 08:02:33 +01:00
Miaoqian Lin	5dd89d2fc4	net: ethernet: ti: am65-cpsw-nuss: Fix some refcount leaks of_get_child_by_name() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. am65_cpsw_init_cpts() and am65_cpsw_nuss_probe() don't release the refcount in error case. Add missing of_node_put() to avoid refcount leak. Fixes: `b1f66a5bee` ("net: ethernet: ti: am65-cpsw-nuss: enable packet timestamping support") Fixes: `93a7653031` ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-27 08:00:40 +01:00
Dan Carpenter	e7e7104e2d	net: ethernet: mtk_eth_soc: out of bounds read in mtk_hwlro_get_fdir_entry() The "fsp->location" variable comes from user via ethtool_get_rxnfc(). Check that it is valid to prevent an out of bounds read. Fixes: `7aab747e55` ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-05-27 07:59:41 +01:00
Vincent Ray	a54ce37036	net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit() does not provide any ordering guarantee as test_bit() is not an atomic operation. This, added to the fact that the spin_trylock() call at the beginning of qdisc_run_begin() does not guarantee acquire semantics if it does not grab the lock, makes it possible for the following statement : if (test_bit(__QDISC_STATE_MISSED, &qdisc->state)) to be executed before an enqueue operation called before qdisc_run_begin(). As a result the following race can happen : CPU 1 CPU 2 qdisc_run_begin() qdisc_run_begin() /* true / set(MISSED) . / returns false / . . / sees MISSED = 1 / . / so qdisc not empty / . __qdisc_run() . . . pfifo_fast_dequeue() ----> / may be done here / . \| . clear(MISSED) \| . . \| . smp_mb __after_atomic(); \| . . \| . / recheck the queue / \| . / nothing => exit / \| enqueue(skb1) \| . \| qdisc_run_begin() \| . \| spin_trylock() / fail / \| . \| smp_mb__before_atomic() / not enough / \| . ---- if (test_bit(MISSED)) return false; / exit */ In the above scenario, CPU 1 and CPU 2 both try to grab the qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the bypass code path, where it emits its skb then calls __qdisc_run(). CPU1 fails, sets MISSED and goes down the traditionnal enqueue() + dequeue() code path. But when executing qdisc_run_begin() for the second time, after enqueuing its skbuff, it sees the MISSED bit still set (by itself) and consequently chooses to exit early without setting it again nor trying to grab the spinlock again. Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue and found it empty, so it returned. At the end of the sequence, we end up with skb1 enqueued in the backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set, and no __netif_schedule() called made. skb1 will now linger in the qdisc until somebody later performs a full __qdisc_run(). Associated to the bypass capacity of the qdisc, and the ability of the TCP layer to avoid resending packets which it knows are still in the qdisc, this can lead to serious traffic "holes" in a TCP connection. We fix this by replacing the smp_mb__before_atomic() / test_bit() / set_bit() / smp_mb__after_atomic() sequence inside qdisc_run_begin() by a single test_and_set_bit() call, which is more concise and enforces the needed memory barriers. Fixes: `89837eb4b2` ("net: sched: add barrier to ensure correct ordering for lockless qdisc") Signed-off-by: Vincent Ray <vray@kalrayinc.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220526001746.2437669-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-26 20:45:46 -07:00
Michael Walle	b58cdd4388	net: lan966x: check devm_of_phy_get() for -EDEFER_PROBE At the moment, if devm_of_phy_get() returns an error the serdes simply isn't set. While it is bad to ignore an error in general, there is a particular bug that network isn't working if the serdes driver is compiled as a module. In that case, devm_of_phy_get() returns -EDEFER_PROBE and the error is silently ignored. The serdes is optional, it is not there if the port is using RGMII, in which case devm_of_phy_get() returns -ENODEV. Rearrange the error handling so that -ENODEV will be handled but other error codes will abort the probing. Fixes: `d28d6d2e37` ("net: lan966x: add port module support") Signed-off-by: Michael Walle <michael@walle.cc> Link: https://lore.kernel.org/r/20220525231239.1307298-1-michael@walle.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-26 20:45:29 -07:00
Jakub Kicinski	4548ad7287	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix UAF when creating non-stateful expression in set. 2) Set limit cost when cloning expression accordingly, from Phil Sutter. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_limit: Clone packet limits' cost value netfilter: nf_tables: disallow non-stateful expression in sets earlier ==================== Link: https://lore.kernel.org/r/20220526205411.315136-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-26 20:43:06 -07:00

1 2 3 4 5 ...

1095707 Commits