linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-18 00:53:40 +00:00

Author	SHA1	Message	Date
Eric Dumazet	1ac8855744	inet: control sockets should not use current thread task_frag Because ICMP handlers run from softirq contexts, they must not use current thread task_frag. Previously, all sockets allocated by inet_ctl_sock_create() would use the per-socket page fragment, with no chance of recursion. Fixes: `98123866fc` ("Treewide: Stop corrupting socket's task_frag") Reported-by: syzbot+bebc6f1acdf4cbb79b03@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Benjamin Coddington <bcodding@redhat.com> Acked-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/20230103192736.454149-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-04 20:38:25 -08:00
Paolo Abeni	2c02d41d71	net/ulp: prevent ULP without clone op from entering the LISTEN status When an ULP-enabled socket enters the LISTEN status, the listener ULP data pointer is copied inside the child/accepted sockets by sk_clone_lock(). The relevant ULP can take care of de-duplicating the context pointer via the clone() operation, but only MPTCP and SMC implement such op. Other ULPs may end-up with a double-free at socket disposal time. We can't simply clear the ULP data at clone time, as TLS replaces the socket ops with custom ones assuming a valid TLS ULP context is available. Instead completely prevent clone-less ULP sockets from entering the LISTEN status. Fixes: `734942cc4e` ("tcp: ULP infrastructure") Reported-by: slipper <slipper.alive@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/4b80c3d1dbe3d0ab072f80450c202d9bc88b4b03.1672740602.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-04 20:37:41 -08:00
Caleb Sander	5401c3e099	qed: allow sleep in qed_mcp_trace_dump() By default, qed_mcp_cmd_and_union() delays 10us at a time in a loop that can run 500K times, so calls to qed_mcp_nvm_rd_cmd() may block the current thread for over 5s. We observed thread scheduling delays over 700ms in production, with stacktraces pointing to this code as the culprit. qed_mcp_trace_dump() is called from ethtool, so sleeping is permitted. It already can sleep in qed_mcp_halt(), which calls qed_mcp_cmd(). Add a "can sleep" parameter to qed_find_nvram_image() and qed_nvram_read() so they can sleep during qed_mcp_trace_dump(). qed_mcp_trace_get_meta_info() and qed_mcp_trace_read_meta(), called only by qed_mcp_trace_dump(), allow these functions to sleep. I can't tell if the other caller (qed_grc_dump_mcp_hw_dump()) can sleep, so keep b_can_sleep set to false when it calls these functions. An example stacktrace from a custom warning we added to the kernel showing a thread that has not scheduled despite long needing resched: [ 2745.362925,17] ------------[ cut here ]------------ [ 2745.362941,17] WARNING: CPU: 23 PID: 5640 at arch/x86/kernel/irq.c:233 do_IRQ+0x15e/0x1a0() [ 2745.362946,17] Thread not rescheduled for 744 ms after irq 99 [ 2745.362956,17] Modules linked in: ... [ 2745.363339,17] CPU: 23 PID: 5640 Comm: lldpd Tainted: P O 4.4.182+ #202104120910+6d1da174272d.61x [ 2745.363343,17] Hardware name: FOXCONN MercuryB/Quicksilver Controller, BIOS H11P1N09 07/08/2020 [ 2745.363346,17] 0000000000000000 ffff885ec07c3ed8 ffffffff8131eb2f ffff885ec07c3f20 [ 2745.363358,17] ffffffff81d14f64 ffff885ec07c3f10 ffffffff81072ac2 ffff88be98ed0000 [ 2745.363369,17] 0000000000000063 0000000000000174 0000000000000074 0000000000000000 [ 2745.363379,17] Call Trace: [ 2745.363382,17] <IRQ> [<ffffffff8131eb2f>] dump_stack+0x8e/0xcf [ 2745.363393,17] [<ffffffff81072ac2>] warn_slowpath_common+0x82/0xc0 [ 2745.363398,17] [<ffffffff81072b4c>] warn_slowpath_fmt+0x4c/0x50 [ 2745.363404,17] [<ffffffff810d5a8e>] ? rcu_irq_exit+0xae/0xc0 [ 2745.363408,17] [<ffffffff817c99fe>] do_IRQ+0x15e/0x1a0 [ 2745.363413,17] [<ffffffff817c7ac9>] common_interrupt+0x89/0x89 [ 2745.363416,17] <EOI> [<ffffffff8132aa74>] ? delay_tsc+0x24/0x50 [ 2745.363425,17] [<ffffffff8132aa04>] __udelay+0x34/0x40 [ 2745.363457,17] [<ffffffffa04d45ff>] qed_mcp_cmd_and_union+0x36f/0x7d0 [qed] [ 2745.363473,17] [<ffffffffa04d5ced>] qed_mcp_nvm_rd_cmd+0x4d/0x90 [qed] [ 2745.363490,17] [<ffffffffa04e1dc7>] qed_mcp_trace_dump+0x4a7/0x630 [qed] [ 2745.363504,17] [<ffffffffa04e2556>] ? qed_fw_asserts_dump+0x1d6/0x1f0 [qed] [ 2745.363520,17] [<ffffffffa04e4ea7>] qed_dbg_mcp_trace_get_dump_buf_size+0x37/0x80 [qed] [ 2745.363536,17] [<ffffffffa04ea881>] qed_dbg_feature_size+0x61/0xa0 [qed] [ 2745.363551,17] [<ffffffffa04eb427>] qed_dbg_all_data_size+0x247/0x260 [qed] [ 2745.363560,17] [<ffffffffa0482c10>] qede_get_regs_len+0x30/0x40 [qede] [ 2745.363566,17] [<ffffffff816c9783>] ethtool_get_drvinfo+0xe3/0x190 [ 2745.363570,17] [<ffffffff816cc152>] dev_ethtool+0x1362/0x2140 [ 2745.363575,17] [<ffffffff8109bcc6>] ? finish_task_switch+0x76/0x260 [ 2745.363580,17] [<ffffffff817c2116>] ? __schedule+0x3c6/0x9d0 [ 2745.363585,17] [<ffffffff810dbd50>] ? hrtimer_start_range_ns+0x1d0/0x370 [ 2745.363589,17] [<ffffffff816c1e5b>] ? dev_get_by_name_rcu+0x6b/0x90 [ 2745.363594,17] [<ffffffff816de6a8>] dev_ioctl+0xe8/0x710 [ 2745.363599,17] [<ffffffff816a58a8>] sock_do_ioctl+0x48/0x60 [ 2745.363603,17] [<ffffffff816a5d87>] sock_ioctl+0x1c7/0x280 [ 2745.363608,17] [<ffffffff8111f393>] ? seccomp_phase1+0x83/0x220 [ 2745.363612,17] [<ffffffff811e3503>] do_vfs_ioctl+0x2b3/0x4e0 [ 2745.363616,17] [<ffffffff811e3771>] SyS_ioctl+0x41/0x70 [ 2745.363619,17] [<ffffffff817c6ffe>] entry_SYSCALL_64_fastpath+0x1e/0x79 [ 2745.363622,17] ---[ end trace f6954aa440266421 ]--- Fixes: `c965db4446` ("qed: Add support for debug data collection") Signed-off-by: Caleb Sander <csander@purestorage.com> Acked-by: Alok Prasad <palok@marvell.com> Link: https://lore.kernel.org/r/20230103233021.1457646-1-csander@purestorage.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-04 20:32:56 -08:00
Jakub Kicinski	49d9601b81	bpf-for-netdev -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmO17/QACgkQ6rmadz2v bTpFWBAAk85+qkRYyagEoAywkxPvRye6oYu321v69YjS7PdxqQ+sKSMyLR5brEaW QiQSQP/elwcBRyQAf4L/Fe++TZndpvuZ6CB+KUidOowtarxXPllc/TgWeNgF7A5l TuX3eQSLNdF5itdzoJzmiEQuBCHHXPce2L6UfNlP897yaWgxhawlkVEW+w8iB/UQ ZQCNGoDpga917DheLLx2kId046v70CUgwm8CU1rcAeA7Zi7B518OCzLm9z37l2em UWxPLOuQVA5u4O635dTRVUyluRPOvF1pjDei2gCmvC2JJS0X5eZbEEXCsZ04UbLd pd3y0+2WVOjiJ6kaSg6/U47G0gAjHfUHTLAoPzDw4LvlaXzyWZxK8Kxr1wHZRiYX q/iW1rUaWhsUI8XqROUnL3TA9UbFoKvSQIIQ8JEQN9jcd6enfMuBkwEDmZZIL9rR rZp3qZj+sxqGcKRpxorW5buM4vslYuoUD8/+9Lc+hkQ7QMCeDqdNzKEWtSCo3qjT ktbmZpNjCNnDhjyFZGuPsPxxK8I4QnuxGGeJvZeLZFlQq0Ft82APqFKzctdNAwFI NER4ihF0ohXdACrdL4PwaNXY10MCboHrQwpUy7vd2PUOAH+VI8nudSuweVXmeYPf ldH3w2N0G/wEy0QOvEo7SUEx3DiGTUNzHvu/ts4fDwUBqrBrzEM= =vD3D -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Alexei Starovoitov says: ==================== bpf 2023-01-04 We've added 5 non-merge commits during the last 8 day(s) which contain a total of 5 files changed, 112 insertions(+), 18 deletions(-). The main changes are: 1) Always use maximal size for copy_array in the verifier to fix KASAN tracking, from Kees. 2) Fix bpf task iterator walking through dead tasks, from Kui-Feng. 3) Make sure livepatch and bpf fexit can coexist, from Chuang. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Always use maximal size for copy_array() selftests/bpf: add a test for iter/task_vma for short-lived processes bpf: keep a reference to the mm, in case the task is dead. selftests/bpf: Temporarily disable part of btf_dump:var_data test. bpf: Fix panic due to wrong pageattr of im->image ==================== Link: https://lore.kernel.org/r/20230104215500.79435-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-01-04 20:17:19 -08:00
Srivatsa S. Bhat (VMware)	558016722e	MAINTAINERS: Update maintainers for ptp_vmw driver Vivek has decided to transfer the maintainership of the VMware virtual PTP clock driver (ptp_vmw) to Srivatsa and Deep. Update the MAINTAINERS file to reflect this change, and also add Alexey as a reviewer for the driver. Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Acked-by: Vivek Thampi <vivek@vivekthampi.com> Acked-by: Deep Shah <sdeep@vmware.com> Acked-by: Alexey Makhalov <amakhalov@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-04 08:52:44 +00:00
Szymon Heidrich	c7dd13805f	usb: rndis_host: Secure rndis_query check against int overflow Variables off and len typed as uint32 in rndis_query function are controlled by incoming RNDIS response message thus their value may be manipulated. Setting off to a unexpectetly large value will cause the sum with len and 8 to overflow and pass the implemented validation step. Consequently the response pointer will be referring to a location past the expected buffer boundaries allowing information leakage e.g. via RNDIS_OID_802_3_PERMANENT_ADDRESS OID. Fixes: `ddda086240` ("USB: rndis_host, various cleanups") Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-03 09:24:41 +00:00
Sean Anderson	7dc6183854	net: dpaa: Fix dtsec check for PCS availability We want to fail if the PCS is not available, not if it is available. Fix this condition. Fixes: `5d93cfcf73` ("net: dpaa: Convert to phylink") Reported-by: Christian Zigotzky <info@xenosoft.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-03 09:23:17 +00:00
Geetha sowjanya	4af1b64f80	octeontx2-pf: Fix lmtst ID used in aura free Current code uses per_cpu pointer to get the lmtst_id mapped to the core on which aura_free() is executed. Using per_cpu pointer without preemption disable causing mismatch between lmtst_id and core on which pointer gets freed. This patch fixes the issue by disabling preemption around aura_free. Fixes: `ef6c8da71e` ("octeontx2-pf: cn10K: Reserve LMTST lines per core") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-03 09:19:03 +00:00
Daniil Tatianin	9c80796548	drivers/net/bonding/bond_3ad: return when there's no aggregator Otherwise we would dereference a NULL aggregator pointer when calling __set_agg_ports_ready on the line below. Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-03 09:17:12 +00:00
David S. Miller	d57609fad9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Use signed integer in ipv6_skip_exthdr() called from nf_confirm(). Reported by static analysis tooling, patch from Florian Westphal. 2) Missing set type checks in nf_tables: Validate that set declaration matches the an existing set type, otherwise bail out with EEXIST. Currently, nf_tables silently accepts the re-declaration with a different type but it bails out later with EINVAL when the user adds entries to the set. This fix is relatively large because it requires two preparation patches that are included in this batch. 3) Do not ignore updates of timeout and gc_interval parameters in existing sets. 4) Fix a hang when 0/0 subnets is added to a hash:net,port,net type of ipset. Except hash:net,port,net and hash:net,iface, the set types don't support 0/0 and the auxiliary functions rely on this fact. So 0/0 needs a special handling in hash:net,port,net which was missing (hash:net,iface was not affected by this bug), from Jozsef Kadlecsik. 5) When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. This patch is a complete rework of the previous version in order to use a smaller internal batch limit and at the same time removing the external hard limit to add arbitrary number of elements in one step. Also from Jozsef Kadlecsik. Except for patch #1, which fixes a bug introduced in the previous net-next development cycle, anything else has been broken for several releases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-03 09:12:22 +00:00
Jozsef Kadlecsik	5e29dc36bd	netfilter: ipset: Rework long task execution when adding/deleting entries When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch `5f7b51bf09` ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Fixes: `5f7b51bf09` ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-01-02 15:10:05 +01:00
Jozsef Kadlecsik	a31d47be64	netfilter: ipset: fix hash:net,port,net hang with /0 subnet The hash:net,port,net set type supports /0 subnets. However, the patch commit `5f7b51bf09` titled "netfilter: ipset: Limit the maximal range of consecutive elements to add/delete" did not take into account it and resulted in an endless loop. The bug is actually older but the patch `5f7b51bf09` brings it out earlier. Handle /0 subnets properly in hash:net,port,net set types. Fixes: `5f7b51bf09` ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: Марк Коренберг <socketpair@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2023-01-02 15:09:02 +01:00
Horatiu Vultur	588ab2dc25	net: sparx5: Fix reading of the MAC address There is an issue with the checking of the return value of 'of_get_mac_address', which returns 0 on success and negative value on failure. The driver interpretated the result the opposite way. Therefore if there was a MAC address defined in the DT, then the driver was generating a random MAC address otherwise it would use address 0. Fix this by checking correctly the return value of 'of_get_mac_address' Fixes: `b74ef9f9cb` ("net: sparx5: Do not use mac_addr uninitialized in mchp_sparx5_probe()") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-02 13:39:14 +00:00
Ido Schimmel	06bf629441	vxlan: Fix memory leaks in error path The memory allocated by vxlan_vnigroup_init() is not freed in the error path, leading to memory leaks [1]. Fix by calling vxlan_vnigroup_uninit() in the error path. The leaks can be reproduced by annotating gro_cells_init() with ALLOW_ERROR_INJECTION() and then running: # echo "100" > /sys/kernel/debug/fail_function/probability # echo "1" > /sys/kernel/debug/fail_function/times # echo "gro_cells_init" > /sys/kernel/debug/fail_function/inject # printf %#x -12 > /sys/kernel/debug/fail_function/gro_cells_init/retval # ip link add name vxlan0 type vxlan dstport 4789 external vnifilter RTNETLINK answers: Cannot allocate memory [1] unreferenced object 0xffff88810db84a00 (size 512): comm "ip", pid 330, jiffies 4295010045 (age 66.016s) hex dump (first 32 bytes): f8 d5 76 0e 81 88 ff ff 01 00 00 00 00 00 00 02 ..v............. 03 00 04 00 48 00 00 00 00 00 00 01 04 00 01 00 ....H........... backtrace: [<ffffffff81a3097a>] kmalloc_trace+0x2a/0x60 [<ffffffff82f049fc>] vxlan_vnigroup_init+0x4c/0x160 [<ffffffff82ecd69e>] vxlan_init+0x1ae/0x280 [<ffffffff836858ca>] register_netdevice+0x57a/0x16d0 [<ffffffff82ef67b7>] __vxlan_dev_create+0x7c7/0xa50 [<ffffffff82ef6ce6>] vxlan_newlink+0xd6/0x130 [<ffffffff836d02ab>] __rtnl_newlink+0x112b/0x18a0 [<ffffffff836d0a8c>] rtnl_newlink+0x6c/0xa0 [<ffffffff836c0ddf>] rtnetlink_rcv_msg+0x43f/0xd40 [<ffffffff83908ce0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839066af>] netlink_unicast+0x53f/0x810 [<ffffffff839072d8>] netlink_sendmsg+0x958/0xe70 [<ffffffff835c319f>] ____sys_sendmsg+0x78f/0xa90 [<ffffffff835cd6da>] ___sys_sendmsg+0x13a/0x1e0 [<ffffffff835cd94c>] __sys_sendmsg+0x11c/0x1f0 [<ffffffff8424da78>] do_syscall_64+0x38/0x80 unreferenced object 0xffff88810e76d5f8 (size 192): comm "ip", pid 330, jiffies 4295010045 (age 66.016s) hex dump (first 32 bytes): 04 00 00 00 00 00 00 00 db e1 4f e7 00 00 00 00 ..........O..... 08 d6 76 0e 81 88 ff ff 08 d6 76 0e 81 88 ff ff ..v.......v..... backtrace: [<ffffffff81a3162e>] __kmalloc_node+0x4e/0x90 [<ffffffff81a0e166>] kvmalloc_node+0xa6/0x1f0 [<ffffffff8276e1a3>] bucket_table_alloc.isra.0+0x83/0x460 [<ffffffff8276f18b>] rhashtable_init+0x43b/0x7c0 [<ffffffff82f04a1c>] vxlan_vnigroup_init+0x6c/0x160 [<ffffffff82ecd69e>] vxlan_init+0x1ae/0x280 [<ffffffff836858ca>] register_netdevice+0x57a/0x16d0 [<ffffffff82ef67b7>] __vxlan_dev_create+0x7c7/0xa50 [<ffffffff82ef6ce6>] vxlan_newlink+0xd6/0x130 [<ffffffff836d02ab>] __rtnl_newlink+0x112b/0x18a0 [<ffffffff836d0a8c>] rtnl_newlink+0x6c/0xa0 [<ffffffff836c0ddf>] rtnetlink_rcv_msg+0x43f/0xd40 [<ffffffff83908ce0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839066af>] netlink_unicast+0x53f/0x810 [<ffffffff839072d8>] netlink_sendmsg+0x958/0xe70 [<ffffffff835c319f>] ____sys_sendmsg+0x78f/0xa90 Fixes: `f9c4bb0b24` ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-02 13:37:33 +00:00
Randy Dunlap	43d253781f	net: sched: htb: fix htb_classify() kernel-doc Fix W=1 kernel-doc warning: net/sched/sch_htb.c:214: warning: expecting prototype for htb_classify(). Prototype was for HTB_DIRECT() instead by moving the HTB_DIRECT() macro above the function. Add kernel-doc notation for function parameters as well. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-02 13:34:48 +00:00
David S. Miller	819fcf4adf	Merge branch 'cls_drop-fix' Jamal Hadi Salim says: ==================== net: dont intepret cls results when asked to drop It is possible that an error in processing may occur in tcf_classify() which will result in res.classid being some garbage value. Example of such a code path is when the classifier goes into a loop due to bad policy. See patch 1/2 for a sample splat. While the core code reacts correctly and asks the caller to drop the packet (by returning TC_ACT_SHOT) some callers first intepret the res.class as a pointer to memory and end up dropping the packet only after some activity with the pointer. There is likelihood of this resulting in an exploit. So lets fix all the known qdiscs that behave this way. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-02 13:32:43 +00:00
Jamal Hadi Salim	caa4b35b43	net: sched: cbq: dont intepret cls results when asked to drop If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume that res.class contains a valid pointer Sample splat reported by Kyle Zeng [ 5.405624] 0: reclassify loop, rule prio 0, protocol 800 [ 5.406326] ================================================================== [ 5.407240] BUG: KASAN: slab-out-of-bounds in cbq_enqueue+0x54b/0xea0 [ 5.407987] Read of size 1 at addr ffff88800e3122aa by task poc/299 [ 5.408731] [ 5.408897] CPU: 0 PID: 299 Comm: poc Not tainted 5.10.155+ #15 [ 5.409516] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 5.410439] Call Trace: [ 5.410764] dump_stack+0x87/0xcd [ 5.411153] print_address_description+0x7a/0x6b0 [ 5.411687] ? vprintk_func+0xb9/0xc0 [ 5.411905] ? printk+0x76/0x96 [ 5.412110] ? cbq_enqueue+0x54b/0xea0 [ 5.412323] kasan_report+0x17d/0x220 [ 5.412591] ? cbq_enqueue+0x54b/0xea0 [ 5.412803] __asan_report_load1_noabort+0x10/0x20 [ 5.413119] cbq_enqueue+0x54b/0xea0 [ 5.413400] ? __kasan_check_write+0x10/0x20 [ 5.413679] __dev_queue_xmit+0x9c0/0x1db0 [ 5.413922] dev_queue_xmit+0xc/0x10 [ 5.414136] ip_finish_output2+0x8bc/0xcd0 [ 5.414436] __ip_finish_output+0x472/0x7a0 [ 5.414692] ip_finish_output+0x5c/0x190 [ 5.414940] ip_output+0x2d8/0x3c0 [ 5.415150] ? ip_mc_finish_output+0x320/0x320 [ 5.415429] __ip_queue_xmit+0x753/0x1760 [ 5.415664] ip_queue_xmit+0x47/0x60 [ 5.415874] __tcp_transmit_skb+0x1ef9/0x34c0 [ 5.416129] tcp_connect+0x1f5e/0x4cb0 [ 5.416347] tcp_v4_connect+0xc8d/0x18c0 [ 5.416577] __inet_stream_connect+0x1ae/0xb40 [ 5.416836] ? local_bh_enable+0x11/0x20 [ 5.417066] ? lock_sock_nested+0x175/0x1d0 [ 5.417309] inet_stream_connect+0x5d/0x90 [ 5.417548] ? __inet_stream_connect+0xb40/0xb40 [ 5.417817] __sys_connect+0x260/0x2b0 [ 5.418037] __x64_sys_connect+0x76/0x80 [ 5.418267] do_syscall_64+0x31/0x50 [ 5.418477] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5.418770] RIP: 0033:0x473bb7 [ 5.418952] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89 [ 5.420046] RSP: 002b:00007fffd20eb0f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a [ 5.420472] RAX: ffffffffffffffda RBX: 00007fffd20eb578 RCX: 0000000000473bb7 [ 5.420872] RDX: 0000000000000010 RSI: 00007fffd20eb110 RDI: 0000000000000007 [ 5.421271] RBP: 00007fffd20eb150 R08: 0000000000000001 R09: 0000000000000004 [ 5.421671] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 [ 5.422071] R13: 00007fffd20eb568 R14: 00000000004fc740 R15: 0000000000000002 [ 5.422471] [ 5.422562] Allocated by task 299: [ 5.422782] __kasan_kmalloc+0x12d/0x160 [ 5.423007] kasan_kmalloc+0x5/0x10 [ 5.423208] kmem_cache_alloc_trace+0x201/0x2e0 [ 5.423492] tcf_proto_create+0x65/0x290 [ 5.423721] tc_new_tfilter+0x137e/0x1830 [ 5.423957] rtnetlink_rcv_msg+0x730/0x9f0 [ 5.424197] netlink_rcv_skb+0x166/0x300 [ 5.424428] rtnetlink_rcv+0x11/0x20 [ 5.424639] netlink_unicast+0x673/0x860 [ 5.424870] netlink_sendmsg+0x6af/0x9f0 [ 5.425100] __sys_sendto+0x58d/0x5a0 [ 5.425315] __x64_sys_sendto+0xda/0xf0 [ 5.425539] do_syscall_64+0x31/0x50 [ 5.425764] entry_SYSCALL_64_after_hwframe+0x61/0xc6 [ 5.426065] [ 5.426157] The buggy address belongs to the object at ffff88800e312200 [ 5.426157] which belongs to the cache kmalloc-128 of size 128 [ 5.426955] The buggy address is located 42 bytes to the right of [ 5.426955] 128-byte region [ffff88800e312200, ffff88800e312280) [ 5.427688] The buggy address belongs to the page: [ 5.427992] page:000000009875fabc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xe312 [ 5.428562] flags: 0x100000000000200(slab) [ 5.428812] raw: 0100000000000200 dead000000000100 dead000000000122 ffff888007843680 [ 5.429325] raw: 0000000000000000 0000000000100010 00000001ffffffff ffff88800e312401 [ 5.429875] page dumped because: kasan: bad access detected [ 5.430214] page->mem_cgroup:ffff88800e312401 [ 5.430471] [ 5.430564] Memory state around the buggy address: [ 5.430846] ffff88800e312180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.431267] ffff88800e312200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc [ 5.431705] >ffff88800e312280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.432123] ^ [ 5.432391] ffff88800e312300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc [ 5.432810] ffff88800e312380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 5.433229] ================================================================== [ 5.433648] Disabling lock debugging due to kernel taint Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: Kyle Zeng <zengyhkyle@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-02 13:32:43 +00:00
Jamal Hadi Salim	a2965c7be0	net: sched: atm: dont intepret cls results when asked to drop If asked to drop a packet via TC_ACT_SHOT it is unsafe to assume res.class contains a valid pointer Fixes: `b0188d4dbe` ("[NET_SCHED]: sch_atm: Lindent") Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-02 13:32:43 +00:00
Michał Grzelak	91e2286160	dt-bindings: net: marvell,orion-mdio: Fix examples As stated in marvell-orion-mdio.txt deleted in commit `0781434af8` ("dt-bindings: net: orion-mdio: Convert to JSON schema") if 'interrupts' property is present, width of 'reg' should be 0x84. Otherwise, width of 'reg' should be 0x4. Fix 'examples:' and add constraints checking whether 'interrupts' property is present and validate it against fixed values in reg. Signed-off-by: Michał Grzelak <mig@semihalf.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 15:00:12 +00:00
Samuel Holland	a3542b0ccd	dt-bindings: net: sun8i-emac: Add phy-supply property This property has always been supported by the Linux driver; see commit `9f93ac8d40` ("net-next: stmmac: Add dwmac-sun8i"). In fact, the original driver submission includes the phy-supply code but no mention of it in the binding, so the omission appears to be accidental. In addition, the property is documented in the binding for the previous hardware generation, allwinner,sun7i-a20-gmac. Document phy-supply in the binding to fix devicetree validation for the 25+ boards that already use this property. Fixes: `0441bde003` ("dt-bindings: net-next: Add DT bindings documentation for Allwinner dwmac-sun8i") Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 14:55:15 +00:00
Alex Elder	d9d71a89f2	net: ipa: use proper endpoint mask for suspend It is now possible for a system to have more than 32 endpoints. As a result, registers related to endpoint suspend are parameterized, with 32 endpoints represented in one more registers. In ipa_interrupt_suspend_control(), the IPA_SUSPEND_EN register offset is determined properly, but the bit mask used still assumes the number of enpoints won't exceed 32. This is a bug. Fix it. Fixes: `f298ba785e` ("net: ipa: add a parameter to suspend registers") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 12:01:14 +00:00
David S. Miller	1c429c1019	Merge branch 'selftests-fix' Po-Hsu Lin says: ==================== selftests: net: fix for arp_ndisc_evict_nocarrier test This patchset will fix a false-positive issue caused by the command in cleanup_v6() of the arp_ndisc_evict_nocarrier test. Also, it will make the test to return a non-zero value for any failure reported in the test for us to avoid false-negative results. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 11:59:53 +00:00
Po-Hsu Lin	1856628baa	selftests: net: return non-zero for failures reported in arp_ndisc_evict_nocarrier Return non-zero return value if there is any failure reported in this script during the test. Otherwise it can only reflect the status of the last command. Fixes: `f86ca07eb5` ("selftests: net: add arp_ndisc_evict_nocarrier") Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 11:59:53 +00:00
Po-Hsu Lin	9c4d7f45d6	selftests: net: fix cleanup_v6() for arp_ndisc_evict_nocarrier The cleanup_v6() will cause the arp_ndisc_evict_nocarrier script exit with 255 (No such file or directory), even the tests are good: # selftests: net: arp_ndisc_evict_nocarrier.sh # run arp_evict_nocarrier=1 test # RTNETLINK answers: File exists # ok # run arp_evict_nocarrier=0 test # RTNETLINK answers: File exists # ok # run all.arp_evict_nocarrier=0 test # RTNETLINK answers: File exists # ok # run ndisc_evict_nocarrier=1 test # ok # run ndisc_evict_nocarrier=0 test # ok # run all.ndisc_evict_nocarrier=0 test # ok not ok 1 selftests: net: arp_ndisc_evict_nocarrier.sh # exit=255 This is because it's trying to modify the parameter for ipv4 instead. Also, tests for ipv6 (run_ndisc_evict_nocarrier_enabled() and run_ndisc_evict_nocarrier_disabled() are working on veth1, reflect this fact in cleanup_v6(). Fixes: `f86ca07eb5` ("selftests: net: add arp_ndisc_evict_nocarrier") Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 11:59:53 +00:00
Sean Anderson	6d4cfcf979	net: phy: Update documentation for get_rate_matching Now that phylink no longer calls phy_get_rate_matching with PHY_INTERFACE_MODE_NA, phys no longer need to support it. Remove the documentation mandating support. Fixes: `7642cc28fd` ("net: phylink: fix PHY validation with rate adaption") Signed-off-by: Sean Anderson <sean.anderson@seco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 11:56:22 +00:00
David S. Miller	d02b825618	Merge branch 'dsa-qca8k-fixes' Christian Marangi says: ==================== net: dsa: qca8k: multiple fix on mdio read/write Due to some problems in reading the Documentation and elaborating it some wrong assumption were done. The error was reported and notice only now due to how things are setup in the code flow. First 2 patch fix mgmt eth where the lenght calculation is very confusing and in step of word size. (the related commit description have an extensive description about how this mess works) Last 3 patch revert the broken mdio cache and apply a correct version that should still save some extra mdio in phy poll secnario. These 5 patch fix each related problem and apply what the Documentation actually say. Changes v2: - Add cover letter - Fix typo in revert patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 09:27:12 +00:00
Christian Marangi	a4165830ca	net: dsa: qca8k: improve mdio master read/write by using single lo/hi Improve mdio master read/write by using singe mii read/write lo/hi. In a read and write we need to poll the mdio master regs in a busy loop to check for a specific bit present in the upper half of the reg. We can ignore the other half since it won't contain useful data. This will save an additional useless read for each read and write operation. In a read operation the returned data is present in the mdio master reg lower half. We can ignore the other half since it won't contain useful data. This will save an additional useless read for each read operation. In a read operation it's needed to just set the hi half of the mdio master reg as the lo half will be replaced by the result. This will save an additional useless write for each read operation. Tested-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 09:27:12 +00:00
Christian Marangi	cfbd6de588	net: dsa: qca8k: introduce single mii read/write lo/hi It may be useful to read/write just the lo or hi half of a reg. This is especially useful for phy poll with the use of mdio master. The mdio master reg is composed by the first 16 bit related to setup and the other half with the returned data or data to write. Refactor the mii function to permit single mii read/write of lo or hi half of the reg. Tested-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 09:27:12 +00:00
Christian Marangi	03cb9e6d0b	Revert "net: dsa: qca8k: cache lo and hi for mdio write" This reverts commit `2481d206fa`. The Documentation is very confusing about the topic. The cache logic for hi and lo is wrong and actually miss some regs to be actually written. What the Documentation actually intended was that it's possible to skip writing hi OR lo if half of the reg is not needed to be written or read. Revert the change in favor of a better and correct implementation. Reported-by: Ronald Wahl <ronald.wahl@raritan.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 09:27:12 +00:00
Christian Marangi	d9dba91be7	net: dsa: tag_qca: fix wrong MGMT_DATA2 size It was discovered that MGMT_DATA2 can contain up to 28 bytes of data instead of the 12 bytes written in the Documentation by accounting the limit of 16 bytes declared in Documentation subtracting the first 4 byte in the packet header. Update the define with the real world value. Tested-by: Ronald Wahl <ronald.wahl@raritan.com> Fixes: `c2ee8181fd` ("net: dsa: tag_qca: add define for handling mgmt Ethernet packet") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 09:27:12 +00:00
Christian Marangi	9807ae6974	net: dsa: qca8k: fix wrong length value for mgmt eth packet The assumption that Documentation was right about how this value work was wrong. It was discovered that the length value of the mgmt header is in step of word size. As an example to process 4 byte of data the correct length to set is 2. To process 8 byte 4, 12 byte 6, 16 byte 8... Odd values will always return the next size on the ack packet. (length of 3 (6 byte) will always return 8 bytes of data) This means that a value of 15 (0xf) actually means reading/writing 32 bytes of data instead of 16 bytes. This behaviour is totally absent and not documented in the switch Documentation. In fact from Documentation the max value that mgmt eth can process is 16 byte of data while in reality it can process 32 bytes at once. To handle this we always round up the length after deviding it for word size. We check if the result is odd and we round another time to align to what the switch will provide in the ack packet. The workaround for the length limit of 15 is still needed as the length reg max value is 0xf(15) Reported-by: Ronald Wahl <ronald.wahl@raritan.com> Tested-by: Ronald Wahl <ronald.wahl@raritan.com> Fixes: `90386223f4` ("net: dsa: qca8k: add support for larger read/write size with mgmt Ethernet") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Cc: stable@vger.kernel.org # v5.18+ Signed-off-by: David S. Miller <davem@davemloft.net>	2023-01-01 09:27:12 +00:00
Miaoqian Lin	d039535850	net: phy: xgmiitorgmii: Fix refcount leak in xgmiitorgmii_probe of_phy_find_device() return device node with refcount incremented. Call put_device() to relese it when not needed anymore. Fixes: `ab4e6ee578` ("net: phy: xgmiitorgmii: Check phy_driver ready before accessing") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:46:10 +00:00
David S. Miller	72f299b0ca	Merge branch 'ena-fixes' David Arinzon says: ==================== ENA driver bug fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:44 +00:00
David Arinzon	a8ee104f98	net: ena: Update NUMA TPH hint register upon NUMA node update The device supports a PCIe optimization hint, which indicates on which NUMA the queue is currently processed. This hint is utilized by PCIe in order to reduce its access time by accessing the correct NUMA resources and maintaining cache coherence. The driver calls the register update for the hint (called TPH - TLP Processing Hint) during the NAPI loop. Though the update is expected upon a NUMA change (when a queue is moved from one NUMA to the other), the current logic performs a register update when the queue is moved to a different CPU, but the CPU is not necessarily in a different NUMA. The changes include: 1. Performing the TPH update only when the queue has switched a NUMA node. 2. Moving the TPH update call to be triggered only when NAPI was scheduled from interrupt context, as opposed to a busy-polling loop. This is due to the fact that during busy-polling, the frequency of CPU switches for a particular queue is significantly higher, thus, the likelihood to switch NUMA is much higher. Therefore, providing the frequent updates to the device upon a NUMA update are unlikely to be beneficial. Fixes: `1738cd3ed3` ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:44 +00:00
David Arinzon	e712f3e492	net: ena: Set default value for RX interrupt moderation RX ring can be NULL in XDP use cases where only TX queues are configured. In this scenario, the RX interrupt moderation value sent to the device remains in its default value of 0. In this change, setting the default value of the RX interrupt moderation to be the same as of the TX. Fixes: `548c4940b9` ("net: ena: Implement XDP_TX action") Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:44 +00:00
David Arinzon	c7062aaee0	net: ena: Fix rx_copybreak value update Make the upper bound on rx_copybreak tighter, by making sure it is smaller than the minimum of mtu and ENA_PAGE_SIZE. With the current upper bound of mtu, rx_copybreak can be larger than a page. Such large rx_copybreak will not bring any performance benefit to the user and therefore makes no sense. In addition, the value update was only reflected in the adapter structure, but not applied for each ring, causing it to not take effect. Fixes: `1738cd3ed3` ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:44 +00:00
David Arinzon	59811faa2c	net: ena: Use bitmask to indicate packet redirection Redirecting packets with XDP Redirect is done in two phases: 1. A packet is passed by the driver to the kernel using xdp_do_redirect(). 2. After finishing polling for new packets the driver lets the kernel know that it can now process the redirected packet using xdp_do_flush_map(). The packets' redirection is handled in the napi context of the queue that called xdp_do_redirect() To avoid calling xdp_do_flush_map() each time the driver first checks whether any packets were redirected, using xdp_flags \|= xdp_verdict; and if (xdp_flags & XDP_REDIRECT) xdp_do_flush_map() essentially treating XDP instructions as a bitmask, which isn't the case: enum xdp_action { XDP_ABORTED = 0, XDP_DROP, XDP_PASS, XDP_TX, XDP_REDIRECT, }; Given the current possible values of xdp_action, the current design doesn't have a bug (since XDP_REDIRECT = 100b), but it is still flawed. This patch makes the driver use a bitmask instead, to avoid future issues. Fixes: `a318c70ad1` ("net: ena: introduce XDP redirect implementation") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:44 +00:00
David Arinzon	c7f5e34d90	net: ena: Account for the number of processed bytes in XDP The size of packets that were forwarded or dropped by XDP wasn't added to the total processed bytes statistic. Fixes: `548c4940b9` ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:43 +00:00
David Arinzon	9c9e539956	net: ena: Don't register memory info on XDP exchange Since the queues aren't destroyed when we only exchange XDP programs, there's no need to re-register them again. Fixes: `548c4940b9` ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:43 +00:00
David Arinzon	332b49ff63	net: ena: Fix toeplitz initial hash value On driver initialization, RSS hash initial value is set to zero, instead of the default value. This happens because we pass NULL as the RSS key parameter, which caused us to never initialize the RSS hash value. This patch fixes it by making sure the initial value is set, no matter what the value of the RSS key is. Fixes: `91a65b7d3e` ("net: ena: fix potential crash when rxfh key is NULL") Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:43:43 +00:00
Po-Hsu Lin	1573c68820	selftests: net: fix cmsg_so_mark.sh test hang This cmsg_so_mark.sh test will hang on non-amd64 systems because of the infinity loop for argument parsing in cmsg_sender. Variable "o" in cs_parse_args() for taking getopt() should be an int, otherwise it will be 255 when getopt() returns -1 on non-amd64 system and thus causing infinity loop. Link: https://lore.kernel.org/lkml/CA+G9fYsM2k7mrF7W4V_TrZ-qDauWM394=8yEJ=-t1oUg8_40YA@mail.gmail.com/t/ Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:37:26 +00:00
David S. Miller	a512807c24	mlx5-fixes-2022-12-28 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmOsm0sACgkQSD+KveBX +j4NAQgAjX16buWsLNdhxHIfCX5AtF96Y1QTBhM/LInv3bjcoraS9SUvjw6W3UtX 2obzgJTp99y/UomoOgIB9ykS51TcvA5htJ2ReUdwVLtaAmVAy5ZnklzMHutb/S1X 16Gp2N1rO1wlEwv71JlMss0jzR0nqQLraP1VkLMGKvV2XxSglx3zIOcHBTkx+KtC tjkMiRNYvN26WK66oubQl2AjjswD4ojfv7mmkX+8k6VZhhQsQZhLt/vT6OOF1qRw BaxpTJnr6mFkiwmfZg9kdW704d4bP3RzTY8xbYO73jf+xbl0XwGS5jwLaDDCc4Uo 0lz/3agl+d8lZmdfwRdtwqlcIRKzRQ== =5A+W -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2022-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-fixes-2022-12-28	2022-12-30 07:33:55 +00:00
Jiguang Xiao	d530ece70f	net: amd-xgbe: add missed tasklet_kill The driver does not call tasklet_kill in several places. Add the calls to fix it. Fixes: `85b85c8534` ("amd-xgbe: Re-issue interrupt if interrupt status not cleared") Signed-off-by: Jiguang Xiao <jiguang.xiao@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:33:05 +00:00
Jian Shen	fec7352117	net: hns3: refine the handling for VF heartbeat Currently, the PF check the VF alive by the KEEP_ALVE mailbox from VF. VF keep sending the mailbox per 2 seconds. Once PF lost the mailbox for more than 8 seconds, it will regards the VF is abnormal, and stop notifying the state change to VF, include link state, vf mac, reset, even though it receives the KEEP_ALIVE mailbox again. It's inreasonable. This patch fixes it. PF will record the state change which need to notify VF when lost the VF's KEEP_ALIVE mailbox. And notify VF when receive the mailbox again. Introduce a new flag HCLGE_VPORT_STATE_INITED, used to distinguish the case whether VF driver loaded or not. For VF will query these states when initializing, so it's unnecessary to notify it in this case. Fixes: `aa5c4f175b` ("net: hns3: add reset handling for VF when doing PF reset") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Hao Lan <lanhao@huawei.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:32:00 +00:00
Uwe Kleine-König	af691c94d0	net: ethernet: freescale: enetc: Drop empty platform remove function A remove callback just returning 0 is equivalent to no remove callback at all. So drop the useless function. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:28:49 +00:00
Uwe Kleine-König	6b57bffa5f	net: ethernet: broadcom: bcm63xx_enet: Drop empty platform remove function A remove callback just returning 0 is equivalent to no remove callback at all. So drop the useless function. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:28:49 +00:00
David S. Miller	0798311cfd	Merge branch 'tcp-bhash2-fixes' Kuniyuki Iwashima says: =================== tcp: Fix bhash2 and TIME_WAIT regression. We forgot to add twsk to bhash2. Therefore TIME_WAIT sockets cannot prevent bind() to the same local address and port. Changes: v1: * Patch 1: * Add tw_bind2_node in inet_timewait_sock instead of moving sk_bind2_node from struct sock to struct sock_common. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:25:53 +00:00
Kuniyuki Iwashima	2c042e8e54	tcp: Add selftest for bind() and TIME_WAIT. bhash2 split the bind() validation logic into wildcard and non-wildcard cases. Let's add a test to catch future regression. Before the previous patch: # ./bind_timewait TAP version 13 1..2 # Starting 2 tests from 3 test cases. # RUN bind_timewait.localhost.1 ... # bind_timewait.c:87:1:Expected ret (0) == -1 (-1) # 1: Test terminated by assertion # FAIL bind_timewait.localhost.1 not ok 1 bind_timewait.localhost.1 # RUN bind_timewait.addrany.1 ... # OK bind_timewait.addrany.1 ok 2 bind_timewait.addrany.1 # FAILED: 1 / 2 tests passed. # Totals: pass:1 fail:1 xfail:0 xpass:0 skip:0 error:0 After: # ./bind_timewait TAP version 13 1..2 # Starting 2 tests from 3 test cases. # RUN bind_timewait.localhost.1 ... # OK bind_timewait.localhost.1 ok 1 bind_timewait.localhost.1 # RUN bind_timewait.addrany.1 ... # OK bind_timewait.addrany.1 ok 2 bind_timewait.addrany.1 # PASSED: 2 / 2 tests passed. # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:25:53 +00:00
Kuniyuki Iwashima	936a192f97	tcp: Add TIME_WAIT sockets in bhash2. Jiri Slaby reported regression of bind() with a simple repro. [0] The repro creates a TIME_WAIT socket and tries to bind() a new socket with the same local address and port. Before commit `28044fc1d4` ("net: Add a bhash2 table hashed by port and address"), the bind() failed with -EADDRINUSE, but now it succeeds. The cited commit should have put TIME_WAIT sockets into bhash2; otherwise, inet_bhash2_conflict() misses TIME_WAIT sockets when validating bind() requests if the address is not a wildcard one. The straight option is to move sk_bind2_node from struct sock to struct sock_common to add twsk to bhash2 as implemented as RFC. [1] However, the binary layout change in the struct sock could affect performances moving hot fields on different cachelines. To avoid that, we add another TIME_WAIT list in inet_bind2_bucket and check it while validating bind(). [0]: https://lore.kernel.org/netdev/6b971a4e-c7d8-411e-1f92-fda29b5b2fb9@kernel.org/ [1]: https://lore.kernel.org/netdev/20221221151258.25748-2-kuniyu@amazon.com/ Fixes: `28044fc1d4` ("net: Add a bhash2 table hashed by port and address") Reported-by: Jiri Slaby <jirislaby@kernel.org> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-12-30 07:25:52 +00:00
Kees Cook	45435d8da7	bpf: Always use maximal size for copy_array() Instead of counting on prior allocations to have sized allocations to the next kmalloc bucket size, always perform a krealloc that is at least ksize(dst) in size (which is a no-op), so the size can be correctly tracked by all the various allocation size trackers (KASAN, __alloc_size, etc). Reported-by: Hyunwoo Kim <v4bel@theori.io> Link: https://lore.kernel.org/bpf/20221223094551.GA1439509@ubuntu Fixes: `ceb35b666d` ("bpf/verifier: Use kmalloc_size_roundup() to match ksize() usage") Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: bpf@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221223182836.never.866-kees@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-12-28 14:54:53 -08:00

1 2 3 4 5 ...

1152876 Commits