linux

Author	SHA1	Message	Date
Menglong Dong	d25e481be0	net: tcp: use tcp_drop_reason() for tcp_data_queue_ofo() Replace tcp_drop() used in tcp_data_queue_ofo with tcp_drop_reason(). Following drop reasons are introduced: SKB_DROP_REASON_TCP_OFOMERGE Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	a7ec381049	net: tcp: use tcp_drop_reason() for tcp_data_queue() Replace tcp_drop() used in tcp_data_queue() with tcp_drop_reason(). Following drop reasons are introduced: SKB_DROP_REASON_TCP_ZEROWINDOW SKB_DROP_REASON_TCP_OLD_DATA SKB_DROP_REASON_TCP_OVERWINDOW SKB_DROP_REASON_TCP_OLD_DATA is used for the case that end_seq of skb less than the left edges of receive window. (Maybe there is a better name?) Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	2a968ef60e	net: tcp: use tcp_drop_reason() for tcp_rcv_established() Replace tcp_drop() used in tcp_rcv_established() with tcp_drop_reason(). Following drop reasons are added: SKB_DROP_REASON_TCP_FLAGS Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	8eba65fa5f	net: tcp: use kfree_skb_reason() for tcp_v{4,6}_do_rcv() Replace kfree_skb() used in tcp_v4_do_rcv() and tcp_v6_do_rcv() with kfree_skb_reason(). Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	7a26dc9e7b	net: tcp: add skb drop reasons to tcp_add_backlog() Pass the address of drop_reason to tcp_add_backlog() to store the reasons for skb drops when fails. Following drop reasons are introduced: SKB_DROP_REASON_SOCKET_BACKLOG Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	643b622b51	net: tcp: add skb drop reasons to tcp_v{4,6}_inbound_md5_hash() Pass the address of drop reason to tcp_v4_inbound_md5_hash() and tcp_v6_inbound_md5_hash() to store the reasons for skb drops when this function fails. Therefore, the drop reason can be passed to kfree_skb_reason() when the skb needs to be freed. Following drop reasons are added: SKB_DROP_REASON_TCP_MD5NOTFOUND SKB_DROP_REASON_TCP_MD5UNEXPECTED SKB_DROP_REASON_TCP_MD5FAILURE SKB_DROP_REASON_TCP_MD5* above correspond to LINUX_MIB_TCPMD5* Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	c0e3154d9c	net: tcp: use kfree_skb_reason() for tcp_v6_rcv() Replace kfree_skb() used in tcp_v6_rcv() with kfree_skb_reason(). Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	255f9034d3	net: tcp: add skb drop reasons to tcp_v4_rcv() Use kfree_skb_reason() for some path in tcp_v4_rcv() that missed before, including: SKB_DROP_REASON_SOCKET_FILTER SKB_DROP_REASON_XFRM_POLICY Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Menglong Dong	082116ffcb	net: tcp: introduce tcp_drop_reason() For TCP protocol, tcp_drop() is used to free the skb when it needs to be dropped. To make use of kfree_skb_reason() and pass the drop reason to it, introduce the function tcp_drop_reason(). Meanwhile, make tcp_drop() an inline call to tcp_drop_reason(). Reviewed-by: Mengen Sun <mengensun@tencent.com> Reviewed-by: Hao Peng <flyingpeng@tencent.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-20 13:55:31 +00:00
Volodymyr Mytnyk	48c77bdf72	net: prestera: acl: fix 'client_map' buff overflow smatch warnings: drivers/net/ethernet/marvell/prestera/prestera_acl.c:103 prestera_acl_chain_to_client() error: buffer overflow 'client_map' 3 <= 3 prestera_acl_chain_to_client(u32 chain_index, ...) ... u32 client_map[] = { PRESTERA_HW_COUNTER_CLIENT_LOOKUP_0, PRESTERA_HW_COUNTER_CLIENT_LOOKUP_1, PRESTERA_HW_COUNTER_CLIENT_LOOKUP_2 }; if (chain_index > ARRAY_SIZE(client_map)) ... Fixes: `fa5d824ce5` ("net: prestera: acl: add multi-chain support offload") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 18:56:50 +00:00
Ahmad Fatoum	173a272a9f	net: dsa: microchip: add ksz8563 to ksz9477 I2C driver The KSZ9477 SPI driver already has support for the KSZ8563. The same switch chip can also be managed via i2c and we have an KSZ9477 I2C driver, but that one lacks the relevant compatible entry. Add it. DT bindings already describe this compatible. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 18:56:03 +00:00
Dan Carpenter	7a11455f37	net/smc: unlock on error paths in __smc_setsockopt() These two error paths need to release_sock(sk) before returning. Fixes: `a6a6fe27ba` ("net/smc: Dynamic control handshake limitation by socket options") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 18:54:43 +00:00
Oleksij Rempel	a7f4f13a0a	net: dsa: microchip: ksz9477: export HW stats over stats64 interface Provide access to HW offloaded packets over stats64 interface. The rx/tx_bytes values needed some fixing since HW is accounting size of the Ethernet frame together with FCS. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 18:52:45 +00:00
David S. Miller	0d0350c471	Merge branch 'phylink-remove-pcs_poll' Russell King says: ==================== net: phylink: remove pcs_poll This small series removes the now unused pcs_poll members from DSA and phylink. "git grep pcs_poll drivers/net/ net/" on net-next confirms that the only places that reference this are in DSA core code and phylink code: drivers/net/phy/phylink.c: if (pl->config->pcs_poll \|\| pcs->poll) drivers/net/phy/phylink.c: poll \|= pl->config->pcs_poll; net/dsa/port.c: dp->pl_config.pcs_poll = ds->pcs_poll; ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:41:50 +00:00
Russell King (Oracle)	64b4a0f8b5	net: phylink: remove phylink_config's pcs_poll phylink_config's pcs_poll is no longer used, let's get rid of it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:41:50 +00:00
Russell King (Oracle)	ccfbf44d4c	net: dsa: remove pcs_poll With drivers converted over to using phylink PCS, there is no need for the struct dsa_switch member "pcs_poll" to exist anymore - there is a flag in the struct phylink_pcs which indicates whether this PCS needs to be polled which supersedes this. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:41:50 +00:00
Juhee Kang	e7f2742068	net: hsr: fix suspicious RCU usage warning in hsr_node_get_first() When hsr_create_self_node() calls hsr_node_get_first(), the suspicious RCU usage warning is occurred. The reason why this warning is raised is the callers of hsr_node_get_first() use rcu_read_lock_bh() and other different synchronization mechanisms. Thus, this patch solved by replacing rcu_dereference() with rcu_dereference_bh_check(). The kernel test robot reports: [ 50.083470][ T3596] ============================= [ 50.088648][ T3596] WARNING: suspicious RCU usage [ 50.093785][ T3596] 5.17.0-rc3-next-20220208-syzkaller #0 Not tainted [ 50.100669][ T3596] ----------------------------- [ 50.105513][ T3596] net/hsr/hsr_framereg.c:34 suspicious rcu_dereference_check() usage! [ 50.113799][ T3596] [ 50.113799][ T3596] other info that might help us debug this: [ 50.113799][ T3596] [ 50.124257][ T3596] [ 50.124257][ T3596] rcu_scheduler_active = 2, debug_locks = 1 [ 50.132368][ T3596] 2 locks held by syz-executor.0/3596: [ 50.137863][ T3596] #0: ffffffff8d3357e8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x3be/0xb80 [ 50.147470][ T3596] #1: ffff88807ec9d5f0 (&hsr->list_lock){+...}-{2:2}, at: hsr_create_self_node+0x225/0x650 [ 50.157623][ T3596] [ 50.157623][ T3596] stack backtrace: [ 50.163510][ T3596] CPU: 1 PID: 3596 Comm: syz-executor.0 Not tainted 5.17.0-rc3-next-20220208-syzkaller #0 [ 50.173381][ T3596] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 50.183623][ T3596] Call Trace: [ 50.186904][ T3596] <TASK> [ 50.189844][ T3596] dump_stack_lvl+0xcd/0x134 [ 50.194640][ T3596] hsr_node_get_first+0x9b/0xb0 [ 50.199499][ T3596] hsr_create_self_node+0x22d/0x650 [ 50.204688][ T3596] hsr_dev_finalize+0x2c1/0x7d0 [ 50.209669][ T3596] hsr_newlink+0x315/0x730 [ 50.214113][ T3596] ? hsr_dellink+0x130/0x130 [ 50.218789][ T3596] ? rtnl_create_link+0x7e8/0xc00 [ 50.223803][ T3596] ? hsr_dellink+0x130/0x130 [ 50.228397][ T3596] __rtnl_newlink+0x107c/0x1760 [ 50.233249][ T3596] ? rtnl_setlink+0x3c0/0x3c0 [ 50.238043][ T3596] ? is_bpf_text_address+0x77/0x170 [ 50.243362][ T3596] ? lock_downgrade+0x6e0/0x6e0 [ 50.248219][ T3596] ? unwind_next_frame+0xee1/0x1ce0 [ 50.253605][ T3596] ? entry_SYSCALL_64_after_hwframe+0x44/0xae [ 50.259669][ T3596] ? __sanitizer_cov_trace_cmp4+0x1c/0x70 [ 50.265423][ T3596] ? is_bpf_text_address+0x99/0x170 [ 50.270819][ T3596] ? kernel_text_address+0x39/0x80 [ 50.275950][ T3596] ? __kernel_text_address+0x9/0x30 [ 50.281336][ T3596] ? unwind_get_return_address+0x51/0x90 [ 50.286975][ T3596] ? create_prof_cpu_mask+0x20/0x20 [ 50.292178][ T3596] ? arch_stack_walk+0x93/0xe0 [ 50.297172][ T3596] ? kmem_cache_alloc_trace+0x42/0x2c0 [ 50.302637][ T3596] ? rcu_read_lock_sched_held+0x3a/0x70 [ 50.308194][ T3596] rtnl_newlink+0x64/0xa0 [ 50.312524][ T3596] ? __rtnl_newlink+0x1760/0x1760 [ 50.317545][ T3596] rtnetlink_rcv_msg+0x413/0xb80 [ 50.322631][ T3596] ? rtnl_newlink+0xa0/0xa0 [ 50.327159][ T3596] netlink_rcv_skb+0x153/0x420 [ 50.331931][ T3596] ? rtnl_newlink+0xa0/0xa0 [ 50.336436][ T3596] ? netlink_ack+0xa80/0xa80 [ 50.341095][ T3596] ? netlink_deliver_tap+0x1a2/0xc40 [ 50.346532][ T3596] ? netlink_deliver_tap+0x1b1/0xc40 [ 50.351839][ T3596] netlink_unicast+0x539/0x7e0 [ 50.356633][ T3596] ? netlink_attachskb+0x880/0x880 [ 50.361750][ T3596] ? __sanitizer_cov_trace_const_cmp8+0x1d/0x70 [ 50.368003][ T3596] ? __sanitizer_cov_trace_const_cmp8+0x1d/0x70 [ 50.374707][ T3596] ? __phys_addr_symbol+0x2c/0x70 [ 50.379753][ T3596] ? __sanitizer_cov_trace_cmp8+0x1d/0x70 [ 50.385568][ T3596] ? __check_object_size+0x16c/0x4f0 [ 50.390859][ T3596] netlink_sendmsg+0x904/0xe00 [ 50.395715][ T3596] ? netlink_unicast+0x7e0/0x7e0 [ 50.400722][ T3596] ? __sanitizer_cov_trace_const_cmp4+0x1c/0x70 [ 50.407003][ T3596] ? netlink_unicast+0x7e0/0x7e0 [ 50.412119][ T3596] sock_sendmsg+0xcf/0x120 [ 50.416548][ T3596] __sys_sendto+0x21c/0x320 [ 50.421052][ T3596] ? __ia32_sys_getpeername+0xb0/0xb0 [ 50.426427][ T3596] ? lockdep_hardirqs_on_prepare+0x400/0x400 [ 50.432721][ T3596] ? __context_tracking_exit+0xb8/0xe0 [ 50.438188][ T3596] ? lock_downgrade+0x6e0/0x6e0 [ 50.443041][ T3596] ? lock_downgrade+0x6e0/0x6e0 [ 50.447902][ T3596] __x64_sys_sendto+0xdd/0x1b0 [ 50.452759][ T3596] ? lockdep_hardirqs_on+0x79/0x100 [ 50.457964][ T3596] ? syscall_enter_from_user_mode+0x21/0x70 [ 50.464150][ T3596] do_syscall_64+0x35/0xb0 [ 50.468565][ T3596] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 50.474452][ T3596] RIP: 0033:0x7f3148504e1c [ 50.479052][ T3596] Code: fa fa ff ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 20 fb ff ff 48 8b [ 50.498926][ T3596] RSP: 002b:00007ffeab5f2ab0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c [ 50.507342][ T3596] RAX: ffffffffffffffda RBX: 00007f314959d320 RCX: 00007f3148504e1c [ 50.515393][ T3596] RDX: 0000000000000048 RSI: 00007f314959d370 RDI: 0000000000000003 [ 50.523444][ T3596] RBP: 0000000000000000 R08: 00007ffeab5f2b04 R09: 000000000000000c [ 50.531492][ T3596] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 [ 50.539455][ T3596] R13: 00007f314959d370 R14: 0000000000000003 R15: 0000000000000000 Fixes: `4acc45db71` ("net: hsr: use hlist_head instead of list_head for mac addresses") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-and-tested-by: syzbot+f0eb4f3876de066b128c@syzkaller.appspotmail.com Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:39:45 +00:00
Christophe JAILLET	92c54a65e6	atm: nicstar: Use kcalloc() to simplify code Use kcalloc() instead of kmalloc_array() and a loop to set all the values of the array to NULL. While at it, remove a duplicated assignment to 'scq->num_entries'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:38:12 +00:00
David S. Miller	32d51cef91	Merge branch 'dpaa2-eth-one-step-register' Radu Bulie says: ==================== Provide direct access to 1588 one step register DPAA2 MAC supports 1588 one step timestamping. If this option is enabled then for each transmitted PTP event packet, the 1588 SINGLE_STEP register is accessed to modify the following fields: -offset of the correction field inside the PTP packet -UDP checksum update bit, in case the PTP event packet has UDP encapsulation These values can change any time, because there may be multiple PTP clients connected, that receive various 1588 frame types: - L2 only frame - UDP / Ipv4 - UDP / Ipv6 - other The current implementation uses dpni_set_single_step_cfg to update the SINLGE_STEP register. Using an MC command on the Tx datapath for each transmitted 1588 message introduces high delays, leading to low throughput and consequently to a small number of supported PTP clients. Besides these, the nanosecond correction field from the PTP packet will contain the high delay from the driver which together with the originTimestamp will render timestamp values that are unacceptable in a GM clock implementation. This patch series replaces the dpni_set_single_step_cfg function call from the Tx datapath for 1588 messages (when one step timestamping is enabled) with a callback that either implements direct access to the SINGLE_STEP register, eliminating the overhead caused by the MC command that will need to be dispatched by the MC firmware through the MC command portal interface or falls back to the dpni_set_single_step_cfg in case the MC version does not have support for returning the single step register base address. In other words all the delay introduced by dpni_set_single_step_cfg function will be eliminated (if MC version has support for returning the base address of the single step register), improving the egress driver performance for PTP packets when single step timestamping is enabled. The first patch adds a new attribute that contains the base address of the SINGLE_STEP register. It will be used to directly update the register on the Tx datapath. The second patch updates the driver such that the SINGLE_STEP register is either accessed directly if MC version >= 10.32 or is accessed through dpni_set_single_step_cfg command when 1588 messages are transmitted. Changes in v2: - move global function pointer into the driver's private structure in 2/2 - move repetitive code outside the body of the callback functions in 2/2 - update function dpaa2_ptp_onestep_reg_update_method and remove goto statement from non error path in 2/2 Changes in v3: - remove static storage class specifier from within the structure in 2/2 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:27:17 +00:00
Radu Bulie	c4680c9785	dpaa2-eth: Update SINGLE_STEP register access DPAA2 MAC supports 1588 one step timestamping. If this option is enabled then for each transmitted PTP event packet, the 1588 SINGLE_STEP register is accessed to modify the following fields: -offset of the correction field inside the PTP packet -UDP checksum update bit, in case the PTP event packet has UDP encapsulation These values can change any time, because there may be multiple PTP clients connected, that receive various 1588 frame types: - L2 only frame - UDP / Ipv4 - UDP / Ipv6 - other The current implementation uses dpni_set_single_step_cfg to update the SINLGE_STEP register. Using an MC command on the Tx datapath for each transmitted 1588 message introduces high delays, leading to low throughput and consequently to a small number of supported PTP clients. Besides these, the nanosecond correction field from the PTP packet will contain the high delay from the driver which together with the originTimestamp will render timestamp values that are unacceptable in a GM clock implementation. This patch updates the Tx datapath for 1588 messages when single step timestamp is enabled and provides direct access to SINGLE_STEP register, eliminating the overhead caused by the dpni_set_single_step_cfg MC command. MC version >= 10.32 implements this functionality. If the MC version does not have support for returning the single step register base address, the driver will use dpni_set_single_step_cfg command for updates operations. All the delay introduced by dpni_set_single_step_cfg function will be eliminated (if MC version has support for returning the base address of the single step register), improving the egress driver performance for PTP packets when single step timestamping is enabled. Before these changes the maximum throughput for 1588 messages with single step hardware timestamp enabled was around 2000pps. After the updates the throughput increased up to 32.82 Mbps / 46631.02 pps. Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:27:17 +00:00
Radu Bulie	9572594ecf	dpaa2-eth: Update dpni_get_single_step_cfg command dpni_get_single_step_cfg is an MC firmware command used for retrieving the contents of SINGLE_STEP 1588 register available in a DPMAC. This patch adds a new version of this command that returns as an extra argument the physical base address of the aforementioned register. The address will be used to directly modify the contents of the SINGLE_STEP register instead of invoking the MC command dpni_set_single_step_cgf. The former approach introduced huge delays on the TX datapath when one step PTP events were transmitted. This led to low throughput and high latencies observed in the PTP correction field. Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:27:16 +00:00
Eric Dumazet	8a4fc54b07	net: get rid of rtnl_lock_unregistering() After recent patches, and in particular commits `faab39f63c` ("net: allow out-of-order netdev unregistration") and `e5f80fcf86` ("ipv6: give an IPv6 dev to blackhole_netdev") we no longer need the barrier implemented in rtnl_lock_unregistering(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:24:03 +00:00
Volodymyr Mytnyk	b3ae2d350d	net: prestera: flower: fix destroy tmpl in chain Fix flower destroy template callback to release template only for specific tc chain instead of all chain tempaltes. The issue was intruduced by previous commit that introduced multi-chain support. Fixes: `fa5d824ce5` ("net: prestera: acl: add multi-chain support offload") Signed-off-by: Volodymyr Mytnyk <vmytnyk@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:22:03 +00:00
Eric Dumazet	36a29fb6b2	bridge: switch br_net_exit to batch mode cleanup_net() is competing with other rtnl users. Instead of calling br_net_exit() for each netns, call br_net_exit_batch() once. This gives cleanup_net() ability to group more devices and call unregister_netdevice_many() only once for all bridge devices. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:20:12 +00:00
David S. Miller	a7cc3464e6	Merge branch 'mctp-i2c' Matt Johnston says: ==================== MCTP I2C driver This patch series adds a netdev driver providing MCTP transport over I2C. I think I've addressed all the points raised in v5. It now has mctp_i2c_unregister() to run things in the correct order, waiting for the worker thread and I2C rx to complete. Cheers, Matt -- v6: - Changed netdev register/unregister/free to avoid races. Ensure that netif functions are not used by irq handler/threads after unregister. - Fix incoming I2C hwaddr that was previously incorrect (left shifted 1 bit) - Add a check that byte_count wire header matches the length received - Renamed I2C driver to mctp-i2c-interface - Removed __func__ from print messages, added missing newlines - Removed sysfs mctp_current_mux file which was used for debug - Renamed curr_lock to sel_lock - Tidied comment formatting - Fix newline in Kconfig v5: - Fix incorrect format string v4: - Switch to __i2c_transfer() rather than __i2c_smbus_xfer(), drop 255 byte smbus patches - Use wait_event_idle() for the sleeping TX thread - Use dev_addr_set() v3: - Added Reviewed-bys for npcm7xx - Resend with net-next open v2: - Simpler Kconfig condition for i2c-mux dependency, from Randy Dunlap ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:18:50 +00:00
Matt Johnston	f5b8abf9fc	mctp i2c: MCTP I2C binding driver Provides MCTP network transport over an I2C bus, as specified in DMTF DSP0237. All messages between nodes are sent as SMBus Block Writes. Each I2C bus to be used for MCTP is flagged in devicetree by a 'mctp-controller' property on the bus node. Each flagged bus gets a mctpi2cX net device created based on the bus number. A 'mctp-i2c-controller' I2C client needs to be added under the adapter. In an I2C mux situation the mctp-i2c-controller node must be attached only to the root I2C bus. The I2C client will handle incoming I2C slave block write data for subordinate busses as well as its own bus. In configurations without devicetree a driver instance can be attached to a bus using the I2C slave new_device mechanism. The MCTP core will hold/release the MCTP I2C device while responses are pending (a 6 second timeout or once a socket is closed, response received etc). While held the MCTP I2C driver will lock the I2C bus so that the correct I2C mux remains selected while responses are received. (Ideally we would just lock the mux to keep the current bus selected for the response rather than a full I2C bus lock, but that isn't exposed in the I2C mux API) Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Reviewed-by: Wolfram Sang <wsa@kernel.org> # I2C transport parts Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:18:49 +00:00
Matt Johnston	6881e493b0	dt-bindings: net: New binding mctp-i2c-controller Used to define a local endpoint to communicate with MCTP peripherals attached to an I2C bus. This I2C endpoint can communicate with remote MCTP devices on the I2C bus. In the example I2C topology below (matching the second yaml example) we have MCTP devices on busses i2c1 and i2c6. MCTP-supporting busses are indicated by the 'mctp-controller' DT property on an I2C bus node. A mctp-i2c-controller I2C client DT node is placed at the top of the mux topology, since only the root I2C adapter will support I2C slave functionality. .-------. \|eeprom \| .------------. .------. /'-------' \| adapter \| \| mux --@0,i2c5------' \| i2c1 ----.*\| --@1,i2c6--.--. \|............\| \'------' \ \ ......... \| mctp-i2c- \| \ \ \ .mctpB . \| controller \| \ \ '.0x30 . \| \| \ ......... \ '.......' \| 0x50 \| \ .mctpA . \ ......... '------------' '.0x1d . '.mctpC . '.......' '.0x31 . '.......' (mctpX boxes above are remote MCTP devices not included in the DT at present, they can be hotplugged/probed at runtime. A DT binding for specific fixed MCTP devices could be added later if required) Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:18:49 +00:00
Mobashshera Rasool	4b340a5a72	net: ip6mr: add support for passing full packet on wrong mif This patch adds support for MRT6MSG_WRMIFWHOLE which is used to pass full packet and real vif id when the incoming interface is wrong. While the RP and FHR are setting up state we need to be sending the registers encapsulated with all the data inside otherwise we lose it. The RP then decapsulates it and forwards it to the interested parties. Currently with WRONGMIF we can only be sending empty register packets and will lose that data. This behaviour can be enabled by using MRT_PIM with val == MRT6MSG_WRMIFWHOLE. This doesn't prevent MRT6MSG_WRONGMIF from happening, it happens in addition to it, also it is controlled by the same throttling parameters as WRONGMIF (i.e. 1 packet per 3 seconds currently). Both messages are generated to keep backwards compatibily and avoid breaking someone who was enabling MRT_PIM with val == 4, since any positive val is accepted and treated the same. Signed-off-by: Mobashshera Rasool <mobash.rasool.linux@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 16:05:54 +00:00
Alexander Lobakin	7e1b54d077	i40e: remove dead stores on XSK hotpath The 'if (ntu == rx_ring->count)' block in i40e_alloc_rx_buffers_zc() was previously residing in the loop, but after introducing the batched interface it is used only to wrap-around the NTU descriptor, thus no more need to assign 'xdp'. 'cleaned_count' in i40e_clean_rx_irq_zc() was previously being incremented in the loop, but after commit `f12738b6ec` ("i40e: remove unnecessary cleaned_count updates") it gets assigned only once after it, so the initialization can be dropped. Fixes: `6aab0bb0c5` ("i40e: Use the xsk batched rx allocation interface") Fixes: `f12738b6ec` ("i40e: remove unnecessary cleaned_count updates") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-19 12:34:04 +00:00
Jakub Kicinski	bbcf340d9d	Merge branch 'add-checks-for-incoming-packet-addresses' Jeremy Kerr says: ==================== Add checks for incoming packet addresses This series adds a couple of checks for valid addresses on incoming MCTP packets. We introduce a couple of helpers in 1/2, and use them in the ingress path in 2/2. ==================== Link: https://lore.kernel.org/r/20220218042554.564787-1-jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 21:24:33 -08:00
Jeremy Kerr	86cdfd63f2	mctp: add address validity checking for packet receive This change adds some basic sanity checks for the source and dest headers of packets on initial receive. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 21:24:29 -08:00
Jeremy Kerr	cb196b7259	mctp: replace mctp_address_ok with more fine-grained helpers Currently, we have mctp_address_ok(), which checks if an EID is in the "valid" range of 8-254 inclusive. However, 0 and 255 may also be valid addresses, depending on context. 0 is the NULL EID, which may be set when physical addressing is used. 255 is valid as a destination address for broadcasts. This change renames mctp_address_ok to mctp_address_unicast, and adds similar helpers for broadcast and null EIDs, which will be used in an upcoming commit. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 21:24:28 -08:00
Jacques de Laval	47f0bd5032	net: Add new protocol attribute to IP addresses This patch adds a new protocol attribute to IPv4 and IPv6 addresses. Inspiration was taken from the protocol attribute of routes. User space applications like iproute2 can set/get the protocol with the Netlink API. The attribute is stored as an 8-bit unsigned integer. The protocol attribute is set by kernel for these categories: - IPv4 and IPv6 loopback addresses - IPv6 addresses generated from router announcements - IPv6 link local addresses User space may pass custom protocols, not defined by the kernel. Grouping addresses on their origin is useful in scenarios where you want to distinguish between addresses based on who added them, e.g. kernel vs. user space. Tagging addresses with a string label is an existing feature that could be used as a solution. Unfortunately the max length of a label is 15 characters, and for compatibility reasons the label must be prefixed with the name of the device followed by a colon. Since device names also have a max length of 15 characters, only -1 characters is guaranteed to be available for any origin tag, which is not that much. A reference implementation of user space setting and getting protocols is available for iproute2: `9a6ea18bd7` Signed-off-by: Jacques de Laval <Jacques.De.Laval@westermo.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220217150202.80802-1-Jacques.De.Laval@westermo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 21:20:06 -08:00
Jakub Kicinski	6e2e59eaee	Merge branch 'ionic-driver-updates' Shannon Nelson says: ==================== ionic: driver updates These are a couple of checkpatch cleanup patches, a bug fix, and something to alleviate memory pressure in tight places. ==================== Link: https://lore.kernel.org/r/20220217220252.52293-1-snelson@pensando.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 20:37:17 -08:00
Shannon Nelson	ecea8bb429	ionic: clean up comments and whitespace Fix up some checkpatch complaints that have crept in: doubled words words, mispellled words, doubled lines. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 20:37:14 -08:00
Shannon Nelson	799c230e93	ionic: prefer strscpy over strlcpy Replace strlcpy with strscpy to clean up a checkpatch complaint. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 20:37:14 -08:00
Brett Creeley	116dce0ff0	ionic: Use vzalloc for large per-queue related buffers Use vzalloc for per-queue info structs that don't need any DMA mapping to help relieve memory pressure found when used in our limited SOC environment. Signed-off-by: Brett Creeley <brett@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 20:37:14 -08:00
Shannon Nelson	12b1b997c0	ionic: catch transition back to RUNNING with fw_generation 0 In some graceful updates that get initially triggered by the RESET event, especially with older firmware, the fw_generation bits don't change but the fw_status is seen to go to 0 then back to 1. However, the driver didn't perform the restart, remained waiting for fw_generation to change, and got left in limbo. This is because the clearing of idev->fw_status_ready to 0 didn't happen correctly as it was buried in the transition trigger: since the transition down was triggered not here but in the RESET event handler, the clear to 0 didn't happen, so the transition back to 1 wasn't detected. Fix this particular case by bringing the setting of idev->fw_status_ready back out to where it was before. Fixes: `398d1e37f9` ("ionic: add FW_STOPPING state") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 20:37:14 -08:00
Eric Dumazet	86213f80da	net: avoid quadratic behavior in netdev_wait_allrefs_any() If the list of devices has N elements, netdev_wait_allrefs_any() is called N times, and linkwatch_forget_dev() is called N*(N-1)/2 times. Fix this by calling linkwatch_forget_dev() only once per device. Fixes: `faab39f63c` ("net: allow out-of-order netdev unregistration") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220218065430.2613262-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-02-18 07:28:31 -08:00
Eric Dumazet	086d49058c	ipv6: annotate some data-races around sk->sk_prot IPv6 has this hack changing sk->sk_prot when an IPv6 socket is 'converted' to an IPv4 one with IPV6_ADDRFORM option. This operation is only performed for TCP and UDP, knowing their 'struct proto' for the two network families are populated in the same way, and can not disappear while a reader might use and dereference sk->sk_prot. If we think about it all reads of sk->sk_prot while either socket lock or RTNL is not acquired should be using READ_ONCE(). Also note that other layers like MPTCP, XFRM, CHELSIO_TLS also write over sk->sk_prot. BUG: KCSAN: data-race in inet6_recvmsg / ipv6_setsockopt write to 0xffff8881386f7aa8 of 8 bytes by task 26932 on cpu 0: do_ipv6_setsockopt net/ipv6/ipv6_sockglue.c:492 [inline] ipv6_setsockopt+0x3758/0x3910 net/ipv6/ipv6_sockglue.c:1019 udpv6_setsockopt+0x85/0x90 net/ipv6/udp.c:1649 sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3489 __sys_setsockopt+0x209/0x2a0 net/socket.c:2180 __do_sys_setsockopt net/socket.c:2191 [inline] __se_sys_setsockopt net/socket.c:2188 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2188 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff8881386f7aa8 of 8 bytes by task 26911 on cpu 1: inet6_recvmsg+0x7a/0x210 net/ipv6/af_inet6.c:659 ____sys_recvmsg+0x16c/0x320 ___sys_recvmsg net/socket.c:2674 [inline] do_recvmmsg+0x3f5/0xae0 net/socket.c:2768 __sys_recvmmsg net/socket.c:2847 [inline] __do_sys_recvmmsg net/socket.c:2870 [inline] __se_sys_recvmmsg net/socket.c:2863 [inline] __x64_sys_recvmmsg+0xde/0x160 net/socket.c:2863 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffffffff85e0e980 -> 0xffffffff85e01580 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 26911 Comm: syz-executor.3 Not tainted 5.17.0-rc2-syzkaller-00316-g0457e5153e0e-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:53:28 +00:00
Cédric Le Goater	7ea0c16a74	net/ibmvnic: Cleanup workaround doing an EOI after partition migration There were a fair amount of changes to workaround a firmware bug leaving a pending interrupt after migration of the ibmvnic device : commit `2df5c60e19` ("net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode") commit `284f87d2f3` ("Revert "net/ibmvnic: Fix EOI when running in XIVE mode"") commit `11d49ce9f7` ("net/ibmvnic: Fix EOI when running in XIVE mode.") commit `f23e0643cd` ("ibmvnic: Clear pending interrupt after device reset") Here is the final one taking into account the XIVE interrupt mode. Cc: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Cc: Dany Madden <drt@linux.ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:47:48 +00:00
jeffreyji	aaae162aeb	teaming: deliver link-local packets with the link they arrive on skb is ignored if team port is disabled. We want the skb to be delivered if it's an link layer packet. Issue is already fixed for bonding in commit `b89f04c61e` ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") changelog: v2: change LLDP -> link layer in comments/commit descrip, comment format Signed-off-by: jeffreyji <jeffreyji@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:40:52 +00:00
David S. Miller	a3b355c778	Merge branch 'qca8k-phylink' Russell King says: ==================== net: dsa: qca8k: convert to phylink_pcs and mark as non-legacy This series adds support into DSA for the mac_select_pcs method, and converts qca8k to make use of this, eventually marking qca8k as non- legacy. Patch 1 adds DSA support for mac_select_pcs. Patch 2 and patch 3 moves code around in qca8k to make patch 4 more readable. Patch 4 does a simple conversion to phylink_pcs. Patch 5 moves the serdes configuration to phylink_pcs. Patch 6 marks qca8k as non-legacy. v2: fix dsa_phylink_mac_select_pcs() formatting and double-blank line in patch 5 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:33 +00:00
Russell King (Oracle)	d9cbacf057	net: dsa: qca8k: mark as non-legacy The qca8k driver does not make use of the speed, duplex, pause or advertisement in its phylink_mac_config() implementation, so it can be marked as a non-legacy driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:33 +00:00
Russell King (Oracle)	7544b3ff74	net: dsa: qca8k: move pcs configuration Move the PCS configuration to qca8k_pcs_config(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:33 +00:00
Russell King (Oracle)	9612a8f915	net: dsa: qca8k: convert to use phylink_pcs Convert the qca8k driver to use the phylink_pcs support to talk to the SGMII PCS. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:33 +00:00
Russell King (Oracle)	10728cd796	net: dsa: qca8k: move qca8k_phylink_mac_link_state() Move qca8k_phylink_mac_link_state() to separate the code movement from code changes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:32 +00:00
Russell King (Oracle)	3ce855f040	net: dsa: qca8k: move qca8k_setup() Move qca8k_setup() to be later in the file to avoid needing prototypes for called functions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:32 +00:00
Russell King (Oracle)	bde018222c	net: dsa: add support for phylink mac_select_pcs() Add DSA support for the phylink mac_select_pcs() method so DSA drivers can return provide phylink with the appropriate PCS for the PHY interface mode. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:28:32 +00:00
Tom Rix	8aba73ef44	net: ethernet: xilinx: cleanup comments Remove the second 'the'. Replacements: endiannes to endianness areconnected to are connected Mamagement to Management undoccumented to undocumented Xilink to Xilinx strucutre to structure Change kernel-doc comment style to c style for /* Management ... Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-02-18 11:11:10 +00:00

1 2 3 4 5 ...

1074688 Commits