linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-22 04:02:20 +00:00

Author	SHA1	Message	Date
Eric Dumazet	4721031c35	net: move gro definitions to include/net/gro.h include/linux/netdevice.h became too big, move gro stuff into include/net/gro.h Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:16:54 +00:00
David S. Miller	6fcc06205c	Merge branch 'tcp-optimizations' Eric Dumazet says: ==================== tcp: optimizations for linux-5.17 Mostly small improvements in this series. The notable change is in "defer skb freeing after socket lock is released" in recvmsg() (and RX zerocopy) The idea is to try to let skb freeing to BH handler, whenever possible, or at least perform the freeing outside of the socket lock section, for much improved performance. This idea can probably be extended to other protocols. Tests on a 100Gbit NIC Max throughput for one TCP_STREAM flow, over 10 runs. MTU : 1500 (1428 bytes of TCP payload per MSS) Before: 55 Gbit After: 66 Gbit MTU : 4096+ (4096 bytes of TCP payload, plus TCP/IPv6 headers) Before: 82 Gbit After: 95 Gbit ==================== Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:35 +00:00
Eric Dumazet	43f51df417	net: move early demux fields close to sk_refcnt sk_rx_dst/sk_rx_dst_ifindex/sk_rx_dst_cookie are read in early demux, and currently spans two cache lines. Moving them close to sk_refcnt makes more sense, as only one cache line is needed. New layout for this hot cache line is : struct sock { struct sock_common __sk_common; /* 0 0x88 / / --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- / struct dst_entry sk_rx_dst; /* 0x88 0x8 / int sk_rx_dst_ifindex; / 0x90 0x4 / u32 sk_rx_dst_cookie; / 0x94 0x4 / socket_lock_t sk_lock; / 0x98 0x20 / atomic_t sk_drops; / 0xb8 0x4 / int sk_rcvlowat; / 0xbc 0x4 / / --- cacheline 3 boundary (192 bytes) --- */ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:35 +00:00
Eric Dumazet	29fbc26e6d	tcp: do not call tcp_cleanup_rbuf() if we have a backlog Under pressure, tcp recvmsg() has logic to process the socket backlog, but calls tcp_cleanup_rbuf() right before. Avoiding sending ACK right before processing new segments makes a lot of sense, as this decrease the number of ACK packets, with no impact on effective ACK clocking. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:35 +00:00
Eric Dumazet	8bd172b787	tcp: check local var (timeo) before socket fields in one test Testing timeo before sk_err/sk_state/sk_shutdown makes more sense. Modern applications use non-blocking IO, while a socket is terminated only once during its life time. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:35 +00:00
Eric Dumazet	f35f821935	tcp: defer skb freeing after socket lock is released tcp recvmsg() (or rx zerocopy) spends a fair amount of time freeing skbs after their payload has been consumed. A typical ~64KB GRO packet has to release ~45 page references, eventually going to page allocator for each of them. Currently, this freeing is performed while socket lock is held, meaning that there is a high chance that BH handler has to queue incoming packets to tcp socket backlog. This can cause additional latencies, because the user thread has to process the backlog at release_sock() time, and while doing so, additional frames can be added by BH handler. This patch adds logic to defer these frees after socket lock is released, or directly from BH handler if possible. Being able to free these skbs from BH handler helps a lot, because this avoids the usual alloc/free assymetry, when BH handler and user thread do not run on same cpu or NUMA node. One cpu can now be fully utilized for the kernel->user copy, and another cpu is handling BH processing and skb/page allocs/frees (assuming RFS is not forcing use of a single CPU) Tested: 100Gbit NIC Max throughput for one TCP_STREAM flow, over 10 runs MTU : 1500 Before: 55 Gbit After: 66 Gbit MTU : 4096+(headers) Before: 82 Gbit After: 95 Gbit Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:35 +00:00
Eric Dumazet	3df684c1a3	tcp: avoid indirect calls to sock_rfree TCP uses sk_eat_skb() when skbs can be removed from receive queue. However, the call to skb_orphan() from __kfree_skb() incurs an indirect call so sock_rfee(), which is more expensive than a direct call, especially for CONFIG_RETPOLINE=y. Add tcp_eat_recv_skb() function to make the call before __kfree_skb(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:35 +00:00
Eric Dumazet	b96c51bd3b	tcp: tp->urg_data is unlikely to be set Use some unlikely() hints in the fast path. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	7b6a893a59	tcp: annotate races around tp->urg_data tcp_poll() and tcp_ioctl() are reading tp->urg_data without socket lock owned. Also, it is faster to first check tp->urg_data in tcp_poll(), then tp->urg_seq == tp->copied_seq, because tp->urg_seq is located in a different/cold cache line. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	0307a0b74b	tcp: annotate data-races on tp->segs_in and tp->data_segs_in tcp_segs_in() can be called from BH, while socket spinlock is held but socket owned by user, eventually reading these fields from tcp_get_info() Found by code inspection, no need to backport this patch to older kernels. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	d2489c7b6d	tcp: add RETPOLINE mitigation to sk_backlog_rcv Use INDIRECT_CALL_INET() to avoid an indirect call when/if CONFIG_RETPOLINE=y Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	93afcfd1db	tcp: small optimization in tcp recvmsg() When reading large chunks of data, incoming packets might be added to the backlog from BH. tcp recvmsg() detects the backlog queue is not empty, and uses a release_sock()/lock_sock() pair to process this backlog. We now have __sk_flush_backlog() to perform this a bit faster. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	91b6d32563	net: cache align tcp_memory_allocated, tcp_sockets_allocated tcp_memory_allocated and tcp_sockets_allocated often share a common cache line, source of false sharing. Also take care of udp_memory_allocated and mptcp_sockets_allocated. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	6c302e799a	net: forward_alloc_get depends on CONFIG_MPTCP (struct proto)->sk_forward_alloc is currently only used by MPTCP. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	1ace2b4d2b	net: shrink struct sock by 8 bytes Move sk_bind_phc next to sk_peer_lock to fill a hole. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	1b31debca8	ipv6: shrink struct ipcm6_cookie gso_size can be moved after tclass, to use an existing hole. (8 bytes saved on 64bit arches) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	aba546565b	net: remove sk_route_nocaps Instead of using a full netdev_features_t, we can use a single bit, as sk_route_nocaps is only used to remove NETIF_F_GSO_MASK from sk->sk_route_cap. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	d0d598ca86	net: remove sk_route_forced_caps We were only using one bit, and we can replace it by sk_is_tcp() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	42f67eea3b	net: use sk_is_tcp() in more places Move sk_is_tcp() to include/net/sock.h and use it where we can. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	3735440200	tcp: small optimization in tcp_v6_send_check() For TCP flows, inet6_sk(sk)->saddr has the same value than sk->sk_v6_rcv_saddr. Using sk->sk_v6_rcv_saddr increases data locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	283c6b54bc	tcp: remove dead code in __tcp_v6_send_check() For some reason, I forgot to change __tcp_v6_send_check() at the same time I removed (ip_summed == CHECKSUM_PARTIAL) check in __tcp_v4_send_check() Fixes: `98be9b1209` ("tcp: remove dead code after CHECKSUM_PARTIAL adoption") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Eric Dumazet	d519f35096	tcp: minor optimization in tcp_add_backlog() If packet is going to be coalesced, sk_sndbuf/sk_rcvbuf values are not used. Defer their access to the point we need them. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-16 13:10:34 +00:00
Sean Anderson	3ad4b7c81a	net: macb: Fix several edge cases in validate There were several cases where validate() would return bogus supported modes with unusual combinations of interfaces and capabilities. For example, if state->interface was 10GBASER and the macb had HIGH_SPEED and PCS but not GIGABIT MODE, then 10/100 modes would be set anyway. In another case, SGMII could be enabled even if the mac was not a GEM (despite this being checked for later on in mac_config()). These inconsistencies make it difficult to refactor this function cleanly. There is still the open question of what exactly the requirements for SGMII and 10GBASER are, and what SGMII actually supports. If someone from Cadence (or anyone else with access to the GEM/MACB datasheet) could comment on this, it would be greatly appreciated. In particular, what is supported by Cadence vs. vendor extension/limitation? To address this, the current logic is split into three parts. First, we determine what we support, then we eliminate unsupported interfaces, and finally we set the appropriate link modes. There is still some cruft related to NA, but this can be removed in a future patch. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Parshuram Thombare <pthombar@cadence.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20211112190400.1937855-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-15 17:06:51 -08:00
Jakub Kicinski	a5bdc36354	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2021-11-15 We've added 72 non-merge commits during the last 13 day(s) which contain a total of 171 files changed, 2728 insertions(+), 1143 deletions(-). The main changes are: 1) Add btf_type_tag attributes to bring kernel annotations like __user/__rcu to BTF such that BPF verifier will be able to detect misuse, from Yonghong Song. 2) Big batch of libbpf improvements including various fixes, future proofing APIs, and adding a unified, OPTS-based bpf_prog_load() low-level API, from Andrii Nakryiko. 3) Add ingress_ifindex to BPF_SK_LOOKUP program type for selectively applying the programmable socket lookup logic to packets from a given netdev, from Mark Pashmfouroush. 4) Remove the 128M upper JIT limit for BPF programs on arm64 and add selftest to ensure exception handling still works, from Russell King and Alan Maguire. 5) Add a new bpf_find_vma() helper for tracing to map an address to the backing file such as shared library, from Song Liu. 6) Batch of various misc fixes to bpftool, fixing a memory leak in BPF program dump, updating documentation and bash-completion among others, from Quentin Monnet. 7) Deprecate libbpf bpf_program__get_prog_info_linear() API and migrate its users as the API is heavily tailored around perf and is non-generic, from Dave Marchevsky. 8) Enable libbpf's strict mode by default in bpftool and add a --legacy option as an opt-out for more relaxed BPF program requirements, from Stanislav Fomichev. 9) Fix bpftool to use libbpf_get_error() to check for errors, from Hengqi Chen. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (72 commits) bpftool: Use libbpf_get_error() to check error bpftool: Fix mixed indentation in documentation bpftool: Update the lists of names for maps and prog-attach types bpftool: Fix indent in option lists in the documentation bpftool: Remove inclusion of utilities.mak from Makefiles bpftool: Fix memory leak in prog_dump() selftests/bpf: Fix a tautological-constant-out-of-range-compare compiler warning selftests/bpf: Fix an unused-but-set-variable compiler warning bpf: Introduce btf_tracing_ids bpf: Extend BTF_ID_LIST_GLOBAL with parameter for number of IDs bpftool: Enable libbpf's strict mode by default docs/bpf: Update documentation for BTF_KIND_TYPE_TAG support selftests/bpf: Clarify llvm dependency with btf_tag selftest selftests/bpf: Add a C test for btf_type_tag selftests/bpf: Rename progs/tag.c to progs/btf_decl_tag.c selftests/bpf: Test BTF_KIND_DECL_TAG for deduplication selftests/bpf: Add BTF_KIND_TYPE_TAG unit tests selftests/bpf: Test libbpf API function btf__add_type_tag() bpftool: Support BTF_KIND_TYPE_TAG libbpf: Support BTF_KIND_TYPE_TAG ... ==================== Link: https://lore.kernel.org/r/20211115162008.25916-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-15 08:49:23 -08:00
Jakub Kicinski	2f6a470d65	Revert "Merge branch 'mctp-i2c-driver'" This reverts commit `71812af723`, reversing changes made to `cc0be1ad68`. Wolfram Sang says: Please revert. Besides the driver in net, it modifies the I2C core code. This has not been acked by the I2C maintainer (in this case me). So, please don't pull this in via the net tree. The question raised here (extending SMBus calls to 255 byte) is complicated because we need ABI backwards compatibility. Link: https://lore.kernel.org/all/YZJ9H4eM%2FM7OXVN0@shikoro/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-11-15 07:53:10 -08:00
David S. Miller	6d3b1b0699	Merge branch 'generic-phylink-validation' Russell King says: ==================== introduce generic phylink validation The various validate method implementations we have in phylink users have been quite repetitive but also prone to bugs. These patches introduce a generic implementation which relies solely on the supported_interfaces bitmap introduced during last cycle, and in the first patch, a bit array of MAC capabilities. MAC drivers are free to continue to do their own thing if they have special requirements - such as mvneta and mvpp2 which do not support 1000base-X without AN enabled. Most implementations currently in the kernel can be converted to call phylink_generic_validate() directly from the phylink MAC operations structure once they fill in the supported_interfaces and mac_capabilities members of phylink_config. This series introduces the generic implementation, and converts mvneta and mvpp2 to use it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:31:00 +00:00
Russell King (Oracle)	5038ffea0c	net: mvpp2: use phylink_generic_validate() Convert mvpp2 to use phylink_generic_validate() for the bulk of its validate() implementation. This network adapter has a restriction that for 802.3z links, autonegotiation must be enabled. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:31:00 +00:00
Russell King (Oracle)	02a0988b98	net: mvneta: use phylink_generic_validate() Convert mvneta to use phylink_generic_validate() for the bulk of its validate() implementation. This network adapter has a restriction that for 802.3z links, autonegotiation must be enabled. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:31:00 +00:00
Russell King (Oracle)	34ae2c09d4	net: phylink: add generic validate implementation Add a generic validate() implementation using the supported_interfaces and a bitmask of MAC pause/speed/duplex capabilities. This allows us to entirely eliminate many driver private validate() implementations. We expose the underlying phylink_get_linkmodes() function so that drivers which have special needs can still benefit from conversion. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:31:00 +00:00
Christophe Leroy	5cf46d8e74	net/wan/fsl_ucc_hdlc: fix sparse warnings CHECK drivers/net/wan/fsl_ucc_hdlc.c drivers/net/wan/fsl_ucc_hdlc.c:309:57: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:309:57: expected void [noderef] __iomem * drivers/net/wan/fsl_ucc_hdlc.c:309:57: got restricted __be16 * drivers/net/wan/fsl_ucc_hdlc.c:311:46: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:311:46: expected void [noderef] __iomem * drivers/net/wan/fsl_ucc_hdlc.c:311:46: got restricted __be32 * drivers/net/wan/fsl_ucc_hdlc.c:320:57: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:320:57: expected void [noderef] __iomem * drivers/net/wan/fsl_ucc_hdlc.c:320:57: got restricted __be16 * drivers/net/wan/fsl_ucc_hdlc.c:322:46: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:322:46: expected void [noderef] __iomem * drivers/net/wan/fsl_ucc_hdlc.c:322:46: got restricted __be32 * drivers/net/wan/fsl_ucc_hdlc.c:372:29: warning: incorrect type in assignment (different base types) drivers/net/wan/fsl_ucc_hdlc.c:372:29: expected unsigned short [usertype] drivers/net/wan/fsl_ucc_hdlc.c:372:29: got restricted __be16 [usertype] drivers/net/wan/fsl_ucc_hdlc.c:379:36: warning: restricted __be16 degrades to integer drivers/net/wan/fsl_ucc_hdlc.c:402:12: warning: incorrect type in assignment (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:402:12: expected struct qe_bd [noderef] __iomem bd drivers/net/wan/fsl_ucc_hdlc.c:402:12: got struct qe_bd curtx_bd drivers/net/wan/fsl_ucc_hdlc.c:425:20: warning: incorrect type in assignment (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:425:20: expected struct qe_bd [noderef] __iomem [assigned] bd drivers/net/wan/fsl_ucc_hdlc.c:425:20: got struct qe_bd tx_bd_base drivers/net/wan/fsl_ucc_hdlc.c:427:16: error: incompatible types in comparison expression (different address spaces): drivers/net/wan/fsl_ucc_hdlc.c:427:16: struct qe_bd [noderef] __iomem * drivers/net/wan/fsl_ucc_hdlc.c:427:16: struct qe_bd * drivers/net/wan/fsl_ucc_hdlc.c:462:33: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:506:41: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:528:33: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:552:38: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:596:67: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:611:41: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:851:38: warning: incorrect type in initializer (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:854:40: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:855:40: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:858:39: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:861:37: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:866:38: warning: incorrect type in initializer (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:868:21: warning: incorrect type in argument 1 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:870:40: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:871:40: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:873:39: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:993:57: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:995:46: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:1004:57: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:1006:46: warning: incorrect type in argument 2 (different address spaces) drivers/net/wan/fsl_ucc_hdlc.c:412:35: warning: dereference of noderef expression drivers/net/wan/fsl_ucc_hdlc.c:412:35: warning: dereference of noderef expression drivers/net/wan/fsl_ucc_hdlc.c:724:29: warning: dereference of noderef expression drivers/net/wan/fsl_ucc_hdlc.c:815:21: warning: dereference of noderef expression drivers/net/wan/fsl_ucc_hdlc.c:1021:29: warning: dereference of noderef expression Most of the warnings are due to DMA memory being incorrectly handled as IO memory. Fix it by doing direct read/write and doing proper dma_rmb() / dma_wmb(). Other problems are type mismatches or lack of use of IO accessors. Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reported-by: kernel test robot <lkp@intel.com> Link: https://lkml.org/lkml/2021/11/12/647 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:24:41 +00:00
Yihao Han	311107bdec	net: fddi: use swap() to make code cleaner Use the macro 'swap()' defined in 'include/linux/minmax.h' to avoid opencoding it. Signed-off-by: Yihao Han <hanyihao@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:19:17 +00:00
Guo Zhengkui	9ed941178c	hinic: use ARRAY_SIZE instead of ARRAY_LEN ARRAY_SIZE defined in <linux/kernel.h> is safer than self-defined macros to get size of an array such as ARRAY_LEN used here. Because ARRAY_SIZE uses __must_be_array(arr) to ensure arr is really an array. Reported-by: Alejandro Colomar <colomar.6.4.3@gmail.com> Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:18:09 +00:00
Jacky Chou	16b1c4e01c	net: usb: ax88179_178a: add TSO feature On low-effciency embedded platforms, transmission performance is poor due to on Bulk-out with single packet. Adding TSO feature improves the transmission performance and reduces the number of interrupt caused by Bulk-out complete. Reference to module, net: usb: aqc111. Signed-off-by: Jacky Chou <jackychou@asix.com.tw> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:16:51 +00:00
David S. Miller	71812af723	Merge branch 'mctp-i2c-driver' Matt Johnston says: ==================== MCTP I2C driver This patch series adds a netdev driver providing MCTP transport over I2C. It applies against net-next using recent MCTP changes there, though also has I2C core changes for review. I'll leave it to maintainers where it should be applied - please let me know if it needs to be submitted differently. The I2C patches were previously sent as RFC though the only feedback there was an ack to 255 bytes for aspeed. The dt-bindings patch went through review on the list. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:25 +00:00
Matt Johnston	80be9b2c0d	mctp i2c: MCTP I2C binding driver Provides MCTP network transport over an I2C bus, as specified in DMTF DSP0237. All messages between nodes are sent as SMBus Block Writes. Each I2C bus to be used for MCTP is flagged in devicetree by a 'mctp-controller' property on the bus node. Each flagged bus gets a mctpi2cX net device created based on the bus number. A 'mctp-i2c-controller' I2C client needs to be added under the adapter. In an I2C mux situation the mctp-i2c-controller node must be attached only to the root I2C bus. The I2C client will handle incoming I2C slave block write data for subordinate busses as well as its own bus. In configurations without devicetree a driver instance can be attached to a bus using the I2C slave new_device mechanism. The MCTP core will hold/release the MCTP I2C device while responses are pending (a 6 second timeout or once a socket is closed, response received etc). While held the MCTP I2C driver will lock the I2C bus so that the correct I2C mux remains selected while responses are received. (Ideally we would just lock the mux to keep the current bus selected for the response rather than a full I2C bus lock, but that isn't exposed in the I2C mux API) This driver requires I2C adapters that allow 255 byte transfers (SMBus 3.0) as the specification requires a minimum MTU of 68 bytes. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:25 +00:00
Matt Johnston	0b6141eb2b	dt-bindings: net: New binding mctp-i2c-controller Used to define a local endpoint to communicate with MCTP peripherals attached to an I2C bus. This I2C endpoint can communicate with remote MCTP devices on the I2C bus. In the example I2C topology below (matching the second yaml example) we have MCTP devices on busses i2c1 and i2c6. MCTP-supporting busses are indicated by the 'mctp-controller' DT property on an I2C bus node. A mctp-i2c-controller I2C client DT node is placed at the top of the mux topology, since only the root I2C adapter will support I2C slave functionality. .-------. \|eeprom \| .------------. .------. /'-------' \| adapter \| \| mux --@0,i2c5------' \| i2c1 ----.*\| --@1,i2c6--.--. \|............\| \'------' \ \ ......... \| mctp-i2c- \| \ \ \ .mctpB . \| controller \| \ \ '.0x30 . \| \| \ ......... \ '.......' \| 0x50 \| \ .mctpA . \ ......... '------------' '.0x1d . '.mctpC . '.......' '.0x31 . '.......' (mctpX boxes above are remote MCTP devices not included in the DT at present, they can be hotplugged/probed at runtime. A DT binding for specific fixed MCTP devices could be added later if required) Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:25 +00:00
Matt Johnston	3ef2de27a0	i2c: npcm7xx: Allow 255 byte block SMBus transfers 255 byte support has been tested on a npcm750 board Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Tali Perry <tali.perry1@gmail.com> Reviewed-by: Patrick Venture <venture@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:24 +00:00
Matt Johnston	1b2ba1f591	i2c: aspeed: Allow 255 byte block transfers 255 byte transfers have been tested on an AST2500 board Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Reviewed-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:24 +00:00
Matt Johnston	84a107e68b	i2c: dev: Handle 255 byte blocks for i2c ioctl I2C_SMBUS is limited to 32 bytes due to compatibility with the 32 byte i2c_smbus_data.block I2C_RDWR allows larger transfers if sufficient sized buffers are passed. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:24 +00:00
Matt Johnston	13cae4a104	i2c: core: Allow 255 byte transfers for SMBus 3.x SMBus 3.0 increased the maximum block transfer size from 32 bytes to 255 bytes. We increase the size of struct i2c_smbus_data's block[] member. i2c_smbus_xfer() and i2c_smbus_xfer_emulated() now support 255 byte block operations, other block functions remain limited to 32 bytes for compatibility with existing callers. We allow adapters to indicate support for the larger size with I2C_FUNC_SMBUS_V3_BLOCK. Most emulated drivers should be able to use 255 byte blocks by replacing I2C_SMBUS_BLOCK_MAX with I2C_SMBUS_V3_BLOCK_MAX though some will have hardware limitations that need testing. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:11:24 +00:00
Christophe JAILLET	cc0be1ad68	net: bridge: Slightly optimize 'find_portno()' The 'inuse' bitmap is local to this function. So we can use the non-atomic '__set_bit()' to save a few cycles. While at it, also remove some useless {}. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 14:07:06 +00:00
Harshit Mogalapalli	cb3ef7b000	net: sched: sch_netem: Refactor code in 4-state loss generator Fixed comments to match description with variable names and refactored code to match the convention as per [1]. To match the convention mapping is done as follows: State 3 - LOST_IN_BURST_PERIOD State 4 - LOST_IN_GAP_PERIOD [1] S. Salsano, F. Ludovici, A. Ordine, "Definition of a general and intuitive loss model for packet networks and its implementation in the Netem module in the Linux kernel" Fixes: `a6e2fe17eb` ("sch_netem: replace magic numbers with enumerate") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 13:23:23 +00:00
Uwe Kleine-König	e99fa4230f	net: dsa: vsc73xxx: Make vsc73xx_remove() return void vsc73xx_remove() returns zero unconditionally and no caller checks the returned value. So convert the function to return no value. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 13:15:07 +00:00
Ong Boon Leong	ac746c8520	net: stmmac: enhance XDP ZC driver level switching performance The previous stmmac_xdp_set_prog() implementation uses stmmac_release() and stmmac_open() which tear down the PHY device and causes undesirable autonegotiation which causes a delay whenever AFXDP ZC is setup. This patch introduces two new functions that just sufficiently tear down DMA descriptors, buffer, NAPI process, and IRQs and reestablish them accordingly in both stmmac_xdp_release() and stammac_xdp_open(). As the results of this enhancement, we get rid of transient state introduced by the link auto-negotiation: $ ./xdpsock -i eth0 -t -z sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 634444 634560 sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 632330 1267072 sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 632438 1899584 sock0@eth0:0 txonly xdp-drv pps pkts 1.00 rx 0 0 tx 632502 2532160 Reported-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-11-15 12:56:34 +00:00
Hengqi Chen	e5043894b2	bpftool: Use libbpf_get_error() to check error Currently, LIBBPF_STRICT_ALL mode is enabled by default for bpftool which means on error cases, some libbpf APIs would return NULL pointers. This makes IS_ERR check failed to detect such cases and result in segfault error. Use libbpf_get_error() instead like we do in libbpf itself. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211115012436.3143318-1-hengqi.chen@gmail.com	2021-11-14 18:38:13 -08:00
Andrii Nakryiko	c874dff452	Merge branch 'bpftool: miscellaneous fixes' Quentin Monnet says: ==================== This set contains several independent minor fixes for bpftool, its Makefile, and its documentation. Please refer to individual commits for details. ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-11-14 18:35:02 -08:00
Quentin Monnet	b06be5651f	bpftool: Fix mixed indentation in documentation Some paragraphs in bpftool's documentation have a mix of tabs and spaces for indentation. Let's make it consistent. This patch brings no change to the text content. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211110114632.24537-7-quentin@isovalent.com	2021-11-14 18:35:02 -08:00
Quentin Monnet	3811e2753a	bpftool: Update the lists of names for maps and prog-attach types To support the different BPF map or attach types, bpftool must remain up-to-date with the types supported by the kernel. Let's update the lists, by adding the missing Bloom filter map type and the perf_event attach type. Both missing items were found with test_bpftool_synctypes.py. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211110114632.24537-6-quentin@isovalent.com	2021-11-14 18:35:02 -08:00
Quentin Monnet	986dec18bb	bpftool: Fix indent in option lists in the documentation Mixed indentation levels in the lists of options in bpftool's documentation produces some unexpected results. For the "bpftool" man page, it prints a warning: $ make -C bpftool.8 GEN bpftool.8 <stdin>:26: (ERROR/3) Unexpected indentation. For other pages, there is no warning, but it results in a line break appearing in the option lists in the generated man pages. RST paragraphs should have a uniform indentation level. Let's fix it. Fixes: `c07ba629df` ("tools: bpftool: Update and synchronise option list in doc and help msg") Fixes: `8cc8c6357c` ("tools: bpftool: Document and add bash completion for -L, -B options") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211110114632.24537-5-quentin@isovalent.com	2021-11-14 18:35:02 -08:00
Quentin Monnet	48f5aef4c4	bpftool: Remove inclusion of utilities.mak from Makefiles Bpftool's Makefile, and the Makefile for its documentation, both include scripts/utilities.mak, but they use none of the items defined in this file. Remove the includes. Fixes: `71bb428fe2` ("tools: bpf: add bpftool") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211110114632.24537-3-quentin@isovalent.com	2021-11-14 18:34:09 -08:00

1 2 3 4 5 ...

1057824 Commits