linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-02 00:51:44 +00:00

Author	SHA1	Message	Date
Edward Cree	3c9561c0a5	sfc: support TC decap rules matching on enc_ip_tos Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask. To avoid partially-overlapping Outer Rules (which can lead to undefined behaviour in the hardware), store extra "pseudo" entries in our encap_match hashtable, which are used to enforce that all Outer Rule entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6 equivalent) have the same ip_tos_mask. The "direct" encap_match entry takes a reference on the "pseudo", allowing it to be destroyed when all "direct" entries using it are removed. efx_tc_em_pseudo_type is an enum rather than just a bool because in future an additional pseudo-type will be added to support Conntrack offload. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 10:37:02 +01:00
Edward Cree	56beb35d85	sfc: populate enc_ip_tos matches in MAE outer rules Currently tc.c will block them before they get here, but following patch will change that. Use the extack message from efx_mae_check_encap_match_caps() instead of writing a new one, since there's now more being fed in than just an IP version. Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 10:37:02 +01:00
Edward Cree	28fa3ac487	sfc: release encap match in efx_tc_flow_free() When force-freeing leftover entries from our match_action_ht, call efx_tc_delete_rule(), which releases all the rule's resources, rather than open-coding it. The open-coded version was missing a call to release the rule's encap match (if any). It probably doesn't matter as everything's being torn down anyway, but it's cleaner this way and prevents further error messages potentially being logged by efx_tc_encap_match_free() later on. Move efx_tc_flow_free() further down the file to avoid introducing a forward declaration of efx_tc_delete_rule(). Fixes: `17654d84b4` ("sfc: add offloading of 'foreign' TC (decap) rules") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 10:37:02 +01:00
wuych	d3616dc779	net: liquidio: lio_main: Remove unnecessary (void) conversions Pointer variables of void type do not require type cast. Signed-off-by: wuych <yunchuan@nfschina.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 10:27:59 +01:00
Alexander Mikhalitsyn	2598619e01	sctp: add bpf_bypass_getsockopt proto callback Implement ->bpf_bypass_getsockopt proto callback and filter out SCTP_SOCKOPT_PEELOFF, SCTP_SOCKOPT_PEELOFF_FLAGS and SCTP_SOCKOPT_CONNECTX3 socket options from running eBPF hook on them. SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT hook returns an error after success of the original handler sctp_getsockopt(...), userspace will receive an error from getsockopt syscall and will be not aware that fd was successfully installed into a fdtable. As pointed by Marcelo Ricardo Leitner it seems reasonable to skip bpf getsockopt hook for SCTP_SOCKOPT_CONNECTX3 sockopt too. Because internaly, it triggers connect() and if error is masked then userspace will be confused. This patch was born as a result of discussion around a new SCM_PIDFD interface: https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ Fixes: `0d01da6afc` ("bpf: implement getsockopt and setsockopt hooks") Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Christian Brauner <brauner@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Cc: linux-sctp@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Suggested-by: Stanislav Fomichev <sdf@google.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 10:07:16 +01:00
David S. Miller	e7ea5080ef	Merge branch 'selftests-fcnal' Guillaume Nault says: ==================== selftests: fcnal: Test SO_DONTROUTE socket option. The objective is to cover kernel paths that use the RTO_ONLINK flag in .flowi4_tos. This way we'll be able to safely remove this flag in the future by properly setting .flowi4_scope instead. With these selftests in place, we can make sure this won't introduce regressions. For more context, the final objective is to convert .flowi4_tos to dscp_t, to ensure that ECN bits don't influence route and fib-rule lookups (see commit `a410a0cf98` ("ipv6: Define dscp_t and stop taking ECN bits into account in fib6-rules")). These selftests only cover IPv4, as SO_DONTROUTE has no effect on IPv6 sockets. v2: - Use two different nettest options for setting SO_DONTROUTE either on the server or on the client socket. - Use the above feature to run a single 'nettest -B' instance per test (instead of having two nettest processes for server and client). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:43:56 +01:00
Guillaume Nault	ceec9f2724	selftests: fcnal: Test SO_DONTROUTE on raw and ping sockets. Use ping -r to test the kernel behaviour with raw and ping sockets having the SO_DONTROUTE option. Since ipv4_ping_novrf() is called with different values of net.ipv4.ping_group_range, then it tests both raw and ping sockets (ping uses ping sockets if its user ID belongs to ping_group_range and raw sockets otherwise). With both socket types, sending packets to a neighbour (on link) host, should work. When the host is behind a router, sending should fail. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:43:56 +01:00
Guillaume Nault	a431327c4f	selftests: fcnal: Test SO_DONTROUTE on UDP sockets. Use nettest --client-dontroute to test the kernel behaviour with UDP sockets having the SO_DONTROUTE option. Sending packets to a neighbour (on link) host, should work. When the host is behind a router, sending should fail. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:43:56 +01:00
Guillaume Nault	dd017c72dd	selftests: fcnal: Test SO_DONTROUTE on TCP sockets. Use nettest --{client,server}-dontroute to test the kernel behaviour with TCP sockets having the SO_DONTROUTE option. Sending packets to a neighbour (on link) host, should work. When the host is behind a router, sending should fail. Client and server sockets are tested independently, so that we can cover different TCP kernel paths. SO_DONTROUTE also affects the syncookies path. So ipv4_tcp_dontroute() is made to work with or without syncookies, to cover both paths. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:43:56 +01:00
Guillaume Nault	aeefbb574c	selftests: Add SO_DONTROUTE option to nettest. Add --client-dontroute and --server-dontroute options to nettest. They allow to set the SO_DONTROUTE option to the client and server sockets respectively. This will be used by the following patches to test the SO_DONTROUTE kernel behaviour with TCP and UDP. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:43:56 +01:00
Simon Horman	c1bc7d73c9	bonding: Always assign be16 value to vlan_proto The type of the vlan_proto field is __be16. And most users of the field use it as such. In the case of setting or testing the field for the special VLAN_N_VID value, host byte order is used. Which seems incorrect. It also seems somewhat odd to store a VLAN ID value in a field that is otherwise used to store Ether types. Address this issue by defining BOND_VLAN_PROTO_NONE, a big endian value. 0xffff was chosen somewhat arbitrarily. What is important is that it doesn't overlap with any valid VLAN Ether types. I don't believe the problems described above are a bug because VLAN_N_VID in both little-endian and big-endian byte order does not conflict with any supported VLAN Ether types in big-endian byte order. Reported by sparse as: .../bond_main.c:2857:26: warning: restricted __be16 degrades to integer .../bond_main.c:2863:20: warning: restricted __be16 degrades to integer .../bond_main.c:2939:40: warning: incorrect type in assignment (different base types) .../bond_main.c:2939:40: expected restricted __be16 [usertype] vlan_proto .../bond_main.c:2939:40: got int No functional changes intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:36:16 +01:00
David S. Miller	deb2e484ba	Merge branch 'net-handshake-fixes' Chuck Lever says: ==================== Bug fixes for net/handshake Please consider these for merge via net-next. Paolo observed that there is a possible leak of sock->file. I haven't looked into that yet, but it seems to be separate from the fixes in this series, so no need to hold these up. Changes since v2: - Address Paolo comment regarding handshake_dup() Changes since v1: - Rework "Fix handshake_dup() ref counting" - Unpin sock->file when a handshake is cancelled ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:08 +01:00
Chuck Lever	eefca7ec51	net/handshake: Enable the SNI extension to work properly Enable the upper layer protocol to specify the SNI peername. This avoids the need for tlshd to use a DNS lookup, which can return a hostname that doesn't match the incoming certificate's SubjectName. Fixes: `2fd5532044` ("net/handshake: Add a kernel API for requesting a TLSv1.3 handshake") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:08 +01:00
Chuck Lever	f921bd4100	net/handshake: Unpin sock->file if a handshake is cancelled If user space never calls DONE, sock->file's reference count remains elevated. Enable sock->file to be freed eventually in this case. Reported-by: Jakub Kacinski <kuba@kernel.org> Fixes: `3b3009ea8a` ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:08 +01:00
Chuck Lever	e36a93e172	net/handshake: handshake_genl_notify() shouldn't ignore @flags Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `3b3009ea8a` ("net/handshake: Create a NETLINK service for handling handshake requests") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:08 +01:00
Chuck Lever	7301034026	net/handshake: Fix uninitialized local variable trace_handshake_cmd_done_err() simply records the pointer in @req, so initializing it to NULL is sufficient and safe. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Fixes: `3b3009ea8a` ("net/handshake: Create a NETLINK service for handling handshake requests") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:07 +01:00
Chuck Lever	2200c1a870	net/handshake: Fix handshake_dup() ref counting If get_unused_fd_flags() fails, we ended up calling fput(sock->file) twice. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Suggested-by: Paolo Abeni <pabeni@redhat.com> Fixes: `3b3009ea8a` ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:07 +01:00
Chuck Lever	b16d76fe9a	net/handshake: Remove unneeded check from handshake_dup() handshake_req_submit() now verifies that the socket has a file. Fixes: `3b3009ea8a` ("net/handshake: Create a NETLINK service for handling handshake requests") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 09:24:07 +01:00
Yang Li	0fae884756	ipvlan: Remove NULL check before dev_{put, hold} The call netdev_{put, hold} of dev_{put, hold} will check NULL, so there is no need to check before using dev_{put, hold}, remove it to silence the warning: ./drivers/net/ipvlan/ipvlan_core.c:559:3-11: WARNING: NULL check before dev_{put, hold} functions is not needed. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4930 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 08:56:57 +01:00
Subbaraya Sundeep	48c0db05a1	octeontx2-pf: mcs: Offload extended packet number(XPN) feature The macsec hardware block supports XPN cipher suites also. Hence added changes to offload XPN feature. Changes include configuring SecY policy to XPN cipher suite, Salt and SSCI values. 64 bit packet number is passed instead of 32 bit packet number. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 08:54:43 +01:00
Uwe Kleine-König	7f88efc816	net: samsung: sxgbe: Make sxgbe_drv_remove() return void sxgbe_drv_remove() returned zero unconditionally, so it can be converted to return void without losing anything. The upside is that it becomes more obvious in its callers that there is no error to handle. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-12 08:49:49 +01:00
Philipp Rosenberger	995585ecdf	net: enc28j60: Use threaded interrupt instead of workqueue The Microchip ENC28J60 SPI Ethernet driver schedules a work item from the interrupt handler because accesses to the SPI bus may sleep. On PREEMPT_RT (which forces interrupt handling into threads) this old-fashioned approach unnecessarily increases latency because an interrupt results in first waking the interrupt thread, then scheduling the work item. So, a double indirection to handle an interrupt. Avoid by converting the driver to modern threaded interrupt handling. Signed-off-by: Philipp Rosenberger <p.rosenberger@kunbus.com> Signed-off-by: Zhi Han <hanzhi09@gmail.com> [lukas: rewrite commit message, linewrap request_threaded_irq() call] Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Link: https://lore.kernel.org/r/342380d989ce26bc49f0e5d45fbb0416a5f7809f.1683606193.git.lukas@wunner.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-11 18:00:37 -07:00
Jakub Kicinski	bc88ba0cad	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-11 09:06:55 -07:00
Linus Torvalds	6e27831b91	Networking fixes for 6.4-rc2, including fixes from netfilter Current release - regressions: - mtk_eth_soc: fix NULL pointer dereference Previous releases - regressions: - core: - skb_partial_csum_set() fix against transport header magic value - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs(). - annotate sk->sk_err write from do_recvmmsg() - add vlan_get_protocol_and_depth() helper - netlink: annotate accesses to nlk->cb_running - netfilter: always release netdev hooks from notifier Previous releases - always broken: - core: deal with most data-races in sk_wait_event() - netfilter: fix possible bug_on with enable_hooks=1 - eth: bonding: fix send_peer_notif overflow - eth: xpcs: fix incorrect number of interfaces - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRcxawSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkv6wQAJgfOBlDAkZNKHzwtMuFiLxECeEMWY9h wJCyiq0qXnz9p5ZqjdmTmA8B+jUp9VkpgN5Z3lid5hXDfzDrvXL1KGZW4pc4ooz9 GUzrp0EUzO5UsyrlZRS9vJ9mbCGN5M1ZWtWH93g8OzGJPRnLs0Q/Tr4IFTBVKzVb GmJPy/ZYWYDjnvx3BgewRDuYeH3Rt9lsIt4Pxq/E+D8W3ypvVM0m3GvrO5eEzMeu EfeilAdmJGJUufeoGguKt0hheqILS3kNCjQO25XS2Lq1OqetnR/wqTwXaaVxL2du Eb2ca7wKkihDpl2l8bQ3ss6vqM0HEpZ63Y2PJaNBS8ASdLsMq4n2L6j2JMfT8hWY RG3nJS7F2UFLyYmCJjNL1/I+Z9XeMyFKnHORzHK1dAkMlhd+8NauKWAxdjlxMbxX p1msyTl54bG0g6FrU/zAirCWNAAZYCPdZG/XvA/2Jj9mdy64OlGlv/QdJvfjcx+C L6nkwZfwXU7QUwKeeTfP8abte2SLrXIxkJrnNEAntPnFOSmd16+/yvQ8JVlbWTMd JugJrSAIxjOglIr/1fsnUuV+Ab+JDYQv/wkoyzvtcY2tjhTAHzgmTwwSfeYiCTJE rEbjyVvVgMcLTUIk/R9QC5/k6nX/7/KRDHxPOMBX4boOsuA0ARVjzt8uKRvv/7cS dRV98RwvCKvD =MoPD -----END PGP SIGNATURE----- Merge tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter. Current release - regressions: - mtk_eth_soc: fix NULL pointer dereference Previous releases - regressions: - core: - skb_partial_csum_set() fix against transport header magic value - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs(). - annotate sk->sk_err write from do_recvmmsg() - add vlan_get_protocol_and_depth() helper - netlink: annotate accesses to nlk->cb_running - netfilter: always release netdev hooks from notifier Previous releases - always broken: - core: deal with most data-races in sk_wait_event() - netfilter: fix possible bug_on with enable_hooks=1 - eth: bonding: fix send_peer_notif overflow - eth: xpcs: fix incorrect number of interfaces - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register" * tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits) af_unix: Fix data races around sk->sk_shutdown. af_unix: Fix a data race of sk->sk_receive_queue->qlen. net: datagram: fix data-races in datagram_poll() net: mscc: ocelot: fix stat counter register values ipvlan:Fix out-of-bounds caused by unclear skb->cb docs: networking: fix x25-iface.rst heading & index order gve: Remove the code of clearing PBA bit tcp: add annotations around sk->sk_shutdown accesses net: add vlan_get_protocol_and_depth() helper net: pcs: xpcs: fix incorrect number of interfaces net: deal with most data-races in sk_wait_event() net: annotate sk->sk_err write from do_recvmmsg() netlink: annotate accesses to nlk->cb_running kselftest: bonding: add num_grat_arp test selftests: forwarding: lib: add netns support for tc rule handle stats get Documentation: bonding: fix the doc of peer_notif_delay bonding: fix send_peer_notif overflow net: ethernet: mtk_eth_soc: fix NULL pointer dereference selftests: nft_flowtable.sh: check ingress/egress chain too selftests: nft_flowtable.sh: monitor result file sizes ...	2023-05-11 08:42:47 -05:00
Linus Torvalds	691e1eee1b	media fixes for v6.4-rc2 -----BEGIN PGP SIGNATURE----- iQIyBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmRcg4gACgkQCF8+vY7k 4RWPGA/0CbjmuSvikL7h2Cjlusp1HAvjhcHyXElKuob87ZlMq248eEQjwmeYkefG Oc1q3dHJR52BHwMH/9RhRfwKxFcSjKtb3Ed8R492OxZgIg9urrpb8YA4Z1r4qMaU Ehd3AS1a8Fe8pLWNxG4cxcs7Nl5giMTnBORMsxcaLx4miRwkfoNWBCY7L50yjOGA l2OYht6sqZMI+CPc4I+C7cgAVCflJBnIGGBoKZI+HSRdMpN1Ndown/O2vc5bZbGh mw2MKgMr3+QUxjMm1kebHFeYIgLtg2p95/kjlQlMEqgQiIdhGIMpv3KqZjeRqZZW 4WAawUMwZEmPXV01OgQtQQT79LYZYVLTxJewm8nRMBLsBbbYNCLLExzVdfY2cxxt aXHYnmLsL2wIicQIdNtq9NuptlVo6+zJpBpFQR5BfXT2VhBT/JF7iB7y5m6B269u Ah5Sv1ygB6JDrkxxPzdsRSavwxT4tMx5bzIUUvcg9D9AyfXgF/jxxK6MYuALWOv0 YUZD1wH4XqoRKITqTfts8kvu9+jDkh9ruBHBwDM/i+v9CNyIAQaU32Yolit8IGgx ArrDiGigzbD2Z9e8NVZFJy+15PQgpCUBOUagU5w/eszhBo25n7N2a8EjNSHQEzYb 7IHHVzjZzizhMLdJSWNAFolgmiL+DcZnuhC79rep+EdG7FIp1Q== =XxmV -----END PGP SIGNATURE----- Merge tag 'media/v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - fix some unused-variable warning in mtk-mdp3 - ignore unused suspend operations in nxp - some driver fixes in rcar-vin * tag 'media/v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: platform: mtk-mdp3: work around unused-variable warning media: nxp: ignore unused suspend operations media: rcar-vin: Select correct interrupt mode for V4L2_FIELD_ALTERNATE media: rcar-vin: Fix NV12 size alignment media: rcar-vin: Gen3 can not scale NV12	2023-05-11 08:35:52 -05:00
Paolo Abeni	285b2a4695	Merge branch 'net-mvneta-reduce-size-of-tso-header-allocation' Russell King says: ==================== net: mvneta: reduce size of TSO header allocation With reference to https://forum.turris.cz/t/random-kernel-exceptions-on-hbl-tos-7-0/18865/ https://github.com/openwrt/openwrt/pull/12375#issuecomment-1528842334 It appears that mvneta attempts an order-6 allocation for the TSO header memory. While this succeeds early on in the system's life time, trying order-6 allocations later can result in failure due to memory fragmentation. Firstly, the reason it's so large is that we take the number of transmit descriptors, and allocate a TSO header buffer for each, and each TSO header is 256 bytes. The driver uses a simple mechanism to determine the address - it uses the transmit descriptor index as an index into the TSO header memory. (The first obvious question is: do there need to be this many? Won't each TSO header always have at least one bit of data to go with it? In other words, wouldn't the maximum number of TSO headers that a ring could accept be the number of ring entries divided by 2?) There is no real need for this memory to be an order-6 allocation, since nothing in hardware requires this buffer to be contiguous. Therefore, this series splits this order-6 allocation up into 32 order-1 allocations (8k pages on 4k page platforms), each giving 32 TSO headers per page. In order to do this, these patches: 1) fix a horrible transmit path error-cleanup bug - the existing code unmaps from the first descriptor that was allocated at interface bringup, not the first descriptor that the packet is using, resulting in the wrong descriptors being unmapped. 2) since xdp support was added, we now have buf->type which indicates what this transmit buffer contains. Use this to mark TSO header buffers. 3) get rid of IS_TSO_HEADER(), instead using buf->type to determine whether this transmit buffer needs to be DMA-unmapped. 4) move tso_build_hdr() into mvneta_tso_put_hdr() to keep all the TSO header building code together. 5) split the TSO header allocation into chunks of order-1 pages. This has now been tested by the Turris folk and has been found to fix the allocation error. ==================== Link: https://lore.kernel.org/r/ZFtuhJOC03qpASt2@shell.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 13:05:20 +02:00
Russell King (Oracle)	33f4cefb26	net: mvneta: allocate TSO header DMA memory in chunks Now that we no longer need to check whether the DMA address is within the TSO header DMA memory range for the queue, we can allocate the TSO header DMA memory in chunks rather than one contiguous order-6 chunk, which can stress the kernel's memory subsystems to allocate. Instead, use order-1 (8k) allocations, which will result in 32 order-1 pages containing 32 TSO headers. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 13:05:15 +02:00
Russell King (Oracle)	d41eb55576	net: mvneta: move tso_build_hdr() into mvneta_tso_put_hdr() Move tso_build_hdr() into mvneta_tso_put_hdr() so that all the TSO header building code is in one place. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 13:05:07 +02:00
Russell King (Oracle)	f00ba4f41a	net: mvneta: use buf->type to determine whether to dma-unmap Now that we use a different buffer type for TSO headers, we can use buf->type to determine whether the original buffer was DMA-mapped or not. The rules are: MVNETA_TYPE_XDP_TX - from a DMA pool, no unmap is required MVNETA_TYPE_XDP_NDO - dma_map_single()'d MVNETA_TYPE_SKB - normal skbuff, dma_map_single()'d MVNETA_TYPE_TSO - from the TSO buffer area This means we only need to call dma_unmap_single() on the XDP_NDO and SKB types of buffer, and we no longer need the private IS_TSO_HEADER() which relies on the TSO region being contiguously allocated. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 13:05:07 +02:00
Russell King (Oracle)	b0bd1b07c3	net: mvneta: mark mapped and tso buffers separately Mark dma-mapped skbs and TSO buffers separately, so we can use buf->type to identify their differences. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 13:05:07 +02:00
Russell King (Oracle)	fef99e840d	net: mvneta: fix transmit path dma-unmapping on error The transmit code assumes that the transmit descriptors that are used begin with the first descriptor in the ring, but this may not be the case. Fix this by providing a new function that dma-unmaps a range of numbered descriptor entries, and use that to do the unmapping. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 13:05:07 +02:00
David Morley	ccce324dab	tcp: make the first N SYN RTO backoffs linear Currently the SYN RTO schedule follows an exponential backoff scheme, which can be unnecessarily conservative in cases where there are link failures. In such cases, it's better to aggressively try to retransmit packets, so it takes routers less time to find a repath with a working link. We chose a default value for this sysctl of 4, to follow the macOS and IOS backoff scheme of 1,1,1,1,1,2,4,8, ... MacOS and IOS have used this backoff schedule for over a decade, since before this 2009 IETF presentation discussed the behavior: https://www.ietf.org/proceedings/75/slides/tcpm-1.pdf This commit makes the SYN RTO schedule start with a number of linear backoffs given by the following sysctl: * tcp_syn_linear_timeouts This changes the SYN RTO scheme to be: init_rto_val for tcp_syn_linear_timeouts, exp backoff starting at init_rto_val For example if init_rto_val = 1 and tcp_syn_linear_timeouts = 2, our backoff scheme would be: 1, 1, 1, 2, 4, 8, 16, ... Signed-off-by: David Morley <morleyd@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Tested-by: David Morley <morleyd@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230509180558.2541885-1-morleyd.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-05-11 10:31:16 +02:00
M Chetan Kumar	8a690c1511	net: wwan: iosm: clean up unused struct members Below members are unused. - td_tag member defined in struct ipc_pipe. - adb_finish_timer & params defined in struct iosm_mux. Remove it to avoid unexpected usage. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/92ee483d79dfc871ed7408da8fec60b395ff3a9c.1683649868.git.m.chetan.kumar@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:15:01 -07:00
M Chetan Kumar	c930192572	net: wwan: iosm: remove unused enum definition ipc_time_unit enum is defined but not used. Remove it to avoid unexpected usage. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/8295a6138f13c686590ee4021384ee992f717408.1683649868.git.m.chetan.kumar@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:15:01 -07:00
M Chetan Kumar	796fb97a8c	net: wwan: iosm: remove unused macro definition IOSM_IF_ID_PAYLOAD is defined but not used. Remove it to avoid unexpected usage. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/0697e811cb7f10b4fd8f99e66bda1329efdd3d1d.1683649868.git.m.chetan.kumar@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:15:01 -07:00
Jakub Kicinski	cceac92678	netfilter pull request 23-05-10 -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmRbUnQACgkQ1V2XiooU IOQNVw//eoCiid3J4TuNdzHaBHXUlZvln1n5Z6K5fIz5ytrY1T7oKC8Y9StkzzWR 29ToFLOJt1iRAcgxsghWRIwUzNwpuUdqgd9cUZMHMQxT0BJItp6FXUql2+1LkF/I b/gnnb90zyE7lBS/VSRyOiqMiJlP+Som22d7Nn5k2KfTYEdXKwfzjsWAu3W3Sb0s Lv/MA9DE42qcwiZubmFmDtOtAunPJFZm3HgkcAVeXoNkBDrSfkvxLIMYG6VfFNhQ AkKMyzX293wpwVxfOuQfJr4QVlxAgOQUko+FqajoWMBtfA3yldZjJ8RC7c9Af9uI ciOP11vHBCG84KrTabC5kdqOcvadreDiM/oIvk57ztQhCr3e+po+vIz6Cv4p9I30 m5GXfgbtMRl6hM2S5lrRc5fNRkYJHE4aNvesFTGaLpK3LogusH1E9mH5jdjnbU42 TwkIe250qJJelNn9ZxS5Jt0BgyogNfeiA9lOXmaQBpYmwIahrjDf0g8z2I85nPhF PDukjSXsxi4uHwpSF5wFrlqkAPiEX4vC95uSUTVbbBgZrHhNJdGgu4FJSHRM4mVo 6Awxk2O2bvcDpXEfTBdD4EJF71bZ/aH2i8ddU1oupIl88O01TWTtkO4SYHqMX+tC fPnpGzBN8KJvHgeFxO4p0v1oelv2RtiITv1gi7YJOnq/+Vd2yy0= =HmfP -----END PGP SIGNATURE----- Merge tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter updates for net The following patchset contains Netfilter fixes for net: 1) Fix UAF when releasing netnamespace, from Florian Westphal. 2) Fix possible BUG_ON when nf_conntrack is enabled with enable_hooks, from Florian Westphal. 3) Fixes for nft_flowtable.sh selftest, from Boris Sukholitko. 4) Extend nft_flowtable.sh selftest to cover integration with ingress/egress hooks, from Florian Westphal. * tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: selftests: nft_flowtable.sh: check ingress/egress chain too selftests: nft_flowtable.sh: monitor result file sizes selftests: nft_flowtable.sh: wait for specific nc pids selftests: nft_flowtable.sh: no need for ps -x option selftests: nft_flowtable.sh: use /proc for pid checking netfilter: conntrack: fix possible bug_on with enable_hooks=1 netfilter: nf_tables: always release netdev hooks from notifier ==================== Link: https://lore.kernel.org/r/20230510083313.152961-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:08:58 -07:00
Jakub Kicinski	33dcee99e0	Merge branch 'af_unix-fix-two-data-races-reported-by-kcsan' Kuniyuki Iwashima says: ==================== af_unix: Fix two data races reported by KCSAN. KCSAN reported data races around these two fields for AF_UNIX sockets. * sk->sk_receive_queue->qlen * sk->sk_shutdown Let's annotate them properly. ==================== Link: https://lore.kernel.org/r/20230510003456.42357-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:06:55 -07:00
Kuniyuki Iwashima	e1d09c2c2f	af_unix: Fix data races around sk->sk_shutdown. KCSAN found a data race around sk->sk_shutdown where unix_release_sock() and unix_shutdown() update it under unix_state_lock(), OTOH unix_poll() and unix_dgram_poll() read it locklessly. We need to annotate the writes and reads with WRITE_ONCE() and READ_ONCE(). BUG: KCSAN: data-race in unix_poll / unix_release_sock write to 0xffff88800d0f8aec of 1 bytes by task 264 on cpu 0: unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631 unix_release+0x59/0x80 net/unix/af_unix.c:1042 __sock_release+0x7d/0x170 net/socket.c:653 sock_close+0x19/0x30 net/socket.c:1397 __fput+0x179/0x5e0 fs/file_table.c:321 ____fput+0x15/0x20 fs/file_table.c:349 task_work_run+0x116/0x1a0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline] syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297 do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff88800d0f8aec of 1 bytes by task 222 on cpu 1: unix_poll+0xa3/0x2a0 net/unix/af_unix.c:3170 sock_poll+0xcf/0x2b0 net/socket.c:1385 vfs_poll include/linux/poll.h:88 [inline] ep_item_poll.isra.0+0x78/0xc0 fs/eventpoll.c:855 ep_send_events fs/eventpoll.c:1694 [inline] ep_poll fs/eventpoll.c:1823 [inline] do_epoll_wait+0x6c4/0xea0 fs/eventpoll.c:2258 __do_sys_epoll_wait fs/eventpoll.c:2270 [inline] __se_sys_epoll_wait fs/eventpoll.c:2265 [inline] __x64_sys_epoll_wait+0xcc/0x190 fs/eventpoll.c:2265 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0x00 -> 0x03 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 222 Comm: dbus-broker Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: `3c73419c09` ("af_unix: fix 'poll for write'/ connected DGRAM sockets") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:06:53 -07:00
Kuniyuki Iwashima	679ed006d4	af_unix: Fix a data race of sk->sk_receive_queue->qlen. KCSAN found a data race of sk->sk_receive_queue->qlen where recvmsg() updates qlen under the queue lock and sendmsg() checks qlen under unix_state_sock(), not the queue lock, so the reader side needs READ_ONCE(). BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_wait_for_peer write (marked) to 0xffff888019fe7c68 of 4 bytes by task 49792 on cpu 0: __skb_unlink include/linux/skbuff.h:2347 [inline] __skb_try_recv_from_queue+0x3de/0x470 net/core/datagram.c:197 __skb_try_recv_datagram+0xf7/0x390 net/core/datagram.c:263 __unix_dgram_recvmsg+0x109/0x8a0 net/unix/af_unix.c:2452 unix_dgram_recvmsg+0x94/0xa0 net/unix/af_unix.c:2549 sock_recvmsg_nosec net/socket.c:1019 [inline] ____sys_recvmsg+0x3a3/0x3b0 net/socket.c:2720 ___sys_recvmsg+0xc8/0x150 net/socket.c:2764 do_recvmmsg+0x182/0x560 net/socket.c:2858 __sys_recvmmsg net/socket.c:2937 [inline] __do_sys_recvmmsg net/socket.c:2960 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0x153/0x170 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff888019fe7c68 of 4 bytes by task 49793 on cpu 1: skb_queue_len include/linux/skbuff.h:2127 [inline] unix_recvq_full net/unix/af_unix.c:229 [inline] unix_wait_for_peer+0x154/0x1a0 net/unix/af_unix.c:1445 unix_dgram_sendmsg+0x13bc/0x14b0 net/unix/af_unix.c:2048 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg+0x148/0x160 net/socket.c:747 ____sys_sendmsg+0x20e/0x620 net/socket.c:2503 ___sys_sendmsg+0xc6/0x140 net/socket.c:2557 __sys_sendmmsg+0x11d/0x370 net/socket.c:2643 __do_sys_sendmmsg net/socket.c:2672 [inline] __se_sys_sendmmsg net/socket.c:2669 [inline] __x64_sys_sendmmsg+0x58/0x70 net/socket.c:2669 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0x0000000b -> 0x00000001 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 49793 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:06:53 -07:00
Eric Dumazet	5bca1d081f	net: datagram: fix data-races in datagram_poll() datagram_poll() runs locklessly, we should add READ_ONCE() annotations while reading sk->sk_err, sk->sk_shutdown and sk->sk_state. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230509173131.3263780-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-05-10 19:06:49 -07:00
Linus Torvalds	80e62bc848	MAINTAINERS: re-sort all entries and fields It's been a few years since we've sorted this thing, and the end result is that we've added MAINTAINERS entries in the wrong order, and a number of entries have their fields in non-canonical order too. So roll this boulder up the hill one more time by re-running ./scripts/parse-maintainers.pl --order on it. This file ends up being fairly painful for merge conflicts even normally, since unlike almost all other kernel files it's one of those "everybody touches the same thing", and re-ordering all entries is only going to make that worse. But the alternative is to never do it at all, and just let it all rot.. The rc2 week is likely the quietest and least painful time to do this. Requested-by: Randy Dunlap <rdunlap@infradead.org> Requested-by: Joe Perches <joe@perches.com> # "Please use --order" Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-05-10 20:04:04 -05:00
Linus Torvalds	d295b66a7b	\n -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmRcFLAACgkQnJ2qBz9k QNmSuQf+P+hFOHVpCb83S4TrNsfPC9M7Pv/M1Rw926p1gESNLez1ElAx2B35OIBZ y4ijqrSHgXbCsC3PvbxR9YfBPJZwIhtmKwjPC+9KlhAFB536Bvt/nt34fjRCWJiS nABnexegkirk7EQufAGVZQ1gtm1p0qSCA1B/l5AFl4KeLIin31U/08F2eE1OAo3K TmxhXzlC+JRXjf/X/X8mzJ8nz/hY5039BDvs7EpVdOsZcLFRQcCUygUCrOdV94uI 1wyzMoGNaiIsZHpHPf3NtphUoE9IFpZQJGXMTR4ehBgFYkkOHYxsjW+wn8YOmaks qzRRXb9w0WdgAXChDGu1xYXV6Wpjiw== =kqSe -----END PGP SIGNATURE----- Merge tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull inotify fix from Jan Kara: "A fix for possibly reporting invalid watch descriptor with inotify event" * tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: inotify: Avoid reporting event with invalid wd	2023-05-10 17:07:42 -05:00
Linus Torvalds	2a78769da3	gfs2 fix - Fix a NULL pointer dereference when mounting corrupted filesystems. -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmRbtSYUHGFncnVlbmJh QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTrQSg/6A7xD25DpU1CSL18CEKMxnViXPNLP IYiTtkByCsRkgQKR4KMtVgPD/lEyWbhpuZJzSF4mZkslpnngjL/o+Rj4Sds9GvPL +66CSJAqjpJUjgGDvAoiJcZYOejRyqZLCHzKF54h0oJ3k9JCd3xN86EX8vKGoc2e F/As+E1Wnf0PP4jiERDyFhhg95MNxFuNVJ/JJ7mwtWpumx+60SpGzzUp575AHcoe 4zHkVHlj6FAz2G8l8NADYAc0wMuhQ05g19psHzRDr2bB9zHBy9kWzXN0MYrhRvuY HuQ3CNDbXCJPXFjrNEmLV0f7Y8yCk/YNrYPbo4zV+AW0l7NzesqzrwkP/qgNfySD pnQJ/mQ76P/h728fln0pYvO8GnFUFoKKoCR27jH4F/Mntr4hgR4FZNDhvK6BMQNc PpipJcXWT8AUY45Sg3TLKPEhjp37K+abiMKfyTEFQRvLT83O6WyCUx2HYpZI7l87 /aTIsidlYkb2ZSeLeXEfBplRXwF7NQNvvRJCGQKHAN7UbTB1o2RVpC5bn6UIghb1 eQQZObs4AUfmzHthF2OaeJy08AiFidbOVLmbpZuj2iOWbJzmjJea8YT/GdntxhMT swmkHJqdmZEYoSKKvDmemT6l7qtOwijg7sTQ/d2D77lk8QPPIHdUw6g1OVvMqoNE oESpIA4O2fshlpU= =/s3O -----END PGP SIGNATURE----- Merge tag 'gfs2-v6.3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Andreas Gruenbacher: - Fix a NULL pointer dereference when mounting corrupted filesystems * tag 'gfs2-v6.3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Don't deref jdesc in evict	2023-05-10 16:48:35 -05:00
Bob Peterson	504a10d9e4	gfs2: Don't deref jdesc in evict On corrupt gfs2 file systems the evict code can try to reference the journal descriptor structure, jdesc, after it has been freed and set to NULL. The sequence of events is: init_journal() ... fail_jindex: gfs2_jindex_free(sdp); <------frees journals, sets jdesc = NULL if (gfs2_holder_initialized(&ji_gh)) gfs2_glock_dq_uninit(&ji_gh); fail: iput(sdp->sd_jindex); <--references jdesc in evict_linked_inode evict() gfs2_evict_inode() evict_linked_inode() ret = gfs2_trans_begin(sdp, 0, sdp->sd_jdesc->jd_blocks); <------references the now freed/zeroed sd_jdesc pointer. The call to gfs2_trans_begin is done because the truncate_inode_pages call can cause gfs2 events that require a transaction, such as removing journaled data (jdata) blocks from the journal. This patch fixes the problem by adding a check for sdp->sd_jdesc to function gfs2_evict_inode. In theory, this should only happen to corrupt gfs2 file systems, when gfs2 detects the problem, reports it, then tries to evict all the system inodes it has read in up to that point. Reported-by: Yang Lan <lanyang0908@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-05-10 17:15:18 +02:00
Linus Torvalds	ad2fd53a78	platform-drivers-x86 for v6.4-2 Highlights: - thinkpad_acpi: Fix profile (performance/bal/low-power) regression on T490 - misc. other small fixes / hw-id additions The following is an automated git shortlog grouped by driver: hp-wmi: - add micmute to hp_wmi_keymap struct intel_scu_pcidrv: - Add back PCI ID for Medfield platform/mellanox: - fix potential race in mlxbf-tmfifo driver platform/x86/intel-uncore-freq: - Return error on write frequency thinkpad_acpi: - Add profile force ability - Fix platform profiles on T490 touchscreen_dmi: - Add info for the Dexp Ursus KX210i - Add upside-down quirk for GDIX1002 ts on the Juno Tablet -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmRbcQ4UHGhkZWdvZWRl QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yFnQf9G83wo2B5SfjN7V7viYHEGB4InfQL +MbhwP+Efe++BkxwAzVMcVLSuYOsbqcrRBCiQkBVCyAF14LVVH3ymnn/upoqz9l4 K4JUZorGIEPgl0SpGP+41j+P93XYfcfMYPRetZA8ErBpitSfHVf9oAG8vTgaemda Qxo8DWs4R1WGdg8KZ7icZ7shXiyzIeGBnQB/5EzpBrv6v6EmiJsUZh5ZZASOidrg IXTp8LgLhkFbIQBgHoOyHoM7puRI7MRbxu5YEJ+wKumFi4KmuTET5iDlZFpmqvoH eAU/JUWIsQ3I1QwTKFSLFYkJZCZrIIJHew1Lvs9F+/x1c2vs1QjOG8RoEg== =SEuW -----END PGP SIGNATURE----- Merge tag 'platform-drivers-x86-v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Nothing special to report just various small fixes: - thinkpad_acpi: Fix profile (performance/bal/low-power) regression on T490 - misc other small fixes / hw-id additions" * tag 'platform-drivers-x86-v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/mellanox: fix potential race in mlxbf-tmfifo driver platform/x86: touchscreen_dmi: Add info for the Dexp Ursus KX210i platform/x86: touchscreen_dmi: Add upside-down quirk for GDIX1002 ts on the Juno Tablet platform/x86: thinkpad_acpi: Add profile force ability platform/x86: thinkpad_acpi: Fix platform profiles on T490 platform/x86: hp-wmi: add micmute to hp_wmi_keymap struct platform/x86/intel-uncore-freq: Return error on write frequency platform/x86: intel_scu_pcidrv: Add back PCI ID for Medfield	2023-05-10 09:36:42 -05:00
Colin Foster	cdc2e28e21	net: mscc: ocelot: fix stat counter register values Commit `d4c3676507` ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset") organized the stats counters for Ocelot chips, namely the VSC7512 and VSC7514. A few of the counter offsets were incorrect, and were caught by this warning: WARNING: CPU: 0 PID: 24 at drivers/net/ethernet/mscc/ocelot_stats.c:909 ocelot_stats_init+0x1fc/0x2d8 reg 0x5000078 had address 0x220 but reg 0x5000079 has address 0x214, bulking broken! Fix these register offsets. Fixes: `d4c3676507` ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset") Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 12:11:18 +01:00
Ilia.Gavrilov	059fa49202	sctp: fix a potential OOB access in sctp_sched_set_sched() The 'sched' index value must be checked before accessing an element of the 'sctp_sched_ops' array. Otherwise, it can lead to OOB access. Note that it's harmless since the 'sched' parameter is checked before calling 'sctp_sched_set_sched'. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilia.Gavrilov <Ilia.Gavrilov@infotecs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 12:10:15 +01:00
wuych	6096bc0555	net: liquidio: lio_vf_main: Remove unnecessary (void) conversions Pointer variables of void type do not require type cast. Signed-off-by: wuych <yunchuan@nfschina.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 11:33:03 +01:00
Subbaraya Sundeep	bd9424efc4	macsec: Use helper macsec_netdev_priv for offload drivers Now macsec on top of vlan can be offloaded to macsec offloading devices so that VLAN tag is sent in clear text on wire i.e, packet structure is DMAC\|SMAC\|VLAN\|SECTAG. Offloading devices can simply enable NETIF_F_HW_MACSEC feature in netdev->vlan_features for this to work. But the logic in offloading drivers to retrieve the private structure from netdev needs to be changed to check whether the netdev received is real device or a vlan device and get private structure accordingly. This patch changes the offloading drivers to use helper macsec_netdev_priv instead of netdev_priv. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 11:32:09 +01:00
t.feng	90cbed5247	ipvlan:Fix out-of-bounds caused by unclear skb->cb If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which is actually skb->cb, and IPCB(skb_in)->opt will be used in __ip_options_echo. It is possible that memcpy is out of bounds and lead to stack overflow. We should clear skb->cb before ip_local_out or ip6_local_out. v2: 1. clean the stack info 2. use IPCB/IP6CB instead of skb->cb crash on stable-5.10(reproduce in kasan kernel). Stack info: [ 2203.651571] BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x589/0x800 [ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task swapper/3/0 [ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted 5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1 [ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014 [ 2203.655475] Call Trace: [ 2203.655481] <IRQ> [ 2203.655501] dump_stack+0x9c/0xd3 [ 2203.655514] print_address_description.constprop.0+0x19/0x170 [ 2203.655530] __kasan_report.cold+0x6c/0x84 [ 2203.655586] kasan_report+0x3a/0x50 [ 2203.655594] check_memory_region+0xfd/0x1f0 [ 2203.655601] memcpy+0x39/0x60 [ 2203.655608] __ip_options_echo+0x589/0x800 [ 2203.655654] __icmp_send+0x59a/0x960 [ 2203.655755] nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4] [ 2203.655763] reject_tg+0x77/0x1bf [ipt_REJECT] [ 2203.655772] ipt_do_table+0x691/0xa40 [ip_tables] [ 2203.655821] nf_hook_slow+0x69/0x100 [ 2203.655828] __ip_local_out+0x21e/0x2b0 [ 2203.655857] ip_local_out+0x28/0x90 [ 2203.655868] ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan] [ 2203.655931] ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan] [ 2203.655967] ipvlan_queue_xmit+0xb3/0x190 [ipvlan] [ 2203.655977] ipvlan_start_xmit+0x2e/0xb0 [ipvlan] [ 2203.655984] xmit_one.constprop.0+0xe1/0x280 [ 2203.655992] dev_hard_start_xmit+0x62/0x100 [ 2203.656000] sch_direct_xmit+0x215/0x640 [ 2203.656028] __qdisc_run+0x153/0x1f0 [ 2203.656069] __dev_queue_xmit+0x77f/0x1030 [ 2203.656173] ip_finish_output2+0x59b/0xc20 [ 2203.656244] __ip_finish_output.part.0+0x318/0x3d0 [ 2203.656312] ip_finish_output+0x168/0x190 [ 2203.656320] ip_output+0x12d/0x220 [ 2203.656357] __ip_queue_xmit+0x392/0x880 [ 2203.656380] __tcp_transmit_skb+0x1088/0x11c0 [ 2203.656436] __tcp_retransmit_skb+0x475/0xa30 [ 2203.656505] tcp_retransmit_skb+0x2d/0x190 [ 2203.656512] tcp_retransmit_timer+0x3af/0x9a0 [ 2203.656519] tcp_write_timer_handler+0x3ba/0x510 [ 2203.656529] tcp_write_timer+0x55/0x180 [ 2203.656542] call_timer_fn+0x3f/0x1d0 [ 2203.656555] expire_timers+0x160/0x200 [ 2203.656562] run_timer_softirq+0x1f4/0x480 [ 2203.656606] __do_softirq+0xfd/0x402 [ 2203.656613] asm_call_irq_on_stack+0x12/0x20 [ 2203.656617] </IRQ> [ 2203.656623] do_softirq_own_stack+0x37/0x50 [ 2203.656631] irq_exit_rcu+0x134/0x1a0 [ 2203.656639] sysvec_apic_timer_interrupt+0x36/0x80 [ 2203.656646] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 2203.656654] RIP: 0010:default_idle+0x13/0x20 [ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08 [ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256 [ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX: ffffffffaf290191 [ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI: ffff88811a3c4f60 [ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff88811a3c4f63 [ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12: 0000000000000003 [ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15: 0000000000000000 [ 2203.656729] default_idle_call+0x5a/0x150 [ 2203.656735] cpuidle_idle_call+0x1c6/0x220 [ 2203.656780] do_idle+0xab/0x100 [ 2203.656786] cpu_startup_entry+0x19/0x20 [ 2203.656793] secondary_startup_64_no_verify+0xc2/0xcb [ 2203.657409] The buggy address belongs to the page: [ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11a388 [ 2203.658665] flags: 0x17ffffc0001000(reserved\|node=0\|zone=2\|lastcpupid=0x1fffff) [ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208 0000000000000000 [ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [ 2203.658686] page dumped because: kasan: bad access detected To reproduce(ipvlan with IPVLAN_MODE_L3): Env setting: ======================================================= modprobe ipvlan ipvlan_default_mode=1 sysctl net.ipv4.conf.eth0.forwarding=1 iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j MASQUERADE ip link add gw link eth0 type ipvlan ip -4 addr add 20.0.0.254/24 dev gw ip netns add net1 ip link add ipv1 link eth0 type ipvlan ip link set ipv1 netns net1 ip netns exec net1 ip link set ipv1 up ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1 ip netns exec net1 route add default gw 20.0.0.254 ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10% ifconfig gw up iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with icmp-port-unreachable ======================================================= And then excute the shell(curl any address of eth0 can reach): for((i=1;i<=100000;i++)) do ip netns exec net1 curl x.x.x.x:8888 done ======================================================= Fixes: `2ad7bf3638` ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: "t.feng" <fengtao40@huawei.com> Suggested-by: Florian Westphal <fw@strlen.de> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-05-10 10:33:06 +01:00

1 2 3 4 5 ...

1185285 Commits