linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 12:42:02 +00:00

Author	SHA1	Message	Date
Jakub Kicinski	bf9b85562a	Merge branch 'net-devlink-move-netdev-notifier-block-to-dest-namespace-during-reload' Jiri Pirko says: ==================== net: devlink: move netdev notifier block to dest namespace during reload Patch #1 is just a dependency of patch #2, which is the actual fix. ==================== Link: https://lore.kernel.org/r/20221108132208.938676-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-09 13:46:05 -08:00
Jiri Pirko	15feb56e30	net: devlink: move netdev notifier block to dest namespace during reload The notifier block tracking netdev changes in devlink is registered during devlink_alloc() per-net, it is then unregistered in devlink_free(). When devlink moves from net namespace to another one, the notifier block needs to move along. Fix this by adding forgotten call to move the block. Reported-by: Ido Schimmel <idosch@idosch.org> Fixes: `02a68a47ea` ("net: devlink: track netdev with devlink_port assigned") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-09 13:45:59 -08:00
Jiri Pirko	3e52fba03a	net: introduce a helper to move notifier block to different namespace Currently, net_dev() netdev notifier variant follows the netdev with per-net notifier from namespace to namespace. This is implemented by move_netdevice_notifiers_dev_net() helper. For devlink it is needed to re-register per-net notifier during devlink reload. Introduce a new helper called move_netdevice_notifier_net() and share the unregister/register code with existing move_netdevice_notifiers_dev_net() helper. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-09 13:45:59 -08:00
Michal Jaron	0e710a3ffd	iavf: Fix VF driver counting VLAN 0 filters VF driver mistakenly counts VLAN 0 filters, when no PF driver counts them. Do not count VLAN 0 filters, when VLAN_V2 is engaged. Counting those filters in, will affect filters size by -1, when sending batched VLAN addition message. Fixes: `968996c070` ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-11-09 13:20:55 -08:00
Norbert Zulinski	f23df5220d	ice: Fix spurious interrupt during removal of trusted VF Previously, during removal of trusted VF when VF is down there was number of spurious interrupt equal to number of queues on VF. Add check if VF already has inactive queues. If VF is disabled and has inactive rx queues then do not disable rx queues. Add check in ice_vsi_stop_tx_ring if it's VF's vsi and if VF is disabled. Fixes: `efe4186000` ("ice: Fix memory corruption in VF driver") Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-11-09 13:20:38 -08:00
Linus Torvalds	f67dd6ce07	slab fixes for 6.1-rc4 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEjUuTAak14xi+SF7M4CHKc/GJqRAFAmNrulwACgkQ4CHKc/GJ qRDGWwf/bqkCffS+Eg8p3wrGEbhWb1pOWnshcPl9EttSlclIfwaby5+kHTjeKpGR r3nt2cRAtWH3gUbU32352TJJ97oobasFHk3aE7xorHYTQ5HVAycwiHi+6BqcEcNH MyH7rcOAnKV1GeE1NnX99CeOtCA0wOaO/kCAn9y1QvSifoxKaiixBodoov4CHuSt PPXcJU3Rgyo8pDzFya3BAScayTTNkr1MU18iacJwndhAyjWolL4tlVqoLgVsi/TA wHb80Moj0iPyEioxHW7OHLkoapCYr4mfB3AUUY2t91ZciFQEKfihmki2KJw2VOg5 XBU1iNezxMJhteNJc6JqXr90nsriAw== =p9yC -----END PGP SIGNATURE----- Merge tag 'slab-for-6.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: "Most are small fixups as described below. The !CONFIG_TRACING fix is a bit bigger and would normally be done in the next merge window as part of upcoming hardening changes. But we realized it can make the kmalloc waste tracking introduced in this window inaccurate, so decided to go with it now. Summary: - Remove !CONFIG_TRACING kmalloc() wrappers intended to save a function call, due to incompatilibity with recently introduced wasted space tracking and planned hardening changes. - A tracing parameter regression fix, by Kees Cook. - Two kernel-doc warning fixups, by Lukas Bulwahn and myself * tag 'slab-for-6.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm, slab: remove duplicate kernel-doc comment for ksize() mm/slab_common: Restore passing "caller" for tracing mm/slab: remove !CONFIG_TRACING variants of kmalloc_[node_]trace() mm/slab_common: repair kernel-doc for __ksize()	2022-11-09 13:07:50 -08:00
Roi Dayan	7f1a6d4b9e	net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions esw_attr is only allocated if namespace is fdb. BUG: KASAN: slab-out-of-bounds in parse_tc_actions+0xdc6/0x10e0 [mlx5_core] Write of size 4 at addr ffff88815f185b04 by task tc/2135 CPU: 5 PID: 2135 Comm: tc Not tainted 6.1.0-rc2+ #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d print_report+0x170/0x471 ? parse_tc_actions+0xdc6/0x10e0 [mlx5_core] kasan_report+0xbc/0xf0 ? parse_tc_actions+0xdc6/0x10e0 [mlx5_core] parse_tc_actions+0xdc6/0x10e0 [mlx5_core] Fixes: `94d651739e` ("net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Roi Dayan	f4f4096b41	net/mlx5e: E-Switch, Fix comparing termination table instance The pkt_reformat pointer being saved under flow_act and not dest attribute in the termination table instance. Fix the comparison pointers. Also fix returning success if one pkt_reformat pointer is null and the other is not. Fixes: `249ccc3c95` ("net/mlx5e: Add support for offloading traffic from uplink to uplink") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Jianbo Liu	9e06430841	net/mlx5e: TC, Fix wrong rejection of packet-per-second policing In the bellow commit, we added support for PPS policing without removing the check which block offload of such cases. Fix it by removing this check. Fixes: `a8d52b024d` ("net/mlx5e: TC, Support offloading police action") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Roi Dayan	08912ea799	net/mlx5e: Fix tc acts array not to be dependent on enum order The tc acts array should not be dependent on kernel internal flow action id enum. Fix the array initialization. Fixes: `fad5479069` ("net/mlx5e: Add tc action infrastructure") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Maxim Mikityanskiy	8d4b475e9d	net/mlx5e: Fix usage of DMA sync API DMA sync functions should use the same direction that was used by DMA mapping. Use DMA_BIDIRECTIONAL for XDP_TX from regular RQ, which reuses the same mapping that was used for RX, and DMA_TO_DEVICE for XDP_TX from XSK RQ and XDP_REDIRECT, which establish a new mapping in this direction. On the RX side, use the same direction that was used when setting up the mapping (DMA_BIDIRECTIONAL for XDP, DMA_FROM_DEVICE otherwise). Also don't skip sync for device when establishing a DMA_FROM_DEVICE mapping for RX, as some architectures (ARM) may require invalidating caches before the device can use the mapping. It doesn't break the bugfix made in commit `0b7cfa4082` ("net/mlx5e: Fix page DMA map/unmap attributes"), since the bug happened on unmap. Fixes: `0b7cfa4082` ("net/mlx5e: Fix page DMA map/unmap attributes") Fixes: `b5503b994e` ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Maxim Mikityanskiy	f9c955b4fe	net/mlx5e: Add missing sanity checks for max TX WQE size The commit cited below started using the firmware capability for the maximum TX WQE size. This commit adds an important check to verify that the driver doesn't attempt to exceed this capability, and also restores another check mistakenly removed in the cited commit (a WQE must not exceed the page size). Fixes: `c27bd1718c` ("net/mlx5e: Read max WQEBBs on the SQ from firmware") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Shay Drory	7d167b4a4c	net/mlx5: fw_reset: Don't try to load device in case PCI isn't working In case PCI reads fail after unload, there is no use in trying to load the device. Fixes: `5ec697446f` ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:43 -08:00
Chris Mi	e12de39c07	net/mlx5: E-switch, Set to legacy mode if failed to change switchdev mode No need to rollback to the other mode because probably will fail again. Just set to legacy mode and clear fdb table created flag. So that fdb table will not be cleared again. Fixes: `f019679ea5` ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:42 -08:00
Roy Novich	2808b37b59	net/mlx5: Allow async trigger completion execution on single CPU systems For a single CPU system, the kernel thread executing mlx5_cmd_flush() never releases the CPU but calls down_trylock(&cmd→sem) in a busy loop. On a single processor system, this leads to a deadlock as the kernel thread which executes mlx5_cmd_invoke() never gets scheduled. Fix this, by adding the cond_resched() call to the loop, allow the command completion kernel thread to execute. Fixes: `8e715cd613` ("net/mlx5: Set command entry semaphore up once got index free") Signed-off-by: Alexander Schmidt <alexschm@de.ibm.com> Signed-off-by: Roy Novich <royno@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:42 -08:00
Vlad Buslov	15f8f16895	net/mlx5: Bridge, verify LAG state when adding bond to bridge Mlx5 LAG is initialized asynchronously on a workqueue which means that for a brief moment after setting mlx5 UL representors as lower devices of a bond netdevice the LAG itself is not fully initialized in the driver. When adding such bond device to a bridge mlx5 bridge code will not consider it as offload-capable, skip creating necessary bookkeeping and fail any further bridge offload-related commands with it (setting VLANs, offloading FDBs, etc.). In order to make the error explicit during bridge initialization stage implement the code that detects such condition during NETDEV_PRECHANGEUPPER event and returns an error. Fixes: `ff9b752146` ("net/mlx5: Bridge, support LAG") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-11-09 10:30:42 -08:00
Jakub Kicinski	154ba79c9f	genetlink: correctly begin the iteration over policies The return value from genl_op_iter_init() only tells us if there are any policies but to begin the iteration (and therefore load the first entry) we need to call genl_op_iter_next(). Note that it's safe to call genl_op_iter_next() on a family with no ops, it will just return false. This may lead to various crashes, a warning in netlink_policy_dump_get_policy_idx() when policy is not found or.. no problem at all if the kmalloc'ed memory happens to be zeroed. Fixes: `b502b3185c` ("genetlink: use iterator in the op to policy map dumping") Link: https://lore.kernel.org/r/20221108204128.330287-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-09 10:26:51 -08:00
David S. Miller	27c064ae14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for net: 1) Fix deadlock in nfnetlink due to missing mutex release in error path, from Ziyang Xuan. 2) Clean up pending autoload module list from nf_tables_exit_net() path, from Shigeru Yoshida. 3) Fixes for the netfilter's reverse path selftest, from Phil Sutter. All of these bugs have been around for several releases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:57:42 +00:00
David S. Miller	3ca6c3b43c	rxrpc changes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmNq0Q8ACgkQ+7dXa6fL C2toRxAAmmvce10i3hcS6ke0PvB4gPu6ZSuaQWO3KxpP9jz6lV7M+cfFOh9N3neG uEe6ms4Kzt/BJIBm+aMdXW84648sV5vOqdrNGBOb2cJikaiTkj9x730klSdwOVr2 epEELoj/IEWZZz/d9U05uq26VUtnxsc/Enzkq/GIaENSVauYWaZXrHdKzrzUZYjk gEbspFSpQEJqu5slRl2XGos4tMHHvTIkehoLH9KM4YmC5WGf1kKYz/6v38PIhc/9 mEBsUqQlTVsUPNcOXWBY24HJKY91CBgowhbTQIxyJNydHPJYPVJ8U5nNp1g1CYmu URdvvX8IyIR0zX2RcVlc9vnWQ+p5NoTjxjwc1iKjnBsofCmqDucie6Iz2vis7Zl6 6s6N1FZSYQTX0fbBbf00efWaG/3I/ynRhcW+zM9NcozHzpRxyuptDlKSOVORXRG7 gy7+sID2y5dLqCg9ukTIx1y9Njt+uryosBOajCMaaAy0VgXEsETFO8UxbodUAu6N ubmPwGO42bY//c+fJWRAjT9tjhzp2fWK4rgrgd3VG4cYrjq2W21EMwyjzilVp2dM ZlvWoWJptIqEhPtWU8nf3i759XE+FOWKt9ns1FupKB+0msht1p2HBj88bue8TrKk CcV1dY9cohNzgRFXvXcgSLvSCioT31Q//mGmXWLif7teOXIUN4A= =q04p -----END PGP SIGNATURE----- Merge tag 'rxrpc-next-20221108' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs rxrpc changes David Howells says: ==================== rxrpc: Increasing SACK size and moving away from softirq, part 1 AF_RXRPC has some issues that need addressing: (1) The SACK table has a maximum capacity of 255, but for modern networks that isn't sufficient. This is hard to increase in the upstream code because of the way the application thread is coupled to the softirq and retransmission side through a ring buffer. Adjustments to the rx protocol allows a capacity of up to 8192, and having a ring sufficiently large to accommodate that would use an excessive amount of memory as this is per-call. (2) Processing ACKs in softirq mode causes the ACKs get conflated, with only the most recent being considered. Whilst this has the upside that the retransmission algorithm only needs to deal with the most recent ACK, it causes DATA transmission for a call to be very bursty because DATA packets cannot be transmitted in softirq mode. Rather transmission must be delegated to either the application thread or a workqueue, so there tend to be sudden bursts of traffic for any particular call due to scheduling delays. (3) All crypto in a single call is done in series; however, each DATA packet is individually encrypted so encryption and decryption of large calls could be parallelised if spare CPU resources are available. This is the first of a number of sets of patches that try and address them. The overall aims of these changes include: (1) To get rid of the TxRx ring and instead pass the packets round in queues (eg. sk_buff_head). On the Tx side, each ACK packet comes with a SACK table that can be parsed as-is, so there's no particular need to maintain our own; we just have to refer to the ACK. On the Rx side, we do need to maintain a SACK table with one bit per entry - but only if packets go missing - and we don't want to have to perform a complex transformation to get the information into an ACK packet. (2) To try and move almost all processing of received packets out of the softirq handler and into a high-priority kernel I/O thread. Only the transferral of packets would be left there. I would still use the encap_rcv hook to receive packets as there's a noticeable performance drop from letting the UDP socket put the packets into its own queue and then getting them out of there. (3) To make the I/O thread also do all the transmission. The app thread would be responsible for packaging the data into packets and then buffering them for the I/O thread to transmit. This would make it easier for the app thread to run ahead of the I/O thread, and would mean the I/O thread is less likely to have to wait around for a new packet to come available for transmission. (4) To logically partition the socket/UAPI/KAPI side of things from the I/O side of things. The local endpoint, connection, peer and call objects would belong to the I/O side. The socket side would not then touch the private internals of calls and suchlike and would not change their states. It would only look at the send queue, receive queue and a way to pass a message to cause an abort. (5) To remove as much locking, synchronisation, barriering and atomic ops as possible from the I/O side. Exclusion would be achieved by limiting modification of state to the I/O thread only. Locks would still need to be used in communication with the UDP socket and the AF_RXRPC socket API. (6) To provide crypto offload kernel threads that, when there's slack in the system, can see packets that need crypting and provide parallelisation in dealing with them. (7) To remove the use of system timers. Since each timer would then send a poke to the I/O thread, which would then deal with it when it had the opportunity, there seems no point in using system timers if, instead, a list of timeouts can be sensibly consulted. An I/O thread only then needs to schedule with a timeout when it is idle. (8) To use zero-copy sendmsg to send packets. This would make use of the I/O thread being the sole transmitter on the socket to manage the dead-reckoning sequencing of the completion notifications. There is a problem with zero-copy, though: the UDP socket doesn't handle running out of option memory very gracefully. With regard to this first patchset, the changes made include: (1) Some fixes, including a fallback for proc_create_net_single_write(), setting ack.bufferSize to 0 in ACK packets and a fix for rxrpc congestion management, which shouldn't be saving the cwnd value between calls. (2) Improvements in rxrpc tracepoints, including splitting the timer tracepoint into a set-timer and a timer-expired trace. (3) Addition of a new proc file to display some stats. (4) Some code cleanups, including removing some unused bits and unnecessary header inclusions. (5) A change to the recently added UDP encap_err_rcv hook so that it has the same signature as {ip,ipv6}_icmp_error(), and then just have rxrpc point its UDP socket's hook directly at those. (6) Definition of a new struct, rxrpc_txbuf, that is used to hold transmissible packets of DATA and ACK type in a single 2KiB block rather than using an sk_buff. This allows the buffer to be on a number of queues simultaneously more easily, and also guarantees that the entire block is in a single unit for zerocopy purposes and that the data payload is aligned for in-place crypto purposes. (7) ACK txbufs are allocated at proposal and queued for later transmission rather than being stored in a single place in the rxrpc_call struct, which means only a single ACK can be pending transmission at a time. The queue is then drained at various points. This allows the ACK generation code to be simplified. (8) The Rx ring buffer is removed. When a jumbo packet is received (which comprises a number of ordinary DATA packets glued together), it used to be pointed to by the ring multiple times, with an annotation in a side ring indicating which subpacket was in that slot - but this is no longer possible. Instead, the packet is cloned once for each subpacket, barring the last, and the range of data is set in the skb private area. This makes it easier for the subpackets in a jumbo packet to be decrypted in parallel. (9) The Tx ring buffer is removed. The side annotation ring that held the SACK information is also removed. Instead, in the event of packet loss, the SACK data attached an ACK packet is parsed. (10) Allocate an skcipher request when needed in the rxkad security class rather than caching one in the rxrpc_call struct. This deals with a race between externally-driven call disconnection getting rid of the skcipher request and sendmsg/recvmsg trying to use it because they haven't seen the completion yet. This is also needed to support parallelisation as the skcipher request cannot be used by two or more threads simultaneously. (11) Call udp_sendmsg() and udpv6_sendmsg() directly rather than going through kernel_sendmsg() so that we can provide our own iterator (zerocopy explicitly doesn't work with a KVEC iterator). This also lets us avoid the overhead of the security hook. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:03:49 +00:00
David S. Miller	5d041588e9	Merge branch 'wwan-iosm-fixes' M Chetan Kumar says: ==================== net: wwan: iosm: fixes This patch series contains iosm fixes. PATCH1: Fix memory leak in ipc_pcie_read_bios_cfg. PATCH2: Fix driver not working with INTEL_IOMMU disabled config. PATCH3: Fix invalid mux header type. PATCH4: Fix kernel build robot reported errors. Please refer to individual commit message for details. -- v2: * PATCH1: No Change * PATCH2: Kconfig change - Add dependency on PCI to resolve kernel build robot errors. * PATCH3: No Change * PATCH4: New (Fix kernel build robot errors) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:00:25 +00:00
M Chetan Kumar	980ec04a88	net: wwan: iosm: fix kernel test robot reported errors Include linux/vmalloc.h in iosm_ipc_coredump.c & iosm_ipc_devlink.c to resolve kernel test robot errors. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:00:25 +00:00
M Chetan Kumar	02d2d2ea4a	net: wwan: iosm: fix invalid mux header type Data stall seen during peak DL throughput test & packets are dropped by mux layer due to invalid header type in datagram. During initlization Mux aggregration protocol is set to default UL/DL size and TD count of Mux lite protocol. This configuration mismatch between device and driver is resulting in data stall/packet drops. Override the UL/DL size and TD count for Mux aggregation protocol. Fixes: `1f52d7b622` ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:00:25 +00:00
M Chetan Kumar	035e3befc1	net: wwan: iosm: fix driver not working with INTEL_IOMMU disabled With INTEL_IOMMU disable config or by forcing intel_iommu=off from grub some of the features of IOSM driver like browsing, flashing & coredump collection is not working. When driver calls DMA API - dma_map_single() for tx transfers. It is resulting in dma mapping error. Set the device DMA addressing capabilities using dma_set_mask() and remove the INTEL_IOMMU dependency in kconfig so that driver follows the platform config either INTEL_IOMMU enable or disable. Fixes: `f7af616c63` ("net: iosm: infrastructure") Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:00:25 +00:00
M Chetan Kumar	d38a648d2d	net: wwan: iosm: fix memory leak in ipc_pcie_read_bios_cfg ipc_pcie_read_bios_cfg() is using the acpi_evaluate_dsm() to obtain the wwan power state configuration from BIOS but is not freeing the acpi_object. The acpi_evaluate_dsm() returned acpi_object to be freed. Free the acpi_object after use. Fixes: `7e98d785ae` ("net: iosm: entry point") Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 14:00:25 +00:00
Andy Ren	bd039b5ea2	net/core: Allow live renaming when an interface is up Allow a network interface to be renamed when the interface is up. As described in the netconsole documentation [1], when netconsole is used as a built-in, it will bring up the specified interface as soon as possible. As a result, user space will not be able to rename the interface since the kernel disallows renaming of interfaces that are administratively up unless the 'IFF_LIVE_RENAME_OK' private flag was set by the kernel. The original solution [2] to this problem was to add a new parameter to the netconsole configuration parameters that allows renaming of the interface used by netconsole while it is administratively up. However, during the discussion that followed, it became apparent that we have no reason to keep the current restriction and instead we should allow user space to rename interfaces regardless of their administrative state: 1. The restriction was put in place over 20 years ago when renaming was only possible via IOCTL and before rtnetlink started notifying user space about such changes like it does today. 2. The 'IFF_LIVE_RENAME_OK' flag was added over 3 years ago in version 5.2 and no regressions were reported. 3. In-kernel listeners to 'NETDEV_CHANGENAME' do not seem to care about the administrative state of interface. Therefore, allow user space to rename running interfaces by removing the restriction and the associated 'IFF_LIVE_RENAME_OK' flag. Help in possible triage by emitting a message to the kernel log that an interface was renamed while UP. [1] https://www.kernel.org/doc/Documentation/networking/netconsole.rst [2] https://lore.kernel.org/netdev/20221102002420.2613004-1-andy.ren@getcruise.com/ Signed-off-by: Andy Ren <andy.ren@getcruise.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:08:12 +00:00
David S. Miller	b96c7b4cbe	Merge branch 'dsa-microchip-checking' Rakesh Sankaranarayanan says: ==================== net: dsa: microchip: ksz_pwrite status check for lan937x and irq and error checking updates for ksz series This patch series include following changes, - Add KSZ9563 inside ksz_switch_chips. As per current structure, KSZ9893 is reused inside ksz_switch_chips structure, but since there is a mismatch in number of irq's, new member added for KSZ9563 and sku detected based on Global Chip ID 4 Register. Compatible string from device tree mapped to KSZ9563 for spi and i2c mode probes. - Assign device interrupt during i2c probe operation. - Add error checking for ksz_pwrite inside lan937x_change_mtu. After v6.0, ksz_pwrite updated to have return type int instead of void, and lan937x_change_mtu still uses ksz_pwrite without status verification. - Add port_nirq as 3 for KSZ8563 switch family. - Use dev_err_probe() instead of dev_err() to have more standardized error formatting and logging. v1 -> v2: - Removed regmap validation patch from the series, planning to take up in future after checking for any better approach and studying the actual need for this change. - Resolved error reported in ksz8863_smi.c file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:06:01 +00:00
Rakesh Sankaranarayanan	9b18331706	net: dsa: microchip: add dev_err_probe in probe functions Probe functions uses normal dev_err() to check error conditions and print messages. Replace dev_err() with dev_err_probe() to have more standardized format and error logging. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:06:01 +00:00
Rakesh Sankaranarayanan	4630d1420f	net: dsa: microchip: ksz8563: Add number of port irq KSZ8563 have three port interrupts: PTP, PHY and ACL. Add port_nirq as 3 for KSZ8563 inside ksz_chip_data. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:06:01 +00:00
Rakesh Sankaranarayanan	e06999c3dc	net: dsa: microchip: add error checking for ksz_pwrite Add status validation for port register write inside lan937x_change_mtu. ksz_pwrite and ksz_pread api's are updated with return type int (Reference patch mentioned below). Update lan937x_change_mtu with status validation for ksz_pwrite16(). Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220826105634.3855578-6-o.rempel@pengutronix.de/ Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:06:01 +00:00
Rakesh Sankaranarayanan	a9c6db3bc9	net: dsa: microchip: add irq in i2c probe add device irq in i2c probe function. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:06:01 +00:00
Rakesh Sankaranarayanan	ef912fe443	net: dsa: microchip: add ksz9563 in ksz_switch_ops and select based on compatible string Add KSZ9563 inside ksz_switch_chips structure with port_nirq as 3. KSZ9563 use KSZ9893 switch parameters but port_nirq count is 3 for KSZ9563 whereas 2 for KSZ9893. Add KSZ9563 inside ksz_switch_chips as a separate member and from device tree map compatible string into KSZ9563 inside ksz_spi.c and ksz9477_i2c.c. Global Chip ID 1 and 2 registers read value 9893, select sku based on Global Chip ID 4 Register which read 0x1c for KSZ9563. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-09 13:06:01 +00:00
Phil Sutter	58bb78ce02	selftests: netfilter: Fix and review rpath.sh Address a few problems with the initial test script version: * On systems with ip6tables but no ip6tables-legacy, testing for ip6tables was disabled by accident. * Firewall setup phase did not respect possibly unavailable tools. * Consistently call nft via '$nft'. Fixes: `6e31ce831c` ("selftests: netfilter: Test reverse path filtering") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2022-11-09 10:29:57 +01:00
Yoshihiro Shimoda	380f9acdf7	net: ethernet: renesas: rswitch: Fix endless loop in error paths Coverity reported that the error path in rswitch_gwca_queue_alloc_skb() has an issue to cause endless loop. So, fix the issue by changing variables' types from u32 to int. After changed the types, rswitch_tx_free() should use rswitch_get_num_cur_queues() to calculate number of current queues. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1527147 ("Control flow issues") Fixes: `3590918b5d` ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/20221107081021.2955122-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 17:49:05 -08:00
Nick Child	742c60e128	ibmveth: Reduce default tx queues to 8 Previously, the default number of transmit queues was 16. Due to resource concerns, set to 8 queues instead. Still allow the user to set more queues (max 16) if they like. Since the driver is virtualized away from the physical NIC, the purpose of multiple queues is purely to allow for parallel calls to the hypervisor. Therefore, there is no noticeable effect on performance by reducing queue count to 8. Fixes: `d926793c1d` ("ibmveth: Implement multi queue on xmit") Reported-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Nick Child <nnac123@linux.ibm.com> Link: https://lore.kernel.org/r/20221107203215.58206-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 17:46:24 -08:00
Zhengchao Shao	b06334919c	net: nixge: disable napi when enable interrupts failed in nixge_open() When failed to enable interrupts in nixge_open() for opening device, napi isn't disabled. When open nixge device next time, it will reports a invalid opcode issue. Fix it. Only be compiled, not be tested. Fixes: `492caffa8a` ("net: ethernet: nixge: Add support for National Instruments XGE netdev") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221107101443.120205-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 17:44:02 -08:00
Eric Dumazet	07d120aa33	net: tun: call napi_schedule_prep() to ensure we own a napi A recent patch exposed another issue in napi_get_frags() caught by syzbot [1] Before feeding packets to GRO, and calling napi_complete() we must first grab NAPI_STATE_SCHED. [1] WARNING: CPU: 0 PID: 3612 at net/core/dev.c:6076 napi_complete_done+0x45b/0x880 net/core/dev.c:6076 Modules linked in: CPU: 0 PID: 3612 Comm: syz-executor408 Not tainted 6.1.0-rc3-syzkaller-00175-g1118b2049d77 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:napi_complete_done+0x45b/0x880 net/core/dev.c:6076 Code: c1 ea 03 0f b6 14 02 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 24 04 00 00 41 89 5d 1c e9 73 fc ff ff e8 b5 53 22 fa <0f> 0b e9 82 fe ff ff e8 a9 53 22 fa 48 8b 5c 24 08 31 ff 48 89 de RSP: 0018:ffffc90003c4f920 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000030 RCX: 0000000000000000 RDX: ffff8880251c0000 RSI: ffffffff875a58db RDI: 0000000000000007 RBP: 0000000000000001 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888072d02628 R13: ffff888072d02618 R14: ffff888072d02634 R15: 0000000000000000 FS: 0000555555f13300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055c44d3892b8 CR3: 00000000172d2000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> napi_complete include/linux/netdevice.h:510 [inline] tun_get_user+0x206d/0x3a60 drivers/net/tun.c:1980 tun_chr_write_iter+0xdb/0x200 drivers/net/tun.c:2027 call_write_iter include/linux/fs.h:2191 [inline] do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735 do_iter_write+0x182/0x700 fs/read_write.c:861 vfs_writev+0x1aa/0x630 fs/read_write.c:934 do_writev+0x133/0x2f0 fs/read_write.c:977 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f37021a3c19 Fixes: `1118b2049d` ("net: tun: Fix memory leaks of napi_get_frags") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Wang Yufen <wangyufen@huawei.com> Link: https://lore.kernel.org/r/20221107180011.188437-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 17:41:11 -08:00
Zhengchao Shao	519b58bbfa	net: marvell: prestera: fix memory leak in prestera_rxtx_switch_init() When prestera_sdma_switch_init() failed, the memory pointed to by sw->rxtx isn't released. Fix it. Only be compiled, not be tested. Fixes: `501ef3066c` ("net: marvell: prestera: Add driver for Prestera family ASIC devices") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Vadym Kochan <vadym.kochan@plvision.eu> Link: https://lore.kernel.org/r/20221108025607.338450-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 17:19:23 -08:00
Jakub Kicinski	2b01450328	linux-can-fixes-for-6.1-20221107 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmNpAhcTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCtfkuQ2KDTXeaNB/4om4cfVvLAgYVnoOrsQgUUXaWRQAxl nrIdRZGOB4LvL5p+Y9cO4tivAQI8plOx10zxex0jJcMujRsY+xWqBHBRRaWTKreh kVLSBd7TBAbiDyIyU5vJNUgjMrRwnymfxl2VcFARBF42z+/BcK2hQrLE8Mj+IqVr 8adtyuCHvfsBZEXk1o0RWbaeR/tbvV53x2cmRiHFukZh2MBliEf6j5a/KmRWJSck +UKdydssDhHoJi3Hv4MdUdo7NcjJVLbXbUYGLlaYz9RJmb7gTbUx/kPGRygCUikJ q/G0k0IgpcdjZjAgDjFGF/PEPIK449sOeMVpE+mzdDgYU+XCGASvDkCL =3x/D -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-6.1-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== can 2022-11-07 The first patch is by Chen Zhongjin and adds a missing dev_remove_pack() to the AF_CAN protocol. Zhengchao Shao's patch fixes a potential NULL pointer deref in AF_CAN's can_rx_register(). The next patch is by Oliver Hartkopp and targets the CAN ISO-TP protocol, and fixes the state handling for echo TX processing. Oliver Hartkopp's patch for the j1939 protocol adds a missing initialization of the CAN headers inside outgoing skbs. Another patch by Oliver Hartkopp fixes an out of bounds read in the check for invalid CAN frames in the xmit callback of virtual CAN devices. This touches all non virtual device drivers as we decided to rename the function requiring that netdev_priv points to a struct can_priv. (Note: This patch will create a merge conflict with net-next where the pch_can driver has removed.) The last patch is by Geert Uytterhoeven and adds the missing ECC error checks for the channels 2-7 in the rcar_canfd driver. * tag 'linux-can-fixes-for-6.1-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: rcar_canfd: Add missing ECC error checks for channels 2-7 can: dev: fix skb drop check can: j1939: j1939_send_one(): fix missing CAN header initialization can: isotp: fix tx state handling for echo tx processing can: af_can: fix NULL pointer dereference in can_rx_register() can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet ==================== Link: https://lore.kernel.org/r/20221107133217.59861-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 15:22:33 -08:00
Yang Li	8e18be7610	lib: Fix some kernel-doc comments Make the description of @policy to @p in nla_policy_len() to clear the below warnings: lib/nlattr.c:660: warning: Function parameter or member 'p' not described in 'nla_policy_len' lib/nlattr.c:660: warning: Excess function parameter 'policy' description in 'nla_policy_len' Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2736 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20221107062623.6709-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-08 15:06:56 -08:00
Shigeru Yoshida	03c1f1ef15	netfilter: Cleanup nft_net->module_list from nf_tables_exit_net() syzbot reported a warning like below [1]: WARNING: CPU: 3 PID: 9 at net/netfilter/nf_tables_api.c:10096 nf_tables_exit_net+0x71c/0x840 Modules linked in: CPU: 2 PID: 9 Comm: kworker/u8:0 Tainted: G W 6.1.0-rc3-00072-g8e5423e991e8 #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 Workqueue: netns cleanup_net RIP: 0010:nf_tables_exit_net+0x71c/0x840 ... Call Trace: <TASK> ? __nft_release_table+0xfc0/0xfc0 ops_exit_list+0xb5/0x180 cleanup_net+0x506/0xb10 ? unregister_pernet_device+0x80/0x80 process_one_work+0xa38/0x1730 ? pwq_dec_nr_in_flight+0x2b0/0x2b0 ? rwlock_bug.part.0+0x90/0x90 ? _raw_spin_lock_irq+0x46/0x50 worker_thread+0x67e/0x10e0 ? process_one_work+0x1730/0x1730 kthread+0x2e5/0x3a0 ? kthread_complete_and_exit+0x40/0x40 ret_from_fork+0x1f/0x30 </TASK> In nf_tables_exit_net(), there is a case where nft_net->commit_list is empty but nft_net->module_list is not empty. Such a case occurs with the following scenario: 1. nfnetlink_rcv_batch() is called 2. nf_tables_newset() returns -EAGAIN and NFNL_BATCH_FAILURE bit is set to status 3. nf_tables_abort() is called with NFNL_ABORT_AUTOLOAD (nft_net->commit_list is released, but nft_net->module_list is not because of NFNL_ABORT_AUTOLOAD flag) 4. Jump to replay label 5. netlink_skb_clone() fails and returns from the function (this is caused by fault injection in the reproducer of syzbot) This patch fixes this issue by calling __nf_tables_abort() when nft_net->module_list is not empty in nf_tables_exit_net(). Fixes: `eb014de4fd` ("netfilter: nf_tables: autoload modules from the abort path") Link: https://syzkaller.appspot.com/bug?id=802aba2422de4218ad0c01b46c9525cc9d4e4aa3 [1] Reported-by: syzbot+178efee9e2d7f87f5103@syzkaller.appspotmail.com Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2022-11-08 23:16:14 +01:00
Ziyang Xuan	03832a32bf	netfilter: nfnetlink: fix potential dead lock in nfnetlink_rcv_msg() When type is NFNL_CB_MUTEX and -EAGAIN error occur in nfnetlink_rcv_msg(), it does not execute nfnl_unlock(). That would trigger potential dead lock. Fixes: `50f2db9e36` ("netfilter: nfnetlink: consolidate callback types") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Florian Westphal <fw@strlen.de>	2022-11-08 23:16:13 +01:00
Linus Torvalds	f141df3713	audit/stable-6.1 PR 20221107 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmNpimwUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMptRAAmDe4f1MokIzJUBQOrFOR8Zw2zWDc snW1dSRTudP4fjV2GX4XVb4X0YFtmnsBXk1GZJpWZWTMshdSxgvS/rnzOT6svqrk mDeAWVhtiPdyB5Xj3bFNnKz7vzvTmgCcsHJ0NqTwEk0nh2vS+NoIwsJRvNgEVmJb 8HN1uIFYzHF83Ij5+ejBaF/8Xvkc5kKjrhvs68R5YmOeH+9EuVi88S/FckF1HkWP 7WBmsD1bgDoU4UFIiri3w5FPrWQNqLcR7ZQISizCU3C8B9U84tCe5ifxLmNr3RmX 9UrT0THiZHd1iV+uDaaIfiHS+fpvpZn5CbSvPiPXYkybpoeUMNgz+pxTq12nR0eX xB6CGPUgT51R6qI0gQCCSazhXz4wUy0Jhkyo4hwruW/7bo11chft6Oktzt0Ij/Pd zyYT5ad+J3Ufub/QSIyo8yvq/oawlmxibMTuDo1mwCkMDDNEsqLJbYJESOBw3/0P XtbvZC5oChVur3ozBepeKV/B1trvPtmih/RaX3ARVDm6LTobCykFuUwu5bwPhyDz /iTpZbbez5jgVI1kJd3TCyCVYXwtalNbAH70XPvlygXKwQZpOM9LUrkggX8BSooT Rq+c2bUT0HZ9Yade4Aw55HlKOTbrzRKlaMWFtlBEz32EVq8tknDxUHnzVPv3zMA/ g8pjmS+k+xD/Ywk= =Ycwp -----END PGP SIGNATURE----- Merge tag 'audit-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "A small audit patch to fix an instance of undefined behavior in a shift operator caused when shifting a signed value too far, the same case as the lsm patch merged previously. While the fix is trivial and I can't imagine it causing a problem in a backport, I'm not explicitly marking it for stable on the off chance that there is some system out there which is relying on some wonky unexpected behavior which this patch could break; if it does break, IMO it's better that to happen in a minor or -rcX release and not in a stable backport" * tag 'audit-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: fix undefined behavior in bit shift for AUDIT_BIT	2022-11-08 12:30:29 -08:00
Linus Torvalds	f49b2d89fb	lsm/stable-6.1 PR 20221107 -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmNpiisUHHBhdWxAcGF1 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMFzg//WEQgUIwRkmOj2vQdXdvhSBj9T6Ec sEkh3nBhT0D++21mZeDQiDAUQC3jE3xr/vFwrkECR9xd9ap2Y/RxT5tCucBGOua+ trUHyqMbvH5Ec0lomUfVsDoez4GcrCtZ++p7TP8YXxgvjAbDSFtbDqVyfbWpV+E/ UV0nByDvhX+HQqGVJRDbK4d8JDQFccki/+SlaGnWtaKYA8CjHKkpTrhhkCVm3Fow iA/qg/sPLX1/5g7yHrhWaFy//MkFM1C7cmLq88nlR46OVVjHGlFqIGXbFclNZNnG cfLzvcPGuDZ9Ih7Pun3wESEDWxMlSXArNZzC12xIw3STTHiHP8fUEnw9bfpKzWUs K+3nu6gN+Mh7xRL7dw0ISqx7tQM/SJ7lF91zD7pIEvuXLMKXIfM3D7KyTkwmBVuz A6nZphAEmmY5R+ez88ry7c0FtNEEc1dST8rVjD8XStvFXxRNqIWOZ3Z2QjhA9SI9 Y/v8H2/VW7hsgGnyozVqmFJmY+x5ij2lge5TEnhfRvCi1xf25Rdii50+lYCdvnM1 v/IY2Xxeq+gvyew8XB6B13Gv6TUKKIgL7sOwMdwEB8Q1Lk2xegzZkl7thsGoVLZn zZny4+8WxFoGhu51V3EtuTCVWHJ884fPSM1PEntWo5+oe/6cJZ7/rcYRP6PtYfha 4usMsFgUJnzntmY= =1ZF/ -----END PGP SIGNATURE----- Merge tag 'lsm-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fix from Paul Moore: "A small capability patch to fix an instance of undefined behavior in a shift operator caused when shifting a signed value too far. While the fix is trivial and I can't imagine it causing a problem in a backport, I'm not explicitly marking it for stable on the off chance that there is some system out there which is relying on some wonky unexpected behavior which this patch could break; if it does break, IMO it's better that to happen in a minor or -rcX release and not in a stable backport" * tag 'lsm-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: capabilities: fix undefined behavior in bit shift for CAP_TO_MASK	2022-11-08 12:22:02 -08:00
David Howells	30d95efe06	rxrpc: Allocate an skcipher each time needed rather than reusing In the rxkad security class, allocate the skcipher used to do packet encryption and decription rather than allocating one up front and reusing it for each packet. Reusing the skcipher precludes doing crypto in parallel. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00
David Howells	1fc4fa2ac9	rxrpc: Fix congestion management rxrpc has a problem in its congestion management in that it saves the congestion window size (cwnd) from one call to another, but if this is 0 at the time is saved, then the next call may not actually manage to ever transmit anything. To this end: (1) Don't save cwnd between calls, but rather reset back down to the initial cwnd and re-enter slow-start if data transmission is idle for more than an RTT. (2) Preserve ssthresh instead, as that is a handy estimate of pipe capacity. Knowing roughly when to stop slow start and enter congestion avoidance can reduce the tendency to overshoot and drop larger amounts of packets when probing. In future, cwind growth also needs to be constrained when the window isn't being filled due to being application limited. Reported-by: Simon Wilkinson <sxw@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00
David Howells	6869ddb87d	rxrpc: Remove the rxtx ring The Rx/Tx ring is no longer used, so remove it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00
David Howells	d57a3a1516	rxrpc: Save last ACK's SACK table rather than marking txbufs Improve the tracking of which packets need to be transmitted by saving the last ACK packet that we receive that has a populated soft-ACK table rather than marking packets. Then we can step through the soft-ACK table and look at the packets we've transmitted beyond that to determine which packets we might want to retransmit. We also look at the highest serial number that has been acked to try and guess which packets we've transmitted the peer is likely to have seen. If necessary, we send a ping to retrieve that number. One downside that might be a problem is that we can't then compare the previous acked/unacked state so easily in rxrpc_input_soft_acks() - which is a potential problem for the slow-start algorithm. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00
David Howells	4e76bd406d	rxrpc: Remove call->lock call->lock is no longer necessary, so remove it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00
David Howells	a4ea4c4776	rxrpc: Don't use a ring buffer for call Tx queue Change the way the Tx queueing works to make the following ends easier to achieve: (1) The filling of packets, the encryption of packets and the transmission of packets can be handled in parallel by separate threads, rather than rxrpc_sendmsg() allocating, filling, encrypting and transmitting each packet before moving onto the next one. (2) Get rid of the fixed-size ring which sets a hard limit on the number of packets that can be retained in the ring. This allows the number of packets to increase without having to allocate a very large ring or having variable-sized rings. [Note: the downside of this is that it's then less efficient to locate a packet for retransmission as we then have to step through a list and examine each buffer in the list.] (3) Allow the filler/encrypter to run ahead of the transmission window. (4) Make it easier to do zero copy UDP from the packet buffers. (5) Make it easier to do zero copy from userspace to the packet buffers - and thence to UDP (only if for unauthenticated connections). To that end, the following changes are made: (1) Use the new rxrpc_txbuf struct instead of sk_buff for keeping packets to be transmitted in. This allows them to be placed on multiple queues simultaneously. An sk_buff isn't really necessary as it's never passed on to lower-level networking code. (2) Keep the transmissable packets in a linked list on the call struct rather than in a ring. As a consequence, the annotation buffer isn't used either; rather a flag is set on the packet to indicate ackedness. (3) Use the RXRPC_CALL_TX_LAST flag to indicate that the last packet to be transmitted has been queued. Add RXRPC_CALL_TX_ALL_ACKED to indicate that all packets up to and including the last got hard acked. (4) Wire headers are now stored in the txbuf rather than being concocted on the stack and they're stored immediately before the data, thereby allowing zerocopy of a single span. (5) Don't bother with instant-resend on transmission failure; rather, leave it for a timer or an ACK packet to trigger. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00
David Howells	5d7edbc923	rxrpc: Get rid of the Rx ring Get rid of the Rx ring and replace it with a pair of queues instead. One queue gets the packets that are in-sequence and are ready for processing by recvmsg(); the other queue gets the out-of-sequence packets for addition to the first queue as the holes get filled. The annotation ring is removed and replaced with a SACK table. The SACK table has the bits set that correspond exactly to the sequence number of the packet being acked. The SACK ring is copied when an ACK packet is being assembled and rotated so that the first ACK is in byte 0. Flow control handling is altered so that packets that are moved to the in-sequence queue are hard-ACK'd even before they're consumed - and then the Rx window size in the ACK packet (rsize) is shrunk down to compensate (even going to 0 if the window is full). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org	2022-11-08 16:42:28 +00:00

... 3 4 5 6 7 ...

1138395 Commits