linux

Author	SHA1	Message	Date
Norbert Ciosek	b32cddd224	i40e: Fix endianness conversions Fixes the following sparse warnings: i40e_main.c:5953:32: warning: cast from restricted __le16 i40e_main.c:8008:29: warning: incorrect type in assignment (different base types) i40e_main.c:8008:29: expected unsigned int [assigned] [usertype] ipa i40e_main.c:8008:29: got restricted __le32 [usertype] i40e_main.c:8008:29: warning: incorrect type in assignment (different base types) i40e_main.c:8008:29: expected unsigned int [assigned] [usertype] ipa i40e_main.c:8008:29: got restricted __le32 [usertype] i40e_txrx.c:1950:59: warning: incorrect type in initializer (different base types) i40e_txrx.c:1950:59: expected unsigned short [usertype] vlan_tag i40e_txrx.c:1950:59: got restricted __le16 [usertype] l2tag1 i40e_txrx.c:1953:40: warning: cast to restricted __le16 i40e_xsk.c:448:38: warning: invalid assignment: \|= i40e_xsk.c:448:38: left side has type restricted __le64 i40e_xsk.c:448:38: right side has type int Fixes: `2f4b411a3d` ("i40e: Enable cloud filters via tc-flower") Fixes: `2a508c64ad` ("i40e: fix VLAN.TCI == 0 RX HW offload") Fixes: `3106c580fb` ("i40e: Use batched xsk Tx interfaces to increase performance") Fixes: `8f88b3034d` ("i40e: Add infrastructure for queue channel support") Signed-off-by: Norbert Ciosek <norbertx.ciosek@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-19 10:09:04 -08:00
Mateusz Palczewski	61c1e0eb83	i40e: Fix add TC filter for IPv6 Fix insufficient distinction between IPv4 and IPv6 addresses when creating a filter. IPv4 and IPv6 are kept in the same memory area. If IPv6 is added, then it's caught by IPv4 check, which leads to err -95. Fixes: `2f4b411a3d` ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-19 10:08:52 -08:00
Sylwester Dziedziuch	dc88126264	i40e: Fix VFs not created When creating VFs they were sometimes not getting resources. It was caused by not executing i40e_reset_all_vfs due to flag __I40E_VF_DISABLE being set on PF. Because of this IAVF was never able to finish setup sequence never getting reset indication from PF. Changed test_and_set_bit __I40E_VF_DISABLE in i40e_sync_filters_subtask to test_bit and removed clear_bit. This function should not set this bit it should only check if it hasn't been already set. Fixes: `a7542b8760` ("i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-18 10:21:17 -08:00
Mateusz Palczewski	28b1208e7a	i40e: Fix addition of RX filters after enabling FW LLDP agent Fix addition of VLAN filter for PF after enabling FW LLDP agent. Changing LLDP Agent causes FW to re-initialize per NVM settings. Remove default PF filter and move "Enable/Disable" to currently used reset flag. Without this patch PF would try to add MAC VLAN filter with default switch filter present. This causes AQ error and sets promiscuous mode on. Fixes: `c65e78f87f` ("i40e: Further implementation of LLDP") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-18 10:21:17 -08:00
Mateusz Palczewski	4cdb9f80dc	i40e: Fix overwriting flow control settings during driver loading During driver loading flow control settings were written to FW using a variable which was always zero, since it was being set only by ethtool. This behavior has been corrected and driver no longer overwrites the default FW/NVM settings. Fixes: `373149fc99` ("i40e: Decrease the scope of rtnl lock") Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-18 10:21:17 -08:00
Mateusz Palczewski	d2c788f739	i40e: Add zero-initialization of AQ command structures Zero-initialize AQ command data structures to comply with API specifications. Fixes: `2f4b411a3d` ("i40e: Enable cloud filters via tc-flower") Fixes: `f4492db16d` ("i40e: Add NPAR BW get and set functions") Signed-off-by: Andrzej Sawuła <andrzej.sawula@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-18 10:20:48 -08:00
Keita Suzuki	58cab46c62	i40e: Fix memory leak in i40e_probe Struct i40e_veb is allocated in function i40e_setup_pf_switch, and stored to an array field veb inside struct i40e_pf. However when i40e_setup_misc_vector fails, this memory leaks. Fix this by calling exit and teardown functions. Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-18 10:03:54 -08:00
Slawomir Laba	92c6058024	i40e: Fix flow for IPv6 next header (extension header) When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with I40E_TX_CTX_EXT_IP_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: `a3fd9d8876` ("i40e/i40evf: Handle IPv6 extension headers in checksum offload") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2021-02-18 10:03:54 -08:00
Vladimir Oltean	3af409ca27	net: enetc: fix destroyed phylink dereference during unbind The following call path suggests that calling unregister_netdev on an interface that is up will first bring it down. enetc_pf_remove -> unregister_netdev -> unregister_netdevice_queue -> unregister_netdevice_many -> dev_close_many -> __dev_close_many -> enetc_close -> enetc_stop -> phylink_stop However, enetc first destroys the phylink instance, then calls unregister_netdev. This is already dissimilar to the setup (and error path teardown path) from enetc_pf_probe, but more than that, it is buggy because it is invalid to call phylink_stop after phylink_destroy. So let's first unregister the netdev (and let the .ndo_stop events consume themselves), then destroy the phylink instance, then free the netdev. Fixes: `71b77a7a27` ("enetc: Migrate to PHYLINK and PCS_LYNX") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 15:05:07 -08:00
Shyam Sundar S K	9eab3fdb41	net: amd-xgbe: Fix network fluctuations when using 1G BELFUSE SFP Frequent link up/down events can happen when a Bel Fuse SFP part is connected to the amd-xgbe device. Try to avoid the frequent link issues by resetting the PHY as documented in Bel Fuse SFP datasheets. Fixes: `e722ec8237` ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:46 -08:00
Shyam Sundar S K	84fe68eb67	net: amd-xgbe: Reset link when the link never comes back Normally, auto negotiation and reconnect should be automatically done by the hardware. But there seems to be an issue where auto negotiation has to be restarted manually. This happens because of link training and so even though still connected to the partner the link never "comes back". This needs an auto-negotiation restart. Also, a change in xgbe-mdio is needed to get ethtool to recognize the link down and get the link change message. This change is only required in a backplane connection mode. Fixes: `abf0a1c2b2` ("amd-xgbe: Add support for SFP+ modules") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:45 -08:00
Shyam Sundar S K	186edbb510	net: amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning The current driver calls netif_carrier_off() late in the link tear down which can result in a netdev watchdog timeout. Calling netif_carrier_off() immediately after netif_tx_stop_all_queues() avoids the warning. ------------[ cut here ]------------ NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220 Modules linked in: amd_xgbe(E) amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down CPU: 3 PID: 0 Comm: swapper/3 Tainted: G E Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019 RIP: 0010:dev_watchdog+0x20d/0x220 Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48 RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0 RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018 R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000 R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010 FS: 0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0 Call Trace: <IRQ> ? pfifo_fast_reset+0x100/0x100 call_timer_fn+0x2b/0x130 run_timer_softirq+0x3e8/0x440 ? enqueue_hrtimer+0x39/0x90 Fixes: `e722ec8237` ("amd-xgbe: Update the BelFuse quirk to support SGMII") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:45 -08:00
Shyam Sundar S K	30b7edc82e	net: amd-xgbe: Reset the PHY rx data path when mailbox command timeout Sometimes mailbox commands timeout when the RX data path becomes unresponsive. This prevents the submission of new mailbox commands to DXIO. This patch identifies the timeout and resets the RX data path so that the next message can be submitted properly. Fixes: `549b32af9f` ("amd-xgbe: Simplify mailbox interface rate change code") Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-16 14:09:45 -08:00
Sukadev Bhattiprolu	4a41c421f3	ibmvnic: serialize access to work queue on remove The work queue is used to queue reset requests like CHANGE-PARAM or FAILOVER resets for the worker thread. When the adapter is being removed the adapter state is set to VNIC_REMOVING and the work queue is flushed so no new work is added. However the check for adapter being removed is racy in that the adapter can go into REMOVING state just after we check and we might end up adding work just as it is being flushed (or after). The ->rwi_lock is already being used to serialize queue/dequeue work. Extend its usage ensure there is no race when scheduling/flushing work. Fixes: `6954a9e419` ("ibmvnic: Flush existing work items before device removal") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Cc:Uwe Kleine-König <uwe@kleine-koenig.org> Cc:Saeed Mahameed <saeed@kernel.org> Reviewed-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:17:30 -08:00
Lijun Pan	7d3a7b9ea5	ibmvnic: skip send_request_unmap for timeout reset Timeout reset will trigger the VIOS to unmap it automatically, similarly as FAILVOER and MOBILITY events. If we unmap it in the linux side, we will see errors like "30000003: Error 4 in REQUEST_UNMAP_RSP". So, don't call send_request_unmap for timeout reset. Fixes: `ed651a1087` ("ibmvnic: Updated reset handling") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:12:26 -08:00
Lijun Pan	42557dab78	ibmvnic: add memory barrier to protect long term buffer dma_rmb() barrier is added to load the long term buffer before copying it to socket buffer; and dma_wmb() barrier is added to update the long term buffer before it being accessed by VIOS (virtual i/o server). Fixes: `032c5e8284` ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Acked-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 15:12:01 -08:00
Heiner Kallweit	7ce189faa7	r8169: fix resuming from suspend on RTL8105e if machine runs on battery Armin reported that after referenced commit his RTL8105e is dead when resuming from suspend and machine runs on battery. This patch has been confirmed to fix the issue. Fixes: `e80bd76fbf` ("r8169: work around power-saving bug on some chip versions") Reported-by: Armin Wolf <W_Armin@gmx.de> Tested-by: Armin Wolf <W_Armin@gmx.de> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 14:56:07 -08:00
Ayush Sawal	2355a6773a	cxgb4/chtls/cxgbit: Keeping the max ofld immediate data size same in cxgb4 and ulds The Max imm data size in cxgb4 is not similar to the max imm data size in the chtls. This caused an mismatch in output of is_ofld_imm() of cxgb4 and chtls. So fixed this by keeping the max wreq size of imm data same in both chtls and cxgb4 as MAX_IMM_OFLD_TX_DATA_WR_LEN. As cxgb4's max imm. data value for ofld packets is changed to MAX_IMM_OFLD_TX_DATA_WR_LEN. Using the same in cxgbit also. Fixes: `36bedb3f2e` ("crypto: chtls - Inline TLS record Tx") Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-15 12:39:33 -08:00
Robert Hancock	57baf8cc70	net: axienet: Handle deferred probe on clock properly This driver is set up to use a clock mapping in the device tree if it is present, but still work without one for backward compatibility. However, if getting the clock returns -EPROBE_DEFER, then we need to abort and return that error from our driver initialization so that the probe can be retried later after the clock is set up. Move clock initialization to earlier in the process so we do not waste as much effort if the clock is not yet available. Switch to use devm_clk_get_optional and abort initialization on any error reported. Also enable the clock regardless of whether the controller is using an MDIO bus, as the clock is required in any case. Fixes: `09a0354cad` ("net: axienet: Use clock framework to get device clock rate") Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:36:31 -08:00
Dany Madden	a6f2fe5f10	ibmvnic: change IBMVNIC_MAX_IND_DESCS to 16 The supported indirect subcrq entries on Power8 is 16. Power9 supports 128. Redefined this value to 16 to minimize the driver from having to reset when migrating between Power9 and Power8. In our rx/tx performance testing, we found no performance difference between 16 and 128 at this time. Fixes: `f019fb6392` ("ibmvnic: Introduce indirect subordinate Command Response Queue buffer") Signed-off-by: Dany Madden <drt@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-12 17:17:05 -08:00
Moshe Shemesh	e1c3940c60	net/mlx5e: Check tunnel offload is required before setting SWP Check that tunnel offload is required before setting Software Parser offsets to get Geneve HW offload. In case of Geneve packet we check HW offload support of SWP in mlx5e_tunnel_features_check() and set features accordingly, this should be reflected in skb offload requested by the kernel and we should add the Software Parser offsets only if requested. Otherwise, in case HW doesn't support SWP for Geneve, data path will mistakenly try to offload Geneve SKBs with skb->encapsulation set, regardless of whether offload was requested or not on this specific SKB. Fixes: `e3cfc7e6b7` ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:16 -08:00
Oz Shlomo	a217313152	net/mlx5e: CT: manage the lifetime of the ct entry object The ct entry object is accessed by the ct add, del, stats and restore methods. In addition, it is referenced from several hash tables. The lifetime of the ct entry object was not managed which triggered race conditions as in the following kasan dump: [ 3374.973945] ================================================================== [ 3374.988552] BUG: KASAN: use-after-free in memcmp+0x4c/0x98 [ 3374.999590] Read of size 1 at addr ffff00036129ea55 by task ksoftirqd/1/15 [ 3375.016415] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G O 5.4.31+ #1 [ 3375.055301] Call trace: [ 3375.060214] dump_backtrace+0x0/0x238 [ 3375.067580] show_stack+0x24/0x30 [ 3375.074244] dump_stack+0xe0/0x118 [ 3375.081085] print_address_description.isra.9+0x74/0x3d0 [ 3375.091771] __kasan_report+0x198/0x1e8 [ 3375.099486] kasan_report+0xc/0x18 [ 3375.106324] __asan_load1+0x60/0x68 [ 3375.113338] memcmp+0x4c/0x98 [ 3375.119409] mlx5e_tc_ct_restore_flow+0x3a4/0x6f8 [mlx5_core] [ 3375.131073] mlx5e_rep_tc_update_skb+0x1d4/0x2f0 [mlx5_core] [ 3375.142553] mlx5e_handle_rx_cqe_rep+0x198/0x308 [mlx5_core] [ 3375.154034] mlx5e_poll_rx_cq+0x2a0/0x1060 [mlx5_core] [ 3375.164459] mlx5e_napi_poll+0x1d4/0xa78 [mlx5_core] [ 3375.174453] net_rx_action+0x28c/0x7a8 [ 3375.182004] __do_softirq+0x1b4/0x5d0 Manage the lifetime of the ct entry object by using synchornization mechanisms for concurrent access. Fixes: `ac991b48d4` ("net/mlx5e: CT: Offload established flows") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:15 -08:00
Shay Drory	edac23c2b3	net/mlx5: Disable devlink reload for lag devices Devlink reload can't be allowed on lag devices since reloading one lag device will cause traffic on the bond to get stucked. Users who wish to reload a lag device, need to remove the device from the bond, and only then reload it. Fixes: `4383cfcc65` ("net/mlx5: Add devlink reload") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:15 -08:00
Shay Drory	7ab91f2b03	net/mlx5: Disallow RoCE on lag device In lag mode, setting roce enabled/disable of lag device have no effect. e.g.: bond device (roce/vf_lag) roce status remain unchanged. Therefore disable it and add an error message. Fixes: `cc9defcbb8` ("net/mlx5: Handle "enable_roce" devlink param") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:15 -08:00
Shay Drory	c70f8597fc	net/mlx5: Disallow RoCE on multi port slave device In dual port mode, setting roce enabled/disable for the slave device have no effect. e.g.: the slave device roce status remain unchanged. Therefore disable it and add an error message. Enable or disable roce of the master device affect both master and slave devices. Fixes: `cc9defcbb8` ("net/mlx5: Handle "enable_roce" devlink param") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:14 -08:00
Shay Drory	d89ddaae17	net/mlx5: Disable devlink reload for multi port slave device Devlink reload can't be allowed on a multi port slave device, because reload of slave device doesn't take effect. The right flow is to disable devlink reload for multi port slave device. Hence, disabling it in mlx5_core probing. Fixes: `4383cfcc65` ("net/mlx5: Add devlink reload") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:14 -08:00
Maxim Mikityanskiy	b850bbff96	net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context wait_for_resync is unreliable - if it timeouts, priv_rx will be freed anyway. However, mlx5e_ktls_handle_get_psv_completion will be called sooner or later, leading to use-after-free. For example, it can happen if a CQ error happened, and ICOSQ stopped, but later on the queues are destroyed, and ICOSQ is flushed with mlx5e_free_icosq_descs. This patch converts the lifecycle of priv_rx to fully refcount-based, so that the struct won't be freed before the refcount goes to zero. Fixes: `0419d8c9d8` ("net/mlx5e: kTLS, Add kTLS RX resync support") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:13 -08:00
Maxim Mikityanskiy	ebf79b6be6	net/mlx5e: Fix CQ params of ICOSQ and async ICOSQ The commit mentioned below has split the parameters of ICOSQ and async ICOSQ, but it contained a typo: the CQ parameters were swapped for ICOSQ and async ICOSQ. Async ICOSQ is longer than the normal ICOSQ, and the CQ size must be the same as the size of the corresponding SQ, but due to this bug, the CQ of async ICOSQ was much shorter than async ICOSQ itself. It led to overflows of the CQ with such messages in dmesg, in particular, when running multiple kTLS-offloaded streams: mlx5_core 0000:08:00.0: cq_err_event_notifier:529:(pid 9422): CQ error on CQN 0x406, syndrome 0x1 mlx5_core 0000:08:00.0 eth2: mlx5e_cq_error_event: cqn=0x000406 event=0x04 This commit fixes the issue by using the corresponding parameters for ICOSQ and async ICOSQ. Fixes: `c293ac927f` ("net/mlx5e: Refactor build channel params") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:12 -08:00
Maxim Mikityanskiy	4d6e6b0c6d	net/mlx5e: Replace synchronize_rcu with synchronize_net The commit cited below switched from using napi_synchronize to synchronize_rcu to have a guarantee that it will finish in finite time. However, on average, synchronize_rcu takes more time than napi_synchronize. Given that it's called multiple times per channel on deactivation, it accumulates to a significant amount, which causes timeouts in some applications (for example, when using bonding with NetworkManager). This commit replaces synchronize_rcu with synchronize_net, which is faster when called under rtnl_lock, allowing to speed up the described flow. Fixes: `9c25a22dfb` ("net/mlx5e: Use synchronize_rcu to sync with NAPI") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:12 -08:00
Shay Drory	51d138c261	net/mlx5: Fix health error state handling Currently, when we discover a fatal error, we are queueing a work that will wait for a lock in order to enter the device to error state. Meanwhile, FW commands are still being processed, and gets timeouts. This can block the driver for few minutes before the work will manage to get the lock and enter to error state. Setting the device to error state before queueing health work, in order to avoid FW commands being processed while the work is waiting for the lock. Fixes: `c1d4d2e92a` ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:11 -08:00
Maxim Mikityanskiy	65ba8594a2	net/mlx5e: Change interrupt moderation channel params also when channels are closed struct mlx5e_params contains fields ({rx,tx}_cq_moderation) that depend on two things: whether DIM is enabled and the state of a private flag (MLX5E_PFLAG_{RX,TX}_CQE_BASED_MODER). Whenever the DIM state changes, mlx5e_reset_{rx,tx}_moderation is called to update the fields, however, only if the channels are open. The flow where the channels are closed misses the required update of the fields. This commit moves the calls of mlx5e_reset_{rx,tx}_moderation, so that they run in both flows. Fixes: `ebeaf084ad` ("net/mlx5e: Properly set default values when disabling adaptive moderation") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:11 -08:00
Maxim Mikityanskiy	019f93bc4b	net/mlx5e: Don't change interrupt moderation params when DIM is enabled When mlx5e_ethtool_set_coalesce doesn't change DIM state (enabled/disabled), it calls mlx5e_set_priv_channels_coalesce unconditionally, which in turn invokes a firmware command to set interrupt moderation parameters. It shouldn't happen while DIM manages those parameters dynamically (it might even be happening at the same time). This patch fixes it by splitting mlx5e_set_priv_channels_coalesce into two functions (for RX and TX) and calling them only when DIM is disabled (for RX and TX respectively). Fixes: `cb3c7fd4f8` ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:10 -08:00
Raed Salem	e33f9f5f2d	net/mlx5e: Enable XDP for Connect-X IPsec capable devices This limitation was inherited by previous Innova (FPGA) IPsec implementation, it uses its private set of RQ handlers which does not support XDP, for Connect-X this is no longer true. Fix by keeping this limitation only for Innova IPsec supporting devices, as otherwise this limitation effectively wrongly blocks XDP for all future Connect-X devices for all flows even if IPsec offload is not used. Fixes: `2d64663cd5` ("net/mlx5: IPsec: Add HW crypto offload support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:10 -08:00
Raed Salem	e4484d9df5	net/mlx5e: Enable striding RQ for Connect-X IPsec capable devices This limitation was inherited by previous Innova (FPGA) IPsec implementation, it uses its private set of RQ handlers which does not support striding rq, for Connect-X this is no longer true. Fix by keeping this limitation only for Innova IPsec supporting devices, as otherwise this limitation effectively wrongly blocks striding RQs for all future Connect-X devices for all flows even if IPsec offload is not used. Fixes: `2d64663cd5` ("net/mlx5: IPsec: Add HW crypto offload support") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:09 -08:00
Parav Pandit	0e22bfb7c0	net/mlx5e: E-switch, Fix rate calculation for overflow rate_bytes_ps is a 64-bit field. It passed as 32-bit field to apply_police_params(). Due to this when police rate is higher than 4Gbps, 32-bit calculation ignores the carry. This results in incorrect rate configurationn the device. Fix it by performing 64-bit calculation. Fixes: `fcb64c0f56` ("net/mlx5: E-Switch, add ingress rate support") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-02-11 18:50:09 -08:00
Ioana Ciornei	e12be9139c	dpaa2-eth: fix memory leak in XDP_REDIRECT If xdp_do_redirect() fails, the calling driver should handle recycling or freeing of the page associated with the frame. The dpaa2-eth driver didn't do either of them and just incremented a counter. Fix this by trying to DMA map back the page and recycle it or, if the mapping fails, just free it. Fixes: `d678be1dc1` ("dpaa2-eth: add XDP_REDIRECT support") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:15:15 -08:00
Tong Zhang	e185ea30df	enetc: auto select PHYLIB and MDIO_DEVRES FSL_ENETC_MDIO use symbols from PHYLIB (MDIO_BUS) and MDIO_DEVRES, however there are no dependency specified in Kconfig ERROR: modpost: "__mdiobus_register" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined! ERROR: modpost: "mdiobus_unregister" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined! ERROR: modpost: "devm_mdiobus_alloc_size" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined! add depends on MDIO_DEVRES && MDIO_BUS Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 18:12:47 -08:00
Nathan Rossi	8a28af7a3e	net: ethernet: aquantia: Handle error cleanup of start on open The aq_nic_start function can fail in a variety of cases which leaves the device in broken state. An example case where the start function fails is the request_threaded_irq which can be interrupted, resulting in a EINTR result. This can be manually triggered by bringing the link up (e.g. ip link set up) and triggering a SIGINT on the initiating process (e.g. Ctrl+C). This would put the device into a half configured state. Subsequently bringing the link up again would cause the napi_enable to BUG. In order to correctly clean up the failed attempt to start a device call aq_nic_stop. Signed-off-by: Nathan Rossi <nathan.rossi@digi.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 14:38:06 -08:00
Vasundhara Volam	db28b6c77f	bnxt_en: Fix devlink info's stored fw.psid version format. The running fw.psid version is in decimal format but the stored fw.psid is in hex format. This can mislead the user to reset the NIC to activate the stored version to become the running version. Fix it to display the stored fw.psid in decimal format. Fixes: `1388875b39` ("bnxt_en: Add stored FW version info to devlink info_get cb.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 14:36:22 -08:00
Edwin Peer	132e0b65dc	bnxt_en: reverse order of TX disable and carrier off A TX queue can potentially immediately timeout after it is stopped and the last TX timestamp on that queue was more than 5 seconds ago with carrier still up. Prevent these intermittent false TX timeouts by bringing down carrier first before calling netif_tx_disable(). Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 14:36:22 -08:00
Sukadev Bhattiprolu	d4083d3c00	ibmvnic: Set to CLOSED state even on error If set_link_state() fails for any reason, we still cleanup the adapter state and cannot recover from a partial close anyway. So set the adapter to CLOSED state. That way if a new soft/hard reset is processed, the adapter will remain in the CLOSED state until the next ibmvnic_open(). Fixes: `01d9bd792d` ("ibmvnic: Reorganize device close") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Reported-by: Abdul Haleem <abdhalee@in.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-11 14:33:14 -08:00
Yufeng Mo	532cfc0df1	net: hns3: add a check for index in hclge_get_rss_key() The index is received from vf, if use it directly, an out-of-bound issue may be caused, so add a check for this index before using it in hclge_get_rss_key(). Fixes: `a638b1d8cc` ("net: hns3: fix get VF RSS issue") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-09 15:20:43 -08:00
Yufeng Mo	326334aad0	net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx() The tqp_index is received from vf, if use it directly, an out-of-bound issue may be caused, so add a check for this tqp_index before using it in hclge_get_ring_chain_from_mbx(). Fixes: `84e095d64e` ("net: hns3: Change PF to add ring-vect binding & resetQ to mailbox") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-09 15:20:43 -08:00
Yufeng Mo	67a69f84ca	net: hns3: add a check for queue_id in hclge_reset_vf_queue() The queue_id is received from vf, if use it directly, an out-of-bound issue may be caused, so add a check for this queue_id before using it in hclge_reset_vf_queue(). Fixes: `1a426f8b40` ("net: hns3: fix the VF queue reset flow error") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-09 15:20:43 -08:00
Vladimir Oltean	eb4733d7cf	net: dsa: felix: implement port flushing on .phylink_mac_link_down There are several issues which may be seen when the link goes down while forwarding traffic, all of which can be attributed to the fact that the port flushing procedure from the reference manual was not closely followed. With flow control enabled on both the ingress port and the egress port, it may happen when a link goes down that Ethernet packets are in flight. In flow control mode, frames are held back and not dropped. When there is enough traffic in flight (example: iperf3 TCP), then the ingress port might enter congestion and never exit that state. This is a problem, because it is the egress port's link that went down, and that has caused the inability of the ingress port to send packets to any other port. This is solved by flushing the egress port's queues when it goes down. There is also a problem when performing stream splitting for IEEE 802.1CB traffic (not yet upstream, but a sort of multicast, basically). There, if one port from the destination ports mask goes down, splitting the stream towards the other destinations will no longer be performed. This can be traced down to this line: ocelot_port_writel(ocelot_port, 0, DEV_MAC_ENA_CFG); which should have been instead, as per the reference manual: ocelot_port_rmwl(ocelot_port, 0, DEV_MAC_ENA_CFG_RX_ENA, DEV_MAC_ENA_CFG); Basically only DEV_MAC_ENA_CFG_RX_ENA should be disabled, but not DEV_MAC_ENA_CFG_TX_ENA - I don't have further insight into why that is the case, but apparently multicasting to several ports will cause issues if at least one of them doesn't have DEV_MAC_ENA_CFG_TX_ENA set. I am not sure what the state of the Ocelot VSC7514 driver is, but probably not as bad as Felix/Seville, since VSC7514 uses phylib and has the following in ocelot_adjust_link: if (!phydev->link) return; therefore the port is not really put down when the link is lost, unlike the DSA drivers which use .phylink_mac_link_down for that. Nonetheless, I put ocelot_port_flush() in the common ocelot.c because it needs to access some registers from drivers/net/ethernet/mscc/ocelot_rew.h which are not exported in include/soc/mscc/ and a bugfix patch should probably not move headers around. Fixes: `bdeced75b1` ("net: dsa: felix: Add PCS operations for PHYLINK") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-09 11:41:11 -08:00
Shay Agroskin	225353c070	net: ena: Update XDP verdict upon failure The verdict returned from ena_xdp_execute() is used to determine the fate of the RX buffer's page. In case of XDP Redirect/TX error the verdict should be set to XDP_ABORTED, otherwise the page won't be freed. Fixes: `a318c70ad1` ("net: ena: introduce XDP redirect implementation") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-06 15:07:29 -08:00
Sukadev Bhattiprolu	ef66a1eace	ibmvnic: Clear failover_pending if unable to schedule Normally we clear the failover_pending flag when processing the reset. But if we are unable to schedule a failover reset we must clear the flag ourselves. We could fail to schedule the reset if we are in PROBING state (eg: when booting via kexec) or because we could not allocate memory. Thanks to Cris Forno for helping isolate the problem and for testing. Fixes: `1d85049374` ("powerpc/vnic: Extend "failover pending" window") Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com> Tested-by: Cristobal Forno <cforno12@linux.ibm.com> Link: https://lore.kernel.org/r/20210203050802.680772-1-sukadev@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-06 10:36:22 -08:00
Mohammad Athari Bin Ismail	f317e2ea8c	net: stmmac: set TxQ mode back to DCB after disabling CBS When disable CBS, mode_to_use parameter is not updated even the operation mode of Tx Queue is changed to Data Centre Bridging (DCB). Therefore, when tc_setup_cbs() function is called to re-enable CBS, the operation mode of Tx Queue remains at DCB, which causing CBS fails to work. This patch updates the value of mode_to_use parameter to MTL_QUEUE_DCB after operation mode of Tx Queue is changed to DCB in stmmac_dma_qmode() callback function. Fixes: `1f705bc61a` ("net: stmmac: Add support for CBS QDISC") Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Song, Yoong Siang <yoong.siang.song@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Link: https://lore.kernel.org/r/1612447396-20351-1-git-send-email-yoong.siang.song@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-05 20:00:19 -08:00
Camelia Groza	0a9946cca1	dpaa_eth: try to move the data in place for the A050385 erratum The XDP frame's headroom might be large enough to accommodate the xdpf backpointer as well as shifting the data to an aligned address. Try this first before resorting to allocating a new buffer and copying the data. Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-05 19:58:34 -08:00
Camelia Groza	c2b0e8455e	dpaa_eth: reduce data alignment requirements for the A050385 erratum The 256 byte data alignment is required for preventing DMA transaction splits when crossing 4K page boundaries. Since XDP deals only with page sized buffers or less, this restriction isn't needed. Instead, the data only needs to be aligned to 64 bytes to prevent DMA transaction splits. These lessened restrictions can increase performance by widening the pool of permitted data alignments and preventing unnecessary realignments. Fixes: `ae680bcbd0` ("dpaa_eth: implement the A050385 erratum workaround for XDP") Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-05 19:58:34 -08:00

1 2 3 4 5 ...

36031 Commits