linux

Author	SHA1	Message	Date
Linus Torvalds	909b447dcc	Networking fixes for 5.11-rc6, including fixes from can, xfrm, wireless, wireless-drivers and netfilter trees. Nothing scary, Intel WiFi-related fixes seemed most notable to the users. Current release - regressions: - dsa: microchip: ksz8795: fix KSZ8794 port map again to program the CPU port correctly Current release - new code bugs: - iwlwifi: pcie: reschedule in long-running memory reads Previous releases - regressions: - iwlwifi: dbg: don't try to overwrite read-only FW data - iwlwifi: provide gso_type to GSO packets - octeontx2: make sure the buffer is 128 byte aligned - tcp: make TCP_USER_TIMEOUT accurate for zero window probes - xfrm: fix wraparound in xfrm_policy_addr_delta() - xfrm: fix oops in xfrm_replay_advance_bmp due to a race between CPUs in presence of packet reorder - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN - wext: fix NULL-ptr-dereference with cfg80211's lack of commit() Previous releases - always broken: - igc: fix link speed advertising - stmmac: configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing - team: protect features update by RCU to avoid deadlock - xfrm: fix disable_xfrm sysctl when used on xfrm interfaces themselves - fec: fix temporary RMII clock reset on link up - can: dev: prevent potential information leak in can_fill_info() Misc: - mrp: fix bad packing of MRP test packet structures - uapi: fix big endian definition of ipv6_rpl_sr_hdr - add David Ahern to IPv4/IPv6 maintainers Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmATRs4ACgkQMUZtbf5S IrtOfQ//Vmn1WprrwLPf6/uOuBN0RAKHC+64IRIw2ahDuiB1QQV0c3ALRd42Xp8n qnoDMB/mUWdF/KjjJEKvwYyBuwBeQWLcpgTXi1HvvhxM13PVHjvyIp6hTAYYj+m4 KyWWzQZwezz0zKQ3wXFdZV4JuefXEgXvMx65o8nk+TsutHn6WK/E6ZnWTexoZ0pa 5Lab149mtoCdSpT3gr2x1aTqd9KYWaxfarYOUD1GY58BQyDFl4wj10MV3oE7xWPj /MKnSBvPx52ajbb+rUVhfFjBN1BmEjdze7cBMncJc5H+0X38R23ZaAlP3gecGaac hZ5C2wnSSvRR8KIvSEwbCArlpuyU+exacZXZ0vS6sfgqISKqoPv8erWvpxtLil3v YfwZVNPYG9RBwbnDVw1gLQIFn3lUqLhIPnJ8J2Ue6KUm7ur4fO566RjyPU3gkPdp 5Zj3Eh7hsB2EqOy4RdwnoI0QboWmlq9+wT11HCXPFyJ077JzVU0FzMSvJr4dgVSI 3D3ckmw+RSej4ib6G4xjpq1tPCFzdf9zlFoUPomRFTKgfJFaky5pEb/22C3bztp1 43fsv3PiwlQtoYP3pfQsRj+r6DikYwDL7A3lskWohIZXviY2wErKWViUcIXr5ULE BxYQq0NYMl4TgDkn525U9EFwVgJAvPAedhYxF7VKn3eHNODqWBo= =dwFD -----END PGP SIGNATURE----- Merge tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes including fixes from can, xfrm, wireless, wireless-drivers and netfilter trees. Nothing scary, Intel WiFi-related fixes seemed most notable to the users. Current release - regressions: - dsa: microchip: ksz8795: fix KSZ8794 port map again to program the CPU port correctly Current release - new code bugs: - iwlwifi: pcie: reschedule in long-running memory reads Previous releases - regressions: - iwlwifi: dbg: don't try to overwrite read-only FW data - iwlwifi: provide gso_type to GSO packets - octeontx2: make sure the buffer is 128 byte aligned - tcp: make TCP_USER_TIMEOUT accurate for zero window probes - xfrm: fix wraparound in xfrm_policy_addr_delta() - xfrm: fix oops in xfrm_replay_advance_bmp due to a race between CPUs in presence of packet reorder - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to OPEN - wext: fix NULL-ptr-dereference with cfg80211's lack of commit() Previous releases - always broken: - igc: fix link speed advertising - stmmac: configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing - team: protect features update by RCU to avoid deadlock - xfrm: fix disable_xfrm sysctl when used on xfrm interfaces themselves - fec: fix temporary RMII clock reset on link up - can: dev: prevent potential information leak in can_fill_info() Misc: - mrp: fix bad packing of MRP test packet structures - uapi: fix big endian definition of ipv6_rpl_sr_hdr - add David Ahern to IPv4/IPv6 maintainers" * tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits) rxrpc: Fix memory leak in rxrpc_lookup_local mlxsw: spectrum_span: Do not overwrite policer configuration selftests: forwarding: Specify interface when invoking mausezahn stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing net: usb: cdc_ether: added support for Thales Cinterion PLSx3 modem family. ibmvnic: Ensure that CRQ entry read are correctly ordered MAINTAINERS: add missing header for bonding net: decnet: fix netdev refcount leaking on error path net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP can: dev: prevent potential information leak in can_fill_info() net: fec: Fix temporary RMII clock reset on link up net: lapb: Add locking to the lapb module team: protect features update by RCU to avoid deadlock MAINTAINERS: add David Ahern to IPv4/IPv6 maintainers net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset net/mlx5e: Revert parameters on errors when changing trust state without reset net/mlx5e: Correctly handle changing the number of queues when the interface is down net/mlx5e: Fix CT rule + encap slow path offload and deletion net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled ...	2021-01-28 15:24:43 -08:00
Ido Schimmel	b6f6881aaf	mlxsw: spectrum_span: Do not overwrite policer configuration The purpose of the delayed work in the SPAN module is to potentially update the destination port and various encapsulation parameters of SPAN agents that point to a VLAN device or a GRE tap. The destination port can change following the insertion of a new route, for example. SPAN agents that point to a physical port or the CPU port are static and never change throughout the lifetime of the SPAN agent. Therefore, skip over them in the delayed work. This fixes an issue where the delayed work overwrites the policer that was set on a SPAN agent pointing to the CPU. Modifying the delayed work to inherit the original policer configuration is error-prone, as the same will be needed for any new parameter. Fixes: `4039504e6a` ("mlxsw: spectrum_span: Allow setting policer on a SPAN agent") Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-28 13:09:01 -08:00
Paul Blakey	e2194a1744	net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable If a non nat tuple entry is inserted just to the regular tuples rhashtable (ct_tuples_ht) and not to natted tuples rhashtable (ct_nat_tuples_ht). Commit `bc562be967` ("net/mlx5e: CT: Save ct entries tuples in hashtables") mixed up the return labels and names sot that on cleanup or failure we still try to remove for the natted tuples rhashtable. Fix that by correctly checking if a natted tuples insertion before removing it. While here make it more readable. Fixes: `bc562be967` ("net/mlx5e: CT: Save ct entries tuples in hashtables") Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:39:04 -08:00
Maxim Mikityanskiy	8355060f5e	net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset Sometimes, channel params are changed without recreating the channels. It happens in two basic cases: when the channels are closed, and when the parameter being changed doesn't affect how channels are configured. Such changes invoke a hardware command that might fail. The whole operation should be reverted in such cases, but the code that restores the parameters' values in the driver was missing. This commit adds this handling. Fixes: `2e20a15120` ("net/mlx5e: Fail safe mtu and lro setting") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:39:01 -08:00
Maxim Mikityanskiy	912c9b5fcc	net/mlx5e: Revert parameters on errors when changing trust state without reset Trust state may be changed without recreating the channels. It happens when the channels are closed, and when channel parameters (min inline mode) stay the same after changing the trust state. Changing the trust state is a hardware command that may fail. The current code didn't restore the channel parameters to their old values if an error happened and the channels were closed. This commit adds handling for this case. Fixes: `6e0504c698` ("net/mlx5e: Change inline mode correctly when changing trust state") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:58 -08:00
Maxim Mikityanskiy	57ac4a31c4	net/mlx5e: Correctly handle changing the number of queues when the interface is down This commit addresses two issues related to changing the number of queues when the channels are closed: 1. Missing call to mlx5e_num_channels_changed to update real_num_tx_queues when the number of TCs is changed. 2. When mlx5e_num_channels_changed returns an error, the channel parameters must be reverted. Two Fixes: tags correspond to the first commits where these two issues were introduced. Fixes: `3909a12e79` ("net/mlx5e: Fix configuration of XPS cpumasks and netdev queues in corner cases") Fixes: `fa3748775b` ("net/mlx5e: Handle errors from netif_set_real_num_{tx,rx}_queues") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:56 -08:00
Paul Blakey	89e3946758	net/mlx5e: Fix CT rule + encap slow path offload and deletion Currently, if a neighbour isn't valid when offloading tunnel encap rules, we offload the original match and replace the original action with "goto slow path" action. For this we use a temporary flow attribute based on the original flow attribute and then change the action. Flow flags, which among those is the CT flag, are still shared for the slow path rule offload, so we end up parsing this flow as a CT + goto slow path rule. Besides being unnecessary, CT action offload saves extra information in the passed flow attribute, such as created ct_flow and mod_hdr, which is lost onces the temporary flow attribute is freed. When a neigh is updated and is valid, we offload the original CT rule with original CT action, which again creates a ct_flow and mod_hdr and saves it in the flow's original attribute. Then we delete the slow path rule with a temporary flow attribute based on original updated flow attribute, and we free the relevant ct_flow and mod_hdr. Then when tc deletes this flow, we try to free the ct_flow and mod_hdr on the flow's attribute again. To fix the issue, skip all furture proccesing (CT/Sample/Split rules) in offload/unoffload of slow path rules. Call trace: [ 758.850525] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000218 [ 758.952987] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 758.964170] Modules linked in: act_csum(E) act_pedit(E) act_tunnel_key(E) act_ct(E) nf_flow_table(E) xt_nat(E) ip6table_filter(E) ip6table_nat(E) xt_comment(E) ip6_tables(E) xt_conntrack(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xt_addrtype(E) iptable_filter(E) iptable_nat(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) xfrm_user(E) overlay(E) act_mirred(E) act_skbedit(E) rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) esp6_offload(E) esp6(E) esp4_offload(E) esp4(E) xfrm_algo(E) mlx5_ib(OE) ib_uverbs(OE) geneve(E) ip6_udp_tunnel(E) udp_tunnel(E) nfnetlink_cttimeout(E) nfnetlink(E) mlx5_core(OE) act_gact(E) cls_flower(E) sch_ingress(E) openvswitch(E) nsh(E) nf_conncount(E) nf_nat(E) mlxfw(OE) psample(E) nf_conntrack(E) nf_defrag_ipv4(E) vfio_mdev(E) mdev(E) ib_core(OE) mlx_compat(OE) crct10dif_ce(E) uio_pdrv_genirq(E) uio(E) i2c_mlx(E) mlxbf_pmc(E) sbsa_gwdt(E) mlxbf_gige(E) gpio_mlxbf2(E) mlxbf_pka(E) mlx_trio(E) mlx_bootctl(E) bluefield_edac(E) knem(O) [ 758.964225] ip_tables(E) mlxbf_tmfifo(E) ipv6(E) crc_ccitt(E) nf_defrag_ipv6(E) [ 759.154186] CPU: 5 PID: 122 Comm: kworker/u16:1 Tainted: G OE 5.4.60-mlnx.52.gde81e85 #1 [ 759.172870] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.5.0-2-gc1b5d64 Jan 4 2021 [ 759.195466] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 759.207344] pstate: a0000005 (NzCv daif -PAN -UAO) [ 759.217003] pc : mlx5_del_flow_rules+0x5c/0x160 [mlx5_core] [ 759.228229] lr : mlx5_del_flow_rules+0x34/0x160 [mlx5_core] [ 759.405858] Call trace: [ 759.410804] mlx5_del_flow_rules+0x5c/0x160 [mlx5_core] [ 759.421337] __mlx5_eswitch_del_rule.isra.43+0x5c/0x1c8 [mlx5_core] [ 759.433963] mlx5_eswitch_del_offloaded_rule_ct+0x34/0x40 [mlx5_core] [ 759.446942] mlx5_tc_rule_delete_ct+0x68/0x74 [mlx5_core] [ 759.457821] mlx5_tc_ct_delete_flow+0x160/0x21c [mlx5_core] [ 759.469051] mlx5e_tc_unoffload_fdb_rules+0x158/0x168 [mlx5_core] [ 759.481325] mlx5e_tc_encap_flows_del+0x140/0x26c [mlx5_core] [ 759.492901] mlx5e_rep_update_flows+0x11c/0x1ec [mlx5_core] [ 759.504127] mlx5e_rep_neigh_update+0x160/0x200 [mlx5_core] [ 759.515314] process_one_work+0x178/0x400 [ 759.523350] worker_thread+0x58/0x3e8 [ 759.530685] kthread+0x100/0x12c [ 759.537152] ret_from_fork+0x10/0x18 [ 759.544320] Code: 97ffef55 51000673 3100067f 54ffff41 (b9421ab3) [ 759.556548] ---[ end trace fab818bb1085832d ]--- Fixes: `4c3844d9e9` ("net/mlx5e: CT: Introduce connection tracking") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:53 -08:00
Maor Dickman	156878d0e6	net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled The cited commit introduce new CONFIG_MLX5_CLS_ACT kconfig variable to control compilation of TC hardware offloads implementation. When this configuration is disabled the driver is still wrongly reports in ethtool that hw-tc-offload is supported. Fixed by reporting hw-tc-offload is supported only when CONFIG_MLX5_CLS_ACT is enabled. Fixes: `d956873f90` ("net/mlx5e: Introduce kconfig var for TC support") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:51 -08:00
Daniel Jurgens	0aa128475d	net/mlx5: Maintain separate page trees for ECPF and PF functions Pages for the host PF and ECPF were stored in the same tree, so the ECPF pages were being freed along with the host PF's when the host driver unloaded. Combine the function ID and ECPF flag to use as an index into the x-array containing the trees to get a different tree for the host PF and ECPF. Fixes: `c6168161f6` ("net/mlx5: Add support for release all pages event") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:48 -08:00
Maxim Mikityanskiy	45c9a30835	net/mlx5e: Fix IPSEC stats When IPSEC offload isn't active, the number of stats is not zero, but the strings are not filled, leading to exposing stats with empty names. Fix this by using the same condition for NUM_STATS and FILL_STRS. Fixes: `0aab3e1b04` ("net/mlx5e: IPSec, Expose IPsec HW stat only for supporting HW") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:46 -08:00
Maor Dickman	48470a90a4	net/mlx5e: Reduce tc unsupported key print level "Unsupported key used:" appears in kernel log when flows with unsupported key are used, arp fields for example. OpenVSwitch was changed to match on arp fields by default that caused this warning to appear in kernel log for every arp rule, which can be a lot. Fix by lowering print level from warning to debug. Fixes: `e3a2b7ed01` ("net/mlx5e: Support offload cls_flower with drop action") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:43 -08:00
Pan Bian	258ed19f07	net/mlx5e: free page before return Instead of directly return, goto the error handling label to free allocated page. Fixes: `5f29458b77` ("net/mlx5e: Support dump callback in TX reporter") Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:41 -08:00
Parav Pandit	1fe3e3166b	net/mlx5e: E-switch, Fix rate calculation for overflow rate_bytes_ps is a 64-bit field. It passed as 32-bit field to apply_police_params(). Due to this when police rate is higher than 4Gbps, 32-bit calculation ignores the carry. This results in incorrect rate configurationn the device. Fix it by performing 64-bit calculation. Fixes: `fcb64c0f56` ("net/mlx5: E-Switch, add ingress rate support") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:38 -08:00
Roi Dayan	487c6ef81e	net/mlx5: Fix memory leak on flow table creation error flow When we create the ft object we also init rhltable in ft->fgs_hash. So in error flow before kfree of ft we need to destroy that rhltable. Fixes: `693c6883bb` ("net/mlx5: Add hash table for flow groups in flow table") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-26 15:38:36 -08:00
Parav Pandit	de641d74fb	Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion" This reverts commit `fbdd0049d9`. Due to commit in fixes tag, netdevice events were received only in one net namespace of mlx5_core_dev. Due to this when netdevice events arrive in net namespace other than net namespace of mlx5_core_dev, they are missed. This results in empty GID table due to RDMA device being detached from its net device. Hence, revert back to receive netdevice events in all net namespaces to restore back RDMA functionality in non init_net net namespace. The deadlock will have to be addressed in another patch. Fixes: `fbdd0049d9` ("RDMA/mlx5: Fix devlink deadlock on net namespace deletion") Link: https://lore.kernel.org/r/20210117092633.10690-1-leon@kernel.org Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-19 20:22:59 -04:00
Vadim Pasternak	b06ca3d5a4	mlxsw: core: Increase critical threshold for ASIC thermal zone Increase critical threshold for ASIC thermal zone from 110C to 140C according to the system hardware requirements. All the supported ASICs (Spectrum-1, Spectrum-2, Spectrum-3) could be still operational with ASIC temperature below 140C. With the old critical threshold value system can perform unjustified shutdown. All the systems equipped with the above ASICs implement thermal protection mechanism at firmware level and firmware could decide to perform system thermal shutdown in case the temperature is below 140C. So with the new threshold system will not meltdown, while thermal operating range will be aligned with hardware abilities. Fixes: `41e760841d` ("mlxsw: core: Replace thermal temperature trips with defines") Fixes: `a50c1e3565` ("mlxsw: core: Implement thermal zone") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 16:25:10 -08:00
Vadim Pasternak	57726ebe27	mlxsw: core: Add validation of transceiver temperature thresholds Validate thresholds to avoid a single failure due to some transceiver unreliability. Ignore the last readouts in case warning temperature is above alarm temperature, since it can cause unexpected thermal shutdown. Stay with the previous values and refresh threshold within the next iteration. This is a rare scenario, but it was observed at a customer site. Fixes: `6a79507cfe` ("mlxsw: core: Extend thermal module with per QSFP module thermal zones") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-09 16:25:10 -08:00
Dinghao Liu	5b0bb12c58	net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups When mlx5_create_flow_group() fails, ft->g should be freed just like when kvzalloc() fails. The caller of mlx5e_create_l2_table_groups() does not catch this issue on failure, which leads to memleak. Fixes: `33cfaaa8f3` ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:51 -08:00
Dinghao Liu	7a6eb072a9	net/mlx5e: Fix two double free cases mlx5e_create_ttc_table_groups() frees ft->g on failure of kvzalloc(), but such failure will be caught by its caller in mlx5e_create_ttc_table() and ft->g will be freed again in mlx5e_destroy_flow_table(). The same issue also occurs in mlx5e_create_ttc_table_groups(). Set ft->g to NULL after kfree() to avoid double free. Fixes: `7b3722fa9e` ("net/mlx5e: Support RSS for GRE tunneled packets") Fixes: `33cfaaa8f3` ("net/mlx5e: Split the main flow steering table") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Leon Romanovsky	4d8be21112	net/mlx5: Release devlink object if adev fails Add missed freeing previously allocated devlink object. Fixes: `a925b5e309` ("net/mlx5: Register mlx5 devices to auxiliary virtual bus") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Aya Levin	b1c0aca3d3	net/mlx5e: ethtool, Fix restriction of autoneg with 56G Prior to this patch, configuring speed to 50G with autoneg off over devices supporting 50G per lane failed. Support for 50G per lane introduced a new set of link-modes, on which driver always performed a speed validation as if only legacy link-modes were configured. Fix driver speed validation to force setting autoneg over 56G only if in legacy link-mode. Fixes: `3d7cadae51` ("net/mlx5e: ethtool, Fix analysis of speed setting") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Maor Dickman	e13ed0ac06	net/mlx5e: In skb build skip setting mark in switchdev mode sop_drop_qpn field in the cqe is used by two features, in SWITCHDEV mode to restore the chain id in case of a miss and in LEGACY mode to support skbedit mark action. In build RX skb, the skb mark field is set regardless of the configured mode which cause a corruption of the mark field in case of switchdev mode. Fix by overriding the mark value back to 0 in the representor tc update skb flow. Fixes: `8f1e0b97cc` ("net/mlx5: E-Switch, Mark miss packets with new chain id mapping") Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:50 -08:00
Alaa Hleihel	25c904b59a	net/mlx5: E-Switch, fix changing vf VLANID Adding vf VLANID for the first time, or after having cleared previously defined VLANID works fine, however, attempting to change an existing vf VLANID clears the rules on the firmware, but does not add new rules for the new vf VLANID. Fix this by changing the logic in function esw_acl_egress_lgcy_setup() so that it will always configure egress rules. Fixes: `ea651a86d4` ("net/mlx5: E-Switch, Refactor eswitch egress acl codes") Signed-off-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:49 -08:00
Moshe Shemesh	b544011f0e	net/mlx5e: Fix SWP offsets when vlan inserted by driver In case WQE includes inline header the vlan is inserted by driver even if vlan offload is set. On geneve over vlan interface where software parser is used the SWP offsets should be updated according to the added vlan. Fixes: `e3cfc7e6b7` ("net/mlx5e: TX, Add geneve tunnel stateless offload support") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:49 -08:00
Oz Shlomo	eed38eeee7	net/mlx5e: CT: Use per flow counter when CT flow accounting is enabled Connection counters may be shared for both directions when the counter is used for connection aging purposes. However, if TC flow accounting is enabled then a unique counter is required per direction. Instantiate a unique counter per direction if the conntrack accounting extension is enabled. Use a shared counter when the connection accounting extension is disabled. Fixes: `1edae2335a` ("net/mlx5e: CT: Use the same counter for both directions") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:49 -08:00
Mark Zhang	0f2dcade69	net/mlx5: Use port_num 1 instead of 0 when delete a RoCE address In multi-port mode, FW reports syndrome 0x2ea48 (invalid vhca_port_number) if the port_num is not 1 or 2. Fixes: `80f09dfc23` ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:48 -08:00
Aya Levin	9c9be85f6b	net/mlx5e: Add missing capability check for uplink follow Expose firmware indication that it supports setting eswitch uplink state to follow (follow the physical link). Condition setting the eswitch uplink admin-state with this capability bit. Older FW may not support the uplink state setting. Fixes: `7d0314b11c` ("net/mlx5e: Modify uplink state on interface up/down") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:48 -08:00
Mark Zhang	abf8ef953a	net/mlx5: Check if lag is supported before creating one This patch fixes a memleak issue by preventing to create a lag and add PFs if lag is not supported. comm “python3”, pid 349349, jiffies 4296985507 (age 1446.976s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ……………. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ……………. backtrace: [<000000005b216ae7>] mlx5_lag_add+0x1d5/0×3f0 [mlx5_core] [<000000000445aa55>] mlx5e_nic_enable+0x66/0×1b0 [mlx5_core] [<00000000c56734c3>] mlx5e_attach_netdev+0x16e/0×200 [mlx5_core] [<0000000030439d1f>] mlx5e_attach+0x5c/0×90 [mlx5_core] [<0000000018fd8615>] mlx5e_add+0x1a4/0×410 [mlx5_core] [<0000000068bc504b>] mlx5_add_device+0x72/0×120 [mlx5_core] [<000000009fce51f9>] mlx5_register_device+0x77/0xb0 [mlx5_core] [<00000000d0d81ff3>] mlx5_load_one+0xc58/0×1eb0 [mlx5_core] [<0000000045077adc>] init_one+0x3ea/0×920 [mlx5_core] [<0000000043287674>] pci_device_probe+0xcd/0×150 [<00000000dafd3279>] really_probe+0x1c9/0×4b0 [<00000000f06bdd84>] driver_probe_device+0x5d/0×140 [<00000000e3d508b6>] device_driver_attach+0x4f/0×60 [<0000000084fba0f0>] bind_store+0xbf/0×120 [<00000000bf6622b3>] kernfs_fop_write+0x114/0×1b0 Fixes: `9b412cc35f` ("net/mlx5e: Add LAG warning if bond slave is not lag master") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-01-07 12:22:48 -08:00
Linus Torvalds	3913d00ac5	A treewide cleanup of interrupt descriptor (ab)use with all sorts of racy accesses, inefficient and disfunctional code. The goal is to remove the export of irq_to_desc() to prevent these things from creeping up again. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/ifgsTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoYm6EACAo8sObkuY3oWLagtGj1KHxon53oGZ VfDw2LYKM+rgJjDWdiyocyxQU5gtm6loWCrIHjH2adRQ4EisB5r8hfI8NZHxNMyq 8khUi822NRBfFN6SCpO8eW9o95euscNQwCzqi7gV9/U/BAKoDoSEYzS4y0YmJlup mhoikkrFiBuFXplWI0gbP4ihb8S/to2+kTL6o7eBoJY9+fSXIFR3erZ6f3fLjYZG CQUUysTywdDhLeDkC9vaesXwgdl2XnaPRwcQqmK8Ez0QYNYpawyILUHLD75cIHDu bHdK2ZoDv/wtad/3BoGTK3+wChz20a/4/IAnBIUVgmnSLsPtW8zNEOPWNNc0aGg+ rtafi5bvJ1lMoSZhkjLWQDOGU6vFaXl9NkC2fpF+dg1skFMT2CyLC8LD/ekmocon zHAPBva9j3m2A80hI3dUH9azo/IOl1GHG8ccM6SCxY3S/9vWSQChNhQDLe25xBEO VtKZS7DYFCRiL8mIy9GgwZWof8Vy2iMua2ML+W9a3mC9u3CqSLbCFmLMT/dDoXl1 oHnMdAHk1DRatA8pJAz83C75RxbAS2riGEqtqLEQ6OaNXn6h0oXCanJX9jdKYDBh z6ijWayPSRMVktN6FDINsVNFe95N4GwYcGPfagIMqyMMhmJDic6apEzEo7iA76lk cko28MDqTIK4UQ== =BXv+ -----END PGP SIGNATURE----- Merge tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "This is the second attempt after the first one failed miserably and got zapped to unblock the rest of the interrupt related patches. A treewide cleanup of interrupt descriptor (ab)use with all sorts of racy accesses, inefficient and disfunctional code. The goal is to remove the export of irq_to_desc() to prevent these things from creeping up again" * tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) genirq: Restrict export of irq_to_desc() xen/events: Implement irq distribution xen/events: Reduce irq_info:: Spurious_cnt storage size xen/events: Only force affinity mask for percpu interrupts xen/events: Use immediate affinity setting xen/events: Remove disfunct affinity spreading xen/events: Remove unused bind_evtchn_to_irq_lateeoi() net/mlx5: Use effective interrupt affinity net/mlx5: Replace irq_to_desc() abuse net/mlx4: Use effective interrupt affinity net/mlx4: Replace irq_to_desc() abuse PCI: mobiveil: Use irq_data_get_irq_chip_data() PCI: xilinx-nwl: Use irq_data_get_irq_chip_data() NTB/msi: Use irq_has_action() mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc pinctrl: nomadik: Use irq_has_action() drm/i915/pmu: Replace open coded kstat_irqs() copy drm/i915/lpe_audio: Remove pointless irq_to_desc() usage s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt() parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts() ...	2020-12-24 13:50:23 -08:00
Thomas Gleixner	ec7b37b6f0	net/mlx5: Use effective interrupt affinity Using the interrupt affinity mask for checking locality is not really working well on architectures which support effective affinity masks. The affinity mask is either the system wide default or set by user space, but the architecture can or even must reduce the mask to the effective set, which means that checking the affinity mask itself does not really tell about the actual target CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20201210194044.876342330@linutronix.de	2020-12-15 16:19:33 +01:00
Thomas Gleixner	6e745db4dd	net/mlx5: Replace irq_to_desc() abuse No driver has any business with the internals of an interrupt descriptor. Storing a pointer to it just to use yet another helper at the actual usage site to retrieve the affinity mask is creative at best. Just because C does not allow encapsulation does not mean that the kernel has no limits. Retrieve a pointer to the affinity mask itself and use that. It's still using an interface which is usually not for random drivers, but definitely less hideous than the previous hack. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Saeed Mahameed <saeedm@nvidia.com> Link: https://lore.kernel.org/r/20201210194044.769458162@linutronix.de	2020-12-15 16:19:33 +01:00
Thomas Gleixner	197d237077	net/mlx4: Use effective interrupt affinity Using the interrupt affinity mask for checking locality is not really working well on architectures which support effective affinity masks. The affinity mask is either the system wide default or set by user space, but the architecture can or even must reduce the mask to the effective set, which means that checking the affinity mask itself does not really tell about the actual target CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20201210194044.672935978@linutronix.de	2020-12-15 16:19:33 +01:00
Thomas Gleixner	80a62deedf	net/mlx4: Replace irq_to_desc() abuse No driver has any business with the internals of an interrupt descriptor. Storing a pointer to it just to use yet another helper at the actual usage site to retrieve the affinity mask is creative at best. Just because C does not allow encapsulation does not mean that the kernel has no limits. Retrieve a pointer to the affinity mask itself and use that. It's still using an interface which is usually not for random drivers, but definitely less hideous than the previous hack. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20201210194044.580936243@linutronix.de	2020-12-15 16:19:33 +01:00
Jiri Pirko	88a31b18b6	mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router In case the eXtended mezzanine is present on the system, use it for IPv4 router offload. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:55 -08:00
Jiri Pirko	dffd566136	mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3 Set a profile option to instruct FW to use 1/2 of KVH for XLT cache, not the whole one. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:55 -08:00
Jiri Pirko	2dfad87a24	mlxsw: spectrum_router_xm: Introduce basic XM cache flushing Upon route insertion and removal, it is needed to flush possibly cached entries from the XM cache. Extend XM op context to carry information needed for the flush. Implement the flush in delayed work since for HW design reasons there is a need to wait 50usec before the flush can be done. If during this time comes the same flush request, consolidate it to the first one. Implement this queued flushes by a hashtable. v2: * Fix GENMASK() high bit Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:55 -08:00
Jiri Pirko	069254662b	mlxsw: reg: Add Router LPM Cache Enable Register The RLPMCE allows disabling the LPM cache. Can be changed on the fly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:55 -08:00
Jiri Pirko	edb47f3d23	mlxsw: reg: Add Router LPM Cache ML Delete Register The RLCMLD register is used to bulk delete the XLT-LPM cache ML entries. This can be used by SW when L is increased or decreased, thus need to remove entries with old ML values. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	54ff9dbbb9	mlxsw: spectrum_router_xm: Implement L-value tracking for M-index There is a table that assigns L-value per M-index. The L is always the biggest from the currently inserted prefixes. Setup a hashtable to track the M-index information and the prefixes that are related to it. Ensure the L-value is always correctly set. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	e35e804648	mlxsw: reg: Add XM Router M Table Register The XRMT configures the M-Table for the XLT-LPM. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	e0bc244dcf	mlxsw: spectrum_router: Introduce per-ASIC XM initialization During the router init flow, call into XM code and initialize couple of items needed for XM functionality: 1) Query the capabilities and sizes. Check the XM device id. 2) Initialize the M-value. Note that currently the M-value is set fixed to 16 for IPv4. In future this may change to better cover the actual inserted routes. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	ec54677e55	mlxsw: reg: Add XM Lookup Table Query Register The XLTQ is used to query HW for XM-related info. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	087489dc27	mlxsw: reg: Add Router XLT M select Register The RXLTM configures and selects the M for the XM lookups. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	50779c3325	mlxsw: Ignore ports that are connected to eXtended mezanine Use the info stored in the bus_info struct about the eXtended mezanine connected ports and don't expose them. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	2ea3f4c7fa	mlxsw: pci: Obtain info about ports used by eXtended mezanine The output of boardinfo command was extended to contain information about XM. Indicates if is present and in case it is, tells which localports are used for the connection. So parse this info and store it in bus_info passed up to the driver. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	ff462103ca	mlxsw: spectrum_router: Introduce XM implementation of router low-level ops In order to offload entries to XM, implement a set of low-level functions to work with LPM trees in XM and also to pack and write FIB entries into XM. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:54 -08:00
Jiri Pirko	6100fbf13d	mlxsw: reg: Add Router XLT Enable Register The RXLTE enables XLT (eXtended Lookup Table) LPM lookups if a capable XM is present on the system. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:53 -08:00
Jiri Pirko	be6ba3b61e	mlxsw: reg: Add XM Direct Register The XMDR allows direct access to the XM device via the switch. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-14 19:09:53 -08:00
Jakub Kicinski	46d5e62dd3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net xdp_return_frame_bulk() needs to pass a xdp_buff to __xdp_return(). strlcpy got converted to strscpy but here it makes no functional difference, so just keep the right code. Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-11 22:29:38 -08:00
Zheng Yongjun	b18cac546b	net/mlx4: simplify the return expression of mlx4_init_srq_table() Simplify the return expression. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-12-10 13:00:51 -08:00

1 2 3 4 5 ...

7093 Commits