Commit Graph

36006 Commits

Author SHA1 Message Date
Shay Drory
d89ddaae17 net/mlx5: Disable devlink reload for multi port slave device
Devlink reload can't be allowed on a multi port slave device, because
reload of slave device doesn't take effect.

The right flow is to disable devlink reload for multi port slave
device. Hence, disabling it in mlx5_core probing.

Fixes: 4383cfcc65 ("net/mlx5: Add devlink reload")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:14 -08:00
Maxim Mikityanskiy
b850bbff96 net/mlx5e: kTLS, Use refcounts to free kTLS RX priv context
wait_for_resync is unreliable - if it timeouts, priv_rx will be freed
anyway. However, mlx5e_ktls_handle_get_psv_completion will be called
sooner or later, leading to use-after-free. For example, it can happen
if a CQ error happened, and ICOSQ stopped, but later on the queues are
destroyed, and ICOSQ is flushed with mlx5e_free_icosq_descs.

This patch converts the lifecycle of priv_rx to fully refcount-based, so
that the struct won't be freed before the refcount goes to zero.

Fixes: 0419d8c9d8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:13 -08:00
Maxim Mikityanskiy
ebf79b6be6 net/mlx5e: Fix CQ params of ICOSQ and async ICOSQ
The commit mentioned below has split the parameters of ICOSQ and async
ICOSQ, but it contained a typo: the CQ parameters were swapped for ICOSQ
and async ICOSQ. Async ICOSQ is longer than the normal ICOSQ, and the CQ
size must be the same as the size of the corresponding SQ, but due to
this bug, the CQ of async ICOSQ was much shorter than async ICOSQ
itself. It led to overflows of the CQ with such messages in dmesg, in
particular, when running multiple kTLS-offloaded streams:

mlx5_core 0000:08:00.0: cq_err_event_notifier:529:(pid 9422): CQ error
on CQN 0x406, syndrome 0x1
mlx5_core 0000:08:00.0 eth2: mlx5e_cq_error_event: cqn=0x000406
event=0x04

This commit fixes the issue by using the corresponding parameters for
ICOSQ and async ICOSQ.

Fixes: c293ac927f ("net/mlx5e: Refactor build channel params")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:12 -08:00
Maxim Mikityanskiy
4d6e6b0c6d net/mlx5e: Replace synchronize_rcu with synchronize_net
The commit cited below switched from using napi_synchronize to
synchronize_rcu to have a guarantee that it will finish in finite time.
However, on average, synchronize_rcu takes more time than
napi_synchronize. Given that it's called multiple times per channel on
deactivation, it accumulates to a significant amount, which causes
timeouts in some applications (for example, when using bonding with
NetworkManager).

This commit replaces synchronize_rcu with synchronize_net, which is
faster when called under rtnl_lock, allowing to speed up the described
flow.

Fixes: 9c25a22dfb ("net/mlx5e: Use synchronize_rcu to sync with NAPI")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:12 -08:00
Shay Drory
51d138c261 net/mlx5: Fix health error state handling
Currently, when we discover a fatal error, we are queueing a work that
will wait for a lock in order to enter the device to error state.
Meanwhile, FW commands are still being processed, and gets timeouts.
This can block the driver for few minutes before the work will manage
to get the lock and enter to error state.

Setting the device to error state before queueing health work, in order
to avoid FW commands being processed while the work is waiting for the
lock.

Fixes: c1d4d2e92a ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:11 -08:00
Maxim Mikityanskiy
65ba8594a2 net/mlx5e: Change interrupt moderation channel params also when channels are closed
struct mlx5e_params contains fields ({rx,tx}_cq_moderation) that depend
on two things: whether DIM is enabled and the state of a private flag
(MLX5E_PFLAG_{RX,TX}_CQE_BASED_MODER). Whenever the DIM state changes,
mlx5e_reset_{rx,tx}_moderation is called to update the fields, however,
only if the channels are open. The flow where the channels are closed
misses the required update of the fields. This commit moves the calls of
mlx5e_reset_{rx,tx}_moderation, so that they run in both flows.

Fixes: ebeaf084ad ("net/mlx5e: Properly set default values when disabling adaptive moderation")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:11 -08:00
Maxim Mikityanskiy
019f93bc4b net/mlx5e: Don't change interrupt moderation params when DIM is enabled
When mlx5e_ethtool_set_coalesce doesn't change DIM state
(enabled/disabled), it calls mlx5e_set_priv_channels_coalesce
unconditionally, which in turn invokes a firmware command to set
interrupt moderation parameters. It shouldn't happen while DIM manages
those parameters dynamically (it might even be happening at the same
time).

This patch fixes it by splitting mlx5e_set_priv_channels_coalesce into
two functions (for RX and TX) and calling them only when DIM is disabled
(for RX and TX respectively).

Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:10 -08:00
Raed Salem
e33f9f5f2d net/mlx5e: Enable XDP for Connect-X IPsec capable devices
This limitation was inherited by previous Innova (FPGA) IPsec
implementation, it uses its private set of RQ handlers which
does not support XDP, for Connect-X this is no longer true.

Fix by keeping this limitation only for Innova IPsec supporting devices,
as otherwise this limitation effectively wrongly blocks XDP for all
future Connect-X devices for all flows even if IPsec offload is not
used.

Fixes: 2d64663cd5 ("net/mlx5: IPsec: Add HW crypto offload support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:10 -08:00
Raed Salem
e4484d9df5 net/mlx5e: Enable striding RQ for Connect-X IPsec capable devices
This limitation was inherited by previous Innova (FPGA) IPsec
implementation, it uses its private set of RQ handlers which does
not support striding rq, for Connect-X this is no longer true.

Fix by keeping this limitation only for Innova IPsec supporting devices,
as otherwise this limitation effectively wrongly blocks striding RQs for
all future Connect-X devices for all flows even if IPsec offload is not
used.

Fixes: 2d64663cd5 ("net/mlx5: IPsec: Add HW crypto offload support")
Signed-off-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:09 -08:00
Parav Pandit
0e22bfb7c0 net/mlx5e: E-switch, Fix rate calculation for overflow
rate_bytes_ps is a 64-bit field. It passed as 32-bit field to
apply_police_params(). Due to this when police rate is higher
than 4Gbps, 32-bit calculation ignores the carry. This results
in incorrect rate configurationn the device.

Fix it by performing 64-bit calculation.

Fixes: fcb64c0f56 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-11 18:50:09 -08:00
Ioana Ciornei
e12be9139c dpaa2-eth: fix memory leak in XDP_REDIRECT
If xdp_do_redirect() fails, the calling driver should handle recycling
or freeing of the page associated with the frame. The dpaa2-eth driver
didn't do either of them and just incremented a counter.
Fix this by trying to DMA map back the page and recycle it or, if the
mapping fails, just free it.

Fixes: d678be1dc1 ("dpaa2-eth: add XDP_REDIRECT support")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 18:15:15 -08:00
Tong Zhang
e185ea30df enetc: auto select PHYLIB and MDIO_DEVRES
FSL_ENETC_MDIO use symbols from PHYLIB (MDIO_BUS) and MDIO_DEVRES,
however there are no dependency specified in Kconfig

ERROR: modpost: "__mdiobus_register" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined!
ERROR: modpost: "mdiobus_unregister" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined!
ERROR: modpost: "devm_mdiobus_alloc_size" [drivers/net/ethernet/freescale/enetc/fsl-enetc-mdio.ko] undefined!

add depends on MDIO_DEVRES && MDIO_BUS

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 18:12:47 -08:00
Nathan Rossi
8a28af7a3e net: ethernet: aquantia: Handle error cleanup of start on open
The aq_nic_start function can fail in a variety of cases which leaves
the device in broken state.

An example case where the start function fails is the
request_threaded_irq which can be interrupted, resulting in a EINTR
result. This can be manually triggered by bringing the link up (e.g. ip
link set up) and triggering a SIGINT on the initiating process (e.g.
Ctrl+C). This would put the device into a half configured state.
Subsequently bringing the link up again would cause the napi_enable to
BUG.

In order to correctly clean up the failed attempt to start a device call
aq_nic_stop.

Signed-off-by: Nathan Rossi <nathan.rossi@digi.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:38:06 -08:00
Vasundhara Volam
db28b6c77f bnxt_en: Fix devlink info's stored fw.psid version format.
The running fw.psid version is in decimal format but the stored
fw.psid is in hex format.  This can mislead the user to reset the
NIC to activate the stored version to become the running version.

Fix it to display the stored fw.psid in decimal format.

Fixes: 1388875b39 ("bnxt_en: Add stored FW version info to devlink info_get cb.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:36:22 -08:00
Edwin Peer
132e0b65dc bnxt_en: reverse order of TX disable and carrier off
A TX queue can potentially immediately timeout after it is stopped
and the last TX timestamp on that queue was more than 5 seconds ago with
carrier still up.  Prevent these intermittent false TX timeouts
by bringing down carrier first before calling netif_tx_disable().

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:36:22 -08:00
Sukadev Bhattiprolu
d4083d3c00 ibmvnic: Set to CLOSED state even on error
If set_link_state() fails for any reason, we still cleanup the adapter
state and cannot recover from a partial close anyway. So set the adapter
to CLOSED state. That way if a new soft/hard reset is processed, the
adapter will remain in the CLOSED state until the next ibmvnic_open().

Fixes: 01d9bd792d ("ibmvnic: Reorganize device close")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Reported-by: Abdul Haleem <abdhalee@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:33:14 -08:00
Yufeng Mo
532cfc0df1 net: hns3: add a check for index in hclge_get_rss_key()
The index is received from vf, if use it directly,
an out-of-bound issue may be caused, so add a check for
this index before using it in hclge_get_rss_key().

Fixes: a638b1d8cc ("net: hns3: fix get VF RSS issue")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 15:20:43 -08:00
Yufeng Mo
326334aad0 net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx()
The tqp_index is received from vf, if use it directly,
an out-of-bound issue may be caused, so add a check for
this tqp_index before using it in hclge_get_ring_chain_from_mbx().

Fixes: 84e095d64e ("net: hns3: Change PF to add ring-vect binding & resetQ to mailbox")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 15:20:43 -08:00
Yufeng Mo
67a69f84ca net: hns3: add a check for queue_id in hclge_reset_vf_queue()
The queue_id is received from vf, if use it directly,
an out-of-bound issue may be caused, so add a check for
this queue_id before using it in hclge_reset_vf_queue().

Fixes: 1a426f8b40 ("net: hns3: fix the VF queue reset flow error")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 15:20:43 -08:00
Vladimir Oltean
eb4733d7cf net: dsa: felix: implement port flushing on .phylink_mac_link_down
There are several issues which may be seen when the link goes down while
forwarding traffic, all of which can be attributed to the fact that the
port flushing procedure from the reference manual was not closely
followed.

With flow control enabled on both the ingress port and the egress port,
it may happen when a link goes down that Ethernet packets are in flight.
In flow control mode, frames are held back and not dropped. When there
is enough traffic in flight (example: iperf3 TCP), then the ingress port
might enter congestion and never exit that state. This is a problem,
because it is the egress port's link that went down, and that has caused
the inability of the ingress port to send packets to any other port.
This is solved by flushing the egress port's queues when it goes down.

There is also a problem when performing stream splitting for
IEEE 802.1CB traffic (not yet upstream, but a sort of multicast,
basically). There, if one port from the destination ports mask goes
down, splitting the stream towards the other destinations will no longer
be performed. This can be traced down to this line:

	ocelot_port_writel(ocelot_port, 0, DEV_MAC_ENA_CFG);

which should have been instead, as per the reference manual:

	ocelot_port_rmwl(ocelot_port, 0, DEV_MAC_ENA_CFG_RX_ENA,
			 DEV_MAC_ENA_CFG);

Basically only DEV_MAC_ENA_CFG_RX_ENA should be disabled, but not
DEV_MAC_ENA_CFG_TX_ENA - I don't have further insight into why that is
the case, but apparently multicasting to several ports will cause issues
if at least one of them doesn't have DEV_MAC_ENA_CFG_TX_ENA set.

I am not sure what the state of the Ocelot VSC7514 driver is, but
probably not as bad as Felix/Seville, since VSC7514 uses phylib and has
the following in ocelot_adjust_link:

	if (!phydev->link)
		return;

therefore the port is not really put down when the link is lost, unlike
the DSA drivers which use .phylink_mac_link_down for that.

Nonetheless, I put ocelot_port_flush() in the common ocelot.c because it
needs to access some registers from drivers/net/ethernet/mscc/ocelot_rew.h
which are not exported in include/soc/mscc/ and a bugfix patch should
probably not move headers around.

Fixes: bdeced75b1 ("net: dsa: felix: Add PCS operations for PHYLINK")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-09 11:41:11 -08:00
Shay Agroskin
225353c070 net: ena: Update XDP verdict upon failure
The verdict returned from ena_xdp_execute() is used to determine the
fate of the RX buffer's page. In case of XDP Redirect/TX error the
verdict should be set to XDP_ABORTED, otherwise the page won't be freed.

Fixes: a318c70ad1 ("net: ena: introduce XDP redirect implementation")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06 15:07:29 -08:00
Sukadev Bhattiprolu
ef66a1eace ibmvnic: Clear failover_pending if unable to schedule
Normally we clear the failover_pending flag when processing the reset.
But if we are unable to schedule a failover reset we must clear the
flag ourselves. We could fail to schedule the reset if we are in PROBING
state (eg: when booting via kexec) or because we could not allocate memory.

Thanks to Cris Forno for helping isolate the problem and for testing.

Fixes: 1d85049374 ("powerpc/vnic: Extend "failover pending" window")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Tested-by: Cristobal Forno <cforno12@linux.ibm.com>
Link: https://lore.kernel.org/r/20210203050802.680772-1-sukadev@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06 10:36:22 -08:00
Mohammad Athari Bin Ismail
f317e2ea8c net: stmmac: set TxQ mode back to DCB after disabling CBS
When disable CBS, mode_to_use parameter is not updated even the operation
mode of Tx Queue is changed to Data Centre Bridging (DCB). Therefore,
when tc_setup_cbs() function is called to re-enable CBS, the operation
mode of Tx Queue remains at DCB, which causing CBS fails to work.

This patch updates the value of mode_to_use parameter to MTL_QUEUE_DCB
after operation mode of Tx Queue is changed to DCB in stmmac_dma_qmode()
callback function.

Fixes: 1f705bc61a ("net: stmmac: Add support for CBS QDISC")
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: Song, Yoong Siang <yoong.siang.song@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/1612447396-20351-1-git-send-email-yoong.siang.song@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 20:00:19 -08:00
Camelia Groza
0a9946cca1 dpaa_eth: try to move the data in place for the A050385 erratum
The XDP frame's headroom might be large enough to accommodate the
xdpf backpointer as well as shifting the data to an aligned address.

Try this first before resorting to allocating a new buffer and copying
the data.

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:34 -08:00
Camelia Groza
c2b0e8455e dpaa_eth: reduce data alignment requirements for the A050385 erratum
The 256 byte data alignment is required for preventing DMA transaction
splits when crossing 4K page boundaries. Since XDP deals only with page
sized buffers or less, this restriction isn't needed. Instead, the data
only needs to be aligned to 64 bytes to prevent DMA transaction splits.

These lessened restrictions can increase performance by widening the pool
of permitted data alignments and preventing unnecessary realignments.

Fixes: ae680bcbd0 ("dpaa_eth: implement the A050385 erratum workaround for XDP")
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:34 -08:00
Camelia Groza
275a9c72b4 dpaa_eth: reserve space for the xdp_frame under the A050385 erratum
When the erratum workaround is triggered, the newly created xdp_frame
structure is stored at the start of the newly allocated buffer. Avoid
the structure from being overwritten by explicitly reserving enough
space in the buffer for storing it.

Account for the fact that the structure's size might increase in time by
aligning the headroom to DPAA_FD_DATA_ALIGNMENT bytes, thus guaranteeing
the data's alignment.

Fixes: ae680bcbd0 ("dpaa_eth: implement the A050385 erratum workaround for XDP")
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:34 -08:00
Vladimir Oltean
07bf34a50e net: enetc: initialize the RFS and RSS memories
Michael tried to enable Advanced Error Reporting through the ENETC's
Root Complex Event Collector, and the system started spitting out single
bit correctable ECC errors coming from the ENETC interfaces:

pcieport 0000:00:1f.0: AER: Multiple Corrected error received: 0000:00:00.0
fsl_enetc 0000:00:00.0: PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
fsl_enetc 0000:00:00.0:   device [1957:e100] error status/mask=00004000/00000000
fsl_enetc 0000:00:00.0:    [14] CorrIntErr
fsl_enetc 0000:00:00.1: PCIe Bus Error: severity=Corrected, type=Transaction Layer, (Receiver ID)
fsl_enetc 0000:00:00.1:   device [1957:e100] error status/mask=00004000/00000000
fsl_enetc 0000:00:00.1:    [14] CorrIntErr

Further investigating the port correctable memory error detect register
(PCMEDR) shows that these AER errors have an associated SOURCE_ID of 6
(RFS/RSS):

$ devmem 0x1f8010e10 32
0xC0000006
$ devmem 0x1f8050e10 32
0xC0000006

Discussion with the hardware design engineers reveals that on LS1028A,
the hardware does not do initialization of that RFS/RSS memory, and that
software should clear/initialize the entire table before starting to
operate. That comes as a bit of a surprise, since the driver does not do
initialization of the RFS memory. Also, the initialization of the
Receive Side Scaling is done only partially.

Even though the entire ENETC IP has a single shared flow steering
memory, the flow steering service should returns matches only for TCAM
entries that are within the range of the Station Interface that is doing
the search. Therefore, it should be sufficient for a Station Interface
to initialize all of its own entries in order to avoid any ECC errors,
and only the Station Interfaces in use should need initialization.

There are Physical Station Interfaces associated with PCIe PFs and
Virtual Station Interfaces associated with PCIe VFs. We let the PF
driver initialize the entire port's memory, which includes the RFS
entries which are going to be used by the VF.

Reported-by: Michael Walle <michael@walle.cc>
Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210204134511.2640309-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04 20:21:39 -08:00
Raju Rangoju
3401e4aa43 cxgb4: Add new T6 PCI device id 0x6092
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Link: https://lore.kernel.org/r/20210202182511.8109-1-rajur@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-04 18:09:23 -08:00
Jakub Kicinski
5a4cb54675 Merge tag 'mlx5-fixes-2021-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2021-02-01

Please note the first patch in this series
("Fix function calculation for page trees") is fixing a regression
due to previous fix in net which you didn't include in your previous
rc pr. So I hope this series will make it into your next rc pr,
so mlx5 won't be broken in the next rc.

* tag 'mlx5-fixes-2021-02-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Release skb in case of failure in tc update skb
  net/mlx5e: Update max_opened_tc also when channels are closed
  net/mlx5: Fix leak upon failure of rule creation
  net/mlx5: Fix function calculation for page trees
====================

Link: https://lore.kernel.org/r/20210202070703.617251-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 08:51:26 -08:00
Heiner Kallweit
cc9f07a838 r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
So far phy_disconnect() is called before free_irq(). If CONFIG_DEBUG_SHIRQ
is set and interrupt is shared, then free_irq() creates an "artificial"
interrupt by calling the interrupt handler. The "link change" flag is set
in the interrupt status register, causing phylib to eventually call
phy_suspend(). Because the net_device is detached from the PHY already,
the PHY driver can't recognize that WoL is configured and powers down the
PHY.

Fixes: f1e911d5d0 ("r8169: add basic phylib support")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/fe732c2c-a473-9088-3974-df83cfbd6efd@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 08:44:15 -08:00
Stefan Chulski
43f4a20a12 net: mvpp2: TCAM entry enable should be written after SRAM data
Last TCAM data contains TCAM enable bit.
It should be written after SRAM data before entry enabled.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Link: https://lore.kernel.org/r/1612172139-28343-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-02 08:42:38 -08:00
Maor Dickman
a34ffec8af net/mlx5e: Release skb in case of failure in tc update skb
In case of failure in tc update skb the packet is dropped
without freeing the skb.

Fixed by freeing the skb in case failure in tc update skb.

Fixes: d6d2778286 ("net/mlx5: E-Switch, Restore chain id on miss")
Fixes: c756909722 ("net/mlx5e: Add tc chains offload support for nic flows")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-01 23:02:02 -08:00
Maxim Mikityanskiy
5a2ba25a55 net/mlx5e: Update max_opened_tc also when channels are closed
max_opened_tc is used for stats, so that potentially non-zero stats
won't disappear when num_tc decreases. However, mlx5e_setup_tc_mqprio
fails to update it in the flow where channels are closed.

This commit fixes it. The new value of priv->channels.params.num_tc is
always checked on exit. In case of errors it will just be the old value,
and in case of success it will be the updated value.

Fixes: 05909babce ("net/mlx5e: Avoid reset netdev stats on configuration changes")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-01 23:02:02 -08:00
Maor Gottlieb
a5bfe6b467 net/mlx5: Fix leak upon failure of rule creation
When creation of a new rule that requires allocation of an FTE fails,
need to call to tree_put_node on the FTE in order to release its'
resource.

Fixes: cefc23554f ("net/mlx5: Fix FTE cleanup")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-01 23:02:01 -08:00
Daniel Jurgens
ed5e83a3c0 net/mlx5: Fix function calculation for page trees
The function calculation always results in a value of 0. This works
generally, but when the release all pages feature is enabled it will
result in crashes.

Fixes: 0aa128475d ("net/mlx5: Maintain separate page trees for ECPF and PF functions")
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-02-01 23:02:01 -08:00
Jakub Kicinski
188fa104f2 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-02-01

This series contains updates to igc and i40e drivers.

Kai-Heng Feng fixes igc to report unknown speed and duplex during suspend
as an attempted read will cause errors.

Kevin Lo sets the default value to -IGC_ERR_NVM instead of success for
writing shadow RAM as this could miss a timeout. Also propagates the return
value for Flow Control configuration to properly pass on errors for igc.

Aleksandr reverts commit 2ad1274fa3 ("i40e: don't report link up for a VF
who hasn't enabled queues") as this can cause link flapping.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"
  igc: check return value of ret_val in igc_config_fc_after_link_up
  igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr
  igc: Report speed and duplex as unknown when device is runtime suspended
====================

Link: https://lore.kernel.org/r/20210201214618.852831-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01 20:23:44 -08:00
Lijun Pan
5e9eff5dfa ibmvnic: device remove has higher precedence over reset
Returning -EBUSY in ibmvnic_remove() does not actually hold the
removal procedure since driver core doesn't care for the return
value (see __device_release_driver() in drivers/base/dd.c
calling dev->bus->remove()) though vio_bus_remove
(in arch/powerpc/platforms/pseries/vio.c) records the
return value and passes it on. [1]

During the device removal precedure, checking for resetting
bit is dropped so that we can continue executing all the
cleanup calls in the rest of the remove function. Otherwise,
it can cause latent memory leaks and kernel crashes.

[1] https://lore.kernel.org/linuxppc-dev/20210117101242.dpwayq6wdgfdzirl@pengutronix.de/T/#m48f5befd96bc9842ece2a3ad14f4c27747206a53
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 7d7195a026 ("ibmvnic: Do not process device remove during device reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20210129043402.95744-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-01 18:32:29 -08:00
Aleksandr Loktionov
f559a35604 i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"
This reverts commit 2ad1274fa3

VF queues were not brought up when PF was brought up after being
downed if the VF driver disabled VFs queues during PF down.
This could happen in some older or external VF driver implementations.
The problem was that PF driver used vf->queues_enabled as a condition
to decide what link-state it would send out which caused the issue.

Remove the check for vf->queues_enabled in the VF link notify.
Now VF will always be notified of the current link status.
Also remove the queues_enabled member from i40e_vf structure as it is
not used anymore. Otherwise VNF implementation was broken and caused
a link flap.

The original commit was a workaround to avoid breaking existing VFs though
it's really a fault of the VF code not the PF. The commit should be safe to
revert as all of the VFs we know of have been fixed. Also, since we now
know there is a related bug in the workaround, removing it is preferred.

Fixes: 2ad1274fa3 ("i40e: don't report link up for a VF who hasn't enabled")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01 13:18:31 -08:00
Kevin Lo
b881145642 igc: check return value of ret_val in igc_config_fc_after_link_up
Check return value from ret_val to make error check actually work.

Fixes: 4eb8080143 ("igc: Add setup link functionality")
Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01 10:04:43 -08:00
Kevin Lo
ebc8d12506 igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr
This patch sets the default return value to -IGC_ERR_NVM in
igc_write_nvm_srwr. Without this change it wouldn't lead to a shadow RAM
write EEWR timeout.

Fixes: ab40561268 ("igc: Add NVM support")
Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01 10:04:43 -08:00
Kai-Heng Feng
2e99dedc73 igc: Report speed and duplex as unknown when device is runtime suspended
Similar to commit 165ae7a8fe ("igb: Report speed and duplex as unknown
when device is runtime suspended"), if we try to read speed and duplex
sysfs while the device is runtime suspended, igc will complain and
stops working:

[  123.449883] igc 0000:03:00.0 enp3s0: PCIe link lost, device now detached
[  123.450052] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  123.450056] #PF: supervisor read access in kernel mode
[  123.450058] #PF: error_code(0x0000) - not-present page
[  123.450059] PGD 0 P4D 0
[  123.450064] Oops: 0000 [#1] SMP NOPTI
[  123.450068] CPU: 0 PID: 2525 Comm: udevadm Tainted: G     U  W  OE     5.10.0-1002-oem #2+rkl2-Ubuntu
[  123.450078] RIP: 0010:igc_rd32+0x1c/0x90 [igc]
[  123.450080] Code: c0 5d c3 b8 fd ff ff ff c3 0f 1f 44 00 00 0f 1f 44 00 00 55 89 f0 48 89 e5 41 56 41 55 41 54 49 89 c4 53 48 8b 57 08 48 01 d0 <44> 8b 28 41 83 fd ff 74 0c 5b 44 89 e8 41 5c 41 5d 4

[  123.450083] RSP: 0018:ffffb0d100d6fcc0 EFLAGS: 00010202
[  123.450085] RAX: 0000000000000008 RBX: ffffb0d100d6fd30 RCX: 0000000000000000
[  123.450087] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff945a12716c10
[  123.450089] RBP: ffffb0d100d6fce0 R08: ffff945a12716550 R09: ffff945a09874000
[  123.450090] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000008
[  123.450092] R13: ffff945a12716000 R14: ffff945a037da280 R15: ffff945a037da290
[  123.450094] FS:  00007f3b34c868c0(0000) GS:ffff945b89200000(0000) knlGS:0000000000000000
[  123.450096] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  123.450098] CR2: 0000000000000008 CR3: 00000001144de006 CR4: 0000000000770ef0
[  123.450100] PKRU: 55555554
[  123.450101] Call Trace:
[  123.450111]  igc_ethtool_get_link_ksettings+0xd6/0x1b0 [igc]
[  123.450118]  __ethtool_get_link_ksettings+0x71/0xb0
[  123.450123]  duplex_show+0x74/0xc0
[  123.450129]  dev_attr_show+0x1d/0x40
[  123.450134]  sysfs_kf_seq_show+0xa1/0x100
[  123.450137]  kernfs_seq_show+0x27/0x30
[  123.450142]  seq_read+0xb7/0x400
[  123.450148]  ? common_file_perm+0x72/0x170
[  123.450151]  kernfs_fop_read+0x35/0x1b0
[  123.450155]  vfs_read+0xb5/0x1b0
[  123.450157]  ksys_read+0x67/0xe0
[  123.450160]  __x64_sys_read+0x1a/0x20
[  123.450164]  do_syscall_64+0x38/0x90
[  123.450168]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  123.450170] RIP: 0033:0x7f3b351fe142
[  123.450173] Code: c0 e9 c2 fe ff ff 50 48 8d 3d 3a ca 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[  123.450174] RSP: 002b:00007fffef2ec138 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  123.450177] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b351fe142
[  123.450179] RDX: 0000000000001001 RSI: 00005644c047f070 RDI: 0000000000000003
[  123.450180] RBP: 00007fffef2ec340 R08: 00005644c047f070 R09: 00007f3b352d9320
[  123.450182] R10: 00005644c047c010 R11: 0000000000000246 R12: 00005644c047cbf0
[  123.450184] R13: 00005644c047e6d0 R14: 0000000000000003 R15: 00007fffef2ec140
[  123.450189] Modules linked in: rfcomm ccm cmac algif_hash algif_skcipher af_alg bnep toshiba_acpi industrialio toshiba_haps hp_accel lis3lv02d btusb btrtl btbcm btintel bluetooth ecdh_generic ecc joydev input_leds nls_iso8859_1 snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_soc_hdac_hda snd_hda_codec_hdmi snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core ath10k_pci snd_hwdep intel_rapl_msr intel_rapl_common ath10k_core soundwire_bus snd_soc_core x86_pkg_temp_thermal ath intel_powerclamp snd_compress ac97_bus snd_pcm_dmaengine mac80211 snd_pcm coretemp snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_intel cfg80211 snd_seq snd_seq_device snd_timer mei_hdcp kvm libarc4 snd crct10dif_pclmul ghash_clmulni_intel aesni_intel
 mei_me dell_wmi
[  123.450266]  dell_smbios soundcore sparse_keymap dcdbas crypto_simd cryptd mei dell_uart_backlight glue_helper ee1004 wmi_bmof intel_wmi_thunderbolt dell_wmi_descriptor mac_hid efi_pstore acpi_pad acpi_tad intel_cstate sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log hid_generic usbhid hid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crc32_pclmul rc_core drm intel_lpss_pci i2c_i801 ahci igc intel_lpss i2c_smbus idma64 xhci_pci libahci virt_dma xhci_pci_renesas wmi video pinctrl_tigerlake
[  123.450335] CR2: 0000000000000008
[  123.450338] ---[ end trace 9f731e38b53c35cc ]---

The more generic approach will be wrap get_link_ksettings() with begin()
and complete() callbacks, and calls runtime resume and runtime suspend
routine respectively. However, igc is like igb, runtime resume routine
uses rtnl_lock() which upper ethtool layer also uses.

So to prevent a deadlock on rtnl, take a different approach, use
pm_runtime_suspended() to avoid reading register while device is runtime
suspended.

Fixes: 8c5ad0dae9 ("igc: Add ethtool support")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01 10:04:43 -08:00
Heiner Kallweit
8d520b4de3 r8169: work around RTL8125 UDP hw bug
It was reported that on RTL8125 network breaks under heavy UDP load,
e.g. torrent traffic ([0], from comment 27). Realtek confirmed a hw bug
and provided me with a test version of the r8125 driver including a
workaround. Tests confirmed that the workaround fixes the issue.
I modified the original version of the workaround to meet mainline
code style.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=209839

v2:
- rebased to net
v3:
- make rtl_skb_is_udp() more robust and use skb_header_pointer()
  to access the ip(v6) header
v4:
- remove dependency on ptp_classify.h
- replace magic number with offsetof(struct udphdr, len)

Fixes: f1bce4ad2f ("r8169: add support for RTL8125")
Tested-by: xplo <xplo.bn@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/6e453d49-1801-e6de-d5f7-d7e6c7526c8f@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-29 21:16:24 -08:00
Linus Torvalds
909b447dcc Merge tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Networking fixes including fixes from can, xfrm, wireless,
  wireless-drivers and netfilter trees. Nothing scary, Intel
  WiFi-related fixes seemed most notable to the users.

  Current release - regressions:

   - dsa: microchip: ksz8795: fix KSZ8794 port map again to program the
     CPU port correctly

  Current release - new code bugs:

   - iwlwifi: pcie: reschedule in long-running memory reads

  Previous releases - regressions:

   - iwlwifi: dbg: don't try to overwrite read-only FW data

   - iwlwifi: provide gso_type to GSO packets

   - octeontx2: make sure the buffer is 128 byte aligned

   - tcp: make TCP_USER_TIMEOUT accurate for zero window probes

   - xfrm: fix wraparound in xfrm_policy_addr_delta()

   - xfrm: fix oops in xfrm_replay_advance_bmp due to a race between
     CPUs in presence of packet reorder

   - tcp: fix TLP timer not set when CA_STATE changes from DISORDER to
     OPEN

   - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()

  Previous releases - always broken:

   - igc: fix link speed advertising

   - stmmac: configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA
     addressing

   - team: protect features update by RCU to avoid deadlock

   - xfrm: fix disable_xfrm sysctl when used on xfrm interfaces
     themselves

   - fec: fix temporary RMII clock reset on link up

   - can: dev: prevent potential information leak in can_fill_info()

  Misc:

   - mrp: fix bad packing of MRP test packet structures

   - uapi: fix big endian definition of ipv6_rpl_sr_hdr

   - add David Ahern to IPv4/IPv6 maintainers"

* tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
  rxrpc: Fix memory leak in rxrpc_lookup_local
  mlxsw: spectrum_span: Do not overwrite policer configuration
  selftests: forwarding: Specify interface when invoking mausezahn
  stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing
  net: usb: cdc_ether: added support for Thales Cinterion PLSx3 modem family.
  ibmvnic: Ensure that CRQ entry read are correctly ordered
  MAINTAINERS: add missing header for bonding
  net: decnet: fix netdev refcount leaking on error path
  net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
  can: dev: prevent potential information leak in can_fill_info()
  net: fec: Fix temporary RMII clock reset on link up
  net: lapb: Add locking to the lapb module
  team: protect features update by RCU to avoid deadlock
  MAINTAINERS: add David Ahern to IPv4/IPv6 maintainers
  net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable
  net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset
  net/mlx5e: Revert parameters on errors when changing trust state without reset
  net/mlx5e: Correctly handle changing the number of queues when the interface is down
  net/mlx5e: Fix CT rule + encap slow path offload and deletion
  net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled
  ...
2021-01-28 15:24:43 -08:00
Ido Schimmel
b6f6881aaf mlxsw: spectrum_span: Do not overwrite policer configuration
The purpose of the delayed work in the SPAN module is to potentially
update the destination port and various encapsulation parameters of SPAN
agents that point to a VLAN device or a GRE tap. The destination port
can change following the insertion of a new route, for example.

SPAN agents that point to a physical port or the CPU port are static and
never change throughout the lifetime of the SPAN agent. Therefore, skip
over them in the delayed work.

This fixes an issue where the delayed work overwrites the policer
that was set on a SPAN agent pointing to the CPU. Modifying the delayed
work to inherit the original policer configuration is error-prone, as
the same will be needed for any new parameter.

Fixes: 4039504e6a ("mlxsw: spectrum_span: Allow setting policer on a SPAN agent")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28 13:09:01 -08:00
Voon Weifeng
7cfc4486e7 stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing
Fix an issue where dump stack is printed and Reset Adapter occurs when
PSE0 GbE or/and PSE1 GbE is/are enabled. EHL PSE0 GbE and PSE1 GbE use
32 bits DMA addressing whereas EHL PCH GbE uses 64 bits DMA addressing.

[   25.535095] ------------[ cut here ]------------
[   25.540276] NETDEV WATCHDOG: enp0s29f2 (intel-eth-pci): transmit queue 2 timed out
[   25.548749] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x259/0x260
[   25.558004] Modules linked in: 8021q bnep bluetooth ecryptfs snd_hda_codec_hdmi intel_gpy marvell intel_ishtp_loader intel_ishtp_hid iTCO_wdt mei_hdcp iTCO_vendor_support x86_pkg_temp_thermal kvm_intel dwmac_intel stmmac kvm igb pcs_xpcs irqbypass phylink snd_hda_intel intel_rapl_msr pcspkr dca snd_hda_codec i915 i2c_i801 i2c_smbus libphy intel_ish_ipc snd_hda_core mei_me intel_ishtp mei spi_dw_pci 8250_lpss spi_dw thermal dw_dmac_core parport_pc tpm_crb tpm_tis parport tpm_tis_core tpm intel_pmc_core sch_fq_codel uhid fuse configfs snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_sof_xtensa_dsp snd_sof snd_soc_acpi_intel_match snd_soc_acpi snd_intel_dspcfg ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm snd_timer snd soundcore
[   25.633795] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G     U            5.11.0-rc4-intel-lts-MISMAIL5+ #5
[   25.644306] Hardware name: Intel Corporation Elkhart Lake Embedded Platform/ElkhartLake LPDDR4x T4 RVP1, BIOS EHLSFWI1.R00.2434.A00.2010231402 10/23/2020
[   25.659674] RIP: 0010:dev_watchdog+0x259/0x260
[   25.664650] Code: e8 3b 6b 60 ff eb 98 4c 89 ef c6 05 ec e7 bf 00 01 e8 fb e5 fa ff 89 d9 4c 89 ee 48 c7 c7 78 31 d2 9e 48 89 c2 e8 79 1b 18 00 <0f> 0b e9 77 ff ff ff 0f 1f 44 00 00 48 c7 47 08 00 00 00 00 48 c7
[   25.685647] RSP: 0018:ffffb7ca80160eb8 EFLAGS: 00010286
[   25.691498] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000103
[   25.699483] RDX: 0000000080000103 RSI: 00000000000000f6 RDI: 00000000ffffffff
[   25.707465] RBP: ffff985709ce0440 R08: 0000000000000000 R09: c0000000ffffefff
[   25.715455] R10: ffffb7ca80160cf0 R11: ffffb7ca80160ce8 R12: ffff985709ce039c
[   25.723438] R13: ffff985709ce0000 R14: 0000000000000008 R15: ffff9857068af940
[   25.731425] FS:  0000000000000000(0000) GS:ffff985864300000(0000) knlGS:0000000000000000
[   25.740481] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   25.746913] CR2: 00005567f8bb76b8 CR3: 00000001f8e0a000 CR4: 0000000000350ee0
[   25.754900] Call Trace:
[   25.757631]  <IRQ>
[   25.759891]  ? qdisc_put_unlocked+0x30/0x30
[   25.764565]  ? qdisc_put_unlocked+0x30/0x30
[   25.769245]  call_timer_fn+0x2e/0x140
[   25.773346]  run_timer_softirq+0x1f3/0x430
[   25.777932]  ? __hrtimer_run_queues+0x12c/0x2c0
[   25.783005]  ? ktime_get+0x3e/0xa0
[   25.786812]  __do_softirq+0xa6/0x2ef
[   25.790816]  asm_call_irq_on_stack+0xf/0x20
[   25.795501]  </IRQ>
[   25.797852]  do_softirq_own_stack+0x5d/0x80
[   25.802538]  irq_exit_rcu+0x94/0xb0
[   25.806475]  sysvec_apic_timer_interrupt+0x42/0xc0
[   25.811836]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[   25.817586] RIP: 0010:cpuidle_enter_state+0xd9/0x370
[   25.823142] Code: 85 c0 0f 8f 0a 02 00 00 31 ff e8 22 d5 7e ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 47 02 00 00 31 ff e8 7b a0 84 ff fb 45 85 f6 <0f> 88 ab 00 00 00 49 63 ce 48 2b 2c 24 48 89 c8 48 6b d1 68 48 c1
[   25.844140] RSP: 0018:ffffb7ca800f7e80 EFLAGS: 00000206
[   25.849996] RAX: ffff985864300000 RBX: 0000000000000003 RCX: 000000000000001f
[   25.857975] RDX: 00000005f2028ea8 RSI: ffffffff9ec5907f RDI: ffffffff9ec62a5d
[   25.865961] RBP: 00000005f2028ea8 R08: 0000000000000000 R09: 0000000000029d00
[   25.873947] R10: 000000137b0e0508 R11: ffff9858643294e4 R12: ffff9858643336d0
[   25.881935] R13: ffffffff9ef74b00 R14: 0000000000000003 R15: 0000000000000000
[   25.889918]  cpuidle_enter+0x29/0x40
[   25.893922]  do_idle+0x24a/0x290
[   25.897536]  cpu_startup_entry+0x19/0x20
[   25.901930]  start_secondary+0x128/0x160
[   25.906326]  secondary_startup_64_no_verify+0xb0/0xbb
[   25.911983] ---[ end trace b4c0c8195d0ba61f ]---
[   25.917193] intel-eth-pci 0000:00:1d.2 enp0s29f2: Reset adapter.

Fixes: 67c08ac414 ("net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID")
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Co-developed-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Link: https://lore.kernel.org/r/20210126100844.30326-1-mohammad.athari.ismail@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28 13:07:28 -08:00
Linus Torvalds
b0dfa64dcd Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Several recent regressions and some bug fixes:

   - Typo corrupting the max_recv_sge for cxgb4

   - Regression from re-using kernel enums as a HW AbI in vmw_pvrdma

   - Sleeping inside a spinlock in hns

   - Revert the attempt to fix devlink deadlocks as the fix is more buggy

   - Typo in sysfs_emit_at conversions

   - Revert the removal of VLAN support in rxe"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  Revert "RDMA/rxe: Remove VLAN code leftovers from RXE"
  RDMA/usnic: Fix misuse of sysfs_emit_at
  Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"
  RDMA/hns: Use mutex instead of spinlock for ida allocation
  RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC
  RDMA/cxgb4: Fix the reported max_recv_sge value
2021-01-28 09:27:26 -08:00
Jakub Kicinski
44a674d6f7 Merge tag 'mlx5-fixes-2021-01-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2021-01-26

This series introduces some fixes to mlx5 driver.

* tag 'mlx5-fixes-2021-01-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable
  net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset
  net/mlx5e: Revert parameters on errors when changing trust state without reset
  net/mlx5e: Correctly handle changing the number of queues when the interface is down
  net/mlx5e: Fix CT rule + encap slow path offload and deletion
  net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled
  net/mlx5: Maintain separate page trees for ECPF and PF functions
  net/mlx5e: Fix IPSEC stats
  net/mlx5e: Reduce tc unsupported key print level
  net/mlx5e: free page before return
  net/mlx5e: E-switch, Fix rate calculation for overflow
  net/mlx5: Fix memory leak on flow table creation error flow
====================

Link: https://lore.kernel.org/r/20210126234345.202096-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27 19:18:37 -08:00
Lijun Pan
e41aec79e6 ibmvnic: Ensure that CRQ entry read are correctly ordered
Ensure that received Command-Response Queue (CRQ) entries are
properly read in order by the driver. dma_rmb barrier has
been added before accessing the CRQ descriptor to ensure
the entire descriptor is read before processing.

Fixes: 032c5e8284 ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20210128013442.88319-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27 19:12:02 -08:00
Jakub Kicinski
5ae3a25b32 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Anthony Nguyen says:

====================
Intel Wired LAN Driver Updates 2021-01-26

This series contains updates to the ice, i40e, and igc driver.

Henry corrects setting an unspecified protocol to IPPROTO_NONE instead of
0 for IPv6 flexbytes filters for ice.

Nick fixes the IPv6 extension header being processed incorrectly and
updates the netdev->dev_addr if it exists in hardware as it may have been
modified outside the ice driver.

Brett ensures a user cannot request more channels than available LAN MSI-X
and fixes the minimum allocation logic as it was incorrectly trying to use
more MSI-X than allocated for ice.

Stefan Assmann minimizes the delay between getting and using the VSI
pointer to prevent a possible crash for i40e.

Corinna Vinschen fixes link speed advertising for igc.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  igc: fix link speed advertising
  i40e: acquire VSI pointer only after VF is initialized
  ice: Fix MSI-X vector fallback logic
  ice: Don't allow more channels than LAN MSI-X available
  ice: update dev_addr in ice_set_mac_address even if HW filter exists
  ice: Implement flow for IPv6 next header (extension header)
  ice: fix FDir IPv6 flexbyte
====================

Link: https://lore.kernel.org/r/20210126221035.658124-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-27 17:44:52 -08:00
Laurent Badel
c730ab423b net: fec: Fix temporary RMII clock reset on link up
fec_restart() does a hard reset of the MAC module when the link status
changes to up. This temporarily resets the R_CNTRL register which controls
the MII mode of the ENET_OUT clock. In the case of RMII, the clock
frequency momentarily drops from 50MHz to 25MHz until the register is
reconfigured. Some link partners do not tolerate this glitch and
invalidate the link causing failure to establish a stable link when using
PHY polling mode. Since as per IEEE802.3 the criteria for link validity
are PHY-specific, what the partner should tolerate cannot be assumed, so
avoid resetting the MII clock by using software reset instead of hardware
reset when the link is up. This is generally relevant only if the SoC
provides the clock to an external PHY and the PHY is configured for RMII.

Signed-off-by: Laurent Badel <laurentbadel@eaton.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 18:24:39 -08:00