Implement previously introduced mlxsw core shared buffer API.
For Spectrum, that is done utilizing registers SBPR, SBCM and SBPM.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Needed in following patch.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Although the device supports max_buff magic values 0 and 0xff, these are
not exposed to the user via devlink.
Therefore, adjust the default values to be within configurable range.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As explained in commit ff6551ec0c ("mlxsw: spectrum: Correctly
configure headroom size") control packets are directed to priority group
buffer 9 (PG9) in the ports' headroom buffers.
Since we don't want to drop control packets in case they can't be
admitted to the switch's shared buffer we bind PG9 to a different
ingress pool from the one used by all other PGs.
Unlike other PGs, we currently don't expose the binding between PG9 to a
pool and leave it fixed.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since there is no congestion control for CPU port traffic, we can change
the CPU port TC binding to pool 0 with min_buff and max_buff zeroed.
Remove initialization for pool egress pool 3 since it is no longer used
by dafault.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to achieve faster dumping of current setting and also in order
to provide possibility to get pool mode without a need to query hardware,
do cache the configuration in driver.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Be consintent with rest of the registers (pm, cm) and use "pr" here.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Structs are in arrays so use array index as pool/tc/prio index. With
that, there is need to maintain separate arrays for ingress and egress.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pushed them into helper functions.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add middle layer in mlxsw core code to forward shared buffer calls
into specific ASIC drivers.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we have the ISS.CGIS bit set, we already know that gPTP interrupt has
happened, so an extra GIS register check at the end of ravb_ptp_interrupt()
seems superfluous. We can model the gPTP interrupt handler like all other
dedicated interrupt handlers in the driver and make it *void*.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds support for the following via ethtool:
- UDP configuration of RSS based on 2-tuple/4-tuple.
- RSS hash key.
- RSS indirection table.
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds the required API for passing RSS-related configuration from qede.
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inbox drivers don't need versioning scheme in order to guarantee
compatibility, as both qed and qede are compiled from same codebase.
Signed-off-by: Rahul Verma <rahul.verma@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Byte Queue Limits (BQL) support to bcmgenet driver.
Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcmgenet_isr1() and bcmgenet_isr0() run in hard irq context,
we do not need to block irq again.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By using napi_complete_done(), we allow fine tuning
of /sys/class/net/ethX/gro_flush_timeout for higher GRO aggregation
efficiency for a Gbit NIC.
Check commit 24d2e4a507 ("tg3: use napi_complete_done()") for details.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Petri Gynther <pgynther@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function compiles to 895 bytes of machine code.
Clearly, this isn't a time-critical function.
For one, it has a number of udelay(1) calls.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
On GMAC4.xx each descriptor contains 2 buffers of 16KB (each).
Initially, those 2 buffers was filled in dwmac4_rd_prepare_tx_desc but
it is actually not needed. Indeed, stmmac driver supports frame up to
9000 bytes (jumbo). So only one buffer is needed.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The QID field gets set to the mac id. This made the DMA linked list queue
the traffic of each MAC on a different internal queue. However during long
term testing we found that this will cause traffic stalls as the multi
queue setup requires a more complete initialisation which is not part of
the upstream driver yet.
This patch removes the code setting the QID field, resulting in all
traffic ending up in queue 0 which works without any special setup.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The worker always touches both netdevs. It is ethernet core and not MAC
specific. We only need one worker, which belongs into the ethernets core
struct.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver supports 2 MACs. Both run on the same DMA ring. If we hit a TX
timeout we need to stop both netdevs before restarting them again. If we
don't do this, mtk_stop() wont shutdown DMA and the consecutive call to
mtk_open() wont restart DMA and enable IRQs.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inside the TX path there is a lock inside the tx_map function. This is
however too late. The patch moves the lock to the start of the xmit
function right before the free count check of the DMA ring happens.
If we do not do this, the code becomes racy leading to TX stalls and
dropped packets. This happens as there are 2 netdevs running on the
same physical DMA ring.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver supports 2 MACs. Both run on the same DMA ring. If we go
above/below the TX rings threshold value, we always need to wake/stop
the queue of both devices. Not doing to can cause TX stalls and packet
drops on one of the devices.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW reset is triggered in the mtk_hw_init() function. There is no need to
also reset the core during probe.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code used to also support the PDMA engine, which had 2 packet pointers
per descriptor. Because of this we had to divide the result by 2 and round
it up. This is no longer needed as the code only supports QDMA.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original commit failed to set watchdog_timeo. This patch sets
watchdog_timeo to HZ.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The host_port field is constantly assigned to 0 and this value has
never changed (since time when cpsw driver was introduced. More over,
if this field will be assigned to non 0 value it will break current
driver functionality.
Hence, there are no reasons to continue maintaining this host_port
field and it can be removed, and the HOST_PORT_NUM and ALE_PORT_HOST
defines can be used instead.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ALE APIs expect to receive port masks as input values for arguments
port_mask, untag, reg_mcast, unreg_mcast. But there are few places in
code where port masks are passed left-shifted by cpsw_priv->host_port,
like below:
cpsw_ale_add_vlan(priv->ale, priv->data.default_vlan,
ALE_ALL_PORTS << priv->host_port,
ALE_ALL_PORTS << priv->host_port, 0, 0);
and cpsw is still working just because priv->host_port == 0
and has never ever been changed.
Hence, fix port_mask parameters in ALE APIs calls and drop
"<< priv->host_port" from all places where it's used to
shift valid port mask.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some dual port cards, link speeds on both ports have to be compatible.
Firmware will inform the driver when a certain speed is no longer
supported if the other port has linked up at a certain speed. Add
logic to handle this event by logging a message and getting the
updated list of supported speeds.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some hypervisors (e.g. ESX) require the VF MAC address to be forwarded to
the PF for approval. In Linux PF, the call is not forwarded and the
firmware will simply check and approve the MAC address if the PF has not
previously administered a valid MAC address for this VF.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let firmware know that the driver is giving up control of the link so that
it can be shutdown if no management firmware is running.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10GBaseT devices must autonegotiate to determine master/slave clocking.
Disallow forced speed in ethtool .set_settings() for these devices.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enables the use of multiple transmit and receive scrqs allowing the ibmvnic
driver to take advantage of multiqueue functionality. To achieve this, the
driver must implement the process of negotiating the maximum number of
queues allowed by the server. Initially, the driver will attempt to login
with the maximum number of tx and rx queues supported by the server. If
the server fails to allocate the requested number of scrqs, it will return
partial success in the login response. In this case, we must reinitiate
the login process from the request capabilities stage and attempt to login
requesting fewer scrqs.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix copy&paste error and state the name of SBPM register correctly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Same field, same values, so share the same enum.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of that, pass mlxsw_core and use a helper to get driver priv
from driver code. Looks much cleaner that way.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of passing around driver priv, pass struct mlxsw_core *
directly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove devlink port reg/unreg from spectrum and switchx2 code and rather
do the common work in core. That also ensures code separation where
devlink is only used in core.c.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since much of the required changes have already been made for
changing MTU at runtime let's use it for ring size changes as
well.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Soon ring resize will call this functions with values
different than the current configuration we need to
explicitly pass the ring count as parameter.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When changing MTU on running device first allocate new rings
and buffers and once it succeeds proceed with changing MTU.
Allocation of new rings is not really necessary for this
operation - it's done to keep the code simple and because
size of the extra ring memory is quite small compared to
the size of buffers.
Operation can still fail midway through if FW communication
times out. In that case we retry with old MTU (rings).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free list buffer size needs to be propagated to few functions
as a parameter and added to struct nfp_net_rx_ring since soon
some of the functions will be reused to manage rings with
buffers of size different than nn->fl_bufsz.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FW reconfiguration in .ndo_open()/.ndo_stop() should reset/
restore queue state. Since we need IRQs to be disabled when
filling rings on RX path we have to move disable_irq() from
.ndo_open() all the way up to IRQ allocation.
nfp_net_start_vec() becomes trivial now so it's inlined.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Divide .ndo_open() and .ndo_stop() into logical, callable
chunks. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nfp_net_[rt]x_ring_{alloc,free} should only allocate or free
ring resources without touching the device. Move setting
parameters in the BAR to separate functions. This will make
it possible to reuse alloc/free functions to allocate new
rings while the device is running.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We want the .ndo_open() to have following structure:
- allocate resources;
- configure HW/FW;
- enable the device from stack perspective.
Therefore filling RX rings needs to be moved to the beginning
of .ndo_open().
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate allocation of buffers from giving them to FW,
thanks to this it will be possible to move allocation
earlier on .ndo_open() path and reuse buffers during
runtime reconfiguration.
Similar to TX side clean up the spill of functionality
from flush to freeing the ring. Unlike on TX side,
RX ring reset does not free buffers from the ring.
Ring reset means only that FW pointers are zeroed and
buffers on the ring must be placed in [0, cnt - 1)
positions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>