This patch adds support for the switchdev_port_attr_get() and
ndo_get_phys_port_name() methods for the PF and the VF-reps.
Using this support a user application can deduce that the PF
(when in the ESWITCH_SWDEV mode) and it's VF-reps form a switch.
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces the RX/TX and a simple netdev implementation
for VF-reps. The VF-reps use the RX/TX rings of the PF. For each VF-rep
the PF driver issues a VFR_ALLOC FW cmd that returns "cfa_code"
and "cfa_action" values. The FW sets up the filter tables in such
a way that VF traffic by default (in absence of other rules)
gets punted to the parent PF. The cfa_code value in the RX-compl
informs the driver of the source VF. For traffic being transmitted
from the VF-rep, the TX BD is tagged with a cfa_action value that
informs the HW to punt it to the corresponding VF.
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is a part of a patch-set that introduces support for
VF-reps in the bnxt_en driver. The driver registers eswitch mode
get/set methods with the devlink interface that allow a user to
enable SRIOV switchdev mode. When enabled, the driver registers
a VF-rep netdev object for each VF with the stack. This can
essentially bring the VFs unders the management perview of the
hypervisor and applications such as OVS.
The next patch in the series, adds the RX/TX routines and a slim
netdev implementation for the VF-reps.
Signed-off-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In addition to the ETS weight, older firmware also requires the min_bw
parameter to be set for it to work properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report DCB_CAP_DCBX_LLD_MANAGED only if the firmware DCBX agent is enabled
and running for PF or VF. Otherwise, if both LLDP and DCBX agents are
disabled in firmware, we report DCB_CAP_DCBX_LLD_HOST and allow host
IEEE DCB settings. This patch refines the current logic in the driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For debugging purpose, it is sometimes useful to disable periodic
port statistics updates, so that the firmware logs will not be
filled with statistics update messages.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of duplicating the logic multiple times. Also, it is unnecessary
to zero the buffer in .get_ethtool_stats() because it is already zeroed
by the caller.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To allow users to set the hardware bridging mode to VEB or VEPA. Only
single function PF can change the bridging mode.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Retrieve and store the hardware bridge mode, so that we can implement
ndo_bridge_{get|set)link methods in the next patch.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VF representors and PTP are added features in the new firmware spec.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the skb is attached to the first control block of a fragmented
skb it is possible that the skb could be freed when reclaiming that
control block before all fragments of the skb have been consumed by
the hardware and unmapped.
This commit introduces first_cb and last_cb pointers to the skb
control block used by the driver to keep track of which transmit
control blocks within a transmit ring are the first and last ones
associated with the skb.
It then splits the bcmgenet_free_cb() function into transmit
(bcmgenet_free_tx_cb) and receive (bcmgenet_free_rx_cb) versions
that can handle the unmapping of dma mapped memory and cleaning up
the corresponding control block structure so that the skb is only
freed after the last associated transmit control block is reclaimed.
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case we fail to map a single fragment, we would be leaving the
transmit ring populated with stale entries.
This commit introduces the helper function bcmgenet_put_txcb()
which takes care of rewinding the per-ring write pointer back to
where we left.
It also consolidates the functionality of bcmgenet_xmit_single()
and bcmgenet_xmit_frag() into the bcmgenet_xmit() function to
make the unmapping of control blocks cleaner.
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IDM operations are usually one time ops and should be done in
firmware itself. Driver is not supposed to touch IDM registers.
However, for some SoCs', driver is performing IDM read/writes.
So this patch masks IDM operations in case firmware is taking
care of IDM operations.
Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com>
Reviewed-by: Oza Oza <oza.oza@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return type for idm register write callback should be void as 'writel'
API is used for write operation. However, there no need to have 'return'
in this function.
Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com>
Reviewed-by: Oza Oza <oza.oza@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc notices that large queue numbers would overflow the queue name
string:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c: In function 'bnx2x_get_strings':
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:25: error: '%d' directive writing between 1 and 10 bytes into a region of size 5 [-Werror=format-overflow=]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:25: note: directive argument in the range [0, 2147483647]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:5: note: 'sprintf' output between 2 and 11 bytes into a destination of size 5
There is a hard limit in place that makes the number at most two
digits, so the code is fine. This changes it to use snprintf()
to truncate instead of overflowing, which shuts up that warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't populate various tables on the stack but make them static const.
Makes the object code smaller by nearly 200 bytes:
Before:
text data bss dec hex filename
113468 11200 0 124668 1e6fc bnx2x_ethtool.o
After:
text data bss dec hex filename
113129 11344 0 124473 1e639 bnx2x_ethtool.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF driver sets up a list of firmware commands from the VF driver that
needs to be forwarded to the PF for approval. This list is a 256-bit
bitmap. The code that sets up the bitmap falls apart on big-endian
architecture. __set_bit() does not work because it operates on long types
whereas the firmware interface is defined in u32 types, causing bits in
the wrong 32-bit word to be set.
Fix it by setting the proper bits on an array of u32.
Fixes: de68f5de56 ("bnxt_en: Fix bitmap declaration to work on 32-bit arches.")
Reported-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When changing channels from combined to rx/tx or vice versa, the code
uses the wrong "sh" parameter to determine if we are reserving rings
for shared or non-shared mode. It should be using the ethtool requested
"sh" parameter instead of the current "sh" parameter.
Fix it by passing the "sh" parameter to bnxt_reserve_rings(). For
ethtool, we will pass in the requested "sh" parameter.
Fixes: 391be5c273 ("bnxt_en: Implement new scheme to reserve tx rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
.ndo_get_stats64() may not be protected by RTNL and can race with
.ndo_stop() or other ethtool operations that can free the statistics
memory. Fix it by setting a new flag BNXT_STATE_READ_STATS and then
proceeding to read statistics memory only if the state is OPEN. The
close path that frees the memory clears the OPEN state and then waits
for the BNXT_STATE_READ_STATS to clear before proceeding to free the
statistics memory.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mc configuration changes bnx2x_config_mcast() can return 0 for
success, negative for failure and positive for benign reason preventing
its immediate work, e.g., when the command awaits the completion of
a previously sent command.
When removing all configured macs on a 578xx adapter, if a positive
value would be returned driver would errneously log it as an error.
Fixes: c7b7b483cc ("bnx2x: Don't flush multicast MACs")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY library now supports an "internal" phy-mode, thus making our
custom parsing code now unnecessary.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To handle netpoll properly, the driver must only handle TX packets
during NAPI. Handling RX events cause warnings and errors in
netpoll mode. The ndo_poll_controller() method should call
napi_schedule() directly so that a NAPI weight of zero will be used
during netpoll mode.
The bnxt_en driver supports 2 ring modes: combined, and separate rx/tx.
In separate rx/tx mode, the ndo_poll_controller() method will only
process the tx rings. In combined mode, the rx and tx completion
entries are mixed in the completion ring and we need to drop the rx
entries and recycle the rx buffers.
Add a function bnxt_force_rx_discard() to handle this in netpoll mode
when we see rx entries in combined ring mode.
Reported-by: Calvin Owens <calvinowens@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we get a TPA_END completion to handle a completed LRO packet, it
is possible that hardware would indicate errors. The current code is
not checking for the error condition. Define the proper error bits and
the macro to check for this error and abort properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to bnxt to report bpf_prog ID during XDP_QUERY_PROG.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It seems like a historic accident that these return unsigned char *,
and in many places that means casts are required, more often than not.
Make these functions return void * and remove all the casts across
the tree, adding a (u8 *) cast only where the unsigned char pointer
was used directly, all done with the following spatch:
@@
expression SKB, LEN;
typedef u8;
identifier fn = { skb_push, __skb_push, skb_push_rcsum };
@@
- *(fn(SKB, LEN))
+ *(u8 *)fn(SKB, LEN)
@@
expression E, SKB, LEN;
identifier fn = { skb_push, __skb_push, skb_push_rcsum };
type T;
@@
- E = ((T *)(fn(SKB, LEN)))
+ E = fn(SKB, LEN)
@@
expression SKB, LEN;
identifier fn = { skb_push, __skb_push, skb_push_rcsum };
@@
- fn(SKB, LEN)[0]
+ *(u8 *)fn(SKB, LEN)
Note that the last part there converts from push(...)[0] to the
more idiomatic *(u8 *)push(...).
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the recently introduced helper to replace the pattern of
skb_put() && memset(), this transformation was done with the
following spatch:
@@
identifier p;
expression len;
expression skb;
@@
-p = skb_put(skb, len);
-memset(p, 0, len);
+p = skb_put_zero(skb, len);
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make return value void since function never return meaningfull value
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to push the chain index down to the drivers, so they have the
information to which chain the rule belongs. For now, no driver supports
multichain offload, so only chain 0 is supported. This is needed to
prevent chain squashes during offload for now. Later this will be used
to implement multichain offload.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When freeing VF's DMA mappings, an already NULLed pointer was checked
again due to an apparent copy&paste error. Consequently, the pf2vf
bulletin DMA mapping was not freed.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On SYSTEMPORT Lite, since we have the main interrupt source in the first
cell, the second cell is the Wake-on-LAN interrupt, yet the code was not
properly updated to fetch the second cell, and instead looked at the
third and non-existing cell for Wake-on-LAN.
Fixes: 44a4524c54 ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Apparently multi-cos isn't working for bnx2x quite some time -
driver implements ndo_select_queue() to allow queue-selection
for FCoE, but the regular L2 flow would cause it to modulo the
fallback's result by the number of queues.
The fallback would return a queue matching the needed tc
[via __skb_tx_hash()], but since the modulo is by the number of TSS
queues where number of TCs is not accounted, transmission would always
be done by a queue configured into using TC0.
Fixes: ada7c19e6d ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to write the doorbell if BQL has stopped the queue and
skb->xmit_more is set. Otherwise it is possible for the tx queue to
rot and cause tx timeout.
Fixes: 4d172f21ce ("bnxt_en: Implement xmit_more.")
Suggested-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the existing code, the local variable sh is hardcoded to true to
calculate default rings for shared ring configuration. It is better
to have the caller determine the value of sh.
Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not write the TX doorbell if skb->xmit_more is set unless the TX
queue is full.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Older chips require the doorbells to be written twice, but newer chips
do not. Add a new common function bnxt_db_write() to write all
doorbells appropriately depending on the chip. Eliminating the extra
doorbell on newer chips has a significant performance improvement
on pktgen.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add additional chip definitions and macros for all supported chips.
Add a new macro BNXT_CHIP_P4_PLUS for the newer generation of chips and
use the macro to properly determine the features supported by these
newer chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When bnxt_en gets a PCI shutdown call, we need to have a new callback
to inform the RDMA driver to do proper shutdown and removal.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new short message format is used on the new BCM57454 VFs. Each
firmware message is a fixed 16-byte message sent using the standard
firmware communication channel. The short message has a DMA address
pointing to the legacy long firmware message.
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include HWTSTAMP_FILTER_NTP_ALL in net_hwtstamp_validate() as a valid
filter and update drivers which can timestamp all packets, or which
explicitly list unsupported filters instead of using a default case, to
handle the filter.
CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise, all the host based DCBX settings from lldpad will fail if the
firmware DCBX agent is running.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current code, bnxt_dcb_init() is called too early before we
determine if the firmware DCBX agent is running or not. As a result,
we are not setting the DCB_CAP_DCBX_HOST and DCB_CAP_DCBX_LLD_MANAGED
flags properly to report to DCBNL.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is inline function to test if carrier present,
so it makes open-coded solution redundant.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the SPARC platform we need to use the DMA_ATTR_WEAK_ORDERING attribute
in our Rx path dma mapping in order to get the expected performance out
of the receive path. Adding it to the Tx path has little effect, so
that's not a part of this patch.
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Tushar Dave <tushar.n.dave@oracle.com>
Reviewed-by: Tom Saeger <tom.saeger@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>