This mode is to support XDP. In this mode, each rx ring is configured
with page sized buffers for linear placement of each packet. MTU will be
restricted to what the page sized buffers can support.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert the global constants BNXT_RX_OFFSET and BNXT_RX_DMA_OFFSET to
device parameters. This will make it easier to support XDP with
headroom support which requires different RX buffer offsets.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When driver is running in XDP mode, rx buffers are DMA mapped as
DMA_BIDIRECTIONAL. Add a field so the code will map/unmap rx buffers
according to this field.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To support XDP_TX, we need the RX buffer's DMA address to transmit the
packet. Convert the DMA address field to a permanent field in
bnxt_sw_rx_bd.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor refactoring of bnxt_rx_skb() so that it can easily be replaced by
a new function that handles packets in a single page. Also, use a
function pointer bp->rx_skb_func() to switch to a new function when
we add the new mode in the next patch.
Add a new field data_ptr that points to the packet data in the
bnxt_sw_rx_bd structure. The original data field is changed to void
pointer so that it can either hold the kmalloc'ed data or a page
pointer.
The last parameter of bnxt_rx_skb() which was the length parameter is
changed to include the payload offset of the packet in the upper 16 bit.
The offset is needed to support the rx page mode and is not used in
this existing function.
v3: Added a new data_ptr parameter to bp->rx_skb_func(). The caller
has the option to modify the starting address of the packet. This
will be needed when XDP with headroom support is added.
v2: Changed the name of the last parameter to offset_and_len to make the
code more clear.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
napi_complete_done() allows to opt-in for gro_flush_timeout,
added back in linux-3.19, commit 3b47d30396
("net: gro: add a per device gro flush timer")
This allows for more efficient GRO aggregation without
sacrifying latencies.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_get_port_module_status() calls bnxt_update_link() which expects
RTNL to be held. In bnxt_sp_task() that does not hold RTNL, we need to
call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
be moved to the end of bnxt_sp_task().
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_update_link() is called from multiple code paths. Most callers,
such as open, ethtool, already hold RTNL. Only the caller bnxt_sp_task()
does not. So it is a bug to take RTNL inside bnxt_update_link().
Fix it by removing the RTNL inside bnxt_update_link(). The function
now expects the caller to always hold RTNL.
In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
bnxt_update_link(). We also need to move the call to the end of
bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
will synchronize and wait for bnxt_sp_task() to finish. Some functions
in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
acquire rtnl_lock() to prevent race conditions.
There are some bugs related to this logic. This patch refactors the code
to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
the RTNL and the clearing/setting of the bit. Multiple functions will
need the same logic. We also need to move bnxt_reset() to the end of
bnxt_sp_task(). Functions that clear BNXT_STATE_IN_SP_TASK must be the
last functions to be called in bnxt_sp_task(). The common scheme will
handle the condition properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the TPA GRO code path, initialize the tcp_opt_len variable to 0 so
that it will be correct for packets without TCP timestamps. The bug
caused the SKB fields to be incorrectly set up for packets without
TCP timestamps, leading to these packets being rejected by the stack.
Reported-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Acked-by: Andy Gospodarek <andrew.gospodarek@broadocm.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the ulp_sriov_cfg callbacks when the number of VFs is changing. This
allows the RDMA driver to provision RDMA resources for the VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add LED blinking code to support ethtool -p on the PF.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit bdbd1eb59c ("bnxt_en: Handle no aggregation ring gracefully.")
introduced the BNXT_FLAG_NO_AGG_RINGS flag. For consistency,
bnxt_set_tpa_flags() should also clear TPA flags when there are no
aggregation rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC [M] drivers/net/ethernet/broadcom/bnxt/bnxt.o
drivers/net/ethernet/broadcom/bnxt/bnxt.c:4947:21: warning: ‘bnxt_get_max_func_rss_ctxs’ defined but not used [-Wunused-function]
static unsigned int bnxt_get_max_func_rss_ctxs(struct bnxt *bp)
^
CC [M] drivers/net/ethernet/broadcom/bnxt/bnxt.o
drivers/net/ethernet/broadcom/bnxt/bnxt.c:4956:21: warning: ‘bnxt_get_max_func_vnics’ defined but not used [-Wunused-function]
static unsigned int bnxt_get_max_func_vnics(struct bnxt *bp)
^
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In dev_get_stats() the statistic structure storage has already been
zeroed. Therefore network drivers do not need to call memset() again.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.
Fix all drivers with ndo_get_stats64 to have a void function.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code assumes that we will always have at least 2 rx rings, 1
will be used as an aggregation ring for TPA and jumbo page placements.
However, it is possible, especially on a VF, that there is only 1 rx
ring available. In this scenario, the current code will fail to initialize.
To handle it, we need to properly set up only 1 ring without aggregation.
Set a new flag BNXT_FLAG_NO_AGG_RINGS for this condition and add logic to
set up the chip to place RX data linearly into a single buffer per packet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the added support for the bnxt_re RDMA driver, both drivers can be
allocating completion rings in any order. The firmware does not know
which completion ring should be receiving async events. Add an
extra step to tell firmware the completion ring number for receiving
async events after bnxt_en allocates the completion rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to properly support TX rate limiting in SRIOV VF functions or
NPAR functions, firmware needs better control over tx ring allocations.
The new scheme requires the driver to reserve the number of tx rings
and to query to see if the requested number of tx rings is reserved.
The driver will use the new scheme when the firmware interface spec is
1.6.1 or newer.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Accept ipv6 flows in .ndo_rx_flow_steer() and support ETHTOOL_GRXCLSRULE
ipv6 flows.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Assign additional vnics to VFs whenever possible so that NTUPLE can be
supported on the VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing hardware RFS mode uses one hardware RSS context block
per ring just to calculate the RSS hash. This is very wasteful and
prevents VF functions from using it. The new hardware mode shares
the same hardware RSS context for RSS placement and RFS steering.
This allows VFs to enable RFS.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add function bnxt_rfs_supported() that determines if the chip supports
RFS. Refactor the existing function bnxt_rfs_capable() that determines
if run-time conditions support RFS.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new vnic RSS capability will enhance NTUPLE support, to be added
in subsequent patches.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call tcp_gro_complete() in the common code path instead of the chip-
specific method. The newer 5731x method is missing the call.
Signed-off-by: Michael Chan <michael.chan@broadcmo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The advertising field is closely related to the auto_link_speeds field.
The former is the user setting while the latter is the firmware setting.
Both should be u16. We should use the advertising field in
bnxt_get_link_ksettings because the auto_link_speeds field may not
be updated with the latest from the firmware yet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IRQ is disabled by writing to the completion ring doorbell. This
should be done before the hardware completion ring is freed for correctness.
The current code disables IRQs after all the completion rings are freed.
Fix it by calling bnxt_disable_int_sync() before freeing the completion
rings. Rearrange the code to avoid forward declaration.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better busy polling and GRO support. Do not re-arm IRQ if
napi_complete_done() returns false.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use native NAPI polling instead. The next patch will complete the work
by switching to use napi_complete_done()
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the network driver and RDMA driver operate on the same PCI function,
we need to create an interface to allow the RDMA driver to share resources
with the network driver.
1. Create a new bnxt_en_dev struct which will be returned by
bnxt_ulp_probe() upon success. After that, all calls from the RDMA driver
to bnxt_en will pass a pointer to this struct.
2. This struct contains additional function pointers to register, request
msix, send fw messages, register for async events.
3. If the RDMA driver wants to enable RDMA on the function, it needs to
call the function pointer bnxt_register_device(). A ulp_ops structure
is passed for RCU protected upcalls from bnxt_en to the RDMA driver.
4. The RDMA driver can call firmware APIs using the bnxt_send_fw_msg()
function pointer.
5. 1 stats context is reserved when the RDMA driver registers. MSIX
and completion rings are reserved when the RDMA driver calls
bnxt_request_msix() function pointer.
6. When the RDMA driver calls bnxt_unregister_device(), all RDMA resources
will be cleaned up.
v2: Fixed 2 uninitialized variable warnings.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver register function with firmware consists of passing version
information and registering for async events. To support the RDMA driver,
the async events that we need to register may change. Separate the
driver register function into 2 parts so that we can just update the
async events for the RDMA driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the device supports RDMA, we'll setup network default rings so that
there are enough minimum resources for RDMA, if possible. However, the
user can still increase network rings to the max if he wants. The actual
RDMA resources won't be reserved until the RDMA driver registers.
v2: Fix compile warning when BNXT_CONFIG_SRIOV is not set.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All available remaining completion rings not used by the PF should be
made available for the VFs so that there are enough rings in the VF to
support RDMA. The earlier workaround code of capping the rings by the
statistics context is removed.
When SRIOV is disabled, call a new function bnxt_restore_pf_fw_resources()
to restore FW resources. Later on we need to add some logic to account
for RDMA resources.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that MSIX is enabled in bnxt_init_one(), resources may be allocated by
the RDMA driver before the network device is opened. So we cannot do
function reset in bnxt_open() which will clear all the resources.
The proper place to do function reset now is in bnxt_init_one().
If we get AER, we'll do function reset as well.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To better support the new RDMA driver, we need to move pci_enable_msix()
from bnxt_open() to bnxt_init_one(). This way, MSIX vectors are available
to the RDMA driver whether the network device is up or down.
Part of the existing bnxt_setup_int_mode() function is now refactored into
a new bnxt_init_int_mode(). bnxt_init_int_mode() is called during
bnxt_init_one() to enable MSIX. The remaining logic in
bnxt_setup_int_mode() to map the IRQs to the completion rings is called
during bnxt_open().
v2: Fixed compile warning when CONFIG_BNXT_SRIOV is not set.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By refactoring existing code into this new function. The new function
will be used in subsequent patches.
v2: Fixed compile warning when CONFIG_BNXT_SRIOV is not set.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function bnxt_hwrm_stat_ctx_alloc() always returns 0, even if the call
to _hwrm_send_message() fails. It may be better to propagate the errors
to the caller of bnxt_hwrm_stat_ctx_alloc().
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188661
Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Report PFC statistics to ethtool -S and DCBNL.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support only IEEE DCBX initially. Add IEEE DCBNL ops and functions to
get and set the hardware DCBX parameters. The DCB code is conditional on
Kconfig CONFIG_BNXT_DCB.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest interface has the latest DCB command structs. Get and store the
max number of lossless TCs the hardware can support.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new function bnxt_setup_mq_tc() to handle MQPRIO. This new function
will be called during ETS setup when we add DCBNL in the next patch.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
udplite conflict is resolved by taking what 'net-next' did
which removed the backlog receive method assignment, since
it is no longer necessary.
Two entries were added to the non-priv ethtool operations
switch statement, one in 'net' and one in 'net-next, so
simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
When busy polling while a link is down (during a link-flap test), TX
timeouts were observed as well as the following messages in the ring
buffer:
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free tx failed. rc:-1
bnxt_en 0008:01:00.2 enP8p1s0f2d2: Resp cmpl intr err msg: 0x51
bnxt_en 0008:01:00.2 enP8p1s0f2d2: hwrm_ring_free rx failed. rc:-1
These were resolved by checking for link status and returning if link
was not up.
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Tested-by: Rob Miller <rob.miller@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Knowing that:
#define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_VXLAN (0x1UL << 0)
#define TUNNEL_DST_PORT_FREE_REQ_TUNNEL_TYPE_GENEVE (0x5UL << 0)
and that 'bnxt_hwrm_tunnel_dst_port_alloc()' is only called with one of
these 2 constants, the TUNNEL_DST_PORT_ALLOC_REQ_TUNNEL_TYPE_GENEVE can not
trigger.
Replace the bit test that overlap by an equality test, just as in
'bnxt_hwrm_tunnel_dst_port_free()' above.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All conflicts were simple overlapping changes except perhaps
for the Thunder driver.
That driver has a change_mtu method explicitly for sending
a message to the hardware. If that fails it returns an
error.
Normally a driver doesn't need an ndo_change_mtu method becuase those
are usually just range changes, which are now handled generically.
But since this extra operation is needed in the Thunder driver, it has
to stay.
However, if the message send fails we have to restore the original
MTU before the change because the entire call chain expects that if
an error is thrown by ndo_change_mtu then the MTU did not change.
Therefore code is added to nicvf_change_mtu to remember the original
MTU, and to restore it upon nicvf_update_hw_max_frs() failue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a missing synchronize_net() call to avoid potential use after free,
since we explicitly call napi_hash_del() to factorize the RCU grace
period.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The newer chips have proper support for 4-tuple UDP RSS.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some dual port NICs, the speed setting on one port can affect the
available speed on the other port. Add logic to detect these changes
and adjust the advertised speed settings when necessary.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new FORCE_LINK_DWN bit to shutdown link during close.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the physical link is down and the VF virtual link is set to "enable",
the current code does not always work. If the link is down but the
cable is attached, the firmware returns LINK_SIGNAL instead of
NO_LINK. The current code is treating LINK_SIGNAL as link up.
The fix is to treat link as down when the link_status != LINK.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The logic is missing the check on whether the tx and rx rings are sharing
completion rings or not.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is automatically done from netif_napi_add(), and we want to not
export napi_hash_add() anymore in the following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tg3: min_mtu 60, max_mtu 9000/1500
bnxt: min_mtu 60, max_mtu 9000
bnx2x: min_mtu 46, max_mtu 9600
- Fix up ETH_OVREHEAD -> ETH_OVERHEAD while we're in here, remove
duplicated defines from bnx2x_link.c.
bnx2: min_mtu 46, max_mtu 9000
- Use more standard ETH_* defines while we're at it.
bcm63xx_enet: min_mtu 46, max_mtu 2028
- compute_hw_mtu was made largely pointless, and thus merged back into
bcm_enet_change_mtu.
b44: min_mtu 60, max_mtu 1500
CC: netdev@vger.kernel.org
CC: Michael Chan <michael.chan@broadcom.com>
CC: Sony Chacko <sony.chacko@qlogic.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Dept-HSGLinuxNICDev@qlogic.com
CC: Siva Reddy Kallam <siva.kallam@broadcom.com>
CC: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
the ability for user-space application to specify it for the VF, as an
option to support 802.1ad.
We adjusted IP Link tool to support this option.
For future use cases, the new UAPI supports multiple vlans. For now we
limit the list size to a single vlan in kernel.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older versions of IP Link tool.
Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
We kept 802.1Q as the drivers' default vlan protocol.
Suitable ip link tool command examples:
Set vf vlan protocol 802.1ad:
ip link set eth0 vf 1 vlan 100 proto 802.1ad
Set vf to VST (802.1Q) mode:
ip link set eth0 vf 1 vlan 100 proto 802.1Q
Or by omitting the new parameter
ip link set eth0 vf 1 vlan 100
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_hwrm_fw_set_time() now returns -EOPNOTSUPP when built for kernel
without RTC_LIB. Setting the firmware time is not critical to the
successful completion of the firmware update process.
Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VF link state can be changed via the 'ip link set' cmd.
Currently, the new link state does not take effect immediately.
The fix is for the PF to send a link change async event to the
designated VF after a VF link state change. This async event will
trigger the VF to update the link status.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Restart autoneg if autoneg is enabled.
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware has a limitation that it won't pass host to BMC loopback
packets below 52-bytes.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After generating the random MAC address for VF, call the firmware to
approve it. This step serves 2 purposes. Some hypervisor (e.g. ESX)
wants to approve the MAC address. 2nd, the call will setup the
proper forwarding database in the internal switch.
We need to unlock the hwrm_cmd_lock mutex before calling bnxt_approve_mac().
We can do that because we are at the end of the function and all the
previous firmware response data has been copied.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-arrange the code so that the generation of the random MAC address for
the VF is at the end of the function. The next patch will add one more step
to call bnxt_approve_mac() to get the firmware to approve the random MAC
address.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing code is inconsistent in reporting and accepting the combined
channel count. bnxt_get_channels() reports maximum combined as the
maximum rx count. bnxt_set_channels() accepts combined count that
cannot be bigger than max rx or max tx.
For example, if max rx = 2 and max tx = 1, we report max supported
combined to be 2. But if the user tries to set combined to 2, it will
fail because 2 is bigger than max tx which is 1.
Fix the code to be consistent. Max allowed combined = max(max_rx, max_tx).
We will accept a combined channel count <= max(max_rx, max_tx).
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using Ethtool flashdev command, entire NVM package (*.pkg) files
may now be staged into the "update" area of the NVM and subsequently
verified and installed by the firmware using the newly introduced
command: NVM_INSTALL_UPDATE.
We also introduce use of the new firmware command FW_SET_TIME so that the
NVM-resident package installation log contains valid time-stamps.
Signed-off-by: Rob Swindell <Rob.Swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove "Single-port/Dual-port" from the device names. Dual-port devices
will appear as 2 separate devices, so no need to call each a dual-port
device. Use a more generic name for VF devices belonging to the same
chip fanmily. Add some remaining NPAR device IDs.
Signed-off-by: David Christensen <david.christensen@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And remove redundant definitions of the same flags.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a code path where we are calling __iowrite64_copy() on
an address that is not 64-bit aligned. This causes an exception on
some architectures such as arm64. Fix that code path by using
__iowrite32_copy().
Reported-by: JD Zheng <jiandong.zheng@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add 5741X/5731X NPAR device IDs and dual media SFP/10GBase-T device IDs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there are not enough resources to enable ntuple filtering,
log a warning message.
v2: Use single message and add missing newline.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the destination MAC address in the ntuple filter structure. The
current code assumes that the destination MAC address is always the MAC
address of the NIC. This may not be true if there are macvlans, for
example. Add destination MAC address checking and configure the filter
correctly using the correct index for the destination MAC address.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
txr->dev_state was not consistently manipulated with the acquisition of
the per-queue lock, after further inspection the lock does not seem
necessary, either the value is read as BNXT_DEV_STATE_CLOSING or 0.
Reported-by: coverity (CID 1339583)
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bridge device in NS2 has the same device ID as the ethernet controller.
Add check to avoid probing the bridge device.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate special vnic for dropping packets not matching the RX filters.
First vnic is for normal RX packets and the driver will drop all
packets on the 2nd vnic.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate napi for special vnic, packets arriving on this
napi will simply be dropped and the buffers will be replenished back
to the HW.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware is unable to drop rx packets not matching the RX filters. To
workaround it, we create a special VNIC and configure the hardware to
direct all packets not matching the filters to it. We then setup the
driver to drop packets received on this VNIC.
This patch creates the infrastructure for this VNIC, reserves a
completion ring, and rx rings. Only shared completion ring mode is
supported. The next 2 patches add a NAPI to handle packets from this
VNIC and the setup of the VNIC.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nitro A0 has a hardware bug in the rx path. The workaround is to create
a special COS context as a path for non-RSS (non-IP) packets. Without this
workaround, the chip may stall when receiving RSS and non-RSS packets.
Add infrastructure to allow 2 contexts (RSS and CoS) per VNIC. Allocate
and configure the CoS context for Nitro A0.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nitro is the embedded version of the ethernet controller in the North
Star 2 SoC. Add basic code to recognize the chip ID and disable
the features (ntuple, TPA, ring and port statistics) not supported on
Nitro A0.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rc is not initialized so it can contain garbage if it is not
set by the call to bnxt_read_sfp_module_eeprom_info. Ensure
garbage is not returned by initializing rc to 0.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This code generates as static checker warning because htons(ETH_P_IPV6)
is always true. From the context it looks like the && was intended to
be !=.
Fixes: 94758f8de0 ('bnxt_en: Add GRO logic for BCM5731X chips.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The allowable range is 0.25 seconds to 1 second interval. Default is
1 second.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is useful for multi-function devices.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With a default VLAN, the VF has its own VLAN domain and it can receive
all traffic within that domain.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For correctness, the MRU enables bit must be set when passing the
MRU to firmware during vnic configuration.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to the Ethtool FLASHDEV command handler for additional
firmware types to cover all the on-chip processors.
Signed-off-by: Rob Swindell <rob.swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon successful mgmt processor firmware update, request a self
reset upon next PCIe reset (e.g. system reboot).
Signed-off-by: Rob Swindell <rob.swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To support Secure Firmware Update, we must be able to allocate
a staging area in the Flash. This patch adds support for the
"update" type to tell firmware to do that.
Signed-off-by: Rob Swindell <rob.swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling the firmware to do function reset on the PF will kill all the VFs.
To prevent that, we call function reset on the 1st PF open before any VF
can be activated. On subsequent PF opens (with possibly some active VFs),
a bit has been set and we'll skip the function reset. VF driver will
always do function reset on every open. If there is an AER event, we will
always do function reset.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Returning 0 for doing nothing is confusing to the user.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The port number for GENEVE is hard coded into the bnxt driver. This is the
kind of thing we want to avoid going forward. For now I will integrate
this back into the port notifier so that we can change the GENEVE port
number if we need to in the future.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ends up doing several things. First it updates the driver to
make use of the new unified UDP tunnel offload notifier functions. In
addition I updated the code so that we can work around the bits that were
checking for if VXLAN was enabled since we are now using a notifier based
setup.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To fully support 25G and 50G link settings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some cards do not support autoneg. The current code does not prevent the
user from enabling autoneg with ethtool on such cards, causing confusion.
Firmware provides the autoneg capability information and we just need to
store it in the support_auto_speeds field in bnxt_link_info struct.
The ethtool set_settings() call will check this field before proceeding
with autoneg.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add bnxt_gro_func_5731x() to handle GRO packets for this chip. The
completion structures used in the new chip have new data to help determine
the header offsets. The offsets can be off by 4 if the packet is an
internal loopback packet (e.g. from one VF to another VF). Some additional
logic is added to adjust the offsets if it is a loopback packet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer chips require different logic to handle GRO packets. So refactor
the code so that we can call different functions depending on the chip.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define all the supported chip numbers and chip categories. Store the
chip_num returned by firmware. If the call to get the version and chip
number fails, we should abort.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NPAR type is read from bnxt_hwrm_func_qcfg. Do not allow changing link
parameters if in NPAR mode sinc ethe port is shared among multiple
partitions. The link parameters are set up by firmware.
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the VF driver gets this event, the VF configuration has changed (such
as default VLAN). The VF driver will initiate a silent reset to pick up
the new configuration.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a default VLAN is added to the VF, the VF driver needs to reset to
pick up the default VLAN ID. We can use the same tx timeout reset logic
to do that, without the debug output. This new function, with the
silent parameter to suppress debug output will now serve both purposes.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF can setup a default VLAN for a VF. The default VLAN tag is
automatically inserted and stripped without the knowledge of the
stack running on the VF. The VF driver needs to know that default
VLAN is enabled as VLAN acceleration on the RX side is no longer
supported. Call netdev_update_features() to fix up the VLAN features
as necessary. Also, VLAN strip mode must be enabled to strip out
the default VLAN tag.
Only allow VF default VLAN to be set if the firmware spec is >= 1.2.1.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since both CTAG and STAG rx acceleration must be enabled together, we
only need to check one feature flag (NETIF_F_HW_VLAN_CTAG_RX) before
calling __vlan_hwaccel_put_tag().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware can only be set to strip or not strip both the VLAN CTAG and
STAG. It cannot strip one and not strip the other. Add logic to
bnxt_fix_features() to toggle both feature flags when the user is toggling
one of them.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the is_push flag in the software BD before the tx data is pushed to
the chip. It is possible to get the tx interrupt as soon as the tx data
is pushed. The tx handler will not handle the event properly if the
is_push flag is not set and it will crash.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch defines two new GSO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the weaker but more appropriate dma_rmb() to order the reading of
the completion ring.
Suggested-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code is more complicated than necessary and can only report
unsupported SFP+ module if it is plugged in after the device is up.
Rename bnxt_port_module_event() to bnxt_get_port_module_status(). We
already have the current module_status in the link_info structure, so
just check that and report any unsupported SFP+ module status. Delete
the unnecessary last_port_module_event. Call this function at the
end of bnxt_open to report unsupported module already plugged in.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The len value in the hwrm error message is wrong. Use the properly adjusted
value in the variable len.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code has 2 problems:
1. The maximum wait time is not long enough. It is about 60% of the
duration specified by the firmware. It is calling usleep_range(600, 800)
for every 1 msec we are supposed to wait.
2. The granularity of the delay is too coarse. Many simple firmware
commands finish in 25 usec or less.
We fix these 2 issues by multiplying the original 1 msec loop counter by
40 and calling usleep_range(25, 40) for each iteration.
There is also a second delay loop to wait for the last DMA word to
complete. This delay loop should be a very short 5 usec wait.
This change results in much faster bring-up/down time:
Before the patch:
time ip link set p4p1 up
real 0m0.120s
user 0m0.001s
sys 0m0.009s
After the patch:
time ip link set p4p1 up
real 0m0.030s
user 0m0.000s
sys 0m0.010s
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The chip supports 4K/8K/64K page sizes for the rings and we try to
match it to the CPU PAGE_SIZE. The current page size limits for the rings
are based on 4K/8K page size. If the page size is 64K, these limits are
too large. Reduce them appropriately.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add code to log a message during driver load indicating PCIe link
speed and width.
The log message will look like this:
bnxt_en 0000:86:00.0 eth0: PCIe: Speed 8.0GT/s Width x8
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to fetch the SFP EEPROM settings from the firmware
and display it via the ethtool -m command. We support SFP+ and QSFP
modules.
v2: Fixed a bug in bnxt_get_module_eeprom() found by Ben Hutchings.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When there is only 1 MSI-X vector or in INTA mode, tx and rx pre-set
max channel parameters are shown incorrectly in ethtool -l. With only 1
vector, bnxt_get_max_rings() will return -ENOMEM. bnxt_get_channels
should check this return value, and set max_rx/max_tx to 0 if it is
non-zero.
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nf_conntrack_core.c fix in 'net' is not relevant in 'net-next'
because we no longer have a per-netns conntrack hash.
The ip_gre.c conflict as well as the iwlwifi ones were cases of
overlapping changes.
Conflicts:
drivers/net/wireless/intel/iwlwifi/mvm/tx.c
net/ipv4/ip_gre.c
net/netfilter/nf_conntrack_core.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Add detection and recovery code when the hardware returned opaque value
does not match the expected consumer index. Once the issue is detected,
we skip the processing of all RX and LRO/GRO packets. These completion
entries are discarded without sending the SKB to the stack and without
producing new buffers. The function will be reset from a workqueue.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a rare hardware bug that can cause a bad opaque value in the RX
or TPA completion. When this happens, the hardware may have used the
same buffer twice for 2 rx packets. In addition, the driver will also
crash later using the bad opaque as the index into the ring.
The rx opaque value is predictable and is always monotonically increasing.
The workaround is to keep track of the expected next opaque value and
compare it with the one returned by hardware during RX and TPA start
completions. If they miscompare, we will not process any more RX and
TPA completions and exit NAPI. We will then schedule a workqueue to
reset the function.
This patch adds the logic to keep track of the next rx consumer index.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In netdevice.h we removed the structure in net-next that is being
changes in 'net'. In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.
The mlx5 conflicts have to do with vxlan support dependencies.
Signed-off-by: David S. Miller <davem@davemloft.net>
The multicast/all-multicast internal flags are not properly restored
after device reset. This could lead to unreliable multicast operations
after an ethtool configuration change for example.
Call bnxt_mc_list_updated() and setup the vnic->mask in bnxt_init_chip()
to fix the issue.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The code determines if the next ring entry is valid before proceeding
further to read the rest of the entry. The CPU can re-order and read
the rest of the entry first, possibly reading a stale entry, if DMA
of a new entry happens right after reading it. This issue can be
readily seen on a ppc64 system, causing it to crash.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch assumes that the bnxt hardware will ignore existing IPv4/v6
header fields for length and checksum as well as the length and checksum
fields for outer UDP and GRE headers.
I have been told by Michael Chan that this is working. Though this might
be somewhat redundant for IPv6 as they are forcing the checksum to be
computed for all IPv6 frames that are offloaded. A follow-up patch may be
necessary in order to fix this as it is essentially mangling the outer IPv6
headers to add a checksum where none was requested.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
net/ipv4/ip_gre.c
Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
If PAGE_SIZE is bigger than BNXT_RX_PAGE_SIZE, that means the native CPU
page is bigger than the maximum length of the RX BD. Divide the page
into multiple 32K buffers for the aggregation ring.
Add an offset field in the bnxt_sw_rx_agg_bd struct to keep track of the
page offset of each buffer. Since each page can be referenced by multiple
buffer entries, call get_page() as needed to get the proper reference
count.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RX BD length field of this device is 16-bit, so the largest buffer
size is 65535. For LRO and GRO, we allocate native CPU pages for the
aggregation ring buffers. It won't work if the native CPU page size is
64K or bigger.
We fix this by defining BNXT_RX_PAGE_SIZE to be native CPU page size
up to 32K. Replace PAGE_SIZE with BNXT_RX_PAGE_SIZE in all appropriate
places related to the rx aggregation ring logic.
The next patch will add additional logic to divide the page into 32K
chunks for aggrgation ring buffers if PAGE_SIZE is bigger than
BNXT_RX_PAGE_SIZE.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only MSI-X can be used on a VF. The driver should fail initialization
if it cannot successfully enable MSI-X.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On some dual port cards, link speeds on both ports have to be compatible.
Firmware will inform the driver when a certain speed is no longer
supported if the other port has linked up at a certain speed. Add
logic to handle this event by logging a message and getting the
updated list of supported speeds.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some hypervisors (e.g. ESX) require the VF MAC address to be forwarded to
the PF for approval. In Linux PF, the call is not forwarded and the
firmware will simply check and approve the MAC address if the PF has not
previously administered a valid MAC address for this VF.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let firmware know that the driver is giving up control of the link so that
it can be shutdown if no management firmware is running.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10GBaseT devices must autonegotiate to determine master/slave clocking.
Disallow forced speed in ethtool .set_settings() for these devices.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If autoneg is off, we should always report the speed and duplex settings
even if it is link down so the user knows the current settings. The
unknown speed and duplex should only be used for autoneg when link is
down.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check that the forced speed is a valid speed supported by firmware.
If not supported, return -EINVAL.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the PORT_CONN_NOT_ALLOWED async event handling logic. The driver
will print an appropriate warning to reflect the SFP+ module enforcement
policy done in the firmware.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the driver only sets bit 0 of the async_event_fwd fields.
To be compatible with the latest spec, we need to set the
appropriate event bits handled by the driver. We should be handling
link change and PF driver unload events, so these 2 bits should be
set.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow users to get|set EEE parameters.
v2: Added comment for preserving the tx_lpi_timer value in get_eee.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Add bnxt_hwrm_set_eee() function to setup EEE firmware parameters based
on the bp->eee settings.
2. The new function bnxt_eee_config_ok() will check if EEE parameters need
to be modified due to autoneg changes.
3. bnxt_hwrm_set_link() has added a new parameter to update EEE. If the
parameter is set, it will call bnxt_hwrm_set_eee().
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get EEE capability and the initial EEE settings from firmware.
Add "EEE is active | not active" to link up dmesg.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the new AUTONEG_PAUSE bit in the new interface to better
control autoneg flow control settings, independent of RX and TX
advertisement settings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use new field names in API structs and stop using deprecated fields
auto_link_speed and auto_duplex in phy_cfg/phy_qcfg structs.
Update copyright year to 2016.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To report flow control tx/rx settings accurately regardless of autoneg
setting, we should use link_info->req_flow_ctrl. Before this patch,
the reported settings were only correct when autoneg was on.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The typo caused the wrong flow control bit to be set.
Reported by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The size of every padded firmware message is specified in the first
HWRM_VER_GET response message. Use this value to pad every message
after that.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing code does the following:
allocate completion ring
initialize completion ring doorbell
disable interrupts on this completion ring by writing to the doorbell
We can have a race where firmware sends an asynchronous event to the host
after completion ring allocation and before doorbell is initialized.
When this happens driver can crash while ringing the doorbell using
uninitialized value as part of handling the IRQ/napi request.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add pci_error_handler callbacks to support for pcie advanced error
recovery.
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the more useful port statistics in ethtool -S for the PF device.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include some of the port error counters (e.g. crc) in ->ndo_get_stats64()
for the PF device.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gather periodic port statistics if the device is PF and link is up. This
is triggered in bnxt_timer() every one second to request firmware to DMA
the counters.
Signed-off-by: Michael Chan <michael.chan@broadocm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow all autoneg speeds aupported by firmware to be advertised. If
the advertising parameter is 0, then all supported speeds will be
advertised.
Remove BNXT_ALL_COPPER_ETHTOOL_SPEED which is no longer used as all
supported speeds can be advertised.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The supported bits and advertising bits in ethtool have the same
definitions. The same is true for the firmware bits. So use the
common function to handle the conversion for both supported and
advertising bits.
v2: Don't use parentheses on function return.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And report actual pause settings to ETHTOOL_GPAUSEPARAM to let ethtool
resolve the actual pause settings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include the conversion of pause bits and add one extra call layer so
that the same refactored function can be reused to get the link partner
advertisement bits.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several cases of overlapping changes, as well as one instance
(vxlan) of a bug fix in 'net' overlapping with code movement
in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
I added this check in setup_tc to multiple drivers,
if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
Unfortunately restricting to TC_H_ROOT like this breaks the old
instantiation of mqprio to setup a hardware qdisc. This patch
relaxes the test to only check the type to make it equivalent
to the check before I broke it. With this the old instantiation
continues to work.
A good smoke test is to setup mqprio with,
# tc qdisc add dev eth4 root mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7
Fixes: e4c6734eaa ("net: rework ndo tc op to consume additional qdisc handle paramete")
Reported-by: Singh Krishneil <krishneil.k.singh@intel.com>
Reported-by: Jake Keller <jacob.e.keller@intel.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is used to send NVM_FIND_DIR_ENTRY messages which can return error
if the entry is not found. This is normal and the error message will
cause unnecessary alarm, so silence it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new function bnxt_do_send_msg() to do essentially the same thing
with an additional paramter to silence error response messages. All
current callers will set silent to false.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For everything to fit, we remove the PHY microcode version and replace it
with the firmware package version in the fw_version string.
Signed-off-by: Rob Swindell <swindell@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use appropriate firmware request header structure to prepare the
firmware messages. This avoids the unnecessary conversion of the
fields to 32-bit fields. Add appropriate endian conversion when
printing out the message fields in dmesg so that they appear correct
in the log.
Reported-by: Rob Swindell <swindell@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this patch, we used a hardcoded value of 500 msec as the default
value for firmware message response timeout. For better portability with
future hardware or debug platforms, use the value provided by firmware in
the first response and store it for all susequent messages. Redefine the
macro HWRM_CMD_TIMEOUT to the stored value. Since we don't have the
value yet in the first message, use the 500 ms default if the stored value
is zero.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When tx and rx rings don't share the same completion ring, tx coalescing
parameters can be set differently from the rx coalescing parameters.
Otherwise, use rx coalescing parameters on shared completion rings.
Adjust rx coalescing default values to lower interrupt rate.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a function to set all the coalescing parameters. The function can
be used later to set both rx and tx coalescing parameters.
v2: Fixed function parameters formatting requested by DaveM.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don't convert these to internal hardware tick values before storing
them. This avoids the confusion of ethtool -c returning slightly
different values than the ones set using ethtool -C when we convert
hardware tick values back to micro seconds. Add better comments for
the hardware settings.
Also, rename the current set of coalescing fields with rx_ prefix.
The next patch will add support of tx coalescing values.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During remove_one() when SRIOV is enabled, the PF driver
should broadcast PF driver unload notification to all
VFs that are attached to VMs. Upon receiving the PF
driver unload notification, the VF driver should print
a warning message to message log. Certain operations on the
VF may not succeed after the PF has unloaded.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow the VF to setup its own MAC address if the PF has not administratively
set it for the VF. To do that, we should always store the MAC address
from the firmware. There are 2 cases:
1. The MAC address is valid. This MAC address is assigned by the PF and
it needs to override the current VF MAC address.
2. The MAC address is zero. The VF will use a random MAC address by default.
By storing this 0 MAC address in the VF structure, it will allow the VF
user to change the MAC address later using ndo_set_mac_address() when
it sees that the stored MAC address is 0.
v2: Expanded descriptions and added more comments.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The arithmetic to zero pad the last 64-bit word in the push buffer is not
correct.
1. It should be pdata + length to get to the end.
2. 'pdata' is void pointer and passing it to PTR_ALIGN() will cast the
aligned pointer to void. Pass 'end' which is u64 pointer to PTR_ALIGN()
instead so that the aligned pointer - 1 is the last 64-bit pointer to data.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/phy/bcm7xxx.c
drivers/net/phy/marvell.c
drivers/net/vxlan.c
All three conflicts were cases of simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
If we fail to update the PHY, we should print a warning and continue.
The current code to exit is buggy as it has not freed up the NIC
resources yet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bnxt_update_phy_setting() to check the correct parameters when
determining whether to update the PHY. Requested line speed/duplex should
only be checked for forced speed mode. This avoids unnecessary link
interruptions when loading the driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When shutting down the NIC, we shutdown async event processing before
freeing all the rings. If there is a link change event during reset, the
driver may miss it and the link state may be incorrect after the NIC is
re-opened. Poll the link at the end of __bnxt_open_nic() to get the
correct link status.
Signed-off-by Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates setup_tc so we can pass additional parameters into
the ndo op in a generic way. To do this we provide structured union
and type flag.
This lets each classifier and qdisc provide its own set of attributes
without having to add new ndo ops or grow the signature of the
callback.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_setup_tc() op was added to support drivers offloading tx
qdiscs however only support for mqprio was ever added. So we
only ever added support for passing the number of traffic classes
to the driver.
This patch generalizes the ndo_setup_tc op so that a handle can
be provided to indicate if the offload is for ingress or egress
or potentially even child qdiscs.
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current default tx ring size of 512 causes an extra page to be
allocated for the tx ring with only 1 entry in it. Reduce it to
511. The default rx ring size is also reduced to 511 to use less
memory by default.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tx push is supported for small packets to reduce DMA latency. The
following bugs are fixed in this patch:
1. Fix the definition of the push BD which is different from the DMA BD.
2. The push buffer has to be zero padded to the next 64-bit word boundary
or tx checksum won't be correct.
3. Increase the tx push packet threshold to 164 bytes (192 bytes with the BD)
so that small tunneled packets are within the threshold.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
20G is not supported by production hardware and only the 40GbaseCR4 standard
is supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cleanup bnxt_probe_phy() to cleanly separate 2 code blocks for autoneg
on and off. Autoneg flow control is possible only if autoneg is enabled.
In bnxt_get_settings(), Pause and Asym_Pause are always supported.
Only the advertisement bits change depending on the ethtool -A setting
in auto mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Determine autoneg on|off setting from link_info->autoneg. Using the
firmware returned setting can be misleading if autoneg is changed and
there hasn't been a phy update from the firmware.
2. If autoneg is disabled, link_info->autoneg should be set to 0 to
indicate both speed and flow control autoneg are disabled.
3. To enable autoneg flow control, speed autoneg must be enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ring index j is not wrapped properly at the end of the ring, causing
it to reference pointers past the end of the ring. For proper loop
termination and to access the ring properly, we need to increment j and
mask it before referencing the ring entry.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This hardware counter is misleading as it counts dropped packets that
don't match the hardware filters for unicast/broadcast/multicast. We
will still report this counter in ethtool -S for diagnostics purposes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use completion ring for ring free response from firmware. The response
will be the last entry in the ring and we can free the ring after getting
the response. This will guarantee no spurious DMA to freed memory.
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newly added code in the bnxt driver uses a couple of variables that
are never initialized when CONFIG_BNXT_SRIOV is not set, and gcc
correctly warns about that:
In file included from include/linux/list.h:8:0,
from include/linux/module.h:9,
from drivers/net/ethernet/broadcom/bnxt/bnxt.c:10:
drivers/net/ethernet/broadcom/bnxt/bnxt.c: In function 'bnxt_get_max_rings':
include/linux/kernel.h:794:26: warning: 'cp' may be used uninitialized in this function [-Wmaybe-uninitialized]
include/linux/kernel.h:794:26: warning: 'tx' may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/broadcom/bnxt/bnxt.c:5730:11: warning: 'rx' may be used uninitialized in this function [-Wmaybe-uninitialized]
drivers/net/ethernet/broadcom/bnxt/bnxt.c:5736:6: note: 'rx' was declared here
This changes the condition so that we fall back to using the PF
data if VF is not available, and always initialize the variables
to something useful.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6e6c5a57fb ("bnxt_en: Modify bnxt_get_max_rings() to support shared or non shared rings.")
Acked-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use HWRM_FW_RESET command to request a self-reset of the embedded
processor(s) after successfully applying a firmware update. For boot
processor, the self-reset is currently deferred until the next PCIe reset.
Signed-off-by: Rob Swindell <swindell@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For future compatibility, zero pad all messages that the driver sends
to the firmware to 128 bytes. If these messages are extended in the
future with new byte enables, zero padding these messages now will
guarantee future compatibility.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver can support either all combined or all rx/tx rings. The
default is combined, but the user can now select rx/tx rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify ring memory allocation and MSIX setup to support shared or
non shared rings and do the proper mapping. Default is still to
use shared rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add logic to calculate how many shared or non shared rings can be
supported. Default is to use shared rings.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to support dedicated or shared completion rings, the ring
indexing and mapping are re-structured as below:
1. bp->grp_info[] array index is 1:1 with bp->bnapi[] array index and
completion ring index.
2. rx rings 0 to n will be mapped to completion rings 0 to n.
3. If tx and rx rings share completion rings, then tx rings 0 to m will
be mapped to completion rings 0 to m.
4. If tx and rx rings use dedicated completion rings, then tx rings 0 to
m will be mapped to completion rings n + 1 to n + m.
5. Each tx or rx ring will use the corresponding completion ring index
for doorbell mapping and MSIX mapping.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each bnxt_napi structure may no longer be having both an rx ring and
a tx ring. Check for a valid ring before using it.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, an rx and a tx ring are always paired with a completion ring.
We want to restructure it so that it is possible to have a dedicated
completion ring for tx or rx only.
The bnxt hardware uses a completion ring for rx and tx events. The driver
has to process the completion ring entries sequentially for the rx and tx
events. Using a dedicated completion ring for rx only or tx only has these
benefits:
1. A burst of rx packets can cause delay in processing tx events if the
completion ring is shared. If tx queue is stopped by BQL, this can cause
delay in re-starting the tx queue.
2. A completion ring is sized according to the rx and tx ring size rounded
up to the nearest power of 2. When the completion ring is shared, it is
sized by adding the rx and tx ring sizes and then rounded to the next power
of 2, often with a lot of wasted space.
3. Using dedicated completion ring, we can adjust the tx and rx coalescing
parameters independently for rx and tx.
The first step is to separate the rx and tx ring structures from the
bnxt_napi struct.
In this patch, an rx ring and a tx ring will point to the same bnxt_napi
struct to share the same completion ring. No change in ring assignment
and mapping yet.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By adding 3 separate functions to dump the different ring states.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added the PCI IDs for the BCM57301 and BCM57402 controllers.
Signed-off-by: David Christensen <davidch@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This interface will be forward compatible with future changes.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer firmware will return the ring group resource when we call
hwrm_func_qcaps(). To be compatible with older firmware, use the
number of tx rings as the number of ring groups if the older firmware
returns 0. When determining how many rx rings we can support, take
the ring group resource in account as well in _bnxt_get_max_rings().
Divide and assign the ring groups to VFs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to keep track of all resources, such as rx rings, tx rings,
cmpl rings, rss contexts, stats contexts, vnics, after we have
divided them for the VFs. Otherwise, subsequent ring changes on
the PF may not work correctly.
We adjust all max resources in struct bnxt_pf_info after they have been
assigned to the VFs. There is no need to keep the separate
max_pf_tx_rings and max_pf_rx_rings.
When SR-IOV is disabled, we call bnxt_hwrm_func_qcaps() to restore the
max resources for the PF.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Use local variable pf for repeated access to this pointer.
2. The 2nd argument num_vfs was unnecessarily declared as pointer to int.
This function doesn't change num_vfs so change the argument to int.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware resources required to enable NTUPLE varies depending on
how many rx channels are configured. We need to make sure we have the
resources before we enable NTUPLE. Add bnxt_rfs_capable() to do the
checking.
In addition, we need to do the same checking in ndo_fix_features(). As
the rx channels are changed using ethtool -L, we call
netdev_update_features() to make the necessary adjustment for NTUPLE.
Calling netdev_update_features() in netif_running() state but before
calling bnxt_open_nic() would be a problem. To make this work,
bnxt_set_features() has to be modified to test for BNXT_STATE_OPEN for
the true hardware state instead of checking netif_running().
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If hardware completes single segment rx frames, don't bother setting
up all the GRO related fields. Pass the SKB up as a normal frame.
Reviewed-by: vasundhara volam <vvolam@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also, no need to check for bp->rx_nr_rings as it is always >= 1. If the
allocation fails, it is not a fatal error and we can still proceed.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rx_l4_csum_error is now incremented only when offload is enabled
Signed-off-by: Satish Baddipadige <sbaddipa@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NC-SI firmware of type apeFW (10) is now supported.
Signed-off-by: Rob Swindell <swindell@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the unnecessary "if" statement before the "for" statement:
if (x) {
for (i = 0; i < x; i++)
...
}
Also, change the ring free function to return void as it only returns 0.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During remove_one, the driver should issue hwrm_func_drv_unrgtr
command to inform firmware that this function has been unloaded.
This is to let firmware keep track of driver present/absent state
when driver is gracefully unloaded. A keep alive timer is needed
later to keep track of driver state during abnormal shutdown.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/geneve.c
Here we had an overlapping change, where in 'net' the extraneous stats
bump was being removed whilst in 'net-next' the final argument to
udp_tunnel6_xmit_skb() was being changed.
Signed-off-by: David S. Miller <davem@davemloft.net>
The reset logic calls bnxt_close_nic() and bnxt_open_nic() under rtnl_lock
from bnxt_sp_task. BNXT_STATE_IN_SP_TASK must be cleared before calling
bnxt_close_nic() to avoid deadlock.
v2: Fixed white space error. Thanks Dave.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When implementing driver reset from tx_timeout in the next patch,
bnxt_close_nic() will be called from the sp_task workqueue. Calling
cancel_work() on sp_task will hang the workqueue.
Instead, set a new bit BNXT_STATE_IN_SP_TASK when bnxt_sp_task() is running.
bnxt_close_nic() will wait for BNXT_STATE_IN_SP_TASK to clear before
proceeding.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows multiple independent bits to be set for various states.
Subsequent patches to implement tx timeout reset will require this.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The declaration of the bitmap vf_req_snif_bmap using fixed array of
unsigned long will only work on 64-bit archs. Use DECLARE_BITMAP instead
which will work on all archs.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/renesas/ravb_main.c
kernel/bpf/syscall.c
net/ipv4/ipmr.c
All three conflicts were cases of overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Call bnxt_cfg_rx_mode() in bnxt_init_chip() to setup uc_list and
mc_list mac address filters. Before the patch, uc_list is not
setup again after chip reset (such as ethtool ring size change)
and macvlans don't work any more after that.
Modify bnxt_cfg_rx_mode() to return error codes appropriately so
that the init chip sequence can detect any failures.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For PF, the bp->pf.mac_addr always holds the permanent MAC
addr assigned by the HW. For VF, the bp->vf.mac_addr always
holds the administrator assigned VF MAC addr. The random
generated VF MAC addr should never get stored to bp->vf.mac_addr.
This way, when the VF wants to change the MAC address, we can tell
if the adminstrator has already set it and disallow the VF from
changing it.
v2: Fix compile error if CONFIG_BNXT_SRIOV is not set.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing ndo_set_mac_address only copies the new MAC addr
and didn't set the new MAC addr to the HW. The correct way is
to delete the existing default MAC filter from HW and add
the new one. Because of RFS filters are also dependent on the
default mac filter l2 context, the driver must go thru
close_nic() to delete the default MAC and RFS filters, then
open_nic() to set the default MAC address to HW.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NAPI drivers no longer need to observe a particular protocol
to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
napi_hash_add() and napi_hash_del() are automatically called
from core networking stack, respectively from
netif_napi_add() and netif_napi_del()
This patch depends on free_netdev() and netif_napi_del() being
called from process context, which seems to be the norm.
Drivers might still prefer to call napi_hash_del() on their
own, since they might combine all the rcu grace periods into
a single one, knowing their NAPI structures lifetime, while
core networking stack has no idea of a possible combining.
Once this patch proves to not bring serious regressions,
we will cleanup drivers to either remove napi_hash_del()
or provide appropriate rcu grace periods combining.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of always calling pci_sriov_disable() in remove_one(),
the driver should detect whether VFs are currently assigned
to the VMs. If the VFs are active in VMs, then it should not
disable SRIOV as it is catastrophic to the VMs. Instead,
it just leaves the VFs alone and continues to unload the PF.
The user can then cleanup the VMs even after the PF driver
has been unloaded.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Assign the return value from bitmap_find_free_region() to an integer
variable and check for negative error codes first, before assigning
the bit ID to the unsigned sw_id field.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use offset 0x4014 for reading CAG interrupt status,
the actual CAG register must be mapped to GRC bar0 window #4.
Otherwise, the driver is reading garbage. This patch corrects
this issue.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The profile ID in the completion record needs to be ANDed with the
profile ID mask of 0x1f. This bug was causing the SKB hash type
and the gso_type to be wrong in some cases.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the sp event bits to be bit positions instead of bit values since
the bit helper functions are expecting the former.
Signed-off-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_gro_skb() has unused variables when CONFIG_INET is not set. We
really cannot support hardware GRO if CONFIG_INET is not set, so
compile out bnxt_gro_skb() completely and define BNXT_FLAG_GRO to be 0
if CONFIG_INET is not set. This will effectively always disable
hardware GRO if CONFIG_INET is not set.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct bnxt_pf_info needs to be always defined. Move bnxt_update_vf_mac()
to bnxt_sriov.c and add some missing #ifdef CONFIG_BNXT_SRIOV.
Reported-by: Jim Hull <jim.hull@hpe.com>
Tested-by: Jim Hull <jim.hull@hpe.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Broadcom ethernet driver for the new family of NetXtreme-C/E
ethernet devices.
v5:
- Removed empty blank lines at end of files (noted by David Miller).
- Moved busy poll helper functions to bnxt.h to at least make the
.c file look less cluttered with #ifdef (noted by Stephen Hemminger).
v4:
- Broke up 2 long message strings with "\n" (suggested by John Linville)
- Constify an array of strings (suggested by Stephen Hemminger)
- Improve bnxt_vf_pciid() (suggested by Stephen Hemminger)
- Use PCI_VDEVICE() to populate pci_device_id table for more compact
source.
v3:
- Fixed 2 more sparse warnings.
- Removed some unused structures in .h files.
v2:
- Fixed all kbuild test robot reported warnings.
- Fixed many of the checkpatch.pl errors and warnings.
- Fixed the Kconfig description (noted by Dmitry Kravkov).
Acked-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Jeffrey Huang <huangjw@broadcom.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>