Commit Graph

1141 Commits

Author SHA1 Message Date
Jakub Kicinski
c9ca5c3aab bnxt: implement ethtool::get_fec_stats
Report corrected bits.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-15 17:08:29 -07:00
Sriharsha Basavapatna
ac797ced1f bnxt_en: Free and allocate VF-Reps during error recovery.
During firmware recovery, VF-Rep configuration in the firmware is lost.
Fix it by freeing and (re)allocating VF-Reps in FW at relevant points
during the error recovery process.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12 13:20:38 -07:00
Michael Chan
90f4fd0296 bnxt_en: Refactor __bnxt_vf_reps_destroy().
Add a new helper function __bnxt_free_one_vf_rep() to free one VF rep.
We also reintialize the VF rep fields to proper initial values so that
the function can be used without freeing the VF rep data structure.  This
will be used in subsequent patches to free and recreate VF reps after
error recovery.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12 13:20:38 -07:00
Sriharsha Basavapatna
ea2d37b2b3 bnxt_en: Refactor bnxt_vf_reps_create().
Add a new function bnxt_alloc_vf_rep() to allocate a VF representor.
This function will be needed in subsequent patches to recreate the
VF reps after error recovery.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12 13:20:38 -07:00
Vasundhara Volam
190eda1a9d bnxt_en: Invalidate health register mapping at the end of probe.
After probe is successful, interface may not be bought up in all
the cases and health register mapping could be invalid if firmware
undergoes reset. Fix it by invalidating the health register at the
end of probe. It will be remapped during ifup.

Fixes: 43a440c400 ("bnxt_en: Improve the status_reliable flag in bp->fw_health.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12 13:20:38 -07:00
Michael Chan
17e1be342d bnxt_en: Treat health register value 0 as valid in bnxt_try_reover_fw().
The retry loop in bnxt_try_recover_fw() should not abort when the
health register value is 0.  It is a valid value that indicates the
firmware is booting up.

Fixes: 861aae786f ("bnxt_en: Enhance retry of the first message to the firmware.")
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12 13:20:38 -07:00
David S. Miller
241949e488 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-03-24

The following pull-request contains BPF updates for your *net-next* tree.

We've added 37 non-merge commits during the last 15 day(s) which contain
a total of 65 files changed, 3200 insertions(+), 738 deletions(-).

The main changes are:

1) Static linking of multiple BPF ELF files, from Andrii.

2) Move drop error path to devmap for XDP_REDIRECT, from Lorenzo.

3) Spelling fixes from various folks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 16:30:46 -07:00
Michael Chan
861aae786f bnxt_en: Enhance retry of the first message to the firmware.
Two enhancements:

1. Read the health status first before sending the first
HWRM_VER_GET message to firmware instead of the other way around.
This guarantees we got the accurate health status before we attempt
to send the message.

2. We currently only retry sending the first HWRM_VER_GET message to
the firmware if the firmware is in the process of booting.  If the
firmware is in error state and is doing core dump for example, the
driver should also retry if the health register has the RECOVERING
flag set.  This flag indicates the firmware will undergo recovery
soon.  Modify the retry logic to retry for this case as well.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Vasundhara Volam
bae8a00379 bnxt_en: Remove the read of BNXT_FW_RESET_INPROG_REG after firmware reset.
Once the chip goes through reset, the register mapping may be lost
and any read of the mapped health registers may return garbage value
until the registers are mapped again in the init path.

Reading BNXT_FW_RESET_INPROG_REG after firmware reset will likely
return garbage value due to the above reason.  Reading this register
is for information purpose only so remove it.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Michael Chan
2924ad95cb bnxt_en: Set BNXT_STATE_FW_RESET_DET flag earlier for the RDMA driver.
During ifup, if the driver detects that firmware has gone through a
reset, it will go through a re-probe sequence.  If the RDMA driver is
loaded, the re-probe sequence includes calling the RDMA driver to stop.
We need to set the BNXT_STATE_FW_RESET_DET flag earlier so that it is
visible to the RDMA driver.  The RDMA driver's stop sequence is
different if firmware has gone through a reset.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: P B S Naresh Kumar <nareshkumar.pbs@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Scott Branden
15a7deb895 bnxt_en: check return value of bnxt_hwrm_func_resc_qcaps
Check return value of call to bnxt_hwrm_func_resc_qcaps in
bnxt_hwrm_if_change and return failure on error.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Edwin Peer
a2f3835cc6 bnxt_en: don't fake firmware response success when PCI is disabled
The original intent here is to allow commands during reset to succeed
without error when the device is disabled, to ensure that cleanup
completes normally during NIC close, where firmware is not necessarily
expected to respond.

The problem with faking success during reset's PCI disablement is that
unrelated ULP commands will also see inadvertent success during reset
when failure would otherwise be appropriate. It is better to return
a different error result such that reset related code can detect
this unique condition and ignore as appropriate.

Note, the pci_disable_device() when firmware is fatally wounded in
bnxt_fw_reset_close() does not need to be addressed, as subsequent
commands are already expected to fail due to the BNXT_NO_FW_ACCESS()
check in bnxt_hwrm_do_send_msg().

Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Pavan Chebbi
80a9641f09 bnxt_en: Improve wait for firmware commands completion
In situations where FW has crashed, the bnxt_hwrm_do_send_msg() call
will have to wait until timeout for each firmware message.  This
generally takes about half a second for each firmware message.  If we
try to unload the driver n this state, the unload sequence will take
a long time to complete.

Improve this by checking the health register if it is available and
abort the wait for the firmware response if the register shows that
firmware is not healthy.  The very first message HWRM_VER_GET is
excluded from this check because that message is used to poll for
firmware to come out of reset during error recovery.

Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Michael Chan
43a440c400 bnxt_en: Improve the status_reliable flag in bp->fw_health.
In order to read the firmware health status, we first need to determine
the register location and then the register may need to be mapped.
There are 2 code paths to do this.  The first one is done early as a
best effort attempt by the function bnxt_try_map_fw_health_reg().  The
second one is done later in the function bnxt_map_fw_health_regs()
after establishing communications with the firmware.  We currently
only set fw_health->status_reliable if we can successfully set up the
health register in the first code path.

Improve the scheme by setting the fw_health->status_reliable flag if
either (or both) code paths can successfully set up the health
register.  This flag is relied upon during run-time when we need to
check the health status.  So this will make it work better.

During ifdown, if the health register is mapped, we need to invalidate
the health register mapping because a potential fw reset will reset
the mapping.  Similarly, we need to do the same after firmware reset
during recovery.  We'll remap it during ifup.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-22 13:07:28 -07:00
Lorenzo Bianconi
fdc13979f9 bpf, devmap: Move drop error path to devmap for XDP_REDIRECT
We want to change the current ndo_xdp_xmit drop semantics because it will
allow us to implement better queue overflow handling. This is working
towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error
path handling from each XDP ethernet driver to devmap code. According to
the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx
loop whenever the hw reports a tx error and it will just return to devmap
caller the number of successfully transmitted frames. It will be devmap
responsibility to free dropped frames.

Move each XDP ndo_xdp_xmit capable driver to the new APIs:

- veth
- virtio-net
- mvneta
- mvpp2
- socionext
- amazon ena
- bnxt
- freescale (dpaa2, dpaa)
- xen-frontend
- qede
- ice
- igb
- ixgbe
- i40e
- mlx5
- ti (cpsw, cpsw-new)
- tun
- sfc

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org
2021-03-18 16:38:51 +01:00
Edwin Peer
20d7d1c5c9 bnxt_en: reliably allocate IRQ table on reset to avoid crash
The following trace excerpt corresponds with a NULL pointer dereference
of 'bp->irq_tbl' in bnxt_setup_inta() on an Aarch64 system after many
device resets:

    Unable to handle kernel NULL pointer dereference at ... 000000d
    ...
    pc : string+0x3c/0x80
    lr : vsnprintf+0x294/0x7e0
    sp : ffff00000f61ba70 pstate : 20000145
    x29: ffff00000f61ba70 x28: 000000000000000d
    x27: ffff0000009c8b5a x26: ffff00000f61bb80
    x25: ffff0000009c8b5a x24: 0000000000000012
    x23: 00000000ffffffe0 x22: ffff000008990428
    x21: ffff00000f61bb80 x20: 000000000000000d
    x19: 000000000000001f x18: 0000000000000000
    x17: 0000000000000000 x16: ffff800b6d0fb400
    x15: 0000000000000000 x14: ffff800b7fe31ae8
    x13: 00001ed16472c920 x12: ffff000008c6b1c9
    x11: ffff000008cf0580 x10: ffff00000f61bb80
    x9 : 00000000ffffffd8 x8 : 000000000000000c
    x7 : ffff800b684b8000 x6 : 0000000000000000
    x5 : 0000000000000065 x4 : 0000000000000001
    x3 : ffff0a00ffffff04 x2 : 000000000000001f
    x1 : 0000000000000000 x0 : 000000000000000d
    Call trace:
    string+0x3c/0x80
    vsnprintf+0x294/0x7e0
    snprintf+0x44/0x50
    __bnxt_open_nic+0x34c/0x928 [bnxt_en]
    bnxt_open+0xe8/0x238 [bnxt_en]
    __dev_open+0xbc/0x130
    __dev_change_flags+0x12c/0x168
    dev_change_flags+0x20/0x60
    ...

Ordinarily, a call to bnxt_setup_inta() (not in trace due to inlining)
would not be expected on a system supporting MSIX at all. However, if
bnxt_init_int_mode() does not end up being called after the call to
bnxt_clear_int_mode() in bnxt_fw_reset_close(), then the driver will
think that only INTA is supported and bp->irq_tbl will be NULL,
causing the above crash.

In the error recovery scenario, we call bnxt_clear_int_mode() in
bnxt_fw_reset_close() early in the sequence. Ordinarily, we will
call bnxt_init_int_mode() in bnxt_hwrm_if_change() after we
reestablish communication with the firmware after reset.  However,
if the sequence has to abort before we call bnxt_init_int_mode() and
if the user later attempts to re-open the device, then it will cause
the crash above.

We fix it in 2 ways:

1. Check for bp->irq_tbl in bnxt_setup_int_mode(). If it is NULL, call
bnxt_init_init_mode().

2. If we need to abort in bnxt_hwrm_if_change() and cannot complete
the error recovery sequence, set the BNXT_STATE_ABORT_ERR flag.  This
will cause more drastic recovery at the next attempt to re-open the
device, including a call to bnxt_init_int_mode().

Fixes: 3bc7d4a352 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-26 15:50:23 -08:00
Vasundhara Volam
d20cd74521 bnxt_en: Fix race between firmware reset and driver remove.
The driver's error recovery reset sequence can take many seconds to
complete and only the critical sections are protected by rtnl_lock.
A recent change has introduced a regression in this sequence.

bnxt_remove_one() may be called while the recovery is in progress.
Normally, unregister_netdev() would cause bnxt_close_nic() to be
called and this would cause the error recovery to safely abort
with the BNXT_STATE_ABORT_ERR flag set in bnxt_close_nic().

Recently, we added bnxt_reinit_after_abort() to allow the user to
reopen the device after an aborted recovery.  This causes the
regression in the scenario described above because we would
attempt to re-open even after the netdev has been unregistered.

Fix it by checking the netdev reg_state in
bnxt_reinit_after_abort() and abort if it is unregistered.

Fixes: 6882c36cf8 ("bnxt_en: attempt to reinitialize after aborted reset")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-26 15:50:23 -08:00
David S. Miller
d489ded1a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-02-16 17:51:13 -08:00
Michael Chan
f4d95c3c19 bnxt_en: Improve logging of error recovery settings information.
We currently only log the error recovery settings if it is enabled.
In some cases, firmware disables error recovery after it was
initially enabled.  Without logging anything, the user will not be
aware of this change in setting.

Log it when error recovery is disabled.  Also, change the reset count
value from hexadecimal to decimal.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan
df97b34d3a bnxt_en: Reply to firmware's echo request async message.
This is a new async message that the firmware can send to check if it
can communicate with the driver.  This is an added error detection
scheme that firmware can use if it suspects errors in the PCIe
interface.  When the driver receives this async message, it will reply
back echoing some data in the async message.  If the firmware is not
getting the reply with the proper data after some retries, error
recovery will kick in.

Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan
41435c3940 bnxt_en: Initialize "context kind" field for context memory blocks.
If firmware provides the offset to the "context kind" field of the
relevant context memory blocks, we'll initialize just that field for
each block instead of initializing all of context memory.

Populate the bnxt_mem_init structure with the proper offset returned
by firmware.  If it is older firmware and the information is not
available, we set the offset to an invalid value and fall back to
the old behavior of initializing every byte.  Otherwise, we initialize
only the "context kind" byte at the offset.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:51 -08:00
Michael Chan
e9696ff33c bnxt_en: Add context memory initialization infrastructure.
Currently, the driver calls memset() to set all relevant context memory
used by the chip to the initial value.  This can take many milliseconds
with the potentially large number of context pages allocated for the
chip.

To make this faster, we only need to initialize the "context kind" field
of each block of context memory.  This patch sets up the infrastructure
to do that with the bnxt_mem_init structure.  In the next patch, we'll
add the logic to obtain the offset of the "context kind" from the
firmware.  This patch is not changing the current behavior of calling
memset() to initialize all relevant context memory.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Michael Chan
dab62e7c2d bnxt_en: Implement faster recovery for firmware fatal error.
During some fatal firmware error conditions, the PCI config space
register 0x2e which normally contains the subsystem ID will become
0xffff.  This register will revert back to the normal value after
the chip has completed core reset.  If we detect this condition,
we can poll this config register immediately for the value to revert.
Because we use config read cycles to poll this register, there is no
possibility of Master Abort if we happen to read it during core reset.
This speeds up recovery significantly as we don't have to wait for the
conservative min_time before polling MMIO to see if the firmware has
come out of reset.  As soon as this register changes value we can
proceed to re-initialize the device.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Edwin Peer
be6d755f3d bnxt_en: selectively allocate context memories
Newer devices may have local context memory instead of relying on the
host for backing store. In these cases, HWRM_FUNC_BACKING_STORE_QCAPS
will return a zero entry size to indicate contexts for which the host
should not allocate backing store.

Selectively allocate context memory based on device capabilities and
only enable backing store for the appropriate contexts.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Michael Chan
31f67c2ee0 bnxt_en: Update firmware interface spec to 1.10.2.16.
The main changes are the echo request/response from firmware for error
detection and the NO_FCS feature to transmit frames without FCS.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-14 17:27:50 -08:00
Vasundhara Volam
db28b6c77f bnxt_en: Fix devlink info's stored fw.psid version format.
The running fw.psid version is in decimal format but the stored
fw.psid is in hex format.  This can mislead the user to reset the
NIC to activate the stored version to become the running version.

Fix it to display the stored fw.psid in decimal format.

Fixes: 1388875b39 ("bnxt_en: Add stored FW version info to devlink info_get cb.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:36:22 -08:00
Edwin Peer
132e0b65dc bnxt_en: reverse order of TX disable and carrier off
A TX queue can potentially immediately timeout after it is stopped
and the last TX timestamp on that queue was more than 5 seconds ago with
carrier still up.  Prevent these intermittent false TX timeouts
by bringing down carrier first before calling netif_tx_disable().

Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-11 14:36:22 -08:00
Michael Chan
871127e6ab bnxt_en: Convert to use netif_level() helpers.
Use the various netif_level() helpers to simplify the C code.  This was
suggested by Joe Perches.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1611642024-3166-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 17:25:59 -08:00
Michael Chan
0da65f4932 bnxt_en: Do not process completion entries after fatal condition detected.
Once the firmware fatal condition is detected, we should cease
comminication with the firmware and hardware quickly even if there
are many completion entries in the completion rings.  This will
speed up the recovery process and prevent further I/Os that may
cause further exceptions.

Do not proceed in the NAPI poll function if fatal condition is
detected.  Call napi_complete() and return without arming interrupts.
Cleanup of all rings and reset are imminent.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:05 -08:00
Michael Chan
5863b10aa8 bnxt_en: Consolidate firmware reset event logging.
Combine the three netdev_warn() calls into a single call, printed at
the NETIF_MSG_HW log level.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
4f036b2e75 bnxt_en: Improve firmware fatal error shutdown sequence.
In the event of a fatal firmware error, firmware will notify the host
and then it will proceed to do core reset when it sees that all functions
have disabled Bus Master.  To prevent Master Aborts and other hard
errors, we need to quiesce all activities in addition to disabling Bus
Master before the chip goes into core reset.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
38290e3729 bnxt_en: Modify bnxt_disable_int_sync() to be called more than once.
In the event of a fatal firmware error, we want to disable IRQ early
in the recovery sequence.  This change will allow it to be called
safely again as part of the normal shutdown sequence.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
e340a5c4fb bnxt_en: Add a new BNXT_STATE_NAPI_DISABLED flag to keep track of NAPI state.
Up until now, we don't need to keep track of this state because NAPI
is always enabled once and disabled once during bring up and shutdown.
For better error recovery in subsequent patches, we want to quiesce
the device earlier during fatal error conditions.  The normal shutdown
sequence will disable NAPI again and the flag will prevent disabling
NAPI twice.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Michael Chan
339eeb4bd9 bnxt_en: Add bnxt_fw_reset_timeout() helper.
This code to check if we have reached the maximum wait time after
firmware reset is used multiple times.  Add a helper function to
do this.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Vasundhara Volam
5d06eb5cb1 bnxt_en: Retry open if firmware is in reset.
Firmware may be in the middle of reset when the driver tries to do ifup.
In that case, firmware will return a special error code and the driver
will retry 10 times with 50 msecs delay after each retry.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Edwin Peer
6882c36cf8 bnxt_en: attempt to reinitialize after aborted reset
Drawing a hard line on aborted resets prevents a NIC open in
some scenarios that may otherwise be recoverable. For example,
if a firmware recovery happened while a PF was down and an
attempt was made to bring up an associated VF in this state,
then it was impossible to ever bring up this VF without a
rebind or reload of its driver.

Attempt to reinitialize the firmware when an aborted reset (or
failed init after a reset) is discovered during open - it may
succeed. Also take care to allow the user to retry opening the
NIC even after an aborted reset.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:04 -08:00
Edwin Peer
a44daa8fcb bnxt_en: log firmware debug notifications
Firmware is capable of generating asynchronous debug notifications.
The event data is opaque to the driver and is simply logged. Debug
notifications can be enabled by turning on hardware status messages
using the ethtool msglvl interface.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Vasundhara Volam
881d8353b0 bnxt_en: Add an upper bound for all firmware command timeouts.
The timeout period for firmware messages is passed to the driver
from the firmware in the response of the first command.  This
timeout period is multiplied by a factor for certain long
running commands such as NVRAM commands.  In some cases, the
timeout period can become really long and it can cause hung task
warnings if firmware has crashed or is not responding.  To avoid
such long delays, cap all firmware commands to a max timeout value
of 40 seconds.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Vasundhara Volam
3e3c09b0e9 bnxt_en: Move reading VPD info after successful handshake with fw.
If firmware is in reset or in bad state, it won't be able to return
VPD data.  Move bnxt_vpd_read_info() until after bnxt_fw_init_one_p1()
successfully returns.  By then we would have established proper
communications with the firmware.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Michael Chan
d1cbd1659c bnxt_en: Retry sending the first message to firmware if it is under reset.
The first HWRM_VER_GET message to firmware during probe may timeout if
firmware is under reset.  This can happen during hot-plug for example.
On P5 and newer chips, we can check if firmware is in the boot stage by
reading a status register.  Retry 5 times if the status register shows
that firmware is not ready and not in error state.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Edwin Peer
b187e4bae0 bnxt_en: handle CRASH_NO_MASTER during bnxt_open()
Add missing support for handling NO_MASTER crashes while ports are
administratively down (ifdown). On some SoC platforms, the driver
needs to assist the firmware to recover from a crash via OP-TEE.
This is performed in a similar fashion to what is done during driver
probe.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Michael Chan
fe1b853572 bnxt_en: Define macros for the various health register states.
Define macros to check for the various states in the lower 16 bits of
the health register.  Replace the C code that checks for these values
with the newly defined macros.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Michael Chan
16db632304 bnxt_en: Update firmware interface to 1.10.2.11.
Updates to backing store APIs, QoS profiles, and push buffer initial
index support.

Since the new HWRM_FUNC_BACKING_STORE_CFG message size has increased,
we need to add some compat. logic to fall back to the smaller legacy
size if firmware cannot accept the larger message size.  The new fields
added to the structure are not used yet.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-25 19:20:03 -08:00
Jakub Kicinski
2d9116be76 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-01-16

1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support,
   that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman.

2) Add support for using kernel module global variables (__ksym externs in BPF
   programs) retrieved via module's BTF, from Andrii Nakryiko.

3) Generalize BPF stackmap's buildid retrieval and add support to have buildid
   stored in mmap2 event for perf, from Jiri Olsa.

4) Various fixes for cross-building BPF sefltests out-of-tree which then will
   unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker.

5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann.

6) Clean up driver's XDP buffer init and split into two helpers to init per-
   descriptor and non-changing fields during processing, from Lorenzo Bianconi.

7) Minor misc improvements to libbpf & bpftool, from Ian Rogers.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
  perf: Add build id data in mmap2 event
  bpf: Add size arg to build_id_parse function
  bpf: Move stack_map_get_build_id into lib
  bpf: Document new atomic instructions
  bpf: Add tests for new BPF atomic operations
  bpf: Add bitwise atomic instructions
  bpf: Pull out a macro for interpreting atomic ALU operations
  bpf: Add instructions for atomic_[cmp]xchg
  bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
  bpf: Move BPF_STX reserved field check into BPF_STX verifier code
  bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
  bpf: x86: Factor out a lookup table for some ALU opcodes
  bpf: x86: Factor out emission of REX byte
  bpf: x86: Factor out emission of ModR/M for *(reg + off)
  tools/bpftool: Add -Wall when building BPF programs
  bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
  selftests/bpf: Install btf_dump test cases
  selftests/bpf: Fix installation of urandom_read
  selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
  selftests/bpf: Fix out-of-tree build
  ...
====================

Link: https://lore.kernel.org/r/20210116012922.17823-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15 17:57:26 -08:00
Jakub Kicinski
1d9f03c0a1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-14 18:34:50 -08:00
Pavan Chebbi
6874877518 bnxt_en: Clear DEFRAG flag in firmware message when retry flashing.
When the FW tells the driver to retry the INSTALL_UPDATE command after
it has cleared the NVM area, the driver is not clearing the previously
used ALLOWED_TO_DEFRAG flag. As a result the FW tries to defrag the NVM
area a second time in a loop and can fail the request.

Fixes: 1432c3f6a6 ("bnxt_en: Retry installing FW package under NO_SPACE error condition.")
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12 20:05:35 -08:00
Michael Chan
869c4d5eb1 bnxt_en: Improve stats context resource accounting with RDMA driver loaded.
The function bnxt_get_ulp_stat_ctxs() does not count the stats contexts
used by the RDMA driver correctly when the RDMA driver is freeing the
MSIX vectors.  It assumes that if the RDMA driver is registered, the
additional stats contexts will be needed.  This is not true when the
RDMA driver is about to unregister and frees the MSIX vectors.

This slight error leads to over accouting of the stats contexts needed
after the RDMA driver has unloaded.  This will cause some firmware
warning and error messages in dmesg during subsequent config. changes
or ifdown/ifup.

Fix it by properly accouting for extra stats contexts only if the
RDMA driver is registered and MSIX vectors have been successfully
requested.

Fixes: c027c6b4e9 ("bnxt_en: get rid of num_stat_ctxs variable")
Reviewed-by: Yongping Zhang <yongping.zhang@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12 20:05:35 -08:00
Lorenzo Bianconi
be9df4aff6 net, xdp: Introduce xdp_prepare_buff utility routine
Introduce xdp_prepare_buff utility routine to initialize per-descriptor
xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in
all XDP capable drivers.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-08 13:39:24 -08:00
Lorenzo Bianconi
43b5169d83 net, xdp: Introduce xdp_init_buff utility routine
Introduce xdp_init_buff utility routine to initialize xdp_buff fields
const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on
xdp_init_buff in all XDP capable drivers.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Shay Agroskin <shayagr@amazon.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Marcin Wojtas <mw@semihalf.com>
Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-08 13:39:24 -08:00
Jakub Kicinski
30bfce1094 net: remove ndo_udp_tunnel_* callbacks
All UDP tunnel port management is now routed via udp_tunnel_nic
infra directly. Remove the old callbacks.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-07 12:53:29 -08:00