In case of MDD events on VF, don't clog kernel log with unlimited VF MDD
events message "VF 0 has had 1018 MDD events since last boot" - limit
events log message to 30, based on the observation in some experimentation
with sending malicious packet once, and number of events reported before
device stopped observing MDD events.
Also removed defunct macro "ICE_DFLT_NUM_MDD_EVENTS_ALLOWED" for tracking
number of MDD events allowed before disabling the interface...
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When a VSI is accessed inside the ice_for_each_vsi macro in the rebuild
path (ice_vsi_rebuild_all() and ice_vsi_replay_all()), it is referred to
as pf->vsi[i]. Introduce local variables to improve readability.
Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add some verbose debugging for dyndbg to help us when
we are having issues with link and/or PHY.
While there, shorten some strings used by locals that
were causing long line wrapping.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
1. ndo_open and ndo_stop are implemented by ice_open and ice_stop
respectively. When enabling/disabling VSIs, just call
ice_open/ice_stop instead of ndo_open/ndo_stop.
2. Rework logic around rtnl_lock/rtnl_unlock
3. In ice_ena_vsi, remove an unnecessary stack variable and return
0 instead of err when __ICE_NEEDS_RESTART is not set.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There was a bug in the previous code which never traverses all the
children to get the first node of the requested layer. Add a sibling
head pointer to point the first node of each layer per TC. This helps
traverse easier and quicker and also removes the recursion.
Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the issue where port and PFC statistics counters are
incrementing at the wrong port with 4x25G cards.
Read the GLPRT port registers using lport parameter instead of pf_id to
update the statistics otherwise the pf_ids are flipped for ports 2 and 3
when read from the HW register PF_FUNC_RID and this is expected as per
hardware specification.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If a VSI is not using a unicast filter or did not configure that
particular unicast filter, driver should not allow it to be removed
by the rogue VSI.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
VSI, especially VF could request to add or remove filter for another VSI,
driver should really guide such request and disallow it.
However, instead of returning error for such malicious request, driver
can simply return success.
In addition, we are not tracking number of MAC filters configured per
VF correctly - and this leads to issue updating VF MAC filters whenever
they were removed and re-configured via bringing VF interface down and
up. Also, since VF could send request to update multiple MAC filters at
once, driver should program those filters individually in the switch, in
order to determine which action resulted to error, and communicate
accordingly to the VF.
So, with this changes, we now track number of filters added right from
when VF resources allocation is done, and could properly add filters for
both trusted and non_trusted VFs, without MAC filters mis-match issue in
the switch...
Also refactor code, so that driver can use new function to add or remove
MAC filters.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Users expect ethtool statistics to be updated on-demand when invoking
'ethtool -S <iface>' instead of providing a snapshot of statistics taken
once a second (the frequency of the watchdog task where stats are currently
updated). Update stats every time 'ethtool -S <iface>' is run.
Also, fix an indentation style issue and an unnecessary local variable
initialization in ice_get_ethtool_stats() discovered while investigating
the subject issue.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move the assignment to local variables after validation.
Remove unnecessary checks in ice_vc_process_vf_msg() as the respective
functions are now performing the checks.
Signed-off-by: "Amruth G.P" <amruth.gouda.parameshwarappa@intel.com>
Signed-off-by: Nitesh B Venkatesh <nitesh.b.venkatesh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver should never clear the auto_fec_enable bit.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When checking the PHY for status, by specification, the driver
should be using "topology" mode when querying the module type.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In some circumstances, VF devices can be deactivated while a message is
in-flight. In that case, a series of scary error message will be
printed in the log. Since these are actually harmless, check for this
case and suppress them. No harm, no foul.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current flag name of "enable-fw-lldp" is a bit cumbersome.
Change priv-flag name to "fw-lldp-agent" with a value of on or
off. This is more straight-forward in meaning.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The virtchnl interface provides a mechanism for a VF driver to request
head writeback support. This feature is deprecated as of AVF 1.0, but
older versions of a VF driver may still attempt to request the mode.
Since the ice hardware does not support head writeback, we should not
accept Tx queue configuration which attempts to enable it.
Currently, the driver simply assumes that the headwb_enabled bit will
never be set.
If a VF driver does request head writeback, the configuration will
return successfully, even though head writeback is not enabled. This
leaves the VF driver in a non functional state since it is assuming to
be operating in head writeback mode.
Fix the PF driver to reject any attempt to setup headwb_enabled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In rebuild DCB desired_dcbx_cfg was copy to local_dcbx_cfg, but
if DCBX mode is IEEE desired_dcbx_cfg is not initialized by DCBX
config from FW. Change logic to copy config value only if mode is
set to CEE.
If driver copy desired_dcbx_cfg to local_dcbx_cfg in IEEE mode there
is problem with globr. System is frozen after two or more globr.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When a port is not cabled, but DCBx is enabled in the
firmware, the status of DCBx will be NOT_STARTED. This
is a valid state for FW enabled and should not be
treated as a is_fw_lldp true automatically.
Add the code to treat NOT_STARTED as another valid state.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we will call synchronize_irq() from the host for VF's. This is
not correct, so don't allow it.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently, only the DCBx status is taken into account to
determine if FW LLDP is possible. But there are NVM version
coming out with DCBx enabled, and FW LLDP disabled. This
is causing errors where the driver sees that DCBx is not
disabled, and then tries to register for LLDP MIB change
events, and fails.
Change the logic to detect both DCBx and LLDP states in the
FW engine.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For control packets (i.e. LLDP packets) to be able to egress
from the main VSI, a bit has to be set in the TX_descriptor.
This should only be done for the main VSI and only if the
FW LLDP agent is disabled. A bit to allow this also has to
be set in the VSI context.
Add the logic to add the necessary bits in the VSI context
for the PF_VSI and the TX_descriptors for control packets
egressing the PF_VSI.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The dev_kfree_skb() function performs also input parameter validation.
Thus the test around the shown calls is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed a bug where driver was breaking out of the loop and
reporting an error without retrying first.
Signed-off-by: Marcin Formela <marcin.formela@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a function to read NVM module data and uses it to
read current LLDP agent configuration from NVM API version 1.8.
Signed-off-by: Sylwia Wnuczko <sylwia.wnuczko@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Driver waits after issuing a reset. When a reset takes too long a driver
gives up. Implemented by invoking PF reset in a loop. After defined
number of unsuccessful PF reset trials it returns error.
Without this patch PF reset fails when NIC is in recovery mode.
So make i40e_set_mac_type() public. i40e driver requires i40e_set_mac_type()
to be public. It is required for recovery mode handling. Without this patch
recovery mode could not be detected in i40e_probe().
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes function i40e_update_dcb_config(). Instead of
i40e_update_dcb_config() we use i40e_init_dcb(), which implements the
correct NVM read.
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add update to the VSI pointer passed to the i40e_set_vf_mac function.
If VF is in reset state the driver waits in i40e_set_vf_mac function
for the reset to be complete, yet after reset the vsi pointer
that was passed into this function is no longer valid.
The patch updates local VSI pointer directly from pf->vsi array,
by using the id stored in VF pointer (lan_vsi_idx).
Without this commit the driver might occasionally invoke general
protection fault in kernel and disable the OS entirely.
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The stats structure for the VEB switch statistics is reset periodically,
but the tc_stats are not reset at the same time.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Upcoming FW increment API version to 1.9 due to Extend PHY access AQ
command support. SW is ready for that support as well.
Signed-off-by: Piotr Azarewicz <piotr.azarewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Function check_recovery_mode had wrong if statement.
Now we check proper FWS1B register values, which are responsible for
the recovery mode. Recovery mode has 4 values for x710 and 2 for x722.
That's why we need 6 different flags which are defined in the code.
Now in the if statement, we recognize type of mac address
and register value.
Without those changes driver could show wrong state.
Signed-off-by: Adrian Podlawski <adrian.podlawski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds "drop mode" parameter to set mac config AQ command.
This bit controls the behavior when a no-drop packet is blocking a TC
queue.
0 – The PF driver is notified.
1 – The blocking packet is dropped and then the PF driver is notified.
Signed-off-by: Sylwia Wnuczko <sylwia.wnuczko@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes following error reported by cppcheck:
(error) Shifting signed 32-bit value by 31 bits is undefined behaviour
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When i40e_configure_tx_ring(vsi->tx_rings[i]) returns an error, we should
exit from i40e_vsi_configure_tx and return the error, instead of continuing
to check whether xdp is enable, and configure the xdp transmit ring.
Signed-off-by: huhai <huhai@kylinos.cn>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to the ixgbe issue fixed in:
655c914145 ("ixgbe: Check DDM existence in transceiver before access)
i40e has the same issue when reading eeprom from SFP's module that comply
with SFF-8472 but not implement the Digital Diagnostic Monitoring (DDM)
interface described in it. The existence of such area is specified by bit
6 of byte 92, set to 1 if implemented.
Without this patch, due to not checking this bit i40e fails to read SFP
module's eeprom with the follow message:
ethtool -m enP51p1s0f0
Cannot get Module EEPROM data: Input/output error
Because it fails to read the additional 256 bytes in which it was assumed
to exist the DDM data.
Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The functions i40e_aq_get_phy_abilities_resp() and i40e_set_fc() both
have giant structure on the stack, which makes each one use stack frames
larger than 500 bytes.
As clang decides one function into the other, we get a warning for
exceeding the frame size limit on 32-bit architectures:
drivers/net/ethernet/intel/i40e/i40e_common.c:1654:23: error: stack frame size of 1116 bytes in function 'i40e_set_fc' [-Werror,-Wframe-larger-than=]
When building with gcc, the inlining does not happen, but i40e_set_fc()
calls i40e_aq_get_phy_abilities_resp() anyway, so they add up on the
kernel stack just as much.
The parts that actually use large stacks don't overlap, so make sure
each one is a separate function, and mark them as noinline_for_stack to
prevent the compilers from combining them again.
Fixes: 0a862b43ac ("i40e/i40evf: Add module_types and update_link_info")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When we fail to add/delete MAC filters in the VF, the print doesn't
distinguish between the two. Fix that by printing whether or not we
failed to add/delete the MAC filter respectively.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
These queue variables are being assigned values that are type u16.
Change the local variables to match these types. Since these
represent queue counts, they should never be negative.
Signed-off-by: Pawel Kaminski <pawel.kaminski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In order to use some of the VF resources definition in the SR-IOV specific
virtchnl header file, this patch moves applicable code to
ice_virtchnl_pf.h file accordingly... and they should have been defined in
the destination file originally.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we use the ICE_MBXQ_LEN for both the Mailbox send and receive
queues that are used to communicate with VFs. This is fine for the send
queue because the PF driver will lock the queue for every single send,
but for the Mailbox receive queue every VF is posting to its Mailbox
send queue and the hardware is then handing the message to the PF on its
Mailbox receive queue. This becomes a problem with many VFs because it
seems to overburden the Mailbox receive queue on the PF. Fix this by
increasing the Mailbox receive queue for the PF to 512 entries.
The number 512 was determined based on the number of VFs supported by
the device. We can have a total of 256 VFs so in the worst case this
allows the VFs to put 2 messages in the PFs Mailbox receive queue at the
same time.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently there are a couple places where the VF is waiting too long when
checking the status of registers. This is causing the AVF driver to
spin for longer than necessary in the __IAVF_STARTUP state. Sometimes
it causes the AVF to go into the __IAVF_COMM_FAILED, which may retrigger
the __IAVF_STARTUP state. Try to reduce the chance of this happening by
removing unnecessary wait times in VF bringup/resets.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Register access for GLINT_DYN_CTL and GLINT_VECT2FUNC should be within
the PF space and not the absolute device space.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
During rebuild ice_ena_vsi() is called to recover the VSI state.
This function assumes the PF VSI is always to be enabled, however,
it's possible that during reset/rebuild the interface can be
brought down. If this occurs, we can attempt to bring up the PF
VSI on a downed interface which can lead to various crashes. If
the interface is not running, do not bring up the associated VSI.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In some circumstances, the hardware will hand us a receive descriptor
which has no data attached, but is otherwise valid. The receive code was
improperly ignoring these descriptors, which result in an infinite loop.
To fix this, change the receive code to process all descriptors,
regardless of the size of the associated data. Add checks to the
memory-handling functions to allow for zero size.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the set local MIB AQ call failures in the DCB rebuild path
by setting the defaults for the ETS recommended DCB configuration. Also,
willing bits for the DCB configuration needs to be set correctly. Resets
works fine in IEEE mode as the ETS recommended DCB configuration is
populated but not in CEE mode.
Without this patch, PFR causes the kernel hang in CEE mode.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently when busy polling is enabled we aren't setting/enabling
WB_ON_ITR in the driver. This doesn't break the driver, but it does
cause issues. If we don't enable WB_ON_ITR mode we will still get
write-backs from hardware during polling when a cache line has been
filled, but if a cache line is not filled we will not get the
write-back because WB_ON_ITR is not set. Fix this by enabling
WB_ON_ITR in the driver when interrupts are disabled.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When ETHTOOL_GLINKSETTINGS is defined get pause param pause->autoneg
reports SW configured setting, however when not defined get pause param
pause->autoneg reports the link status. Set pause param needs to compare
pause->autoneg with the same source as get pause param to block the user
from changing autoneg with the set pause param option, or the user
may be incorrectly blocked from changing Rx|Tx pause settings.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch restructures how VFs are configured, and resources allocated.
Instead of freeing resources that were never allocated, and resetting
empty VFs that have never been created - the new flow will just allocate
resources for number of requested VFs based on the availability.
During VFs initialization process, global interrupt is disabled, and
rearmed after getting MSIX vectors for VFs. This allows immediate mailbox
communications, instead of delaying it till later and VFs.
PF communications resulted to using polling instead of actual interrupt.
The issue manifested when creating higher number of VFs (128 VFs) per PF.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we divide budget by the number of Rx queues per Rx ring
container in ice_napi_poll even if there is only 1. This is an
unnecessary divide for the normal case of 1 Rx ring per Rx ring
container. Fix this by using an unlikely() call in the case where we
actually need to divide.
Also, we will always set budget_per_ring even if there are no Rx rings
in the Rx ring container so we don't need to initialize it to 0.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently in ice_get_tx_pending we try to read a Tx ring's tail. This is
then compared with the software based head (next_to_clean) to determine
if we have pending work. This will never work because reading of the Tx
ring's tail is no longer supported. Fix this by using the software based
tail (next_to_use) to determine if there is pending work.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for the need_wakeup feature of AF_XDP. If the
application has told the kernel that it might sleep using the new bind
flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has
no more buffers on the NIC Rx ring and yield to the application. For
Tx, it will set the flag if it has no outstanding Tx completion
interrupts and return to the application.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch adds support for the need_wakeup feature of AF_XDP. If the
application has told the kernel that it might sleep using the new bind
flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has
no more buffers on the NIC Rx ring and yield to the application. For
Tx, it will set the flag if it has no outstanding Tx completion
interrupts and return to the application.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new
ndo provides the same functionality as before but with the addition of
a new flags field that is used to specifiy if Rx, Tx or both should be
woken up. The previous ndo only woke up Tx, as implied by the
name. The i40e and ixgbe drivers (which are all the supported ones)
are updated with this new interface.
This new ndo will be used by the new need_wakeup functionality of XDP
sockets that need to be able to wake up both Rx and Tx driver
processing.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe_service_task() calls unregister_netdev() under rtnl_lock().
But unregister_netdev() internally calls rtnl_lock().
So deadlock would occur.
Fixes: 59dd45d550 ("ixgbe: firmware recovery mode")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2019-08-05
This series contains updates to i40e driver only.
Dmitrii adds missing statistic counters for VEB and VEB TC's.
Slawomir adds support for logging the "Disable Firmware LLDP" flag
option and its current status.
Jake fixes an issue where VF's being notified of their link status
before their queues are enabled which was causing issues. So always
report link status down when the VF queues are not enabled. Also adds
future proofing when statistics are added or removed by adding checks to
ensure the data pointer for the strings lines up with the expected
statistics count.
Czeslaw fixes the advertised mode reported in ethtool for FEC, where the
"None BaseR RS" was always being displayed no matter what the mode it
was in. Also added logging information when the PF is entering or
leaving "allmulti" (or promiscuous) mode. Fixed up the logging logic
for VF's when leaving multicast mode to not include unicast as well.
v2: drop Aleksandr's patch (previously patch #2 in the series) to
display the VF MAC address that is set by the VF while community
feedback is addressed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes unicast log when VF is leaving multicast mode.
Added check of vf->vf_states &
I40E_VF_STATE_MC_PROMISC/I40E_VF_STATE_UC_PROMISC.
Without this commit, leaving multicast mode logs "unset unicast"
in dmsg.
Signed-off-by: Czeslaw Zagorski <czeslawx.zagorski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to i40e_get_ethtool_stats, add a goto to verify that the data
pointer for the strings lines up with the expected stats count. This
helps ensure that bugs are not introduced when adding stats.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add log when PF is entering and leaving allmulti mode. The
change of PF state is visible in dmesg now. Without this commit,
entering and leaving allmulti mode is not logged in dmesg.
Signed-off-by: Czeslaw Zagorski <czeslawx.zagorski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Updates visual effect for advertised mode after setting desired mode.
The mode appears in advertised FEC mode correctly, when ethtool
interface command is called. Without this commit advertised FEC
is displayed regardless of the settings as "None BaseR RS".
Signed-off-by: Czeslaw Zagorski <czeslawx.zagorski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Found a code comment that needed TLC to correct their formatting.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Commit d3d657a908 ("i40e: update VFs of link state after
GET_VF_RESOURCES") modified the PF driver to notify a VF of
its link status immediately after it requests resources.
This was intended to fix reporting on VF drivers, so that they would
properly report link status.
However, some older VF drivers do not respond well to receiving a link
up notification before queues are enabled. This can cause their state
machine to think that it is safe to send traffic. This results in a Tx
hang on the VF.
More recent versions of the old i40evf and all versions of iavf are
resilient to these early link status messages. However, if a VM happens
to run an older version of the VF driver, this can be problematic.
Record whether the PF has actually enabled queues for the VF. When
reporting link status, always report link down if the queues aren't
enabled. In this way, the VF driver will never receive a link up
notification until after its queues are enabled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add logging for disable-fw-lldp flag by ethtool. Added check
for I40E_FLAG_DISABLE_FW_LLDP and logging state in dmesg.
Without this commit there was no clear statement in dmesg
about FW LLDP state in dmesg.
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes missing call of i40e_update_veb_stats() in function
i40e_get_ethtool_stats() to update stats data of VEB and VEB TC
counters before they are written into ethtool buffer.
Before the patch ethtool counters may fell behind interface counters.
Signed-off-by: Dmitrii Golovanov <dmitrii.golovanov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix assignment of the FM10K_FAULT_ADDR_LO register into fault->address
by using a bit-wise |= operation. Without this, the low address is
completely overwriting the high potion of the address. This caused the
fault to incorrectly return only the lower 32 bits of the fault address.
This issue was detected by cppcheck and resolves the following warnings
produced by that tool:
[fm10k_pf.c:1668] -> [fm10k_pf.c:1670]: (style) Variable
'fault->address' is reassigned a value before the old one has been used.
[fm10k_pf.c:1669] -> [fm10k_pf.c:1670]: (style) Variable
'fault->address' is reassigned a value before the old one has been used.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver currently uses a macro to decide whether we should use
NON_Q_VECTORS_PF or NON_Q_VECTORS_VF.
However, we also define NON_Q_VECTORS_VF to the same value as
NON_Q_VECTORS_PF. This means that the macro NON_Q_VECTORS(hw) will
always return the same value.
Let's just remove this macro, and replace it directly with an enum value
on the enum non_q_vectors.
This was detected by cppcheck and fixes the following warnings when
building with BUILD=KERNEL
[fm10k_ethtool.c:1123]: (style) Same value in both branches of ternary
operator.
[fm10k_ethtool.c:1142]: (style) Same value in both branches of ternary
operator.
[fm10k_main.c:1826]: (style) Same value in both branches of ternary
operator.
[fm10k_main.c:1849]: (style) Same value in both branches of ternary
operator.
[fm10k_main.c:1858]: (style) Same value in both branches of ternary
operator.
[fm10k_pci.c:901]: (style) Same value in both branches of ternary
operator.
[fm10k_pci.c:1040]: (style) Same value in both branches of ternary
operator.
[fm10k_pci.c:1726]: (style) Same value in both branches of ternary
operator.
[fm10k_pci.c:1763]: (style) Same value in both branches of ternary
operator.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Several functions in the fm10k driver have specific function templates,
as they are used as function pointers. The parameters in these functions
are not always used. Explicitly mark unused parameters with the
__always_unused macro, so that the compiler will not warn about them
when building with the -Wunused-parameter warning enabled.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The page_addr variable is a void pointer. Incrementing it before calling
prefetch is technically undefined. Fix this by casting it to a u8*
pointer before incrementing it. This ensures that we increment the
pointer value in byte units, instead of relying on this undefined
behavior.
This was detected by cppcheck, and resolves the following warning
produced by that tool:
[fm10k_main.c:328]: (portability) 'page_addr' is of type 'void *'. When
using void pointers in calculations, the behaviour is undefined.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the fm10k_handle_resume function, return 0 explicitly at the end of
the function instead of returning the err value.
This was detected by cppcheck and resolves the following style warning
produced by that tool:
[fm10k_pci.c:2768] -> [fm10k_pci.c:2787]: (warning) Identical condition
'err', second condition is always false
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The local variable 'size' in fm10k_dfwd_add_station is initialized, but
is always re-assigned immediately before use. Remove this unnecessary
initialization.
This was detected by cppcheck and resolves the following warning
produced by that tool:
[fm10k_netdev.c:1466]: (style) Variable 'size' is assigned a value that is never used.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The local variable err in several functions in the fm10k_netdev.c file
is initialized with a value that is never used. The err value is
immediately re-assigned in all cases where it will be checked. Remove
the unnecessary initializers.
This was detected by cppcheck and resolves the following warnings
produced by that tool:
[fm10k_netdev.c:999] -> [fm10k_netdev.c:1004]: (style) Variable 'err' is
reassigned a value before the old one has been used.
[fm10k_netdev.c:1019] -> [fm10k_netdev.c:1024]: (style) Variable 'err'
is reassigned a value before the old one has been used.
[fm10k_netdev.c:64]: (style) Variable 'err' is assigned a value that is
never used.
[fm10k_netdev.c:131]: (style) Variable 'err' is assigned a value that
is never used.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The err variable in the fm10k_tlv_attr_parse function is initialized
with zero. However, the function never reads err without first assigning
it from a function call. Remove this unnecessary initialization.
This was detected by cppcheck and resolves the following warning
produced by that tool:
[fm10k_tlv.c:498]: (style) Variable 'err' is assigned a value that is
never used.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the ring local variable in the fm10k_assign_l2_accel
function.
This was detected by cppcheck and resolves the following warning
produced by that tool:
[fm10k_netdev.c:1447]: (style) The scope of the variable 'ring' can be
reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the result local variable in the
fm10k_iov_msg_lport_state_pf function.
This was detected by cppcheck and resolves the following warning
produced by that tool:
[fm10k_pf.c:1435]: (style) The scope of the variable 'result' can be
reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The msg variable in the fm10k_mbx_validate_msg_size and
fm10k_sm_mbx_transmit functions is only used within the do {} loop
scope. Reduce its scope only to where it is used.
This was detected by cppcheck, and resolves the following warnings
produced by that tool:
[fm10k_mbx.c:299]: (style) The scope of the variable 'msg' can be reduced.
[fm10k_mbx.c:2004]: (style) The scope of the variable 'msg' can be reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the local loop variable in the
fm10k_check_hang_subtask function.
This was detected by cppcheck and resolves the following warning
produced by that tool:
[driver/fm10k_pci.c:852]: (style) The scope of the variable 'i' can be
reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the local variable err in the fm10k_detach_subtask
function.
This was detected by cppcheck and resolves the following warning
produced by that tool:
[fm10k_pci.c:403]: (style) The scope of the variable 'err' can be reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The tx_buffer local variable in the function fm10k_clean_tx_ring is not
used except inside a smaller block scope. Reduce the scope to its point
of use.
This was detected by cppcheck and resolves the following style warning
produced by that tool:
[fm10k_netdev.c:179]: (style) The scope of the variable 'tx_buffer' can
be reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the q_idx local variable in the fm10k_cache_ring_qos
function.
This was detected by cppcheck and resolves the following style warning
produced by that tool:
[fm10k_main.c:2016]: (style) The scope of the variable 'q_idx' can be
reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the local err variable in the fm10k_iov_alloc_data
function.
This was detected by cppcheck and resolves the following style warning
produced by that tool:
[fm10k_iov.c:426]: (style) The scope of the variable 'err' can be reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the qv vector pointer local variable in the
fm10k_set_coalesce function.
This was detected by cppcheck and resolves the following style warning
produced by that tool:
[fm10k_ethtool.c:658]: (style) The scope of the variable 'qv' can be
reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the char *p local variable to only the block where
it is used.
This was detected by cppcheck and resolves the following style warning
produced by that tool:
[fm10k_ethtool.c:229]: (style) The scope of the variable 'p' can be
reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the scope of the err local variable in the fm10k_dcbnl_ieee_setets
function.
This was detected using cppcheck, and resolves the following style
warning:
[fm10k_dcbnl.c:37]: (style) The scope of the variable 'err' can be reduced.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update driver version to 0.7.5
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
As a result of refactoring of VF VSIs interrupts code, there is no
need to track its configuration status again with ICE_VF_STATE_CFG_INTR
flag - In fact, it is not being checked anywhere in the code right now, so
this patch removes the dead code as applicable to the flag.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This flag is not needed and is called every time we re-enable interrupts
in the hotpath so remove it. Also remove ice_vsi_req_irq() because it
was a wrapper function for ice_vsi_req_irq_msix() whose sole purpose was
checking the ICE_FLAG_MSIX_ENA flag.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since Tx rings are being managed by FW/NVM, Tx rings might have not been
set up or driver had already wiped them off - In that case, call to
disable LAN Tx queue is being returned as not in existence. This patch
makes sure we don't return unnecessary error for such scenario.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently if the call to ice_alloc_mapped_page() fails we jump to the
no_buf label, possibly call ice_release_rx_desc(), and return true
indicating that there is more work to do. In the success case we just
fall out of the while loop, possibly call ice_alloc_mapped_page(), and
return false saying we exhausted cleaned_count. This flow can be
improved by breaking if ice_alloc_mapped_page() fails and then the flow
outside of the while loop is the same for the failure and success case.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we are not reporting dropped counts at the port level to
ethtool or netlink. This was found when debugging Rx dropped issues
and the total packets sent did not equal the total packets received
minus the rx_dropped, which was very confusing. To determine dropped
counts at the port level we need to read the PRTRPB_RDPC register.
To fix reporting we will store the dropped counts in the PF's
rx_discards. This will be reported to netlink by storing it in the
PF VSI's rx_missed_errors signaling that the receiver missed the
packet. Also, we will report this to ethtool in the rx_dropped.nic
field.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In case there is a request from a VF to change its number of queues, and
the request was successful, we need to update number of queues
configured on the VF before updating corresponding VSI for that VF,
especially LAN Tx queue tree and TC update, otherwise, we would continued
to use old value of vf->num_vf_qs for allocated Tx/Rx queues...
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch uses allocated number of Tx queues per VSI to set up its
scheduling tree instead of using total number of available Tx queues.
Only PF VSIs have total number of allocated Tx queues equal to number
of available Tx queues, other VSIs have different number of queues
configured.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we bump the Rx tail and release/give buffers to hardware every
16 descriptors. This causes us to bump Rx tail up to 4 times per
napi_poll call. Also we are always bumping tail on an odd index and this
is a problem because hardware ignores the lower 3 bits in the QRX_TAIL
register. This is making it so hardware sees tail bumps only every 8
descriptors. Instead lets only bump Rx tail once per napi_poll if
the value aligns with hardware's expectations of the lower 3 bits being
cleared. Also only release/give Rx buffers once per napi_poll call.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds code to clear VFs enable status until reset is completed,
and Tx/Rx rings are setup. Without this patch, the code flow request Tx
queues to be disabled after reset, especially PFR - where VF VSI Tx rings
have already been wiped off in the NVM and result to adminq error based on
the call to disable Tx LAN queue in ice_reset_all_vfs function call.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The firmware reports an error when trying to configure a port with no
media. Instead of always configuring the port, check for media before
attempting to configure it. In the absence of media, turn off link and
poll for media to become available before re-enabling link.
Move ice_force_phys_link_state() up to avoid forward declaration.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ice_init_all_ctrlq and ice_shutdown_all_ctrlq functions create and
destroy the locks used to protect the send and receive process of each
control queue.
This is problematic, as the driver may use these functions to shutdown
and re-initialize the control queues at run time. For example, it may do
this in response to a device reset.
If the driver failed to recover from a reset, it might leave the control
queues offline. In this case, the locks will no longer be initialized.
A later call to ice_sq_send_cmd will then attempt to acquire a lock that
has been destroyed.
It is incorrect behavior to access a lock that has been destroyed.
Indeed, ice_aq_send_cmd already tries to avoid accessing an offline
control queue, but the check occurs inside the lock.
The root of the problem is that the locks are destroyed at run time.
Modify ice_init_all_ctrlq and ice_shutdown_all_ctrlq such that they no
longer create or destroy the locks.
Introduce new functions, ice_create_all_ctrlq and ice_destroy_all_ctrlq.
Call these functions in ice_init_hw and ice_deinit_hw.
Now, the control queue locks will remain valid for the life of the
driver, and will not be destroyed until the driver unloads.
This also allows removing a duplicate check of the sq.count and
rq.count values when shutting down the controlqs. The ice_shutdown_ctrlq
function already checks this value under the lock. Previously
commit dec64ff10e ("ice: use [sr]q.count when checking if queue is
initialized") needed this check to happen outside the lock, because it
prevented duplicate attempts at destroying the locks.
The driver may now safely use ice_init_all_ctrlq and
ice_shutdown_all_ctrlq while handling reset events, without causing the
locks to be invalid.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we are always setting prefena to 0. This is causing the
hardware to only fetch descriptors when there are none free in the cache
for a received packet instead of prefetching when it has used the last
descriptor regardless of incoming packets. Fix this by allowing the
hardware to prefetch Rx descriptors.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When interrupt tracking was refactored, during rebuild, the call to
ice_vsi_setup_vector_base() was inadvertently removed from the PF VSI
instead of being removed from the VF VSI. During reset, the failure to
properly setup the vector base generates a call trace. Correct this so
that resets/rebuilds properly complete.
Fixes: cbe66bfee6 ("ice: Refactor interrupt tracking")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently, ice_stat_update32 and ice_stat_update40 will limit the
value of the software statistic to 32 or 40 bits wide, depending on
which register is being read.
This means that if a driver is running for a long time, the displayed
software register values will roll over to zero at 40 bits or 32 bits.
This occurs because the functions directly assign the difference between
the previous value and current value of the hardware statistic.
Instead, add this value to the current software statistic, and then
update the previous value.
In this way, each time ice_stat_update40 or ice_stat_update32 are
called, they will increment the software tracking value by the
difference of the hardware register from its last read. The software
tracking value will correctly count up until it overflows a u64.
The only requirement is that the ice_stat_update functions be called at
least once each time the hardware register overflows.
While we're fixing ice_stat_update40, modify it to use rd64 instead of
two calls to rd32. Additionally, drop the now unnecessary hireg
function parameter.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for reporting link partner advertising when
ETHTOOL_GLINKSETTINGS defined. Get pause param reports the Tx/Rx
pause configured, and then ethtool issues ETHTOOL_GSET ioctl and
ice_get_settings_link_up reports the negotiated Tx/Rx pause. Negotiated
pause frame report per IEEE 802.3-2005 table 288-3.
$ ethtool --show-pause ens6f0
Pause parameters for ens6f0:
Autonegotiate: on
RX: on
TX: on
RX negotiated: on
TX negotiated: on
$ ethtool ens6f0
Settings for ens6f0:
Supported ports: [ FIBRE ]
Supported link modes: 25000baseCR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: None BaseR RS
Advertised link modes: 25000baseCR/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: None BaseR RS
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 25000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
When ETHTOOL_GLINKSETTINGS is not defined, get pause param reports the
negotiated Tx/Rx pause.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2019-07-24
This series contains updates to igc and e1000e client drivers only.
Sasha provides a couple of cleanups to remove code that is not needed
and reduce structure sizes. Updated the MAC reset flow to use the
device reset flow instead of a port reset flow. Added addition device
id's that will be supported.
Kai-Heng Feng provides a workaround for a possible stalled packet issue
in our ICH devices due to a clock recovery from the PCH being too slow.
v2: removed the last patch in the series that supposedly fixed a MAC/PHY
de-sync potential issue while waiting for additional information from
hardware engineers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The linux-next commit "net: Rename skb_frag_t size to bv_len" [1]
introduced a compilation error on powerpc as it forgot to deal with the
renaming from "size" to "bv_len" for ixgbevf.
[1] https://lore.kernel.org/netdev/20190723030831.11879-1-willy@infradead.org/T/#md052f1c7de965ccd1bdcb6f92e1990a52298eac5
In file included from ./include/linux/cache.h:5,
from ./include/linux/printk.h:9,
from ./include/linux/kernel.h:15,
from ./include/linux/list.h:9,
from ./include/linux/module.h:9,
from
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:12:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function
'ixgbevf_xmit_frame_ring':
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:51: error:
'skb_frag_t' {aka 'struct bio_vec'} has no member named 'size'
count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
^
./include/uapi/linux/kernel.h:13:40: note: in definition of macro
'__KERNEL_DIV_ROUND_UP'
#define __KERNEL_DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
^
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:4138:12: note: in
expansion of macro 'TXD_USE_COUNT'
count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size);
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: David S. Miller <davem@davemloft.net>
This works around a possible stalled packet issue, which may occur due to
clock recovery from the PCH being too slow, when the LAN is transitioning
from K1 at 1G link speed.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204057
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for more SKUs.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use Device Reset flow instead of Port Reset flow.
This flow performs a reset of the entire controller device,
resulting in a state nearly approximating the state
following a power-up reset or internal PCIe reset,
except for system PCI configuration.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch comes to clean up the device specification structure.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Polarity and cable length fields is not applicable for the i225 device.
This patch comes to clean up PHY information structure.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for unifying the skb_frag and bio_vec, use the fine
accessors which already exist and use skb_frag_t instead of
struct skb_frag_struct.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 8c0d3a02c1 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.
Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The audience for the Kernel driver-model is clearly Kernel hackers.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> # ice driver changes
Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups. Because of this, there is going
to be some merge issues with your tree at the moment, I'll follow up
with the expected resolutions to make it easier for you.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups (will cause build warnings
with s390 and coresight drivers in your tree)
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse
easier due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of merge
issues that Stephen has been patient with me for. Other than the merge
issues, functionality is working properly in linux-next :)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXSgpnQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykcwgCfS30OR4JmwZydWGJ7zK/cHqk+KjsAnjOxjC1K
LpRyb3zX29oChFaZkc5a
=XrEZ
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"
* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...
And any other existing fields in this structure that refer to tc.
Specifically:
* tc_cls_flower_offload_flow_rule() to flow_cls_offload_flow_rule().
* TC_CLSFLOWER_* to FLOW_CLS_*.
* tc_cls_common_offload to tc_cls_common_offload.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates flow_block_cb_setup_simple() to use the flow block API.
Several drivers are also adjusted to use it.
This patch introduces the per-driver list of flow blocks to account for
blocks that are already in use.
Remove tc_block_offload alias.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most drivers do the same thing to set up the flow block callbacks, this
patch adds a helper function to do this.
This preparation patch reduces the number of changes to adapt the
existing drivers to use the flow block callback API.
This new helper function takes a flow block list per-driver, which is
set to NULL until this driver list is used.
This patch also introduces the flow_block_command and
flow_block_binder_type enumerations, which are renamed to use
FLOW_BLOCK_* in follow up patches.
There are three definitions (aliases) in order to reduce the number of
updates in this patch, which go away once drivers are fully adapted to
use this flow block API.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-07-03
The following pull-request contains BPF updates for your *net-next* tree.
There is a minor merge conflict in mlx5 due to 8960b38932 ("linux/dim:
Rename externally used net_dim members") which has been pulled into your
tree in the meantime, but resolution seems not that bad ... getting current
bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed
just so he's aware of the resolution below:
** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
<<<<<<< HEAD
static int mlx5e_open_cq(struct mlx5e_channel *c,
struct dim_cq_moder moder,
struct mlx5e_cq_param *param,
struct mlx5e_cq *cq)
=======
int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder,
struct mlx5e_cq_param *param, struct mlx5e_cq *cq)
>>>>>>> e5a3e259ef
Resolution is to take the second chunk and rename net_dim_cq_moder into
dim_cq_moder. Also the signature for mlx5e_open_cq() in ...
drivers/net/ethernet/mellanox/mlx5/core/en.h +977
... and in mlx5e_open_xsk() ...
drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64
... needs the same rename from net_dim_cq_moder into dim_cq_moder.
** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:
<<<<<<< HEAD
int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix));
struct dim_cq_moder icocq_moder = {0, 0};
struct net_device *netdev = priv->netdev;
struct mlx5e_channel *c;
unsigned int irq;
=======
struct net_dim_cq_moder icocq_moder = {0, 0};
>>>>>>> e5a3e259ef
Take the second chunk and rename net_dim_cq_moder into dim_cq_moder
as well.
Let me know if you run into any issues. Anyway, the main changes are:
1) Long-awaited AF_XDP support for mlx5e driver, from Maxim.
2) Addition of two new per-cgroup BPF hooks for getsockopt and
setsockopt along with a new sockopt program type which allows more
fine-grained pass/reject settings for containers. Also add a sock_ops
callback that can be selectively enabled on a per-socket basis and is
executed for every RTT to help tracking TCP statistics, both features
from Stanislav.
3) Follow-up fix from loops in precision tracking which was not propagating
precision marks and as a result verifier assumed that some branches were
not taken and therefore wrongly removed as dead code, from Alexei.
4) Fix BPF cgroup release synchronization race which could lead to a
double-free if a leaf's cgroup_bpf object is released and a new BPF
program is attached to the one of ancestor cgroups in parallel, from Roman.
5) Support for bulking XDP_TX on veth devices which improves performance
in some cases by around 9%, from Toshiaki.
6) Allow for lookups into BPF devmap and improve feedback when calling into
bpf_redirect_map() as lookup is now performed right away in the helper
itself, from Toke.
7) Add support for fq's Earliest Departure Time to the Host Bandwidth
Manager (HBM) sample BPF program, from Lawrence.
8) Various cleanups and minor fixes all over the place from many others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2019-06-28
This series contains a smorgasbord of updates to many of the Intel
drivers.
Gustavo A. R. Silva updates the ice and iavf drivers to use the
strcut_size() helper where possible.
Miguel increases the pause and refresh time for flow control in the
e1000e driver during reset for certain devices.
Dann Frazier fixes a potential NULL pointer dereference in ixgbe driver
when using non-IPSec enabled devices.
Colin Ian King fixes a potential overflow during a shift in the ixgbe
driver. Also fixes a potential NULL pointer dereference in the iavf
driver by adding a check.
Venkatesh Srinivas converts the e1000 driver to use dma_wmb() instead of
wmb() for doorbell writes to avoid SFENCEs in the transmit and receive
paths.
Arjan updates the e1000e driver to improve boot time by over 100 msec by
reducing the usleep ranges suring system startup.
Artem updates the igb driver register dump in ethtool, first prepares
the register dump for future additions of registers in the dump, then
secondly, adds the RR2DCDELAY register to the dump. When dealing with
time-sensitive networks, this register is helpful in determining your
latency from the device to the ring.
Alex fixes the ixgbevf driver to use the current cached link state,
rather than trying to re-check the value from the PF.
Harshitha adds support for MACVLAN offloads in i40e by using channels as
MACVLAN interfaces.
Detlev Casanova updates the e1000e driver to use delayed work instead of
timers to run the watchdog.
Vitaly fixes an issue in e1000e, where when disconnecting and
reconnecting the physical cable connection, the NIC enters a DMoff
state. This state causes a mismatch in link and duplexing, so check the
PCIm function state and perform a PHY reset when in this state to
resolve the issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to commit: 5d8682588605 ("[misc] mei: me: allow runtime
pm for platform with D0i3")
When disconnecting the cable and reconnecting it the NIC
enters DMoff state. This caused wrong link indication
and duplex mismatch. This bug is described in:
https://bugzilla.redhat.com/show_bug.cgi?id=1689436
Checking PCIm function state and performing PHY reset after a
timeout in watchdog task solves this issue.
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use delayed work instead of timers to run the watchdog of the e1000e
driver.
Simplify the code with one less middle function.
Signed-off-by: Detlev Casanova <detlev.casanova@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables macvlan offloads for i40e. The idea is to use
channels as macvlan interfaces. The channels are VSIs of
type VMDQ. When the first macvlan is created, the maximum number of
channels possible are created. From then on, as a macvlan interface
is created, a macvlan filter is added to these already created
channels (VSIs).
This patch utilizes subordinate device traffic classes to make queue
groups(channels) available for an upper device like a macvlan.
Steps to configure macvlan offloads:
1. ethtool -K ethx l2-fwd-offload on
2. ip link add link ethx name macvlan1 type macvlan
3. ip addr add <address> dev macvlan1
4. ip link set macvlan1 up
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Change the ethtool link settings call to just read the cached state out of
the adapter structure instead of trying to recheck the value from the PF.
Doing this should prevent excessive reading of the mailbox.
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A recent commit efa14c3985 ("iavf: allow null RX descriptors") added
a null pointer sanity check on rx_buffer, however, rx_buffer is being
dereferenced before that check, which implies a null pointer dereference
bug can potentially occur. Fix this by only dereferencing rx_buffer
until after the null pointer check.
Addresses-Coverity: ("Dereference before null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds the RR2DCDELAY register to the ethtool registers dump.
RR2DCDELAY exists on I210 and I211 Intel Gigabit Ethernet chips and it stands
for "Read Request To Data Completion Delay". Here is how this register is
described in the I210 datasheet:
"This field captures the maximum PCIe split time in 16 ns units, which is the
maximum delay between the read request to the first data completion. This is
giving an estimation of the PCIe round trip time."
In other words, whenever I210 reads from the host memory (e.g., fetches a
descriptor from the ring), the chip measures every PCI DMA read transaction and
captures the maximum value. So it ends up containing the longest DMA
transaction time.
This register is very useful for troubleshooting and research purposes. If you
are dealing with time-sensitive networks, this register can help you get
an idea of your "I210-to-ring" latency. This helps answering questions like
"should I have PCIe ASPM enabled?" or "should I enable deep C-states?" on
my system.
It is safe to read this register at any point, reading it has no effect on
the I210 chip functionality.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch has no functional impact and it is just a preparation
for the following patch. It removes an early return from the
'igb_get_regs()' function by moving the 82576-only registers
dump into an "if" block. With this preparation, we can dump more
non-82576 registers at the end of this function.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This aligns the iavf_debug() macro with the other Intel drivers.
Add the bus number, bus_id field to i40e_bus_info so output shows
each physical port(i.e func) in following format:
[[[[<domain>]:]<bus>]:][<slot>][.[<func>]]
domains are numbered from 0 to ffff), bus (0-ff), slot (0-1f) and
function (0-7).
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
The e1000e driver is a great user of the usleep_range() API,
and has nice ranges that in principle help power management.
However the ranges that are used only during system startup are
very long (and can add easily 100 msec to the boot time) while
the power savings of such long ranges is irrelevant due to the
one-off, boot only, nature of these functions.
This patch shrinks some of the longest ranges to be shorter
(while still using a power friendly 1 msec range); this saves
100msec+ of boot time on my BDW NUCs
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes, in particular in the
context in which this code is being used.
So, replace code of the following form:
sizeof(struct virtchnl_ether_addr_list) + (count * sizeof(struct virtchnl_ether_addr))
with:
struct_size(veal, list, count)
and so on...
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
e1000 writes to doorbells to post transmit descriptors and fill the
receive ring. After writing descriptors to memory but before
writing to doorbells, use dma_wmb() rather than wmb(). wmb() is more
heavyweight than necessary for a device to see descriptor writes.
On x86, this avoids SFENCEs before doorbell writes in both the
Tx and Rx paths. On ARM, this converts DSB ST -> DMB OSHST.
Tested: 82576EB / x86; QEMU (qemu emulates an 8257x)
Signed-off-by: Venkatesh Srinivas <venkateshs@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The u32 variable rem is being shifted using u32 arithmetic however
it is being passed to div_u64 that expects the expression to be a u64.
The 32 bit shift may potentially overflow, so cast rem to a u64 before
shifting to avoid this. Also remove comment about overflow.
Addresses-Coverity: ("Unintentional integer overflow")
Fixes: cd45832069 ("ixgbe: implement support for SDP/PPS output on X550 hardware")
Fixes: 68d9676fc0 ("ixgbe: fix PTP SDP pin setup on X540 hardware")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Suggested-by: Tim Pepper <timothy.c.pepper@linux.intel.com>
Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
instance = alloc(size, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
size = struct_size(instance, entry, count);
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If a packet which is utilizing the launchtime feature (via SO_TXTIME socket
option) also requests the hardware transmit timestamp, the hardware
timestamp is not delivered to the userspace. This is because the value in
skb->tstamp is mistaken as the software timestamp.
Applications, like ptp4l, request a hardware timestamp by setting the
SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is
detected by the driver (this work is done in igb_ptp_tx_work() which calls
igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the
ERR_QUEUE for the userspace to read. When the userspace is ready, it will
issue a recvmsg() call to collect this timestamp. The problem is in this
recvmsg() call. If the skb->tstamp is not cleared out, it will be
interpreted as a software timestamp and the hardware tx timestamp will not
be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the
callee function __sock_recv_timestamp() in net/socket.c for more details.
Signed-off-by: Vedang Patel <vedang.patel@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some drivers want to access the data transmitted in order to implement
acceleration features of the NICs. It is also useful in AF_XDP TX flow.
Change the xsk_umem_consume_tx API to return the whole xdp_desc, that
contains the data pointer, length and DMA address, instead of only the
latter two. Adapt the implementation of i40e and ixgbe to this change.
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct virtchnl_iwarp_qvlist_info {
...
struct virtchnl_iwarp_qv_info qv_info[1];
};
size = sizeof(struct virtchnl_iwarp_qvlist_info) + (sizeof(struct virtchnl_iwarp_qv_info) * count;
instance = kzalloc(size, GFP_KERNEL);
and
struct virtchnl_vf_resource {
...
struct virtchnl_vsi_resource vsi_res[1];
};
size = sizeof(struct virtchnl_vf_resource) + sizeof(struct virtchnl_vsi_resource) * count;
instance = kzalloc(size, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, qv_info, count), GFP_KERNEL);
and
instance = kzalloc(struct_size(instance, vsi_res, count), GFP_KERNEL);
Notice that, in the first case above, variable size is not necessary, hence it
is removed.
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It was found that the string that prints our copyright was
not up to date. Updating to reflect our copyright.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Changing descriptor count via 'ethtool -G' is not persistent across resets.
When PF reset occurs, we roll back to the default value of vsi->num_desc,
which is used then in i40e_alloc_rings to set descriptor count. XDP does a
PF reset so when user has changed the descriptor count and load XDP
program, the default count will be back there.
To fix this:
* introduce new VSI members - num_tx_desc and num_rx_desc in favour of
num_desc
* set them in i40e_set_ringparam to user's values
* set them to default values in i40e_set_num_rings_in_vsi only when they
don't have previous values
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes reading f/w LLDP agent status at DCB init time.
It's done by removing direct NVM reading in i40e_update_dcb_config()
and checking whether f/w LLDP agent is disabled via
I40E_FLAG_DISABLE_FW_LLDP flag in i40e_init_pf_dcb(). The function
i40e_update_dcb_config() in i40e_main.c is a temporary solution which
will be later renamed to i40e_init_dcb() in the i40e_dcb module. Also
logging was extended to make visible if f/w LLDP agent is running or not
and always log a message when DCB was not initialized. Without this
patch for new f/w versions f/w LLDP agent status was always read
from NVM as disabled and DCB initialization failed without
clear reason in logs.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Generate log entry when TC0 is created or deleted.
Log entry is generated during main VSI setup.
Before there was no log info about adding or deleting TC0.
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix for missing "Supported link modes" and "Advertised link modes"
info in ethtool after changed speed on X722 devices with BASE-T PHY
with FW API version >= 1.7.
The same FW API version on X710 and X722 does not mean the same
feature set so the change was needed as mac type of the device
should also be checked instead of FW API version only.
Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes 'NIC Link is Up, Unknown bps' message in dmesg
for 2.5Gb/5Gb speeds. This problem is fixed by adding constants
for VIRTCHNL_LINK_SPEED_2_5GB and VIRTCHNL_LINK_SPEED_5GB cases
in the i40e_virtchnl_link_speed() function.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The next call to ixgbevf_update_itr will continue to dynamically
update ITR.
Copy from commit bdbeefe8ea ("ixgbe: fix possible divide by zero in
ixgbe_update_itr")
Signed-off-by: Young Xiao <92siuyang@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some transceivers may comply with SFF-8472 but not implement the Digital
Diagnostic Monitoring (DDM) interface described in it. The existence of
such area is specified by bit 6 of byte 92, set to 1 if implemented.
Currently, due to not checking this bit ixgbe fails trying to read SFP
module's eeprom with the follow message:
ethtool -m enP51p1s0f0
Cannot get Module EEPROM data: Input/output error
Because it fails to read the additional 256 bytes in which it was assumed
to exist the DDM data.
This issue was noticed using a Mellanox Passive DAC PN 01FT738. The eeprom
data was confirmed by Mellanox as correct and present in other Passive
DACs in from other manufacturers.
Signed-off-by: "Mauro S. M. Rodrigues" <maurosr@linux.vnet.ibm.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Convert the various documents at the driver-model, preparing
them to be part of the driver-api book.
The conversion is actually:
- add blank lines and identation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> # ice
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some circumstances, the hardware can hand us a null receive
descriptor, with no data attached but otherwise valid. Unfortunately,
the driver was ill-equipped to handle such an event, and would stop
processing packets at that point.
To fix this, use the Descriptor Done bit instead of the size to
determine whether or not a descriptor is ready to be processed. Add some
checks to allow for unused buffers.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add call to iavf_add_cloud_filter and iavf_del_cloud_filter from
iavf_process_aq_command to clear aq_required
IAVF_FLAG_AQ_ADD_CLOUD_FILTER and IAVF_FLAG_AQ_DEL_CLOUD_FILTER bits.
aq_required IAVF_FLAG_AQ_DEL_CLOUD_FILTER bit is being set in
iavf_down and iavf_delete_clsflower, and are never cleared.
aq_required IAVF_FLAG_AQ_ADD_CLOUD_FILTER bit is being set in
iavf_handle_reset and iavf_configure_clsflower, and are never
cleared.
Since the aq_required is not zero, iavf_watchdog_task is setting the
queue_delayed_work to 20 msec instead of the longer delay.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cleanup of init state machine, move state specific
code to separate functions and rewrite the
iavf_init_task() function.
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Refactor the watchdog state machine implementation.
Add the additional state __IAVF_COMM_FAILED to process
the PF communication fails. Prepare the watchdog state machine
to integrate with init state machine.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove the watchdog timer, instead declare watchdog task
as delayed work and use dedicated workqueue to service driver
tasks. The dedicated driver workqueue iavf_wq is common
for all driver instances.
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move the commands processing outside the watchdog_task()
function. This reduce length and complexity of the function
which is mainly designed to process the watchdog state machine.
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There was a calculation error in virtchnl regarding the valid
length which was fixed recently and a corresponding change needs
to go into the code while we enable ADq.
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
iavf_add_vlan() is being called in atomic context
so kzalloc() needs GFP_ATOMIC. This patch fixes it.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On some hardware/driver/architecture combinations, it may take longer
than 200msec for all close operations to be completed, causing a
spurious error message to be logged.
Increase the timeout value to 500msec to avoid this erroneous error.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The counter variable in iavf_clean_tx_irq starts out negative and climbs
to 0. So allocating it as u16 is actually a really bad idea that just
happens to work because the value underflows and overflows consistently
on most architectures.
Replace the u16 with an int so signed math works as expected.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes how VLAN tag are being populated and programmed into
the HW - Instead of start adding VF VLAN tag from the last member of the
element list, start from the first member of the list, until number of
allowed VLAN tags is exhausted in the HW.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warning:
drivers/net/ethernet/intel/i40e/i40e_xsk.c: In function ‘i40e_run_xdp_zc’:
drivers/net/ethernet/intel/i40e/i40e_xsk.c:217:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
bpf_warn_invalid_xdp_action(act);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/intel/i40e/i40e_xsk.c:218:2: note: here
case XDP_ABORTED:
^~~~
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Driver updated pf->flags before calling i40e_aq_start_lldp().
This patch moved down updating pf->flags down so flags will be
updated only in case of successful i40e_aq_start_lldp() call.
Also was introduced is_reset_needed local flag to avoid unnecessary h/w
reset in case 40e_aq_start_lldp() didn't change lldp state.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The tx_errors statistic was being calculated twice in
i40e_update_eth_stats.
This appears to be as of commit 201db2898f2c ("i40e: add missing VSI
statistics", 2014-03-25).
Remove the extra i40e_stat_update32 call for GLV_TEPC.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the problem with a kernel panic occurring when trying
to bind the i40e driver to a non-i40e port. The problem is fixed by
checking if the BAR size in the device is large enough by reading the
highest register.
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Driver did not check response on LLDP flag change and always returned
SUCCESS.
This patch now checks for an error and returns an error code and has
additional information in the log.
Signed-off-by: Piotr Marczak <piotr.marczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Change some data to unsigned int instead of integer when we compare.
Check LUT values in VIRTCHNL_OP_CONFIG_RSS_LUT handler.
Also enhance error/warning messages to print the real values of
I40E_MAX_VF_QUEUES, I40E_MAX_VF_VSI and I40E_DEFAULT_QUEUES_PER_VF
instead of plain text.
Refactor code to comply with 'check first then assign' policy.
Remove duplicate checks for VIRTCHNL_OP_CONFIG_RSS_KEY and
VIRTCHNL_OP_CONFIG_RSS_LUT opcodes in i40e_vc_process_vf_msg(). We have
the very same checks inside the handlers already.
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes it possible to log only AQ descriptors, without the
entire AQ message buffers being dumped too. It should greatly reduce
kernel log size in cases where a full AQ dump is not needed.
Selection is made by setting flags in hw->debug_mask.
Additionally, some debug messages that preceded an AQ dump have been
moved to I40E_DEBUG_AQ_COMMAND class, which seems more appropriate.
Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add bounds check for ch[] array.
Use ARRAY_SIZE() to ensure that idx is within the range.
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The counter variable in i40e_clean_tx_irq starts out negative and climbs
to 0. So it should not be defined as a u16. This was working by accident
due to the fact the u16 overflows and underflows predictably.
Replace the u16 with int, which is signed and can handle the negativity.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add veb array access boundary checks.
Ensure veb array index is smaller than I40E_MAX_VEB.
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch lets untrusted VF to create up to 16 VLANs.
It was implemented by increasing I40E_VC_MAX_VLAN_PER_VF up to 16.
Without this patch untrusted VF could create only up to 8 VLANs.
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds functions stubs to support EEE on/off.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The PF driver state flag __I40E_VIRTCHNL_OP_PENDING needs to be
checked and set at the beginning of i40e_ndo_set_vf_mac. Otherwise,
if there are error conditions before it, the flag will be cleared
unexpectedly by this function to cause potential race conditions.
Hence move the check to the top of this function.
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VF configuration returned in i40e_ndo_get_vf_config is
already stored by the PF. There is no dependency on any
specific state of the VF to return the configuration.
Drop the check against I40E_VF_STATE_INIT since it is not
needed.
Signed-off-by: Lihong Yang <lihong.yang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ixgbevf_write_msg_read_ack fails, return its error code upstream
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Similar to the X540 hardware, enable support for generating a 1pps
output signal on SDP0.
This support is slightly different to the X540 hardware, because of the
register layout changes. First, the system time register is now
represented in 'cycles' and 'billions of cycles'. Second, we need to
also program the TSSDP register, as well as the ESDP register. Third,
the clock output uses only FREQOUT, instead of a full 64bit value for
the output clock period. Finally, we have to use the ST0 bit instead of
the SYNCLK bit in the TSAUXC register.
This support should work even for the hardware with a higher frequency
clock, as it carefully takes into account the multiply and shift of the
cycle counter used.
We also set the pps configuration to 1, since we now support generating
a pulse per second output.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove references to IXGBE_ETH_P_LLD and use ETH_P_LLDP instead.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove references to I40E_ETH_P_LLDP and use ETH_P_LLDP instead.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This function was missing a documentation comment. Add one now.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ixgbe_ptp.c file sometimes uses hw_cc as the local variable for the
cycle counter in ixgbe_ptp_read_X550. However, we use just 'cc' as
a local variable for this by convention else where in the file.
Convert this lone usage of 'hw_cc' into just the shorter 'cc' name to
match the other read functions in the file.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The function ixgbe_ptp_setup_sdp_X540 attempts to program a software
defined pin, in order to generate a pulse-per-second output on SDP 0.
It does work to generate the output, but does not align the output on
the full second. Additionally, it does not take into account the
cyclecounter multiplier. This leads to somewhat confusing code which is
likely to be incorrect if blindly copied to another hardware type.
Update this code to account for the cyclecounter multiplier, and to
directly use timecounter_read.
This change ensures that the SDP output will align properly on a full
second, and makes the intent of the calculations a bit more clear.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously we waited for a whole 15 seconds before we cleared the Tx
timestamp state. This is astronomically long compared to the worst case
timings expected by our devices. In addition, this is longer than the
wait in ptp4l when it detects a fault (caused by missing Tx timestamps).
Thus, reduce the timer to only 1 second, which is well after the maximum
expected delay. This should reduce user frustration when a timestamp
does get dropped for some reason.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The total_packets count at ixgbe_clean_xdp_tx_irq is
always zero when testing with xdpsock -t -N. Set the gso_segs
to 1 to make the tx packet count correct.
Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The tx bytecount is done twice. When running
'./xdpsock -t -N -i eth3' and 'ip -s link show dev eth3'
The avg packet size is 120 instead of 60. So remove the
extra one.
Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
As current implementation of netdev already contains and provides
umems for us, we no longer have the need to contain these
structures in ixgbe_adapter.
Refactor the code to operate on netdev-provided umems.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Here, we add a bitmap to the ixgbe_adapter that tracks if a
certain queue pair has been "zero-copy enabled" via the ndo_bpf.
The bitmap is used in ixgbe_xsk_umem, and enables zero-copy if
and only if XDP is enabled, the corresponding qid in the bitmap
is set, and the umem is non-NULL;
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some small things were missed with recent name changes
from i40e to iavf. Having a separate patch allows to
correct the small misses in one place.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Renaming remaining defines from i40e to iavf
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There were a couple of erroneously missed i40e names to
update to iavf left after the larger chunks. Updated them
separately so now they should all be aligned as iavf.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The defines in iavf_client.h were still vastly i40e, and they
should be iavf.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
rename the flags inside of iavf_status from I40E_*
to IAVF_*
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update the old variables and flags marked as i40e to match
the iavf name of the driver.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update the old i40e function names to be iavf
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of typedefing the enum iavf_status_code with iavf_status,
just shorten the enum itself and get rid of typedef.
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
With the rename of the iavf driver, there were some
files that were missed in renaming. Update these to
be iavf as well.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
size = struct_size(instance, entry, count);
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
instance = kzalloc(size, GFP_KERNEL)
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL)
Notice that, in this case, variable bufsz is not necessary, hence it
is removed.
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Limiting RSS queues number to online CPUs number in order to
avoid issues with creating misconfigured RSS queues.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clang warns:
In file included from drivers/net/ethernet/intel/iavf/iavf_main.c:4:
In file included from drivers/net/ethernet/intel/iavf/iavf.h:37:
In file included from drivers/net/ethernet/intel/iavf/iavf_type.h:8:
drivers/net/ethernet/intel/iavf/iavf_osdep.h:49:18: warning: 'format' attribute argument not supported: gnu_printf [-Wignored-attributes]
__attribute__ ((format(gnu_printf, 3, 4)));
^
1 warning generated.
We can convert from gnu_printf to printf without any side effects for
two reasons:
1. All iavf_debug instances use standard printf formats, as pointed out
by Miguel Ojeda at the below link, meaning gnu_printf is not strictly
required.
2. However, GCC has aliased printf to gnu_printf on Linux since at least
2010 based on git history.
From gcc/c-family/c-format.c:
/* Attributes such as "printf" are equivalent to those such as
"gnu_printf" unless this is overridden by a target. */
static const target_ovr_attr gnu_target_overrides_format_attributes[] =
{
{ "gnu_printf", "printf" },
{ "gnu_scanf", "scanf" },
{ "gnu_strftime", "strftime" },
{ "gnu_strfmon", "strfmon" },
{ NULL, NULL }
};
The mentioned override only happens on Windows (mingw32). Changing from
gnu_printf to printf is a no-op for GCC and stops Clang from warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/111
Suggested-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch mostly capitalizes abbreviations in code comments. Fixed some
typos and removed some unnecessary newlines as well.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In ice_print_link_msg, add cases for 50GB and 100GB speeds. This
results in the right speed being reported on load, instead of
"Unknownbps".
When VF link if forced (in ice_set_pfe_link_forced), report
max speed 100GB.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Replace the use of the ICE_DBG_AQ_MSG bit when dumping firmware logging
messages with a separate distinct type ICE_DBG_FW_LOG. This is useful
so that developers may enable ICE_DBG_FW_LOG and get firmware logging
messages, without also dumping AdminQ messages at the same time.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add some details to the function header for ice_deinit_hw.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The define describing the bits for the struct field should be below
the field itself.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current specification has updates to the command formats for
manage MAC opcodes (opcodes 0x0107 and 0x0108) and get PHY caps
(opcode 0x0600). Update the code to reflect this.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For style consistency, use continue instead of an else block in
ice_pf_dcb_recfg.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Change minimum number of descriptor count from 32 to 64. This is to have
a feature parity with previous Intel NIC drivers.
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add call to configure dropping egress LLDP packets in ice_vsi_setup
and remove the rule in ice_vsi_release.
Add calls to add/remove rule to route LLDP packets to default VSI when
FW LLDP engine is disabled/enabled and remove rule if applied during
ice_vsi_release.
In the function ice_add_eth_mac(), there is a line that hard codes the
filter info flag to TX. This is incorrect as this flag will be set by
the calling function that built the list of filters to add. So remove
the hard coded value.
This patch also contains a fix to stop treating the DCBx state of
"Not Started" as an error state that kicks DCB in SW mode. This will
address having non-cabled interfaces automatically go into SW mode
with the FW engine running.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Do not allocate memory for the Get PHY Abilities command data buffer when
it is not necessary, change one local variable to another to reduce the
number of de-references, reduce the scope of some local variables, and
reorder the code and change exit points to get rid of an unnecessary goto
label.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_cfg_vsi_lan returns a value of type enum ice_status. So
use a local of the same type to capture the return value.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds driver support for Forward Error Correction (FEC)
and ethtool handlers to set/get FEC params.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for virtchnl_vector_map.[rxq|txq]_map to use bitmap to
associate indicated queues with the specified vector. This support is
needed since the Windows AVF driver calls VIRTCHNL_OP_CONFIG_IRQ_MAP for
each vector and used the bitmap to indicate the associated queues.
Updated ice_vc_dis_qs_msg to not subtract one from
virtchnl_irq_map_info.num_vectors, and changed the VSI vector index to
the vector id. This change supports the Windows AVF driver which maps
one vector at a time and sets num_vectors to one. Using vectors_id to
index the vector array .
Add check for vector_id zero, and return VIRTCHNL_STATUS_ERR_PARAM
if vector_id is zero and there are rings associated with that vector.
Vector_id zero is for the OICR.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Consolidate adding unicast and broadcast MAC filters in a single new
function ice_init_mac_fltr.
Move ice_napi_del to ice_lib.c
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently in ice_free_vf_res() we are writing to the VFINT_DYN_CTLN
register in the PF's function space to disable all VF's interrupts. This
is incorrect because this register is only for use in the VF's function
space. This becomes obvious when seeing that the valid indices used for
the VFINT_DYN_CTLN register is from 0-63, which is the maximum number of
interrupts for a VF (not including the OICR interrupt). Fix this by
writing to the GLINT_DYN_CTL register for each VF. We can do this
because we keep track of each VF's first_vector_idx inside of the PF's
function space and the number of interrupts given to each VF.
Also in ice_free_vfs() we were disabling Rx/Tx queues after calling
pci_disable_sriov(). One part of disabling the Tx queues causes the PF
driver to trigger a software interrupt, which causes the VF's napi
routine to run. This doesn't currently work because pci_disable_sriov()
causes iavf_remove() to be called which disables interrupts. Fix this by
disabling Rx/Tx queues prior to pci_disable_sriov().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a new function ice_trigger_sw_intr to trigger interrupts.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Call ice_vsi_cfg_rss_lut_key only if RSS is enabled.
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In order to initialize the current status of the FW logging,
this patch adds ice_get_fw_log_cfg. The function retrieves
the current setting of the FW logging from HW and updates the
ice_hw structure accordingly.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove duplicate define for ICE_INVAL_Q_HANDLE. Move defines to the
top of the file.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the path for re-enabling FW LLDP engine, there is
a call to register for LLDP MIB change events. This
call is redundant, in that the call to ice_pf_dcb_cfg
will already register the driver for these events. Also,
the call as it stands now is too early in the flow before
before DCB is configured.
Remove the redundant call.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Change the message level of the MTU change log message from debug to
info.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don't use the mdd_detected variable as an exit condition for this loop;
the first VF to NOT have an MDD event will cause the loop to terminate.
Instead just look at all of the VFs, but don't disable them. This
prevents proper release of resources if the VFs are rebooted or the VF
driver reloaded. Instead, just log a message and call out repeat
offenders.
To make it clear what we are doing, use a differently-named variable in
the loop.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we have two MSI-x (IRQ) trackers, one for OS requested MSI-x
entries (sw_irq_tracker) and one for hardware MSI-x vectors
(hw_irq_tracker). Generally the sw_irq_tracker has less entries than the
hw_irq_tracker because the hw_irq_tracker has entries equal to the max
allowed MSI-x per PF and the sw_irq_tracker is mainly the minimum (non
SR-IOV portion of the vectors, kernel granted IRQs). All of the non
SR-IOV portions of the driver (i.e. LAN queues, RDMA queues, OICR, etc.)
take at least one of each type of tracker resource. SR-IOV only grabs
entries from the hw_irq_tracker. There are a few issues with this approach
that can be seen when doing any kind of device reconfiguration (i.e.
ethtool -L, SR-IOV, etc.). One of them being, any time the driver creates
an ice_q_vector and associates it to a LAN queue pair it will grab and
use one entry from the hw_irq_tracker and one from the sw_irq_tracker.
If the indices on these does not match it will cause a Tx timeout, which
will cause a reset and then the indices will match up again and traffic
will resume. The mismatched indices come from the trackers not being the
same size and/or the search_hint in the two trackers not being equal.
Another reason for the refactor is the co-existence of features with
SR-IOV. If SR-IOV is enabled and the interrupts are taken from the end
of the sw_irq_tracker then other features can no longer use this space
because the hardware has now given the remaining interrupts to SR-IOV.
This patch reworks how we track MSI-x vectors by removing the
hw_irq_tracker completely and instead MSI-x resources needed for SR-IOV
are determined all at once instead of per VF. This can be done because
when creating VFs we know how many are wanted and how many MSI-x vectors
each VF needs. This also allows us to start using MSI-x resources from
the end of the PF's allowed MSI-x vectors so we are less likely to use
entries needed for other features (i.e. RDMA, L2 Offload, etc).
This patch also reworks the ice_res_tracker structure by removing the
search_hint and adding a new member - "end". Instead of having a
search_hint we will always search from 0. The new member, "end", will be
used to manipulate the end of the ice_res_tracker (specifically
sw_irq_tracker) during runtime based on MSI-x vectors needed by SR-IOV.
In the normal case, the end of ice_res_tracker will be equal to the
ice_res_tracker's num_entries.
The sriov_base_vector member was added to the PF structure. It is used
to represent the starting MSI-x index of all the needed MSI-x vectors
for all SR-IOV VFs. Depending on how many MSI-x are needed, SR-IOV may
have to take resources from the sw_irq_tracker. This is done by setting
the sw_irq_tracker->end equal to the pf->sriov_base_vector. When all
SR-IOV VFs are removed then the sw_irq_tracker->end is reset back to
sw_irq_tracker->num_entries. The sriov_base_vector, along with the VF's
number of MSI-x (pf->num_vf_msix), vf_id, and the base MSI-x index on
the PF (pf->hw.func_caps.common_cap.msix_vector_first_id), is used to
calculate the first HW absolute MSI-x index for each VF, which is used
to write to the VPINT_ALLOC[_PCI] and GLINT_VECT2FUNC registers to
program the VFs MSI-x PCI configuration bits. Also, the sriov_base_vector
is used along with VF's num_vf_msix, vf_id, and q_vector->v_idx to
determine the MSI-x register index (used for writing to GLINT_DYN_CTL)
within the PF's space.
Interrupt changes removed any references to hw_base_vector, hw_oicr_idx,
and hw_irq_tracker. Only sw_base_vector, sw_oicr_idx, and sw_irq_tracker
variables remain. Change all of these by removing the "sw_" prefix to
help avoid confusion with these variables and their use.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a handler for ethtool selftest. Selftest includes
testing link, interrupts, eeprom, registers and packet loopback.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_cfg_itr() sets the ITR granularity and default ITR values for the
PF's interrupt vectors. For VF's this will be done in the AVF driver
flow. Fix this by not calling ice_cfg_itr() for SR-IOV.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently we set the default number of Rx descriptors per
queue to the system's page size divided by the number of bytes per
descriptor. For 4K page size systems this is resulting in 128 Rx
descriptors per queue. This is causing more dropped packets than desired
in the default configuration. Fix this by setting the minimum default
Rx descriptor count per queue to 512.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some static analysis tools can complain when doing a bitop assignment using
operands of different sizes. Fix that.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Implement the toggling of rx-vlan-filter; enable|disable VLAN
pruning based on on|off, respectively.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clear PXE mode AQ call (opcode 0x0110) is now supported in FW. So
remove the direct register write to GLLAN_RCTL_0.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix a checkpatch "LINE_SPACING: Please don't use multiple blank lines"
issue that has snuck in to the code.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The default flow control settings for the i225 device is both
'rx' and 'tx' pause frames. There is no depend on the NVM value.
This patch comes to fix this and clean up the driver code.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds flow control settings. This is required to
enable the legacy flow control support.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Driver does not want to keep packets in Tx queue when link is lost.
But present code only reset NIC to flush them, but does not prevent
queuing new packets. Moreover reset sequence itself could generate
new packets via netconsole and NIC falls into endless reset loop.
This patch wakes Tx queue only when NIC is ready to send packets.
This is proper fix for problem addressed by commit 0f9e980bf5
("e1000e: fix cyclic resets at link up with active tx").
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Suggested-by: Alexander Duyck <alexander.duyck@gmail.com>
Tested-by: Joseph Yasi <joe.yasi@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This reverts commit 0f9e980bf5.
That change cased false-positive warning about hardware hang:
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
e1000e 0000:00:1f.6 eth0: Detected Hardware Unit Hang:
TDH <0>
TDT <1>
next_to_use <1>
next_to_clean <0>
buffer_info[next_to_clean]:
time_stamp <fffba7a7>
next_to_watch <0>
jiffies <fffbb140>
next_to_watch.status <0>
MAC Status <40080080>
PHY Status <7949>
PHY 1000BASE-T Status <0>
PHY Extended Status <3000>
PCI Status <10>
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Besides warning everything works fine.
Original issue will be fixed property in following patch.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reported-by: Joseph Yasi <joe.yasi@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203175
Tested-by: Joseph Yasi <joe.yasi@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Tested-by: Oleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Enables a resend request after the completion timeout workaround is not
relevant for i225 device. This patch is clean code relevant this
workaround.
Minor cosmetic fixes, replace the 'spaces' with 'tabs'
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Few function pointers from phy_operations structure were unused.
This patch cleans those.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Collision threshold and threshold's shift has been defined twice.
This patch comes to fix that.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warning:
drivers/net/ethernet/intel/igb/e1000_82575.c: In function ‘igb_get_invariants_82575’:
drivers/net/ethernet/intel/igb/e1000_82575.c:636:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (igb_sgmii_uses_mdio_82575(hw)) {
^
drivers/net/ethernet/intel/igb/e1000_82575.c:642:2: note: here
case E1000_CTRL_EXT_LINK_MODE_PCIE_SERDES:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warning:
drivers/net/ethernet/intel/igb/igb_main.c: In function ‘__igb_notify_dca’:
drivers/net/ethernet/intel/igb/igb_main.c:6694:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (dca_add_requester(dev) == 0) {
^
drivers/net/ethernet/intel/igb/igb_main.c:6701:2: note: here
case DCA_PROVIDER_REMOVE:
^~~~
Warning level 3 was used: -Wimplicit-fallthrough=3
Notice that, in this particular case, the code comment is modified
in accordance with what GCC is expecting to find.
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Failed in read the HW register is very serious for igb/igc driver,
as its hw_addr will be set to NULL and cause the adapter be seen as
"REMOVED".
We saw the error only a few times in the MTBF test for suspend/resume,
but can hardly get any useful info to debug.
Adding WARN() so that we can get the necessary information about
where and how it happens, and use it for root causing and fixing
this "PCIe link lost issue"
This affects igb, igc.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Recent versions of sparse warn about casting pointers to/from restricted
endian types in the Linux driver. Silence those with the compiler
attribute __force macro from the Linux kernel to force casts to/from
restricted endian types.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the driver is calling ice_napi_del() and then
unregister_netdev(). The call to unregister_netdev() will result in a
call to ice_stop() and then ice_vsi_close(). This is where we call
napi_disable() for all the MSI-X vectors. This flow is reversed so make
the changes to ensure napi_disable() happens prior to napi_del().
Before calling napi_del() and free_netdev() make sure
unregister_netdev() was called. This is done by making sure the
__ICE_DOWN bit is set in the vsi->state for the interested VSI.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ice_vf struct can be used hundreds of times in our
driver so it pays to use less memory per struct.
ice_vf prior to this commit:
/* size: 112, cachelines: 2, members: 25 */
/* sum members: 101, holes: 4, sum holes: 8 */
/* bit holes: 2, sum bit holes: 11 bits */
/* padding: 3 */
/* last cacheline: 48 bytes */
ice_vf after this commit:
/* size: 104, cachelines: 2, members: 25 */
/* sum members: 100, holes: 3, sum holes: 4 */
/* bit holes: 1, sum bit holes: 3 bits */
/* last cacheline: 40 bytes */
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We can use bit fields to store boolean values and when the
bit fields are next to each other, the compiler will combine them
(as long as the size holds enough).
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use more efficient structure ordering by using the pahole tool
and a lot of code inspection to get hot cache lines to have
packed data (no holes if possible) and adjacent warm data.
ice_ring prior to this change:
/* size: 192, cachelines: 3, members: 23 */
/* sum members: 158, holes: 4, sum holes: 12 */
/* padding: 22 */
ice_ring after this change:
/* size: 192, cachelines: 3, members: 25 */
/* sum members: 162, holes: 1, sum holes: 1 */
/* padding: 29 */
ice_tx_buf prior to this change:
/* size: 48, cachelines: 1, members: 7 */
/* sum members: 38, holes: 2, sum holes: 6 */
/* padding: 4 */
/* last cacheline: 48 bytes */
ice_tx_buf after this change:
/* size: 40, cachelines: 1, members: 7 */
/* sum members: 38, holes: 1, sum holes: 2 */
/* last cacheline: 40 bytes */
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fixes ethtool -S reported stats in ice driver to match
format and nomenclature of the ixgbe driver.
Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to
free some resources, reset variables, and return an error value.
Fix this by adding another unroll case to free the pf->vf array, set
the pf->num_alloc_vfs to 0, and return an error code.
Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will
not be able to do SRIOV without hard rebooting the system because
rmmod'ing the driver does not work.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the LLDP MIB change event handling code by removing
the workarounds in the current code. Added ice_dcb_need_recfg() to
print the DCB configuration changes detected via MIB change event.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
User requested link modes affect what is returned as an advertised
link mode. If no modes have been requested, we are not advertising
any link modes. Advertise what we are capable of supporting if no
link modes have been requested.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When disabling and enabling VSIs, there are a couple of flows
that recursively acquire the RTNL lock which causes a deadlock.
Fix that.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_parse_caps is used to parse both device and function capabilities.
Currently, capabilities are printed with a cryptic "HW caps" prefix,
which makes it difficult to distinguish whether the capabilities being
printed are device or function capabilities.
This patch makes a change to add a "func cap" prefix when printing
function capabilities, and a "dev cap" prefix when printing device
capabilities.
This patch also changes some of the capability print strings for
consistency.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix checkpatch warning "WARNING:BRACES: braces {} are not necessary
for single statement blocks"
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Commit 3463688e6ced ("ice: Add more validation in ice_vc_cfg_irq_map_msg")
added an assignment of vsi making the assignment during declaration
unnecessary.
Also, cleanup the declaration and assignment of irqmap_info to not use two
lines in the variable declaration section.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Implement LLDP persistence across reboots, start and stop of LLDP agent.
Add additional parameter to ice_aq_start_lldp and ice_aq_stop_lldp.
Also change the ethtool private flag from "disable-fw-lldp" to
"enable-fw-lldp". This change will flip the boolean logic of the
functionality of the flag (on = enable, off = disable). The change
in name and functionality is to differentiate between the
pre-persistence and post-persistence states.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull networking updates from David Miller:
"Highlights:
1) Support AES128-CCM ciphers in kTLS, from Vakul Garg.
2) Add fib_sync_mem to control the amount of dirty memory we allow to
queue up between synchronize RCU calls, from David Ahern.
3) Make flow classifier more lockless, from Vlad Buslov.
4) Add PHY downshift support to aquantia driver, from Heiner
Kallweit.
5) Add SKB cache for TCP rx and tx, from Eric Dumazet. This reduces
contention on SLAB spinlocks in heavy RPC workloads.
6) Partial GSO offload support in XFRM, from Boris Pismenny.
7) Add fast link down support to ethtool, from Heiner Kallweit.
8) Use siphash for IP ID generator, from Eric Dumazet.
9) Pull nexthops even further out from ipv4/ipv6 routes and FIB
entries, from David Ahern.
10) Move skb->xmit_more into a per-cpu variable, from Florian
Westphal.
11) Improve eBPF verifier speed and increase maximum program size,
from Alexei Starovoitov.
12) Eliminate per-bucket spinlocks in rhashtable, and instead use bit
spinlocks. From Neil Brown.
13) Allow tunneling with GUE encap in ipvs, from Jacky Hu.
14) Improve link partner cap detection in generic PHY code, from
Heiner Kallweit.
15) Add layer 2 encap support to bpf_skb_adjust_room(), from Alan
Maguire.
16) Remove SKB list implementation assumptions in SCTP, your's truly.
17) Various cleanups, optimizations, and simplifications in r8169
driver. From Heiner Kallweit.
18) Add memory accounting on TX and RX path of SCTP, from Xin Long.
19) Switch PHY drivers over to use dynamic featue detection, from
Heiner Kallweit.
20) Support flow steering without masking in dpaa2-eth, from Ioana
Ciocoi.
21) Implement ndo_get_devlink_port in netdevsim driver, from Jiri
Pirko.
22) Increase the strict parsing of current and future netlink
attributes, also export such policies to userspace. From Johannes
Berg.
23) Allow DSA tag drivers to be modular, from Andrew Lunn.
24) Remove legacy DSA probing support, also from Andrew Lunn.
25) Allow ll_temac driver to be used on non-x86 platforms, from Esben
Haabendal.
26) Add a generic tracepoint for TX queue timeouts to ease debugging,
from Cong Wang.
27) More indirect call optimizations, from Paolo Abeni"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1763 commits)
cxgb4: Fix error path in cxgb4_init_module
net: phy: improve pause mode reporting in phy_print_status
dt-bindings: net: Fix a typo in the phy-mode list for ethernet bindings
net: macb: Change interrupt and napi enable order in open
net: ll_temac: Improve error message on error IRQ
net/sched: remove block pointer from common offload structure
net: ethernet: support of_get_mac_address new ERR_PTR error
net: usb: smsc: fix warning reported by kbuild test robot
staging: octeon-ethernet: Fix of_get_mac_address ERR_PTR check
net: dsa: support of_get_mac_address new ERR_PTR error
net: dsa: sja1105: Fix status initialization in sja1105_get_ethtool_stats
vrf: sit mtu should not be updated when vrf netdev is the link
net: dsa: Fix error cleanup path in dsa_init_module
l2tp: Fix possible NULL pointer dereference
taprio: add null check on sched_nest to avoid potential null pointer dereference
net: mvpp2: cls: fix less than zero check on a u32 variable
net_sched: sch_fq: handle non connected flows
net_sched: sch_fq: do not assume EDT packets are ordered
net: hns3: use devm_kcalloc when allocating desc_cb
net: hns3: some cleanup for struct hns3_enet_ring
...
Remove mmiowb() from the kernel memory barrier API and instead, for
architectures that need it, hide the barrier inside spin_unlock() when
MMIO has been performed inside the critical section.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzMFaUACgkQt6xw3ITB
YzRICQgAiv7wF/yIbBhDOmCNCAKDO59chvFQWxXWdGk/aAB56kwKAMXJgLOvlMG/
VRuuLyParTFQETC3jaxKgnO/1hb+PZLDt2Q2KqixtjIzBypKUPWvK2sf6THhSRF1
GK0DBVUd1rCrWrR815+SPb8el4xXtdBzvAVB+Fx35PXVNpdRdqCkK+EQ6UnXGokm
rXXHbnfsnquBDtmb4CR4r2beH+aNElXbdt0Kj8VcE5J7f7jTdW3z6Q9WFRvdKmK7
yrsxXXB2w/EsWXOwFp0SLTV5+fgeGgTvv8uLjDw+SG6t0E0PebxjNAflT7dPrbYL
WecjKC9WqBxrGY+4ew6YJP70ijLBCw==
=aC8m
-----END PGP SIGNATURE-----
Merge tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull mmiowb removal from Will Deacon:
"Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb())
Remove mmiowb() from the kernel memory barrier API and instead, for
architectures that need it, hide the barrier inside spin_unlock() when
MMIO has been performed inside the critical section.
The only relatively recent changes have been addressing review
comments on the documentation, which is in a much better shape thanks
to the efforts of Ben and Ingo.
I was initially planning to split this into two pull requests so that
you could run the coccinelle script yourself, however it's been plain
sailing in linux-next so I've just included the whole lot here to keep
things simple"
* tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (23 commits)
docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread
docs/memory-barriers.txt: Fix style, spacing and grammar in I/O section
arch: Remove dummy mmiowb() definitions from arch code
net/ethernet/silan/sc92031: Remove stale comment about mmiowb()
i40iw: Redefine i40iw_mmiowb() to do nothing
scsi/qla1280: Remove stale comment about mmiowb()
drivers: Remove explicit invocations of mmiowb()
drivers: Remove useless trailing comments from mmiowb() invocations
Documentation: Kill all references to mmiowb()
riscv/mmiowb: Hook up mmwiob() implementation to asm-generic code
powerpc/mmiowb: Hook up mmwiob() implementation to asm-generic code
ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
mips/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
sh/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
m68k/io: Remove useless definition of mmiowb()
nds32/io: Remove useless definition of mmiowb()
x86/io: Remove useless definition of mmiowb()
arm64/io: Remove useless definition of mmiowb()
ARM/io: Remove useless definition of mmiowb()
mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors
...
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-05-04
This series contains updates to the ice driver only.
Jesse updated the driver to make more functions consistent in their use
of a local variable for vsi->back. Updates the driver to use bit fields
when possible to avoid wasting lots of storage space to store single bit
values. Optimized the driver to be more memory efficient by moving
structure members around that are not in are hot path.
Michal updates the driver to disable the VF if malicious device driver
(MDD) event is detected by the hardware. Adds checks to validate the
messages coming from the VF driver. Tightens up the sniffing of the
driver so that transmit traffic so that VF's cannot see what is on other
VSIs.
Tony fixed the driver so that receive stripping state won't change every
time transmit insertion is changed. Cleanup the __always_unused
attribute, now that the variable is being used. Fixed the function
which evaluates setting of features to ensure that can evaluate and set
multiple features in a single function call.
Akeem fixes the driver so that we do not attempt to remove a VLAN filter
that does not exist. Adds support for adding a ethertype based filter
rule on VSI and describe it in a very long run-on sentence. :-)
Bruce cleans up static analysis warnings by removing a local variable
initialization that is not needed.
Brett makes the allocate/deallocate more consistent in all the driver
flows for VSI q_vectors. In addition, makes setting/getting coalesce
settings more consistent throughout the driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Added freeing the old allocation of vf->qvlist_info in function
i40e_config_iwarp_qvlist before overwriting it with
the new allocation.
Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fixed possible memory leak in i40e_vc_add_cloud_filter function:
cfilter is being allocated and in some error conditions
the function returns without freeing the memory.
Fix of integer truncation from u16 (type of queue_id value) to u8
when calling i40e_vc_isvalid_queue_id function.
Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
size = sizeof(struct foo) + count * sizeof(struct boo);
instance = kzalloc(size, GFP_KERNEL)
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL)
Notice that, in this case, variable size is not necessary, hence it
is removed.
This code was detected with the help of Coccinelle.
Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The reason of this revert is unexpected issue found in NVM Update tool
during NVM image downgrade. The implementation is no longer needed
since the QV tools are already aware of new FW double ShadowRAM dump
mechanism.
This patch reverts ShadowRAM checksum calculation change introduced in
commit 9d12f0c4e436 ("i40e: Revert ShadowRAM checksum calculation change")
Signed-off-by: Maciej Paczkowski <maciej.paczkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Patch is adding missing input validation on VF message handling
by the PF to the functions with opcodes:
VIRTCHNL_OP_CONFIG_VSI_QUEUES = 6
VIRTCHNL_OP_CONFIG_IRQ_MAP = 7,
VIRTCHNL_OP_DISABLE_QUEUES = 9,
VIRTCHNL_OP_CONFIG_PROMISCUOUS_MODE = 14,
Signed-off-by: Martyna Szapar <martyna.szapar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
New device ids are created to support X710 backplane and SFP+ cards.
This patch adds in i40e driver support for 2.5GbaseT and 5GbaseT speed.
It's implemented by checking I40E_CAP_PHY_TYPE_2_5GBASE_T,
I40E_CAP_PHY_TYPE_5GBASE_T bits from f/w and setting corresponding bits
in ethtool link ksettings supported and advertising masks.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In this patch fixed wrong truncation method from u16 to u8 during
validation.
It was changed by changing u8 to u32 parameter in method declaration
and arguments were changed to u32.
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Field num_vectors from struct virtchnl_iwarp_qvlist_info should not be
larger than num_msix_vectors_vf in the hw struct. The iwarp uses the
same set of vectors as the LAN VF driver.
Signed-off-by: Sergey Nemov <sergey.nemov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the typo in I40E_CAP_PHY_TYPE mode link code.
It was fixed by changing 40000baseLR4_Full to 40000baseKR4_Full
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a bug where changing VLAN to 0 was not set until VF
restart.
Now we are setting pvid info to 0 when we have to change VLAN to 0.
Without this change when VF VLAN was changed to 0 nothing happened until
VF restart. For changing to VLAN different than 0 it worked correctly.
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
New device ids are created to support X710/XXV710 N3000 cards.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes a bug where the promiscuous mode was not being
kept when the VF switched to a new VLAN.
Now we are config two times a promiscuous mode when we switch VLAN.
Without this change when we change VF VLAN we still receive
all the packets from previous VLAN and only unicast from new VLAN.
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Delete code that add default Tx rule on PF. With this rule PF can see
Tx VF traffic that should go outside. For traffic from VF to another
VF default Tx rule on PF doesn't apply because of lower priority than
VF mac rule.
With this change on PF in promisc mode we can see only Rx traffic that
doesn't match any other rule (mac etc.). We can't see Tx traffic from
other VSI.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move a bunch of members around to make more efficient use of
memory, eliminating holes where possible. None of these members
are hot path so cache line alignment is not very important here.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver was converted to not use bool, but it was
neglected that the bools should have been converted to bit fields
as bit fields in software structures are ok, as long as they
use the correct kinds of unsigned types. This avoids
wasting lots of storage space to store single bit values.
One of the change hunks moves a variable lport out of
a group of "combinable" bit fields because all bits of
the u8 lport are valid and the variable can be packed in the
struct in struct holes.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds function to program VSI with ethertype based filter rule,
so that all flow control frames would be disallowed from being transmitted
to the client, in order to prevent malicious VSI, especially VF from
sending out PAUSE or PFC frames, and then control other VSIs traffic.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Set features can have multiple features turned on|off in a single
call. Grouping these all in an if/else means after one condition
is met, other conditions/features will not be evaluated. Break
the if/else statements by feature to ensure all features will be
handled properly.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The variable netdev is being used in this function; remove the
__always_unused attribute from it.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A recent version of cppcheck falsely reports-
Variable ip.hdr is assigned a value that is never used.
ip is a union so the pointer ip.hdr is actually used when referenced as
ip.v4 and ip.v6. Silence these false reports when using cppcheck with the
--inline-suppr command-line option.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently if the driver has an uneven amount of Rx/Tx queues
setting the coalesce settings through ethtool will result in
an error. This is happening because in the setting coalesce
flow we are reporting an error if either Rx or Tx fails.
Also, the flow for setting/getting per_q_coalesce and
setting/getting coalesce settings for the entire device
is different.
Fix these issues by adding one function, ice_set_q_coalesce(),
and another, ice_get_q_coalesce(), that both getting/setting
per_q and entire device coalesce can use. This makes handling
the error cases generic between the two flows and simplifies
__ice_set_coalesce() and __ice_get_coalesce().
Also, add a header comment to __ice_set_coalesce().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently when probing/removing the driver we allocate/deallocate
each vsi->q_vectors array in ice_vsi_alloc_arrays() and
ice_vsi_free_arrays() respectively. However, we don't do this
during the reset and VSI rebuild flow. This is inconsistent
and unnecessary to have a difference between the two flows.
This patch makes the change to always allocate/deallocate the
vsi->q_vectors array regardless of the driver flow we are in.
Also, update the comment for ice_vsi_free_arrays() to be more
descriptive.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The local variable speed does not need to be initialized and can cause some
static analysis tools to complain the initial assigned value is never used.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add few checks to validate msg from iavf driver.
Test if we have got enough q_vectors allocated in VSI connected with VF.
Add masks for itr_indx and msix_indx to avoid writing to reserved fieldi
of QINT. Clear q_vector->num_ring_rx/tx, without it we can increment this
value every time we send irq map msg from VF. So after second call this
value will be incorrect.
Decrement num_vectors from msg, because last vector in iavf msg is misc
vector (we don't set map for it).
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In case of non-trusted VFs, it is possible to program VLAN filter far
less than what is requested by the VF originally, thereby makes number of
VLAN elements being tracked by VF different from actual VLAN tags. This
patch makes sure that we are not attempting to remove VLAN filter that
does not exist.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When Tx insertion is set, we are not accounting for the state of Rx
stripping. This causes Rx stripping to be enabled any time Tx
insertion is changed, even when it's supposed to be disabled.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Disable VF if any malicious device driver (MDD) event is detected by
hardware. Track vf->num_mdd_events for information about VF MDD events.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Many times in our functions we have a local variable pf, which is
equivalent to vsi->back. Just use pf consistently instead of vsi->back
where available.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-05-02
This series contains updates to the ice driver only.
Anirudh introduces the framework to store queue specific information in
the VSI queue contexts. This will allow future changes to update the
structure to hold queue specific information.
Akeem adds additional check so that if there is no queue to disable when
attempting to disable a queue, return a configuration error without
acquiring the lock. Fixed an issue with non-trusted VFs being able to
add more than the permitted number of VLANs.
Bruce removes unreachable code and updated the function to return void
since it would never return anything but success.
Brett provides most of the changes in the series, starting with reducing
the scope of the error variable used and improved the debug message if
we fail to configure the receive queue. Updates the driver to use a
macro instead of using the same 'for' loop throughout the driver which
helps with readability. Fixed an issue where users were led to believe
they could set rx-usecs-high value, yet the changes to this value would
not stick because it was not yet implemented to allow changes to this
value, so implement the missing code to change the value. Found we had
unnecessary wait when disabling queues, so remove it. I,proved a
wasteful addition operation in our hot path by adding a member to the
ice_q_vector structure and the necessary changes to use the member which
stores the calculated vector hardware index. Refactored the link event
flow to make it cleaner and more clear.
Maciej updates the array index when stopping transmit rings, so that
process every ring the VSI, not just the rings in a given transmit
class.
Paul adds support for setting 52 byte RSS hash keys.
Md Fahad cleaned up a runtime change to the PFINT_OICR_ENA register,
since the interrupt handlers will handle resetting the bit, if
necessary.
Tony adds a missing PHY type, which was causing warning message about an
unrecognized PHY.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces "recovery mode" to the i40e driver. It is
part of a new Any2Any idea of upgrading the firmware. In this
approach, it is required for the driver to have support for
"transition firmware", that is used for migrating from structured
to flat firmware image. In this new, very basic mode, i40e driver
must be able to handle particular IOCTL calls from the NVM Update
Tool and run a small set of AQ commands.
These additional AQ commands are part of the interface used by
the NVMUpdate tool. The NVMUpdate tool contains all of the
necessary logic to reference these new AQ commands. The end user
experience remains the same, they are using the NVMUpdate tool to
update the NVM contents.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Piotr Marczak <piotr.marczak@intel.com>
Tested-by: Don Buchholz <donald.buchholz@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Printing each devices PCI vendor and device ID has the advantage of
easily revealing what hardware we're dealing with exactly. It's no
longer necessary to match the PCI bus information to the lspci output.
Helps with bug reports where no lspci output is available.
Output before
i40e 0000:08:00.0: fw 6.1.49420 api 1.7 nvm 6.80 0x80003c64 1.2007.0
and after
i40e 0000:08:00.0: fw 6.1.49420 api 1.7 nvm 6.80 0x80003c64 1.2007.0 [8086:1572] [8086:0004]
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A refactor of the i40e_vc_config_promiscuous_mode_msg function moved
the check for un-trusted VF into another function. We have to lie to
an un-trusted VF that its request to set promiscuous mode is
successful even when it is not because we don't want the VF to find
out its trust status this way. With the refactor, we were running into
a case where even though we were not setting promiscuous mode for an
un-trusted VF, we still printed a misleading message that it was
successful.
This patch fixes that by ensuring that a success message is printed
on the host side only when the promiscuous mode change has been
successful.
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Just bumping the version number appropriately.
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The function i40e_validate_cloud_filter checks that the destination and
source port numbers are valid by attempting to ensure that the number is
non-zero and no larger than 0xFFFF. However, the types for the dst_port
and src_port variable are __be16 which by definition cannot be larger
than 0xFFFF
Since these values cannot be larger than 2 bytes, the check to see if
they exceed 0xFFFF is meaningless.
One might consider these checks as some sort of defensive coding, in
case the type was later changed. However, these checks also byte-swap
the value before comparison using be16_to_cpu, which will truncate the
values to 16bits anyways. Additionally, changing the type would require
updating the opcodes to support new data layout of these virtchnl
commands.
Remove the check to silence the -Wtype-limits warning that was added to
GCC 8.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This code implements driver code changes necessary for LLDP
Agent support. Modified i40e_aq_start_lldp() and
i40e_aq_stop_lldp() adding false parameter whether LLDP state
should be persistent across power cycles.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add assignments for advertising 40GBase_LR4, 40GBase_CR4 and fibre
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Due to changes in FW the SW is required to perform double SR dump in
some cases.
Implementation adds two new steps to update nvm checksum function:
* recalculate checksum and check if checksum in NVM is correct
* if checksum in NVM is not correct then update it again
Signed-off-by: Maciej Paczkowski <maciej.paczkowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
VF's attempt to delete vlan 0 when a port vlan is configured is harmless
in this case pf driver just does nothing. If vf will try to remove
other vlans when a port vlan is configured it will still produce error
as before.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
TX MDD events reported on the PF are the result of the
PF misconfiguring a descriptor and not because of "bad actions"
by anything else. No need to reset now because if it
results in a Tx hang, the Tx hang check will take care of it.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes the driver behavior when detecting a VF MDD event.
It now disables the VF after one event, which indicates a hw detected
problem in the VF. Before this change, the PF would allow a couple of
events before doing the reset.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
dev_err makes more sense than dev_info when this call fails.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the link event flow works, but can be much better.
Refactor the link event flow to make it cleaner and more clear
on what is going on.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The PHY type ICE_PHY_TYPE_LOW_25G_AUI_C2C is missing from
ice_get_settings_link_up() which is causing a warning
message for unrecognized PHY. Add the PHY type to
correctly set the settings and avoid the warning message.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Every time we want to re-enable interrupts and/or write to a register
that requires an interrupt vector's hardware index we do the following:
vsi->hw_base_vector + q_vector->v_idx
This is a wasteful operation, especially in the hot path. Fix this by
adding a u16 reg_idx member to the ice_q_vector structure and make the
necessary changes to make this work.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Runtime change of PFINT_OICR_ENA register is unnecessary.
The handlers should always clear the atomic bit for each
task as they start, because it will make sure that any late
interrupt will either 1) re-set the bit, or 2) be handled
directly in the "already running" task handler.
Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes issue with non trusted VFs being able to add more than
permitted number of VLANs by adding a check in ice_vc_process_vlan_msg.
Also don't return an error in this case as the VF does not need to know
that it is not trusted.
Also rework ice_vsi_kill_vlan to use the right types.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In ice_vsi_ctrl_rx_rings() we are unnecessarily waiting for
QRX_CTRL_QENA_REQ and QRX_CTRL_QENA_STAT to be the same value prior to
disabling each Rx queue. There is no reason to do this so remove
this wait loop as we already have a wait loop after disabling/enabling
the Rx queue through the QRX_CTRL register to make sure it gets
successfully disabled/enabled.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the driver allows rx-usecs-high values to be set,
but when querying the device for rx-usecs-high the value
does not stick. This is because it was not yet implemented.
Add code to allow the user to change rx-usecs-high and
use this to set the q_vector's intrl value.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support to set 52 byte RSS hash key.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are many places in the code where we do the following:
for (i = 0; i < vsi->num_q_vectors; i++)
Instead use the macro mentioned in the commit title:
ice_for_each_q_vector(vsi, i)
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When stopping Tx rings, we use 'i' as an ring array index for looking up
whether the ice_ring exists and have assigned a q_vector. This checks
rings only within a given TC and we need to go through every ring in
VSI. Use 'q_idx' instead.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce scope of the variable 'err' to inside the for loop instead
of using it as a second looping conditional. Also while here,
improve the debug message if we fail to configure a Rx queue.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Static analysis points out the default case in the switch statement in
ice_get_itr_intrl_gran() is an infeasible condition causing the default
case statement to be unreachable. Remove it and since the function no
longer returns anything but success, change it to just return void and
update the only call to it accordingly.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If there is no queue to disable, return appropriate configuration error
earlier without acquiring the lock.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch introduces a framework to store queue specific information
in VSI queue contexts. Currently VSI queue context (represented by
struct ice_q_ctx) only has q_handle as a member. In future patches,
this structure will be updated to hold queue specific information.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update all users of eth_get_headlen to pass network device, fetch
network namespace from it and pass it down to the flow dissector.
This commit is a noop until administrator inserts BPF flow dissector
program.
Cc: Maxim Krasnyansky <maxk@qti.qualcomm.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: Yisen Zhuang <yisen.zhuang@huawei.com>
Cc: Salil Mehta <salil.mehta@huawei.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Currently when calculating how much to increment ITR by inside of
ice_update_itr() we do some estimations and intermediate
calculations. Instead of doing estimations, just do the
calculation directly. This allows for a more accurate value and it
makes it easier for the next person to understand and update.
Also, remove the dividing the ITR value by 2 when latency
driven because the ITR values are already so low for 100Gbps
speed. This should help get to the desired ITR value faster.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update driver version to 0.7.4
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds code to start or stop LLDP and DCBX in firmware through
use of ethtool private flags.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch introduces a new function ice_dcb_rebuild which reinitializes
DCB after a reset.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a new function ice_update_dcb_stats to get DCB stats
from the hardware and ethtool support for displaying these stats.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch introduces a new function ice_tx_prepare_vlan_flags_dcb to
insert 802.1p priority information into the VLAN header
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a new function ice_vsi_cfg_dcb_rings which updates a
VSI's rings based on DCB traffic class information.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support to process LLDP MIB change notifications sent
by the firmware.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the firmware doesn't support LLDP or DCBX, the driver should switch
to "software LLDP mode". This patch adds support for the same.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a new function ice_pf_dcb_cfg (and related helpers)
which applies the DCB configuration obtained from the firmware. As
part of this, VSIs/netdevs are updated with traffic class information.
This patch requires a bit of a refactor of existing code.
1. For a MIB change event, the associated VSI is closed and brought up
again. The gap between closing and opening the VSI can cause a race
condition. Fix this by grabbing the rtnl_lock prior to closing the
VSI and then only free it after re-opening the VSI during a MIB
change event.
2. ice_sched_query_elem is used in ice_sched.c and with this patch, in
ice_dcb.c as well. However, ice_dcb.c is not built when CONFIG_DCB is
unset. This results in namespace warnings (ice_sched.o: Externally
defined symbols with no external references) when CONFIG_DCB is unset.
To avoid this move ice_sched_query_elem from ice_sched.c to
ice_common.c.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch introduces a new top level function ice_init_dcb (and
related lower level helper functions) which continues the DCB init
flow.
This function uses ice_get_dcb_cfg to get, parse and store the DCB
configuration. Once this is done, it sets itself up to be notified
by the firmware on LLDP MIB change events.
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch introduces a skeleton for ice_init_pf_dcb, the top level
function for DCB initialization. Subsequent patches will add to this
DCB init flow.
In this patch, ice_init_pf_dcb checks if DCB is a supported capability.
If so, an admin queue call to start the LLDP and DCBx in firmware is
issued. If not, an error is reported. Note that we don't fail the driver
init if DCB init fails.
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bump driver version to 0.7.3
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Capitalize abbreviations and spell out some that aren't obvious.
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes typos in code comments.
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes an error code for an admin queue
head overrun to use I40E_ERR_ADMIN_QUEUE_FULL instead
of I40E_ERR_QUEUE_EMPTY.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the problem with the driver being able to add only 7
multicast MAC address filters instead of 16. The problem is fixed by
changing the maximum number of MAC address filters to 16+1+1 (two extra
are needed because the driver uses 1 for unicast MAC address and 1 for
broadcast).
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Defined the advertised link mode field for 40000baseSR4_Full for
use with ethtool.
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Added the API version in the error message for clarity.
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A new FW has been released, which uses API version 1.8.
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Removed misleading messages when untrusted VF tries to
add more addresses than NIC limit
Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Modify the i40e_init_dcb to return the correct error when LLDP or DCBX
is not in operational state.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On some hardware LEDs would not blink after command 'ethtool -p {eth-port}'
in certain circumstances. Now, function does not care about the activity
of the LED (though still preserves its state) but forcibly executes
identification blinking and then restores the LED state.
Signed-off-by: Piotr Marczak <piotr.marczak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the case where PTP is running on the hardware clock, but the kernel
system time is not being synced, a device reset can mess up the clock
time.
This occurs because we reset the clock time based on the kernel time
every reset. This causes us to potentially completely reset the PTP
time, and can cause unexpected behavior in programs like ptp4l.
Avoid this by saving the PTP time prior to device reset, and then
restoring using that time after the reset.
Directly restoring the PTP time we saved isn't perfect, because time
should have continued running, but the clock will essentially be stopped
during the reset. This is still better than the current solution of
assuming that the PTP HW clock is synced to the CLOCK_REALTIME.
We can do even better, by saving the ktime and calculating
a differential, using ktime_get(). This is based on CLOCK_MONOTONIC, and
allows us to get a fairly precise measure of the time difference between
saving and restoring the time.
Using this, we can update the saved PTP time, and use that as the value
to write to the hardware clock registers. This, of course is not perfect.
However, it does help ensure that the PTP time is restored as close as
feasible to the time it should have been if the reset had not occurred.
During device initialization, continue using the system time as the
source for the creation of the PTP clock, since this is the best known
current time source at driver load.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Modifying the VLAN stripping options when a port VLAN is configured
will break traffic for the VSI, and conceptually doesn't make sense,
so don't allow this.
Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch introduces DDP (Dynamic Device Personalization) which allows
loading profiles that change the way internal parser interprets processed
frames. To load DDP profiles it utilizes ethtool flash feature. The files
with recipes must be located in /var/lib/firmware directory. Afterwards
the recipe can be loaded by invoking:
ethtool -f <if_name> <file_name> 100
ethtool -f <if_name> - 100
See further details of this feature in the i40e documentation, or
visit
https://www.intel.com/content/www/us/en/architecture-and-technology/ethernet/dynamic-device-personalization-brief.html
The driver shall verify DDP profile can be loaded in accordance with
the rules:
* Package with Group ID 0 are exclusive and can only be loaded the first.
* Packages with Group ID 0x01-0xFE can only be loaded simultaneously
with the packages from the same group.
* Packages with Group ID 0xFF are compatible with all other packages.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Added a new local variable in the i40e_setup_tc function named
old_queue_pairs so num_queue_pairs can be restored to the correct
value in case configuring queue channels fails. Additionally, moved
the exit label in the i40e_setup_tc function so the if (need_reset)
block can be executed.
Also, fixed data packing in the i40e_setup_tc function.
Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
mmiowb() is now implied by spin_unlock() on architectures that require
it, so there is no reason to call it from driver code. This patch was
generated using coccinelle:
@mmiowb@
@@
- mmiowb();
and invoked as:
$ for d in drivers include/linux/qed sound; do \
spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done
NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation. If you've ended up bisecting to this commit, you can
reintroduce the mmiowb() calls using wmb() instead, which should restore
the old behaviour on all architectures other than some esoteric ia64
systems.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Minor comment merge conflict in mlx5.
Staging driver has a fixup due to the skb->xmit_more changes
in 'net-next', but was removed in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two reasons for this.
First, the xmit_more flag conceptually doesn't fit into the skb, as
xmit_more is not a property related to the skb.
Its only a hint to the driver that the stack is about to transmit another
packet immediately.
Second, it was only done this way to not have to pass another argument
to ndo_start_xmit().
We can place xmit_more in the softnet data, next to the device recursion.
The recursion counter is already written to on each transmit. The "more"
indicator is placed right next to it.
Drivers can use the netdev_xmit_more() helper instead of skb->xmit_more
to check the "more packets coming" hint.
skb->xmit_more is retained (but always 0) to not cause build breakage.
This change takes care of the simple s/skb->xmit_more/netdev_xmit_more()/
conversions. Remaining drivers are converted in the next patches.
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit f3fef2b6e1 ("i40e: Remove umem from VSI") a regression was
introduced; When the VSI was reset, the setup code would try to enable
AF_XDP ZC unconditionally (as long as there was a umem placed in the
netdev._rx struct). Here, we add a bitmap to the VSI that tracks if a
certain queue pair has been "zero-copy enabled" via the ndo_bpf. The
bitmap is used in i40e_xsk_umem, and enables zero-copy if and only if
XDP is enabled, the corresponding qid in the bitmap is set and the
umem is non-NULL.
Fixes: f3fef2b6e1 ("i40e: Remove umem from VSI")
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The i40e_xsk_umem function was explicitly inlined in i40e.h. There is
no reason for that, so move it to i40e_main.c instead.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current check for WoL on i40e is broken. Code comment says only
magic packet is supported, so only check for that.
Fixes: 540a152da7 (i40e/ixgbe/igb: fail on new WoL flag setting WAKE_MAGICSECURE)
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ixgbe ignores errors returned from mdiobus_register() and leaves
adapter->mii_bus non-NULL and MDIO bus state as MDIOBUS_ALLOCATED.
This triggers a BUG from mdiobus_unregister() during ixgbe_remove() call.
Fixes: 8fa10ef012 ("ixgbe: register a mdiobus")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The runtime_suspend device callbacks are not supposed to save
configuration state or change the power state. Commit fb29f76cc566
("igb: Fix an issue that PME is not enabled during runtime suspend")
changed the driver to not save configuration state during runtime
suspend, however the driver callback still put the device into a
low-power state. This causes a warning in the pci pm core and results in
pci_pm_runtime_suspend not calling pci_save_state or pci_finish_runtime_suspend.
Fix this by not changing the power state either, leaving that to pci pm
core, and make the same change for suspend callback as well.
Also move a couple of defines into the appropriate header file instead
of inline in the .c file.
Fixes: fb29f76cc566 ("igb: Fix an issue that PME is not enabled during runtime suspend")
Signed-off-by: Arvind Sankar <niveditas98@gmail.com>
Reviewed-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>