This change makes certain that any packet we attempt to transmit will meet
minimum size requirements for the hardware.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to just cleanup the logic in ixgbe_change_mtu since we
are making it unnecessarily complex due to a workaround required for 82599
when SR-IOV is enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch replaces the existing Rx hot-path in the ixgbe driver with a new
implementation that is based on performing a double buffered receive. The
ixgbe driver already had something similar in place for its' packet split
path, however in that case we were still receiving the header for the
packet into the sk_buff. The big change here is the entire receive path
will receive into pages only, and then pull the header out of the page and
copy it into the sk_buff data. There are several motivations behind this
approach.
First, this allows us to avoid several cache misses as we were taking a
set of cache misses for allocating the sk_buff and then another set for
receiving data into the sk_buff. We are able to avoid these misses on
receive now as we allocate the sk_buff when data is available.
Second we are able to see a considerable performance gain when an IOMMU is
enabled because we are no longer unmapping every buffer on receive.
Instead we can delay the unmap until we are unable to use the page, and
instead we can simply call sync_single_range on the half of the page that
contains new data.
Finally we are able to drop a considerable amount of code from the driver
as we no longer have to support 2 different receive modes, packet split and
one buffer. This allows us to optimize the Rx path further since less
branching is required.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows the NIC to receive all frames available, including
those with bad FCS, ethernet control frames, and more.
Tested by sending frames with bad FCS.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Including bad FCS, used generate frames with bad FCS
to test other system's handling of RX of bad packets.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows the NIC to receive all frames available, including
those with bad FCS, un-matched vlans, ethernet control frames,
and more.
Tested by sending frames with bad FCS.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Including bad FCS, used generate frames with bad FCS
to test other system's handling of RX of bad packets.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Disabling and enabling DCB can cause FCoE hardware initialization to
occur on the incorrect traffic class when the up2tc mapping has not
yet been reconfigured.
Fix this by using the DCB configuration maps that are correct
and will be pushed at mqprio after DCB driver setup completes
successfully.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Marcus Dennis <marcusx.e.dennis@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There was a race condition in the reset path where the RX buffer
could become corrupted during Fdir configuration.This is due to
a HW bug.The fix right now is to lock the buffer while we do the
fdir configuration.Since we were using similar workaround for another bug,
I moved the existing code to a function and reused it.HW team also recommended
that IXGBE_MAX_SECRX_POLL value be changed from 30 to 40.The erratum for this
bug will be published in the next release 82599 Spec Update
Signed-off-by: Atita Shirwaikar <atita.shirwaikar@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
using the form min((int)var, ver)) is replaced by min_t(int, ...)
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is clearly a typeo where we are not checking the return value from
get_link_capabilities but should. This patch corrects that.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There isn't much point in using variables to store the values of eitr_low
and eitr_high since they are not user changeable. As such I am replacing
them with the constants 10 and 20 in order to avoid any confusion on what
the values actually are.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A previous fix had gone though and disabled relaxed ordering for Rx
descriptor read fetching. This was not necessary as this functions
correctly and has no ill effects on the system.
In addition several of the defines used for the DCA control registers were
incorrect in that they indicated descriptor effects when they actually had
an impact on either data or header write back. As such I have update these
to correctly reflect either DATA or HEAD.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use the current logging styles.
Remove unnecessary _DEBUG_DRIVER_ and PFX, use pr_debug.
Coalesce format.
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it a bit easier to do the loopback frame creating and
testing. Previously we were doing an and to drop the last bit, and then
dividing the frame_size by 2 in order to get locations for frame bytes and
testing. Instead we can simplify it by just shifting the register one bit
to the right and using that for the frame offsets.
This change also replaces all instances of rx_buffer_info with just
rx_buffer since that is closer to the name of the actual structure being
used and can save a few extra characters.
In addition I have updated the logic for cleaning up a test frame so that
we pass an rx_buffer instead of the sk_buff. The main motivation behind
this is changes that will replace the sk_buff with just a page in the
future.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since there are multiple spots where we have to cycle through all of the
rings on a q_vector it makes sense to just add a function for iterating
through all of them.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes the rings a part of the q_vector directly instead of
indirectly. Specifically on x86 systems this helps to avoid any cache
set conflicts between the q_vector, the tx_rings, and the rx_rings as the
critical stride is 4K and in order to cross that boundary you would need to
have over 15 rings on a single q_vector.
In addition this allows for smarter allocations when Flow Director is
enabled. Previously Flow Director would set the irq_affinity hints based
on the CPU and was still using a node interleaving approach which on some
systems would end up with the two values mismatched. With the new approach
we can set the affinity for the irq_vector and use the CPU for that
affinity to determine the node value for the node and the rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is a minor cleanup to address the unnecessary use of
napi_schedule_prep in ixgbe_intr and to also remove a blank line that is
not needed since it is separating a comment from the line it is explaining.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The old code had several errors in how it was determining the vector
budget. In order to simplify things this patch updates the code so that it
will attempt to always allocated paired Rx/Tx vectors instead of attempting
to allocate individual vectors when the number of queues is less than the
number of CPUs.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change corrects an issue in which Adaptive Interrupt Moderation was
not changing values due to the fact that we were performing an and
operation on the resultant value that was causing the value to never change
from the default 20K interrupts per second.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to address the fact that the tx_itr_setting was
dropping to 0 when no separate Tx vectors were provided. This had resulted
in the driver incorrectly configuring the Tx ring with a WTHRESH of 1 in
order to avoid Tx hangs even though that was not necessary. This change
makes it so that we instead take a look at the Tx ring's q_vector to
determine if the ring will have an ITR value less than 8us.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change moves several frequently accessed items together into one cache
line in order to reduce cache misses in the hot-path.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There isn't any need to clear the status bits in the descriptors due to the
fact that the eop_desc provides enough information for us to know
that we have cleaned to the last packet that the software has put on the
ring. The status bits are cleared as a part of putting the frame on the
ring so as long as we do not read the descriptor bit prior to reading the
value eop_desc we should be able to guarantee that we will not clean beyond
the end of the current data stream.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We are seeing dev_watchdog hangs on several drivers. I suspect this is due
to the __QUEUE_STATE_STACK_XOFF bit being set prior to a reset for link
change, and then not being cleared by netdev_tx_reset_queue. This change
corrects that.
In addition we were seeing dev_watchdog hangs on igb after running the
ethtool tests. We found this to be due to the fact that the ethtool test
runs the same logic as ndo_start_xmit, but we were never clearing the XOFF
flag since the loopback test in ethtool does not do byte queue accounting.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This adds support for byte queue limits (BQL).
Based on patch from Eric Dumazet for igb.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
As noted by Ben Hutchings and David Miller, work limits for NAPI
should not be tied to interrupt moderation parameters. This
should be handled by NAPI, possibly through sysfs.
Neil Horman & Stephen Hemminger are working on a solution for
NAPI currently. In the meantime, remove this tie between
work limits and interrupt moderation.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
A bug was introduced with the following patch:
Commmit bdbc063129
Author: Eric Dumazet <eric.dumazet@gmail.com>
igb: Add support for byte queue limits.
The ethtool offline tests will cause a perpetual link flap, this
is because the tests also need to account for byte queue limits (BQL).
CC: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
{g|s}etnumtcs() today returns a u8 that is only used by the DCB code
to verify no error occurred. Today the driver implementations return
negative error codes which end up being non-zero so the logic works
out but triggers some sparse warnings.
To fix the sparse warnings convert the return value to an int.
CC: Eilon Greenstein <eilong@broadcom.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
dcb netlink code calls setup_tc to init hardware traffic classes
to use for DCB. At some call sites the return values are not
checked for errors and in one case may return -EINVAL back to
the net/dcbnl.c caller which is expecting a u8.
This fixes some smatch hits and although failures are never
seen in practive its best to check return codes.
Reported-by: Dan Carenter <dan.carpenter@oracle.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The old code would += the total errors every time
stats were gathered. Instead, keep a count of short-pkt
and long-pkt counters and then simply add them together
for the rx-over-length stat.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch corrects several comments that are either incorrect or formatted
incorrectly for multiline comments.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Correct spelling error caught with codespell.py.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is meant to address several minor issues in
ixgbe_xmit_frame_ring. Specifically it adds a comment explaining the TXSW
flag, and correctly wraps a line over 80 characters.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The e1000_link_stall_workaround_lv() function is always called in non-
atomic context so it should use msleep instead of mdelay. Also, remove
unnecessary #include <linux/delay.h>.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows the NIC to receive packets with bad FCS
and other errors. Good for sniffing packets on flakey
networks.
v4: Only flax rx-over-length errors if pkt is beyond
maximum expected packet size, not just beyond the MTU.
This matches the existing logic for this counter.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This can aid with testing the RX logic for bad
CRCs.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This allows e100 to be configured to append the
Ethernet FCS to the skb.
Useful for sniffing networks.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rename e1000e_reload_nvm() to e1000e_reload_nvm_generic() to signify the
function is used for more than one MAC-family type, and set and use it as a
MAC ops function pointer to be consistent with the driver design.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove reference to non-existant function.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rename e1000e_config_collision_dist() to
e1000e_config_collision_dist_generic() to signify the function is used for
more than one MAC-family type, and set and use it as a MAC ops function
pointer to be consistent with the driver design.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Call the MAC ops setup_physical_interface function pointer instead of the
MAC-family-specific function to conform to the rest of the driver design.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Replace e1000_check_reset_block() inline function with calls to the PHY ops
check_reset_block function pointer.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Replace e1000_check_mng_mode() inline function with calls to the MAC ops
check_mng_mode function pointer.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rename e1000e_setup_link() to e1000e_setup_link_generic() to signify the
function is used for more than one MAC-family type. The 82571-family has
a custom setup_link function which also calls the generic function. The
ich8lan-family has a custom function which should just be called via the
function pointer. The 80003es2lan-family just uses the generic function.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>