Implement new interface for firmware commands to access some PHYs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The firmware version method and functions are not used anywhere, so
remove them all.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are two problems with EEPROM access. One is that it needs to
hold the semaphore until the entire response is read or else the
response can be corrupted by other firmware accesses. The second
problem is that acquiring and releasing the semaphore is slow, so
it should be taken and released once when multiple EEPROM accesses
will be done.
Both of these issues can be solved by adding a new function,
ixgbe_hic_unlocked, to issue firmware commands that will assume
that the caller has acquired the needed semaphore.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch ensures that the advertised link speeds are configured
for X553 KR/KX backplane. Without this patch the link remains at
1G when resuming from low power after being downshifted by LPLU.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Rx timestamp does not work on 82599 and X540 because bitwise operation
of RX_HWTSTAMP flags is incorrect and ixgbe_ptp_rx_hwtstamp() is never
called. This patch fixes it to enable Rx timestamp on 82599 and X540.
Without this fix:
ptp4l[278.730]: selected /dev/ptp8 as PTP clock
ptp4l[278.733]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[278.733]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[278.834]: port 1: received SYNC without timestamp
ptp4l[278.835]: port 1: new foreign master 1c3947.fffe.60f9cc-1
ptp4l[279.834]: port 1: received SYNC without timestamp
ptp4l[280.834]: port 1: received SYNC without timestamp
ptp4l[281.834]: port 1: received SYNC without timestamp
ptp4l[282.834]: port 1: received SYNC without timestamp
ptp4l[282.835]: selected best master clock 1c3947.fffe.60f9cc
ptp4l[282.835]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[283.834]: port 1: received SYNC without timestamp
With this fix:
ptp4l[239.154]: selected /dev/ptp8 as PTP clock
ptp4l[239.157]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[239.157]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[240.989]: port 1: new foreign master 1c3947.fffe.60f9cc-1
ptp4l[244.989]: selected best master clock 1c3947.fffe.60f9cc
ptp4l[244.989]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[246.977]: master offset -899583339542096 s0 freq +0 path delay 16222
ptp4l[247.977]: master offset -899583339617265 s1 freq -75169 path delay 16177
ptp4l[248.977]: master offset -130 s2 freq -75299 path delay 16177
ptp4l[248.977]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[249.977]: master offset -9 s2 freq -75217 path delay 16177
ptp4l[250.977]: master offset 88 s2 freq -75123 path delay 16132
Fixes: a9763f3cb5 ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Yusuke Suzuki <yus-suzuki@uf.jp.nec.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make sure that we free the IRQs in ixgbe_io_error_detected() when
responding to an PCIe AER error and also restore them when the
interface recovers from it.
Previously it was possible to trigger BUG_ON() check in free_msix_irqs()
in the case where we call ixgbe_remove() after a failed recovery from
AER error because the interrupts were not freed.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are two methods for setting mac addresses in a Macvlan, that
differentiate themselves in the function macvlan_set_mac_Address.
If the macvlan mode is passthru, then we use the dev_set_mac_address
method, otherwise we use the dev_uc api via macvlan_sync_addresses.
The latter method (which would stem from using any non-passthru mode,
like bridge, or vepa), calls down into the driver in a path that terminates
in ixgbevf_set_uc_addr_vf, which sends a IXGBE_VF_SET_MACVLAN message,
which causes the pf to spawn the noted error message. This occurs because
it appears that the guest is trying to delete the mac address of the macvlan
before adding another.
The other path in macvlan_set_mac_address uses dev_set_mac_address, which
calls into ixgbevf_set_mac which uses the IXGBE_VF_SET_MAC_ADDR to the
pf to set the macvlan mac address.
The discrepancy here is in the handlers. The handler function for
IXGBE_VF_SET_MAC_ADDR (ixgbe_set_vf_mac_addr) has a check for
the vfinfo[].trusted bit to allow the operation if the vf is trusted.
In comparison, the IXGBE_VF_SET_MACVLAN message handler
(ixgbe_set_vf_macvlan_msg) has no such check of the trusted bit.
Signed-off-by: Ken Cox <jkc@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When an interface is part of a namespace it is possible that
ixgbe_close() may be called while __ixgbe_shutdown() is running
which ends up in a double free WARN and/or a BUG in free_msi_irqs().
To handle this situation we extend the rtnl_lock() to protect the
call to netif_device_detach() and ixgbe_clear_interrupt_scheme()
in __ixgbe_shutdown() and check for netif_device_present()
to avoid clearing the interrupts second time in ixgbe_close();
Also extend the rtnl lock in ixgbe_resume() to netif_device_attach().
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
BaseT adapters that are capable of supporting 100Mb are not reporting this
capability. This patch corrects the reporting so that 100Mb is shown as
supported on those adapters.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A retry count of 10 is likely to run into problems on X550 devices that
have to detect and reset unresponsive CS4227 devices. So, reduce the I2C
retry count to 3 for X550 and above. This should avoid any possible
regressions in existing devices.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is an extension of commit 003287e0f0 ("ixgbevf: Correct parameter
sent to LED function"); add bounds checking to x540 functions to ensure the
index is valid.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The indirection table was reported incorrectly for X550 and newer
where we can support up to 64 RSS queues.
Reported-by Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The generic PHY reset check we had previously is not sufficient for the
ixgbe_phy_x550em_ext_t PHY type. Check 1.CC02.0 instead - same as
ixgbe_init_ext_t_x550().
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some x550 devices require the driver version reported to its firmware; this
patch sends the driver version string to the firmware through the host
interface command for x550 devices.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
FEC is configured by the NVM and the driver should not be
overriding it.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is no point in having an extra type for extra confusion. u64 is
unambiguous.
Conversion was done with the following coccinelle script:
@rem@
@@
-typedef u64 cycle_t;
@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Couple conflicts resolved here:
1) In the MACB driver, a bug fix to properly initialize the
RX tail pointer properly overlapped with some changes
to support variable sized rings.
2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
overlapping with a reorganization of the driver to support
ACPI, OF, as well as PCI variants of the chip.
3) In 'net' we had several probe error path bug fixes to the
stmmac driver, meanwhile a lot of this code was cleaned up
and reorganized in 'net-next'.
4) The cls_flower classifier obtained a helper function in
'net-next' called __fl_delete() and this overlapped with
Daniel Borkamann's bug fix to use RCU for object destruction
in 'net'. It also overlapped with Jiri's change to guard
the rhashtable_remove_fast() call with a check against
tc_skip_sw().
5) In mlx4, a revert bug fix in 'net' overlapped with some
unrelated changes in 'net-next'.
6) In geneve, a stale header pointer after pskb_expand_head()
bug fix in 'net' overlapped with a large reorganization of
the same code in 'net-next'. Since the 'net-next' code no
longer had the bug in question, there was nothing to do
other than to simply take the 'net-next' hunks.
Signed-off-by: David S. Miller <davem@davemloft.net>
In the case of IPIP and SIT tunnel frames the outer transport header
offset is actually set to the same offset as the inner transport header.
This results in the lco_csum call not doing any checksum computation over
the inner IPv4/v6 header data.
In order to account for that I am updating the code so that we determine
the location to start the checksum ourselves based on the location of the
IPv4 header and the length.
Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For some Tx paths (e.g., tpacket_snd()), ixgbe_atr may be
passed down an sk_buff that has the network and transport
header in the paged data, so it needs to make sure these
headers are available in the headlen bytes to calculate the
l4_proto.
This patch expect that network and transport headers are
already available in the non-paged header dat. The assumption
is that the caller has set this up if l4_proto based Tx
steering is desired.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Commit 9f12df906c ("ixgbe: Store VXLAN port number in network order")
incorrectly checks for hdr.ipv4->protocol != IPPROTO_UDP
in ixgbe_atr(). This check should be for "==" instead.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We were using an old Alpha version of the X550 phy ID. This was leading
to unnecessary queries of the PHY. I removed the old ID (which shouldn't
be on any HW) and add the two that are.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch add X553 FW ALEF support for B0. ALEF is the new unified
FW. This contains updated register defines for ALEF speed
configuration. Likewise it also removes the AN_CNTL_8 usage from
the native SFI flow as it is no longer supported by FW.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix an issue where set_phy_power was NULL for X550 copper devices
because get_invariants was called before hw->device_id was set.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Introduce ixgbe_link_operations struct with the following changes:
read_i2c_combined => read_link
read_i2c_combined_unlocked => read_link_unlocked
write_i2c_combined => write_link
write_i2c_combined_unlocked => write_link_unlocked
This will allow X550EM_a to override these methods for MDIO access
while X550EM_x provides methods to use I2C combined access. This
also adds a new structure, ixgbe_link_info, to hold information
about the link. Initially this is just method pointers and a bus
address.
The functions involved in combined I2C accesses were moved from
ixgbe_phy.c to ixgbe_x550.c. The underlying functions that carry
out the combined I2C accesses were left in ixgbe_phy.c because
they share some functions with other I2C methods.
v2 - set hw->link.ops in probe.
v3 - check ii->link_ops before setting it since we don't have it
for all devices.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove SFP ixfi code since there is no HW that currently supports it.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The msix_entries memory can be freed by a previous suspend or
remove, so don't crash on close when it isn't there.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds X553 flow control auto negotiation for fiber and
backplain. To enable this new function pointers were added as well
as creating a function to dynamically set function pointer we can't
define only on MAC type.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Read the PHY register twice in order to get the correct value for
autoneg_status.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Replace some ixgbe specific MDIO defines with their equivalent
from the kernel.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch updates ixgbe_setup_phy_link_generic to set/unset
auto-negotiation for all speeds. This ensures that unsupported
speeds are unset. This is necessary since the PHY NVM may
advertise unsupported speeds.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support to get the LED link active via the LEDCTL
register. If the LEDCTL register does not have LED link active
(LED mode field = 0x0100) set then default LED link active returned.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
X553 doesn't need all the initialization that X552 did for iXFI. This
patch will allow native SPI SFP+ to work with X553 devices. Future
patches will add additional configuration as needed.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mostly simple overlapping changes.
For example, David Ahern's adjacency list revamp in 'net-next'
conflicted with an adjacency list traversal bug fix in 'net'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix NULL pointer dereference in the case where a macvlan interface is
brought up while the PF is still down:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffffa0170fb2>] ixgbe_alloc_rx_buffers+0x42/0x1a0 [ixgbe]
Call Trace:
[<ffffffffa017336b>] ixgbe_configure_rx_ring+0x2eb/0x3d0 [ixgbe]
[<ffffffffa0173811>] ixgbe_fwd_ring_up+0xd1/0x380 [ixgbe]
[<ffffffffa0179709>] ixgbe_fwd_add+0x149/0x230 [ixgbe]
[<ffffffffa0113480>] macvlan_open+0x260/0x2b0 [macvlan]
Reported-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Convert ixgbe users to new dev walk API. This is just a code conversion;
no functional change is intended.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e100: min_mtu 68, max_mtu 1500
- remove e100_change_mtu entirely, is identical to old eth_change_mtu,
and no longer serves a purpose. No need to set min_mtu or max_mtu
explicitly, as ether_setup() will already set them to 68 and 1500.
e1000: min_mtu 46, max_mtu 16110
e1000e: min_mtu 68, max_mtu varies based on adapter
fm10k: min_mtu 68, max_mtu 15342
- remove fm10k_change_mtu entirely, does nothing now
i40e: min_mtu 68, max_mtu 9706
i40evf: min_mtu 68, max_mtu 9706
igb: min_mtu 68, max_mtu 9216
- There are two different "max" frame sizes claimed and both checked in
the driver, the larger value wasn't relevant though, so I've set max_mtu
to the smaller of the two values here to retain identical behavior.
igbvf: min_mtu 68, max_mtu 9216
- Same issue as igb duplicated
ixgb: min_mtu 68, max_mtu 16114
- Also remove pointless old == new check, as that's done in dev_set_mtu
ixgbe: min_mtu 68, max_mtu 9710
ixgbevf: min_mtu 68, max_mtu dependent on hardware/firmware
- Some hw can only handle up to max_mtu 1504 on a vf, others 9710
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These accessors are used in various drivers that support tc offloading,
to detect properties of a given 'tc_action'.
'is_tcf_mirred_redirect' tests that the action is TCA_EGRESS_REDIR.
'is_tcf_mirred_mirror' tests that the action is TCA_EGRESS_MIRROR.
As a prep towards supporting INGRESS redir/mirror, rename these
predicates to reflect their true meaning:
s/is_tcf_mirred_redirect/is_tcf_mirred_egress_redirect/
s/is_tcf_mirred_mirror/is_tcf_mirred_egress_mirror/
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Ido Schimmel <idosch@mellanox.com>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2016-09-23
This series contains updates to ixgbe and ixgbevf.
Emil provides several changes, first simplifies the logic for setting VLAN
filtering by checking the VMDQ flag and the old 82598 MAC, instead of
having to maintain a list of MAC types. Then made two functions static
that are used only within the file, a by-product is sparse is now happy.
Added spinlocks to make sure that the MTU configuration is handled
properly. Fixed an issue where when SR-IOV is enabled while the
ixgbevf driver is loaded would result in all mailbox requests being
rejected by ixgbe, so call ixgbe_sriov_reinit() before pci_enable_sriov()
to ensure mailbox requests are properly handled.
Mark resolves a NULL pointer issue by simply setting the read and write
*_ref_mdi pointers for x550em_a devices. Then clearly indicates within
ethtool that all MACs support pause frames and made sure that the
advertising is set to the requested mode. Fixed an issue where
MDIO_PRTAD_NONE was not being used consistently to indicate no PHY
address.
Alex fixes an issue, where the support for multiple queues when SR-IOV
is enabled was added but the support was not reported. With that, fix
an issue where the hardware redirection table could support more queues
then the PF currently has when SR-IOV is enabled, so use the RSS mask to
trim off the bits that are not used. Lastly, instead of limiting the
VFs if we do not use 4 queues for RSS in the PF, we can instead just limit
the RSS queues used to a power of 2. We can now support use cases where
VFs are using more queues than the PF is currently using and can support
RSS if so desired.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
the ability for user-space application to specify it for the VF, as an
option to support 802.1ad.
We adjusted IP Link tool to support this option.
For future use cases, the new UAPI supports multiple vlans. For now we
limit the list size to a single vlan in kernel.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older versions of IP Link tool.
Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
We kept 802.1Q as the drivers' default vlan protocol.
Suitable ip link tool command examples:
Set vf vlan protocol 802.1ad:
ip link set eth0 vf 1 vlan 100 proto 802.1ad
Set vf to VST (802.1Q) mode:
ip link set eth0 vf 1 vlan 100 proto 802.1Q
Or by omitting the new parameter
ip link set eth0 vf 1 vlan 100
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enabling SRIOV while the ixgbevf driver is loaded will result in all
mailbox requests from ixgbevf_open() being rejected by ixgbe because
adapter->clear_to_send is set to false on reset.
Call ixgbe_sriov_reinit() before pci_enable_sriov() to make sure that
mailbox requests are handled from the time ixgbevf is loaded.
Reported-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of limiting the VFs if we don't use 4 queues for RSS in the PF we
can instead just limit the RSS queues used to a power of 2. By doing this
we can support use cases where VFs are using more queues than the PF is
currently using and can support RSS if so desired.
The only limitation on this is that we cannot support 3 queues of RSS in
the PF or VF. In either of these cases we should fall back to 2 queues in
order to be able to use the power of 2 masking provided by the psrtype
register.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The hardware redirection table can support more queues then the PF
currently has when SR-IOV is enabled. In order to account for this use the
RSS mask to trim of the bits that are not used.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The maximum queue count reported was 1, however support for multiple queues
with SR-IOV was added some time ago so we should report support for it to
the user so that they can select multiple queues if they so desire.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The value MDIO_PRTAD_NONE should be used to indicate no PHY address.
Not 0, not 0xFFFF. Use the MDIO_PRTAD_NONE value consistently.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
All the MACs supported by ixgbe support pause frames, so indicate
that support in ethtool. Also set advertising according to requested
mode.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Set the read_reg_mdi and write_reg_mdi method pointers for
X550EM_A_10G_T devices to resolve jumping to NULL.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
These functions are only used in ixgbe_x550.c.
Fixes a warning when compiling with -Wmissing-prototypes
Reported-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simplify the logic for setting VLNCTRL.VFE by checking the VMDQ flag
and 82598 MAC instead of having to maintain a list of MAC types.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Drivers must be ready to accept NULL from ptp_clock_register() if the
PTP clock subsystem is configured out.
This patch documents that and ensures that all drivers cope well
with a NULL return.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Eugenia Emantayev <eugenia@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IS_ENABLED() macro checks if a Kconfig symbol has been enabled either
built-in or as a module, use that macro instead of open coding the same.
Using the macro makes the code more readable by helping abstract away some
of the Kconfig built-in and module enable details.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove a useless log message and improve the logic for setting
a PHY address from the contents of the MNG_IF_SEL register.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the new copper device X557.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This shouldn't matter as nothing should be attached still to be
consisted control MDIO speed for these devices as well.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X557 devices use a different interface to the LED for the port.
This patch reflect that change.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add geneve Rx offload support for x550em_a.
The implementation follows the vxlan code with the lower 16 bits of
the VXLANCTRL register holding the UDP port for VXLAN and the upper
for Geneve.
Disabled NFS filters in the RFCTL register which allows us to simplify
the check for VXLAN and Geneve packets in ixgbe_rx_checksum().
Removed vxlan from the name of the callback functions and replaced it
with udp_tunnel which is more in line with the new API.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ethtool reports backplane type interfaces as 1000/10000baseT link modes.
This has been corrected to report the media as KR, KX or KX4 based on the backplane interface present.
Signed-off-by: Veola Nazareth <veola.nazareth@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The RAR entry for the SAN MAC address was being cleared when we were
clearing the VMDq pool bits. In order to prevent this we need to add
an extra check to protect the SAN MAC from being cleared.
Fixes: 6e982aeae ("ixgbe: Clear stale pool mappings")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use atomic bitwise operations when setting and checking reset
requests. This should help with possible races in the service task.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Following a write the TXDCTL.ENABLE bit is set only when the Tx queue
is actually enabled, which may not happen during the configure phase even
if we waited for it. Make this check debug only since this is causing
confusion with users who notice the warning in dmesg.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.
The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.
Fixes: a85a970af2 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back when I submitted the GSO code I messed up and dropped the support for
disabling the VLAN tag filtering via the feature bit. This patch
re-enables the use of the NETIF_F_HW_VLAN_CTAG_FILTER to enable/disable the
VLAN filtering independent of toggling promiscuous mode.
Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When I was adding the code for enabling VLAN promiscuous mode with SR-IOV
enabled I had inadvertently left the VLNCTRL.VFE bit unchanged as I has
assumed there was code in another path that was setting it when we enabled
SR-IOV. This wasn't the case and as a result we were just disabling VLAN
filtering for all the VFs apparently.
Also the previous patches were always clearing CFIEN which was always set
to 0 by the hardware anyway so I am dropping the redundant bit clearing.
Fixes: 1636956491 ("ixgbe: Add support for VLAN promiscuous with SR-IOV")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2016-07-22
This series contains updates to ixgbe and ixgbevf only.
Emil fixes the NACK check in ixgbevf_set_uc_addr_vf() for instances where
the index is not equal to zero. Fixes an issue where mac->ops.setup_fc
can be NULL for backplanes which can cause the driver to crash on load.
Don fixes the second parameter of the LED functions, which is the index to
the LED we are interested in affecting. Fixed variable to store register
reads to unsigned integer. Adds support for the new x553 hardware into
ixgbevf. Fixed a missing rtnl lock around ixgbevf_reinit_locked().
Fixed an issue where in ixgbevf_reset_subtask() was not verifying that
the port has been removed. Cleans up the initial crosstalk fix, since
the SFP that indicates the presence of a SFP+ module changes between
hardware types.
Babu Moger fixes typo in freeing IRQ, since the array subscript increments
after the execution of the statement.
Wei Yongjun adds the missing destroy_workqueue() before returning from
ixgbe_init_module() in the error handling case.
Tony adds range checking for setting the MTU from the VF, where the PF can
return a NACK but this was not passed on to the VF, so propagate the
results from the PF to the VF so errors can be reported. Consolidates
mailbox read and write functions, since the recent changes to
ixgbevf_write_msg_read_ack(), other functions are performing the same
operations done here.
Colin Ian King removes a redundant check on ret_val, since ret_val has
not changed since the previous check.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch address a few issues with the initial crosstalk fix. Most
important of which is the SDP that indicates the presents of a SFP+
module changes between HW types. With this change that is taken in
to consideration
It also moves the check closer to the base code that checks link. This
makes it so we only need to do the check in one spot.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The last check on ret_val is redundant since ret_val has not changed
since the previous check, so remove it as it is extraneous.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add the missing destroy_workqueue() before return from
ixgbe_init_module() in the error handling case.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
mac->ops.setup_fc can be null for backplanes which can cause the driver
to crash on load.
Reported-by: Patrick McLean <patrickm@gaikai.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The array subscript increments after the execution of the statement.
So there is no issue here. However it helps to read the code better.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
I noticed this variable used for register reads wasn't an unsigned
so this patch corrects that. I don't believe this was causing any
issue as is but this is more consistent with the rest of the driver.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The second parameter of these functions is the index to the led we
are interested in affecting. However we were mistakenly passing
the offset in the register. This patch corrects that and adds some
bonds checking which would hopefully make bugs like this more noticeable
in the future.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the function ixgbe_poll() returns 0 when it clean completely
the rx rings, but this foul budget accounting in core code.
Fix this returning the actual work done, capped to weight - 1, since
the core doesn't allow to return the full budget when the driver modifies
the napi status
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some comments weren't updated to reflect the renaming of ndo's and the
change of arguments.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When setting spoofing, both VLAN and MAC need to be set together.
This change resolves an issue where MAC-VLANs on the VF fail to pass
traffic due to spoofed packets.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update ixgbe_ethtool_get_ts_info() to show that x550 supports hardware
timestamping of all packets.
Reported-by: Guy Harris <guy@alum.mit.edu>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
For u32 classifier filters, avoid overwriting existing filter
in a hardware location without removing it first, to clean up
inconsistencies due to duplicate values for filter location.
Verified with the following filters:
Create child hash tables:
handle 1: u32 divisor 1
handle 2: u32 divisor 1
Link to the child hash table from parent hash table:
handle 800:0:11 u32 ht 800: link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat \
match ip protocol 6 ff match ip dst 15.0.0.1/32
handle 800:0:12 u32 ht 800: link 2: \
offset at 0 mask 0f00 shift 6 plus 0 eat \
match ip protocol 17 ff match ip dst 16.0.0.1/32
Add filter into child hash table:
handle 1:0:3 u32 ht 1: \
match tcp src 22 ffff action drop
Add another filter to the same location:
handle 2:0:3 u32 ht 2: \
match tcp src 33 ffff action drop
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On deleting filters which are links to a child hash table, the filters
in the child hash table must be cleared from the hardware if there
is no link between the parent and child hash table.
Verified with the following filters:
Create a child hash table:
handle 1: u32 divisor 1
Link to the child hash table from parent hash table:
handle 800:0:10 u32 ht 800: link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat \
match ip protocol 6 ff match ip dst 15.0.0.1/32
Add filters into child hash table:
handle 1:0:2 u32 ht 1: \
match tcp src 22 ffff action drop
handle 1:0:3 u32 ht 1: \
match tcp src 33 ffff action drop
Delete link filter from parent hash table:
handle 800:0:10 u32
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that we do have pci_request_mem_regions() and pci_release_mem_regions()
at hand, use it in the Intel ethernet drivers.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: David S. Miller <davem@davemloft.net>
This change replaces the network device operations for adding or removing a
VXLAN port with operations that are more generically defined to be used for
any UDP offload port but provide a type. As such by just adding a line to
verify that the offload type is VXLAN we can maintain the same
functionality.
In addition I updated the socket address family check so that instead of
excluding IPv6 we instead abort of type is not IPv4. This makes much more
sense as we should only be supporting IPv4 outer addresses on this
hardware.
The last change is that I pulled the rtnl_lock/unlock into the conditional
statement for IXGBE_FLAG2_VXLAN_REREG_NEEDED. The motivation behind this
is to avoid unneeded bouncing of the mutex which will just slow down the
handling of this call anyway.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for offloading IPXIP6 type packets that represent
either IPv4 or IPv6 encapsulated inside of an IPv6 outer IP header. In
addition with this change we should also be able to support FOU
encapsulated traffic with outer IPv6 headers.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch defines two new GSO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like at some point I somehow transposed the location of setting
the VLAN features in netdev->features and the configuration of the
vlan_features. As a result the driver is now generating a warning about
vlan_features being setup incorrectly.
This patch corrects that by placing the update of netdev->features to
include the VLAN features so that it is after the point where we write
netdev->features into netdev->vlan_features.
Fixes: b83e30104b ("ixgbe/ixgbevf: Add support for GSO partial")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Swap the parameters in GENMASK in order to generate the correct mask.
This change fixes Tx hangs when enabling SRIOV.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2016-05-04
This series contains updates to ixgbe, ixgbevf and traffic class helpers.
Sridhar adds helper functions to the tc_mirred header to access tcf_mirred
information and then implements them for ixgbe to enable redirection to
a SRIOV VF or an offloaded MACVLAN device queue via tc 'mirred' action.
Amritha adds support to set filters with multiple header fields (L3,L4)
to match on.
KY Srinivasan from Microsoft add Hyper-V support into ixgbevf.
Emil adds 82599 sub-device IDs that were missing from the list of parts
that support WoL. Then simplified the logic we use to determine WoL
support by reading the EEPROM bits for MACs X540 and newer.
Preethi cleaned up duplicate and unused device IDs. Fixed our ethtool
stat reporting where we were ignoring higher 32 bits of stats registers,
so fill out 64 bit stat values into two 32 bit words.
Babu Moger from Oracle improves VF performance issues on SPARC.
Alex Duyck cleans up some of the Hyper-V implementation from KY so that
we can just use function pointers instead of having to identify if a
given VF is running on a Linux or Windows PF.
Usha makes sure that DCB and FCoE is disabled for X550EM_x/a MACs and
cleans up the DCB initialization in the process.
Tony cleans up the API for ixgbevf_update_xcast_mode() so we do not
have to pass in the netdev parameter, since it was never used in the
function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
a trans_start struct member exists twice:
- in struct net_device (legacy)
- in struct netdev_queue
Instead of open-coding dev->trans_start usage to obtain the current
trans_start value, use dev_trans_start() instead.
This is not exactly the same, as dev_trans_start also considers
the trans_start values of the netdev queues owned by the device
and provides the most recent one.
For legacy devices this doesn't matter as dev_trans_start can cope
with netdev trans_start values of 0 (they are ignored).
This is a prerequisite to eventual removal of dev->trans_start.
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds IXGBE_FLAG_DCB_CAPABLE flag that is set
for all MACs other than X550EM_x and x550em_a. DCB and
FCoE is disabled for these MACS. DCB initialization
code is moved to a separate function.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Ronald Bynoe <ronald.j.bynoe@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Revise populating few registers in ixgbe_get_regs() and macro
definitions.
Before applying patch:
$ du -k objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
8572 objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
After applying patch:
$ du -k objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
8568 objs/drivers/net/ethernet/intel/ixgbe/ixgbe.ko
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code was ignoring higher 32 bits of stats registers. This patch
correctly fills out 64 bit value in two 32 bit words.
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove duplicate and unused device ID definitions.
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change aims to simplify the logic we use to determine WOL
support by reading the EEPROM bits for MACs X540 and newer.
Also some cleanups in ixgbe_wol_supported() - changed return type to
bool and removed redundant return variable by simply using return after
the checks.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We had some 82599 subdevice IDs missing from the list of parts that
support WoL.
Reported-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Adds support to set filters with multiple header fields (L3,L4)to match on.
This is achieved in the following order:
1. Create a leaf hash table for the next header.
2. Create a link to the leaf hash table from the base hash table with
matches on next header type and current header fields.
3. Add filter in leaf hash table with match on next header fields and
action.
Verified with the following filters :
Match TCP and DIP:
handle 1: u32 divisor 1
u32 ht 800: order 1 link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat \
match ip protocol 6 ff match ip dst 10.0.0.1/32
match tcp src 28 ffff action drop
Delete the filter:
Match on DIP, SIP, UDP (SPort, DPort):
handle 2: u32 divisor 1
u32 ht 800: order 2 link 2: \
offset at 0 mask 0f00 shift 6 plus 0 eat \
match ip dst 15.0.0.2/32 match ip protocol 17 ff \
match ip src 15.0.0.1/32
match udp src 30 ffff match udp dst 32 ffff action drop
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables 'redirect' to a SRIOV VF or a offloaded macvlan
device queue via tc 'mirred' action.
Verified with the following script that creates SRIOV VFs, offloaded
macvlan and adds tc u32 filters with redirect action to the associated
netdevs.
# add ingress qdisc.
tc qdisc add dev p4p1 ingress
# enable hw tc offload.
ethtool -K p4p1 hw-tc-offload on
# create 4 sriov VFs and bring up the first one.
echo 4 > /sys/class/net/p4p1/device/sriov_numvfs
sleep 1
ip link set p4p1 up
ip link set p4p1_0 up
# create a offloaded macvlan device and bring it up.
ethtool -K p4p1 l2-fwd-offload on
ip link add link p4p1 name mvlan_1 type macvlan
ip link set mvlan_1 up
# add u32 filter with action to redirect to VF netdev
tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:1 u32 ht 800: \
match ip src 192.168.1.3/32 \
action mirred egress redirect dev p4p1_0
# add u32 filter with action to redirect to macvlan netdev
tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:2 u32 ht 800: \
match ip src 192.168.2.3/32 \
action mirred egress redirect dev mvlan_1
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The newly added x550em_a support causes a link failure on ARM because of
an overly long time passed into udelay():
ERROR: "__bad_udelay" [drivers/net/ethernet/intel/ixgbe/ixgbe.ko] undefined!
There are multiple variants of the ixgbe_acquire_swfw_sync_*() function,
and the other ones all use msleep(), so we can safely assume that all
callers are allowed to sleep, which makes msleep() a better replacement
than mdelay().
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 49425dfc74 ("ixgbe: Add support for x550em_a 10G MAC type")
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for partial GSO segmentation in the case of
tunnels. Specifically with this change the driver an perform segmentation
as long as the frame either has IPv6 inner headers, or we are allowed to
mangle the IP IDs on the inner header. This is needed because we will not
be modifying any fields from the start of the start of the outer transport
header to the start of the inner transport header as we are treating them
like they are just a block of IP options.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Make use of GENMASK instead of open coding the equivalent operation
incorrectly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Several areas of ixgbe were written before widespread usage of the
BIT(n) macro. With the impending release of GCC 6 and its associated new
warnings, some usages such as (1 << 31) have been noted within the ixgbe
driver source. Fix these wholesale and prevent future issues by simply
using BIT macro instead of hand coded bit shifts.
Also fix a few shifts that are shifting values into place by using the
'u' prefix to indicate unsigned. It doesn't strictly matter in these
cases because we're not shifting by too large a value, but these are all
unsigned values and should be indicated as such.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It is possible on some systems that crosstalk could lead to link flap
on empty SFP+ cages. A new NVM bit was defined to let SW know it
needs to implement the work around which consists of verifying that
there is a module in the cage before acting on the LSC.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Somehow the wrong fc_setup function was used for x550em_a, so
correct that. Also set setup_link to NULL as its value is
determined later, just like it is with X550EM_x.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use a new register to wait for previous register writes to complete
before issuing a register read. This is needed when slower links
are in use.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This field is used to record the RX queue index for a redirect action
passed via ring_cookie field in struct ethtool_rx_flow_spec which is
a u64 value.
For ex: after adding a filter rule to redirect to a VF using ethtool
# echo 4 > /sys/class/net/p4p1/device/sriov_numvfs
# ethtool -N p4p1 flow-type ip4 src-ip 192.168.0.1 action 0x100000000
querying for the rule shows the Action as 'Direct to queue 0'
# ethtool -n p4p1
4 RX rings available
Total 1 rules
Filter: 2045
Rule Type: Raw IPv4
Src IP addr: 192.168.0.1 mask: 0.0.0.0
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Protocol: 0 mask: 0xff
L4 bytes: 0x0 mask: 0xffffffff
VLAN EtherType: 0x0 mask: 0xffff
VLAN: 0x0 mask: 0xffff
User-defined: 0x0 mask: 0xffffffffffffffff
Action: Direct to queue 0
With this fix, ethtool will report the right queue index even for VFs.
Action: Direct to queue 4294967296
Here 4294967296 corresponds to 0x100000000.
We need to update 'ethtool' to report the queue index as a Hex value so
that it is more user friendly and matches with the 'action' value that
is passed when adding the rule.
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
X550EM_a/x did not have a default value for mac->ops.setup_link which
was causing link issues for backplane devices.
This patch sets mac->ops.setup_link to ixgbe_setup_mac_link_X540 for
X550EM_a/x which is also default for X550. This will result in
mac->ops.setup_link calling the link setup function for the respective
PHY type in case we do not need a special function to deal with it.
Reported-by: Ken Cox <jkc@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously the PF driver would only set VLAN spoof checking if
the VF had created VLANs. This was done by setting and checking
a counter (vlan_count) whenever a VLAN was created by the VF.
However it is possible for the vlan_count to be !=0 while there are
no VLANs assigned to the VF due to the count incrementing every
time a VLAN 0 is added on ifdown/up, which resulted in VLAN spoofing
always being set for those VFs.
This patch cleans up the logic by unconditionally setting VLAN based on
how the VF is configured (via ip link set ethX vf Y spoofchk on/off).
This change also resolves an issue where the VLAN spoofing can remain
set even after being disabled by the user due to the driver enabling
VLAN spoof checking every time a VLAN is added to the VF, but would
only allow changes in the setting if vlan_count != 0.
Also default_vf_vlan_id and vlans_enabled were removed from the
vf_data_storage structure since they are not being used in the driver.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Consolidate the logic behind configuring spoof checking:
Move the setting of the MAC, VLAN and Ethertype spoof checking into
ixgbe_ndo_set_vf_spoofchk().
Change ixgbe_set_mac_anti_spoofing() to set MAC spoofing per VF similar
to the VLAN and Ethertype functions - this allows us to call the helper
functions in ixgbe_ndo_set_vf_spoofchk() for all spoof check types and
only disable MAC spoof checking when creating MACVLAN.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
vxlan_get_rx_port requires rtnl_lock to be held.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Shannon Nelson <shannon.nelson@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update ixgbe version number.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for x550em_a-based KR backplane devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for an SGMII backplane interface.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for SFPs with an external retimer.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move code that controls MDIO speed into a new function because
there will be more MACs that need the control.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Read the IXGBE_NW_MNG_IF_SEL register and use it to set interface
attributes.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Read the instance number from EEPROM and save it for later use.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now x550em_a devices will use a new method for PHY access that will
get the firmware token for each access.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for x550em_a 10G MAC type to the ixgbe driver. The new
MAC includes new firmware commands that need to be used to control
PHY and IOSF access, so that support is also added. The interface
supported is a native SFP+ interface.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Provide method pointers and use them to access IOSF-attached
devices. A new MAC will introduce a new access method.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add definitions for a x550em_a 10G MAC device with a native SFP
interface.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for a single-port X550 device.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch enables bulk free in Tx cleanup for ixgbevf and cleans up the
boolean logic in the polling routines for ixgbe and ixgbevf in the hopes of
avoiding any mix-ups similar to what occurred with i40e and i40evf.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We need to take the manageability semaphore when issuing firmware
commands to avoid problems. With this in place, the semaphore is
no longer taken in the ixgbe_set_fw_drv_ver_generic function, since
it will now always be taken by the ixgbe_host_interface_command
function.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up the interface for issuing firmware commands to use a
void * instead of a u32 *. This eliminates a number of casts.
Also clean up ixgbe_host_interface_command in a few other ways,
eliminating comparisons with 0, redundant parens and minor
formatting issues.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The function ixgbe_host_interface_command actually uses a multiple
of word sized buffer to do its business, but only checks against
the actual length passed in. This means that on read operations it
could be possible to modify locations beyond the length passed in.
Change the check to round up in the same way, just to avoid any
possible hazard.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since the lan_id and func fields only ever hold small values, make
them u8 to avoid casts used to silence warnings.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
I noticed the SRAMREL registers are not referenced for any device,
so delete the definitions.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we can use the ethtool rx-vlan-filter flag to
toggle Rx VLAN filtering on and off. This is basically just an extension
of the existing VLAN promisc work in that it just adds support for the
additional ethtool flag.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Added support to match on UDP fields in the transport layer.
Extended core logic to support multiple headers.
Verified with the following filters :
handle 1: u32 divisor 1
u32 ht 800: order 1 link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
u32 ht 1: order 2 \
match tcp src 1024 ffff match tcp dst 23 ffff action drop
handle 2: u32 divisor 1
u32 ht 800: order 3 link 2: \
offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 17 ff
u32 ht 2: order 4 \
match udp src 1025 ffff match udp dst 24 ffff action drop
Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It is possible on some HW that a system reset could occur when we are
holding the SWFW semaphore lock. So next time the driver was loaded we
would see it incorrectly as locked. This patch will recover from that state
by: Attempting to acquire the semaphore and then regardless of whether or
not it was acquire we immediately release it. This will force us into
a known good state.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This commit adds a callback which allows to adjust the maximum transmit
bitrate the card can output. This makes it possible to get a smooth
traffic instead of the default burst-y behaviour when trying to output
e.g. a video stream.
Much of the logic needed to get a correct bcnrc_val was taken from the
ixgbe_set_vf_rate_limit() function.
Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Xeon D KR backplane is different from other backplanes,
in that we can't use auto-negotiation to determine the
mode. Instead, use whatever the user configured.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for generic Tx checksums to the ixgbe driver. It
turns out this is actually pretty easy after going over the datasheet as we
were doing a number of steps we didn't need to.
In order to perform a Tx checksum for an L4 header we need to fill in the
following fields in the Tx descriptor:
MACLEN (maximum of 127), retrieved from:
skb_network_offset()
IPLEN (maximum of 511), retrieved from:
skb_checksum_start_offset() - skb_network_offset()
TUCMD.L4T indicates offset and if checksum or crc32c, based on:
skb->csum_offset
The added advantage to doing this is that we can support inner checksum
offloads for tunnels and MPLS while still being able to transparently
insert VLAN tags.
I also took the opportunity to clean-up many of the feature flag
configuration bits to make them a bit more consistent between drivers.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This commit converts commit c762dff24c ("ixgbe: Look up MAC address in
Open Firmware or IDPROM") to use eth_platform_get_mac_address()
added by commit c7f5d10549 ("net: Add eth_platform_get_mac_address()
helper.")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The source for the ops structure contents are const, so make them
so. Copy them in place with structure assignments instead of memcpys.
Make the mbx_ops accessed by reference instead of making a copy of
the source structure. Update copyright date on the touched files.
Reported-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We were adding VLAN 0 twice each time we restored the VLAN configuration.
Instead of doing it twice we can just start working through the active
VLANs from ID 1 on and skip the double write.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
While doing the work on igb I realized there were a few cases where we were
still adding VLANs to the VLVF entries for the PF when they were not
needed. This patch cleans that up so that the only time we add a PF entry
to the VLVF is either for VLAN 0 or if the PF has requested a VLAN that a VF
is already using.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When running certain routing protocols like VRRP, VF guests need the
ability to set the unicast address of the interface. Extend the new ndo
trust feature to let the hypervisor trust a guest to set/update its own
unicast address.
Signed-off-by: Chas Williams <3chas3@gmail.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It seem to be non intentionally changed to Tx in
commit adc810900a ("ixgbe: Refactor busy poll socket code to address
multiple issues")
Lock is taken from ixgbe_low_latency_recv, and there under this
lock we use ixgbe_clean_rx_irq so it looks wrong for me to increment
Tx counter.
Yield stats can be shown through ethtool:
ethtool -S enp129s0 | grep yield
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix support for 16 bit source/dest port matches in ixgbe model.
u32 uses a single 32-bit key value for both source and destination ports
starting at offset 0. So replace the 2 functions with a single function
that takes this key value/mask to program both source and dest ports.
Verified with the following filter:
#tc qdisc add dev p4p1 ingress
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 1: u32 divisor 1
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:10 u32 ht 800: link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 1:0:10 u32 ht 1: \
match tcp src 1024 ffff match tcp dst 80 ffff action drop
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 1:0:11 u32 ht 1: \
match tcp src 1025 ffff action drop
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 1:0:12 u32 ht 1: \
match tcp dst 81 ffff action drop
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove the incorrect check for mask in ixgbe_configure_clsu32 and
drop the 'mask' field that is not required in struct ixgbe_mat_field
Verified with the following filters:
#tc qdisc add dev p4p1 ingress
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:1 u32 ht 800: \
match ip dst 10.0.0.1/8 match ip src 10.0.0.2/8 action drop
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:2 u32 ht 800: \
match ip dst 11.0.0.1/16 match ip src 11.0.0.2/16 action drop
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:3 u32 ht 800: \
match ip dst 12.0.0.1/24 match ip src 12.0.0.2/24 action drop
#tc filter add dev p4p1 parent ffff: protocol ip prio 99 \
handle 800:0:4 u32 ht 800: \
match ip dst 13.0.0.1/32 match ip src 13.0.0.2/32 action drop
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Check for handle ids when adding/deleting hash nodes OR adding/deleting
filter entries and limit them to max number of links or header nodes
supported(IXGBE_MAX_LINK_HANDLE).
Start from bit 0 when setting hash table bit-map.(adapter->tables)
Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This function is only used in ixgbe_main.c
Resolves a "missing prototype" warning when building the driver with W=1
Reported-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Calling dev_close() causes IFF_UP to be cleared which will remove the
interfaces routes and some addresses. That's probably not what the user
intended when running the offline selftest. Besides this does not happen
if the interface is brought down before the test, so the current
behaviour is inconsistent.
Instead call the net_device_ops ndo_stop function directly and avoid
touching IFF_UP at all.
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use udelay instead of usleep_range because this can be called while
a lock is held.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ATR code was assuming that it would be able to use tcp_hdr for
every TCP frame that came through. However this isn't the case as it
is possible for a frame to arrive that is TCP but sent through something
like a raw socket. As a result the driver was setting up bad filters in
which tcp_hdr was really pointing to the network header so the data was
all invalid.
In order to correct this I have added a bit of parsing logic that will
determine the TCP header location based off of the network header and
either the offset in the case of the IPv4 header, or a walk through the
IPv6 extension headers until it encounters the header that indicates
IPPROTO_TCP. In addition I have added checks to verify that the lowest
protocol provided is recognized as IPv4 or IPv6 to help mitigate raw
sockets using ETH_P_ALL from having ATR applied to them.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The VXLAN port number should be stored in network order instead of in host
order as it is accessed from the hot-path in ATR. This way we can avoid
having to do any byte swaps in order to validate the port number.
I moved the vxlan_port value into a hole in the read-mostly region of the
adapter struct. This way it should be in a warm cache-line instead of in
some isolated region in memory when it needs to be accessed.
In addition I went through and stripped a bunch of unneeded ifdef flags
since having an extra variable present doesn't really hurt anything and
makes the code easier to read. I also went through and dropped the
NETIF_F_RXCSUM flag which was being set in hw_encap_features but provides
no value as the flag is not evaluated in the Rx path.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
commit c9f53e63c2 ("ixgbe: Refactor MAC address configuration code")
introduced code that doesn't set HW register RAR0 to default mac address
but FF:FF:FF:FF:FF:FF. Due to this, ixgbe HW discards all incoming packets
that doesn't have destination mac address equals to FF:FF:FF:FF:FF:FF.
This commit sets RAR0 correctly to default HW mac address.
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Tested-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pull networking updates from David Miller:
"Highlights:
1) Support more Realtek wireless chips, from Jes Sorenson.
2) New BPF types for per-cpu hash and arrap maps, from Alexei
Starovoitov.
3) Make several TCP sysctls per-namespace, from Nikolay Borisov.
4) Allow the use of SO_REUSEPORT in order to do per-thread processing
of incoming TCP/UDP connections. The muxing can be done using a
BPF program which hashes the incoming packet. From Craig Gallek.
5) Add a multiplexer for TCP streams, to provide a messaged based
interface. BPF programs can be used to determine the message
boundaries. From Tom Herbert.
6) Add 802.1AE MACSEC support, from Sabrina Dubroca.
7) Avoid factorial complexity when taking down an inetdev interface
with lots of configured addresses. We were doing things like
traversing the entire address less for each address removed, and
flushing the entire netfilter conntrack table for every address as
well.
8) Add and use SKB bulk free infrastructure, from Jesper Brouer.
9) Allow offloading u32 classifiers to hardware, and implement for
ixgbe, from John Fastabend.
10) Allow configuring IRQ coalescing parameters on a per-queue basis,
from Kan Liang.
11) Extend ethtool so that larger link mode masks can be supported.
From David Decotigny.
12) Introduce devlink, which can be used to configure port link types
(ethernet vs Infiniband, etc.), port splitting, and switch device
level attributes as a whole. From Jiri Pirko.
13) Hardware offload support for flower classifiers, from Amir Vadai.
14) Add "Local Checksum Offload". Basically, for a tunneled packet
the checksum of the outer header is 'constant' (because with the
checksum field filled into the inner protocol header, the payload
of the outer frame checksums to 'zero'), and we can take advantage
of that in various ways. From Edward Cree"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
bonding: fix bond_get_stats()
net: bcmgenet: fix dma api length mismatch
net/mlx4_core: Fix backward compatibility on VFs
phy: mdio-thunder: Fix some Kconfig typos
lan78xx: add ndo_get_stats64
lan78xx: handle statistics counter rollover
RDS: TCP: Remove unused constant
RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
net: smc911x: convert pxa dma to dmaengine
team: remove duplicate set of flag IFF_MULTICAST
bonding: remove duplicate set of flag IFF_MULTICAST
net: fix a comment typo
ethernet: micrel: fix some error codes
ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
bpf, dst: add and use dst_tclassid helper
bpf: make skb->tc_classid also readable
net: mvneta: bm: clarify dependencies
cls_bpf: reset class and reuse major in da
ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
ldmvsw: Add ldmvsw.c driver code
...
The success of CMA allocation largely depends on the success of
migration and key factor of it is page reference count. Until now, page
reference is manipulated by direct calling atomic functions so we cannot
follow up who and where manipulate it. Then, it is hard to find actual
reason of CMA allocation failure. CMA allocation should be guaranteed
to succeed so finding offending place is really important.
In this patch, call sites where page reference is manipulated are
converted to introduced wrapper function. This is preparation step to
add tracepoint to each page reference manipulation function. With this
facility, we can easily find reason of CMA allocation failure. There is
no functional change in this patch.
In addition, this patch also converts reference read sites. It will
help a second step that renames page._count to something else and
prevents later attempt to direct access to it (Suggested by Andrew).
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I added this check in setup_tc to multiple drivers,
if (handle != TC_H_ROOT || tc->type != TC_SETUP_MQPRIO)
Unfortunately restricting to TC_H_ROOT like this breaks the old
instantiation of mqprio to setup a hardware qdisc. This patch
relaxes the test to only check the type to make it equivalent
to the check before I broke it. With this the old instantiation
continues to work.
A good smoke test is to setup mqprio with,
# tc qdisc add dev eth4 root mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 0@0 1@1 2@2 3@3 4@4 5@5 6@6 7@7
Fixes: e4c6734eaa ("net: rework ndo tc op to consume additional qdisc handle paramete")
Reported-by: Singh Krishneil <krishneil.k.singh@intel.com>
Reported-by: Jake Keller <jacob.e.keller@intel.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the original series drivers would get offload requests for cls_u32
rules even if the feature bit is disabled. This meant the driver had
to do a boiler plate check on the feature bit before adding/deleting
the rule.
This patch lifts the check into the core code and removes it from the
driver specific case.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 9d35cf062e ("net: ixgbe: add minimal parser details for ixgbe")
Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I incorrectly used __u32 types where we should be using u32 types when
I added the ixgbe_model.h file.
Fixes: 9d35cf062e ("net: ixgbe: add minimal parser details for ixgbe")
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch ensures ixgbe will not try to offload hash tables from the
u32 module. The device class does not currently support this so until
it is enabled just abort on these tables.
Interestingly the more flexible your hardware is the less code you
need to implement to guard against these cases.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds initial support for offloading the u32 tc classifier. This
initial implementation only implements a few base matches and actions
to illustrate the use of the infrastructure patches.
However it is an interesting subset because it handles the u32 next
hdr logic to correctly map tcp packets from ip headers using the ihl
and protocol fields. After this is accepted we can extend the match
and action fields easily by updating the model header file.
Also only the drop action is supported initially.
Here is a short test script,
#tc qdisc add dev eth4 ingress
#tc filter add dev eth4 parent ffff: protocol ip \
u32 ht 800: order 1 \
match ip dst 15.0.0.1/32 match ip src 15.0.0.2/32 action drop
<-- hardware has dst/src ip match rule installed -->
#tc filter del dev eth4 parent ffff: prio 49152
#tc filter add dev eth4 parent ffff: protocol ip prio 99 \
handle 1: u32 divisor 1
#tc filter add dev eth4 protocol ip parent ffff: prio 99 \
u32 ht 800: order 1 link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
#tc filter add dev eth4 parent ffff: protocol ip \
u32 ht 1: order 3 match tcp src 23 ffff action drop
<-- hardware has tcp src port rule installed -->
#tc qdisc del dev eth4 parent ffff:
<-- hardware cleaned up -->
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds an ixgbe data structure that is used to determine what
headers:fields can be matched and in what order they are supported.
For hardware devices this can be a bit tricky because typically
only pre-programmed (firmware, ucode, rtl) parse graphs will be
supported and we don't yet have an interface to change these from
the OS. So its sort of a you get whatever your friendly vendor
provides affair at the moment.
In the future we can add the get routines and set routines to
update this data structure. One interesting thing to note here
is the data structure here identifies ethernet, ip, and tcp
fields without having to hardcode them as enumerations or use
other identifiers.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates setup_tc so we can pass additional parameters into
the ndo op in a generic way. To do this we provide structured union
and type flag.
This lets each classifier and qdisc provide its own set of attributes
without having to add new ndo ops or grow the signature of the
callback.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_setup_tc() op was added to support drivers offloading tx
qdiscs however only support for mqprio was ever added. So we
only ever added support for passing the number of traffic classes
to the driver.
This patch generalizes the ndo_setup_tc op so that a handle can
be provided to indicate if the offload is for ingress or egress
or potentially even child qdiscs.
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is an opportunity to bulk free SKBs during reclaiming of
resources after DMA transmit completes in ixgbe_clean_tx_irq. Thus,
bulk freeing at this point does not introduce any added latency.
Simply use napi_consume_skb() which were recently introduced. The
napi_budget parameter is needed by napi_consume_skb() to detect if it
is called from netpoll.
Benchmarking IPv4-forwarding, on CPU i7-4790K @4.2GHz (no turbo boost)
Single CPU/flow numbers: before: 1982144 pps -> after : 2064446 pps
Improvement: +82302 pps, -20 nanosec, +4.1%
(SLUB and GCC version 5.1.1 20150618 (Red Hat 5.1.1-4))
Joint work with Alexander Duyck.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Right now ATR is not handling IPv6 extended headers, so ATR is not
being performed on such packets. Fix that by skipping extended
headers when they are present. This also fixes a problem where
the ATR code was not checking that the inner protocol was actually
TCP before setting up the signature rules. Since the protocol check
is intimately involved with the extended header processing as well,
this all gets fixed together.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When FCoE is enabled with SR-IOV on the X550 NIC the hardware
generates MDD events.
This patch fixes these by setting the expected values in the
Tx context descriptors for FCoE/FIP frames and adding a flush
after writing the RDLEN register.
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Check whether the FCOE support is enabled for the devices to get the
FDMI HBA attributes information instead of checking each device id.
Also, add Model string information for X550.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If an outer UDP checksum is set, pass the skb up with CHECKSUM_NONE
so that the stack will check the checksum. Do not increment an
error counter, because we don't know that there is an actual error.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In ixgbe_get_settings() the link status and speed of the interface
are determined based on a read from the LINKS register via the call
to mac.ops.check.link(). This can cause issues where external drivers
may end up with unknown speed when calling ethtool_get_setings().
Instead of calling the mac.ops.check_link() we can report the speed
from the adapter structure which is populated by the driver.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
PFC is configuration is skipped for X550 devices due to a incorrect
device id check, fixing that to include X550 PFC configuration.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use fcoe_ddp_xid from netdev as this is correctly set for different
device IDs to avoid DDP skip error on X550 as "xid=0x20b out-of-range"
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently credit_refill and credit_max could be zero for a TC and that
is causing Tx hang for CEE mode configuration, so to fix that have at
min credit assigned to a TC and that is as what IEEE mode already does.
Change-ID: If652c133093a21e530f4e9eab09097976f57fb12
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When I had rewritten the code for ixgbe_clear_vf_vlans() it looks like I
had transitioned back and forth between using word as an offset and using
word as a register offset. As a result I honestly don't see how the code
was working before other than the fact that resetting the VLANs on the VF
like didn't do much to clear them.
Another issue found is that the mask was using a divide instead of a
modulus. As a result the mask bit was incorrectly being set to either bit
0 or 1 based on the value of the VF being tested. As a result the wrong
VFs were having their VLANs cleared if they were enabled.
I have updated the code so that word represents the offset in the array.
This way we can use the modulus and xor operations and they will make sense
instead of being performed on a 4 byte aligned value.
I replaced the statement "(word % 2) ^ 1" with "~word % 2" in order to
reduce the line length as the line exceeded 80 characters with the register
name inserted. The two should be equivalent so the change should be safe.
Reported-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X550EM_x revision check needs to check a value, not just a bit.
Use a mask and check the value. Also remove the redundant check
inside the ixgbe_enter_lplu_t_x550em, because it can only be called
when both the mac type and revision check pass.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
X550 allows for up to 64 RSS queues, but the driver can have max
of 63 (-1 MSIX vector for link).
On systems with >= 64 CPUs the driver will set the redirection table
for all 64 queues which will result in packets being dropped.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up minor redundancy in the setting of hw_enc_features that
makes it appears that X550 uniquely has more encapsulation features
than other devices. The driver only supports one more feature, so
make it look that way. No longer set NETIF_F_SG since that is set
by the register_netdev call. Thanks to Alex Duyck for noticing this
slight confusion.
Reported-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ethtool reports backplane type interfaces as 1000/10000baseT link modes.
This has been corrected to report the media as KR, KX or KX4 based on the
backplane interface present.
Signed-off-by: Veola Nazareth <veola.nazareth@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add missing QSFP PHY types to allow for more accurate reporting of
port settings.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Conflicts:
drivers/net/geneve.c
Here we had an overlapping change, where in 'net' the extraneous stats
bump was being removed whilst in 'net-next' the final argument to
udp_tunnel6_xmit_skb() was being changed.
Signed-off-by: David S. Miller <davem@davemloft.net>
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
set of features for offloading all checksums. This is a mask of the
checksum offload related features bits. It is incorrect to set both
NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
features of a device.
This patch:
- Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
NETIF_F_ALL_CSUM is being used as a mask).
- Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some X550 devices can connect at 2.5Gbps during fail-over, but only
with certain link partners. Also setting the advertised speed will
not work so we do not report it as supported to avoid confusion.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch guarantees that the VFs do not have access to VLANs that they
were not supposed to. What this patch does is add code so that we delete
the previous port VLAN after adding a new one, and if we reset the VF we
clear all of the filters associated with it.
Previously the code was leaving all previous VLANs mapped to the VF and
they didn't get deleted unless the VF specifically requested it or if the
PF itself was reset.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes certain that we clear the pool mappings added when we
configure default MAC addresses for the interface. Without this we run the
risk of leaking an address into pool 0 which really belongs to VF 0 when
SR-IOV is enabled.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is a follow-on for enabling VLAN promiscuous and allowing the PF
to add VLANs without adding a VLVF entry. What this patch does is go
through and free the VLVF registers if they are not needed as the VLAN
belongs only to the PF which is the default pool.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for VLAN promiscuous with SR-IOV enabled.
The code prior to this patch was only adding the PF to VLANs that the VF
had added. As such enabling promiscuous mode would actually not add any
additional VLAN filters so visibility was limited. This lead to a number
of issues as the bridge and OVS would expect us to accept all VLAN tagged
packets when promiscuous mode was enabled, and instead we would filter out
most if not all depending on the configuration of the PF.
With this patch what we do is set all the bits in the VFTA and all of the
VLVF bits associated with the pool belonging to the PF. By doing this the
PF is guaranteed to receive all VLAN tagged traffic associated with the RAR
filters assigned to the PF. In addition we will clean up those same bits
in the event of promiscuous mode being disabled.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is meant to reduce the complexity of the search function used
for finding a VLVF entry associated with a given VLAN ID. The previous
code was searching from bottom to top. I reordered it to search from top
to bottom. In addition I pulled an AND statement out of the loop and
instead replaced it with an OR statement outside the loop. This should
help to reduce the overall size and complexity of the function.
There was also some formatting I cleaned up in regards to whitespace and
such.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for bypassing the VLVF entry creation when the PF
is adding a new VLAN. The advantage to doing this is that we can then save
the VLVF entries for the VFs which must have them in order to function,
versus the PF which can fall back on the default pool entry.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch addresses several issues within the VLVF and VLVFB
configuration
First was the fact that code was overly complicated with multiple
conditional paths depending on if we adding or removing and which bit we
were going to add or remove. Instead of messing with all that I have
simplified it by using (vid / 32) and (1 - vid / 32) to identify our
register and the other vlvfb register.
Second was the fact that we were likely leaking a few packets into the PF
in cases where we were deleting an entry and the VFTA filter for that entry
as the ordering was such that we deleted the pool and then the VLAN filter
instead of the other way around. I have updated that by adding a check for
no bits being set and if that occurs we clear things up in the proper
order.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In order to clear the way for upcoming work I thought it best to drop the
level of indent in the ixgbe_set_vfta_generic function. Most of the code
is held in the virtualization specific section. So the easiest approach is
to just add a jump label and jump past the bulk of the code if it is not
enabled.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch simplifies the logic for setting the VFTA register by removing
the number of conditional checks needed. Instead we just use some boolean
logic to generate vfta_delta, and if that is set then we xor the vfta by
that value and write it back.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code for checking the PF bit in ixgbe_set_vf_vlan_msg was using the
wrong offset and as a result it was pulling the VLAN off of the PF even if
there were VFs numbered greater than 40 that still had the VLAN enabled.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add a check to make certain mac_table was actually allocated and is not
NULL. If it is NULL return -ENOMEM and allow the probe routine to fail
rather then causing a NULL pointer dereference further down the line.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Enabling SR-IOV and then bringing the interface up was resulting in the PF
MAC addresses getting into a bad state. Specifically the MAC address was
enabled for both VF 0 and the PF. This resulted in some odd behaviors such
as VF 0 receiving a copy of the PFs traffic, which in turn enables the
ability for VF 0 to spoof the PF.
A workaround for this issue appears to be to bring up the interface first
and then enable SR-IOV as this way the reset is then triggered in the
existing code.
In order to correct this I have added a change to ixgbe_setup_tc where if
the interface is down we still will at least call ixgbe_reset so that the
MAC addresses for the device are reset to the correct pools.
Steps to reproduce issue:
modprobe ixgbe
echo 7 > /sys/bus/pci/devices/0000\:01\:00.1/sriov_numvfs
ifconfig enp1s0f1 up
ethregs -s 1:00.1 | grep MPSAR | grep -v 00000000
Result:
MPSAR[0] 00000081
MPSAR[254] 00000001
Expected Result, behavior after patch:
MPSAR[0] 00000080
MPSAR[254] 00000080
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of inhibiting PHY power control when manageability is
present, only inhibit turning PHY power off when manageability
is present. Consequently, PHY power will always be turned on when
requested. Without this patch, some systems with X540 or X550
devices in some conditions will never get link.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for recognizing problems with the first version
of this patch and recognizing how to coalesce error conditions
into a single location.
Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Save VF device pointers and take references to speed accesses used
to monitor the device behavior to avoid slot resets. The saved
information avoids lock contention during the search used to access
each of the VFs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
According to the datasheets, the driver should wait for the master
disable bit to read as being set before checking the status
register for master disable.
Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ixgbe driver was violating the specification in the datasheet
by not waiting 1ms before checking for the reset bit clearing. This
is called out for devices supported by ixgbe, so implement the
required delay.
Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X550EM_x devices handle clocking differently, so update the
PTP implementation to accommodate them. This involves significant
changes to ixgbe's PTP code to accommodate the new range of
behaviors including things like non-power-of-2 clock wrapping.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we allow the PF to make use of all free RAR
entries for FDB use if needed.
Previously the code limited us to 16 unicast entries, however this was
shared between MACVLAN which wasn't limited and the FDB code which was. So
instead of treating the FDB code as a second class citizen I have updated
it so that it has access to just as many entries as the MACVLAN filters.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change replaces the ixgbe_write_uc_addr_list call in ixgbe_set_rx_mode
with a call to __dev_uc_sync instead. This works much better with the MAC
addr list code that was already in place and solves an issue in which you
couldn't remove an FDB address without having to reset the port.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the process of tracking down a memory leak when adding/removing FDB
entries I had to go through the MAC address configuration code for ixgbe.
In the process of doing so I found a number of issues that impacted
readability and performance. This change updates the code in general to
clean it up so it becomes clear what each step is doing. From what I can
tell there a couple of bugs cleaned up in this code.
First is the fact that the MAC addresses were being double counted for the
PF. As a result once entries up to 63 had been used you could no longer
add additional filters.
A simple test case for this:
for i in `seq 0 96`
do
ip link add link ens8 name mv$i type macvlan
ip link set dev mv$i up
done
Test script:
ethregs -s 0:8.0 | grep -e "RAH" | grep 8000....$
When things are working correctly RAL/H registers 1 - 97 will be consumed.
In the failing case it will stop at 63 and prevent any further filters from
being added.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The newer copper PHY implementation used with newer X550EM_x
devices uses a different thermal alarm type than the earlier
one. Make changes to support both types.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes KR PHY reset from ixgbe_init_phy_ops_x550em,
since this function is meant to initialize function pointers for
the detected PHY type. Internal PHY reset was moved to
ixgbe_setup_internal_phy_t_x550em which will now detect which
mode the internal PHY operates in and set it up as required.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Testing has now shown that the diagnostic code used with the CS4227
is no longer needed, so remove it.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ixgbe_intr and ixgbe/ixgbevf_msix_clean_rings functions run from hard
interrupt context or with interrupts already disabled in netpoll.
They can use napi_schedule_irqoff() instead of napi_schedule()
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
KR auto-neg mode is what we will be using going forward. The SW
interface for this mode is different that what was used for iXFI.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch corrects an issue in which the polling routine would increase
the budget for Rx to at least 1 per queue if multiple queues were present.
This would result in Rx packets being processed when the budget was 0 which
is meant to indicate that no Rx can be handled.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The commit dfaf891dd3 ("ixgbe: Refactor the RSS configuration code")
introduced a few kernel-doc errors:
1) The function name is missing;
2) The format is wrong;
3) The short description is redundant.
Fix all the above for the correct execution of the kernel doc.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Delete a redundant include of net/vxlan.h.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove unneeded NULL test.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
NAPI drivers no longer need to observe a particular protocol
to benefit from busy polling (CONFIG_NET_RX_BUSY_POLL=y)
napi_hash_add() and napi_hash_del() are automatically called
from core networking stack, respectively from
netif_napi_add() and netif_napi_del()
This patch depends on free_netdev() and netif_napi_del() being
called from process context, which seems to be the norm.
Drivers might still prefer to call napi_hash_del() on their
own, since they might combine all the rcu grace periods into
a single one, knowing their NAPI structures lifetime, while
core networking stack has no idea of a possible combining.
Once this patch proves to not bring serious regressions,
we will cleanup drivers to either remove napi_hash_del()
or provide appropriate rcu grace periods combining.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We would like to automatically provide busy polling support
to all NAPI drivers, without them having to implement anything.
skb_mark_napi_id() can be called from napi_gro_receive() and
napi_get_frags().
Few drivers are still calling skb_mark_napi_id() because
they use netif_receive_skb(). They should eventually call
napi_gro_receive() instead. I will leave this to drivers
maintainers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The limitation of the number of multicast address for VF is not enough
for the large scale server with SR-IOV feature. IPv6 requires the multicast
MAC address for each IP address to handle the Neighbor Solicitation
message. We couldn't assign over 30 IPv6 addresses to a single VF.
This patch introduces the new mailbox API, IXGBE_VF_UPDATE_XCAST_MODE,
to update multicast mode of VF. This adds 3 modes;
- NONE only L2 exact match addresses or Flow Director enabled
- MULTI BAM and ROMPE set
- ALLMULTI BAM, ROMPE and MPE set
If a guest VF user wants over 30 MAC multicast addresses, set IFF_ALLMULTI
to request PF to update xcast mode to enable VF multicast promiscuous mode.
On the other hand, enabling VF multicast promiscuous mode may affect
security and performance in the network of the NIC. Only trusted VF can
enable multicast promiscuous mode. The behavior of untrusted VF is the
same as previous version.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Implements the new netdev op to trust VF in ixgbe.
The administrator can turn on and off VF trusted by ip command which
supports trust message.
# ip link set dev eth0 vf 1 trust on
or
# ip link set dev eth0 vf 1 trust off
Send a ping to reset VF on changing the status of trusting.
VF driver will reconfigure its features on reset.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
As per Eric Dumazet's previous patches:
(see commit (24d2e4a507) - tg3: use napi_complete_done())
Quoting verbatim:
Using napi_complete_done() instead of napi_complete() allows
us to use /sys/class/net/ethX/gro_flush_timeout
GRO layer can aggregate more packets if the flush is delayed a bit,
without having to set too big coalescing parameters that impact
latencies.
</end quote>
Tested
configuration: low latency via ethtool -C ethx adaptive-rx off
rx-usecs 10 adaptive-tx off tx-usecs 15
workload: streaming rx using netperf TCP_MAERTS
igb:
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
...
Interim result: 941.48 10^6bits/s over 1.000 seconds ending at 1440193171.589
Alignment Offset Bytes Bytes Recvs Bytes Sends
Local Remote Local Remote Xfered Per Per
Recv Send Recv Send Recv (avg) Send (avg)
8 8 0 0 1176930056 1475.36 797726 16384.00 71905
MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.1 () port 0 AF_INET : demo
...
Interim result: 941.49 10^6bits/s over 0.997 seconds ending at 1440193142.763
Alignment Offset Bytes Bytes Recvs Bytes Sends
Local Remote Local Remote Xfered Per Per
Recv Send Recv Send Recv (avg) Send (avg)
8 8 0 0 1175182320 50476.00 23282 16384.00 71816
i40e:
Hard to test because the traffic is incoming so fast (24Gb/s) that GRO
always receives 87kB, even at the highest interrupt rate.
Other drivers were only compile tested.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Many drivers initialize uselessly n_priv_flags, n_stats, testinfo_len,
eedump_len & regdump_len fields in their .get_drvinfo() ethtool op.
It's not necessary as these fields is filled in ethtool_get_drvinfo().
v2: removed unused variable
v3: removed another unused variable
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only call the internal_setup_link method when it is provided. This
check is required for newer version parts.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If the reset never completes, it is necessary to retake the
semaphore before returning, because the caller will release
the semaphore.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch disables LRO by default in favor of GRO.
LRO is incompatible with forwarding and is disabled when forwarding
is turned on which makes the default offloads of the driver
inconsistent. LRO can still be enabled via ethtool.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch makes sure that flow control packets initiated by the VF are
dropped and reported as spoofed.
Flow control packets can be used to limit the throughput or as DOS
attack when generated from a VF. Flow control is not supported per VF
hence any pause frames generated from a VF are considered malicious.
Also cleaned up indentation and some redundant comments.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
With the addition of X550em_x SFP+ support, the driver is now
functionally equivalent to what will be the 4.2.1 driver when
released, so change the version to match.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X540 thermal interrupt (IXGBE_EIMS_TS) is not an SDP, so it
doesn't need to be enabled in ixgbe_setup_gpie(). In fact the
value is simply not for the GPIE register at all.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The 82599 and X540 datasheets require that FCRTH be "set" for Tx
switching (VM-to-VM loopback) but it did not previously specify what
the value should be set to. It has now been determined that
the correct value is RXPBSIZE - (24*1024).
This setting is also required for later devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
A logic error here results in the adapter_stopped flag only being
cleared when ixgbe_setup_fc returns an error. Correct the logic.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change does two things. First, it makes it so that we always
set the relaxed ordering bits related to the DCA registers even if
DCA is not enabled. Second, it moves the configuration out of the
ixgbe_down function and into the ixgbe_configure function before
enabling the Rx and Tx rings. This ensures that DCA is configured
correctly before starting to process packets.
Thanks to Alex Duyck for this fix.
CC: Alex Duyck <aduyck@mirantis.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add new device ID for X550EM device with SFPs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch skips the PCI transactions pending check in
ixgbe_disable_pcie_master. This is done to addresses a known HW
issue where the PCI transactions pending bit sticks high when there
are pending transactions. HW engineering instructed to workaround
this issue by wait and then continue with our reset flow.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch sets RDRXCTL.PSP when the driver is in SRIOV mode which
enables padding of small packets.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Setting the X550* RDRXCTL register should fall through into X540
and 82599, not 82598.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The timeout path is supposed to release the semaphore, so do that.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Take control of an I2C mux that selects which SFP is attached to
the I2C bus. The control of the mux is captured in the taking and
releasing of the related semaphore. Because only port 1 can control
the mux, port 1 always leaves the mux set to select port 0.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the frequency of polling for SFP modules. Because the
service task sometimes runs at high rates, we can poll for
SFPs too often. When an SFP is not present, the I2C timeouts
that result are very costly. So, prevent SFP polling from
being done more than once every two seconds. To reduce latency,
the poll time is cleared in a couple of cases to permit the
next service task execution to poll the SFP module.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since SFP+ can be used with some X550 devices, permit them to be
detected.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On some hardware platforms, the CS4227 does not initialize properly.
Detect those cases and reset it appropriately.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Configures the CS4227 correctly for both 1G and 10G operation,
by moving the code to ixgbe_setup_mac_link_sfp_x550em(). It
needs to be in this function because we need both the module
type and the speed, and this is the only function in the init
flow that knows the speed. In contrast,
ixgbe_setup_sfp_modules_X550em() does not know the speed, so we
can't do anything useful here. This is a fundamental difference
from the previous flow, and is due to the way the CS4227 is
implemented.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds X550EM_x SFP+ dual-speed support. 82599 fiber link
code was moved from ixgbe_82599.c to ixgbe_common.c for use by
X550EM. SFP MAC link code is added to x550EM.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Reduce the number of retries during PHY detection. This reduces
pauses when no SFP is present. Once an SFP is detected, the normal
retry count will be used.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clear the destination location for I2C data initially so that
the received data will not be affected by previous attempts.
This could have returned wrong data in certain retry sequences.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Set the bit banging mode in the hardware when performing bit banging
I2C operations on X550. Also control the output enable on both the
clock and data lines.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The lan_id is being set after a previous I2C eeprom access which
makes no sense because it needs to be set before any access. Move
the setting to before the access.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Most I2C accesses take and release semaphores for each access. Now
there is a reason to perform multiple I2C operations under the same
holding of the semaphore, so provide unlocked I2C methods for that
purpose.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Provide I2C combined operations on X550EM, not X550 devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for the SFP insertion interrupt on X550EM devices with
SFPs.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When an SFP not present error is returned by the reset_hw method,
accept it and go on, since an SFP can still be inserted. Previously
it was only accepted for 82598 devices.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
X550 has HW support for SCTP flow director filters SCTP mask. This
patch adds it like we do for UDP and TCP.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is part of the future enablement of X550 SFP+ support. This
HW uses different SDP so the interrupts need to be set up accordingly.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch updates the lowest limit for adaptive interrupt interrupt
moderation to roughly 12K interrupts per second.
The way I came about reaching 12K as the desired interrupt rate is by
testing with UDP flows. Specifically I had a simple test that ran a
netperf UDP_STREAM test at varying sizes. What I found was as the packet
sizes increased the performance fell steadily behind until we were only
able to receive at ~4Gb/s with a message size of 65507. A bit of digging
found that we were dropping packets for the socket in the network stack,
and looking at things further what I found was I could solve it by increasing
the interrupt rate, or increasing the rmem_default/rmem_max. What I found was
that when the interrupt coalescing resulted in more data being processed
per interrupt than could be stored in the socket buffer we started losing
packets and the performance dropped. So I reached 12K based on the
following math.
rmem_default = 212992
skb->truesize = 2994
212992 / 2994 = 71.14 packets to fill the buffer
packet rate at 1514 packet size is 812744pps
71.14 / 812744 = 87.9us to fill socket buffer
From there it was just a matter of choosing the interrupt rate and
providing a bit of wiggle room which is why I decided to go with 12K
interrupts per second as that uses a value of 84us.
The data below is based on VM to VM over a direct assigned ixgbe interface.
The test run was:
netperf -H <ip> -t UDP_STREAM"
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
Before:
212992 65507 60.00 1100662 0 9613.4 10.89 0.557
212992 60.00 473474 4135.4 11.27 0.576
After:
212992 65507 60.00 1100413 0 9611.2 10.73 0.549
212992 60.00 974132 8508.3 11.69 0.598
Using bare metal the data is similar but not as dramatic as the throughput
increases from about 8.5Gb/s to 9.5Gb/s.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the .remove() callback for a PF is called, SR-IOV support for the
device is disabled, which requires unbinding and removing the VFs.
The VFs may be in-use either by the host kernel or userspace, such as
assigned to a VM through vfio-pci. In this latter case, the VFs may
be removed either by shutting down the VM or hot-unplugging the
devices from the VM. Unfortunately in the case of a Windows 2012 R2
guest, hot-unplug is broken due to the ordering of the PF driver
teardown. Disabling SR-IOV prior to unregister_netdev() avoids this
issue.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add checks for systems that don't have SFP's to avoid incorrectly
acting on interrupts that are falsely interpreted as SFP events.
This also includes a modified check generating the EICR mask to be
more forward-looking.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Resolve warnings resulting from redundant initialization of the
get_bus_info field in the mac_ops_X550* structures.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When unbinding an SR-IOV device with VFs configured from ixgbe, the
driver behaves in one of two ways. If max_vfs was specified, the
SR-IOV state is disabled, removing the VFs. The occurs regardless of
whether the VF count was later modified through sysfs. If however
max_vfs is zero, such as by not specifying the module parameter, the
VFs persist after the PF is unbound from ixgbe. If the PF is then
bound to vfio-pci to be assigned to a VM, the PF is non-functional.
>From the comment, commit da36b64736 ("ixgbe: Implement PCI SR-IOV
sysfs callback operation") clearly intended this alternate behavior,
but probably didn't realize the PF doesn't work in this mode.
This bimodal behavior is confusing to users and results in a state
where the PF is broken for other uses unless the user sets
sriov_numvfs to zero prior to unbinding the device. Remove this
behavior so that VFs are removed and the PF is functional for other
uses after unbind, regardless of the way VFs are enabled.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Now that we can do 2.5G link speed, we need to be able to report it.
Also change the nested triadic involved in creating the log message
to instead use a simpler switch statement to set a string pointer.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>