Use RFS infrastructure and flow steering in HW to keep CPU
affinity of rx interrupts and application per TCP stream.
A flow steering filter is added to the HW whenever the RFS
ndo callback is invoked by core networking code.
Because the invocation takes place in interrupt context, the
actual setup of HW is done using workqueue. Whenever new filter
is added, the driver checks for expiry of existing filters.
Since there's window in time between the point where the core
RFS code invoked the ndo callback, to the point where the HW
is configured from the workqueue context, the 2nd, 3rd etc
packets from that stream will cause the net core to invoke
the callback again and again.
To prevent inefficient/double configuration of the HW, the filters
are kept in a database which is indexed using hash function to enable
fast access.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap.
If supplied, the assigned IRQ is tracked using rmap infrastructure.
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define this macro is one common place instead of duplicating it over the code
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change is just meant to defragment the flags as there are several hole
that have been introduced since several features, or the flags for them,
have been removed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
All of our hardware supports RSS even if it is only for a single queue. So
instead of toting around the RSS enable flag I am updating the code so that
all devices are enabled and if we want to disable RSS it is indicated via
the RSS mask.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change essentially makes it so that we can enable almost all of the
features all at once. This patch allows for the combination of SR-IOV,
DCB, and FCoE in the case of the x540. It also beefs up the SR-IOV by
adding support for RSS to the PF.
The testing matrix gets to be very complex for this patch as there are a
number of different features and subsets for queueing options. I tried to
narrow these down a bit by restricting the PF to only supporting 4TC DCB
when it is enabled in addition to SR-IOV.
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change allows all pools from the default pool forward to be enabled vi
ixgbe_configure_virtualization. This is needed as we are planning to use
queues belonging to adjacent pools for FCoE when SR-IOV and FCoE are both
enabled.
In addition this patch contains some minor formatting changes as there were
a few spots that seemed to be in need of some cleanup.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In ixgbevf_get_ringparam we could run into a NULL pointer dereference
if the rings were not allocated when we attempted the call. To prevent
that we can just access the tx/rx_ring_count values instead of attempting
to access the rings to get the count.
This change corrects a memory leak and memory corruption in
ixgbevf_set_ringparam.
The memory leak was due to us not freeing the resources from the ring
before overwriting them. This change corrects the memory leak by making
certain to call ixgbe_free_tx/rx_resources on the rings prior to freeing
them.
The memory corruption was because we were replacing the rings but not
updating the q_vectors. It addresses the memory corruption by leaving the
rings in place and instead just copying the contents of the new rings into
the existing rings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is a good bit of redundancy between the Tx checksum and segmentation
offloads. In order to reduce some of this I am moving the code for
creating a context descriptor into a separate function.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change adds the netdev to the ring structure. This allows for a
quicker transition from ring to netdev without having to go from ring to
adapter to netdev.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver is going back one step from its' previous location before
bumping tail. This is incorrect. We should just be writing the value of
next_to_use into the tail register.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We have had an issue when using ixgbe+ixgbevf and 802.1 VLAN tagging.
When attaching a VLAN to a VF, frames with a 802.1q priority appeared
untagged on the VF hence not reaching the VLAN, where frames with
priority 0 where tagged as expected and seen by the VLAN device.
This seems due to the way ixgbevf is looking up the full tag
(prio+cfi+vlan) against the adapter active_vlans, as a condition to mark
the skb tagged.
Signed-off-by: Pascal Bouchareine <pascal@gandi.net>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cloned patch of Eric Dumazet for bonding.
Some workloads greatly benefit of IFF_XMIT_DST_RELEASE capability
on output net device, avoiding dirtying dst refcount.
team currently disables IFF_XMIT_DST_RELEASE unconditionally.
If all ports have the IFF_XMIT_DST_RELEASE bit set, then
team dev can also have it in its priv_flags.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enabling runtime PM support for davinci mdio driver
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the netpoll function to support netconsole. Tested and works
fine on my "JMC250 PCI Express Gigabit Ethernet Controller" (PCI ID 0250).
Signed-off-by: Peter Wu <lekensteyn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
add OF support for the davinci_emac driver.
Signed-off-by: Heiko Schocher <hs@denx.de>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: netdev@vger.kernel.org
Cc: davinci-linux-open-source@linux.davincidsp.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: devicetree-discuss@lists.ozlabs.org
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Sekhar Nori <nsekhar@ti.com>
Cc: Wolfgang Denk <wd@denx.de>
Cc: Anatoly Sivov <mm05@mail.ru>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The condition is always true so WOL will never work.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers should pull only ethernet header from page frag
to skb->head.
Pulling 64 bytes is too much for TCP (without options) on IPv4.
However, it makes sense to pull all the frame if it fits the
128 bytes bloc allocated for skb->head, to free one page per
small incoming frame.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Po-Yu Chuang <ratbert@faraday-tech.com>
Acked-by: Yan-Pai Chen <yanpai.chen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sold by O2 (telefonica germany) under the name "LTE4G"
Tested-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return -ENOTSUPP if the initialization fails because the
device is configured for a mode that is not supported by the driver.
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some workloads greatly benefit of IFF_XMIT_DST_RELEASE capability
on output net device, avoiding dirtying dst refcount.
bonding currently disables IFF_XMIT_DST_RELEASE unconditionally.
If all slaves have the IFF_XMIT_DST_RELEASE bit set, then
bonding master can also have it in its priv_flags
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The usbnet API use the device ID table to store a pointer to
a minidriver. Setting a generic pointer for dynamic device
IDs will in most cases make them work as expected. usbnet
will otherwise treat the dynamic IDs as blacklisted. That is
rarely useful.
There is no standard class describing devices supported by
this driver, and most vendors don't even provide enough
information to allow vendor specific wildcard matching. The
result is that most of the supported devices must be
explicitly listed in the device table. Allowing dynamic IDs
to work both simplifies testing and verification of new
devices, and provides a way for end users to use a device
before the ID is added to the driver.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for 64 bit stats to Broadcom b44 ethernet driver.
Signed-off-by: Kevin Groeneveld <kgroeneveld@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ability of driver to transmit packets depends on logical state
of the link. Ignore physical link status.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lancer FW has added new capability checks for VFs.
Driver should only use those capabilities which are allowed for VFs.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jerr Kirsher says:
====================
This series contains updates to ixgbe & ixgbevf.
...
Alexander Duyck (6):
ixgbe: Ping the VFs on link status change to trigger link change
ixgbe: Handle failures in the ixgbe_setup_rx/tx_resources calls
ixgbe: Move configuration of set_real_num_rx/tx_queues into open
ixgbe: Update the logic for ixgbe_cache_ring_dcb and DCB RSS
configuration
ixgbe: Cleanup logic for MRQC and MTQC configuration
ixgbevf: Update descriptor macros to accept pointers and drop _ADV
suffix
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Francois Romieu says:
====================
Francois Romieu (1):
r8169: verbose error message.
Hayes Wang (1):
r8169: remove rtl_ocpdr_cond.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings says:
====================
1. Fix potential badness when running a self-test with SR-IOV enabled.
2. Fix calculation of some interface statistics that could run backward.
3. Miscellaneous cleanup.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This change updates the descriptor macros to accept pointers, updates the
name to drop the _ADV suffix, and include the IXGBEVF name in the macro.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to make the code much more readable for MTQC and MRQC
configuration.
The big change is that I simplified much of the logic so that we are
essentially handling just 4 cases and their variants. In the cases where
RSS is disabled we are actually just programming the RETA table with all
1s resulting in a single queue RSS. In the case of SR-IOV I am treating
that as a subset of VMDq. This all results int he following configuration
for the hardware:
DCB
En Dis
VMDq En VMDQ/DCB VMDq/RSS
Dis DCB/RSS RSS
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change cleans up some of the logic in an attempt to try and simplify
things for how we are configuring DCB w/ RSS.
In this patch I basically did 3 things. I updated the logic for getting
the first register index. I applied the fact that all TCs get the same
number of queues to simplify the looping logic in caching the DCB ring
register. Finally I updated how we configure the RQTC register to match
the fact that all TCs are assigned the same number of queues.
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It makes much more sense for us to configure the real number of Tx and Rx
queues in the ixgbe_open call than it does in ixgbe_set_num_queues. By
setting the number in ixgbe_open we can avoid a number of unecessary
updates and only have to make the calls once.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Previously we were exiting without cleaning up the memory internally on the
ixgbe_setup_rx_resources and ixgbe_setup_tx_resources calls. Instead of
forcing the caller to clean things up for us we should instead just unwind
the rings and free the memory as we go. This way we can more gracefully
clean up the rings in the event of an allocation failure.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the link status changes on the PF we need to notify the VFs. In order
to do this we should ping all of the VFs in order to trigger a link status
change on them as well.
This fixes issues in which the PF would reset, but the VF didn't because the
NAK flag was not set in the VF mailbox.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It is not needed for mac_ocp_{write / read}. Actually bit 31 of OCPDR
does not change and r8168_mac_ocp_read always returns ~0.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Tested-by: Francois Romieu <romieu@fr.zoreil.com>
It's done in very similar way this is done in bonding and bridge.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some interface statistics are computed in such a way that they can
sometimes decrease (and even underflow). Since the computed value
will never be greater than the true value, we fix this by only storing
the computed value when it increases.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Currently VF queues and drivers may remain active during this test.
This could cause memory corruption or spurious test failures.
Therefore we reset the port/function before running these tests on
Siena.
On Falcon this doesn't work: we have to do some additional
initialisation before some blocks will work again. So refactor the
reset/register-test sequence into an efx_nic_type method so
efx_selftest() doesn't have to consider such quirks.
In the process, fix another minor bug: Siena does not have an
'invisible' reset and the self-test currently fails to push the PHY
configuration after resetting. Passing RESET_TYPE_ALL to
efx_reset_{down,up}() fixes this.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Fix CID 113952 in Coverity report on Linux.
This is the one instance where we don't, and shouldn't, check the
return code from efx_mcdi_rpc(). It wasn't immediately obvious to me
why we didn't, so I think an explanation is in order.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Fix CID 102619 in the Coverity report on Linux.
efx_end_loopback() iterates over an array of skb pointers of which
some may be null (if efx_begin_loopback() failed). It should not use
dev_kfree_skb_irq(), which requires non-null pointers. In practice
this is safe because it does not run in interrupt context and
therefore always ends up calling dev_kfree_skb(), which does allow
null pointers. But we should make that explicit.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Fix CID 113703 in the Coverity report on Linux.
ethtool stats names are limited to 32 bytes including a null
terminator. Use strlcpy() to ensure that we will always include the
null terminator even if a source string becomes longer than this.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
There is nothing in the VLAN driver or core VLAN support that
invalidates the TCP and IP header offsets.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
tso_state::packet_space is always set in tso_start_packet(); the
value set in tso_start() is not used, and is also incorrect.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
With some gcc versions & optimisations, the compiler will warn that
'depth' in efx_filter_insert_filter() may be used without being
initialised, although this is not the case.
This is related to inlining of efx_filter_search(), which only has
one caller since commit 8db182f4a8
('sfc: Remove now-unused filter function').
Shut the compiler up by initialising it to 0.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
The interrupt registers accessed in ixgbevf are more similar to the igb
style registers than they are to the ixgbe style registers. As such we
would be better off setting up the code for the EICS, EIMS, EICS, EIAM, and
EIAC like we do in igb instead of ixgbe.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Sibai Li <sibai.li@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>