Commit Graph

574299 Commits

Author SHA1 Message Date
Zhang Shengju
7009212b15 macvlan: convert to use IFF_NO_QUEUE
Use IFF_NO_QUEUE to indicate that a device can run without a qdisc.

Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 15:19:44 -05:00
David S. Miller
f551588595 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2016-02-16

This series contains updates to i40e/i40evf only.

Shannon adds flags to MAC allocation requests to signify that the MAC VLAN
filters should come from the shared resource pool.  Added a new "set switch
config" admin queue command and the new Cisco VXLAN-GPE cloud tunnel type
for the admin queue commands.  Added more detail to the NVM update debug
message in order to see the full ethtool request data.  Also added a few
more bits of netdev data into the debugfs output for dump VSI.

Pandi fixes the width of two datatypes which were being declared a different
size from what they are assigned.

Anjali fixes an issue where we were not doing write-back on interrupt
throttle for legacy case in x722.

Mitch adds a counter for ARQ overflows since sometimes an ever-growing
number indicates that something bad is happening.  Also added 20G speed for
Tx bandwidth calculations.

Jesse refactors the DCB function based on a community suggestion to change
the multi-level if statement into a switch statement.  Cleans up VF device
IDs in the PF, since it does not need to know them.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:48:50 -05:00
David S. Miller
86f447783f Merge branch 'tc-hw-offload'
John Fastabend says:

====================
tc offload for cls_u32 on ixgbe

This extends the setup_tc framework so it can support more than just
the mqprio offload and push other classifiers and qdiscs into the
hardware. The series here targets the u32 classifier and ixgbe
driver. I worked out the u32 classifier because it is protocol
oblivious and aligns with multiple hardware devices I have access
to. I did an initial implementation on ixgbe because (a) I have one
in my box (b) its a stable driver and (c) it is relatively simple
compared to the other devices I have here but still has enough
flexibility to exercise the features of cls_u32.

I intentionally limited the scope of this series to the basic
feature set. Specifically this uses a 'big hammer' feature bit
to do the offload or not. If the bit is set you get offloaded rules
if it is not then rules will not be offloaded. If we can agree on
this patch series there are some more patches on my queue we can
talk about to make the offload decision per rule using flags similar
to how we do l2 mac updates. Additionally the error strategy can
be improved to be hard aborting, log and continue, etc. I think
these are nice to have improvements but shouldn't block this series.

Also by adding get_parse_graph and set_parse_graph attributes as
in my previous flow_api work we can build programmable devices
and programmatically learn when rules can or can not be loaded
into the hardware. Again future work.

... v3 includes a couple style fixups suggested by Jiri and a
quick fix to ixgbe to check features flag and not dev_features
flag the latter being always set.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:37 -05:00
John Fastabend
db956ae882 net: ixgbe: abort with cls u32 divisor groups greater than 1
This patch ensures ixgbe will not try to offload hash tables from the
u32 module. The device class does not currently support this so until
it is enabled just abort on these tables.

Interestingly the more flexible your hardware is the less code you
need to implement to guard against these cases.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:37 -05:00
John Fastabend
b82b17d929 net: ixgbe: add support for tc_u32 offload
This adds initial support for offloading the u32 tc classifier. This
initial implementation only implements a few base matches and actions
to illustrate the use of the infrastructure patches.

However it is an interesting subset because it handles the u32 next
hdr logic to correctly map tcp packets from ip headers using the ihl
and protocol fields. After this is accepted we can extend the match
and action fields easily by updating the model header file.

Also only the drop action is supported initially.

Here is a short test script,

 #tc qdisc add dev eth4 ingress
 #tc filter add dev eth4 parent ffff: protocol ip \
	u32 ht 800: order 1 \
	match ip dst 15.0.0.1/32 match ip src 15.0.0.2/32 action drop

<-- hardware has dst/src ip match rule installed -->

 #tc filter del dev eth4 parent ffff: prio 49152
 #tc filter add dev eth4 parent ffff: protocol ip prio 99 \
	handle 1: u32 divisor 1
 #tc filter add dev eth4 protocol ip parent ffff: prio 99 \
	u32 ht 800: order 1 link 1: \
	offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
 #tc filter add dev eth4 parent ffff: protocol ip \
	u32 ht 1: order 3 match tcp src 23 ffff action drop

<-- hardware has tcp src port rule installed -->

 #tc qdisc del dev eth4 parent ffff:

<-- hardware cleaned up -->

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:36 -05:00
John Fastabend
9d35cf062e net: ixgbe: add minimal parser details for ixgbe
This adds an ixgbe data structure that is used to determine what
headers:fields can be matched and in what order they are supported.

For hardware devices this can be a bit tricky because typically
only pre-programmed (firmware, ucode, rtl) parse graphs will be
supported and we don't yet have an interface to change these from
the OS. So its sort of a you get whatever your friendly vendor
provides affair at the moment.

In the future we can add the get routines and set routines to
update this data structure. One interesting thing to note here
is the data structure here identifies ethernet, ip, and tcp
fields without having to hardcode them as enumerations or use
other identifiers.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:36 -05:00
John Fastabend
3b01cf56da net: tc: helper functions to query action types
This is a helper function drivers can use to learn if the
action type is a drop action.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:36 -05:00
John Fastabend
1c78c64e9c net: add tc offload feature flag
Its useful to turn off the qdisc offload feature at a per device
level. This gives us a big hammer to enable/disable offloading.
More fine grained control (i.e. per rule) may be supported later.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:36 -05:00
John Fastabend
a1b7c5fd7f net: sched: add cls_u32 offload hooks for netdevs
This patch allows netdev drivers to consume cls_u32 offloads via
the ndo_setup_tc ndo op.

This works aligns with how network drivers have been doing qdisc
offloads for mqprio.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:36 -05:00
John Fastabend
16e5cc6471 net: rework setup_tc ndo op to consume general tc operand
This patch updates setup_tc so we can pass additional parameters into
the ndo op in a generic way. To do this we provide structured union
and type flag.

This lets each classifier and qdisc provide its own set of attributes
without having to add new ndo ops or grow the signature of the
callback.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:35 -05:00
John Fastabend
e4c6734eaa net: rework ndo tc op to consume additional qdisc handle parameter
The ndo_setup_tc() op was added to support drivers offloading tx
qdiscs however only support for mqprio was ever added. So we
only ever added support for passing the number of traffic classes
to the driver.

This patch generalizes the ndo_setup_tc op so that a handle can
be provided to indicate if the offload is for ingress or egress
or potentially even child qdiscs.

CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17 09:47:35 -05:00
Catherine Sullivan
82f399c935 i40e/i40evf: Bump i40e to 1.4.11 and i40evf to 1.4.7
Bump.

Change-ID: I21aa520a3c8c5f4f562a98019bf8b76b3706c480
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:36:08 -08:00
Jesse Brandeburg
d17038d687 i40e: trivial: remove unnecessary local var
Probe routine already has too many locals, just convert one
used for kzalloc into a kcalloc, eliminating the local.

Change-ID: I349049872b71f858cbeb91ad7836e6767fc7b7d1
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Anjali Singhai <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:28:51 -08:00
Jesse Brandeburg
2eccf1d611 i40e: remove VF device IDs from PF
The PF doesn't need to know about the VF's device IDs, so remove them.

Change-ID: I62cf0e0fffa1ace586e58e00bc271b10ae440f05
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:24:26 -08:00
Shannon Nelson
de1017f76a i40e: add netdev info to VSI dump
Add a few more bits of netdev data into the debugfs output for dump VSI.
For now, we'll add the features, hw_features, vlan_features, and flags
bitflags and the state. More could be added later if needed.

Also, tweak a couple nearby output lines for output readability.

Change-ID: I9fb5a9da75c9ad7679498ce9ac3ba24d065ddd2e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Reviewed-by: Brandeburg, Jesse <jesse.brandeburg@intel.com>
Reviewed-by: Wyborny, Carolyn <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:19:58 -08:00
Mitch Williams
509a447ae8 i40evf: enable bus master after reset
If the VF is reset via VFLR, the device will be knocked out of bus
master mode, and the driver will fail to recover from the reset. Fix
this by enabling bus mastering after every reset. In a non-VFLR case,
the bus master bit will not be disabled, and this call will have no effect.

Change-ID: Id515859ac7a691db478222228add6d149e96801a
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:14:56 -08:00
Shannon Nelson
1d73b2db4b i40e: add a little more to an NVM update debug message
Add a little more detail to an NVM update debug message in order to
see the full ethtool request data.

Change-ID: Iab10437cb32d6fddc67ee347e7c0b42511e152cd
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Kevin Scott <kevin.c.scott@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:08:17 -08:00
Jesse Brandeburg
dd54a1ada9 i40e: refactor DCB function
This is a simple refactor suggested by the community
to change a multi-level if statement into a switch.

Change-ID: I831cf3c40426022220aa9b43990022d22dfd50db
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 19:04:08 -08:00
Mitch Williams
07f169c3e9 i40e: add 20G speed for Tx bandwidth calculations
When calculating TX bandwidth for VFs, we need to know the link speed to
make sure we don't allocate more bandwidth than is available. Add 20G
link speed to the switch statement so we can support devices that link
at that speed.

Change-ID: I5409f6139d549e5832777db9c22ca0664e0c5f8b
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 18:59:32 -08:00
Mitch Williams
1d0a4ada84 i40e: add counter for arq overflows
Sometimes, ARQ overflows are a big deal and tell us that the
firmware/hardware/driver/something is having problems. But normally
they're no big deal. To assist in assessing this, add a counter to
our Ethtool stats. A handful of ARQ overflows during VF init is no
problem. A large, ever-growing number indicates that Something Bad is
happening.

Change-ID: Ie5348bfbc8a54a890559cb00279c28d976a55096
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 18:54:53 -08:00
Anjali Singhai Jain
a3d772a392 i40e: fix write-back-on-itr to work with legacy itr
We were not doing write-back on interrupt throttle for Legacy case in X722.
This patch fixes that, so we do WB_ON_ITR for Legacy as well. Plus the issue
that we should still be setting NO_ITR if we are touching the DYN_CTLN register
since we do not want to change ITR setting here.

Change-ID: I5db8491ee1544118a389db839cecc93e1bbc480e
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 18:50:15 -08:00
Pandi Maharajan
071c859b87 i40e: Store lan_vsi_idx and lan_vsi_id in the right size
lan_vsi_idx and lan_vsi_id are assigned to u16 data sized variables but
declared in u8. This patch fixes the width of the datatype.

Change-ID: If4bcbcc7d32f2b287c51cb33d17879691258dce2
Signed-off-by: Pandi Kumar Maharajan <pandi.maharajan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 18:40:26 -08:00
Shannon Nelson
97b884fecd i40e: Bump AQ minor version to 1.5 for new FW features
Bump AQ minor version to 1.5 for new FW features.

Change-ID: I5a790f7f519a2a8921aaa1c5663727dd1897ffec
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Kevin Scott <kevin.c.scott@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 18:35:53 -08:00
Shannon Nelson
6774faf964 i40e: AQ thermal sensor control struct
Add the new AQ command and struct for managing a thermal sensor.

Change-ID: I6f5631839a0f3dca352a6c222f1269a960e2310a
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 18:31:50 -08:00
David S. Miller
547b9ca879 Merge branch 'ip-sysctl-namespaceify'
Nikolay Borisov says:

====================
Namespacify various ip sysctl knobs

This series continues namespacifying more net related knobs.
The focus here is on ip options. Patches 1,3,4,5 namespacify
the respective sysctl knobs. Patch 2 moves some igmp code to the
correct file (and function) and also adds some #ifdef guards to
silence compilation warnings.

Finally, patch 5 exposes the ip fragmentation related sysctls
since all of the knobs are namespaced.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:55 -05:00
Nikolay Borisov
52a773d645 net: Export ip fragment sysctl to unprivileged users
Now that all the ip fragmentation related sysctls are namespaceified
there is no reason to hide them anymore from "root" users inside
containers.

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:54 -05:00
Nikolay Borisov
0fbf4cb27e ipv4: namespacify ip fragment max dist sysctl knob
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:54 -05:00
Nikolay Borisov
e21145a987 ipv4: namespacify ip_early_demux sysctl knob
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:54 -05:00
Nikolay Borisov
287b7f38fd ipv4: Namespacify ip_dynaddr sysctl knob
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:54 -05:00
Nikolay Borisov
dcd87999d4 igmp: net: Move igmp namespace init to correct file
When igmp related sysctl were namespacified their initializatin was
erroneously put into the tcp socket namespace constructor. This
patch moves the relevant code into the igmp namespace constructor to
keep things consistent.

Also sprinkle some #ifdefs to silence warnings

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:54 -05:00
Nikolay Borisov
fa50d974d1 ipv4: Namespaceify ip_default_ttl sysctl knob
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:42:54 -05:00
David S. Miller
6cd21d7941 Major changes:
wl12xx
 
 * add device tree support for SPI
 
 mwifiex
 
 * add debugfs file to read chip information
 * add MSIx support for newer pcie chipsets (8997 onwards)
 * add schedule scan support
 * add WoWLAN net-detect support
 * firmware dump support for w8997 chipset
 
 iwlwifi
 
 * continue the work on multiple Rx queues
 * add support for beacon storing used in low power states
 * use the regular firmware image of WoWLAN
 * fix 8000 devices for Big Endian machines
 * more firmware debug hooks
 * add support for P2P Client snoozing
 * make the beacon filtering for AP mode configurable
 * fix transmit queues overflow with LSO
 
 libertas
 
 * add support for setting power save via cfg80211
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJWvbW+AAoJEG4XJFUm622b1lIH/RmFoJNbJka9/CrDse5ndb3A
 TBRDGvinOjubHn9Hn8Baz7u0huXvBpWoN/YI4hqjKTW4YsXfN8cB1IJakzohLFCF
 I9kUhITYGrK7hKDwkBsop2qxwJsrkAzv/XNzi+yGqxlj/IqZp+pYBmZTb9H6G963
 lb8hJPhx+t2aCBPvDdkAd5n67kdq0C35At+a1KB1CYFYKdYO5y5QeDLYtb+5zrWT
 /Pwel7gRdgcf4TpaUZs++mDf+agcFo+JjG+olLrZNw1BDthA8VC46agMyfffrt4l
 dRZIS07fUDd0zf4HJMWV05l8kKdEwqrRAuoBOnxGI611GNN94M3NWJDxU/MgBos=
 =zE9m
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-next-for-davem-2016-02-12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
Major changes:

wl12xx

* add device tree support for SPI

mwifiex

* add debugfs file to read chip information
* add MSIx support for newer pcie chipsets (8997 onwards)
* add schedule scan support
* add WoWLAN net-detect support
* firmware dump support for w8997 chipset

iwlwifi

* continue the work on multiple Rx queues
* add support for beacon storing used in low power states
* use the regular firmware image of WoWLAN
* fix 8000 devices for Big Endian machines
* more firmware debug hooks
* add support for P2P Client snoozing
* make the beacon filtering for AP mode configurable
* fix transmit queues overflow with LSO

libertas

* add support for setting power save via cfg80211
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:38:29 -05:00
Arnd Bergmann
27090cbdc3 net: phy: spi_ks8995: include linux/gpio/consumer.h
The ks8995 phy driver just started using gpiod_* functions, which are
declared in linux/gpio/consumer.h, not linux/gpio.h, resulting in a
build failure in randconfig builds that do not have CONFIG_GPIOLIB
enabled:

drivers/net/phy/spi_ks8995.c: In function 'ks8995_probe':
drivers/net/phy/spi_ks8995.c:477:3: error: implicit declaration of function 'gpiod_set_value' [-Werror=implicit-function-declaration]

This changes the header inclusion so it builds in all configurations.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: cd6f288cba ("net: phy: spi_ks8995: add support for resetting switch using GPIO")
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:37:04 -05:00
Shannon Nelson
2fcc1a401e i40e: AQ Add VXLAN-GPE tunnel type
Add the new Cisco VXLAN-GPE cloud tunnel type for the Add Cloud Filter
and UDP tunnel AQ commands.

Change-ID: I2c093c7d79726c7fca08a36e5c63581a905da3d2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Kevin Scott <kevin.c.scott@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 17:32:30 -08:00
Eric Dumazet
cd9b266095 tcp: add tcpi_min_rtt and tcpi_notsent_bytes to tcp_info
tcpi_min_rtt reports the minimal rtt observed by TCP stack for the flow,
in usec unit. Might be ~0U if not yet known.

tcpi_notsent_bytes reports the amount of bytes in the write queue that
were not yet sent.

This is done in a single patch to not add a temporary 32bit padding hole
in tcp_info.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:27:35 -05:00
David S. Miller
4cba259f19 Merge branch 'unified-tunnel-dst-caching'
Paolo Abeni says:

====================
net: unify dst caching for tunnel devices

This patch series try to unify the dst cache implementations currently
present in the kernel, namely in ip_tunnel.c and ip6_tunnel.c, introducing a
new generic implementation, replacing the existing ones, and then using
the new implementation in other tunnel devices which currently lack it.

The new dst implementation is compiled, as built-in, only if any device using
it is enabled.

Caching the dst for the tunnel remote address gives small, but measurable,
performance improvement when tunneling over ipv4 (in the 2%-4% range) and
significant ones when tunneling over ipv6 (roughly 60% when no
fragmentation/segmentation take place and the tunnel local address
is not specified).

v2:
- move the vxlan dst_cache usage inside the device lookup functions
- fix usage after free for lwt tunnel moving the dst cache storage inside
  the dst_metadata,
- sparse codying style cleanup
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:49 -05:00
Paolo Abeni
3c1cb4d260 net/ipv4: add dst cache support for gre lwtunnels
In case of UDP traffic with datagram length below MTU this
gives about 4% performance increase

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:49 -05:00
Paolo Abeni
468dfffcd7 geneve: add dst caching support
use generic dst implementation for both plain geneve devices and
lwtunnels.

In case of UDP traffic with datagram length below MTU this give
about 2% performance increase for plain geneve tunnel over ipv4,
about 65% performance increase for ipv6 tunnel.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:49 -05:00
Paolo Abeni
d71785ffc7 net: add dst_cache to ovs vxlan lwtunnel
In case of UDP traffic with datagram length
below MTU this give about 2% performance increase
when tunneling over ipv4 and about 60% when tunneling
over ipv6

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:48 -05:00
Paolo Abeni
0c1d70af92 net: use dst_cache for vxlan device
In case of UDP traffic with datagram length
below MTU this give about 3% performance increase
when tunneling over ipv4 and about 70% when
tunneling over ipv6.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:48 -05:00
Paolo Abeni
e09acddf87 ip_tunnel: replace dst_cache with generic implementation
The current ip_tunnel cache implementation is prone to a race
that will cause the wrong dst to be cached on cuncurrent dst cache
miss and ip tunnel update via netlink.

Replacing with the generic implementation fix the issue.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:48 -05:00
Paolo Abeni
607f725f6f net: replace dst_cache ip6_tunnel implementation with the generic one
This also fix a potential race into the existing tunnel code, which
could lead to the wrong dst to be permanenty cached:

CPU1:					CPU2:
  <xmit on ip6_tunnel>
  <cache lookup fails>
  dst = ip6_route_output(...)
					<tunnel params are changed via nl>
					dst_cache_reset() // no effect,
							// the cache is empty
  dst_cache_set() // the wrong dst
	// is permanenty stored
	// into the cache

With the new dst implementation the above race is not possible
since the first cache lookup after dst_cache_reset will fail due
to the timestamp check

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:48 -05:00
Paolo Abeni
911362c70d net: add dst_cache support
This patch add a generic, lockless dst cache implementation.
The need for lock is avoided updating the dst cache fields
only in per cpu scope, and requiring that the cache manipulation
functions are invoked with the local bh disabled.

The refresh_ts and reset_ts fields are used to ensure the cache
consistency in case of cuncurrent cache update (dst_cache_set*) and
reset operation (dst_cache_reset).

Consider the following scenario:

CPU1:                                   	CPU2:
  <cache lookup with emtpy cache: it fails>
  <get dst via uncached route lookup>
						<related configuration changes>
                                        	dst_cache_reset()
  dst_cache_set()

The dst entry set passed to dst_cache_set() should not be used
for later dst cache lookup, because it's obtained using old
configuration values.

Since the refresh_ts is updated only on dst_cache lookup, the
cached value in the above scenario will be discarded on the next
lookup.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Suggested-and-acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:21:48 -05:00
David S. Miller
64f63d59ba Merge branch 'bnx2x-next'
Yuval Mintz says:

====================
bnx2x: driver updates

This series contains several changes - the biggest change is the
addition of Geneve NDO support [allows device to perform RSS according
to inner-headers of encapsulated packet, similar to what it does for
vxlan]. It also extends dcbx support, as well as introducing some minor
changes.

Dave,

Please consider applying this series to `net-next'.
[Do notice patch #3 fails checkpatch due to consistency with existing
HSI]
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:12:17 -05:00
Yuval Mintz
e56270f635 bnx2x: Warn about grc timeouts in register dump
There are several scenarios where taking a register dump from a device
might log benign GRC timeout attentions to system logs.
Most common of those is when taking the dump from a 2-port device.

Sadly, there's no easy way to mask the problematic attentions during the
flow - Changing this behvaior would require a firmware update.
For now, simply warn users to ignore the warnings.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:12:16 -05:00
Yuval Mintz
e5d3a51cef bnx2x: extend DCBx support
This adds support for default application priority.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:12:16 -05:00
Yuval Mintz
9c73267d2e bnx2x: Add support for single-port DCBx
Driver is currently looking at shared information for determining whether
DCBx can be supported for a given port.
On 4-port devices, up-to-date management firmware can support DCBx on
each port of a given engine independently - but that would cause bnx2x to
misinterpert the support and assume DCBx is supported on both.

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:12:16 -05:00
Yuval Mintz
883ce97d25 bnx2x: Add Geneve inner-RSS support
This adds the ability to perform RSS hashing based on encapsulated
headers for a geneve-encapsulated packet.

This also changes the Vxlan implementation in bnx2x to be uniform
for both vxlan and geneve [from configuration perspective].

Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:12:16 -05:00
Yuval Mintz
445204644b bnx2x: Remove unneccessary EXPORT_SYMBOL
bnx2x_schedule_sp_rtnl is exported by bnx2x, although no other module
uses it.

Reported-by: Benjamin Poirier <bpoirier@suse.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-16 20:12:16 -05:00
Shannon Nelson
fa5623a6e6 i40e: AQ Add set_switch_config
Add the new Set Switch Config AdminQ command, and mark the L2 Filter
bit as deprecated in the Add VEB command.

Change-ID: I5b24790f14c56f0ddf3f70df1e486844146b039f
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Kevin Scott <kevin.c.scott@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2016-02-16 17:00:14 -08:00