Commit Graph

797745 Commits

Author SHA1 Message Date
Ido Schimmel
40051c4dca vxlan: Allow changing ageing time
In a similar fashion to the bridge device, allow changing the ageing
time of the VxLAN device by scheduling its timer to fire if the ageing
time changed.

One use case is selftests where learning / ageing of VxLAN FDB entries
is tested. The default ageing time is 5 minutes, which is too long for a
simple selftest.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 17:10:31 -08:00
Petr Machata
5728ae0d17 vxlan: Add hardware FDB learning
In order to allow devices to signal learning events to VXLAN, introduce
two new switchdev messages: SWITCHDEV_VXLAN_FDB_ADD_TO_BRIDGE and
SWITCHDEV_VXLAN_FDB_DEL_TO_BRIDGE.

Listen to these notifications in the vxlan driver. The FDB entries
learned this way have an NTF_EXT_LEARNED flag, and only entries marked
as such can be unlearned by the _DEL_ event. They are also immediately
marked as offloaded. This is the same behavior that the bridge driver
observes.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 17:10:31 -08:00
Petr Machata
0ec566aacc vxlan: Don't override user-added entries with ext-learned ones
When an external learning event collides with an user-added entry, the
user-added entry shouldn't be taken over. Otherwise on an unlearn event
the entry would be completely lost, even though the user added it by
hand.

Therefore skip update of FDB flags and state for these cases. This is in
accordance with the bridge behavior.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 17:10:30 -08:00
Petr Machata
45598c1cee vxlan: Mark user-added FDB entries
The VXLAN driver needs to differentiate between FDB entries learned by
the VXLAN driver, and those added by the user. The latter ones shouldn't
be taken over by external learning events. This is in accordance with
bridge behavior.

Therefore, extend the flags bitfield to 16 bits and add a new private
NTF flag to mark the user-added entries.

This seems preferable to adding a dedicated boolean, because passing the
flag, unlike passing e.g. a true, makes it clear what the meaning of the
bit is.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 17:10:30 -08:00
Petr Machata
0e6160f3f5 vxlan: vxlan_fdb_notify(): Make switchdev notification configurable
In a following patch, vxlan is extended to allow hardware FDB learning.
For FDB entries learned this way, switchdev notifications should not be
sent again, because the driver already knows about these entries.

To that end, add an argument vxlan_fdb_notify() to determine whether
the switchdev notifications should be sent. Propagate the argument to
all call sites transitively, eventually passing true in all root calls.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 17:10:30 -08:00
Petr Machata
5572c81560 vxlan: __vxlan_fdb_delete(): Drop unused argument vid
This argument is necessary for vxlan_fdb_delete(), the API of which is
prescribed by ndo_fdb_del, but __vxlan_fdb_delete() doesn't need it.
Therefore drop it.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 17:10:30 -08:00
Andrea Claudi
d59da3fbfe net: lpc_eth: fix trivial comment typo
Fix comment typo rxfliterctrl -> rxfilterctrl

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 16:17:32 -08:00
David S. Miller
1e33f01599 Merge branch 'VLAN-tag-handling-cleanup'
Michał Mirosław says:

====================
VLAN tag handling cleanup

This is a cleanup set after VLAN_TAG_PRESENT removal. The CFI bit
handling is made similar to how other tag fields are used.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 15:41:31 -08:00
Michał Mirosław
6c0fbd7262 mlx5: use skb_vlan_tag_get_prio()
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 15:41:30 -08:00
Michał Mirosław
fb1e3df002 benet: use skb_vlan_tag_get_prio()
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 15:41:30 -08:00
Michał Mirosław
98ba780e4c net/hyperv: use skb_vlan_tag_*() helpers
Replace open-coded bitfield manipulation with skb_vlan_tag_*() helpers.
This also enables correctly passing of VLAN.CFI bit.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 15:41:30 -08:00
Michał Mirosław
a2e768b861 net/vlan: introduce skb_vlan_tag_get_cfi() helper
Abstract CFI/DEI bit access consistently with other VLAN tag fields.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-21 15:41:30 -08:00
David S. Miller
11c6c0c228 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-11-20

This series contains updates to the ice driver only.

Akeem updates the driver to determine whether or not to do
auto-negotiation based on the VSI state.

Bruce cleans up the control queue code to remove duplicate code.  Take
advantage of some compiler optimizations by making some structures
constant, and also note that they cannot be modified.  Cleaned up
formatting issues and code comment that needed clarification.  Fixed a
potential NULL pointer dereference by adding a check.

Jaroslaw adds a check to verify if memory was allocated or not.

Yashaswini Raghuram fixes the driver to ensure we are not enabling the
LAN_EN flag if the MAC in the MAC-VLAN is a unicast MAC, so that the
unicast packets are not forwarded to the wire.

Dave fixes the return value of ice_napi_poll() to be more useful in
returning the work that was done and should only return 0 when no work
was done.

Anirudh does code comment cleanup, to make more consistent.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:59:27 -08:00
David S. Miller
51428fd661 Merge branch 'dsa-microchip-Modify-KSZ9477-DSA-driver-in-preparation-to-add-other-KSZ-switch-drivers'
Tristram Ha says:

====================
net: dsa: microchip: Modify KSZ9477 DSA driver in preparation to add other KSZ switch drivers

This series of patches is to modify the original KSZ9477 DSA driver so
that other KSZ switch drivers can be added and use the common code.

There are several steps to accomplish this achievement.  First is to
rename some function names with a prefix to indicate chip specific
function.  Second is to move common code into header that can be shared.
Last is to modify tag_ksz.c so that it can handle many tail tag formats
used by different KSZ switch drivers.

ksz_common.c will contain the common code used by all KSZ switch drivers.
ksz9477.c will contain KSZ9477 code from the original ksz_common.c.
ksz9477_spi.c is renamed from ksz_spi.c.
ksz9477_reg.h is renamed from ksz_9477_reg.h.
ksz_common.h is added to provide common code access to KSZ switch
drivers.
ksz_spi.h is added to provide common SPI access functions to KSZ SPI
drivers.

v4
- Patches were removed to concentrate on changing driver structure without
adding new code.

v3
- The phy_device structure is used to hold port link information
- A structure is passed in ksz_xmit and ksz_rcv instead of function pointer
- Switch offload forwarding is supported

v2
- Initialize reg_mutex before use
- The alu_mutex is only used inside chip specific functions

v1
- Each patch in the set is self-contained
- Use ksz9477 prefix to indicate KSZ9477 specific code
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:12 -08:00
Tristram Ha
84bd190819 net: dsa: microchip: rename ksz_9477_reg.h to ksz9477_reg.h
Rename ksz_9477_reg.h to ksz9477_reg.h for consistency as the product
name is always KSZ####.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:12 -08:00
Tristram Ha
c2e866911e net: dsa: microchip: break KSZ9477 DSA driver into two files
Break KSZ9477 DSA driver into two files in preparation to add more KSZ
switch drivers.
Add common functions in ksz_common.h so that other KSZ switch drivers
can access code in ksz_common.c.
Add ksz_spi.h for common functions used by KSZ switch SPI drivers.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:12 -08:00
Tristram Ha
74a7194f15 net: dsa: microchip: rename ksz_spi.c to ksz9477_spi.c
Rename ksz_spi.c to ksz9477_spi.c and update Kconfig in preparation to add
more KSZ switch drivers.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:12 -08:00
Tristram Ha
353592781d net: dsa: microchip: rename some functions with ksz9477 prefix
Rename some functions with ksz9477 prefix to separate chip specific code
from common code.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:12 -08:00
Tristram Ha
9bc981c355 net: dsa: microchip: clean up code
Clean up code according to patch check suggestions.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:12 -08:00
Tristram Ha
5b79c72e96 net: dsa: microchip: replace license with GPL
Replace license with GPL.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 20:57:11 -08:00
Bruce Allan
f25dad19ba ice: Fix possible NULL pointer de-reference
A recent update to smatch is causing it to report the error "we previously
assumed 'm_entry->vsi_list_info' could be null". Fix that.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan
d337f2afb7 ice: Use Tx|Rx in comments
In code comments, use Tx|Rx instead of tx|rx

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan
df17b7e02f ice: Cosmetic formatting changes
1. Fix several cases of double spacing
2. Fix typos
3. Capitalize abbreviations

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Bruce Allan
2c5492de87 ice: Cleanup short function signatures
Function signatures that do not exceed 80-characters should be on a single
line.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Bruce Allan
bc0c6fab8a ice: Cleanup ice_tx_timeout()
Clean up number of formatting issues and a comment that could use
clarification.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Dave Ertman
e0c9fd9b77 ice: Fix return value from NAPI poll
ice_napi_poll is hard-coded to return zero when it's done. It should
instead return the work done (if any work was done). The only time it
should return zero is if an interrupt or poll is handled and no work
is performed. So change the return value to be the minimum of work
done or budget-1.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Bruce Allan
55aa141ed9 ice: Constify global structures that can/should be
Indicate these structs should not be modified and take advantage of some
compiler optimizations by making these structs const.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Yashaswini Raghuram Prathivadi Bhayankaram
6a7e699369 ice: Do not set LAN_EN for MAC-VLAN filters
In the action fields for a MAC-VLAN filter, do not set the LAN_EN flag
if the MAC in the MAC-VLAN is unicast MAC. The unicast packets that
match should not be forwarded to the wire.

Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:04 -08:00
Jaroslaw Ilgiewicz
5fb597d7c8 ice: Pass the return value of ice_init_def_sw_recp()
Added check of return value for ice_init_def_sw_recp().
Now we know if memory was correctly allocated.

Signed-off-by: Jaroslaw Ilgiewicz <jaroslaw.ilgiewicz@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:03 -08:00
Bruce Allan
7afdbc903a ice: Cleanup duplicate control queue code
1. Assigning the register offset and mask values contains duplicate code
   that can easily be replaced with a macro.

2. Separate functions for freeing send queue and receive queue rings are
   not needed; replace with a single function that uses a pointer to the
   struct ice_ctl_q_ring structure as a parameter instead of a pointer to
   the struct ice_ctl_q_info structure.

3. Initializing register settings for both send queue and receive queue
   contains duplicate code that can easily be replaced with a helper
   function.

4. Separate functions for freeing send queue and receive queue buffers are
   not needed; duplicate code can easily be replaced with a macro.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:03 -08:00
Akeem G Abodunrin
d38b08834f ice: Do autoneg based on VSI state
If VSI state is up, we should do autoneg with link up, otherwise
with link down.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-20 11:39:03 -08:00
Xue Chaojing
b1a2004841 net-next/hinic: fix a bug in rx data flow
In rx_alloc_pkts(), there is a loop call of tasklet, which causes
100% cpu utilization, even no packets are being received. This patch
fixes this bug.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:38:08 -08:00
Xue Chaojing
9ea72dc943 net-next/hinic:fix a bug in set mac address
In add_mac_addr(), if the MAC address is a muliticast address,
it will not be set, which causes the network card fail to receive
the multicast packet. This patch fixes this bug.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:38:08 -08:00
Xue Chaojing
4a61abb100 net-next/hinic:add rx checksum offload for HiNIC
In order to improve performance, this patch adds rx checksum offload
for the HiNIC driver. Performance test(Iperf) shows more than 80%
improvement in TCP streams.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:38:08 -08:00
Xue Chaojing
ebda9b46ce net-next/hinic:replace multiply and division operators
To improve performance, this patch uses bit operations to replace
multiply and division operators.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:38:08 -08:00
Vadim Pasternak
a421ce088a mlxsw: core: Extend cooling device with cooling levels
Extend cooling device with cooling levels vector to allow more
flexibility of PWM setting.

Thermal zone algorithm operates with the numerical states for PWM
setting. Each state is the index, defined in range from 0 to 10 and it's
mapped to the relevant duty cycle value, which is written to PWM
controller. With the current definition fan speed is set to 0% for state
0, 10% for state 1, and so on up to 100% for the maximum state 10.

Some systems have limitation for the PWM speed minimum. For such systems
PWM setting speed to 0% will just disable the ability to increase speed
anymore and such device will be stall on zero speed.  Cooling levels
allow to configure state vector according to the particular system
requirements. For example, if PWM speed is not allowed to be below 30%,
cooling levels could be configured as 30%, 30%, 30%, 30%, 40%, 50% and
so on.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:31:15 -08:00
Arjun Vynipadath
b539ea60f5 cxgb4/cxgb4vf: Fix mac_hlist initialization and free
Null pointer dereference seen when cxgb4vf driver is unloaded
without bringing up any interfaces, moving mac_hlist initialization
to driver probe and free the mac_hlist in remove to fix the issue.

Fixes: 24357e06ba ("cxgb4vf: fix memleak in mac_hlist initialization")
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:29:36 -08:00
Eric Dumazet
ade9628ed0 tcp: drop dst in tcp_add_backlog()
Under stress, softirq rx handler often hits a socket owned by the user,
and has to queue the packet into socket backlog.

When this happens, skb dst refcount is taken before we escape rcu
protected region. This is done from __sk_add_backlog() calling
skb_dst_force().

Consumer will have to perform the opposite costly operation.

AFAIK nothing in tcp stack requests the dst after skb was stored
in the backlog. If this was the case, we would have had failures
already since skb_dst_force() can end up clearing skb dst anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:25:47 -08:00
David S. Miller
b2c8510063 ipv4: Don't try to print ASCII of link level header in martian dumps.
This has no value whatsoever.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 10:15:36 -08:00
Eric Dumazet
6b015a523f net_sched: sch_fq: avoid calling ktime_get_ns() if not needed
There are two cases were we can avoid calling ktime_get_ns() :

1) Queue is empty.
2) Internal queue is not empty.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-20 09:51:32 -08:00
David S. Miller
6133e78f41 Merge branch 'gred-add-offload-support'
Jakub Kicinski says:

====================
gred: add offload support

This series adds support for GRED offload in the nfp driver.  So
far we have only supported the RED Qdisc offload, but we need a
way to differentiate traffic types e.g. based on DSCP marking.

It may seem like PRIO+RED is a good match for this job, however,
(a) we don't need strict priority behaviour of PRIO, and (b) PRIO
uses the legacy way of mapping ToS fields to bands, which is quite
awkward and limitting.

The less commonly used GRED Qdisc is a better much for the scenario,
it allows multiple sets of RED parameters and queue lengths to be
maintained with a single FIFO queue.  This is exactly how nfp offload
behaves.  We use a trivial u32 classifier to assign packets to virtual
queues.

There is also the minor advantage that GRED can't have its child
changed, therefore limitting ways in which the configuration of SW
path can diverge from HW offload.

Last patch of the series adds support for (G)RED in non-ECN mode,
where packets are dropped instead of marked.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
340a4864d5 nfp: abm: add support for more threshold actions
Original FW only allowed us to perform ECN marking.  Newer releases
also support plain old drop.  Add the ability to configure drop
policy.  This is particularly useful in combination with GRED,
because different bands can have different ECN marking setting.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
174ab544e3 nfp: abm: add cls_u32 offload for simple band classification
Use offload of very simple u32 filters to direct packets to GRED
bands based on the DSCP marking.  No u32 hashing is supported,
just plain simple filters matching on ToS or Priority with
appropriate mask device can support.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
6a80240571 nfp: abm: add functions to update DSCP -> virtual queue map
Learn how to set the DSCP map.  FW uses a packed array which
geometry depends on the number of supported priorities and
virtual queues.  Write code to assemble this map and to communicate
the setting to the FW via mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
14780c3429 nfp: abm: calculate PRIO map len and check mailbox size
In preparation for PRIO offload calculate how long the prio map
for FW will be and make sure the configuration can be performed
via the vNIC mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
068ceb3555 net: sched: cls_u32: add res to offload information
In case of egress offloads the class/flowid assigned by the filter
may be very important for offloaded Qdisc selection.  Provide this
info to drivers.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
f3d6372064 nfp: abm: add GRED offload
Add support for GRED offload.  It behaves much like RED, but
can apply different parameters to different bands.  GRED operates
pretty much exactly like our HW/FW with a single FIFO and different
RED state instances.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
990b50a53a nfp: abm: wrap RED parameters in bands
Wrap RED parameters and stats into a structure, and a 1-element
array.  Upcoming GRED offload will add the support for more bands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
e49efd5288 net: sched: gred: support reporting stats from offloads
Allow drivers which offload GRED to report back statistics.  Since
A lot of GRED stats is fairly ad hoc in nature pass to drivers the
standard struct gnet_stats_basic/gnet_stats_queue pairs, and
untangle the values in the core.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:46 -08:00
Jakub Kicinski
890d8d23ec net: sched: gred: add basic Qdisc offload
Add basic offload for the GRED Qdisc.  Inform the drivers any
time Qdisc or virtual queue configuration changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-11-19 18:53:45 -08:00