In a similar fashion to the bridge device, allow changing the ageing
time of the VxLAN device by scheduling its timer to fire if the ageing
time changed.
One use case is selftests where learning / ageing of VxLAN FDB entries
is tested. The default ageing time is 5 minutes, which is too long for a
simple selftest.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to allow devices to signal learning events to VXLAN, introduce
two new switchdev messages: SWITCHDEV_VXLAN_FDB_ADD_TO_BRIDGE and
SWITCHDEV_VXLAN_FDB_DEL_TO_BRIDGE.
Listen to these notifications in the vxlan driver. The FDB entries
learned this way have an NTF_EXT_LEARNED flag, and only entries marked
as such can be unlearned by the _DEL_ event. They are also immediately
marked as offloaded. This is the same behavior that the bridge driver
observes.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an external learning event collides with an user-added entry, the
user-added entry shouldn't be taken over. Otherwise on an unlearn event
the entry would be completely lost, even though the user added it by
hand.
Therefore skip update of FDB flags and state for these cases. This is in
accordance with the bridge behavior.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VXLAN driver needs to differentiate between FDB entries learned by
the VXLAN driver, and those added by the user. The latter ones shouldn't
be taken over by external learning events. This is in accordance with
bridge behavior.
Therefore, extend the flags bitfield to 16 bits and add a new private
NTF flag to mark the user-added entries.
This seems preferable to adding a dedicated boolean, because passing the
flag, unlike passing e.g. a true, makes it clear what the meaning of the
bit is.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a following patch, vxlan is extended to allow hardware FDB learning.
For FDB entries learned this way, switchdev notifications should not be
sent again, because the driver already knows about these entries.
To that end, add an argument vxlan_fdb_notify() to determine whether
the switchdev notifications should be sent. Propagate the argument to
all call sites transitively, eventually passing true in all root calls.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This argument is necessary for vxlan_fdb_delete(), the API of which is
prescribed by ndo_fdb_del, but __vxlan_fdb_delete() doesn't need it.
Therefore drop it.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław says:
====================
VLAN tag handling cleanup
This is a cleanup set after VLAN_TAG_PRESENT removal. The CFI bit
handling is made similar to how other tag fields are used.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace open-coded bitfield manipulation with skb_vlan_tag_*() helpers.
This also enables correctly passing of VLAN.CFI bit.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Abstract CFI/DEI bit access consistently with other VLAN tag fields.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-11-20
This series contains updates to the ice driver only.
Akeem updates the driver to determine whether or not to do
auto-negotiation based on the VSI state.
Bruce cleans up the control queue code to remove duplicate code. Take
advantage of some compiler optimizations by making some structures
constant, and also note that they cannot be modified. Cleaned up
formatting issues and code comment that needed clarification. Fixed a
potential NULL pointer dereference by adding a check.
Jaroslaw adds a check to verify if memory was allocated or not.
Yashaswini Raghuram fixes the driver to ensure we are not enabling the
LAN_EN flag if the MAC in the MAC-VLAN is a unicast MAC, so that the
unicast packets are not forwarded to the wire.
Dave fixes the return value of ice_napi_poll() to be more useful in
returning the work that was done and should only return 0 when no work
was done.
Anirudh does code comment cleanup, to make more consistent.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tristram Ha says:
====================
net: dsa: microchip: Modify KSZ9477 DSA driver in preparation to add other KSZ switch drivers
This series of patches is to modify the original KSZ9477 DSA driver so
that other KSZ switch drivers can be added and use the common code.
There are several steps to accomplish this achievement. First is to
rename some function names with a prefix to indicate chip specific
function. Second is to move common code into header that can be shared.
Last is to modify tag_ksz.c so that it can handle many tail tag formats
used by different KSZ switch drivers.
ksz_common.c will contain the common code used by all KSZ switch drivers.
ksz9477.c will contain KSZ9477 code from the original ksz_common.c.
ksz9477_spi.c is renamed from ksz_spi.c.
ksz9477_reg.h is renamed from ksz_9477_reg.h.
ksz_common.h is added to provide common code access to KSZ switch
drivers.
ksz_spi.h is added to provide common SPI access functions to KSZ SPI
drivers.
v4
- Patches were removed to concentrate on changing driver structure without
adding new code.
v3
- The phy_device structure is used to hold port link information
- A structure is passed in ksz_xmit and ksz_rcv instead of function pointer
- Switch offload forwarding is supported
v2
- Initialize reg_mutex before use
- The alu_mutex is only used inside chip specific functions
v1
- Each patch in the set is self-contained
- Use ksz9477 prefix to indicate KSZ9477 specific code
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename ksz_9477_reg.h to ksz9477_reg.h for consistency as the product
name is always KSZ####.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Break KSZ9477 DSA driver into two files in preparation to add more KSZ
switch drivers.
Add common functions in ksz_common.h so that other KSZ switch drivers
can access code in ksz_common.c.
Add ksz_spi.h for common functions used by KSZ switch SPI drivers.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename ksz_spi.c to ksz9477_spi.c and update Kconfig in preparation to add
more KSZ switch drivers.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename some functions with ksz9477 prefix to separate chip specific code
from common code.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up code according to patch check suggestions.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace license with GPL.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent update to smatch is causing it to report the error "we previously
assumed 'm_entry->vsi_list_info' could be null". Fix that.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In code comments, use Tx|Rx instead of tx|rx
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Function signatures that do not exceed 80-characters should be on a single
line.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up number of formatting issues and a comment that could use
clarification.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ice_napi_poll is hard-coded to return zero when it's done. It should
instead return the work done (if any work was done). The only time it
should return zero is if an interrupt or poll is handled and no work
is performed. So change the return value to be the minimum of work
done or budget-1.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Indicate these structs should not be modified and take advantage of some
compiler optimizations by making these structs const.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the action fields for a MAC-VLAN filter, do not set the LAN_EN flag
if the MAC in the MAC-VLAN is unicast MAC. The unicast packets that
match should not be forwarded to the wire.
Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Added check of return value for ice_init_def_sw_recp().
Now we know if memory was correctly allocated.
Signed-off-by: Jaroslaw Ilgiewicz <jaroslaw.ilgiewicz@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
1. Assigning the register offset and mask values contains duplicate code
that can easily be replaced with a macro.
2. Separate functions for freeing send queue and receive queue rings are
not needed; replace with a single function that uses a pointer to the
struct ice_ctl_q_ring structure as a parameter instead of a pointer to
the struct ice_ctl_q_info structure.
3. Initializing register settings for both send queue and receive queue
contains duplicate code that can easily be replaced with a helper
function.
4. Separate functions for freeing send queue and receive queue buffers are
not needed; duplicate code can easily be replaced with a macro.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If VSI state is up, we should do autoneg with link up, otherwise
with link down.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In rx_alloc_pkts(), there is a loop call of tasklet, which causes
100% cpu utilization, even no packets are being received. This patch
fixes this bug.
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In add_mac_addr(), if the MAC address is a muliticast address,
it will not be set, which causes the network card fail to receive
the multicast packet. This patch fixes this bug.
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to improve performance, this patch adds rx checksum offload
for the HiNIC driver. Performance test(Iperf) shows more than 80%
improvement in TCP streams.
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To improve performance, this patch uses bit operations to replace
multiply and division operators.
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend cooling device with cooling levels vector to allow more
flexibility of PWM setting.
Thermal zone algorithm operates with the numerical states for PWM
setting. Each state is the index, defined in range from 0 to 10 and it's
mapped to the relevant duty cycle value, which is written to PWM
controller. With the current definition fan speed is set to 0% for state
0, 10% for state 1, and so on up to 100% for the maximum state 10.
Some systems have limitation for the PWM speed minimum. For such systems
PWM setting speed to 0% will just disable the ability to increase speed
anymore and such device will be stall on zero speed. Cooling levels
allow to configure state vector according to the particular system
requirements. For example, if PWM speed is not allowed to be below 30%,
cooling levels could be configured as 30%, 30%, 30%, 30%, 40%, 50% and
so on.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Null pointer dereference seen when cxgb4vf driver is unloaded
without bringing up any interfaces, moving mac_hlist initialization
to driver probe and free the mac_hlist in remove to fix the issue.
Fixes: 24357e06ba ("cxgb4vf: fix memleak in mac_hlist initialization")
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Under stress, softirq rx handler often hits a socket owned by the user,
and has to queue the packet into socket backlog.
When this happens, skb dst refcount is taken before we escape rcu
protected region. This is done from __sk_add_backlog() calling
skb_dst_force().
Consumer will have to perform the opposite costly operation.
AFAIK nothing in tcp stack requests the dst after skb was stored
in the backlog. If this was the case, we would have had failures
already since skb_dst_force() can end up clearing skb dst anyway.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two cases were we can avoid calling ktime_get_ns() :
1) Queue is empty.
2) Internal queue is not empty.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
gred: add offload support
This series adds support for GRED offload in the nfp driver. So
far we have only supported the RED Qdisc offload, but we need a
way to differentiate traffic types e.g. based on DSCP marking.
It may seem like PRIO+RED is a good match for this job, however,
(a) we don't need strict priority behaviour of PRIO, and (b) PRIO
uses the legacy way of mapping ToS fields to bands, which is quite
awkward and limitting.
The less commonly used GRED Qdisc is a better much for the scenario,
it allows multiple sets of RED parameters and queue lengths to be
maintained with a single FIFO queue. This is exactly how nfp offload
behaves. We use a trivial u32 classifier to assign packets to virtual
queues.
There is also the minor advantage that GRED can't have its child
changed, therefore limitting ways in which the configuration of SW
path can diverge from HW offload.
Last patch of the series adds support for (G)RED in non-ECN mode,
where packets are dropped instead of marked.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Original FW only allowed us to perform ECN marking. Newer releases
also support plain old drop. Add the ability to configure drop
policy. This is particularly useful in combination with GRED,
because different bands can have different ECN marking setting.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use offload of very simple u32 filters to direct packets to GRED
bands based on the DSCP marking. No u32 hashing is supported,
just plain simple filters matching on ToS or Priority with
appropriate mask device can support.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Learn how to set the DSCP map. FW uses a packed array which
geometry depends on the number of supported priorities and
virtual queues. Write code to assemble this map and to communicate
the setting to the FW via mailbox.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for PRIO offload calculate how long the prio map
for FW will be and make sure the configuration can be performed
via the vNIC mailbox.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of egress offloads the class/flowid assigned by the filter
may be very important for offloaded Qdisc selection. Provide this
info to drivers.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for GRED offload. It behaves much like RED, but
can apply different parameters to different bands. GRED operates
pretty much exactly like our HW/FW with a single FIFO and different
RED state instances.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wrap RED parameters and stats into a structure, and a 1-element
array. Upcoming GRED offload will add the support for more bands.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow drivers which offload GRED to report back statistics. Since
A lot of GRED stats is fairly ad hoc in nature pass to drivers the
standard struct gnet_stats_basic/gnet_stats_queue pairs, and
untangle the values in the core.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add basic offload for the GRED Qdisc. Inform the drivers any
time Qdisc or virtual queue configuration changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>