1) In the case where rate == priv->pkt_rate_low == priv->pkt_rate_high,
mlx4_en_auto_moderation() does a divide by zero.
2) We want to properly change the moderation parameters if rx_frames
was changed (like in ethtool -C eth0 rx-frames 16)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All rx and rx netdev interrupts are handled by respectively
by mlx4_en_rx_irq() and mlx4_en_tx_irq() which simply schedule a NAPI.
But mlx4_eq_int() also fires a tasklet to service all items that were
queued via mlx4_add_cq_to_tasklet(), but this handler was not called
unless user cqe was handled.
This is very confusing, as "mpstat -I SCPU ..." show huge number of
tasklet invocations.
This patch saves this overhead, by carefully firing the tasklet directly
from mlx4_add_cq_to_tasklet(), removing four atomic operations per IRQ.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current behaviour of "mirred redirect" action (forward) offload is a bit
odd. For matched packets the action forwards them to the desired
destination, but it also lets the packet duplicates to go the original
way down (bridge, router, etc). That is more like "mirred mirror".
Fix this by using PBS type which behaves exactly like "mirred redirect".
Note that PBS does not support loopback mode.
Fixes: 4cda7d8d70 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using a reader-writer lock in fast path is silly, when we can
instead use RCU or a seqlock.
For mlx4 hwstamp clock, a seqlock is the way to go, removing
two atomic operations and false sharing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Point back the unregister IPv6 mc table to the bc table.
It is done since IPv6 mcast snooping is not supported for Spectrum yet.
Reported-by: Jiri Pirko <jiri@mellanox.com>
Fixes: 71c365bdc4 ("mlxsw: spectrum: Separate bc and mc floods")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Tested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g
_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)
per the kernel documention, preemption should be disabled, add that.
Before the fix, when running with CONFIG_PREEMPT set, we get a
BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793
asserion from the TC action (mirred) stats_update callback.
Fixes: aad7e08d39 ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW does not understand ETH_P_ALL. So treat this special case differently
and translate to 0/0 key/mask. That will allow HW to match all ethertypes.
Fixes: 7aa0f5aa90 ("mlxsw: spectrum: Implement TC flower offload")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a function to update mc_disabled from switchdev attr
SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlxsw_sp_port_orig_get returns the vport from the physical
port if needed, based on the original device.
This patch addresses the case where the original device is a bridge.
If it is vlan unaware bridge, it returns the matching vport. If it is vlan
aware bridge, there is no matching vport, and it returns the original port.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The decision whether to flood a multicast packet to a port dependent
on three flags: mc_disabled, mc_router_port, mc_flood.
If mc_disabled is on, the port will be flooded according to mc_flood,
otherwise, according to mc_router_port. To accomplish that, add those
flags into the mlxsw_sp_port struct and update the mc flood table
accordingly.
Update mc_router_port by switchdev attribute
SWITCHDEV_ATTR_ID_PORT_MC_ROUTER_PORT.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Break the bm (broadcast-multicast) into two tables, one for broadcast
(and link local multicast that behaves like bc) and one for unknown
multicasts.
Add a bool into mlxsw_sp_port named mc_flood that reflect the value this
port should have in the mc flood table (currently, always 1);
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A user that wants many bridges will use 1.Q bridge which are scalable.
One can have as many 1.Q bridges as vfids.
This patch sets their number to 1k, which is a reasonably large number.
This change is done here because the next patches will add a new flood
table, and without it, it will increase the overall size of the flood
tables dramatically.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, there is a per port flood update function only for the UC
table. Make the function more generic by changing the table type to be
an input.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the flood set function can't operate on only one table, but
sets both uc_flood and mb_flood together.
This patch creates a function that sets the flood state per table.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon the reception of an ENTRY_REPLACE notification, resolve the FIB
node corresponding to the prefix and length and insert the new route
before the first matching entry.
Since the notification also signals the deletion of the replaced route,
delete it from the driver's cache.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a new route is appended, it's placed after existing routes sharing
the same parameters (prefix, length, table ID, TOS and priority).
While the device supports only one route with the same prefix and length
in a single table, it's important to correctly place the appended route
in the driver's cache, as when a route is deleted the next one is
programmed into the device.
Following the reception of an ENTRY_APPEND notification, resolve the
FIB node corresponding to the prefix and length and correctly place the
new entry in its entry list.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the device, routes are indexed in a routing table based on the prefix
and its length. This is in contrast to the kernel's FIB where several
FIB aliases can exist with these parameters being identical. In such
cases, the routes will be sorted by table ID (LOCAL first, then MAIN),
TOS and finally priority (metric).
During lookup, these routes will be evaluated in order. In case the
packet's TOS field is non-zero and a FIB alias with a matching TOS is
found, then it's selected. Otherwise, the lookup defaults to the route
with TOS 0 (if it exists). However, if the requested scope is narrower
than the one found, then the lookup continues.
To best reflect the kernel's datapath we should take the above into
account. Given a prefix and its length, the reflected route will always
be the first one in the FIB alias list. However, if the route has a
non-zero TOS then its action will be converted to trap instead of
forward, since we currently don't support TOS-based routing. If this
turns out to be a real issue, we can add support for that using
policy-based switching.
The route's scope can be effectively ignored as any packet being routed
by the device would've been looked-up using the widest scope (UNIVERSE).
To achieve that we need to do two changes. Firstly, we need to create
another struct (FIB node) that will hold the list of FIB entries sharing
the same prefix and length. This struct will be hashed using these two
parameters.
Secondly, we need to change the route reflection to match the above
logic, so that the first FIB entry in the list will be programmed into
the device while the rest will remain in the driver's cache in case of
subsequent changes.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel resolves the nexthops for a given route using
FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a
route with one of its nexthops being LINKDOWN.
In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then
we shouldn't reflect the nexthop to the device's table.
Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD
and reflect it to the device.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the last IP address is removed from a netdev, its RIF is deleted.
However, if user didn't first remove neighbours and nexthops using this
interface, then they would still be present in the device's tables.
Therefore, whenever a RIF is deleted, make sure all the neighbours and
nexthops (adjacency entries) using it are removed from the relevant
tables as well.
The action associated with any route using this RIF would be refreshed,
most likely to trap. If the kernel decides to remove the route (f.e.,
because all the nexthops are now DEAD), then an event would be sent,
causing the route to be removed from the device.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a packet hits a multipath route in the device's routing table, a
hash is computed over its headers, which is then used to select the
appropriate nexthop from the device's adjacency table.
There are situations in which the kernel removes a nexthop from a
multipath route (e.g., no carrier) and the device should do the same.
Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from
the device's adjacency table and refresh all the routes using the
nexthop group. If all the nexthops of a multipath route are invalid,
then any packet hitting the route would be trapped to the CPU for
forwarding.
If all the nexthops are DEAD, then the kernel would remove the route
entirely. On the other hand, if all the nexthops are merely LINKDOWN,
then the kernel would keep the route and forward any incoming packet
using a different route.
While the last case might sound like a problem, it's expected that a
routing daemon running in user space would remove such a route from the
FIB as it's dumped with the DEAD flag set.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device can have one of three actions associated with a route:
1) Remote - packets continue to the adjacency table
2) Local - packets continue to the neighbour table
3) Trap - packets continue to the CPU
The first two actions can also trap packets to the CPU, but they do so
using a different trap ID, which has a lower traffic class and less
allotted bandwidth.
We currently use the third action for both RTN_{LOCAL,BROADCAST} routes
and RTN_UNICAST routes not pointing to the switch ports.
However, packets that merely need to be forwarded by the switch are
likely not control packets and can be therefore scheduled towards the
CPU using a lower traffic class.
Achieve the above by assigning the third action only to local and
broadcast routes and have any other route use either of the first two
actions, based on whether the route is gatewayed or not.
This will also allow us to refresh routes using the local action and
have them trap packets when their RIF is no longer valid following a
NH_DEL event.
One side effect of this patch is that we no longer give special
treatment to multipath routes using both switch and non-switch ports
towards their nexthops. If at least one of the nexthops can be resolved,
then the device will forward the packets instead of trapping them.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous patch introduced a generic function to determine whether a
route should be offloaded or not. Make use of it here.
In the future we're going to add more conditions to this test (e.g.,
whether TOS is non-zero), so it makes sense to centralize it instead of
open coding it in a few places.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently set the RTNH_F_OFFLOAD flag for all routes using remote
action, but this isn't always correct. If none of the nexthops
associated with a gatewayed route can be offloaded into the device, then
any packet hitting it would be trapped to the CPU and forwarded by the
kernel.
Solve this by pushing the setting of the offload flag to after the route
was programmed into the device, thereby allowing us to take all the
parameters into account.
This change will also help us further in the patchset, when we refresh
routes following the reception of NH_{ADD,DEL} events.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nexthop init and de-init functions both have symmetric parts
concerned with the reflection of the neighbour entry into the device's
adjacency table, in case it's used by a gatewayed route.
These sections of code also need to be called when a nexthop is marked
as valid / invalid following NH_{ADD,DEL} events. Break these out into
appropriate functions, so that they could be invoked following the
reception of above events.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the previous changes, the FIB info is embedded in every nexthop
group struct, which in turn is embedded in every FIB entry struct.
We can therefore safely remove the FIB info from the entry struct. This
has the added advantage of making the router-related structs more
generic and suitable for use with IPv6 offloads.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now, the only FIB entries that were associated with a nexthop
group were routes to remote networks where all the nexthop devices had a
valid router interface (RIF). This is in contrast to the FIB code,
where all the routes are associated with a FIB info. The same design
choice needs to be applied to the driver's cache.
Based on the NH_{ADD,DEL} events which will be added later in the
patchset, we need to be able to change the action (forward / trap)
associated with all the routes using the nexthop group. However, if we
can't link between the nexthop and the routes using it, then the above
is impossible.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The next patch is going to generalize the way in which we store routes.
Instead of attaching a nexthop group only to gatewayed routes, one will
be attached to each route, in a similar way to the way the FIB code
stores its routes.
The above means that any function operating on a nexthop group cannot
assume the group represents only gatewayed nexthops. One such function
is the one that refreshes a nexthop group and updates the adjacency
table following nexthop changes.
For a nexthop group that doesn't represent any gateways this function
would essentially be a NOP, but it would be useful if it did update the
action associated with any route using it. This will allow us to later
consolidate code paths when a nexthop changes following NH_{ADD,DEL}
events.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently use the scope of the FIB info to distinguish between a
direct unicast route and a gatewayed one. However, the kernel is
perfectly happy to configure a route with scope UNIVERSE to a directly
connected network.
Instead, we can rely on the first nexthop's scope to check if the route
is gatewayed or not.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Later in the patchset we'll add the NH_{ADD,DEL} events which will let
us know when a nexthop is considered to be dead. Based on these events
we need to be able to add or remove the nexthop from the device's
tables.
Therefore, store the private nexthop structs in a hash table and use the
kernel's fib_nh struct as the key, so that we'll be able to easily find
them when the events are received.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when we're notified about a new RTN_UNICAST route we perform
a lookup on the nexthop group list looking for a group with a matching
configuration to that found in the FIB info. This is quite inefficient.
Instead, we can simply rely on the kernel to consolidate several FIB
configurations into the same FIB info and use the FIB info as the key
for our private nexthop group struct.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we invalidate a nexthop we should also invalidate its neighbour
entry pointer as it might be destroyed later on. This makes the nexthop
de-init function symmetric with its init and also ensures nobody will
try to access the neighbour entry.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No rollback is needed since the chain is in consistent state and
mlxsw_afa_block_destroy() will take care of putting it away. So remove
the one we have now which is wrong. Also move the set of 'finished' flag
to the beginning of the function, because the block is certainly unusable
for future action addition no matter if the function succeeds or not.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 4cda7d8d70 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conflict was an interaction between a bug fix in the
netvsc driver in 'net' and an optimization of the RX path
in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
This series includes some updates to mlx5 core and ethernet driver.
We got one patch from Or to fix some static checker warnings.
2nd patche from Dan came to add the support for 128B cache line
in the HCA, which will configures the hardware to use 128B alignment only
on systems with 128B cache lines, otherwise it will be kept as the current
default of 64B.
From me three patches to support no inline copy on TX on ConnectX-5 and
later HCAs. Starting with two small infrastructure changes and
refactoring patches followed by two patches to add the actual support for
both xmit ndo and XDP xmit routines.
Last patch is a simple fix to return a mistakenly removed pointer from the
SQ structure, which was remove in previous submission of mlx5 4K UAR.
Saeed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJYmQKaAAoJEEg/ir3gV/o+RBMH/RGHNw3yPB2MyWo28V3eabw+
xl/SymiNOUgmq03ULYoc6xJpi9RCya7m/Kyce1M/M1gSz6LXubG2IDw9QsKV8lnc
+5rwHCKjop6MdR3khsgqvWqGiKfQN0+QON5MjlPZB3/4u8qFcjauhfXpiX9naMO5
aB/Sm9zRPwRnsEhy2AwPyZqOxe5boZzHqmZxpthIgPMtqbpBYNkTkooljsj/KqXf
AO3y/mdGykELPF3lIHTE4X9zixx5s6MrlAYX2uGUrAojs2WVIBsq3iXI/J8X9zs/
lg7to15WoMttR66vRZ120U6tx17OMmoxuAp+bmgZumabi/wDAZGSy5ELbH28WlY=
=F+t/
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2017-01-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2017-01-31
This series includes some updates to mlx5 core and ethernet driver.
We got one patch from Or to fix some static checker warnings.
2nd patche from Dan came to add the support for 128B cache line
in the HCA, which will configures the hardware to use 128B alignment only
on systems with 128B cache lines, otherwise it will be kept as the current
default of 64B.
From me three patches to support no inline copy on TX on ConnectX-5 and
later HCAs. Starting with two small infrastructure changes and
refactoring patches followed by two patches to add the actual support for
both xmit ndo and XDP xmit routines.
Last patch is a simple fix to return a mistakenly removed pointer from the
SQ structure, which was remove in previous submission of mlx5 4K UAR.
Saeed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run
in a deterministic time frame and the following message may be logged:
NOHZ: local_softirq_pending 08
The problem is the same as what was described in commit ec13ee8014
("virtio_net: invoke softirqs after __napi_schedule") and this patch
applies the same fix to mlx4.
Fixes: 07841f9d94 ("net/mlx4_en: Schedule napi when RX buffers allocation fails")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PSAMPLE is a loadable module, spectrum must not be built-in:
drivers/net/built-in.o: In function `mlxsw_sp_rx_listener_sample_func':
spectrum.c:(.text+0xe357e): undefined reference to `psample_sample_packet'
This adds a Kconfig dependency to enforce usable configurations.
Fixes: 98d0f7b9ac ("mlxsw: spectrum: Add packet sample offloading support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit abeffce ("net/mlx5e: Fix a -Wmaybe-uninitialized warning"), I fixed a
gcc warning for the ipv4 offload handling. Now we get the same warning for the
added ipv6 support:
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:815:40: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized]
We can apply the same workaround here as well.
Fixes: ce99f6b97f ("net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a "||" vs "|" typo here so we test 0x1 instead of 0x6.
Fixes: 1f8176f735 ("net/mlx4_en: Check the enabling pptx/pprx flags in SET_PORT wrapper flow")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We periodically ask the neighbouring system to try and resolve
neighbours that are used for nexthops, but aren't currently resolved.
However, 'nud_state' is protected by the neighbour lock, so we shouldn't
access it without taking it. Instead, we can simply check the
'connected' field of the neighbour entry, which we update upon
NEIGH_UPDATE events.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We only add neighbour entries that are also used for nexthops to
'nexthop_neighs_list', so when iterating over this list there's no need
to check that the entry is indeed used for nexthops.
Remove the redundant check.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now we had two interfaces for neighbour related configuration:
ndo_neigh_{construct,destroy} and NEIGH_UPDATE netevents. The ndos were
used to add and remove neighbours from the driver's cache, whereas the
netevent was used to reflect the neighbours into the device's tables.
However, if the NUD state of a neighbour isn't NUD_VALID or if the
neighbour is dead, then there's really no reason for us to keep it
inside our cache. The only exception to this rule are neighbours that
are also used for nexthops, which we periodically refresh to get them
resolved.
We can therefore eliminate the ndo entry point into the driver and
simplify the code, making it similar to the FIB reflection, which is
based solely on events. This also helps us avoid a locking issue, in
which the RIF cache was traversed without proper locking during
insertion into the neigh entry cache.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 33b1341cd1 ("mlxsw: spectrum_router: Fix handling of
neighbour structure") we no longer use destination IP for neighbour
lookup, so remove it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently associate each neighbour entry with a work item, so it's
not possible to have multiple events queued for the same neighbour
entry. However, this is about to be changed so that the neighbour entry
is only resolved when the work item is scheduled.
The above can result in a mismatch between the kernel's and the device's
neighbour table, unless the associated work items are processed in the
order in which they were submitted.
Do that by migrating the NEIGH_UPDATE work items to be processed in the
ordered workqueue which was recently introduced in mlxsw in commit
a3832b3189 ("mlxsw: core: Create an ordered workqueue for FIB
offload").
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We always use zero delay before queueing a work on the ordered workqueue
('mlxsw_owq'), so use work_struct directly instead of delayable work.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4K Uar series modified the mlx5e driver to use the new bfreg API,
and mistakenly removed the sq->uar_map iomem data path dedicated
pointer, which was meant to be read from xmit path for cache locality
utilization.
Fix that by returning that pointer to the SQ struct.
Fixes: 7309cb4ad71e ("IB/mlx5: Support 4k UAR for libmlx5")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
ConnectX-5 and later HW generations will report min inline mode ==
MLX5_INLINE_MODE_NONE, which means driver is not required to copy packet
headers to inline fields of TX WQE.
Avoid copy to inline segment in XDP TX routine when HW inline mode doesn't
require it.
This will improve CPU utilization and boost XDP TX performance.
Tested with xdp2 single flow:
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
HCA: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
Before: 7.4Mpps
After: 7.8Mpps
Improvement: 5%
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>