Flexible array members should be denoted using [] instead of [0], else
gcc will not warn when they are no longer at the end of a struct.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
in IPoIB case we can't see a VF broadcast address for but
can see for PF
Before:
11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
link/infiniband
80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 MAC 14:80:00:00:66:fe, spoof checking off, link-state disable,
trust off, query_rss off
...
After:
11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
link/infiniband
80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband
80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof
checking off, link-state disable, trust off, query_rss off
v1->v2: add the IFLA_VF_BROADCAST constant
v2->v3: put IFLA_VF_BROADCAST at the end
to avoid KABI breakage and set NLA_REJECT
dev_setlink
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
in the case of IPoIB with SRIOV enabled hardware
ip link show command incorrecly prints
0 instead of a VF hardware address.
Before:
11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
link/infiniband
80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable,
trust off, query_rss off
...
After:
11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
link/infiniband
80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
vf 0 link/infiniband
80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof
checking off, link-state disable, trust off, query_rss off
v1->v2: just copy an address without modifing ifla_vf_mac
v2->v3: update the changelog
Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
Acked-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel says:
====================
mlxsw: Improve IPv6 route insertion rate
Unlike IPv4, an IPv6 multipath route in the kernel is composed from
multiple sibling routes, each representing a single nexthop.
Therefore, an addition of a multipath route with N nexthops translates
to N in-kernel notifications. This is inefficient for device drivers
that need to program the route to the underlying device. Each time a new
nexthop is appended, a new nexthop group needs to be constructed and the
old one deleted.
This patchset improves the situation by sending a single notification
for a multipath route addition / deletion instead of one per-nexthop.
When adding thousands of multipath routes with 16 nexthops, I measured
an improvement of about x10 in the insertion rate.
Patches #1-#3 add a flag that indicates that in-kernel notifications
need to be suppressed and extend the IPv6 FIB notification info with
information about the number of sibling routes that are being notified.
Patches #4-#5 adjust the two current listeners to these notifications to
ignore notifications about IPv6 multipath routes.
Patches #6-#7 adds add / delete notifications for IPv6 multipath routes.
Patches #8-#14 do the same for mlxsw.
Patch #15 finally removes the limitations added in patches #4-#5 and
stops the kernel from sending a notification for each added / deleted
nexthop.
Patch #16 adds test cases.
v2 (David Ahern):
* Remove patch adjusting netdevsim to consume resources for each
fib6_info. Instead, consume one resource for the entire multipath
route
* Remove 'multipath_rt' usage in patch #10
* Remove 'multipath_rt' from 'struct fib6_entry_notifier_info' in patch
#15. The member is only removed in this patch to prevent drivers from
processing multipath routes twice during the series
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Test that the offload indication for unicast routes is correctly set in
different scenarios. IPv4 support will be added in the future.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both listeners - mlxsw and netdevsim - of IPv6 FIB notifications are now
ready to handle IPv6 multipath notifications.
Therefore, stop ignoring such notifications in both drivers and stop
sending notification for each added / deleted nexthop.
v2:
* Remove 'multipath_rt' from 'struct fib6_entry_notifier_info'
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow the driver to create an IPv6 multipath route in one go by passing
an array of sibling routes and iterating over them.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the functions that take care of populating IPv6 nexthop
groups only add / delete a single nexthop.
Prepare them to handle multiple routes in one notification by passing an
array of routes and adding / deleting all of them.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prepare the driver to handle multiple routes in a single notification by
passing an array of routes to the functions that actually add / delete a
route.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, IPv6 replace notifications were only sent from
fib6_add_rt2node(). The function only emitted such notifications if a
route actually replaced another route.
A previous patch added another call site in ip6_route_multipath_add()
from which such notification can be emitted even if a route was merely
added and did not replace another route.
Adjust the driver to take this into account and potentially set the
'replace' flag to 'false' if the notified route did not replace an
existing route.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prepare the driver to process IPv6 multipath notifications by passing an
array of 'struct fib6_info' instead of just one route.
A reference is taken on each sibling route in order to prevent them from
being freed until they are processed by the workqueue.
v2:
* Remove 'multipath_rt' usage
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlxsw_sp_router_fib6_event() takes care of preparing the
needed information for the work item that actually inserts the route
into the device.
When processing an IPv6 multipath route, the function will need to
allocate an array to store pointers to all the sibling routes.
Change the function's signature to return an error code and adjust the
single call site.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
No such notifications are sent by the IPv6 code, so remove them.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If all the nexthops of a multipath route are being deleted, send one
notification for the entire route, instead of one per-nexthop.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emit a notification when a multipath routes is added or replace.
Note that unlike the replace notifications sent from fib6_add_rt2node(),
it is possible we are sending a 'FIB_EVENT_ENTRY_REPLACE' when a route
was merely added and not replaced.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a similar fashion to previous patch, have netdevsim ignore IPv6
multipath notifications for now.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 multipath notifications are about to be sent, but mlxsw is not
ready to process them, so ignore them.
The limitation will be lifted by a subsequent patch which will also stop
the kernel from sending a notification for each nexthop.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extend the IPv6 FIB notifier info with number of sibling routes being
notified.
This will later allow listeners to process one notification for a
multipath routes instead of N, where N is the number of nexthops.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The struct includes a 'skip_notify' flag that indicates if netlink
notifications to user space should be suppressed. As explained in commit
3b1137fe74 ("net: ipv6: Change notifications for multipath add to
RTA_MULTIPATH"), this is useful to suppress per-nexthop RTM_NEWROUTE
notifications when an IPv6 multipath route is added / deleted. Instead,
one notification is sent for the entire multipath route.
This concept is also useful for in-kernel notifications. Sending one
in-kernel notification for the addition / deletion of an IPv6 multipath
route - instead of one per-nexthop - provides a significant increase in
the insertion / deletion rate to underlying devices.
Add a 'skip_notify_kernel' flag to suppress in-kernel notifications.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some fields were not documented. Add documentation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2019-06-17
This series contains updates to the iavf driver only.
Akeem updates the driver to change how VLAN tags are being populated and
programmed into the hardware by starting from the first member of the
list until the number of allowed VLAN tags is exhausted.
Mitch fixed the variable type since the variable counter starts out
negative and climbs to zero, so use a signed integer instead of
unsigned. Also increase the timeout to avoid erroneous errors. Fixed
the driver to be able to handle when the hardware hands us a null
receive descriptor with no data attached, yet is still valid.
Aleksandr fixes the driver to use GFP_ATOMIC when allocating memory in
atomic context.
Avinash updates the driver to fix a calculation error in virtchnl
regarding the valid length.
Jakub does some refactoring of the commands processing the watchdog
state machine to reduce the length and complexity of the function. Also
decalre watchdog task as delayed work and use a dedicated work queue to
service the driver tasks.
Paul updated the iavf_process_aq_command to call the necessary functions
to be able to clear cloud filter bits that need to be cleared.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Compilation on 32-bit ARM fails after commit 992aa864dc ("mlxsw:
spectrum_ptp: Add implementation for physical hardware clock operations")
because of 64-bit division:
arm-linux-gnueabi-ld:
drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.o: in function
`mlxsw_sp1_ptp_phc_settime': spectrum_ptp.c:(.text+0x39c): undefined
reference to `__aeabi_uldivmod'
Fix by using div_u64().
Fixes: 992aa864dc ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations")
Signed-off-by: Shalom Toledo <shalomt@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fred Klassen says:
====================
UDP GSO audit tests
Updates to UDP GSO selftests ot optionally stress test CMSG
subsytem, and report the reliability and performance of both
TX Timestamping and ZEROCOPY messages.
====================
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ensure that failure on any individual test results in an overall
failure of the test script.
Signed-off-by: Fred Klassen <fklassen@appneta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Audit tests count the total number of messages sent and compares
with total number of CMSG received on error queue. Example:
udp gso zerocopy timestamp audit
udp rx: 1599 MB/s 1166414 calls/s
udp tx: 1615 MB/s 27395 calls/s 27395 msg/s
udp rx: 1634 MB/s 1192261 calls/s
udp tx: 1633 MB/s 27699 calls/s 27699 msg/s
udp rx: 1633 MB/s 1191358 calls/s
udp tx: 1631 MB/s 27678 calls/s 27678 msg/s
Summary over 4.000 seconds...
sum udp tx: 1665 MB/s 82772 calls (27590/s) 82772 msgs (27590/s)
Tx Timestamps: 82772 received 0 errors
Zerocopy acks: 82772 received
Errors are thrown if CMSG count does not equal send count,
example:
Summary over 4.000 seconds...
sum tcp tx: 7451 MB/s 493706 calls (123426/s) 493706 msgs (123426/s)
./udpgso_bench_tx: Unexpected number of Zerocopy completions: 493706 expected 493704 received
Also reduce individual test time from 4 to 3 seconds so that
overall test time does not increase significantly.
v3: Enhancements as per Willem de Bruijn <willemb@google.com>
- document -P option for TCP audit
Signed-off-by: Fred Klassen <fklassen@appneta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This enhancement adds options that facilitate load testing with
additional TX CMSG options, and to optionally print results of
various send CMSG operations.
These options are especially useful in isolating situations
where error-queue messages are lost when combined with other
CMSG operations (e.g. SO_ZEROCOPY).
New options:
-a - count all CMSG messages and match to sent messages
-T - add TX CMSG that requests TX software timestamps
-H - similar to -T except request TX hardware timestamps
-P - call poll() before reading error queue
-v - print detailed results
v2: Enhancements as per Willem de Bruijn <willemb@google.com>
- Updated control and buffer parameters for recvmsg
- poll() parameter cleanup
- fail on bad audit results
- remove TOS options
- improved reporting
v3: Enhancements as per Willem de Bruijn <willemb@google.com>
- add SOF_TIMESTAMPING_OPT_TSONLY to eliminate MSG_TRUNC
- general code cleanup
Signed-off-by: Fred Klassen <fklassen@appneta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull vfs fixes from Al Viro:
"MS_MOVE regression fix + breakage in fsmount(2) (also introduced in
this cycle, along with fsmount(2) itself).
I'm still digging through the piles of mail, so there might be more
fixes to follow, but these two are obvious and self-contained, so
there's no point delaying those..."
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/namespace: fix unprivileged mount propagation
vfs: fsmount: add missing mntget()
Florian Westphal says:
====================
net: ipv4: remove erroneous advancement of list pointer
Tariq reported a soft lockup on net-next that Mellanox was able to
bisect to 2638eb8b50 ("net: ipv4: provide __rcu annotation for ifa_list").
While reviewing above patch I found a regression when addresses have a
lifetime specified.
Second patch extends rtnetlink.sh to trigger crash
(without first patch applied).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This exercises kernel code path that deal with addresses that have
a limited lifetime.
Without previous fix, this triggers following crash on net-next:
BUG: KASAN: null-ptr-deref in check_lifetime+0x403/0x670
Read of size 8 at addr 0000000000000010 by task kworker [..]
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Causes crash when lifetime expires on an adress as garbage is
dereferenced soon after.
This used to look like this:
for (ifap = &ifa->ifa_dev->ifa_list;
*ifap != NULL; ifap = &(*ifap)->ifa_next) {
if (*ifap == ifa) ...
but this was changed to:
struct in_ifaddr *tmp;
ifap = &ifa->ifa_dev->ifa_list;
tmp = rtnl_dereference(*ifap);
while (tmp) {
tmp = rtnl_dereference(tmp->ifa_next); // Bogus
if (rtnl_dereference(*ifap) == ifa) {
...
ifap = &tmp->ifa_next; // Can be NULL
tmp = rtnl_dereference(*ifap); // Dereference
}
}
Remove the bogus assigment/list entry skip.
Fixes: 2638eb8b50 ("net: ipv4: provide __rcu annotation for ifa_list")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to a reversed dependency, it is possible to build
the lower ptp driver as a loadable module and the actual
driver using it as built-in, causing a link error:
drivers/net/dsa/sja1105/sja1105_spi.o: In function `sja1105_static_config_upload':
sja1105_spi.c:(.text+0x6f0): undefined reference to `sja1105_ptp_reset'
drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x2d4): undefined reference to `sja1105et_ptp_cmd'
drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x604): undefined reference to `sja1105pqrs_ptp_cmd'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_remove':
sja1105_main.c:(.text+0x8d4): undefined reference to `sja1105_ptp_clock_unregister'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_rxtstamp_work':
sja1105_main.c:(.text+0x964): undefined reference to `sja1105_tstamp_reconstruct'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_setup':
sja1105_main.c:(.text+0xb7c): undefined reference to `sja1105_ptp_clock_register'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_port_deferred_xmit':
sja1105_main.c:(.text+0x1fa0): undefined reference to `sja1105_ptpegr_ts_poll'
sja1105_main.c:(.text+0x1fc4): undefined reference to `sja1105_tstamp_reconstruct'
drivers/net/dsa/sja1105/sja1105_main.o:(.rodata+0x5b0): undefined reference to `sja1105_get_ts_info'
Change the Makefile logic to always build the ptp module
the same way as the rest. Another option would be to
just add it to the same module and remove the exports,
but I don't know if there was a good reason to keep them
separate.
Fixes: bb77f36ac2 ("net: dsa: sja1105: Add support for the PTP clock")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When building without CONFIG_OF, we get a harmless build warning:
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_phy_setup':
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:973:22: error: unused variable 'node' [-Werror=unused-variable]
struct device_node *node = priv->plat->phy_node;
Reword it so we always use the local variable, by making it the
fwnode pointer instead of the device_node.
Fixes: 74371272f9 ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"Lots of bug fixes here:
1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer.
2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John
Crispin.
3) Use after free in psock backlog workqueue, from John Fastabend.
4) Fix source port matching in fdb peer flow rule of mlx5, from Raed
Salem.
5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet.
6) Network header needs to be set for packet redirect in nfp, from
John Hurley.
7) Fix udp zerocopy refcnt, from Willem de Bruijn.
8) Don't assume linear buffers in vxlan and geneve error handlers,
from Stefano Brivio.
9) Fix TOS matching in mlxsw, from Jiri Pirko.
10) More SCTP cookie memory leak fixes, from Neil Horman.
11) Fix VLAN filtering in rtl8366, from Linus Walluij.
12) Various TCP SACK payload size and fragmentation memory limit fixes
from Eric Dumazet.
13) Use after free in pneigh_get_next(), also from Eric Dumazet.
14) LAPB control block leak fix from Jeremy Sowden"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits)
lapb: fixed leak of control-blocks.
tipc: purge deferredq list for each grp member in tipc_group_delete
ax25: fix inconsistent lock state in ax25_destroy_timer
neigh: fix use-after-free read in pneigh_get_next
tcp: fix compile error if !CONFIG_SYSCTL
hv_sock: Suppress bogus "may be used uninitialized" warnings
be2net: Fix number of Rx queues used for flow hashing
net: handle 802.1P vlan 0 packets properly
tcp: enforce tcp_min_snd_mss in tcp_mtu_probing()
tcp: add tcp_min_snd_mss sysctl
tcp: tcp_fragment() should apply sane memory limits
tcp: limit payload size of sacked skbs
Revert "net: phylink: set the autoneg state in phylink_phy_change"
bpf: fix nested bpf tracepoints with per-cpu data
bpf: Fix out of bounds memory access in bpf_sk_storage
vsock/virtio: set SOCK_DONE on peer shutdown
net: dsa: rtl8366: Fix up VLAN filtering
net: phylink: set the autoneg state in phylink_phy_change
net: add high_order_alloc_disable sysctl/static key
tcp: add tcp_tx_skb_cache sysctl
...
In some circumstances, the hardware can hand us a null receive
descriptor, with no data attached but otherwise valid. Unfortunately,
the driver was ill-equipped to handle such an event, and would stop
processing packets at that point.
To fix this, use the Descriptor Done bit instead of the size to
determine whether or not a descriptor is ready to be processed. Add some
checks to allow for unused buffers.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add call to iavf_add_cloud_filter and iavf_del_cloud_filter from
iavf_process_aq_command to clear aq_required
IAVF_FLAG_AQ_ADD_CLOUD_FILTER and IAVF_FLAG_AQ_DEL_CLOUD_FILTER bits.
aq_required IAVF_FLAG_AQ_DEL_CLOUD_FILTER bit is being set in
iavf_down and iavf_delete_clsflower, and are never cleared.
aq_required IAVF_FLAG_AQ_ADD_CLOUD_FILTER bit is being set in
iavf_handle_reset and iavf_configure_clsflower, and are never
cleared.
Since the aq_required is not zero, iavf_watchdog_task is setting the
queue_delayed_work to 20 msec instead of the longer delay.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cleanup of init state machine, move state specific
code to separate functions and rewrite the
iavf_init_task() function.
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Refactor the watchdog state machine implementation.
Add the additional state __IAVF_COMM_FAILED to process
the PF communication fails. Prepare the watchdog state machine
to integrate with init state machine.
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Remove the watchdog timer, instead declare watchdog task
as delayed work and use dedicated workqueue to service driver
tasks. The dedicated driver workqueue iavf_wq is common
for all driver instances.
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move the commands processing outside the watchdog_task()
function. This reduce length and complexity of the function
which is mainly designed to process the watchdog state machine.
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There was a calculation error in virtchnl regarding the valid
length which was fixed recently and a corresponding change needs
to go into the code while we enable ADq.
Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
iavf_add_vlan() is being called in atomic context
so kzalloc() needs GFP_ATOMIC. This patch fixes it.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On some hardware/driver/architecture combinations, it may take longer
than 200msec for all close operations to be completed, causing a
spurious error message to be logged.
Increase the timeout value to 500msec to avoid this erroneous error.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The counter variable in iavf_clean_tx_irq starts out negative and climbs
to 0. So allocating it as u16 is actually a really bad idea that just
happens to work because the value underflows and overflows consistently
on most architectures.
Replace the u16 with an int so signed math works as expected.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes how VLAN tag are being populated and programmed into
the HW - Instead of start adding VF VLAN tag from the last member of the
element list, start from the first member of the list, until number of
allowed VLAN tags is exhausted in the HW.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When propagating mounts across mount namespaces owned by different user
namespaces it is not possible anymore to move or umount the mount in the
less privileged mount namespace.
Here is a reproducer:
sudo mount -t tmpfs tmpfs /mnt
sudo --make-rshared /mnt
# create unprivileged user + mount namespace and preserve propagation
unshare -U -m --map-root --propagation=unchanged
# now change back to the original mount namespace in another terminal:
sudo mkdir /mnt/aaa
sudo mount -t tmpfs tmpfs /mnt/aaa
# now in the unprivileged user + mount namespace
mount --move /mnt/aaa /opt
Unfortunately, this is a pretty big deal for userspace since this is
e.g. used to inject mounts into running unprivileged containers.
So this regression really needs to go away rather quickly.
The problem is that a recent change falsely locked the root of the newly
added mounts by setting MNT_LOCKED. Fix this by only locking the mounts
on copy_mnt_ns() and not when adding a new mount.
Fixes: 3bd045cc9c ("separate copying and locking mount tree on cross-userns copies")
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Tested-by: Christian Brauner <christian@brauner.io>
Acked-by: Christian Brauner <christian@brauner.io>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
sys_fsmount() needs to take a reference to the new mount when adding it
to the anonymous mount namespace. Otherwise the filesystem can be
unmounted while it's still in use, as found by syzkaller.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: syzbot+99de05d099a170867f22@syzkaller.appspotmail.com
Reported-by: syzbot+7008b8b8ba7df475fdc8@syzkaller.appspotmail.com
Fixes: 93766fbd26 ("vfs: syscall: Add fsmount() to create a mount for a superblock")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Currently user is unable to delete the filter. See following example:
$ tc filter add dev ens16np1 ingress pref 1 handle 1 matchall action drop
$ tc filter show dev ens16np1 ingress
filter protocol all pref 1 matchall chain 0
filter protocol all pref 1 matchall chain 0 handle 0x1
in_hw
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1
$ tc filter del dev ens16np1 ingress pref 1 handle 1 matchall action drop
RTNETLINK answers: Operation not supported
Implement tcf_proto_ops->delete() op and allow user to delete the filter.
Reported-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pointer ae_dev is null checked however, prior to that it is dereferenced
when assigned pointer ops. Fix this by assigning pointer ops after ae_dev
has been null checked.
Addresses-Coverity: ("Dereference before null check")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kevin Darbyshire-Bryant says:
====================
net: sched: act_ctinfo: fixes
This is first attempt at sending a small series. Order is important
because one bug (policy validation) prevents us from encountering the
more important 'OOPS' generating bug in action creation. Fix the OOPS
first.
Confession time: Until very recently, development of this module has
been done on 'net-next' tree to 'clean compile' level with run-time
testing on backports to 4.14 & 4.19 kernels under openwrt. It turns out
that sched: action: based code has been under more active change than I
realised.
During the back & forward porting during development & testing, the
critical ACT_P_CREATED return code got missed despite being in the 4.14
& 4.19 backports. I have now gone through the init functions, using
act_csum as reference with a fine toothed comb and am happy they do the
same things.
This issue hadn't been caught till now due to another issue caused by
new strict nla_parse_nested function failing parsing validation before
action creation.
Thanks to Marcelo Leitner <marcelo.leitner@gmail.com> for flagging
extack deficiency (fixed in 733f0766c3 sched: act_ctinfo: use extack
error reporting) which led to b424e432e7 ("netlink: add validation of
NLA_F_NESTED flag") and 8cb081746c ("netlink: make validation more
configurable for future strictness”) which led to the policy validation
fix, which then led to the action creation fix both contained in this
series.
If I ever get to a developer conference please feel free to
tar/feather/apply cone of shame.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>