pktgen create threads for all online cpus and bond these threads to
relevant cpu repecivtily. when this thread firstly be woken up, it
will compare cpu currently running with the cpu specified at the time
of creation and if the two cpus are not equal, BUG_ON() will take effect
causing panic on the system.
Notice that these threads could be migrated to other cpus before start
running because of the cpu hotplug after these threads have created. so the
BUG_ON() used here seems unreasonable and we can replace it with WARN_ON()
to just printf a warning other than panic the system.
Signed-off-by: Di Zhu <zhudi21@huawei.com>
Link: https://lore.kernel.org/r/20210125124229.19334-1-zhudi21@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.
Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.
Logs:
For IPv4:
Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
metric 4261413864
IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K 0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m
i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.
After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705
IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 pref high
Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high
FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
> - selected route, * - FIB route
S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m
If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.
$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704
IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]
Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high
Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emil Renner Berthing says:
====================
net: usbnet: convert to new tasklet API
This converts the usbnet driver to use the new tasklet API introduced in
commit 12cc923f1c ("tasklet: Introduce new initialization API")
It is split into two commits for ease of reviewing.
====================
Link: https://lore.kernel.org/r/20210123173221.5855-1-esmil@mailme.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This converts the driver to use the new tasklet API introduced in
commit 12cc923f1c ("tasklet: Introduce new initialization API")
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Initialize tasklet using tasklet_init() rather than open-coding it.
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rasmus Villemoes says:
====================
net: dsa: mv88e6xxx: remove some 6250-specific methods
v2:
- resend now that the bug-fix patch (87fe04367d, "net: dsa:
mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext") is in
net and also merged to net-next.
- include various tags in patch 1.
- add second similar patch for loadpurge.
====================
Link: https://lore.kernel.org/r/20210125150449.115032-1-rasmus.villemoes@prevas.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Apart from the mask used to get the high bits of the fid,
mv88e6185_g1_vtu_loadpurge() and mv88e6250_g1_vtu_loadpurge() are
identical. Since the entry->fid passed in should never exceed the
number of databases, we can simply use the former as-is as replacement
for the latter.
Suggested-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mv88e6250_g1_vtu_getnext is almost identical to
mv88e6185_g1_vtu_getnext, except for the 6250 only having 64 databases
instead of 256. We can reduce code duplication by simply masking off
the extra two garbage bits when assembling the fid from VTU op [3:0]
and [11:8].
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add selftests for kernel behavior with regard to various classes of
unallocated/reserved IPv4 addresses, checking whether or not these
addresses can be assigned as unicast addresses on links and used in
routing.
Expect the current kernel behavior at the time of this patch. That is:
* 0/8 and 240/4 may be used as unicast, with the exceptions of 0.0.0.0
and 255.255.255.255;
* the lowest address in a subnet may only be used as a broadcast address;
* 127/8 may not be used as unicast (the route_localnet option, which is
disabled by default, still leaves it treated slightly specially);
* 224/4 may not be used as unicast.
Signed-off-by: Seth David Schoen <schoen@loyalty.org>
Suggested-by: John Gilmore <gnu@toad.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210126040834.GR24989@frotz.zork.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When TLS is a module, the built-in bonding driver may cause a
link error:
x86_64-linux-ld: drivers/net/bonding/bond_main.o: in function `bond_start_xmit':
bond_main.c:(.text+0xc451): undefined reference to `tls_validate_xmit_skb'
Add a dependency to avoid the problem.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210125113209.2248522-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix the messed up indentation in br_multicast_eht_set_entry_lookup().
Fixes: baa74d39ca ("net: bridge: multicast: add EHT source set handling functions")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210125082040.13022-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan says:
====================
bnxt_en: Error recovery improvements.
This series contains a number of improvements in the area of error
recovery. Most error recovery scenarios are tightly coordinated with
the firmware. A number of patches add retry logic to establish
connection with the firmware if there are indications that the
firmware is still alive and will likely transition back to the
normal state. Some patches speed up the recovery process and make
it more reliable. There are some cleanup patches as well.
====================
Link: https://lore.kernel.org/r/1611558501-11022-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Once the firmware fatal condition is detected, we should cease
comminication with the firmware and hardware quickly even if there
are many completion entries in the completion rings. This will
speed up the recovery process and prevent further I/Os that may
cause further exceptions.
Do not proceed in the NAPI poll function if fatal condition is
detected. Call napi_complete() and return without arming interrupts.
Cleanup of all rings and reset are imminent.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Combine the three netdev_warn() calls into a single call, printed at
the NETIF_MSG_HW log level.
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the event of a fatal firmware error, firmware will notify the host
and then it will proceed to do core reset when it sees that all functions
have disabled Bus Master. To prevent Master Aborts and other hard
errors, we need to quiesce all activities in addition to disabling Bus
Master before the chip goes into core reset.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the event of a fatal firmware error, we want to disable IRQ early
in the recovery sequence. This change will allow it to be called
safely again as part of the normal shutdown sequence.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Up until now, we don't need to keep track of this state because NAPI
is always enabled once and disabled once during bring up and shutdown.
For better error recovery in subsequent patches, we want to quiesce
the device earlier during fatal error conditions. The normal shutdown
sequence will disable NAPI again and the flag will prevent disabling
NAPI twice.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This code to check if we have reached the maximum wait time after
firmware reset is used multiple times. Add a helper function to
do this.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Firmware may be in the middle of reset when the driver tries to do ifup.
In that case, firmware will return a special error code and the driver
will retry 10 times with 50 msecs delay after each retry.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Drawing a hard line on aborted resets prevents a NIC open in
some scenarios that may otherwise be recoverable. For example,
if a firmware recovery happened while a PF was down and an
attempt was made to bring up an associated VF in this state,
then it was impossible to ever bring up this VF without a
rebind or reload of its driver.
Attempt to reinitialize the firmware when an aborted reset (or
failed init after a reset) is discovered during open - it may
succeed. Also take care to allow the user to retry opening the
NIC even after an aborted reset.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Firmware is capable of generating asynchronous debug notifications.
The event data is opaque to the driver and is simply logged. Debug
notifications can be enabled by turning on hardware status messages
using the ethtool msglvl interface.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The timeout period for firmware messages is passed to the driver
from the firmware in the response of the first command. This
timeout period is multiplied by a factor for certain long
running commands such as NVRAM commands. In some cases, the
timeout period can become really long and it can cause hung task
warnings if firmware has crashed or is not responding. To avoid
such long delays, cap all firmware commands to a max timeout value
of 40 seconds.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If firmware is in reset or in bad state, it won't be able to return
VPD data. Move bnxt_vpd_read_info() until after bnxt_fw_init_one_p1()
successfully returns. By then we would have established proper
communications with the firmware.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The first HWRM_VER_GET message to firmware during probe may timeout if
firmware is under reset. This can happen during hot-plug for example.
On P5 and newer chips, we can check if firmware is in the boot stage by
reading a status register. Retry 5 times if the status register shows
that firmware is not ready and not in error state.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add missing support for handling NO_MASTER crashes while ports are
administratively down (ifdown). On some SoC platforms, the driver
needs to assist the firmware to recover from a crash via OP-TEE.
This is performed in a similar fashion to what is done during driver
probe.
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Define macros to check for the various states in the lower 16 bits of
the health register. Replace the C code that checks for these values
with the newly defined macros.
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Updates to backing store APIs, QoS profiles, and push buffer initial
index support.
Since the new HWRM_FUNC_BACKING_STORE_CFG message size has increased,
we need to add some compat. logic to fall back to the smaller legacy
size if firmware cannot accept the larger message size. The new fields
added to the structure are not used yet.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
DENG Qingfang says:
====================
dsa: add MT7530 GPIO support
MT7530's LED controller can be used as GPIO controller.
Add support for it.
====================
Link: https://lore.kernel.org/r/20210125044322.6280-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
MT7530's LED controller can drive up to 15 LED/GPIOs.
Add support for GPIO control and allow users to use its GPIOs by
setting gpio-controller property in device tree.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add device tree binding to support MT7530 GPIO controller.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
MT762x HW, except for MT7628, supports frame length up to 2048
(maximum length on GDM), so allow setting MTU up to 2030.
Also set the default frame length to the hardware default 1518.
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210125042046.5599-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
coccicheck suggested using PTR_ERR_OR_ZERO() and looking at the code.
Fix the following coccicheck warnings:
./net/bridge/br_multicast.c:1295:7-13: WARNING: PTR_ERR_OR_ZERO can be
used.
Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Zhong <abaci-bugfix@linux.alibaba.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/1611542381-91178-1-git-send-email-abaci-bugfix@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support SPI and sequence number fields of
ESP/AH header to be hashed for RSS. By default
ESP/AH fields are not considered for RSS and
needs to be set explicitly as below:
ethtool -U eth0 rx-flow-hash esp4 sdfn
or
ethtool -U eth0 rx-flow-hash ah4 sdfn
or
ethtool -U eth0 rx-flow-hash esp6 sdfn
or
ethtool -U eth0 rx-flow-hash ah6 sdfn
To disable hashing of ESP fields:
ethtool -U eth0 rx-flow-hash esp4 sd
or
ethtool -U eth0 rx-flow-hash ah4 sd
or
ethtool -U eth0 rx-flow-hash esp6 sd
or
ethtool -U eth0 rx-flow-hash ah6 sd
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/1611378552-13288-1-git-send-email-sundeep.lkml@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When working on the PCI VPD code I also tested with a Broadcom BCM95719
card. tg3 uses internal NVRAM access with this card, so I forced it to
PCI VPD mode for testing. PCI VPD access fails
(i + PCI_VPD_LRDT_TAG_SIZE + j > len) because only TG3_NVM_VPD_LEN (256)
bytes are read, but PCI VPD has 400 bytes on this card.
So add a constant TG3_NVM_PCI_VPD_MAX_LEN that defines the maximum
PCI VPD size. The actual VPD size is returned by pci_read_vpd().
In addition it's not worth looping over pci_read_vpd(). If we miss the
125ms timeout per VPD dword read then definitely something is wrong,
and if the tg3 module loading is killed then there's also not much
benefit in retrying the VPD read.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/cb9e9113-0861-3904-87e0-d4c4ab3c8860@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The switch has support for the 802.1Qbv Time Aware Shaper (TAS). Traffic
schedules may be configured individually on each front port. Each port has eight
egress queues. The traffic is mapped to a traffic class respectively via the PCP
field of a VLAN tagged frame.
The TAPRIO Qdisc already implements that. Therefore, this interface can simply
be reused. Add .port_setup_tc() accordingly.
The activation of a schedule on a port is split into two parts:
* Programming the necessary gate control list (GCL)
* Setup delayed work for starting the schedule
The hardware supports starting a schedule up to eight seconds in the future. The
TAPRIO interface provides an absolute base time. Therefore, periodic delayed
work is leveraged to check whether a schedule may be started or not.
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The 'wwan' devtype is meant for devices that require additional
configuration to be used, like WWAN specific APN setup over AT/QMI
commands, rmnet link creation, etc. This is the case for MHI (Modem
host Interface) netdev which targets modem/WWAN endpoints.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1611328554-1414-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexander Lobakin says:
====================
udp: allow forwarding of plain (non-fraglisted) UDP GRO packets
This series allows to form UDP GRO packets in cases without sockets
(for forwarding). To not change the current datapath, this is
performed only when the new corresponding netdev feature is enabled
via Ethtool (and fraglisted GRO is disabled).
Prior to this point, only fraglisted UDP GRO was available. Plain UDP
GRO shows better forwarding performance when a target NIC is capable
of GSO UDP offload.
Since v3 [2]:
- rename introduced netdev feature to reflect that it targets
forwarding and don't touch fraglisted GRO at all (Willem de Bruijn).
Since v2 [1]:
- convert to a series;
- new: add new netdev_feature to explicitly enable/disable UDP GRO
when there is no socket, defaults to off (Paolo Abeni).
Since v1 [0]:
- drop redundant 'if (sk)' check (Alexander Duyck);
- add a ref in the commit message to one more commit that was
an important step for UDP GRO forwarding.
[0] https://lore.kernel.org/netdev/20210112211536.261172-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210113103232.4761-1-alobakin@pm.me
[2] https://lore.kernel.org/netdev/20210118193122.87271-1-alobakin@pm.me
====================
Link: https://lore.kernel.org/r/20210122181909.36340-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 9fd1ff5d2a ("udp: Support UDP fraglist GRO/GSO.") actually
not only added a support for fraglisted UDP GRO, but also tweaked
some logics the way that non-fraglisted UDP GRO started to work for
forwarding too.
Commit 2e4ef10f58 ("net: add GSO UDP L4 and GSO fraglists to the
list of software-backed types") added GSO UDP L4 to the list of
software GSO to allow virtual netdevs to forward them as is up to
the real drivers.
Tests showed that currently forwarding and NATing of plain UDP GRO
packets are performed fully correctly, regardless if the target
netdevice has a support for hardware/driver GSO UDP L4 or not.
Add the last element and allow to form plain UDP GRO packets if
we are on forwarding path, and the new NETIF_F_GRO_UDP_FWD is
enabled on a receiving netdevice.
If both NETIF_F_GRO_FRAGLIST and NETIF_F_GRO_UDP_FWD are set,
fraglisted GRO takes precedence. This keeps the current behaviour
and is generally more optimal for now, as the number of NICs with
hardware USO offload is relatively small.
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduce a new netdev feature, NETIF_F_GRO_UDP_FWD, to allow user
to turn UDP GRO on and off for forwarding.
Defaults to off to not change current datapath.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Richard Cochran says:
====================
Remove unneeded PHY time stamping option.
The NETWORK_PHY_TIMESTAMPING configuration option adds additional
checks into the networking hot path, and it is only needed by two
rather esoteric devices, namely the TI DP83640 PHYTER and the ZHAW
InES 1588 IP core. Very few end users have these devices, and those
that do have them are building specialized embedded systems.
Unfortunately two unrelated drivers depend on this option, and two
defconfigs enable it. It is probably my fault for not paying enough
attention in reviews.
This series corrects the gratuitous use of NETWORK_PHY_TIMESTAMPING.
====================
Link: https://lore.kernel.org/r/cover.1611198584.git.richardcochran@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The mvpp2 is an Ethernet driver, and it implements MAC style time
stamping of PTP frames. It has no need of the expensive option to
enable PHY time stamping. Remove the incorrect dependency.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The mv88e6xxx is a DSA driver, and it implements DSA style time
stamping of PTP frames. It has no need of the expensive option to
enable PHY time stamping. Remove the bogus dependency.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Brandon Streiff <brandon.streiff@ni.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder says:
====================
net: ipa: NAPI poll updates
While reviewing the IPA NAPI polling code in detail I found two
problems. This series fixes those, and implements a few other
improvements to this part of the code.
The first two patches are minor bug fixes that avoid extra passes
through the poll function. The third simplifies code inside the
polling loop a bit.
The last two update how interrupts are disabled; previously it was
possible for another I/O completion condition to be recorded before
NAPI got scheduled.
====================
Link: https://lore.kernel.org/r/20210121114821.26495-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently in gsi_isr_ieob(), event ring IEOB interrupts are disabled
one at a time. The loop disables the IEOB interrupt for all event
rings represented in the event mask. Instead, just disable them all
at once.
Disable them all *before* clearing the interrupt condition. This
guarantees we'll schedule NAPI for each event once, before another
IEOB interrupt could be signaled.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rename gsi_irq_ieob_disable() to be gsi_irq_ieob_disable_one().
Introduce a new function gsi_irq_ieob_disable() that takes a mask of
events to disable rather than a single event id. This will be used
in the next patch.
Rename gsi_irq_ieob_enable() to be gsi_irq_ieob_enable_one() to be
consistent.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>