Add a new rtnl flag (RTNL_FLAG_BULK_DEL_SUPPORTED) which is used to
verify that the delete operation allows bulk object deletion. Also emit
a warning if anyone tries to set it for non-delete kind.
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new delete request modifier called NLM_F_BULK which, when
supported, would cause the request to delete multiple objects. The flag
is a convenient way to signal that a multiple delete operation is
requested which can be gradually added to different delete requests. In
order to make sure older kernels will error out if the operation is not
supported instead of doing something unintended we have to break a
required condition when implementing support for this flag, f.e. for
neighbors we will omit the mandatory mac address attribute.
Initially it will be used to add flush with filtering support for bridge
fdbs, but it also opens the door to add similar support to others.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a helper which extracts the msg type's kind using the kind mask (0x3).
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add rtnl kind names instead of using raw values. We'll need to
check for DEL kind later to validate bulk flag support.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko says:
====================
net: ethernet: ti: enable bc/mc storm prevention support
This series first adds supports for the ALE feature to rate limit number ingress
broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC storm
prevention.
And then enables corresponding support for ingress broadcast(BC)/multicast(MC)
packets rate limiting for TI CPSW switchdev and AM65x/J221E CPSW_NUSS drivers by
implementing HW offload for simple tc-flower with policer action with matches
on dst_mac/mask:
- ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate
limiting (exact match)
- 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC
packets rate limiting
The CPSW supports MC/BC packets rate limiting in packets/sec and affects
all ingress MC/BC packets and serves as BC/MC storm prevention feature.
Examples:
- BC rate limit to 1000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police pkts_rate 1000 pkts_burst 1 drop
- MC rate limit to 20000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
action police rate pkts_rate 20000 pkts_burst 1 drop
pkts_burst - not used.
The solution inspired patch from Vladimir Oltean [1].
Changes in v3:
- comments applied
- policer validation added
Changes in v2:
- switch to packet-per-second policing introduced by
commit 2ffe039528 ("net/sched: act_police: add support for packet-per-second policing") [2]
v2: https://patchwork.kernel.org/project/netdevbpf/cover/20211101170122.19160-1-grygorii.strashko@ti.com/
v1: https://patchwork.kernel.org/project/netdevbpf/cover/20201114035654.32658-1-grygorii.strashko@ti.com/
[1] https://lore.kernel.org/patchwork/patch/1217254/
[2] https://patchwork.kernel.org/project/netdevbpf/cover/20210312140831.23346-1-simon.horman@netronome.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables support for ingress broadcast(BC)/multicast(MC) packets
rate limiting in TI CPSW switchdev driver (the corresponding ALE support
was added in previous patch) by implementing HW offload for simple
tc-flower with policer action with matches on dst_mac:
- ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate
limiting (exact match)
- 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC
packets rate limiting
The CPSW supports MC/BC packets rate limiting in packets/sec and affects
all ingress MC/BC packets and serves as BC/MC storm prevention feature.
Examples:
- BC rate limit to 1000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police pkts_rate 1000 pkts_burst 1 drop
- MC rate limit to 20000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
action police rate pkts_rate 10000 pkts_burst 1 drop
pkts_burst - not used.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables support for ingress broadcast(BC)/multicast(MC) packets
rate limiting in TI AM65x CPSW driver (the corresponding ALE support was
added in previous patch) by implementing HW offload for simple tc-flower
with policer action with matches on dst_mac/mask:
- ff:ff:ff:ff:ff:ff/ff:ff:ff:ff:ff:ff has to be used for BC packets rate
limiting (exact match)
- 01:00:00:00:00:00/01:00:00:00:00:00 fixed value has to be used for MC
packets rate limiting
The CPSW supports MC/BC packets rate limiting in packets/sec and affects
all ingress MC/BC packets and serves as BC/MC storm prevention feature.
Examples:
- BC rate limit to 1000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police pkts_rate 1000 pkts_burst 1 drop
- MC rate limit to 20000pps:
tc qdisc add dev eth0 clsact
tc filter add dev eth0 ingress flower skip_sw dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 \
action police rate pkts_rate 20000 pkts_burst 1 drop
pkts_burst - not used.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CPSW ALE supports feature to rate limit number ingress
broadcast(BC)/multicast(MC) packets per/sec which main purpose is BC/MC
storm prevention.
The ALE BC/MC packet rate limit configuration consist of two parts:
- global
ALE_CONTROL.ENABLE_RATE_LIMIT bit 0 which enables rate limiting globally
ALE_PRESCALE.PRESCALE specifies rate limiting interval
- per-port
ALE_PORTCTLx.BCASTMCAST/_LIMIT specifies number of BC/MC packets allowed
per rate limiting interval.
When port.BCASTMCAST/_LIMIT is 0 rate limiting is disabled for Port.
When BC/MC packet rate limiting is enabled the number of allowed packets
per/sec is defined as:
number_of_packets/sec = (Fclk / ALE_PRESCALE) * port.BCASTMCAST/_LIMIT
Hence, the ALE_PRESCALE configuration is common for all ports the 1ms
interval is selected and configured during ALE initialization while
port.BCAST/MCAST_LIMIT are configured per-port.
This allows to achieve:
- min number_of_packets = 1000 when port.BCAST/MCAST_LIMIT = 1
- max number_of_packets = 1000 * 255 = 255000
when port.BCAST/MCAST_LIMIT = 0xFF
The ALE_CONTROL.ENABLE_RATE_LIMIT can also be enabled once during ALE
initialization as rate limiting enabled by non zero port.BCASTMCAST/_LIMIT
values.
This patch implements above logic in ALE and adds new ALE APIs
cpsw_ale_rx_ratelimit_bc();
cpsw_ale_rx_ratelimit_mc();
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As there are now no users of phylink_helper_basex_speed(), we can
remove this obsolete functionality.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The __mtk_foe_entry_clear() function frees "entry" so we have to use
the _safe() version of hlist_for_each_entry() to prevent a use after
free.
Fixes: 33fc42de33 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using pm_runtime_resume_and_get is more appropriate
for simplifing code
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Address the following coccicheck warning:
net/ipv6/exthdrs.c:620:44-45: WARNING opportunity for swap()
by using swap() for the swapping of variable values and drop
the tmp (`addr`) variable that is not needed any more.
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add boilerplate test loop in test to run all tests
in fib_rule_tests.sh
Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rely on standard cci-control-port property to identify CCI port
reference.
Update mt7622 dts binding.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-04-12
This series contains updates to i40e and ice drivers.
Joe Damato adds TSO support for MPLS packets on i40e and ice drivers. He
also adds tracking and reporting of tx_stopped statistic for i40e.
Nabil S. Alramli adds reporting of tx_restart to ethtool for i40e.
Mateusz adds new device id support for i40e.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
tls: rx: random refactoring part 3
TLS Rx refactoring. Part 3 of 3. This set is mostly around rx_list
and async processing. The last two patches are minor optimizations.
A couple of features to follow.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
TLS 1.3 and ChaChaPoly don't carry IV in the packet.
The code before this change would copy out iv_size
worth of whatever followed the TLS header in the packet
and then for TLS 1.3 | ChaCha overwrite that with
the sequence number. Waste of cycles especially
with TLS 1.2 being close to dead and TLS 1.3 being
the common case.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IVs are 8 or 16 bytes, no point reading out the exact value
for quantities this small.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Propagating EINPROGRESS thru multiple layers of functions is
error prone. Use darg->async as an in/out argument, like we
use darg->zc today. On input it tells the code if async is
allowed, on output if it took place.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
async crypto handler will report the socket error no need
to report it again. We can, however, let the data we already
copied be reported to user space but we need to make sure
the error will be reported next time around.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
process_rx_list() only fails if it can't copy data to user
space. There is no point recording the error onto sk->sk_err
or giving up on the data which was read partially. Treat
the return value like a normal socket partial read.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If crypto didn't always invoke our callback for async
we'd not be clearing skb->sk and would crash in the
skb core when freeing it. This if must be dead code.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Async crypto never worked with TLS 1.3 and was explicitly disabled in
commit 8497ded2d1 ("net/tls: Disable async decrytion for tls1.3").
There's no need for us to handle TLS 1.3 padding in the async cb.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move counting TlsDecryptErrors to tls_do_decryption()
where differences between sync and async crypto are
reconciled.
No functional changes, this code just always gave
me a pause.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
rx_list is protected by the socket lock, no need to take
the built-in spin lock on accesses.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
platform_get_irq() return negative value on failure, so null check of
return value is incorrect. Fix it by comparing whether it is less than
zero.
Fixes: 9055a2f591 ("ixp4xx_eth: make ptp support a platform driver")
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220412085126.2532924-1-lv.ruyi@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync and
pm_runtime_put_noidle. This change is just to simplify the code, no
actual functional changes.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20220412082847.2532584-1-chi.minghao@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
XRFM is no longer needed for configuring FOU tunnels
(CONFIG_NET_FOU_IP_TUNNELS), remove from Kconfig.
Also remove the xrfm.h dependency in fou.c. It was
added in '23461551c006 ("fou: Support for foo-over-udp RX path")'
for depencies of udp_del_offload and udp_offloads, which were removed in
'd92283e338f6 ("fou: change to use UDP socket GRO")'.
Built and installed kernel and setup GUE/FOU tunnels.
Signed-off-by: Coco Li <lixiaoyan@google.com>
Link: https://lore.kernel.org/r/20220411213717.3688789-1-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for Ethernet Connection X722 for 10GbE SFP+ cards.
Make possible for the driver to bind to the card.
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add vsi.tx_restart to the i40e driver ethtool statistics
Signed-off-by: Nabil S. Alramli <dev@nalramli.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Track TX queue stop events and export the new stat with ethtool.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Attempt to add mpls+tso support.
I don't have ice hardware available to test myself, but I just implemented
this feature in i40e and thought it might be useful to implement for ice
while this is fresh in my brain.
Hoping some one at intel will be able to test this on my behalf.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Ido Schimmel says:
====================
mlxsw: Extend device registers for line cards support
This patch set prepares mlxsw for line cards support by extending device
registers with a slot index, which allows accessing components found on
a line card at a given slot. Currently, only slot index 0 (main board)
is used.
No user visible changes that I am aware of.
====================
Link: https://lore.kernel.org/r/20220411144657.2655752-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add new field 'max_modules_per_slot' to provide maximum number of
modules that can be connected per slot. This field will always be zero,
if 'slot_index' in query request is set to non-zero value, otherwise
value in this field will provide maximum modules number, which can be
equipped on device inserted at any slot.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pass the slot index down to PMAOS pack helper alongside with the module.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend MGPIR (Management General Peripheral Information Register) with
new fields specifying the slot number and number of the slots available
on system. The purpose of these fields is:
- to support access to MPGIR register on modular system for getting the
number of cages, equipped on the line card, inserted at specified
slot. In case slot number is set zero, MGPIR will provide the
information for the main board. For Top of the Rack (non-modular)
system it will provide the same as before.
- to provide the number of slots supported by system. This data is
relevant only in case slot number is set zero.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend PMMP (Port Module Memory Map Properties Register) with new
field specifying the slot number. The purpose of this field is to
enable overriding the cable/module memory map advertisement.
For non-modular systems the 'module' number uniquely identifies the
transceiver location. For modular systems the transceivers are
identified by two indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'module', specifying cage transceiver within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend MCION (Management Cable IO and Notifications Register) with new
field specifying the slot number. The purpose of this field is to
support access to MCION register for query cage transceiver on modular
system.
For non-modular systems the 'module' number uniquely identifies the
transceiver location. For modular systems the transceivers are
identified by two indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'module', specifying cage transceiver within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend MCIA (Management Cable Info Access Register) with new field
specifying the slot number. The purpose of this field is to support
access to MCIA register for reading cage cable information on modular
system. For non-modular systems the 'module' number uniquely identifies
the transceiver location. For modular systems the transceivers are
identified by two indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'module', specifying cage transceiver within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend MTBR (Management Temperature Bulk Register) with new field
specifying the slot number. The purpose of this field is to support
access to MTBR register for reading temperature sensors on modular
system. For non-modular systems the 'sensor_index' uniquely identifies
the cage sensors. For modular systems the sensors are identified by two
indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'sensor_index', specifying cage sensor within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend MTMP (Management Temperature Register) with new field specifying
the slot index. The purpose of this field is to support access to MTMP
register for reading temperature sensors on modular systems.
For non-modular systems the 'sensor_index' uniquely identifies the cage
sensors, while 'slot_index' is always 0. For modular systems the
sensors are identified by:
- 'slot_index', specifying the slot index, where line card is located;
- 'sensor_index', specifying cage sensor within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The internal recvmsg() functions have two parameters 'flags' and 'noblock'
that were merged inside skb_recv_datagram(). As a follow up patch to commit
f4b41f062c ("net: remove noblock parameter from skb_recv_datagram()")
this patch removes the separate 'noblock' parameter for recvmsg().
Analogue to the referenced patch for skb_recv_datagram() the 'flags' and
'noblock' parameters are unnecessarily split up with e.g.
err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
or in
err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg,
sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
instead of simply using only flags all the time and check for MSG_DONTWAIT
where needed (to preserve for the formerly separated no(n)block condition).
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The strings are only used in efx_common.c so the definitions
can be static in there.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>