Commit Graph

1091314 Commits

Author SHA1 Message Date
Jakub Kicinski
0a806ecc40 Merge branch 'bnxt_en-bug-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes

This patch series includes 3 fixes:
 - Fix an occasional VF open failure.
 - Fix a PTP spinlock usage before initialization
 - Fix unnecesary RX packet drops under high TX traffic load.
====================

Link: https://lore.kernel.org/r/1651540392-2260-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:41:35 -07:00
Michael Chan
195af57914 bnxt_en: Fix unnecessary dropping of RX packets
In bnxt_poll_p5(), we first check cpr->has_more_work.  If it is true,
we are in NAPI polling mode and we will call __bnxt_poll_cqs() to
continue polling.  It is possible to exhanust the budget again when
__bnxt_poll_cqs() returns.

We then enter the main while loop to check for new entries in the NQ.
If we had previously exhausted the NAPI budget, we may call
__bnxt_poll_work() to process an RX entry with zero budget.  This will
cause packets to be dropped unnecessarily, thinking that we are in the
netpoll path.  Fix it by breaking out of the while loop if we need
to process an RX NQ entry with no budget left.  We will then exit
NAPI and stay in polling mode.

Fixes: 389a877a3b ("bnxt_en: Process the NQ under NAPI continuous polling.")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:41:32 -07:00
Michael Chan
2b156fb57d bnxt_en: Initiallize bp->ptp_lock first before using it
bnxt_ptp_init() calls bnxt_ptp_init_rtc() which will acquire the ptp_lock
spinlock.  The spinlock is not initialized until later.  Move the
bnxt_ptp_init_rtc() call after the spinlock is initialized.

Fixes: 24ac1ecd52 ("bnxt_en: Add driver support to use Real Time Counter for PTP")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:41:32 -07:00
Somnath Kotur
13ba794397 bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flag
bnxt_open() can fail in this code path, especially on a VF when
it fails to reserve default rings:

bnxt_open()
  __bnxt_open_nic()
    bnxt_clear_int_mode()
    bnxt_init_dflt_ring_mode()

RX rings would be set to 0 when we hit this error path.

It is possible for a subsequent bnxt_open() call to potentially succeed
with a code path like this:

bnxt_open()
  bnxt_hwrm_if_change()
    bnxt_fw_init_one()
      bnxt_fw_init_one_p3()
        bnxt_set_dflt_rfs()
          bnxt_rfs_capable()
            bnxt_hwrm_reserve_rings()

On older chips, RFS is capable if we can reserve the number of vnics that
is equal to RX rings + 1.  But since RX rings is still set to 0 in this
code path, we may mistakenly think that RFS is supported for 0 RX rings.

Later, when the default RX rings are reserved and we try to enable
RFS, it would fail and cause bnxt_open() to fail unnecessarily.

We fix this in 2 places.  bnxt_rfs_capable() will always return false if
RX rings is not yet set.  bnxt_init_dflt_ring_mode() will call
bnxt_set_dflt_rfs() which will always clear the RFS flags if RFS is not
supported.

Fixes: 20d7d1c5c9 ("bnxt_en: reliably allocate IRQ table on reset to avoid crash")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:41:31 -07:00
Jakub Kicinski
f43f0cd2d9 Merge tag 'wireless-next-2022-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:

====================
wireless-next patches for v5.19

First set of patches for v5.19 and this is a big one. We have two new
drivers, a change in mac80211 STA API affecting most drivers and
ath11k getting support for WCN6750. And as usual lots of fixes and
cleanups all over.

Major changes:

new drivers
 - wfx: silicon labs devices
 - plfxlc: pureLiFi X, XL, XC devices

mac80211
 - host based BSS color collision detection
 - prepare sta handling for IEEE 802.11be Multi-Link Operation (MLO) support

rtw88
 - support TP-Link T2E devices

rtw89
 - support firmware crash simulation
 - preparation for 8852ce hardware support

ath11k
 - Wake-on-WLAN support for QCA6390 and WCN6855
 - device recovery (firmware restart) support for QCA6390 and WCN6855
 - support setting Specific Absorption Rate (SAR) for WCN6855
 - read country code from SMBIOS for WCN6855/QCA6390
 - support for WCN6750

wcn36xx
 - support for transmit rate reporting to user space

* tag 'wireless-next-2022-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (228 commits)
  rtw89: 8852c: rfk: add DPK
  rtw89: 8852c: rfk: add IQK
  rtw89: 8852c: rfk: add RX DCK
  rtw89: 8852c: rfk: add RCK
  rtw89: 8852c: rfk: add TSSI
  rtw89: 8852c: rfk: add LCK
  rtw89: 8852c: rfk: add DACK
  rtw89: 8852c: rfk: add RFK tables
  plfxlc: fix le16_to_cpu warning for beacon_interval
  rtw88: remove a copy of the NAPI_POLL_WEIGHT define
  carl9170: tx: fix an incorrect use of list iterator
  wil6210: use NAPI_POLL_WEIGHT for napi budget
  ath10k: remove a copy of the NAPI_POLL_WEIGHT define
  ath11k: Add support for WCN6750 device
  ath11k: Datapath changes to support WCN6750
  ath11k: HAL changes to support WCN6750
  ath11k: Add QMI changes for WCN6750
  ath11k: Fetch device information via QMI for WCN6750
  ath11k: Add register access logic for WCN6750
  ath11k: Add HW params for WCN6750
  ...
====================

Link: https://lore.kernel.org/r/20220503153622.C1671C385A4@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:27:51 -07:00
Jakub Kicinski
58caed3dac netdev: reshuffle netif_napi_add() APIs to allow dropping weight
Most drivers should not have to worry about selecting the right
weight for their NAPI instances and pass NAPI_POLL_WEIGHT.
It'd be best if we didn't require the argument at all and selected
the default internally.

This change prepares the ground for such reshuffling, allowing
for a smooth transition. The following API should remain after
the next release cycle:
  netif_napi_add()
  netif_napi_add_weight()
  netif_napi_add_tx()
  netif_napi_add_tx_weight()
Where the _weight() variants take an explicit weight argument.
I opted for a _weight() suffix rather than a __ prefix, because
we use __ in places to mean that caller needs to also issue a
synchronize_net() call.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20220502232703.396351-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:26:10 -07:00
Vladimir Oltean
7d4e91e064 selftests: forwarding: add basic QoS classification test for Ocelot switches
Test basic (port-default, VLAN PCP and IP DSCP) QoS classification for
Ocelot switches. Advanced QoS classification using tc filters is covered
by tc_flower_chains.sh in the same directory.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220502155424.4098917-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 17:24:45 -07:00
Sergey Shtylyov
5ef9b803a4 smsc911x: allow using IRQ0
The AlphaProject AP-SH4A-3A/AP-SH4AD-0A SH boards use IRQ0 for their SMSC
LAN911x Ethernet chip, so the networking on them must have been broken by
commit 965b2aa78f ("net/smsc911x: fix irq resource allocation failure")
which filtered out 0 as well as the negative error codes -- it was kinda
correct at the time, as platform_get_irq() could return 0 on of_irq_get()
failure and on the actual 0 in an IRQ resource.  This issue was fixed by
me (back in 2016!), so we should be able to fix this driver to allow IRQ0
usage again...

When merging this to the stable kernels, make sure you also merge commit
e330b9a6bb ("platform: don't return 0 from platform_get_irq[_byname]()
on error") -- that's my fix to platform_get_irq() for the DT platforms...

Fixes: 965b2aa78f ("net/smsc911x: fix irq resource allocation failure")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/656036e4-6387-38df-b8a7-6ba683b16e63@omp.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:57:33 -07:00
Jakub Kicinski
2201124dbb Merge branch 'mptcp-userspace-path-manager-prerequisites'
Mat Martineau says:

====================
mptcp: Userspace path manager prerequisites

This series builds upon the path manager mode selection changes merged
in 4994d4fa99 ("Merge branch 'mptcp-path-manager-mode-selection'") to
further modify the path manager code in preparation for adding the new
netlink commands to announce/remove advertised addresses and
create/destroy subflows of an MPTCP connection. The third and final
patch series for the userspace path manager will implement those
commands as discussed in
https://lore.kernel.org/netdev/23ff3b49-2563-1874-fa35-3af55d3088e7@linux.intel.com/#r

Patches 1, 5, and 7 remove some internal constraints on path managers
(in general) without changing in-kernel PM behavior.

Patch 2 adds a self test to validate MPTCP address advertisement ack
behavior.

Patches 3, 4, and 6 add new attributes to existing MPTCP netlink events
and track internal state for populating those attributes.
====================

Link: https://lore.kernel.org/r/20220502205237.129297-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:55:00 -07:00
Kishen Maloor
304ab97f4c mptcp: allow ADD_ADDR reissuance by userspace PMs
This change allows userspace PM implementations to reissue ADD_ADDR
announcements (if necessary) based on their chosen policy.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:55 -07:00
Kishen Maloor
41b3c69bf9 mptcp: expose server_side attribute in MPTCP netlink events
This change records the 'server_side' attribute of MPTCP_EVENT_CREATED
and MPTCP_EVENT_ESTABLISHED events to inform their recipient about the
Client/Server role of the running MPTCP application.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/246
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:55 -07:00
Kishen Maloor
70c708e826 mptcp: establish subflows from either end of connection
This change updates internal logic to permit subflows to be
established from either the client or server ends of MPTCP
connections. This symmetry and added flexibility may be
harnessed by PM implementations running on either end in
creating new subflows.

The essence of this change lies in not relying on the
"server_side" flag (which continues to be available if needed).

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:55 -07:00
Kishen Maloor
d1ace2d9ab mptcp: reflect remote port (not 0) in ANNOUNCED events
Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP
SHOULD attempt to connect to the specified address on the same port
as the port that is already in use by the subflow on which the
ADD_ADDR signal was sent.

To facilitate that, this change reflects the specific remote port in
use by that subflow in MPTCP_EVENT_ANNOUNCED events.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:55 -07:00
Kishen Maloor
8a34839220 mptcp: store remote id from MP_JOIN SYN/ACK in local ctx
This change reads the addr id assigned to the remote endpoint
of a subflow from the MP_JOIN SYN/ACK message and stores it
in the related subflow context. The remote id was not being
captured prior to this change, and will now provide a consistent
view of remote endpoints and their ids as seen through netlink
events.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:54 -07:00
Mat Martineau
b3b71bf915 selftests: mptcp: ADD_ADDR echo test with missing userspace daemon
Check userspace PM behavior to ensure ADD_ADDR echoes are only sent when
there is an active userspace daemon. If the daemon is restarting or
hasn't loaded yet, the missing echo will cause the peer to retransmit
the ADD_ADDR - and hopefully the daemon will be ready to receive it at
that later time.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:54 -07:00
Kishen Maloor
4d25247d3a mptcp: bypass in-kernel PM restrictions for non-kernel PMs
Current limits on the # of addresses/subflows must apply only to
in-kernel PM managed sockets. Thus this change removes such
restrictions on connections overseen by non-kernel (e.g. userspace)
PMs. This change also ensures that the kernel does not record stats
inside struct mptcp_pm_data updated along kernel code paths when exercised
via non-kernel PMs.

Additionally, address announcements are acknolwedged and subflow
requests are honored only when it's deemed that	a userspace path
manager	is active at the time.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:54:54 -07:00
Matthew Hagan
2069624dac net: sfp: Add tx-fault workaround for Huawei MA5671A SFP ONT
As noted elsewhere, various GPON SFP modules exhibit non-standard
TX-fault behaviour. In the tested case, the Huawei MA5671A, when used
in combination with a Marvell mv88e6085 switch, was found to
persistently assert TX-fault, resulting in the module being disabled.

This patch adds a quirk to ignore the SFP_F_TX_FAULT state, allowing the
module to function.

Change from v1: removal of erroneous return statment (Andrew Lunn)

Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220502223315.1973376-1-mnhagan88@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03 16:53:39 -07:00
Linus Torvalds
107c948d1d Merge tag 'seccomp-v5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp selftest fix from Kees Cook:

 - Avoid using stdin for read syscall testing (Jann Horn)

* tag 'seccomp-v5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests/seccomp: Don't call read() on TTY from background pgrp
2022-05-03 15:47:19 -07:00
Linus Torvalds
ef8e4d3c2a Merge tag 'hwmon-for-v5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Work around a hardware problem in the delta-ahe50dc-fan driver

 - Explicitly disable PEC in PMBus core if not enabled

 - Fix negative temperature values in f71882fg driver

 - Fix warning on removal of adt7470 driver

 - Fix CROSSHAIR VI HERO name in asus_wmi_sensors driver

 - Fix build warning seen in xdpe12284 driver if
   CONFIG_SENSORS_XDPE122_REGULATOR is disabled

 - Fix type of 'ti,n-factor' in ti,tmp421 driver bindings

* tag 'hwmon-for-v5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus) delta-ahe50dc-fan: work around hardware quirk
  hwmon: (pmbus) disable PEC if not enabled
  hwmon: (f71882fg) Fix negative temperature
  dt-bindings: hwmon: ti,tmp421: Fix type for 'ti,n-factor'
  hwmon: (adt7470) Fix warning on module removal
  hwmon: (asus_wmi_sensors) Fix CROSSHAIR VI HERO name
  hwmon: (xdpe12284) Fix build warning seen if CONFIG_SENSORS_XDPE122_REGULATOR is disabled
2022-05-03 09:51:52 -07:00
Tetsuo Handa
3a58f13a88 net: rds: acquire refcount on TCP sockets
syzbot is reporting use-after-free read in tcp_retransmit_timer() [1],
for TCP socket used by RDS is accessing sock_net() without acquiring a
refcount on net namespace. Since TCP's retransmission can happen after
a process which created net namespace terminated, we need to explicitly
acquire a refcount.

Link: https://syzkaller.appspot.com/bug?extid=694120e1002c117747ed [1]
Reported-by: syzbot <syzbot+694120e1002c117747ed@syzkaller.appspotmail.com>
Fixes: 26abe14379 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Fixes: 8a68173691 ("net: sk_clone_lock() should only do get_net() if the parent is not a kernel socket")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+694120e1002c117747ed@syzkaller.appspotmail.com>
Link: https://lore.kernel.org/r/a5fb1fc4-2284-3359-f6a0-e4e390239d7b@I-love.SAKURA.ne.jp
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 13:22:50 +02:00
Marc Kleine-Budde
f5c2174a37 selftests/net: so_txtime: usage(): fix documentation of default clock
The program uses CLOCK_TAI as default clock since it was added to the
Linux repo. In commit:
| 040806343b ("selftests/net: so_txtime multi-host support")
a help text stating the wrong default clock was added.

This patch fixes the help text.

Fixes: 040806343b ("selftests/net: so_txtime multi-host support")
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220502094638.1921702-3-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 13:18:26 +02:00
Marc Kleine-Budde
97926d5a84 selftests/net: so_txtime: fix parsing of start time stamp on 32 bit systems
This patch fixes the parsing of the cmd line supplied start time on 32
bit systems. A "long" on 32 bit systems is only 32 bit wide and cannot
hold a timestamp in nano second resolution.

Fixes: 040806343b ("selftests/net: so_txtime multi-host support")
Cc: Carlos Llamas <cmllamas@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20220502094638.1921702-2-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 13:18:26 +02:00
Paolo Abeni
2b68abf933 Merge tag 'mlx5-updates-2022-05-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-updates-2022-05-02

1) Trivial Misc updates to mlx5 driver

2) From Mark Bloch: Flow steering, general steering refactoring/cleaning

An issue with flow steering deletion flow (when creating a rule without
dests) turned out to be easy to fix but during the fix some issue
with the flow steering creation/deletion flows have been found.

The following patch series tries to fix long standing issues with flow
steering code and hopefully preventing silly future bugs.

  A) Fix an issue where a proper dest type wasn't assigned.
  B) Refactor and fix dests enums values, refactor deletion
     function and do proper bookkeeping of dests.
  C) Change mlx5_del_flow_rules() to delete rules when there are no
     no more rules attached associated with an FTE.
  D) Don't call hard coded deletion function but use the node's
     defined one.
  E) Add a WARN_ON() to catch future bugs when an FTE with dests
     is deleted.

* tag 'mlx5-updates-2022-05-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: fs, an FTE should have no dests when deleted
  net/mlx5: fs, call the deletion function of the node
  net/mlx5: fs, delete the FTE when there are no rules attached to it
  net/mlx5: fs, do proper bookkeeping for forward destinations
  net/mlx5: fs, add unused destination type
  net/mlx5: fs, jump to exit point and don't fall through
  net/mlx5: fs, refactor software deletion rule
  net/mlx5: fs, split software and IFC flow destination definitions
  net/mlx5e: TC, set proper dest type
  net/mlx5e: Remove unused mlx5e_dcbnl_build_rep_netdev function
  net/mlx5e: Drop error CQE handling from the XSK RX handler
  net/mlx5: Print initializing field in case of timeout
  net/mlx5: Delete redundant default assignment of runtime devlink params
  net/mlx5: Remove useless kfree
  net/mlx5: use kvfree() for kvzalloc() in mlx5_ct_fs_smfs_matcher_create
====================

Link: https://lore.kernel.org/r/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 12:43:41 +02:00
Paolo Abeni
f4f1fd7646 Merge branch 'mlxsw-remove-size-limitations-on-egress-descriptor-buffer'
Ido Schimmel says:

====================
mlxsw: Remove size limitations on egress descriptor buffer

Petr says:

Spectrum machines have two resources related to keeping packets in an
internal buffer: bytes (allocated in cell-sized units) for packet payload,
and descriptors, for keeping headers. Currently, mlxsw only configures the
bytes part of the resource management.

Spectrum switches permit a full parallel configuration for the descriptor
resources, including port-pool and port-TC-pool quotas. By default, these
are all configured to use pool 14, with an infinite quota. The ingress pool
14 is then infinite in size.

However, egress pool 14 has finite size by default. The size is chip
dependent, but always much lower than what the chip actually permits. As a
result, we can easily construct workloads that exhaust the configured
descriptor limit.

Going forward, mlxsw will have to fix this issue properly by maintaining
descriptor buffer sizes, TC bindings, and quotas that match the
architecture recommendation. Short term, fix the issue by configuring the
egress descriptor pool to be infinite in size as well. This will maintain
the same configuration philosophy, but will unlock all chip resources to be
usable.

In this patchset, patch #1 first adds the "desc" field into the pool
configuration register. Then in patch #2, the new field is used to
configure both ingress and egress pool 14 as infinite.

In patches #3 and #4, add a selftest that verifies that a large burst
can be absorbed by the shared buffer. This test specifically exercises a
scenario where descriptor buffer is the limiting factor and the test
fails without the above patches.
====================

Link: https://lore.kernel.org/r/20220502084926.365268-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 12:10:52 +02:00
Petr Machata
1d267aa869 selftests: mlxsw: Add a test for soaking up a burst of traffic
Add a test that sends 1Gbps of traffic through the switch, into which it
then injects a burst of traffic and tests that there are no drops.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 12:10:50 +02:00
Petr Machata
1531cc632d selftests: forwarding: lib: Add start_traffic_pktsize() helpers
Add two helpers, start_traffic_pktsize() and start_tcp_traffic_pktsize(),
that allow explicit overriding of packet size. Change start_traffic() and
start_tcp_traffic() to dispatch through these helpers with the default
packet size.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 12:10:50 +02:00
Petr Machata
c864769add mlxsw: Configure descriptor buffers
Spectrum machines have two resources related to keeping packets in an
internal buffer: bytes (allocated in cell-sized units) for packet payload,
and descriptors, for keeping metadata. Currently, mlxsw only configures the
bytes part of the resource management.

Spectrum switches permit a full parallel configuration for the descriptor
resources, including port-pool and port-TC-pool quotas. By default, these
are all configured to use pool 14, with an infinite quota. The ingress pool
14 is then infinite in size.

However, egress pool 14 has finite size by default. The size is chip
dependent, but always much lower than what the chip actually permits. As a
result, we can easily construct workloads that exhaust the configured
descriptor limit.

Fix the issue by configuring the egress descriptor pool to be infinite in
size as well. This will maintain the configuration philosophy of the
default configuration, but will unlock all chip resources to be usable.

In the code, include both the configuration of ingress and ingress, mostly
for clarity.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 12:10:50 +02:00
Petr Machata
135433b30a mlxsw: reg: Add "desc" field to SBPR
SBPR, or Shared Buffer Pools Register, configures and retrieves the shared
buffer pools and configuration. The desc field determines whether the
configuration relates to the byte pool or the descriptor pool.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 12:10:49 +02:00
Ido Schimmel
3122257c02 selftests: mirror_gre_bridge_1q: Avoid changing PVID while interface is operational
In emulated environments, the bridge ports enslaved to br1 get a carrier
before changing br1's PVID. This means that by the time the PVID is
changed, br1 is already operational and configured with an IPv6
link-local address.

When the test is run with netdevs registered by mlxsw, changing the PVID
is vetoed, as changing the VID associated with an existing L3 interface
is forbidden. This restriction is similar to the 8021q driver's
restriction of changing the VID of an existing interface.

Fix this by taking br1 down and bringing it back up when it is fully
configured.

With this fix, the test reliably passes on top of both the SW and HW
data paths (emulated or not).

Fixes: 239e754af8 ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20220502084507.364774-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 11:21:14 +02:00
Paolo Abeni
45c77fb418 Merge branch 'emaclite-improve-error-handling-and-minor-cleanup'
Radhey Shyam Pandey says:

====================
emaclite: improve error handling and minor cleanup

This patchset does error handling for of_address_to_resource() and also
removes "Don't advertise 1000BASE-T" and auto negotiation.

Changes for v3:
- Resolve git apply conflicts for 2/2 patch.

Changes for v2:
- Added Andrew's reviewed by tag in 1/2 patch.
- Move ret to down to align with reverse xmas tree style in 2/2 patch.
- Also add fixes tag in 2/2 patch.
- Specify tree name in subject prefix.
====================

Link: https://lore.kernel.org/r/1651476470-23904-1-git-send-email-radhey.shyam.pandey@xilinx.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 11:07:38 +02:00
Shravya Kumbham
7a6bc33ab5 net: emaclite: Add error handling for of_address_to_resource()
check the return value of of_address_to_resource() and also add
missing of_node_put() for np and npp nodes.

Fixes: e0a3bc6544 ("net: emaclite: Support multiple phys connected to one MDIO bus")
Addresses-Coverity: Event check_return value.
Signed-off-by: Shravya Kumbham <shravya.kumbham@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 11:07:32 +02:00
Shravya Kumbham
b800528b97 net: emaclite: Don't advertise 1000BASE-T and do auto negotiation
In xemaclite_open() function we are setting the max speed of
emaclite to 100Mb using phy_set_max_speed() function so,
there is no need to write the advertising registers to stop
giga-bit speed and the phy_start() function starts the
auto-negotiation so, there is no need to handle it separately
using advertising registers. Remove the phy_read and phy_write
of advertising registers in xemaclite_open() function.

Signed-off-by: Shravya Kumbham <shravya.kumbham@xilinx.com>
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 11:07:32 +02:00
Paolo Abeni
cb636b3e37 Merge branch 'use-standard-sysctl-macro'
Tonghao Zhang says:

====================
use standard sysctl macro

From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

This patchset introduce sysctl macro or replace var
with macro.
====================

Link: https://lore.kernel.org/r/20220501035524.91205-1-xiangxia.m.yue@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 10:15:09 +02:00
Tonghao Zhang
57b19468b3 selftests/sysctl: add sysctl macro test
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 10:15:07 +02:00
Tonghao Zhang
4c7f24f857 net: sysctl: introduce sysctl SYSCTL_THREE
This patch introdues the SYSCTL_THREE.

KUnit:
[00:10:14] ================ sysctl_test (10 subtests) =================
[00:10:14] [PASSED] sysctl_test_api_dointvec_null_tbl_data
[00:10:14] [PASSED] sysctl_test_api_dointvec_table_maxlen_unset
[00:10:14] [PASSED] sysctl_test_api_dointvec_table_len_is_zero
[00:10:14] [PASSED] sysctl_test_api_dointvec_table_read_but_position_set
[00:10:14] [PASSED] sysctl_test_dointvec_read_happy_single_positive
[00:10:14] [PASSED] sysctl_test_dointvec_read_happy_single_negative
[00:10:14] [PASSED] sysctl_test_dointvec_write_happy_single_positive
[00:10:14] [PASSED] sysctl_test_dointvec_write_happy_single_negative
[00:10:14] [PASSED] sysctl_test_api_dointvec_write_single_less_int_min
[00:10:14] [PASSED] sysctl_test_api_dointvec_write_single_greater_int_max
[00:10:14] =================== [PASSED] sysctl_test ===================

./run_kselftest.sh -c sysctl
...
ok 1 selftests: sysctl: sysctl.sh

Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 10:15:06 +02:00
Tonghao Zhang
bd8a53675c net: sysctl: use shared sysctl macro
This patch replace two, four and long_one to SYSCTL_XXX.

Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-05-03 10:15:06 +02:00
Kalle Valo
f39af96d35 Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
ath.git patches for v5.19. Major changes:

ath11k

* support setting Specific Absorption Rate (SAR) for WCN6855

* read country code from SMBIOS for WCN6855/QCA6390

* support for WCN6750
2022-05-03 08:38:03 +03:00
Ping-Ke Shih
da4cea16cb rtw89: 8852c: rfk: add DPK
DPK is short for digital pre-distortion calibration. It can adjusts digital
waveform according to PA linear characteristics dynamically to enhance
TX EVM.

Do this calibration when we are going to run on AP channel. To prevent
power offset out of boundary, it monitors thermal and set proper boundary
to register.

8852c needs two backup buffers, so we enlarge the array. But, 8852a still
needs only one, so it only uses first element (index zero).

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-9-pkshih@realtek.com
2022-05-03 08:32:03 +03:00
Ping-Ke Shih
2da8109d98 rtw89: 8852c: rfk: add IQK
IQ signal calibration is a very important calibration to yield good RF
performance. We do this calibration only if we are going to run on AP
channel. During scanning phase, without this calibration RF performance
is still acceptable because it transmits with low data rate at this phase.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-8-pkshih@realtek.com
2022-05-03 08:32:03 +03:00
Ping-Ke Shih
ac91be9756 rtw89: 8852c: rfk: add RX DCK
RX DCK is receiver DC calibration. Do this calibration when bringing up
interface and going to run on AP channel.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-7-pkshih@realtek.com
2022-05-03 08:32:03 +03:00
Ping-Ke Shih
30052c5a1c rtw89: 8852c: rfk: add RCK
RCK is synchronize RC calibration. It needs to be triggered only once when
interface is going to up.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-6-pkshih@realtek.com
2022-05-03 08:32:02 +03:00
Ping-Ke Shih
e5efc4d55c rtw89: 8852c: rfk: add TSSI
TSSI is transmitter signal strength indication, which is a close-loop
hardware circuit to feedback actual transmitting power as a reference for
next transmission.

When we setup channel to connect an AP, it does full calibration. When
switching bands or channels, it needs to reset hardware status to prevent
use wrong feedback of previous transmission.

To do TX power compensation reflecting current temperature, it loads tables
of compensation values into registers according to channel and band group.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-5-pkshih@realtek.com
2022-05-03 08:32:02 +03:00
Ping-Ke Shih
fb8177d729 rtw89: 8852c: rfk: add LCK
LCK is short fro LC Tank calibration. Do this calibration once driver
loads RF parameters table. Since the characteristic can be changed by
temperature, we do this calibration again if difference of thermal value
is over a threshold.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-4-pkshih@realtek.com
2022-05-03 08:32:02 +03:00
Ping-Ke Shih
76599a8d0b rtw89: 8852c: rfk: add DACK
DACK (digital-to-analog converters calibration) is used to calibrate DAC
to output analog signals as expected.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-3-pkshih@realtek.com
2022-05-03 08:32:02 +03:00
Ping-Ke Shih
ec424639d4 rtw89: 8852c: rfk: add RFK tables
These tables are used by RFK (RF calibration) to set parameters. These
parameters can trigger certain calibration, or configure/reset settings
before and after RF calibrations.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502235408.15052-2-pkshih@realtek.com
2022-05-03 08:32:02 +03:00
Srinivasan Raju
ccc915e7dd plfxlc: fix le16_to_cpu warning for beacon_interval
Fix the following sparse warnings:
drivers/net/wireless/purelifi/plfxlc/chip.c:36:31: sparse: expected unsigned short [usertype] beacon_interval
drivers/net/wireless/purelifi/plfxlc/chip.c:36:31: sparse: got restricted __le16 [usertype]

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Srinivasan Raju <srini.raju@purelifi.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220502150133.6052-1-srini.raju@purelifi.com
2022-05-03 08:28:32 +03:00
Mark Bloch
3a09fae035 net/mlx5: fs, an FTE should have no dests when deleted
When deleting an FTE it should have no dests, which means
fte->dests_size should be 0. Add a WARN_ON() to catch bugs
where the proper tracking wasn't done.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-02 21:21:16 -07:00
Mark Bloch
72191a4cd5 net/mlx5: fs, call the deletion function of the node
Don't call del_hw_fte() directly, instead use the hardware deletion
function set. This is just a small cleanup and doesn't change anything
as for an FTE the deletion function is already set to del_hw_fte().

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-02 21:21:15 -07:00
Mark Bloch
7b0c633859 net/mlx5: fs, delete the FTE when there are no rules attached to it
When an FTE has no children is means all the rules where removed
and the FTE can be deleted regardless of the dests_size value.
While dests_size should be 0 when there are no children
be extra careful not to leak memory or get firmware syndrome
if the proper bookkeeping of dests_size wasn't done.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-02 21:21:15 -07:00
Mark Bloch
a30c8b9025 net/mlx5: fs, do proper bookkeeping for forward destinations
Keep track after destinations that are forward destinations.
When a forward destinations is removed from an FTE check if
the actions bits need to be updated.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-02 21:21:14 -07:00