Commit Graph

43444 Commits

Author SHA1 Message Date
Linus Torvalds
a5088ee725 Thermal control updates for 6.1-rc1
- Rework the device tree initialization, convert the drivers to the
    new API and remove the old OF code (Daniel Lezcano).
 
  - Fix return value to -ENODEV when searching for a specific thermal
    zone which does not exist (Daniel Lezcano).
 
  - Fix the return value inspection in of_thermal_zone_find() (Dan
    Carpenter).
 
  - Fix kernel panic when KASAN is enabled as it detects use after
    free when unregistering a thermal zone (Daniel Lezcano).
 
  - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano).
 
  - Remove unnecessary error message as it is already shown in the
    underlying function (Jiapeng Chong).
 
  - Rework the monitoring path and move the locks upper in the call
    stack to fix some potentials race windows (Daniel Lezcano).
 
  - Fix lockdep_assert() warning introduced by the lock rework (Daniel
    Lezcano).
 
  - Do not lock thermal zone mutex in the user space governor (Rafael
    Wysocki).
 
  - Revert the Mellanox 'hotter thermal zone' feature because it is
    already handled in the thermal framework core code (Daniel Lezcano).
 
  - Increase maximum number of trip points in the thermal core (Sumeet
    Pawnikar).
 
  - Replace strlcpy() with unused retval with strscpy() in the core
    thermal control code (Wolfram Sang).
 
  - Use module_pci_driver() macro in the int340x processor_thermal
    driver (Shang XiaoJing).
 
  - Use get_cpu() instead of smp_processor_id() in the intel_powerclamp
    thermal driver to prevent it from crashing and remove unused
    accounting for IRQ wakes from it (Srinivas Pandruvada).
 
  - Consolidate priv->data_vault checks in int340x_thermal (Rafael
    Wysocki).
 
  - Check the policy first in cpufreq_cooling_register() (Xuewen Yan).
 
  - Drop redundant error message from da9062-thermal (zhaoxiao).
 
  - Drop of_match_ptr() from thermal_mmio (Jean Delvare).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OzUSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx8CMP/1kNnDUjtCBIvl/OJEM0yVlEGJNppNu9
 dCNQwuftAJRt7dBB292clKW15ef1fFAHrW/eOy+z62k6k/tfqnMLEK38hpS4pIaL
 PZRjRfpp2CDmBBVTJ0I2nNC9h0zZY4tQK+JCII1M2UmgtLaGbp3NuIu2Ga4i5bNR
 IYE3QvVz/g2/pRNa/BTsJySje/q7wv231Vd5jg4czg58EyntBGO4gmWNuXyZ6OkF
 ijzcu7K5Tkll6DT+Paw8bGcPCyjBtoBhv2A6HtsC3cYfc2tY9BVf3DG2HlG0qTAU
 pIZsLOlUozzJScVbu8ScKj1hlP/iCKcPgVjP4l+U57EtcY/ULBqrQtkDVtGedNOZ
 wa5a1VUsZDrigWRpj1u5SsDWmbAod5uK5X/3naUfH7XtomhqikfaZK7Euek6kfDN
 SEhZaBycAFWlI/L5cG2sd6BBcmWlBqkaHiQ0zEv2YlALY5SkNOUQrrq2A/1LP1Ae
 67xbzbWFXyR8DAq3YyfnLpqgcJcSiyGxmdKZ1u2XHudXeJHKdAB7xlcJz+hfWpYU
 Rd1ZTB3ATA/IMG1rLO2dKgaMmyQCWPG6oXtJjTH0g3sXm0U6K14dwEU1ey9u/Li6
 DmI4cWaXf8WoRvb3rkEcJliayUed4U/ohU+z1bInSE8EuCBuyMLRhoJdi3ZhunOC
 PAyKcHg8fLy9
 =Ufw7
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "The most significant part of this update is the thermal control DT
  initialization rework from Daniel Lezcano and the following conversion
  of drivers to use the new API introduced by it

  Apart from that, the maximum number of trip points in a thermal zone
  is increased and there are some fixes and code cleanups

  Specifics:

   - Rework the device tree initialization, convert the drivers to the
     new API and remove the old OF code (Daniel Lezcano)

   - Fix return value to -ENODEV when searching for a specific thermal
     zone which does not exist (Daniel Lezcano)

   - Fix the return value inspection in of_thermal_zone_find() (Dan
     Carpenter)

   - Fix kernel panic when KASAN is enabled as it detects use after free
     when unregistering a thermal zone (Daniel Lezcano)

   - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano)

   - Remove unnecessary error message as it is already shown in the
     underlying function (Jiapeng Chong)

   - Rework the monitoring path and move the locks upper in the call
     stack to fix some potentials race windows (Daniel Lezcano)

   - Fix lockdep_assert() warning introduced by the lock rework (Daniel
     Lezcano)

   - Do not lock thermal zone mutex in the user space governor (Rafael
     Wysocki)

   - Revert the Mellanox 'hotter thermal zone' feature because it is
     already handled in the thermal framework core code (Daniel Lezcano)

   - Increase maximum number of trip points in the thermal core (Sumeet
     Pawnikar)

   - Replace strlcpy() with unused retval with strscpy() in the core
     thermal control code (Wolfram Sang)

   - Use module_pci_driver() macro in the int340x processor_thermal
     driver (Shang XiaoJing)

   - Use get_cpu() instead of smp_processor_id() in the intel_powerclamp
     thermal driver to prevent it from crashing and remove unused
     accounting for IRQ wakes from it (Srinivas Pandruvada)

   - Consolidate priv->data_vault checks in int340x_thermal (Rafael
     Wysocki)

   - Check the policy first in cpufreq_cooling_register() (Xuewen Yan)

   - Drop redundant error message from da9062-thermal (zhaoxiao)

   - Drop of_match_ptr() from thermal_mmio (Jean Delvare)"

* tag 'thermal-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits)
  thermal: core: Increase maximum number of trip points
  thermal: int340x: processor_thermal: Use module_pci_driver() macro
  thermal: intel_powerclamp: Remove accounting for IRQ wakes
  thermal: intel_powerclamp: Use get_cpu() instead of smp_processor_id() to avoid crash
  thermal: int340x_thermal: Consolidate priv->data_vault checks
  thermal: cpufreq_cooling: Check the policy first in cpufreq_cooling_register()
  thermal: Drop duplicate words from comments
  thermal: move from strlcpy() with unused retval to strscpy()
  thermal: da9062-thermal: Drop redundant error message
  thermal/drivers/thermal_mmio: Drop of_match_ptr()
  thermal: gov_user_space: Do not lock thermal zone mutex
  Revert "mlxsw: core: Add the hottest thermal zone detection"
  thermal/core: Fix lockdep_assert() warning
  thermal/core: Move the mutex inside the thermal_zone_device_update() function
  thermal/core: Move the thermal zone lock out of the governors
  thermal/governors: Group the thermal zone lock inside the throttle function
  thermal/core: Rework the monitoring a bit
  thermal/core: Rearm the monitoring only one time
  thermal/drivers/qcom/spmi-adc-tm5: Remove unnecessary print function dev_err()
  thermal/of: Remove old OF code
  ...
2022-10-03 15:33:38 -07:00
Rafael J. Wysocki
a7ae50fc34 Merge branch 'thermal-core'
Merge core thermal control changes for 6.1-rc1:

 - Increase maximum number of trip points in the thermal core (Sumeet
   Pawnikar).

 - Replace strlcpy() with unused retval with strscpy() in the core
   thermal control code (Wolfram Sang).

 - Do not lock thermal zone mutex in the user space governor (Rafael
   Wysocki)

 - Rework the device tree initialization, convert the drivers to the
   new API and remove the old OF code (Daniel Lezcano)

 - Fix return value to -ENODEV when searching for a specific thermal
   zone which does not exist (Daniel Lezcano)

 - Fix the return value inspection in of_thermal_zone_find() (Dan
   Carpenter)

 - Fix kernel panic when KASAN is enabled as it detects use after
   free when unregistering a thermal zone (Daniel Lezcano)

 - Move the set_trip ops inside the therma sysfs code (Daniel Lezcano)

 - Remove unnecessary error message as it is already showed in the
   underlying function (Jiapeng Chong)

 - Rework the monitoring path and move the locks upper in the call
   stack to fix some potentials race windows (Daniel Lezcano)

 - Fix lockdep_assert() warning introduced by the lock rework (Daniel
   Lezcano)

 - Revert the Mellanox 'hotter thermal zone' feature because it is
   already handled in the thermal framework core code (Daniel Lezcano)

* thermal-core: (47 commits)
  thermal: core: Increase maximum number of trip points
  thermal: move from strlcpy() with unused retval to strscpy()
  thermal: gov_user_space: Do not lock thermal zone mutex
  Revert "mlxsw: core: Add the hottest thermal zone detection"
  thermal/core: Fix lockdep_assert() warning
  thermal/core: Move the mutex inside the thermal_zone_device_update() function
  thermal/core: Move the thermal zone lock out of the governors
  thermal/governors: Group the thermal zone lock inside the throttle function
  thermal/core: Rework the monitoring a bit
  thermal/core: Rearm the monitoring only one time
  thermal/drivers/qcom/spmi-adc-tm5: Remove unnecessary print function dev_err()
  thermal/of: Remove old OF code
  thermal/core: Move set_trip_temp ops to the sysfs code
  thermal/drivers/samsung: Switch to new of thermal API
  regulator/drivers/max8976: Switch to new of thermal API
  Input: sun4i-ts - switch to new of thermal API
  iio/drivers/sun4i_gpadc: Switch to new of thermal API
  hwmon/drivers/core: Switch to new of thermal API
  hwmon: pm_bus: core: Switch to new of thermal API
  ata/drivers/ahci_imx: Switch to new of thermal API
  ...
2022-10-03 20:38:43 +02:00
Jakub Kicinski
3e1308a7c8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
ice: xsk: ZC changes

Maciej Fijalkowski says:

This set consists of two fixes to issues that were either pointed out on
indirectly (John was reviewing AF_XDP selftests that were testing ice's
ZC support) mailing list or were directly reported by customers.

First patch allows user space to see done descriptor in CQ even after a
single frame being transmitted and second patch removes the need for
having HW rings sized to power of 2 number of descriptors when used
against AF_XDP.

I also forgot to mention that due to the current Tx cleaning algorithm,
4k HW ring was broken and these two patches bring it back to life, so we
kill two birds with one stone.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: xsk: drop power of 2 ring size restriction for AF_XDP
  ice: xsk: change batched Tx descriptor cleaning
====================

Link: https://lore.kernel.org/r/20220927164112.4011983-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:04:43 -07:00
Daniel Golle
c9da02bfb1 net: ethernet: mtk_eth_soc: fix mask of RX_DMA_GET_SPORT{,_V2}
The bitmasks applied in RX_DMA_GET_SPORT and RX_DMA_GET_SPORT_V2 macros
were swapped. Fix that.

Reported-by: Chen Minqiang <ptpt52@gmail.com>
Fixes: 160d3a9b19 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support")
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/YzMW+mg9UsaCdKRQ@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:03:57 -07:00
Vladimir Oltean
276d37eb44 net: mscc: ocelot: fix tagged VLAN refusal while under a VLAN-unaware bridge
Currently the following set of commands fails:

$ ip link add br0 type bridge # vlan_filtering 0
$ ip link set swp0 master br0
$ bridge vlan
port              vlan-id
swp0              1 PVID Egress Untagged
$ bridge vlan add dev swp0 vid 10
Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.

Dumping ocelot->vlans, one can see that the 2 egress-untagged VLANs on swp0 are
vid 1 (the bridge PVID) and vid 4094, a PVID used privately by the driver for
VLAN-unaware bridging. So this is why bridge vid 10 is refused, despite
'bridge vlan' showing a single egress untagged VLAN.

As mentioned in the comment added, having this private VLAN does not impose
restrictions to the hardware configuration, yet it is a bookkeeping problem.

There are 2 possible solutions.

One is to make the functions that operate on VLAN-unaware pvids:
- ocelot_add_vlan_unaware_pvid()
- ocelot_del_vlan_unaware_pvid()
- ocelot_port_setup_dsa_8021q_cpu()
- ocelot_port_teardown_dsa_8021q_cpu()
call something different than ocelot_vlan_member_(add|del)(), the latter being
the real problem, because it allocates a struct ocelot_bridge_vlan *vlan which
it adds to ocelot->vlans. We don't really *need* the private VLANs in
ocelot->vlans, it's just that we have the extra convenience of having the
vlan->portmask cached in software (whereas without these structures, we'd have
to create a raw ocelot_vlant_rmw_mask() procedure which reads back the current
port mask from hardware).

The other solution is to filter out the private VLANs from
ocelot_port_num_untagged_vlans(), since they aren't what callers care about.
We only need to do this to the mentioned function and not to
ocelot_port_num_tagged_vlans(), because private VLANs are never egress-tagged.

Nothing else seems to be broken in either solution, but the first one requires
more rework which will conflict with the net-next change  36a0bf4435 ("net:
mscc: ocelot: set up tag_8021q CPU ports independent of user port affinity"),
and I'd like to avoid that. So go with the other one.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220927122042.1100231-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 19:03:28 -07:00
Maciej Fijalkowski
b3056ae2b5 ice: xsk: drop power of 2 ring size restriction for AF_XDP
We had multiple customers in the past months that reported commit
296f13ff38 ("ice: xsk: Force rings to be sized to power of 2")
makes them unable to use ring size of 8160 in conjunction with AF_XDP.
Remove this restriction.

Fixes: 296f13ff38 ("ice: xsk: Force rings to be sized to power of 2")
CC: Alasdair McWilliam <alasdair.mcwilliam@outlook.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27 09:01:01 -07:00
Maciej Fijalkowski
29322791bc ice: xsk: change batched Tx descriptor cleaning
AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy"
way - descriptors are not cleaned immediately after send. We rather hold
on with cleaning until we see that free space in ring drops below
particular threshold. This was supposed to reduce the amount of
unnecessary work related to cleaning and instead of keeping the ring
empty, ring was rather saturated.

In AF_XDP realm cleaning Tx descriptors implies producing them to CQ.
This is a way of letting know user space that particular descriptor has
been sent, as John points out in [0].

We tried to implement serial descriptor cleaning which would be used in
conjunction with batched cleaning but it made code base more convoluted
and probably harder to maintain in future. Therefore we step away from
batched cleaning in a current form in favor of an approach where we set
RS bit on every last descriptor from a batch and clean always at the
beginning of ice_xmit_zc().

This means that we give up a bit of Tx performance, but this doesn't
hurt l2fwd scenario which is way more meaningful than txonly as this can
be treaten as AF_XDP based packet generator. l2fwd is not hurt due to
the fact that Tx side is much faster than Rx and Rx is the one that has
to catch Tx up.

FWIW Tx descriptors are still produced in a batched way.

[0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/

Fixes: 126cdfe100 ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27 08:11:02 -07:00
Junxiao Chang
49725ffc15 net: stmmac: power up/down serdes in stmmac_open/release
This commit fixes DMA engine reset timeout issue in suspend/resume
with ADLink I-Pi SMARC Plus board which dmesg shows:
...
[   54.678271] PM: suspend exit
[   54.754066] intel-eth-pci 0000:00:1d.2 enp0s29f2: PHY [stmmac-3:01] driver [Maxlinear Ethernet GPY215B] (irq=POLL)
[   54.755808] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-0
...
[   54.780482] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-7
[   55.784098] intel-eth-pci 0000:00:1d.2: Failed to reset the dma
[   55.784111] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_hw_setup: DMA engine initialization failed
[   55.784115] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_open: Hw setup failed
...

The issue is related with serdes which impacts clock.  There is
serdes in ADLink I-Pi SMARC board ethernet controller. Please refer to
commit b9663b7ca6 ("net: stmmac: Enable SERDES power up/down sequence")
for detial. When issue is reproduced, DMA engine clock is not ready
because serdes is not powered up.

To reproduce DMA engine reset timeout issue with hardware which has
serdes in GBE controller, install Ubuntu. In Ubuntu GUI, click
"Power Off/Log Out" -> "Suspend" menu, it disables network interface,
then goes to sleep mode. When it wakes up, it enables network
interface again. Stmmac driver is called in this way:

1. stmmac_release: Stop network interface. In this function, it
   disables DMA engine and network interface;
2. stmmac_suspend: It is called in kernel suspend flow. But because
   network interface has been disabled(netif_running(ndev) is
   false), it does nothing and returns directly;
3. System goes into S3 or S0ix state. Some time later, system is
   waken up by keyboard or mouse;
4. stmmac_resume: It does nothing because network interface has
   been disabled;
5. stmmac_open: It is called to enable network interace again. DMA
   engine is initialized in this API, but serdes is not power on so
   there will be DMA engine reset timeout issue.

Similarly, serdes powerdown should be added in stmmac_release.
Network interface might be disabled by cmd "ifconfig eth0 down",
DMA engine, phy and mac have been disabled in ndo_stop callback,
serdes should be powered down as well. It doesn't make sense that
serdes is on while other components have been turned off.

If ethernet interface is in enabled state(netif_running(ndev) is true)
before suspend/resume, the issue couldn't be reproduced  because serdes
could be powered up in stmmac_resume.

Because serdes_powerup is added in stmmac_open, it doesn't need to be
called in probe function.

Fixes: b9663b7ca6 ("net: stmmac: Enable SERDES power up/down sequence")
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon@intel.com>
Tested-by: Jimmy JS Chen <jimmyjs.chen@adlinktech.com>
Tested-by: Looi, Hong Aun <hong.aun.looi@intel.com>
Link: https://lore.kernel.org/r/20220923050448.1220250-1-junxiao.chang@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27 10:38:11 +02:00
Peng Wu
4774db8dfc net/mlxbf_gige: Fix an IS_ERR() vs NULL bug in mlxbf_gige_mdio_probe
The devm_ioremap() function returns NULL on error, it doesn't return
error pointers.

Fixes: 3a1a274e93 ("mlxbf_gige: compute MDIO period based on i1clk")
Signed-off-by: Peng Wu <wupeng58@huawei.com>
Link: https://lore.kernel.org/r/20220923023640.116057-1-wupeng58@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 13:20:23 -07:00
Rafael Mendonca
c635ebe8d9 cxgb4: fix missing unlock on ETHOFLD desc collect fail path
The label passed to the QDESC_GET for the ETHOFLD TXQ, RXQ, and FLQ, is the
'out' one, which skips the 'out_unlock' label, and thus doesn't unlock the
'uld_mutex' before returning. Additionally, since commit 5148e5950c
("cxgb4: add EOTID tracking and software context dump"), the access to
these ETHOFLD hardware queues should be protected by the 'mqprio_mutex'
instead.

Fixes: 2d0cb84dd9 ("cxgb4: add ETHOFLD hardware queue support")
Fixes: 5148e5950c ("cxgb4: add EOTID tracking and software context dump")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Reviewed-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Link: https://lore.kernel.org/r/20220922175109.764898-1-rafaelmendsr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 13:17:53 -07:00
Sasha Levin
6052a4c11f Revert "net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()"
This reverts commit fe2c9c61f6.

On Tue, Sep 13, 2022 at 05:48:58PM +0100, Russell King (Oracle) wrote:
>What happens if this is built as a module, and the module is loaded,
>binds (and creates the directory), then is removed, and then re-
>inserted?  Nothing removes the old directory, so doesn't
>debugfs_create_dir() fail, resulting in subsequent failure to add
>any subsequent debugfs entries?
>
>I don't think this patch should be backported to stable trees until
>this point is addressed.

Revert until a proper fix is available as the original behavior was
better.

Cc: Marcin Wojtas <mw@semihalf.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: stable@kernel.org
Reported-by: Russell King <linux@armlinux.org.uk>
Fixes: fe2c9c61f6 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220923234736.657413-1-sashal@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 10:06:34 -07:00
Andy Moreton
b7ca8d5f56 sfc: correct filter_table_remove method for EF10 PFs
A previous patch added a wrapper function to take a lock around
 efx_mcdi_filter_table_remove(), but only changed EF10 VFs' method table
 to call it.  Change it in the PF method table too.

Fixes: 77eb40749d ("sfc: move table locking into filter_table_{probe,remove} methods")
Signed-off-by: Andy Moreton <andy.moreton@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220922211218.814-1-ecree@xilinx.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-23 20:57:14 -07:00
Radhey Shyam Pandey
f22bd29ba1 net: macb: Fix ZynqMP SGMII non-wakeup source resume failure
When GEM is in SGMII mode and disabled as a wakeup source, the power
management controller can power down the entire full power domain(FPD)
if none of the FPD devices are in use.

Incase of FPD off, there are below ethernet link up issues on non-wakeup
suspend/resume. To fix it add phy_exit() in suspend and phy_init() in the
resume path which reinitializes PS GTR SGMII lanes.

$ echo +20 > /sys/class/rtc/rtc0/wakealarm
$ echo mem > /sys/power/state

After resume:

$ ifconfig eth0 up
xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout
phy phy-fd400000.phy.0: phy poweron failed --> -110
xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout
SIOCSIFFLAGS: Connection timed out
phy phy-fd400000.phy.0: phy poweron failed --> -110

Fixes: 8b73fa3ae0 ("net: macb: Added ZynqMP-specific initialization")
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-23 12:32:49 +01:00
Linus Torvalds
504c25cb76 Including fixes from wifi, netfilter and can.
A handful of awaited fixes here - revert of the FEC changes,
 bluetooth fix, fixes for iwlwifi spew.
 
 We added a warning in PHY/MDIO code which is triggering on
 a couple of platforms in a false-positive-ish way. If we can't
 iron that out over the week we'll drop it and re-add for 6.1.
 
 I've added a new "follow up fixes" section for fixes to fixes
 in 6.0-rcs but it may actually give the false impression that
 those are problematic or that more testing time would have
 caught them. So likely a one time thing.
 
 Follow up fixes:
 
  - nf_tables_addchain: fix nft_counters_enabled underflow
 
  - ebtables: fix memory leak when blob is malformed
 
  - nf_ct_ftp: fix deadlock when nat rewrite is needed
 
 Current release - regressions:
 
  - Revert "fec: Restart PPS after link state change"
  - Revert "net: fec: Use a spinlock to guard `fep->ptp_clk_on`"
 
  - Bluetooth: fix HCIGETDEVINFO regression
 
  - wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2
 
  - mptcp: fix fwd memory accounting on coalesce
 
  - rwlock removal fall out:
    - ipmr: always call ip{,6}_mr_forward() from RCU read-side
      critical section
    - ipv6: fix crash when IPv6 is administratively disabled
 
  - tcp: read multiple skbs in tcp_read_skb()
 
  - mdio_bus_phy_resume state warning fallout:
    - eth: ravb: fix PHY state warning splat during system resume
    - eth: sh_eth: fix PHY state warning splat during system resume
 
 Current release - new code bugs:
 
  - wifi: iwlwifi: don't spam logs with NSS>2 messages
 
  - eth: mtk_eth_soc: enable XDP support just for MT7986 SoC
 
 Previous releases - regressions:
 
  - bonding: fix NULL deref in bond_rr_gen_slave_id
 
  - wifi: iwlwifi: mark IWLMEI as broken
 
 Previous releases - always broken:
 
  - nf_conntrack helpers:
    - irc: tighten matching on DCC message
    - sip: fix ct_sip_walk_headers
    - osf: fix possible bogus match in nf_osf_find()
 
  - ipvlan: fix out-of-bound bugs caused by unset skb->mac_header
 
  - core: fix flow symmetric hash
 
  - bonding, team: unsync device addresses on ndo_stop
 
  - phy: micrel: fix shared interrupt on LAN8814
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmMsj3EACgkQMUZtbf5S
 IrsUgQ//eXxuUZeGTg7cgJKPFJelrZ3iL16B1+s2qX94GPIqXRAShgC78iM7IbSe
 y3vR/7YVE7sKXm88wnLefMQVXPp0cE2p0+8++E/j4zcRZsM5sHb2+d3gW6nos2ed
 U8Ldm7LzWUNt/o1ZHDqZWBSoreFkmbFyHO6FVPCuH11tFUJqxJ/SP860mwo6tbuT
 HOoVphKis41IMEXCgybs2V0DAQewba0gejzAmySDy8epNhOj2F4Vo6aadnUCI68U
 HrIFYe2wiEi6MZDsB9zpRXc9seb6ZBKbBjgQnTK7MwfBEQCzxtR2lkNobJM1WbdL
 nYwHBOJ16yX0BnlSpUEepv6iJYY5Q7FS35Wk3Rq5Mik6DaEir6vVSBdRxHpYOkO2
 KPIyyMMAA5E8mAtqH3PcpnwDK+9c3KlZYYKXxIp2IjQm87DpOZJFynwsC3Crmbzo
 C7UTMav2nkHljoapMLUwzqyw2ip+Qo14XA043FDPUru1sXY9CY6q50XZa5GmrNKh
 xyaBdp4Ckj1kOuXUR9jz3Rq8skOZ8lNGHtiCdgPZitWhNKW1YORJihC7/9zdieCR
 1gOE7Dpz/MhVmFn2e8S5O3TkU5lXfALfPDJi4QiML5VLHXd/nCE5sHPiOBWcoo4w
 2djKbIGpLRnO6qMs4NkWNmPbG+/ouvpM+lewqn+xU4TGyn/NTbI=
 =wrep
 -----END PGP SIGNATURE-----

Merge tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from wifi, netfilter and can.

  A handful of awaited fixes here - revert of the FEC changes, bluetooth
  fix, fixes for iwlwifi spew.

  We added a warning in PHY/MDIO code which is triggering on a couple of
  platforms in a false-positive-ish way. If we can't iron that out over
  the week we'll drop it and re-add for 6.1.

  I've added a new "follow up fixes" section for fixes to fixes in
  6.0-rcs but it may actually give the false impression that those are
  problematic or that more testing time would have caught them. So
  likely a one time thing.

  Follow up fixes:

   - nf_tables_addchain: fix nft_counters_enabled underflow

   - ebtables: fix memory leak when blob is malformed

   - nf_ct_ftp: fix deadlock when nat rewrite is needed

  Current release - regressions:

   - Revert "fec: Restart PPS after link state change" and the related
     "net: fec: Use a spinlock to guard `fep->ptp_clk_on`"

   - Bluetooth: fix HCIGETDEVINFO regression

   - wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2

   - mptcp: fix fwd memory accounting on coalesce

   - rwlock removal fall out:
      - ipmr: always call ip{,6}_mr_forward() from RCU read-side
        critical section
      - ipv6: fix crash when IPv6 is administratively disabled

   - tcp: read multiple skbs in tcp_read_skb()

   - mdio_bus_phy_resume state warning fallout:
      - eth: ravb: fix PHY state warning splat during system resume
      - eth: sh_eth: fix PHY state warning splat during system resume

  Current release - new code bugs:

   - wifi: iwlwifi: don't spam logs with NSS>2 messages

   - eth: mtk_eth_soc: enable XDP support just for MT7986 SoC

  Previous releases - regressions:

   - bonding: fix NULL deref in bond_rr_gen_slave_id

   - wifi: iwlwifi: mark IWLMEI as broken

  Previous releases - always broken:

   - nf_conntrack helpers:
      - irc: tighten matching on DCC message
      - sip: fix ct_sip_walk_headers
      - osf: fix possible bogus match in nf_osf_find()

   - ipvlan: fix out-of-bound bugs caused by unset skb->mac_header

   - core: fix flow symmetric hash

   - bonding, team: unsync device addresses on ndo_stop

   - phy: micrel: fix shared interrupt on LAN8814"

* tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
  selftests: forwarding: add shebang for sch_red.sh
  bnxt: prevent skb UAF after handing over to PTP worker
  net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
  net: sched: fix possible refcount leak in tc_new_tfilter()
  net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
  udp: Use WARN_ON_ONCE() in udp_read_skb()
  selftests: bonding: cause oops in bond_rr_gen_slave_id
  bonding: fix NULL deref in bond_rr_gen_slave_id
  net: phy: micrel: fix shared interrupt on LAN8814
  net/smc: Stop the CLC flow if no link to map buffers on
  ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
  net: atlantic: fix potential memory leak in aq_ndev_close()
  can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
  can: gs_usb: gs_can_open(): fix race dev->can.state condition
  can: flexcan: flexcan_mailbox_read() fix return value for drop = true
  net: sh_eth: Fix PHY state warning splat during system resume
  net: ravb: Fix PHY state warning splat during system resume
  netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed
  netfilter: ebtables: fix memory leak when blob is malformed
  netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
  ...
2022-09-22 10:58:13 -07:00
Jakub Kicinski
c31f26c8f6 bnxt: prevent skb UAF after handing over to PTP worker
When reading the timestamp is required bnxt_tx_int() hands
over the ownership of the completed skb to the PTP worker.
The skb should not be used afterwards, as the worker may
run before the rest of our code and free the skb, leading
to a use-after-free.

Since dev_kfree_skb_any() accepts NULL make the loss of
ownership more obvious and set skb to NULL.

Fixes: 83bb623c96 ("bnxt_en: Transmit and retrieve packet timestamps")
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20220921201005.335390-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22 07:33:17 -07:00
Liang He
3aac7ada64 net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
In prestera_port_sfp_bind(), there are two refcounting bugs:
(1) we should call of_node_get() before of_find_node_by_name() as
it will automaitcally decrease the refcount of 'from' argument;
(2) we should call of_node_put() for the break of the iteration
for_each_child_of_node() as it will automatically increase and
decrease the 'child'.

Fixes: 52323ef754 ("net: marvell: prestera: add phylink support")
Signed-off-by: Liang He <windhl@126.com>
Reviewed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Link: https://lore.kernel.org/r/20220921133245.4111672-1-windhl@126.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22 07:23:58 -07:00
Sean Anderson
878e240571 net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
There is a separate receive path for small packets (under 256 bytes).
Instead of allocating a new dma-capable skb to be used for the next packet,
this path allocates a skb and copies the data into it (reusing the existing
sbk for the next packet). There are two bytes of junk data at the beginning
of every packet. I believe these are inserted in order to allow aligned DMA
and IP headers. We skip over them using skb_reserve. Before copying over
the data, we must use a barrier to ensure we see the whole packet. The
current code only synchronizes len bytes, starting from the beginning of
the packet, including the junk bytes. However, this leaves off the final
two bytes in the packet. Synchronize the whole packet.

To reproduce this problem, ping a HME with a payload size between 17 and
214

	$ ping -s 17 <hme_address>

which will complain rather loudly about the data mismatch. Small packets
(below 60 bytes on the wire) do not have this issue. I suspect this is
related to the padding added to increase the minimum packet size.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220920235018.1675956-1-seanga2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22 06:44:28 -07:00
Jakub Kicinski
624aea6bed Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-09-20 (ice)

Michal re-sets TC configuration when changing number of queues.

Mateusz moves the check and call for link-down-on-close to the specific
path for downing/closing the interface.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: Fix interface being down after reset with link-down-on-close flag on
  ice: config netdev tc before setting queues number
====================

Link: https://lore.kernel.org/r/20220920205344.1860934-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21 18:39:47 -07:00
Larysa Zaremba
114f398d48 ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
The original patch added the static branch to handle the situation,
when assigning an XDP TX queue to every CPU is not possible,
so they have to be shared.

However, in the XDP transmit handler ice_xdp_xmit(), an error was
returned in such cases even before static condition was checked,
thus making queue sharing still impossible.

Fixes: 22bf877e52 ("ice: introduce XDP_TX fallback path")
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/r/20220919134346.25030-1-larysa.zaremba@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21 17:33:29 -07:00
Jakub Kicinski
f64780e3cc Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-09-19 (iavf, i40e)

Norbert adds checking of buffer size for Rx buffer checks in iavf.

Michal corrects setting of max MTU in iavf to account for MTU data provided
by PF, fixes i40e to set VF max MTU, and resolves lack of rate limiting
when value was less than divisor for i40e.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  i40e: Fix set max_tx_rate when it is lower than 1 Mbps
  i40e: Fix VF set max MTU size
  iavf: Fix set max MTU size with port VLAN and jumbo frames
  iavf: Fix bad page state
====================

Link: https://lore.kernel.org/r/20220919223428.572091-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-21 17:28:35 -07:00
Jianglei Nie
65e5d27df6 net: atlantic: fix potential memory leak in aq_ndev_close()
If aq_nic_stop() fails, aq_ndev_close() returns err without calling
aq_nic_deinit() to release the relevant memory and resource, which
will lead to a memory leak.

We can fix it by deleting the if condition judgment and goto statement to
call aq_nic_deinit() directly after aq_nic_stop() to fix the memory leak.

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-21 12:50:57 +01:00
Geert Uytterhoeven
6a1dbfefda net: sh_eth: Fix PHY state warning splat during system resume
Since commit 744d23c71a ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:

	WARNING: CPU: 0 PID: 626 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xe4

As the Renesas SuperH Ethernet driver already calls phy_{stop,start}()
in its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.

Fixes: fba863b816 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/c6e1331b9bef61225fa4c09db3ba3e2e7214ba2d.1663598886.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 17:05:50 -07:00
Geert Uytterhoeven
4924c0cdce net: ravb: Fix PHY state warning splat during system resume
Since commit 744d23c71a ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:

        WARNING: CPU: 0 PID: 1197 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xc8

As the Renesas Ethernet AVB driver already calls phy_{stop,start}() in
its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.

Fixes: fba863b816 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/8ec796f47620980fdd0403e21bd8b7200b4fa1d4.1663598796.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 17:05:46 -07:00
Mateusz Palczewski
8ac7132704 ice: Fix interface being down after reset with link-down-on-close flag on
When performing a reset on ice driver with link-down-on-close flag on
interface would always stay down. Fix this by moving a check of this
flag to ice_stop() that is called only when user wants to bring
interface down.

Fixes: ab4ab73fc1 ("ice: Add ethtool private flag to make forcing link down optional")
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Petr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20 13:30:51 -07:00
Michal Swiatkowski
122045ca77 ice: config netdev tc before setting queues number
After lowering number of tx queues the warning appears:
"Number of in use tx queues changed invalidating tc mappings. Priority
traffic classification disabled!"
Example command to reproduce:
ethtool -L enp24s0f0 tx 36 rx 36

Fix this by setting correct tc mapping before setting real number of
queues on netdev.

Fixes: 0754d65bd4 ("ice: Add infrastructure for mqprio support via ndo_setup_tc")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-20 13:30:51 -07:00
Vladimir Oltean
5641c751fe net: enetc: deny offload of tc-based TSN features on VF interfaces
TSN features on the ENETC (taprio, cbs, gate, police) are configured
through a mix of command BD ring messages and port registers:
enetc_port_rd(), enetc_port_wr().

Port registers are a region of the ENETC memory map which are only
accessible from the PCIe Physical Function. They are not accessible from
the Virtual Functions.

Moreover, attempting to access these registers crashes the kernel:

$ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs
pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001
fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15
fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002)
fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0
$ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
	sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2
Unable to handle kernel paging request at virtual address ffff800009551a08
Internal error: Oops: 96000007 [#1] PREEMPT SMP
pc : enetc_setup_tc_taprio+0x170/0x47c
lr : enetc_setup_tc_taprio+0x16c/0x47c
Call trace:
 enetc_setup_tc_taprio+0x170/0x47c
 enetc_setup_tc+0x38/0x2dc
 taprio_change+0x43c/0x970
 taprio_init+0x188/0x1e0
 qdisc_create+0x114/0x470
 tc_modify_qdisc+0x1fc/0x6c0
 rtnetlink_rcv_msg+0x12c/0x390

Split enetc_setup_tc() into separate functions for the PF and for the
VF drivers. Also remove enetc_qos.o from being included into
enetc-vf.ko, since it serves absolutely no purpose there.

Fixes: 34c6adf197 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:27:10 -07:00
Vladimir Oltean
fed38e64d9 net: enetc: move enetc_set_psfp() out of the common enetc_set_features()
The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
flag; only PFs should. Moreover, TSN-specific code should go to
enetc_qos.c, which should not be included in the VF driver.

Fixes: 79e499829f ("net: enetc: add hw tc hw offload features for PSPF capability")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:27:09 -07:00
Íñigo Huguet
589c6eded1 sfc/siena: fix null pointer dereference in efx_hard_start_xmit
Like in previous patch for sfc, prevent potential (but unlikely) NULL
pointer dereference.

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141958.16458-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:21:28 -07:00
Íñigo Huguet
974bb793ad sfc/siena: fix TX channel offset when using legacy interrupts
As in previous commit for sfc, fix TX channels offset when
efx_siena_separate_tx_channels is false (the default)

Fixes: 25bde571b4 ("sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141653.15504-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:21:13 -07:00
Francesco Dolcini
01b825f997 Revert "net: fec: Use a spinlock to guard fep->ptp_clk_on"
This reverts commit b353b241f1, this is
creating multiple issues, just not ready to be merged yet.

Link: https://lore.kernel.org/all/CAHk-=wj1obPoTu1AHj9Bd_BGYjdjDyPP+vT5WMj8eheb3A9WHw@mail.gmail.com/
Link: https://lore.kernel.org/all/20220907143915.5w65kainpykfobte@pengutronix.de/
Fixes: b353b241f1 ("net: fec: Use a spinlock to guard `fep->ptp_clk_on`")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 12:16:58 +02:00
Francesco Dolcini
7b15515fc1 Revert "fec: Restart PPS after link state change"
This reverts commit f79959220f, this is
creating multiple issues, just not ready to be merged yet.

Link: https://lore.kernel.org/all/20220905180542.GA3685102@roeck-us.net/
Link: https://lore.kernel.org/all/CAHk-=wj1obPoTu1AHj9Bd_BGYjdjDyPP+vT5WMj8eheb3A9WHw@mail.gmail.com/
Fixes: f79959220f ("fec: Restart PPS after link state change")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 12:16:58 +02:00
Shailend Chand
8ccac4edc8 gve: Fix GFP flags when allocing pages
Use GFP_ATOMIC when allocating pages out of the hotpath,
continue to use GFP_KERNEL when allocating pages during setup.

GFP_KERNEL will allow blocking which allows it to succeed
more often in a low memory enviornment but in the hotpath we do
not want to allow the allocation to block.

Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Signed-off-by: Shailend Chand <shailend@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Link: https://lore.kernel.org/r/20220913000901.959546-1-jeroendb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:31:06 -07:00
Vadim Fedorenko
ae8ffba8ba bnxt_en: fix flags to check for supported fw version
The warning message of unsupported FW appears every time RX timestamps
are disabled on the interface. The patch fixes the flags to correct set
for the check.

Fixes: 66ed81dced ("bnxt_en: Enable packet timestamping for all RX packets")
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20220915234932.25497-1-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:22:06 -07:00
Íñigo Huguet
0a242eb291 sfc: fix null pointer dereference in efx_hard_start_xmit
Trying to get the channel from the tx_queue variable here is wrong
because we can only be here if tx_queue is NULL, so we shouldn't
dereference it. As the above comment in the code says, this is very
unlikely to happen, but it's wrong anyway so let's fix it.

I hit this issue because of a different bug that caused tx_queue to be
NULL. If that happens, this is the error message that we get here:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [...]
  RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220914111135.21038-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:09:55 -07:00
Íñigo Huguet
f232af4295 sfc: fix TX channel offset when using legacy interrupts
In legacy interrupt mode the tx_channel_offset was hardcoded to 1, but
that's not correct if efx_sepparate_tx_channels is false. In that case,
the offset is 0 because the tx queues are in the single existing channel
at index 0, together with the rx queue.

Without this fix, as soon as you try to send any traffic, it tries to
get the tx queues from an uninitialized channel getting these errors:
  WARNING: CPU: 1 PID: 0 at drivers/net/ethernet/sfc/tx.c:540 efx_hard_start_xmit+0x12e/0x170 [sfc]
  [...]
  RIP: 0010:efx_hard_start_xmit+0x12e/0x170 [sfc]
  [...]
  Call Trace:
   <IRQ>
   dev_hard_start_xmit+0xd7/0x230
   sch_direct_xmit+0x9f/0x360
   __dev_queue_xmit+0x890/0xa40
  [...]
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [...]
  RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]
  [...]
  Call Trace:
   <IRQ>
   dev_hard_start_xmit+0xd7/0x230
   sch_direct_xmit+0x9f/0x360
   __dev_queue_xmit+0x890/0xa40
  [...]

Fixes: c308dfd1b4 ("sfc: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220914103648.16902-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:09:43 -07:00
Lorenzo Bianconi
5e69163d3b net: ethernet: mtk_eth_soc: enable XDP support just for MT7986 SoC
Disable page_pool/XDP support for MT7621 SoC in order fix a regression
introduce adding XDP for MT7986 SoC. There is no a real use case for XDP
on MT7621 since it is a low-end cpu. Moreover this patch reduces the
memory footprint.

Tested-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Fixes: 23233e577e ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/2bf31e27b888c43228b0d84dd2ef5033338269e2.1663074002.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 16:30:54 -07:00
Haiyang Zhang
6fd2c68da5 net: mana: Add rmb after checking owner bits
Per GDMA spec, rmb is necessary after checking owner_bits, before
reading EQ or CQ entries.

Add rmb in these two places to comply with the specs.

Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reported-by: Sinan Kaya <Sinan.Kaya@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1662928805-15861-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 16:23:19 -07:00
Michal Jaron
198eb7e1b8 i40e: Fix set max_tx_rate when it is lower than 1 Mbps
While converting max_tx_rate from bytes to Mbps, this value was set to 0,
if the original value was lower than 125000 bytes (1 Mbps). This would
cause no transmission rate limiting to occur. This happened due to lack of
check of max_tx_rate against the 1 Mbps value for max_tx_rate and the
following division by 125000. Fix this issue by adding a helper
i40e_bw_bytes_to_mbits() which sets max_tx_rate to minimum usable value of
50 Mbps, if its value is less than 1 Mbps, otherwise do the required
conversion by dividing by 125000.

Fixes: 5ecae4120a ("i40e: Refactor VF BW rate limiting")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Andrii Staikov <andrii.staikov@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-19 14:13:06 -07:00
Michal Jaron
372539def2 i40e: Fix VF set max MTU size
Max MTU sent to VF is set to 0 during memory allocation. It cause
that max MTU on VF is changed to IAVF_MAX_RXBUFFER and does not
depend on data from HW.

Set max_mtu field in virtchnl_vf_resource struct to inform
VF in GET_VF_RESOURCES msg what size should be max frame.

Fixes: dab86afdbb ("i40e/i40evf: Change the way we limit the maximum frame size for Rx")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-19 14:13:06 -07:00
Michal Jaron
399c98c4dc iavf: Fix set max MTU size with port VLAN and jumbo frames
After setting port VLAN and MTU to 9000 on VF with ice driver there
was an iavf error
"PF returned error -5 (IAVF_ERR_PARAM) to our request 6".

During queue configuration, VF's max packet size was set to
IAVF_MAX_RXBUFFER but on ice max frame size was smaller by VLAN_HLEN
due to making some space for port VLAN as VF is not aware whether it's
in a port VLAN. This mismatch in sizes caused ice to reject queue
configuration with ERR_PARAM error. Proper max_mtu is sent from ice PF
to VF with GET_VF_RESOURCES msg but VF does not look at this.

In iavf change max_frame from IAVF_MAX_RXBUFFER to max_mtu
received from pf with GET_VF_RESOURCES msg to make vf's
max_frame_size dependent from pf. Add check if received max_mtu is
not in eligible range then set it to IAVF_MAX_RXBUFFER.

Fixes: dab86afdbb ("i40e/i40evf: Change the way we limit the maximum frame size for Rx")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-19 14:12:57 -07:00
David Thompson
182447b121 mlxbf_gige: clear MDIO gateway lock after read
The MDIO gateway (GW) lock in BlueField-2 GIGE logic is
set after read.  This patch adds logic to make sure the
lock is always cleared at the end of each MDIO transaction.

Fixes: f92e1869d7 ("Add Mellanox BlueField Gigabit Ethernet driver")
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: David Thompson <davthompson@nvidia.com>
Link: https://lore.kernel.org/r/20220902164247.19862-1-davthompson@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 14:11:48 -07:00
Norbert Zulinski
66039eb901 iavf: Fix bad page state
Fix bad page state, free inappropriate page in handling dummy
descriptor. iavf_build_skb now has to check not only if rx_buffer is
NULL but also if size is zero, same thing in iavf_clean_rx_irq.
Without this patch driver would free page that will be used
by napi_build_skb.

Fixes: a9f49e0060 ("iavf: Fix handling of dummy receive descriptors")
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-19 14:03:45 -07:00
David S. Miller
21be1ad637 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-09-08 (ice, iavf)

This series contains updates to ice and iavf drivers.

Dave removes extra unplug of auxiliary bus on reset which caused a
scheduling while atomic to be reported for ice.

Ding Hui defers setting of queues for TCs to ensure valid configuration
and restores old config if invalid for ice.

Sylwester fixes a check of setting MAC address to occur after result is
received from PF for iavf driver.

Brett changes check of ring tail to use software cached value as not all
devices have access to register tail for iavf driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-16 12:16:44 +01:00
Oleksandr Mazur
9124dbcc2d net: marvell: prestera: add support for for Aldrin2
Aldrin2 (98DX8525) is a Marvell Prestera PP, with 100G support.

Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>

V2:
  - retarget to net tree instead of net-next;
  - fix missed colon in patch subject ('net marvell' vs 'net: mavell');
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-16 11:53:48 +01:00
Linus Torvalds
e839a75601 hyperv-fixes for v6.0-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmMfrHUTHHdlaS5saXVA
 a2VybmVsLm9yZwAKCRB2FHBfkEGgXhocCAC7OUoyEeZxSkrX7FJbPj1XA7b2ansi
 bJX6onBOCaJalswQ83b+GU8IrmAHMMv8V7fa/v3dFf0pvKoBPR9WxizyV9hSLAmx
 3WGuj1Wzr3oJb+bWfVIzE3qpWezNtNsIiP6oKT9Q07wmUkN4877Ikm6JhgkDP94C
 20fZZouhaYV4njkM/w7QBYxTueQHSWYxjJDBiBc31CN5ChwY/2gkrAHfM9vq7krX
 MNtaGy75d2EQYKc4OlAjxHseFo6wnMThibZ3JStX/KASV/e02DlRTU3UYUpIrNCQ
 gbU+bbeeLN3BdCrMkZlFvXtGmTQJEBGaEBGd8qsT6pM6jsZT49tlYdz+
 =e1dD
 -----END PGP SIGNATURE-----

Merge tag 'hyperv-fixes-signed-20220912' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Fix an error handling issue in DRM driver (Christophe JAILLET)

 - Fix some issues in framebuffer driver (Vitaly Kuznetsov)

 - Two typo fixes (Jason Wang, Shaomin Deng)

 - Drop unnecessary casting in kvp tool (Zhou Jie)

* tag 'hyperv-fixes-signed-20220912' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region
  Drivers: hv: Always reserve framebuffer region for Gen1 VMs
  PCI: Move PCI_VENDOR_ID_MICROSOFT/PCI_DEVICE_ID_HYPERV_VIDEO definitions to pci_ids.h
  tools: hv: kvp: remove unnecessary (void*) conversions
  Drivers: hv: remove duplicate word in a comment
  tools: hv: Remove an extraneous "the"
  drm/hyperv: Fix an error handling path in hyperv_vmbus_probe()
2022-09-12 18:33:55 -04:00
Linus Torvalds
0099baa879 v6.0 second rc pull request
Many bug fixes in several drivers:
 
 - Fix misuse of the DMA API in rtrs
 
 - Several irdma issues: hung task due to SQ flushing, incorrect capability
   reporting to userspace, improper error handling for MW corners, touching
   an uninitialized SGL for during invalidation.
 
 - hns was using the wrong page size limits for the HW, an incorrect
   calculation of wqe_shift causing WQE corruption, and mis computed
   a timer id.
 
 - Fix a crash in SRP triggered by blktests
 
 - Fix compiler errors by calling virt_to_page() with the proper type in
   siw
 
 - Userspace triggerable deadlock in ODP
 
 - mlx5 could use the wrong profile due to some driver loading races,
   counters were not working in some device configurations, and a crash on
   error unwind.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCYxtj4QAKCRCFwuHvBreF
 YQNdAQDOAoXv3PCZikmyu4zmjzVdeUUXEig5RU3MgFdCimo99gEA8t+2/pHmnSTB
 vn7cxuKMpJydAmLVFJPZxaOEuaBdegQ=
 =/eYF
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Many bug fixes in several drivers:

   - Fix misuse of the DMA API in rtrs

   - Several irdma issues: hung task due to SQ flushing, incorrect
     capability reporting to userspace, improper error handling for MW
     corners, touching an uninitialized SGL for during invalidation.

   - hns was using the wrong page size limits for the HW, an incorrect
     calculation of wqe_shift causing WQE corruption, and mis computed a
     timer id.

   - Fix a crash in SRP triggered by blktests

   - Fix compiler errors by calling virt_to_page() with the proper type
     in siw

   - Userspace triggerable deadlock in ODP

   - mlx5 could use the wrong profile due to some driver loading races,
     counters were not working in some device configurations, and a
     crash on error unwind"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Report RNR NAK generation in device caps
  RDMA/irdma: Use s/g array in post send only when its valid
  RDMA/irdma: Return correct WC error for bind operation failure
  RDMA/irdma: Return error on MR deregister CQP failure
  RDMA/irdma: Report the correct max cqes from query device
  MAINTAINERS: Update maintainers of HiSilicon RoCE
  RDMA/mlx5: Fix UMR cleanup on error flow of driver init
  RDMA/mlx5: Set local port to one when accessing counters
  RDMA/mlx5: Rely on RoCE fw cap instead of devlink when setting profile
  IB/core: Fix a nested dead lock as part of ODP flow
  RDMA/siw: Pass a pointer to virt_to_page()
  RDMA/srp: Set scmnd->result only when scmnd is not NULL
  RDMA/hns: Remove the num_qpc_timer variable
  RDMA/hns: Fix wrong fixed value of qp->rq.wqe_shift
  RDMA/hns: Fix supported page size
  RDMA/cma: Fix arguments order in net device validation
  RDMA/irdma: Fix drain SQ hang with no completion
  RDMA/rtrs-srv: Pass the correct number of entries for dma mapped SGL
  RDMA/rtrs-clt: Use the right sg_cnt after ib_dma_map_sg
2022-09-09 14:46:44 -04:00
Brett Creeley
809f23c042 iavf: Fix cached head and tail value for iavf_get_tx_pending
The underlying hardware may or may not allow reading of the head or tail
registers and it really makes no difference if we use the software
cached values. So, always used the software cached values.

Fixes: 9c6c12595b ("i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Co-developed-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-08 13:22:25 -07:00
Sylwester Dziedziuch
f66b98c868 iavf: Fix change VF's mac address
Previously changing mac address gives false negative because
ip link set <interface> address <MAC> return with
RTNLINK: Permission denied.
In iavf_set_mac was check if PF handled our mac set request,
even before filter was added to list.
Because this check returns always true and it never waits for
PF's response.

Move iavf_is_mac_handled to wait_event_interruptible_timeout
instead of false. Now it will wait for PF's response and then
check if address was added or rejected.

Fixes: 35a2443d09 ("iavf: Add waiting for response from PF in set mac")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Co-developed-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-08 13:22:25 -07:00
Ding Hui
a509702cac ice: Fix crash by keep old cfg when update TCs more than queues
There are problems if allocated queues less than Traffic Classes.

Commit a632b2a4c9 ("ice: ethtool: Prohibit improper channel config
for DCB") already disallow setting less queues than TCs.

Another case is if we first set less queues, and later update more TCs
config due to LLDP, ice_vsi_cfg_tc() will failed but left dirty
num_txq/rxq and tc_cfg in vsi, that will cause invalid pointer access.

[   95.968089] ice 0000:3b:00.1: More TCs defined than queues/rings allocated.
[   95.968092] ice 0000:3b:00.1: Trying to use more Rx queues (8), than were allocated (1)!
[   95.968093] ice 0000:3b:00.1: Failed to config TC for VSI index: 0
[   95.969621] general protection fault: 0000 [#1] SMP NOPTI
[   95.969705] CPU: 1 PID: 58405 Comm: lldpad Kdump: loaded Tainted: G     U  W  O     --------- -t - 4.18.0 #1
[   95.969867] Hardware name: O.E.M/BC11SPSCB10, BIOS 8.23 12/30/2021
[   95.969992] RIP: 0010:devm_kmalloc+0xa/0x60
[   95.970052] Code: 5c ff ff ff 31 c0 5b 5d 41 5c c3 b8 f4 ff ff ff eb f4 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 89 d1 <8b> 97 60 02 00 00 48 8d 7e 18 48 39 f7 72 3f 55 89 ce 53 48 8b 4c
[   95.970344] RSP: 0018:ffffc9003f553888 EFLAGS: 00010206
[   95.970425] RAX: dead000000000200 RBX: ffffea003c425b00 RCX: 00000000006080c0
[   95.970536] RDX: 00000000006080c0 RSI: 0000000000000200 RDI: dead000000000200
[   95.970648] RBP: dead000000000200 R08: 00000000000463c0 R09: ffff888ffa900000
[   95.970760] R10: 0000000000000000 R11: 0000000000000002 R12: ffff888ff6b40100
[   95.970870] R13: ffff888ff6a55018 R14: 0000000000000000 R15: ffff888ff6a55460
[   95.970981] FS:  00007f51b7d24700(0000) GS:ffff88903ee80000(0000) knlGS:0000000000000000
[   95.971108] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   95.971197] CR2: 00007fac5410d710 CR3: 0000000f2c1de002 CR4: 00000000007606e0
[   95.971309] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   95.971419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   95.971530] PKRU: 55555554
[   95.971573] Call Trace:
[   95.971622]  ice_setup_rx_ring+0x39/0x110 [ice]
[   95.971695]  ice_vsi_setup_rx_rings+0x54/0x90 [ice]
[   95.971774]  ice_vsi_open+0x25/0x120 [ice]
[   95.971843]  ice_open_internal+0xb8/0x1f0 [ice]
[   95.971919]  ice_ena_vsi+0x4f/0xd0 [ice]
[   95.971987]  ice_dcb_ena_dis_vsi.constprop.5+0x29/0x90 [ice]
[   95.972082]  ice_pf_dcb_cfg+0x29a/0x380 [ice]
[   95.972154]  ice_dcbnl_setets+0x174/0x1b0 [ice]
[   95.972220]  dcbnl_ieee_set+0x89/0x230
[   95.972279]  ? dcbnl_ieee_del+0x150/0x150
[   95.972341]  dcb_doit+0x124/0x1b0
[   95.972392]  rtnetlink_rcv_msg+0x243/0x2f0
[   95.972457]  ? dcb_doit+0x14d/0x1b0
[   95.972510]  ? __kmalloc_node_track_caller+0x1d3/0x280
[   95.972591]  ? rtnl_calcit.isra.31+0x100/0x100
[   95.972661]  netlink_rcv_skb+0xcf/0xf0
[   95.972720]  netlink_unicast+0x16d/0x220
[   95.972781]  netlink_sendmsg+0x2ba/0x3a0
[   95.975891]  sock_sendmsg+0x4c/0x50
[   95.979032]  ___sys_sendmsg+0x2e4/0x300
[   95.982147]  ? kmem_cache_alloc+0x13e/0x190
[   95.985242]  ? __wake_up_common_lock+0x79/0x90
[   95.988338]  ? __check_object_size+0xac/0x1b0
[   95.991440]  ? _copy_to_user+0x22/0x30
[   95.994539]  ? move_addr_to_user+0xbb/0xd0
[   95.997619]  ? __sys_sendmsg+0x53/0x80
[   96.000664]  __sys_sendmsg+0x53/0x80
[   96.003747]  do_syscall_64+0x5b/0x1d0
[   96.006862]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Only update num_txq/rxq when passed check, and restore tc_cfg if setup
queue map failed.

Fixes: a632b2a4c9 ("ice: ethtool: Prohibit improper channel config for DCB")
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Reviewed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-08 13:22:11 -07:00
Dave Ertman
23c6191903 ice: Don't double unplug aux on peer initiated reset
In the IDC callback that is accessed when the aux drivers request a reset,
the function to unplug the aux devices is called.  This function is also
called in the ice_prepare_for_reset function. This double call is causing
a "scheduling while atomic" BUG.

[  662.676430] ice 0000:4c:00.0 rocep76s0: cqp opcode = 0x1 maj_err_code = 0xffff min_err_code = 0x8003

[  662.676609] ice 0000:4c:00.0 rocep76s0: [Modify QP Cmd Error][op_code=8] status=-29 waiting=1 completion_err=1 maj=0xffff min=0x8003

[  662.815006] ice 0000:4c:00.0 rocep76s0: ICE OICR event notification: oicr = 0x10000003

[  662.815014] ice 0000:4c:00.0 rocep76s0: critical PE Error, GLPE_CRITERR=0x00011424

[  662.815017] ice 0000:4c:00.0 rocep76s0: Requesting a reset

[  662.815475] BUG: scheduling while atomic: swapper/37/0/0x00010002

[  662.815475] BUG: scheduling while atomic: swapper/37/0/0x00010002
[  662.815477] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill 8021q garp mrp stp llc vfat fat rpcrdma intel_rapl_msr intel_rapl_common sunrpc i10nm_edac rdma_ucm nfit ib_srpt libnvdimm ib_isert iscsi_target_mod x86_pkg_temp_thermal intel_powerclamp coretemp target_core_mod snd_hda_intel ib_iser snd_intel_dspcfg libiscsi snd_intel_sdw_acpi scsi_transport_iscsi kvm_intel iTCO_wdt rdma_cm snd_hda_codec kvm iw_cm ipmi_ssif iTCO_vendor_support snd_hda_core irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hwdep snd_seq snd_seq_device rapl snd_pcm snd_timer isst_if_mbox_pci pcspkr isst_if_mmio irdma intel_uncore idxd acpi_ipmi joydev isst_if_common snd mei_me idxd_bus ipmi_si soundcore i2c_i801 mei ipmi_devintf i2c_smbus i2c_ismt ipmi_msghandler acpi_power_meter acpi_pad rv(OE) ib_uverbs ib_cm ib_core xfs libcrc32c ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helpe
 r ttm
[  662.815546]  nvme nvme_core ice drm crc32c_intel i40e t10_pi wmi pinctrl_emmitsburg dm_mirror dm_region_hash dm_log dm_mod fuse
[  662.815557] Preemption disabled at:
[  662.815558] [<0000000000000000>] 0x0
[  662.815563] CPU: 37 PID: 0 Comm: swapper/37 Kdump: loaded Tainted: G S         OE     5.17.1 #2
[  662.815566] Hardware name: Intel Corporation D50DNP/D50DNP, BIOS SE5C6301.86B.6624.D18.2111021741 11/02/2021
[  662.815568] Call Trace:
[  662.815572]  <IRQ>
[  662.815574]  dump_stack_lvl+0x33/0x42
[  662.815581]  __schedule_bug.cold.147+0x7d/0x8a
[  662.815588]  __schedule+0x798/0x990
[  662.815595]  schedule+0x44/0xc0
[  662.815597]  schedule_preempt_disabled+0x14/0x20
[  662.815600]  __mutex_lock.isra.11+0x46c/0x490
[  662.815603]  ? __ibdev_printk+0x76/0xc0 [ib_core]
[  662.815633]  device_del+0x37/0x3d0
[  662.815639]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[  662.815674]  ice_schedule_reset+0x3c/0xd0 [ice]
[  662.815693]  irdma_iidc_event_handler.cold.7+0xb6/0xd3 [irdma]
[  662.815712]  ? bitmap_find_next_zero_area_off+0x45/0xa0
[  662.815719]  ice_send_event_to_aux+0x54/0x70 [ice]
[  662.815741]  ice_misc_intr+0x21d/0x2d0 [ice]
[  662.815756]  __handle_irq_event_percpu+0x4c/0x180
[  662.815762]  handle_irq_event_percpu+0xf/0x40
[  662.815764]  handle_irq_event+0x34/0x60
[  662.815766]  handle_edge_irq+0x9a/0x1c0
[  662.815770]  __common_interrupt+0x62/0x100
[  662.815774]  common_interrupt+0xb4/0xd0
[  662.815779]  </IRQ>
[  662.815780]  <TASK>
[  662.815780]  asm_common_interrupt+0x1e/0x40
[  662.815785] RIP: 0010:cpuidle_enter_state+0xd6/0x380
[  662.815789] Code: 49 89 c4 0f 1f 44 00 00 31 ff e8 65 d7 95 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 64 02 00 00 31 ff e8 ae c5 9c ff fb 45 85 f6 <0f> 88 12 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49
[  662.815791] RSP: 0018:ff2c2c4f18edbe80 EFLAGS: 00000202
[  662.815793] RAX: ff280805df140000 RBX: 0000000000000002 RCX: 000000000000001f
[  662.815795] RDX: 0000009a52da2d08 RSI: ffffffff93f8240b RDI: ffffffff93f53ee7
[  662.815796] RBP: ff5e2bd11ff41928 R08: 0000000000000000 R09: 000000000002f8c0
[  662.815797] R10: 0000010c3f18e2cf R11: 000000000000000f R12: 0000009a52da2d08
[  662.815798] R13: ffffffff94ad7e20 R14: 0000000000000002 R15: 0000000000000000
[  662.815801]  cpuidle_enter+0x29/0x40
[  662.815803]  do_idle+0x261/0x2b0
[  662.815807]  cpu_startup_entry+0x19/0x20
[  662.815809]  start_secondary+0x114/0x150
[  662.815813]  secondary_startup_64_no_verify+0xd5/0xdb
[  662.815818]  </TASK>
[  662.815846] bad: scheduling from the idle thread!
[  662.815849] CPU: 37 PID: 0 Comm: swapper/37 Kdump: loaded Tainted: G S      W  OE     5.17.1 #2
[  662.815852] Hardware name: Intel Corporation D50DNP/D50DNP, BIOS SE5C6301.86B.6624.D18.2111021741 11/02/2021
[  662.815853] Call Trace:
[  662.815855]  <IRQ>
[  662.815856]  dump_stack_lvl+0x33/0x42
[  662.815860]  dequeue_task_idle+0x20/0x30
[  662.815863]  __schedule+0x1c3/0x990
[  662.815868]  schedule+0x44/0xc0
[  662.815871]  schedule_preempt_disabled+0x14/0x20
[  662.815873]  __mutex_lock.isra.11+0x3a8/0x490
[  662.815876]  ? __ibdev_printk+0x76/0xc0 [ib_core]
[  662.815904]  device_del+0x37/0x3d0
[  662.815909]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[  662.815937]  ice_schedule_reset+0x3c/0xd0 [ice]
[  662.815961]  irdma_iidc_event_handler.cold.7+0xb6/0xd3 [irdma]
[  662.815979]  ? bitmap_find_next_zero_area_off+0x45/0xa0
[  662.815985]  ice_send_event_to_aux+0x54/0x70 [ice]
[  662.816011]  ice_misc_intr+0x21d/0x2d0 [ice]
[  662.816033]  __handle_irq_event_percpu+0x4c/0x180
[  662.816037]  handle_irq_event_percpu+0xf/0x40
[  662.816039]  handle_irq_event+0x34/0x60
[  662.816042]  handle_edge_irq+0x9a/0x1c0
[  662.816045]  __common_interrupt+0x62/0x100
[  662.816048]  common_interrupt+0xb4/0xd0
[  662.816052]  </IRQ>
[  662.816053]  <TASK>
[  662.816054]  asm_common_interrupt+0x1e/0x40
[  662.816057] RIP: 0010:cpuidle_enter_state+0xd6/0x380
[  662.816060] Code: 49 89 c4 0f 1f 44 00 00 31 ff e8 65 d7 95 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 64 02 00 00 31 ff e8 ae c5 9c ff fb 45 85 f6 <0f> 88 12 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49
[  662.816063] RSP: 0018:ff2c2c4f18edbe80 EFLAGS: 00000202
[  662.816065] RAX: ff280805df140000 RBX: 0000000000000002 RCX: 000000000000001f
[  662.816067] RDX: 0000009a52da2d08 RSI: ffffffff93f8240b RDI: ffffffff93f53ee7
[  662.816068] RBP: ff5e2bd11ff41928 R08: 0000000000000000 R09: 000000000002f8c0
[  662.816070] R10: 0000010c3f18e2cf R11: 000000000000000f R12: 0000009a52da2d08
[  662.816071] R13: ffffffff94ad7e20 R14: 0000000000000002 R15: 0000000000000000
[  662.816075]  cpuidle_enter+0x29/0x40
[  662.816077]  do_idle+0x261/0x2b0
[  662.816080]  cpu_startup_entry+0x19/0x20
[  662.816083]  start_secondary+0x114/0x150
[  662.816087]  secondary_startup_64_no_verify+0xd5/0xdb
[  662.816091]  </TASK>
[  662.816169] bad: scheduling from the idle thread!

The correct place to unplug the aux devices for a reset is in the
prepare_for_reset function, as this is a common place for all reset flows.
It also has built in protection from being called twice in a single reset
instance before the aux devices are replugged.

Fixes: f9f5301e7e ("ice: Register auxiliary device to provide RDMA")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Helena Anna Dubel <helena.anna.dubel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-08 11:21:41 -07:00