Commit Graph

41985 Commits

Author SHA1 Message Date
Dany Madden
aeaf59b787 Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"
This reverts commit 723ad91613

When client requests channel or ring size larger than what the server
can support the server will cap the request to the supported max. So,
the client would not be able to successfully request resources that
exceed the server limit.

Fixes: 723ad91613 ("ibmvnic: Add ethtool private flag for driver-defined queue limits")
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 09:46:18 -07:00
Vladimir Oltean
66a2f5ef68 net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASK
The Time-Specified Departure feature is indeed mutually exclusive with
TX IP checksumming in ENETC, but TX checksumming in itself is broken and
was removed from this driver in commit 82728b91f1 ("enetc: Remove Tx
checksumming offload code").

The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply
with software TSO's expectations, and still did the checksumming in
software by calling skb_checksum_help(). So there isn't any restriction
for the Time-Specified Departure feature.

However, enetc_setup_tc_txtime() doesn't understand that, and blindly
looks for NETIF_F_CSUM_MASK.

Instead of checking for things which can literally never happen in the
current code base, just remove the check and let the driver offload
tc-etf qdiscs.

Fixes: acede3c5da ("net: enetc: declare NETIF_F_HW_CSUM and do it in software")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 09:45:48 -07:00
Leon Romanovsky
f049efc7f7 ixgbe: ensure IPsec VF<->PF compatibility
The VF driver can forward any IPsec flags and such makes the function
is not extendable and prone to backward/forward incompatibility.

If new software runs on VF, it won't know that PF configured something
completely different as it "knows" only XFRM_OFFLOAD_INBOUND flag.

Fixes: eda0333ac2 ("ixgbe: add VF IPsec management")
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220427173152.443102-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 09:40:02 -07:00
Yang Yingliang
d2b52ec056 net: fec: add missing of_node_put() in fec_enet_init_stop_mode()
Put device node in error path in fec_enet_init_stop_mode().

Fixes: 8a448bf832 ("net: ethernet: fec: move GPR register offset and bit into DT")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220426125231.375688-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27 17:53:48 -07:00
Manish Chopra
af68656d66 bnx2x: fix napi API usage sequence
While handling PCI errors (AER flow) driver tries to
disable NAPI [napi_disable()] after NAPI is deleted
[__netif_napi_del()] which causes unexpected system
hang/crash.

System message log shows the following:
=======================================
[ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures.
[ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)'
[ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports:
'need reset'
[ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports:
'need reset'
[ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset'
[ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow:
[ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows:
[ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow:
[ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows:
[ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset'
[ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->slot_reset()
[ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing...
[ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset
--> driver unload

Uninterruptible tasks
=====================
crash> ps | grep UN
     213      2  11  c000000004c89e00  UN   0.0       0      0  [eehd]
     215      2   0  c000000004c80000  UN   0.0       0      0
[kworker/0:2]
    2196      1  28  c000000004504f00  UN   0.1   15936  11136  wickedd
    4287      1   9  c00000020d076800  UN   0.0    4032   3008  agetty
    4289      1  20  c00000020d056680  UN   0.0    7232   3840  agetty
   32423      2  26  c00000020038c580  UN   0.0       0      0
[kworker/26:3]
   32871   4241  27  c0000002609ddd00  UN   0.1   18624  11648  sshd
   32920  10130  16  c00000027284a100  UN   0.1   48512  12608  sendmail
   33092  32987   0  c000000205218b00  UN   0.1   48512  12608  sendmail
   33154   4567  16  c000000260e51780  UN   0.1   48832  12864  pickup
   33209   4241  36  c000000270cb6500  UN   0.1   18624  11712  sshd
   33473  33283   0  c000000205211480  UN   0.1   48512  12672  sendmail
   33531   4241  37  c00000023c902780  UN   0.1   18624  11648  sshd

EEH handler hung while bnx2x sleeping and holding RTNL lock
===========================================================
crash> bt 213
PID: 213    TASK: c000000004c89e00  CPU: 11  COMMAND: "eehd"
  #0 [c000000004d477e0] __schedule at c000000000c70808
  #1 [c000000004d478b0] schedule at c000000000c70ee0
  #2 [c000000004d478e0] schedule_timeout at c000000000c76dec
  #3 [c000000004d479c0] msleep at c0000000002120cc
  #4 [c000000004d479f0] napi_disable at c000000000a06448
                                        ^^^^^^^^^^^^^^^^
  #5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x]
  #6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x]
  #7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc
  #8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8
  #9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64

And the sleeping source code
============================
crash> dis -ls c000000000a06448
FILE: ../net/core/dev.c
LINE: 6702

   6697  {
   6698          might_sleep();
   6699          set_bit(NAPI_STATE_DISABLE, &n->state);
   6700
   6701          while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
* 6702                  msleep(1);
   6703          while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
   6704                  msleep(1);
   6705
   6706          hrtimer_cancel(&n->timer);
   6707
   6708          clear_bit(NAPI_STATE_DISABLE, &n->state);
   6709  }

EEH calls into bnx2x twice based on the system log above, first through
bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes
the following call chains:

bnx2x_io_error_detected()
  +-> bnx2x_eeh_nic_unload()
       +-> bnx2x_del_all_napi()
            +-> __netif_napi_del()

bnx2x_io_slot_reset()
  +-> bnx2x_netif_stop()
       +-> bnx2x_napi_disable()
            +->napi_disable()

Fix this by correcting the sequence of NAPI APIs usage,
that is delete the NAPI after disabling it.

Fixes: 7fa6f34081 ("bnx2x: AER revised")
Reported-by: David Christensen <drc@linux.vnet.ibm.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Link: https://lore.kernel.org/r/20220426153913.6966-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27 17:50:32 -07:00
Jacob Keller
b668f4cd71 ice: fix use-after-free when deinitializing mailbox snapshot
During ice_sriov_configure, if num_vfs is 0, we are being asked by the
kernel to remove all VFs.

The driver first de-initializes the snapshot before freeing all the VFs.
This results in a use-after-free BUG detected by KASAN. The bug occurs
because the snapshot can still be accessed until all VFs are removed.

Fix this by freeing all the VFs first before calling
ice_mbx_deinit_snapshot.

[  +0.032591] ==================================================================
[  +0.000021] BUG: KASAN: use-after-free in ice_mbx_vf_state_handler+0x1c3/0x410 [ice]
[  +0.000315] Write of size 28 at addr ffff889908eb6f28 by task kworker/55:2/1530996

[  +0.000029] CPU: 55 PID: 1530996 Comm: kworker/55:2 Kdump: loaded Tainted: G S        I       5.17.0-dirty #1
[  +0.000022] Hardware name: Dell Inc. PowerEdge R740/0923K0, BIOS 1.6.13 12/17/2018
[  +0.000013] Workqueue: ice ice_service_task [ice]
[  +0.000279] Call Trace:
[  +0.000012]  <TASK>
[  +0.000011]  dump_stack_lvl+0x33/0x42
[  +0.000030]  print_report.cold.13+0xb2/0x6b3
[  +0.000028]  ? ice_mbx_vf_state_handler+0x1c3/0x410 [ice]
[  +0.000295]  kasan_report+0xa5/0x120
[  +0.000026]  ? __switch_to_asm+0x21/0x70
[  +0.000024]  ? ice_mbx_vf_state_handler+0x1c3/0x410 [ice]
[  +0.000298]  kasan_check_range+0x183/0x1e0
[  +0.000019]  memset+0x1f/0x40
[  +0.000018]  ice_mbx_vf_state_handler+0x1c3/0x410 [ice]
[  +0.000304]  ? ice_conv_link_speed_to_virtchnl+0x160/0x160 [ice]
[  +0.000297]  ? ice_vsi_dis_spoofchk+0x40/0x40 [ice]
[  +0.000305]  ice_is_malicious_vf+0x1aa/0x250 [ice]
[  +0.000303]  ? ice_restore_all_vfs_msi_state+0x160/0x160 [ice]
[  +0.000297]  ? __mutex_unlock_slowpath.isra.15+0x410/0x410
[  +0.000022]  ? ice_debug_cq+0xb7/0x230 [ice]
[  +0.000273]  ? __kasan_slab_alloc+0x2f/0x90
[  +0.000022]  ? memset+0x1f/0x40
[  +0.000017]  ? do_raw_spin_lock+0x119/0x1d0
[  +0.000022]  ? rwlock_bug.part.2+0x60/0x60
[  +0.000024]  __ice_clean_ctrlq+0x3a6/0xd60 [ice]
[  +0.000273]  ? newidle_balance+0x5b1/0x700
[  +0.000026]  ? ice_print_link_msg+0x2f0/0x2f0 [ice]
[  +0.000271]  ? update_cfs_group+0x1b/0x140
[  +0.000018]  ? load_balance+0x1260/0x1260
[  +0.000022]  ? ice_process_vflr_event+0x27/0x130 [ice]
[  +0.000301]  ice_service_task+0x136e/0x1470 [ice]
[  +0.000281]  process_one_work+0x3b4/0x6c0
[  +0.000030]  worker_thread+0x65/0x660
[  +0.000023]  ? __kthread_parkme+0xe4/0x100
[  +0.000021]  ? process_one_work+0x6c0/0x6c0
[  +0.000020]  kthread+0x179/0x1b0
[  +0.000018]  ? kthread_complete_and_exit+0x20/0x20
[  +0.000022]  ret_from_fork+0x22/0x30
[  +0.000026]  </TASK>

[  +0.000018] Allocated by task 10742:
[  +0.000013]  kasan_save_stack+0x1c/0x40
[  +0.000018]  __kasan_kmalloc+0x84/0xa0
[  +0.000016]  kmem_cache_alloc_trace+0x16c/0x2e0
[  +0.000015]  intel_iommu_probe_device+0xeb/0x860
[  +0.000015]  __iommu_probe_device+0x9a/0x2f0
[  +0.000016]  iommu_probe_device+0x43/0x270
[  +0.000015]  iommu_bus_notifier+0xa7/0xd0
[  +0.000015]  blocking_notifier_call_chain+0x90/0xc0
[  +0.000017]  device_add+0x5f3/0xd70
[  +0.000014]  pci_device_add+0x404/0xa40
[  +0.000015]  pci_iov_add_virtfn+0x3b0/0x550
[  +0.000016]  sriov_enable+0x3bb/0x600
[  +0.000013]  ice_ena_vfs+0x113/0xa79 [ice]
[  +0.000293]  ice_sriov_configure.cold.17+0x21/0xe0 [ice]
[  +0.000291]  sriov_numvfs_store+0x160/0x200
[  +0.000015]  kernfs_fop_write_iter+0x1db/0x270
[  +0.000018]  new_sync_write+0x21d/0x330
[  +0.000013]  vfs_write+0x376/0x410
[  +0.000013]  ksys_write+0xba/0x150
[  +0.000012]  do_syscall_64+0x3a/0x80
[  +0.000012]  entry_SYSCALL_64_after_hwframe+0x44/0xae

[  +0.000028] Freed by task 10742:
[  +0.000011]  kasan_save_stack+0x1c/0x40
[  +0.000015]  kasan_set_track+0x21/0x30
[  +0.000016]  kasan_set_free_info+0x20/0x30
[  +0.000012]  __kasan_slab_free+0x104/0x170
[  +0.000016]  kfree+0x9b/0x470
[  +0.000013]  devres_destroy+0x1c/0x20
[  +0.000015]  devm_kfree+0x33/0x40
[  +0.000012]  ice_mbx_deinit_snapshot+0x39/0x70 [ice]
[  +0.000295]  ice_sriov_configure+0xb0/0x260 [ice]
[  +0.000295]  sriov_numvfs_store+0x1bc/0x200
[  +0.000015]  kernfs_fop_write_iter+0x1db/0x270
[  +0.000016]  new_sync_write+0x21d/0x330
[  +0.000012]  vfs_write+0x376/0x410
[  +0.000012]  ksys_write+0xba/0x150
[  +0.000012]  do_syscall_64+0x3a/0x80
[  +0.000012]  entry_SYSCALL_64_after_hwframe+0x44/0xae

[  +0.000024] Last potentially related work creation:
[  +0.000010]  kasan_save_stack+0x1c/0x40
[  +0.000016]  __kasan_record_aux_stack+0x98/0xa0
[  +0.000013]  insert_work+0x34/0x160
[  +0.000015]  __queue_work+0x20e/0x650
[  +0.000016]  queue_work_on+0x4c/0x60
[  +0.000015]  nf_nat_masq_schedule+0x297/0x2e0 [nf_nat]
[  +0.000034]  masq_device_event+0x5a/0x60 [nf_nat]
[  +0.000031]  raw_notifier_call_chain+0x5f/0x80
[  +0.000017]  dev_close_many+0x1d6/0x2c0
[  +0.000015]  unregister_netdevice_many+0x4e3/0xa30
[  +0.000015]  unregister_netdevice_queue+0x192/0x1d0
[  +0.000014]  iavf_remove+0x8f9/0x930 [iavf]
[  +0.000058]  pci_device_remove+0x65/0x110
[  +0.000015]  device_release_driver_internal+0xf8/0x190
[  +0.000017]  pci_stop_bus_device+0xb5/0xf0
[  +0.000014]  pci_stop_and_remove_bus_device+0xe/0x20
[  +0.000016]  pci_iov_remove_virtfn+0x19c/0x230
[  +0.000015]  sriov_disable+0x4f/0x170
[  +0.000014]  ice_free_vfs+0x9a/0x490 [ice]
[  +0.000306]  ice_sriov_configure+0xb8/0x260 [ice]
[  +0.000294]  sriov_numvfs_store+0x1bc/0x200
[  +0.000015]  kernfs_fop_write_iter+0x1db/0x270
[  +0.000016]  new_sync_write+0x21d/0x330
[  +0.000012]  vfs_write+0x376/0x410
[  +0.000012]  ksys_write+0xba/0x150
[  +0.000012]  do_syscall_64+0x3a/0x80
[  +0.000012]  entry_SYSCALL_64_after_hwframe+0x44/0xae

[  +0.000025] The buggy address belongs to the object at ffff889908eb6f00
               which belongs to the cache kmalloc-96 of size 96
[  +0.000016] The buggy address is located 40 bytes inside of
               96-byte region [ffff889908eb6f00, ffff889908eb6f60)

[  +0.000026] The buggy address belongs to the physical page:
[  +0.000010] page:00000000b7e99a2e refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1908eb6
[  +0.000016] flags: 0x57ffffc0000200(slab|node=1|zone=2|lastcpupid=0x1fffff)
[  +0.000024] raw: 0057ffffc0000200 ffffea0069d9fd80 dead000000000002 ffff88810004c780
[  +0.000015] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[  +0.000009] page dumped because: kasan: bad access detected

[  +0.000016] Memory state around the buggy address:
[  +0.000012]  ffff889908eb6e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  +0.000014]  ffff889908eb6e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  +0.000014] >ffff889908eb6f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  +0.000011]                                   ^
[  +0.000013]  ffff889908eb6f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  +0.000013]  ffff889908eb7000: fa fb fb fb fb fb fb fb fc fc fc fc fa fb fb fb
[  +0.000012] ==================================================================

Fixes: 0891c89674 ("ice: warn about potentially malicious VFs")
Reported-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-26 09:26:48 -07:00
Petr Oros
b537752e6c ice: wait 5 s for EMP reset after firmware flash
We need to wait 5 s for EMP reset after firmware flash. Code was extracted
from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this
wait, fw_activate let card in inconsistent state and recoverable only
by second flash/activate. Flash was tested on these fw's:
From -> To
 3.00 -> 3.10/3.20
 3.10 -> 3.00/3.20
 3.20 -> 3.00/3.10

Reproducer:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
    fw_activate
[root@host ~]# ip link show ens7f0
71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
    link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
    altname enp202s0f0

dmesg after flash:
[   55.120788] ice: Copyright (c) 2018, Intel Corporation.
[   55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway
[   55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0
[   55.603629] ice 0000:ca:00.0: Get PHY capability failed.
[   55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
[   55.647348] ice 0000:ca:00.0: PTP init successful
[   55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8
[   55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode.
[   55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware
[   55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
Reboot doesn’t help, only second flash/activate with OOT or patched
driver put card back in consistent state.

After patch:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
    fw_activate
[root@host ~]# ip link show ens7f0
19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
    altname enp202s0f0

Fixes: 399e27dbbd ("ice: support immediate firmware activation via devlink reload")
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-26 09:26:48 -07:00
Ivan Vecera
77d64d285b ice: Protect vf_state check by cfg_lock in ice_vc_process_vf_msg()
Previous patch labelled "ice: Fix incorrect locking in
ice_vc_process_vf_msg()"  fixed an issue with ignored messages
sent by VF driver but a small race window still left.

Recently caught trace during 'ip link set ... vf 0 vlan ...' operation:

[ 7332.995625] ice 0000:3b:00.0: Clearing port VLAN on VF 0
[ 7333.001023] iavf 0000:3b:01.0: Reset indication received from the PF
[ 7333.007391] iavf 0000:3b:01.0: Scheduling reset task
[ 7333.059575] iavf 0000:3b:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 3
[ 7333.059626] ice 0000:3b:00.0: Invalid message from VF 0, opcode 3, len 4, error -1

Setting of VLAN for VF causes a reset of the affected VF using
ice_reset_vf() function that runs with cfg_lock taken:

1. ice_notify_vf_reset() informs IAVF driver that reset is needed and
   IAVF schedules its own reset procedure
2. Bit ICE_VF_STATE_DIS is set in vf->vf_state
3. Misc initialization steps
4. ice_sriov_post_vsi_rebuild() -> ice_vf_set_initialized() and that
   clears ICE_VF_STATE_DIS in vf->vf_state

Step 3 is mentioned race window because IAVF reset procedure runs in
parallel and one of its step is sending of VIRTCHNL_OP_GET_VF_RESOURCES
message (opcode==3). This message is handled in ice_vc_process_vf_msg()
and if it is received during the mentioned race window then it's
marked as invalid and error is returned to VF driver.

Protect vf_state check in ice_vc_process_vf_msg() by cfg_lock to avoid
this race condition.

Fixes: e6ba5273d4 ("ice: Fix race conditions between virtchnl handling and VF ndo ops")
Tested-by: Fei Liu <feliu@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-26 09:26:42 -07:00
Ivan Vecera
aaf461af72 ice: Fix incorrect locking in ice_vc_process_vf_msg()
Usage of mutex_trylock() in ice_vc_process_vf_msg() is incorrect
because message sent from VF is ignored and never processed.

Use mutex_lock() instead to fix the issue. It is safe because this
mutex is used to prevent races between VF related NDOs and
handlers processing request messages from VF and these handlers
are running in ice_service_task() context. Additionally move this
mutex lock prior ice_vc_is_opcode_allowed() call to avoid potential
races during allowlist access.

Fixes: e6ba5273d4 ("ice: Fix race conditions between virtchnl handling and VF ndo ops")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-26 09:26:33 -07:00
Jonathan Lemon
acac0541d1 net: bcmgenet: hide status block before TX timestamping
The hardware checksum offloading requires use of a transmit
status block inserted before the outgoing frame data, this was
updated in '9a9ba2a4aaaa ("net: bcmgenet: always enable status blocks")'

However, skb_tx_timestamp() assumes that it is passed a raw frame
and PTP parsing chokes on this status block.

Fix this by calling __skb_pull(), which hides the TSB before calling
skb_tx_timestamp(), so an outgoing PTP packet is parsed correctly.

As the data in the skb has already been set up for DMA, and the
dma_unmap_* calls use a separately stored address, there is no
no effective change in the data transmission.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220424165307.591145-1-jonathan.lemon@gmail.com
Fixes: d03825fba4 ("net: bcmgenet: add skb_tx_timestamp call")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-26 10:45:46 +02:00
Vladimir Oltean
1fcb8fb352 net: mscc: ocelot: don't add VID 0 to ocelot->vlans when leaving VLAN-aware bridge
DSA, through dsa_port_bridge_leave(), first notifies the port of the
fact that it left a bridge, then, if that bridge was VLAN-aware, it
notifies the port of the change in VLAN awareness state, towards
VLAN-unaware mode.

So ocelot_port_vlan_filtering() can be called when ocelot_port->bridge
is NULL, and this makes ocelot_add_vlan_unaware_pvid() create a struct
ocelot_bridge_vlan with a vid of 0 and an "untagged" setting of true on
that port.

In a way this structure correctly reflects the reality, but by design,
VID 0 (OCELOT_STANDALONE_PVID) was not meant to be kept in the bridge
VLAN list of the driver, but managed separately.

Having OCELOT_STANDALONE_PVID in ocelot->vlans makes us trip up on
several sanity checks that did not expect to have this VID there.
For example, after we leave a VLAN-aware bridge and we re-join it, we
can no longer program egress-tagged VLANs to hardware:

 # ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
 # ip link set swp0 master br0
 # ip link set swp0 nomaster
 # ip link set swp0 master br0
 # bridge vlan add dev swp0 vid 100
Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.

But this configuration is in fact supported by the hardware, since we
could use OCELOT_PORT_TAG_NATIVE. According to its comment:

/* all VLANs except the native VLAN and VID 0 are egress-tagged */

yet when assessing the eligibility for this mode, we do not check for
VID 0 in ocelot_port_uses_native_vlan(), instead we just ensure that
ocelot_port_num_untagged_vlans() == 1. This is simply because VID 0
doesn't have a bridge VLAN structure.

The way I identify the problem is that ocelot_port_vlan_filtering(false)
only means to call ocelot_add_vlan_unaware_pvid() when we dynamically
turn off VLAN awareness for a bridge we are under, and the PVID changes
from the bridge PVID to a reserved PVID based on the bridge number.

Since OCELOT_STANDALONE_PVID is statically added to the VLAN table
during ocelot_vlan_init() and never removed afterwards, calling
ocelot_add_vlan_unaware_pvid() for it is not intended and does not serve
any purpose.

Fix the issue by avoiding the call to ocelot_add_vlan_unaware_pvid(vid=0)
when we're resetting VLAN awareness after leaving the bridge, to become
a standalone port.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:47:55 +01:00
Vladimir Oltean
9323ac3670 net: mscc: ocelot: ignore VID 0 added by 8021q module
Both the felix DSA driver and ocelot switchdev driver declare
dev->features & NETIF_F_HW_VLAN_CTAG_FILTER under certain circumstances*,
so the 8021q module will add VID 0 to our RX filter when the port goes
up, to ensure 802.1p traffic is not dropped.

We treat VID 0 as a special value (OCELOT_STANDALONE_PVID) which
deliberately does not have a struct ocelot_bridge_vlan associated with
it. Instead, this gets programmed to the VLAN table in ocelot_vlan_init().

If we allow external calls to modify VID 0, we reach the following
situation:

 # ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up
 # ip link set swp0 master br0
 # ip link set swp0 up # this adds VID 0 to ocelot->vlans with untagged=false
bridge vlan
port              vlan-id
swp0              1 PVID Egress Untagged # the bridge also adds VID 1
br0               1 PVID Egress Untagged
 # bridge vlan add dev swp0 vid 100 untagged
Error: mscc_ocelot_switch_lib: Port with egress-tagged VLANs cannot have more than one egress-untagged (native) VLAN.

This configuration should have been accepted, because
ocelot_port_manage_port_tag() should select OCELOT_PORT_TAG_NATIVE.
Yet it isn't, because we have an entry in ocelot->vlans which says
VID 0 should be egress-tagged, something the hardware can't do.

Fix this by suppressing additions/deletions on VID 0 and managing this
VLAN exclusively using OCELOT_STANDALONE_PVID.

*DSA toggles it when the port becomes VLAN-aware by joining a VLAN-aware
bridge. Ocelot declares it unconditionally for some reason.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:47:55 +01:00
Dan Carpenter
9810c58c70 net: lan966x: fix a couple off by one bugs
The lan966x->ports[] array has lan966x->num_phys_ports elements.  These
are assigned in lan966x_probe().  That means the > comparison should be
changed to >=.

The first off by one check is harmless but the second one could lead to
an out of bounds access and a crash.

Fixes: 5ccd66e01c ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:25:37 +01:00
Peng Wu
e85f8a9f16 net: hns: Add missing fwnode_handle_put in hns_mac_init
In one of the error paths of the device_for_each_child_node() loop
in hns_mac_init, add missing call to fwnode_handle_put.

Signed-off-by: Peng Wu <wupeng58@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:06:53 +01:00
Jian Shen
c59d606296 net: hns3: add return value for mailbox handling in PF
Currently, there are some querying mailboxes sent from VF to PF,
and VF will wait the PF's handling result. For mailbox
HCLGE_MBX_GET_QID_IN_PF and HCLGE_MBX_GET_RSS_KEY, it may fail
when the input parameter is invalid, but the prototype of their
handler function is void. In this case, PF always return success
to VF, which may cause the VF get incorrect result.

Fixes it by adding return value for these function.

Fixes: 63b1279d99 ("net: hns3: check queue id range before using")
Fixes: 532cfc0df1 ("net: hns3: add a check for index in hclge_get_rss_key()")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:48:49 +01:00
Jian Shen
7d413735cb net: hns3: add validity check for message data length
Add validity check for message data length in function
hclge_send_mbx_msg(), avoid unexpected overflow.

Fixes: dde1a86e93 ("net: hns3: Add mailbox support to PF driver")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:48:49 +01:00
Jie Wang
48009e9972 net: hns3: modify the return code of hclge_get_ring_chain_from_mbx
Currently, function hclge_get_ring_chain_from_mbx will return -ENOMEM if
ring_num is bigger than HCLGE_MBX_MAX_RING_CHAIN_PARAM_NUM. It is better to
return -EINVAL for the invalid parameter case.

So this patch fixes it by return -EINVAL in this abnormal branch.

Fixes: 5d02a58dae ("net: hns3: fix for buffer overflow smatch warning")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:48:49 +01:00
Peng Li
123521b6b2 net: hns3: fix error log of tx/rx tqps stats
The comments in function hclge_comm_tqps_update_stats is not right,
so fix it.

Fixes: 287db5c40d ("net: hns3: create new set of common tqp stats APIs for PF and VF reuse")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:48:49 +01:00
Hao Chen
1ec1968e4e net: hns3: align the debugfs output to the left
For debugfs node rx/tx_queue_info and rx/tx_bd_info, their output info is
aligned to the right, it's not aligned with output of other debugfs node,
so uniform their output info.

Fixes: 907676b130 ("net: hns3: use tx bounce buffer for small packets")
Fixes: e44c495d95 ("net: hns3: refactor queue info of debugfs")
Fixes: 77e9184869 ("net: hns3: refactor dump bd info of debugfs")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:48:49 +01:00
Jian Shen
e98365afc1 net: hns3: clear inited state and stop client after failed to register netdev
If failed to register netdev, it needs to clear INITED state and stop
client in case of cause problem when concurrency with uninitialized
process of driver.

Fixes: a289a7e5c1 ("net: hns3: put off calling register_netdev() until client initialize complete")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:48:49 +01:00
Dinh Nguyen
5fd1fe4807 net: ethernet: stmmac: fix write to sgmii_adapter_base
I made a mistake with the commit a6aaa00324 ("net: ethernet: stmmac:
fix altr_tse_pcs function when using a fixed-link"). I should have
tested against both scenario of having a SGMII interface and one
without.

Without the SGMII PCS TSE adpater, the sgmii_adapter_base address is
NULL, thus a write to this address will fail.

Cc: stable@vger.kernel.org
Fixes: a6aaa00324 ("net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link")
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/r/20220420152345.27415-1-dinguyen@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22 16:31:56 -07:00
Kevin Hao
234901de2b net: stmmac: Use readl_poll_timeout_atomic() in atomic state
The init_systime() may be invoked in atomic state. We have observed the
following call trace when running "phc_ctl /dev/ptp0 set" on a Intel
Agilex board.
  BUG: sleeping function called from invalid context at drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:74
  in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 381, name: phc_ctl
  preempt_count: 1, expected: 0
  RCU nest depth: 0, expected: 0
  Preemption disabled at:
  [<ffff80000892ef78>] stmmac_set_time+0x34/0x8c
  CPU: 2 PID: 381 Comm: phc_ctl Not tainted 5.18.0-rc2-next-20220414-yocto-standard+ #567
  Hardware name: SoCFPGA Agilex SoCDK (DT)
  Call trace:
   dump_backtrace.part.0+0xc4/0xd0
   show_stack+0x24/0x40
   dump_stack_lvl+0x7c/0xa0
   dump_stack+0x18/0x34
   __might_resched+0x154/0x1c0
   __might_sleep+0x58/0x90
   init_systime+0x78/0x120
   stmmac_set_time+0x64/0x8c
   ptp_clock_settime+0x60/0x9c
   pc_clock_settime+0x6c/0xc0
   __arm64_sys_clock_settime+0x88/0xf0
   invoke_syscall+0x5c/0x130
   el0_svc_common.constprop.0+0x4c/0x100
   do_el0_svc+0x7c/0xa0
   el0_svc+0x58/0xcc
   el0t_64_sync_handler+0xa4/0x130
   el0t_64_sync+0x18c/0x190

So we should use readl_poll_timeout_atomic() here instead of
readl_poll_timeout().

Also adjust the delay time to 10us to fix a "__bad_udelay" build error
reported by "kernel test robot <lkp@intel.com>". I have tested this on
Intel Agilex and NXP S32G boards, there is no delay needed at all.
So the 10us delay should be long enough for most cases.

Fixes: ff8ed73786 ("net: stmmac: use readl_poll_timeout() function in init_systime()")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 11:10:27 +01:00
Vladimir Oltean
4cf35a2b62 net: mscc: ocelot: fix broken IP multicast flooding
When the user runs:
bridge link set dev $br_port mcast_flood on

this command should affect not only L2 multicast, but also IPv4 and IPv6
multicast.

In the Ocelot switch, unknown multicast gets flooded according to
different PGIDs according to its type, and PGID_MC only handles L2
multicast. Therefore, by leaving PGID_MCIPV4 and PGID_MCIPV6 at their
default value of 0, unknown IP multicast traffic is never flooded.

Fixes: 421741ea56 ("net: mscc: ocelot: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220415151950.219660-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-19 10:33:33 +02:00
Manuel Ullmann
cbe6c3a8f8 net: atlantic: invert deep par in pm functions, preventing null derefs
This will reset deeply on freeze and thaw instead of suspend and
resume and prevent null pointer dereferences of the uninitialized ring
0 buffer while thawing.

The impact is an indefinitely hanging kernel. You can't switch
consoles after this and the only possible user interaction is SysRq.

BUG: kernel NULL pointer dereference
RIP: 0010:aq_ring_rx_fill+0xcf/0x210 [atlantic]
aq_vec_init+0x85/0xe0 [atlantic]
aq_nic_init+0xf7/0x1d0 [atlantic]
atl_resume_common+0x4f/0x100 [atlantic]
pci_pm_thaw+0x42/0xa0

resolves in aq_ring.o to

```
0000000000000ae0 <aq_ring_rx_fill>:
{
/* ... */
 baf:	48 8b 43 08          	mov    0x8(%rbx),%rax
 		buff->flags = 0U; /* buff is NULL */
```

The bug has been present since the introduction of the new pm code in
8aaa112a57 ("net: atlantic: refactoring pm logic") and was hidden
until 8ce8427169 ("net: atlantic: changes for multi-TC support"),
which refactored the aq_vec_{free,alloc} functions into
aq_vec_{,ring}_{free,alloc}, but is technically not wrong. The
original functions just always reinitialized the buffers on S3/S4. If
the interface is down before freezing, the bug does not occur. It does
not matter, whether the initrd contains and loads the module before
thawing.

So the fix is to invert the boolean parameter deep in all pm function
calls, which was clearly intended to be set like that.

First report was on Github [1], which you have to guess from the
resume logs in the posted dmesg snippet. Recently I posted one on
Bugzilla [2], since I did not have an AQC device so far.

#regzbot introduced: 8ce8427169
#regzbot from: koo5 <kolman.jindrich@gmail.com>
#regzbot monitor: https://github.com/Aquantia/AQtion/issues/32

Fixes: 8aaa112a57 ("net: atlantic: refactoring pm logic")
Link: https://github.com/Aquantia/AQtion/issues/32 [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215798 [2]
Cc: stable@vger.kernel.org
Reported-by: koo5 <kolman.jindrich@gmail.com>
Signed-off-by: Manuel Ullmann <labre@posteo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 13:34:36 +01:00
Horatiu Vultur
d08ed85256 net: lan966x: Make sure to release ptp interrupt
When the lan966x driver is removed make sure to remove also the ptp_irq
IRQ.

Fixes: e85a96e48e ("net: lan966x: Add support for ptp interrupts")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20220413195716.3796467-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:29:38 -07:00
David Ahern
db53cd3d88 net: Handle l3mdev in ip_tunnel_init_flow
Ido reported that the commit referenced in the Fixes tag broke
a gre use case with dummy devices. Add a check to ip_tunnel_init_flow
to see if the oif is an l3mdev port and if so set the oif to 0 to
avoid the oif comparison in fib_lookup_good_nhc.

Fixes: 40867d74c3 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:27:30 -07:00
Jakub Kicinski
f3226eed54 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-04-13

This series contains updates to igc and e1000e drivers.

Sasha removes waiting for hardware semaphore as it could cause an
infinite loop and changes usleep_range() calls done under atomic
context to udelay() for igc. For e1000e, he changes some variables from
u16 to u32 to prevent possible overflow of values.

Vinicius disables PTM when going to suspend as it is causing hang issues
on some platforms for igc.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  e1000e: Fix possible overflow in LTR decoding
  igc: Fix suspending when PTM is active
  igc: Fix BUG: scheduling while atomic
  igc: Fix infinite loop in release_swfw_sync
====================

Link: https://lore.kernel.org/r/20220413170814.2066855-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-15 14:25:42 -07:00
Stephen Hemminger
da367ac74a net: restore alpha order to Ethernet devices in config
The displayed list of Ethernet devices in make menuconfig
has gotten out of order. This is mostly due to changes in vendor
names etc, but also because of new Microsoft entry in wrong place.

This restores so that the display is in order even if the names
of the sub directories are not.

Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-15 11:56:16 +01:00
Jianglei Nie
7c8881b779 ice: Fix memory leak in ice_get_orom_civd_data()
A memory chunk was allocated for orom_data in ice_get_orom_civd_data()
by vzmalloc(). But when ice_read_flash_module() fails, the allocated
memory is not freed, which will lead to a memory leak.

We can fix it by freeing the orom_data when ce_read_flash_module() fails.

Fixes: af18d8866c ("ice: reduce time to read Option ROM CIVD data")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-14 08:31:01 -07:00
Wojciech Drewek
d201665147 ice: fix crash in switchdev mode
Below steps end up with crash:
- modprobe ice
- devlink dev eswitch set $PF1_PCI mode switchdev
- echo 64 > /sys/class/net/$PF1/device/sriov_numvfs
- rmmod ice

Calling ice_eswitch_port_start_xmit while the process of removing
VFs is in progress ends up with NULL pointer dereference.
That's because PR netdev is not released but some resources
are already freed. Fix it by checking if ICE_VF_DIS bit is set.

Call trace:
[ 1379.595146] BUG: kernel NULL pointer dereference, address: 0000000000000040
[ 1379.595284] #PF: supervisor read access in kernel mode
[ 1379.595410] #PF: error_code(0x0000) - not-present page
[ 1379.595535] PGD 0 P4D 0
[ 1379.595657] Oops: 0000 [#1] PREEMPT SMP PTI
[ 1379.595783] CPU: 4 PID: 974 Comm: NetworkManager Kdump: loaded Tainted: G           OE     5.17.0-rc8_mrq_dev-queue+ #12
[ 1379.595926] Hardware name: Intel Corporation S1200SP/S1200SP, BIOS S1200SP.86B.03.01.0042.013020190050 01/30/2019
[ 1379.596063] RIP: 0010:ice_eswitch_port_start_xmit+0x46/0xd0 [ice]
[ 1379.596292] Code: c7 c8 09 00 00 e8 9a c9 fc ff 84 c0 0f 85 82 00 00 00 4c 89 e7 e8 ca 70 fe ff 48 8b 7d 58 48 89 c3 48 85 ff 75 5e 48 8b 53 20 <8b> 42 40 85 c0 74 78 8d 48 01 f0 0f b1 4a 40 75 f2 0f b6 95 84 00
[ 1379.596456] RSP: 0018:ffffaba0c0d7bad0 EFLAGS: 00010246
[ 1379.596584] RAX: ffff969c14c71680 RBX: ffff969c14c71680 RCX: 000100107a0f0000
[ 1379.596715] RDX: 0000000000000000 RSI: ffff969b9d631000 RDI: 0000000000000000
[ 1379.596846] RBP: ffff969c07b46500 R08: ffff969becfca8ac R09: 0000000000000001
[ 1379.596977] R10: 0000000000000004 R11: ffffaba0c0d7bbec R12: ffff969b9d631000
[ 1379.597106] R13: ffffffffc08357a0 R14: ffff969c07b46500 R15: ffff969b9d631000
[ 1379.597237] FS:  00007f72c0e25c80(0000) GS:ffff969f13500000(0000) knlGS:0000000000000000
[ 1379.597414] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1379.597562] CR2: 0000000000000040 CR3: 000000012b316006 CR4: 00000000003706e0
[ 1379.597713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1379.597863] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1379.598015] Call Trace:
[ 1379.598153]  <TASK>
[ 1379.598294]  dev_hard_start_xmit+0xd9/0x220
[ 1379.598444]  sch_direct_xmit+0x8a/0x340
[ 1379.598592]  __dev_queue_xmit+0xa3c/0xd30
[ 1379.598739]  ? packet_parse_headers+0xb4/0xf0
[ 1379.598890]  packet_sendmsg+0xa15/0x1620
[ 1379.599038]  ? __check_object_size+0x46/0x140
[ 1379.599186]  sock_sendmsg+0x5e/0x60
[ 1379.599330]  ____sys_sendmsg+0x22c/0x270
[ 1379.599474]  ? import_iovec+0x17/0x20
[ 1379.599622]  ? sendmsg_copy_msghdr+0x59/0x90
[ 1379.599771]  ___sys_sendmsg+0x81/0xc0
[ 1379.599917]  ? __pollwait+0xd0/0xd0
[ 1379.600061]  ? preempt_count_add+0x68/0xa0
[ 1379.600210]  ? _raw_write_lock_irq+0x1a/0x40
[ 1379.600369]  ? ep_done_scan+0xc9/0x110
[ 1379.600494]  ? _raw_spin_unlock_irqrestore+0x25/0x40
[ 1379.600622]  ? preempt_count_add+0x68/0xa0
[ 1379.600747]  ? _raw_spin_lock_irq+0x1a/0x40
[ 1379.600899]  ? __fget_light+0x8f/0x110
[ 1379.601024]  __sys_sendmsg+0x49/0x80
[ 1379.601148]  ? release_ds_buffers+0x50/0xe0
[ 1379.601274]  do_syscall_64+0x3b/0x90
[ 1379.601399]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1379.601525] RIP: 0033:0x7f72c1e2e35d

Fixes: f5396b8a66 ("ice: switchdev slow path")
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reported-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-14 08:19:54 -07:00
Maciej Fijalkowski
aacca7a83b ice: allow creating VFs for !CONFIG_NET_SWITCHDEV
Currently for !CONFIG_NET_SWITCHDEV kernel builds it is not possible to
create VFs properly as call to ice_eswitch_configure() returns
-EOPNOTSUPP for us. This is because CONFIG_ICE_SWITCHDEV depends on
CONFIG_NET_SWITCHDEV.

Change the ice_eswitch_configure() implementation for
!CONFIG_ICE_SWITCHDEV to return 0 instead -EOPNOTSUPP and let
ice_ena_vfs() finish its work properly.

CC: Grzegorz Nitka <grzegorz.nitka@intel.com>
Fixes: 1a1c40df2e ("ice: set and release switchdev environment")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-14 08:19:54 -07:00
Maciej Fijalkowski
d1fc4c6fea ice: xsk: check if Rx ring was filled up to the end
__ice_alloc_rx_bufs_zc() checks if a number of the descriptors to be
allocated would cause the ring wrap. In that case, driver will issue two
calls to xsk_buff_alloc_batch() - one that will fill the ring up to the
end and the second one that will start with filling descriptors from the
beginning of the ring.

ice_fill_rx_descs() is a wrapper for taking care of what
xsk_buff_alloc_batch() gave back to the driver. It works in a best
effort approach, so for example when driver asks for 64 buffers,
ice_fill_rx_descs() could assign only 32. Such case needs to be checked
when ring is being filled up to the end, because in that situation ntu
might not reached the end of the ring.

Fix the ring wrap by checking if nb_buffs_extra has the expected value.
If not, bump ntu and go directly to tail update.

Fixes: 3876ff525d ("ice: xsk: Handle SW XDP ring wrap and bump tail more often")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Shwetha Nagaraju <Shwetha.nagaraju@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-14 08:08:40 -07:00
Jeremy Linton
2df3fc4a84 net: bcmgenet: Revert "Use stronger register read/writes to assure ordering"
It turns out after digging deeper into this bug, that it was being
triggered by GCC12 failing to call the bcmgenet_enable_dma()
routine. Given that a gcc12 fix has been merged [1] and the genet
driver now works properly when built with gcc12, this commit should
be reverted.

[1]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=aabb9a261ef060cf24fd626713f1d7d9df81aa57

Fixes: 8d3ea3d402 ("net: bcmgenet: Use stronger register read/writes to assure ordering")
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220412210420.1129430-1-jeremy.linton@arm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-14 09:13:07 +02:00
Sasha Neftin
04ebaa1cfd e1000e: Fix possible overflow in LTR decoding
When we decode the latency and the max_latency, u16 value may not fit
the required size and could lead to the wrong LTR representation.

Scaling is represented as:
scale 0 - 1         (2^(5*0)) = 2^0
scale 1 - 32        (2^(5 *1))= 2^5
scale 2 - 1024      (2^(5 *2)) =2^10
scale 3 - 32768     (2^(5 *3)) =2^15
scale 4 - 1048576   (2^(5 *4)) = 2^20
scale 5 - 33554432  (2^(5 *4)) = 2^25
scale 4 and scale 5 required 20 and 25 bits respectively.
scale 6 reserved.

Replace the u16 type with the u32 type and allow corrected LTR
representation.

Cc: stable@vger.kernel.org
Fixes: 44a13a5d99 ("e1000e: Fix the max snoop/no-snoop latency for 10M")
Reported-by: James Hutchinson <jahutchinson99@googlemail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215689
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Tested-by: James Hutchinson <jahutchinson99@googlemail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-13 09:18:27 -07:00
Vinicius Costa Gomes
822f52e7ef igc: Fix suspending when PTM is active
Some mainboard/CPU combinations, in particular, Alder Lake-S with a
W680 mainboard, have shown problems (system hangs usually, no kernel
logs) with suspend/resume when PCIe PTM is enabled and active. In some
cases, it could be reproduced when removing the igc module.

The best we can do is to stop PTM dialogs from the downstream/device
side before the interface is brought down. PCIe PTM will be re-enabled
when the interface is being brought up.

Fixes: a90ec84837 ("igc: Add support for PTP getcrosststamp()")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-13 09:17:58 -07:00
Sasha Neftin
c80a29f0fe igc: Fix BUG: scheduling while atomic
Replace usleep_range() method with udelay() method to allow atomic contexts
in low-level MDIO access functions.

The following issue can be seen by doing the following:
$ modprobe -r bonding
$ modprobe -v bonding max_bonds=1 mode=1 miimon=100 use_carrier=0
$ ip link set bond0 up
$ ifenslave bond0 eth0 eth1

[  982.357308] BUG: scheduling while atomic: kworker/u64:0/9/0x00000002
[  982.364431] INFO: lockdep is turned off.
[  982.368824] Modules linked in: bonding sctp ip6_udp_tunnel udp_tunnel mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core nfp tls sunrpc intel_rapl_msr iTCO_wdt iTCO_vendor_support mxm_wmi dcdbas intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate intel_uncore pcspkr lpc_ich mei_me ipmi_ssif mei ipmi_si ipmi_devintf ipmi_msghandler wmi acpi_power_meter xfs libcrc32c sr_mod cdrom sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ahci libahci crc32c_intel libata i2c_algo_bit tg3 megaraid_sas igc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: bonding]
[  982.437941] CPU: 25 PID: 9 Comm: kworker/u64:0 Kdump: loaded Tainted: G        W        --------- -  - 4.18.0-348.el8.x86_64+debug #1
[  982.451333] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 2.7.0 12/005/2017
[  982.459791] Workqueue: bond0 bond_mii_monitor [bonding]
[  982.465622] Call Trace:
[  982.468355]  dump_stack+0x8e/0xd0
[  982.472056]  __schedule_bug.cold.60+0x3a/0x60
[  982.476919]  __schedule+0x147b/0x1bc0
[  982.481007]  ? firmware_map_remove+0x16b/0x16b
[  982.485967]  ? hrtimer_fixup_init+0x40/0x40
[  982.490625]  schedule+0xd9/0x250
[  982.494227]  schedule_hrtimeout_range_clock+0x10d/0x2c0
[  982.500058]  ? hrtimer_nanosleep_restart+0x130/0x130
[  982.505598]  ? hrtimer_init_sleeper_on_stack+0x90/0x90
[  982.511332]  ? usleep_range+0x88/0x130
[  982.515514]  ? recalibrate_cpu_khz+0x10/0x10
[  982.520279]  ? ktime_get+0xab/0x1c0
[  982.524175]  ? usleep_range+0x88/0x130
[  982.528355]  usleep_range+0xdd/0x130
[  982.532344]  ? console_conditional_schedule+0x30/0x30
[  982.537987]  ? igc_put_hw_semaphore+0x17/0x60 [igc]
[  982.543432]  igc_read_phy_reg_gpy+0x111/0x2b0 [igc]
[  982.548887]  igc_phy_has_link+0xfa/0x260 [igc]
[  982.553847]  ? igc_get_phy_id+0x210/0x210 [igc]
[  982.558894]  ? lock_acquire+0x34d/0x890
[  982.563187]  ? lock_downgrade+0x710/0x710
[  982.567659]  ? rcu_read_unlock+0x50/0x50
[  982.572039]  igc_check_for_copper_link+0x106/0x210 [igc]
[  982.577970]  ? igc_config_fc_after_link_up+0x840/0x840 [igc]
[  982.584286]  ? rcu_read_unlock+0x50/0x50
[  982.588661]  ? lock_release+0x591/0xb80
[  982.592939]  ? lock_release+0x591/0xb80
[  982.597220]  igc_has_link+0x113/0x330 [igc]
[  982.601887]  ? lock_downgrade+0x710/0x710
[  982.606362]  igc_ethtool_get_link+0x6d/0x90 [igc]
[  982.611614]  bond_check_dev_link+0x131/0x2c0 [bonding]
[  982.617350]  ? bond_time_in_interval+0xd0/0xd0 [bonding]
[  982.623277]  ? rcu_read_lock_held+0x62/0xc0
[  982.627944]  ? rcu_read_lock_sched_held+0xe0/0xe0
[  982.633198]  bond_mii_monitor+0x314/0x2500 [bonding]
[  982.638738]  ? lock_contended+0x880/0x880
[  982.643214]  ? bond_miimon_link_change+0xa0/0xa0 [bonding]
[  982.649336]  ? lock_acquire+0x34d/0x890
[  982.653615]  ? lock_downgrade+0x710/0x710
[  982.658089]  ? debug_object_deactivate+0x221/0x340
[  982.663436]  ? rcu_read_unlock+0x50/0x50
[  982.667811]  ? debug_print_object+0x2b0/0x2b0
[  982.672672]  ? __switch_to_asm+0x41/0x70
[  982.677049]  ? __switch_to_asm+0x35/0x70
[  982.681426]  ? _raw_spin_unlock_irq+0x24/0x40
[  982.686288]  ? trace_hardirqs_on+0x20/0x195
[  982.690956]  ? _raw_spin_unlock_irq+0x24/0x40
[  982.695818]  process_one_work+0x8f0/0x1770
[  982.700390]  ? pwq_dec_nr_in_flight+0x320/0x320
[  982.705443]  ? debug_show_held_locks+0x50/0x50
[  982.710403]  worker_thread+0x87/0xb40
[  982.714489]  ? process_one_work+0x1770/0x1770
[  982.719349]  kthread+0x344/0x410
[  982.722950]  ? kthread_insert_work_sanity_check+0xd0/0xd0
[  982.728975]  ret_from_fork+0x3a/0x50

Fixes: 5586838fe9 ("igc: Add code for PHY support")
Reported-by: Corinna Vinschen <vinschen@redhat.com>
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Corinna Vinschen <vinschen@redhat.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-13 07:58:32 -07:00
Sasha Neftin
907862e9ae igc: Fix infinite loop in release_swfw_sync
An infinite loop may occur if we fail to acquire the HW semaphore,
which is needed for resource release.
This will typically happen if the hardware is surprise-removed.
At this stage there is nothing to do, except log an error and quit.

Fixes: c0071c7aa5 ("igc: Add HW initialization code")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-13 07:58:32 -07:00
Dylan Hung
3d25045245 net: ftgmac100: access hardware register after clock ready
AST2600 MAC register 0x58 is writable only when the MAC clock is
enabled.  Usually, the MAC clock is enabled by the bootloader so
register 0x58 is set normally when the bootloader is involved.  To make
ast2600 ftgmac100 work without the bootloader, postpone the register
write until the clock is ready.

Fixes: 137d23cea1 ("net: ftgmac100: Fix Aspeed ast2600 TX hang issue")
Signed-off-by: Dylan Hung <dylan_hung@aspeedtech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13 12:43:55 +01:00
Horatiu Vultur
269219321e net: lan966x: Stop processing the MAC entry is port is wrong.
Currently when getting a new MAC is learn, the HW generates an
interrupt. So then the SW will check the new entry and checks if it
arrived on a correct port. If it didn't just generate a warning.
But this could still crash the system. Therefore stop processing that
entry when an issue is seen.

Fixes: 5ccd66e01c ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 20:47:15 -07:00
Horatiu Vultur
d7a947d289 net: lan966x: Fix when a port's upper is changed.
On lan966x it is not allowed to have foreign interfaces under a bridge
which already contains lan966x ports. So when a port leaves the bridge
it would call switchdev_bridge_port_unoffload which eventually will
notify the other ports that bridge left the vlan group but that is not
true because the bridge is still part of the vlan group.

Therefore when a port leaves the bridge, stop generating replays because
already the HW cleared after itself and the other ports don't need to do
anything else.

Fixes: cf2f60897e ("net: lan966x: Add support to offload the forwarding.")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 20:47:15 -07:00
Horatiu Vultur
6476f90aef net: lan966x: Fix IGMP snooping when frames have vlan tag
In case an IGMP frame has a vlan tag, then the function
lan966x_hw_offload couldn't figure out that is a IGMP frame. Therefore
the SW thinks that the frame was already forward by the HW which is not
true.
Extend lan966x_hw_offload to pop the vlan tag if are any and then check
for IGMP frames.

Fixes: 47aeea0d57 ("net: lan966x: Implement the callback SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED ")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 20:47:15 -07:00
Horatiu Vultur
eb9c0d671e net: lan966x: Update lan966x_ptp_get_nominal_value
The clk_per_cfg register represents the value added to the system clock
for each clock cycle. The issue is that the default value is wrong,
meaning that in case the DUT was a grandmaster then everone in the
network was too slow. In case there was a grandmaster, then there is no
issue because the DUT will configure clk_per_cfg register based on the
master frequency.

Fixes: d096459494 ("net: lan966x: Add support for ptp clocks")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 20:47:15 -07:00
Jakub Kicinski
65745da171 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-04-08

Alexander fixes a use after free issue with aRFS for ice driver.

Mateusz reverts a commit that introduced issues related to device
resets for iavf driver.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  Revert "iavf: Fix deadlock occurrence during resetting VF interface"
  ice: arfs: fix use-after-free when freeing @rx_cpu_rmap
====================

Link: https://lore.kernel.org/r/20220408163411.2415552-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 18:34:44 -07:00
Tomas Melin
5ad7f18cd8 net: macb: Restart tx only if queue pointer is lagging
commit 4298388574 ("net: macb: restart tx after tx used bit read")
added support for restarting transmission. Restarting tx does not work
in case controller asserts TXUBR interrupt and TQBP is already at the end
of the tx queue. In that situation, restarting tx will immediately cause
assertion of another TXUBR interrupt. The driver will end up in an infinite
interrupt loop which it cannot break out of.

For cases where TQBP is at the end of the tx queue, instead
only clear TX_USED interrupt. As more data gets pushed to the queue,
transmission will resume.

This issue was observed on a Xilinx Zynq-7000 based board.
During stress test of the network interface,
driver would get stuck on interrupt loop within seconds or minutes
causing CPU to stall.

Signed-off-by: Tomas Melin <tomas.melin@vaisala.com>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220407161659.14532-1-tomas.melin@vaisala.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 18:18:07 -07:00
Lv Ruyi
1a7eb80d17 dpaa_eth: Fix missing of_node_put in dpaa_get_ts_info()
Both of of_get_parent() and of_parse_phandle() return node pointer with
refcount incremented, use of_node_put() on it to decrease refcount
when done.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 12:02:33 +01:00
Dinh Nguyen
a6aaa00324 net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link
When using a fixed-link, the altr_tse_pcs driver crashes
due to null-pointer dereference as no phy_device is provided to
tse_pcs_fix_mac_speed function. Fix this by adding a check for
phy_dev before calling the tse_pcs_fix_mac_speed() function.

Also clean up the tse_pcs_fix_mac_speed function a bit. There is
no need to check for splitter_base and sgmii_adapter_base
because the driver will fail if these 2 variables are not
derived from the device tree.

Fixes: fb3bbdb859 ("net: ethernet: Add TSE PCS support to dwmac-socfpga")
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 10:41:00 +01:00
Kai-Heng Feng
8d3a6c37d5 net: atlantic: Avoid out-of-bounds indexing
UBSAN warnings are observed on atlantic driver:
[ 294.432996] UBSAN: array-index-out-of-bounds in /build/linux-Qow4fL/linux-5.15.0/drivers/net/ethernet/aquantia/atlantic/aq_nic.c:484:48
[ 294.433695] index 8 is out of range for type 'aq_vec_s *[8]'

The ring is dereferenced right before breaking out the loop, to prevent
that from happening, only use the index in the loop to fix the issue.

BugLink: https://bugs.launchpad.net/bugs/1958770
Tested-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Link: https://lore.kernel.org/r/20220408022204.16815-1-kai.heng.feng@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 20:34:31 -07:00
Vadim Pasternak
d452088cdf mlxsw: i2c: Fix initialization error flow
Add mutex_destroy() call in driver initialization error flow.

Fixes: 6882b0aee1 ("mlxsw: Introduce support for I2C bus")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220407070703.2421076-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 14:37:28 -07:00
Mateusz Palczewski
7d59706dbe Revert "iavf: Fix deadlock occurrence during resetting VF interface"
This change caused a regression with resetting while changing network
namespaces. By clearing the IFF_UP flag, the kernel now thinks it has
fully closed the device.

This reverts commit 0cc318d2e8.

Fixes: 0cc318d2e8 ("iavf: Fix deadlock occurrence during resetting VF interface")
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-08 09:08:37 -07:00
Alexander Lobakin
d7442f512b ice: arfs: fix use-after-free when freeing @rx_cpu_rmap
The CI testing bots triggered the following splat:

[  718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80
[  718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834
[  718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S      W IOE     5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1
[  718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020
[  718.223418] Call Trace:
[  718.227139]
[  718.230783]  dump_stack_lvl+0x33/0x42
[  718.234431]  print_address_description.constprop.9+0x21/0x170
[  718.238177]  ? free_irq_cpu_rmap+0x53/0x80
[  718.241885]  ? free_irq_cpu_rmap+0x53/0x80
[  718.245539]  kasan_report.cold.18+0x7f/0x11b
[  718.249197]  ? free_irq_cpu_rmap+0x53/0x80
[  718.252852]  free_irq_cpu_rmap+0x53/0x80
[  718.256471]  ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice]
[  718.260174]  ice_remove_arfs+0x5f/0x70 [ice]
[  718.263810]  ice_rebuild_arfs+0x3b/0x70 [ice]
[  718.267419]  ice_rebuild+0x39c/0xb60 [ice]
[  718.270974]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[  718.274472]  ? ice_init_phy_user_cfg+0x360/0x360 [ice]
[  718.278033]  ? delay_tsc+0x4a/0xb0
[  718.281513]  ? preempt_count_sub+0x14/0xc0
[  718.284984]  ? delay_tsc+0x8f/0xb0
[  718.288463]  ice_do_reset+0x92/0xf0 [ice]
[  718.292014]  ice_pci_err_resume+0x91/0xf0 [ice]
[  718.295561]  pci_reset_function+0x53/0x80
<...>
[  718.393035] Allocated by task 690:
[  718.433497] Freed by task 20834:
[  718.495688] Last potentially related work creation:
[  718.568966] The buggy address belongs to the object at ffff8881bd127e00
                which belongs to the cache kmalloc-96 of size 96
[  718.574085] The buggy address is located 0 bytes inside of
                96-byte region [ffff8881bd127e00, ffff8881bd127e60)
[  718.579265] The buggy address belongs to the page:
[  718.598905] Memory state around the buggy address:
[  718.601809]  ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  718.604796]  ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
[  718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[  718.610811]                    ^
[  718.613819]  ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
[  718.617107]  ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc

This is due to that free_irq_cpu_rmap() is always being called
*after* (devm_)free_irq() and thus it tries to work with IRQ descs
already freed. For example, on device reset the driver frees the
rmap right before allocating a new one (the splat above).
Make rmap creation and freeing function symmetrical with
{request,free}_irq() calls i.e. do that on ifup/ifdown instead
of device probe/remove/resume. These operations can be performed
independently from the actual device aRFS configuration.
Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers
only when aRFS is disabled -- otherwise, CPU rmap sets and clears
its own and they must not be touched manually.

Fixes: 28bf26724f ("ice: Implement aRFS")
Co-developed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-04-08 09:08:36 -07:00