linux/drivers/net/ethernet
Mohamed Khalfella 330a699ecb igb: Do not bring the device up after non-fatal error
Commit 004d25060c ("igb: Fix igb_down hung on surprise removal")
changed igb_io_error_detected() to ignore non-fatal pcie errors in order
to avoid hung task that can happen when igb_down() is called multiple
times. This caused an issue when processing transient non-fatal errors.
igb_io_resume(), which is called after igb_io_error_detected(), assumes
that device is brought down by igb_io_error_detected() if the interface
is up. This resulted in panic with stacktrace below.

[ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down
[  T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0
[  T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
[  T292] igb 0000:09:00.0:   device [8086:1537] error status/mask=00004000/00000000
[  T292] igb 0000:09:00.0:    [14] CmpltTO [  200.105524,009][  T292] igb 0000:09:00.0: AER:   TLP Header: 00000000 00000000 00000000 00000000
[  T292] pcieport 0000:00:1c.5: AER: broadcast error_detected message
[  T292] igb 0000:09:00.0: Non-correctable non-fatal error reported.
[  T292] pcieport 0000:00:1c.5: AER: broadcast mmio_enabled message
[  T292] pcieport 0000:00:1c.5: AER: broadcast resume message
[  T292] ------------[ cut here ]------------
[  T292] kernel BUG at net/core/dev.c:6539!
[  T292] invalid opcode: 0000 [#1] PREEMPT SMP
[  T292] RIP: 0010:napi_enable+0x37/0x40
[  T292] Call Trace:
[  T292]  <TASK>
[  T292]  ? die+0x33/0x90
[  T292]  ? do_trap+0xdc/0x110
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? do_error_trap+0x70/0xb0
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? exc_invalid_op+0x4e/0x70
[  T292]  ? napi_enable+0x37/0x40
[  T292]  ? asm_exc_invalid_op+0x16/0x20
[  T292]  ? napi_enable+0x37/0x40
[  T292]  igb_up+0x41/0x150
[  T292]  igb_io_resume+0x25/0x70
[  T292]  report_resume+0x54/0x70
[  T292]  ? report_frozen_detected+0x20/0x20
[  T292]  pci_walk_bus+0x6c/0x90
[  T292]  ? aer_print_port_info+0xa0/0xa0
[  T292]  pcie_do_recovery+0x22f/0x380
[  T292]  aer_process_err_devices+0x110/0x160
[  T292]  aer_isr+0x1c1/0x1e0
[  T292]  ? disable_irq_nosync+0x10/0x10
[  T292]  irq_thread_fn+0x1a/0x60
[  T292]  irq_thread+0xe3/0x1a0
[  T292]  ? irq_set_affinity_notifier+0x120/0x120
[  T292]  ? irq_affinity_notify+0x100/0x100
[  T292]  kthread+0xe2/0x110
[  T292]  ? kthread_complete_and_exit+0x20/0x20
[  T292]  ret_from_fork+0x2d/0x50
[  T292]  ? kthread_complete_and_exit+0x20/0x20
[  T292]  ret_from_fork_asm+0x11/0x20
[  T292]  </TASK>

To fix this issue igb_io_resume() checks if the interface is running and
the device is not down this means igb_io_error_detected() did not bring
the device down and there is no need to bring it up.

Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com>
Fixes: 004d25060c ("igb: Fix igb_down hung on surprise removal")
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:39:21 -07:00
..
3com net: annotate data-races around dev->if_port 2024-05-08 18:51:30 -07:00
8390 net: ethernet: 8390: ne2k-pci: remove unused struct 'ne2k_pci_card' 2024-05-28 15:21:12 +02:00
actions
adaptec net: ethernet: starfire: remove unused structs 2024-05-28 15:21:04 +02:00
adi net: ethernet: adi: adin1110: Fix some error handling path in adin1110_read_fifo() 2024-10-07 16:49:43 -07:00
aeroflex
agere net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
alacritech net: alacritech: Partially revert "net: alacritech: Switch to use dev_err_probe()" 2024-09-03 15:28:57 -07:00
allwinner
alteon net: alteon: Convert tasklet API to new bottom half workqueue mechanism 2024-07-31 18:59:46 -07:00
altera net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
amazon net: ena: Extend customer metrics reporting support 2024-09-12 18:01:17 -07:00
amd amd-xgbe: Remove setting of RX software timestamp 2024-09-09 17:44:40 -07:00
apm net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
apple net: apple: bmac: Use IRQF_NO_AUTOEN flag in request_irq() 2024-09-12 20:35:04 -07:00
aquantia net: atlantic: convert comma to semicolon 2024-09-06 18:05:53 -07:00
arc net: ethernet: arc: remove emac_arc driver 2024-06-21 10:07:17 +01:00
asix
atheros net: ag71xx: remove dead code path 2024-09-13 19:53:47 -07:00
broadcom move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
brocade bna: adjust 'name' buf size of bna_tcb and bna_ccb structures 2024-07-12 01:56:48 +01:00
cadence net: macb: Use predefined PCI vendor ID constant 2024-09-13 20:08:53 -07:00
calxeda net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
cavium net: thunderx: Remove setting of RX software timestamp 2024-09-09 17:44:41 -07:00
chelsio cxgb4: Remove setting of RX software timestamp 2024-09-06 09:34:18 +01:00
cirrus net: cirrus: use u8 for addr to calm down sparse 2024-09-23 06:58:37 +00:00
cisco enic: Report some per queue statistics in ethtool 2024-09-13 21:17:12 -07:00
cortina net: ethernet: cortina: Implement .set_pauseparam() 2024-06-01 16:07:29 -07:00
davicom net: dm9051: fix module autoloading 2024-08-27 14:26:04 -07:00
dec move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
dlink net: ethernet: dlink: replace deprecated macro 2024-08-14 12:20:55 +01:00
emulex be2net: Remove unused declarations 2024-09-03 15:38:22 -07:00
engleder tsnep: Remove setting of RX software timestamp 2024-09-03 15:17:48 -07:00
ezchip
faraday Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-09-12 17:11:24 -07:00
freescale Including fixes from ieee802154, bluetooth and netfilter. 2024-10-03 09:44:00 -07:00
fujitsu
fungible net/funeth: Remove setting of RX software timestamp 2024-09-09 17:44:41 -07:00
google gve: Remove unused declaration gve_rx_alloc_rings() 2024-08-19 17:48:20 -07:00
hisilicon net: hns3: Remove setting of RX software timestamp 2024-09-03 15:17:48 -07:00
huawei net: hinic: use ethtool_sprintf/puts 2024-08-13 11:59:37 +02:00
i825xx
ibm ibmvnic: Inspect header requirements before using scrq direct 2024-10-04 12:04:09 -07:00
intel igb: Do not bring the device up after non-fatal error 2024-10-08 14:39:21 -07:00
litex
marvell Updates for timers and timekeeping: 2024-09-17 07:25:37 +02:00
mediatek net: airoha: Update tx cpu dma ring idx at the end of xmit loop 2024-10-07 17:29:22 -07:00
mellanox mlx5-fixes-2024-09-25 2024-10-02 17:14:53 -07:00
meta move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
micrel net: ks8851: Fix potential TX stall after interface reopen 2024-07-11 11:52:29 +02:00
microchip net: microchip: Make FDMA config symbol invisible 2024-10-01 11:31:43 +02:00
microsoft dma-mapping updates for linux 6.12 2024-09-19 11:12:49 +02:00
moxa
mscc net: mscc: ocelot: Remove setting of RX software timestamp 2024-09-09 17:44:41 -07:00
myricom net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
natsemi net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
neterion net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
netronome move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
ni net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
nvidia net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
nxp
oki-semi net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
packetengines move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
pasemi netdev_features: convert NETIF_F_LLTX to dev->lltx 2024-09-03 11:36:43 +02:00
pensando ionic: Allow XDP program to be hot swapped 2024-09-09 19:18:15 -07:00
qlogic qlcnic: make read-only const array key static 2024-09-11 16:00:54 -07:00
qualcomm netdev_features: convert NETIF_F_LLTX to dev->lltx 2024-09-03 11:36:43 +02:00
rdc
realtek move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
renesas net: ravb: Fix R-Car RX frame size limit 2024-09-24 11:55:13 +02:00
rocker netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local 2024-09-03 11:36:43 +02:00
samsung net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
seeq net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race Condition 2024-09-19 15:17:30 +02:00
sfc sfc: Don't invoke xdp_do_flush() from netpoll. 2024-10-03 15:42:00 -07:00
sgi
silan
sis net: annotate data-races around dev->if_port 2024-05-08 18:51:30 -07:00
smsc move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
socionext
stmicro Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled" 2024-10-07 16:47:07 -07:00
sun net: sunvnet: use ethtool_sprintf/puts 2024-08-12 13:25:38 +01:00
sunplus
synopsys net: dwc-xlgmac: fix missing MODULE_DESCRIPTION() warning 2024-06-17 18:05:38 -07:00
tehuti netdev_features: convert NETIF_F_LLTX to dev->lltx 2024-09-03 11:36:43 +02:00
ti net: ethernet: ti: am65-cpsw: avoid devm_alloc_etherdev, fix module removal 2024-10-08 10:30:30 +02:00
toshiba netdev_features: convert NETIF_F_LLTX to dev->lltx 2024-09-03 11:36:43 +02:00
tundra
vertexcom net: vertexcom: mse102x: Use ETH_ZLEN 2024-08-29 11:39:35 -07:00
via net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
wangxun i2c-for-6.12-rc1 2024-09-23 14:34:19 -07:00
wiznet
xilinx net: xilinx: axienet: Fix packet counting 2024-09-19 13:00:46 +02:00
xircom net: annotate data-races around dev->if_port 2024-05-08 18:51:30 -07:00
xscale ixp4xx_eth: Remove setting of RX software timestamp 2024-09-09 17:44:42 -07:00
dnet.c
dnet.h
ec_bhf.c
ethoc.c
fealnx.c
jme.c net: ethernet: use ip_hdrlen() instead of bit shift 2024-08-11 04:41:15 +01:00
jme.h
Kconfig net: ethernet: oa_tc6: implement register write operation 2024-09-11 20:53:42 -07:00
korina.c
lantiq_etop.c net: ethernet: lantiq_etop: fix memory disclosure 2024-10-01 10:58:07 +02:00
lantiq_xrx200.c net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00
Makefile net: ethernet: oa_tc6: implement register write operation 2024-09-11 20:53:42 -07:00
oa_tc6.c net: ethernet: oa_tc6: add helper function to enable zero align rx frame 2024-09-11 20:53:45 -07:00