linux/drivers
Shay Drory 42ea9f1b5c net/mlx5: drain health workqueue in case of driver load error
In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().

Trace of the error:
BUG: unable to handle page fault for address: ffffb5b141c18014
PF: supervisor read access in kernel mode
PF: error_code(0x0000) - not-present page
PGD 1fe95d067 P4D 1fe95d067 PUD 1fe95e067 PMD 1b7823067 PTE 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 6755 Comm: kworker/u128:2 Not tainted 5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ? mlx5_health_try_recover+0x4d/0x270 [mlx5_core]
 mlx5_fw_fatal_reporter_recover+0x16/0x20 [mlx5_core]
 devlink_health_reporter_recover+0x1c/0x50
 devlink_health_report+0xfb/0x240
 mlx5_fw_fatal_reporter_err_work+0x65/0xd0 [mlx5_core]
 process_one_work+0x1fb/0x4e0
 ? process_one_work+0x16b/0x4e0
 worker_thread+0x4f/0x3d0
 kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0
 ? kthread_cancel_delayed_work_sync+0x20/0x20
 ret_from_fork+0x1f/0x30
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache 8021q garp mrp stp llc ipmi_devintf ipmi_msghandler rpcrdma rdma_ucm ib_iser rdma_cm ib_umad iw_cm ib_ipoib libiscsi scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core mlx5_core sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 mlxfw crypto_simd cryptd glue_helper input_leds hyperv_fb intel_rapl_perf joydev serio_raw pci_hyperv pci_hyperv_mini mac_hid hv_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 hv_utils hid_generic hv_storvsc ptp hid_hyperv hid hv_netvsc hyperv_keyboard pps_core scsi_transport_fc psmouse hv_vmbus i2c_piix4 floppy pata_acpi
CR2: ffffb5b141c18014
---[ end trace b12c5503157cad24 ]---
RIP: 0010:ioread32be+0x30/0x40
Code: 00 77 27 48 81 ff 00 00 01 00 76 07 0f b7 d7 ed 0f c8 c3 55 48 c7 c6 3b ee d5 9f 48 89 e5 e8 67 fc ff ff b8 ff ff ff ff 5d c3 <8b> 07 0f c8 c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 81 fe ff ff 03
RSP: 0018:ffffb5b14c56fd78 EFLAGS: 00010292
RAX: ffffb5b141c18000 RBX: ffff8e9f78a801c0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8e9f7ecd7628 RDI: ffffb5b141c18014
RBP: ffffb5b14c56fd90 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8e9f372a2c30 R11: ffff8e9f87f4bc40 R12: ffff8e9f372a1fc0
R13: ffff8e9f78a80000 R14: ffffffffc07136a0 R15: ffff8e9f78ae6f20
FS:  0000000000000000(0000) GS:ffff8e9f7ecc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffb5b141c18014 CR3: 00000001c8f82006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
in_atomic(): 0, irqs_disabled(): 1, pid: 6755, name: kworker/u128:2
INFO: lockdep is turned off.
CPU: 3 PID: 6755 Comm: kworker/u128:2 Tainted: G      D           5.2.0-net-next-mlx5-hv_stats-over-last-worked-hyperv #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  04/28/2016
Workqueue: mlx5_healtha050:00:02.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
Call Trace:
 dump_stack+0x63/0x88
 ___might_sleep+0x10a/0x130
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x230
 ? blocking_notifier_call_chain+0x16/0x20
 do_exit+0xb1/0xc30
 ? kthread+0x10d/0x140
 ? process_one_work+0x4e0/0x4e0

Fixes: 52c368dc3d ("net/mlx5: Move health and page alloc init to mdev_init")
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-06-11 15:37:48 -07:00
..
accessibility
acpi pci-v5.8-changes 2020-06-06 11:01:58 -07:00
amba ARM: SoC changes for v5.8 2020-06-04 19:47:11 -07:00
android
ata for-5.8/block-2020-06-01 2020-06-02 15:29:19 -07:00
atm
auxdisplay
base Driver core patches for 5.8-rc1 2020-06-07 10:53:36 -07:00
bcma
block RDMA 5.8 merge window pull request 2020-06-05 14:05:57 -07:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
bus Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
cdrom Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
char Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
clk Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
clocksource clocksource/drivers/timer-versatile: Clear OF_POPULATED flag 2020-05-23 00:03:25 +02:00
connector connector/cn_proc: Protect send_msg() with a local lock 2020-05-28 10:31:10 +02:00
counter
cpufreq ARM/SoC: drivers for v5.7 2020-06-04 19:56:20 -07:00
cpuidle powerpc updates for 5.8 2020-06-05 12:39:30 -07:00
crypto Crypto/chcr: Checking cra_refcnt before unregistering the algorithms 2020-06-10 17:05:02 -07:00
dax device-dax: add memory via add_memory_driver_managed() 2020-06-04 19:06:23 -07:00
dca
devfreq PM / devfreq: Use lockdep asserts instead of manual checks for locked mutex 2020-05-28 18:02:40 +09:00
dio
dma dmaengine: tegra210-adma: Fix an error handling path in 'tegra_adma_probe()' 2020-05-19 22:26:01 +05:30
dma-buf drm pull for 5.8-rc1 2020-06-02 15:04:15 -07:00
edac Merge branches 'edac-i10nm' and 'edac-misc' into edac-updates-for-5.8 2020-06-01 11:39:15 +02:00
eisa
extcon extcon: arizona: Fix runtime PM imbalance on error 2020-05-29 17:36:02 +09:00
firewire
firmware Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
fpga Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
fsi
gnss
gpio USB/PHY driver updates for 5.8-rc1 2020-06-07 09:42:16 -07:00
gpu TTY/Serial driver updates for 5.8-rc1 2020-06-07 09:52:36 -07:00
greybus
hid Merge branches 'for-5.7/upstream-fixes', 'for-5.8/apple', 'for-5.8/asus', 'for-5.8/core', 'for-5.8/intel-ish', 'for-5.8/logitech', 'for-5.8/mcp2221' and 'for-5.8/multitouch' into for-linus 2020-06-03 22:23:52 +02:00
hsi
hv hyperv-next for 5.8 2020-06-03 15:00:05 -07:00
hwmon hwmon: Add Baikal-T1 PVT sensor driver 2020-05-28 07:59:45 -07:00
hwspinlock
hwtracing Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
i2c This is the bulk of GPIO changes for the v5.8 kernel cycle. 2020-06-05 14:00:30 -07:00
i3c
ide
idle
iio Staging/IIO driver patches for 5.8-rc1 2020-06-07 10:45:08 -07:00
infiniband RDMA 5.8 merge window pull request 2020-06-05 14:05:57 -07:00
input powerpc updates for 5.8 2020-06-05 12:39:30 -07:00
interconnect interconnect changes for 5.8 2020-05-22 09:14:03 +02:00
iommu dma-mapping updates for 5.8, part 1 2020-06-06 11:43:23 -07:00
ipack
irqchip irqchip: Fix "Loongson HyperTransport Vector support" driver build on all non-MIPS platforms 2020-06-01 09:48:52 +02:00
isdn
leds LEDs pull request for 5.8-rc1. 2020-06-04 11:03:45 -07:00
lightnvm for-5.8/block-2020-06-01 2020-06-02 15:29:19 -07:00
macintosh powerpc updates for 5.8 2020-06-05 12:39:30 -07:00
mailbox
mcb
md - Largest change for this cycle is the DM zoned target's metadata 2020-06-05 15:45:03 -07:00
media media updates for v5.8-rc1 2020-06-03 20:59:38 -07:00
memory Merge branch 'baikal/drivers' into arm/drivers 2020-05-28 14:18:11 +02:00
memstick
message
mfd This is the bulk of GPIO changes for the v5.8 kernel cycle. 2020-06-05 14:00:30 -07:00
misc Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
mmc Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
most
mtd for-5.8/block-2020-06-01 2020-06-02 15:29:19 -07:00
mux
net net/mlx5: drain health workqueue in case of driver load error 2020-06-11 15:37:48 -07:00
nfc Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-06-01 12:00:10 -07:00
ntb NTB: perf: Fix race condition when run with ntb_test 2020-06-05 20:02:09 -04:00
nubus
nvdimm nvdimm: use bio_{start,end}_io_acct 2020-05-27 05:21:23 -06:00
nvme RDMA 5.8 merge window pull request 2020-06-05 14:05:57 -07:00
nvmem nvmem: qfprom: remove incorrect write support 2020-05-27 11:09:26 +02:00
of Driver core patches for 5.8-rc1 2020-06-07 10:53:36 -07:00
opp
oprofile
parisc
parport Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
pci pci-v5.8-changes 2020-06-06 11:01:58 -07:00
pcmcia pci-v5.8-changes 2020-06-06 11:01:58 -07:00
perf arm64 updates for 5.8 2020-06-01 15:18:27 -07:00
phy USB: changes for v5.8 merge window 2020-05-25 13:28:20 +02:00
pinctrl This is the bulk of pin control changes for the v5.8 2020-06-07 16:13:43 -07:00
platform chrome platform changes for 5.8 2020-06-04 10:54:45 -07:00
pnp
power ARM: SoC changes for v5.8 2020-06-04 19:47:11 -07:00
powercap powercap: RAPL: remove unused local MSR define 2020-05-25 10:59:29 +02:00
pps
ps3
ptp ptp_clock: Let the ADJ_OFFSET interface respect the ADJ_NANO flag for PHC devices. 2020-05-25 17:55:17 -07:00
pwm
rapidio rapidio: convert get_user_pages() --> pin_user_pages() 2020-06-04 19:06:26 -07:00
ras
regulator Merge remote-tracking branch 'regulator/for-5.8' into regulator-linus 2020-06-01 13:01:44 +01:00
remoteproc
reset Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
rpmsg
rtc RTC for 5.8 2020-06-07 16:11:23 -07:00
s390 SCSI misc on 20200605 2020-06-05 15:11:50 -07:00
sbus
scsi SCSI misc on 20200605 2020-06-05 15:11:50 -07:00
sfi
sh
siox
slimbus
soc Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
soundwire soundwire: intel: use a single module 2020-05-20 19:24:55 +05:30
spi Char/Misc driver patches for 5.8-rc1 2020-06-07 10:59:32 -07:00
spmi
ssb
staging Staging/IIO driver patches for 5.8-rc1 2020-06-07 10:45:08 -07:00
target SCSI misc on 20200605 2020-06-05 15:11:50 -07:00
tc
tee tee: fix crypto select 2020-05-28 12:38:00 +02:00
thermal
thunderbolt USB/PHY driver updates for 5.8-rc1 2020-06-07 09:42:16 -07:00
tty Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc 2020-06-07 17:11:41 -07:00
uio
usb USB/PHY driver updates for 5.8-rc1 2020-06-07 09:42:16 -07:00
vdpa
vfio VFIO updates for v5.8-rc1 2020-06-05 13:51:49 -07:00
vhost SCSI misc on 20200605 2020-06-05 15:11:50 -07:00
video powerpc updates for 5.8 2020-06-05 12:39:30 -07:00
virt
virtio
visorbus
vlynq
vme
w1 w1: omap-hdq: print dev_err if irq flags are not cleared 2020-05-27 12:18:49 +02:00
watchdog linux-watchdog 5.8-rc1 tag 2020-06-04 10:50:22 -07:00
xen
zorro
Kconfig
Makefile