linux/drivers/usb/core
Douglas Anderson ed62ca2f4f USB: core: Avoid race of async_completed() w/ usbdev_release()
While running reboot tests w/ a specific set of USB devices (and
slub_debug enabled), I found that once every few hours my device would
be crashed with a stack that looked like this:

[   14.012445] BUG: spinlock bad magic on CPU#0, modprobe/2091
[   14.012460]  lock: 0xffffffc0cb055978, .magic: ffffffc0, .owner: cryption contexts: %lu/%lu
[   14.012460] /1025536097, .owner_cpu: 0
[   14.012466] CPU: 0 PID: 2091 Comm: modprobe Not tainted 4.4.79 #352
[   14.012468] Hardware name: Google Kevin (DT)
[   14.012471] Call trace:
[   14.012483] [<....>] dump_backtrace+0x0/0x160
[   14.012487] [<....>] show_stack+0x20/0x28
[   14.012494] [<....>] dump_stack+0xb4/0xf0
[   14.012500] [<....>] spin_dump+0x8c/0x98
[   14.012504] [<....>] spin_bug+0x30/0x3c
[   14.012508] [<....>] do_raw_spin_lock+0x40/0x164
[   14.012515] [<....>] _raw_spin_lock_irqsave+0x64/0x74
[   14.012521] [<....>] __wake_up+0x2c/0x60
[   14.012528] [<....>] async_completed+0x2d0/0x300
[   14.012534] [<....>] __usb_hcd_giveback_urb+0xc4/0x138
[   14.012538] [<....>] usb_hcd_giveback_urb+0x54/0xf0
[   14.012544] [<....>] xhci_irq+0x1314/0x1348
[   14.012548] [<....>] usb_hcd_irq+0x40/0x50
[   14.012553] [<....>] handle_irq_event_percpu+0x1b4/0x3f0
[   14.012556] [<....>] handle_irq_event+0x4c/0x7c
[   14.012561] [<....>] handle_fasteoi_irq+0x158/0x1c8
[   14.012564] [<....>] generic_handle_irq+0x30/0x44
[   14.012568] [<....>] __handle_domain_irq+0x90/0xbc
[   14.012572] [<....>] gic_handle_irq+0xcc/0x18c

Investigation using kgdb() found that the wait queue that was passed
into wake_up() had been freed (it was filled with slub_debug poison).

I analyzed and instrumented the code and reproduced.  My current
belief is that this is happening:

1. async_completed() is called (from IRQ).  Moves "as" onto the
   completed list.
2. On another CPU, proc_reapurbnonblock_compat() calls
   async_getcompleted().  Blocks on spinlock.
3. async_completed() releases the lock; keeps running; gets blocked
   midway through wake_up().
4. proc_reapurbnonblock_compat() => async_getcompleted() gets the
   lock; removes "as" from completed list and frees it.
5. usbdev_release() is called.  Frees "ps".
6. async_completed() finally continues running wake_up().  ...but
   wake_up() has a pointer to the freed "ps".

The instrumentation that led me to believe this was based on adding
some trace_printk() calls in a select few functions and then using
kdb's "ftdump" at crash time.  The trace follows (NOTE: in the trace
below I cheated a little bit and added a udelay(1000) in
async_completed() after releasing the spinlock because I wanted it to
trigger quicker):

<...>-2104   0d.h2 13759034us!: async_completed at start: as=ffffffc0cc638200
mtpd-2055    3.... 13759356us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3d..1 13759362us : async_getcompleted after list_del_init: as=ffffffc0cc638200
mtpd-2055    3.... 13759371us+: proc_reapurbnonblock_compat: free_async(ffffffc0cc638200)
mtpd-2055    3.... 13759422us+: async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759479us : usbdev_release at start: ps=ffffffc0cc042080
mtpd-2055    3.... 13759487us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759497us!: usbdev_release after kfree(ps): ps=ffffffc0cc042080
<...>-2104   0d.h2 13760294us : async_completed before wake_up(): as=ffffffc0cc638200

To fix this problem we can just move the wake_up() under the ps->lock.
There should be no issues there that I'm aware of.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28 11:17:57 +02:00
..
buffer.c usb: separate out sysdev pointer from usb_bus 2017-03-23 08:20:21 +01:00
config.c usb-core: Add LINEAR_FRAME_INTR_BINTERVAL USB quirk 2017-03-14 17:07:31 +08:00
devices.c usb: fix some references for /proc/bus/usb 2017-04-18 16:54:19 +02:00
devio.c USB: core: Avoid race of async_completed() w/ usbdev_release() 2017-08-28 11:17:57 +02:00
driver.c usb: hub: Do not attempt to autosuspend disconnected devices 2017-03-23 08:13:22 +01:00
endpoint.c usb: patches for v4.10 merge window 2016-11-18 16:02:15 +01:00
file.c USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously 2017-03-29 11:55:25 +02:00
generic.c USB: core: add missing license information to some files 2016-10-29 12:51:56 -04:00
hcd-pci.c USB / PCI / PM: Allow the PCI core to do the resume cleanup 2017-06-15 00:55:43 +02:00
hcd.c Merge 4.13-rc5 into usb-next 2017-08-14 14:50:58 -07:00
hub.c usb: Increase root hub reset signaling time to prevent retry 2017-08-16 15:26:26 -07:00
hub.h usb: Support USB 3.1 extended port status request 2016-01-24 20:16:52 -08:00
Kconfig docs-rst: fix usb cross-references 2017-04-11 14:41:29 -06:00
ledtrig-usbport.c usb: Convert to using %pOF instead of full_name 2017-07-22 15:56:53 +02:00
Makefile usb: add CONFIG_USB_PCI for system have both PCI HW and non-PCI based USB HW 2017-03-17 13:16:56 +09:00
message.c usb: get rid of some ReST doc build errors 2017-04-11 14:40:48 -06:00
notify.c USB: core: add missing license information to some files 2016-10-29 12:51:56 -04:00
of.c USB: of: document reference taken by child-lookup helper 2017-06-13 11:07:32 +02:00
otg_whitelist.h usb: core: use IS_ENABLED() instead of checking for built-in or module 2016-09-02 14:36:33 +02:00
port.c Revert "USB / PM: Allow USB devices to remain runtime-suspended when sleeping" 2016-05-02 08:44:31 -07:00
quirks.c usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter 2017-08-10 11:50:55 -07:00
sysfs.c usb: Convert to using %pOF instead of full_name 2017-07-22 15:56:53 +02:00
urb.c USB: core: replace %p with %pK 2017-05-17 11:27:41 +02:00
usb-acpi.c usb: optimize acpi companion search for usb port devices 2017-06-03 18:02:58 +09:00
usb.c USB: of: fix root-hub device-tree node handling 2017-06-13 11:07:32 +02:00
usb.h USB: core: add missing license information to some files 2016-10-29 12:51:56 -04:00