Go to file
Douglas Anderson ed62ca2f4f USB: core: Avoid race of async_completed() w/ usbdev_release()
While running reboot tests w/ a specific set of USB devices (and
slub_debug enabled), I found that once every few hours my device would
be crashed with a stack that looked like this:

[   14.012445] BUG: spinlock bad magic on CPU#0, modprobe/2091
[   14.012460]  lock: 0xffffffc0cb055978, .magic: ffffffc0, .owner: cryption contexts: %lu/%lu
[   14.012460] /1025536097, .owner_cpu: 0
[   14.012466] CPU: 0 PID: 2091 Comm: modprobe Not tainted 4.4.79 #352
[   14.012468] Hardware name: Google Kevin (DT)
[   14.012471] Call trace:
[   14.012483] [<....>] dump_backtrace+0x0/0x160
[   14.012487] [<....>] show_stack+0x20/0x28
[   14.012494] [<....>] dump_stack+0xb4/0xf0
[   14.012500] [<....>] spin_dump+0x8c/0x98
[   14.012504] [<....>] spin_bug+0x30/0x3c
[   14.012508] [<....>] do_raw_spin_lock+0x40/0x164
[   14.012515] [<....>] _raw_spin_lock_irqsave+0x64/0x74
[   14.012521] [<....>] __wake_up+0x2c/0x60
[   14.012528] [<....>] async_completed+0x2d0/0x300
[   14.012534] [<....>] __usb_hcd_giveback_urb+0xc4/0x138
[   14.012538] [<....>] usb_hcd_giveback_urb+0x54/0xf0
[   14.012544] [<....>] xhci_irq+0x1314/0x1348
[   14.012548] [<....>] usb_hcd_irq+0x40/0x50
[   14.012553] [<....>] handle_irq_event_percpu+0x1b4/0x3f0
[   14.012556] [<....>] handle_irq_event+0x4c/0x7c
[   14.012561] [<....>] handle_fasteoi_irq+0x158/0x1c8
[   14.012564] [<....>] generic_handle_irq+0x30/0x44
[   14.012568] [<....>] __handle_domain_irq+0x90/0xbc
[   14.012572] [<....>] gic_handle_irq+0xcc/0x18c

Investigation using kgdb() found that the wait queue that was passed
into wake_up() had been freed (it was filled with slub_debug poison).

I analyzed and instrumented the code and reproduced.  My current
belief is that this is happening:

1. async_completed() is called (from IRQ).  Moves "as" onto the
   completed list.
2. On another CPU, proc_reapurbnonblock_compat() calls
   async_getcompleted().  Blocks on spinlock.
3. async_completed() releases the lock; keeps running; gets blocked
   midway through wake_up().
4. proc_reapurbnonblock_compat() => async_getcompleted() gets the
   lock; removes "as" from completed list and frees it.
5. usbdev_release() is called.  Frees "ps".
6. async_completed() finally continues running wake_up().  ...but
   wake_up() has a pointer to the freed "ps".

The instrumentation that led me to believe this was based on adding
some trace_printk() calls in a select few functions and then using
kdb's "ftdump" at crash time.  The trace follows (NOTE: in the trace
below I cheated a little bit and added a udelay(1000) in
async_completed() after releasing the spinlock because I wanted it to
trigger quicker):

<...>-2104   0d.h2 13759034us!: async_completed at start: as=ffffffc0cc638200
mtpd-2055    3.... 13759356us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3d..1 13759362us : async_getcompleted after list_del_init: as=ffffffc0cc638200
mtpd-2055    3.... 13759371us+: proc_reapurbnonblock_compat: free_async(ffffffc0cc638200)
mtpd-2055    3.... 13759422us+: async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759479us : usbdev_release at start: ps=ffffffc0cc042080
mtpd-2055    3.... 13759487us : async_getcompleted before spin_lock_irqsave
mtpd-2055    3.... 13759497us!: usbdev_release after kfree(ps): ps=ffffffc0cc042080
<...>-2104   0d.h2 13760294us : async_completed before wake_up(): as=ffffffc0cc638200

To fix this problem we can just move the wake_up() under the ps->lock.
There should be no issues there that I'm aware of.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-28 11:17:57 +02:00
2017-08-13 16:01:32 -07:00

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
Description
mainlining shenanigans
Readme 5.1 GiB
Languages
C 97.7%
Assembly 1.1%
Shell 0.4%
Makefile 0.3%
Python 0.2%
Other 0.1%