The problem seems to be that if a new device is detected
while we have already removed the shared HCD, then many of the
xhci operations (e.g. xhci_alloc_dev(), xhci_setup_device())
hang as command never completes.
I don't think XHCI can operate without the shared HCD as we've
already called xhci_halt() in xhci_only_stop_hcd() when shared HCD
goes away. We need to prevent new commands from being queued
not only when HCD is dying but also when HCD is halted.
The following lockup was detected while testing the otg state
machine.
[ 178.199951] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
[ 178.205799] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1
[ 178.214458] xhci-hcd xhci-hcd.0.auto: hcc params 0x0220f04c hci version 0x100 quirks 0x00010010
[ 178.223619] xhci-hcd xhci-hcd.0.auto: irq 400, io mem 0x48890000
[ 178.230677] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 178.237796] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 178.245358] usb usb1: Product: xHCI Host Controller
[ 178.250483] usb usb1: Manufacturer: Linux 4.0.0-rc1-00024-g6111320 xhci-hcd
[ 178.257783] usb usb1: SerialNumber: xhci-hcd.0.auto
[ 178.267014] hub 1-0:1.0: USB hub found
[ 178.272108] hub 1-0:1.0: 1 port detected
[ 178.278371] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller
[ 178.284171] xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2
[ 178.294038] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[ 178.301183] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 178.308776] usb usb2: Product: xHCI Host Controller
[ 178.313902] usb usb2: Manufacturer: Linux 4.0.0-rc1-00024-g6111320 xhci-hcd
[ 178.321222] usb usb2: SerialNumber: xhci-hcd.0.auto
[ 178.329061] hub 2-0:1.0: USB hub found
[ 178.333126] hub 2-0:1.0: 1 port detected
[ 178.567585] dwc3 48890000.usb: usb_otg_start_host 0
[ 178.572707] xhci-hcd xhci-hcd.0.auto: remove, state 4
[ 178.578064] usb usb2: USB disconnect, device number 1
[ 178.586565] xhci-hcd xhci-hcd.0.auto: USB bus 2 deregistered
[ 178.592585] xhci-hcd xhci-hcd.0.auto: remove, state 1
[ 178.597924] usb usb1: USB disconnect, device number 1
[ 178.603248] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[ 190.597337] INFO: task kworker/u4:0:6 blocked for more than 10 seconds.
[ 190.604273] Not tainted 4.0.0-rc1-00024-g6111320 #1058
[ 190.610228] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 190.618443] kworker/u4:0 D c05c0ac0 0 6 2 0x00000000
[ 190.625120] Workqueue: usb_otg usb_otg_work
[ 190.629533] [<c05c0ac0>] (__schedule) from [<c05c10ac>] (schedule+0x34/0x98)
[ 190.636915] [<c05c10ac>] (schedule) from [<c05c1318>] (schedule_preempt_disabled+0xc/0x10)
[ 190.645591] [<c05c1318>] (schedule_preempt_disabled) from [<c05c23d0>] (mutex_lock_nested+0x1ac/0x3fc)
[ 190.655353] [<c05c23d0>] (mutex_lock_nested) from [<c046cf8c>] (usb_disconnect+0x3c/0x208)
[ 190.664043] [<c046cf8c>] (usb_disconnect) from [<c0470cf0>] (_usb_remove_hcd+0x98/0x1d8)
[ 190.672535] [<c0470cf0>] (_usb_remove_hcd) from [<c0485da8>] (usb_otg_start_host+0x50/0xf4)
[ 190.681299] [<c0485da8>] (usb_otg_start_host) from [<c04849a4>] (otg_set_protocol+0x5c/0xd0)
[ 190.690153] [<c04849a4>] (otg_set_protocol) from [<c0484b88>] (otg_set_state+0x170/0xbfc)
[ 190.698735] [<c0484b88>] (otg_set_state) from [<c0485740>] (otg_statemachine+0x12c/0x470)
[ 190.707326] [<c0485740>] (otg_statemachine) from [<c0053c84>] (process_one_work+0x1b4/0x4a0)
[ 190.716162] [<c0053c84>] (process_one_work) from [<c00540f8>] (worker_thread+0x154/0x44c)
[ 190.724742] [<c00540f8>] (worker_thread) from [<c0058f88>] (kthread+0xd4/0xf0)
[ 190.732328] [<c0058f88>] (kthread) from [<c000e810>] (ret_from_fork+0x14/0x24)
[ 190.739898] 5 locks held by kworker/u4:0/6:
[ 190.744274] #0: ("%s""usb_otg"){.+.+.+}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
[ 190.752799] #1: ((&otgd->work)){+.+.+.}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
[ 190.761326] #2: (&otgd->fsm.lock){+.+.+.}, at: [<c048562c>] otg_statemachine+0x18/0x470
[ 190.769934] #3: (usb_bus_list_lock){+.+.+.}, at: [<c0470ce8>] _usb_remove_hcd+0x90/0x1d8
[ 190.778635] #4: (&dev->mutex){......}, at: [<c046cf8c>] usb_disconnect+0x3c/0x208
[ 190.786700] INFO: task kworker/1:0:14 blocked for more than 10 seconds.
[ 190.793633] Not tainted 4.0.0-rc1-00024-g6111320 #1058
[ 190.799567] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 190.807783] kworker/1:0 D c05c0ac0 0 14 2 0x00000000
[ 190.814457] Workqueue: usb_hub_wq hub_event
[ 190.818866] [<c05c0ac0>] (__schedule) from [<c05c10ac>] (schedule+0x34/0x98)
[ 190.826252] [<c05c10ac>] (schedule) from [<c05c4e40>] (schedule_timeout+0x13c/0x1ec)
[ 190.834377] [<c05c4e40>] (schedule_timeout) from [<c05c19f0>] (wait_for_common+0xbc/0x150)
[ 190.843062] [<c05c19f0>] (wait_for_common) from [<bf068a3c>] (xhci_setup_device+0x164/0x5cc [xhci_hcd])
[ 190.852986] [<bf068a3c>] (xhci_setup_device [xhci_hcd]) from [<c046b7f4>] (hub_port_init+0x3f4/0xb10)
[ 190.862667] [<c046b7f4>] (hub_port_init) from [<c046eb64>] (hub_event+0x704/0x1018)
[ 190.870704] [<c046eb64>] (hub_event) from [<c0053c84>] (process_one_work+0x1b4/0x4a0)
[ 190.878919] [<c0053c84>] (process_one_work) from [<c00540f8>] (worker_thread+0x154/0x44c)
[ 190.887503] [<c00540f8>] (worker_thread) from [<c0058f88>] (kthread+0xd4/0xf0)
[ 190.895076] [<c0058f88>] (kthread) from [<c000e810>] (ret_from_fork+0x14/0x24)
[ 190.902650] 5 locks held by kworker/1:0/14:
[ 190.907023] #0: ("usb_hub_wq"){.+.+.+}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
[ 190.915454] #1: ((&hub->events)){+.+.+.}, at: [<c0053bf4>] process_one_work+0x124/0x4a0
[ 190.924070] #2: (&dev->mutex){......}, at: [<c046e490>] hub_event+0x30/0x1018
[ 190.931768] #3: (&port_dev->status_lock){+.+.+.}, at: [<c046eb50>] hub_event+0x6f0/0x1018
[ 190.940558] #4: (&bus->usb_address0_mutex){+.+.+.}, at: [<c046b458>] hub_port_init+0x58/0xb10
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the xHCI host controller has died (ie, device removed) or suffered
other serious fatal error (STS_FATAL), then xhci_irq should handle this
condition with IRQ_HANDLED instead of -ESHUTDOWN.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Isoc TDs usually consist of one TRB, sometimes two. When all goes well we
receive only one success event for a TD, and move the dequeue pointer to
the next TD.
This fails if the TD consists of two TRBs and we get a transfer error
on the first TRB, we will then see two events for that TD.
Fix this by making sure the event we get is for the last TRB in that TD
before moving the dequeue pointer to the next TD. This will resolve some
of the uvc and dvb issues with the
"ERROR Transfer event TRB DMA ptr not part of current TD" error message
Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This part 2 pull request contains only the patches
which make sure everybody on linux uses the same
resume timeout value.
Signed-off-by: Felipe Balbi <balbi@ti.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVJEivAAoJEIaOsuA1yqREV6QP/1oJVdBxzOeQVXyGZsZT4k0E
T8SKmTIJmaCdSjgazSXAhqF7IR992sTM7urHZ2hww3tftODNVHvfFkgDlSguHwkr
GV95YpNv99ash7spjod3xVOt8lDvrt6FHs7Foj4b0yQZ1yXUZsOH4j8G4gMYOgTd
bq1mO2/NaaM7xiddzPMCGpSkL6NcsWrurJ8JiOlBXaUIkSR+Cwe9+8vzEDrxfSM+
6yBTLQCdvR0ammqV0YMNsF9dGLO8yiDotC6b104i3yCqpAZnyj0RJQ3riH7ESxsG
eSKkCXXGY0OWDGDcnWTskYEHJ7iDfg5MrlzBJyneMjauGKNeIoJRRYWP/f6NSV9m
GHxRV1ITKC61Dtt3w+Oy2a16UpAbq7/w1q/SkQl4YtJ8l0P1PUb+TfOZpTOsCLLF
cLXUKJufGWhuGbo+e2m6GizNCsDi5LlfdZcjIjVKRXpyiZI0peqyGj16mTm70UUY
/A9nsLtE7c6bqHMTcFQFRx4d2iHmkilyzQqOvx4ulo1rovNMafgmZJF7Kq0LY7ZO
Uj4s2rEAjqbELbOiEIf8HzGP+J/QuECG4SHyPEQsiG0drYyoCOqGR1S3kV7aJgFz
krl4zTvXR8EOk2IpPa8PDzmMr7YCYzlXOxzmSoJiZhqqpLH7cHYkDiRTmyxn+eB3
lEjEzBA8J9CG4iRIFjk3
=RqsO
-----END PGP SIGNATURE-----
Merge tag 'usb-for-v4.1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-testing
Felipe writes:
usb: generic resume timeout for v4.1
This part 2 pull request contains only the patches
which make sure everybody on linux uses the same
resume timeout value.
Signed-off-by: Felipe Balbi <balbi@ti.com>
Make sure we're using the new macro, so our
resume signaling will always pass certification.
Cc: <stable@vger.kernel.org> # v3.10+
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Some toggling operation in xHCI driver still use conditional toggling:
ring->cycle_state = (ring->cycle_state ? 0 : 1);
Use XOR to invert the cycle state instead of a conditional toggle to unify
cycle state toggling operation in xHCI driver.
Signed-off-by: Lin Wang <lin.x.wang@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 27082e2654 ("xhci: Clear the host side toggle manually")
Turns out this fix to enable soft resetting endpoints wasn't mature enough.
It caused regression with some usb DVB-T devices and needs some more tuning
to get the endpiont ring pointers set correctly.
The original commit was tagged for stable 3.18, and should be reverted
from there as well.
Cc: stable <stable@vger.kernel.org> # v3.18
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a control transfer has a short data stage, the xHCI controller generates
two transfer events: a COMP_SHORT_TX event that specifies the untransferred
amount, and a COMP_SUCCESS event. But when the data stage is not short, only the
COMP_SUCCESS event occurs. Therefore, xhci-hcd must set urb->actual_length to
urb->transfer_buffer_length while processing the COMP_SUCCESS event, unless
urb->actual_length was set already by a previous COMP_SHORT_TX event.
The driver checks this by seeing whether urb->actual_length == 0, but this alone
is the wrong test, as it is entirely possible for a short transfer to have an
urb->actual_length = 0.
This patch changes the xhci driver to rely on a new td->urb_length_set flag,
which is set to true when a COMP_SHORT_TX event is received and the URB length
updated at that stage.
This fixes a bug which affected the HSO plugin, which relies on URBs with
urb->actual_length == 0 to halt re-submitting the RX URB in the control
endpoint.
Cc: <stable@vger.kernel.org>
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Main benefit of this is to get xhci connected USB scanners to work.
Some devices use a clear endpoint halt request as a 'soft reset' even if
the endpoint is not halted. This will clear the toggle and sequence on the
device side. xHCI however refuses to reset a non-halted endpoint, so instead
we need to issue a configure endpoint command on xHCI to clear its host side
toggle and sequence, and get it in sync with the device side.
Tested-by: Mike Mammarella <mikem@crystalorb.net>
Cc: <stable@vger.kernel.org> # v3.18
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helps debugging to know the unhandled event type.
Also make the debug message grepable
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some parameters are not used by functions in xhci-mem.c, just
remove it.
Changes compared to v1:
- Rebase to the latest usb-next branch
Signed-off-by: Lin Wang <lin.x.wang@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Parameter 'xhci' is no longer be used in function xhci_handshake(),
just remove it.
Signed-off-by: Lin Wang <lin.x.wang@intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Endpoints halted on errors, and endpoints stopped manually both used
the same ep->stopped_td to store the halted or stopped td. this causes
confusion and possible races.
There is no longer a need to use the ep->stopped_td variable to store
the halted TD. A halted endpoint is handled immediately and we can pass
it to the handling function directly.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a device is halted and reuturns a STALL, then the halted endpoint
needs to be cleared both on the host and device side. The host
side halt is cleared by issueing a xhci reset endpoint command. The device side
is cleared with a ClearFeature(ENDPOINT_HALT) request, which should
be issued by the device driver if a URB reruen -EPIPE.
Previously we cleared the host side halt after the device side was cleared.
To make sure the host side halt is cleared in time we want to issue the
reset endpoint command immedialtely when a STALL status is encountered.
Otherwise we end up not following the specs and not returning -EPIPE
several times in a row when trying to transfer data to a halted endpoint.
Fixes: bcef3fd (USB: xhci: Handle errors that cause endpoint halts.)
Cc: <stable@vger.kernel.org> # v2.6.33+
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A halted endpoint ring must first be reset, then move the ring
dequeue pointer past the problematic TRB. If we start the ring too
early after reset, but before moving the dequeue pointer we
will end up executing the same problematic TRB again.
As we always issue a set transfer dequeue command after a reset
endpoint command we can skip starting endpoint rings at reset endpoint
command completion.
Without this fix we end up trying to handle the same faulty TD for
contol endpoints. causing timeout, and failing testusb ctrl_out write
tests.
Fixes: e9df17e (USB: xhci: Correct assumptions about number of rings per endpoint.)
Cc: <stable@vger.kernel.org> #v2.6.35
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lately (with the use of uas / bulk-streams) we have been seeing several
cases where this error triggers (which should never happen).
Add some extra logging to make debugging these errors easier.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Even though a Set TR deq ptr command operates on a ring, and an endpoint
can have multiple rings, we can have only one Set TR deq ptr command pending.
When an endpoint with streams halts or is stopped to unlink urbs, there
will only be at most one ring active / one td being executed (the td
stopped_td points to).
So when we reset the endpoint (for a halt), or the stop command completes, we
will queue one Set TR deq ptr command at most, cancelled urbs on other stream
rings then the one being executed will have there trbs turned to nops, and
once the hcd gets around to execute that stream ring they will be simply
skipped.
So the SET_DEQ_PENDING flag in the endpoint is sufficient protection against
starting the endpoing before all stream rings are cleaned up.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Even if the stream for which the command was intended has been freed in the
mean time. This ensures that things start rolling again after an unlink / halt.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
xhci_queue_new_dequeue_state is the only caller of queue_set_tr_deq
and queue_set_tr_deq checks for SET_DEQ_PENDING, where as
xhci_queue_new_dequeue_state sets it which is inconsistent.
Simply fold the 2 into one is a nice cleanup and fixes the inconsistency.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are multiple reasons for this:
1) This fixes a missing check for xhci_alloc_command failing in
xhci_handle_cmd_stop_ep()
2) This adds a warning when we cannot set the new dequeue state because of
xhci_alloc_command failing
3) It puts the allocation of the command after the sanity checks in
queue_set_tr_deq(), avoiding leaking the command if those fail
4) Since queue_set_tr_deq now owns the command it can free it if queue_command
fails
5) It reduces code duplication
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we manually need to move the TR dequeue pointer we need to set the
correct cycle bit as well. Previously we used the trb pointer from the
last event received as a base, but this was changed in
commit 1f81b6d22a ("usb: xhci: Prefer endpoint context dequeue pointer")
to use the dequeue pointer from the endpoint context instead
It turns out some Asmedia controllers advance the dequeue pointer
stored in the endpoint context past the event triggering TRB, and
this messed up the way the cycle bit was calculated.
Instead of adding a quirk or complicating the already hard to follow cycle bit
code, the whole cycle bit calculation is now simplified and adapted to handle
event and endpoint context dequeue pointer differences.
Fixes: 1f81b6d22a ("usb: xhci: Prefer endpoint context dequeue pointer")
Reported-by: Maciej Puzio <mx34567@gmail.com>
Reported-by: Evan Langlois <uudruid74@gmail.com>
Reviewed-by: Julius Werner <jwerner@chromium.org>
Tested-by: Maciej Puzio <mx34567@gmail.com>
Tested-by: Evan Langlois <uudruid74@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When using a Renesas uPD720231 chipset usb-3 uas to sata bridge with a 120G
Crucial M500 ssd, model string: Crucial_ CT120M500SSD1, together with a
the integrated Intel xhci controller on a Haswell laptop:
00:14.0 USB controller [0c03]: Intel Corporation 8 Series USB xHCI HC [8086:9c31] (rev 04)
The following error gets logged to dmesg:
xhci error: Transfer event TRB DMA ptr not part of current TD
Treating COMP_STOP the same as COMP_STOP_INVAL when no event_seg gets found
fixes this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The transfer burst count (TBC) field in xhci 1.0 hosts should be set
to the number of bursts needed to transfer all packets in a isoc TD.
Supported values are 0-2 (1 to 3 bursts per service interval).
Formula for TBC calculation is given in xhci spec section 4.11.2.3:
TBC = roundup( Transfer Descriptor Packet Count / Max Burst Size +1 ) - 1
This patch should be applied to stable kernels since 3.0 that contain
the commit 5cd43e33b9
"xhci 1.0: Set transfer burst count field."
Cc: stable@vger.kernel.org # 3.0
Suggested-by: ShiChun Ma <masc2008@qq.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Command completion events normally include command completion status,
SLOT_ID, and a pointer to the original command. Reset device command
completion SLOT_ID may be zero according to xhci specs 4.6.11.
VIA controllers set the SLOT_ID to zero, triggering a WARN_ON in the
command completion handler.
Use the SLOT ID found from the original command instead.
This patch should be applied to stable kernels since 3.13 that contain
the commit 20e7acb13f
"xhci: use completion event's slot id rather than dig it out of command"
Cc: stable@vger.kernel.org # 3.13
Reported-by: Saran Neti <sarannmr@gmail.com>
Tested-by: Saran Neti <sarannmr@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use one timer to control command timeout.
start/kick the timer every time a command is completed and a
new command is waiting, or a new command is added to a empty list.
If the timer runs out, then tag the current command as "aborted", and
start the xhci command abortion process.
Previously each function that submitted a command had its own timer.
If that command timed out, a new command structure for the
command was created and it was put on a cancel_cmd_list list,
then a pci write to abort the command ring was issued.
when the ring was aborted, it checked if the current command
was the one to be canceled, later when the ring was stopped the
driver got ownership of the TRBs in the command ring,
compared then to the TRBs in the cancel_cmd_list,
and turned them into No-ops.
Now, instead, at timeout we tag the status of the command in the
command queue to be aborted, and start the ring abortion.
Ring abortion stops the command ring and gives control of the
commands to us.
All the aborted commands are now turned into No-ops.
If the ring is already stopped when the command times outs its not possible
to start the ring abortion, in this case the command is turnd to No-op
right away.
All these changes allows us to remove the entire cancel_cmd_list code.
The functions waiting for a command to finish no longer have their own timeouts.
They will wait either until the command completes normally,
or until the whole command abortion is done.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Remove the per-device command list and handle_cmd_in_cmd_wait_list()
and use the completion and status variables found in the
command structure in the global command list.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Create a list to store command structures, add a structure to it every time
a command is submitted, and remove it from the list once we get a
command completion event matching the command.
Callers that wait for completion will free their command structures themselves.
The other command structures are freed in the command completion event handler.
Also add a check that prevents queuing commands if host is dying
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To create a global command queue we require that each command put on the
command ring is submitted with a command structure.
Functions that queue commands and wait for completion need to allocate a command
before submitting it, and free it once completed. The following command queuing
functions need to be modified.
xhci_configure_endpoint()
xhci_address_device()
xhci_queue_slot_control()
xhci_queue_stop_endpoint()
xhci_queue_new_dequeue_state()
xhci_queue_reset_ep()
xhci_configure_endpoint()
xhci_configure_endpoint() could already be called with a command structure,
and only xhci_check_maxpacket and xhci_check_bandwidth did not do so. These
are changed and a command structure is now required. This change also simplifies
the configure endpoint command completion handling and the "goto bandwidth_change"
handling code can be removed.
In some cases the command queuing function is called in interrupt context.
These commands needs to be allocated atomically, and they can't wait for
completion. These commands will in this patch be freed directly after queuing,
but freeing will be moved to the command completion event handler in a later
patch once we get the global command queue up.(Just so that we won't leak
memory in the middle of the patch set)
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have observed a rare cycle state desync bug after Set TR Dequeue
Pointer commands on Intel LynxPoint xHCs (resulting in an endpoint that
doesn't fetch new TRBs and thus an unresponsive USB device). It always
triggers when a previous Set TR Dequeue Pointer command has set the
pointer to the final Link TRB of a segment, and then another URB gets
enqueued and cancelled again before it can be completed. Further
investigation showed that the xHC had returned the Link TRB in the TRB
Pointer field of the Transfer Event (CC == Stopped -- Length Invalid),
but when xhci_find_new_dequeue_state() later accesses the Endpoint
Context's TR Dequeue Pointer field it is set to the first TRB of the
next segment.
The driver expects those two values to be the same in this situation,
and uses the cycle state of the latter together with the address of the
former. This should be fine according to the XHCI specification, since
the endpoint ring should be stopped when returning the Transfer Event
and thus should not advance over the Link TRB before it gets restarted.
However, real-world XHCI implementations apparently don't really care
that much about these details, so the driver should follow a more
defensive approach to try to work around HC spec violations.
This patch removes the stopped_trb variable that had been used to store
the TRB Pointer from the last Transfer Event of a stopped TRB. Instead,
xhci_find_new_dequeue_state() now relies only on the Endpoint Context,
requiring a small amount of additional processing to find the virtual
address corresponding to the TR Dequeue Pointer. Some other parts of the
function were slightly rearranged to better fit into this model.
This patch should be backported to kernels as old as 2.6.31 that contain
the commit ae63674714 "USB: xhci: URB
cancellation support."
Signed-off-by: Julius Werner <jwerner@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the host controller stops responding to commands, we need to kill all
the URBs that were queued to all endpoints. The current code would only
kill URBs that had been queued to the endpoint rings. ep->ring is set
to NULL if streams has been enabled for the endpoint, which means URBs
submitted with a non-zero stream_id would never get killed. Fix this.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
In preparation for fixing this function for streams endpoints, refactor
code in the command watchdog timeout function into two new functions.
One kills all URBs on a ring (either stream or endpoint), the other
kills all URBs associated with an endpoint. Fix a split string while
we're at it.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This fixes TR dequeue validation failing on Intel XHCI controllers with the
following warning:
Mismatch between completed Set TR Deq Ptr command & xHCI internal state.
Interestingly enough reading the deq ptr from the ep ctx after a
TR Deq Ptr command does work on a Nec XHCI controller, it seems the Nec
writes the ptr to both the ep and stream contexts when streams are used.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Nec XHCI controllers don't seem to care, but without this Intel XHCI
controllers reject Set TR dequeue commands with a COMP_TRB_ERR, leading
to the following warning:
WARN Set TR Deq Ptr cmd invalid because of stream ID configuration
And very shortly after this the system completely freezes.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This changes debug messages and warnings in xhci-ring.c
to be on a single line so grep can find them. grep must
have precedence over the 80 column limit.
[Sarah fixed two checkpatch.pl issues with split lines
introduced by this commit.]
Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This reverts commit 35773dac5f. It's a
hack that caused regressions in the usb-storage and userspace USB
drivers that use usbfs and libusb. Commit 70cabb7d992f "xhci 1.0: Limit
arbitrarily-aligned scatter gather." should fix the issues seen with the
ax88179_178a driver on xHCI 1.0 hosts, without causing regressions.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable@vger.kernel.org # 3.12
This reverts commit d6c9ea9069.
We are ripping out commit 35773dac5f "usb:
xhci: Link TRB must not occur within a USB payload burst" because it's a
hack that caused regressions in the usb-storage and userspace USB
drivers that use usbfs and libusb. This commit attempted to fix the
issues with that patch.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable@vger.kernel.org # 3.12
This reverts commit e8b373326d. Many xHCI
host controllers can only handle 32-bit addresses, and writing 64-bits
at a time causes them to fail. Reading 64-bits at a time may also cause
them to return 0xffffffff, so revert this commit as well.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This reverts commit 7dd09a1af2.
Many xHCI host controllers can only handle 32-bit addresses, and writing
64-bits at a time causes them to fail. Rafał reports that USB devices
simply do not enumerate, and reverting this patch helps. Branimir
reports that his host controller doesn't respond to an Enable Slot
command and dies:
[ 75.576160] xhci_hcd 0000:03:00.0: Timeout while waiting for a slot
[ 88.991634] xhci_hcd 0000:03:00.0: Stopped the command ring failed, maybe the host is dead
[ 88.991748] xhci_hcd 0000:03:00.0: Abort command ring failed
[ 88.991845] xhci_hcd 0000:03:00.0: HC died; cleaning up
[ 93.985489] xhci_hcd 0000:03:00.0: Timeout while waiting for a slot
[ 93.985494] xhci_hcd 0000:03:00.0: Abort the command ring, but the xHCI is dead.
[ 98.982586] xhci_hcd 0000:03:00.0: Timeout while waiting for a slot
[ 98.982591] xhci_hcd 0000:03:00.0: Abort the command ring, but the xHCI is dead.
[ 103.979696] xhci_hcd 0000:03:00.0: Timeout while waiting for a slot
[ 103.979702] xhci_hcd 0000:03:00.0: Abort the command ring, but the xHCI is dead
Signed-off-by: Sarah Sharp <sarah.a.sharp@intel.com>
Reported-by: Rafał Miłecki <zajec5@gmail.com>
Reported-by: Branimir Maksimovic <branimir.maksimovic@gmail.com>
Cc: Xenia Ragiadakou <burzalodowa@gmail.com>
Currently prepare_ring() returns -ENOMEM if the urb won't fit into a
single ring segment. usb_sg_wait() treats this error as a temporary
condition and will keep retrying until something else goes wrong.
The number of retries should be limited in usb_sg_wait(), but also
prepare_ring() should not return an error code that suggests it might
be worth retrying. Change it to -EINVAL.
Reported-by: jidanni@jidanni.org
References: http://bugs.debian.org/733907
Fixes: 35773dac5f ('usb: xhci: Link TRB must not occur within a USB payload burst')
Cc: stable <stable@vger.kernel.org> # 3.12
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Happy Holidays, Greg!
Here's four patches to be queued to usb-next for 3.14.
One adds a module parameter to the xHCI driver to allow users to enable
xHCI quirks without recompiling their kernel, which you've already said
is fine. The second patch is a bug fix for new usbtest code that's only
in usb-next. The third patch is simple cleanup.
The last patch is a non-urgent bug fix for xHCI platform devices. The
bug has been in the code since 3.9. You've been asking me to hold off
on non-urgent bug fixes after -rc4/-rc5, so it can go into usb-next, and
be backported to stable once 3.14 is out.
These have all been tested over the past week. I did run across one
oops, but it turned out to be a bug in 3.12, and therefore not related
to any of these patches.
Please queue these for usb-next and 3.14.
Thanks,
Sarah Sharp
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJStMtLAAoJEBMGWMLi1Gc5uxMQAIXOL38n53yfNETU+DJMLaWB
+DY/RJjJ0zAZuRL9emOvR+BYEsQpd2q6aT3F2MryjJT/dhEKwJgHLTLOv0EQEWS7
dfLXX/NKNI+iRNyJkSATriytWMmMwv13kTXkreu99ZAw2PWWt7mM4BPHaWJ6sz5G
MjWS3v6LE59zjZatdzZyg4wlgUeNmO8cc4CV8YSqv0rFiVrKBL8IJsxwgQkWXI+2
J19hZcEpgeVMPo7aPlkMGoS1Ze9SxTviALBVLBWwwR028UfWCELFWsSFGVfJ6RgJ
/3DsE+jhQBn/1Y1o6hrlU/arDj7N/iJ3Gz5Ru5l+BtJw56fdaI/ToaHzzIDarONE
DGLiziDIknlSPsuX6X81kqXqROz1Zt624aqLipvqGCk0FMrLZz5BMaPEslkmW4Wb
/TQRX6KnTVzK4uMjv5yNVaGtnyoTStJeRE7dIF4/9e2YDLo4SCmm2Y9necr4C3Ls
7FzT8t6m9F74ZHAkmPpKFlEkYTYOy3yv/KBlhHF/OVFo9FAxAwUEY69QVrvhaHVX
GhLANc4NOuJb1eIwQarVkub2+lLI3N9zwEWliepKKUTQ8OTFDkDFM/+bwDHd4RzD
PO23wuFHVLj9N2BZbIAV2OkDyJLU2FOl+ZiEG6NUDXssihiZj4AVfIRiOU1c3EMc
g27X+N4FI2fnpqvAYq/a
=PRbx
-----END PGP SIGNATURE-----
Merge tag 'for-usb-next-2013-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci into usb-next
Sarah writes:
xhci: Cleanups, non-urgent fixes for 3.14.
Happy Holidays, Greg!
Here's four patches to be queued to usb-next for 3.14.
One adds a module parameter to the xHCI driver to allow users to enable
xHCI quirks without recompiling their kernel, which you've already said
is fine. The second patch is a bug fix for new usbtest code that's only
in usb-next. The third patch is simple cleanup.
The last patch is a non-urgent bug fix for xHCI platform devices. The
bug has been in the code since 3.9. You've been asking me to hold off
on non-urgent bug fixes after -rc4/-rc5, so it can go into usb-next, and
be backported to stable once 3.14 is out.
These have all been tested over the past week. I did run across one
oops, but it turned out to be a bug in 3.12, and therefore not related
to any of these patches.
Please queue these for usb-next and 3.14.
Thanks,
Sarah Sharp
This patch remove unused variable 'addr' in inc_deq() and inc_enq().
Signed-off-by: Lin Wang <lin.x.wang@intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Change the default enumeration scheme for xhci attached non-SuperSpeed
devices from:
Reset
SetAddress [xhci address-device BSR = 0]
GetDescriptor(8)
GetDescriptor(18)
...to:
Reset
[xhci address-device BSR = 1]
GetDescriptor(64)
Reset
SetAddress [xhci address-device BSR = 0]
GetDescriptor(18)
...as some devices misbehave when encountering a SetAddress command
prior to GetDescriptor. There are known legacy devices that require
this scheme, but testing has found at least one USB3 device that fails
enumeration when presented with this ordering. For now, follow the ehci
case and enable 'new scheme' by default for non-SuperSpeed devices.
To support this enumeration scheme on xhci the AddressDevice operation
needs to be performed twice. The first instance of the command enables
the HC's device and slot context info for the device, but omits sending
the device a SetAddress command (BSR == block set address request).
Then, after GetDescriptor completes, follow up with the full
AddressDevice+SetAddress operation.
As mentioned before, this ordering of events with USB3 devices causes an
extra state transition to be exposed to xhci. Previously USB3 devices
would transition directly from 'enabled' to 'addressed' and never need
to underrun responses to 'get descriptor'. We do see the 64-byte
descriptor fetch the correct data, but the following 18-byte descriptor
read after the reset gets:
bLength = 0
bDescriptorType = 0
bcdUSB = 0
bDeviceClass = 0
bDeviceSubClass = 0
bDeviceProtocol = 0
bMaxPacketSize0 = 9
instead of:
bLength = 12
bDescriptorType = 1
bcdUSB = 300
bDeviceClass = 0
bDeviceSubClass = 0
bDeviceProtocol = 0
bMaxPacketSize0 = 9
which results in the discovery process looping until falling back to
'old scheme' enumeration.
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: David Moore <david.moore@gmail.com>
Suggested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Function xhci_write_64() is used to write 64bit xHC registers residing in MMIO.
On 32bit systems, xHC registers need to be written with 32bit accesses by
writing first the lower 32bits and then the higher 32bits. The header file
asm-generic/io-64-nonatomic-lo-hi.h ensures that on 32bit systems writeq() will
will write 64bit registers in 32bit chunks with low-high order.
Replace all calls to xhci_write_64() with calls to writeq().
This is done to reduce code duplication since 64bit low-high write logic
is already implemented and to take advantage of inherent "atomic" 64bit
write operations on 64bit systems.
Signed-off-by: Xenia Ragiadakou <burzalodowa@gmail.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Function xhci_read_64() is used to read 64bit xHC registers residing in MMIO.
On 32bit systems, xHC registers need to be read with 32bit accesses by
reading first the lower 32bits and then the higher 32bits.
Replace all calls to xhci_read_64() with calls to readq() and include
asm-generic/io-64-nonatomic-lo-hi.h header file, so that if the system
is not 64bit, readq() will read registers in 32bit chunks with low-high order.
This is done to reduce code duplication since 64bit low-high read logic
is already implemented and to take advantage of inherent "atomic" 64bit
read operations on 64bit systems.
Signed-off-by: Xenia Ragiadakou <burzalodowa@gmail.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Function xhci_writel() is used to write a 32bit value in xHC registers residing
in MMIO address space. It takes as first argument a pointer to the xhci_hcd
although it does not use it. xhci_writel() internally simply calls writel().
This creates an illusion that xhci_writel() is an xhci specific function that
has to be called in a context where a pointer to xhci_hcd is available.
Remove xhci_writel() wrapper function and replace its calls with calls to
writel() to make the code more straight-forward.
Signed-off-by: Xenia Ragiadakou <burzalodowa@gmail.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>