Add more debugging to the irq handler, slot context initialization, ring
operations, URB cancellation, and MMIO writes.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The Event Handler Busy bit in the event ring dequeue pointer is write 1 to
clear. Fix the interrupt service routine to clear that bit after the
event handler has run.
xhci_set_hc_event_deq() is designed to update the event ring dequeue pointer
without changing any of the four reserved bits in the lower nibble. The event
handler busy (EHB) bit is write one to clear, so the new value must always
contain a zero in that bit in order to preserve the EHB value.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There are several xHCI data structures that use two 32-bit fields to
represent a 64-bit address. Since some architectures don't support 64-bit
PCI writes, the fields need to be written in two 32-bit writes. The xHCI
specification says that if a platform is incapable of generating 64-bit
writes, software must write the low 32-bits first, then the high 32-bits.
Hardware that supports 64-bit addressing will wait for the high 32-bit
write before reading the revised value, and hardware that only supports
32-bit writes will ignore the high 32-bit write.
Previous xHCI code represented 64-bit addresses with two u32 values. This
lead to buggy code that would write the 32-bits in the wrong order, or
forget to write the upper 32-bits. Change the two u32s to one u64 and
create a function call to write all 64-bit addresses in the proper order.
This new function could be modified in the future if all platforms support
64-bit writes.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The xHCI functions to queue an URB onto the hardware rings must be called
with the xhci spinlock held. Those functions will allocate memory, and
take a gfp_t memory flags argument. We must pass them the GFP_ATOMIC
flag, since we don't want the memory allocation to attempt to sleep while
waiting for more memory to become available.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
When an endpoint on a device under an xHCI host controller stalls, the
host controller driver must let the hardware know that the USB core has
successfully cleared the halt condition. The HCD submits a Reset Endpoint
Command, which will clear the toggle bit for USB 2.0 devices, and set the
sequence number to zero for USB 3.0 devices.
The xHCI urb_enqueue will accept new URBs while the endpoint is halted,
and will queue them to the hardware rings. However, the endpoint doorbell
will not be rung until the Reset Endpoint Command completes.
Don't queue a reset endpoint command for root hubs. khubd clears halt
conditions on the roothub during the initialization process, but the roothub
isn't a real device, so the xHCI host controller doesn't need to know about the
cleared halt.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Narrow down time spent holding the xHCI spinlock so that it's only used to
protect the xHCI rings, not as mutual exclusion. Stop allocating memory
while holding the spinlock and calling xhci_alloc_virt_device() and
xhci_endpoint_init().
The USB core should have locking in it to prevent device state to be
manipulated by more than one kernel thread. E.g. you can't free a device
while you're in the middle of setting a new configuration. So removing
the locks from the sections where xhci_alloc_dev() and
xhci_reset_bandwidth() touch xHCI's representation of the device should be
OK.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Mask off the lower 16 bits of the interrupt control register, instead of
masking off the upper 16 bits. The interrupt moderation interval field is
the lower 16 bytes, and is set to 0x4000 (1ms) by default. The previous
code was adding 40 us to the default value, instead of setting it to 40
us. This makes performance really bad.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The packed attribute allows gcc to muck with the alignment of data
structures, which may lead to byte-wise writes that break atomicity of
writes. Packed should only be used when the compile may add undesired
padding to the structure. Each element of the structure will be aligned
by C based on its size and the size of the elements around it. E.g. a u64
would be aligned on an 8 byte boundary, the next u32 would be aligned on a
four byte boundary, etc.
Since most of the xHCI structures contain only u32 bit values, removing
the packed attribute for them should be harmless. (A future patch will
change some of the twin 32-bit address fields to one 64-bit field, but all
those places have an even number of 32-bit fields before them, so the
alignment should be correct.) Add BUILD_BUG_ON statements to check that
the compiler doesn't add padding to the data structures that have a
hardware-defined layout.
While we're modifying the registers, change the name of intr_reg to
xhci_intr_reg to avoid global conflicts.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make sure the error path in xhci_urb_enqueue() releases the spinlock
before it returns. Reported by Oliver in
http://marc.info/?l=linux-usb&m=124091637311832&w=2
Reported-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Drop spinlock in xhci_irq() error path.
This fixes the issue reported by Oliver Neukum on this thread:
http://marc.info/?l=linux-usb&m=124090924401444&w=2
Remove unnecessary register read reported by Viral Mehta:
http://marc.info/?l=linux-usb&m=124091326007398&w=2
Reported-by: Oliver Neukum <oliver@neukum.org>
Reported-by: Viral Mehta <viral.mehta@einfochips.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Make all globally visible functions start with xhci_ and mark functions as
static if they're only called within the same C file. Fix some long lines
while we're at it.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The 0.95 xHCI spec says that if the xHCI HW support 64-bit addressing, you
must write the whole 64-bit address as one atomic operation, or write the
low 32 bits, and then the high 32 bits. I had the register writes
swapped in some places.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Turns out someone never built this code on a 64bit platform.
Someone owes me a beer...
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The former is way to generic for a global symbol.
Fixes this build error:
drivers/usb/built-in.o: In function `.handle_event': (.text+0x67dd0): multiple definition of `.handle_event'
drivers/pcmcia/built-in.o:(.text+0xcfcc): first defined here
drivers/usb/built-in.o: In function `handle_event': (.opd+0x5bc8): multiple definition of `handle_event'
drivers/pcmcia/built-in.o:(.opd+0xed0): first defined here
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add URB cancellation support to the xHCI host controller driver. This
currently supports cancellation for endpoints that do not have streams
enabled.
An URB is represented by a number of Transaction Request Buffers (TRBs),
that are chained together to make one (or more) Transaction Descriptors
(TDs) on an endpoint ring. The ring is comprised of contiguous segments,
linked together with Link TRBs (which may or may not be chained into a TD).
To cancel an URB, we must stop the endpoint ring, make the hardware skip
over the TDs in the URB (either by turning them into No-op TDs, or by
moving the hardware's ring dequeue pointer past the last TRB in the last
TD), and then restart the ring.
There are times when we must drop the xHCI lock during this process, like
when we need to complete cancelled URBs. We must ensure that additional
URBs can be marked as cancelled, and that new URBs can be enqueued (since
the URB completion handlers can do either). The new endpoint ring
variables cancels_pending and state (which can only be modified while
holding the xHCI lock) ensure that future cancellation and enqueueing do
not interrupt any pending cancellation code.
To facilitate cancellation, we must keep track of the starting ring
segment, first TRB, and last TRB for each URB. We also need to keep track
of the list of TDs that have been marked as cancelled, separate from the
list of TDs that are queued for this endpoint. The new variables and
cancellation list are stored in the xhci_td structure.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Allow device drivers to submit URBs to bulk endpoints on devices under an
xHCI host controller. Share code between the control and bulk enqueueing
functions when it makes sense.
To get the best performance out of bulk transfers, SuperSpeed devices must
have the bMaxBurst size copied from their endpoint companion controller
into the xHCI device context. This allows the host controller to "burst"
up to 16 packets before it has to wait for the device to acknowledge the
first packet.
The buffers in Transfer Request Blocks (TRBs) can cross page boundaries,
but they cannot cross 64KB boundaries. The buffer must be broken into
multiple TRBs if a 64KB boundary is crossed.
The sum of buffer lengths in all the TRBs in a Transfer Descriptor (TD)
cannot exceed 64MB. To work around this, the enqueueing code must enqueue
multiple TDs. The transfer event handler may incorrectly give back the
URB in this case, if it gets a transfer event that points somewhere in the
first TD. FIXME later.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since the xHCI host controller hardware (xHC) has an internal schedule, it
needs a better representation of what devices are consuming bandwidth on
the bus. Each device is represented by a device context, with data about
the device, endpoints, and pointers to each endpoint ring.
We need to update the endpoint information for a device context before a
new configuration or alternate interface setting is selected. We setup an
input device context with modified endpoint information and newly
allocated endpoint rings, and then submit a Configure Endpoint Command to
the hardware.
The host controller can reject the new configuration if it exceeds the bus
bandwidth, or the host controller doesn't have enough internal resources
for the configuration. If the command fails, we still have the older
device context with the previous configuration. If the command succeeds,
we free the old endpoint rings.
The root hub isn't a real device, so always say yes to any bandwidth
changes for it.
The USB core will enable, disable, and then enable endpoint 0 several
times during the initialization sequence. The device will always have an
endpoint ring for endpoint 0 and bandwidth allocated for that, unless the
device is disconnected or gets a SetAddress 0 request. So we don't pay
attention for when xhci_check_bandwidth() is called for a re-add of
endpoint 0.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Allow device drivers to enqueue URBs to control endpoints on devices under
an xHCI host controller. Each control transfer is represented by a
series of Transfer Descriptors (TDs) written to an endpoint ring. There
is one TD for the Setup phase, (optionally) one TD for the Data phase, and
one TD for the Status phase.
Enqueue these TDs onto the endpoint ring that represents the control
endpoint. The host controller hardware will return an event on the event
ring that points to the (DMA) address of one of the TDs on the endpoint
ring. If the transfer was successful, the transfer event TRB will have a
completion code of success, and it will point to the Status phase TD.
Anything else is considered an error.
This should work for control endpoints besides the default endpoint, but
that hasn't been tested.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
xHCI needs to get a "Slot ID" from the host controller and allocate other
data structures for every USB device. Make usb_alloc_dev() and
usb_release_dev() allocate and free these device structures. After
setting up the xHC device structures, usb_alloc_dev() must wait for the
hardware to respond to an Enable Slot command. usb_alloc_dev() fires off
a Disable Slot command and does not wait for it to complete.
When the USB core wants to choose an address for the device, the xHCI
driver must issue a Set Address command and wait for an event for that
command.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add functionality for getting port status and hub descriptor for xHCI root
hubs. This is WIP because the USB 3.0 hub descriptor is different from
the USB 2.0 hub descriptor. For now, we lie about the root hub descriptor
because the changes won't effect how the core talks to the root hub.
Later we will need to add the USB 3.0 hub descriptor for real hubs, and
this code might change.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
xHCI host controllers can optionally implement a no-op test. This
simple test ensures the OS has correctly setup all basic data structures
and can correctly respond to interrupts from the host controller
hardware.
There are two rings exercised by the no-op test: the command ring, and
the event ring.
The host controller driver writes a no-op command TRB to the command
ring, and rings the doorbell for the command ring (the first entry in
the doorbell array). The hardware receives this event, places a command
completion event on the event ring, and fires an interrupt.
The host controller driver sees the interrupt, and checks the event ring
for TRBs it can process, and sees the command completion event. (See
the rules in xhci-ring.c for who "owns" a TRB. This is a simplified set
of rules, and may not contain all the details that are in the xHCI 0.95
spec.)
A timer fires every 60 seconds to debug the state of the hardware and
command and event rings. This timer only runs if
CONFIG_USB_XHCI_HCD_DEBUGGING is 'y'.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Allocate basic xHCI host controller data structures. For every xHC, there
is a command ring, an event ring, and a doorbell array.
The doorbell array is used to notify the host controller that work has
been enqueued onto one of the rings. The host controller driver enqueues
commands on the command ring. The HW enqueues command completion events
on the event ring and interrupts the system (currently using PCI
interrupts, although the xHCI HW will use MSI interrupts eventually).
All rings and the doorbell array must be allocated by the xHCI host
controller driver.
Each ring is comprised of one or more segments, which consists of 16-byte
Transfer Request Blocks (TRBs) that can be chained to form a Transfer
Descriptor (TD) that represents a multiple-buffer request. Segments are
linked into a ring using Link TRBs, which means they are dynamically
growable.
The producer of the ring enqueues a TD by writing one or more TRBs in the
ring and toggling the TRB cycle bit for each TRB. The consumer knows it
can process the TRB when the cycle bit matches its internal consumer cycle
state for the ring. The consumer cycle state is toggled an odd amount of
times in the ring.
An example ring (a ring must have a minimum of 16 TRBs on it, but that's
too big to draw in ASCII art):
chain cycle
bit bit
------------------------
| TD A TRB 1 | 1 | 1 |<------------- <-- consumer dequeue ptr
------------------------ | consumer cycle state = 1
| TD A TRB 2 | 1 | 1 | |
------------------------ |
| TD A TRB 3 | 0 | 1 | segment 1 |
------------------------ |
| TD B TRB 1 | 1 | 1 | |
------------------------ |
| TD B TRB 2 | 0 | 1 | |
------------------------ |
| Link TRB | 0 | 1 |----- |
------------------------ | |
| |
chain cycle | |
bit bit | |
------------------------ | |
| TD C TRB 1 | 0 | 1 |<---- |
------------------------ |
| TD D TRB 1 | 1 | 1 | |
------------------------ |
| TD D TRB 2 | 1 | 1 | segment 2 |
------------------------ |
| TD D TRB 3 | 1 | 1 | |
------------------------ |
| TD D TRB 4 | 1 | 1 | |
------------------------ |
| Link TRB | 1 | 1 |----- |
------------------------ | |
| |
chain cycle | |
bit bit | |
------------------------ | |
| TD D TRB 5 | 1 | 1 |<---- |
------------------------ |
| TD D TRB 6 | 0 | 1 | |
------------------------ |
| TD E TRB 1 | 0 | 1 | segment 3 |
------------------------ |
| | 0 | 0 | | <-- producer enqueue ptr
------------------------ |
| | 0 | 0 | |
------------------------ |
| Link TRB | 0 | 0 |---------------
------------------------
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add PCI initialization code to take control of the xHCI host controller
away from the BIOS, halt, and reset the host controller. The xHCI spec
says that BIOSes must give up the host controller within 5 seconds.
Add some host controller glue functions to handle hardware initialization
and memory allocation for the host controller. The current xHCI
prototypes use PCI interrupts, but the xHCI spec requires MSI-X
interrupts. Add code to support MSI-X interrupts, but use the PCI
interrupts for now.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>