linux/drivers/pci
Jarod Wilson 1469d17dd3 PCI: pciehp: Handle invalid data when reading from non-existent devices
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set.  This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.

One example, reported in the bugzilla below, involved this hierarchy:

  pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
  pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
  pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
  pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
  pci 0000:06:00.0: PCI bridge to [bus 07]    Thunderbolt Downstream Port
  pci 0000:07:00.0: BCM57762 NIC

Unplugging the Thunderbolt switch and the NIC below it resulted in this:

  pciehp 0000:03:03.0: Surprise Removal
  tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
  pciehp 0000:06:00.0: unloading service driver pciehp
  pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
  pciehp 0000:06:00.0: Switch interrupt received
  pciehp 0000:06:00.0: Latch open on Slot
  pciehp 0000:06:00.0: Attention button interrupt received
  pciehp 0000:06:00.0: Button pressed on Slot
  pciehp 0000:06:00.0: Presence/Notify input change
  pciehp 0000:06:00.0: Card present on Slot
  pciehp 0000:06:00.0: Power fault interrupt received
  pciehp 0000:06:00.0: Data Link Layer State change
  pciehp 0000:06:00.0: Link Up event

The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.

Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device.  The resulting timeout is a tg3 issue and not of
interest here.

Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up.  These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:

  pciehp 0000:06:00.0: PCI slot - powering on due to button press
  pciehp 0000:06:00.0: Surprise Insertion
  pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff

[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-08-10 14:24:09 -05:00
..
host Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-07-01 15:19:35 -07:00
hotplug PCI: pciehp: Handle invalid data when reading from non-existent devices 2015-08-10 14:24:09 -05:00
pcie PCI/ASPM: Simplify Clock Power Management setting 2015-06-10 14:00:21 -05:00
access.c PCI: Add generic config accessors 2015-01-22 13:59:45 -06:00
ats.c PCI: Removed unused parts of Page Request Interface support 2014-01-10 14:00:47 -07:00
bus.c PCI: Add pci_bus_addr_t 2015-05-29 17:21:45 -05:00
host-bridge.c Merge branch 'pci/misc' into next 2015-04-10 08:27:18 -05:00
hotplug-pci.c PCI: Remove unnecessary __ref annotations 2014-04-29 17:36:44 -06:00
htirq.c x86/htirq: Use hierarchical irqdomain to manage Hypertransport interrupts 2015-04-24 15:36:50 +02:00
iov.c PCI: Add pcibios_iov_resource_alignment() interface 2015-03-31 13:02:36 +11:00
irq.c PCI: Fix whitespace, capitalization, and spelling errors 2013-11-14 11:28:18 -07:00
Kconfig PCI: Add pci_bus_addr_t 2015-05-29 17:21:45 -05:00
Makefile PCI: Remove PCI ioapic driver 2014-12-16 14:08:14 +01:00
msi.c PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI 2015-05-07 09:52:21 -05:00
of.c
pci-acpi.c ACPI / PM: Rework device power management to follow ACPI 6 2015-05-16 01:55:35 +02:00
pci-driver.c ACPI and power management updates for v3.20-rc1 2015-02-10 15:09:41 -08:00
pci-label.c PCI: Make a shareable UUID for PCI firmware ACPI _DSM 2015-04-08 14:39:30 -05:00
pci-stub.c PCI: Whitespace cleanup 2014-06-10 20:20:19 -06:00
pci-sysfs.c PCI: Don't read past the end of sysfs "driver_override" buffer 2015-02-24 17:35:37 -06:00
pci.c Merge branches 'pci/aspm', 'pci/enumeration', 'pci/hotplug', 'pci/misc', 'pci/msi', 'pci/resource' and 'pci/virtualization' into next 2015-06-12 15:26:45 -05:00
pci.h Merge branches 'pci/aspm', 'pci/enumeration', 'pci/hotplug', 'pci/misc', 'pci/msi', 'pci/resource' and 'pci/virtualization' into next 2015-06-12 15:26:45 -05:00
probe.c PCI: Hold pci_slot_mutex while searching bus->slots list 2015-07-30 16:19:53 -05:00
proc.c PCI: Whitespace cleanup 2014-06-10 20:20:19 -06:00
quirks.c PCI changes for the v4.2 merge window: 2015-06-23 13:41:24 -07:00
remove.c PCI: Export symbols required for loadable host driver modules 2015-04-08 14:17:10 -05:00
rom.c PCI: Fix infinite loop with ROM image of size 0 2015-01-23 17:42:59 -06:00
search.c PCI: Delete unnecessary NULL pointer checks 2014-11-10 21:02:17 -07:00
setup-bus.c PCI: Preserve resource size during alignment reordering 2015-06-01 17:56:32 -05:00
setup-irq.c PCI: Export symbols required for loadable host driver modules 2015-04-08 14:17:10 -05:00
setup-res.c PCI: Mark invalid BARs as unassigned 2015-03-12 18:52:12 -05:00
slot.c PCI: Hold pci_slot_mutex while searching bus->slots list 2015-07-30 16:19:53 -05:00
syscall.c PCI: Whitespace cleanup 2014-06-10 20:20:19 -06:00
vc.c PCI: Use dev->has_secondary_link to find downstream PCIe links 2015-05-29 15:35:26 -05:00
vpd.c
xen-pcifront.c xen: features and cleanups for 4.2-rc0 2015-07-01 11:53:46 -07:00