linux/drivers/pci/hotplug
Lukas Wunner 3e487d2e4a PCI: pciehp: Fix indefinite wait on sysfs requests
David Hoyer reports that powering pciehp slots up or down via sysfs may
hang:  The call to wait_event() in pciehp_sysfs_enable_slot() and
_disable_slot() does not return because ctrl->ist_running remains true.

This flag, which was introduced by commit 157c1062fc ("PCI: pciehp: Avoid
returning prematurely from sysfs requests"), signifies that the IRQ thread
pciehp_ist() is running.  It is set to true at the top of pciehp_ist() and
reset to false at the end.  However there are two additional return
statements in pciehp_ist() before which the commit neglected to reset the
flag to false and wake up waiters for the flag.

That omission opens up the following race when powering up the slot:

* pciehp_ist() runs because a PCI_EXP_SLTSTA_PDC event was requested
  by pciehp_sysfs_enable_slot()

* pciehp_ist() turns on slot power via the following call stack:
  pciehp_handle_presence_or_link_change() -> pciehp_enable_slot() ->
  __pciehp_enable_slot() -> board_added() -> pciehp_power_on_slot()

* after slot power is turned on, the link comes up, resulting in a
  PCI_EXP_SLTSTA_DLLSC event

* the IRQ handler pciehp_isr() stores the event in ctrl->pending_events
  and returns IRQ_WAKE_THREAD

* the IRQ thread is already woken (it's bringing up the slot), but the
  genirq code remembers to re-run the IRQ thread after it has finished
  (such that it can deal with the new event) by setting IRQTF_RUNTHREAD
  via __handle_irq_event_percpu() -> __irq_wake_thread()

* the IRQ thread removes PCI_EXP_SLTSTA_DLLSC from ctrl->pending_events
  via board_added() -> pciehp_check_link_status() in order to deal with
  presence and link flaps per commit 6c35a1ac3d ("PCI: pciehp:
  Tolerate initially unstable link")

* after pciehp_ist() has successfully brought up the slot, it resets
  ctrl->ist_running to false and wakes up the sysfs requester

* the genirq code re-runs pciehp_ist(), which sets ctrl->ist_running
  to true but then returns with IRQ_NONE because ctrl->pending_events
  is empty

* pciehp_sysfs_enable_slot() is finally woken but notices that
  ctrl->ist_running is true, hence continues waiting

The only way to get the hung task going again is to trigger a hotplug
event which brings down the slot, e.g. by yanking out the card.

The same race exists when powering down the slot because remove_board()
likewise clears link or presence changes in ctrl->pending_events per commit
3943af9d01 ("PCI: pciehp: Ignore Link State Changes after powering off a
slot") and thereby may cause a re-run of pciehp_ist() which returns with
IRQ_NONE without resetting ctrl->ist_running to false.

Fix by adding a goto label before the teardown steps at the end of
pciehp_ist() and jumping to that label from the two return statements which
currently neglect to reset the ctrl->ist_running flag.

Fixes: 157c1062fc ("PCI: pciehp: Avoid returning prematurely from sysfs requests")
Link: https://lore.kernel.org/r/cca1effa488065cb055120aa01b65719094bdcb5.1584530321.git.lukas@wunner.de
Reported-by: David Hoyer <David.Hoyer@netapp.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Cc: stable@vger.kernel.org	# v4.19+
2020-03-31 10:22:18 -05:00
..
acpi_pcihp.c PCI: shpchp: Separate existence of SHPC and permission to use it 2018-06-26 15:38:28 -05:00
acpiphp_core.c Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
acpiphp_glue.c ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge 2019-11-13 17:01:59 -06:00
acpiphp_ibm.c PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
acpiphp.h Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
cpci_hotplug_core.c PCI: Remove unnecessary returns 2019-08-30 14:00:34 -05:00
cpci_hotplug_pci.c PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
cpci_hotplug.h PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
cpcihp_generic.c PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
cpcihp_zt5550.c PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
cpcihp_zt5550.h PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
cpqphp_core.c PCI: Remove unnecessary returns 2019-08-30 14:00:34 -05:00
cpqphp_ctrl.c PCI: Remove unnecessary returns 2019-08-30 14:00:34 -05:00
cpqphp_nvram.c PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
cpqphp_nvram.h PCI: Remove unnecessary returns 2019-08-30 14:00:34 -05:00
cpqphp_pci.c Merge branch 'pci/spdx' into next 2018-02-01 11:40:07 -06:00
cpqphp_sysfs.c PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
cpqphp.h PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
ibmphp_core.c PCI: ibmphp: Turn semaphores into completions or mutexes 2019-01-29 17:15:36 -06:00
ibmphp_ebda.c PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
ibmphp_hpc.c PCI: ibmphp: Turn semaphores into completions or mutexes 2019-01-29 17:15:36 -06:00
ibmphp_pci.c Merge branch 'pci/spdx' into next 2018-02-01 11:40:07 -06:00
ibmphp_res.c PCI: Mark expected switch fall-through 2019-08-08 15:13:34 -05:00
ibmphp.h PCI: ibmphp: Turn semaphores into completions or mutexes 2019-01-29 17:15:36 -06:00
Kconfig PCI: Fix indentation 2019-11-21 15:06:47 -06:00
Makefile PCI/hotplug: remove the sgi_hotplug driver 2019-08-16 11:33:56 -07:00
pci_hotplug_core.c PCI: hotplug: Drop hotplug_slot_info 2018-09-18 17:52:15 -05:00
pciehp_core.c PCI: pciehp: Prevent deadlock on disconnect 2019-11-12 17:17:42 -06:00
pciehp_ctrl.c PCI: pciehp: Prevent deadlock on disconnect 2019-11-12 17:17:42 -06:00
pciehp_hpc.c PCI: pciehp: Fix indefinite wait on sysfs requests 2020-03-31 10:22:18 -05:00
pciehp_pci.c PCI: pciehp: Log messages with pci_dev, not pcie_device 2019-05-09 16:45:20 -05:00
pciehp.h PCI: pciehp: Disable in-band presence detect when possible 2020-02-20 22:44:30 -06:00
pnv_php.c pci/hotplug/pnv-php: Wrap warnings in macro 2020-01-23 21:31:17 +11:00
rpadlpar_core.c PCI: Remove unnecessary returns 2019-08-30 14:00:34 -05:00
rpadlpar_sysfs.c pci-v4.16-changes 2018-02-06 09:59:40 -08:00
rpadlpar.h PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
rpaphp_core.c pci-v5.5-changes 2019-12-03 13:58:22 -08:00
rpaphp_pci.c PCI: hotplug: Drop hotplug_slot_info 2018-09-18 17:52:15 -05:00
rpaphp_slot.c PCI: rpaphp: Get/put device node reference during slot alloc/dealloc 2019-04-10 16:07:12 -05:00
rpaphp.h PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
s390_pci_hpc.c PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
shpchp_core.c PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
shpchp_ctrl.c PCI: hotplug: Drop hotplug_slot_info 2018-09-18 17:52:15 -05:00
shpchp_hpc.c PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
shpchp_pci.c Merge branch 'pci/spdx' into next 2018-02-01 11:40:07 -06:00
shpchp_sysfs.c PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate 2018-01-28 15:49:06 -06:00
shpchp.h PCI: hotplug: Embed hotplug_slot 2018-09-18 17:52:15 -05:00
TODO PCI: hotplug: Document TODOs 2018-09-18 17:52:15 -05:00