linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-02 09:01:34 +00:00

Author	SHA1	Message	Date
Logan Gunthorpe	aaca43fda7	PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support To support peer-to-peer traffic on a segment of the PCI hierarchy, we must disable the ACS redirect bits for select PCI bridges. The bridges must be selected before the devices are discovered by the kernel and the IOMMU groups created. Therefore, add a kernel command line parameter to specify devices which must have their ACS bits disabled. The new parameter takes a list of devices separated by a semicolon. Each device specified will have its ACS redirect bits disabled. This is similar to the existing 'resource_alignment' parameter. The ACS Request P2P Request Redirect, P2P Completion Redirect and P2P Egress Control bits are disabled, which is sufficient to always allow passing P2P traffic uninterrupted. The bits are set after the kernel (optionally) enables the ACS bits itself. It is also done regardless of whether the kernel or platform firmware sets the bits. If the user tries to disable the ACS redirect for a device without the ACS capability, print a warning to dmesg. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: reorder to add the generic code first and move the device-specific quirk to subsequent patches] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-08-09 17:37:19 -05:00
Logan Gunthorpe	45db33709c	PCI: Allow specifying devices using a base bus and path of devfns When specifying PCI devices on the kernel command line using a bus/device/function address, bus numbers can change when adding or replacing a device, changing motherboard firmware, or applying kernel parameters like "pci=assign-buses". When bus numbers change, it's likely the command line tweak will be applied to the wrong device. Therefore, it is useful to be able to specify devices with a base bus number and the path of devfns needed to get to it, similar to the "device scope" structure in the Intel VT-d spec, Section 8.3.1. Thus, we add an option to specify devices in the following format: [<domain>:]<bus>:<device>.<func>[/<device>.<func>]* The path can be any segment within the PCI hierarchy of any length and determined through the use of 'lspci -t'. When specified this way, it is less likely that a renumbered bus will result in a valid device specification and the tweak won't be applied to the wrong device. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: use "device" instead of "slot" in documentation since that's the usual language in the PCI specs] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-08-09 16:24:39 -05:00
Logan Gunthorpe	07d8d7e57c	PCI: Make specifying PCI devices in kernel parameters reusable Separate out the code to match a PCI device with a string (typically originating from a kernel parameter) from the pci_specified_resource_alignment() function into its own helper function. While we are at it, this change fixes the kernel style of the function (fixing a number of long lines and extra parentheses). Additionally, make the analogous change to the kernel parameter documentation: Separate the description of how to specify a PCI device into its own section at the head of the "pci=" parameter. This patch should have no functional alterations. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> [bhelgaas: use "device" instead of "slot" in documentation since that's the usual language in the PCI specs] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stephen Bates <sbates@raithlin.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>	2018-08-09 16:23:06 -05:00
Bjorn Helgaas	bd2e9567db	PCI: Hide ACS quirk declarations inside PCI core Move declarations for these functions: pci_dev_specific_acs_enabled() pci_dev_specific_enable_acs() from include/linux/pci.h to drivers/pci/pci.h because nothing outside the PCI core needs to use them. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-09 16:19:52 -05:00
Alex Williamson	51ba09452d	PCI: Delay after FLR of Intel DC P3700 NVMe Add a device-specific reset for Intel DC P3700 NVMe device which exhibits a timeout failure in drivers waiting for the ready status to update after NVMe enable if the driver interacts with the device too soon after FLR. As this has been observed in device assignment scenarios, resolve this with a device-specific reset quirk to add an additional, heuristically determined, delay after the FLR completes. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1592654 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-09 15:20:52 -05:00
Alex Williamson	ffb0863426	PCI: Disable Samsung SM961/PM961 NVMe before FLR The Samsung SM961/PM961 (960 EVO) sometimes fails to return from FLR with the PCI config space reading back as -1. A reproducible instance of this behavior is resolved by clearing the enable bit in the NVMe configuration register and waiting for the ready status to clear (disabling the NVMe controller) prior to FLR. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1542494 Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-09 15:18:33 -05:00
Alex Williamson	2d2917f774	PCI: Export pcie_has_flr() pcie_flr() suggests pcie_has_flr() to ensure that PCIe FLR support is present prior to calling. pcie_flr() is exported while pcie_has_flr() is not. Resolve this. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-09 15:18:27 -05:00
Thomas Petazzoni	f23d0d449c	PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers() This comment has been there since the driver was introduced, but seems to be a leftover from previous iterations of the driver. Indeed, we do not lookup in a list to find the register ranges that matches the given port/lane, as the "reg" property is in each sub-node representing a PCI port. There is no lookup involved at all. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-08 15:58:18 +01:00
Thomas Petazzoni	42342073e3	PCI: mvebu: Convert to use pci_host_bridge directly Rather than using the ARM-specific pci_common_init_dev() API, use the pci_host_bridge logic directly. Unfortunately, we can't use devm_of_pci_get_host_bridge_resources(), because the DT binding for describing PCIe apertures for this PCI controller is a bit special, and we cannot retrieve them from the 'ranges' property. Therefore, we still have some special code to handle this. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-08 15:57:50 +01:00
Thomas Petazzoni	5a553d6ba1	PCI: mvebu: Use resource_size() to remap I/O space Instead of hardcoding the remapping of IO_SPACE_LIMIT - SZ_64K, use resource_size(). However, we cannot use just IO_SPACE_LIMIT, because pci_ioremap_io() has a bug and doesn't allow remapping the last 64 KB before IO_SPACE_LIMIT, so we ensure that we do not exceed this limit. When the pci_ioremap_io() issue is fixed, this work around can be dropped. Note that this workaround already existed, since we were mapping only up to IO_SPACE_LIMIT - SZ_64K. Suggested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> [lorenzo.pieralisi@arm.com: tweaked the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-08 15:51:06 +01:00
Thomas Petazzoni	ee1604381a	PCI: mvebu: Only remap I/O space if configured If there is no PCI I/O aperture configured in the Device Tree, it does not make sense to create the virtual mapping for the PCI I/O space, since we will anyway not create the MBus window that will allow to access it. Therefore, do the pci_ioremap_io() only if necessary. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-08 15:50:30 +01:00
Thomas Petazzoni	dfd0309fd7	PCI: mvebu: Fix I/O space end address calculation pcie->realio.end should be the address of last byte of the area, therefore using resource_size() of another resource is not correct, we must substract 1 to get the address of the last byte. Fixes: `11be65472a` ("PCI: mvebu: Adapt to the new device tree layout") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-08 15:50:04 +01:00
Thomas Petazzoni	6554f95019	PCI: mvebu: Remove redundant platform_set_drvdata() call This is already done earlier in mvebu_pcie_probe(). Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-08 15:49:52 +01:00
Bjorn Helgaas	ce29af2a50	PCI: Remove unnecessary include of <linux/pci-aspm.h> Several PCI core files include pci-aspm.h even though they don't need anything provided by that file. Remove the unnecessary includes of it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org>	2018-08-06 14:32:22 -05:00
Andy Shevchenko	36131ce9a0	PCI/ASPM: Convert to use sysfs_match_string() helper The sysfs_match_string() helper returns index of the matching string in an array. Use it in pcie_aspm_set_policy() to simplify the code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> [bhelgaas: squash sysfs_match_string() fix into original patch for issue Reported-by: Heiner Kallweit <hkallweit1@gmail.com>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-08-06 14:30:34 -05:00
Christoph Hellwig	34dbc9c658	PCI/xilinx: Depend on OF instead of the ARCH There isn't a hard dependency of the Xilinx AXI-PCIe host bridge on any architecture. For example: at SiFive we map RISC-V cores to Xilinx FPGAs and connect the Xilinx IP via a TileLink adapter, so the RISC-V Linux port will need to be able to enable PCIE_XILINX in order to have PCIe support. This patch decouples the PCIE_XILINX support from ARCH. Instead it just depends on OF, which is the only true dependency. Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com> [hch: switch to OF instead of OF_PCI now that the latter is gone] Signed-off-by: Christoph Hellwig <hch@lst.de> [lorenzo.pieralisi@arm.com: trimmed the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-08-06 11:39:08 +01:00
Thomas Gleixner	f2701b77bb	Merge 4.18-rc7 into master to pick up the KVM dependcy Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2018-08-05 16:39:29 +02:00
Nicolai Stange	447ae31667	x86: Don't include linux/irq.h from asm/hardirq.h The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to .c files as needed. Note that some of these .c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2018-08-05 09:53:13 +02:00
Linus Torvalds	ef46808b79	pci-v4.18-fixes-5 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltjEoMUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vw3gQ/+IK9Poej5yIKG06gtbtLjflq9wRXW ZP72U9P8gpfI/L/0T+uAHVpp5ewjD8tGCgsEhS/OzU7eu/fyr5V5TnqjfFE++fi4 JnQrl4J14y8miP/iYQGf+6CwGYsHM4TnVa/yo525FocZBqYWKcLAPtYuhcXNPWNC iOKa3wii1KErJnjwCU0mR1a45dP5vJC/TCEcDD1ZQYJPxDrfB9lwwAo+rXFyPjMe TBJso1bLOzd3XTuyQqiP/HBPdtGi024eb3JdK16K273EPKoFp9Iw8irpP4hZKPqK 60wbTc4kSlo7Sj9OOdTXH1XW2xe/xDJimyyJjsVz8Rg6y8DN6Vg0aXEazm/xFmRC wxP4/U5ciivhvmCrbqB6DPqcBtu83Yxh7sZZPKw4LGU8Dk75QDhlVAmE2+vH51u8 Owt11GAjj++6EHCjeTms+bvAL/HFWeGru/A2NZE93QEy/tq3UtCLXGkni5Av1vWj u2bQPVqia1s7xPdRR3lwCOeCa7yU/LwqpNm8w0uF/0oCRKs42Ao4pU/sLYQGihoo rwqAEBKYA6TI6L7C4TmCue4cegRvK0Mhn8wZdGrJhJTKu+1gtIs0ph1bwtMin28d U8zX6Zhi9/xQSJN7iARbEcdgCZtePKkLIMSiILKMmAh+bjAs3HzLMs2czQ2g7YKU +71+CqJfgQXmxjI= =pwJS -----END PGP SIGNATURE----- Merge tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix integer overflow in new mobiveil driver (Dan Carpenter) - Fix race during NVMe removal/rescan (Hari Vyas) * tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix is_added/is_busmaster race condition PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE	2018-08-02 10:59:19 -07:00
Bjorn Helgaas	944d58595b	PCI/AER: Remove duplicate PCI_EXP_AER_FLAGS definition PCI_EXP_AER_FLAGS was defined twice (with identical definitions), once under #ifdef CONFIG_ACPI_APEI, and again at the top level. This looks like my merge error from these commits: `fd3362cb73` ("PCI/AER: Squash aerdrv_core.c into aerdrv.c") `41cbc9eb1a` ("PCI/AER: Squash ecrc.c into aerdrv.c") Remove the duplicate PCI_EXP_AER_FLAGS definition. Fixes: `41cbc9eb1a` ("PCI/AER: Squash ecrc.c into aerdrv.c") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-31 16:26:09 -05:00
Lukas Wunner	4e6a13356f	PCI: pciehp: Deduplicate presence check on probe & resume On driver probe and on resume from system sleep, pciehp checks the Presence Detect State bit in the Slot Status register to bring up an occupied slot or bring down an unoccupied slot. Both code paths are identical, so deduplicate them per Mika's request. On probe, an additional check is performed to disable power of an unoccupied slot. This can e.g. happen if power was enabled by BIOS. It cannot happen once pciehp has taken control, hence is not necessary on resume: The Slot Control register is set to the same value that it had on suspend by pci_restore_state(), so if the slot was occupied, power is enabled and if it wasn't, power is disabled. Should occupancy have changed during the system sleep transition, power is adjusted by bringing up or down the slot per the paragraph above. To allow for deduplication of the presence check, move the power check to pcie_init(). This seems safer anyway, because right now it is performed while interrupts are already enabled, and although I can't think of a scenario where pciehp_power_off_slot() and the IRQ thread collide, it does feel brittle. However this means that pcie_init() may now write to the Slot Control register before the IRQ is requested. If both the CCIE and HPIE bits happen to be set, pcie_wait_cmd() will wait for an interrupt (instead of polling the Command Completed bit) and eventually emit a timeout message. Additionally, if a level-triggered INTx interrupt is used, the user may see a spurious interrupt splat. Avoid by disabling interrupts before disabling power. (Normally the HPIE and CCIE bits should be clear on probe, but conceivably they may already have been set e.g. by BIOS.) Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2018-07-31 13:27:24 -05:00
Lukas Wunner	8bb46b079d	PCI: pciehp: Avoid implicit fallthroughs in switch statements Per Mika's request, add an explicit break to the last case of switch statements everywhere in pciehp to be more defensive towards future amendments. Per Gustavo's request, mark all non-empty implicit fallthroughs with a comment to silence warnings triggered by -Wimplicit-fallthrough=2. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Gustavo A. R. Silva <gustavo@embeddedor.com>	2018-07-31 13:26:33 -05:00
Hari Vyas	44bda4b7d2	PCI: Fix is_added/is_busmaster race condition When a PCI device is detected, pdev->is_added is set to 1 and proc and sysfs entries are created. When the device is removed, pdev->is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, pdev->is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple removal and rescan of a PCIe NVMe device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between the PCI core setting the is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue setting the is_busmaster bit in pci_set_master(). As these fields are not handled atomically, that clears the is_added bit. Move the is_added bit to a separate private flag variable and use atomic functions to set and retrieve the device addition state. This avoids the race because is_added no longer shares a memory location with is_busmaster. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283 Signed-off-by: Hari Vyas <hari.vyas@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au>	2018-07-31 11:27:54 -05:00
Lukas Wunner	47a8e237ed	PCI: Whitelist Thunderbolt ports for runtime D3 Thunderbolt controllers can be runtime suspended to D3cold to save ~1.5W. This requires that runtime D3 is allowed on its PCIe ports, so whitelist them. The 2015 BIOS cutoff that we've instituted for runtime D3 on PCIe ports is unnecessary on Thunderbolt because we know that even the oldest controller, Light Ridge (2010), is able to suspend its ports to D3 just fine -- specifically including its hotplug ports. And the power saving should be afforded to machines even if their BIOS predates 2015. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andreas Noever <andreas.noever@gmail.com>	2018-07-31 11:09:36 -05:00
Lukas Wunner	eb3b5bf1a8	PCI: Whitelist native hotplug ports for runtime D3 Previously we blacklisted PCIe hotplug ports for runtime D3 because: (a) Ports handled by the firmware must not be transitioned to D3 by the OS behind the firmware's back: https://bugzilla.kernel.org/show_bug.cgi?id=53811 (b) Ports handled natively by the OS lacked runtime D3 support in the pciehp driver. We've just rectified the latter, so allow users to manually enable and test it by passing pcie_port_pm=force on the command line. Vendors are thus put in a position to validate hotplug ports for runtime D3 and perhaps we can someday enable it by default, but with a BIOS cutoff date. Ashok Raj tested runtime D3 on hotplug ports of a SkyLake Xeon-SP in 2017 and encountered Hardware Error NMIs, so this feature clearly cannot be enabled for everyone yet: https://lkml.kernel.org/r/20170503180426.GA4058@otc-nc-03 While at it, remove an erroneous code comment I added with `97a90aee5d` ("PCI: Consolidate conditions to allow runtime PM on PCIe ports") which claims that parents of a hotplug port must stay awake lest interrupts cannot be delivered. That has turned out to be wrong at least for Thunderbolt hotplug ports. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>	2018-07-31 11:09:36 -05:00
Lukas Wunner	82c3fbff6e	PCI: sysfs: Resume to D0 on function reset When performing a function reset via sysfs, the device's config space is accessed in places such as pcie_flr() and its MMIO space is accessed e.g. in reset_ivb_igd(), so ensure accessibility by resuming the device to D0. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>	2018-07-31 11:09:36 -05:00
Lukas Wunner	4417aa45c1	PCI: pciehp: Resume parent to D0 on config space access Ensure accessibility of a hotplug port's config space when accessed via sysfs by resuming its parent to D0. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>	2018-07-31 11:09:36 -05:00
Lukas Wunner	8350307454	PCI: pciehp: Resume to D0 on enable/disable pciehp's IRQ thread ensures accessibility of the port by runtime resuming its parent to D0. However when the slot is enabled/disabled, the port itself needs to be in D0 because its secondary bus is accessed in: pciehp_check_link_status(), pciehp_configure_device() (both called from board_added()) and pciehp_unconfigure_device() (called from remove_board()). Thus, acquire a runtime PM ref on enable/disablement of the slot. Yinghai Lu additionally discovered that some SkyLake servers feature a Power Controller for their PCIe hotplug ports (PCIe r3.1, sec 6.7.1.8) which requires the port to be in D0 when invoking pciehp_power_on_slot() (likewise called from board_added()). If slot power is turned on while in D3hot, link training later fails: https://lkml.kernel.org/r/20170205073454.GA253@wunner.de The spec is silent about such a requirement, but it seems prudent to assume that any hotplug port with a Power Controller may need this. The present commit holds a runtime PM ref whenever slot power is turned on and off, but it doesn't keep the port in D0 as long as slot power is on. If vendors determine that's necessary, they need to amend pciehp to acquire a runtime PM ref in pciehp_power_on_slot() and release one in pciehp_power_off_slot(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>	2018-07-31 11:09:36 -05:00
Lukas Wunner	6b08c3854c	PCI: pciehp: Support interrupts sent from D3hot If a hotplug port is able to send an interrupt, one would naively assume that it is accessible at that moment. After all, if it wouldn't be accessible, i.e. if its parent is in D3hot and the link to the hotplug port is thus down, how should an interrupt come through? It turns out that assumption is wrong at least for Thunderbolt: Even though its parents are in D3hot, a Thunderbolt hotplug port is able to signal interrupts. Because the port's config space is inaccessible and resuming the parents may sleep, the hard IRQ handler has to defer runtime resuming the parents and reading the Slot Status register to the IRQ thread. If the hotplug port uses a level-triggered INTx interrupt, it needs to be masked until the IRQ thread has cleared the signaled events. For simplicity, this commit also masks edge-triggered MSI/MSI-X interrupts. Note that if the interrupt is shared (which can only happen for INTx), other devices are starved from receiving interrupts until the IRQ thread is scheduled, has runtime resumed the hotplug port's parents and has read and cleared the Slot Status register. That delay is dominated by the 10 ms D3hot->D0 transition time of each parent port. The worst case is a Thunderbolt downstream port at the end of a daisy chain: There may be up to six Thunderbolt controllers in-between it and the root port, each comprising an upstream and downstream port, plus its own upstream port. That's 13 x 10 = 130 ms. Possible mitigations are polling the interrupt while it's disabled or reducing the d3_delay of Thunderbolt ports if possible. Open code masking of the interrupt instead of requesting it with the IRQF_ONESHOT flag to minimize the period during which it is masked. (IRQF_ONESHOT unmasks the IRQ only after the IRQ thread has finished.) PCIe r4.0 sec 6.7.3.4 states that "If wake generation is required by the associated form factor specification, a hotplug capable Downstream Port must support generation of a wakeup event (using the PME mechanism) on hotplug events that occur when the system is in a sleep state or the Port is in device state D1, D2, or D3Hot." This would seem to imply that PME needs to be enabled on the hotplug port when it is runtime suspended. pci_enable_wake() currently doesn't enable PME on bridges, it may be necessary to add an exemption for hotplug bridges there. On "Light Ridge" Thunderbolt controllers, the PME_Status bit is not set when an interrupt occurs while the hotplug port is in D3hot, even if PME is enabled. (I've tested this on a Mac and we hardcode the OSC_PCI_EXPRESS_PME_CONTROL bit to 0 on Macs in negotiate_os_control(), modifying it to 1 didn't change the behavior.) (Side note: Section 6.7.3.4 also states that "PME and Hot-Plug Event interrupts (when both are implemented) always share the same MSI or MSI-X vector". That would only seem to apply to Root Ports, however the section never mentions Root Ports, only Downstream Ports. This is explained in the definition of "Downstream Port" in the "Terms and Acronyms" section of the PCIe Base Spec: "The Ports on a Switch that are not the Upstream Port are Downstream Ports. All Ports on a Root Complex are Downstream Ports.") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Yinghai Lu <yinghai@kernel.org>	2018-07-31 11:08:56 -05:00
Lukas Wunner	469e764c4a	PCI: pciehp: Obey compulsory command delay after resume Upon resume from system sleep, the Slot Control register is written via: pci_pm_resume_noirq() pci_pm_default_resume_early() pci_restore_state() pci_restore_pcie_state() PCIe r4.0, sec 6.7.3.2 says that after "issuing a write transaction that targets any portion of the Port's Slot Control register, [...] software must wait for [the] command to complete before issuing the next command". pciehp currently fails to enforce that rule after the above-mentioned write. Fix it. (Moving restoration of the Slot Control register to pciehp doesn't seem to make sense because the other PCIe hotplug drivers may need it as well.) Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-31 11:07:59 -05:00
Lukas Wunner	7903782460	PCI: pciehp: Clear spurious events earlier on resume Thunderbolt hotplug ports that were occupied before system sleep resume with their downstream link in "off" state. Only after the Thunderbolt controller has reestablished the PCIe tunnels does the link go up. As a result, a spurious Presence Detect Changed and/or Data Link Layer State Changed event occurs. The events are not immediately acted upon because tunnel reestablishment happens in the ->resume_noirq phase, when interrupts are still disabled. Also, notification of events may initially be disabled in the Slot Control register when coming out of system sleep and is reenabled in the ->resume_noirq phase through: pci_pm_resume_noirq() pci_pm_default_resume_early() pci_restore_state() pci_restore_pcie_state() It is not guaranteed that the events are acted upon at all: PCIe r4.0, sec 6.7.3.4 says that "a port may optionally send an MSI when there are hot-plug events that occur while interrupt generation is disabled, and interrupt generation is subsequently enabled." Note the "optionally". If an MSI is sent, pciehp will gratuitously turn the slot off and back on once the ->resume_early phase has commenced. If an MSI is not sent, the extant, unacknowledged events in the Slot Status register will prevent future notification of presence or link changes. Commit `13c65840fe` ("PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume") fixed the latter by clearing the events in the ->resume phase. Move this to the ->resume_noirq phase to also fix the gratuitous disable/enablement of the slot. The commit further restored the Slot Control register in the ->resume phase, but that's dispensable because as shown above it's already been done in the ->resume_noirq phase. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>	2018-07-31 11:07:59 -05:00
Lukas Wunner	6ccb127ba6	PCI: portdrv: Deduplicate PM callback iterator Replace suspend_iter() and resume_iter() with a single function pm_iter() to allow addition of port service callbacks for further power management phases without having to add another iterator each time. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-31 11:07:59 -05:00
Lukas Wunner	5b3f7b7d06	PCI: pciehp: Avoid slot access during reset The ->reset_slot callback introduced by commits: `2e35afaefe` ("PCI: pciehp: Add reset_slot() method") and `06a8d89af5` ("PCI: pciehp: Disable link notification across slot reset") disables notification of Presence Detect Changed and Data Link Layer State Changed events for the duration of a secondary bus reset. However a bus reset not only triggers these events, but may also clear the Presence Detect State bit in the Slot Status register and the Data Link Layer Link Active bit in the Link Status register momentarily. According to Sinan Kaya: "I know for a fact that bus reset clears the Data Link Layer Active bit as soon as link goes down. It gets set again following link up. Presence detect depends on the HW implementation. QDT root ports don't change presence detect for instance since nobody actually removed the card. If an implementation supports in-band presence detect, the answer is yes. As soon as the link goes down, presence detect bit will get cleared until recovery." https://lkml.kernel.org/r/42e72f83-3b24-f7ef-e5bc-290fae99259a@codeaurora.org In-band presence detect is also covered in Table 4-15 in PCIe r4.0, sec 4.2.6. pciehp should therefore ensure that any parts of the driver that access those bits do not run concurrently to a bus reset. The only precaution the commits took to that effect was to halt interrupt polling. They made no effort to drain the slot workqueue, cancel an outstanding Attention Button work, or block slot enable/disable requests via sysfs and in the ->probe hook. Now that pciehp is converted to enable/disable the slot exclusively from the IRQ thread, the only places accessing the two above-mentioned bits are the IRQ thread and the ->probe hook. Add locking to serialize them with a bus reset. This obviates the need to halt interrupt polling. Do not add locking to the ->get_adapter_status sysfs callback to afford users unfettered access to that bit. Use an rw_semaphore in lieu of a regular mutex to allow parallel execution of the non-reset code paths accessing the critical bits, i.e. the IRQ thread and the ->probe hook. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rajat Jain <rajatja@google.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Sinan Kaya <okaya@kernel.org>	2018-07-31 10:50:31 -05:00
Heiner Kallweit	f7368a5502	PCI: Use IRQF_ONESHOT if pci_request_irq() called with no handler If we have a threaded interrupt with the handler being NULL, then request_threaded_irq() -> __setup_irq() will complain and bail out if the IRQF_ONESHOT flag isn't set. Therefore check for the handler being NULL and set IRQF_ONESHOT in this case. This change is needed to migrate the mei_me driver to pci_alloc_irq_vectors() and pci_request_irq(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2018-07-31 10:43:43 -05:00
Christoph Hellwig	a8651194f9	PCI: Call dma_debug_add_bus() for pci_bus_type from PCI core There is nothing arch-specific about PCI or dma-debug, so call dma_debug_add_bus() from the PCI core just after registering the bus type. Most of dma-debug is already generic; this just adds reporting of pending dma-allocations on driver unload for arches other than powerpc, sh, and x86. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)	2018-07-30 15:58:01 -05:00
Lorenzo Pieralisi	6f2c73c124	PCI: mobiveil: Add Kconfig/Makefile entries commit `9af6bcb11e` ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") did not add the configuration and build infrastructure to configure and build the mobiveil controller driver, so at present the driver code is in the kernel but cannot be compiled. Add the mobiveil controller driver Kconfig/Makefile infrastructure. Fixes: `9af6bcb11e` ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in>	2018-07-30 14:30:16 +01:00
Lorenzo Pieralisi	d374301223	PCI: mobiveil: Add missing ../pci.h include PCI mobiveil host controller driver currently fails to compile with the following error: drivers/pci/controller/pcie-mobiveil.c: In function 'mobiveil_pcie_probe': drivers/pci/controller/pcie-mobiveil.c:788:8: error: implicit declaration of function 'devm_of_pci_get_host_bridge_resources'; did you mean 'pci_get_host_bridge_device'? [-Werror=implicit-function-declaration] ret = devm_of_pci_get_host_bridge_resources(dev, 0, 0xff, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ pci_get_host_bridge_device Add the missing include file to pull in the required function declaration. Fixes: `9af6bcb11e` ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in>	2018-07-30 14:30:12 +01:00
Lorenzo Pieralisi	af3f606e0b	PCI: mobiveil: Fix struct mobiveil_pcie.pcie_reg_base address type The field pcie_reg_base in struct mobiveil_pcie represents a physical address so it should be of phys_addr_t type rather than void __iomem; this results in the following compilation warnings: drivers/pci/controller/pcie-mobiveil.c: In function 'mobiveil_pcie_parse_dt': drivers/pci/controller/pcie-mobiveil.c:326:22: warning: assignment makes pointer from integer without a cast [-Wint-conversion] pcie->pcie_reg_base = res->start; ^ drivers/pci/controller/pcie-mobiveil.c: In function 'mobiveil_pcie_enable_msi': drivers/pci/controller/pcie-mobiveil.c:485:25: warning: initialization makes integer from pointer without a cast [-Wint-conversion] phys_addr_t msg_addr = pcie->pcie_reg_base; ^~~~ drivers/pci/controller/pcie-mobiveil.c: In function 'mobiveil_compose_msi_msg': drivers/pci/controller/pcie-mobiveil.c:640:21: warning: initialization makes integer from pointer without a cast [-Wint-conversion] phys_addr_t addr = pcie->pcie_reg_base + (data->hwirq sizeof(int)); Fix the type and with it the compilation warnings. Fixes: `9af6bcb11e` ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Subrahmanya Lingappa <l.subrahmanya@mobiveil.co.in>	2018-07-30 14:30:08 +01:00
Dave Airlie	3fce461827	BackMerge v4.18-rc7 into drm-next rmk requested this for armada and I think we've had a few conflicts build up. Signed-off-by: Dave Airlie <airlied@redhat.com>	2018-07-30 10:39:22 +10:00
Dan Carpenter	a54e43f993	PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE IB_WIN_SIZE is larger than INT_MAX so we need to cast it to u64. Fixes: `9af6bcb11e` ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-27 17:10:39 -05:00
Linus Torvalds	1a3d8691fd	pci-v4.18-fixes-4 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltbU2MUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vy6QxAAlUYBbT1OD0X2XjDgwG9TaxxqBHMT w8VTxMUQmMuQRZP8F+xLUksC6lwRGNU8ZPbN4p7d5VMO1PoyKuubUbOgtGSjg7lz SQEbUntQ2s4JCxeu1sUyUwDwW0LSMcunluYnWKzut2B6R/yWopJc/1sOsORYwHBf SaAVttFayaDl+u2I9hWn5JpHAgTsu+6rpUY1Hs79GI3DAKbSMTy0J0w/5J+eqQUO hwuWJqG4hFF9U/zaTqP89EbB181b/OFNathk/Neh3Ge1KnGRHPmE86146ZJSuk4q QfYfhzLtzNWaJcjaDyG0hmqX4hNDHWREyPAXgBJ+oxegAQs67Zj1waaKGQ7bY58i j911Jd14j30ZnruB4nhsLE/DQJa5cU3WjyFbiL6+qicBWWrKGj3ahoIi5V/j882Q z7V+VeBTxpz0LhxTOSC83yZRbUwY6TlcVVW5yJYDI3NXI9ixfOzeh9QFFhSveSbM P/3EFv6vrg1V145RlaRPBzXfiDZwqPCL3udBOeK2Z+5jgfTr6yOg0c64c5D+tpR1 nGm+VKqNaYAeTayeGByWiOOyt4CbL3WXOi+L+tRSjfYjffbh1CyiqRuVrbP/PG64 o6vg3yfKStR6DXbNpxFvDfsrc69NAaJEH5TBQOeuT3abZF0UwnUvL9TF3Y5cSdjj nyaamyAUMSflqAk= =Lmx6 -----END PGP SIGNATURE----- Merge tag 'pci-v4.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "Fix a use-after-free error in fatal error recovery (Thomas Tai)" * tag 'pci-v4.18-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/AER: Work around use-after-free in pcie_do_fatal_recovery()	2018-07-27 10:28:51 -07:00
Thomas Tai	bd91b56cb3	PCI/AER: Work around use-after-free in pcie_do_fatal_recovery() When an fatal error is received by a non-bridge device, the device is removed, and pci_stop_and_remove_bus_device() deallocates the device structure. The freed device structure is used by subsequent code to send uevents and print messages. Hold a reference on the device until we're finished using it. This is not an ideal fix because pcie_do_fatal_recovery() should not use the device at all after removing it, but that's too big a project for right now. Fixes: `7e9084b367` ("PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices") Signed-off-by: Thomas Tai <thomas.tai@oracle.com> [bhelgaas: changelog, reduce get/put coverage] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-26 12:13:04 -05:00
Dan Carpenter	014562071c	PCI: mobiveil: Integer overflow in IB_WIN_SIZE IB_WIN_SIZE is larger than INT_MAX so we need to cast it to u64. Fixes: `9af6bcb11e` ("PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>	2018-07-26 11:46:46 +01:00
Lukas Wunner	cdf6b73621	PCI: pciehp: Always enable occupied slot on probe Per PCIe r4.0, sec 6.7.3.4, a "port may optionally send an MSI when there are hot-plug events that occur while interrupt generation is disabled, and interrupt generation is subsequently enabled." On probe, we currently clear all event bits in the Slot Status register with the notable exception of the Presence Detect Changed bit. Thereby we seek to receive an interrupt for an already occupied slot once event notification is enabled. But because the interrupt is optional, users may have to specify the pciehp_force parameter on the command line, which is inconvenient. Moreover, now that pciehp's event handling has become resilient to missed events, a Presence Detect Changed interrupt for a slot which is powered on is interpreted as removal of the card. If the slot has already been brought up by the BIOS, receiving such an interrupt on probe causes the slot to be powered off and immediately back on, which is likewise undesirable. Avoid both issues by making the behavior of pciehp_force the default and clearing the Presence Detect Changed bit on probe. Note that the stated purpose of pciehp_force per the MODULE_PARM_DESC ("Force pciehp, even if OSHP is missing") seems nonsensical because the OSHP control method is only relevant for SHCP slots according to the PCI Firmware specification r3.0, sec 4.8. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>	2018-07-23 17:04:16 -05:00
Lukas Wunner	d331710ea7	PCI: pciehp: Become resilient to missed events A hotplug port's Slot Status register does not count how often each type of event occurred, it only records the fact that an event has occurred. Previously pciehp queued a work item for each event. But if it missed an event, e.g. removal of a card in-between two back-to-back insertions, it queued up the wrong work item or no work item at all. Commit `fad214b0aa` ("PCI: pciehp: Process all hotplug events before looking for new ones") sought to improve the situation by shrinking the window during which events may be missed. But Stefan Roese reports unbalanced Card present and Link Up events, suggesting that we're still missing events if they occur very rapidly. Bjorn Helgaas responds that he considers pciehp's event handling "baroque" and calls for its simplification and rationalization: https://lkml.kernel.org/r/20180202192045.GA53759@bhelgaas-glaptop.roam.corp.google.com It gets worse once a hotplug port is runtime suspended: The port can signal an interrupt while it and its parents are in D3hot, i.e. while it is inaccessible. By the time we've runtime resumed all parents to D0 and read the port's Slot Status register, we may have missed an arbitrary number of events. Event handling therefore needs to be reworked to become resilient to missed events. Assume that a Presence Detect Changed event has occurred. Consider the following truth table: - Slot is in OFF_STATE and is currently empty. => Do nothing. (The event is trailing a Link Down or we've missed an insertion and subsequent removal.) - Slot is in OFF_STATE and is currently occupied. => Turn the slot on. - Slot is in ON_STATE and is currently empty. => Turn the slot off. - Slot is in ON_STATE and is currently occupied. => Turn the slot off, (Be cautious and assume the card in then back on. the slot isn't the same as before.) This leads to the following simple algorithm: 1 If the slot is in ON_STATE, turn it off unconditionally. 2 If the slot is currently occupied, turn it on. Because those actions are now carried out synchronously, rather than by scheduled work items, pciehp reacts to the current situation and missed events no longer matter. Data Link Layer State Changed events can be handled identically to Presence Detect Changed events. Note that in the above truth table, a Link Up trailing a Card present event didn't have to be accounted for: It is filtered out by pciehp_check_link_status(). As for Attention Button Pressed events, PCIe r4.0, sec 6.7.1.5 says: "Once the Power Indicator begins blinking, a 5-second abort interval exists during which a second depression of the Attention Button cancels the operation." In other words, the user can only expect the system to react to a button press after it starts blinking. Missed button presses that occur in-between are irrelevant. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefan Roese <sr@denx.de> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>	2018-07-23 17:04:16 -05:00
Lukas Wunner	6c35a1ac3d	PCI: pciehp: Tolerate initially unstable link When a device is hotplugged, Presence Detect and Link Up events often do not occur simultaneously, but with a lag of a few milliseconds. Only the first event received is relevant, the other one can be disregarded. Moreover, Stefan Roese reports that on certain platforms, Link State and Presence Detect may flap for up to 100 ms before stabilizing, suggesting that such events should be disregarded for at least this long: https://lkml.kernel.org/r/20180130084121.18653-1-sr@denx.de On slot enablement, pciehp_check_link_status() waits for 100 ms per PCIe r4.0, sec 6.7.3.3, then probes the hotplugged device's vendor register for up to 1 second. If this succeeds, the link is definitely up, so ignore any Presence Detect or Link State events that occurred up to this point. pciehp_check_link_status() then checks the Link Training bit in the Link Status register. This is the final opportunity to detect inaccessibility of the device and abort slot enablement. Any link or presence change that occurs afterwards will cause the slot to be disabled again immediately after attempting to enable it. The astute reviewer may appreciate that achieving this behavior would be more complicated had pciehp not just been converted to enable/disable the slot exclusively from the IRQ thread: When the slot is enabled via sysfs, each link or presence flap would otherwise cause the IRQ thread to run and it would have to sense that those events are belonging to a concurrent slot enablement operation and disregard them. It would be much more difficult than this mere 3 line change. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefan Roese <sr@denx.de>	2018-07-23 17:04:16 -05:00
Lukas Wunner	25c83b84b1	PCI: pciehp: Declare pciehp_enable/disable_slot() static No callers of pciehp_enable/disable_slot() outside of pciehp_ctrl.c remain, so declare the functions static. For now this requires forward declarations. Those can be eliminated by reshuffling functions once the ongoing effort to refactor the driver has settled. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:15 -05:00
Lukas Wunner	1656716d45	PCI: pciehp: Drop enable/disable lock Previously slot enablement and disablement could happen concurrently. But now it's under the exclusive control of the IRQ thread, rendering the locking obsolete. Drop it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:15 -05:00
Lukas Wunner	32a8cef274	PCI: pciehp: Enable/disable exclusively from IRQ thread Besides the IRQ thread, there are several other places in the driver which enable or disable the slot: - pciehp_probe() enables the slot if it's occupied and the pciehp_force module parameter is used. - pciehp_resume() enables or disables the slot after system sleep. - pciehp_queue_pushbutton_work() enables or disables the slot after the 5 second delay following an Attention Button press. - pciehp_sysfs_enable_slot() and pciehp_sysfs_disable_slot() enable or disable the slot on sysfs write. This requires locking and complicates pciehp's state machine. A simplification can be achieved by enabling and disabling the slot exclusively from the IRQ thread. Amend the functions listed above to request slot enable/disablement from the IRQ thread by either synthesizing a Presence Detect Changed event or, in the case of a disable user request (via sysfs or an Attention Button press), submitting a newly introduced force disable request. The latter is needed because the slot shall be forced off despite being occupied. For this force disable request, avoid colliding with Slot Status register bits by using a bit number greater than 16. For synchronous execution of requests (on sysfs write), wait for the request to finish and retrieve the result. There can only ever be one sysfs write in flight due to the locking in kernfs_fop_write(), hence there is no risk of returning the result of a different sysfs request to user space. The POWERON_STATE and POWEROFF_STATE is now no longer entered by the above-listed functions, but solely by the IRQ thread when it begins a power transition. Afterwards, it moves to STATIC_STATE. The same applies to canceling the Attention Button work, it likewise becomes an IRQ thread only operation. An immediate consequence is that the POWERON_STATE and POWEROFF_STATE is never observed by the IRQ thread itself, only by functions called in a different context, such as pciehp_sysfs_enable_slot(). So remove handling of these states from pciehp_handle_button_press() and pciehp_handle_link_change() which are exclusively called from the IRQ thread. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:15 -05:00
Lukas Wunner	9590192f25	PCI: pciehp: Track enable/disable status handle_button_press_event() currently determines whether the slot has been turned on or off by looking at the Power Controller Control bit in the Slot Control register. This assumes that an attention button implies presence of a power controller even though that's not mandated by the spec. Moreover the Power Controller Control bit is unreliable when a power fault occurs (PCIe r4.0, sec 6.7.1.8). This issue has existed since the driver was introduced in 2004. Fix by replacing STATIC_STATE with ON_STATE and OFF_STATE and tracking whether the slot has been turned on or off. This is also a required ingredient to make pciehp resilient to missed events, which is the object of an upcoming commit. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:14 -05:00
Lukas Wunner	774d446b0f	PCI: pciehp: Publish to user space last on probe The PCI hotplug core has just been refactored to separate slot initialization for in-kernel use from publication to user space. Take advantage of it in pciehp by publishing to user space last on probe. This will allow enable/disablement of the slot exclusively from the IRQ thread because the IRQ is requested after initialization for in-kernel use (thereby getting its unique name needed by the IRQ thread) but before user space is able to submit enable/disable requests. On teardown, the order is the same in reverse: The user space interface is removed prior to freeing the IRQ and destroying the slot. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:14 -05:00
Lukas Wunner	51bbf9bee3	PCI: hotplug: Demidlayer registration with the core When a hotplug driver calls pci_hp_register(), all steps necessary for registration are carried out in one go, including creation of a kobject and addition to sysfs. That's a problem for pciehp once it's converted to enable/disable the slot exclusively from the IRQ thread: The thread needs to be spawned after creation of the kobject (because it uses the kobject's name), but before addition to sysfs (because it will handle enable/disable requests submitted via sysfs). pci_hp_deregister() does offer a ->release callback that's invoked after deletion from sysfs and before destruction of the kobject. But because pci_hp_register() doesn't offer a counterpart, hotplug drivers' ->probe and ->remove code becomes asymmetric, which is error prone as recently discovered use-after-free bugs in pciehp's ->remove hook have shown. In a sense, this appears to be a case of the midlayer antipattern: "The core thesis of the "midlayer mistake" is that midlayers are bad and should not exist. That common functionality which it is so tempting to put in a midlayer should instead be provided as library routines which can [be] used, augmented, or ignored by each bottom level driver independently. Thus every subsystem that supports multiple implementations (or drivers) should provide a very thin top layer which calls directly into the bottom layer drivers, and a rich library of support code that eases the implementation of those drivers. This library is available to, but not forced upon, those drivers." -- Neil Brown (2009), https://lwn.net/Articles/336262/ The presence of midlayer traits in the PCI hotplug core might be ascribed to its age: When it was introduced in February 2002, the blessings of a library approach might not have been well known: https://git.kernel.org/tglx/history/c/a8a2069f432c For comparison, the driver core does offer split functions for creating a kobject (device_initialize()) and addition to sysfs (device_add()) as an alternative to carrying out everything at once (device_register()). This was introduced in October 2002: https://git.kernel.org/tglx/history/c/8b290eb19962 The odd ->release callback in the PCI hotplug core was added in 2003: https://git.kernel.org/tglx/history/c/69f8d663b595 Clearly, a library approach would not force every hotplug driver to implement a ->release callback, but rather allow the driver to remove the sysfs files, release its data structures and finally destroy the kobject. Alternatively, a driver may choose to remove everything with pci_hp_deregister(), then release its data structures. To this end, offer drivers pci_hp_initialize() and pci_hp_add() as a split-up version of pci_hp_register(). Likewise, offer pci_hp_del() and pci_hp_destroy() as a split-up version of pci_hp_deregister(). Eliminate the ->release callback and move its code into each driver's teardown routine. Declare pci_hp_deregister() void, in keeping with the usual kernel pattern that enablement can fail, but disablement cannot. It only returned an error if the caller passed in a NULL pointer or a slot which has never or is no longer registered or is sharing its name with another slot. Those would be bugs, so WARN about them. Few hotplug drivers actually checked the return value and those that did only printed a useless error message to dmesg. Remove that. For most drivers the conversion was straightforward since it doesn't matter whether the code in the ->release callback is executed before or after destruction of the kobject. But in the case of ibmphp, it was unclear to me whether setting slot_cur->ctrl and slot_cur->bus_on to NULL needs to happen before the kobject is destroyed, so I erred on the side of caution and ensured that the order stays the same. Another nontrivial case is pnv_php, I've found the list and kref logic difficult to understand, however my impression was that it is safe to delete the list element and drop the references until after the kobject is destroyed. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> # drivers/platform/x86 Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Scott Murray <scott@spiteful.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Corentin Chary <corentin.chary@gmail.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org>	2018-07-23 17:04:13 -05:00
Lukas Wunner	55a6b7a657	PCI: pciehp: Drop slot workqueue Previously the slot workqueue was used to handle events and enable or disable the slot. That's no longer the case as those tasks are done synchronously in the IRQ thread. The slot workqueue is thus merely used to handle a button press after the 5 second delay and only one such work item may be in flight at any given time. A separate workqueue isn't necessary for this simple task, so use the system workqueue instead. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:13 -05:00
Lukas Wunner	0e94916e60	PCI: pciehp: Handle events synchronously Up until now, pciehp's IRQ handler schedules a work item for each event, which in turn schedules a work item to enable or disable the slot. This double indirection was necessary because sleeping wasn't allowed in the IRQ handler. However it is now that pciehp has been converted to threaded IRQ handling and polling, so handle events synchronously in pciehp_ist() and remove the work item infrastructure (with the exception of work items to handle a button press after the 5 second delay). For link or presence change events, move the register read to determine the current link or presence state behind acquisition of the slot lock to prevent it from becoming stale while the lock is contended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:12 -05:00
Lukas Wunner	b0ccd9dd5d	PCI: pciehp: Stop blinking on slot enable failure If the attention button is pressed to power on the slot AND the user powers on the slot via sysfs before 5 seconds have elapsed AND powering on the slot fails because either the slot is unoccupied OR the latch is open, we neglect turning off the green LED so it keeps on blinking. That's because the error path of pciehp_sysfs_enable_slot() doesn't call pciehp_green_led_off(), unlike pciehp_power_thread() which does. The bug has been present since 2004 when the driver was introduced. Fix by deduplicating common code in pciehp_sysfs_enable_slot() and pciehp_power_thread() into a wrapper function pciehp_enable_slot() and renaming the existing function to __pciehp_enable_slot(). Same for pciehp_disable_slot(). This will also simplify the upcoming rework of pciehp's event handling. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:12 -05:00
Lukas Wunner	ec07a44730	PCI: pciehp: Convert to threaded polling We've just converted pciehp to threaded IRQ handling, but still cannot sleep in pciehp_ist() because the function is also called in poll mode, which runs in softirq context (from a timer). Convert poll mode to a kthread so that pciehp_ist() always runs in task context. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de>	2018-07-23 17:04:12 -05:00
Lukas Wunner	7b4ce26bcf	PCI: pciehp: Convert to threaded IRQ pciehp's IRQ handler queues up a work item for each event signaled by the hardware. A more modern alternative is to let a long running kthread service the events. The IRQ handler's sole job is then to check whether the IRQ originated from the device in question, acknowledge its receipt to the hardware to quiesce the interrupt and wake up the kthread. One benefit is reduced latency to handle the IRQ, which is a necessity for realtime environments. Another benefit is that we can make pciehp simpler and more robust by handling events synchronously in process context, rather than asynchronously by queueing up work items. pciehp's usage of work items is a historic artifact, it predates the introduction of threaded IRQ handlers by two years. (The former was introduced in 2007 with commit `5d386e1ac4` ("pciehp: Event handling rework"), the latter in 2009 with commit `3aa551c9b4` ("genirq: add threaded interrupt handler support").) Convert pciehp to threaded IRQ handling by retrieving the pending events in pciehp_isr(), saving them for later consumption by the thread handler pciehp_ist() and clearing them in the Slot Status register. By clearing the Slot Status (and thereby acknowledging the events) in pciehp_isr(), we can avoid requesting the IRQ with IRQF_ONESHOT, which would have the unpleasant side effect of starving devices sharing the IRQ until pciehp_ist() has finished. pciehp_isr() does not count how many times each event occurred, but merely records the fact that an event occurred. If the same event occurs a second time before pciehp_ist() is woken, that second event will not be recorded separately, which is problematic according to commit `fad214b0aa` ("PCI: pciehp: Process all hotplug events before looking for new ones") because we may miss removal of a card in-between two back-to-back insertions. We're about to make pciehp_ist() resilient to missed events. The present commit regresses the driver's behavior temporarily in order to separate the changes into reviewable chunks. This doesn't affect regular slow-motion hotplug, only plug-unplug-plug operations that happen in a timespan shorter than wakeup of the IRQ thread. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mayurkumar Patel <mayurkumar.patel@intel.com> Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>	2018-07-23 17:04:12 -05:00
Lukas Wunner	4aed1cd6fb	PCI: pciehp: Document struct slot and struct controller Document the driver's data structures to lower the barrier to entry for contributors. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:12 -05:00
Lukas Wunner	1d2e2673dc	PCI: pciehp: Declare pciehp_unconfigure_device() void Since commit `0f4bd8014d` ("PCI: hotplug: Drop checking of PCI_BRIDGE_ CONTROL in *_unconfigure_device()"), pciehp_unconfigure_device() can no longer fail, so declare it and its sole caller remove_board() void, in keeping with the usual kernel pattern that enablement can fail, but disablement cannot. No functional change intended. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>	2018-07-23 17:04:11 -05:00
Lukas Wunner	6641311df9	PCI: pciehp: Drop unnecessary NULL pointer check pciehp_disable_slot() checks if the ctrl attribute of the slot is NULL and bails out if so. However the function is not called prior to the attribute being set in pcie_init_slot(), and pcie_init_slot() is not called if ctrl is NULL. So the check is unnecessary. Drop it. It has been present ever since the driver was introduced in 2004, but it was already unnecessary back then: https://git.kernel.org/tglx/history/c/c16b4b14d980 Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-23 17:04:11 -05:00
Lukas Wunner	1204e35bed	PCI: pciehp: Fix unprotected list iteration in IRQ handler Commit `b440bde74f` ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") iterates over the devices on a hotplug port's subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem. It is thus possible for a user to cause a crash by concurrently manipulating the device list, e.g. by disabling slot power via sysfs on a different CPU or by initiating a remove/rescan via sysfs. This can't be fixed by acquiring pci_bus_sem because it may sleep. The simplest fix is to avoid the list iteration altogether and just check the ignore_hotplug flag on the port itself. This works because pci_ignore_hotplug() sets the flag both on the device as well as on its parent bridge. We do lose the ability to print the name of the device blocking hotplug in the debug message, but that's probably bearable. Fixes: `b440bde74f` ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org	2018-07-23 17:04:10 -05:00
Lukas Wunner	281e878eab	PCI: pciehp: Fix use-after-free on unplug When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the hotplug_slot struct is deregistered and thus freed before freeing the IRQ. The IRQ handler and the work items it schedules print the slot name referenced from the freed structure in various informational and debug log messages, each time resulting in a quadruple dereference of freed pointers (hotplug_slot -> pci_slot -> kobject -> name). At best the slot name is logged as "(null)", at worst kernel memory is exposed in logs or the driver crashes: pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present An attacker may provoke the bug by unplugging multiple devices on a Thunderbolt daisy chain at once. Unplugging can also be simulated by powering down slots via sysfs. The bug is particularly easy to trigger in poll mode. It has been present since the driver's introduction in 2004: https://git.kernel.org/tglx/history/c/c16b4b14d980 Fix by rearranging teardown such that the IRQ is freed first. Run the work items queued by the IRQ handler to completion before freeing the hotplug_slot struct by draining the work queue from the ->release_slot callback which is invoked by pci_hp_deregister(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v2.6.4	2018-07-23 17:04:10 -05:00
Lukas Wunner	4ce6435820	PCI: hotplug: Don't leak pci_slot on registration failure If addition of sysfs files fails on registration of a hotplug slot, the struct pci_slot as well as the entry in the slot_list is leaked. The issue has been present since the hotplug core was introduced in 2002: https://git.kernel.org/tglx/history/c/a8a2069f432c Perhaps the idea was that even though sysfs addition fails, the slot should still be usable. But that's not how drivers use the interface, they abort probe if a non-zero value is returned. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v2.4.15+ Cc: Greg Kroah-Hartman <greg@kroah.com>	2018-07-23 17:04:10 -05:00
Lukas Wunner	b4efce5c47	PCI: hotplug: Delete skeleton driver Ten years ago, commit `58319b802a` ("PCI: Hotplug core: remove 'name'") dropped the name element from struct hotplug_slot but neglected to update the skeleton driver. That same year, commit `f46753c5e3` ("PCI: introduce pci_slot") raised the number of arguments to pci_hp_register() from one to four. Fourteen years ago, historic commit 7ab60fc1b8e7 ("PCI Hotplug skeleton: final cleanups") removed all usages of the retval variable from pcihp_skel_init() but not the variable itself, provoking a compiler warning: https://git.kernel.org/tglx/history/c/7ab60fc1b8e7 It seems fair to assume the driver hasn't been used as a template for a new driver in a while. Per Bjorn's and Christoph's preference, delete it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Christoph Hellwig <hch@lst.de>	2018-07-23 17:04:10 -05:00
Oza Pawandeep	89e1f5cb1e	PCI/portdrv: Remove pcie_portdrv_err_handler.slot_reset The pci_error_handlers.slot_reset() callback is only used for non-bridge devices (see broadcast_error_message()). Since portdrv only binds to bridges, we don't need pcie_portdrv_slot_reset(), so remove it. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog, remove pcie_portdrv_slot_reset() completely] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:13 -05:00
Oza Pawandeep	10d790d99d	PCI/AER: Clear device status bits during ERR_COR handling In case of correctable error, the Correctable Error Detected bit in the Device Status register is set. Clear it after handling the error. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:12 -05:00
Oza Pawandeep	ec752f5d54	PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL Clear the device status bits while handling both ERR_FATAL and ERR_NONFATAL cases. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: rename to pci_aer_clear_device_status(), declare internal to PCI core instead of exposing it everywhere] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:11 -05:00
Oza Pawandeep	43ec03a9e5	PCI/AER: Remove ERR_FATAL code from ERR_NONFATAL path broadcast_error_message() is only used for ERR_NONFATAL events, when the state is always pci_channel_io_normal, so remove the unused alternate path. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:10 -05:00
Oza Pawandeep	5b6c09660d	PCI/AER: Factor out ERR_NONFATAL status bit clearing aer_error_resume() clears all ERR_NONFATAL error status bits. This is exactly what pci_cleanup_aer_uncorrect_error_status(), so use that instead of duplicating the code. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:09 -05:00
Oza Pawandeep	e7b0b847de	PCI/AER: Clear only ERR_NONFATAL bits during non-fatal recovery pci_cleanup_aer_uncorrect_error_status() is called by driver .slot_reset() methods when handling ERR_NONFATAL errors. Previously this cleared all the bits, including ERR_FATAL bits. Since we're only handling ERR_NONFATAL errors, clear only the ERR_NONFATAL error status bits. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: split to separate patch] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:08 -05:00
Bjorn Helgaas	7ab92e89bf	PCI/AER: Clear only ERR_FATAL status bits during fatal recovery During recovery from fatal errors, we previously called pci_cleanup_aer_uncorrect_error_status(), which cleared all uncorrectable error status bits (both ERR_FATAL and ERR_NONFATAL). Instead, call a new pci_aer_clear_fatal_status() that clears only the ERR_FATAL bits (as indicated by the PCI_ERR_UNCOR_SEVER register). Based-on-patch-by: Oza Pawandeep <poza@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-20 15:27:07 -05:00
Sinan Kaya	c6a44ba950	PCI: Rename pci_try_reset_bus() to pci_reset_bus() Now that the old implementation of pci_reset_bus() is gone, replace pci_try_reset_bus() with pci_reset_bus(). Compared to the old implementation, new code will fail immmediately with -EAGAIN if object lock cannot be obtained. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Sinan Kaya	fe32e2fa65	PCI: Deprecate pci_reset_bus() and pci_reset_slot() functions pci_reset_bus() and pci_reset_slot() functions are not being used by any code. Remove them from the kernel in favor of pci_try_reset_bus() and pci_try_reset_slot() functions. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Sinan Kaya	811c5cb37d	PCI: Unify try slot and bus reset API Drivers are expected to call pci_try_reset_slot() or pci_try_reset_bus() by querying if a system supports hotplug or not. A survey showed that most drivers don't do this and we are leaking hotplug capability to the user. Hide pci_try_slot_reset() from drivers and embed into pci_try_bus_reset(). Change pci_try_reset_bus() parameter from struct pci_bus to struct pci_dev. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Sinan Kaya	381634cad1	PCI: Hide pci_reset_bridge_secondary_bus() from drivers Rename pci_reset_bridge_secondary_bus() to pci_bridge_secondary_bus_reset() and move the declaration from linux/pci.h to drivers/pci.h to be used internally in PCI directory only. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Sinan Kaya	1842623850	PCI: Handle error return from pci_reset_bridge_secondary_bus() Commit `01fd61c0b9` ("PCI: Add a return type for pci_reset_bridge_secondary_bus()") added a return value to the function to return if a device is accessible following a reset. Callers are not checking the value. Pass error code up high in the stack if device is not accessible. Fixes: `01fd61c0b9` ("PCI: Add a return type for pci_reset_bridge_secondary_bus()") Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 18:04:23 -05:00
Bjorn Helgaas	51259d0022	PCI/IOV: Tidy pci_sriov_set_totalvfs() Fix minor style issues in pci_sriov_set_totalvfs(). No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:42:21 -05:00
Keith Busch	e77b8216a2	PCI/DPC: Remove indirection waiting for inactive link Simplify waiting for the contained link to become inactive, removing the indirection to a unnecessary DPC-specific handler. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	738c4e411d	PCI/DPC: Use threaded IRQ for bottom half handling Remove the work struct that was being used to handle a DPC event and use a threaded IRQ instead. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	8aefa9b0d9	PCI/DPC: Print AER status in DPC event handling A DPC enabled device suppresses ERR_(NON)FATAL messages, preventing the AER handler from reporting error details. If the DPC trigger reason says the downstream port detected the error, collect the AER uncorrectable status for logging, then clear the status. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	f1d16b1756	PCI/DPC: Remove rp_pio_status from dpc struct We don't need to save the rp pio status across multiple contexts as all DPC event handling occurs in a single work queue context. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	0c27e28f77	PCI/DPC: Defer event handling to work queue Move all event handling to the existing work queue, which will make it simpler to pass event information to the handler. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:21:01 -05:00
Keith Busch	f8d46c89c8	PCI/DPC: Leave interrupts enabled while handling event Now that the DPC driver clears the interrupt status before exiting the IRQ handler, we don't need to abuse the DPC control register to know if a shared interrupt is for a new DPC event: a DPC port can not trigger a second interrupt until the host clears the trigger status later in the work queue handler. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:20:59 -05:00
Alexandru Gagniuc	7af02fcd84	PCI/AER: Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST According to the documentation, "pcie_ports=native", linux should use native AER and DPC services. While that is true for the _OSC method parsing, this is not the only place that is checked. Should the HEST list PCIe ports as firmware-first, linux will not use native services. This happens because aer_acpi_firmware_first() doesn't take 'pcie_ports' into account. This is wrong. DPC uses the same logic when it decides whether to load or not, so fixing this also fixes DPC not loading. Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com> [bhelgaas: return "false" from bool function (from kbuild robot)] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:19:53 -05:00
Rajat Jain	12833017e5	PCI/AER: Add sysfs attributes for rootport cumulative stats Add sysfs attributes for rootport statistics (that are cumulative of all the ERR_* messages seen on this PCI hierarchy). Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:19:52 -05:00
Rajat Jain	81aa5206f9	PCI/AER: Add sysfs attributes to provide AER stats and breakdown Add sysfs attributes to provide total and breakdown of the AERs seen, into different type of correctable, fatal and nonfatal errors: /sys/bus/pci/devices/<dev>/aer_dev_correctable /sys/bus/pci/devices/<dev>/aer_dev_fatal /sys/bus/pci/devices/<dev>/aer_dev_nonfatal Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:19:51 -05:00
Rajat Jain	db89ccbe52	PCI/AER: Define aer_stats structure for AER capable devices Define a structure to hold the AER statistics. There are 2 groups of statistics: dev_* counters that are to be collected for all AER capable devices and rootport_* counters that are collected for all (AER capable) rootports only. Allocate and free this structure when device is added or released (thus counters survive the lifetime of the device). Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:17:03 -05:00
Rajat Jain	60ed982a4e	PCI/AER: Move internal declarations to drivers/pci/pci.h Since pci_aer_init() and pci_no_aer() are used only internally, move their declarations to the PCI internal header file. Also, no one cares about return value of pci_aer_init(), so make it void. Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2018-07-19 16:17:03 -05:00
Tyler Baicar	bd237801fe	PCI/AER: Adopt lspci names for AER error decoding lspci uses abbreviated naming for AER error strings. Adopt the same naming convention for the AER printing so they match. Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:17:01 -05:00
Keith Busch	1e4511604d	PCI/AER: Expose internal API for obtaining AER information Export some common AER functions and structures for other PCI core drivers to use. Since this is making the function externally visible inside the PCI core, prepend "aer_" to the function name. Signed-off-by: Keith Busch <keith.busch@intel.com> [bhelgaas: move AER declarations from linux/aer.h to drivers/pci/pci.h] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Reviewed-by: Oza Pawandeep <poza@codeaurora.org>	2018-07-19 16:16:55 -05:00
Linus Torvalds	fb7d1bcf16	pci-v4.18-fixes-3 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAltQ1y4UHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vyzKA//T8+ePVGcIBZhyEDy3gX0V/WXF5Sr feOWCy5YWsY3gWkQ1XIU40kPox+6/bsO8Cte74aO5m1cWShpqEJntuFkOInNz9ag 6gkN1j3G7B8VjpzWlH9rML2d2QVcnm8POBkmtwEgBw8rdAumD25MFsvjENhVkdeL LOLzi+8kPRQl8UK33HnawU6spLgHCosSVVInIjPyNpSzw+agbW+s0i4/kmWXnB8C 9P4VieQ3dzyfzX1W5ty82Ck6Gd6gOXI42hVls9EFrJnuxQIUPe5pX14dqFzVFP3G j9SrUTPbnoZ3yVxW4mhibCv3v8+vyTxPhkradj4zt1dk4Cq4+s1Z9+l8vPjChIkQ X679bNgI1x3xIdtUc5OcIdNI9LyzmCKZ309iNPO0bTD/gHvXromw4wqvOUCkBEoa HkTJeFXf+h7DsFNAcj6ntaQAbUcwjsOgHLhikhZJ8nUcxvjLvnIZ0imzytBuyboJ L7GV0VurwVgi2yHhdzsQ9qiSO4iTXnSe05urxzZyIn8mFETOxB6Kai2kmvAKTpqx DR0/gZMWHYd40zRtPmcpFbzPCJAaTtks2/C/IzOeUkSRDSLsc5IhmjduG5pNAfqC ikf/WrWNbZHXbLAGyD37YElhYOl7StcNz3yZtlvvbx1iNjWikU4NJR/GwgNeHZ3j sRI2Trz8bfJW4AE= =SV5J -----END PGP SIGNATURE----- Merge tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix crashes that happen when PHY drivers are left disabled in the V3 Semiconductor, MediaTek, Faraday, Aardvark, DesignWare, Versatile, and X-Gene host controller drivers (Sergei Shtylyov) - Fix a NULL pointer dereference in the endpoint library configfs support (Kishon Vijay Abraham I) - Fix a race condition in Hyper-V IRQ handling (Dexuan Cui) * tag 'pci-v4.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: v3-semi: Fix I/O space page leak PCI: mediatek: Fix I/O space page leak PCI: faraday: Fix I/O space page leak PCI: aardvark: Fix I/O space page leak PCI: designware: Fix I/O space page leak PCI: versatile: Fix I/O space page leak PCI: xgene: Fix I/O space page leak PCI: OF: Fix I/O space page leak PCI: endpoint: Fix NULL pointer dereference error when CONFIGFS is disabled PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg()	2018-07-19 11:54:04 -07:00
Gustavo Pimentel	15c972dfb3	PCI: endpoint: Add MSI set maximum restriction Add pci_epc_set_msi() maximum 32 interrupts validation. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:47:25 +01:00
Gustavo Pimentel	c2e00e3108	pci-epf-test/pci_endpoint_test: Add MSI-X support Add MSI-X support and update driver documentation accordingly. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:46:45 +01:00
Gustavo Pimentel	e8817de7fb	pci-epf-test/pci_endpoint_test: Cleanup PCI_ENDPOINT_TEST memspace Cleanup PCI_ENDPOINT_TEST memspace (by moving the interrupt number away from command section). Add IRQ_TYPE register to identify the triggered ID interrupt required for the READ/WRITE/COPY tests and raise IRQ test commands. Update documentation accordingly. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:39:44 +01:00
Gustavo Pimentel	cb22d40b5f	PCI: dwc: Add legacy interrupt callback handler Currently DesignWare IP does not handle legacy interrupts. Add a legacy interrupt callback handler. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:38:38 +01:00
Gustavo Pimentel	3920a5d7b2	PCI: dwc: Rework MSI callbacks handler Remove duplicate defines located on pcie-designware.h file already available on /include/uapi/linux/pci-regs.h file. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:38:05 +01:00
Gustavo Pimentel	beb4641a78	PCI: dwc: Add MSI-X callbacks handler Add PCIe config space capability search function. Add sysfs set/get interface to allow the change of EP MSI-X maximum number. Add EP MSI-X callback for triggering interruptions. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:37:27 +01:00
Gustavo Pimentel	d3c70a98d7	PCI: Update xxx_pcie_ep_raise_irq() and pci_epc_raise_irq() signatures Change {cdns, dra7xx, artpec6, dw, rockchip}_pcie_ep_raise_irq() and pci_epc_raise_irq() signature, namely the interrupt_num variable type from u8 to u16 to accommodate 2048 maximum MSI-X interrupts. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Alan Douglas <adouglas@cadence.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:34:42 +01:00
Gustavo Pimentel	8963106eab	PCI: endpoint: Add MSI-X interfaces Add PCI_EPC_IRQ_MSIX type. Add MSI-X callbacks signatures to the ops structure. Add sysfs interface for set/get MSI-X capability maximum number. Update documentation accordingly. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:34:23 +01:00
Gustavo Pimentel	4e965ede18	PCI: dwc: Fix EP link notification implementation Move specific features settings from EP shared code (pcie-designware-ep.c) to the driver (pcie-designware-plat.c). Previous implementation disables the EP link notification by default for all SoCs that uses EP DesignWare IP, which affects directly the dra7xx and artpec6 SoCs. Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2018-07-19 11:33:58 +01:00

1 2 3 4 5 ...

6684 Commits