linux/drivers/pci/pci.h

748 lines
23 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef DRIVERS_PCI_H
#define DRIVERS_PCI_H
#include <linux/pci.h>
/* Number of possible devfns: 0.0 to 1f.7 inclusive */
#define MAX_NR_DEVFNS 256
#define PCI_FIND_CAP_TTL 48
PCI: Recognize Thunderbolt devices Detect on probe whether a PCI device is part of a Thunderbolt controller. Intel uses a Vendor-Specific Extended Capability (VSEC) with ID 0x1234 on such devices. Detect presence of this VSEC and cache it in a newly added is_thunderbolt bit in struct pci_dev. Also, add a helper to check whether a given PCI device is situated on a Thunderbolt daisy chain (i.e., below a PCI device with is_thunderbolt set). The necessity arises from the following: * If an external Thunderbolt GPU is connected to a dual GPU laptop, that GPU is currently registered with vga_switcheroo even though it can neither drive the laptop's panel nor be powered off by the platform. To vga_switcheroo it will appear as if two discrete GPUs are present. As a result, when the external GPU is runtime suspended, vga_switcheroo will cut power to the internal discrete GPU which may not be runtime suspended at all at this moment. The solution is to not register external GPUs with vga_switcheroo, which necessitates a way to recognize if they're on a Thunderbolt daisy chain. * Dual GPU MacBook Pros introduced 2011+ can no longer switch external DisplayPort ports between GPUs. (They're no longer just used for DP but have become combined DP/Thunderbolt ports.) The driver to switch the ports, drivers/platform/x86/apple-gmux.c, needs to detect presence of a Thunderbolt controller and, if found, keep external ports permanently switched to the discrete GPU. v2: Make kerneldoc for pci_is_thunderbolt_attached() more precise, drop portion of commit message pertaining to separate series. (Bjorn Helgaas) Cc: Andreas Noever <andreas.noever@gmail.com> Cc: Michael Jamet <michael.jamet@intel.com> Cc: Tomas Winkler <tomas.winkler@intel.com> Cc: Amir Levy <amir.jer.levy@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: http://patchwork.freedesktop.org/patch/msgid/0ab165a4a35c0b60f29d4c306c653ead14fcd8f9.1489145162.git.lukas@wunner.de
2017-03-10 20:23:45 +00:00
#define PCI_VSEC_ID_INTEL_TBT 0x1234 /* Thunderbolt */
extern const unsigned char pcie_link_speed[];
extern bool pci_early_dump;
bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
bool pcie_cap_has_rtctl(const struct pci_dev *dev);
/* Functions internal to the PCI core code */
int pci_create_sysfs_dev_files(struct pci_dev *pdev);
void pci_remove_sysfs_dev_files(struct pci_dev *pdev);
void pci_cleanup_rom(struct pci_dev *dev);
#ifdef CONFIG_DMI
extern const struct attribute_group pci_dev_smbios_attr_group;
#endif
enum pci_mmap_api {
PCI_MMAP_SYSFS, /* mmap on /sys/bus/pci/devices/<BDF>/resource<N> */
PCI_MMAP_PROCFS /* mmap on /proc/bus/pci/<BDF> */
};
int pci_mmap_fits(struct pci_dev *pdev, int resno, struct vm_area_struct *vmai,
enum pci_mmap_api mmap_api);
bool pci_reset_supported(struct pci_dev *dev);
void pci_init_reset_methods(struct pci_dev *dev);
int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
struct pci_cap_saved_data {
u16 cap_nr;
bool cap_extended;
unsigned int size;
u32 data[];
};
struct pci_cap_saved_state {
struct hlist_node next;
struct pci_cap_saved_data cap;
};
void pci_allocate_cap_save_buffers(struct pci_dev *dev);
void pci_free_cap_save_buffers(struct pci_dev *dev);
int pci_add_cap_save_buffer(struct pci_dev *dev, char cap, unsigned int size);
int pci_add_ext_cap_save_buffer(struct pci_dev *dev,
u16 cap, unsigned int size);
struct pci_cap_saved_state *pci_find_saved_cap(struct pci_dev *dev, char cap);
struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
u16 cap);
#define PCI_PM_D2_DELAY 200 /* usec; see PCIe r4.0, sec 5.9.1 */
#define PCI_PM_D3HOT_WAIT 10 /* msec */
#define PCI_PM_D3COLD_WAIT 100 /* msec */
/**
* struct pci_platform_pm_ops - Firmware PM callbacks
*
* @bridge_d3: Does the bridge allow entering into D3
*
* @is_manageable: returns 'true' if given device is power manageable by the
* platform firmware
*
* @set_state: invokes the platform firmware to set the device's power state
*
* @get_state: queries the platform firmware for a device's current power state
*
* @refresh_state: asks the platform to refresh the device's power state data
*
* @choose_state: returns PCI power state of given device preferred by the
* platform; to be used during system-wide transitions from a
* sleeping state to the working state and vice versa
*
* @set_wakeup: enables/disables wakeup capability for the device
PCI / ACPI / PM: Platform support for PCI PME wake-up Although the majority of PCI devices can generate PMEs that in principle may be used to wake up devices suspended at run time, platform support is generally necessary to convert PMEs into wake-up events that can be delivered to the kernel. If ACPI is used for this purpose, PME signals generated by a PCI device will trigger the ACPI GPE associated with the device to generate an ACPI wake-up event that we can set up a handler for, provided that everything is configured correctly. Unfortunately, the subset of PCI devices that have GPEs associated with them is quite limited. The devices without dedicated GPEs have to rely on the GPEs associated with other devices (in the majority of cases their upstream bridges and, possibly, the root bridge) to generate ACPI wake-up events in response to PME signals from them. Add ACPI platform support for PCI PME wake-up: o Add a framework making is possible to use ACPI system notify handlers for run-time PM. o Add new PCI platform callback ->run_wake() to struct pci_platform_pm_ops allowing us to enable/disable the platform to generate wake-up events for given device. Implemet this callback for the ACPI platform. o Define ACPI wake-up handlers for PCI devices and PCI root buses and make the PCI-ACPI binding code register wake-up notifiers for all PCI devices present in the ACPI tables. o Add function pci_dev_run_wake() which can be used by PCI drivers to check if given device is capable of generating wake-up events at run time. Developed in cooperation with Matthew Garrett <mjg@redhat.com>. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-17 22:44:09 +00:00
*
* @need_resume: returns 'true' if the given device (which is currently
* suspended) needs to be resumed to be configured for system
* wakeup.
*
* If given platform is generally capable of power managing PCI devices, all of
* these callbacks are mandatory.
*/
struct pci_platform_pm_ops {
bool (*bridge_d3)(struct pci_dev *dev);
bool (*is_manageable)(struct pci_dev *dev);
int (*set_state)(struct pci_dev *dev, pci_power_t state);
pci_power_t (*get_state)(struct pci_dev *dev);
void (*refresh_state)(struct pci_dev *dev);
pci_power_t (*choose_state)(struct pci_dev *dev);
int (*set_wakeup)(struct pci_dev *dev, bool enable);
bool (*need_resume)(struct pci_dev *dev);
};
int pci_set_platform_pm(const struct pci_platform_pm_ops *ops);
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_refresh_power_state(struct pci_dev *dev);
int pci_power_up(struct pci_dev *dev);
void pci_disable_enabled_device(struct pci_dev *dev);
int pci_finish_runtime_suspend(struct pci_dev *dev);
void pcie_clear_device_status(struct pci_dev *dev);
void pcie_clear_root_pme_status(struct pci_dev *dev);
bool pci_check_pme_status(struct pci_dev *dev);
void pci_pme_wakeup_bus(struct pci_bus *bus);
int __pci_pme_wakeup(struct pci_dev *dev, void *ign);
void pci_pme_restore(struct pci_dev *dev);
bool pci_dev_need_resume(struct pci_dev *dev);
void pci_dev_adjust_pme(struct pci_dev *dev);
void pci_dev_complete_resume(struct pci_dev *pci_dev);
void pci_config_pm_runtime_get(struct pci_dev *dev);
void pci_config_pm_runtime_put(struct pci_dev *dev);
void pci_pm_init(struct pci_dev *dev);
void pci_ea_init(struct pci_dev *dev);
void pci_msi_init(struct pci_dev *dev);
void pci_msix_init(struct pci_dev *dev);
bool pci_bridge_d3_possible(struct pci_dev *dev);
void pci_bridge_d3_update(struct pci_dev *dev);
PCI/PM: Add missing link delays required by the PCIe spec Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints: +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller \-04.0-[38-6b]-- DS hotplug port The root port (1b.0) and the PCIe switch downstream ports are all PCIe Gen3 so they support 8GT/s link speeds. We wait for the PCIe hierarchy to enter D3cold (runtime): pcieport 0000:00:1b.0: power state changed by ACPI to D3cold When it wakes up from D3cold, according to the PCIe 5.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 5.0 section 6.6.1. For the PCIe Gen3 ports we are dealing with here, the following applies: With a Downstream Port that supports Link speeds greater than 5.0 GT/s, software must wait a minimum of 100 ms after Link training completes before sending a Configuration Request to the device immediately below that Port. Software can determine when Link training completes by polling the Data Link Layer Link Active bit or by setting up an associated interrupt (see Section 6.7.3.3). Translating this into the above topology we would need to do this (DLLLA stands for Data Link Layer Link Active): 0000:00:1b.0: wait for 100 ms after DLLLA is set before access to 0000:01:00.0 0000:02:00.0: wait for 100 ms after DLLLA is set before access to 0000:03:00.0 0000:02:02.0: wait for 100 ms after DLLLA is set before access to 0000:37:00.0 I've instrumented the kernel with some additional logging so we can see the actual delays performed: pcieport 0000:00:1b.0: power state changed by ACPI to D0 pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms For the switch upstream port (01:00.0 reachable through 00:1b.0 root port) we wait for 100 ms but not taking into account the DLLLA requirement. We then wait 10 ms for D3hot -> D0 transition of the root port and the two downstream hotplug ports. This means that we deviate from what the spec requires. Performing the same check for system sleep (s2idle) transitions it turns out to be even worse. None of the mandatory delays are performed. If this would be S3 instead of s2idle then according to PCI FW spec 3.2 section 4.6.8. there is a specific _DSM that allows the OS to skip the delays but this platform does not provide the _DSM and does not go to S3 anyway so no firmware is involved that could already handle these delays. On this particular platform these delays are not actually needed because there is an additional delay as part of the ACPI power resource that is used to turn on power to the hierarchy but since that additional delay is not required by any of standards (PCIe, ACPI) it is not present in the Intel Ice Lake, for example where missing the mandatory delays causes pciehp to start tearing down the stack too early (links are not yet trained). Below is an example how it looks like when this happens: pcieport 0000:83:04.0: pciehp: Slot(4): Card not present pcieport 0000:87:04.0: PME# disabled pcieport 0000:83:04.0: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:86:00 pcieport 0000:86:00.0: Refused to change power state, currently in D3 pcieport 0000:86:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x201ff) pcieport 0000:86:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0) ... There is also one reported case (see the bugzilla link below) where the missing delay causes xHCI on a Titan Ridge controller fail to runtime resume when USB-C dock is plugged. This does not involve pciehp but instead the PCI core fails to runtime resume the xHCI device: pcieport 0000:04:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020) pcieport 0000:04:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100406) xhci_hcd 0000:39:00.0: Refused to change power state, currently in D3 xhci_hcd 0000:39:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff) xhci_hcd 0000:39:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0) ... Add a new function pci_bridge_wait_for_secondary_bus() that is called on PCI core resume and runtime resume paths accordingly if the bridge entered D3cold (and thus went through reset). This is second attempt to add the missing delays. The previous solution in c2bf1fc212f7 ("PCI: Add missing link delays required by the PCIe spec") was reverted because of two issues it caused: 1. One system become unresponsive after S3 resume due to PME service spinning in pcie_pme_work_fn(). The root port in question reports that the xHCI sent PME but the xHCI device itself does not have PME status set. The PME status bit is never cleared in the root port resulting the indefinite loop in pcie_pme_work_fn(). 2. Slows down resume if the root/downstream port does not support Data Link Layer Active Reporting because pcie_wait_for_link_delay() waits 1100 ms in that case. This version should avoid the above issues because we restrict the delay to happen only if the port went into D3cold. Link: https://lore.kernel.org/linux-pci/SL2P216MB01878BBCD75F21D882AEEA2880C60@SL2P216MB0187.KORP216.PROD.OUTLOOK.COM/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=203885 Link: https://lore.kernel.org/r/20191112091617.70282-3-mika.westerberg@linux.intel.com Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-11-12 09:16:17 +00:00
void pci_bridge_wait_for_secondary_bus(struct pci_dev *dev);
static inline void pci_wakeup_event(struct pci_dev *dev)
{
/* Wait 100 ms before the system can be put into a sleep state. */
pm_wakeup_event(&dev->dev, 100);
}
static inline bool pci_has_subordinate(struct pci_dev *pci_dev)
{
return !!(pci_dev->subordinate);
}
PCI: Put PCIe ports into D3 during suspend Currently the Linux PCI core does not touch power state of PCI bridges and PCIe ports when system suspend is entered. Leaving them in D0 consumes power unnecessarily and may prevent the CPU from entering deeper C-states. With recent PCIe hardware we can power down the ports to save power given that we take into account few restrictions: - The PCIe port hardware is recent enough, starting from 2015. - Devices connected to PCIe ports are effectively in D3cold once the port is transitioned to D3 (the config space is not accessible anymore and the link may be powered down). - Devices behind the PCIe port need to be allowed to transition to D3cold and back. There is a way both drivers and userspace can forbid this. - If the device behind the PCIe port is capable of waking the system it needs to be able to do so from D3cold. This patch adds a new flag to struct pci_device called 'bridge_d3'. This flag is set and cleared by the PCI core whenever there is a change in power management state of any of the devices behind the PCIe port. When system later on is suspended we only need to check this flag and if it is true transition the port to D3 otherwise we leave it in D0. Also provide override mechanism via command line parameter "pcie_port_pm=[off|force]" that can be used to disable or enable the feature regardless of the BIOS manufacturing date. Tested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-02 08:17:12 +00:00
static inline bool pci_power_manageable(struct pci_dev *pci_dev)
{
/*
* Currently we allow normal PCI devices and PCI bridges transition
* into D3 if their bridge_d3 is set.
*/
return !pci_has_subordinate(pci_dev) || pci_dev->bridge_d3;
}
static inline bool pcie_downstream_port(const struct pci_dev *dev)
{
int type = pci_pcie_type(dev);
return type == PCI_EXP_TYPE_ROOT_PORT ||
type == PCI_EXP_TYPE_DOWNSTREAM ||
type == PCI_EXP_TYPE_PCIE_BRIDGE;
}
void pci_vpd_init(struct pci_dev *dev);
void pci_vpd_release(struct pci_dev *dev);
extern const struct attribute_group pci_dev_vpd_attr_group;
/* PCI Virtual Channel */
int pci_save_vc_state(struct pci_dev *dev);
void pci_restore_vc_state(struct pci_dev *dev);
void pci_allocate_vc_save_buffers(struct pci_dev *dev);
/* PCI /proc functions */
#ifdef CONFIG_PROC_FS
int pci_proc_attach_device(struct pci_dev *dev);
int pci_proc_detach_device(struct pci_dev *dev);
int pci_proc_detach_bus(struct pci_bus *bus);
#else
static inline int pci_proc_attach_device(struct pci_dev *dev) { return 0; }
static inline int pci_proc_detach_device(struct pci_dev *dev) { return 0; }
static inline int pci_proc_detach_bus(struct pci_bus *bus) { return 0; }
#endif
/* Functions for PCI Hotplug drivers to use */
int pci_hp_add_bridge(struct pci_dev *dev);
#ifdef HAVE_PCI_LEGACY
void pci_create_legacy_files(struct pci_bus *bus);
void pci_remove_legacy_files(struct pci_bus *bus);
#else
static inline void pci_create_legacy_files(struct pci_bus *bus) { return; }
static inline void pci_remove_legacy_files(struct pci_bus *bus) { return; }
#endif
/* Lock for read/write access to pci device and bus lists */
extern struct rw_semaphore pci_bus_sem;
extern struct mutex pci_slot_mutex;
extern raw_spinlock_t pci_lock;
extern unsigned int pci_pm_d3hot_delay;
#ifdef CONFIG_PCI_MSI
void pci_no_msi(void);
#else
static inline void pci_no_msi(void) { }
#endif
void pci_realloc_get_opt(char *);
static inline int pci_no_d1d2(struct pci_dev *dev)
{
unsigned int parent_dstates = 0;
if (dev->bus->self)
parent_dstates = dev->bus->self->no_d1d2;
return (dev->no_d1d2 || parent_dstates);
}
extern const struct attribute_group *pci_dev_groups[];
extern const struct attribute_group *pcibus_groups[];
extern const struct device_type pci_dev_type;
extern const struct attribute_group *pci_bus_groups[];
extern unsigned long pci_hotplug_io_size;
extern unsigned long pci_hotplug_mmio_size;
extern unsigned long pci_hotplug_mmio_pref_size;
extern unsigned long pci_hotplug_bus_size;
/**
* pci_match_one_device - Tell if a PCI device structure has a matching
* PCI device id structure
* @id: single PCI device id structure to match
* @dev: the PCI device structure to match against
*
* Returns the matching pci_device_id structure or %NULL if there is no match.
*/
static inline const struct pci_device_id *
pci_match_one_device(const struct pci_device_id *id, const struct pci_dev *dev)
{
if ((id->vendor == PCI_ANY_ID || id->vendor == dev->vendor) &&
(id->device == PCI_ANY_ID || id->device == dev->device) &&
(id->subvendor == PCI_ANY_ID || id->subvendor == dev->subsystem_vendor) &&
(id->subdevice == PCI_ANY_ID || id->subdevice == dev->subsystem_device) &&
!((id->class ^ dev->class) & id->class_mask))
return id;
return NULL;
}
PCI: introduce pci_slot Currently, /sys/bus/pci/slots/ only exposes hotplug attributes when a hotplug driver is loaded, but PCI slots have attributes such as address, speed, width, etc. that are not related to hotplug at all. Introduce pci_slot as the primary data structure and kobject model. Hotplug attributes described in hotplug_slot become a secondary structure associated with the pci_slot. This patch only creates the infrastructure that allows the separation of PCI slot attributes and hotplug attributes. In this patch, the PCI hotplug core remains the only user of this infrastructure, and thus, /sys/bus/pci/slots/ will still only become populated when a hotplug driver is loaded. A later patch in this series will add a second user of this new infrastructure and demonstrate splitting the task of exposing pci_slot attributes from hotplug_slot attributes. - Make pci_slot the primary sysfs entity. hotplug_slot becomes a subsidiary structure. o pci_create_slot() creates and registers a slot with the PCI core o pci_slot_add_hotplug() gives it hotplug capability - Change the prototype of pci_hp_register() to take the bus and slot number (on parent bus) as parameters. - Remove all the ->get_address methods since this functionality is now handled by pci_slot directly. [achiang@hp.com: rpaphp-correctly-pci_hp_register-for-empty-pci-slots] Tested-by: Badari Pulavarty <pbadari@us.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: make headers_check happy] [akpm@linux-foundation.org: nuther build fix] [akpm@linux-foundation.org: fix typo in #include] Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Matthew Wilcox <matthew@wil.cx> Cc: Greg KH <greg@kroah.com> Cc: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Cc: Len Brown <lenb@kernel.org> Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-06-10 21:28:50 +00:00
/* PCI slot sysfs helper code */
#define to_pci_slot(s) container_of(s, struct pci_slot, kobj)
extern struct kset *pci_slots_kset;
struct pci_slot_attribute {
struct attribute attr;
ssize_t (*show)(struct pci_slot *, char *);
ssize_t (*store)(struct pci_slot *, const char *, size_t);
};
#define to_pci_slot_attr(s) container_of(s, struct pci_slot_attribute, attr)
enum pci_bar_type {
pci_bar_unknown, /* Standard PCI BAR probe */
pci_bar_io, /* An I/O port BAR */
pci_bar_mem32, /* A 32-bit memory BAR */
pci_bar_mem64, /* A 64-bit memory BAR */
};
struct device *pci_get_host_bridge_device(struct pci_dev *dev);
void pci_put_host_bridge_device(struct device *dev);
int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int crs_timeout);
PCI: Workaround IDT switch ACS Source Validation erratum Some IDT switches incorrectly flag an ACS Source Validation error on completions for config read requests even though PCIe r4.0, sec 6.12.1.1, says that completions are never affected by ACS Source Validation. Here's the text of IDT 89H32H8G3-YC, erratum #36: Item #36 - Downstream port applies ACS Source Validation to Completions Section 6.12.1.1 of the PCI Express Base Specification 3.1 states that completions are never affected by ACS Source Validation. However, completions received by a downstream port of the PCIe switch from a device that has not yet captured a PCIe bus number are incorrectly dropped by ACS Source Validation by the switch downstream port. Workaround: Issue a CfgWr1 to the downstream device before issuing the first CfgRd1 to the device. This allows the downstream device to capture its bus number; ACS Source Validation no longer stops completions from being forwarded by the downstream port. It has been observed that Microsoft Windows implements this workaround already; however, some versions of Linux and other operating systems may not. When doing the first config read to probe for a device, if the device is behind an IDT switch with this erratum: 1. Disable ACS Source Validation if enabled 2. Wait for device to become ready to accept config accesses (by using the Config Request Retry Status mechanism) 3. Do a config write to the endpoint 4. Enable ACS Source Validation (if it was enabled to begin with) The workaround suggested by IDT is basically only step 3, but we don't know when the device is ready to accept config requests. That means we need to do config reads until we receive a non-Config Request Retry Status, which means we need to disable ACS SV temporarily. Signed-off-by: James Puthukattukaran <james.puthukattukaran@oracle.com> [bhelgaas: changelog, clean up whitespace, fold in unused variable fix from Anders Roxell <anders.roxell@linaro.org>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2018-07-09 15:31:25 +00:00
bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
int crs_timeout);
int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int crs_timeout);
int pci_setup_device(struct pci_dev *dev);
int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
void pci_configure_ari(struct pci_dev *dev);
void __pci_bus_size_bridges(struct pci_bus *bus,
PCI / ACPI: Use boot-time resource allocation rules during hotplug On x86 platforms, the kernel respects PCI resource assignments from the BIOS and only reassigns resources for unassigned BARs at boot time. However, with the ACPI-based hotplug (acpiphp), it ignores the BIOS' PCI resource assignments completely and reassigns all resources by itself. This causes differences in PCI resource allocation between boot time and runtime hotplug to occur, which is generally undesirable and sometimes actively breaks things. Namely, if there are enough resources, reassigning all PCI resources during runtime hotplug should work, but it may fail if the resources are constrained. This may happen, for instance, when some PCI devices with huge MMIO BARs are involved in the runtime hotplug operations, because the current PCI MMIO alignment algorithm may waste huge chunks of MMIO address space in those cases. On the Alexander's Sony VAIO VPCZ23A4R the BIOS allocates limited MMIO resources for the dock station which contains a device (graphics adapter) with a 256MB MMIO BAR. An attempt to reassign that during runtime hotplug causes the dock station MMIO window to be exhausted and acpiphp fails to allocate resources for the majority of devices on the dock station as a result. To prevent that from happening, modify acpiphp to follow the boot time resources allocation behavior so that the BIOS' resource assignments are respected during runtime hotplug too. [rjw: Changelog] References: https://bugzilla.kernel.org/show_bug.cgi?id=56531 Reported-and-tested-by: Alexander E. Patrakov <patrakov@gmail.com> Tested-by: Illya Klymov <xanf@xanf.me> Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: 3.9+ <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-06-22 23:01:35 +00:00
struct list_head *realloc_head);
void __pci_bus_assign_resources(const struct pci_bus *bus,
struct list_head *realloc_head,
struct list_head *fail_head);
bool pci_bus_clip_resource(struct pci_dev *dev, int idx);
void pci_reassigndev_resource_alignment(struct pci_dev *dev);
void pci_disable_bridge_window(struct pci_dev *dev);
struct pci_bus *pci_bus_get(struct pci_bus *bus);
void pci_bus_put(struct pci_bus *bus);
2009-03-16 08:13:39 +00:00
/* PCIe link information from Link Capabilities 2 */
#define PCIE_LNKCAP2_SLS2SPEED(lnkcap2) \
((lnkcap2) & PCI_EXP_LNKCAP2_SLS_64_0GB ? PCIE_SPEED_64_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_32_0GB ? PCIE_SPEED_32_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_16_0GB ? PCIE_SPEED_16_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_8_0GB ? PCIE_SPEED_8_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_5_0GB ? PCIE_SPEED_5_0GT : \
(lnkcap2) & PCI_EXP_LNKCAP2_SLS_2_5GB ? PCIE_SPEED_2_5GT : \
PCI_SPEED_UNKNOWN)
/* PCIe speed to Mb/s reduced by encoding overhead */
#define PCIE_SPEED2MBS_ENC(speed) \
((speed) == PCIE_SPEED_64_0GT ? 64000*128/130 : \
(speed) == PCIE_SPEED_32_0GT ? 32000*128/130 : \
(speed) == PCIE_SPEED_16_0GT ? 16000*128/130 : \
(speed) == PCIE_SPEED_8_0GT ? 8000*128/130 : \
(speed) == PCIE_SPEED_5_0GT ? 5000*8/10 : \
(speed) == PCIE_SPEED_2_5GT ? 2500*8/10 : \
0)
const char *pci_speed_string(enum pci_bus_speed speed);
enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
u32 pcie_bandwidth_capable(struct pci_dev *dev, enum pci_bus_speed *speed,
enum pcie_link_width *width);
void __pcie_print_link_status(struct pci_dev *dev, bool verbose);
void pcie_report_downtraining(struct pci_dev *dev);
void pcie_update_link_speed(struct pci_bus *bus, u16 link_status);
/* Single Root I/O Virtualization */
struct pci_sriov {
int pos; /* Capability position */
int nres; /* Number of resources */
u32 cap; /* SR-IOV Capabilities */
u16 ctrl; /* SR-IOV Control */
u16 total_VFs; /* Total VFs associated with the PF */
u16 initial_VFs; /* Initial VFs associated with the PF */
u16 num_VFs; /* Number of VFs available */
u16 offset; /* First VF Routing ID offset */
u16 stride; /* Following VF stride */
u16 vf_device; /* VF device ID */
u32 pgsz; /* Page size for BAR alignment */
u8 link; /* Function Dependency Link */
u8 max_VF_buses; /* Max buses consumed by VFs */
u16 driver_max_VFs; /* Max num VFs driver supports */
struct pci_dev *dev; /* Lowest numbered PF */
struct pci_dev *self; /* This PF */
u32 class; /* VF device */
u8 hdr_type; /* VF header type */
u16 subsystem_vendor; /* VF subsystem vendor */
u16 subsystem_device; /* VF subsystem device */
resource_size_t barsz[PCI_SRIOV_NUM_BARS]; /* VF BAR size */
bool drivers_autoprobe; /* Auto probing of VFs by driver */
};
/**
* pci_dev_set_io_state - Set the new error state if possible.
*
* @dev: PCI device to set new error_state
* @new: the state we want dev to be in
*
* Must be called with device_lock held.
*
* Returns true if state has been changed to the requested state.
*/
static inline bool pci_dev_set_io_state(struct pci_dev *dev,
pci_channel_state_t new)
{
bool changed = false;
device_lock_assert(&dev->dev);
switch (new) {
case pci_channel_io_perm_failure:
switch (dev->error_state) {
case pci_channel_io_frozen:
case pci_channel_io_normal:
case pci_channel_io_perm_failure:
changed = true;
break;
}
break;
case pci_channel_io_frozen:
switch (dev->error_state) {
case pci_channel_io_frozen:
case pci_channel_io_normal:
changed = true;
break;
}
break;
case pci_channel_io_normal:
switch (dev->error_state) {
case pci_channel_io_frozen:
case pci_channel_io_normal:
changed = true;
break;
}
break;
}
if (changed)
dev->error_state = new;
return changed;
}
static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused)
{
device_lock(&dev->dev);
pci_dev_set_io_state(dev, pci_channel_io_perm_failure);
device_unlock(&dev->dev);
return 0;
}
static inline bool pci_dev_is_disconnected(const struct pci_dev *dev)
{
return dev->error_state == pci_channel_io_perm_failure;
}
/* pci_dev priv_flags */
#define PCI_DEV_ADDED 0
PCI: pciehp: Ignore Link Down/Up caused by DPC Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver. A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state. Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot. Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event. Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes. Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag. If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default. pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform. The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete. This commit draws inspiration from previous attempts to synchronize DPC with pciehp: By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/ By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/ By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de Reported-by: Sinan Kaya <okaya@kernel.org> Reported-by: Ethan Zhao <haifeng.zhao@intel.com> Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>
2021-05-01 08:29:00 +00:00
#define PCI_DPC_RECOVERED 1
#define PCI_DPC_RECOVERING 2
static inline void pci_dev_assign_added(struct pci_dev *dev, bool added)
{
assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added);
}
static inline bool pci_dev_is_added(const struct pci_dev *dev)
{
return test_bit(PCI_DEV_ADDED, &dev->priv_flags);
}
#ifdef CONFIG_PCIEAER
#include <linux/aer.h>
#define AER_MAX_MULTI_ERR_DEVICES 5 /* Not likely to have more */
struct aer_err_info {
struct pci_dev *dev[AER_MAX_MULTI_ERR_DEVICES];
int error_dev_num;
unsigned int id:16;
unsigned int severity:2; /* 0:NONFATAL | 1:FATAL | 2:COR */
unsigned int __pad1:5;
unsigned int multi_error_valid:1;
unsigned int first_error:5;
unsigned int __pad2:2;
unsigned int tlp_header_valid:1;
unsigned int status; /* COR/UNCOR Error Status */
unsigned int mask; /* COR/UNCOR Error Mask */
struct aer_header_log_regs tlp; /* TLP Header */
};
int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info);
void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
#endif /* CONFIG_PCIEAER */
#ifdef CONFIG_PCIEPORTBUS
/* Cached RCEC Endpoint Association */
struct rcec_ea {
u8 nextbusn;
u8 lastbusn;
u32 bitmap;
};
#endif
#ifdef CONFIG_PCIE_DPC
void pci_save_dpc_state(struct pci_dev *dev);
void pci_restore_dpc_state(struct pci_dev *dev);
void pci_dpc_init(struct pci_dev *pdev);
void dpc_process_error(struct pci_dev *pdev);
pci_ers_result_t dpc_reset_link(struct pci_dev *pdev);
PCI: pciehp: Ignore Link Down/Up caused by DPC Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver. A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state. Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot. Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event. Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes. Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag. If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default. pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform. The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete. This commit draws inspiration from previous attempts to synchronize DPC with pciehp: By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/ By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/ By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de Reported-by: Sinan Kaya <okaya@kernel.org> Reported-by: Ethan Zhao <haifeng.zhao@intel.com> Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>
2021-05-01 08:29:00 +00:00
bool pci_dpc_recovered(struct pci_dev *pdev);
#else
static inline void pci_save_dpc_state(struct pci_dev *dev) {}
static inline void pci_restore_dpc_state(struct pci_dev *dev) {}
static inline void pci_dpc_init(struct pci_dev *pdev) {}
PCI: pciehp: Ignore Link Down/Up caused by DPC Downstream Port Containment (PCIe r5.0, sec. 6.2.10) disables the link upon an error and attempts to re-enable it when instructed by the DPC driver. A slot which is both DPC- and hotplug-capable is currently powered off by pciehp once DPC is triggered (due to the link change) and powered back up on successful recovery. That's undesirable, the slot should remain powered so the hotplugged device remains bound to its driver. DPC notifies the driver of the error and of successful recovery in pcie_do_recovery() and the driver may then restore the device to working state. Moreover, Sinan points out that turning off slot power by pciehp may foil recovery by DPC: Power off/on is a cold reset concurrently to DPC's warm reset. Sathyanarayanan reports extended delays or failure in link retraining by DPC if pciehp brings down the slot. Fix by detecting whether a Link Down event is caused by DPC and awaiting recovery if so. On successful recovery, ignore both the Link Down and the subsequent Link Up event. Afterwards, check whether the link is down to detect surprise-removal or another DPC event immediately after DPC recovery. Ensure that the corresponding DLLSC event is not ignored by synthesizing it and invoking irq_wake_thread() to trigger a re-run of pciehp_ist(). The IRQ threads of the hotplug and DPC drivers, pciehp_ist() and dpc_handler(), race against each other. If pciehp is faster than DPC, it will wait until DPC recovery completes. Recovery consists of two steps: The first step (waiting for link disablement) is recognizable by pciehp through a set DPC Trigger Status bit. The second step (waiting for link retraining) is recognizable through a newly introduced PCI_DPC_RECOVERING flag. If DPC is faster than pciehp, neither of the two flags will be set and pciehp may glean the recovery status from the new PCI_DPC_RECOVERED flag. The flag is zero if DPC didn't occur at all, hence DLLSC events are not ignored by default. pciehp waits up to 4 seconds before assuming that DPC recovery failed and bringing down the slot. This timeout is not taken from the spec (it doesn't mandate one) but based on a report from Yicong Yang that DPC may take a bit more than 3 seconds on HiSilicon's Kunpeng platform. The timeout is necessary because the DPC Trigger Status bit may never clear: On Root Ports which support RP Extensions for DPC, the DPC driver polls the DPC RP Busy bit for up to 1 second before giving up on DPC recovery. Without the timeout, pciehp would then wait indefinitely for DPC to complete. This commit draws inspiration from previous attempts to synchronize DPC with pciehp: By Sinan Kaya, August 2018: https://lore.kernel.org/linux-pci/20180818065126.77912-1-okaya@kernel.org/ By Ethan Zhao, October 2020: https://lore.kernel.org/linux-pci/20201007113158.48933-1-haifeng.zhao@intel.com/ By Kuppuswamy Sathyanarayanan, March 2021: https://lore.kernel.org/linux-pci/59cb30f5e5ac6d65427ceaadf1012b2ba8dbf66c.1615606143.git.sathyanarayanan.kuppuswamy@linux.intel.com/ Link: https://lore.kernel.org/r/0be565d97438fe2a6d57354b3aa4e8626952a00b.1619857124.git.lukas@wunner.de Reported-by: Sinan Kaya <okaya@kernel.org> Reported-by: Ethan Zhao <haifeng.zhao@intel.com> Reported-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>
2021-05-01 08:29:00 +00:00
static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
#endif
#ifdef CONFIG_PCIEPORTBUS
void pci_rcec_init(struct pci_dev *dev);
void pci_rcec_exit(struct pci_dev *dev);
void pcie_link_rcec(struct pci_dev *rcec);
void pcie_walk_rcec(struct pci_dev *rcec,
int (*cb)(struct pci_dev *, void *),
void *userdata);
#else
static inline void pci_rcec_init(struct pci_dev *dev) {}
static inline void pci_rcec_exit(struct pci_dev *dev) {}
static inline void pcie_link_rcec(struct pci_dev *rcec) {}
static inline void pcie_walk_rcec(struct pci_dev *rcec,
int (*cb)(struct pci_dev *, void *),
void *userdata) {}
#endif
#ifdef CONFIG_PCI_ATS
/* Address Translation Service */
void pci_ats_init(struct pci_dev *dev);
void pci_restore_ats_state(struct pci_dev *dev);
#else
static inline void pci_ats_init(struct pci_dev *d) { }
static inline void pci_restore_ats_state(struct pci_dev *dev) { }
#endif /* CONFIG_PCI_ATS */
#ifdef CONFIG_PCI_PRI
void pci_pri_init(struct pci_dev *dev);
void pci_restore_pri_state(struct pci_dev *pdev);
#else
static inline void pci_pri_init(struct pci_dev *dev) { }
static inline void pci_restore_pri_state(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCI_PASID
void pci_pasid_init(struct pci_dev *dev);
void pci_restore_pasid_state(struct pci_dev *pdev);
#else
static inline void pci_pasid_init(struct pci_dev *dev) { }
static inline void pci_restore_pasid_state(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCI_IOV
int pci_iov_init(struct pci_dev *dev);
void pci_iov_release(struct pci_dev *dev);
void pci_iov_remove(struct pci_dev *dev);
void pci_iov_update_resource(struct pci_dev *dev, int resno);
resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
void pci_restore_iov_state(struct pci_dev *dev);
int pci_iov_bus_range(struct pci_bus *bus);
PCI/IOV: Add sysfs MSI-X vector assignment interface A typical cloud provider SR-IOV use case is to create many VFs for use by guest VMs. The VFs may not be assigned to a VM until a customer requests a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors proportional to the number of CPUs in the VM, but there is no standard way to change the number of MSI-X vectors supported by a VF. Some Mellanox ConnectX devices support dynamic assignment of MSI-X vectors to SR-IOV VFs. This can be done by the PF driver after VFs are enabled, and it can be done without affecting VFs that are already in use. The hardware supports a limited pool of MSI-X vectors that can be assigned to the PF or to individual VFs. This is device-specific behavior that requires support in the PF driver. Add a read-only "sriov_vf_total_msix" sysfs file for the PF and a writable "sriov_vf_msix_count" file for each VF. Management software may use these to learn how many MSI-X vectors are available and to dynamically assign them to VFs before the VFs are passed through to a VM. If the PF driver implements the ->sriov_get_vf_total_msix() callback, "sriov_vf_total_msix" contains the total number of MSI-X vectors available for distribution among VFs. If no driver is bound to the VF, writing "N" to "sriov_vf_msix_count" uses the PF driver ->sriov_set_msix_vec_count() callback to assign "N" MSI-X vectors to the VF. When a VF driver subsequently reads the MSI-X Message Control register, it will see the new Table Size "N". Link: https://lore.kernel.org/linux-pci/20210314124256.70253-2-leon@kernel.org Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-04-04 07:22:18 +00:00
extern const struct attribute_group sriov_pf_dev_attr_group;
extern const struct attribute_group sriov_vf_dev_attr_group;
#else
static inline int pci_iov_init(struct pci_dev *dev)
{
return -ENODEV;
}
static inline void pci_iov_release(struct pci_dev *dev)
{
}
static inline void pci_iov_remove(struct pci_dev *dev)
{
}
static inline void pci_restore_iov_state(struct pci_dev *dev)
{
}
static inline int pci_iov_bus_range(struct pci_bus *bus)
{
return 0;
}
#endif /* CONFIG_PCI_IOV */
#ifdef CONFIG_PCIE_PTM
void pci_save_ptm_state(struct pci_dev *dev);
void pci_restore_ptm_state(struct pci_dev *dev);
void pci_disable_ptm(struct pci_dev *dev);
#else
static inline void pci_save_ptm_state(struct pci_dev *dev) { }
static inline void pci_restore_ptm_state(struct pci_dev *dev) { }
static inline void pci_disable_ptm(struct pci_dev *dev) { }
#endif
unsigned long pci_cardbus_resource_alignment(struct resource *);
static inline resource_size_t pci_resource_alignment(struct pci_dev *dev,
struct resource *res)
PCI SR-IOV: correct broken resource alignment calculations An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Yu Zhao <yu.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-28 20:00:06 +00:00
{
#ifdef CONFIG_PCI_IOV
int resno = res - dev->resource;
if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
return pci_sriov_resource_alignment(dev, resno);
#endif
if (dev->class >> 8 == PCI_CLASS_BRIDGE_CARDBUS)
return pci_cardbus_resource_alignment(res);
PCI SR-IOV: correct broken resource alignment calculations An SR-IOV capable device includes an SR-IOV PCIe capability which describes the Virtual Function (VF) BAR requirements. A typical SR-IOV device can support multiple VFs whose BARs must be in a contiguous region, effectively an array of VF BARs. The BAR reports the size requirement for a single VF. We calculate the full range needed by simply multiplying the VF BAR size with the number of possible VFs and create a resource spanning the full range. This all seems sane enough except it artificially inflates the alignment requirement for the VF BAR. The VF BAR need only be aligned to the size of a single BAR not the contiguous range of VF BARs. This can cause us to fail to allocate resources for the BAR despite the fact that we actually have enough space. This patch adds a thin PCI specific layer over the generic resource_alignment() function which is aware of the special nature of VF BARs and does sorting and allocation based on the smaller alignment requirement. I recognize that while resource_alignment is generic, it's basically a PCI helper. An alternative to this patch is to add PCI VF BAR specific information to struct resource. I opted for the extra layer rather than adding such PCI specific information to struct resource. This does have the slight downside that we don't cache the BAR size and re-read for each alignment query (happens a small handful of times during boot for each VF BAR). Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Yu Zhao <yu.zhao@intel.com> Cc: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-08-28 20:00:06 +00:00
return resource_alignment(res);
}
void pci_acs_init(struct pci_dev *dev);
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_acs_enabled(struct pci_dev *dev, u16 acs_flags);
int pci_dev_specific_enable_acs(struct pci_dev *dev);
int pci_dev_specific_disable_acs_redir(struct pci_dev *dev);
#else
static inline int pci_dev_specific_acs_enabled(struct pci_dev *dev,
u16 acs_flags)
{
return -ENOTTY;
}
static inline int pci_dev_specific_enable_acs(struct pci_dev *dev)
{
return -ENOTTY;
}
static inline int pci_dev_specific_disable_acs_redir(struct pci_dev *dev)
{
return -ENOTTY;
}
#endif
/* PCI error reporting and recovery */
pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_channel_state_t state,
pci_ers_result_t (*reset_subordinates)(struct pci_dev *pdev));
bool pcie_wait_for_link(struct pci_dev *pdev, bool active);
#ifdef CONFIG_PCIEASPM
void pcie_aspm_init_link_state(struct pci_dev *pdev);
void pcie_aspm_exit_link_state(struct pci_dev *pdev);
void pcie_aspm_pm_state_change(struct pci_dev *pdev);
void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
#else
static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev) { }
static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
#endif
#ifdef CONFIG_PCIE_ECRC
void pcie_set_ecrc_checking(struct pci_dev *dev);
void pcie_ecrc_get_policy(char *str);
#else
static inline void pcie_set_ecrc_checking(struct pci_dev *dev) { }
static inline void pcie_ecrc_get_policy(char *str) { }
#endif
#ifdef CONFIG_PCIE_PTM
void pci_ptm_init(struct pci_dev *dev);
#else
static inline void pci_ptm_init(struct pci_dev *dev) { }
#endif
struct pci_dev_reset_methods {
u16 vendor;
u16 device;
int (*reset)(struct pci_dev *dev, bool probe);
};
struct pci_reset_fn_method {
int (*reset_fn)(struct pci_dev *pdev, bool probe);
char *name;
};
#ifdef CONFIG_PCI_QUIRKS
int pci_dev_specific_reset(struct pci_dev *dev, bool probe);
#else
static inline int pci_dev_specific_reset(struct pci_dev *dev, bool probe)
{
return -ENOTTY;
}
#endif
#if defined(CONFIG_PCI_QUIRKS) && defined(CONFIG_ARM64)
int acpi_get_rc_resources(struct device *dev, const char *hid, u16 segment,
struct resource *res);
#else
static inline int acpi_get_rc_resources(struct device *dev, const char *hid,
u16 segment, struct resource *res)
{
return -ENODEV;
}
#endif
int pci_rebar_get_current_size(struct pci_dev *pdev, int bar);
int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size);
static inline u64 pci_rebar_size_to_bytes(int size)
{
return 1ULL << (size + 20);
}
struct device_node;
#ifdef CONFIG_OF
int of_pci_parse_bus_range(struct device_node *node, struct resource *res);
int of_get_pci_domain_nr(struct device_node *node);
int of_pci_get_max_link_speed(struct device_node *node);
void pci_set_of_node(struct pci_dev *dev);
void pci_release_of_node(struct pci_dev *dev);
void pci_set_bus_of_node(struct pci_bus *bus);
void pci_release_bus_of_node(struct pci_bus *bus);
int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge);
#else
static inline int
of_pci_parse_bus_range(struct device_node *node, struct resource *res)
{
return -EINVAL;
}
static inline int
of_get_pci_domain_nr(struct device_node *node)
{
return -1;
}
static inline int
of_pci_get_max_link_speed(struct device_node *node)
{
return -EINVAL;
}
static inline void pci_set_of_node(struct pci_dev *dev) { }
static inline void pci_release_of_node(struct pci_dev *dev) { }
static inline void pci_set_bus_of_node(struct pci_bus *bus) { }
static inline void pci_release_bus_of_node(struct pci_bus *bus) { }
static inline int devm_of_pci_bridge_init(struct device *dev, struct pci_host_bridge *bridge)
{
return 0;
}
#endif /* CONFIG_OF */
#ifdef CONFIG_PCIEAER
void pci_no_aer(void);
void pci_aer_init(struct pci_dev *dev);
void pci_aer_exit(struct pci_dev *dev);
extern const struct attribute_group aer_stats_attr_group;
void pci_aer_clear_fatal_status(struct pci_dev *dev);
int pci_aer_clear_status(struct pci_dev *dev);
int pci_aer_raw_clear_status(struct pci_dev *dev);
#else
static inline void pci_no_aer(void) { }
static inline void pci_aer_init(struct pci_dev *d) { }
static inline void pci_aer_exit(struct pci_dev *d) { }
static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
#endif
#ifdef CONFIG_ACPI
PCI/ACPI: Remove unnecessary struct hotplug_program_ops Move the ACPI-specific structs hpx_type0, hpx_type1, hpx_type2 and hpx_type3 to drivers/pci/pci-acpi.c as they are not used anywhere else. Then remove the struct hotplug_program_ops that has been shared between drivers/pci/probe.c and drivers/pci/pci-acpi.c from drivers/pci/pci.h as it is no longer needed. The struct hotplug_program_ops was added by 87fcf12e846a ("PCI/ACPI: Remove the need for 'struct hotplug_params'") and replaced previously used struct hotplug_params enabling the support for the _HPX Type 3 Setting Record that was added by f873c51a155a ("PCI/ACPI: Implement _HPX Type 3 Setting Record"). The new struct allowed for the static functions such program_hpx_type0(), program_hpx_type1(), etc., from the drivers/pci/probe.c to be called from the function pci_acpi_program_hp_params() in the drivers/pci/pci-acpi.c. Previously a programming of _HPX Type 0 was as follows: drivers/pci/probe.c: program_hpx_type0() ... pci_configure_device() hp_ops = { .program_type0 = program_hpx_type0, ... } pci_acpi_program_hp_params(&hp_ops) drivers/pci/pci-acpi.c: pci_acpi_program_hp_params(&hp_ops) acpi_run_hpx(hp_ops) decode_type0_hpx_record() hp_ops->program_type0 # program_hpx_type0() called via hp_ops After the ACPI-specific functions, structs, enums, etc., have been moved to drivers/pci/pci-acpi.c there is no need for the hotplug_program_ops as all of the _HPX Type 0, 1, 2 and 3 are directly accessible. Link: https://lore.kernel.org/r/20190827094951.10613-4-kw@linux.com Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-08-27 09:49:51 +00:00
int pci_acpi_program_hp_params(struct pci_dev *dev);
extern const struct attribute_group pci_dev_acpi_attr_group;
void pci_set_acpi_fwnode(struct pci_dev *dev);
int pci_dev_acpi_reset(struct pci_dev *dev, bool probe);
#else
static inline int pci_dev_acpi_reset(struct pci_dev *dev, bool probe)
{
return -ENOTTY;
}
static inline void pci_set_acpi_fwnode(struct pci_dev *dev) {}
PCI/ACPI: Remove unnecessary struct hotplug_program_ops Move the ACPI-specific structs hpx_type0, hpx_type1, hpx_type2 and hpx_type3 to drivers/pci/pci-acpi.c as they are not used anywhere else. Then remove the struct hotplug_program_ops that has been shared between drivers/pci/probe.c and drivers/pci/pci-acpi.c from drivers/pci/pci.h as it is no longer needed. The struct hotplug_program_ops was added by 87fcf12e846a ("PCI/ACPI: Remove the need for 'struct hotplug_params'") and replaced previously used struct hotplug_params enabling the support for the _HPX Type 3 Setting Record that was added by f873c51a155a ("PCI/ACPI: Implement _HPX Type 3 Setting Record"). The new struct allowed for the static functions such program_hpx_type0(), program_hpx_type1(), etc., from the drivers/pci/probe.c to be called from the function pci_acpi_program_hp_params() in the drivers/pci/pci-acpi.c. Previously a programming of _HPX Type 0 was as follows: drivers/pci/probe.c: program_hpx_type0() ... pci_configure_device() hp_ops = { .program_type0 = program_hpx_type0, ... } pci_acpi_program_hp_params(&hp_ops) drivers/pci/pci-acpi.c: pci_acpi_program_hp_params(&hp_ops) acpi_run_hpx(hp_ops) decode_type0_hpx_record() hp_ops->program_type0 # program_hpx_type0() called via hp_ops After the ACPI-specific functions, structs, enums, etc., have been moved to drivers/pci/pci-acpi.c there is no need for the hotplug_program_ops as all of the _HPX Type 0, 1, 2 and 3 are directly accessible. Link: https://lore.kernel.org/r/20190827094951.10613-4-kw@linux.com Signed-off-by: Krzysztof Wilczynski <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-08-27 09:49:51 +00:00
static inline int pci_acpi_program_hp_params(struct pci_dev *dev)
{
return -ENODEV;
}
#endif
#ifdef CONFIG_PCIEASPM
extern const struct attribute_group aspm_ctrl_attr_group;
#endif
extern const struct attribute_group pci_dev_reset_method_attr_group;
#endif /* DRIVERS_PCI_H */