Some PCI devices must be powered-on before they can be detected on the
bus. Introduce a simple framework reusing the existing PCI OF
infrastructure.
The way this works is: a DT node representing a PCI device connected to
the port can be matched against its power control platform driver. If
the match succeeds, the driver is responsible for powering-up the device
and calling pci_pwrctl_device_set_ready() which will trigger a PCI bus
rescan as well as subscribe to PCI bus notifications.
When the device is detected and created, we'll make it consume the same
DT node that the platform device did. When the device is bound, we'll
create a device link between it and the parent power control device.
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD, SM8650-QRD & SM8650-HDK
Tested-by: Caleb Connolly <caleb.connolly@linaro.org> # OnePlus 8T
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240612082019.19161-5-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
The entirety of pci_iomap.c is guarded by an #ifdef CONFIG_PCI. It,
consequently, does not belong to lib/ because it is not generic
infrastructure.
Move pci_iomap.c to drivers/pci/ and implement the necessary changes to
Makefiles and Kconfigs.
Update MAINTAINERS file.
Update Documentation.
Link: https://lore.kernel.org/r/20240131090023.12331-3-pstanner@redhat.com
[bhelgaas: squash in https://lore.kernel.org/r/20240212150934.24559-1-pstanner@redhat.com]
Suggested-by: Danilo Krummrich <dakr@redhat.com>
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
The CONFIG_PCI_P2PDMA Kconfig help text contains a Cyrillic small "Dze"
(ѕ). When menuconfig renders it, it looks like "Enable ~U drivers" instead
of "Enables drivers".
Replace it by a plain "s" so the help text is displayed correctly by
menuconfig.
Uwe Kleine-König <u.kleine-koenig@pengutronix.de> later posted the same
patch at
https://lore.kernel.org/r/20231006150209.87666-1-u.kleine-koenig@pengutronix.de
Link: https://lore.kernel.org/r/1658301723-111283-1-git-send-email-liusong@linux.alibaba.com
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Liu Song <liusong@linux.alibaba.com>
[bhelgaas: commit log, add Uwe's report]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
The PCI endpoint device such as Xilinx Alveo PCI card maps the register
spaces from multiple hardware peripherals to its PCI BAR. Normally,
the PCI core discovers devices and BARs using the PCI enumeration process.
There is no infrastructure to discover the hardware peripherals that are
present in a PCI device, and which can be accessed through the PCI BARs.
Apparently, the device tree framework requires a device tree node for the
PCI device. Thus, it can generate the device tree nodes for hardware
peripherals underneath. Because PCI is self discoverable bus, there might
not be a device tree node created for PCI devices. Furthermore, if the PCI
device is hot pluggable, when it is plugged in, the device tree nodes for
its parent bridges are required. Add support to generate device tree node
for PCI bridges.
Add an of_pci_make_dev_node() interface that can be used to create device
tree node for PCI devices.
Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on,
the kernel will generate device tree nodes for PCI bridges unconditionally.
Initially, add the basic properties for the dynamically generated device
tree nodes which include #address-cells, #size-cells, device_type,
compatible, ranges, reg.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/1692120000-46900-3-git-send-email-lizhi.hou@amd.com
Signed-off-by: Rob Herring <robh@kernel.org>
The DMA flags field will be useful for users beyond PCI P2P, so upgrade to
its own dedicated config option.
[catalin.marinas@arm.com: use #ifdef CONFIG_NEED_SG_DMA_FLAGS in scatterlist.h]
[catalin.marinas@arm.com: update PCI_P2PDMA dma_flags comment in scatterlist.h]
Link: https://lkml.kernel.org/r/20230612153201.554742-13-catalin.marinas@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Adjust to reality and remove another layer of pointless Kconfig
indirection. CONFIG_GENERIC_MSI_IRQ is good enough to serve
all purposes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122014.524842979@linutronix.de
What a zoo:
PCI_MSI
select GENERIC_MSI_IRQ
PCI_MSI_IRQ_DOMAIN
def_bool y
depends on PCI_MSI
select GENERIC_MSI_IRQ_DOMAIN
Ergo PCI_MSI enables PCI_MSI_IRQ_DOMAIN which in turn selects
GENERIC_MSI_IRQ_DOMAIN. So all the dependencies on PCI_MSI_IRQ_DOMAIN are
just an indirection to PCI_MSI.
Match the reality and just admit that PCI_MSI requires
GENERIC_MSI_IRQ_DOMAIN.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.467556921@linutronix.de
- Introduce a 'struct cxl_region' object with support for provisioning
and assembling persistent memory regions.
- Introduce alloc_free_mem_region() to accompany the existing
request_free_mem_region() as a method to allocate physical memory
capacity out of an existing resource.
- Export insert_resource_expand_to_fit() for the CXL subsystem to
late-publish CXL platform windows in iomem_resource.
- Add a polled mode PCI DOE (Data Object Exchange) driver service and
use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
Table).
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYvLYmAAKCRDfioYZHlFs
Z0pbAQC/3j+WriWpU7CdhrnZI1Wqn+x5IIklF0Lc4/f6LwGZtAEAsSbLpItzvwqx
M/rcLaeLpwYlgvS1JjdsuQ2VQ7KOtAs=
=ehNT
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for 6.0:
- Introduce a 'struct cxl_region' object with support for
provisioning and assembling persistent memory regions.
- Introduce alloc_free_mem_region() to accompany the existing
request_free_mem_region() as a method to allocate physical memory
capacity out of an existing resource.
- Export insert_resource_expand_to_fit() for the CXL subsystem to
late-publish CXL platform windows in iomem_resource.
- Add a polled mode PCI DOE (Data Object Exchange) driver service and
use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
Table)"
* tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits)
cxl/hdm: Fix skip allocations vs multiple pmem allocations
cxl/region: Disallow region granularity != window granularity
cxl/region: Fix x1 interleave to greater than x1 interleave routing
cxl/region: Move HPA setup to cxl_region_attach()
cxl/region: Fix decoder interleave programming
Documentation: cxl: remove dangling kernel-doc reference
cxl/region: describe targets and nr_targets members of cxl_region_params
cxl/regions: add padding for cxl_rr_ep_add nested lists
cxl/region: Fix IS_ERR() vs NULL check
cxl/region: Fix region reference target accounting
cxl/region: Fix region commit uninitialized variable warning
cxl/region: Fix port setup uninitialized variable warnings
cxl/region: Stop initializing interleave granularity
cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
cxl/acpi: Minimize granularity for x1 interleaves
cxl/region: Delete 'region' attribute from root decoders
cxl/acpi: Autoload driver for 'cxl_acpi' test devices
cxl/region: decrement ->nr_targets on error in cxl_region_attach()
cxl/region: prevent underflow in ways_to_cxl()
cxl/region: uninitialized variable in alloc_hpa()
...
Introduce a dma_flags field in struct scatterlist. These flags will be
used by dma_[un]map_sg_p2pdma() to determine when a given SGL segments
dma_address points to a PCI bus address. dma_unmap_sg_p2pdma() will need
to perform different cleanup when a segment is marked as a bus address.
The dma_flags field will fit in the existing padding on 64BIT systems
(assuming CONFIG_NEED_SG_DMA_LENGTH is also set).
The new bit will only be used when CONFIG_PCI_P2PDMA is set; this means
PCI P2PDMA will require CONFIG_64BIT. This should be acceptable as the
majority of P2PDMA use cases are restricted to newer root complexes and
roughly require the extra address space for memory BARs used in the
transactions.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Introduced in a PCIe r6.0, sec 6.30, DOE provides a config space based
mailbox with standard protocol discovery. Each mailbox is accessed
through a DOE Extended Capability.
Each DOE mailbox must support the DOE discovery protocol in addition to
any number of additional protocols.
Define core PCIe functionality to manage a single PCIe DOE mailbox at a
defined config space offset. Functionality includes iterating,
creating, query of supported protocol, and task submission. Destruction
of the mailboxes is device managed.
Cc: "Li, Ming" <ming4.li@intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20220719205249.566684-4-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The VGA arbiter is really PCI-specific and doesn't depend on any GPU
things. Move it to the PCI subsystem.
Note that misc_init() must be called before vga_arb_device_init(). These
are both subsys_initcalls, so this ordering depends on the link order,
which is determined by drivers/Makefile:
obj-y += pci/
obj-y += char/ <-- misc_init()
obj-y += gpu/ <-- vga_arb_device_init() (before this commit)
The drivers/pci/ subsys_initcalls are called *before* misc_init(), so
convert vga_arb_device_init() to subsys_initcall_sync(), which is called
after *all* subsys_initcalls.
Link: https://lore.kernel.org/r/20220224224753.297579-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add arm64 Hyper-V vPCI support by implementing the arch specific
interfaces. Introduce an IRQ domain and chip specific to Hyper-v vPCI that
is based on SPIs. The IRQ domain parents itself to the arch GIC IRQ domain
for basic vector management.
[bhelgaas: squash in fix from Yang Li <yang.lee@linux.alibaba.com>:
https://lore.kernel.org/r/20220112003324.62755-1-yang.lee@linux.alibaba.com]
Link: https://lore.kernel.org/r/1641411156-31705-3-git-send-email-sunilmut@linux.microsoft.com
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
The driver's module init function, pcifront_init(), invokes
xen_pv_domain() first thing. That construct produces constant "false"
when !CONFIG_XEN_PV. Hence there's no point building the driver in
non-PV configurations.
Drop the (now implicit and generally wrong) X86 dependency: At present,
XEN_PV can only be set when X86 is also enabled. In general an
architecture supporting Xen PV (and PCI) would want to have this driver
built.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/3a7f6c9b-215d-b593-8056-b5fe605dafd7@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Add Kconfig options for changing the default pcie_bus_config, i.e., the
strategy for configuration MPS and MRRS, in the same manner as the
CONFIG_PCIEASPM_XXXX choice. The pci_bus_config setting may still be
overridden by kernel command-line parameters, e.g.,
"pci=pcie_bus_tune_off".
[bhelgaas: depend on EXPERT, tweak help texts]
Link: https://lore.kernel.org/r/20200928194651.5393-2-james.quinlan@broadcom.com
Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture
requires them or not. Architectures which are fully utilizing hierarchical
irq domains should never call into that code.
It's not only architectures which depend on that by implementing one or
more of the weak functions, there is also a bunch of drivers which relies
on the weak functions which invoke msi_controller::setup_irq[s] and
msi_controller::teardown_irq.
Make the architectures and drivers which rely on them select them in Kconfig
and if not selected replace them by stub functions which emit a warning and
fail the PCI/MSI interrupt allocation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20200826112333.992429909@linutronix.de
The only apparent reason for the PCI_MSI_IRQ_DOMAIN architecture
whitelist was that it requires msi.h. Now that msi.h is mandatory in
asm-generic/Kbuild, every arch should have at least the default version,
so remove the whitelist.
Built for all the architectures that play nice with make.cross, but not
boot tested anywhere.
Link: https://lore.kernel.org/r/514e7b040be8ccd69088193aba260da1b89e919c.1571983829.git.michal.simek@xilinx.com
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Waiman Long <longman@redhat.com>
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig
[bhelgaas: do same in vmd.c]
Link: https://lore.kernel.org/r/20191120134036.14502-1-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
- Fix Hyper-V use-after-free in pci_dev removal (Dexuan Cui)
- Fix Hyper-V build error in non-sysfs config (Randy Dunlap)
- Reallocate to avoid Hyper-V domain number collisions (Haiyang Zhang)
- Use Hyper-V instance ID bytes 4-5 to reduce domain collisions (Haiyang
Zhang)
* remotes/lorenzo/pci/hv:
PCI: hv: Use bytes 4 and 5 from instance ID as the PCI domain numbers
PCI: hv: Detect and fix Hyper-V PCI domain number collision
PCI: pci-hyperv: Fix build errors on non-SYSFS config
PCI: hv: Avoid use of hv_pci_dev->pci_slot after freeing it
This interface driver is a helper driver allows other drivers to
have a common interface with the Hyper-V PCI frontend driver.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix build errors when building almost-allmodconfig but with SYSFS
not set (not enabled). Fixes these build errors:
ERROR: "pci_destroy_slot" [drivers/pci/controller/pci-hyperv.ko] undefined!
ERROR: "pci_create_slot" [drivers/pci/controller/pci-hyperv.ko] undefined!
drivers/pci/slot.o is only built when SYSFS is enabled, so
pci-hyperv.o has an implicit dependency on SYSFS.
Make that explicit.
Also, depending on X86 && X86_64 is not needed, so just change that
to depend on X86_64.
Fixes: a15f2c08c7 ("PCI: hv: support reporting serial number as slot information")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jake Oshins <jakeo@microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: Dexuan Cui <decui@microsoft.com>
Add RISC-V as an arch that supports PCI_MSI_IRQ_DOMAIN. The related change
to generate asm/msi.h is 251a448881 ("riscv: include generic support for
MSI irqdomains").
Link: https://lore.kernel.org/r/alpine.DEB.2.21.9999.1907251426450.32766@viisi.sifive.com
Signed-off-by: Wesley Terpstra <wesley@sifive.com>
[paul.walmsley@sifive.com: wrote patch description; split this
patch from the arch/riscv patch]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
After commit eb01d42a77 ("PCI: consolidate PCI config entry in
drivers/pci"), all the PCI kconfig options appear below "PCI support"
rather than within a sub-menu. This is because menuconfig expects all
kconfig entries to be enclosed in an if/endif section. Add the missing
if/endif.
With this, "depends on PCI" is redundant in the sub-menu entries and can
be removed.
Fixes: eb01d42a77 ("PCI: consolidate PCI config entry in drivers/pci")
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Let architectures select the syscall support instead of duplicating the
kconfig entry.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Move the definitions to drivers/pci and let the architectures select
them. Two small differences to before: PCI_DOMAINS_GENERIC now selects
PCI_DOMAINS, cutting down the churn for modern architectures. As the
only architectured arm did previously also offer PCI_DOMAINS as a user
visible choice in addition to selecting it from the relevant configs,
this is gone now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
There is no good reason to duplicate the PCI menu in every architecture.
Instead provide a selectable HAVE_PCI symbol that indicates availability
of PCI support, and a FORCE_PCI symbol to for PCI on and the handle the
rest in drivers/pci.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tell users what a PCI PF is in the PCI_PF_STUB config help text.
Fixes: a8ccf8a666 ("PCI/IOV: Add pci-pf-stub driver for PFs that only enable VFs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Some PCI host controllers do not expose a configuration space for the
root port PCI bridge. Due to this, the Marvell Armada 370/38x/XP PCI
controller driver (pci-mvebu) emulates a root port PCI bridge
configuration space, and uses that to (among other things) dynamically
create the memory windows that correspond to the PCI MEM and I/O
regions.
Since we now need to add a very similar logic for the Marvell Armada
37xx PCI controller driver (pci-aardvark), instead of duplicating the
code, we create in this commit a common logic called pci-bridge-emul.
The idea of this logic is to emulate a root port PCI bridge
configuration space by providing configuration space read/write
operations, and faking behind the scenes the configuration space of a
PCI bridge. A PCI host controller driver simply has to call
pci_bridge_emul_conf_read() and pci_bridge_emul_conf_write() to
read/write the configuration space of the bridge.
By default, the PCI bridge configuration space is simply emulated by a
chunk of memory, but the PCI host controller can override the behavior
of the read and write operations on a per-register basis to do
additional actions if needed. We take care of complying with the
behavior of the PCI configuration space registers in terms of bits
that are read-write, read-only, reserved and write-1-to-clear.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Some PCI devices may have memory mapped in a BAR space that's intended for
use in peer-to-peer transactions. To enable such transactions the memory
must be registered with ZONE_DEVICE pages so it can be used by DMA
interfaces in existing drivers.
Add an interface for other subsystems to find and allocate chunks of P2P
memory as necessary to facilitate transfers between two PCI peers:
struct pci_dev *pci_p2pmem_find[_many]();
int pci_p2pdma_distance[_many]();
void *pci_alloc_p2pmem();
The new interface requires a driver to collect a list of client devices
involved in the transaction then call pci_p2pmem_find() to obtain any
suitable P2P memory. Alternatively, if the caller knows a device which
provides P2P memory, they can use pci_p2pdma_distance() to determine if it
is usable. With a suitable p2pmem device, memory can then be allocated
with pci_alloc_p2pmem() for use in DMA transactions.
Depending on hardware, using peer-to-peer memory may reduce the bandwidth
of the transfer but can significantly reduce pressure on system memory.
This may be desirable in many cases: for example a system could be designed
with a small CPU connected to a PCIe switch by a small number of lanes
which would maximize the number of lanes available to connect to NVMe
devices.
The code is designed to only utilize the p2pmem device if all the devices
involved in a transfer are behind the same PCI bridge. This is because we
have no way of knowing whether peer-to-peer routing between PCIe Root Ports
is supported (PCIe r4.0, sec 1.3.1). Additionally, the benefits of P2P
transfers that go through the RC is limited to only reducing DRAM usage
and, in some cases, coding convenience. The PCI-SIG may be exploring
adding a new capability bit to advertise whether this is possible for
future hardware.
This commit includes significant rework and feedback from Christoph
Hellwig.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
[bhelgaas: fold in fix from Keith Busch <keith.busch@intel.com>:
https://lore.kernel.org/linux-pci/20181012155920.15418-1-keith.busch@intel.com,
to address comment from Dan Carpenter <dan.carpenter@oracle.com>, fold in
https://lore.kernel.org/linux-pci/20181017160510.17926-1-logang@deltatee.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Native PCI drivers for root complex devices were originally all in
drivers/pci/host/. Some of these devices can also be operated in endpoint
mode. Drivers for endpoint mode didn't seem to fit in the "host"
directory, so we put both the root complex and endpoint drivers in
per-device directories, e.g., drivers/pci/dwc/, drivers/pci/cadence/, etc.
These per-device directories contain trivial Kconfig and Makefiles and
clutter drivers/pci/. Make a new drivers/pci/controllers/ directory and
collect all the device-specific drivers there.
No functional change intended.
Link: https://lkml.kernel.org/r/1520304202-232891-1-git-send-email-shawn.lin@rock-chips.com
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlsZdg0UHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwJOBAAsuuWsOdiJRRhQLU5WfEMFgzcL02R
gsumqZkK7E8LOq0DPNMtcgv9O0KgYZyCiZyTMJ8N7sEYohg04lMz8mtYXOibjcwI
p+nVMko8jQXV9FXwSMGVqigEaLLcrbtkbf/mPriD63DDnRMa/+/Jh15SwfLTydIH
QRTJbIxkS3EiOauj5C8QY3UwzjlvV9mDilzM/x+MSK27k2HFU9Pw/3lIWHY716rr
grPZTwBTfIT+QFZjwOm6iKzHjxRM830sofXARkcH4CgSNaTeq5UbtvAs293MHvc+
v/v/1dfzUh00NxfZDWKHvTUMhjazeTeD9jEVS7T+HUcGzvwGxMSml6bBdznvwKCa
46ynePOd1VcEBlMYYS+P4njRYBLWeUwt6/TzqR4yVwb0keQ6Yj3Y9H2UpzscYiCl
O+0qz6RwyjKY0TpxfjoojgHn4U5ByI5fzVDJHbfr2MFTqqRNaabVrfl6xU4sVuhh
OluT5ym+/dOCTI/wjlolnKNb0XThVre8e2Busr3TRvuwTMKMIWqJ9sXLovntdbqE
furPD/UnuZHkjSFhQ1SQwYdWmsZI5qAq2C9haY8sEWsXEBEcBGLJ2BEleMxm8UsL
KXuy4ER+R4M+sFtCkoWf3D4NTOBUdPHi4jyk6Ooo1idOwXCsASVvUjUEG5YcQC6R
kpJ1VPTKK1XN64I=
=aFAi
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- unify AER decoding for native and ACPI CPER sources (Alexandru
Gagniuc)
- add TLP header info to AER tracepoint (Thomas Tai)
- add generic pcie_wait_for_link() interface (Oza Pawandeep)
- handle AER ERR_FATAL by removing and re-enumerating devices, as
Downstream Port Containment does (Oza Pawandeep)
- factor out common code between AER and DPC recovery (Oza Pawandeep)
- stop triggering DPC for ERR_NONFATAL errors (Oza Pawandeep)
- share ERR_FATAL recovery path between AER and DPC (Oza Pawandeep)
- disable ASPM L1.2 substate if we don't have LTR (Bjorn Helgaas)
- respect platform ownership of LTR (Bjorn Helgaas)
- clear interrupt status in top half to avoid interrupt storm (Oza
Pawandeep)
- neaten pci=earlydump output (Andy Shevchenko)
- avoid errors when extended config space inaccessible (Gilles Buloz)
- prevent sysfs disable of device while driver attached (Christoph
Hellwig)
- use core interface to report PCIe link properties in bnx2x, bnxt_en,
cxgb4, ixgbe (Bjorn Helgaas)
- remove unused pcie_get_minimum_link() (Bjorn Helgaas)
- fix use-before-set error in ibmphp (Dan Carpenter)
- fix pciehp timeouts caused by Command Completed errata (Bjorn
Helgaas)
- fix refcounting in pnv_php hotplug (Julia Lawall)
- clear pciehp Presence Detect and Data Link Layer Status Changed on
resume so we don't miss hotplug events (Mika Westerberg)
- only request pciehp control if we support it, so platform can use
ACPI hotplug otherwise (Mika Westerberg)
- convert SHPC to be builtin only (Mika Westerberg)
- request SHPC control via _OSC if we support it (Mika Westerberg)
- simplify SHPC handoff from firmware (Mika Westerberg)
- fix an SHPC quirk that mistakenly included *all* AMD bridges as well
as devices from any vendor with device ID 0x7458 (Bjorn Helgaas)
- assign a bus number even to non-native hotplug bridges to leave
space for acpiphp additions, to fix a common Thunderbolt xHCI
hot-add failure (Mika Westerberg)
- keep acpiphp from scanning native hotplug bridges, to fix common
Thunderbolt hot-add failures (Mika Westerberg)
- improve "partially hidden behind bridge" messages from core (Mika
Westerberg)
- add macros for PCIe Link Control 2 register (Frederick Lawler)
- replace IB/hfi1 custom macros with PCI core versions (Frederick
Lawler)
- remove dead microblaze and xtensa code (Bjorn Helgaas)
- use dev_printk() when possible in xtensa and mips (Bjorn Helgaas)
- remove unused pcie_port_acpi_setup() and portdrv_acpi.c (Bjorn
Helgaas)
- add managed interface to get PCI host bridge resources from OF (Jan
Kiszka)
- add support for unbinding generic PCI host controller (Jan Kiszka)
- fix memory leaks when unbinding generic PCI host controller (Jan
Kiszka)
- request legacy VGA framebuffer only for VGA devices to avoid false
device conflicts (Bjorn Helgaas)
- turn on PCI_COMMAND_IO & PCI_COMMAND_MEMORY in pci_enable_device()
like everybody else, not in pcibios_fixup_bus() (Bjorn Helgaas)
- add generic enable function for simple SR-IOV hardware (Alexander
Duyck)
- use generic SR-IOV enable for ena, nvme (Alexander Duyck)
- add ACS quirk for Intel 7th & 8th Gen mobile (Alex Williamson)
- add ACS quirk for Intel 300 series (Mika Westerberg)
- enable register clock for Armada 7K/8K (Gregory CLEMENT)
- reduce Keystone "link already up" log level (Fabio Estevam)
- move private DT functions to drivers/pci/ (Rob Herring)
- factor out dwc CONFIG_PCI Kconfig dependencies (Rob Herring)
- add DesignWare support to the endpoint test driver (Gustavo
Pimentel)
- add DesignWare support for endpoint mode (Gustavo Pimentel)
- use devm_ioremap_resource() instead of devm_ioremap() in dra7xx and
artpec6 (Gustavo Pimentel)
- fix Qualcomm bitwise NOT issue (Dan Carpenter)
- add Qualcomm runtime PM support (Srinivas Kandagatla)
- fix DesignWare enumeration below bridges (Koen Vandeputte)
- use usleep() instead of mdelay() in endpoint test (Jia-Ju Bai)
- add configfs entries for pci_epf_driver device IDs (Kishon Vijay
Abraham I)
- clean up pci_endpoint_test driver (Gustavo Pimentel)
- update Layerscape maintainer email addresses (Minghuan Lian)
- add COMPILE_TEST to improve build test coverage (Rob Herring)
- fix Hyper-V bus registration failure caused by domain/serial number
confusion (Sridhar Pitchai)
- improve Hyper-V refcounting and coding style (Stephen Hemminger)
- avoid potential Hyper-V hang waiting for a response that will never
come (Dexuan Cui)
- implement Mediatek chained IRQ handling (Honghui Zhang)
- fix vendor ID & class type for Mediatek MT7622 (Honghui Zhang)
- add Mobiveil PCIe host controller driver (Subrahmanya Lingappa)
- add Mobiveil MSI support (Subrahmanya Lingappa)
- clean up clocks, MSI, IRQ mappings in R-Car probe failure paths
(Marek Vasut)
- poll more frequently (5us vs 5ms) while waiting for R-Car data link
active (Marek Vasut)
- use generic OF parsing interface in R-Car (Vladimir Zapolskiy)
- add R-Car V3H (R8A77980) "compatible" string (Sergei Shtylyov)
- add R-Car gen3 PHY support (Sergei Shtylyov)
- improve R-Car PHYRDY polling (Sergei Shtylyov)
- clean up R-Car macros (Marek Vasut)
- use runtime PM for R-Car controller clock (Dien Pham)
- update arm64 defconfig for Rockchip (Shawn Lin)
- refactor Rockchip code to facilitate both root port and endpoint
mode (Shawn Lin)
- add Rockchip endpoint mode driver (Shawn Lin)
- support VMD "membar shadow" feature (Jon Derrick)
- support VMD bus number offsets (Jon Derrick)
- add VMD "no AER source ID" quirk for more device IDs (Jon Derrick)
- remove unnecessary host controller CONFIG_PCIEPORTBUS Kconfig
selections (Bjorn Helgaas)
- clean up quirks.c organization and whitespace (Bjorn Helgaas)
* tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (144 commits)
PCI/AER: Replace struct pcie_device with pci_dev
PCI/AER: Remove unused parameters
PCI: qcom: Include gpio/consumer.h
PCI: Improve "partially hidden behind bridge" log message
PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc
PCI: Move resource distribution for single bridge outside loop
PCI: Account for all bridges on bus when distributing bus numbers
ACPI / hotplug / PCI: Drop unnecessary parentheses
ACPI / hotplug / PCI: Mark stale PCI devices disconnected
ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug
PCI: hotplug: Add hotplug_is_native()
PCI: shpchp: Add shpchp_is_native()
PCI: shpchp: Fix AMD POGO identification
PCI: mobiveil: Add MSI support
PCI: mobiveil: Add Mobiveil PCIe Host Bridge IP driver
PCI/AER: Decode Error Source Requester ID
PCI/AER: Remove aer_recover_work_func() forward declaration
PCI/DPC: Use the generic pcie_do_fatal_recovery() path
PCI/AER: Pass service type to pcie_do_fatal_recovery()
PCI/DPC: Disable ERR_NONFATAL handling by DPC
...
This symbol is now always identical to CONFIG_ARCH_DMA_ADDR_T_64BIT, so
remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Some SR-IOV PF devices provide no functionality other than acting as a
means of enabling VFs. For these devices, we want to enable the VFs and
assign them to guest virtual machines, but there's no need to have a driver
for the PF itself.
Add a new pci-pf-stub driver to claim those PF devices and provide the
generic VF enable functionality. An administrator can use the sysfs
"sriov_numvfs" file to enable VFs, then assign them to guests.
For now I only have one example ID provided by Amazon in terms of devices
that require this functionality. The general idea is that in the future we
will see other devices added as vendors come up with devices where the PF
is more or less just a lightweight shim used to allocate VFs.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
* pci/spdx:
PCI: Add SPDX GPL-2.0+ to replace implicit GPL v2 or later statement
PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate
PCI: Add SPDX GPL-2.0 to replace COPYING boilerplate
PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate
PCI: Add SPDX GPL-2.0 when no license was specified
This patch adds support to the Cadence PCIe controller in endpoint mode.
Since pieces of source code are shared with the host driver (Root
Complex mode), we create a new directory under drivers/pci dedicated to
the Cadence PCIe controller. The common code is placed into
drivers/pci/cadence/pcie-cadence.c and used by both the host and
endpoint controller drivers.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@free-electrons.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Clean up drivers/Makefile by moving the pci/endpoint and pci/dwc entries
from drivers/Makefile into drivers/pci/Makefile.
Since we don't want to introduce any dependency between CONFIG_PCI and
CONFIG_PCI_ENDPOINT, we now always execute drivers/pci/Makefile.
Hence all Makefiles in drivers/pci/ were updated accordingly so no file is
compiled when CONFIG_PCI is not defined.
Also, we add a comment to reinforce that EPC and EPF libraries must be
initialized before their users. Hence built-in EPC drivers, such as
those of Designware, are linked after the endpoint core libraries.
Finally, we add another comment to explain why obj-y has been chosen
instead of obj-$(CONFIG_PCIE_DW) to parse the dwc/ sub-folder.
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@free-electrons.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
b24413180f ("License cleanup: add SPDX GPL-2.0 license identifier to
files with no license") added SPDX GPL-2.0 to several PCI files that
previously contained no license information.
Add SPDX GPL-2.0 to all other PCI files that did not contain any license
information and hence were under the default GPL version 2 license of the
kernel.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull misc x86 fixes from Ingo Molnar:
- topology enumeration fixes
- KASAN fix
- two entry fixes (not yet the big series related to KASLR)
- remove obsolete code
- instruction decoder fix
- better /dev/mem sanity checks, hopefully working better this time
- pkeys fixes
- two ACPI fixes
- 5-level paging related fixes
- UMIP fixes that should make application visible faults more debuggable
- boot fix for weird virtualization environment
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/decoder: Add new TEST instruction pattern
x86/PCI: Remove unused HyperTransport interrupt support
x86/umip: Fix insn_get_code_seg_params()'s return value
x86/boot/KASLR: Remove unused variable
x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
x86/pkeys/selftests: Fix protection keys write() warning
x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
x86/mpx/selftests: Fix up weird arrays
x86/pkeys: Update documentation about availability
x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
x86/smpboot: Fix __max_logical_packages estimate
x86/topology: Avoid wasting 128k for package id array
perf/x86/intel/uncore: Cache logical pkg id in uncore driver
x86/acpi: Reduce code duplication in mp_override_legacy_irq()
x86/acpi: Handle SCI interrupts above legacy space gracefully
x86/boot: Fix boot failure when SMP MP-table is based at 0
x86/mm: Limit mmap() of /dev/mem to valid physical addresses
x86/selftests: Add test for mapping placement for 5-level paging
...
There are no in-tree callers of ht_create_irq(), the driver interface for
HyperTransport interrupts, left. Remove the unused entry point and all the
supporting code.
See 8b955b0ddd ("[PATCH] Initial generic hypertransport interrupt
support").
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pci@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: https://lkml.kernel.org/r/20171122221337.3877.23362.stgit@bhelgaas-glaptop.roam.corp.google.com
* pci/virtualization:
PCI: Document reset method return values
PCI: Detach driver before procfs & sysfs teardown on device remove
PCI: Apply Cavium ThunderX ACS quirk to more Root Ports
PCI: Set Cavium ACS capability quirk flags to assert RR/CR/SV/UF
PCI: Restore ARI Capable Hierarchy before setting numVFs
PCI: Create SR-IOV virtfn/physfn links before attaching driver
PCI: Expose SR-IOV offset, stride, and VF device ID via sysfs
PCI: Cache the VF device ID in the SR-IOV structure
PCI: Add Kconfig PCI_IOV dependency for PCI_REALLOC_ENABLE_AUTO
PCI: Remove unused function __pci_reset_function()
PCI: Remove reset argument from pci_iov_{add,remove}_virtfn()
Localize PCI_QUIRKS in the PCI bus menu.
Move PCI_QUIRKS to the PCI bus menu instead of the (often broken) General
Setup EXPERT menu. The prompt still depends on EXPERT.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Ensure only valid Kconfig configurations for PCI_REALLOC_ENABLE_AUTO. This
is done by selecting PCI_IOV, which is required by PCI_REALLOC_ENABLE_AUTO
to work.
Signed-off-by: Sascha El-Sharkawy <elscha@sse.uni-hildesheim.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The generic PCI configuration space accessors are globally serialized via
pci_lock. On larger systems this causes massive lock contention when the
configuration space has to be accessed frequently. One such access pattern
is the Intel Uncore performance counter unit.
Provide a kernel config option which can be selected by an architecture
when the low level PCI configuration space accessors in the architecture
use their own serialization or can operate completely lockless.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-pci@vger.kernel.org
Link: http://lkml.kernel.org/r/20170316215057.205961140@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>