Instead of open-coding it everywhere introduce a tiny helper that can be
used to iterate over each resource of a PCI device, and convert the most
obvious users into it.
While at it drop doubled empty line before pdev_sort_resources().
No functional changes intended.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230330162434.35055-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
After a pci_doe_task completes, its work_struct needs to be destroyed
to avoid a memory leak with CONFIG_DEBUG_OBJECTS=y.
Fixes: 9d24322e88 ("PCI/DOE: Add DOE mailbox support functions")
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: stable@vger.kernel.org # v6.0+
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/775768b4912531c3b887d405fc51a50e465e1bf9.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL
probing because pci_doe_submit_task() invokes INIT_WORK() instead of
INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack.
All callers of pci_doe_submit_task() allocate the work_struct on the
stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable
short-term fix.
The long-term fix implemented by a subsequent commit is to move to a
synchronous API which allocates the work_struct internally in the DOE
library.
Stacktrace for posterity:
WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183
CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
pci_doe_submit_task+0x5d/0xd0
pci_doe_discovery+0xb4/0x100
pcim_doe_create_mb+0x219/0x290
cxl_pci_probe+0x192/0x430
local_pci_probe+0x41/0x80
pci_device_probe+0xb3/0x220
really_probe+0xde/0x380
__driver_probe_device+0x78/0x170
driver_probe_device+0x1f/0x90
__driver_attach_async_helper+0x5c/0xe0
async_run_entry_fn+0x30/0x130
process_one_work+0x294/0x5b0
Fixes: 9d24322e88 ("PCI/DOE: Add DOE mailbox support functions")
Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/
Reported-by: Gregory Price <gregory.price@memverge.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Gregory Price <gregory.price@memverge.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Gregory Price <gregory.price@memverge.com>
Cc: stable@vger.kernel.org # v6.0+
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/67a9117f463ecdb38a2dbca6a20391ce2f1e7a06.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We need the fixes in here for testing, as well as the driver core
changes for documentation updates to build on.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When there is no card plugged on a PCIe port a log reporting that
the port will be disabled is flagged as an error (dev_err()).
Since this is not an error at all, change the log level by using
dev_info() instead.
Link: https://lore.kernel.org/r/20230324073733.1596231-1-sergio.paracuellos@gmail.com
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
commit bb38919ec5 ("PCI: imx6: Add support for i.MX6 PCIe controller")
added a fault hook to this driver in the probe function. So it was only
installed if needed.
commit bde4a5a00e ("PCI: imx6: Allow probe deferral by reset GPIO")
moved it from probe to driver init which installs the hook unconditionally
as soon as the driver is compiled into a kernel.
When this driver is compiled as a module, the hook is not registered
until after the driver has been matched with a .compatible and
loaded.
commit 415b6185c5 ("PCI: imx6: Fix config read timeout handling")
extended the fault handling code.
commit 2d8ed461db ("PCI: imx6: Add support for i.MX8MQ")
added some protection for non-ARM architectures, but this does not
protect non-i.MX ARM architectures.
Since fault handlers can be triggered on any architecture for different
reasons, there is no guarantee that they will be triggered only for the
assumed situation, leading to improper error handling (i.MX6-specific
imx6q_pcie_abort_handler) on foreign systems.
I had seen strange L3 imprecise external abort messages several times on
OMAP4 and OMAP5 devices and couldn't make sense of them until I realized
they were related to this unused imx6q driver because I had
CONFIG_PCI_IMX6=y.
Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed
to run on different ARM SoC and be differentiated only by device tree
binaries. So turning off CONFIG_PCI_IMX6 is not a solution.
Therefore we check the compatible in the init function before registering
the fault handler.
Link: https://lore.kernel.org/r/e1bcfc3078c82b53aa9b78077a89955abe4ea009.1678380991.git.hns@goldelico.com
Fixes: bde4a5a00e ("PCI: imx6: Allow probe deferral by reset GPIO")
Fixes: 415b6185c5 ("PCI: imx6: Fix config read timeout handling")
Fixes: 2d8ed461db ("PCI: imx6: Add support for i.MX8MQ")
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>
struct bus_type should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
bus_type to be moved to read-only memory.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hu Haowen <src.res@email.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd
Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl
Reviewed-by: Alex Shi <alexs@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi
Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The CDAT exposed in sysfs differs between little endian and big endian
arches: On big endian, every 4 bytes are byte-swapped.
PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors
such as pci_read_config_dword() implicitly swap bytes on big endian.
That way, the macros in include/uapi/linux/pci_regs.h work regardless of
the arch's endianness. For an example of implicit byte-swapping, see
ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx
(Load Word Byte-Reverse Indexed).
DOE Read/Write Data Mailbox Registers are unlike other registers in
Configuration Space in that they contain or receive a 4 byte portion of
an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f).
They need to be copied to or from the request/response buffer verbatim.
So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit
byte-swapping.
The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume
implicit byte-swapping. Byte-swap requests after constructing them with
those macros and byte-swap responses before parsing them.
Change the request and response type to __le32 to avoid sparse warnings.
Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for
consistency.
Fixes: c97006046c ("cxl/port: Read CDAT table")
Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org # v6.0+
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If CDM_CHECK is enabled (by the DT "snps,enable-cdm-check" property), 'val'
is overwritten by PCIE_PL_CHK_REG_CONTROL_STATUS initialization. Commit
ec7b952f45 ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check"
exists") did not account for further usage of 'val', so we wrote improper
values to PCIE_PORT_LINK_CONTROL when the CDM check is enabled.
Move the PCIE_PORT_LINK_CONTROL update to be completely after the
PCIE_PL_CHK_REG_CONTROL_STATUS register initialization.
[bhelgaas: commit log adapted from Serge's version]
Fixes: ec7b952f45 ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists")
Link: https://lore.kernel.org/r/20230310123510.675685-2-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Add PCIe EP mode support for ls1028a.
Link: https://lore.kernel.org/r/20230209151050.233973-1-Frank.Li@nxp.com
Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com>
Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Acked-by: Roy Zang <Roy.Zang@nxp.com>
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something. So just remove it and fix up all callers of the function in
the kernel tree at the same time.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On s390 PCI functions may be hotplugged individually even when they
belong to a multi-function device. In particular on an SR-IOV device VFs
may be removed and later re-added.
In commit a50297cf82 ("s390/pci: separate zbus creation from
scanning") it was missed however that struct pci_bus and struct
zpci_bus's resource list retained a reference to the PCI functions MMIO
resources even though those resources are released and freed on
hot-unplug. These stale resources may subsequently be claimed when the
PCI function re-appears resulting in use-after-free.
One idea of fixing this use-after-free in s390 specific code that was
investigated was to simply keep resources around from the moment a PCI
function first appeared until the whole virtual PCI bus created for
a multi-function device disappears. The problem with this however is
that due to the requirement of artificial MMIO addreesses (address
cookies) extra logic is then needed to keep the address cookies
compatible on re-plug. At the same time the MMIO resources semantically
belong to the PCI function so tying their lifecycle to the function
seems more logical.
Instead a simpler approach is to remove the resources of an individually
hot-unplugged PCI function from the PCI bus's resource list while
keeping the resources of other PCI functions on the PCI bus untouched.
This is done by introducing pci_bus_remove_resource() to remove an
individual resource. Similarly the resource also needs to be removed
from the struct zpci_bus's resource list. It turns out however, that
there is really no need to add the MMIO resources to the struct
zpci_bus's resource list at all and instead we can simply use the
zpci_bar_struct's resource pointer directly.
Fixes: a50297cf82 ("s390/pci: separate zbus creation from scanning")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230306151014.60913-2-schnelle@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Defines prefixed with "CONFIG" should be limited to proper Kconfig options,
that are introduced in a Kconfig file.
In the R-car driver the bitmask to configure the SEND_ENABLE mode is named
CONFIG_SEND_ENABLE.
Rename this local definition to a more suitable name, containing the
register bitfield name defined in the R-Car Gen3 rev. 2.30 user
manual.
No functional change.
Link: https://lore.kernel.org/r/20230113084516.31888-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
[lpieralisi@kernel.org: Changed define naming and commit log]
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
pcie-kirin uses regmaps, and needs to pull them in; otherwise, with
CONFIG_PCIE_KIRIN=y and without CONFIG_REGMAP_MMIO pcie-kirin produces
a linker failure looking for __devm_regmap_init_mmio_clk().
Fixes: d19afe7be1 ("PCI: kirin: Use regmap for APB registers")
Link: https://lore.kernel.org/r/04636141da1d6d592174eefb56760511468d035d.1668410580.git.josh@joshtriplett.org
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
[lpieralisi@kernel.org: commit log and removed REGMAP select]
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: stable@vger.kernel.org # 5.16+
- Prevent possible NULL pointer derefences in irq_data_get_affinity_mask()
and irq_domain_create_hierarchy().
- Take the per device MSI lock before invoking code which relies
on it being hold.
- Make sure that MSI descriptors are unreferenced before freeing
them. This was overlooked when the platform MSI code was converted to
use core infrastructure and results in a fals positive warning.
- Remove dead code in the MSI subsystem.
- Clarify the documentation for pci_msix_free_irq().
- More kobj_type constification.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmQEVToTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYodc9EACg5HBOGsh5OV8pwnuEDqfThK/dOZ5k
LJ8xQjGAx29JeNWu4gkkHaSDEGjZhwLiZlB6qUeH+LPQ1NgmAlLL3T2NxEOOWa6y
z/xQv+1Ceu6XxazpCSFRWR/6w4Nyup92jhlsUIkmmsWkVvKH/pV6Uo+3ta0WagWg
heb3vqts6J0AOJaMepF8azYGbwAPSIElNLI1UtiEuQYEKU55N8jLK20VJTL6lzJ2
FyRg/0ghNWDAaBdnv4cZCQ/MzoG5UkoU3f2cqhdSce5mqnq2fKRfgBjzllNgaRgA
zxOxIR88QaKTMHIr+WKD1dyWxDQlotFbBOkmVW39XAa13rn42s4GIeW88VCjJGww
RAm52SbC48cCIyNQlh4A6Vhb4vjPx2DndWbWnnVj5fWlUevdAPRxSlm4BjfxFxe+
LbuZCRRL1jjlC0fXmhVXTTxeE1/K7jarAZwRV7Nxhr3g0gT+Zv1jyaaW9rWuHq5U
3pS+xBl89LA/VYp9tv6jDfJlocmRwgrFbGX4UlfikqtObdTFqcH0FtmqisE61fZS
n0194BMWNDfPSibSpDohf/CDPoHZ6pNxeuqkVDiisUJHPpIYOt8+lH+8//DgBL7a
oi9zS0JazPIn2VM6NB4f/WXOYmS9GZq5+loiYEWb52AYtodKUKmOoWG0SUyy6XFr
E7yJzemsUwrJVg==
=jWot
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"A set of updates for the interrupt susbsystem:
- Prevent possible NULL pointer derefences in
irq_data_get_affinity_mask() and irq_domain_create_hierarchy()
- Take the per device MSI lock before invoking code which relies on
it being hold
- Make sure that MSI descriptors are unreferenced before freeing
them. This was overlooked when the platform MSI code was converted
to use core infrastructure and results in a fals positive warning
- Remove dead code in the MSI subsystem
- Clarify the documentation for pci_msix_free_irq()
- More kobj_type constification"
* tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced
genirq/msi: Drop dead domain name assignment
irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy()
genirq/irqdesc: Make kobj_type structures constant
PCI/MSI: Clarify usage of pci_msix_free_irq()
genirq/msi: Take the per-device MSI lock before validating the control structure
genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
device feature provisioning in ifcvf, mlx5
new SolidNET driver
support for zoned block device in virtio blk
numa support in virtio pmem
VIRTIO_F_RING_RESET support in vhost-net
more debugfs entries in mlx5
resume support in vdpa
completion batching in virtio blk
cleanup of dma api use in vdpa
now simulating more features in vdpa-sim
documentation, features, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmP0D98PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpV6IH/iecRgLMWWjp3n31IFdu31f/J4HpF7dczVjK
qtV98eJ1N2pkgeJkdCfmB5XszfvFBeAurrS7++FTHiJhrRfR3Z+2ml/Qtvh5DEyP
qxz6wOw6VVsi/txdUxM1wsxLeEmmzkmFdAmPM+FXeIjhWj76GOgy/4A3eaj6TgzV
W8ShsBve/UZ5qMOC3XbIscvdOrudHJ18tH90Tiz3NZfH1fAs5E4uWbU6Mrz9DJVr
canGvf4kAI9z8qram5HSgzPIXRJEYiF4q/eiStdtiiME8gL1mHLRZDNP1I1LeCAb
q6Q6RCRKi3Ek+LGdH6u+nR1Swu03N2b/g+vgKtv30kJo06oZVzw=
=EasV
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- device feature provisioning in ifcvf, mlx5
- new SolidNET driver
- support for zoned block device in virtio blk
- numa support in virtio pmem
- VIRTIO_F_RING_RESET support in vhost-net
- more debugfs entries in mlx5
- resume support in vdpa
- completion batching in virtio blk
- cleanup of dma api use in vdpa
- now simulating more features in vdpa-sim
- documentation, features, fixes all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits)
vdpa/mlx5: support device features provisioning
vdpa/mlx5: make MTU/STATUS presence conditional on feature bits
vdpa: validate device feature provisioning against supported class
vdpa: validate provisioned device features against specified attribute
vdpa: conditionally read STATUS in config space
vdpa: fix improper error message when adding vdpa dev
vdpa/mlx5: Initialize CVQ iotlb spinlock
vdpa/mlx5: Don't clear mr struct on destroy MR
vdpa/mlx5: Directly assign memory key
tools/virtio: enable to build with retpoline
vringh: fix a typo in comments for vringh_kiov
vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails
scsi: virtio_scsi: fix handling of kmalloc failure
vdpa: Fix a couple of spelling mistakes in some messages
vhost-net: support VIRTIO_F_RING_RESET
vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit
vdpa: mlx5: support per virtqueue dma device
vdpa: set dma mask for vDPA device
virtio-vdpa: support per vq dma device
vdpa: introduce get_vq_dma_device()
...
- CXL RAM region enumeration: instantiate 'struct cxl_region' objects
for platform firmware created memory regions
- CXL RAM region provisioning: complement the existing PMEM region
creation support with RAM region support
- "Soft Reservation" policy change: Online (memory hot-add)
soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for
setting aside such memory for dedicated access via device-dax.
- CXL Events and Interrupts: Takeover CXL event handling from
platform-firmware (ACPI calls this CXL Memory Error Reporting) and
export CXL Events via Linux Trace Events.
- Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
subsystem interrogate the result of CXL _OSC negotiation.
- Emulate CXL DVSEC Range Registers as "decoders": Allow for
first-generation devices that pre-date the definition of the CXL HDM
Decoder Capability to translate the CXL DVSEC Range Registers into
'struct cxl_decoder' objects.
- Set timestamp: Per spec, set the device timestamp in case of hotplug,
or if platform-firwmare failed to set it.
- General fixups: linux-next build issues, non-urgent fixes for
pre-production hardware, unit test fixes, spelling and debug message
improvements.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY/WYcgAKCRDfioYZHlFs
Z6m3APkBUtiEEm1o8ikdu5llUS1OTLBwqjJDwGMTyf8X/WDXhgD+J2mLsCgARS7X
5IS0RAtefutrW5sQpUucPM7QiLuraAY=
=kOXC
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) updates from Dan Williams:
"To date Linux has been dependent on platform-firmware to map CXL RAM
regions and handle events / errors from devices. With this update we
can now parse / update the CXL memory layout, and report events /
errors from devices. This is a precursor for the CXL subsystem to
handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that
for DDR-attached-DRAM is handled by the EDAC driver where it maps
system physical address events to a field-replaceable-unit (FRU /
endpoint device). In general, CXL has the potential to standardize
what has historically been a pile of memory-controller-specific error
handling logic.
Another change of note is the default policy for handling RAM-backed
device-dax instances. Previously the default access mode was "device",
mmap(2) a device special file to access memory. The new default is
"kmem" where the address range is assigned to the core-mm via
add_memory_driver_managed(). This saves typical users from wondering
why their platform memory is not visible via free(1) and stuck behind
a device-file. At the same time it allows expert users to deploy
policy to, for example, get dedicated access to high performance
memory, or hide low performance memory from general purpose kernel
allocations. This affects not only CXL, but also systems with
high-bandwidth-memory that platform-firmware tags with the
EFI_MEMORY_SP (special purpose) designation.
Summary:
- CXL RAM region enumeration: instantiate 'struct cxl_region' objects
for platform firmware created memory regions
- CXL RAM region provisioning: complement the existing PMEM region
creation support with RAM region support
- "Soft Reservation" policy change: Online (memory hot-add)
soft-reserved memory (EFI_MEMORY_SP) by default, but still allow
for setting aside such memory for dedicated access via device-dax.
- CXL Events and Interrupts: Takeover CXL event handling from
platform-firmware (ACPI calls this CXL Memory Error Reporting) and
export CXL Events via Linux Trace Events.
- Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
subsystem interrogate the result of CXL _OSC negotiation.
- Emulate CXL DVSEC Range Registers as "decoders": Allow for
first-generation devices that pre-date the definition of the CXL
HDM Decoder Capability to translate the CXL DVSEC Range Registers
into 'struct cxl_decoder' objects.
- Set timestamp: Per spec, set the device timestamp in case of
hotplug, or if platform-firwmare failed to set it.
- General fixups: linux-next build issues, non-urgent fixes for
pre-production hardware, unit test fixes, spelling and debug
message improvements"
* tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits)
dax/kmem: Fix leak of memory-hotplug resources
cxl/mem: Add kdoc param for event log driver state
cxl/trace: Add serial number to trace points
cxl/trace: Add host output to trace points
cxl/trace: Standardize device information output
cxl/pci: Remove locked check for dvsec_range_allowed()
cxl/hdm: Add emulation when HDM decoders are not committed
cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders
cxl/hdm: Emulate HDM decoder from DVSEC range registers
cxl/pci: Refactor cxl_hdm_decode_init()
cxl/port: Export cxl_dvsec_rr_decode() to cxl_port
cxl/pci: Break out range register decoding from cxl_hdm_decode_init()
cxl: add RAS status unmasking for CXL
cxl: remove unnecessary calling of pci_enable_pcie_error_reporting()
dax/hmem: build hmem device support as module if possible
dax: cxl: add CXL_REGION dependency
cxl: avoid returning uninitialized error code
cxl/pmem: Fix nvdimm registration races
cxl/mem: Fix UAPI command comment
cxl/uapi: Tag commands from cxl_query_cmd()
...
- Core support
- New devm_of_phy_optional_get() API with users and conversion
- New support:
- Mediatek MT7986 tphy support
- Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250
USB3 phy support, SM6350 combo phy support, SM6125 UFS PHY
support amd SM8350 & SM8450 combo phy support
- Qualcomm SNPS eUSB2 eUSB2 repeater drivers
- Allwinner F1C100s USB PHY support
- Tegra xusb support for Tegra234
- Updates:
- Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy
- G4 mode support in Qualcomm UFS phy and support for various SoCs
- Yaml conversion for Meson usb2 phy
- TI Type C support for usb phy for j721
- Yaml conversion for Tegra xusb binding
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmP4vg0ACgkQfBQHDyUj
g0dJYxAA0DL9X6IFeK8U5blp/ZWmXBqbhRmnsf0KGsSRJMSgHkPruhOWfTCAvXnc
2+xp8wiE0MdLI7xseJWZi7Q2r10f5LtU55rzbL3mI7MWd/g2WTlKiXCDrpa4fY/Z
pxo+892vUJh3+I2+Sjf0JnIY89MV/sqSLXsFeKDtvp7J9lMjA98TV6m+YDVTXn22
SW3hjaB8ochSQV1HEMdEJWsrZc3lmszLdQM+qz3PafyQRbhc1A98Vkf0X/sWR/Ot
p0FCXlNnY3O272dnrU0V5yv7wwWqjVDN5+Q3vk3AbSlo9ERLVwchayUzxi8EIS7t
cPmxhsyMoEmsSIPx4z47vLt1NQoqiaKNM7XCrn13Z0fE9fbTW8Trx8VBXcIUsE98
hT6IxrjRFGJOta8koOssBqSjuwP4QBIZiwXL2YEujj3hGqyRefOCN5XBek7dVyDe
ctwJsIKBCG8Wh87dFldYLrJgQKR9svZXDjxVADpYMUpPM2v02DCWhUyM50ODowZf
Yl7bP8dXtn2UBIybbhNTZg29PbrATk73tcr73GZeX8JTOK2vpsZ3+fUsdxPYzed3
lF2vw361E2ry1DtgmH7XMXevDFvKJ/aks5FIAKebc1tlAPPGYVIkBqyQprAQmlS3
tDQ+6+jQLAr14iSaVQd9MC3obNqbJYHf1WEU3rKtDy3MB0flbqo=
=2g27
-----END PGP SIGNATURE-----
Merge tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"This features a bunch of new device support, a couple of new drivers,
yaml conversion and updates of a few drivers.
Core support:
- New devm_of_phy_optional_get() API with users and conversion
New hardware support:
- Mediatek MT7986 phy support
- Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250 USB3
phy support, SM6350 combo phy support, SM6125 UFS PHY support amd
SM8350 & SM8450 combo phy support
- Qualcomm SNPS eUSB2 eUSB2 repeater drivers
- Allwinner F1C100s USB PHY support
- Tegra xusb support for Tegra234
Updates:
- Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy
- G4 mode support in Qualcomm UFS phy and support for various SoCs
- Yaml conversion for Meson usb2 phy
- TI Type C support for usb phy for j721
- Yaml conversion for Tegra xusb binding"
* tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (106 commits)
phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater
phy: qcom: Add QCOM SNPS eUSB2 repeater driver
dt-bindings: phy: qcom,snps-eusb2-phy: Add phys property for the repeater
dt-bindings: phy: Add qcom,snps-eusb2-repeater schema file
dt-bindings: phy: amlogic,g12a-usb3-pcie-phy: add missing optional phy-supply property
phy: rockchip-typec: Fix unsigned comparison with less than zero
phy: rockchip-typec: fix tcphy_get_mode error case
phy: qcom: snps-eusb2: Add missing headers
phy: qcom-qmp-combo: Add support for SM8550
phy: qcom-qmp: Add v6 DP register offsets
phy: qcom-qmp: pcs-usb: Add v6 register offsets
dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp: Document SM8550 compatible
phy: qcom: Add QCOM SNPS eUSB2 driver
dt-bindings: phy: Add qcom,snps-eusb2-phy schema file
phy: qcom-qmp-pcie: Add support for SM8550 g3x2 and g4x2 PCIEs
phy: qcom-qmp: qserdes-lane-shared: Add v6 register offsets
phy: qcom-qmp: qserdes-txrx: Add v6.20 register offsets
phy: qcom-qmp: pcs-pcie: Add v6.20 register offsets
phy: qcom-qmp: pcs-pcie: Add v6 register offsets
phy: qcom-qmp: pcs: Add v6.20 register offsets
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmP2dbsUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vzBDg//aW2IeJYku5ENXwwnCQjBlyjBGOOZ
456KGpFt/ky0N9Jp0ZS3nQSa5YN7q+L8XY48gu6I7s1hXly8iLZKLrJN++S//k55
BadXu7mDUyVoY74LYvBe0nlXuwJul2qnq9IJLufRucrn1yoyqApAh39IRdCzi4U8
mP+wad7sQA0Si4bpf80uwn6Yq8SrDoO0mtmO/dZSXJooM2t2SnDXEL/fxMwTNDA4
XsVSP9FrbPmcTLo8mkDa8Dy7JKbL6KQJF9yDlmYzuA2spQpTf+YLLfsNnmE+850h
WTtfCjVaYtlik7i9qTB+VcN1CsGVepYKK3H5the16Aeql2Fu+Ji5KSt74C220Yi9
ZSDA93d/EfGc5egKyBdUUMFgqhe46srRUAoWcMrx2T4ARGuOm5EYCa9C8C7dFmO0
j6f9MYL3j2Sw3FROEKViRVOFfbIfVW1TXIo3x0fE0ud3xkg73eKp/++X8QeTMjox
2ArY2AWPNQpUI1oMlKxlSEd5XjFf7n/hHDtFqj9bIuJzt0/8wXQf0jCYTjhpGkRB
pmO+lColK6lp+bg8aWRRkiwN73xGdQhKaeXLo0Iq4T6xr0Lb3XoskHZvt6NIGe/A
ds5/uwtErq6kCf2G9YG1xfh+G1bimbjWwsHCNfSNXzTsWGDFTCb8tvqF90m+7+yl
bllxTXA6PO312Tw=
=/y4d
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Rework portdrv shutdown so it disables interrupts but doesn't
disable bus mastering, which leads to hangs on Loongson LS7A
- Add mechanism to prevent Max_Read_Request_Size (MRRS) increases,
again to avoid hardware issues on Loongson LS7A (and likely other
devices based on DesignWare IP)
- Ignore devices with a firmware (DT or ACPI) node that says the
device is disabled
Resource management:
- Distribute spare resources to unconfigured hotplug bridges at
boot-time (not just when hot-adding such a bridge), which makes
hot-adding devices to docks work better. Tried this in v6.1 but had
to revert for regressions, so try again
- Fix root bus issue that dropped resources that happened to end
at 0, e.g., [bus 00]
PCI device hotplug:
- Remove device locking when marking device as disconnected so this
doesn't have to wait for concurrent driver bind/unbind to complete
- Quirk more Qualcomm bridges that don't fully implement the PCIe
Slot Status 'Command Completed' bit
Power management:
- Account for _S0W of the target bridge in acpi_pci_bridge_d3() so we
don't miss hot-add notifications for USB4 docks, Thunderbolt, etc
Reset:
- Observe delay after reset, e.g., resuming from system sleep,
regardless of whether a bridge can suspend to D3cold at runtime
- Wait for secondary bus to become ready after a bridge reset
Virtualization:
- Avoid FLR on some AMD FCH AHCI adapters where it doesn't work
- Allow independent IOMMU groups for some Wangxun NICs that prevent
peer-to-peer transactions but don't advertise an ACS Capability
Error handling:
- Configure End-to-End-CRC (ECRC) only if Linux owns the AER
Capability
- Remove redundant Device Control Error Reporting Enable in the AER
service driver since this is already done for all devices during
enumeration
ASPM:
- Add pci_enable_link_state() interface to allow drivers to enable
ASPM link state
Endpoint framework:
- Move dra7xx and tegra194 linkup processing from hard IRQ to
threaded IRQ handler
- Add a separate lock for endpoint controller list of endpoint
function drivers to prevent deadlock in callbacks
- Pass events from endpoint controller to endpoint function drivers
via callbacks instead of notifiers
Synopsys DesignWare eDMA controller driver (acked by Vinod):
- Fix CPU vs PCI address issues
- Fix source vs destination address issues
- Fix issues with interleaved transfer semantics
- Fix channel count initialization issue (issue still exists in
several other drivers)
- Clean up and improve debugfs usage so it will work on platforms
with several eDMA devices
Baikal T-1 PCIe controller driver:
- Set a 64-bit DMA mask
Freescale i.MX6 PCIe controller driver:
- Add i.MX8MM, i.MX8MQ, i.MX8MP endpoint mode DT binding and driver
support
Intel VMD host bridge driver:
- Add quirk to configure PCIe ASPM and LTR. This is normally done by
BIOS, and will be for future products
Marvell MVEBU PCIe controller driver:
- Mark this driver as broken in Kconfig since bugs prevent its daily
usage
MediaTek MT7621 PCIe controller driver:
- Delay PHY port initialization to improve boot reliability for ZBT
WE1326, ZBT WF3526-P, and some Netgear models
Qualcomm PCIe controller driver:
- Add MSM8998 DT compatible string
- Unify MSM8996 and MSM8998 clock orderings
- Add SM8350 DT binding and driver support
- Add IPQ8074 Gen3 DT binding and driver support
- Correct qcom,perst-regs in DT binding
- Add qcom_pcie_host_deinit() so the PHY is powered off and
regulators and clocks are disabled on late host-init errors
Socionext UniPhier Pro5 controller driver:
- Clean up uniphier-ep reg, clocks, resets, and their names in DT
binding
Synopsys DesignWare PCIe controller driver:
- Restrict coherent DMA mask to 32 bits for MSI, but allow controller
drivers to set 64-bit streaming DMA mask
- Add eDMA engine support in both Root Port and Endpoint controllers
Miscellaneous:
- Remove MODULE_LICENSE from boolean drivers so they don't look like
modules so modprobe can complain about them"
* tag 'pci-v6.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (86 commits)
PCI: dwc: Add Root Port and Endpoint controller eDMA engine support
PCI: bt1: Set 64-bit DMA mask
PCI: dwc: Restrict only coherent DMA mask for MSI address allocation
dmaengine: dw-edma: Prepare dw_edma_probe() for builtin callers
dmaengine: dw-edma: Depend on DW_EDMA instead of selecting it
dmaengine: dw-edma: Add mem-mapped LL-entries support
PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules
PCI: hv: Drop duplicate PCI_MSI dependency
PCI/P2PDMA: Annotate RCU dereference
PCI/sysfs: Constify struct kobj_type pci_slot_ktype
PCI: hotplug: Allow marking devices as disconnected during bind/unbind
PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
PCI: qcom: Add IPQ8074 Gen3 port support
dt-bindings: PCI: qcom: Add IPQ8074 Gen3 port
dt-bindings: PCI: qcom: Sort compatibles alphabetically
PCI: qcom: Fix host-init error handling
PCI: qcom: Add SM8350 support
dt-bindings: PCI: qcom: Add SM8350
dt-bindings: PCI: qcom-ep: Correct qcom,perst-regs
dt-bindings: PCI: qcom: Unify MSM8996 and MSM8998 clock order
...
Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work falls
into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be moved
into read-only memory (i.e. const) The recent work with Rust has
pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only making
things safer overall. This is the contuation of that work (started
last release with kobject changes) in moving struct bus_type to be
constant. We didn't quite make it for this release, but the
remaining patches will be finished up for the release after this
one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
6oeFOjD3JDju3cQsfGgd
=Su6W
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work
falls into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be
moved into read-only memory (i.e. const) The recent work with Rust
has pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only
making things safer overall. This is the contuation of that work
(started last release with kobject changes) in moving struct
bus_type to be constant. We didn't quite make it for this release,
but the remaining patches will be finished up for the release after
this one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems"
[ Geert Uytterhoeven points out that that last sentence isn't true, and
that there's a pending report that has a fix that is queued up - Linus ]
* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
OPP: fix error checking in opp_migrate_dentry()
debugfs: update comment of debugfs_rename()
i3c: fix device.h kernel-doc warnings
dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
Revert "driver core: add error handling for devtmpfs_create_node()"
Revert "devtmpfs: add debug info to handle()"
Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
driver core: cpu: don't hand-override the uevent bus_type callback.
devtmpfs: remove return value of devtmpfs_delete_node()
devtmpfs: add debug info to handle()
driver core: add error handling for devtmpfs_create_node()
driver core: bus: update my copyright notice
driver core: bus: add bus_get_dev_root() function
driver core: bus: constify bus_unregister()
driver core: bus: constify some internal functions
driver core: bus: constify bus_get_kset()
driver core: bus: constify bus_register/unregister_notifier()
driver core: remove private pointer from struct bus_type
...
- Add pci_enable_link_state() to allow drivers to enable ASPM link state
(Michael Bottini)
- Add quirk to enable all ASPM link states and program LTR for devices
below VMD (David E. Box)
* pci/controller/vmd:
PCI: vmd: Add quirk to configure PCIe ASPM and LTR
PCI: vmd: Create feature grouping for client products
PCI: vmd: Use PCI_VDEVICE in device list
PCI/ASPM: Add pci_enable_link_state()
- Delay PHY initialization to make boots reliable for ZBT WE1326 and ZBT
WF3526-P and some Netgear models (Sergio Paracuellos)
* pci/controller/mt7621:
PCI: mt7621: Delay phy ports initialization
- Release previously-requested DW eDMA IRQs if request_irq() fails (Serge
Semin)
- Convert DW eDMA linked-list (ll) and data target (dt) from CPU-relative
addresses to PCI bus addresses (Serge Semin)
- Fix missing src/dst address for interleaved transfers (Serge Semin)
- Enforce the DW eDMA restriction that interleaved transfers must increment
src and dst addresses (Serge Semin)
- Fix some invalid interleaved transfer semantics (Serge Semin)
- Convert CPU-relative addresses to PCI bus addresses for eDMA engine
(Serge Semin)
- Drop chancnt initialization from dw-edma-core, since it is managed by the
dmaengine core, e.g., in dma_async_device_channel_register() (Serge Semin)
- Clean up bogus casting of debugfs_entries.reg addresses (Serge Semin)
- Ignore debugfs file/directory creation errors (Serge Semin)
- Allocate debugfs entries from the heap to prepare for multi-eDMA
platforms (Serge Semin)
- Simplify and rework register accessors to remove another obstacle to
multi-eDMA platforms (Serge Semin)
- Consolidate eDMA read/write channels in a single dma_device to simplify,
better reflect the hardware design, and avoid a debugfs complaint (Serge
Semin)
- Move eDMA-specific debugfs nodes into existing dmaengine subdirectory
(Serge Semin)
- Fix a readq_ch() truncation from 64 to 32 bits (Serge Semin)
- Use existing readq()/writeq rather than hand-coding new ones (Serge
Semin)
- Drop unnecessary data target region allocation in favor of existing
dw_edma_chip members (Serge Semin)
- Use parent device in eDMA controller name to prepare for multi-eDMA
platforms (Serge Semin)
- In addition to the existing MMIO accessors for linked list entries, add
support for ioremapped entries for use by eDMA in Root Ports or local
Endpoints (Serge Semin)
- Convert DW_EDMA_PCIE so it depends on DW_EDMA instead of selecting it
(Serge Semin)
- Allow DWC drivers to set streaming DMA masks larger than 32 bits;
previously both streaming and coherent DMA were limited to 32 bits
because some PCI devices only support coherent 32-bit DMA for MSI (Serge
Semin)
- Set 64-bit streaming and coherent DMA mask for the bt1 driver (Serge
Semin)
- Add DW Root Port and Endpoint controller support for eDMA (Serge Semin)
* pci/controller/dwc:
PCI: dwc: Add Root Port and Endpoint controller eDMA engine support
PCI: bt1: Set 64-bit DMA mask
PCI: dwc: Restrict only coherent DMA mask for MSI address allocation
dmaengine: dw-edma: Prepare dw_edma_probe() for builtin callers
dmaengine: dw-edma: Depend on DW_EDMA instead of selecting it
dmaengine: dw-edma: Add mem-mapped LL-entries support
dmaengine: dw-edma: Skip cleanup procedure if no private data found
dmaengine: dw-edma: Replace chip ID number with device name
dmaengine: dw-edma: Drop DT-region allocation
dmaengine: dw-edma: Use non-atomic io-64 methods
dmaengine: dw-edma: Fix readq_ch() return value truncation
dmaengine: dw-edma: Use DMA engine device debugfs subdirectory
dmaengine: dw-edma: Join read/write channels into a single device
dmaengine: dw-edma: Move eDMA data pointer to debugfs node descriptor
dmaengine: dw-edma: Simplify debugfs context CSRs init procedure
dmaengine: dw-edma: Rename debugfs dentry variables to 'dent'
dmaengine: dw-edma: Convert debugfs descs to being heap-allocated
dmaengine: dw-edma: Add dw_edma prefix to debugfs nodes descriptor
dmaengine: dw-edma: Stop checking debugfs_create_*() return value
dmaengine: dw-edma: Drop unnecessary debugfs reg casts
dmaengine: dw-edma: Drop chancnt initialization
dmaengine: dw-edma: Add PCI bus address getter to the remote EP glue driver
dmaengine: dw-edma: Add CPU to PCI bus address translation
dmaengine: dw-edma: Fix invalid interleaved xfers semantics
dmaengine: dw-edma: Don't permit non-inc interleaved xfers
dmaengine: dw-edma: Fix missing src/dst address of interleaved xfers
dmaengine: dw-edma: Convert ll/dt phys address to PCI bus/DMA address
dmaengine: dw-edma: Release requested IRQs on failure
dmaengine: Fix dma_slave_config.dst_addr description
- Convert dra7xx to threaded IRQ handler (Manivannan Sadhasivam)
- Move tegra194 dw_pcie_ep_linkup() to threaded IRQ handler (Manivannan
Sadhasivam)
- Add a separate lock for the endpoint pci_epf list to avoid deadlock
while running callbacks (Manivannan Sadhasivam)
- Use callbacks instead of notifier chains to signal events from EPC to EPF
drivers (Manivannan Sadhasivam)
- Use link_up() callback in place of LINK_UP notifier (Manivannan
Sadhasivam)
* pci/endpoint:
PCI: endpoint: Use link_up() callback in place of LINK_UP notifier
PCI: endpoint: Use callback mechanism for passing events from EPC to EPF
PCI: endpoint: Use a separate lock for protecting epc->pci_epf list
PCI: tegra194: Move dw_pcie_ep_linkup() to threaded IRQ handler
PCI: dra7xx: Use threaded IRQ handler for "dra7xx-pcie-main" IRQ
- Avoid FLR for AMD FCH AHCI adapters to avoid a hardware defect (Damien Le
Moal)
- Add ACS quirk for Wangxun NICs that don't allow peer-to-peer between
functions, but don't advertise an ACS Capability (Mengyuan Lou)
* pci/virtualization:
PCI: Add ACS quirk for Wangxun NICs
PCI: Avoid FLR for AMD FCH AHCI adapters
- Realign space as required by bridge windows after dividing it up (Mika
Westerberg)
- Account for space required by other devices on the bus before
distributing it all to bridges (Mika Westerberg)
- Distribute spare resources to root bus devices as well as to other
hotplug bridges (Mika Westerberg)
- Fix bug that dropped root bus resources that end at zero, e.g., a host
bridge that leads only to bus 00 (Geert Uytterhoeven)
* pci/resource:
PCI: Fix dropping valid root bus resources with .end = zero
PCI: Distribute available resources for root buses, too
PCI: Take other bus devices into account when distributing resources
PCI: Align extra resources for hotplug bridges properly
- Always observe reset delay when waking devices from D3cold, e.g., after
system sleep, regardless of whether we're allowed to runtime-suspend to
D3cold (Lukas Wunner)
- Unify reset and resume delays to wait for downstream devices after a
bridge reset (Lukas Wunner)
- Wait for downstream devices after a DPC-induced bridge reset (Lukas
Wunner)
* pci/reset:
PCI/DPC: Await readiness of secondary bus after reset
PCI: Unify delay handling for reset and resume
PCI/PM: Observe reset delay irrespective of bridge_d3
- Account for _S0W when deciding whether to put bridges in D3 to avoid
missing hotplug events (Rafael J. Wysocki)
* pci/pm:
PCI/ACPI: Account for _S0W of the target bridge in acpi_pci_bridge_d3()
- Remove MODULE_LICENSE from boolean drivers so they don't look like
modules so modprobe will complain about them (Nick Alcock)
* pci/kbuild:
PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules
- Add quirk to work around Qualcomm hardware defect in Command Completed
signaling (Manivannan Sadhasivam)
- Remove locking to allow devices to be marked as disconnected immediately
instead of waiting for concurrent bind/unbind to complete (Lukas Wunner)
* pci/hotplug:
PCI: hotplug: Allow marking devices as disconnected during bind/unbind
PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
- Implement portdrv .shutdown() method that calls service driver .remove()
methods (which disables interrupt generation as required by .shutdown()),
but doesn't disable bus mastering (which hangs on Loongson LS7A because
of a hardware defect) (Huacai Chen)
- Prevent MRRS increases for devices below Loongson LS7A to avoid hardware
limitations (Huacai Chen)
- Ignore devices with a firmware (DT/ACPI) node that says the device is
disabled (Rob Herring)
* pci/enumeration:
PCI: Honor firmware's device disabled status
PCI: loongson: Add more devices that need MRRS quirk
PCI: loongson: Prevent LS7A MRRS increases
PCI/portdrv: Prevent LS7A Bus Master clearing on shutdown
Since the DW eDMA core now supports eDMA controllers embedded in locally
accessible DW PCIe Root Ports and Endpoints, register these controllers
when possible.
To do that the DW PCIe core driver needs to perform some preparations
first. First of all, it needs to find the eDMA controller CSRs base
address, whether they are accessible over the Port Logic or iATU unrolled
space. Afterwards it can try to auto-detect the eDMA controller
availability and number of read/write channels. If none are found the
procedure silently returns without error.
Secondly, the platform is supposed to provide either combined or
per-channel IRQ signals. If no valid IRQs set is found, the procedure
returns without error to be backward compatible with platforms where DW
PCIe controllers have eDMA but lack the IRQ description.
Finally, before actually probing the eDMA device we need to allocate LLP
items buffers. After that the DW eDMA can be registered. If registration is
successful, a message regarding the number of detected Read/Write eDMA
channels will be printed to the system as is done for the iATU settings.
Note: the DW PCI controller driver (either host or endpoint mode) is
currently always built-in, so if the DW eDMA core is built as a module
(CONFIG_DW_EDMA=m), eDMA controllers will not be registered even if the
dw-edma module is later loaded.
Link: https://lore.kernel.org/r/20230113171409.30470-28-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
The DW PCIe Root Port IP core is synthesized with the 64-bit AXI address
bus. Since the device is also equipped with the eDMA engine, explicitly
set the device DMA mask so DMA engine clients can allocate data buffers
anywhere in the 64-bit memory space.
Link: https://lore.kernel.org/r/20230113171409.30470-27-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The MSI target address must be in the lowest 4GB memory to support PCI
peripherals without 64-bit MSI support. Since the allocation is done from
DMA coherent memory, set only the coherent DMA mask, leaving the streaming
DMA mask alone.
Thus streaming DMA operations will work with no artificial limitations. It
will be specifically useful for the eDMA-capable controllers so the
corresponding DMA engine clients would map the DMA buffers with no need for
SWIOTLB for buffers allocated above 4GB.
Add a brief comment about the reason allocating the MSI target address
below 4GB.
Link: https://lore.kernel.org/r/20230113171409.30470-26-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmPzgDgTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXrc7CACfG4SSd8KkWU/y8Q66Irxdau0a3ETD
KL4UNRKGIyKujufgFsme79O6xVSSsCNSay449wk20hqn8lnwbSRi9pUwmLn29hfd
CMFleWIqgwGFfC1do5DRF1vrt1siuG/jVE07mWsEwuY2iHx/es+H7LiQKidhkndZ
DhXRqoi7VYiJv5fRSumpkUJrMZiI96o9Mk09HUksdMwCn3+7RQEqHnlTH5KOozKF
iMroDB72iNw5Na/USZwWL2EDRptENam3lFkPBeDPqNw0SbG4g65JGPR9DSa0Lkbq
AGCJQkdU33mcYQG5MY7R4K1evufpOl/apqLW7h92j45Znr9ok6Vr2c1R
=J1VT
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- allow Linux to run as the nested root partition for Microsoft
Hypervisor (Jinank Jain and Nuno Das Neves)
- clean up the return type of callback functions (Dawei Li)
* tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Fix hv_get/set_register for nested bringup
Drivers: hv: Make remove callback of hyperv driver void returned
Drivers: hv: Enable vmbus driver for nested root partition
x86/hyperv: Add an interface to do nested hypercalls
Drivers: hv: Setup synic registers in case of nested root partition
x86/hyperv: Add support for detecting nested hypervisor
This pull request contains the following branches:
doc.2023.01.05a: Documentation updates.
fixes.2023.01.23a: Miscellaneous fixes, perhaps most notably:
o Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number
of callbacks.
o Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized.
o Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have doen this for mnay years.)
o Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled,
so this should not (yet) affect production use cases.)
kvfree.2023.01.03a: Cause kfree_rcu() and friends to take advantage of
polled grace periods, thus reducing memory footprint by almost
two orders of magnitude, admittedly on a microbenchmark.
This series also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs
where kfree_rcu(p), which can block, was typed instead of the
intended kfree_rcu(p, rh).
srcu.2023.01.03a: SRCU updates, perhaps most notably fixing a bug that
causes SRCU to fail when booted on a system with a non-zero boot
CPU. This surprising situation actually happens for kdump kernels
on the powerpc architecture. It also adds an srcu_down_read()
and srcu_up_read(), which act like srcu_read_lock() and
srcu_read_unlock(), but allow an SRCU read-side critical section
to be handed off from one task to another.
srcu-always.2023.02.02a: Cleans up the now-useless SRCU Kconfig option.
There are a few more commits that are not yet acked or pulled
into maintainer trees, and these will be in a pull request for
a later merge window.
tasks.2023.01.03a: RCU-tasks updates, perhaps most notably these fixes:
o A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang.
o A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can result
in a too-short grace period.
o A race between shrinking RCU tasks down to a single callback list
and queuing a new callback to some other CPU, but where that
queuing is delayed for more than an RCU grace period. This can
result in that callback being stranded on the non-boot CPU.
torture.2023.01.05a: Torture-test updates and fixes.
torturescript.2023.01.03a: Torture-test scripting updates and fixes.
stall.2023.01.09a: Provide additional RCU CPU stall-warning information
in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and
restore the full five-minute timeout limit for expedited RCU
CPU stall warnings.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPq29UTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jAhVEACEAKJY1VJ9IUqz7CwzAYkzgRJfiygh
oDUXmlqtm6ew9pr2GdLUVCVsUSldzBc0K7Djb/G1niv4JPs+v7YwupIV33+UbStU
Qxt6ztTdxc4lKospLm1+2vF9ZdzVEmiP4wVCc4iDarv5FM3FpWSTNc8+L7qmlC+X
myjv+GqMTxkXZBvYJOgJGFjDwN8noTd7Fr3mCCVLFm3PXMDa7tcwD6HRP5AqD2N8
qC5M6LEqepKVGmz0mYMLlSN1GPaqIsEcexIFEazRsPEivPh/iafyQCQ/cqxwhXmV
vEt7u+dXGZT/oiDq9cJ+/XRDS2RyKIS6dUE14TiiHolDCn1ONESahfA/gXWKykC2
BaGPfjWXrWv/hwbeZ+8xEdkAvTIV92tGpXir9Fby1Z5PjP3balvrnn6hs5AnQBJb
NdhRPLzy/dCnEF+CweAYYm1qvTo8cd5nyiNwBZHn7rEAIu3Axrecag1rhFl3AJ07
cpVMQXZtkQVa2X8aIRTUC+ijX6yIqNaHlu0HqNXgIUTDzL4nv5cMjOMzpNQP9/dZ
FwAMZYNiOk9IlMiKJ8ZiVcxeiA8ouIBlkYM3k6vGrmiONZ7a/EV/mSHoJqI8bvqr
AxUIJ2Ayhg3bxPboL5oKgCiLql0A7ZVvz6quX6McitWGMgaSvel1fDzT3TnZd41e
4AFBFd/+VedUGg==
=bBYK
-----END PGP SIGNATURE-----
Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes, perhaps most notably:
- Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number of
callbacks
- Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized
- Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have done this for many years)
- Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled, so
this should not (yet) affect production use cases)
- Make kfree_rcu() and friends take advantage of polled grace periods,
thus reducing memory footprint by almost two orders of magnitude,
admittedly on a microbenchmark
This also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs where
kfree_rcu(p), which can block, was typed instead of the intended
kfree_rcu(p, rh)
- SRCU updates, perhaps most notably fixing a bug that causes SRCU to
fail when booted on a system with a non-zero boot CPU. This
surprising situation actually happens for kdump kernels on the
powerpc architecture
This also adds an srcu_down_read() and srcu_up_read(), which act like
srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
critical section to be handed off from one task to another
- Clean up the now-useless SRCU Kconfig option
There are a few more commits that are not yet acked or pulled into
maintainer trees, and these will be in a pull request for a later
merge window
- RCU-tasks updates, perhaps most notably these fixes:
- A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang
- A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can
result in a too-short grace period
- A race between shrinking RCU tasks down to a single callback
list and queuing a new callback to some other CPU, but where
that queuing is delayed for more than an RCU grace period. This
can result in that callback being stranded on the non-boot CPU
- Torture-test updates and fixes
- Torture-test scripting updates and fixes
- Provide additional RCU CPU stall-warning information in kernels built
with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
timeout limit for expedited RCU CPU stall warnings
* tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
kernel/notifier: Remove CONFIG_SRCU
init: Remove "select SRCU"
fs/quota: Remove "select SRCU"
fs/notify: Remove "select SRCU"
fs/btrfs: Remove "select SRCU"
fs: Remove CONFIG_SRCU
drivers/pci/controller: Remove "select SRCU"
drivers/net: Remove "select SRCU"
drivers/md: Remove "select SRCU"
drivers/hwtracing/stm: Remove "select SRCU"
drivers/dax: Remove "select SRCU"
drivers/base: Remove CONFIG_SRCU
rcu: Disable laziness if lazy-tracking says so
rcu: Track laziness during boot and suspend
rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
rcu: Align the output of RCU CPU stall warning messages
rcu: Add RCU stall diagnosis information
sched: Add helper nr_context_switches_cpu()
...
pci_msix_free_irq() is used to free an interrupt on a PCI/MSI-X interrupt
domain.
The API description specifies that the interrupt to be freed was allocated
via pci_msix_alloc_irq_at(). This description limits the usage of
pci_msix_free_irq() since pci_msix_free_irq() can also be used to free
MSI-X interrupts allocated with, for example, pci_alloc_irq_vectors().
Remove the text stating that the interrupt to be freed had to be allocated
with pci_msix_alloc_irq_at(). The needed struct msi_map need not be from
pci_msix_alloc_irq_at() but can be created from scratch using
pci_irq_vector() to obtain the Linux IRQ number. Highlight that
pci_msix_free_irq() cannot be used to disable MSI-X to guide users that,
for example, pci_free_irq_vectors() remains to be needed.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/lkml/87r0xsd8j4.ffs@tglx
Link: https://lore.kernel.org/r/4c3e7a50d6e70f408812cd7ab199c6b4b326f9de.1676408572.git.reinette.chatre@intel.com
This patch fixes a FLR bug on the SNET DPU rev 1 by setting the
PCI_DEV_FLAGS_NO_FLR_RESET flag.
As there is a quirk to avoid FLR (quirk_no_flr), I added a new quirk
to check the rev ID before calling to quirk_no_flr.
Without this patch, a SNET DPU rev 1 may hang when FLR is applied.
Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Message-Id: <20230110165638.123745-3-alvaro.karsz@solid-run.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since 8b41fc4454 ("kbuild: create modules.builtin without
Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are
used to identify modules. As a consequence, MODULE_LICENSE in non-modules
causes modprobe to misidentify the object file as a module when it is not,
and modprobe might succeed rather than failing with a suitable error
message.
For tristate modules that can be either built-in or loaded at runtime,
modprobe succeeds in both cases:
# modprobe ext4
[exit status zero if CONFIG_EXT4_FS=y or =m]
For boolean modules like the Standard Hot Plug Controller driver (shpchp)
that cannot be loaded at runtime, modprobe should always fail like this:
# modprobe shpchp
modprobe: FATAL: Module shpchp not found in directory /lib/modules/...
[exit status non-zero regardless of CONFIG_HOTPLUG_PCI_SHPC]
but prior to this commit, shpchp_core.c contained MODULE_LICENSE, so
"modprobe shpchp" silently succeeded when it should have failed.
Remove MODULE_LICENSE in files that cannot be built as modules.
[bhelgaas: commit log, squash]
Suggested-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230216152410.4312-1-nick.alcock@oracle.com/
Signed-off-by: Nick Alcock <nick.alcock@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Commit a474d3fbe2 ("PCI/MSI: Get rid of PCI_MSI_IRQ_DOMAIN") removed
PCI_MSI_IRQ_DOMAIN and made all previous references to it refer to PCI_MSI
instead.
PCI_HYPERV_INTERFACE already depended on PCI_MSI && PCI_MSI_IRQ_DOMAIN, so
we ended up with a redundant dependency on PCI_MSI && PCI_MSI. Drop the
duplicate.
No functional change. Just a stylistic clean-up.
Link: https://lore.kernel.org/r/20221215101310.9135-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A dereference of the __rcu pointer was noticed by sparse:
drivers/pci/p2pdma.c:199:44: sparse: sparse: dereference of noderef expression
Dereference the __rcu pointer using rcu_dereference_protected() instead of
accessing it directly. It's safe to use rcu_dereference_protected() because
a reference is held on the pgmap's percpu reference counter and thus it
cannot disappear.
Link: https://lore.kernel.org/r/20230209172953.4597-1-logang@deltatee.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.") the
driver core allows the usage of const struct kobj_type.
Take advantage of this to constify the structure definition to prevent
modification at runtime.
Link: https://lore.kernel.org/r/20230216-kobj_type-pci-v1-1-46a63c8612b5@weissschuh.net
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
On surprise removal, pciehp_unconfigure_device() and acpiphp's
trim_stale_devices() call pci_dev_set_disconnected() to mark removed
devices as permanently offline. Thereby, the PCI core and drivers know
to skip device accesses.
However pci_dev_set_disconnected() takes the device_lock and thus waits for
a concurrent driver bind or unbind to complete. As a result, the driver's
->probe and ->remove hooks have no chance to learn that the device is gone.
That doesn't make any sense, so drop the device_lock and instead use atomic
xchg() and cmpxchg() operations to update the device state.
As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs
on surprise removal with AER concurrently performing a bus reset.
AER bus reset:
INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds.
Tainted: G W 6.2.0-rc3-custom-norework-jan11+
schedule
rwsem_down_write_slowpath
down_write_nested
pciehp_reset_slot # acquires reset_lock
pci_reset_hotplug_slot
pci_slot_reset # acquires device_lock
pci_bus_error_reset
aer_root_reset
pcie_do_recovery
aer_process_err_devices
aer_isr
pciehp surprise removal:
INFO: task irq/26-pciehp:96 blocked for more than 120 seconds.
Tainted: G W 6.2.0-rc3-custom-norework-jan11+
schedule_preempt_disabled
__mutex_lock
mutex_lock_nested
pci_dev_set_disconnected # acquires device_lock
pci_walk_bus
pciehp_unconfigure_device
pciehp_disable_slot
pciehp_handle_presence_or_link_change
pciehp_ist # acquires reset_lock
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590
Fixes: a6bd101b8f ("PCI: Unify device inaccessible")
Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de
Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.20+
Cc: Keith Busch <kbusch@kernel.org>
The Qualcomm PCI bridge device (Device ID 0x010e) found in chipsets such as
SC8280XP used in Lenovo Thinkpad X13s, does not set the Command Completed
bit unless writes to the Slot Command register change "Control" bits.
This results in timeouts like below during boot and resume from suspend:
pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x03c0 (issued 2020 msec ago)
...
pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x13f1 (issued 107724 msec ago)
Add the device to the Command Completed quirk to mark commands "completed"
immediately unless they change the "Control" bits.
Link: https://lore.kernel.org/r/20230213144922.89982-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
IPQ8074 has one Gen2 and one Gen3 port, with Gen2 port already supported.
Add compatible for Gen3 port which uses the same controller as IPQ6018.
Link: https://lore.kernel.org/r/20230113164449.906002-7-robimarko@gmail.com
Signed-off-by: Robert Marko <robimarko@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Implement the new host_deinit() callback so that the PHY is powered off
and regulators and clocks are disabled also on late host-init errors.
Link: https://lore.kernel.org/r/20221017114705.8277-2-johan+linaro@kernel.org
Fixes: 82a823833f ("PCI: qcom: Add Qualcomm PCIe controller driver")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Wangxun has verified there is no peer-to-peer between functions for the
below selection of SFxxx, RP1000 and RP2000 NICS. They may be
multi-function devices, but the hardware does not advertise ACS capability.
Add an ACS quirk for these devices so the functions can be in independent
IOMMU groups.
Link: https://lore.kernel.org/r/20230207102419.44326-1-mengyuanlou@net-swift.com
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
On r8a7791/koelsch:
kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
# cat /sys/kernel/debug/kmemleak
unreferenced object 0xc3a34e00 (size 64):
comm "swapper/0", pid 1, jiffies 4294937460 (age 199.080s)
hex dump (first 32 bytes):
b4 5d 81 f0 b4 5d 81 f0 c0 b0 a2 c3 00 00 00 00 .]...]..........
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<fe3aa979>] __kmalloc+0xf0/0x140
[<34bd6bc0>] resource_list_create_entry+0x18/0x38
[<767046bc>] pci_add_resource_offset+0x20/0x68
[<b3f3edf2>] devm_of_pci_get_host_bridge_resources.constprop.0+0xb0/0x390
When coalescing two resources for a contiguous aperture, the second
resource is enlarged to cover the full contiguous range, while the first
resource is marked invalid. This invalidation is done by clearing the
flags, start, and end members.
When adding the initial resources to the bus later, invalid resources are
skipped. Unfortunately, the check for an invalid resource considers only
the end member, causing false positives.
E.g. on r8a7791/koelsch, root bus resource 0 ("bus 00") is skipped, and no
longer registered with pci_bus_insert_busn_res() (causing the memory leak),
nor printed:
pci-rcar-gen2 ee090000.pci: host bridge /soc/pci@ee090000 ranges:
pci-rcar-gen2 ee090000.pci: MEM 0x00ee080000..0x00ee08ffff -> 0x00ee080000
pci-rcar-gen2 ee090000.pci: PCI: revision 11
pci-rcar-gen2 ee090000.pci: PCI host bridge to bus 0000:00
-pci_bus 0000:00: root bus resource [bus 00]
pci_bus 0000:00: root bus resource [mem 0xee080000-0xee08ffff]
Fix this by only skipping resources where all of the flags, start, and end
members are zero.
Fixes: 7c3855c423 ("PCI: Coalesce host bridge contiguous apertures")
Link: https://lore.kernel.org/r/da0fcd5e86c74239be79c7cb03651c0fce31b515.1676036673.git.geert+renesas@glider.be
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
As a part of the transition towards callback mechanism for signalling the
events from EPC to EPF, let's use the link_up() callback in the place of
the LINK_UP notifier. This also removes the notifier support completely
from the PCI endpoint framework.
Link: https://lore.kernel.org/linux-pci/20230124071158.5503-6-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Kishon Vijay Abraham I <kishon@kernel.org>
Instead of using the notifiers for passing the events from EPC to EPF,
let's introduce a callback based mechanism where the EPF drivers can
populate relevant callbacks for EPC events they want to subscribe.
The use of notifiers in kernel is not recommended if there is a real link
between the sender and receiver, like in this case. Also, the existing
atomic notifier forces the notification functions to be in atomic context
while the caller may be in non-atomic context. For instance, the two
in-kernel users of the notifiers, pcie-qcom and pcie-tegra194, both are
calling the notifier functions in non-atomic context (from threaded IRQ
handlers). This creates a sleeping in atomic context issue with the
existing EPF_TEST driver that calls the EPC APIs that may sleep.
For all these reasons, let's get rid of the notifier chains and use the
simple callback mechanism for signalling the events from EPC to EPF
drivers. This preserves the context of the caller and avoids the latency
of going through a separate interface for triggering the notifications.
As a first step of the transition, the core_init() callback is introduced
in this commit, that'll replace the existing CORE_INIT notifier used for
signalling the init complete event from EPC.
During the occurrence of the event, EPC will go over the list of EPF
drivers attached to it and will call the core_init() callback if available.
Link: https://lore.kernel.org/linux-pci/20230124071158.5503-5-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Kishon Vijay Abraham I <kishon@kernel.org>
The EPC controller maintains a list of EPF drivers added to it. For
protecting this list against the concurrent accesses, the epc->lock
(used for protecting epc_ops) has been used so far. Since there were
no users trying to use epc_ops and modify the pci_epf list simultaneously,
this was not an issue.
But with the addition of callback mechanism for passing the events, this
will be a problem. Because the pci_epf list needs to be iterated first
for getting hold of the EPF driver and then the relevant event specific
callback needs to be called for the driver.
If the same epc->lock is used, then it will result in a deadlock scenario.
For instance,
...
mutex_lock(&epc->lock);
list_for_each_entry(epf, &epc->pci_epf, list) {
epf->event_ops->core_init(epf);
|
|-> pci_epc_set_bar();
|
|-> mutex_lock(&epc->lock) # DEADLOCK
...
So to fix this issue, use a separate lock called "list_lock" for
protecting the pci_epf list against the concurrent accesses. This lock
will also be used by the callback mechanism.
Link: https://lore.kernel.org/linux-pci/20230124071158.5503-4-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
dw_pcie_ep_linkup() may take more time to execute depending on the EPF
driver implementation. Calling this API in the hard IRQ handler is not
encouraged since the hard IRQ handlers are supposed to complete quickly.
So move the dw_pcie_ep_linkup() call to threaded IRQ handler.
Link: https://lore.kernel.org/linux-pci/20230124071158.5503-3-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
The "dra7xx-pcie-main" hard IRQ handler is just printing the IRQ status
and calling the dw_pcie_ep_linkup() API if LINK_UP status is set. But the
execution of dw_pcie_ep_linkup() depends on the EPF driver and may take
more time depending on the EPF implementation.
In general, hard IRQ handlers are supposed to return quickly and not block
for so long. Moreover, there is no real need of the current IRQ handler to
be a hard IRQ handler. So switch to the threaded IRQ handler for the
"dra7xx-pcie-main" IRQ.
Link: https://lore.kernel.org/linux-pci/20230124071158.5503-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
If a device has a firmware node (DT/ACPI), and the device is marked
disabled, that is currently ignored. Add a check for this condition and
bail out creating the pci_dev.
This assumes the config space for the device can still be accessed because
they already have by this point in order to identify the device.
Link: https://lore.kernel.org/r/20230210164351.2687475-1-robh@kernel.org
Tested-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Liu Peibao <liupeibao@loongson.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmPmuwEUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vzEYg//XHHddqDRiZmx9McETDAi33rJ9DDo
CMCwiydUzGlDl/IDnBxwcmq0K5wiA5jFvXlRFmzHfnHGpWpRf6ntcT436QnhKe4G
/DAXxVdZGWr079m7s4NKjByDunhkkkT/elapFCtZTwXxMkUvbprM0ozMdtSMnC/M
RDCJKfaV2CKUkl/5Mk9Iw3vzrr62PP8fVHHMIr+6O39frZ2+MrzYCgpGkW0pubmT
He0gmeVnNFzR6qB1GraXVNwlapjPjzvHe1IggDDLJRxM4+sz8qKJz0vKew10JwSo
R5s8ACfTNtHwY45af1EWIeO9BoGD3soNLvWmK/5uNrCWJx9wnczQuz4b/Km2y02Y
KCJaudiC6EfAzu5gCSgao3VZ/EQ45sHrYZN9qiyDujOgAUUPl0oonwa1HW/1WUSH
Pd/ff9o78vASxdZP1o1hF0davNET1HOsvXGxQj71TJLXVsB2pifWvAoNocHHnpoe
cPCix8t3c4pgXzI0RG04tcfqGWAgsaVz73SdU0/g5qk+hPRvypjcY1lw6U66sk9f
/ZNII5fSX6hIWTetD27JiCZNOxJq1jikxOD4/LZizMTjdZYf6VxjDxkIaLS99pZw
RCOQ8chKVemr12lD//8eFUJJvblug2aTlHIwFnMuKiavy6pL5Sm1zGMBrqhYmUSO
pkNXzFaZe+GyF3k=
=NSFX
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI fixes from Bjorn Helgaas:
- Move to a shared PCI git tree (Bjorn Helgaas)
- Add Krzysztof Wilczyński as another PCI maintainer (Lorenzo
Pieralisi)
- Revert a couple ASPM patches to fix suspend/resume regressions (Bjorn
Helgaas)
* tag 'pci-v6.2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"
Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"
MAINTAINERS: Promote Krzysztof to PCI controller maintainer
MAINTAINERS: Move to shared PCI tree
This reverts commit 5e85eba6f5.
Thomas Witt reported that 5e85eba6f5 ("PCI/ASPM: Refactor L1 PM Substates
Control Register programming") broke suspend/resume on a Tuxedo
Infinitybook S 14 v5, which seems to use a Clevo L140CU Mainboard.
The main symptom is:
iwlwifi 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
nvme 0000:03:00.0: Unable to change power state from D3hot to D0, device inaccessible
and the machine is only partially usable after resume. It can't run dmesg
and can't do a clean reboot. This happens on every suspend/resume cycle.
Revert 5e85eba6f5 until we can figure out the root cause.
Fixes: 5e85eba6f5 ("PCI/ASPM: Refactor L1 PM Substates Control Register programming")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
Reported-by: Thomas Witt <kernel@witt.link>
Tested-by: Thomas Witt <kernel@witt.link>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v6.1+
Cc: Vidya Sagar <vidyas@nvidia.com>
This reverts commit 4ff116d0d5.
Tasev Nikola and Mark Enriquez reported that resume from suspend was broken
in v6.1-rc1. Tasev bisected to a47126ec29 ("PCI/PTM: Cache PTM
Capability offset"), but we can't figure out how that could be related.
Mark saw the same symptoms and bisected to 4ff116d0d5 ("PCI/ASPM: Save L1
PM Substates Capability for suspend/resume"), which does have a connection:
it restores L1 Substates configuration while ASPM L1 may be enabled:
pci_restore_state
pci_restore_aspm_l1ss_state
aspm_program_l1ss
pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore
pci_restore_pcie_state
pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore
which is a problem because PCIe r6.0, sec 5.5.4, requires that:
If setting either or both of the enable bits for ASPM L1 PM
Substates, both ports must be configured as described in this
section while ASPM L1 is disabled.
Separately, Thomas Witt reported that 5e85eba6f5 ("PCI/ASPM: Refactor L1
PM Substates Control Register programming") broke suspend/resume, and it
depends on 4ff116d0d5.
Revert 4ff116d0d5 ("PCI/ASPM: Save L1 PM Substates Capability for
suspend/resume") to fix the resume issue and enable revert of 5e85eba6f5
to fix the issue Thomas reported.
Note that reverting 4ff116d0d5 means L1 Substates config may be lost on
suspend/resume. As far as we know the system will use more power but will
still *work* correctly.
Fixes: 4ff116d0d5 ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
Reported-by: Tasev Nikola <tasev.stefanoska@skynet.be>
Reported-by: Mark Enriquez <enriquezmark36@gmail.com>
Reported-by: Thomas Witt <kernel@witt.link>
Tested-by: Mark Enriquez <enriquezmark36@gmail.com>
Tested-by: Thomas Witt <kernel@witt.link>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v6.1+
Cc: Vidya Sagar <vidyas@nvidia.com>
pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus
Reset, but not after a DPC-induced Hot Reset.
As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not
observed and devices on the secondary bus may be accessed before
they're ready.
One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a
PCIe switch whose upstream port is not immediately ready after reset.
Because its config space is restored too early, it remains in
D0uninitialized, its subordinate devices remain inaccessible and DPC
recovery fails with messages such as:
i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible)
intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible)
pcieport 0000:89:02.0: AER: device recovery failed
Fix it.
Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
People are reporting that pci-mvebu.c driver does not work with recent
mainline kernel. There are more bugs which prevents its for daily usage.
So lets mark it as broken for now, until somebody would be able to fix it
in mainline kernel.
Link: https://lore.kernel.org/r/20230114164125.1298-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3.
Resolve a conflict with the fix and move of cxl_report_and_clear() from
pci.c to core/pci.c.
Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait
for devices on the secondary bus to become accessible after reset:
Although it does call pci_dev_wait(), it erroneously passes the bridge's
pci_dev rather than that of a child. The bridge of course is always
accessible while its secondary bus is reset, so pci_dev_wait() returns
immediately.
Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait()
function which is called from pci_bridge_secondary_bus_reset():
https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/
However we already have pci_bridge_wait_for_secondary_bus() which does
almost exactly what we need. So far it's only called on resume from
D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8).
Re-using it for Secondary Bus Resets is a leaner and more rational
approach than introducing a new function.
That only requires a few minor tweaks:
- Amend pci_bridge_wait_for_secondary_bus() to await accessibility of
the first device on the secondary bus by calling pci_dev_wait() after
performing the prescribed delays. pci_dev_wait() needs two parameters,
a reset reason and a timeout, which callers must now pass to
pci_bridge_wait_for_secondary_bus(). The timeout is 1 sec for resume
(PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c4 ("PCI:
Wait up to 60 seconds for device to become ready after FLR")).
Introduce a PCI_RESET_WAIT macro for the 1 sec timeout.
- Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or
-ENOTTY on error for consumption by pci_bridge_secondary_bus_reset().
- Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which
is now performed by pci_bridge_wait_for_secondary_bus(). A static
delay this long is only necessary for Conventional PCI, so modern
PCIe systems benefit from shorter reset times as a side effect.
Fixes: 6b2f1351af ("PCI: Wait for device to become ready after secondary bus reset")
Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@wunner.de
Reported-by: Sheng Bi <windy.bi.enflame@gmail.com>
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org # v4.17+
If a PCI bridge is suspended to D3cold upon entering system sleep,
resuming it entails a Fundamental Reset per PCIe r6.0 sec 5.8.
The delay prescribed after a Fundamental Reset in PCIe r6.0 sec 6.6.1
is sought to be observed by:
pci_pm_resume_noirq()
pci_pm_bridge_power_up_actions()
pci_bridge_wait_for_secondary_bus()
However, pci_bridge_wait_for_secondary_bus() bails out if the bridge_d3
flag is not set. That flag indicates whether a bridge is allowed to
suspend to D3cold at *runtime*.
Hence *no* delay is observed on resume from system sleep if runtime
D3cold is forbidden. That doesn't make any sense, so drop the bridge_d3
check from pci_bridge_wait_for_secondary_bus().
The purpose of the bridge_d3 check was probably to avoid delays if a
bridge remained in D0 during suspend. However the sole caller of
pci_bridge_wait_for_secondary_bus(), pci_pm_bridge_power_up_actions(),
is only invoked if the previous power state was D3cold. Hence the
additional bridge_d3 check seems superfluous.
Fixes: ad9001f2f4 ("PCI/PM: Add missing link delays required by the PCIe spec")
Link: https://lore.kernel.org/r/eb37fa345285ec8bacabbf06b020b803f77bdd3d.1673769517.git.lukas@wunner.de
Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org # v5.5+
Previously we distributed spare resources only upon hot-add, so if the
initial root bus scan found devices that had not been fully configured by
the BIOS, we allocated only enough resources to cover what was then
present. If some of those devices were hotplug bridges, we did not leave
any additional resource space for future expansion.
Distribute the available resources for root buses, too, to make this work
the same way as the normal hotplug case.
A previous commit to do this was reverted due to a regression reported by
Jonathan Cameron:
e96e27fc6f ("PCI: Distribute available resources for root buses, too")
5632e2beaf ("Revert "PCI: Distribute available resources for root buses, too"")
This commit changes pci_bridge_resources_not_assigned() to work with
bridges that do not have all the resource windows programmed by the boot
firmware (previously we expected all I/O, memory and prefetchable memory
were programmed).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000
Link: https://lore.kernel.org/r/20220905080232.36087-5-mika.westerberg@linux.intel.com
Link: https://lore.kernel.org/r/20230131092405.29121-4-mika.westerberg@linux.intel.com
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A PCI bridge may reside on a bus with other devices as well. The resource
distribution code does not take this into account and therefore it expands
the bridge resource windows too much, not leaving space for the other
devices (or functions of a multifunction device). This leads to an issue
that Jonathan reported when running QEMU with the following topology (QEMU
parameters):
-device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2 \
-device x3130-upstream,id=sw1,bus=root_port13,multifunction=on \
-device e1000,bus=root_port13,addr=0.1 \
-device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3 \
-device e1000,bus=fun1
The first e1000 NIC here is another function in the switch upstream port.
This leads to following errors:
pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04]
pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04]
pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000]
e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0]
Fix this by taking into account bridge windows, device BARs and SR-IOV PF
BARs on the bus (PF BARs include space for VF BARS so only account PF
BARs), including the ones belonging to bridges themselves if it has any.
Link: https://lore.kernel.org/linux-pci/20221014124553.0000696f@huawei.com/
Link: https://lore.kernel.org/linux-pci/6053736d-1923-41e7-def9-7585ce1772d9@ixsystems.com/
Link: https://lore.kernel.org/r/20230131092405.29121-3-mika.westerberg@linux.intel.com
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reported-by: Alexander Motin <mav@ixsystems.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
After division the extra resource space per hotplug bridge may not be
aligned according to the window alignment, so align it before passing it
down for further distribution.
Link: https://lore.kernel.org/r/20230131092405.29121-2-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some devices like ZBT WE1326 and ZBT WF3526-P and some Netgear models need
to delay phy port initialization after calling the mt7621_pcie_init_port()
driver function to get into reliable boots for both warm and hard resets.
The delay required to detect the ports seems to be in the range [75-100]
milliseconds.
If the ports are not detected the controller is not functional.
There is no datasheet or something similar to really understand why this
extra delay is needed only for these devices and it is not for most of
the boards that are built on mt7621 SoC.
This issue has been reported by openWRT community and the complete
discussion is in [0]. The 100 milliseconds delay has been tested in all
devices to validate it.
Add the extra 100 milliseconds delay to fix the issue.
[0]: https://github.com/openwrt/openwrt/pull/11220
Link: https://lore.kernel.org/r/20221231074041.264738-1-sergio.paracuellos@gmail.com
Fixes: 2bdd5238e7 ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver")
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: "Krzysztof Wilczyński" <kw@linux.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: <linux-pci@vger.kernel.org>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
PCIe ports reserved for VMD use are not visible to BIOS and therefore not
configured to enable PCIe ASPM or LTR values (which BIOS will configure if
they are not set). Lack of this programming results in high power
consumption on laptops as reported in bugzilla. For affected products use
pci_enable_link_state to set the allowed link states for devices on the
root ports. Also set the LTR value to the maximum value needed for the SoC.
This is a workaround for products from Rocket Lake through Alder Lake.
Raptor Lake, the latest product at this time, has already implemented LTR
configuring in BIOS. Future products will move ASPM configuration back to
BIOS as well. As this solution is intended for laptops, support is not
added for hotplug or for devices downstream of a switch on the root port.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=212355
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215063
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213717
Link: https://lore.kernel.org/r/20230120031522.2304439-5-david.e.box@linux.intel.com
Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Simplify the device ID list by creating a grouping of features shared by
client products.
Suggested-by: Jon Derrick <jonathan.derrick@linux.dev>
Link: https://lore.kernel.org/r/20230120031522.2304439-4-david.e.box@linux.intel.com
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Use PCI_VDEVICE to simplify the device table.
Link: https://lore.kernel.org/r/20230120031522.2304439-3-david.e.box@linux.intel.com
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev>
Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Add pci_enable_link_state() to allow devices to change the default BIOS
configured states. Clears the BIOS default settings then sets the new
states and reconfigures the link under the semaphore. Also add
PCIE_LINK_STATE_ALL macro for convenience for callers that want to enable
all link states.
Link: https://lore.kernel.org/r/20230120031522.2304439-2-david.e.box@linux.intel.com
Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Except for isochronous-configured devices, software may set
Max_Read_Request_Size (MRRS) to any value up to 4096. If a device issues a
read request with size greater than the completer's Max_Payload_Size (MPS),
the completer is required to break the response into multiple completions.
Instead of correctly responding with multiple completions to a large read
request, some LS7A Root Ports respond with a Completer Abort. To prevent
this, the MRRS must be limited to an implementation-specific value.
The OS cannot detect that value, so rely on BIOS to configure MRRS before
booting, and quirk the Root Ports so we never set an MRRS larger than that
BIOS value for any downstream device.
N.B. Hot-added devices are not configured by BIOS, and they power up with
MRRS = 512 bytes, so these devices will be limited to 512 bytes. If the
LS7A limit is smaller, those hot-added devices may not work correctly, but
per [1], hotplug is not supported with this chipset revision.
[1] https://lore.kernel.org/r/073638a7-ae68-2847-ac3d-29e5e760d6af@loongson.cn
[bhelgaas: commit log]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216884
Link: https://lore.kernel.org/r/20230201043018.778499-3-chenhuacai@loongson.cn
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
After cc27b735ad ("PCI/portdrv: Turn off PCIe services during shutdown")
we observe hangs during poweroff/reboot on systems with LS7A chipset.
This happens because the portdrv .shutdown() method (pcie_portdrv_remove())
clears PCI_COMMAND_MASTER via pci_disable_device(), which prevents bridges
from forwarding memory or I/O Requests in the upstream direction (PCIe
r6.0, sec 7.5.1.1.3).
LS7A Root Ports have a hardware defect: clearing PCI_COMMAND_MASTER *also*
prevents the bridge from forwarding CPU MMIO requests in the downstream
direction, and these MMIO accesses to devices below the bridge happen even
after .shutdown(), e.g., to print console messages. LS7A neither forwards
the requests nor sends an unsuccessful completion to the CPU, so the CPU
waits forever, resulting in the hang.
The purpose of .shutdown() is to disable interrupts and DMA from the
device. PCIe ports may generate interrupts (either MSI/MSI-X or INTx) for
AER, DPC, PME, hotplug, etc., but they never perform DMA except MSI/MSI-X.
Clearing PCI_COMMAND_MASTER effectively disables MSI/MSI-X, but not INTx.
The port service driver .remove() methods clear the interrupt enables in
PCI_ERR_ROOT_COMMAND, PCI_EXP_DPC_CTL, PCI_EXP_SLTCTL, and PCI_EXP_RTCTL,
etc., which disables interrupts regardless of whether they are MSI/MSI-X or
INTx.
Add a pcie_portdrv_shutdown() method that calls all the port service driver
.remove() methods to clear the interrupt enables for each service but does
not clear Bus Mastering on the port itself.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230201043018.778499-2-chenhuacai@loongson.cn
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCI passthrough to VMs does not work with AMD FCH AHCI adapters: the guest
OS fails to correctly probe devices attached to the controller due to FIS
communication failures:
ata4: softreset failed (1st FIS failed)
...
ata4.00: qc timeout after 5000 msecs (cmd 0xec)
ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
Forcing the "bus" reset method before unbinding & binding the adapter to
the vfio-pci driver solves this issue, e.g.:
echo "bus" > /sys/bus/pci/devices/<ID>/reset_method
gives a working guest OS, indicating that the default FLR reset method
doesn't work correctly.
Apply quirk_no_flr() to AMD FCH AHCI devices to work around this issue.
Link: https://lore.kernel.org/r/20230128013951.523247-1-damien.lemoal@opensource.wdc.com
Reported-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following bits in the PCIe Device Control register enable sending of
ERR_COR, ERR_NONFATAL, or ERR_FATAL Messages (or reporting internally in
the case of Root Ports):
Correctable Error Reporting Enable
Non-Fatal Error Reporting Enable
Fatal Error Reporting Enable
Unsupported Request Reporting Enable
These enable bits are set by pci_enable_pcie_error_reporting(), and since
f26e58bf6f ("PCI/AER: Enable error reporting when AER is native"), we
do that in this path during enumeration:
pci_init_capabilities
pci_aer_init
pci_enable_pcie_error_reporting
Previously, the AER service driver also traversed the hierarchy when
claiming a Root Port, enabling error reporting for downstream devices, but
this is redundant.
Remove the code that enables this error reporting in the AER .probe() path.
Also remove similar code that disables error reporting in the AER .remove()
path.
Note that these Device Control Reporting Enable bits do not control
interrupt generation. That's done by the similarly-named bits in the AER
Root Error Command register, which are still set by aer_probe() and cleared
by aer_remove(), since the AER service driver handles those interrupts.
See PCIe r6.0, sec 6.2.6.
Link: https://lore.kernel.org/r/20230118234612.272916-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Keith Busch <kbusch@kernel.org>
8e4bfbe644 ("PCI: endpoint: pci-epf-vntb: fix error handle in
epf_ntb_mw_bar_init()") added a "num_mws" parameter to
epf_ntb_mw_bar_clear() but failed to add kernel-doc for num_mws.
Add kernel-doc for num_mws on epf_ntb_mw_bar_clear().
Fixes: 8e4bfbe644 ("PCI: endpoint: pci-epf-vntb: fix error handle in epf_ntb_mw_bar_init()")
Link: https://lore.kernel.org/r/20230103024907.293853-1-yangyingliang@huawei.com
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The "ret" variable in switchtec_dma_mrpc_isr() is superfluous. Remove it
and just return the value. No functional change intended.
Link: https://lore.kernel.org/r/20221216162126.207863-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
The sysfs link name "virtfn%u" constructed by pci_iov_sysfs_link() requires
17 bytes to contain the longest possible string. Increase VIRTFN_ID_LEN to
accommodate that.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
[bhelgaas: commit log, comment at #define]
Fixes: dd7cc44d0b ("PCI: add SR-IOV API for Physical Function driver")
Link: https://lore.kernel.org/r/20221218033347.23743-1-gremlin@altlinux.org
Signed-off-by: Alexey V. Vissarionov <gremlin@altlinux.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Since commit fc7a6209d5 ("bus: Make remove callback return
void") forces bus_type::remove be void-returned, it doesn't
make much sense for any bus based driver implementing remove
callbalk to return non-void to its caller.
As such, change the remove function for Hyper-V VMBus based
drivers to return void.
Signed-off-by: Dawei Li <set_pte_at@outlook.com>
Link: https://lore.kernel.org/r/TYCP286MB2323A93C55526E4DF239D3ACCAFA9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Wei Liu <wei.liu@kernel.org>
i.MX PCIe is one dual mode PCIe controller.
Add i.MX PCIe EP mode support here, and split the PCIe modes to the Root
Complex mode and Endpoint mode.
Link: https://lore.kernel.org/r/1673847684-31893-12-git-send-email-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmPBniAUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vwOjRAAhjyRAgyiZV2rWS4pyvpQpqcpZWD9
796ZSqnzLJjVYCymGvUTX23FEA48n59+bCM/WpfEGUPrBf8LZQxC9YOCm6ltuM8+
FoSBykW/tHPq5IWaLzgrWpHeDOgEnZu/WFGGvrV3tl1mLpM1SJT8bGDsjHXlo+FM
qkTEiA3nUEKQs5x9r2TTLCeUWGPNTIHNd2VfuxOqM3qC/nVCOfTTxU8nm6Lk7Eix
nboAugAIADJIjs/+ZGekLBuzZYPkLYuDTyMYJ5hdo1p7wWCLc9gArEqvXKwVgmD3
ptenZeOlQi9Ay45HmkfIgfgKeeQ7REJj3dx04vf67neAianyUrB0EZDqDjR7LmgM
ozlNt0XjyoeEhu6AQS0s1LZtbDiED1R/00P6Gb+YEjUCVipW2lEYYwP0v9dsnNoh
6wblgnkQoxLFM+5CAXRmCmpaoQn0Uam7okfVeohtsz8/kNQF2St0hjzr4Dmws+O3
k9PUqnnUl4ByElzpEDesVGZMJ3pxFVH15ufu8VnRqN60pLTvNrsPyU4cVnG176Rc
3RSDN3zMtPxnHJVy4r3bTNEZsX/7RUrOb4xScOXMmRDBMUc8QdscF8Oj1ucKlj5j
mp7vB/7+VjU96uRarRyqUxGeQc77DCTcvOa1IGh/cuYom8ZJ6vpSCpKy6f6SFGuf
i8iTTUcQKCdqVW4=
=Fv2v
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fixes from Bjorn Helgaas:
- Work around apparent firmware issue that made Linux reject MMCONFIG
space, which broke PCI extended config space (Bjorn Helgaas)
- Fix CONFIG_PCIE_BT1 dependency due to mid-air collision between a
PCI_MSI_IRQ_DOMAIN -> PCI_MSI change and addition of PCIE_BT1 (Lukas
Bulwahn)
* tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM space
x86/pci: Simplify is_mmconf_reserved() messages
PCI: dwc: Adjust to recent removal of PCI_MSI_IRQ_DOMAIN
It is questionable to allow a PCI bridge to go into D3 if it has _S0W
returning D2 or a shallower power state, so modify acpi_pci_bridge_d3(() to
always take the return value of _S0W for the target bridge into account.
That is, make it return 'false' if _S0W returns D2 or a shallower power
state for the target bridge regardless of its ancestor Root Port
properties. Of course, this also causes 'false' to be returned if the Root
Port itself is the target and its _S0W returns D2 or a shallower power
state.
However, still allow bridges without _S0W that are power-manageable via
ACPI to enter D3 to retain the current code behavior in that case.
This fixes problems where a hotplug notification is missed because a bridge
is in D3. That means hot-added devices such as USB4 docks (and the devices
they contain) and Thunderbolt 3 devices may not work.
Link: https://lore.kernel.org/linux-pci/20221031223356.32570-1-mario.limonciello@amd.com/
Link: https://lore.kernel.org/r/12155458.O9o76ZdvQC@kreacher
Reported-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY76ohgAKCRCAXGG7T9hj
vo8fAP0XJ94B7asqcN4W3EyeyfqxUf1eZvmWRhrbKqpLnmHLaQEA/uJBkXL49Zj7
TTcbxR1coJ/hPwhtmONU4TNtCZ+RXw0=
=2Ib5
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- two cleanup patches
- a fix of a memory leak in the Xen pvfront driver
- a fix of a locking issue in the Xen hypervisor console driver
* tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/pvcalls: free active map buffer on pvcalls_front_free_map
hvc/xen: lock console list traversal
x86/xen: Remove the unused function p2m_index()
xen: make remove callback of xen driver void returned
As the ECRC configuration bits are part of AER registers, configure ECRC
only if AER is natively owned by the kernel.
Link: https://lore.kernel.org/r/20230112072111.20063-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CXL _OSC Error Reporting Control is used by the OS to determine if
Firmware has control of various CXL error reporting capabilities
including the event logs.
Expose the result of negotiating CXL Error Reporting Control in struct
pci_host_bridge for consumption by the CXL drivers.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
a474d3fbe2 ("PCI/MSI: Get rid of PCI_MSI_IRQ_DOMAIN") removed
PCI_MSI_IRQ_DOMAIN and changed all references to refer to PCI_MSI instead.
ba6ed462dc ("PCI: dwc: Add Baikal-T1 PCIe controller support")
independently added PCIE_BT1, depending on PCI_MSI_IRQ_DOMAIN.
Both commits appeared in v6.2-rc1, so the latter missed the conversion from
PCI_MSI_IRQ_DOMAIN to PCI_MSI. Update PCIE_BT1 to depend on PCI_MSI
instead.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20221215103452.23131-1-lukas.bulwahn@gmail.com
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
- New support:
- Allwinner H616 USB PHY and A100 DPHY support
- TI J721s2, J784s4 and J721e support
- Freescale i.MX8MP PCIe PHY support
- New driver for Renesas Ethernet SERDES supporting R-Car S4-8
- Qualcomm SM8450 PCIe1 PHY support in EP mode
- Updates:
- again a big pile of updates on qcom-qmp-* drivers following the
driver split and reorganization merged earlier
- Phy order of API calls documentation update
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmOfIbYACgkQfBQHDyUj
g0fSbw//Rgfk+owGLWyJ3PxRXiDhZaJdBUQNuZEe46TjGKKHvWLJ4+ig6vrXlPgr
8mVte7jEMZubO7YE/1Vifv9xiFmjo+5R4//WlfkIwy/0SFR8+N+DPQiGU7i7ecov
uzkFN26qsi4aQrKmxyadGJQzHipaLViBkr6fqfuFcmyDiFII0FoVa/mV7ZQlFtl3
cDv3leFnp3HQ9mr/mKhOSmbyWCEQHqQvjDwB50R915WfH9PLV2jYddfO4Cbwpr4r
7m7wX2EiFlQ1o2gwcFQdLiDkA8YL9Kw3wOChpbcCu4gOapJ+GWqCk0AqS9m8MMWF
HnyAyHw3NxDagwV6sN19Xxa7XgkPJZPn6/92BfGYeD6H5gxmYwdROeU2/x6Qt1+z
scTl1m6z8X9WWwjnWK1cqVqBPUXoJJ2smym6VBHh3f4AJAVmwZy+yyk1Oar5qa2M
yDWV7nIRJQmXnuQ+XsG5rmXmmMwOuBgng4NsNX9PjhdVy6/1FUOJuMCr8ldPLAkG
Lpg+GN8w6tn2G0bxrHzWeAOytxjK5XuXch99BHmXDl+NgIpp/6DuyddXmvG4nrvk
R6eDv86UOQgGP2h7SujUm9f6RIWb3nJrYN27r+IHK/z5LjSMfylSSu13GvMjZkt4
Et5Q4Wk9MomHFQkhiTGTd9WlSvb497RgzKhBhMg/lJoSyTi9Eew=
=4HRP
-----END PGP SIGNATURE-----
Merge tag 'phy-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"This tme we have again a big pile of qcom-qmp-* changes, one new
driver and bunch of new hardware support.
New hardware support:
- Allwinner H616 USB PHY and A100 DPHY support
- TI J721s2, J784s4 and J721e support
- Freescale i.MX8MP PCIe PHY support
- New driver for Renesas Ethernet SERDES supporting R-Car S4-8
- Qualcomm SM8450 PCIe1 PHY support in EP mode
- Qualcomm SC8280XP PCIe PHY support (including x4 mode)
- Fixed Qualcomm SC8280XP USB4-USB3-DP PHY DT bindings
Updates:
- A big pile of updates on qcom-qmp-* drivers following the driver
split and reorganization merged earlier
- Phy order of API calls documentation update"
* tag 'phy-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (174 commits)
phy: ti: phy-j721e-wiz: add j721s2-wiz-10g module support
dt-bindings: phy-j721e-wiz: add j721s2 compatible string
phy: use devm_platform_get_and_ioremap_resource()
phy: allwinner: phy-sun6i-mipi-dphy: Add the A100 DPHY variant
phy: allwinner: phy-sun6i-mipi-dphy: Add a variant power-on hook
phy: allwinner: phy-sun6i-mipi-dphy: Set the enable bit last
phy: allwinner: phy-sun6i-mipi-dphy: Make RX support optional
dt-bindings: sun6i-a31-mipi-dphy: Add the A100 DPHY variant
dt-bindings: sun6i-a31-mipi-dphy: Add the interrupts property
phy: qcom-qmp-pcie: drop redundant clock allocation
phy: qcom-qmp-usb: drop redundant clock allocation
phy: qcom-qmp: drop unused type header
phy: qcom-qmp-usb: drop sc8280xp reference-clock source
dt-bindings: phy: qcom,sc8280xp-qmp-usb3-uni: drop reference-clock source
phy: qcom-qmp-combo: add support for updated sc8280xp binding
phy: qcom-qmp-combo: rename DP_PHY register pointer
phy: qcom-qmp-combo: rename common-register pointers
phy: qcom-qmp-combo: clean up DP clock callbacks
phy: qcom-qmp-combo: separate clock and provider registration
phy: qcom-qmp-combo: add clock registration helper
...
Since commit fc7a6209d5 ("bus: Make remove callback return void")
forces bus_type::remove be void-returned, it doesn't make much sense for
any bus based driver implementing remove callbalk to return non-void to
its caller.
This change is for xen bus based drivers.
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Dawei Li <set_pte_at@outlook.com>
Link: https://lore.kernel.org/r/TYCP286MB23238119AB4DF190997075C9CAE39@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM
Signed-off-by: Juergen Gross <jgross@suse.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmOYpTIUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxuZhAAhGjE8voLZeOYwxbvfL69hGTAZ+Me
x2hqRWVhh/IGWXTTaoSLwSjMMokcmAKN5S/wv8qdCG5sB8EN8FyTBIZDy8PuRRdl
8UlqlBMSL+d4oSRDCnYLxFNcynLRNnmx2dfcdw9tJ4zjTLN8Y4o8PHFogR6pJ3MT
sDC8S0myTQKXr4wAGzTZycKsiGManviYtByp6dCcKD3Oy5Q2uZ9OKO2DP2yQpn+F
c3IJSV9oDz3KR8JVJ5Q1iz9cdMXbGwjkM3JLlHpxhedwjN4ErLumPutKcebtzO5C
aTqabN7Nnzc4yJusAIfojFCWH7fgaYUyJ3pxcFyJ4tu4m9Last+2I5UB/kV2sYAD
jWiCYx3sA/mRopNXOnrBGae+Lgy+sQnt8or0grySr0bK+b+ArAGis4uT4A0uASGO
RUQdIQwz7zhHeQrwAladHWxnx4BEDNCatgfn38p4fklIYKydCY5nfZURMDvHezSR
G6Nu08hoE9ZXlmkWTFw+5F23wPWKcCpzZj0hf7OroIouXUp8vqSFSqatH5vGkbCl
bDswck9GdRJ2hl5SvFOeelaXkM42du45TMLU2JmIn6dYYFNrO93JgdvKSU7E2CpG
AmDIpg1Idxo8fEPPGH1I7RVU5+ilzmmPQQY7poQW+va4/dEd/QVp1+ZZTDnMC1qk
qi3ck22VdvPU2VU=
=KULr
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Enumeration:
- Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and
make more things static.
- Make portdrv bind to Switch Ports that have AER. Previously, if
these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant
the Ports couldn't be suspended to low-power states. AER on these
Ports doesn't use interrupts, and the AER driver doesn't need to
claim them.
- Assign PCI domain IDs using ida_alloc(), which makes host bridge
add/remove work better.
Resource management:
- To work better with recent BIOSes that use EfiMemoryMappedIO for
PCI host bridge apertures, remove those regions from the E820 map
(E820 entries normally prevent us from allocating BARs). In v5.19,
we added some quirks to disable E820 checking, but that's not very
maintainable. EfiMemoryMappedIO means the OS needs to map the
region for use by EFI runtime services; it shouldn't prevent OS
from using it.
PCIe native device hotplug:
- Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4
PCIe tunneling depends on native PCIe hotplug.
- Enable Command Completed Interrupt only if supported to avoid user
confusion from lspci output that says this is enabled but not
supported.
- Prevent pciehp from binding to Switch Upstream Ports; this happened
because of interaction with acpiphp and caused devices below the
Upstream Port to disappear.
Power management:
- Convert AGP drivers to generic power management. We hope to remove
legacy power management from the PCI core eventually.
Virtualization:
- Fix pci_device_is_present(), which previously always returned
"false" for VFs, causing virtio hangs when unbinding the driver.
Miscellaneous:
- Convert drivers to gpiod API to prepare for dropping some legacy
code.
- Fix DOE fencepost error for the maximum data object length.
Baikal-T1 PCIe controller driver:
- Add driver and DT bindings.
Broadcom STB PCIe controller driver:
- Enable Multi-MSI.
- Delay 100ms after PERST# deassert to allow power and clocks to
stabilize.
- Configure Read Completion Boundary to 64 bytes.
Freescale i.MX6 PCIe controller driver:
- Initialize PHY before deasserting core reset to fix a regression in
v6.0 on boards where the PHY provides the reference.
- Fix imx6sx and imx8mq clock names in DT schema.
Intel VMD host bridge driver:
- Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe
SSDs in VT-d pass-through scenarios.
- Disable MSI remapping, which gets re-enabled by firmware during
suspend/resume.
MediaTek PCIe Gen3 controller driver:
- Add MT7986 and MT8195 support.
Qualcomm PCIe controller driver:
- Add SC8280XP/SA8540P basic interconnect support.
Rockchip DesignWare PCIe controller driver:
- Base DT schema on common Synopsys schema.
Synopsys DesignWare PCIe core:
- Collect DT items shared between Root Port and Endpoint (PERST GPIO,
PHY info, clocks, resets, link speed, number of lanes, number of
iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml.
- Add dma-ranges support for Root Ports and Endpoints.
- Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to
reduce code duplication.
- Add generic names for clocks and resets to encourage more
consistent naming across drivers using DesignWare IP.
- Stop advertising PTM Responder role for Endpoints, which aren't
allowed to be responders.
TI J721E PCIe driver:
- Add j721s2 host mode ID to DT schema.
- Add interrupt properties to DT schema.
Toshiba Visconti PCIe controller driver:
- Fix interrupts array max constraints in DT schema"
* tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits)
x86/PCI: Use pr_info() when possible
x86/PCI: Fix log message typo
x86/PCI: Tidy E820 removal messages
PCI: Skip allocate_resource() if too little space available
efi/x86: Remove EfiMemoryMappedIO from E820 map
PCI/portdrv: Allow AER service only for Root Ports & RCECs
PCI: xilinx-nwl: Fix coding style violations
PCI: mvebu: Switch to using gpiod API
PCI: pciehp: Enable Command Completed Interrupt only if supported
PCI: aardvark: Switch to using devm_gpiod_get_optional()
dt-bindings: PCI: mediatek-gen3: add support for mt7986
dt-bindings: PCI: mediatek-gen3: add SoC based clock config
dt-bindings: PCI: qcom: Allow 'dma-coherent' property
PCI: mt7621: Add sentinel to quirks table
PCI: vmd: Fix secondary bus reset for Intel bridges
PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning
PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db
PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32)
PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member
PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path
...
iommufd is the user API to control the IOMMU subsystem as it relates to
managing IO page tables that point at user space memory.
It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
container) which is the VFIO specific interface for a similar idea.
We see a broad need for extended features, some being highly IOMMU device
specific:
- Binding iommu_domain's to PASID/SSID
- Userspace IO page tables, for ARM, x86 and S390
- Kernel bypassed invalidation of user page tables
- Re-use of the KVM page table in the IOMMU
- Dirty page tracking in the IOMMU
- Runtime Increase/Decrease of IOPTE size
- PRI support with faults resolved in userspace
Many of these HW features exist to support VM use cases - for instance the
combination of PASID, PRI and Userspace IO Page Tables allows an
implementation of DMA Shared Virtual Addressing (vSVA) within a
guest. Dirty tracking enables VM live migration with SRIOV devices and
PASID support allow creating "scalable IOV" devices, among other things.
As these features are fundamental to a VM platform they need to be
uniformly exposed to all the driver families that do DMA into VMs, which
is currently VFIO and VDPA.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5ct7wAKCRCFwuHvBreF
YZZ5AQDciXfcgXLt0UBEmWupNb0f/asT6tk717pdsKm8kAZMNAEAsIyLiKT5HqGl
s7fAu+CQ1pr9+9NKGevD+frw8Solsw4=
=jJkd
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd implementation from Jason Gunthorpe:
"iommufd is the user API to control the IOMMU subsystem as it relates
to managing IO page tables that point at user space memory.
It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO
container) which is the VFIO specific interface for a similar idea.
We see a broad need for extended features, some being highly IOMMU
device specific:
- Binding iommu_domain's to PASID/SSID
- Userspace IO page tables, for ARM, x86 and S390
- Kernel bypassed invalidation of user page tables
- Re-use of the KVM page table in the IOMMU
- Dirty page tracking in the IOMMU
- Runtime Increase/Decrease of IOPTE size
- PRI support with faults resolved in userspace
Many of these HW features exist to support VM use cases - for instance
the combination of PASID, PRI and Userspace IO Page Tables allows an
implementation of DMA Shared Virtual Addressing (vSVA) within a guest.
Dirty tracking enables VM live migration with SRIOV devices and PASID
support allow creating "scalable IOV" devices, among other things.
As these features are fundamental to a VM platform they need to be
uniformly exposed to all the driver families that do DMA into VMs,
which is currently VFIO and VDPA"
For more background, see the extended explanations in Jason's pull request:
https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits)
iommufd: Change the order of MSI setup
iommufd: Improve a few unclear bits of code
iommufd: Fix comment typos
vfio: Move vfio group specific code into group.c
vfio: Refactor dma APIs for emulated devices
vfio: Wrap vfio group module init/clean code into helpers
vfio: Refactor vfio_device open and close
vfio: Make vfio_device_open() truly device specific
vfio: Swap order of vfio_device_container_register() and open_device()
vfio: Set device->group in helper function
vfio: Create wrappers for group register/unregister
vfio: Move the sanity check of the group to vfio_create_group()
vfio: Simplify vfio_create_group()
iommufd: Allow iommufd to supply /dev/vfio/vfio
vfio: Make vfio_container optionally compiled
vfio: Move container related MODULE_ALIAS statements into container.c
vfio-iommufd: Support iommufd for emulated VFIO devices
vfio-iommufd: Support iommufd for physical VFIO devices
vfio-iommufd: Allow iommufd to be used in place of a container fd
vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent()
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOScsgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpi5ID/9pLXFYOq1+uDjU0KO/MdjMjK8Ukr34lCnk
WkajRLheE8JBKOFDE54XJk56sQSZHX9bTWqziar0h1fioh7FlQR/tVvzsERCm2M9
2y9THJNJygC68wgybStyiKlshFjl7TD7Kv5N9Y3xP3mkQygT+D6o8fXZk5xQbYyH
YdFSoq4rJVHxRL03yzQiReGGIYdOUEQQh8l1FiLwLlKa3lXAey1KuxWIzksVN0KK
aZB4QhiBpOiPgDHUVisq2XtyQjpZ2byoCImPzgrcqk9Jo4esvm/e6esrg4xlsvII
LKFFkTmbVqjUZtFjqakFHmfuzVor4nU5f+xb90ZHExuuODYckkxWp5rWhf9QwqqI
0ik6WYgI1/5vnHnX8f2DYzOFQf9qa/rLgg0CshyUODlD6RfHa9vntqYvlIFkmOBd
Q7KblIoK8YTzUS1M+v7X8JQ7gDR2KwygH37Da2KJS+vgvfIb8kJGr1ZORuhJuJJ7
Bl69gaNkHTHrqufp7UI64YXfueeuNu2J9z3zwzGoxeaFaofF/phDn0/2gCQE1fQI
XBhsMw+ETqI6B2SPHMnzYDu2DM1S8ZTOYQlaD4G3uqgWnAM1tG707395uAy5yu4n
D5azU1fVG4UocoNIyPujpaoSRs2zWZycEFEeUQkhyDDww/j4hlHi6H33eOnk0zsr
wxzFGfvHfw==
=k/vv
-----END PGP SIGNATURE-----
Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull requests via Christoph:
- Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
Joshi)
- Refactor PCIe probing and reset (Christoph Hellwig)
- Various fabrics authentication fixes and improvements (Sagi
Grimberg)
- Avoid fallback to sequential scan due to transient issues (Uday
Shankar)
- Implement support for the DEAC bit in Write Zeroes (Christoph
Hellwig)
- Allow overriding the IEEE OUI and firmware revision in configfs
for nvmet (Aleksandr Miloserdov)
- Force reconnect when number of queue changes in nvmet (Daniel
Wagner)
- Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
Grimberg, Christoph Hellwig, Christophe JAILLET)
- Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
- Use the common tagset helpers in nvme-pci driver (Christoph
Hellwig)
- Cleanup the nvme-pci removal path (Christoph Hellwig)
- Use kstrtobool() instead of strtobool (Christophe JAILLET)
- Allow unprivileged passthrough of Identify Controller (Joel
Granados)
- Support io stats on the mpath device (Sagi Grimberg)
- Minor nvmet cleanup (Sagi Grimberg)
- MD pull requests via Song:
- Code cleanups (Christoph)
- Various fixes
- Floppy pull request from Denis:
- Fix a memory leak in the init error path (Yuan)
- Series fixing some batch wakeup issues with sbitmap (Gabriel)
- Removal of the pktcdvd driver that was deprecated more than 5 years
ago, and subsequent removal of the devnode callback in struct
block_device_operations as no users are now left (Greg)
- Fix for partition read on an exclusively opened bdev (Jan)
- Series of elevator API cleanups (Jinlong, Christoph)
- Series of fixes and cleanups for blk-iocost (Kemeng)
- Series of fixes and cleanups for blk-throttle (Kemeng)
- Series adding concurrent support for sync queues in BFQ (Yu)
- Series bringing drbd a bit closer to the out-of-tree maintained
version (Christian, Joel, Lars, Philipp)
- Misc drbd fixes (Wang)
- blk-wbt fixes and tweaks for enable/disable (Yu)
- Fixes for mq-deadline for zoned devices (Damien)
- Add support for read-only and offline zones for null_blk
(Shin'ichiro)
- Series fixing the delayed holder tracking, as used by DM (Yu,
Christoph)
- Series enabling bio alloc caching for IRQ based IO (Pavel)
- Series enabling userspace peer-to-peer DMA (Logan)
- BFQ waker fixes (Khazhismel)
- Series fixing elevator refcount issues (Christoph, Jinlong)
- Series cleaning up references around queue destruction (Christoph)
- Series doing quiesce by tagset, enabling cleanups in drivers
(Christoph, Chao)
- Series untangling the queue kobject and queue references (Christoph)
- Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)
* tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
blktrace: Fix output non-blktrace event when blk_classic option enabled
block: sed-opal: Don't include <linux/kernel.h>
sed-opal: allow using IOC_OPAL_SAVE for locking too
blk-cgroup: Fix typo in comment
block: remove bio_set_op_attrs
nvmet: don't open-code NVME_NS_ATTR_RO enumeration
nvme-pci: use the tagset alloc/free helpers
nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
nvme: consolidate setting the tagset flags
nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
block: bio_copy_data_iter
nvme-pci: split out a nvme_pci_ctrl_is_dead helper
nvme-pci: return early on ctrl state mismatch in nvme_reset_work
nvme-pci: rename nvme_disable_io_queues
nvme-pci: cleanup nvme_suspend_queue
nvme-pci: remove nvme_pci_disable
nvme-pci: remove nvme_disable_admin_queue
nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
nvme: use nvme_wait_ready in nvme_shutdown_ctrl
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe
A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw
bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I
RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX
WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS
waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT
ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6
/ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI
PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX
RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4
CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq
APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE=
=QRhK
-----END PGP SIGNATURE-----
Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
- Replace prandom_u32_max() and various open-coded variants of it,
there is now a new family of functions that uses fast rejection
sampling to choose properly uniformly random numbers within an
interval:
get_random_u32_below(ceil) - [0, ceil)
get_random_u32_above(floor) - (floor, U32_MAX]
get_random_u32_inclusive(floor, ceil) - [floor, ceil]
Coccinelle was used to convert all current users of
prandom_u32_max(), as well as many open-coded patterns, resulting in
improvements throughout the tree.
I'll have a "late" 6.1-rc1 pull for you that removes the now unused
prandom_u32_max() function, just in case any other trees add a new
use case of it that needs to converted. According to linux-next,
there may be two trivial cases of prandom_u32_max() reintroductions
that are fixable with a 's/.../.../'. So I'll have for you a final
conversion patch doing that alongside the removal patch during the
second week.
This is a treewide change that touches many files throughout.
- More consistent use of get_random_canary().
- Updates to comments, documentation, tests, headers, and
simplification in configuration.
- The arch_get_random*_early() abstraction was only used by arm64 and
wasn't entirely useful, so this has been replaced by code that works
in all relevant contexts.
- The kernel will use and manage random seeds in non-volatile EFI
variables, refreshing a variable with a fresh seed when the RNG is
initialized. The RNG GUID namespace is then hidden from efivarfs to
prevent accidental leakage.
These changes are split into random.c infrastructure code used in the
EFI subsystem, in this pull request, and related support inside of
EFISTUB, in Ard's EFI tree. These are co-dependent for full
functionality, but the order of merging doesn't matter.
- Part of the infrastructure added for the EFI support is also used for
an improvement to the way vsprintf initializes its siphash key,
replacing an sleep loop wart.
- The hardware RNG framework now always calls its correct random.c
input function, add_hwgenerator_randomness(), rather than sometimes
going through helpers better suited for other cases.
- The add_latent_entropy() function has long been called from the fork
handler, but is a no-op when the latent entropy gcc plugin isn't
used, which is fine for the purposes of latent entropy.
But it was missing out on the cycle counter that was also being mixed
in beside the latent entropy variable. So now, if the latent entropy
gcc plugin isn't enabled, add_latent_entropy() will expand to a call
to add_device_randomness(NULL, 0), which adds a cycle counter,
without the absent latent entropy variable.
- The RNG is now reseeded from a delayed worker, rather than on demand
when used. Always running from a worker allows it to make use of the
CPU RNG on platforms like S390x, whose instructions are too slow to
do so from interrupts. It also has the effect of adding in new inputs
more frequently with more regularity, amounting to a long term
transcript of random values. Plus, it helps a bit with the upcoming
vDSO implementation (which isn't yet ready for 6.2).
- The jitter entropy algorithm now tries to execute on many different
CPUs, round-robining, in hopes of hitting even more memory latencies
and other unpredictable effects. It also will mix in a cycle counter
when the entropy timer fires, in addition to being mixed in from the
main loop, to account more explicitly for fluctuations in that timer
firing. And the state it touches is now kept within the same cache
line, so that it's assured that the different execution contexts will
cause latencies.
* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
random: include <linux/once.h> in the right header
random: align entropy_timer_state to cache line
random: mix in cycle counter when jitter timer fires
random: spread out jitter callback to different CPUs
random: remove extraneous period and add a missing one in comments
efi: random: refresh non-volatile random seed when RNG is initialized
vsprintf: initialize siphash key using notifier
random: add back async readiness notifier
random: reseed in delayed work rather than on-demand
random: always mix cycle counter in add_latent_entropy()
hw_random: use add_hwgenerator_randomness() for early entropy
random: modernize documentation comment on get_random_bytes()
random: adjust comment to account for removed function
random: remove early archrandom abstraction
random: use random.trust_{bootloader,cpu} command line option only
stackprotector: actually use get_random_canary()
stackprotector: move get_random_canary() into stackprotector.h
treewide: use get_random_u32_inclusive() when possible
treewide: use get_random_u32_{above,below}() instead of manual loop
treewide: use get_random_u32_below() instead of deprecated function
...
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
response to physical memory reconfiguration, or memory-side data
invalidation from operations like secure erase or memory-device unlock.
- Add a facility for the kernel to warn about collisions between kernel
and userspace access to PCI configuration registers
- Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1)
- Add handling and reporting of CXL errors reported via the PCIe AER
mechanism
- Add support for CXL Persistent Memory Security commands
- Add support for the "XOR" algorithm for CXL host bridge interleave
- Rework / simplify CXL to NVDIMM interactions
- Miscellaneous cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY5UpyAAKCRDfioYZHlFs
Z0ttAP4uxCjIibKsFVyexpSgI4vaZqQ9yt9NesmPwonc0XookwD+PlwP6Xc0d0Ox
t0gJ6+pwdh11NRzhcNE1pAaPcJZU4gs=
=HAQk
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for 6.2.
While it may seem backwards, the CXL update this time around includes
some focus on CXL 1.x enabling where the work to date had been with
CXL 2.0 (VH topologies) in mind.
First generation CXL can mostly be supported via BIOS, similar to DDR,
however it became clear there are use cases for OS native CXL error
handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x
hosts (Restricted CXL Host (RCH) topologies). So, this update brings
RCH topologies into the Linux CXL device model.
In support of the ongoing CXL 2.0+ enabling two new core kernel
facilities are added.
One is the ability for the kernel to flag collisions between userspace
access to PCI configuration registers and kernel accesses. This is
brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware
mailbox over config-cycles.
The other is a cpu_cache_invalidate_memregion() API that maps to
wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest
VMs and architectures that do not support it yet. The CXL paths that
need it, dynamic memory region creation and security commands (erase /
unlock), are disabled when it is not present.
As for the CXL 2.0+ this cycle the subsystem gains support Persistent
Memory Security commands, error handling in response to PCIe AER
notifications, and support for the "XOR" host bridge interleave
algorithm.
Summary:
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
response to physical memory reconfiguration, or memory-side data
invalidation from operations like secure erase or memory-device
unlock.
- Add a facility for the kernel to warn about collisions between
kernel and userspace access to PCI configuration registers
- Add support for Restricted CXL Host (RCH) topologies (formerly CXL
1.1)
- Add handling and reporting of CXL errors reported via the PCIe AER
mechanism
- Add support for CXL Persistent Memory Security commands
- Add support for the "XOR" algorithm for CXL host bridge interleave
- Rework / simplify CXL to NVDIMM interactions
- Miscellaneous cleanups and fixes"
* tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits)
cxl/region: Fix memdev reuse check
cxl/pci: Remove endian confusion
cxl/pci: Add some type-safety to the AER trace points
cxl/security: Drop security command ioctl uapi
cxl/mbox: Add variable output size validation for internal commands
cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size
cxl/security: Fix Get Security State output payload endian handling
cxl: update names for interleave ways conversion macros
cxl: update names for interleave granularity conversion macros
cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry
tools/testing/cxl: Require cache invalidation bypass
cxl/acpi: Fail decoder add if CXIMS for HBIG is missing
cxl/region: Fix spelling mistake "memergion" -> "memregion"
cxl/regs: Fix sparse warning
cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support
tools/testing/cxl: Add an RCH topology
cxl/port: Add RCD endpoint port enumeration
cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
tools/testing/cxl: Add XOR Math support to cxl_test
cxl/acpi: Support CXL XOR Interleave Math (CXIMS)
...
- Core:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for PCI/MSI[-X]
and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows device
manufactures to provide implementation defined storage for MSI messages
contrary to the uniform and specification defined storage mechanisms for
PCI/MSI and PCI/MSI-X. IMS not only allows to overcome the size limitations
of the MSI-X table, but also gives the device manufacturer the freedom to
store the message in arbitrary places, even in host memory which is shared
with the device.
There have been several attempts to glue this into the current MSI code,
but after lengthy discussions it turned out that there is a fundamental
design problem in the current PCI/MSI-X implementation. This needs some
historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management was
completely different from what we have today in the actively developed
architectures. Interrupt management was completely architecture specific
and while there were attempts to create common infrastructure the
commonalities were rudimentary and just providing shared data structures and
interfaces so that drivers could be written in an architecture agnostic
way.
The initial PCI/MSI[-X] support obviously plugged into this model which
resulted in some basic shared infrastructure in the PCI core code for
setting up MSI descriptors, which are a pure software construct for holding
data relevant for a particular MSI interrupt, but the actual association to
Linux interrupts was completely architecture specific. This model is still
supported today to keep museum architectures and notorious stranglers
alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel,
which was creating yet another architecture specific mechanism and resulted
in an unholy mess on top of the existing horrors of x86 interrupt handling.
The x86 interrupt management code was already an incomprehensible maze of
indirections between the CPU vector management, interrupt remapping and the
actual IO/APIC and PCI/MSI[-X] implementation.
At roughly the same time ARM struggled with the ever growing SoC specific
extensions which were glued on top of the architected GIC interrupt
controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86 vector
domain and interrupt remapping and also allowed ARM to handle the zoo of
SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that it is
not available the PCI/MSI[-X] domains have the vector domain as their
parent. This reduced the required interaction between the domains pretty
much to the initialization phase where it is obviously required to
establish the proper parent relation ship in the components of the
hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware
it's clear that the actual PCI/MSI[-X] interrupt controller is not a global
entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the easy
solution of providing "global" PCI/MSI domains which was possible because
the PCI/MSI[-X] handling is uniform across the devices. This also allowed
to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in
turn made it simple to keep the existing architecture specific management
alive.
A similar problem was created in the ARM world with support for IP block
specific message storage. Instead of going all the way to stack a IP block
specific domain on top of the generic MSI domain this ended in a construct
which provides a "global" platform MSI domain which allows overriding the
irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the MSI
infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into the
existing infrastructure. Some of this just works by chance on particular
platforms but will fail in hard to diagnose ways when the driver is used
on platforms where the underlying MSI interrupt management code does not
expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront to
avoid resource exhaustion. They are expanded at run-time when the guest
actually tries to use them. The way how this is implemented is that the
host disables MSI-X and then re-enables it with a larger number of
vectors again. That works by chance because most device drivers set up
all interrupts before the device actually will utilize them. But that's
not universally true because some drivers allocate a large enough number
of vectors but do not utilize them until it's actually required,
e.g. for acceleration support. But at that point other interrupts of the
device might be in active use and the MSI-X disable/enable dance can
just result in losing interrupts and therefore hard to diagnose subtle
problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS
is not longer providing a uniform storage and configuration model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for PCI/IMS.
PCI/IMS has been verified with the work in progress IDXD driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
- Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUsygTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYXiD/40tXKzCzf0qFIqUlZLia1N3RRrwrNC
DVTixuLtR9MrjwE+jWLQILa85SHInV8syXHSd35SzhsGDxkURFGi+HBgVWmysODf
br9VSh3Gi+kt7iXtIwAg8WNWviGNmS3kPksxCko54F0YnJhMY5r5bhQVUBQkwFG2
wES1C9Uzd4pdV2bl24Z+WKL85cSmZ+pHunyKw1n401lBABXnTF9c4f13zC14jd+y
wDxNrmOxeL3mEH4Pg6VyrDuTOURSf3TjJjeEq3EYqvUo0FyLt9I/cKX0AELcZQX7
fkRjrQQAvXNj39RJfeSkojDfllEPUHp7XSluhdBu5aIovSamdYGCDnuEoZ+l4MJ+
CojIErp3Dwj/uSaf5c7C3OaDAqH2CpOFWIcrUebShJE60hVKLEpUwd6W8juplaoT
gxyXRb1Y+BeJvO8VhMN4i7f3232+sj8wuj+HTRTTbqMhkElnin94tAx8rgwR1sgR
BiOGMJi4K2Y8s9Rqqp0Dvs01CW4guIYvSR4YY+WDbbi1xgiev89OYs6zZTJCJe4Y
NUwwpqYSyP1brmtdDdBOZLqegjQm+TwUb6oOaasFem4vT1swgawgLcDnPOx45bk5
/FWt3EmnZxMz99x9jdDn1+BCqAZsKyEbEY1avvhPVMTwoVIuSX2ceTBMLseGq+jM
03JfvdxnueM3gw==
=9erA
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt core and driver subsystem:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for
PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows
device manufactures to provide implementation defined storage for MSI
messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified
message store which is uniform accross all devices). The PCI/MSI[-X]
uniformity allowed us to get away with "global" PCI/MSI domains.
IMS not only allows to overcome the size limitations of the MSI-X
table, but also gives the device manufacturer the freedom to store the
message in arbitrary places, even in host memory which is shared with
the device.
There have been several attempts to glue this into the current MSI
code, but after lengthy discussions it turned out that there is a
fundamental design problem in the current PCI/MSI-X implementation.
This needs some historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management
was completely different from what we have today in the actively
developed architectures. Interrupt management was completely
architecture specific and while there were attempts to create common
infrastructure the commonalities were rudimentary and just providing
shared data structures and interfaces so that drivers could be written
in an architecture agnostic way.
The initial PCI/MSI[-X] support obviously plugged into this model
which resulted in some basic shared infrastructure in the PCI core
code for setting up MSI descriptors, which are a pure software
construct for holding data relevant for a particular MSI interrupt,
but the actual association to Linux interrupts was completely
architecture specific. This model is still supported today to keep
museum architectures and notorious stragglers alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the
kernel, which was creating yet another architecture specific mechanism
and resulted in an unholy mess on top of the existing horrors of x86
interrupt handling. The x86 interrupt management code was already an
incomprehensible maze of indirections between the CPU vector
management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X]
implementation.
At roughly the same time ARM struggled with the ever growing SoC
specific extensions which were glued on top of the architected GIC
interrupt controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86
vector domain and interrupt remapping and also allowed ARM to handle
the zoo of SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that
it is not available the PCI/MSI[-X] domains have the vector domain as
their parent. This reduced the required interaction between the
domains pretty much to the initialization phase where it is obviously
required to establish the proper parent relation ship in the
components of the hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the
hardware it's clear that the actual PCI/MSI[-X] interrupt controller
is not a global entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the
easy solution of providing "global" PCI/MSI domains which was possible
because the PCI/MSI[-X] handling is uniform across the devices. This
also allowed to keep the existing PCI/MSI[-X] infrastructure mostly
unchanged which in turn made it simple to keep the existing
architecture specific management alive.
A similar problem was created in the ARM world with support for IP
block specific message storage. Instead of going all the way to stack
a IP block specific domain on top of the generic MSI domain this ended
in a construct which provides a "global" platform MSI domain which
allows overriding the irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the
MSI infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into
the existing infrastructure. Some of this just works by chance on
particular platforms but will fail in hard to diagnose ways when the
driver is used on platforms where the underlying MSI interrupt
management code does not expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront
to avoid resource exhaustion. They are expanded at run-time when the
guest actually tries to use them. The way how this is implemented is
that the host disables MSI-X and then re-enables it with a larger
number of vectors again. That works by chance because most device
drivers set up all interrupts before the device actually will utilize
them. But that's not universally true because some drivers allocate a
large enough number of vectors but do not utilize them until it's
actually required, e.g. for acceleration support. But at that point
other interrupts of the device might be in active use and the MSI-X
disable/enable dance can just result in losing interrupts and
therefore hard to diagnose subtle problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact
that IMS is not longer providing a uniform storage and configuration
model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per
device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for
PCI/IMS. PCI/IMS has been verified with the work in progress IDXD
driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place"
* tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits)
irqchip/ti-sci-inta: Fix kernel doc
irqchip/gic-v2m: Mark a few functions __init
irqchip/gic-v2m: Include arm-gic-common.h
irqchip/irq-mvebu-icu: Fix works by chance pointer assignment
iommu/amd: Enable PCI/IMS
iommu/vt-d: Enable PCI/IMS
x86/apic/msi: Enable PCI/IMS
PCI/MSI: Provide pci_ims_alloc/free_irq()
PCI/MSI: Provide IMS (Interrupt Message Store) support
genirq/msi: Provide constants for PCI/IMS support
x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN
PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X
PCI/MSI: Provide prepare_desc() MSI domain op
PCI/MSI: Split MSI-X descriptor setup
genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN
genirq/msi: Provide msi_domain_alloc_irq_at()
genirq/msi: Provide msi_domain_ops:: Prepare_desc()
genirq/msi: Provide msi_desc:: Msi_data
genirq/msi: Provide struct msi_map
x86/apic/msi: Remove arch_create_remap_msi_irq_domain()
...
- Switch to using devm_gpiod_get_optional() so we can stop exporting
devm_gpiod_get_from_of_node() (Dmitry Torokhov)
* pci/ctrl/aardvark:
PCI: aardvark: Switch to using devm_gpiod_get_optional()
- Restore MSI remapping configuration during resume because the
configuration is cleared out by firmware when suspending (Nirmal Patel)
- Reset the hierarchy below VMD when probing the VMD; we attempted this
before, but with the wrong device, so it didn't work (Francisco Munoz)
* remotes/lorenzo/pci/vmd:
PCI: vmd: Fix secondary bus reset for Intel bridges
PCI: vmd: Disable MSI remapping after suspend
- Switch from devm_gpiod_get_from_of_node() to devm_fwnode_gpiod_get()
(Dmitry Torokhov)
* remotes/lorenzo/pci/tegra:
PCI: tegra: Switch to using devm_fwnode_gpiod_get
- Add sentinel to mt7621_pcie_quirks_match[] to prevent oops when parsing
the table (John Thomson)
* remotes/lorenzo/pci/mt7621:
PCI: mt7621: Add sentinel to quirks table
- Enable Multi-MSI (Jim Quinlan)
- Wait for 100ms after PERST# deassert for power and clocks to stabilize
(Jim Quinlan)
- Use readl_poll_timeout_atomic() instead of hand-rolled timeout loop (Jim
Quinlan)
- Drop needless "inline" annotations (Jim Quinlan)
- Set RCB_MPS mode bit so data for reads up to MPS are returned in a single
completion (Jim Quinlan)
* remotes/lorenzo/pci/brcmstb:
PCI: brcmstb: Set RCB_{MPS,64B}_MODE bits
PCI: brcmstb: Drop needless 'inline' annotations
PCI: brcmstb: Replace status loops with read_poll_timeout_atomic()
PCI: brcmstb: Wait for 100ms following PERST# deassert
PCI: brcmstb: Enable Multi-MSI
- Remove EfiMemoryMappedIO regions from the E820 map to allow PCI core to
allocate BARs from them. The only purpose of EfiMemoryMappedIO is to
tell the OS to map things needed by EFI runtime services, so it's often
used for PCI host bridge apertures. If we can't allocate from those
apertures, we can't hot-add devices (Bjorn Helgaas)
* pci/resource:
x86/PCI: Use pr_info() when possible
x86/PCI: Fix log message typo
x86/PCI: Tidy E820 removal messages
PCI: Skip allocate_resource() if too little space available
efi/x86: Remove EfiMemoryMappedIO from E820 map
- Squash portdrv_core.c and portdrv_pci.c into portdrv.c to make it easier
to find things (Bjorn Helgaas)
- Allow AER service only for Root Ports & RCECs so portdrv can successfully
bind to other devices that have AER but lack MSI (which they don't need
for AER), which allows power management for those devices (Bjorn Helgaas)
* pci/portdrv:
PCI/portdrv: Allow AER service only for Root Ports & RCECs
PCI/portdrv: Unexport pcie_port_service_register(), pcie_port_service_unregister()
PCI/portdrv: Move private things to portdrv.c
PCI/portdrv: Squash into portdrv.c
- Use METHOD_NAME__UID instead of plain string to make it easier to find
all uses (Yipeng Zou)
* pci/misc:
PCI/ACPI: Use METHOD_NAME__UID instead of plain string
- Enable pciehp by default if USB4 is enabled because USB4/Thunderbolt
tunneling depends on native PCIe hotplug (Albert Zhou)
- Make sure pciehp binds only to Downstream Ports, not Upstream Ports
(Rafael J. Wysocki)
- Remove unused get_mode1_ECC_cap callback in shpchp (Ian Cowan)
- Enable pciehp Command Completed Interrupt only if supported to reduce
confusion when looking at lspci output (Pali Rohár)
* pci/hotplug:
PCI: pciehp: Enable Command Completed Interrupt only if supported
PCI: shpchp: Remove unused get_mode1_ECC_cap callback
PCI: acpiphp: Avoid setting is_hotplug_bridge for PCIe Upstream Ports
PCI/portdrv: Set PCIE_PORT_SERVICE_HP for Root and Downstream Ports only
PCI: pciehp: Enable by default if USB4 enabled
- Only read/write PCIe Link 2 registers for devices with Links and PCIe
Capability version >= 2 (Maciej W. Rozycki)
- Revert a patch that cleared PCI_STATUS during enumeration because it
broke Linux guests on Apple's virtualization framework (Bjorn Helgaas)
- Assign PCI domain IDs using IDAs so IDs can be easily reused after
loading/unloading host bridge drivers (Pali Rohár)
- Fix pci_device_is_present(), which previously always returned "false" for
VFs because their vendor ID is always 0xfff (Michael S. Tsirkin)
- Check for alloc failure in pci_request_irq() (Zeng Heng)
* pci/enumeration:
PCI: Check for alloc failure in pci_request_irq()
PCI: Fix pci_device_is_present() for VFs by checking PF
PCI: Assign PCI domain IDs by ida_alloc()
Revert "PCI: Clear PCI_STATUS when setting up device"
PCI: Access Link 2 registers only for devices with Links
pci_bus_alloc_from_region() allocates MMIO space by iterating through all
the resources available on the bus. The available resource might be
reduced if the caller requires 32-bit space or we're avoiding BIOS or E820
areas.
Don't bother calling allocate_resource() if we need more space than is
available in this resource. This prevents some pointless and annoying
messages about avoided areas.
Link: https://lore.kernel.org/r/20221208190341.1560157-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Previously portdrv allowed the AER service for any device with an AER
capability (assuming Linux had control of AER) even though the AER service
driver only attaches to Root Port and RCECs.
Because get_port_device_capability() included AER for non-RP, non-RCEC
devices, we tried to initialize the AER IRQ even though these devices
don't generate AER interrupts.
Intel DG1 and DG2 discrete graphics cards contain a switch leading to a
GPU. The switch supports AER but not MSI, so initializing an AER IRQ
failed, and portdrv failed to claim the switch port at all. The GPU itself
could be suspended, but the switch could not be put in a low-power state
because it had no driver.
Don't allow the AER service on non-Root Port, non-Root Complex Event
Collector devices. This means we won't enable Bus Mastering if the device
doesn't require MSI, the AER service will not appear in sysfs, and the AER
service driver will not bind to the device.
Link: https://lore.kernel.org/r/20221207084105.84947-1-mika.westerberg@linux.intel.com
Link: https://lore.kernel.org/r/20221210002922.1749403-1-helgaas@kernel.org
Based-on-patch-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Switch the driver away from legacy gpio/of_gpio API to gpiod API, and
remove use of of_get_named_gpio_flags() which I want to make private to
gpiolib.
Link: https://lore.kernel.org/r/Y5EAft42YiT66mVj@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The No Command Completed Support bit in the Slot Capabilities register
indicates whether Command Completed Interrupt Enable is unsupported.
We already check whether No Command Completed Support bit is set in
pcie_wait_cmd(), and do not wait in this case.
Don't enable this Command Completed Interrupt at all if NCCS is set, so
that when users dump configuration space from userspace, the dump does not
confuse them by saying that Command Completed Interrupt is not supported,
but it is enabled.
Link: https://lore.kernel.org/r/20220927141926.8895-2-kabel@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Switch the driver to the generic version of gpiod API (and away from
OF-specific variant), so that we can stop exporting
devm_gpiod_get_from_of_node().
Link: https://lore.kernel.org/r/Y3KMEZFv6dpxA+Gv@google.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Pali Rohár <pali@kernel.org>
Current driver is missing a sentinel in the struct soc_device_attribute
array, which causes an oops when assessed by the
soc_device_match(mt7621_pcie_quirks_match) call.
This was only exposed once the CONFIG_SOC_MT7621 mt7621 soc_dev_attr
was fixed to register the SOC as a device, in:
commit 7c18b64bba ("mips: ralink: mt7621: do not use kzalloc too early")
Fix it by adding the required sentinel.
Link: https://lore.kernel.org/lkml/26ebbed1-0fe9-4af9-8466-65f841d0b382@app.fastmail.com
Link: https://lore.kernel.org/r/20221205204645.301301-1-git@johnthomson.fastmail.com.au
Fixes: b483b4e4d3 ("staging: mt7621-pci: add quirks for 'E2' revision using 'soc_device_attribute'")
Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
The reset was never applied in the current implementation because Intel
Bridges owned by VMD are parentless. Internally, pci_reset_bus() applies
a reset to the parent of the PCI device supplied as argument, but in this
case it failed because there wasn't a parent.
In more detail, this change allows the VMD driver to enumerate NVMe devices
in pass-through configurations when guest reboots are performed. There was
an attempted to fix this, but later we discovered that the code inside
pci_reset_bus() wasn’t triggering secondary bus resets. Therefore, we
updated the parameters passed to it, and now NVMe SSDs attached to VMD
bridges are properly enumerated in VT-d pass-through scenarios.
Link: https://lore.kernel.org/r/20221206001637.4744-1-francisco.munoz.ruiz@linux.intel.com
Fixes: 6aab562229 ("PCI: vmd: Clean up domain before enumeration")
Signed-off-by: Francisco Munoz <francisco.munoz.ruiz@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com>
Reviewed-by: Jonathan Derrick <jonathan.derrick@linux.dev>
Single vector allocation which allocates the next free index in the IMS
space. The free function releases.
All allocated vectors are released also via pci_free_vectors() which is
also releasing MSI/MSI-X vectors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.961711347@linutronix.de
IMS (Interrupt Message Store) is a new specification which allows
implementation specific storage of MSI messages contrary to the
strict standard specified MSI and MSI-X message stores.
This requires new device specific interrupt domains to handle the
implementation defined storage which can be an array in device memory or
host/guest memory which is shared with hardware queues.
Add a function to create IMS domains for PCI devices. IMS domains are using
the new per device domain mechanism and are configured by the device driver
via a template. IMS domains are created as secondary device domains so they
work side on side with MSI[-X] on the same device.
The IMS domains have a few constraints:
- The index space is managed by the core code.
Device memory based IMS provides a storage array with a fixed size
which obviously requires an index. But there is no association between
index and functionality so the core can randomly allocate an index in
the array.
System memory based IMS does not have the concept of an index as the
storage is somewhere in memory. In that case the index is purely
software based to keep track of the allocations.
- There is no requirement for consecutive index ranges
This is currently a limitation of the MSI core and can be implemented
if there is a justified use case by changing the internal storage from
xarray to maple_tree. For now it's single vector allocation.
- The interrupt chip must provide the following callbacks:
- irq_mask()
- irq_unmask()
- irq_write_msi_msg()
- The interrupt chip must provide the following optional callbacks
when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks
cannot operate directly on hardware, e.g. in the case that the
interrupt message store is in queue memory:
- irq_bus_lock()
- irq_bus_unlock()
These callbacks are invoked from preemptible task context and are
allowed to sleep. In this case the mandatory callbacks above just
store the information. The irq_bus_unlock() callback is supposed to
make the change effective before returning.
- Interrupt affinity setting is handled by the underlying parent
interrupt domain and communicated to the IMS domain via
irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity()
callback. That's a reasonable restriction similar to the PCI/MSI
device domain implementations.
The domain is automatically destroyed when the PCI device is removed.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.904316841@linutronix.de
MSI-X vectors can be allocated after the initial MSI-X enablement, but this
needs explicit support of the underlying interrupt domains.
Provide a function to query the ability and functions to allocate/free
individual vectors post-enable.
The allocation can either request a specific index in the MSI-X table or
with the index argument MSI_ANY_INDEX it allocates the next free vector.
The return value is a struct msi_map which on success contains both index
and the Linux interrupt number. In case of failure index is negative and
the Linux interrupt number is 0.
The allocation function is for a single MSI-X index at a time as that's
sufficient for the most urgent use case VFIO to get rid of the 'disable
MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts
and redirections to the legacy and obviously unhandled INTx.
As single index allocation is also sufficient for the use cases Jason
Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network
queue. See Link below.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/all/20211126232735.547996838@linutronix.de
Link: https://lore.kernel.org/r/20221124232326.731233614@linutronix.de
The setup of MSI descriptors for PCI/MSI-X interrupts depends partially on
the MSI index for which the descriptor is initialized.
Dynamic MSI-X vector allocation post MSI-X enablement allows to allocate
vectors at a given index or at any free index in the available table
range. The latter requires that the descriptor is initialized after the
MSI core has chosen an index.
Implement the prepare_desc() op in the PCI/MSI-X specific msi_domain_ops
which is invoked before the core interrupt descriptor and the associated
Linux interrupt number is allocated.
That callback is also provided for the upcoming PCI/IMS implementations so
the implementation specific interrupt domain can do their domain specific
initialization of the MSI descriptors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.673658806@linutronix.de
The upcoming mechanism to allocate MSI-X vectors after enabling MSI-X needs
to share some of the MSI-X descriptor setup.
The regular descriptor setup on enable has the following code flow:
1) Allocate descriptor
2) Setup descriptor with PCI specific data
3) Insert descriptor
4) Allocate interrupts which in turn scans the inserted
descriptors
This cannot be easily changed because the PCI/MSI code needs to handle the
legacy architecture specific allocation model and the irq domain model
where quite some domains have the assumption that the above flow is how it
works.
Ideally the code flow should look like this:
1) Invoke allocation at the MSI core
2) MSI core allocates descriptor
3) MSI core calls back into the irq domain which fills in
the domain specific parts
This could be done for underlying parent MSI domains which support
post-enable allocation/free but that would create significantly different
code pathes for MSI/MSI-X enable.
Though for dynamic allocation which wants to share the allocation code with
the upcoming PCI/IMS support it's the right thing to do.
Split the MSI-X descriptor setup into the preallocation part which just sets
the index and fills in the horrible hack of virtual IRQs and the real PCI
specific MSI-X setup part which solely depends on the index in the
descriptor. This allows to provide a common dynamic allocation interface at
the MSI core level for both PCI/MSI-X and PCI/IMS.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.616292598@linutronix.de
The check for special MSI domains like VMD which prevents the interrupt
remapping code to overwrite device::msi::domain is not longer required and
has been replaced by an x86 specific version which is aware of MSI parent
domains.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232326.093093200@linutronix.de
Provide a template and the necessary callbacks to create PCI/MSI and
PCI/MSI-X domains.
The domains are created when MSI or MSI-X is enabled. The domain's lifetime
is either the device lifetime or in case that e.g. MSI-X was tried first
and failed, then the MSI-X domain is removed and a MSI domain is created as
both are mutually exclusive and reside in the default domain ID slot of the
per device domain pointer array.
Also expand pci_msi_domain_supports() to handle feature checks correctly
even in the case that the per device domain was not yet created by checking
the features supported by the MSI parent.
Add the necessary setup calls into the MSI and MSI-X enable code path.
These setup calls are backwards compatible. They return success when there
is no parent domain found, which means the existing global domains or the
legacy allocation path keep just working.
Co-developed-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232325.975388241@linutronix.de
The upcoming per device MSI domains will create different domains for MSI
and MSI-X. Split the write message function into MSI and MSI-X helpers so
they can be used by those new domain functions seperately.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124232325.857982142@linutronix.de
Switch to the new domain id aware interfaces to phase out the previous
ones. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124230314.455168748@linutronix.de
This reflects the functionality better. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124230314.103554618@linutronix.de
Use bullet-list RST syntax for kernel-doc parameters' flags and interrupt
mode descriptions. Otherwise Sphinx produces "Unexpected identation" errors
and warnings.
Fixes: 5c0997dc33 ("PCI/MSI: Move pci_alloc_irq_vectors() to api.c")
Fixes: 017239c8db ("PCI/MSI: Move pci_irq_vector() to api.c")
Fixes: be37b8428b ("PCI/MSI: Move pci_irq_get_affinity() to api.c")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Suggested-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ahmed S. Darwish <darwi@linutronix.de>
Link: https://lore.kernel.org/r/20221203100511.222136-1-bagasdotme@gmail.com
Some new devices such as CXL devices may want to record additional error
information on a corrected error. Add a callback to allow the PCI device
driver to do additional logging such as providing additional stats for user
space RAS monitoring.
For CXL device, this is actually a need due to CXL needing to write to the
CXL RAS capability structure correctable error status register in order to
clear the unmasked correctable errors. See CXL spec rev3.0 8.2.4.16.
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166984619233.2804404.3966368388544312674.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Resolve conflicts in drivers/vfio/vfio_main.c by using the iommfd version.
The rc fix was done a different way when iommufd patches reworked this
code.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The function hv_set_affinity was removed in commit 831c1ae7 ("PCI: hv:
Make the code arch neutral by adding arch specific interfaces").
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Link: https://lore.kernel.org/r/20221107171831.25283-1-olaf@aepfle.de
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Use epf_db[i] dereference instead of readl() because epf_db is
in memory allocated by dma_alloc_coherent(), not I/O.
Remove useless/duplicated readl() in the process.
Link: https://lore.kernel.org/r/20221102141014.1025893-7-Frank.Li@nxp.com
Signed-off-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
NTB spad entry item size is sizeof(u32), replace hardcoded 4 with it.
Link: https://lore.kernel.org/r/20221102141014.1025893-6-Frank.Li@nxp.com
Signed-off-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
epf_db_phy member in struct epf_ntb is not used, remove it.
Link: https://lore.kernel.org/r/20221102141014.1025893-5-Frank.Li@nxp.com
Signed-off-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Replace pci_epc_mem_free_addr() with pci_epf_free_space() in the
error handle path to match pci_epf_alloc_space().
Link: https://lore.kernel.org/r/20221102141014.1025893-4-Frank.Li@nxp.com
Fixes: e35f56bb03 ("PCI: endpoint: Support NTB transfer between RC and EP")
Signed-off-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cleanup warning found by scripts/kernel-doc.
Consolidate terms:
- host, host1 to HOST
- vhost, vHost, Vhost, VHOST2 to VHOST
Link: https://lore.kernel.org/r/20221102141014.1025893-2-Frank.Li@nxp.com
Signed-off-by: Frank Li <frank.li@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Baikal-T1 SoC is equipped with DWC PCIe v4.60a host controller. It can be
trained to work up to Gen.3 speed over up to x4 lanes. The host controller
is attached to the DW PCIe 3.0 PCS via the PIPE-4 interface, which in its
turn is connected to the DWC 10G PHY. The whole system is supposed to be
fed up with four clock sources: DBI peripheral clock, AXI application
clocks and external PHY/core reference clock generating the 100MHz signal.
In addition to that the platform provide a way to reset each part of the
controller: sticky/non-sticky bits, host controller core, PIPE interface,
PCS/PHY and Hot/Power reset signal. The driver also provides a way to
handle the GPIO-based PERST# signal.
Note due to the Baikal-T1 MMIO peculiarity we have to implement the DBI
interface accessors which make sure the IO operations are dword-aligned.
Link: https://lore.kernel.org/r/20221113191301.5526-21-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Currently almost each platform driver uses its own resets and clocks
naming in order to get the corresponding descriptors. It makes the code
harder to maintain and comprehend especially seeing the DWC PCIe core main
resets and clocks signals set hasn't changed much for about at least one
major IP-core release. So in order to organize things around these signals
we suggest to create a generic interface for them in accordance with the
naming introduced in the DWC PCIe IP-core reference manual:
Application clocks:
- "dbi" - data bus interface clock (on some DWC PCIe platforms it's
referred as "pclk", "pcie", "sys", "ahb", "cfg", "iface",
"gio", "reg", "pcie_apb_sys");
- "mstr" - AXI-bus master interface clock (some DWC PCIe glue drivers
refer to this clock as "port", "bus", "pcie_bus",
"bus_master/master_bus/axi_m", "pcie_aclk");
- "slv" - AXI-bus slave interface clock (also called as "port", "bus",
"pcie_bus", "bus_slave/slave_bus/axi_s", "pcie_aclk",
"pcie_inbound_axi").
Core clocks:
- "pipe" - core-PCS PIPE interface clock coming from external PHY (it's
normally named by the platform drivers as just "pipe");
- "core" - primary clock of the controller (none of the platform drivers
declare such a clock but in accordance with the ref. manual
the devices may have it separately specified);
- "aux" - auxiliary PMC domain clock (it is named by some platforms as
"pcie_aux" and just "aux");
- "ref" - Generic reference clock (it is a generic clock source, which
can be used as a signal source for multiple interfaces, some
platforms call it as "ref", "general", "pcie_phy",
"pcie_phy_ref").
Application resets:
- "dbi" - Data-bus interface reset (it's CSR interface clock and is
normally called as "apb" though technically it's not APB but
DWC PCIe-specific interface);
- "mstr" - AXI-bus master reset (some platforms call it as "port", "apps",
"bus", "axi_m");
- "slv" - ABI-bus slave reset (some platforms call it as "port", "apps",
"bus", "axi_s").
Core resets:
- "non-sticky" - non-sticky CSR flags reset;
- "sticky" - sticky CSR flags reset;
- "pipe" - PIPE-interface (Core-PCS) logic reset (some platforms
call it just "pipe");
- "core" - controller primary reset (resets everything except PMC
module, some platforms refer to this signal as "soft",
"pci");
- "phy" - PCS/PHY block reset (strictly speaking it is normally
connected to the input of an external block, but the
reference manual says it must be available for the PMC
working correctly, some existing platforms call it
"pciephy", "phy", "link");
- "hot" - PMC hot reset signal (also called as "sleep");
- "pwr" - cold reset signal (can be referred as "pwr", "turnoff").
Bus reset:
- "perst" - PCIe standard signal used to reset the PCIe peripheral
devices.
As you can see each platform uses it's own naming for basically the same
set of the signals. In the framework of this commit we suggest to add a
set of the clocks and reset signals resources, corresponding names and
identifiers for each denoted entity. At current stage the platforms will
be able to use the provided infrastructure to automatically request all
these resources and manipulate with them in the Host/EP init callbacks.
Alas it isn't that easy to create a common cold/hot reset procedure due to
too many platform-specifics in the procedure, like the external flags
exposure and the delays requirement.
Link: https://lore.kernel.org/r/20221113191301.5526-20-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Since the iATU CSR region is now retrieved in the DW PCIe resources getter
there is no much benefits in the iATU detection procedures splitting up.
Therefore let's join the iATU unroll/viewport detection procedure with the
rest of the iATU parameters detection code. The resultant method will be
as coherent as before, while the redundant functions will be eliminated
thus producing more readable code.
Link: https://lore.kernel.org/r/20221113191301.5526-19-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Currently the DW PCIe Root Port and Endpoint CSR spaces are retrieved in
the separate parts of the DW PCIe core driver. It doesn't really make
sense since the both controller types have identical set of the core CSR
regions: DBI, DBI CS2 and iATU/eDMA. Thus we can simplify the DW PCIe Host
and EP initialization methods by moving the platform-specific registers
space getting and mapping into a common method. It gets to be even more
justified seeing the CSRs base address pointers are preserved in the
common DW PCIe descriptor. Note all the OF-based common DW PCIe settings
initialization will be moved to the new method too in order to have a
single function for all the generic platform properties handling in single
place.
A nice side-effect of this change is that the pcie-designware-host.c and
pcie-designware-ep.c drivers are cleaned up from all the direct dw_pcie
storage modification, which makes the DW PCIe core, Root Port and Endpoint
modules more coherent.
Link: https://lore.kernel.org/r/20221113191301.5526-18-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Since in addition to the already available iATU unrolled mapping we are
about to add a few more DW PCIe platform-specific capabilities (CDM-check
and generic clocks/resets resources) let's add a generic interface to set
and get the flags indicating their availability. The new interface shall
improve maintainability of the platform-specific code.
Link: https://lore.kernel.org/r/20221113191301.5526-17-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
In accordance with the generic PCIe Root Port DT-bindings the "dma-ranges"
property has the same format as the "ranges" property. The only difference
is in their semantics. The "dma-ranges" property describes the PCIe-to-CPU
memory mapping in opposite to the CPU-to-PCIe mapping of the "ranges"
property. Even though the DW PCIe controllers are normally equipped with
the internal Address Translation Unit which inbound and outbound tables
can be used to implement both properties semantics, it was surprising for
me to discover that the host-related part of the DW PCIe driver currently
supports the "ranges" property only while the "dma-ranges" windows are
just ignored. Having the "dma-ranges" supported in the driver would be
very handy for the platforms, that don't tolerate the 1:1 CPU-PCIe memory
mapping and require a customized PCIe memory layout. So let's fix that by
introducing the "dma-ranges" property support.
First of all we suggest to rename the dw_pcie_prog_inbound_atu() method to
dw_pcie_prog_ep_inbound_atu() and create a new version of the
dw_pcie_prog_inbound_atu() function. Thus we'll have two methods for the
RC and EP controllers respectively in the same way as it has been
developed for the outbound ATU setup methods.
Secondly aside with the memory window index and type the new
dw_pcie_prog_inbound_atu() function will accept CPU address, PCIe address
and size as its arguments. These parameters define the PCIe and CPU memory
ranges which will be used to setup the respective inbound ATU mapping. The
passed parameters need to be verified against the ATU ranges constraints
in the same way as it is done for the outbound ranges.
Finally the DMA-ranges detected for the PCIe controller need to be
converted to the inbound ATU entries during the host controller
initialization procedure. It will be done in the framework of the
dw_pcie_iatu_setup() method. Note before setting the inbound ranges up we
need to disable all the inbound ATU entries in order to prevent unexpected
PCIe TLPs translations defined by some third party software like
bootloaders.
Link: https://lore.kernel.org/r/20221113191301.5526-16-Sergey.Semin@baikalelectronics.ru
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
It is reported that on some systems pciehp binds to an Upstream Port and
attempts to operate it which causes devices below the Port to disappear
from the bus.
This happens because acpiphp sets dev->is_hotplug_bridge for that Port
(after receiving a Device Check notification on it from the platform
firmware via ACPI) during the enumeration of PCI devices.
get_port_device_capability() sees that dev->is_hotplug_bridge is set and
adds PCIE_PORT_SERVICE_HP to Port services, which allows pciehp to bind to
the Port in question.
Even though this particular problem can be addressed by making the
portdrv_core checks more robust, it also causes power management to work
differently on the affected systems which generally is not desirable (PCIe
Ports with dev->is_hotplug_bridge set have to pass additional tests to be
allowed to go into the D3hot/cold power states which affects runtime PM of
devices below these Ports).
For this reason, amend check_hotplug_bridge() with a PCIe type check to
prevent it from setting dev->is_hotplug_bridge for Upstream Ports.
Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/2262230.ElGaqSPkdT@kreacher
Reported-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
It is reported that on some systems pciehp binds to an Upstream Port and
attempts to operate it which causes devices below the Port to disappear
from the bus.
This happens because acpiphp sets dev->is_hotplug_bridge for that Port
(after receiving a Device Check notification on it from the platform
firmware via ACPI) during the enumeration of PCI devices.
get_port_device_capability() sees that dev->is_hotplug_bridge is set and
adds PCIE_PORT_SERVICE_HP to Port services (which allows pciehp to bind to
the Port in question) without consulting the PCIe type, which should be
either Root Port or Downstream Port for the hotplug capability to be
present.
Per PCIe r6.0, sec 7.5.3.2, the Slot Implemented bit is only valid for
Downstream Ports (including Root Ports), and PCIe hotplug depends on the
Slot Capabilities / Control / Status registers.
Make get_port_device_capability() more robust by adding a PCIe type check
to it before adding PCIE_PORT_SERVICE_HP to Port services which helps to
avoid the problem.
[bhelgaas: add spec citation]
Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/4786090.31r3eYUQgx@kreacher
Reported-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
When kvasprintf() fails to allocate memory, it returns a NULL pointer.
Return error from pci_request_irq() so we don't dereference it.
[bhelgaas: commit log]
Fixes: 704e8953d3 ("PCI/irq: Add pci_request_irq() and pci_free_irq() helpers")
Link: https://lore.kernel.org/r/20221121020029.3759444-1-zengheng4@huawei.com
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
This is a simple mechanical transformation done by:
@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
(E)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
We have stubs for most OF interfaces even when CONFIG_OF is not set, so we
allow building of most controller drivers in that case for compile testing.
When CONFIG_OF is not set, "of_match_ptr(<match_table>)" compiles to NULL,
which leaves <match_table> unused, resulting in errors like this:
$ make W=1
drivers/pci/controller/pci-xgene.c:636:34: error: ‘xgene_pcie_match_table’ defined but not used [-Werror=unused-const-variable=]
Drop of_match_ptr() to avoid the unused variable warning.
See also 1dff012f63 ("PCI: Drop of_match_ptr() to avoid unused
variables").
Link: https://lore.kernel.org/r/20221025191339.667614-2-helgaas@kernel.org
Link: https://lore.kernel.org/r/20221116205100.1136224-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that the PCI/MSI core code does early checking for multi-MSI support
X86_IRQ_ALLOC_CONTIGUOUS_VECTORS is not required anymore.
Remove the flag and rely on MSI_FLAG_MULTI_PCI_MSI.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122015.865042356@linutronix.de
All these sanity checks are now done _before_ any allocation work
happens. No point in doing it twice.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.749446904@linutronix.de
With interrupt domains the sanity check for MSI-X vector validation can be
done _before_ any allocation happens. The sanity check only applies to the
allocation functions which have an 'entries' array argument. The entries
array is filled by the caller with the requested MSI-X indices. Some drivers
have gaps in the index space which is not supported on all architectures.
The PCI/MSI irq domain has a 'feature' bit to enforce this validation late
during the allocation phase.
Just do it right away before doing any other work along with the other
sanity checks on that array.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.691357406@linutronix.de
Similar to PCI multi-MSI reject MSI-X enablement when a irq domain is
attached to the device which does not support MSI-X.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.631728309@linutronix.de
When hierarchical MSI interrupt domains are enabled then there is no point
to do tons of work and detect the missing support for multi-MSI late in the
allocation path.
Just query the domain feature flags right away. The query function is going
to be used for other purposes later and has a mode argument which influences
the result:
ALLOW_LEGACY returns true when:
- there is no irq domain attached (legacy support)
- there is a irq domain attached which has the feature flag set
DENY_LEGACY returns only true when:
- there is a irq domain attached which has the feature flag set
This allows to use the function universally without ifdeffery in the
calling code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.574339988@linutronix.de
There is no point in doing the same sanity checks over and over in a loop
during MSI-X enablement. Put them in front of the loop and return early
when they fail.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.516946468@linutronix.de
There is no way to navigate msi.c without banging the head against the wall
every now and then because MSI and MSI-X specific functions are
intermingled and the code flow is completely non-obvious.
Reorder everthing so common helpers, MSI and MSI-X specific functions are
grouped together.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.459089736@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_msi_enabled() and add kernel-doc for the function.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.331584998@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_msi_enabled() and make its kernel-doc comprehensive.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.271447896@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_irq_get_affinity() and let its kernel-doc match rest of the
file.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.214792769@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_disable_msix() and make its kernel-doc comprehensive.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.156785224@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_msix_vec_count() and make its kernel-doc comprehensive.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.099461602@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_free_irq_vectors() and make its kernel-doc comprehensive.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122015.042870570@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_irq_vector() and let its kernel-doc match the rest of the file.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.984490384@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_alloc_irq_vectors_affinity() and let its kernel-doc reference
pci_alloc_irq_vectors() documentation added in parent commit.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.927531290@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Make pci_alloc_irq_vectors() a real function instead of wrapper and add
proper kernel doc to it.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.870888193@linutronix.de
To disentangle the maze in msi.c, all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_enable_msix_range() and make its kernel-doc comprehensive.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.813792885@linutronix.de
To disentangle the maze in msi.c all exported device-driver MSI APIs are
now to be grouped in one file, api.c.
Move pci_enable_msi() and make its kernel-doc comprehensive.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.755178149@linutronix.de
msi.c is a maze of randomly sorted functions which makes the code
unreadable. As a first step split the driver visible API and the internal
implementation which also allows proper API documentation via one file.
Create drivers/pci/msi/api.c to group all exported device-driver PCI/MSI
APIs in one C file.
Begin by moving pci_disable_msi() there and add kernel-doc for the function
as appropriate.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.696798036@linutronix.de
The upcoming support for per device MSI interrupt domains needs to share
some of the inline helpers with the MSI implementation.
Move them to the header file.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.640052354@linutronix.de
Follow the style of <linux/pci.h>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.582175082@linutronix.de
Adjust to reality and remove another layer of pointless Kconfig
indirection. CONFIG_GENERIC_MSI_IRQ is good enough to serve
all purposes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122014.524842979@linutronix.de
What a zoo:
PCI_MSI
select GENERIC_MSI_IRQ
PCI_MSI_IRQ_DOMAIN
def_bool y
depends on PCI_MSI
select GENERIC_MSI_IRQ_DOMAIN
Ergo PCI_MSI enables PCI_MSI_IRQ_DOMAIN which in turn selects
GENERIC_MSI_IRQ_DOMAIN. So all the dependencies on PCI_MSI_IRQ_DOMAIN are
just an indirection to PCI_MSI.
Match the reality and just admit that PCI_MSI requires
GENERIC_MSI_IRQ_DOMAIN.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.467556921@linutronix.de
Let the core do the freeing of descriptors and just keep it around for the
legacy case.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.409654736@linutronix.de
Set the bus token in the msi_domain_info structure and let the core code
handle the update.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.352437595@linutronix.de
PCI/MSI and PCI/MSI-X are mutually exclusive, but the MSI-X enable code
lacks a check for already enabled MSI.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122013.653556720@linutronix.de
Per PCIe r6.0, sec 6.30.1, a data object Length of 0x0 indicates 2^18
DWORDs (256K DW or 1MB) being transferred. Adjust the value of data object
length for this case on both sending side and receiving side.
Don't bother checking whether Length is greater than SZ_1M because all
values of the 18-bit Length field are valid, and it is impossible to
represent anything larger than SZ_1M:
0x00000 256K DW (1M bytes)
0x00001 1 DW (4 bytes)
...
0x3ffff 256K-1 DW (1M - 4 bytes)
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20221116015637.3299664-1-ming4.li@intel.com
Fixes: 9d24322e88 ("PCI/DOE: Add DOE mailbox support functions")
Signed-off-by: Li Ming <ming4.li@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v6.0+
PCI config space access from user space has traditionally been
unrestricted with writes being an understood risk for device operation.
Unfortunately, device breakage or odd behavior from config writes lacks
indicators that can leave driver writers confused when evaluating
failures. This is especially true with the new PCIe Data Object
Exchange (DOE) mailbox protocol where backdoor shenanigans from user
space through things such as vendor defined protocols may affect device
operation without complete breakage.
A prior proposal restricted read and writes completely.[1] Greg and
Bjorn pointed out that proposal is flawed for a couple of reasons.
First, lspci should always be allowed and should not interfere with any
device operation. Second, setpci is a valuable tool that is sometimes
necessary and it should not be completely restricted.[2] Finally
methods exist for full lock of device access if required.
Even though access should not be restricted it would be nice for driver
writers to be able to flag critical parts of the config space such that
interference from user space can be detected.
Introduce pci_request_config_region_exclusive() to mark exclusive config
regions. Such regions trigger a warning and kernel taint if accessed
via user space.
Create pci_warn_once() to restrict the user from spamming the log.
[1] https://lore.kernel.org/all/161663543465.1867664.5674061943008380442.stgit@dwillia2-desk3.amr.corp.intel.com/
[2] https://lore.kernel.org/all/YF8NGeGv9vYcMfTV@kroah.com/
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20220926215711.2893286-2-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch switches the driver away from legacy gpio/of_gpio API to
gpiod API, and removes use of of_get_named_gpio_flags() which I want to
make private to gpiolib.
Link: https://lore.kernel.org/r/20220906204301.3736813-1-dmitry.torokhov@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Jeffrey added Multi-MSI support to the pci-hyperv driver by the 4 patches:
08e61e861a ("PCI: hv: Fix multi-MSI to allow more than one MSI vector")
455880dfe2 ("PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI")
b4b77778ec ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()")
a2bad844a6 ("PCI: hv: Fix interrupt mapping for multi-MSI")
It turns out that the third patch (b4b77778ec) causes a performance
regression because all the interrupts now happen on 1 physical CPU (or two
pCPUs, if one pCPU doesn't have enough vectors). When a guest has many PCI
devices, it may suffer from soft lockups if the workload is heavy, e.g.,
see https://lwn.net/ml/linux-kernel/20220804025104.15673-1-decui@microsoft.com/
Commit b4b77778ec itself is good. The real issue is that the hypercall in
hv_irq_unmask() -> hv_arch_irq_unmask() ->
hv_do_hypercall(HVCALL_RETARGET_INTERRUPT...) only changes the target
virtual CPU rather than physical CPU; with b4b77778ec, the pCPU is
determined only once in hv_compose_msi_msg() where only vCPU0 is specified;
consequently the hypervisor only uses 1 target pCPU for all the interrupts.
Note: before b4b77778ec, the pCPU is determined twice, and when the pCPU
is determined the second time, the vCPU in the effective affinity mask is
used (i.e., it isn't always vCPU0), so the hypervisor chooses different
pCPU for each interrupt.
The hypercall will be fixed in future to update the pCPU as well, but
that will take quite a while, so let's restore the old behavior in
hv_compose_msi_msg(), i.e., don't reuse the existing IRTE allocation for
single-MSI and MSI-X; for multi-MSI, we choose the vCPU in a round-robin
manner for each PCI device, so the interrupts of different devices can
happen on different pCPUs, though the interrupts of each device happen on
some single pCPU.
The hypercall fix may not be backported to all old versions of Hyper-V, so
we want to have this guest side change forever (or at least till we're sure
the old affected versions of Hyper-V are no longer supported).
Fixes: b4b77778ec ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()")
Co-developed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Co-developed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20221104222953.11356-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
pci_device_is_present() previously didn't work for VFs because it reads the
Vendor and Device ID, which are 0xffff for VFs, which looks like they
aren't present. Check the PF instead.
Wei Gong reported that if virtio I/O is in progress when the driver is
unbound or "0" is written to /sys/.../sriov_numvfs, the virtio I/O
operation hangs, which may result in output like this:
task:bash state:D stack: 0 pid: 1773 ppid: 1241 flags:0x00004002
Call Trace:
schedule+0x4f/0xc0
blk_mq_freeze_queue_wait+0x69/0xa0
blk_mq_freeze_queue+0x1b/0x20
blk_cleanup_queue+0x3d/0xd0
virtblk_remove+0x3c/0xb0 [virtio_blk]
virtio_dev_remove+0x4b/0x80
...
device_unregister+0x1b/0x60
unregister_virtio_device+0x18/0x30
virtio_pci_remove+0x41/0x80
pci_device_remove+0x3e/0xb0
This happened because pci_device_is_present(VF) returned "false" in
virtio_pci_remove(), so it called virtio_break_device(). The broken vq
meant that vring_interrupt() skipped the vq.callback() that would have
completed the virtio I/O operation via virtblk_done().
[bhelgaas: commit log, simplify to always use pci_physfn(), add stable tag]
Link: https://lore.kernel.org/r/20221026060912.173250-1-mst@redhat.com
Reported-by: Wei Gong <gongwei833x@gmail.com>
Tested-by: Wei Gong <gongwei833x@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
When the PHY is the reference clock provider then it must be initialized
and powered on before the reset on the client is deasserted, otherwise
the link will never come up. The order was changed in cf236e0c0d.
Restore the correct order to make the driver work again on boards where
the PHY provides the reference clock. This also changes the order for
boards where the Soc is the PHY reference clock divider, but this
shouldn't do any harm.
Link: https://lore.kernel.org/r/20221101095714.440001-1-s.hauer@pengutronix.de
Fixes: cf236e0c0d ("PCI: imx6: Do not hide PHY driver callbacks and refine the error handling")
Tested-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Change to follow the Kconfig style guide. This patch fixes to use tab
rather than space to indent, while help text is indented an additional
two spaces.
Link: https://lore.kernel.org/r/20220815025006.48167-1-mie@igel.co.jp
Fixes: e35f56bb03 ("PCI: endpoint: Support NTB transfer between RC and EP")
Signed-off-by: Shunsuke Mie <mie@igel.co.jp>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
On Qualcomm platforms like SC8280XP and SA8540P, interconnect bandwidth
must be requested before enabling interconnect clocks.
Add basic support for managing an optional "pcie-mem" interconnect path
by setting a low constraint before enabling clocks and updating it after
the link is up.
Note that it is not possible for a controller driver to set anything but
a maximum peak bandwidth as expected average bandwidth will vary with
use case and actual use (and power policy?). This very much remains an
unresolved problem with the interconnect framework.
Also note that no constraint is set for the SC8280XP/SA8540P "cpu-pcie"
path for now as it is not clear what an appropriate constraint would be
(and the system does not crash when left unspecified).
Link: https://lore.kernel.org/r/20221102090705.23634-3-johan+linaro@kernel.org
Fixes: 70574511f3 ("PCI: qcom: Add support for SC8280XP")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Georgi Djakov <djakov@kernel.org>
MSI remapping is disabled by VMD driver for Intel's Icelake and
newer systems in order to improve performance by setting
VMCONFIG_MSI_REMAP. By design VMCONFIG_MSI_REMAP register is cleared
by firmware during boot. The same register gets cleared when system
is put in S3 power state. VMD driver needs to set this register again
in order to avoid interrupt issues with devices behind VMD if MSI
remapping was disabled before.
Link: https://lore.kernel.org/r/20221109142652.450998-1-nirmal.patel@linux.intel.com
Fixes: ee81ee84f8 ("PCI: vmd: Disable MSI-X remapping when possible")
Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Francisco Munoz <francisco.munoz.ruiz@linux.intel.com>
Set RCB_MPS mode bit so that data for PCIe read requests up to the size of
the Maximum Payload Size (MPS) are returned in one completion, and data for
PCIe read requests greater than the MPS are split at the specified Read
Completion Boundary setting.
Set RCB_64B so that the Read Compeletion Boundary is 64B.
Link: https://lore.kernel.org/r/20221011184211.18128-6-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
A number of inline functions are called rarely and/or are not
time-critical. Take out the "inline" and let the compiler do its work.
Link: https://lore.kernel.org/r/20221011184211.18128-5-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
It would be nice to replace the PCIe link-up loop as well but
there are too many uses of this that do not poll (and the
read_poll_timeout uses "timeout==0" to loop forever).
Link: https://lore.kernel.org/r/20221011184211.18128-4-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Be prudent and give some time for power and clocks to become stable. As
described in the PCIe CEM specification sections 2.2 and 2.2.1; as well as
PCIe r5.0, 6.6.1.
Link: https://lore.kernel.org/r/20221011184211.18128-3-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
We always wanted to enable Multi-MSI but didn't have a test device until
recently. In addition, there are some devices out there that will ask for
multiple MSI but refuse to work if they are only granted one.
Link: https://lore.kernel.org/r/20221011184211.18128-2-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Many host controller drivers #include <linux/of_irq.h> even though they
don't need it. Remove the unnecessary #includes.
Link: https://lore.kernel.org/r/20221031153954.1163623-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Roy Zang <roy.zang@nxp.com>
pci-xgene-msi.c uses irq_domain_add_linear() and related interfaces, so it
needs <linux/irqdomain.h> but doesn't include it directly; it relies on the
fact that <linux/of_irq.h> includes it.
But pci-xgene-msi.c *doesn't* need <linux/of_irq.h> itself. Include
<linux/irqdomain.h> directly to remove this implicit dependency so a future
patch can drop <linux/of_irq.h>.
Link: https://lore.kernel.org/r/20221031153954.1163623-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci-mvebu.c uses irq_domain_add_linear() and related interfaces but relies
on <linux/irqdomain.h> but doesn't include it directly; it relies on the
fact that <linux/of_irq.h> includes it.
Include <linux/irqdomain.h> directly to remove this implicit dependency.
Link: https://lore.kernel.org/r/20221031153954.1163623-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
pcie-microchip-host.c uses irq_domain_add_linear() and related interfaces,
so it needs <linux/irqdomain.h> but doesn't include it directly; it relies
on the fact that <linux/of_irq.h> includes it.
But pcie-microchip-host.c *doesn't* need <linux/of_irq.h> itself. Include
<linux/irqdomain.h> directly to remove this implicit dependency so a future
patch can drop <linux/of_irq.h>.
Link: https://lore.kernel.org/r/20221031153954.1163623-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
pcie-altera-msi.c uses irq_domain_add_linear() and related interfaces, so
it needs <linux/irqdomain.h> but doesn't include it directly; it relies on
the fact that <linux/of_irq.h> includes it.
But pcie-altera-msi.c *doesn't* need <linux/of_irq.h> itself. Include
<linux/irqdomain.h> directly to remove this implicit dependency so a future
patch can drop <linux/of_irq.h>.
Link: https://lore.kernel.org/r/20221031153954.1163623-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some of the platforms (like Tegra194 and Tegra234) have open slots and
not having an endpoint connected to the slot is not an error.
So, changing the macro from dev_err to dev_info to log the event.
Link: https://lore.kernel.org/r/20220913101237.4337-1-vidyas@nvidia.com
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
When pci_create_attr() fails, pci_remove_resource_files() is called which
will iterate over the res_attr[_wc] arrays and frees every non NULL entry.
To avoid a double free here set the array entry only after it's clear we
successfully initialized it.
Fixes: b562ec8f74 ("PCI: Don't leak memory if sysfs_create_bin_file() fails")
Link: https://lore.kernel.org/r/20221007070735.GX986@pengutronix.de/
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Create a sysfs bin attribute called "allocate" under the existing
"p2pmem" group. The only allowable operation on this file is the mmap()
call.
When mmap() is called on this attribute, the kernel allocates a chunk of
memory from the genalloc and inserts the pages into the VMA. The
dev_pagemap .page_free callback will indicate when these pages are no
longer used and they will be put back into the genalloc.
On device unbind, remove the sysfs file before the memremap_pages are
cleaned up. This ensures unmap_mapping_range() is called on the files
inode and no new mappings can be created.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20221021174116.7200-9-logang@deltatee.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace assignment of PCI domain IDs from atomic_inc_return() to
ida_alloc().
Use two IDAs, one for static domain allocations (those which are defined in
device tree) and second for dynamic allocations (all other).
During removal of root bus / host bridge, also release the domain ID. The
released ID can be reused again, for example when dynamically loading and
unloading native PCI host bridge drivers.
This change also allows to mix static device tree assignment and dynamic by
kernel as all static allocations are reserved in dynamic pool.
[bhelgaas: set "err" if "bus->domain_nr < 0"]
Link: https://lore.kernel.org/r/20220714184130.5436-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe r2.0, sec 7.8 added Link Capabilities/Status/Control 2 registers to
the PCIe Capability with Capability Version 2.
Previously we assumed these registers were implemented for all PCIe
Capabilities of version 2 or greater, but in fact they are only
implemented for devices with Links.
Update pcie_capability_reg_implemented() to check whether the device has
a Link.
[bhelgaas: commit log, squash export]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209100057070.2275@angie.orcam.me.uk
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209100057300.2275@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The local variable 'vector' must be u32 rather than u8: see the
struct hv_msi_desc3.
'vector_count' should be u16 rather than u8: see struct hv_msi_desc,
hv_msi_desc2 and hv_msi_desc3.
Fixes: a2bad844a6 ("PCI: hv: Fix interrupt mapping for multi-MSI")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Jeffrey Hugo <quic_jhugo@quicinc.com>
Cc: Carl Vanderlip <quic_carlv@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://lore.kernel.org/r/20221027205256.17678-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
The Requester ID/Process Address Space ID (PASID) combination
identifies an address space distinct from the PCI bus address space,
e.g., an address space defined by an IOMMU.
But the PCIe fabric routes Memory Requests based on the TLP address,
ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with PASID that
SHOULD go upstream to the IOMMU may instead be routed as a P2P
Request if its address falls in a bridge window.
To ensure that all Memory Requests with PASID are routed upstream,
only enable PASID if ACS P2P Request Redirect and Upstream Forwarding
are enabled for the path leading to the device.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Tony Zhu <tony.zhu@intel.com>
Link: https://lore.kernel.org/r/20221031005917.45690-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The pci_epf_test_notifier function should be installed also if only
core_init_notifier is enabled. Fix the current logic.
Link: https://lore.kernel.org/r/20220825090101.20474-1-hayashi.kunihiko@socionext.com
Fixes: 5e50ee27d4 ("PCI: pci-epf-test: Add support to defer core initialization")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Om Prakash Singh <omp@nvidia.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
[devm_]gpiod_get_from_of_node in drivers usage should be limited
so that gpiolib can be cleaned up; let's switch to the generic device
property API.
It may even help with handling secondary fwnodes when gpiolib is taught
to handle gpios described by swnodes.
Link: https://lore.kernel.org/r/20220903-gpiod_get_from_of_node-remove-v1-1-b29adfb27a6c@gmail.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
[lpieralisi@kernel.org: commit log]
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Since there is no release callback defined for the PCI EPC device,
the below warning is thrown by driver core when a PCI endpoint driver is
removed:
Device 'e65d0000.pcie-ep' does not have a release() function, it is broken and must be fixed. See Documentation/core-api/kobject.rst.
WARNING: CPU: 0 PID: 139 at drivers/base/core.c:2232 device_release+0x78/0x8c
Hence, add the release callback and also move the kfree(epc) from
pci_epc_destroy() so that the epc memory is freed when all references are
dropped.
Link: https://lore.kernel.org/r/20220623003817.298173-1-yoshihiro.shimoda.uh@renesas.com
Tested-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Dual mode DesignWare PCIe IP has PTM capability enabled (if supported) even
in the EP mode. The PCIe compliance for the EP mode expects PTM
capabilities (ROOT_CAPABLE, RES_CAPABLE, CLK_GRAN) be disabled.
Hence disable PTM for the EP mode.
Link: https://lore.kernel.org/r/20220919143340.4527-3-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
commit aeaa0bfe89 ("PCI: dwc: Move N_FTS setup to common setup")
incorrectly uses pci->link_gen in deriving the index to the
n_fts[] array also introducing the issue of accessing beyond the
boundaries of array for greater than Gen-2 speeds. This change fixes
that issue.
Link: https://lore.kernel.org/r/20220926111923.22487-1-vidyas@nvidia.com
Fixes: aeaa0bfe89 ("PCI: dwc: Move N_FTS setup to common setup")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
1a1daf097e ("PCI/PM: Remove unused pci_driver.suspend_late() hook")
removed the legacy .suspend_late() hook, which was the only user of the
"state" parameter to pci_legacy_suspend_late(), but it neglected to remove
the parameter.
Remove the unused "state" parameter to pci_legacy_suspend_late().
Link: https://lore.kernel.org/r/20221025193502.669091-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
pcie_port_service_register() and pcie_port_service_unregister() are used
only by the pciehp, aer, dpc, and pme PCIe port service drivers, none of
which can be modules. Unexport pcie_port_service_register() and
pcie_port_service_unregister(). No functional change intended.
Link: https://lore.kernel.org/r/20221019204127.44463-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Previously several things used by portdrv_core.c and portdrv_pci.c were
shared by defining them in portdrv.h. Now that portdrv_core.c and
portdrv_pci.c have been squashed, move things that can be private into
portdrv.c. No functional change intended.
Link: https://lore.kernel.org/r/20221019204127.44463-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Squash portdrv_core.c and portdrv_pci.c into portdrv.c to make it easier to
find things. The whole thing is less than 1000 lines, and it's a pain to
bounce back and forth between two files.
Several portdrv_core.c functions were non-static because they were
referenced from portdrv_pci.c. Make them static since they're now all in
portdrv.c.
No functional change intended.
Link: https://lore.kernel.org/r/20221019204127.44463-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
This reverts commit 8bb7ff12a9.
Commit 8bb7ff12a9 ("PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro")
updated the Tegra PCI driver to use the macro PCI_CONF1_EXT_ADDRESS()
instead of a local function in the Tegra PCI driver. This broke PCI for
some Tegra platforms because, when calculating the offset value, the mask
applied to the lower 8-bits changed from 0xff to 0xfc.
For now, fix this by reverting this commit.
Fixes: 8bb7ff12a9 ("PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro")
Link: https://lore.kernel.org/r/20221017084006.11770-1-jonathanh@nvidia.com
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Call phy_set_mode_ext() to notify the PHY driver that the PHY is being
used in the EP mode.
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Link: https://lore.kernel.org/r/20220927092207.161501-6-dmitry.baryshkov@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Call phy_set_mode_ext() to notify the PHY driver that the PHY is being
used in the RC mode.
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Link: https://lore.kernel.org/r/20220927092207.161501-5-dmitry.baryshkov@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmNKzlUUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vxBbA//QYuw8iZHzshJntCyWlTRg742x2jI
2NrLojVG/ccSlqnl/KRultaPFdSlZb221UASt7CAEIGoP6+/SNepazEEJ4QJhMX6
inY/hYrVd2OHhCnO+OoNd7DKfbeJRkFVMHOGJdpbQ7jMwa99YSaN6xEMfALcZIrg
8CqYKDpdJW4avYyKlfXayY42d8gljWJwfLuK1xEnnIrdvXQEB3vj3yKxziadhBtA
EJbLWORVF+yQwq63NCPkLvCe0ZuV09a4q4IhlzWo+30FLMOsk2z4HBi1B2fwlawV
tnYCt0i6FNC6e3DJyWX/hkAiIbJUadnych+LcGq+/FY30tCBtNkNaQN3XsrvdvI4
5piacj1av21R7JHSyk64M4jdNlE7MYonQNsc8vp4yDtlAjQYkaj7BJdsIq3MCTei
ce6mxguVb2hrMr4fZXc8ka7XZkofJNv/XhfuQAorzKfEqA8cI4enICsPdoTFyZDV
pYYuHIJXY4rTm0wUD+TlDg/Cn6gyLu3WYVyEn6q/E53Twj1rr5ske7igDMDVYYly
qMg8CTPHYzHJ0wbzkf0CfND7SVAAE1BnIzr4Drfbrm+fi11HVhd0u/0XiWf2NthW
/ZFQUDBh0SwSHgUnO+ONqcR3arhIT4PEsEJadD4kuW4nIyPS7t1Nfu0V3XUZDa14
U6UeL6URnXBDV7U=
=gs3v
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fix from Bjorn Helgaas:
"Revert the attempt to distribute spare resources to unconfigured
hotplug bridges at boot time.
This fixed some dock hot-add scenarios, but Jonathan Cameron reported
that it broke a topology with a multi-function device where one
function was a Switch Upstream Port and the other was an Endpoint"
* tag 'pci-v6.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI: Distribute available resources for root buses, too"
This reverts commit e96e27fc6f.
Jonathan reported that this commit broke this topology, where all the space
available on bus 02 was assigned to the 02:00.0 bridge window, leaving none
for the e1000 device at 02:00.1:
pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04]
pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04]
pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000]
e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0]
Link: https://lore.kernel.org/r/20221014124553.0000696f@huawei.com
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY0ZjFAAKCRCAXGG7T9hj
vjEsAP4rFMnqc6AXy4Mpvv8cxBtEuQZbwEqgBrMJUvK1jZQrBQD/dOJK2GBCVcfD
2yaVlefFiJGTw5WUlbPeohUlTZ8pJwg=
=xsHV
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- Some minor typo fixes
- A fix of the Xen pcifront driver for supporting the device model to
run in a Linux stub domain
- A cleanup of the pcifront driver
- A series to enable grant-based virtio with Xen on x86
- A cleanup of Xen PV guests to distinguish between safe and faulting
MSR accesses
- Two fixes of the Xen gntdev driver
- Two fixes of the new xen grant DMA driver
* tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum"
xen/pv: support selecting safe/unsafe msr accesses
xen/pv: refactor msr access functions to support safe and unsafe accesses
xen/pv: fix vendor checks for pmu emulation
xen/pv: add fault recovery control to pmu msr accesses
xen/virtio: enable grant based virtio on x86
xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT
xen/virtio: restructure xen grant dma setup
xen/pcifront: move xenstore config scanning into sub-function
xen/gntdev: Accommodate VMA splitting
xen/gntdev: Prevent leaking grants
xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices
xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page()
xen/xenbus: Fix spelling mistake "hardward" -> "hardware"
xen-pcifront: Handle missed Connected state
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmNECrkUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vy7HQ//egBx+1/Eu5N+Y2c4ebGLcc4zCTks
Cj3lDFB8v6KvNaDJZHToJi+vychv6BjsQPE1rpfU18+FCfB/fBfzkf/N22qr258l
tNDn+YxgXHKd6zumUW88bRmK6vnKz8ELKELC3LMZNmJMcWemFKFY2rdgCh5SEJ8d
S/VFSSPip+oyA02zOa1QoCu1nlbYGwZRegYpBTHMSHfgm0ddBSQWNVMKBJSnNP9A
73X+unY3Nh360oRgaQb8wnCJp8vpalavtGMTq1CJPQBAAlUTWpVUAabF7eSNSFZt
KO7aFdL1rWgWeZsLBLhVq6fIhK31U7ED9aXYYBMcla8zRUlP/IkMEUF/ztdSuQ9Y
t4cQVrVGAM/6WnQaqzfuktEWGG3OJGnOYncvWhUlP6SydUagvaKn49yvKBNw6Ehb
GFtrx7/2Ap8icGqkeLajxGVq+N8VH5T02uToOCxZn/U10m0Hhu6objt3erFmIVoN
+aWojZCc7YLVksNzyYcNSEvsDeGz/RA/l3dxgEfJOu44rQAKq/AzwT8scjsNqjAs
e0UjDVYJVdU7yCM2lDhqYEzfImkqienO+iEySIkUIpo5Z9YUf8//kvYPOFih7ZXH
KhGbezPl3O/Ta2Go5TQ5ovh09IGVYM/PzjFO0a/PeOjCe9heutrmuFtLi+y+ROL0
wd/mZL1kkT++P7c=
=pOqY
-----END PGP SIGNATURE-----
Merge tag 'pci-v6.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci updates from Bjorn Helgaas:
"Resource management:
- Distribute spare resources to unconfigured hotplug bridges at
boot-time (not just when hot-adding such a bridge), which makes
hot-adding devices to docks work better.
- Revert to a BAR assignment inherited from firmware only when the
address is actually reachable via any upstream bridges, which fixes
some cases where firmware doesn't configure all devices.
- Add a sysfs interface to resize BARs so this can be done before
assigning devices to a VM through VFIO.
Power management:
- Disable Precision Time Management for all devices on suspend to
enable lower-power PM state. We previously did this just for Root
Ports, which isn't enough because downstream devices can still
generate PTM messages, which cause errors if it's disabled in the
Root Port.
- Save and restore the ASPM L1 PM Substates configuration for
suspend/ resume. Previously this configuration was lost, so L1.x
states likely stopped working after resume.
- Check whether the L1 PM Substates Capability exists. If it didn't
exist, we previously read junk and tried to configure L1 Substates
based on that.
- Fix the LTR_L1.2_THRESHOLD computation, which previously set a
threshold for entering L1.2 that was too low in some cases.
- Reduce the delay after transitions to or from D3cold by using
usleep_range() rather than msleep(), which often slept for ~19ms
instead of the 10ms normally required. The spec says 10ms is
enough, but it's possible we could trip over devices that need a
little more.
Error handling:
- Work around a BIOS bug that caused Intel Root Ports to advertise a
Root Port Programmed I/O (RP PIO) log size of zero, which caused
annoying warnings and prevented the kernel from dumping log
registers for DPC errors.
Qualcomm PCIe controller driver:
- Add support for SC8280XP and SA8540P host controllers and SM8450
endpoint controller.
- Disable Master AXI clock on endpoint controllers to save power when
link is idle or in L1.x.
- Expose link state transition counts via debugfs to help debug
issues with low-power states.
- Add auto-loading module support.
Synopsys DesignWare PCIe controller driver:
- Remove a dependency on ZONE_DMA32 by allocating the MSI target page
differently. There's more work to do related to eDMA controllers,
so it's not completely settled"
* tag 'pci-v6.1-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (71 commits)
PCI: qcom-ep: Check platform_get_resource_byname() return value
PCI: qcom-ep: Add support for SM8450 SoC
dt-bindings: PCI: qcom-ep: Add support for SM8450 SoC
dt-bindings: PCI: qcom-ep: Define clocks per platform
PCI: qcom-ep: Make PERST separation optional
dt-bindings: PCI: qcom-ep: Make PERST separation optional
PCI: qcom-ep: Disable Master AXI Clock when there is no PCIe traffic
PCI: Expose PCIe Resizable BAR support via sysfs
PCI/ASPM: Correct LTR_L1.2_THRESHOLD computation
PCI/ASPM: Ignore L1 PM Substates if device lacks capability
PCI/ASPM: Factor out L1 PM Substates configuration
PCI: qcom-ep: Gate Master AXI clock to MHI bus during L1SS
PCI: qcom-ep: Expose link transition counts via debugfs
PCI: qcom-ep: Disable IRQs during driver remove
PCI/ASPM: Save L1 PM Substates Capability for suspend/resume
PCI/ASPM: Refactor L1 PM Substates Control Register programming
PCI: qcom-ep: Make use of the cached dev pointer
PCI: qcom-ep: Rely on the clocks supplied by devicetree
PCI: qcom-ep: Add kernel-doc for qcom_pcie_ep structure
phy: freescale: imx8m-pcie: Fix the wrong order of phy_init() and phy_power_on()
...
pcifront_try_connect() and pcifront_attach_devices() share a large
chunk of duplicated code for reading the config information from
Xenstore, which only differs regarding calling pcifront_rescan_root()
or pcifront_scan_root().
Put that code into a new sub-function. It is fine to always call
pcifront_rescan_root() from that common function, as it will fallback
to pcifront_scan_root() if the domain/bus combination isn't known
yet (and pcifront_scan_root() should never be called for an already
known domain/bus combination anyway). In order to avoid duplicate
messages for the fallback case move the check for domain/bus not known
to the beginning of pcifront_rescan_root().
While at it fix the error reporting in case the root-xx node had the
wrong format.
As the return value of pcifront_try_connect() and
pcifront_attach_devices() are not used anywhere make those functions
return void. As an additional bonus this removes the dubious return
of -EFAULT in case of an unexpected driver state.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
- Add macros for PCI Configuration Mechanism #1 and use them in the
ftpci100, mt7621, and tegra drivers (Pali Rohár)
* remotes/lorenzo/pci/misc:
PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro
PCI: mt7621: Use PCI_CONF1_EXT_ADDRESS() macro
PCI: ftpci100: Use PCI_CONF1_ADDRESS() macro
PCI: Add standard PCI Config Address macros
- List platforms that use a single MSI host interrupt in qcom DT (Johan
Hovold)
- Add SC8280XP, SA8540P support to qcom DT binding and driver(Johan Hovold)
- Make all optional clocks truly optional in the driver (Johan Hovold)
- Rename per-IP structs to reflect the IP version (Johan Hovold)
- Sort device ID match table by compatible string (Johan Hovold)
- Add MODULE_DEVICE_TABLE to enable module autoloading (Dmitry Baryshkov)
- Drop the unused .post_deinit() callback (Johan Hovold)
- Rely on DT for clock information instead of hard-coding it in the driver
(Manivannan Sadhasivam)
- Disable IRQs when removing driver to avoid spurious IRQs later
(Manivannan Sadhasivam)
- Expose link transition counts via debugfs to help debug issues with
low-power states (Manivannan Sadhasivam)
- Gate Master AXI clock to the MHI bus while in L1 substates to save power
(Manivannan Sadhasivam)
- Disable Master AXI clock to save power when there is no traffic on PCIe
(Manivannan Sadhasivam)
- Make the "PERST separation" debug feature optional in the DT and the
driver (Manivannan Sadhasivam)
- Define clocks to be per-platform in DT to prepare for future SoCs
(Manivannan Sadhasivam)
- Add SM8450 SoC support (Manivannan Sadhasivam)
- Check for platform_get_resource_byname() to avoid a NULL pointer
dereference (Yang Yingliang)
* pci/qcom:
PCI: qcom-ep: Check platform_get_resource_byname() return value
PCI: qcom-ep: Add support for SM8450 SoC
dt-bindings: PCI: qcom-ep: Add support for SM8450 SoC
dt-bindings: PCI: qcom-ep: Define clocks per platform
PCI: qcom-ep: Make PERST separation optional
dt-bindings: PCI: qcom-ep: Make PERST separation optional
PCI: qcom-ep: Disable Master AXI Clock when there is no PCIe traffic
PCI: qcom-ep: Gate Master AXI clock to MHI bus during L1SS
PCI: qcom-ep: Expose link transition counts via debugfs
PCI: qcom-ep: Disable IRQs during driver remove
PCI: qcom-ep: Make use of the cached dev pointer
PCI: qcom-ep: Rely on the clocks supplied by devicetree
PCI: qcom-ep: Add kernel-doc for qcom_pcie_ep structure
PCI: qcom: Rename host-init error label
PCI: qcom: Drop unused post_deinit callback
PCI: qcom-ep: Add MODULE_DEVICE_TABLE
PCI: qcom: Sort device-id table
PCI: qcom: Clean up IP configurations
PCI: qcom: Make all optional clocks optional
PCI: qcom: Add support for SA8540P
PCI: qcom: Add support for SC8280XP
dt-bindings: PCI: qcom: Add SA8540P to binding
dt-bindings: PCI: qcom: Add SC8280XP to binding
dt-bindings: PCI: qcom: Enumerate platforms with single msi interrupt
- Rename the pcie-mediatek-gen3 driver from 'mtk-pcie' to 'mtk-pcie-gen3'
so it can coexist with the pcie-mediatek driver, which also uses
'mtk-pcie' (Felix Fietkau)
* remotes/lorenzo/pci/mediatek:
PCI: mediatek-gen3: Change driver name to mtk-pcie-gen3