There are two kexec load syscalls, kexec_load another and kexec_file_load.
kexec_file_load has been splited as kernel/kexec_file.c. In this patch I
split kexec_load syscall code to kernel/kexec.c.
And add a new kconfig option KEXEC_CORE, so we can disable kexec_load and
use kexec_file_load only, or vice verse.
The original requirement is from Ted Ts'o, he want kexec kernel signature
being checked with CONFIG_KEXEC_VERIFY_SIG enabled. But kexec-tools use
kexec_load syscall can bypass the checking.
Vivek Goyal proposed to create a common kconfig option so user can compile
in only one syscall for loading kexec kernel. KEXEC/KEXEC_FILE selects
KEXEC_CORE so that old config files still work.
Because there's general code need CONFIG_KEXEC_CORE, so I updated all the
architecture Kconfig with a new option KEXEC_CORE, and let KEXEC selects
KEXEC_CORE in arch Kconfig. Also updated general kernel code with to
kexec_load syscall.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Young <dyoung@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull parisc updates from Helge Deller:
"The most important changes in this patchset are:
- re-enable 64bit PCI bus addresses which were temporarily disabled
for PA-RISC in kernel 4.2
- fix the 64bit CAS operation in the LWS path which now enables us to
enable the 64bit gcc atomic builtins even on 32bit userspace with
64bit kernel
- fix a long-standing bug which sometimes crashed kernel at bootup
while serial interrupt wasn't registered yet"
* 'parisc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Use platform_device_register_simple("rtc-generic")
parisc: Drop CONFIG_SMP around update_cr16_clocksource()
parisc: Use double word condition in 64bit CAS operation
parisc: Filter out spurious interrupts in PA-RISC irq handler
parisc: Additionally check for in_atomic() in page fault handler
PCI,parisc: Enable 64-bit bus addresses on PA-RISC
parisc: Define ioremap_uc and ioremap_wc
1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map. This facility is used by the pmem driver to
enable pfn_to_page() operations on the page frames returned by DAX
('direct_access' in 'struct block_device_operations'). For now, the
'memmap' allocation for these "device" pages comes from "System
RAM". Support for allocating the memmap from device memory will
arrive in a later kernel.
2/ Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3. Completion of
the conversion is targeted for v4.4.
3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
5/ Miscellaneous updates and fixes to libnvdimm including support
for issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV6Nx7AAoJEB7SkWpmfYgCWyYQAI5ju6Gvw27RNFtPovHcZUf5
JGnxXejI6/AqeTQ+IulgprxtEUCrXOHjCDA5dkjr1qvsoqK1qxug+vJHOZLgeW0R
OwDtmdW4Qrgeqm+CPoxETkorJ8wDOc8mol81kTiMgeV3UqbYeeHIiTAmwe7VzZ0C
nNdCRDm5g8dHCjTKcvK3rvozgyoNoWeBiHkPe76EbnxDICxCB5dak7XsVKNMIVFQ
NuYlnw6IYN7+rMHgpgpRux38NtIW8VlYPWTmHExejc2mlioWMNBG/bmtwLyJ6M3e
zliz4/cnonTMUaizZaVozyinTa65m7wcnpjK+vlyGV2deDZPJpDRvSOtB0lH30bR
1gy+qrKzuGKpaN6thOISxFLLjmEeYwzYd7SvC9n118r32qShz+opN9XX0WmWSFlA
sajE1ehm4M7s5pkMoa/dRnAyR8RUPu4RNINdQ/Z9jFfAOx+Q26rLdQXwf9+uqbEb
bIeSQwOteK5vYYCstvpAcHSMlJAglzIX5UfZBvtEIJN7rlb0VhmGWfxAnTu+ktG1
o9cqAt+J4146xHaFwj5duTsyKhWb8BL9+xqbKPNpXEp+PbLsrnE/+WkDLFD67jxz
dgIoK60mGnVXp+16I2uMqYYDgAyO5zUdmM4OygOMnZNa1mxesjbDJC6Wat1Wsndn
slsw6DkrWT60CRE42nbK
=o57/
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release. The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and
the introduction of ZONE_DEVICE + devm_memremap_pages().
Summary:
- Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
mechanism for adding device-driver-discovered memory regions to the
kernel's direct map.
This facility is used by the pmem driver to enable pfn_to_page()
operations on the page frames returned by DAX ('direct_access' in
'struct block_device_operations').
For now, the 'memmap' allocation for these "device" pages comes
from "System RAM". Support for allocating the memmap from device
memory will arrive in a later kernel.
- Introduce memremap() to replace usages of ioremap_cache() and
ioremap_wt(). memremap() drops the __iomem annotation for these
mappings to memory that do not have i/o side effects. The
replacement of ioremap_cache() with memremap() is limited to the
pmem driver to ease merging the api change in v4.3.
Completion of the conversion is targeted for v4.4.
- Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
driver, update the VFS DAX implementation and PMEM api to provide
persistence guarantees for kernel operations on a DAX mapping.
- Convert the ACPI NFIT 'BLK' driver to map the block apertures as
cacheable to improve performance.
- Miscellaneous updates and fixes to libnvdimm including support for
issuing "address range scrub" commands, clarifying the optimal
'sector size' of pmem devices, a clarification of the usage of the
ACPI '_STA' (status) property for DIMM devices, and other minor
fixes"
* tag 'libnvdimm-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (34 commits)
libnvdimm, pmem: direct map legacy pmem by default
libnvdimm, pmem: 'struct page' for pmem
libnvdimm, pfn: 'struct page' provider infrastructure
x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
add devm_memremap_pages
mm: ZONE_DEVICE for "device memory"
mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
dax: drop size parameter to ->direct_access()
nd_blk: change aperture mapping from WC to WB
nvdimm: change to use generic kvfree()
pmem, dax: have direct_access use __pmem annotation
dax: update I/O path to do proper PMEM flushing
pmem: add copy_from_iter_pmem() and clear_pmem()
pmem, x86: clean up conditional pmem includes
pmem: remove layer when calling arch_has_wmb_pmem()
pmem, x86: move x86 PMEM API to new pmem.h header
libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
pmem: switch to devm_ allocations
devres: add devm_memremap
libnvdimm, btt: write and validate parent_uuid
...
Commit 3a9ad0b ("PCI: Add pci_bus_addr_t") unconditionally introduced usage of
64-bit PCI bus addresses on all 64-bit platforms which broke PA-RISC.
It turned out that due to enabling the 64-bit addresses, the PCI logic decided
to use the GMMIO instead of the LMMIO region. This commit simply disables
registering the GMMIO and thus we fall back to use the LMMIO region as before.
Reverts commit 45ea2a5fed
("PCI: Don't use 64-bit bus addresses on PA-RISC")
To: linux-parisc@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Meelis Roos <mroos@linux.ee>
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: Helge Deller <deller@gmx.de>
Pull irq updates from Thomas Gleixner:
"This updated pull request does not contain the last few GIC related
patches which were reported to cause a regression. There is a fix
available, but I let it breed for a couple of days first.
The irq departement provides:
- new infrastructure to support non PCI based MSI interrupts
- a couple of new irq chip drivers
- the usual pile of fixlets and updates to irq chip drivers
- preparatory changes for removal of the irq argument from interrupt
flow handlers
- preparatory changes to remove IRQF_VALID"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
irqchip/imx-gpcv2: IMX GPCv2 driver for wakeup sources
irqchip: Add bcm2836 interrupt controller for Raspberry Pi 2
irqchip: Add documentation for the bcm2836 interrupt controller
irqchip/bcm2835: Add support for being used as a second level controller
irqchip/bcm2835: Refactor handle_IRQ() calls out of MAKE_HWIRQ
PCI: xilinx: Fix typo in function name
irqchip/gic: Ensure gic_cpu_if_up/down() programs correct GIC instance
irqchip/gic: Only allow the primary GIC to set the CPU map
PCI/MSI: pci-xgene-msi: Consolidate chained IRQ handler install/remove
unicore32/irq: Prepare puv3_gpio_handler for irq argument removal
tile/pci_gx: Prepare trio_handle_level_irq for irq argument removal
m68k/irq: Prepare irq handlers for irq argument removal
C6X/megamode-pic: Prepare megamod_irq_cascade for irq argument removal
blackfin: Prepare irq handlers for irq argument removal
arc/irq: Prepare idu_cascade_isr for irq argument removal
sparc/irq: Use access helper irq_data_get_affinity_mask()
sparc/irq: Use helper irq_data_get_irq_handler_data()
parisc/irq: Use access helper irq_data_get_affinity_mask()
mn10300/irq: Use access helper irq_data_get_affinity_mask()
irqchip/i8259: Prepare i8259_irq_dispatch for irq argument removal
...
Here's our branch of ARM64 contents for this merge window.
Most of this is DT contents for new SoCs (or those who have seen new
device support added). Maybe we should stop separating out the arm64
contents here to avoid the kind of internal conflicts as we got this
time around, where 32- and 64-bit contents conflicted.
Anyhow, on the actual contents:
New SoCs:
- Broadcom North Star 2 (ns2)
- Marvell Berlin4CT
- Mediatek MT6795
- Rockchip RK3368
In addition, there are enhancements for the following platforms:
- Mediatek MT8173: cpuidle-dt updates, misc other additions
- ZyncMP: A bunch of devices added to the existing DTSI
- Qualcomm MSM8916 and APQ8016 updates for USB, etc.
+ A handful of other updates for various platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJV5N8IAAoJEIwa5zzehBx3qt0P/ju6RvgS2vUDk6QxjBvzp2oN
daXOj73Nefe4Qef7GT5k5K1hLFVflKmesIDqG/eTfGTDSjMjQaJufZTKGh+5arLQ
28XAm897gsjt+NVzEgdkG1QD8GIlfth+S3ylh/nLx+zOMTz353dMgpNY0o3eWVFr
TYq+x6u4UFEmBFvkYm7P9ONfqKKiaO2pqjsA41SfKao0c8+kwrB5jWO2m4DaYqhC
Mj6ug8HoQLJk+ZpbhcsUDD4tAMIL0VJWl446LLCsd/tk81J05K95vPCTEdyGRu0U
dwe0W0PgODPMqyYTXteVdOuT1n7CYFlySAKfJUx4yW4w9Y5hHbB6sHYRHHt6unny
Ynn5u2xIUHKaqiOeShDHV433ssCB+1tgBefoAVPqY2kYC8eCP46zrieVwZip4/KA
rEldc1c+tCUYROe39F4ljg3I4wFNtBhfn3JCoocvbljmMSro5MD3zf1n8p21C6GV
vCidRbIXHVd89FALFL/1r1vuMAlaF6TD/FyxYVrJXgr8LTj5gak93/O5WpsYam6c
GwiTnykz+3eHvDi7xrhsohCHptvE7TnBSNhfkxsngFoJr3evAbU3B4qJKLzrZCvf
KHVIoSylEl+4+M6VISowid6NL4ADc8mNeHXfGxFHVBEfnK/tH3VySM4hV0reAh4m
NR+TX15zqfF4uudoAqlC
=SZR9
-----END PGP SIGNATURE-----
Merge tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC 64-bit changes from Olof Johansson:
"Here's our branch of ARM64 contents for this merge window.
Most of this is DT contents for new SoCs (or those who have seen new
device support added). Maybe we should stop separating out the arm64
contents here to avoid the kind of internal conflicts as we got this
time around, where 32- and 64-bit contents conflicted.
Anyhow, on the actual contents:
New SoCs:
- Broadcom North Star 2 (ns2)
- Marvell Berlin4CT
- Mediatek MT6795
- Rockchip RK3368
In addition, there are enhancements for the following platforms:
- Mediatek MT8173: cpuidle-dt updates, misc other additions
- ZyncMP: A bunch of devices added to the existing DTSI
- Qualcomm MSM8916 and APQ8016 updates for USB, etc.
+ a handful of other updates for various platforms"
* tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (47 commits)
ARM64: dts: vexpress: Use assigned-clock-parents for sp810
ARM64: dts: mt6795: enable basic SMP bringup for MT6795
arm64: Enable Marvell Berlin SoC family in defconfig
arm64: Enable Marvell Berlin SoC family in Kconfig
arm64: dts: Add dts files for Marvell Berlin4CT SoC
ARM64: zynqmp: Move SPI nodes to the right location
ARM64: zynqmp: Move uart and ttcs to the right location
ARM64: zynqmp: Enable spi flashes on ep108
ARM64: zynqmp: Add eeprom memories on i2c bus
ARM64: zynqmp: Enable sdhci on ep108
ARM64: zynqmp: Enable watchdog on ep108
ARM64: zynqmp: Add DWC3 usb support
ARM64: zynqmp: Add SMMU support
ARM64: zynqmp: Add CANs node for platform
ARM64: zynqmp: Use zynqmp specific compatible string for gpio
devicetree: xilinx: zynqmp: add sata node
PCI: iproc: Fix BCMA dependency in Kconfig
arm64: dts: Add Broadcom North Star 2 support
arm64: Add Broadcom iProc family support
PCI: iproc: Fix ARM64 dependency in Kconfig
...
Pull x86 mm updates from Ingo Molnar:
"The dominant change in this cycle was the continued work to isolate
kernel drivers from MTRR legacies: this tree gets rid of all kernel
internal driver interfaces to MTRRs (mostly by rewriting it to proper
PAT interfaces), the only access left is the /proc/mtrr ABI.
This work was done by Luis R Rodriguez.
There's also some related PCI interface additions for which I've
Cc:-ed Bjorn"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
x86/mm/mtrr: Remove kernel internal MTRR interfaces: unexport mtrr_add() and mtrr_del()
s390/io: Add pci_iomap_wc() and pci_iomap_wc_range()
drivers/dma/iop-adma: Use dma_alloc_writecombine() kernel-style
drivers/video/fbdev/vt8623fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/s3fb: Use arch_phys_wc_add() and pci_iomap_wc()
drivers/video/fbdev/arkfb.c: Use arch_phys_wc_add() and pci_iomap_wc()
PCI: Add pci_iomap_wc() variants
drivers/video/fbdev/gxt4500: Use pci_ioremap_wc_bar() to map framebuffer
drivers/video/fbdev/kyrofb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
drivers/video/fbdev/i740fb: Use arch_phys_wc_add() and pci_ioremap_wc_bar()
PCI: Add pci_ioremap_wc_bar()
x86/mm: Make kernel/check.c explicitly non-modular
x86/mm/pat: Make mm/pageattr[-test].c explicitly non-modular
x86/mm/pat: Add comments to cachemode translation tables
arch/*/io.h: Add ioremap_uc() to all architectures
drivers/video/fbdev/atyfb: Use arch_phys_wc_add() and ioremap_wc()
drivers/video/fbdev/atyfb: Replace MTRR UC hole with strong UC
drivers/video/fbdev/atyfb: Clarify ioremap() base and length used
drivers/video/fbdev/atyfb: Carve out framebuffer length fudging into a helper
x86/mm, asm-generic: Add IOMMU ioremap_uc() variant default
...
Commit 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel
doesn't support MSI") changed the location of the code that initialises
dev->msi_cap/msix_cap and then disables MSI/MSI-X interrupts at PCI
probe time in devices that have this flag set. It moved the code from
pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(),
called by pci_setup_device().
The pseries PCI probing code does not call pci_setup_device(), so since
the aforementioned commit the function pci_msi_setup_pci_dev() is not
called and MSI/MSI-X interrupts are left enabled. Additionally because
dev->msi_cap/msix_cap are not initialised no driver can ever enable
MSI/MSI-X.
To fix this, the pseries PCI probe should manually call
pci_msi_setup_pci_dev(), so this patch makes it non-static.
Fixes: 1851617cd2 ("PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI")
[mpe: Update change log to mention dev->msi_cap/msix_cap]
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This lets drivers take advantage of PAT when available. It
should help with the transition of converting video drivers over
to ioremap_wc() to help with the goal of eventually using
_PAGE_CACHE_UC over _PAGE_CACHE_UC_MINUS on x86 on
ioremap_nocache(), see:
de33c442ed ("x86 PAT: fix performance drop for glx, use UC minus for ioremap(), ioremap_nocache() and pci_mmap_page_range()")
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: <syrjala@sci.fi>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Antonino Daplas <adaplas@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Ville Syrjälä <syrjala@sci.fi>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: airlied@linux.ie
Cc: benh@kernel.crashing.org
Cc: dan.j.williams@intel.com
Cc: konrad.wilk@oracle.com
Cc: linux-fbdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: mst@redhat.com
Cc: vinod.koul@intel.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/1440443613-13696-2-git-send-email-mcgrof@do-not-panic.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On multi-function JMicron SATA/PATA/AHCI devices, the PATA controller at
function 1 doesn't work if it is powered on before the SATA controller at
function 0. The result is that PATA doesn't work after resume, and we
print messages like this:
pata_jmicron 0000:02:00.1: Refused to change power state, currently in D3
irq 17: nobody cared (try booting with the "irqpoll" option)
Async resume was introduced in v3.15 by 76569faa62 ("PM / sleep:
Asynchronous threads for resume_noirq"). Prior to that, we powered on
the functions in order, so this problem shouldn't happen.
e6b7e41cdd ("ata: Disabling the async PM for JMicron chip 363/361")
solved the problem for JMicron 361 and 363 devices. With async suspend
disabled, we always power on function 0 before function 1.
Barto then reported the same problem with a JMicron 368 (see comment #57 in
the bugzilla).
Rather than extending the blacklist piecemeal, disable async suspend for
all JMicron multi-function SATA/PATA/AHCI devices.
This quirk could stay in the ahci and pata_jmicron drivers, but it's likely
the problem will occur even if pata_jmicron isn't loaded until after the
suspend/resume. Making it a PCI quirk ensures that we'll preserve the
power-on order even if the drivers aren't loaded.
[bhelgaas: changelog, limit to multi-function, limit to IDE/ATA]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=81551
Reported-and-tested-by: Barto <mister.freeman@laposte.net>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.15+
* pci/host-dra7xx:
PCI: dra7xx: Remove unneeded use of IS_ERR_VALUE()
* pci/host-imx6:
PCI: imx6: Simplify a trivial if-return sequence
* pci/host-spear:
PCI: spear: Use BUG_ON() instead of condition followed by BUG()
Firmware typically configures the PCIe fabric with a consistent Max Payload
Size setting based on the devices present at boot. A hot-added device
typically has the power-on default MPS setting (128 bytes), which may not
match the fabric.
The previous Linux default, in the absence of any "pci=pcie_bus_*" options,
was PCIE_BUS_TUNE_OFF, in which we never touch MPS, even for hot-added
devices.
Add a new default setting, PCIE_BUS_DEFAULT, in which we make sure every
device's MPS setting matches the upstream bridge. This makes it more
likely that a hot-added device will work in a system with optimized MPS
configuration.
Note that if we hot-add a device that only supports 128-byte MPS, it still
likely won't work because we don't reconfigure the rest of the fabric.
Booting with "pci=pcie_bus_peer2peer" is a workaround for this because it
sets MPS to 128 for everything.
[bhelgaas: changelog, new default, rework for pci_configure_device() path]
Tested-by: Keith Busch <keith.busch@intel.com>
Tested-by: Jordan Hargrave <jharg93@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Meelis and Helge reported that 3a9ad0b4fd ("PCI: Add pci_bus_addr_t")
caused HPMCs on A500 and hangs on rp5470.
PA-RISC does not set ARCH_DMA_ADDR_T_64BIT, even for 64-bit kernels, so
prior to 3a9ad0b4fd, we always used 32-bit PCI addresses. After
3a9ad0b4fd, we do use 64-bit PCI addresses in 64-bit kernels, and
apparently there's some PA-RISC problem related to them.
Fixes: 3a9ad0b4fd ("PCI: Add pci_bus_addr_t")
Link: http://lkml.kernel.org/r/alpine.LRH.2.11.1507260929000.30065@math.ut.ee
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Helge Deller <deller@gmx.de>
Tested-by: Helge Deller <deller@gmx.de>
Based-on-idea-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
CC: stable@vger.kernel.org # v3.19+
Previously we checked for invalid MPS settings, i.e., a device with MPS
different than its upstream bridge, in pcie_bus_detect_mps(). We only did
this if the arch or hotplug driver called pcie_bus_configure_settings(),
and then only if PCIe bus tuning was disabled (PCIE_BUS_TUNE_OFF).
Move the MPS checking code to pci_configure_device(), so we do it in the
pci_device_add() path for every device.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
of_parse_phandle() returns a device_node pointer with the refcount
incremented. We should dispose of this reference when we're finished.
Drop the reference acquired by of_parse_phandle().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The pcibios_msi_controller() hook was only implemented by ARM, and it sets
pci_bus->msi now, so it doesn't need this hook anymore.
Remove the unused pcibios_msi_controller() hook.
[bhelgaas: changelog, split into separate patch]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
ARM previously stored the msi_controller pointer in its sysdata, struct
pci_sys_data, and implemented pcibios_msi_controller() to retrieve it.
That made PCI host controller drivers specific to ARM because they had to
put the msi_controller pointer in the ARM-specific pci_sys_data.
There is now a generic mechanism, pci_scan_root_bus_msi(), for giving the
msi_controller pointer to the PCI core. Use this for all ARM systems and
for the DesignWare and Xilinx PCI host controller drivers.
This removes an ARM dependency from the DesignWare, DRA7xx, EXYNOS, i.MX6,
Keystone, Layerscape, SPEAr13xx, and Xilinx drivers.
[bhelgaas: changelog, split into separate patch]
Suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Simon Horman <horms@verge.net.au>
CC: Russell King <linux@arm.linux.org.uk>
CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
CC: Thierry Reding <thierry.reding@gmail.com>
CC: Michal Simek <michal.simek@xilinx.com>
CC: Marc Zyngier <marc.zyngier@arm.com>
Add a pci_scan_root_bus_msi() interface so an arch can specify the MSI
controller up front. This removes the need for a pcibios callback to set
the MSI controller later.
This is not exported because I'd like to replace the variety of "scan root
bus" interfaces with a single, more extensible interface that can handle
the MSI controller, domain, pci_ops, resources, etc. I hope this interface
is temporary.
[bhelgaas: changelog, split into separate patch]
Suggested-by: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Make pci-host-generic driver (kernel option PCI_HOST_GENERIC) available on
arm64.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
ARM64 requires setup-irq.o to provide pci_fixup_irqs() implementation. We
are adding this now to support the pci-host-generic host controller, but we
enable it for ARM64 PCI so that other host controllers can use this as
well.
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The generic OF-based host controller driver uses pci_common_init_dev(),
which is ARM-specific and requires the ARM struct hw_pci. The part of
pci_common_init_dev() that is needed is limited and can be done here
without using hw_pci.
Note that the ARM pcibios functions expect the PCI sysdata to be a pointer
to a struct pci_sys_data. Add a struct pci_sys_data as the first element
in struct gen_pci so that when we use a gen_pci pointer as sysdata, it is
also a pointer to a struct pci_sys_data.
Create and scan the root bus directly without using the ARM
pci_common_init_dev() interface.
[bhelgaas: changelog, move pcie_bus_configure_settings() before
pci_bus_add_devices(), combine !PCI_PROBE_ONLY blocks]
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Jayachandran C <jchandra@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Simplify a trivial if-return sequence by combining it with a preceding
function call.
The semantic patch that makes this change is available in
scripts/coccinelle/misc/simple_return.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Lucas Stach <l.stach@pengutronix.de>
Use BUG_ON() instead of an if condition followed by BUG().
The semantic patch that makes this change is available in
scripts/coccinelle/misc/bugon.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
There is no need to use the IS_ERR_VALUE() macro for checking the return
value from pm_runtime_* functions.
Test for a negative pm_runtime_get_sync() return value instead of using
IS_ERR_VALUE().
The semantic patch that makes this change is available in
scripts/coccinelle/api/pm_runtime.cocci.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Kishon Vijay Abraham I <kishon@ti.com>
We should not assume any particular hardware topology. Commit d0751b98df
("PCI: Add dev->has_secondary_link to track downstream PCIe links") relied
on the assumption that every PCIe hierarchy is rooted at a Root Port. But
we can't rely on any assumption about what hardware we will find; we just
have to deal with the world as it is.
On some platforms, PCIe devices (endpoints, switch upstream ports, etc.)
appear directly on the root bus, and there is no Root Port in the PCI bus
hierarchy. For example, Meelis observed these top-level devices on a
Sparc V245:
0000:02:00.0 PCI bridge to [bus 03-0d] Switch Upstream Port
0001:02:00.0 PCI bridge to [bus 03] PCIe to PCI/PCI-X Bridge
These devices *look* like they have links going upstream, but there really
are no upstream devices.
In set_pcie_port_type(), we used the parent device to figure out which side
of a switch port has a link, so if the parent device did not exist, we
dereferenced a NULL parent pointer.
Check whether the parent device exists before dereferencing it.
Meelis observed this oops on Sparc V245 and T2000. Ben Herrenschmidt says
this is also possible on IBM PowerVM guests on PowerPC.
[bhelgaas: changelog, comment]
Link: http://lkml.kernel.org/r/alpine.LRH.2.20.1508122118210.18637@math.ut.ee
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
There's a typo in commit e39758e0ea in linux-next, which incorrectly
spells "msi_desc_to_pci_sysdata()" as "msi_desc_to_pci_sys_data()" and
causes build failure:
> ../drivers/pci/host/pcie-xilinx.c:235:3: error: implicit declaration
of function 'msi_desc_to_pci_sys_data' [-Werror=implicit-function-declaration]
Fixes: e39758e0ea "PCI: Use helper functions to access fields in struct msi_desc"
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Brown <broonie@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Sören Brinkmann <soren.brinkmann@xilinx.com>
Cc: Srikanth Thokala <sthokal@xilinx.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Yijing Wang <wangyijing@huawei.com>
Link: http://lkml.kernel.org/r/1439912763-10645-1-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* pci/host-dra7xx:
ARM: dts: am57xx-evm: Add 'gpios' property with gpio2_8
PCI: dra7xx: Add support to make GPIO drive PERST# line
PCI: dra7xx: Clear MSE bit during suspend so clocks will idle
PCI: dra7xx: Add PM support
PCI: dra7xx: Disable pm_runtime on get_sync failure
* pci/host-iproc:
PCI: iproc: Allow BCMA bus driver to be built as module
PCI: iproc: Add arm64 support
PCI: iproc: Delete unnecessary checks before phy calls
* pci/hotplug:
PCI: pciehp: Remove ignored MRL sensor interrupt events
PCI: pciehp: Remove unused interrupt events
PCI: pciehp: Handle invalid data when reading from non-existent devices
PCI: Hold pci_slot_mutex while searching bus->slots list
PCI: Protect pci_bus->slots with pci_slot_mutex, not pci_bus_sem
PCI: pciehp: Simplify pcie_poll_cmd()
PCI: Use "slot" and "pci_slot" for struct hotplug_slot and struct pci_slot
* pci/iommu:
PCI: Remove pci_ats_enabled()
PCI: Stop caching ATS Invalidate Queue Depth
PCI: Move ATS declarations to linux/pci.h so they're all together
PCI: Clean up ATS error handling
PCI: Use pci_physfn() rather than looking up physfn by hand
PCI: Inline the ATS setup code into pci_ats_init()
PCI: Rationalize pci_ats_queue_depth() error checking
PCI: Reduce size of ATS structure elements
PCI: Embed ATS info directly into struct pci_dev
PCI: Allocate ATS struct during enumeration
iommu/vt-d: Cache PCI ATS state and Invalidate Queue Depth
* pci/irq:
PCI: Kill off set_irq_flags() usage
* pci/virtualization:
PCI: Add ACS quirks for Intel I219-LM/V
Remove pci_ats_enabled(). There are no callers outside the ATS code
itself. We don't need to check ats_cap, because if we don't find an ATS
capability, we'll never set ats_enabled.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Stop caching the Invalidate Queue Depth in struct pci_dev.
pci_ats_queue_depth() is typically called only once per device, and it
returns a fixed value per-device, so callers who need the value frequently
can cache it themselves.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
There's no need to BUG() if we enable ATS when it's already enabled. We
don't need to BUG() when disabling ATS on a device that doesn't support ATS
or if it's already disabled. If ATS is enabled, certainly we found an ATS
capability in the past, so it should still be there now.
Clean up these error paths.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Use the pci_physfn() helper rather than looking up physfn by hand.
No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The ATS setup code in ats_alloc_one() is only used by pci_ats_init(), so
inline it there. No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
We previously returned -ENODEV for devices that don't support ATS (except
that we always returned 0 for VFs, whether or not they support ATS).
For consistency, always return -EINVAL (not -ENODEV) if the device doesn't
support ATS. Return zero for VFs that support ATS.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The pci_ats struct is small and will get smaller, so I don't think it's
worth allocating it separately from the pci_dev struct.
Embed the ATS fields directly into struct pci_dev.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Previously, we allocated pci_ats structures when an IOMMU driver called
pci_enable_ats(). An SR-IOV VF shares the STU setting with its PF, so when
enabling ATS on the VF, we allocated a pci_ats struct for the PF if it
didn't already have one. We held the sriov->lock to serialize threads
concurrently enabling ATS on several VFS so only one would allocate the PF
pci_ats.
Gregor reported a deadlock here:
pci_enable_sriov
sriov_enable
virtfn_add
mutex_lock(dev->sriov->lock) # acquire sriov->lock
pci_device_add
device_add
BUS_NOTIFY_ADD_DEVICE notifier chain
iommu_bus_notifier
amd_iommu_add_device # iommu_ops.add_device
init_iommu_group
iommu_group_get_for_dev
iommu_group_add_device
__iommu_attach_device
amd_iommu_attach_device # iommu_ops.attach_device
attach_device
pci_enable_ats
mutex_lock(dev->sriov->lock) # deadlock
There's no reason to delay allocating the pci_ats struct, and if we
allocate it for each device at enumeration-time, there's no need for
locking in pci_enable_ats().
Allocate pci_ats struct during enumeration, when we initialize other
capabilities.
Note that this implementation requires ATS to be enabled on the PF first,
before on any of the VFs because the PF controls the STU for all the VFs.
Link: http://permalink.gmane.org/gmane.linux.kernel.iommu/9433
Reported-by: Gregor Dick <gdick@solarflare.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
The PERST# line in am57x-evm is connected to a GPIO line and PERST# should
be driven high to indicate the clocks are stable (As per Figure 2-10: Power
Up of the PCIe CEM spec 3.0).
Add support to make GPIO drive PERST# line.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
DRA7xx requires the MSE bit to be cleared to set the master in standby
mode. (In DRA7xx TRM_vE, section 24.9.4.5.2.2.1 PCIe Controller Master
Standby Behavior advises to use the clearing of the local MSE bit to set
the master in standby. Without this some of the clocks do not idle).
Clear the MSE bit on suspend and enable it on resume. Clearing MSE bit is
required to get clocks to be idled after suspend.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Add PM support to pci-dra7xx so PCI clocks can be disabled during suspend
and enabled during resume without affecting PCI functionality.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Fix the error handling when pm_runtime_get_sync() fails.
If pm_runtime_get_sync() fails, call pm_runtime_disable() so there are no
unbalanced pm_runtime_enable() calls.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
Change CONFIG_PCIE_IPROC_BCMA to tristate to make it possible to build this
driver as a module.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ray Jui <rjui@broadcom.com>
The Intel 100-series chipset now includes the integrated Ethernet as part
of a multifunction package. The Ethernet function does not include native
ACS support, but Intel confirms that the device is not capable of peer-to-
peer within the package. We can therefore quirk it to expose the
isolation.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: John Ronciak <john.ronciak@gmail.com>
set_irq_flags is ARM-specific with custom flags which have genirq
equivalents. Convert drivers to use the genirq interfaces directly, so we
can kill off set_irq_flags. The translation of flags is as follows:
IRQF_VALID -> !IRQ_NOREQUEST
IRQF_PROBE -> !IRQ_NOPROBE
IRQF_NOAUTOEN -> IRQ_NOAUTOEN
For IRQs managed by an irqdomain, the irqdomain core code handles clearing
and setting IRQ_NOREQUEST already, so there is no need to do this in .map()
functions, and we can simply remove the set_irq_flags calls. Some users
also modify IRQ_NOPROBE, and this has been maintained although it is not
clear that is really needed. There appears to be a great deal of blind
copy and paste of this code.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
CC: Kishon Vijay Abraham I <kishon@ti.com>
CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Thierry Reding <thierry.reding@gmail.com>
CC: Stephen Warren <swarren@wwwdotorg.org>
CC: Alexandre Courbot <gnurou@gmail.com>
CC: Jingoo Han <jingoohan1@gmail.com>
CC: Pratyush Anand <pratyush.anand@gmail.com>
CC: Simon Horman <horms@verge.net.au>
CC: Michal Simek <michal.simek@xilinx.com>
CC: "Sören Brinkmann" <soren.brinkmann@xilinx.com>
Quoting Arnd:
I was thinking the opposite approach and basically removing all uses
of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
them.and we can probably replace them all with hardcoded
ioremap_cached() calls in the cases they are actually useful.
All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
ioremap_nocache() if the resource is cacheable, however ioremap() is
uncached by default. Clearly none of the existing usages care about the
cacheability. Particularly devm_ioremap_resource() never worked as
advertised since it always fell back to plain ioremap().
Clean this up as the new direction we want is to convert
ioremap_<type>() usages to memremap(..., flags).
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
We queued interrupt events for the MRL being opened or closed, but the code
in interrupt_event_handler() that handles these events ignored them.
Stop enabling MRL interrupts and remove the ignored events.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The list of interrupt events (INT_BUTTON_IGNORE, INT_PRESENCE_ON, etc.) was
copied from other hotplug drivers, but pciehp doesn't use them all.
Remove the interrupt events that aren't used by pciehp.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set. This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.
One example, reported in the bugzilla below, involved this hierarchy:
pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
pci 0000:06:00.0: PCI bridge to [bus 07] Thunderbolt Downstream Port
pci 0000:07:00.0: BCM57762 NIC
Unplugging the Thunderbolt switch and the NIC below it resulted in this:
pciehp 0000:03:03.0: Surprise Removal
tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
pciehp 0000:06:00.0: unloading service driver pciehp
pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
pciehp 0000:06:00.0: Switch interrupt received
pciehp 0000:06:00.0: Latch open on Slot
pciehp 0000:06:00.0: Attention button interrupt received
pciehp 0000:06:00.0: Button pressed on Slot
pciehp 0000:06:00.0: Presence/Notify input change
pciehp 0000:06:00.0: Card present on Slot
pciehp 0000:06:00.0: Power fault interrupt received
pciehp 0000:06:00.0: Data Link Layer State change
pciehp 0000:06:00.0: Link Up event
The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.
Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device. The resulting timeout is a tg3 issue and not of
interest here.
Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up. These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:
pciehp 0000:06:00.0: PCI slot - powering on due to button press
pciehp 0000:06:00.0: Surprise Insertion
pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff
[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>