c770cb4cb5 ("PCI: Mark invalid BARs as unassigned") sets IORESOURCE_UNSET
if we fail to claim a resource. If we tried to claim a bridge window,
failed, clipped the window, and tried to claim the clipped window, we
failed again because of IORESOURCE_UNSET:
pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff window]
pci 0000:00:01.0: can't claim BAR 15 [mem 0xbdf00000-0xddefffff 64bit pref]: no compatible bridge window
pci 0000:00:01.0: [mem size 0x20000000 64bit pref] clipped to [mem size 0x1df00000 64bit pref]
pci 0000:00:01.0: bridge window [mem size 0x1df00000 64bit pref]
pci 0000:00:01.0: can't claim BAR 15 [mem size 0x1df00000 64bit pref]: no address assigned
The 00:01.0 window started as [mem 0xbdf00000-0xddefffff 64bit pref]. That
starts before the host bridge window [mem 0xc0000000-0xffffffff window], so
we clipped the 00:01.0 window to [mem 0xc0000000-0xddefffff 64bit pref].
But we left it marked IORESOURCE_UNSET, so the second claim failed when it
should have succeeded.
This means downstream devices will also fail for lack of resources, e.g.,
in the bugzilla below,
radeon 0000:01:00.0: Fatal error during GPU init
Clear IORESOURCE_UNSET when we clip a bridge window. Also clear
IORESOURCE_UNSET in our copy of the unclipped window so we can see exactly
what the original window was and how it now fits inside the upstream
window.
Fixes: c770cb4cb5 ("PCI: Mark invalid BARs as unassigned")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491#c47
Based-on-patch-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Based-on-patch-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
CC: stable@vger.kernel.org # v4.1+
David Ahern reported that d63e2e1f3d ("sparc/PCI: Clip bridge windows
to fit in upstream windows") fails to boot on sparc/T5-8:
pci 0000:06:00.0: reg 0x184: can't handle BAR above 4GB (bus address 0x110204000)
The problem is that sparc64 assumed that dma_addr_t only needed to hold DMA
addresses, i.e., bus addresses returned via the DMA API (dma_map_single(),
etc.), while the PCI core assumed dma_addr_t could hold *any* bus address,
including raw BAR values. On sparc64, all DMA addresses fit in 32 bits, so
dma_addr_t is a 32-bit type. However, BAR values can be 64 bits wide, so
they don't fit in a dma_addr_t. d63e2e1f3d added new checking that
tripped over this mismatch.
Add pci_bus_addr_t, which is wide enough to hold any PCI bus address,
including both raw BAR values and DMA addresses. This will be 64 bits
on 64-bit platforms and on platforms with a 64-bit dma_addr_t. Then
dma_addr_t only needs to be wide enough to hold addresses from the DMA API.
[bhelgaas: changelog, bugzilla, Kconfig to ensure pci_bus_addr_t is at
least as wide as dma_addr_t, documentation]
Fixes: d63e2e1f3d ("sparc/PCI: Clip bridge windows to fit in upstream windows")
Fixes: 23b13bc76f ("PCI: Fail safely if we can't handle BARs larger than 4GB")
Link: http://lkml.kernel.org/r/CAE9FiQU1gJY1LYrxs+ma5LCTEEe4xmtjRG0aXJ9K_Tsu+m9Wuw@mail.gmail.com
Link: http://lkml.kernel.org/r/1427857069-6789-1-git-send-email-yinghai@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=96231
Reported-by: David Ahern <david.ahern@oracle.com>
Tested-by: David Ahern <david.ahern@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David S. Miller <davem@davemloft.net>
CC: stable@vger.kernel.org # v3.19+
Use common resource list management data structure and interfaces
instead of private implementation.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add pci_bus_clip_resource(). If a PCI-PCI bridge window overlaps an
upstream bridge window but is not completely contained by it, this clips
the downstream window so it fits inside the upstream one.
No functional change (this adds the function but no callers).
[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <kordikmarek@gmail.com>
Fixes: 5b28541552 ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.16+
Move EXPORT_SYMBOL so it immediately follows the function or variable.
No functional change.
[bhelgaas: squash similar changes, fix hotplug, probe, rom, search, too]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bus_add_device() always returns 0, so there's no point in returning
anything at all. Make it a void function and remove the tests of the
return value from the callers.
[bhelgaas: changelog, remove unused "err" from i82875p_setup_overfl_dev()]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
None of these files are actually using any __init type directives and hence
don't need to include <linux/init.h>. Most are just a left over from
__devinit and __cpuinit removal, or simply due to code getting copied from
one driver to the next.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pci_bus_alloc_resource() "type_mask" parameter is used to compare with
the "flags" member of a struct resource, so it should be the same type,
namely "unsigned long".
No functional change because all current IORESOURCE_* flags fit in 32 bits.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When allocating space from a bus resource, i.e., from apertures leading to
this bus, make sure the entire resource type matches. The previous code
assumed the IORESOURCE_TYPE_BITS field was a bitmask with only a single bit
set, but this is not true. IORESOURCE_TYPE_BITS is really an enumeration,
and we have to check all the bits.
See 72dcb11972 ("resources: Add register address resource type").
No functional change. If we used this path for allocating IRQs, DMA
channels, or bus numbers, this would fix a bug because those types are
indistinguishable when masked by IORESOURCE_IO | IORESOURCE_MEM. But we
don't, so this shouldn't make any difference.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Paul reported that after f75b99d5a7 ("PCI: Enforce bus address limits in
resource allocation") on a 32-bit kernel (CONFIG_PHYS_ADDR_T_64BIT not
set), intel-gtt complained "can't ioremap flush page - no chipset
flushing". In addition, other PCI resource allocations, e.g., for bridge
windows, failed.
This happens because we incorrectly skip bus resources of
[mem 0x00000000-0xffffffff] because we think they are of size zero.
When resource_size_t is 32 bits wide, resource_size() on
[mem 0x00000000-0xffffffff] returns 0 because (r->end - r->start + 1)
overflows.
Therefore, we can't use "resource_size() == 0" to decide that allocation
from this resource will fail. allocate_resource() should fail anyway if it
can't satisfy the address constraints, so we should just depend on that.
A [mem 0x00000000-0xffffffff] bus resource is obviously not really valid,
but we do fall back to it as a default when we don't have information about
host bridge apertures.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=71611
Fixes: f75b99d5a7 PCI: Enforce bus address limits in resource allocation
Reported-and-tested-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/resource:
PCI: Allocate 64-bit BARs above 4G when possible
PCI: Enforce bus address limits in resource allocation
PCI: Split out bridge window override of minimum allocation address
agp/ati: Use PCI_COMMAND instead of hard-coded 4
agp/intel: Use CPU physical address, not bus address, for ioremap()
agp/intel: Use pci_bus_address() to get GTTADR bus address
agp/intel: Use pci_bus_address() to get MMADR bus address
agp/intel: Support 64-bit GMADR
agp/intel: Rename gtt_bus_addr to gtt_phys_addr
drm/i915: Rename gtt_bus_addr to gtt_phys_addr
agp: Use pci_resource_start() to get CPU physical address for BAR
agp: Support 64-bit APBASE
PCI: Add pci_bus_address() to get bus address of a BAR
PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev
PCI: Change pci_bus_region addresses to dma_addr_t
Try to allocate space for 64-bit BARs above 4G first, to preserve the space
below 4G for 32-bit BARs. If there's no space above 4G available, fall
back to allocating anywhere.
[bhelgaas: reworked starting from http://lkml.kernel.org/r/1387485843-17403-2-git-send-email-yinghai@kernel.org]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When allocating space for 32-bit BARs, we previously limited RESOURCE
addresses so they would fit in 32 bits. However, the BUS address need not
be the same as the resource address, and it's the bus address that must fit
in the 32-bit BAR.
This patch adds:
- pci_clip_resource_to_region(), which clips a resource so it contains
only the range that maps to the specified bus address region, e.g., to
clip a resource to 32-bit bus addresses, and
- pci_bus_alloc_from_region(), which allocates space for a resource from
the specified bus address region,
and changes pci_bus_alloc_resource() to allocate space for 64-bit BARs from
the entire bus address region, and space for 32-bit BARs from only the bus
address region below 4GB.
If we had this window:
pci_root HWP0002:0a: host bridge window [mem 0xf0180000000-0xf01fedfffff] (bus address [0x80000000-0xfedfffff])
we previously could not put a 32-bit BAR there, because the CPU addresses
don't fit in 32 bits. This patch fixes this, so we can use this space for
32-bit BARs.
It's also possible (though unlikely) to have resources with 32-bit CPU
addresses but bus addresses above 4GB. In this case the previous code
would allocate space that a 32-bit BAR could not map.
Remove PCIBIOS_MAX_MEM_32, which is no longer used.
[bhelgaas: reworked starting from http://lkml.kernel.org/r/1386658484-15774-3-git-send-email-yinghai@kernel.org]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bus_alloc_resource() avoids allocating space below the "min" supplied
by the caller (usually PCIBIOS_MIN_IO or PCIBIOS_MIN_MEM). This is to
protect badly documented motherboard resources. But if we're allocating
space inside an already-configured PCI-PCI bridge window, we ignore "min".
See 688d191821 ("pci: make bus resource start address override minimum IO
address").
This patch moves the check to make it more visible and simplify future
patches. No functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
4f535093cf ("PCI: Put pci_dev in device tree as early as possible")
moved pci_proc_attach_device() from pci_bus_add_device() to
pci_device_add().
This moves it back to pci_bus_add_device(), essentially reverting that
part of 4f535093cf. This makes it symmetric with pci_stop_dev(),
where we call pci_proc_detach_device() and pci_remove_sysfs_dev_files()
and set dev->is_added = 0.
[bhelgaas: changelog, create sysfs then attach proc for symmetry]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We currently enable PCI bridges after scanning a bus and assigning
resources. This is often done in arch code.
This patch changes this so we don't enable a bridge until necessary, i.e.,
until we enable a PCI device behind the bridge. We do this in the generic
pci_enable_device() path, so this also removes the arch-specific code to
enable bridges.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Commit 4f535093cf "PCI: Put pci_dev in device tree as early as possible"
moved final fixups from pci_bus_add_device() to pci_device_add(). But
pci_device_add() happens before resource assignment, so BARs may not be
valid yet.
Typical flow for hot-add:
pciehp_configure_device
pci_scan_slot
pci_scan_single_device
pci_device_add
pci_fixup_device(pci_fixup_final, dev) # previous location
# resource assignment happens here
pci_bus_add_devices
pci_bus_add_device
pci_fixup_device(pci_fixup_final, dev) # new location
[bhelgaas: changelog, move fixups to pci_bus_add_device()]
Reference: https://lkml.kernel.org/r/20130415182614.GB9224@xanatos
Reported-by: David Bulkow <David.Bulkow@stratus.com>
Tested-by: David Bulkow <David.Bulkow@stratus.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v3.9+
* pci/cleanup:
PCI: Remove "extern" from function declarations
PCI: Warn about failures instead of "must_check" functions
PCI: Remove __must_check from definitions
PCI: Remove unused variables
PCI: Move cpci_hotplug_init() proto to header file
PCI: Make local functions/structs static
PCI: Fix missing prototype for pcie_port_acpi_setup()
Conflicts:
drivers/pci/hotplug/acpiphp.h
include/linux/pci.h
These places capture return values to avoid "must_check" warnings,
but we didn't *do* anything with the return values, which causes
"set but not used" warnings. We might as well do something instead
of just trying to evade the "must_check" warnings.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now pci_bus->is_added is only used to guard invoking of
pcibios_fixup_bus() in pci_scan_child_bus(), so just set
it directly after the fixups and remove the other test
and set in pci_bus_add_devices().
[bhelgaas: changelog]
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Yinghai Lu <yinghai@kernel.org>
We want to put pci_dev structs in the device tree as soon as possible so
for_each_pci_dev() iteration will not miss them, but driver attachment
needs to be delayed until after pci_assign_unassigned_resources() to make
sure all devices have resources assigned first.
This patch moves device registering from pci_bus_add_devices() to
pci_device_add(), which happens earlier, leaving driver attachment in
pci_bus_add_devices().
It also removes unattached child bus handling in pci_bus_add_devices().
That's not needed because child bus via pci_add_new_bus() is already
in parent bus children list.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We want to add PCI devices to the device tree as early as possible but
delay attaching drivers.
device_add() adds a device to the device hierarchy and (via
device_attach()) attaches a matching driver and calls its .probe() method.
We want to separate adding the device to the hierarchy from attaching the
driver.
This patch does that by adding "match_driver" in struct pci_dev. When
false, we return failure from pci_bus_match(), which makes device_attach()
believe there's no matching driver.
Later, we set "match_driver = true" and call device_attach() again, which
now attaches the driver and calls its .probe() method.
[bhelgaas: changelog, explicitly init dev->match_driver,
fold device_attach() call into pci_bus_add_device()]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Firmware may have assigned PCI BARs for hot-added devices, so reserve
those resources before trying to allocate more.
[bhelgaas: move empty weak definition here]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/mjg-pci-roms-from-efi:
x86: Use PCI setup data
PCI: Add support for non-BAR ROMs
PCI: Add pcibios_add_device
EFI: Stash ROMs if they're not in the PCI BAR
Platforms may want to provide architecture-specific functionality during
PCI enumeration. Add a pcibios_add_device() call that architectures can
override to do so.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
If a PCI device and its parents are put into D3cold, unbinding the
device will trigger deadlock as follow:
- driver_unbind
- device_release_driver
- device_lock(dev) <--- previous lock here
- __device_release_driver
- pm_runtime_get_sync
...
- rpm_resume(dev)
- rpm_resume(dev->parent)
...
- pci_pm_runtime_resume
...
- pci_set_power_state
- __pci_start_power_transition
- pci_wakeup_bus(dev->parent->subordinate)
- pci_walk_bus
- device_lock(dev) <--- deadlock here
If we do not do device_lock in pci_walk_bus, we can avoid deadlock.
Device_lock in pci_walk_bus is introduced in commit:
d71374dafb, corresponding email thread
is: https://lkml.org/lkml/2006/5/26/38. The patch author Zhang Yanmin
said device_lock is added to pci_walk_bus because:
Some error handling functions call pci_walk_bus. For example, PCIe
aer. Here we lock the device, so the driver wouldn't detach from the
device, as the cb might call driver's callback function.
So I fixed the deadlock as follows:
- remove device_lock from pci_walk_bus
- add device_lock into callback if callback will call driver's callback
I checked pci_walk_bus users one by one, and found only PCIe aer needs
device lock.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v3.6+
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Should use struct pci_bus_resource instead of struct pci_host_bridge_window
Commit 45ca9e9730 ("PCI: add helpers for building PCI bus resource lists")
added pci_free_resource_list() and used it in pci_bus_remove_resources().
Later it was also used for host bridge aperture lists, which was fine until
commit 0efd5aab41 ("PCI: add struct pci_host_bridge_window with CPU/bus
address offset"). That commit added offset information, so we needed a
struct pci_host_bridge_window that was separate from struct
pci_bus_resource.
Commit 0efd5aab41 should have split the host bridge aperture users of
pci_free_resource_list() from the pci_bus_resource user
(pci_bus_remove_resources()), but it did not.
[bhelgaas: changelog -- 0efd5aab41 was mine, so this is all my fault]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
My "PCI: Integrate 'pci_fixup_final' quirks into hot-plug paths" patch
introduced an undefined reference to 'pci_fixup_final_inited' when
CONFIG_PCI_QUIRKS is not enabled (on x86_64):
drivers/built-in.o: In function `pci_bus_add_device':
(.text+0x4f62): undefined reference to `pci_fixup_final_inited'
This patch removes the external reference ending up with a result closer
to what we ultimately want when the boot path issues described in the
original patch are resolved.
References:
https://lkml.org/lkml/2012/7/9/542 Original, offending, patch
https://lkml.org/lkml/2012/7/12/338 Randy's catch
Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Final fixups are currently applied only at boot-time by
pci_apply_final_quirks(), which is an fs_initcall(). Hot-added devices
don't get these fixups, so they may not be completely initialized.
This patch makes us run final fixups for hot-added devices in
pci_bus_add_device() just before the new device becomes eligible for driver
binding.
This patch keeps the fs_initcall() for devices present at boot because we
do resource assignment between pci_bus_add_device and the fs_initcall(),
and we don't want to break any fixups that depend on that assignment. This
is a design issue that may be addressed in the future -- any resource
assignment should be done *before* device_add().
[bhelgaas: changelog]
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some PCI host bridges apply an address offset, so bus addresses on PCI are
different from CPU addresses. This patch adds a way for architectures to
tell the PCI core about this offset. For example:
LIST_HEAD(resources);
pci_add_resource_offset(&resources, host->io_space, host->io_offset);
pci_add_resource_offset(&resources, host->mem_space, host->mem_offset);
pci_scan_root_bus(parent, bus, ops, sysdata, &resources);
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We'd like to supply a list of resources when we create a new PCI bus,
e.g., the root bus under a PCI host bridge. These are helpers for
constructing that list.
These are exported because the plan is to replace this exported interface:
pci_scan_bus_parented()
with this one:
pci_add_resource(resources, ...)
pci_scan_root_bus(..., resources)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Requested by Greg KH to fix a race condition in the creating of PCI bus
cpuaffinity files.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reverts commit b126b4703a.
We're going back to the old behavior of allocating from bus resources
in _CRS order.
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reverts commit 82e3e767c2.
We're going back to considering bus resources in the order we found
them (in _CRS order, when we're using _CRS), so we don't need to
define any ordering.
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When a PCI bus has two resources with the same start/end, e.g.,
pci_bus 0000:04: resource 2 [mem 0xd0000000-0xd7ffffff pref]
pci_bus 0000:04: resource 7 [mem 0xd0000000-0xd7ffffff]
the previous pci_bus_find_resource_prev() implementation would alternate
between them forever:
pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff pref])
returns [mem 0xd0000000-0xd7ffffff]
pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff])
returns [mem 0xd0000000-0xd7ffffff pref]
pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff pref])
returns [mem 0xd0000000-0xd7ffffff]
...
This happened because there was no ordering between two resources with the
same start and end. A resource that had the same start and end as the
cursor, but was not itself the cursor, was considered to be before the
cursor.
This patch fixes the hang by making a fixed ordering between any two
resources.
In addition, it tries to allocate from positively decoded regions before
using any subtractively decoded resources. This means we will use a
positive decode region before a subtractive decode one, even if it means
using a smaller address.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=22062
Reported-by: Borislav Petkov <bp@amd64.org>
Tested-by: Borislav Petkov <bp@amd64.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
xen: register xen pci notifier
xen: initialize cpu masks for pv guests in xen_smp_init
xen: add a missing #include to arch/x86/pci/xen.c
xen: mask the MTRR feature from the cpuid
xen: make hvc_xen console work for dom0.
xen: add the direct mapping area for ISA bus access
xen: Initialize xenbus for dom0.
xen: use vcpu_ops to setup cpu masks
xen: map a dummy page for local apic and ioapic in xen_set_fixmap
xen: remap MSIs into pirqs when running as initial domain
xen: remap GSIs as pirqs when running as initial domain
xen: introduce XEN_DOM0 as a silent option
xen: map MSIs into pirqs
xen: support GSI -> pirq remapping in PV on HVM guests
xen: add xen hvm acpi_register_gsi variant
acpi: use indirect call to register gsi in different modes
xen: implement xen_hvm_register_pirq
xen: get the maximum number of pirqs from xen
xen: support pirq != irq
* 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits)
X86/PCI: Remove the dependency on isapnp_disable.
xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright.
x86: xen: Sanitse irq handling (part two)
swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer.
xen/pci: Request ACS when Xen-SWIOTLB is activated.
xen-pcifront: Xen PCI frontend driver.
xenbus: prevent warnings on unhandled enumeration values
xenbus: Xen paravirtualised PCI hotplug support.
xen/x86/PCI: Add support for the Xen PCI subsystem
x86: Introduce x86_msi_ops
msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
x86/PCI: Export pci_walk_bus function.
x86/PCI: make sure _PAGE_IOMAP it set on pci mappings
x86/PCI: Clean up pci_cache_line_size
xen: fix shared irq device passthrough
xen: Provide a variant of xen_poll_irq with timeout.
xen: Find an unbound irq number in reverse order (high to low).
xen: statically initialize cpu_evtchn_mask_p
...
Fix up trivial conflicts in drivers/pci/Makefile
Allocate space from the highest-address PCI bus resource first, then work
downward.
Previously, we looked for space in PCI host bridge windows in the order
we discovered the windows. For example, given the following windows
(discovered via an ACPI _CRS method):
pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000effff]
pci_root PNP0A03:00: host bridge window [mem 0x000f0000-0x000fffff]
pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xf7ffffff]
pci_root PNP0A03:00: host bridge window [mem 0xff980000-0xff980fff]
pci_root PNP0A03:00: host bridge window [mem 0xff97c000-0xff97ffff]
pci_root PNP0A03:00: host bridge window [mem 0xfed20000-0xfed9ffff]
we attempted to allocate from [mem 0x000a0000-0x000bffff] first, then
[mem 0x000c0000-0x000effff], and so on.
With this patch, we allocate from [mem 0xff980000-0xff980fff] first, then
[mem 0xff97c000-0xff97ffff], [mem 0xfed20000-0xfed9ffff], etc.
Allocating top-down follows Windows practice, so we're less likely to
trip over BIOS defects in the _CRS description.
On the machine above (a Dell T3500), the [mem 0xbff00000-0xbfffffff] region
doesn't actually work and is likely a BIOS defect. The symptom is that we
move the AHCI controller to 0xbff00000, which leads to "Boot has failed,
sleeping forever," a BUG in ahci_stop_engine(), or some other boot failure.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c43
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=620313
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=629933
Reported-by: Brian Bloniarz <phunge0@hotmail.com>
Reported-and-tested-by: Stefan Becker <chemobejk@gmail.com>
Reported-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In preperation of modularizing Xen-pcifront the pci_walk_bus
needs to be exported so that the xen-pcifront module can walk
call the pci subsystem to walk the PCI devices and claim them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> [http://marc.info/?l=linux-pci&m=126149958010298&w=2]
pci_enable_device can fail. In that case, a printed warning would be
more appropriate.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Junchang Wang <junchangwang@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Assigning zero where NULL should be used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
In the future, we are going to be changing the lock type for struct
device (once we get the lockdep infrastructure properly worked out) To
make that changeover easier, and to possibly burry the lock in a
different part of struct device, let's create some functions to lock and
unlock a device so that no out-of-core code needs to be changed in the
future.
This patch creates the device_lock/unlock/trylock() functions, and
converts all in-tree users to them.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Magnus Damm <damm@igel.co.jp>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alex Chiang <achiang@hp.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Yu Zhao <yu.zhao@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Cc: CHENG Renquan <rqcheng@smu.edu.sg>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Frans Pop <elendil@planet.nl>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Previously we used a table of size PCI_BUS_NUM_RESOURCES (16) for resources
forwarded to a bus by its upstream bridge. We've increased this size
several times when the table overflowed.
But there's no good limit on the number of resources because host bridges
and subtractive decode bridges can forward any number of ranges to their
secondary buses.
This patch reduces the table to only PCI_BRIDGE_RESOURCE_NUM (4) entries,
which corresponds to the number of windows a PCI-to-PCI (3) or CardBus (4)
bridge can positively decode. Any additional resources, e.g., PCI host
bridge windows or subtractively-decoded regions, are kept in a list.
I'd prefer a single list rather than this split table/list approach, but
that requires simultaneous changes to every architecture. This approach
only requires immediate changes where we set up (a) host bridges with more
than four windows and (b) subtractive-decode P2P bridges, and we can
incrementally change other architectures to use the list.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No functional change; this converts loops that iterate from 0 to
PCI_BUS_NUM_RESOURCES through pci_bus resource[] table to use the
pci_bus_for_each_resource() iterator instead.
This doesn't change the way resources are stored; it merely removes
dependencies on the fact that they're in a table.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Now that we return the new resource start position, there is no
need to update "struct resource" inside the align function.
Therefore, mark the struct resource as const.
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
As suggested by Linus, align functions should return the start
of a resource, not void. An update of "res->start" is no longer
necessary.
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Based on PCI Express AER specs, a root port might receive multiple
TLP errors while it could only save a correctable error source id
and an uncorrectable error source id at the same time. In addition,
some root port hardware might be unable to provide a correct source
id, i.e., the source id, or the bus id part of the source id provided
by root port might be equal to 0.
The patchset implements the support in kernel by searching the device
tree under the root port.
Patch 1 changes parameter cb of function pci_walk_bus to return a value.
When cb return non-zero, pci_walk_bus stops more searching on the
device tree.
Reviewed-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We should not assign 64bit ranges to PCI devices that only take 32bit
prefetchable addresses.
Try to set IORESOURCE_MEM_64 in 64bit resource of pci_device/pci_bridge
and make the bus resource only have that bit set when all devices under
it support 64bit prefetchable memory. Use that flag to allocate
resources from that range.
Reported-by: Yannick <yannick.roehlly@free.fr>
Reviewed-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch sets up disabled bridges even if buses have already been
added.
pci_assign_unassigned_resources is called after buses are added.
pci_assign_unassigned_resources calls pci_bus_assign_resources.
pci_bus_assign_resources calls pci_setup_bridge to configure BARs of
bridges.
Currently pci_setup_bridge returns immediately if the bus have already
been added. So pci_assign_unassigned_resources can't configure BARs of
bridges that were added in a disabled state; this patch fixes the issue.
On logical hot-add, we need to prevent the kernel from re-initializing
bridges that have already been initialized. To achieve this,
pci_setup_bridge returns immediately if the bridge have already been
enabled.
We don't need to check whether the specified bus is a root bus or not.
pci_setup_bridge is not called on a root bus, because a root bus does
not have a bridge.
The patch adds a new helper function, pci_is_enabled. I made the
function name similar to pci_is_managed. The codes which use
enable_cnt directly are changed to use pci_is_enabled.
Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>