linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-25 12:21:37 +00:00

Author	SHA1	Message	Date
Yinghai Lu	52d21b5ef4	PCI: Use class for quirk for host bridge mmio_always_on Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-24 14:34:48 -08:00
Yinghai Lu	f4ca5c6a56	PCI: Add class support in quirk handling Recently added support to allow quirks to report duration also make the boot log very crowded when initcall_debug is specified. One thing we can to do mitigate this is to not call quirks unnecessarily by adding a new quirk declaration macro that takes a class argument. The new macro takes a class value and a class shift value (since it can vary) so that quirks will be limited to certain device classes, greatly reducing the number we call on every PCI device addition. -v2: fix v1 that left over of sparated patch. -v3: according to Jesse, change cls to class, cls_shift, to class_shift. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-24 14:34:40 -08:00
Jesse Barnes	ecd58d667a	Merge branch 'pci-next+probe_only+bus2res-fb127cb' of git://github.com/bjorn-helgaas/linux into linux-next	2012-02-24 14:25:33 -08:00
Yinghai Lu	b07f2ebc10	PCI: add a PCI resource reallocation config option Add a new config option, PCI_REALLOC_ENABLE_AUTO, which will automatically try to re-allocate PCI resources if PCI_IOV support is enabled and the SR-IOV resources are unassigned. Behavior can still be controlled using the pci=realloc= parameter. -v2: According to Jesse, adding one CONFIG option for distribution to disable it or enable it. -v3: update Kconfig text (jbarnes) Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-24 09:38:59 -08:00
Yinghai Lu	eb572e7c76	PCI: print out suggestion about using pci=realloc let user know they could try if pci=realloc could help. -v2: update suggestion text. Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-24 08:47:53 -08:00
Yinghai Lu	b55438fdd5	PCI: prepare pci=realloc for multiple options Let the user could enable and disable with pci=realloc=on or pci=realloc=off Also 1. move variable and functions near the place they are used. 2. change macro to function 3. change related functions and variable to static and _init 4. update parameter description accordingly. This will let us add a config option to control default behavior, and still allow the user to turn off automatic reallocation if it fails on their platform until a permanent solution is found. -v2: still honor pci=realloc, and treat it as pci=realloc=on also use enum instead of ... -v3: update kernel-paramenters.txt according to Jesse. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-24 08:47:42 -08:00
Yinghai Lu	0c5be0cb0e	PCI: Retry on IORESOURCE_IO type allocations When enabling pci reallocation for a pci bridge, we clear the small size in in bridge and re-assign with requested + optional size for first several tries, but Ram mention could have problem with one case: https://bugzilla.kernel.org/show_bug.cgi?id=15960 After checking the booting log in https://lkml.org/lkml/2010/4/19/44 [regression, bisected] Xonar DX invalid PCI I/O range since `977d17bb17` We should not stop too early for io ports. Apr 19 10:19:38 [kernel] pci 0000:04:00.0: BAR 7: can't assign io (size 0x4000) Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 8: assigned [mem 0x80400000-0x805fffff] Apr 19 10:19:38 [kernel] pci 0000:05:01.0: BAR 7: can't assign io (size 0x2000) Apr 19 10:19:38 [kernel] pci 0000:05:02.0: BAR 7: can't assign io (size 0x1000) Apr 19 10:19:38 [kernel] pci 0000:05:03.0: BAR 7: can't assign io (size 0x1000) Apr 19 10:19:38 [kernel] pci 0000:08:00.0: BAR 7: can't assign io (size 0x1000) Apr 19 10:19:38 [kernel] pci 0000:09:04.0: BAR 0: can't assign io (size 0x100) and clear 00:1c.0 to retry again. This patch removes IORESOUCE_IO checking, and tries one more time. It gives us a chance to get an allocation for the 00:1c.0 io port range because the range from 0x4000 to 0x8000 will be freed and we can use it. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-24 08:44:38 -08:00
Bjorn Helgaas	fb127cb9de	PCI: collapse pcibios_resource_to_bus Everybody uses the generic pcibios_resource_to_bus() supplied by the core now, so remove the ARCH_HAS_GENERIC_PCI_OFFSETS used during conversion. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:19:04 -07:00
Bjorn Helgaas	36a66cd6fd	PCI: add generic pcibios_resource_to_bus() This replaces the generic versions of pcibios_resource_to_bus() and pcibios_bus_to_resource() in asm-generic/pci.h with versions that use pci_resource_to_bus() and pci_bus_to_resource(). The replacements are equivalent except that they can apply host bridge window offsets when the arch has supplied them by using pci_add_resource_offset(). Each arch can convert to using pci_add_resource_offset() individually by removing its device resource fixups from pcibios_fixup_bus() and supplying ARCH_HAS_GENERIC_PCI_OFFSETS. ARCH_HAS_GENERIC_PCI_OFFSETS can be removed after all have converted. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:19:00 -07:00
Bjorn Helgaas	5bfa14ed9f	PCI: convert bus addresses to resource when reading BARs Some PCI host bridges translate CPU addresses to PCI bus addresses. Previously, we initialized pci_dev resources with PCI bus addresses, then converted them to CPU addresses later in arch-specific code (pcibios_fixup_resources()), which leaves a window of time where the pci_dev resources are incorrect. This patch adds support in the core for this address translation. When the arch creates the root bus, it can supply the host bridge address translation information, and the core can use it to set the pci_dev resources correctly from the beginning. This gives us a way to fix the problem that quirks that run between device discovery and pcibios_fixup_resources() fail because they use pci_dev resources that haven't been converted. The reference below is to one such problem that affected ARM and ia64. Note that this patch has no effect until an arch starts using pci_add_resource_offset() with a non-zero offset: before that, all all host bridge windows have a zero offset and pci_bus_to_resource() copies the pci_bus_region directly to the struct resource. Reference: https://lkml.org/lkml/2009/10/12/405 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:19:00 -07:00
Bjorn Helgaas	0efd5aab41	PCI: add struct pci_host_bridge_window with CPU/bus address offset Some PCI host bridges apply an address offset, so bus addresses on PCI are different from CPU addresses. This patch adds a way for architectures to tell the PCI core about this offset. For example: LIST_HEAD(resources); pci_add_resource_offset(&resources, host->io_space, host->io_offset); pci_add_resource_offset(&resources, host->mem_space, host->mem_offset); pci_scan_root_bus(parent, bus, ops, sysdata, &resources); Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:19:00 -07:00
Bjorn Helgaas	5a21d70dbd	PCI: add struct pci_host_bridge and a list of all bridges found This adds a list of all PCI host bridges we find and a way to look up the host bridge from a pci_dev. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:18:59 -07:00
Bjorn Helgaas	a5390aa6dc	PCI: don't publish new root bus until it's fully initialized When pci_create_root_bus() adds the new struct pci_bus to the global pci_root_buses list, the bus becomes visible to other parts of the kernel, so it should be fully initialized. This patch delays adding the bus to the pci_root_buses list until after all the struct pci_bus initialization is finished. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:18:59 -07:00
Bjorn Helgaas	844393f4c5	PCI: make pci_flags non-weak No architecture defines its own pci_flags, so the core symbol does not need to be weak. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:18:59 -07:00
Bjorn Helgaas	47087700ce	PCI: make pci_flags always available If we move resource assignment functions into the core, we'll still need a way for architectures to prevent reassignment, e.g., the "pci_probe_only" functionality, and we'll need a generic, always available way the core can test for that. The "pci_flags" arrangement used by several architectures seems like a convenient way to do this. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-02-23 20:18:55 -07:00
MUNEDA Takahiro	7570a333d8	PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp driver Add a parameter to avoid using MSI/MSI-X for PCIe native hotplug; it's known to be buggy on some platforms. In my environment, while shutting down, following stack trace is shown sometimes. irq 16: nobody cared (try booting with the "irqpoll" option) Pid: 1081, comm: reboot Not tainted 3.2.0 #1 Call Trace: <IRQ> [<ffffffff810cec1d>] __report_bad_irq+0x3d/0xe0 [<ffffffff810cee1c>] note_interrupt+0x15c/0x210 [<ffffffff810cc485>] handle_irq_event_percpu+0xb5/0x210 [<ffffffff810cc621>] handle_irq_event+0x41/0x70 [<ffffffff810cf675>] handle_fasteoi_irq+0x55/0xc0 [<ffffffff81015356>] handle_irq+0x46/0xb0 [<ffffffff814fbe9d>] do_IRQ+0x5d/0xe0 [<ffffffff814f146e>] common_interrupt+0x6e/0x6e [<ffffffff8106b040>] ? __do_softirq+0x60/0x210 [<ffffffff8108aeb1>] ? hrtimer_interrupt+0x151/0x240 [<ffffffff814fb5ec>] call_softirq+0x1c/0x30 [<ffffffff810152d5>] do_softirq+0x65/0xa0 [<ffffffff8106ae9d>] irq_exit+0xbd/0xe0 [<ffffffff814fbf8e>] smp_apic_timer_interrupt+0x6e/0x99 [<ffffffff814f9e5e>] apic_timer_interrupt+0x6e/0x80 <EOI> [<ffffffff814f0fb1>] ? _raw_spin_unlock_irqrestore+0x11/0x20 [<ffffffff812629fc>] pci_bus_write_config_word+0x6c/0x80 [<ffffffff81266fc2>] pci_intx+0x52/0xa0 [<ffffffff8127de3d>] pci_intx_for_msi+0x1d/0x30 [<ffffffff8127e4fb>] pci_msi_shutdown+0x7b/0x110 [<ffffffff81269d34>] pci_device_shutdown+0x34/0x50 [<ffffffff81326c4f>] device_shutdown+0x2f/0x140 [<ffffffff8107b981>] kernel_restart_prepare+0x31/0x40 [<ffffffff8107b9e6>] kernel_restart+0x16/0x60 [<ffffffff8107bbfd>] sys_reboot+0x1ad/0x220 [<ffffffff814f4b90>] ? do_page_fault+0x1e0/0x460 [<ffffffff811942d0>] ? __sync_filesystem+0x90/0x90 [<ffffffff8105c9aa>] ? __cond_resched+0x2a/0x40 [<ffffffff814ef090>] ? _cond_resched+0x30/0x40 [<ffffffff81169e17>] ? iterate_supers+0xb7/0xd0 [<ffffffff814f9382>] system_call_fastpath+0x16/0x1b handlers: [<ffffffff8138a0f0>] usb_hcd_irq [<ffffffff8138a0f0>] usb_hcd_irq [<ffffffff8138a0f0>] usb_hcd_irq Disabling IRQ #16 An un-wanted interrupt is generated when PCI driver switches from MSI/MSI-X to INTx while shutting down the device. The interrupt does not happen if MSI/MSI-X is not used on the device. I confirmed that this problem does not happen if pcie_hp=nomsi was specified and hotplug operation worked fine as usual. v2: Automatically disable MSI/MSI-X against following device: PCI bridge: Integrated Device Technology, Inc. Device 807f (rev 02) v3: Based on the review comment, combile the if statements. v4: Removed module parameter. Move some code to build pciehp as a module. Move device specific code to driver/pci/quirks.c. v5: Drop a device specific code until getting a vendor statement. Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com> Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 12:29:35 -08:00
Yinghai Lu	34a4876e30	PCI: move pci_find_saved_cap out of linux/pci.h Only one user in driver/pci/pci.c, so we don't need to put it in global pci.h Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 12:27:11 -08:00
Yinghai Lu	f796841e49	PCI: fix memleak for pci dev removing during hotplug unreferenced object 0xffff880276d17700 (size 64): comm "swapper/0", pid 1, jiffies 4294897182 (age 3976.028s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 18 f9 de 76 02 88 ff ff ...........v.... 10 00 00 00 0e 00 00 00 0f 28 40 00 00 00 00 00 .........(@..... backtrace: [<ffffffff81c8aede>] kmemleak_alloc+0x26/0x43 [<ffffffff811385f0>] __kmalloc+0x121/0x183 [<ffffffff813cf821>] pci_add_cap_save_buffer+0x35/0x7c [<ffffffff813d12b7>] pci_allocate_cap_save_buffers+0x1d/0x65 [<ffffffff813cdb52>] pci_device_add+0x92/0xf1 [<ffffffff81c8afe6>] pci_scan_single_device+0x9f/0xa1 [<ffffffff813cdbd2>] pci_scan_slot.part.20+0x21/0x106 [<ffffffff813cdce2>] pci_scan_slot+0x2b/0x35 [<ffffffff81c8dae4>] __pci_scan_child_bus+0x51/0x107 [<ffffffff81c8d75b>] pci_scan_bridge+0x376/0x6ae [<ffffffff81c8db60>] __pci_scan_child_bus+0xcd/0x107 [<ffffffff81c8dbab>] pci_scan_child_bus+0x11/0x2a [<ffffffff81cca58c>] pci_acpi_scan_root+0x18b/0x21c [<ffffffff81c916be>] acpi_pci_root_add+0x1e1/0x42a [<ffffffff81406210>] acpi_device_probe+0x50/0x190 [<ffffffff814a0227>] really_probe+0x99/0x126 Need to free saved_buffer for capabilities. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 12:08:53 -08:00
Yinghai Lu	2dd8ba921d	PCI: Fix device class print out Found debug print of class is shifted. \| pci 0000:f8:15.2: [8086:2b56] type 0 class 0x000600 Code is trying to print class with 6 digits, but use shifted class with 4 digits valid value as variable. Change to original dev->class directly. Also remove not needed calculating of local variable class, because it will be updated after pci_fixup_device(pci_fixup_early...) Also unify type print out when class and header is not matched. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 12:05:59 -08:00
Yinghai Lu	3796f1e2ca	PCI: Skip cardbus assigned resource reset during pci bus rescan Otherwise when rescan is used for cardbus, assigned resources will get cleared. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Tested-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 12:00:04 -08:00
Yinghai Lu	1184893439	PCI: Fix "cardbus bridge resources as optional" size handling We should not set the requested size to -2; that will confuse the resource list sorting with align when SIZEALIGN is used. Change to STARTALIGN and pass align from start; we are safe to do that just as we do that regular pci bridge. In the long run, we should just treat cardbus like a regular pci bridge. Also fix the case when realloc_head is not passed: we should keep the requested size. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Tested-by: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 11:59:56 -08:00
Yinghai Lu	dcef0d06b3	PCI: Disable cardbus bridge MEM1 prefetchable bit Some BIOSes enable prefetch on both MEM0 and MEM1. But the cardbus code assumes MEM1 is non-pref... Discussion could be found at: https://lkml.org/lkml/2012/1/12/1 https://bugzilla.kernel.org/show_bug.cgi?id=41622#c23 Signed-off-by: Yinghai Lu <yinghai@kernel.org> Tested-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-23 11:59:45 -08:00
Rafael J. Wysocki	5b415f1e79	PCI / PM: Disable wakeup during shutdown for devices not enabled to wake up If a PCI device is enabled to generate wakeup signals (PME) when put into a low-power state by runtime PM, it will be still enabled to generate those signals after the system shutdown, unless its driver's .shutdown() callback takes care of the wakeup signals generation setting. Moreover, there are devices that are not enabled to wake up the system and that are configured by runtime PM to generate wakeup signals so that (runtime) remote wakeup works with them. Those devices should be reconfigured during system shutdown so that they don't generate wakeup signals, but at least some drivers don't do that. However, that very well may be done by the PCI core so that drivers don't have to worry about it. For this reason, modify pci_device_shutdown() to disable the generation of wakeup events for devices not supposed to wake up the system. References: https://bugzilla.kernel.org/show_bug.cgi?id=37952 Reported-and-tested-by: Kamil Iskra <kamil.54002@iskra.name> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-17 09:22:04 -08:00
Yinghai Lu	09cedbef44	PCI: Fix /sys warning when sriov enabled and card is hot removed sysfs is a bit stricter now and emits warnings in more cases. For SRIOV hotplug, we are calling pci_stop_dev() for each VF first (after we update pci_stop_bus_devices) which remove each VF subdir. So double check the VF dir in /sys before trying to remove the physfn link. Signed-of-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-17 09:22:03 -08:00
Matthew Garrett	ad71c96213	PCI: pcie: Add support for setting default ASPM policy Distributions may wish to provide different defaults for PCIE ASPM depending on their target audience. Provide a configuration option for choosing the default policy. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-17 09:22:03 -08:00
Thomas Jarosch	f67fd55fa9	PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge GPUs Some BIOS implementations leave the Intel GPU interrupts enabled, even though no one is handling them (f.e. i915 driver is never loaded). Additionally the interrupt destination is not set up properly and the interrupt ends up -somewhere-. These spurious interrupts are "sticky" and the kernel disables the (shared) interrupt line after 100.000+ generated interrupts. Fix it by disabling the still enabled interrupts. This resolves crashes often seen on monitor unplug. Tested on the following boards: - Intel DH61CR: Affected - Intel DH67BL: Affected - Intel S1200KP server board: Affected - Asus P8H61-M LE: Affected, but system does not crash. Probably the IRQ ends up somewhere unnoticed. According to reports on the net, the Intel DH61WW board is also affected. Many thanks to Jesse Barnes from Intel for helping with the register configuration and to Intel in general for providing public hardware documentation. Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com> Tested-by: Charlie Suffin <charlie.suffin@stratus.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:26 -08:00
Arjan van de Ven	3209874a1d	PCI: Annotate PCI quirks in initcall_debug style While diagnosing some boot time issues on a platform, all that I could see in the bootgraph/dmesg was that the system was spending a lot of time in applying one or more PCI quirks... which was virtually undebuggable. This patch adds printk's in "initcall_debug" style to the dmesg, which are added when the user asks for the initcall_debug (the nr one tool to use when debugging boot hangs or boot time issues) kernel command line option. v2: add #includes so quirks can build on non-x86 Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:04 -08:00
Danny Kukawka	309c665110	PCI hotplug: cpcihp: fix debug module parameter to be bool Fix debug variable from module parameter to be really bool to fix 'warning: return from incompatible pointer type'. Acked-by: Scott Murray <scott@spiteful.org> Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:03 -08:00
Kay, Allen M	26f41062f2	PCI: check for pci bar restore completion and retry On some OEM systems, pci_restore_state() is called while FLR has not yet completed. As a result, PCI BAR register restore is not successful. This fix reads back the restored value and compares it with saved value and re-tries 10 times before giving up. Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com> Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com> Signed-off-by: Allen Kay <allen.m.kay@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:02 -08:00
Yinghai Lu	2debd92899	PCI: pciehp: Disable/enable link during slot power off/on On a system with a repeater on the system board to support gen2 hotplug, we found that when an ExpressModule is removed from some slots, /var/log/messages will be full of "card present/not present" warnings. It turns out the root complex is continually trying to train the link to the repeater because the repeater has not been reset. This patch will disable the link at removal time to allow the repeater to be reset properly. This also prevents a potential AER message at removal time. Also, when testing hotplug on a system under development, we found if we boot the system without an EM installed, and later hot-add an EM, it does not work with Linux, but another OS is ok. The root cause is that BIOS left link disabled when slot was empty at boot time, and other OS is modifying the link disable bit in link ctrl during power on/off. So we should do the same thing to disable/enable link during power off/on. -v2: check link DLLA bit instead of 100ms waiting. Separate link disable/enable functions to another patch. Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:02 -08:00
Yinghai Lu	7f822999e1	PCI: pciehp: Add Disable/enable link functions Will use it during power off/on of slots Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:01 -08:00
Yinghai Lu	bffe4f72fc	PCI: pciehp: Add pcie_wait_link_not_active() Will use it for link disable status checking. Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:01 -08:00
Yinghai Lu	4e2ce405b2	PCI: pciehp: make check_link_active more helpful A few changes: - remove the 'inline' and let the complier decide - return a bool to indicate whether the link was active - add a debug message to indicate link state when it beocmes active Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:00 -08:00
Yinghai Lu	2f5d8e4ff9	PCI: pciehp: replace unconditional sleep with config space access check During reviewing \| PCI: pciehp: wait 1000 ms before Link Training check Linus said: >... > That's a long time, and it's irritating to the user. It makes the > user think "the machine is slow". >... > And quite frankly, an unconditional one-second delay here seems bad. >Two seconds was unacceptable, one second is just bad. Try to access the pci conf of a pci device that is supposed to show up in 1s. If we can read back a valid vendor/device id, we can return early. Related discussion could be found: https://lkml.org/lkml/2011/12/6/339 -v2: seperate code to pci_bus_read_dev_vendor_id() from pci_scan_device() and reuse it from pciehp code. Suggested by Matthew Wilcox. -v3: According to Kenj, don't use array in stack, and don't wait too long for crs, also return fail status if not found. Also separate pci_bus_dev_read_vendor_id() change to another patch. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:45:00 -08:00
Yinghai Lu	efdc87dab1	PCI: Separate pci_bus_read_dev_vendor_id from pci_scan_device We can reuse it for pciehp probing. -v2: according to Kenji, fix crs timeout checking, and export the function for later use when pciehp is compiled as a module. Suggested-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:59 -08:00
Yinghai Lu	ac205b7bb7	PCI: make sriov work with hotplug remove When hot removing a pci express module that has a pcie switch and supports SRIOV, we got: [ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1 [ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received [ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3) [ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9 [ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press. [ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10 [ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200 [ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0 [ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain🚌device=0000:b0:00 [ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9 [ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain🚌dev = 0000:b0:00 [ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6 [ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14 [ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15 [ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16 [ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled [ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null) [ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83 [ 5926.133377] PGD 0 [ 5926.135402] Oops: 0000 [#1] SMP [ 5926.138659] CPU 2 [ 5926.140499] Modules linked in: ... [ 5926.143754] [ 5926.275823] Call Trace: [ 5926.278267] [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83 [ 5926.284180] [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba [ 5926.290264] [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b [ 5926.296866] [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188 [ 5926.303123] [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188 [ 5926.309206] [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0 ... +-[0000:80]-+-00.0-[81-8f]-- \| +-01.0-[90-9f]-- \| +-02.0-[a0-af]-- \| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device \| \| \| +-00.1 Device \| \| \| +-00.2 Device \| \| \| \-00.3 Device \| \| \-03.0-[b3]--+-00.0 Device \| \| +-00.1 Device \| \| +-00.2 Device \| \| \-00.3 Device root complex: 80:02.2 pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0. end devices are b2:00.0 and b3.00.0. VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3 Root cause: when doing pci_stop_bus_device() with phys fn, it will stop virt fn and remove the fn, so list_for_each_safe(l, n, &bus->devices) will have problem to refer freed n that is pointed to vf entry. Solution is just replacing list_for_each_safe() with list_for_each_prev_safe(). This will make sure we can get valid n pointer to PF instead of the freed VF pointer (because newly added devices are inserted to the bus->devices list tail). During reviewing the patch, Bjorn said: \| The PCI hot-remove path calls pci_stop_bus_devices() via \| pci_remove_bus_device(). \| \| pci_stop_bus_devices() traverses the bus->devices list (point A below), \| stopping each device in turn, which calls the driver remove() method. When \| the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which \| also uses pci_remove_bus_device() to remove the VF devices from the \| bus->devices list (point B). \| \| pci_remove_bus_device \| pci_stop_bus_device \| pci_stop_bus_devices(subordinate) \| list_for_each(bus->devices) <-- A \| pci_stop_bus_device(PF) \| ... \| driver->remove \| pci_disable_sriov \| ... \| pci_remove_bus_device(VF) \| <remove from bus_list> <-- B \| \| At B, we're changing the same list we're iterating through at A, so when \| the driver remove() method returns, the pci_stop_bus_devices() iterator has \| a pointer to a list entry that has already been freed. Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141 https://lkml.org/lkml/2012/1/23/360 -v5: According to Linus to make remove more robust, Change to list_for_each_prev_safe instead. That is more reasonable, because those devices are added to tail of the list before. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:59 -08:00
Yinghai Lu	67cc7e26a5	PCI: remove add_to_failed_list() Only one user; just use add_to_list instead. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:58 -08:00
Yinghai Lu	b592443d90	PCI: add debug print out for add_size For use in debugging resource reallocation. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:58 -08:00
Yinghai Lu	bffc56d411	PCI: make free_list() into a function After merging struct pci_dev_resource_x and pci_dev_resource, We can use a function instead of macro now. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:57 -08:00
Yinghai Lu	b9b0bba96c	PCI: Rename dev_res_x to add_res or fail_res Linus says don't use dev_res_x because it doesn't communicate anything about usage. Rename them to add_res or fail_res etc according to context. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:57 -08:00
Yinghai Lu	764242a0ae	PCI: Merge pci_dev_resource_x and pci_dev_resource pci_dev_resource_x is a superset of pci_dev_resource and they're just temp structs used during resource reallocation. pci_dev_resource usage is quite limted. So just use pci_dev_resource_x, and rename it as new pci_dev_resource. -v2: According to Linus, Separate free_list change to another patch Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:56 -08:00
Yinghai Lu	bdc4abecae	PCI: Replace resource_list with generic list So we can use helper functions for generic list. This makes the resource re-allocation code much more readable. -v2: Use list_add_tail instead of adding list_insert_before, Pointed out by Linus. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:55 -08:00
Yinghai Lu	2934a0de09	PCI: Move struct resource_list to setup-bus.c No user outside of setup-bus.c now. Later patches will convert resource_list to a regular list. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:55 -08:00
Yinghai Lu	78c3b329b9	PCI: Move pdev_sort_resources() to setup-bus.c This allows us to move the definition of struct resource_list to setup_bus.c and later convert resource_list to a regular list. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:54 -08:00
Yinghai Lu	19aa7ee432	PCI: make re-allocation try harder by reassigning ranges higher in the heirarchy On a system with devices that support SRIOV connected to a pcie switch to pcie root port: +-[0000:80]-+-00.0-[81-8f]-- \| +-01.0-[90-9f]-- \| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a \| \| \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a \| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a \| \| \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a When the BIOS does not assign resources for SRIOV BARs, kernel pci reallocation only goes up one bridge and then gives up, failing to to get resources for all sSRIOV BARs, even though the range is large enough in the peer root bus. Specifically, only the bridge at the a1:02.0 level has its resources cleared and reallocated. The kernel does not go up to clear the bridge at the 80:02.0 level. To make it go to upper levels, during retry, we need to treat "good to have" resources as "must have". Only on the last try will we treat good to have resources as optional. At that time, parent bridge resources will already have been released so we'll have a chance to get everything assigned with must_have plus good_to_have for all child devices. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:54 -08:00
Yinghai Lu	9b03088f95	PCI: Make pci_rescan_bus handle add_list This allows us to allocate resources to hotplug bridges during remove/rescan. We need to move the function to setup-bus.c so it can use __pci_bus_size_bridges and __pci_bus_assign_resources directly to take the add_list resource tracking list. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:53 -08:00
Yinghai Lu	2f320521a0	PCI: Make rescan bus increase bridge resource size if needed Current rescan will not touch bridge MMIO and IO. Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge resources, if child devices need more resources. Only do that for bridges whose children are all removed already; i.e. don't release resources that could already be in use by drivers on child devices. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:53 -08:00
Yinghai Lu	8424d7592e	PCI: Use add_list in pcie hotplug path. We need add size for hot plug path when pluging in hotplug chassis without cards. -v2: change descriptions. make it applicable after "pci: Check bridge resources after resource allocation." Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:52 -08:00
Yinghai Lu	3e6e0d8094	PCI: try to assign required+option size first We found reassignment can not find a range for one resource, even if the total available range is large enough. bridge b1:02.0 will need 2M+3M bridge b1:03.0 will need 2M+3M so bridge b0:00.0 will get assigned: 4M : [f8000000-f83fffff] later is reassigned to 10M : [f8000000-f9ffffff] b1:02.0 is assigned to 2M : [f8000000-f81fffff] b1:03.0 is assigned to 2M : [f8200000-f83fffff] After that b1:03.0 get chance to be reassigned to [f8200000-f86fffff], but b1:02.0 will not have chance to expand, because b1:03.0 is using in middle one. [ 187.911401] pci 0000:b1:02.0: bridge window [mem 0x00100000-0x002fffff] to [bus b2-b2] add_size 300000 [ 187.920764] pci 0000:b1:03.0: bridge window [mem 0x00100000-0x002fffff] to [bus b3-b3] add_size 300000 [ 187.930129] pci 0000:b1:02.0: [mem 0x00100000-0x002fffff] get_res_add_size add_size 300000 [ 187.938500] pci 0000:b1:03.0: [mem 0x00100000-0x002fffff] get_res_add_size add_size 300000 [ 187.946857] pci 0000:b0:00.0: bridge window [mem 0x00100000-0x004fffff] to [bus b1-b3] add_size 600000 [ 187.956206] pci 0000:b0:00.0: BAR 14: assigned [mem 0xf8000000-0xf83fffff] [ 187.963102] pci 0000:b0:00.0: BAR 15: assigned [mem 0xf5000000-0xf51fffff pref] [ 187.970434] pci 0000:b0:00.0: BAR 14: reassigned [mem 0xf8000000-0xf89fffff] [ 187.977497] pci 0000:b1:02.0: BAR 14: assigned [mem 0xf8000000-0xf81fffff] [ 187.984383] pci 0000:b1:02.0: BAR 15: assigned [mem 0xf5000000-0xf50fffff pref] [ 187.991695] pci 0000:b1:03.0: BAR 14: assigned [mem 0xf8200000-0xf83fffff] [ 187.998576] pci 0000:b1:03.0: BAR 15: assigned [mem 0xf5100000-0xf51fffff pref] [ 188.005888] pci 0000:b1:03.0: BAR 14: reassigned [mem 0xf8200000-0xf86fffff] [ 188.012939] pci 0000:b1:02.0: BAR 14: can't assign mem (size 0x200000) [ 188.019471] pci 0000:b1:02.0: failed to add 300000 to res=[mem 0xf8000000-0xf81fffff] [ 188.027326] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit] [ 188.034071] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit] [ 188.040795] pci 0000:b2:00.0: BAR 2: assigned [mem 0xf8000000-0xf80fffff 64bit] [ 188.048119] pci 0000:b2:00.0: BAR 2: set to [mem 0xf8000000-0xf80fffff 64bit] (PCI address [0xf8000000-0xf80fffff]) [ 188.058550] pci 0000:b2:00.0: BAR 6: assigned [mem 0xf5000000-0xf50fffff pref] [ 188.065802] pci 0000:b2:00.0: BAR 0: assigned [mem 0xf8100000-0xf8103fff 64bit] [ 188.073125] pci 0000:b2:00.0: BAR 0: set to [mem 0xf8100000-0xf8103fff 64bit] (PCI address [0xf8100000-0xf8103fff]) [ 188.083596] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit] [ 188.090310] pci 0000:b2:00.0: BAR 9: can't assign mem (size 0x300000) [ 188.096773] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit] [ 188.103479] pci 0000:b2:00.0: BAR 7: assigned [mem 0xf8104000-0xf810ffff 64bit] [ 188.110801] pci 0000:b2:00.0: BAR 7: set to [mem 0xf8104000-0xf810ffff 64bit] (PCI address [0xf8104000-0xf810ffff]) [ 188.121256] pci 0000:b1:02.0: PCI bridge to [bus b2-b2] [ 188.126512] pci 0000:b1:02.0: bridge window [mem 0xf8000000-0xf81fffff] [ 188.133328] pci 0000:b1:02.0: bridge window [mem 0xf5000000-0xf50fffff pref] [ 188.140608] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit] [ 188.147341] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit] [ 188.154076] pci 0000:b3:00.0: BAR 2: assigned [mem 0xf8200000-0xf82fffff 64bit] [ 188.161417] pci 0000:b3:00.0: BAR 2: set to [mem 0xf8200000-0xf82fffff 64bit] (PCI address [0xf8200000-0xf82fffff]) [ 188.171865] pci 0000:b3:00.0: BAR 6: assigned [mem 0xf5100000-0xf51fffff pref] [ 188.179090] pci 0000:b3:00.0: BAR 0: assigned [mem 0xf8300000-0xf8303fff 64bit] [ 188.186431] pci 0000:b3:00.0: BAR 0: set to [mem 0xf8300000-0xf8303fff 64bit] (PCI address [0xf8300000-0xf8303fff]) [ 188.196884] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit] [ 188.203591] pci 0000:b3:00.0: BAR 9: assigned [mem 0xf8400000-0xf86fffff 64bit] [ 188.210909] pci 0000:b3:00.0: BAR 9: set to [mem 0xf8400000-0xf86fffff 64bit] (PCI address [0xf8400000-0xf86fffff]) [ 188.221379] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit] [ 188.228089] pci 0000:b3:00.0: BAR 7: assigned [mem 0xf8304000-0xf830ffff 64bit] [ 188.235407] pci 0000:b3:00.0: BAR 7: set to [mem 0xf8304000-0xf830ffff 64bit] (PCI address [0xf8304000-0xf830ffff]) [ 188.245843] pci 0000:b1:03.0: PCI bridge to [bus b3-b3] [ 188.251107] pci 0000:b1:03.0: bridge window [mem 0xf8200000-0xf86fffff] [ 188.257922] pci 0000:b1:03.0: bridge window [mem 0xf5100000-0xf51fffff pref] [ 188.265180] pci 0000:b0:00.0: PCI bridge to [bus b1-b3] [ 188.270443] pci 0000:b0:00.0: bridge window [mem 0xf8000000-0xf89fffff] [ 188.277250] pci 0000:b0:00.0: bridge window [mem 0xf5000000-0xf51fffff pref] [ 188.284512] pcieport 0000:80:02.2: PCI bridge to [bus b0-bf] [ 188.290184] pcieport 0000:80:02.2: bridge window [io 0xa000-0xbfff] [ 188.296735] pcieport 0000:80:02.2: bridge window [mem 0xf8000000-0xf8ffffff] [ 188.303963] pcieport 0000:80:02.2: bridge window [mem 0xf5000000-0xf5ffffff 64bit pref] Thus b2:00.0 BAR 9 does not get assigned... root cause: b1:02.0 can not be added more range, because b1:03.0 is just after it; no space between the required ranges. Solution: Try to assign required + optional all together at first, and if that fails, try again with just the required resources. -v2: seperate add_to_list change() to another patch according to Jesse. seperate get_res_add_size() moving to another patch according to Jesse. add !realloc_head->next check if the list is empty to bail early according to Jesse. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:52 -08:00
Yinghai Lu	1c372353e9	PCI: Move get_res_add_size() function Need to call it from __assign_resources_sorted() later and we'd like to avoid a forward declaraion. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-14 08:44:51 -08:00

1 2 3 4 5 ...

2690 Commits