Commit Graph

2690 Commits

Author SHA1 Message Date
Yinghai Lu
ef62dfefa9 PCI: Make add_to_list() return status
Will be used for resource_list_x duplication when trying
requested+optional at first.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:51 -08:00
Yinghai Lu
a4ac9fea01 PCI : Calculate right add_size
During debug of one SRIOV enabled hotplug device, we found found that
add_size is not passed properly.

The device has devices under two level bridges:

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0  Oracle Corporation Device
 |           |                               \-03.0-[a3]--+-00.0  Oracle Corporation Device

Which means later the parent bridge will not try to add a big enough range:

[  557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
[  557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
[  557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
[  557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
[  557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
[  557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
[  557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
[  557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
[  557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
[  557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]

It turns out we did not calculate size1 properly.

static resource_size_t calculate_memsize(resource_size_t size,
                resource_size_t min_size,
                resource_size_t size1,
                resource_size_t old_size,
                resource_size_t align)
{
        if (size < min_size)
                size = min_size;
        if (old_size == 1 )
                old_size = 0;
        if (size < old_size)
                size = old_size;
        size = ALIGN(size + size1, align);
        return size;
}

We should not pass add_size with min_size in calculate_memsize since
that will make add_size not contribute final add_size.

So just pass add_size with size1 to calculate_memsize().

With this change, we should have chance to remove extra addon in
pci_reassign_resource.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:50 -08:00
Masanari Iida
0dea210b17 PCI: Fix typo in setup-res.c
Correct spelling "resouce" to "resource" in
dricers/pci/setup-res.c

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:50 -08:00
Konrad Rzeszutek Wilk
6fbf9e7a90 PCI: Introduce __pci_reset_function_locked to be used when holding device_lock.
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.

The call chain when a user does "bind" looks as so:

 echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind

and ends up calling:
  driver_bind:
    device_lock(dev);  <=== TAKES LOCK
    XXXX_probe:
         .. pci_enable_device()
         ...__pci_reset_function(), which calls
                 pci_dev_reset(dev, 0):
                        if (!0) {
                                device_lock(dev) <==== DEADLOCK

The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:48 -08:00
Julia Lawall
8f0cdddcd3 PCI: drivers/pci/hotplug/ibmphp_ebda.c: add missing iounmap
Add missing iounmap in error handling code, in a case where the function
already preforms iounmap on some other execution path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression e;
statement S,S1;
int ret;
@@
e = \(ioremap\|ioremap_nocache\)(...)
... when != iounmap(e)
if (<+...e...+>) S
... when any
    when != iounmap(e)
*if (...)
   { ... when != iounmap(e)
     return ...; }
... when any
iounmap(e);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:47 -08:00
Amos Kong
f382a086f3 PCI: Can continually add funcs after adding func0
Boot up a KVM guest, and hotplug multifunction
devices(func1,func2,func0,func3) to guest.

for i in 1 2 0 3;do
qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
(qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
(qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
done

In linux kernel, when func0 of the slot is hot-added, the whole
slot will be marked as 'enabled', then driver will ignore other new
hotadded funcs.
But in Win7 & WinXP, we can continaully add other funcs after adding
func0, all funcs will be added in guest.

drivers/pci/hotplug/acpiphp_glue.c:
static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{
....
        for (slot = bridge->slots; slot; slot = slot->next) {
                if (slot->flags & SLOT_ENABLED) {
                        acpiphp_disable_slot()
                else
                        acpiphp_enable_slot()
....                              |
}                                 v
                            enable_device()
                                  |
                                  v
        //only don't enable slot if func0 is not added
	list_for_each_entry(func, &slot->funcs, sibling) {
               ...
        }
       slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'

This patch just make pci driver can continaully add funcs after adding
func 0. Only mark slot to 'enabled' when all funcs are added.

For pci multifunction hotplug, we can add functions one by one(func 0 is
necessary), and all functions will be removed in one time.

Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:47 -08:00
Myron Stowe
6535943fbf x86/PCI: Convert maintaining FW-assigned BIOS BAR values to use a list
This patch converts the underlying maintenance aspects of FW-assigned
BIOS BAR values from a statically allocated array within struct pci_dev
to a list of temporary, stand alone, entries.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:46 -08:00
Myron Stowe
351fc6d1a5 PCI: Fix starting basis for resource requests
pci_revert_fw_address() is used to reinstate a PCI device's original
FW-assigned BIOS BAR value(s) if normal resource assignment fails.

When attempting to reinstate an address, the point within the resource
tree from which to attempt the new resource request should be the parent
resource corresponding to the device, not the base of the resource tree
(ioport_resource or iomem_resource).  For PCI devices this would
typically be the resource corresponding to the upstream PCI host bridge
or P2P bridge aperture.

This patch sets the point within the resource tree to attempt a new
resource assignment request to the PCI device's parent resource and only
if that fails does it fall back to the base ioport_resource or
iomem_resource.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:45 -08:00
Yinghai Lu
3682a3946d PCI: Fix pci cardbus removal
During test busn_res allocation with cardbus, found pci card removal is not
working anymore, and it turns out it is broken by:

|commit 79cc9601c3
|Date:   Tue Nov 22 21:06:53 2011 -0800
|
|    PCI: Only call pci_stop_bus_device() one time for child devices at remove

The above changed the behavior of pci_remove_behind_bridge that
yenta_cardbus depended on.  So restore the old behavoir of
pci_remove_behind_bridge (which requires stopping and removing of all
devices) by:

1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
   __pci_remove_bus_device() call it instead.
2. add pci_stop_behind_bridge that will stop devices behind a bridge
3. add back pci_remove_behind_bridge that will stop and remove devices
   under bridge.

-v2: update commit description a little bit.

Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 12:19:31 -08:00
Vaidyanathan Srinivasan
8161fe91d8 PCI: set pci sriov page size before reading SRIOV BAR
For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
the PCI_SRIOV_BAR are queried.  The sys pagesize defaults to 4k,
so this change is required on powerpc box with 64k base page size.
    
This is a regression caused due to moving SRIOV init to sriov_enable().
    
| commit afd24ece5c
| Author: Ram Pai <linuxram@us.ibm.com>
    
| PCI: delay configuration of SRIOV capability
| The SRIOV capability, namely page size and total_vfs of a device are
| configured during enumeration phase of the device.  This can potentially
| interfere with the PCI operations of the platform, if the IOV capability
| of the device is not enabled.
    
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 12:01:56 -08:00
Yinghai Lu
71f6bd4a23 PCI: workaround hard-wired bus number V2
Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
when using ACPI resources (_CRS).
This is default, a manual workaround (without this patch)
would be pci=nocrs boot param.

V2: Add dev_warn if the workaround is hit. This should reveal
how common such setups are (via google) and point to possible
problems if things are still not working as expected.
-> Suggested by Jan Beulich.

Cc: stable@vger.kernel.org
Tested-by: garyhade@us.ibm.com
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-10 11:34:42 -08:00
Randy Dunlap
6e9292c588 kernel-doc: fix new warnings in pci
Fix new kernel-doc warnings:

Warning(drivers/pci/pci.c:2811): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2811): Excess function parameter 'pdev' description in 'pci_intx_mask_supported'
Warning(drivers/pci/pci.c:2894): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2894): Excess function parameter 'pdev' description in 'pci_check_and_mask_intx'
Warning(drivers/pci/pci.c:2908): No description found for parameter 'dev'
Warning(drivers/pci/pci.c:2908): Excess function parameter 'pdev' description in 'pci_check_and_unmask_intx'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-23 08:44:53 -08:00
Linus Torvalds
c49c41a413 Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
  capabilities: remove __cap_full_set definition
  security: remove the security_netlink_recv hook as it is equivalent to capable()
  ptrace: do not audit capability check when outputing /proc/pid/stat
  capabilities: remove task_ns_* functions
  capabitlies: ns_capable can use the cap helpers rather than lsm call
  capabilities: style only - move capable below ns_capable
  capabilites: introduce new has_ns_capabilities_noaudit
  capabilities: call has_ns_capability from has_capability
  capabilities: remove all _real_ interfaces
  capabilities: introduce security_capable_noaudit
  capabilities: reverse arguments to security_capable
  capabilities: remove the task from capable LSM hook entirely
  selinux: sparse fix: fix several warnings in the security server cod
  selinux: sparse fix: fix warnings in netlink code
  selinux: sparse fix: eliminate warnings for selinuxfs
  selinux: sparse fix: declare selinux_disable() in security.h
  selinux: sparse fix: move selinux_complete_init
  selinux: sparse fix: make selinux_secmark_refcount static
  SELinux: Fix RCU deref check warning in sel_netport_insert()

Manually fix up a semantic mis-merge wrt security_netlink_recv():

 - the interface was removed in commit fd77846152 ("security: remove
   the security_netlink_recv hook as it is equivalent to capable()")

 - a new user of it appeared in commit a38f7907b9 ("crypto: Add
   userspace configuration API")

causing no automatic merge conflict, but Eric Paris pointed out the
issue.
2012-01-14 18:36:33 -08:00
Rusty Russell
90ab5ee941 module_param: make bool parameters really bool (drivers & misc)
module_param(bool) used to counter-intuitively take an int.  In
fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
trick.

It's time to remove the int/unsigned int option.  For this version
it'll simply give a warning, but it'll break next kernel version.

Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-13 09:32:20 +10:30
Linus Torvalds
7b67e75147 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
  x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
  PCI: Increase resource array mask bit size in pcim_iomap_regions()
  PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
  PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
  PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
  x86/PCI: amd: factor out MMCONFIG discovery
  PCI: Enable ATS at the device state restore
  PCI: msi: fix imbalanced refcount of msi irq sysfs objects
  PCI: kconfig: English typo in pci/pcie/Kconfig
  PCI/PM/Runtime: make PCI traces quieter
  PCI: remove pci_create_bus()
  xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
  x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
  x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
  x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
  sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
  sparc/PCI: convert to pci_create_root_bus()
  sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
  powerpc/PCI: convert to pci_create_root_bus()
  powerpc/PCI: split PHB part out of pcibios_map_io_space()
  ...

Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
to the same patches being applied in other branches.
2012-01-11 18:50:26 -08:00
Linus Torvalds
1c8106528a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (53 commits)
  iommu/amd: Set IOTLB invalidation timeout
  iommu/amd: Init stats for iommu=pt
  iommu/amd: Remove unnecessary cache flushes in amd_iommu_resume
  iommu/amd: Add invalidate-context call-back
  iommu/amd: Add amd_iommu_device_info() function
  iommu/amd: Adapt IOMMU driver to PCI register name changes
  iommu/amd: Add invalid_ppr callback
  iommu/amd: Implement notifiers for IOMMUv2
  iommu/amd: Implement IO page-fault handler
  iommu/amd: Add routines to bind/unbind a pasid
  iommu/amd: Implement device aquisition code for IOMMUv2
  iommu/amd: Add driver stub for AMD IOMMUv2 support
  iommu/amd: Add stat counter for IOMMUv2 events
  iommu/amd: Add device errata handling
  iommu/amd: Add function to get IOMMUv2 domain for pdev
  iommu/amd: Implement function to send PPR completions
  iommu/amd: Implement functions to manage GCR3 table
  iommu/amd: Implement IOMMUv2 TLB flushing routines
  iommu/amd: Add support for IOMMUv2 domain mode
  iommu/amd: Add amd_iommu_domain_direct_map function
  ...
2012-01-10 11:08:21 -08:00
Linus Torvalds
90160371b3 Merge branch 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/for-linus-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (37 commits)
  xen/pciback: Expand the warning message to include domain id.
  xen/pciback: Fix "device has been assigned to X domain!" warning
  xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the "[un|]bind"
  xen/xenbus: don't reimplement kvasprintf via a fixed size buffer
  xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX
  xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.
  Xen: consolidate and simplify struct xenbus_driver instantiation
  xen-gntalloc: introduce missing kfree
  xen/xenbus: Fix compile error - missing header for xen_initial_domain()
  xen/netback: Enable netback on HVM guests
  xen/grant-table: Support mappings required by blkback
  xenbus: Use grant-table wrapper functions
  xenbus: Support HVM backends
  xen/xenbus-frontend: Fix compile error with randconfig
  xen/xenbus-frontend: Make error message more clear
  xen/privcmd: Remove unused support for arch specific privcmp mmap
  xen: Add xenbus_backend device
  xen: Add xenbus device driver
  xen: Add privcmd device driver
  xen/gntalloc: fix reference counts on multi-page mappings
  ...
2012-01-10 10:09:59 -08:00
Joerg Roedel
00fb5430f5 Merge branches 'iommu/fixes', 'arm/omap' and 'x86/amd' into next
Conflicts:
	drivers/pci/hotplug/acpiphp_glue.c
2012-01-09 13:04:05 +01:00
Linus Torvalds
972b2c7199 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
  reiserfs: Properly display mount options in /proc/mounts
  vfs: prevent remount read-only if pending removes
  vfs: count unlinked inodes
  vfs: protect remounting superblock read-only
  vfs: keep list of mounts for each superblock
  vfs: switch ->show_options() to struct dentry *
  vfs: switch ->show_path() to struct dentry *
  vfs: switch ->show_devname() to struct dentry *
  vfs: switch ->show_stats to struct dentry *
  switch security_path_chmod() to struct path *
  vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
  vfs: trim includes a bit
  switch mnt_namespace ->root to struct mount
  vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
  vfs: opencode mntget() mnt_set_mountpoint()
  vfs: spread struct mount - remaining argument of next_mnt()
  vfs: move fsnotify junk to struct mount
  vfs: move mnt_devname
  vfs: move mnt_list to struct mount
  vfs: switch pnode.h macros to struct mount *
  ...
2012-01-08 12:19:57 -08:00
Konrad Rzeszutek Wilk
76ccc29701 x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 14:02:26 -08:00
Linus Torvalds
67b0243131 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Skip cpus with apic-ids >= 255 in !x2apic_mode
  x86, x2apic: Allow "nox2apic" to disable x2apic mode setup by BIOS
  x86, x2apic: Fallback to xapic when BIOS doesn't setup interrupt-remapping
  x86, acpi: Skip acpi x2apic entries if the x2apic feature is not present
  x86, apic: Add probe() for apic_flat
  x86: Simplify code by removing a !SMP #ifdefs from 'struct cpuinfo_x86'
  x86: Convert per-cpu counter icr_read_retry_count into a member of irq_stat
  x86: Add per-cpu stat counter for APIC ICR read tries
  pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
  x86: Fix the !CONFIG_NUMA build of the new CPU ID fixup code support
  x86: Add NumaChip support
  x86: Add x86_init platform override to fix up NUMA core numbering
  x86: Make flat_init_apic_ldr() available
2012-01-06 13:58:21 -08:00
Hao, Xudong
1900ca132f PCI: Enable ATS at the device state restore
During S3 or S4 resume or PCI reset, ATS regs aren't restored correctly.
This patch enables ATS at the device state restore if PCI device has ATS
capability.

Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:18 -08:00
Neil Horman
424eb39159 PCI: msi: fix imbalanced refcount of msi irq sysfs objects
This warning was recently reported to me:

------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
Hardware name: VMware Virtual Platform
kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
being called.
Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
vmw_pvscsi
Pid: 630, comm: modprobe Tainted: G        W   3.1.6-1.fc16.x86_64 #1
Call Trace:
 [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff810da293>] ? free_desc+0x63/0x70
 [<ffffffff812a9aa0>] kobject_put+0x50/0x60
 [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
 [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
 [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
 [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
 [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
 [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
 [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
 [<ffffffff8138bceb>] __driver_attach+0xab/0xb0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
 [<ffffffff8138b63e>] driver_attach+0x1e/0x20
 [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff8138c246>] driver_register+0x76/0x140
 [<ffffffff815ca414>] ? printk+0x51/0x53
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
 [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
 [<ffffffff81002042>] do_one_initcall+0x42/0x180
 [<ffffffff810aad71>] sys_init_module+0x91/0x200
 [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
---[ end trace 44593438a59a9558 ]---
Using INTx interrupt, #Rx queues: 1.

It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
be called.  Because populate_msi_sysfs fails, we never registered any of the
msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
on each of them, which gets flagged in the above stack trace.

The fix is pretty straightforward.  We can key of the parent pointer in the
kobject.  It is only set if the kobject_init_and_add succededs in
populate_msi_sysfs.  If anything fails there, each kobject has its parent reset
to NULL

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
P. Christeas
d56641c772 PCI: kconfig: English typo in pci/pcie/Kconfig
Just fix this help text.

Signed-off-by: P. Christeas <xrg@linux.gr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
Vincent Palatin
85b8582d7c PCI/PM/Runtime: make PCI traces quieter
When the runtime PM is activated on PCI, if a device switches state
frequently (e.g. an EHCI controller with autosuspending USB devices
connected) the PCI configuration traces might be very verbose in the
kernel log.  Let's guard those traces with DEBUG condition.

Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:16 -08:00
Bjorn Helgaas
118faafaf9 PCI: remove pci_create_bus()
All users of pci_create_bus() have been converted to pci_create_root_bus(),
so remove it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:15 -08:00
Bjorn Helgaas
7e00fe2e53 PCI: deprecate pci_scan_bus_parented()
Users of pci_scan_bus_parented() should be converted to use either
    pci_scan_root_bus() (preferred, but also calls pci_bus_add_devices)
or
    pci_create_root_bus()
    pci_scan_child_bus()

Since pci_scan_bus_parented(), I'm marking it deprecated now and will
actually remove it later.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:55 -08:00
Bjorn Helgaas
1e39ae9f90 PCI: convert pci_scan_bus_parented() to use pci_create_root_bus()
This converts pci_scan_bus_parented() to use pci_create_root_bus()
instead of pci_create_bus().  The new bus still has the default (incorrect)
resources, so this patch doesn't help fix that problem, but it does remove
one more use of pci_create_bus().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:54 -08:00
Bjorn Helgaas
de4b2f76d6 PCI: convert pci_scan_bus() to use pci_create_root_bus()
I plan to deprecate pci_scan_bus_parented(), so use pci_create_root_bus()
directly instead.  pci_scan_bus() itself will be removed as soon as all
callers are gone, so this is just an interim step.

v2: export pci_scan_bus

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:53 -08:00
Bjorn Helgaas
a2ebb82795 PCI: add pci_scan_root_bus() that accepts resource list
"Early" and "header" quirks often use incorrect bus resources because they
see the default resources assigned by pci_create_bus(), before the
architecture fixes them up (typically in pcibios_fixup_bus()).  Regions
reserved by these quirks end up with the wrong parents.

Here's the standard path for scanning a PCI root bus:

  pci_scan_bus or pci_scan_bus_parented
    pci_create_bus                     <-- A create with default resources
    pci_scan_child_bus
      pci_scan_slot
        pci_scan_single_device
          pci_scan_device
            pci_setup_device
              pci_fixup_device(early)  <-- B
          pci_device_add
            pci_fixup_device(header)   <-- C
      pcibios_fixup_bus                <-- D fill in correct resources

Early and header quirks at B and C use the default (incorrect) root bus
resources rather than those filled in at D.

This patch adds a new pci_scan_root_bus() function that sets the bus
resources correctly from a supplied list of resources.

I intend to remove pci_scan_bus() and pci_scan_bus_parented() after
fixing all callers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:52 -08:00
Bjorn Helgaas
166c637075 PCI: add pci_create_root_bus() that accepts resource list
pci_create_bus() assigns ioport_resource and iomem_resource as the default
bus resources, i.e., the entire address space.  Architectures fix these
later, typically in pcibios_fixup_bus() or after pci_scan_bus_parented()
returns, but code that runs in the interim sees incorrect resource
information.

This patch adds a new pci_create_root_bus() that sets the bus resources
correctly from a supplied list of resources.

I intend to remove pci_create_bus() after changing all callers.

Based on original patch by Deng-Cheng Zhu.

Reference: http://www.spinics.net/lists/mips/msg41654.html
Reference: https://lkml.org/lkml/2011/8/26/88
Signed-off-by: Deng-Cheng Zhu <dczhu@mips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:51 -08:00
Bjorn Helgaas
a9d9f5276c PCI: show host bridges and root bus resources
Show the bus number and resources for every root bus we create.  This
will become more interesting when we supply the correct resources
instead of using the defaults (ioport_resource and iomem_resource).

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:51 -08:00
Bjorn Helgaas
45ca9e9730 PCI: add helpers for building PCI bus resource lists
We'd like to supply a list of resources when we create a new PCI bus,
e.g., the root bus under a PCI host bridge.  These are helpers for
constructing that list.

These are exported because the plan is to replace this exported interface:
    pci_scan_bus_parented()
with this one:
    pci_add_resource(resources, ...)
    pci_scan_root_bus(..., resources)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:50 -08:00
Ram Pai
afd24ece5c PCI: delay configuration of SRIOV capability
The SRIOV capability, namely page size and total_vfs of a device are
configured during enumeration phase of the device.  This can potentially
interfere with the PCI operations of the platform, if the IOV capability
of the device is not enabled.

The following patch postpones the configuration of the IOV capability of
the device to a later point, when the IOV capability is explicitly
enabled by the device driver.

The patch is tested on x86 and power platform.

Tested-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:49 -08:00
Yinghai Lu
79cc9601c3 PCI: Only call pci_stop_bus_device() one time for child devices at remove
During debugging pcie hotplug with SRIOV with pcie switch, I found
pci_stop_bus_device() is called several times for some child devices.

So change original pci_remove_bus_device() to __pci_remove_bus_device(),
and make it only do remove work, and add a new pci_remove_bus_device
that calls pci_stop_bus_device() one time, and then call
__pci_remove_bus_device().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:48 -08:00
Myron Stowe
f676678f89 PCI: latency timer doesn't apply to PCIe
The latency timer is read-only and hardwired to zero for all PCIe
devices, both Type 0 and Type 1, so don't bother trying to update it
and cluttering the dmesg log with meaningless "setting latency timer
to 64" messages.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:47 -08:00
Myron Stowe
96c5590058 PCI: Pull PCI 'latency timer' setup up into the core
The 'latency timer' of PCI devices, both Type 0 and Type 1,
is setup in architecture-specific code [see: 'pcibios_set_master()'].
There are two approaches being taken by all the architectures - check
if the 'latency timer' is currently set between 16 and 255 and if not
bring it within bounds, or, do nothing (and then there is the
gratuitously different PA-RISC implementation).

There is nothing architecture-specific about PCI's 'latency timer' so
this patch pulls its setup functionality up into the PCI core by
creating a generic 'pcibios_set_master()' function using the '__weak'
attribute which can be used by all architectures as a default which,
if necessary, can then be over-ridden by architecture-specific code.

No functional change.

Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:42 -08:00
Jan Kiszka
a2e27787f8 PCI: Introduce INTx check & mask API
These new PCI services allow to probe for 2.3-compliant INTx masking
support and then use the feature from PCI interrupt handlers. The
services are properly synchronized with concurrent config space access
via sysfs or on device reset.

This enables generic PCI device drivers like uio_pci_generic or KVM's
device assignment to implement the necessary kernel-side IRQ handling
without any knowledge about device-specific interrupt status and control
registers.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:34 -08:00
Jan Kiszka
fb51ccbf21 PCI: Rework config space blocking services
pci_block_user_cfg_access was designed for the use case that a single
context, the IPR driver, temporarily delays user space accesses to the
config space via sysfs. This assumption became invalid by the time
pci_dev_reset was added as locking instance. Today, if you run two loops
in parallel that reset the same device via sysfs, you end up with a
kernel BUG as pci_block_user_cfg_access detect the broken assumption.

This reworks the pci_block_user_cfg_access to a sleeping service
pci_cfg_access_lock and an atomic-compatible variant called
pci_cfg_access_trylock. The former not only blocks user space access as
before but also waits if access was already locked. The latter service
just returns false in this case, allowing the caller to resolve the
conflict instead of raising a BUG.

Adaptions of the ipr driver were originally written by Brian King.

Acked-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:33 -08:00
Zac Storer
68e35c9b0b PCI: fix a brace coding style issue in probe.c
Fixed a brace coding style issue.

Signed-off-by: Zac Storer <zac.3.14159@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:31 -08:00
David Fries
82440a8253 PCI: pci_has_legacy_pm_support add driver and device to WARN
Include the driver name and device in warning when a pci driver
supports both legacy pm and new framework as just the stack trace
gives no way to identify the driver.

Signed-off-by: David Fries <David@Fries.net>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:30 -08:00
Eric W. Biederman
d5dea7d95c PCI: msi: Disable msi interrupts when we initialize a pci device
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup.  The booting kernel had not enabled those interupts so
was not prepared to handle them.

I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts.  The pci
spec specifies that msi interrupts should be off by default.  Drivers
are expected to enable the msi interrupts if they want to use them.  Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.

This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:29 -08:00
Rafael J. Wysocki
4716a450eb PCI/ACPI/PM: Avoid resuming devices that don't signal PME
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:29 -08:00
Kenji Kaneshige
ab4ca7821f PCI: pciehp: Handle push button event asynchronously
Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:28 -08:00
Kenji Kaneshige
863b7eb583 PCI: pciehp: Fix wrong workqueue cleanup
Fix improper workqueue cleanup.

In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:27 -08:00
Matthew Garrett
10f6dc7eed PCI: Rework ASPM disable code
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.

This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.

It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.

Measured to save around 5W on an idle Thinkpad X220.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:26 -08:00
Alex Williamson
cfa4d8cc56 PCI: Fix PRI and PASID consistency
These are extended capabilities, rename and move to proper
group for consistency.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:26 -08:00
Neil Horman
da8d1c8ba4 PCI/sysfs: add per pci device msi[x] irq listing (v5)
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs.  For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:25 -08:00
Eric Paris
b7e724d303 capabilities: reverse arguments to security_capable
security_capable takes ns, cred, cap.  But the LSM capable() hook takes
cred, ns, cap.  The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!

Signed-off-by: Eric Paris <eparis@redhat.com>
2012-01-05 18:52:53 -05:00
Jan Beulich
73db144b58 Xen: consolidate and simplify struct xenbus_driver instantiation
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.

Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).

Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.

v2: Restore DRV_NAME for the driver name in xen-pciback.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-01-04 17:01:17 -05:00
Al Viro
587a1f1659 switch ->is_visible() to returning umode_t
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:55 -05:00
Yinghai Lu
497f16f21a pci: Fix hotplug of Express Module with pci bridges
I noticed that hotplug of one setup does not work with recent change in
pci tree.

After checking the bridge conf setup, I noticed that the bridges get
assigned but do not get enabled.

The reason is the following commit, while simply ignores bridge
resources when enabling a pci device:

| commit bbef98ab0f
| Author: Ram Pai <linuxram@us.ibm.com>
| Date:   Sun Nov 6 10:33:10 2011 +0800
|
|    PCI: defer enablement of SRIOV BARS
|...
|    NOTE: Note, there is subtle change in the pci_enable_device() API.  Any
|    driver that depends on SRIOV BARS to be enabled in pci_enable_device()
|    can fail.

Put back bridge resource and ROM resource checking to fix the problem.

That should fix regression like BIOS does not assign correct resource to
bridge.

Discussion can be found at:
	http://www.spinics.net/lists/linux-pci/msg12874.html

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-18 14:10:16 -08:00
Ajaykumar Hotchandani
b51306c634 PCI: Set device power state to PCI_D0 for device without native PM support
During test of one IB card with guest VM, found that, msi is not
initialized properly.

It turns out __write_msi_msg will do nothing if device current_state is
not PCI_D0.  And, that pci device does not have pm_cap in guest VM.

There is an error in setting of power state to PCI_D0 in
pci_enable_device(), but error is not returned for this.  Following is
code flow:

pci_enable_device() -->   __pci_enable_device_flags() -->
do_pci_enable_device() -->   pci_set_power_state() -->
__pci_start_power_transition()

We have following condition inside __pci_start_power_transition():
         if (platform_pci_power_manageable(dev)) {
                 error = platform_pci_set_power_state(dev, state);
                 if (!error)
                         pci_update_current_state(dev, state);
         } else {
                 error = -ENODEV;
                 /* Fall back to PCI_D0 if native PM is not supported */
                 if (!dev->pm_cap)
                         dev->current_state = PCI_D0;
         }

Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is
getting called and that is failing with ENODEV because of following
condition:

         if (!handle || ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp)))
                 return -ENODEV;

Because of that, pci_update_current_state() is not getting called.

With this patch, if device power state can not be set via
platform_pci_set_power_state and that device does not have native pm
support, then PCI device power state will be set to PCI_D0.

-v2: This also reverts 47e9037ac1, as it's
     not needed after this change.

Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-14 08:26:42 -08:00
Rafael J. Wysocki
619a5182d1 PCI hotplug: Always allow acpiphp to handle non-PCIe bridges
Commit 0d52f54e2e (PCI / ACPI: Make
acpiphp ignore root bridges using PCIe native hotplug) added code
that made the acpiphp driver completely ignore PCIe root complexes
for which the kernel had been granted control of the native PCIe
hotplug feature by the BIOS through _OSC.  Unfortunately, however,
this was a mistake, because on some systems there were PCI bridges
supporting PCI (non-PCIe) hotplug under such root complexes and
those bridges should have been handled by acpiphp.

For this reason, revert the changes made by the commit mentioned
above and make register_slot() in drivers/pci/hotplug/acpiphp_glue.c
avoid registering hotplug slots for PCIe ports that belong to
root complexes with native PCIe hotplug enabled (which means that
the BIOS has granted the kernel control of this feature for the
given root complex).  This is reported to address the original
issue fixed by commit 0d52f54e2e and
to work on the system where that commit broke things.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-13 10:41:23 -08:00
Jan Beulich
b95a7bd700 pci, x86/io-apic: Allow PCI_IOAPIC to be user configurable on x86
This adjusts PCI_IOAPIC to be user configurable (possibly as a
module) on x86, since the base architecture code for adding
IO-APICs dynamically isn't there yet (and hence having the code
present everywhere is pretty pointless).

To make this consistent, a MODULE_DEVICE_TABLE() declaration
gets added, the class specifications get corrected (by properly
using PCI_DEVICE_CLASS() intended for purposes like this), and
the probe and remove functions get their sections adjusted.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/4EDDD71A02000078000659F1@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06 09:21:05 +01:00
James Bottomley
8c45194567 PCI: fix ats compile failure
I get this compile failure on parisc:

drivers/pci/ats.c: In function 'ats_alloc_one':
drivers/pci/ats.c:29: error: implicit declaration of function 'kzalloc'
drivers/pci/ats.c:29: warning: assignment makes pointer from integer without a cast
drivers/pci/ats.c: In function 'ats_free_one':
drivers/pci/ats.c:45: error: implicit declaration of function 'kfree'

Because ats.c is missing linux/slab.h as an include.  This patch fixes it

Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:31:25 -08:00
Ram Pai
bbef98ab0f PCI: defer enablement of SRIOV BARS
All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device().  This unnecessarily enables SRIOV BARs of the
device.

On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.

The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().

NOTE: Note, there is subtle change in the pci_enable_device() API.  Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.

The patch has been touch tested on power and x86 platform.

Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:30:22 -08:00
Alex Williamson
91f57d5e1b PCI: More PRI/PASID cleanup
More consistency cleanups.  Drop the _OFF, separate and indent
CTRL/CAP/STATUS bit definitions.  This helped find the previous
mis-use of bit 0 in the PASID capability register.

Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:22:15 -08:00
Alex Williamson
60fe823837 PCI: Enable is not exposed as a PASID capability
The PASID ECN indicates bit 0 is reserved in the capability register.
Switch pci_enable_pasid() to error if PASID is already enabled and
don't expose enable as a feature in pci_pasid_features().

Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:22:03 -08:00
Eric W. Biederman
a776c491ca PCI: msi: Disable msi interrupts when we initialize a pci device
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup.  The booting kernel had not enabled those interupts so
was not prepared to handle them.

I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts.  The pci
spec specifies that msi interrupts should be off by default.  Drivers
are expected to enable the msi interrupts if they want to use them.  Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.

This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:50 -08:00
Rafael J. Wysocki
a424948dde PCI/ACPI/PM: Avoid resuming devices that don't signal PME
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:49 -08:00
Rafael J. Wysocki
d90116ea38 PCI/ACPI: Make acpiphp ignore root bridges using SHPC native hotplug
If the kernel has requested control of the SHPC native hotplug
feature for a given root bridge, the acpiphp driver should not try
to handle that root bridge and it should leave it to shpchp.
Failing to do so causes problems to happen if shpchp is loaded
and unloaded before loading acpiphp (ACPI-based hotplug won't work
in that case anyway).

To address this issue make find_root_bridges() ignore PCI root
bridges with SHPC native hotplug enabled and make add_bridge()
return error code if SHPC native hotplug is enabled for the given
root bridge.  This causes acpiphp to refuse to load if SHPC native
hotplug is enabled for all root bridges and to refuse binding to
the root bridges with SHPC native hotplug enabled.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:48 -08:00
Kenji Kaneshige
486b10b9f4 PCI: pciehp: Handle push button event asynchronously
Use non-ordered workqueue for attention button events.

Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:47 -08:00
Kenji Kaneshige
027e8d52ab PCI: pciehp: Fix wrong workqueue cleanup
Fix improper workqueue cleanup.

In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:46 -08:00
Matthew Garrett
3c076351c4 PCI: Rework ASPM disable code
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.

This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.

It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.

Measured to save around 5W on an idle Thinkpad X220.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:45 -08:00
Alex Williamson
69166fbf02 PCI: Fix PRI and PASID consistency
These are extended capabilities, rename and move to proper
group for consistency.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:45 -08:00
Neil Horman
b50cac55bf PCI/sysfs: add per pci device msi[x] irq listing (v5)
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs.  For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-12-05 10:21:44 -08:00
Linus Torvalds
09521577ca Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
  PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
  PCI: pciehp: wait 100 ms after Link Training check
  PCI: pciehp: wait 1000 ms before Link Training check
  PCI: pciehp: Retrieve link speed after link is trained
  PCI: Let PCI_PRI depend on PCI
  PCI: Fix compile errors with PCI_ATS and !PCI_IOV
  PCI / ACPI: Make acpiphp ignore root bridges using PCIe native hotplug
2011-11-23 14:58:46 -08:00
Bjorn Helgaas
4cac2eb158 PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
Previously we claimed device ID 0x7450, regardless of the vendor, which is
clearly wrong.  Now we'll claim that device ID only for AMD.

I suspect this was just a typo in the original code, but it's possible this
change will break shpchp on non-7450 AMD bridges.  If so, we'll have to fix
them as we find them.

Reference: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638863
Reported-by: Ralf Jung <ralfjung-e@gmx.de>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-14 09:43:14 -08:00
Kenji Kaneshige
b3c0045422 PCI: pciehp: wait 100 ms after Link Training check
If the port supports Link speeds greater than 5.0 GT/s, we must wait
for 100 ms after Link training completes before sending configuration
request.

Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-11 09:31:43 -08:00
Kenji Kaneshige
0027cb3e19 PCI: pciehp: wait 1000 ms before Link Training check
We need to wait for 1000 ms after Data Link Layer Link Active (DLLLA)
bit reads 1b before sending configuration request. Currently pciehp
does this wait after checking Link Training (LT) bit. But we need it
before checking LT bit because LT is still set even after DLLLA bit is
set on some platforms.

Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-11 09:31:34 -08:00
Yinghai Lu
fdbd3ce9ef PCI: pciehp: Retrieve link speed after link is trained
During hot plug, board_added will call pciehp_power_on_slot().
But link speed is updated in pciehp_power_on_slot().

We should not update link speed there, because that is too early.

So move the link speed update to pciehp_check_link_status() after making
sure the link has been trained.

-v2: fix compile warning that Kenji found.

Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-11-07 08:07:34 -08:00
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Linus Torvalds
02ebbbd481 Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  scsi: drop unused Kconfig symbol
  pci: drop unused Kconfig symbol
  stmmac: drop unused Kconfig symbol
  x86: drop unused Kconfig symbol
  powerpc: drop unused Kconfig symbols
  powerpc: 40x: drop unused Kconfig symbol
  mips: drop unused Kconfig symbols
  openrisc: drop unused Kconfig symbols
  arm: at91: drop unused Kconfig symbol
  samples: drop unused Kconfig symbol
  m32r: drop unused Kconfig symbol
  score: drop unused Kconfig symbols
  sh: drop unused Kconfig symbol
  um: drop unused Kconfig symbol
  sparc: drop unused Kconfig symbol
  alpha: drop unused Kconfig symbol

Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig
as per Michal: the STMMAC_DUAL_MAC config variable is still unused and
should be deleted.
2011-11-06 18:54:53 -08:00
Paul Gortmaker
eefa9cfc89 pci: add module.h to files implicitly relying on its presence.
These were getting module.h implicitly from device.h but we want
to clean that up, so we fix it here to avoid things like:

pci/slot.c: In function ‘pci_hp_create_module_link’:
pci/slot.c:383: error: ‘module_kset’ undeclared (first use in this function)

Similarly, rpadlpar_core.c is modular, so add module.h to its includes.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:23 -04:00
Paul Gortmaker
363c75db1d pci: Fix files needing export.h for EXPORT_SYMBOL/THIS_MODULE
They were implicitly getting it from device.h --> module.h but
we want to clean that up.  So add the minimal header for these
macros.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:22 -04:00
Paul Bolle
a8d2de5e55 pci: drop unused Kconfig symbol
There's no other Kconfig symbol that depends on XEN_PCIDEV_FE_DEBUG.
Neither is there anything that uses CONFIG_XEN_PCIDEV_FE_DEBUG.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-10-31 23:40:16 +01:00
Joerg Roedel
c54420d330 PCI: Let PCI_PRI depend on PCI
This avoids the PCI_PRI question in 'make config' when PCI
is not selected.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-31 10:23:58 -07:00
Rafael J. Wysocki
0d52f54e2e PCI / ACPI: Make acpiphp ignore root bridges using PCIe native hotplug
If the kernel has requested control of the PCIe native hotplug
feature for a given root complex, the acpiphp driver should not try
to handle that root complex and it should leave it to pciehp.
Failing to do so causes problems to happen if acpiphp is loaded
before pciehp on such systems.

To address this issue make find_root_bridges() ignore PCIe root
complexes with PCIe native hotplug enabled and make add_bridge()
return error code if PCIe native hotplug is enabled for the given
root port.  This causes acpiphp to refuse to load if PCIe native
hotplug is enabled for all complexes and to refuse binding to
the root complexes with PCIe native hotplug is enabled.

Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-31 10:17:43 -07:00
Linus Torvalds
0e59e7e7fe Merge branch 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
  PCI: Clean-up MPS debug output
  pci: Clamp pcie_set_readrq() when using "performance" settings
  PCI: enable MPS "performance" setting to properly handle bridge MPS
  PCI: Workaround for Intel MPS errata
  PCI: Add support for PASID capability
  PCI: Add implementation for PRI capability
  PCI: Export ATS functions to modules
  PCI: Move ATS implementation into own file
  PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
  PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
  PCI / PM: Extend PME polling to all PCI devices
  PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
  PCI: Make pci_setup_bridge() non-static for use by arch code
  x86: constify PCI raw ops structures
  PCI: Add quirk for known incorrect MPSS
  PCI: Add Solarflare vendor ID and SFC4000 device IDs
2011-10-28 14:20:44 -07:00
Jon Mason
a513a99a7c PCI: Clean-up MPS debug output
Clean-up MPS debug output to make it a single line and aligned, thus
making it more readable for a large number of buses and devices in a
single system.

Suggested by Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:44 -07:00
Benjamin Herrenschmidt
a1c473aa11 pci: Clamp pcie_set_readrq() when using "performance" settings
When configuring the PCIe settings for "performance", we allow parents
to have a larger Max Payload Size than children and rely on children
Max Read Request Size to not be larger than their own MPS to avoid
having the host bridge generate responses they can't cope with.

However, various drivers in Linux call pci_set_readrq() with arbitrary
values, assuming this to be a simple performance tweak. This breaks
under our "performance" configuration.

Fix that by making sure the value programmed by pcie_set_readrq() is
never larger than the configured MPS for that device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:44 -07:00
Jon Mason
62f392ea5b PCI: enable MPS "performance" setting to properly handle bridge MPS
Rework the "performance" MPS option to configure the device MPS with the
smaller of the device MPSS or the bridge MPS (which is assumed to be
properly configured at this point to the largest allowable MPS based on
its parent bus).

Also, rework the MRRS setting to report an inability to set the MRRS to
a valid setting.

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:43 -07:00
Jon Mason
d387a8d666 PCI: Workaround for Intel MPS errata
Intel 5000 and 5100 series memory controllers have a known issue if read
completion coalescing is enabled and the PCI-E Maximum Payload Size is
set to 256B.  To work around this issue, disable read completion
coalescing in the memory controller and root complexes.  Unfortunately,
it must always be disabled, even if no 256B MPS devices are present, due
to the possibility of one being hotplugged.

Links to erratas:
http://www.intel.com/content/dam/doc/specification-update/5000-chipset-memory-controller-hub-specification-update.pdf
http://www.intel.com/content/dam/doc/specification-update/5100-memory-controller-hub-chipset-specification-update.pdf

Thanks to Jesse Brandeburg and Ben Hutchings for providing insight into
the problem.

Tested-and-Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:42 -07:00
Linus Torvalds
982653009b Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, ioapic: Consolidate the explicit EOI code
  x86, ioapic: Restore the mask bit correctly in eoi_ioapic_irq()
  x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC
  iommu: Rename the DMAR and INTR_REMAP config options
  x86, ioapic: Define irq_remap_modify_chip_defaults()
  x86, msi, intr-remap: Use the ioapic set affinity routine
  iommu: Cleanup ifdefs in detect_intel_iommu()
  iommu: No need to set dmar_disabled in check_zero_address()
  iommu: Move IOMMU specific code to intel-iommu.c
  intr_remap: Call dmar_dev_scope_init() explicitly
  x86, x2apic: Enable the bios request for x2apic optout
2011-10-26 16:11:53 +02:00
Linus Torvalds
04a8752485 Merge branches 'stable/drivers-3.2', 'stable/drivers.bugfixes-3.2' and 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/drivers-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xenbus: don't rely on xen_initial_domain to detect local xenstore
  xenbus: Fix loopback event channel assuming domain 0
  xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'
  xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel
  xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable
  xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel
  xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports
  xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive

* 'stable/drivers.bugfixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pciback: Check if the device is found instead of blindly assuming so.
  xen/pciback: Do not dereference psdev during printk when it is NULL.
  xen: remove XEN_PLATFORM_PCI config option
  xen: XEN_PVHVM depends on PCI
  xen/pciback: double lock typo
  xen/pciback: use mutex rather than spinlock in vpci backend
  xen/pciback: Use mutexes when working with Xenbus state transitions.
  xen/pciback: miscellaneous adjustments
  xen/pciback: use mutex rather than spinlock in passthrough backend
  xen/pciback: use resource_size()

* 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/pci: support multi-segment systems
  xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.
  xen/pci: make bus notifier handler return sane values
  xen-swiotlb: fix printk and panic args
  xen-swiotlb: Fix wrong panic.
  xen-swiotlb: Retry up three times to allocate Xen-SWIOTLB
  xen-pcifront: Update warning comment to use 'e820_host' option.
2011-10-25 09:19:36 +02:00
Joerg Roedel
086ac11f64 PCI: Add support for PASID capability
Devices supporting Process Address Space Identifiers
(PASIDs) can use an IOMMU to access multiple IO address
spaces at the same time. A PCIe device indicates support for
this feature by implementing the PASID capability. This
patch adds support for the capability to the Linux kernel.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:35 -07:00
Joerg Roedel
c320b976d7 PCI: Add implementation for PRI capability
Implement the necessary functions to handle PRI capabilities
on PCIe devices. With PRI devices behind an IOMMU can signal
page fault conditions to software and recover from such
faults.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:34 -07:00
Joerg Roedel
d4c0636c21 PCI: Export ATS functions to modules
This patch makes the ATS functions usable for modules.
They will be used by a module implementing some advanced
AMD IOMMU features.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:34 -07:00
Joerg Roedel
db3c33c6d3 PCI: Move ATS implementation into own file
ATS does not depend on IOV support, so move the code into
its own file. This file will also include support for the
PRI and PASID capabilities later.
Also give ATS its own Kconfig variable to allow selecting it
without IOV support.

Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:33 -07:00
Rafael J. Wysocki
78d090b0be PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
The result returned by acpi_dev_run_wake() is always either -EINVAL
or -ENODEV, while obviously it should return 0 on success.  The
problem is that the leftover error variable, that's not really used
in the function, is initialized with -ENODEV and then returned
without modification.

To fix this issue remove the error variable from acpi_dev_run_wake()
and make the function return 0 on success as appropriate.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:32 -07:00
Prarit Bhargava
6af8bef14d PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
I originally submitted a patch to workaround this by pushing all Ejection
Requests and Device Checks onto the kacpi_hotplug queue.

http://marc.info/?l=linux-acpi&m=131678270930105&w=2

The patch is still insufficient in that Bus Checks also need to be added.

Rather than add all events, including non-PCI-hotplug events, to the
hotplug queue, mjg suggested that a better approach would be to modify
the acpiphp driver so only acpiphp events would be added to the
kacpi_hotplug queue.

It's a longer patch, but at least we maintain the benefit of having separate
queues in ACPI.  This, of course, is still only a workaround the problem.
As Bjorn and mjg pointed out, we have to refactor a lot of this code to do
the right thing but at this point it is a better to have this code working.

The acpi core places all events on the kacpi_notify queue.  When the acpiphp
driver is loaded and a PCI card with a PCI-to-PCI bridge is removed the
following call sequence occurs:

cleanup_p2p_bridge()
	    -> cleanup_bridge()
		    -> acpi_remove_notify_handler()
			    -> acpi_os_wait_events_complete()
				    -> flush_workqueue(kacpi_notify_wq)

which is the queue we are currently executing on and the process will hang.

Move all hotplug acpiphp events onto the kacpi_hotplug workqueue.  In
handle_hotplug_event_bridge() and handle_hotplug_event_func() we can simply
push the rest of the work onto the kacpi_hotplug queue and then avoid the
deadlock.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: mjg@redhat.com
Cc: bhelgaas@google.com
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:31 -07:00
Rafael J. Wysocki
379021d5c0 PCI / PM: Extend PME polling to all PCI devices
The land of PCI power management is a land of sorrow and ugliness,
especially in the area of signaling events by devices.  There are
devices that set their PME Status bits, but don't really bother
to send a PME message or assert PME#.  There are hardware vendors
who don't connect PME# lines to the system core logic (they know
who they are).  There are PCI Express Root Ports that don't bother
to trigger interrupts when they receive PME messages from the devices
below.  There are ACPI BIOSes that forget to provide _PRW methods for
devices capable of signaling wakeup.  Finally, there are BIOSes that
do provide _PRW methods for such devices, but then don't bother to
call Notify() for those devices from the corresponding _Lxx/_Exx
GPE-handling methods.  In all of these cases the kernel doesn't have
a chance to receive a proper notification that it should wake up a
device, so devices stay in low-power states forever.  Worse yet, in
some cases they continuously send PME Messages that are silently
ignored, because the kernel simply doesn't know that it should clear
the device's PME Status bit.

This problem was first observed for "parallel" (non-Express) PCI
devices on add-on cards and Matthew Garrett addressed it by adding
code that polls PME Status bits of such devices, if they are enabled
to signal PME, to the kernel.  Recently, however, it has turned out
that PCI Express devices are also affected by this issue and that it
is not limited to add-on devices, so it seems necessary to extend
the PME polling to all PCI devices, including PCI Express and planar
ones.  Still, it would be wasteful to poll the PME Status bits of
devices that are known to receive proper PME notifications, so make
the kernel (1) poll the PME Status bits of all PCI and PCIe devices
enabled to signal PME and (2) disable the PME Status polling for
devices for which correct PME notifications are received.

Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:31 -07:00
Josh Boyer
3e309cdf07 PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
Commit 15bed0f2f added a quirk for the e823 Ricoh card reader to lower the
base frequency.  However, the quirk first checks to see if the proprietary
MMC controller is disabled, and returns if so.  On some devices, such as the
Lenovo X220, the MMC controller is already disabled by firmware it seems,
but the frequency change is still needed so sdhci-pci can talk to the cards.
Since the MMC controller is disabled, the frequency fixup was never being run
on these machines.

This moves the e823 check above the MMC controller check so that it always
gets run.

This fixes https://bugzilla.redhat.com/show_bug.cgi?id=722509

Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:30 -07:00
Benjamin Herrenschmidt
e24442733e PCI: Make pci_setup_bridge() non-static for use by arch code
The "powernv" platform of the powerpc architecture needs to assign PCI
resources using a specific algorithm to fit some HW constraints of
the IBM "IODA" architecture (related to the ability to create error
handling domains that encompass specific segments of MMIO space).

For doing so, it wants to call pci_setup_bridge() from architecture
specific resource management in order to configure bridges after all
resources have been assigned. So make it non-static.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:29 -07:00
Ben Hutchings
a94d072b20 PCI: Add quirk for known incorrect MPSS
Using legacy interrupts and TLPs > 256 bytes on the SFC4000 (all
revisions) may cause interrupt messages to be replayed.  In some
systems this results in a non-recoverable MCE.  Early boards using the
SFC4000 set the maximum payload size supported (MPSS) to 1024 bytes
and we should override that.

There are probably other devices with similar issues, so give this
quirk a generic name.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-14 09:05:27 -07:00
Jon Mason
5f39e6705f PCI: Disable MPS configuration by default
Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults.  Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.

Also, add the option for peer to peer DMA MPS configuration.  Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another.  To work around this, simply make the system
wide MPS the smallest possible value (128B).

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-04 09:52:28 -07:00
Suresh Siddha
d3f138106b iommu: Rename the DMAR and INTR_REMAP config options
Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
with the other IOMMU options.

Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
irq subsystem name.

And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-09-21 10:22:03 +02:00
Benjamin Herrenschmidt
1a4b1a41b8 pci: Don't crash when reading mpss from root complex
In pcie_find_smpss(), we have the following statement:

 	if (dev->is_hotplug_bridge && (!list_is_singular(&dev->bus->devices) ||
	    dev->bus->self->pcie_type != PCI_EXP_TYPE_ROOT_PORT))

The problem is that at least on my machine, this gets called for the
root complex (virtual P2P bridge), and dev->bus->self is NULL since
the parent bus for this is not itself anchor to a PCI device.

This adds the necessary NULL check.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jon Mason <mason@myri.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-13 16:08:31 -07:00
Jon Mason
ed2888e906 PCI: Remove MRRS modification from MPS setting code
Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
massive negative ramifications on some devices.  Without knowing which
devices have this issue, do not modify from the default value when
walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
the default procedure.

Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Simon Kirby <sim@hostway.ca>
Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-09 19:49:58 -07:00