Commit Graph

271 Commits

Author SHA1 Message Date
Wong Vee Khee
56c1af4606 PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc
Expose PCIe bridges attributes such as secondary bus number, subordinate
bus number, max link speed and link width, current link speed and link
width via sysfs in /sys/bus/pci/devices/...

This information is available via lspci, but that requires root privilege.

Signed-off-by: Wong Vee Khee <vee.khee.wong@ni.com>
Signed-off-by: Hui Chun Ong <hui.chun.ong@ni.com>
[bhelgaas: changelog, return errors early to unindent usual case, return
errors with same style throughout]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-06-19 16:54:53 -05:00
Jakub Kicinski
17530e71e0 PCI: Protect pci_driver->sriov_configure() usage with device_lock()
Every method in struct device_driver or structures derived from it like
struct pci_driver MUST provide exclusion vs the driver's ->remove() method,
usually by using device_lock().

Protect use of pci_driver->sriov_configure() by holding the device lock
while calling it.

The PCI core sets the pci_dev->driver pointer in local_pci_probe() before
calling ->probe() and only clears it after ->remove().  This means driver's
->sriov_configure() callback will happily race with probe() and remove(),
most likely leading to BUGs, since drivers don't expect this.

Remove the iov lock completely, since we remove the last user.

[bhelgaas: changelog, thanks to Christoph for locking rule]
Link: http://lkml.kernel.org/r/20170522225023.14010-1-jakub.kicinski@netronome.com
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-06-14 17:41:19 -05:00
Bjorn Helgaas
ef1b5dad5a Merge branch 'pci/virtualization' into next
* pci/virtualization:
  ixgbe: Use pcie_flr() instead of duplicating it
  IB/hfi1: Use pcie_flr() instead of duplicating it
  PCI: Call pcie_flr() from reset_chelsio_generic_dev()
  PCI: Call pcie_flr() from reset_intel_82599_sfp_virtfn()
  PCI: Export pcie_flr()
  PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding
  PCI: Avoid FLR for Intel 82579 NICs

Conflicts:
	include/linux/pci.h
2017-04-28 10:36:12 -05:00
Bodong Wang
0e7df22401 PCI: Add sysfs sriov_drivers_autoprobe to control VF driver binding
Sometimes it is not desirable to bind SR-IOV VFs to drivers.  This can save
host side resource usage by VF instances that will be assigned to VMs.

Add a new PCI sysfs interface "sriov_drivers_autoprobe" to control that
from the PF.  To modify it, echo 0/n/N (disable probe) or 1/y/Y (enable
probe) to:

  /sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe

Note that this must be done before enabling VFs.  The change will not take
effect if VFs are already enabled.  Simply, one can disable VFs by setting
sriov_numvfs to 0, choose whether to probe or not, and then re-enable the
VFs by restoring sriov_numvfs.

[bhelgaas: changelog, ABI doc]
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2017-04-20 08:53:51 -05:00
David Woodhouse
f719582435 PCI: Add pci_mmap_resource_range() and use it for ARM64
Starting to leave behind the legacy of the pci_mmap_page_range() interface
which takes "user-visible" BAR addresses.  This takes just the resource and
offset.

For now, both APIs coexist and depending on the platform, one is
implemented as a wrapper around the other.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20 08:47:47 -05:00
David Woodhouse
f66e225828 PCI: Add BAR index argument to pci_mmap_page_range()
In all cases we know which BAR it is.  Passing it in means that arch code
(or generic code; watch this space) won't have to go looking for it again.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20 08:47:47 -05:00
David Woodhouse
dca40b186b PCI: Use BAR index in sysfs attr->private instead of resource pointer
We store the pointer, and then on *every* use of it we loop over the
device's resources to find out the index.  That's kind of silly.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-20 08:47:47 -05:00
David Woodhouse
e854d8b2a8 PCI: Add arch_can_pci_mmap_io() on architectures which can mmap() I/O space
This is relatively esoteric, and knowing that we don't have it makes life
easier in some cases rather than just an eventual -EINVAL from
pci_mmap_page_range().

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18 13:02:26 -05:00
David Woodhouse
ae749c7ab4 PCI: Add arch_can_pci_mmap_wc() macro
Most of the almost-identical versions of pci_mmap_page_range() silently
ignore the 'write_combine' argument and give uncached mappings.

Yet we allow the PCIIOC_WRITE_COMBINE ioctl in /proc/bus/pci, expose the
'resourceX_wc' file in sysfs, and allow an attempted mapping to apparently
succeed.

To fix this, introduce a macro arch_can_pci_mmap_wc() which indicates
whether the platform can do a write-combining mapping.  On x86 this ends up
being pat_enabled(), while the few other platforms that support it can just
set it to a literal '1'.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-04-18 13:01:42 -05:00
David Woodhouse
6bccc7f426 PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms
In the PCI_MMAP_PROCFS case when the address being passed by the user is a
'user visible' resource address based on the bus window, and not the actual
contents of the resource, that's what we need to be checking it against.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2017-04-12 12:36:52 -05:00
Emil Tantilov
5b0948dfe1 PCI: Lock each enable/disable num_vfs operation in sysfs
Enabling/disabling SRIOV via sysfs by echo-ing multiple values
simultaneously:

  # echo 63 > /sys/class/net/ethX/device/sriov_numvfs&
  # echo 63 > /sys/class/net/ethX/device/sriov_numvfs

  # sleep 5

  # echo 0 > /sys/class/net/ethX/device/sriov_numvfs&
  # echo 0 > /sys/class/net/ethX/device/sriov_numvfs

results in the following bug:

  kernel BUG at drivers/pci/iov.c:495!
  invalid opcode: 0000 [#1] SMP
  CPU: 1 PID: 8050 Comm: bash Tainted: G   W   4.9.0-rc7-net-next #2092
  RIP: 0010:[<ffffffff813b1647>]
	    [<ffffffff813b1647>] pci_iov_release+0x57/0x60

  Call Trace:
   [<ffffffff81391726>] pci_release_dev+0x26/0x70
   [<ffffffff8155be6e>] device_release+0x3e/0xb0
   [<ffffffff81365ee7>] kobject_cleanup+0x67/0x180
   [<ffffffff81365d9d>] kobject_put+0x2d/0x60
   [<ffffffff8155bc27>] put_device+0x17/0x20
   [<ffffffff8139c08a>] pci_dev_put+0x1a/0x20
   [<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90
   [<ffffffff8139cca5>] pci_get_subsys+0x35/0x40
   [<ffffffff8139ccc8>] pci_get_device+0x18/0x20
   [<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60
   [<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180
   [<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140
   [<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe]
   [<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe]
   [<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130
  ...
  RIP  [<ffffffff813b1647>] pci_iov_release+0x57/0x60

Use the existing mutex lock to protect each enable/disable operation.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
2017-02-03 13:42:38 -06:00
Emil Velikov
702ed3be1b PCI: Create revision file in sysfs
Currently the revision isn't available via sysfs/libudev thus if one wants
to know the value one needs to read through the config file, which can be
quite time-consuming because it wakes/powers up the device.

There are at least two userspace components which could make use the new
file: libpciaccess and libdrm.  The former wakes up _every_ PCI device,
which can be observed via glxinfo when using Mesa 10.0+ drivers.  The
latter, in association with Mesa 13.0, can lead to 2-3 second delays while
starting firefox, thunderbird or chromium.

Link: https://bugs.freedesktop.org/show_bug.cgi?id=98502
Tested-by: Mauro Santos <registo.mailling@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch
CC: Greg KH <gregkh@linuxfoundation.org>
2016-11-21 16:25:32 -06:00
Mika Westerberg
9d26d3a8f1 PCI: Put PCIe ports into D3 during suspend
Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered.  Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.

With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:

  - The PCIe port hardware is recent enough, starting from 2015.

  - Devices connected to PCIe ports are effectively in D3cold once the port
    is transitioned to D3 (the config space is not accessible anymore and
    the link may be powered down).

  - Devices behind the PCIe port need to be allowed to transition to D3cold
    and back.  There is a way both drivers and userspace can forbid this.

  - If the device behind the PCIe port is capable of waking the system it
    needs to be able to do so from D3cold.

This patch adds a new flag to struct pci_device called 'bridge_d3'.  This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port.  When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.

Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-13 14:57:36 -05:00
Linus Torvalds
7afd16f882 PCI changes for the v4.7 merge window:
Enumeration
     Refine PCI support check in pcibios_init() (Adrian-Ken Rueegsegger)
     Provide common functions for ECAM mapping (Jayachandran C)
     Allow all PCIe services on non-ACPI host bridges (Jon Derrick)
     Remove return values from pcie_port_platform_notify() and relatives (Jon Derrick)
     Widen portdrv service type from 4 bits to 8 bits (Keith Busch)
     Add Downstream Port Containment portdrv service type (Keith Busch)
     Add Downstream Port Containment driver (Keith Busch)
 
   Resource management
     Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs (Alex Williamson)
     Supply CPU physical address (not bus address) to iomem_is_exclusive() (Bjorn Helgaas)
     alpha: Call iomem_is_exclusive() for IORESOURCE_MEM, but not IORESOURCE_IO (Bjorn Helgaas)
     Mark Broadwell-EP Home Agent 1 as having non-compliant BARs (Prarit Bhargava)
     Disable all BAR sizing for devices with non-compliant BARs (Prarit Bhargava)
     Move PCI I/O space management from OF to PCI core code (Tomasz Nowicki)
 
   PCI device hotplug
     acpiphp_ibm: Avoid uninitialized variable reference (Dan Carpenter)
     Use cached copy of PCI_EXP_SLTCAP_HPC bit (Lukas Wunner)
 
   Virtualization
     Mark Intel i40e NIC INTx masking as broken (Alex Williamson)
     Reverse standard ACS vs device-specific ACS enabling (Alex Williamson)
     Work around Intel Sunrise Point PCH incorrect ACS capability (Alex Williamson)
 
   IOMMU
     Add pci_add_dma_alias() to abstract implementation (Bjorn Helgaas)
     Move informational printk to pci_add_dma_alias() (Bjorn Helgaas)
     Add support for multiple DMA aliases (Jacek Lawrynowicz)
     Add DMA alias quirk for mic_x200_dma (Jacek Lawrynowicz)
 
   Thunderbolt
     Fix double free of drom buffer (Andreas Noever)
     Add Intel Thunderbolt device IDs (Lukas Wunner)
     Fix typos and magic number (Lukas Wunner)
     Support 1st gen Light Ridge controller (Lukas Wunner)
 
   Generic host bridge driver
     Use generic ECAM API (Jayachandran C)
 
   Cavium ThunderX host bridge driver
     Don't clobber read-only bits in bridge config registers (David Daney)
     Use generic ECAM API (Jayachandran C)
 
   Freescale i.MX6 host bridge driver
     Use enum instead of bool for variant indicator (Andrey Smirnov)
     Implement reset sequence for i.MX6+ (Andrey Smirnov)
     Factor out ref clock enable (Bjorn Helgaas)
     Add initial imx6sx support (Christoph Fritz)
     Add reset-gpio-active-high boolean property to DT (Petr Štetiar)
     Add DT property for link gen, default to Gen1 (Tim Harvey)
     dts: Specify imx6qp version of PCIe core (Andrey Smirnov)
     dts: Fix PCIe reset GPIO polarity on Toradex Apalis Ixora (Petr Štetiar)
 
   Marvell Armada host bridge driver
     add DT binding for Marvell Armada 7K/8K PCIe controller (Thomas Petazzoni)
     Add driver for Marvell Armada 7K/8K PCIe controller (Thomas Petazzoni)
 
   Marvell MVEBU host bridge driver
     Constify mvebu_pcie_pm_ops structure (Jisheng Zhang)
     Use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS for mvebu_pcie_pm_ops (Jisheng Zhang)
 
   Microsoft Hyper-V host bridge driver
     Report resources release after stopping the bus (Vitaly Kuznetsov)
     Add explicit barriers to config space access (Vitaly Kuznetsov)
 
   Renesas R-Car host bridge driver
     Select PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)
 
   Synopsys DesignWare host bridge driver
     Remove incorrect RC memory base/limit configuration (Gabriele Paoloni)
     Move Root Complex setup code to dw_pcie_setup_rc() (Jisheng Zhang)
 
   TI Keystone host bridge driver
     Add error IRQ handler (Murali Karicheri)
     Remove unnecessary goto statement (Murali Karicheri)
 
   Miscellaneous
     Fix spelling errors (Colin Ian King)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPdMKAAoJEFmIoMA60/r8ofUP/j0zyzn24f0xY1wLeGJ8geB9
 6nHk1QdkPqwCiXZahEcnA5HMlFCl/ciWjjsoCqeMlvS6NXkX13KGcc1UGZszelTs
 68bFhyBKqcoMn0it53vBjBXnkfA64PmlxwY/T1ADulxL8amFOCpjjBruZ8pxJ/U7
 r6uHvhxUxHCRF7hMmpNN+V5XWXWCFFkPJZvxOTkglaxkbdnhZ0h0Xz9p9liUvjPH
 mBE72E3WUjiGogXGoLAPDclz1NI6rhRVUyTRcQ8EWaOwitV3OqMuDpAwoWH62ZZJ
 iorCkQk2/eKfN6OA6UgZh4loauAty0FeoZDX7ZVftQr52IpAzRUVx1oAq0J7u4ga
 KRX37mlK/53UcMZyv9Lz2kw4KjaLLELiInzcF+w3Bbov4UhY4/sL5uh9eNMFvSUU
 iZuY+GFlceL0P6wZuVKU5U8td/CyBr3f5vY/3htxuYHE1xJq4FkL92JpWRCvwpVr
 YdCzocscw73Yn8ZMplt8DX2fyabN7HyGezbQISrDDGY6T0ZDsRRKc6FFAt4xF+ta
 JJ+bcY8OcXtxGw6SXtrscL7vNXdR7Zg1HBSa8Sl/CopCdW9zs0VdwgFoxgORcWDT
 mphIgt57DMzaiUUaV8FRQz0mSLixnAcCEfGjVbAEEw3SP5ZChGfS3EknKb/CPRyk
 TD6I3pXTBhTWXd8aS113
 =68Iz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:
   - Refine PCI support check in pcibios_init() (Adrian-Ken Rueegsegger)
   - Provide common functions for ECAM mapping (Jayachandran C)
   - Allow all PCIe services on non-ACPI host bridges (Jon Derrick)
   - Remove return values from pcie_port_platform_notify() and relatives (Jon Derrick)
   - Widen portdrv service type from 4 bits to 8 bits (Keith Busch)
   - Add Downstream Port Containment portdrv service type (Keith Busch)
   - Add Downstream Port Containment driver (Keith Busch)

  Resource management:
   - Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs (Alex Williamson)
   - Supply CPU physical address (not bus address) to iomem_is_exclusive() (Bjorn Helgaas)
   - alpha: Call iomem_is_exclusive() for IORESOURCE_MEM, but not IORESOURCE_IO (Bjorn Helgaas)
   - Mark Broadwell-EP Home Agent 1 as having non-compliant BARs (Prarit Bhargava)
   - Disable all BAR sizing for devices with non-compliant BARs (Prarit Bhargava)
   - Move PCI I/O space management from OF to PCI core code (Tomasz Nowicki)

  PCI device hotplug:
   - acpiphp_ibm: Avoid uninitialized variable reference (Dan Carpenter)
   - Use cached copy of PCI_EXP_SLTCAP_HPC bit (Lukas Wunner)

  Virtualization:
   - Mark Intel i40e NIC INTx masking as broken (Alex Williamson)
   - Reverse standard ACS vs device-specific ACS enabling (Alex Williamson)
   - Work around Intel Sunrise Point PCH incorrect ACS capability (Alex Williamson)

  IOMMU:
   - Add pci_add_dma_alias() to abstract implementation (Bjorn Helgaas)
   - Move informational printk to pci_add_dma_alias() (Bjorn Helgaas)
   - Add support for multiple DMA aliases (Jacek Lawrynowicz)
   - Add DMA alias quirk for mic_x200_dma (Jacek Lawrynowicz)

  Thunderbolt:
   - Fix double free of drom buffer (Andreas Noever)
   - Add Intel Thunderbolt device IDs (Lukas Wunner)
   - Fix typos and magic number (Lukas Wunner)
   - Support 1st gen Light Ridge controller (Lukas Wunner)

  Generic host bridge driver:
   - Use generic ECAM API (Jayachandran C)

  Cavium ThunderX host bridge driver:
   - Don't clobber read-only bits in bridge config registers (David Daney)
   - Use generic ECAM API (Jayachandran C)

  Freescale i.MX6 host bridge driver:
   - Use enum instead of bool for variant indicator (Andrey Smirnov)
   - Implement reset sequence for i.MX6+ (Andrey Smirnov)
   - Factor out ref clock enable (Bjorn Helgaas)
   - Add initial imx6sx support (Christoph Fritz)
   - Add reset-gpio-active-high boolean property to DT (Petr Štetiar)
   - Add DT property for link gen, default to Gen1 (Tim Harvey)
   - dts: Specify imx6qp version of PCIe core (Andrey Smirnov)
   - dts: Fix PCIe reset GPIO polarity on Toradex Apalis Ixora (Petr Štetiar)

  Marvell Armada host bridge driver:
   - add DT binding for Marvell Armada 7K/8K PCIe controller (Thomas Petazzoni)
   - Add driver for Marvell Armada 7K/8K PCIe controller (Thomas Petazzoni)

  Marvell MVEBU host bridge driver:
   - Constify mvebu_pcie_pm_ops structure (Jisheng Zhang)
   - Use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS for mvebu_pcie_pm_ops (Jisheng Zhang)

  Microsoft Hyper-V host bridge driver:
   - Report resources release after stopping the bus (Vitaly Kuznetsov)
   - Add explicit barriers to config space access (Vitaly Kuznetsov)

  Renesas R-Car host bridge driver:
   - Select PCI_MSI_IRQ_DOMAIN (Arnd Bergmann)

  Synopsys DesignWare host bridge driver:
   - Remove incorrect RC memory base/limit configuration (Gabriele Paoloni)
   - Move Root Complex setup code to dw_pcie_setup_rc() (Jisheng Zhang)

  TI Keystone host bridge driver:
   - Add error IRQ handler (Murali Karicheri)
   - Remove unnecessary goto statement (Murali Karicheri)

  Miscellaneous:
   - Fix spelling errors (Colin Ian King)"

* tag 'pci-v4.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
  PCI: Disable all BAR sizing for devices with non-compliant BARs
  x86/PCI: Mark Broadwell-EP Home Agent 1 as having non-compliant BARs
  PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs
  PCI, of: Move PCI I/O space management to PCI core code
  PCI: generic, thunder: Use generic ECAM API
  PCI: Provide common functions for ECAM mapping
  PCI: hv: Add explicit barriers to config space access
  PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
  PCI: Add Downstream Port Containment driver
  PCI: Add Downstream Port Containment portdrv service type
  PCI: Widen portdrv service type from 4 bits to 8 bits
  PCI: designware: Remove incorrect RC memory base/limit configuration
  PCI: hv: Report resources release after stopping the bus
  ARM: dts: imx6qp: Specify imx6qp version of PCIe core
  PCI: imx6: Implement reset sequence for i.MX6+
  PCI: imx6: Use enum instead of bool for variant indicator
  PCI: thunder: Don't clobber read-only bits in bridge config registers
  thunderbolt: Fix double free of drom buffer
  PCI: rcar: Select PCI_MSI_IRQ_DOMAIN
  PCI: armada: Add driver for Marvell Armada 7K/8K PCIe controller
  ...
2016-05-19 13:10:54 -07:00
Bjorn Helgaas
ca620723d4 PCI: Supply CPU physical address (not bus address) to iomem_is_exclusive()
iomem_is_exclusive() requires a CPU physical address, but on some arches we
supplied a PCI bus address instead.

On most arches, pci_resource_to_user(res) returns "res->start", which is a
CPU physical address.  But on microblaze, mips, powerpc, and sparc, it
returns the PCI bus address corresponding to "res->start".

The result is that pci_mmap_resource() may fail when it shouldn't (if the
bus address happens to match an existing resource), or it may succeed when
it should fail (if the resource is exclusive but the bus address doesn't
match it).

Call iomem_is_exclusive() with "res->start", which is always a CPU physical
address, not the result of pci_resource_to_user().

Fixes: e8de1481fd ("resource: allow MMIO exclusivity for device drivers")
Suggested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
2016-04-25 15:58:26 -05:00
Linus Torvalds
ab0fa82b2d pci-sysfs: use proper file capability helper function
The PCI config access checked the file capabilities correctly, but used
the itnernal security capability check rather than the helper function
that is actually meant for that.

The security_capable() has unusual return values and is not meant to be
used elsewhere (the only other use is in the capability checking
functions that we actually intend people to use, and this odd PCI usage
really stood out when looking around the capability code.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-14 12:56:09 -07:00
Bjorn Helgaas
6e6f498b03 Merge branch 'pci/resource' into next
* pci/resource:
  PCI: Simplify pci_create_attr() control flow
  PCI: Don't leak memory if sysfs_create_bin_file() fails
  PCI: Simplify sysfs ROM cleanup
  PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
  MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
  MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
  ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
  ia64/PCI: Use ioremap() instead of open-coded equivalent
  ia64/PCI: Use temporary struct resource * to avoid repetition
  PCI: Clean up pci_map_rom() whitespace
  PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
  PCI: Set ROM shadow location in arch code, not in PCI core
  PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
  PCI: Don't assign or reassign immutable resources
  PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
  x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
  PCI: Disable IO/MEM decoding for devices with non-compliant BARs
2016-03-15 08:56:28 -05:00
Bjorn Helgaas
bd5174dfb6 PCI: Simplify pci_create_attr() control flow
Return error immediately to simplify the control flow in pci_create_attr().
No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas
b562ec8f74 PCI: Don't leak memory if sysfs_create_bin_file() fails
If sysfs_create_bin_file() fails, pci_create_attr() leaks the struct
bin_attribute it allocated previously.

Free the struct bin_attribute if pci_create_attr() fails.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas
9d88b93bea PCI: Simplify sysfs ROM cleanup
The value of pdev->rom_attr is the definitive indicator of the fact that
we're created a sysfs attribute.  Check that rather than rom_size, which is
only used incidentally when deciding whether to create a sysfs attribute.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:29 -06:00
Bjorn Helgaas
ac0c302a91 PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
When pci_create_sysfs_dev_files() created the "rom" sysfs file, it set the
sysfs file size to the actual size of a ROM BAR, or if there was no ROM BAR
but the platform provided a shadow copy in RAM, to 0x20000.  0x20000 is an
arch-specific length that should not be baked into the PCI core.

Every place that sets IORESOURCE_ROM_SHADOW also sets the size of the
PCI_ROM_RESOURCE, so use the resource length always.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-03-12 06:00:25 -06:00
Hannes Reinecke
104daa71b3 PCI: Determine actual VPD size on first access
PCI-2.2 VPD entries have a maximum size of 32k, but might actually be
smaller than that.  To figure out the actual size one has to read the VPD
area until the 'end marker' is reached.

Per spec, reading outside of the VPD space is "not allowed."  In practice,
it may cause simple read errors or even crash the card.  To make matters
worse not every PCI card implements this properly, leaving us with no 'end'
marker or even completely invalid data.

Try to determine the size of the VPD data when it's first accessed.  If no
valid data can be read an I/O error will be returned when reading or
writing the sysfs attribute.

As the amount of VPD data is unknown initially the size of the sysfs
attribute will always be set to '0'.

[bhelgaas: changelog, use 0/1 (not false/true) for bitfield, tweak
pci_vpd_pci22_read() error checking]
Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
2016-02-29 17:47:04 -06:00
Hannes Reinecke
f52e5629f6 PCI: Allow access to VPD attributes with size 0
It is not always possible to determine the actual size of the VPD
data, so allow access to them if the size is set to '0'.

Tested-by: Shane Seymour <shane.seymour@hpe.com>
Tested-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
2016-02-29 17:46:50 -06:00
Linus Torvalds
d43421565b PCI changes for the v4.5 merge window:
Enumeration
     Simplify config space size computation (Bjorn Helgaas)
     Avoid iterating through ROM outside the resource window (Edward O'Callaghan)
     Support PCIe devices with short cfg_size (Jason S. McMullan)
     Add Netronome vendor and device IDs (Jason S. McMullan)
     Limit config space size for Netronome NFP6000 family (Jason S. McMullan)
     Add Netronome NFP4000 PF device ID (Simon Horman)
     Limit config space size for Netronome NFP4000 (Simon Horman)
     Print warnings for all invalid expansion ROM headers (Vladis Dronov)
 
   Resource management
     Fix minimum allocation address overwrite (Christoph Biedl)
 
   PCI device hotplug
     acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King)
     pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck)
     shpchp: Constify hpc_ops structure (Julia Lawall)
     ibmphp: Remove unneeded NULL test (Julia Lawall)
 
   Power management
     Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski)
 
   Virtualization
     Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander)
 
   MSI
     Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas)
     Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko)
     Initialize MSI capability for all architectures (Guilherme G. Piccoli)
     Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang)
 
   ARM Versatile host bridge driver
     Remove unused pci_sys_data structures (Lorenzo Pieralisi)
 
   Broadcom iProc host bridge driver
     Hide CONFIG_PCIE_IPROC (Arnd Bergmann)
     Do not use 0x in front of %pap (Dmitry V. Krivenok)
     Update iProc PCIe device tree binding (Ray Jui)
     Add PAXC interface support (Ray Jui)
     Add iProc PCIe MSI device tree binding (Ray Jui)
     Add iProc PCIe MSI support (Ray Jui)
 
   Freescale i.MX6 host bridge driver
     Use gpio_set_value_cansleep() (Fabio Estevam)
     Add support for active-low reset GPIO (Petr Štetiar)
 
   HiSilicon host bridge driver
     Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni)
 
   Intel VMD host bridge driver
     Export irq_domain_set_info() for module use (Keith Busch)
     x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch)
     Use 32 bit PCI domain numbers (Keith Busch)
     Add driver for Intel Volume Management Device (VMD) (Keith Busch)
 
   Qualcomm host bridge driver
     Document PCIe devicetree bindings (Stanimir Varbanov)
     Add Qualcomm PCIe controller driver (Stanimir Varbanov)
     dts: apq8064: add PCIe devicetree node (Stanimir Varbanov)
     dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov)
 
   Renesas R-Car host bridge driver
     Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa)
     Allow DT to override default window settings (Phil Edworthy)
     Convert to DT resource parsing API (Phil Edworthy)
     Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy)
     Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy)
     Add runtime PM support to pcie-rcar (Phil Edworthy)
     Add Gen2 PHY setup to pcie-rcar (Phil Edworthy)
     Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman)
     Add gen2 fallback compatibility string for pcie-rcar (Simon Horman)
 
   Synopsys DesignWare host bridge driver
     Simplify control flow (Bjorn Helgaas)
     Make config accessor override checking symmetric (Bjorn Helgaas)
     Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov)
 
   Miscellaneous
     Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann)
     Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas)
     Fix all whitespace issues (Bogicevic Sasa)
     x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang)
     Use to_pci_dev() instead of open-coding it (Geliang Tang)
     Use kobj_to_dev() instead of open-coding it (Geliang Tang)
     Use list_for_each_entry() to simplify code (Geliang Tang)
     Fix typos in <linux/msi.h> (Thomas Petazzoni)
     x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWoQIuAAoJEFmIoMA60/r8ckYP/0ZrkANeN1SB5cQVi2k7aceq
 kQb1Hk6ifxohJvgpJ/iwmVCHoApyeBfUBfrC+fUpIC2f7ncPsE5HNyjqpAWzFzj2
 sYWwY029yjBQ9g4mPhkvjBXfha+lNtLthWc+Xxcat5pdcyG63Dg4SfJKWm2ZYnbN
 0GJzyRZXIwAMnNf0KIr61Aqru0nXeHvi5wblyJ08UZ7AcNzCtB0wKLmE3S6SeZVF
 f2fry35zcGu+TFvQ1hAYemfl3XyDBJ87nPiKzJAwYSaKcWPFWt+72PBDPO6X9squ
 6prm4nmAgeG2Oo4Zu0fbkDlB2bEsWUc14/xT0i5Wfs35vcwzF+S1zirJAtVqoNir
 NgC7fSbEHbsS7FZOz0rBOBIvIkbb6NdfLFuZqUFv0X1M5bRFywjo8lZRfAYoGJzK
 Mmus0uKbklx5m6RT5adf9+Plev1YJT6XZW9XrDpGnxrwRyPjHmyvuTWsYkumxY7Q
 CE5Wr3p7q2I2+MtrQVv2D9Nzsb+4zQ6BgHrd2vwR/IxTsfdXLU7+B691wkUDX8No
 UKFxBd0FiVCn+srG96u7lWQvdoUqoNCogTZSVzGR5gFBv3zAN9gi8HS7NbV558Mg
 Io3Xw+6dcbG33uvWdU6jHEDLMQsohZcp05Q5esCgRQNV4cGJbPxBDtOZEO/ezvW4
 FAI7lfgYTFiQK3NzE3Ng
 =z9mQ
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for the v4.5 merge window:

  Enumeration:
   - Simplify config space size computation (Bjorn Helgaas)
   - Avoid iterating through ROM outside the resource window (Edward O'Callaghan)
   - Support PCIe devices with short cfg_size (Jason S. McMullan)
   - Add Netronome vendor and device IDs (Jason S. McMullan)
   - Limit config space size for Netronome NFP6000 family (Jason S. McMullan)
   - Add Netronome NFP4000 PF device ID (Simon Horman)
   - Limit config space size for Netronome NFP4000 (Simon Horman)
   - Print warnings for all invalid expansion ROM headers (Vladis Dronov)

  Resource management:
   - Fix minimum allocation address overwrite (Christoph Biedl)

  PCI device hotplug:
   - acpiphp_ibm: Fix null dereferences on null ibm_slot (Colin Ian King)
   - pciehp: Always protect pciehp_disable_slot() with hotplug mutex (Guenter Roeck)
   - shpchp: Constify hpc_ops structure (Julia Lawall)
   - ibmphp: Remove unneeded NULL test (Julia Lawall)

  Power management:
   - Make ASPM sysfs link_state_store() consistent with link_state_show() (Andy Lutomirski)

  Virtualization
   - Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183 (Tim Sander)

  MSI:
   - Remove empty pci_msi_init_pci_dev() (Bjorn Helgaas)
   - Mark PCIe/PCI (MSI) IRQ cascade handlers as IRQF_NO_THREAD (Grygorii Strashko)
   - Initialize MSI capability for all architectures (Guilherme G. Piccoli)
   - Relax msi_domain_alloc() to support parentless MSI irqdomains (Liu Jiang)

  ARM Versatile host bridge driver:
   - Remove unused pci_sys_data structures (Lorenzo Pieralisi)

  Broadcom iProc host bridge driver:
   - Hide CONFIG_PCIE_IPROC (Arnd Bergmann)
   - Do not use 0x in front of %pap (Dmitry V. Krivenok)
   - Update iProc PCIe device tree binding (Ray Jui)
   - Add PAXC interface support (Ray Jui)
   - Add iProc PCIe MSI device tree binding (Ray Jui)
   - Add iProc PCIe MSI support (Ray Jui)

  Freescale i.MX6 host bridge driver:
   - Use gpio_set_value_cansleep() (Fabio Estevam)
   - Add support for active-low reset GPIO (Petr Štetiar)

  HiSilicon host bridge driver:
   - Add support for HiSilicon Hip06 PCIe host controllers (Gabriele Paoloni)

  Intel VMD host bridge driver:
   - Export irq_domain_set_info() for module use (Keith Busch)
   - x86/PCI: Allow DMA ops specific to a PCI domain (Keith Busch)
   - Use 32 bit PCI domain numbers (Keith Busch)
   - Add driver for Intel Volume Management Device (VMD) (Keith Busch)

  Qualcomm host bridge driver:
   - Document PCIe devicetree bindings (Stanimir Varbanov)
   - Add Qualcomm PCIe controller driver (Stanimir Varbanov)
   - dts: apq8064: add PCIe devicetree node (Stanimir Varbanov)
   - dts: ifc6410: enable PCIe DT node for this board (Stanimir Varbanov)

  Renesas R-Car host bridge driver:
   - Add support for R-Car H3 to pcie-rcar (Harunobu Kurokawa)
   - Allow DT to override default window settings (Phil Edworthy)
   - Convert to DT resource parsing API (Phil Edworthy)
   - Revert "PCI: rcar: Build pcie-rcar.c only on ARM" (Phil Edworthy)
   - Remove unused pci_sys_data struct from pcie-rcar (Phil Edworthy)
   - Add runtime PM support to pcie-rcar (Phil Edworthy)
   - Add Gen2 PHY setup to pcie-rcar (Phil Edworthy)
   - Add gen2 fallback compatibility string for pci-rcar-gen2 (Simon Horman)
   - Add gen2 fallback compatibility string for pcie-rcar (Simon Horman)

  Synopsys DesignWare host bridge driver:
   - Simplify control flow (Bjorn Helgaas)
   - Make config accessor override checking symmetric (Bjorn Helgaas)
   - Ensure ATU is enabled before IO/conf space accesses (Stanimir Varbanov)

  Miscellaneous:
   - Add of_pci_get_host_bridge_resources() stub (Arnd Bergmann)
   - Check for PCI_HEADER_TYPE_BRIDGE equality, not bitmask (Bjorn Helgaas)
   - Fix all whitespace issues (Bogicevic Sasa)
   - x86/PCI: Simplify pci_bios_{read,write} (Geliang Tang)
   - Use to_pci_dev() instead of open-coding it (Geliang Tang)
   - Use kobj_to_dev() instead of open-coding it (Geliang Tang)
   - Use list_for_each_entry() to simplify code (Geliang Tang)
   - Fix typos in <linux/msi.h> (Thomas Petazzoni)
   - x86/PCI: Clarify AMD Fam10h config access restrictions comment (Tomasz Nowicki)"

* tag 'pci-v4.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (58 commits)
  PCI: Add function 1 DMA alias quirk for Lite-On/Plextor M6e/Marvell 88SS9183
  PCI: Limit config space size for Netronome NFP4000
  PCI: Add Netronome NFP4000 PF device ID
  x86/PCI: Add driver for Intel Volume Management Device (VMD)
  PCI/AER: Use 32 bit PCI domain numbers
  x86/PCI: Allow DMA ops specific to a PCI domain
  irqdomain: Export irq_domain_set_info() for module use
  PCI: host: Add of_pci_get_host_bridge_resources() stub
  genirq/MSI: Relax msi_domain_alloc() to support parentless MSI irqdomains
  PCI: rcar: Add Gen2 PHY setup to pcie-rcar
  PCI: rcar: Add runtime PM support to pcie-rcar
  PCI: designware: Make config accessor override checking symmetric
  PCI: ibmphp: Remove unneeded NULL test
  ARM: dts: ifc6410: enable PCIe DT node for this board
  ARM: dts: apq8064: add PCIe devicetree node
  PCI: hotplug: Use list_for_each_entry() to simplify code
  PCI: rcar: Remove unused pci_sys_data struct from pcie-rcar
  PCI: hisi: Add support for HiSilicon Hip06 PCIe host controllers
  PCI: Avoid iterating through memory outside the resource window
  PCI: acpiphp_ibm: Fix null dereferences on null ibm_slot
  ...
2016-01-21 11:52:16 -08:00
Bjorn Helgaas
9662e32c81 Merge branch 'pci/trivial' into next
* pci/trivial:
  PCI: shpchp: Constify hpc_ops structure
  PCI: Use kobj_to_dev() instead of open-coding it
  PCI: Use to_pci_dev() instead of open-coding it
  PCI: Fix all whitespace issues
  PCI/MSI: Fix typos in <linux/msi.h>
2016-01-20 11:48:25 -06:00
Geliang Tang
554a60379a PCI: Use kobj_to_dev() instead of open-coding it
Use kobj_to_dev() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-01-08 12:07:57 -06:00
Jason S. McMullan
c20aecf696 PCI: Support PCIe devices with short cfg_size
If a device quirk modifies the pci_dev->cfg_size to be less than
PCI_CFG_SPACE_EXP_SIZE (4096), but greater than PCI_CFG_SPACE_SIZE (256),
the PCI sysfs interface truncates the readable size to PCI_CFG_SPACE_SIZE.

Allow sysfs access to config space up to cfg_size, even if the device
doesn't support the entire 4096-byte PCIe config space.

Note that pci_read_config() and pci_write_config() limit access to
dev->cfg_size even though pcie_config_attr contains 4096 (the maximum
size).

Signed-off-by: Jason S. McMullan <jason.mcmullan@netronome.com>
[simon: edited changelog]
Signed-off-by: Simon Horman <simon.horman@netronome.com>
[bhelgaas: more changelog edits]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-12-10 19:38:07 -06:00
Mathias Krause
3dcc8d39cf PCI: Prevent out of bounds access in numa_node override
Commit 1266963170 ("PCI: Prevent out of bounds access in numa_node
override") missed that the user-provided node could also be negative.
Handle this case as well to avoid out-of-bounds accesses to the
node_states[] array.  However, allow the special value -1, i.e.
NUMA_NO_NODE, to be able to set the 'no specific node' configuration.

Fixes: 1266963170 ("PCI: Prevent out of bounds access in numa_node override")
Fixes: 63692df103 ("PCI: Allow numa_node override via sysfs")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Sasha Levin <sasha.levin@oracle.com>
CC: Prarit Bhargava <prarit@redhat.com>
CC: stable@vger.kernel.org	# v3.19+
2015-11-24 12:33:13 -06:00
Sasha Levin
1266963170 PCI: Prevent out of bounds access in numa_node override
63692df103 ("PCI: Allow numa_node override via sysfs") didn't check that
the numa node provided by userspace is valid.  Passing a node number too
high would attempt to access invalid memory and trigger a kernel panic.

Fixes: 63692df103 ("PCI: Allow numa_node override via sysfs")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v3.19+
2015-10-07 11:03:28 -05:00
Sasha Levin
4efe874aac PCI: Don't read past the end of sysfs "driver_override" buffer
When printing the driver_override parameter when it is 4095 and 4094 bytes
long, the printing code would access invalid memory because we need count+1
bytes for printing.

Fixes: 782a985d7a ("PCI: Introduce new device binding path using pci_dev.driver_override")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
CC: stable@vger.kernel.org	# v3.16+
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Alexander Graf <agraf@suse.de>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-02-24 17:35:37 -06:00
Linus Torvalds
e6b5be2be4 Driver core patches for 3.19-rc1
Here's the set of driver core patches for 3.19-rc1.
 
 They are dominated by the removal of the .owner field in platform
 drivers.  They touch a lot of files, but they are "simple" changes, just
 removing a line in a structure.
 
 Other than that, a few minor driver core and debugfs changes.  There are
 some ath9k patches coming in through this tree that have been acked by
 the wireless maintainers as they relied on the debugfs changes.
 
 Everything has been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlSOD20ACgkQMUfUDdst+ylLPACg2QrW1oHhdTMT9WI8jihlHVRM
 53kAoLeteByQ3iVwWurwwseRPiWa8+MI
 =OVRS
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core update from Greg KH:
 "Here's the set of driver core patches for 3.19-rc1.

  They are dominated by the removal of the .owner field in platform
  drivers.  They touch a lot of files, but they are "simple" changes,
  just removing a line in a structure.

  Other than that, a few minor driver core and debugfs changes.  There
  are some ath9k patches coming in through this tree that have been
  acked by the wireless maintainers as they relied on the debugfs
  changes.

  Everything has been in linux-next for a while"

* tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits)
  Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries"
  fs: debugfs: add forward declaration for struct device type
  firmware class: Deletion of an unnecessary check before the function call "vunmap"
  firmware loader: fix hung task warning dump
  devcoredump: provide a one-way disable function
  device: Add dev_<level>_once variants
  ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries
  ath: use seq_file api for ath9k debugfs files
  debugfs: add helper function to create device related seq_file
  drivers/base: cacheinfo: remove noisy error boot message
  Revert "core: platform: add warning if driver has no owner"
  drivers: base: support cpu cache information interface to userspace via sysfs
  drivers: base: add cpu_device_create to support per-cpu devices
  topology: replace custom attribute macros with standard DEVICE_ATTR*
  cpumask: factor out show_cpumap into separate helper function
  driver core: Fix unbalanced device reference in drivers_probe
  driver core: fix race with userland in device_add()
  sysfs/kernfs: make read requests on pre-alloc files use the buffer.
  sysfs/kernfs: allow attributes to request write buffer be pre-allocated.
  fs: sysfs: return EGBIG on write if offset is larger than file size
  ...
2014-12-14 16:10:09 -08:00
Linus Torvalds
92a578b064 ACPI and power management updates for 3.19-rc1
This time we have some more new material than we used to have during
 the last couple of development cycles.
 
 The most important part of it to me is the introduction of a unified
 interface for accessing device properties provided by platform
 firmware.  It works with Device Trees and ACPI in a uniform way and
 drivers using it need not worry about where the properties come
 from as long as the platform firmware (either DT or ACPI) makes
 them available.  It covers both devices and "bare" device node
 objects without struct device representation as that turns out to
 be necessary in some cases.  This has been in the works for quite
 a few months (and development cycles) and has been approved by
 all of the relevant maintainers.
 
 On top of that, some drivers are switched over to the new interface
 (at25, leds-gpio, gpio_keys_polled) and some additional changes are
 made to the core GPIO subsystem to allow device drivers to manipulate
 GPIOs in the "canonical" way on platforms that provide GPIO information
 in their ACPI tables, but don't assign names to GPIO lines (in which
 case the driver needs to do that on the basis of what it knows about
 the device in question).  That also has been approved by the GPIO
 core maintainers and the rfkill driver is now going to use it.
 
 Second is support for hardware P-states in the intel_pstate driver.
 It uses CPUID to detect whether or not the feature is supported by
 the processor in which case it will be enabled by default.  However,
 it can be disabled entirely from the kernel command line if necessary.
 
 Next is support for a platform firmware interface based on ACPI
 operation regions used by the PMIC (Power Management Integrated
 Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms.
 That interface is used for manipulating power resources and for
 thermal management: sensor temperature reporting, trip point setting
 and so on.
 
 Also the ACPI core is now going to support the _DEP configuration
 information in a limited way.  Basically, _DEP it supposed to reflect
 off-the-hierarchy dependencies between devices which may be very
 indirect, like when AML for one device accesses locations in an
 operation region handled by another device's driver (usually, the
 device depended on this way is a serial bus or GPIO controller).
 The support added this time is sufficient to make the ACPI battery
 driver work on Asus T100A, but it is general enough to be able to
 cover some other use cases in the future.
 
 Finally, we have a new cpufreq driver for the Loongson1B processor.
 
 In addition to the above, there are fixes and cleanups all over the
 place as usual and a traditional ACPICA update to a recent upstream
 release.
 
 As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver
 for Intel platforms should be able to handle power management of
 the DMA engine correctly, the cpufreq-dt driver should interact
 with the thermal subsystem in a better way and the ACPI backlight
 driver should handle some more corner cases, among other things.
 
 On top of the ACPICA update there are fixes for race conditions
 in the ACPICA's interrupt handling code which might lead to some
 random and strange looking failures on some systems.
 
 In the cleanups department the most visible part is the series
 of commits targeted at getting rid of the CONFIG_PM_RUNTIME
 configuration option.  That was triggered by a discussion
 regarding the generic power domains code during which we realized
 that trying to support certain combinations of PM config options
 was painful and not really worth it, because nobody would use them
 in production anyway.  For this reason, we decided to make
 CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and that lead to the
 conclusion that the latter became redundant and CONFIG_PM could
 be used instead of it.  The material here makes that replacement
 in a major part of the tree, but there will be at least one more
 batch of that in the second part of the merge window.
 
 Specifics:
 
  - Support for retrieving device properties information from ACPI
    _DSD device configuration objects and a unified device properties
    interface for device drivers (and subsystems) on top of that.
    As stated above, this works with Device Trees and ACPI and allows
    device drivers to be written in a platform firmware (DT or ACPI)
    agnostic way.  The at25, leds-gpio and gpio_keys_polled drivers
    are now going to use this new interface and the GPIO subsystem
    is additionally modified to allow device drivers to assign names
    to GPIO resources returned by ACPI _CRS objects (in case _DSD is
    not present or does not provide the expected data).  The changes
    in this set are mostly from Mika Westerberg, Rafael J Wysocki,
    Aaron Lu, and Darren Hart with some fixes from others (Fabio Estevam,
    Geert Uytterhoeven).
 
  - Support for Hardware Managed Performance States (HWP) as described
    in Volume 3, section 14.4, of the Intel SDM in the intel_pstate
    driver.  CPUID is used to detect whether or not the feature is
    supported by the processor.  If supported, it will be enabled
    automatically unless the intel_pstate=no_hwp switch is present in
    the kernel command line.  From Dirk Brandewie.
 
  - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie).
 
  - Support for firmware interface based on ACPI operation regions
    used by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR
    platforms for power resource control and thermal management
    (Aaron Lu).
 
  - Limited support for retrieving off-the-hierarchy dependencies
    between devices from ACPI _DEP device configuration objects
    and deferred probing support for the ACPI battery driver based
    on the _DEP information to make that driver work on Asus T100A
    (Lan Tianyu).
 
  - New cpufreq driver for the Loongson1B processor (Kelvin Cheung).
 
  - ACPICA update to upstream revision 20141107 which only affects
    tools (Bob Moore).
 
  - Fixes for race conditions in the ACPICA's interrupt handling
    code and in the ACPI code related to system suspend and resume
    (Lv Zheng and Rafael J Wysocki).
 
  - ACPI core fix for an RCU-related issue in the ioremap() regions
    management code that slowed down significantly after CPUs had
    been allowed to enter idle states even if they'd had RCU callbakcs
    queued and triggered some problems in certain proprietary graphics
    driver (and elsewhere).  The fix replaces synchronize_rcu() in
    that code with synchronize_rcu_expedited() which makes the issue
    go away.  From Konstantin Khlebnikov.
 
  - ACPI LPSS (Low-Power Subsystem) driver fix to handle power
    management of the DMA engine included into the LPSS correctly.
    The problem is that the DMA engine doesn't have ACPI PM support
    of its own and it simply is turned off when the last LPSS device
    having ACPI PM support goes into D3cold.  To work around that,
    the PM domain used by the ACPI LPSS driver is redesigned so at
    least one device with ACPI PM support will be on as long as the
    DMA engine is in use.  From Andy Shevchenko.
 
  - ACPI backlight driver fix to avoid using it on "Win8-compatible"
    systems where it doesn't work and where it was used by default by
    mistake (Aaron Lu).
 
  - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki,
    Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and
    Ashwin Chaugule (mostly related to the upcoming ARM64 support).
 
  - Intel RAPL (Running Average Power Limit) power capping driver
    fixes and improvements including new processor IDs (Jacob Pan).
 
  - Generic power domains modification to power up domains after
    attaching devices to them to meet the expectations of device
    drivers and bus types assuming devices to be accessible at
    probe time (Ulf Hansson).
 
  - Preliminary support for controlling device clocks from the
    generic power domains core code and modifications of the
    ARM/shmobile platform to use that feature (Ulf Hansson).
 
  - Assorted minor fixes and cleanups of the generic power
    domains core code (Ulf Hansson, Geert Uytterhoeven).
 
  - Assorted minor fixes and cleanups of the device clocks control
    code in the PM core (Geert Uytterhoeven, Grygorii Strashko).
 
  - Consolidation of device power management Kconfig options by making
    CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter
    which is now redundant (Rafael J Wysocki and Kevin Hilman).  That
    is the first batch of the changes needed for this purpose.
 
  - Core device runtime power management support code cleanup related
    to the execution of callbacks (Andrzej Hajda).
 
  - cpuidle ARM support improvements (Lorenzo Pieralisi).
 
  - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and
    a new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and
    Bartlomiej Zolnierkiewicz).
 
  - New cpufreq driver callback (->ready) to be executed when the
    cpufreq core is ready to use a given policy object and cpufreq-dt
    driver modification to use that callback for cooling device
    registration (Viresh Kumar).
 
  - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu,
    James Geboski, Tomeu Vizoso).
 
  - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate,
    cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao,
    Stefan Wahren, Petr Cvek).
 
  - OPP (Operating Performance Points) framework modification to
    allow OPPs to be removed too and update of a few cpufreq drivers
    (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added
    during initialization) on driver removal (Viresh Kumar).
 
  - Hibernation core fixes and cleanups (Tina Ruchandani and
    Markus Elfring).
 
  - PM Kconfig fix related to CPU power management (Pankaj Dubey).
 
  - cpupower tool fix (Prarit Bhargava).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJUhj6JAAoJEILEb/54YlRxTM4P/j5g5SfqvY0QKsn7sR7MGZ6v
 nsgCBhJAqTw3ocNC7EAs8z9h2GWy1KbKpakKYWAh9Fs1yZoey7tFSlcv/Rgjlp70
 uU5sDQHtpE9mHKiymdsowiQuWgpl962L4k+k8hUslhlvgk1PvVbpajR6OqG8G+pD
 asuIW9eh1APNkLyXmRJ3ZPomzs0VmRdZJ0NEs0lKX9mJskqEvxPIwdaxq3iaJq9B
 Fo0J345zUDcJnxWblDRdHlOigCimglElfN5qJwaC4KpwUKuBvLRKbp4f69+wfT0c
 kYFiR29X5KjJ2kLfP/wKsLyuDCYYXRq3tCia5M1tAqOjZ+UA89H/GDftx/5lntmv
 qUlBa35VfdS1SX4HyApZitOHiLgo+It/hl8Z9bJnhyVw66NxmMQ8JYN2imb8Lhqh
 XCLR7BxLTah82AapLJuQ0ZDHPzZqMPG2veC2vAzRMYzVijict/p4Y2+qBqONltER
 4rs9uRVn+hamX33lCLg8BEN8zqlnT3rJFIgGaKjq/wXHAU/zpE9CjOrKMQcAg9+s
 t51XMNPwypHMAYyGVhEL89ImjXnXxBkLRuquhlmEpvQchIhR+mR3dLsarGn7da44
 WPIQJXzcsojXczcwwfqsJCR4I1FTFyQIW+UNh02GkDRgRovQqo+Jk762U7vQwqH+
 LBdhvVaS1VW4v+FWXEoZ
 =5dox
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:
 "This time we have some more new material than we used to have during
  the last couple of development cycles.

  The most important part of it to me is the introduction of a unified
  interface for accessing device properties provided by platform
  firmware.  It works with Device Trees and ACPI in a uniform way and
  drivers using it need not worry about where the properties come from
  as long as the platform firmware (either DT or ACPI) makes them
  available.  It covers both devices and "bare" device node objects
  without struct device representation as that turns out to be necessary
  in some cases.  This has been in the works for quite a few months (and
  development cycles) and has been approved by all of the relevant
  maintainers.

  On top of that, some drivers are switched over to the new interface
  (at25, leds-gpio, gpio_keys_polled) and some additional changes are
  made to the core GPIO subsystem to allow device drivers to manipulate
  GPIOs in the "canonical" way on platforms that provide GPIO
  information in their ACPI tables, but don't assign names to GPIO lines
  (in which case the driver needs to do that on the basis of what it
  knows about the device in question).  That also has been approved by
  the GPIO core maintainers and the rfkill driver is now going to use
  it.

  Second is support for hardware P-states in the intel_pstate driver.
  It uses CPUID to detect whether or not the feature is supported by the
  processor in which case it will be enabled by default.  However, it
  can be disabled entirely from the kernel command line if necessary.

  Next is support for a platform firmware interface based on ACPI
  operation regions used by the PMIC (Power Management Integrated
  Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms.
  That interface is used for manipulating power resources and for
  thermal management: sensor temperature reporting, trip point setting
  and so on.

  Also the ACPI core is now going to support the _DEP configuration
  information in a limited way.  Basically, _DEP it supposed to reflect
  off-the-hierarchy dependencies between devices which may be very
  indirect, like when AML for one device accesses locations in an
  operation region handled by another device's driver (usually, the
  device depended on this way is a serial bus or GPIO controller).  The
  support added this time is sufficient to make the ACPI battery driver
  work on Asus T100A, but it is general enough to be able to cover some
  other use cases in the future.

  Finally, we have a new cpufreq driver for the Loongson1B processor.

  In addition to the above, there are fixes and cleanups all over the
  place as usual and a traditional ACPICA update to a recent upstream
  release.

  As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for
  Intel platforms should be able to handle power management of the DMA
  engine correctly, the cpufreq-dt driver should interact with the
  thermal subsystem in a better way and the ACPI backlight driver should
  handle some more corner cases, among other things.

  On top of the ACPICA update there are fixes for race conditions in the
  ACPICA's interrupt handling code which might lead to some random and
  strange looking failures on some systems.

  In the cleanups department the most visible part is the series of
  commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration
  option.  That was triggered by a discussion regarding the generic
  power domains code during which we realized that trying to support
  certain combinations of PM config options was painful and not really
  worth it, because nobody would use them in production anyway.  For
  this reason, we decided to make CONFIG_PM_SLEEP select
  CONFIG_PM_RUNTIME and that lead to the conclusion that the latter
  became redundant and CONFIG_PM could be used instead of it.  The
  material here makes that replacement in a major part of the tree, but
  there will be at least one more batch of that in the second part of
  the merge window.

  Specifics:

   - Support for retrieving device properties information from ACPI _DSD
     device configuration objects and a unified device properties
     interface for device drivers (and subsystems) on top of that.  As
     stated above, this works with Device Trees and ACPI and allows
     device drivers to be written in a platform firmware (DT or ACPI)
     agnostic way.  The at25, leds-gpio and gpio_keys_polled drivers are
     now going to use this new interface and the GPIO subsystem is
     additionally modified to allow device drivers to assign names to
     GPIO resources returned by ACPI _CRS objects (in case _DSD is not
     present or does not provide the expected data).  The changes in
     this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron
     Lu, and Darren Hart with some fixes from others (Fabio Estevam,
     Geert Uytterhoeven).

   - Support for Hardware Managed Performance States (HWP) as described
     in Volume 3, section 14.4, of the Intel SDM in the intel_pstate
     driver.  CPUID is used to detect whether or not the feature is
     supported by the processor.  If supported, it will be enabled
     automatically unless the intel_pstate=no_hwp switch is present in
     the kernel command line.  From Dirk Brandewie.

   - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie).

   - Support for firmware interface based on ACPI operation regions used
     by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR
     platforms for power resource control and thermal management (Aaron
     Lu).

   - Limited support for retrieving off-the-hierarchy dependencies
     between devices from ACPI _DEP device configuration objects and
     deferred probing support for the ACPI battery driver based on the
     _DEP information to make that driver work on Asus T100A (Lan
     Tianyu).

   - New cpufreq driver for the Loongson1B processor (Kelvin Cheung).

   - ACPICA update to upstream revision 20141107 which only affects
     tools (Bob Moore).

   - Fixes for race conditions in the ACPICA's interrupt handling code
     and in the ACPI code related to system suspend and resume (Lv Zheng
     and Rafael J Wysocki).

   - ACPI core fix for an RCU-related issue in the ioremap() regions
     management code that slowed down significantly after CPUs had been
     allowed to enter idle states even if they'd had RCU callbakcs
     queued and triggered some problems in certain proprietary graphics
     driver (and elsewhere).  The fix replaces synchronize_rcu() in that
     code with synchronize_rcu_expedited() which makes the issue go
     away.  From Konstantin Khlebnikov.

   - ACPI LPSS (Low-Power Subsystem) driver fix to handle power
     management of the DMA engine included into the LPSS correctly.  The
     problem is that the DMA engine doesn't have ACPI PM support of its
     own and it simply is turned off when the last LPSS device having
     ACPI PM support goes into D3cold.  To work around that, the PM
     domain used by the ACPI LPSS driver is redesigned so at least one
     device with ACPI PM support will be on as long as the DMA engine is
     in use.  From Andy Shevchenko.

   - ACPI backlight driver fix to avoid using it on "Win8-compatible"
     systems where it doesn't work and where it was used by default by
     mistake (Aaron Lu).

   - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki,
     Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin
     Chaugule (mostly related to the upcoming ARM64 support).

   - Intel RAPL (Running Average Power Limit) power capping driver fixes
     and improvements including new processor IDs (Jacob Pan).

   - Generic power domains modification to power up domains after
     attaching devices to them to meet the expectations of device
     drivers and bus types assuming devices to be accessible at probe
     time (Ulf Hansson).

   - Preliminary support for controlling device clocks from the generic
     power domains core code and modifications of the ARM/shmobile
     platform to use that feature (Ulf Hansson).

   - Assorted minor fixes and cleanups of the generic power domains core
     code (Ulf Hansson, Geert Uytterhoeven).

   - Assorted minor fixes and cleanups of the device clocks control code
     in the PM core (Geert Uytterhoeven, Grygorii Strashko).

   - Consolidation of device power management Kconfig options by making
     CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter
     which is now redundant (Rafael J Wysocki and Kevin Hilman).  That
     is the first batch of the changes needed for this purpose.

   - Core device runtime power management support code cleanup related
     to the execution of callbacks (Andrzej Hajda).

   - cpuidle ARM support improvements (Lorenzo Pieralisi).

   - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a
     new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and
     Bartlomiej Zolnierkiewicz).

   - New cpufreq driver callback (->ready) to be executed when the
     cpufreq core is ready to use a given policy object and cpufreq-dt
     driver modification to use that callback for cooling device
     registration (Viresh Kumar).

   - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James
     Geboski, Tomeu Vizoso).

   - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate,
     cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao,
     Stefan Wahren, Petr Cvek).

   - OPP (Operating Performance Points) framework modification to allow
     OPPs to be removed too and update of a few cpufreq drivers
     (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added
     during initialization) on driver removal (Viresh Kumar).

   - Hibernation core fixes and cleanups (Tina Ruchandani and Markus
     Elfring).

   - PM Kconfig fix related to CPU power management (Pankaj Dubey).

   - cpupower tool fix (Prarit Bhargava)"

* tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (120 commits)
  i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c
  dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  tools: cpupower: fix return checks for sysfs_get_idlestate_count()
  drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME
  MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  leds: leds-gpio: Fix multiple instances registration without 'label' property
  iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  hwrandom / exynos / PM: Use CONFIG_PM in #ifdef
  block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
  USB / PM: Drop CONFIG_PM_RUNTIME from the USB core
  PM: Merge the SET*_RUNTIME_PM_OPS() macros
  ...
2014-12-10 21:17:00 -08:00
Linus Torvalds
c75059c462 PCI changes for the v3.19 merge window:
NUMA
     - Allow numa_node override via sysfs (Prarit Bhargava)
 
   Resource management
     - Restore detection of read-only BARs (Myron Stowe)
     - Shrink decoding-disabled window while sizing BARs (Myron Stowe)
     - Add informational printk for invalid BARs (Myron Stowe)
     - Remove fixed parameter in pci_iov_resource_bar() (Myron Stowe)
 
   MSI
     - Add pci_msi_ignore_mask to prevent writes to MSI/MSI-X Mask Bits (Yijing Wang)
     - Revert "PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()" (Yijing Wang)
     - s390/MSI: Use __msi_mask_irq() instead of default_msi_mask_irq() (Yijing Wang)
 
   Virtualization
     - xen: Process failure for pcifront_(re)scan_root() (Chen Gang)
     - Make FLR and AF FLR reset warning messages different (Gavin Shan)
 
   Generic host bridge driver
     - Allocate config space windows after limiting bus number range (Lorenzo Pieralisi)
     - Convert to DT resource parsing API (Lorenzo Pieralisi)
 
   Freescale Layerscape
     - Add Freescale Layerscape PCIe driver (Minghuan Lian)
 
   NVIDIA Tegra
     - Do not build on 64-bit ARM (Thierry Reding)
     - Add Kconfig help text (Thierry Reding)
 
   Renesas R-Car
     - Make rcar_pci static (Jingoo Han)
 
   Samsung Exynos
     - Add exynos prefix to add_pcie_port(), pcie_init() (Jingoo Han)
 
   ST Microelectronics SPEAr13xx
     - Add spear prefix to add_pcie_port(), pcie_init() (Jingoo Han)
     - Make spear13xx_add_pcie_port() __init (Jingoo Han)
     - Remove unnecessary OOM message (Jingoo Han)
 
   TI DRA7xx
     - Add dra7xx prefix to add_pcie_port() (Jingoo Han)
     - Make dra7xx_add_pcie_port() __init (Jingoo Han)
 
   TI Keystone
     - Make ks_dw_pcie_msi_domain_ops static (Jingoo Han)
     - Remove unnecessary OOM message (Jingoo Han)
 
   Miscellaneous
     - Delete unnecessary NULL pointer checks (Markus Elfring)
     - Remove unused to_hotplug_slot() (Gavin Shan)
     - Whitespace cleanup (Jingoo Han)
     - Simplify if-return sequences (Quentin Lambert)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUhik9AAoJEFmIoMA60/r8tAQQAJ3Rv5MlHt63cXxgIMOcoLrR
 OsFvW+2oMTyUkGg69SgI3YfF9IBjdwkJ3U6OnpfPGcbKyQvmSTxwCEZPVYM9r3mC
 1UknItYLXSFsz682sXGrepHoL/N3Im0fhu56oEJwIL+htHNMgGKk+Sk6yW9rBVvz
 J7fw31mlrs5YnjkLvwbDjmS3fpCmjqb5fkNlZHxwKcPtM/ODfbRnYYvSucN9Relt
 xy2MyuXlZvp7aPwi03z7utZx1ezjzfVlGNlCWyVINERvqbKYeIrAGbfwmVdCVRRf
 2kqNS5N6B1IHq6iHg5xbjh9ZOdzYu2bPO4v7qgDEUDWzT0JTes4mOrv5NJWk4ZV/
 0erFLOkaCzHpriAXYN8qSfJilm40EYt+hKQI3f8jaTEOycOTWgOcVh9ci7uaNWgX
 6Ia9Ch+FXbMg3deL+MwfFQFNbkMzgeNihLZW7xf54psWJobQ3v4eG2KTRqCaOqI0
 87tMWPSzOqqnQEUWGw0rTSS7P5UxgKc27Qw83OaaIMz8G3ibSc4VhZT/PpBCQog9
 M6ezsxNhJ6rj/81mM5jElzGHQeHUnsAahcQscvva07q6UcRx7JhWVLW0E6l+gyD+
 u1XWZQi5b3PwVlJRyv3sKgFpFjsH8pu7wBL8F13NHd0eb4M5m3ZUZmBbXktF0dLc
 V0H7kqLWqkTCXo7omekm
 =kKg9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI changes from Bjorn Helgaas:
 "Here are the PCI changes intended for v3.19.  I don't think there's
  anything very exciting here, but there was a lot of MSI-related stuff
  coming via Thomas.

  Details:

  NUMA
    - Allow numa_node override via sysfs (Prarit Bhargava)

  Resource management
    - Restore detection of read-only BARs (Myron Stowe)
    - Shrink decoding-disabled window while sizing BARs (Myron Stowe)
    - Add informational printk for invalid BARs (Myron Stowe)
    - Remove fixed parameter in pci_iov_resource_bar() (Myron Stowe)

  MSI
    - Add pci_msi_ignore_mask to prevent writes to MSI/MSI-X Mask Bits (Yijing Wang)
    - Revert "PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()" (Yijing Wang)
    - s390/MSI: Use __msi_mask_irq() instead of default_msi_mask_irq() (Yijing Wang)

  Virtualization
    - xen: Process failure for pcifront_(re)scan_root() (Chen Gang)
    - Make FLR and AF FLR reset warning messages different (Gavin Shan)

  Generic host bridge driver
    - Allocate config space windows after limiting bus number range (Lorenzo Pieralisi)
    - Convert to DT resource parsing API (Lorenzo Pieralisi)

  Freescale Layerscape
    - Add Freescale Layerscape PCIe driver (Minghuan Lian)

  NVIDIA Tegra
    - Do not build on 64-bit ARM (Thierry Reding)
    - Add Kconfig help text (Thierry Reding)

  Renesas R-Car
    - Make rcar_pci static (Jingoo Han)

  Samsung Exynos
    - Add exynos prefix to add_pcie_port(), pcie_init() (Jingoo Han)

  ST Microelectronics SPEAr13xx
    - Add spear prefix to add_pcie_port(), pcie_init() (Jingoo Han)
    - Make spear13xx_add_pcie_port() __init (Jingoo Han)
    - Remove unnecessary OOM message (Jingoo Han)

  TI DRA7xx
    - Add dra7xx prefix to add_pcie_port() (Jingoo Han)
    - Make dra7xx_add_pcie_port() __init (Jingoo Han)

  TI Keystone
    - Make ks_dw_pcie_msi_domain_ops static (Jingoo Han)
    - Remove unnecessary OOM message (Jingoo Han)

  Miscellaneous
    - Delete unnecessary NULL pointer checks (Markus Elfring)
    - Remove unused to_hotplug_slot() (Gavin Shan)
    - Whitespace cleanup (Jingoo Han)
    - Simplify if-return sequences (Quentin Lambert)"

* tag 'pci-v3.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (28 commits)
  PCI: Remove fixed parameter in pci_iov_resource_bar()
  PCI: Add informational printk for invalid BARs
  PCI: tegra: Add Kconfig help text
  PCI: tegra: Do not build on 64-bit ARM
  PCI: spear: Remove unnecessary OOM message
  PCI: mvebu: Add a blank line after declarations
  PCI: designware: Add a blank line after declarations
  PCI: exynos: Remove unnecessary return statement
  PCI: imx6: Use tabs for indentation
  PCI: keystone: Remove unnecessary OOM message
  PCI: Remove unused and broken to_hotplug_slot()
  PCI: Make FLR and AF FLR reset warning messages different
  PCI: dra7xx: Add __init annotation to dra7xx_add_pcie_port()
  PCI: spear: Add __init annotation to spear13xx_add_pcie_port()
  PCI: spear: Rename add_pcie_port(), pcie_init() to spear13xx_add_pcie_port(), etc.
  PCI: dra7xx: Rename add_pcie_port() to dra7xx_add_pcie_port()
  PCI: layerscape: Add Freescale Layerscape PCIe driver
  PCI: Simplify if-return sequences
  PCI: Delete unnecessary NULL pointer checks
  PCI: Shrink decoding-disabled window while sizing BARs
  ...
2014-12-10 20:58:52 -08:00
Rafael J. Wysocki
fbb988be7f PCI / PM: Drop CONFIG_PM_RUNTIME from the PCI core
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so quite a few
depend on CONFIG_PM.

Replace CONFIG_PM_RUNTIME with CONFIG_PM in the PCI core code.

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-04 00:50:33 +01:00
Sudeep Holla
5aaba36318 cpumask: factor out show_cpumap into separate helper function
Many sysfs *_show function use cpu{list,mask}_scnprintf to copy cpumap
to the buffer aligned to PAGE_SIZE, append '\n' and '\0' to return null
terminated buffer with newline.

This patch creates a new helper function cpumap_print_to_pagebuf in
cpumask.h using newly added bitmap_print_to_pagebuf and consolidates
most of those sysfs functions using the new helper function.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Suggested-by: Stephen Boyd <sboyd@codeaurora.org>
Tested-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: x86@kernel.org
Cc: linux-acpi@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-11-07 11:45:00 -08:00
Prarit Bhargava
63692df103 PCI: Allow numa_node override via sysfs
NUMA systems with ACPI normally describe the physical topology via _PXM
methods.  But many BIOSes don't implement _PXM, which leaves the kernel
with no way to discover the device topology, which reduces performance
because we can't put memory and processes close to the device.

The NUMA node of a PCI device is already exported in the sysfs "numa_node"
file.  Make that file writable so users can workaround the lack of _PXM
methods in the BIOS.  For example:

  echo 3 > /sys/devices/pci0000:ff/0000:03:1f.3/numa_node

sets the node for PCI device 0000:03:1f.3.

Writing the file emits a FW_BUG warning to encourage users to request
firmware updates.  It also taints the kernel with TAINT_FIRMWARE_WORKAROUND
because overriding the node incorrectly can cause performance issues.

[bhelgaas: changelog, documentation text]
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Myron Stowe <mstowe@redhat.com>
CC: Alexander Ducyk <alexander.h.duyck@redhat.com>
CC: Jiang Liu <jiang.liu@linux.intel.com>
2014-11-06 15:10:06 -07:00
Greg Kroah-Hartman
d8e7d53a2f PCI: Rename sysfs 'enabled' file back to 'enable'
Back in commit 5136b2da77 ("PCI: convert bus code to use dev_groups"),
I misstyped the 'enable' sysfs filename as 'enabled', which broke the
userspace API.  This patch fixes that issue by renaming the file back.

Fixes: 5136b2da77 ("PCI: convert bus code to use dev_groups")
Reported-by: Jeff Epler <jepler@unpythonic.net>
Tested-by: Jeff Epler <jepler@unpythonic.net>	# on v3.14-rt
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# 3.13
2014-10-30 11:17:10 -06:00
Bjorn Helgaas
359c660e99 Merge branch 'pci/msi' into next
* pci/msi:
  PCI/MSI: Remove unnecessary temporary variable
  PCI/MSI: Use __write_msi_msg() instead of write_msi_msg()
  MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
  PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg()
  PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
  PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib
  PCI/MSI: Remove unused kobject from struct msi_desc
  PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported()
  PCI/MSI: Move D0 check into pci_msi_check_device()
  PCI/MSI: Remove arch_msi_check_device()
  irqchip: armada-370-xp: Remove arch_msi_check_device()
  PCI/MSI/PPC: Remove arch_msi_check_device()

Conflicts:
	drivers/pci/host/pcie-designware.c
2014-10-01 12:31:46 -06:00
Yijing Wang
468ff15a3a PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
The "msi_bus" sysfs file for bridges sets a bus flag to allow or disallow
future driver requests for MSI or MSI-X.  Previously, the sysfs file
existed for endpoints but did nothing.

Add "msi_bus" support for endpoints, so an administrator can prevent the
use of MSI and MSI-X for individual devices.

Note that as for bridges, these changes only affect future driver requests
for MSI or MSI-X, so drivers may need to be reloaded.

Add documentation for the "msi_bus" sysfs file.

[bhelgaas: changelog, comments, add "subordinate", add endpoint printk,
rework bus_flags setting, make bus_flags printk unconditional]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-10-01 12:21:23 -06:00
Ricardo Ribalda Delgado
89ec3dcf17 PCI: Generate uppercase hex for modalias interface class
Some implementations of modprobe fail to load the driver for a PCI device
automatically because the "interface" part of the modalias from the kernel
is lowercase, and the modalias from file2alias is uppercase.

The "interface" is the low-order byte of the Class Code, defined in PCI
r3.0, Appendix D.  Most interface types defined in the spec do not use
alpha characters, so they won't be affected.  For example, 00h, 01h, 10h,
20h, etc. are unaffected.

Print the "interface" byte of the Class Code in uppercase hex, as we
already do for the Vendor ID, Device ID, Class, etc.

[bhelgaas: changelog]
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: stable@vger.kernel.org
2014-09-22 12:55:58 -06:00
Ryan Desfosses
227f064705 PCI: Merge multi-line quoted strings
Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.

No functional change.

[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 20:20:42 -06:00
Ryan Desfosses
3c78bc61f5 PCI: Whitespace cleanup
Fix various whitespace errors.

No functional change.

[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 20:20:19 -06:00
Bjorn Helgaas
d1a2523d2a Merge branches 'pci/hotplug', 'pci/pci_is_bridge' and 'pci/virtualization' into next
* pci/hotplug:
  PCI: cpqphp: Fix possible null pointer dereference
  NVMe: Implement PCIe reset notification callback
  PCI: Notify driver before and after device reset

* pci/pci_is_bridge:
  pcmcia: Use pci_is_bridge() to simplify code
  PCI: pciehp: Use pci_is_bridge() to simplify code
  PCI: acpiphp: Use pci_is_bridge() to simplify code
  PCI: cpcihp: Use pci_is_bridge() to simplify code
  PCI: shpchp: Use pci_is_bridge() to simplify code
  PCI: rpaphp: Use pci_is_bridge() to simplify code
  sparc/PCI: Use pci_is_bridge() to simplify code
  powerpc/PCI: Use pci_is_bridge() to simplify code
  ia64/PCI: Use pci_is_bridge() to simplify code
  x86/PCI: Use pci_is_bridge() to simplify code
  PCI: Use pci_is_bridge() to simplify code
  PCI: Add new pci_is_bridge() interface
  PCI: Rename pci_is_bridge() to pci_has_subordinate()

* pci/virtualization:
  PCI: Introduce new device binding path using pci_dev.driver_override

Conflicts:
	drivers/pci/pci-sysfs.c
2014-05-28 16:21:07 -06:00
Alex Williamson
782a985d7a PCI: Introduce new device binding path using pci_dev.driver_override
The driver_override field allows us to specify the driver for a device
rather than relying on the driver to provide a positive match of the
device.  This shortcuts the existing process of looking up the vendor and
device ID, adding them to the driver new_id, binding the device, then
removing the ID, but it also provides a couple advantages.

First, the above existing process allows the driver to bind to any device
matching the new_id for the window where it's enabled.  This is often not
desired, such as the case of trying to bind a single device to a meta
driver like pci-stub or vfio-pci.  Using driver_override we can do this
deterministically using:

  echo pci-stub > /sys/bus/pci/devices/0000:03:00.0/driver_override
  echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
  echo 0000:03:00.0 > /sys/bus/pci/drivers_probe

Previously we could not invoke drivers_probe after adding a device to
new_id for a driver as we get non-deterministic behavior whether the driver
we intend or the standard driver will claim the device.  Now it becomes a
deterministic process, only the driver matching driver_override will probe
the device.

To return the device to the standard driver, we simply clear the
driver_override and reprobe the device:

  echo > /sys/bus/pci/devices/0000:03:00.0/driver_override
  echo 0000:03:00.0 > /sys/bus/pci/devices/0000:03:00.0/driver/unbind
  echo 0000:03:00.0 > /sys/bus/pci/drivers_probe

Another advantage to this approach is that we can specify a driver override
to force a specific binding or prevent any binding.  For instance when an
IOMMU group is exposed to userspace through VFIO we require that all
devices within that group are owned by VFIO.  However, devices can be
hot-added into an IOMMU group, in which case we want to prevent the device
from binding to any driver (override driver = "none") or perhaps have it
automatically bind to vfio-pci.  With driver_override it's a simple matter
for this field to be set internally when the device is first discovered to
prevent driver matches.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-28 16:04:53 -06:00
Sebastian Ott
9edbcd2252 PCI: Remove pcibios_add_platform_entries()
Remove pcibios_add_platform_entries().  Architecture-specific attributes
can be achieved by setting pdev->dev.groups.

Link: https://lkml.kernel.org/r/alpine.LFD.2.11.1404141101500.1529@denkbrett
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-05-22 10:54:07 -06:00
Sebastian Ott
dfc73e7acd PCI: Move Open Firmware devspec attribute to PCI common code
Move the devspec OF attribute to PCI common code's set of device attributes
since it's not architecture dependent.  As a side effect microblaze and
powerpc no longer need to use pcibios_add_platform_entries().

[bhelgaas: fold in #include for compile error]
Link: https://lkml.kernel.org/r/alpine.LFD.2.11.1404141101500.1529@denkbrett
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-30 14:48:41 -06:00
Tejun Heo
bc6caf02cc pci: use device_remove_file_self() instead of device_schedule_callback()
driver-core now supports synchrnous self-deletion of attributes and
the asynchrnous removal mechanism is scheduled for removal.  Use it
instead of device_schedule_callback().  This makes "remove" behave
synchronously.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-02-07 15:42:41 -08:00
Rafael J. Wysocki
9d16947b75 PCI: Add global pci_lock_rescan_remove()
There are multiple PCI device addition and removal code paths that may be
run concurrently with the generic PCI bus rescan and device removal that
can be triggered via sysfs.  If that happens, it may lead to multiple
different, potentially dangerous race conditions.

The most straightforward way to address those problems is to run
the code in question under the same lock that is used by the
generic rescan/remove code in pci-sysfs.c.  To prepare for those
changes, move the definition of the global PCI remove/rescan lock
to probe.c and provide global wrappers, pci_lock_rescan_remove()
and pci_unlock_rescan_remove(), allowing drivers to manipulate
that lock.  Also provide pci_stop_and_remove_bus_device_locked()
for the callers of pci_stop_and_remove_bus_device() who only need
to hold the rescan/remove lock around it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-13 17:49:49 -07:00
Bjorn Helgaas
f7625980f5 PCI: Fix whitespace, capitalization, and spelling errors
Fix whitespace, capitalization, and spelling errors.  No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.

Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-14 11:28:18 -07:00
Bjorn Helgaas
33de1b8bf6 Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Report pci_pme_active() kmalloc failure
  mn10300/PCI: Remove useless pcibios_last_bus
  frv/PCI: Remove pcibios_last_bus
  PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0
  x86/PCI: Coalesce multiple overlapping host bridge windows
  MAINTAINERS: Add arch/x86/pci to PCI file patterns
  PCI/PM: Remove pci_pm_complete()
  PCI: Add pci_dev_show_local_cpu() to simplify code
  mn10300/PCI: Remove unused pci_mem_start
  cris/PCI: Remove unused pci_mem_start
  PCI: Make pci_dev_pm_ops static

Conflicts:
	drivers/pci/pci-sysfs.c
2013-10-31 14:12:40 -06:00
Yijing Wang
c489f5fbb1 PCI: Add pci_dev_show_local_cpu() to simplify code
local_cpus_show() and local_cpulist_show() are almost the same.
This adds a new helper function, pci_dev_show_local_cpu(), to simplify
code.

The same strategy is already used by cpuaffinity_show() and
cpulistaffinity_show().

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-07 15:12:46 -06:00
Sachin Kamat
bf22c90fb1 PCI: Make pci_bus_attrs, pci_dev_attrs, dev_rescan_attr, dev_remove_attr, vga_attr static
Local variables used only in this file are made static.

[bhelgaas: also make pci_dev_attrs[] static (from Fengguang)]
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-07 15:10:46 -06:00
Greg Kroah-Hartman
5136b2da77 PCI: convert bus code to use dev_groups
The dev_attrs field of struct bus_type is going away soon, dev_groups
should be used instead.  This converts the PCI bus code to use the
correct field.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-07 14:58:42 -06:00
Greg Kroah-Hartman
0f49ba5599 PCI: convert bus code to use bus_groups
The bus_attrs field of struct bus_type is going away soon, dev_groups
should be used instead.  This converts the PCI bus code to use the
correct field.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-10-07 14:51:02 -06:00
Greg Kroah-Hartman
56039e658c PCI: Convert class code to use dev_groups
The dev_attrs field of struct class is going away soon, dev_groups
should be used instead.  This converts the PCI class code to use the
correct field.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-07-25 12:18:42 -06:00
Bjorn Helgaas
bb4bac9308 Merge branch 'pci/jiang-iov-fixes' into next
* pci/jiang-iov-fixes:
  PCI: Hide remove and rescan sysfs interfaces for SR-IOV virtual functions
  PCI: Finish SR-IOV VF setup before adding the device
2013-06-05 12:27:19 -06:00
Jiang Liu
dfab88beda PCI: Hide remove and rescan sysfs interfaces for SR-IOV virtual functions
PCI devices for SR-IOV virtual functions should only be created/
destroyed by pci_enable_sriov()/pci_disable_sriov() because special
data structures are associated with SR-IOV virtual functions.
So hide hotplug related sysfs interfaces "remove" and "rescan" for
SR-IOV virtual functions, otherwise it may cause memory leakage
and other issues.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Donald Dutile <ddutile@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ram Pai <linuxram@us.ibm.com>
2013-06-05 12:18:50 -06:00
Jingoo Han
9a994e8ec7 PCI: Replace strict_strtoul() with kstrtoul()
The usage of strict_strtoul() is not preferred, because
strict_strtoul() is obsolete.  Thus, kstrtoul() should be
used.

[bhelgaas: "#define strict_strtoul  kstrtoul", so no functional change]
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-06-01 17:41:25 -06:00
Libin
64b001758e PCI: Use vma_pages() to replace (vm_end - vm_start) >> PAGE_SHIFT
(*->vm_end - *->vm_start) >> PAGE_SHIFT operation is implemented
as an inline funcion vma_pages() in linux/mm.h, so use it.

Signed-off-by: Libin <huawei.libin@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-15 14:30:44 -06:00
Bjorn Helgaas
faa48a507f PCI: Remove spurious error for sriov_numvfs store and simplify flow
If we request "num_vfs" and the driver's sriov_configure() method enables
exactly that number ("num_vfs_enabled"), we complain "Invalid value for
number of VFs to enable" and return an error.  We should silently return
success instead.

Also, use kstrtou16() since numVFs is defined to be a 16-bit field and
rework to simplify control flow.

Reported-by: Greg Rose <gregory.v.rose@intel.com>
Reference: http://lkml.kernel.org/r/20121214101911.00002f59@unknown
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Donald Dutile <ddutile@redhat.com>
2012-12-26 10:39:22 -07:00
Linus Torvalds
193c0d6825 PCI changes for the v3.8 merge window:
Host bridge hotplug:
     - Untangle _PRT from struct pci_bus (Bjorn Helgaas)
     - Request _OSC control before scanning root bus (Taku Izumi)
     - Assign resources when adding host bridge (Yinghai Lu)
     - Remove root bus when removing host bridge (Yinghai Lu)
     - Remove _PRT during hot remove (Yinghai Lu)
 
   SRIOV
     - Add sysfs knobs to control numVFs (Don Dutile)
 
   Power management
     - Notify devices when power resource turned on (Huang Ying)
 
   Bug fixes
     - Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
     - Keep runtime PM enabled for unbound PCI devices (Huang Ying)
     - Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
     - Fix xen frontend shutdown issue (David Vrabel)
     - Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
 
   Miscellaneous
     - Add GPL license for drivers/pci/ioapic (Andrew Cooks)
     - Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
     - NumaChip remote PCI support (Daniel Blueman)
     - Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
     - Convert dev_printk() to dev_info(), etc (Joe Perches)
     - Add support for non PCI BAR ROM data (Matthew Garrett)
     - Add x86 support for host bridge translation offset (Mike Yoknis)
     - Report success only when every driver supports AER (Vijay Pandarathil)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
 5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
 eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
 +05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
 3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
 HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
 T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
 1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
 3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
 MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
 8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
 Xf48OyYV5EjpC3FYUSiU
 =OE3Q
 -----END PGP SIGNATURE-----

Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI update from Bjorn Helgaas:
 "Host bridge hotplug:
   - Untangle _PRT from struct pci_bus (Bjorn Helgaas)
   - Request _OSC control before scanning root bus (Taku Izumi)
   - Assign resources when adding host bridge (Yinghai Lu)
   - Remove root bus when removing host bridge (Yinghai Lu)
   - Remove _PRT during hot remove (Yinghai Lu)

  SRIOV
    - Add sysfs knobs to control numVFs (Don Dutile)

  Power management
   - Notify devices when power resource turned on (Huang Ying)

  Bug fixes
   - Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
   - Keep runtime PM enabled for unbound PCI devices (Huang Ying)
   - Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
   - Fix xen frontend shutdown issue (David Vrabel)
   - Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)

  Miscellaneous
   - Add GPL license for drivers/pci/ioapic (Andrew Cooks)
   - Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
   - NumaChip remote PCI support (Daniel Blueman)
   - Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
     Han)
   - Convert dev_printk() to dev_info(), etc (Joe Perches)
   - Add support for non PCI BAR ROM data (Matthew Garrett)
   - Add x86 support for host bridge translation offset (Mike Yoknis)
   - Report success only when every driver supports AER (Vijay
     Pandarathil)"

Fix up trivial conflicts.

* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
  PCI: Use phys_addr_t for physical ROM address
  x86/PCI: Add NumaChip remote PCI support
  ath9k: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: collapse wrapper for pcie_capability_read_word()
  iwlegacy: Use standard #defines for PCIe Capability ASPM fields
  iwlegacy: collapse wrapper for pcie_capability_read_word()
  cxgb3: Use standard #defines for PCIe Capability ASPM fields
  PCI: Add standard PCIe Capability Link ASPM field names
  PCI/portdrv: Use PCI Express Capability accessors
  PCI: Use standard PCIe Capability Link register field names
  x86: Use PCI setup data
  PCI: Add support for non-BAR ROMs
  PCI: Add pcibios_add_device
  EFI: Stash ROMs if they're not in the PCI BAR
  PCI: Add and use standard PCI-X Capability register names
  PCI/PM: Keep runtime PM enabled for unbound PCI devices
  xen-pcifront: Handle backend CLOSED without CLOSING
  PCI: SRIOV control and status via sysfs (documentation)
  PCI/AER: Report success only when every device has AER-aware driver
  ...
2012-12-13 12:14:47 -08:00
Bill Pemberton
b40b97ae73 PCI: Remove CONFIG_HOTPLUG ifdefs
Remove conditional code based on CONFIG_HOTPLUG being false.  It's
always on now in preparation of it going away as an option.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-11-28 12:53:46 -08:00
Bjorn Helgaas
d3fe3988fb Merge branch 'for-linus' into next
* for-linus:
  PCI/portdrv: Don't create hotplug slots unless port supports hotplug
  PCI/PM: Fix proc config reg access for D3cold and bridge suspending
  PCI/PM: Resume device before shutdown
  PCI/PM: Fix deadlock when unbinding device if parent in D3cold
2012-11-26 13:00:57 -07:00
Bjorn Helgaas
6b13672469 PCI: Use spec names for SR-IOV capability fields
Use the same names (almost) as the spec for TotalVFs, InitialVFs, NumVFs.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 21:38:53 -07:00
Donald Dutile
bff73156d3 PCI: Provide method to reduce the number of total VFs supported
Some implementations of SRIOV provide a capability structure
value of TotalVFs that is greater than what the software can support.
Provide a method to reduce the capability structure reported value
to the value the driver can support.
This ensures sysfs reports the current capability of the system,
hardware and software.
Example for its use: igb & ixgbe -- report 8 & 64 as TotalVFs,
but drivers only support 7 & 63 maximum.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 21:37:39 -07:00
Donald Dutile
1789382a72 PCI: SRIOV control and status via sysfs
Provide files under sysfs to determine the maximum number of VFs
an SR-IOV-capable PCIe device supports, and methods to enable and
disable the VFs on a per-device basis.

Currently, VF enablement by SR-IOV-capable PCIe devices is done
via driver-specific module parameters.  If not setup in modprobe files,
it requires admin to unload & reload PF drivers with number of desired
VFs to enable.  Additionally, the enablement is system wide: all
devices controlled by the same driver have the same number of VFs
enabled.  Although the latter is probably desired, there are PCI
configurations setup by system BIOS that may not enable that to occur.

Two files are created for the PF of PCIe devices with SR-IOV support:

    sriov_totalvfs	Contains the maximum number of VFs the device
			could support as reported by the TotalVFs register
			in the SR-IOV extended capability.

    sriov_numvfs	Contains the number of VFs currently enabled on
			this device as reported by the NumVFs register in
			the SR-IOV extended capability.

			Writing zero to this file disables all VFs.

			Writing a positive number to this file enables that
			number of VFs.

These files are readable for all SR-IOV PF devices.  Writes to the
sriov_numvfs file are effective only if a driver that supports the
sriov_configure() method is attached.

Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 20:10:39 -07:00
Yinghai Lu
625e1d59a8 PCI: Use is_visible() with boot_vga attribute for pci_dev
Should make pci_create_sysfs_dev_files() simpler.  Also fix possible
memleak in remove path.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 20:09:53 -07:00
Yinghai Lu
4e15c46bdc PCI: Add pci_device_type to pdev's device struct
Need type filled in device structure so it can be used for visible
attribute control in sysfs for pci_dev.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-11-09 20:07:31 -07:00
Huang Ying
b3c32c4f95 PCI/PM: Fix proc config reg access for D3cold and bridge suspending
In https://bugzilla.kernel.org/show_bug.cgi?id=48981
Peter reported that /proc/bus/pci/??/??.? does not work for 3.6.
This is because the device configuration space registers are
not accessible if the corresponding parent bridge is suspended or
the device is put into D3cold state.

This is the same as /sys/bus/pci/devices/0000:??:??.?/config access
issue.  So the function used to solve sysfs issue is used to solve
this issue.

This patch moves pci_config_pm_runtime_get()/_put() from pci/pci-sysfs.c
to pci/pci.c and makes them extern so they can be used by both the
sysfs and proc paths.

[bhelgaas: changelog, references, reporters]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=48981
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=49031
Reported-by: Forrest Loomis <cybercyst@gmail.com>
Reported-by: Peter <lekensteyn@gmail.com>
Reported-by: Micael Dias <kam1kaz3@gmail.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org		# v3.6+
2012-11-05 10:46:23 -07:00
Huang Ying
3d8387efe1 PCI/PM: Fix config reg access for D3cold and bridge suspending
This patch fixes the following bug:

http://marc.info/?l=linux-pci&m=134338059022620&w=2

Where lspci does not work properly if a device and the corresponding
parent bridge (such as PCIe port) is suspended.  This is because the
device configuration space registers will be not accessible if the
corresponding parent bridge is suspended or the device is put into
D3cold state.

To solve the issue, the bridge/PCIe port connected to the device is
put into active state before read/write configuration space registers.
If the device is in D3cold state, it will be put into active state
too.

To avoid resume/suspend PCIe port for each configuration register
read/write, a small delay is added before the PCIe port to go
suspended.

Reported-by: Bjorn Mork <bjorn@mork.no>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
2012-08-21 17:34:24 -06:00
Bjorn Helgaas
35e7f73c32 Merge branch 'topic/huang-d3cold-v7' into next
* topic/huang-d3cold-v7:
  PCI/PM: add PCIe runtime D3cold support
  PCI: do not call pci_set_power_state with PCI_D3cold
  PCI/PM: add runtime PM support to PCIe port
  ACPI/PM: specify lowest allowed state for device sleep state
2012-06-23 11:59:43 -06:00
Huang Ying
448bd857d4 PCI/PM: add PCIe runtime D3cold support
This patch adds runtime D3cold support and corresponding ACPI platform
support.  This patch only enables runtime D3cold support; it does not
enable D3cold support during system suspend/hibernate.

D3cold is the deepest power saving state for a PCIe device, where its main
power is removed.  While it is in D3cold, you can't access the device at
all, not even its configuration space (which is still accessible in D3hot).
Therefore the PCI PM registers can not be used to transition into/out of
the D3cold state; that must be done by platform logic such as ACPI _PR3.

To support wakeup from D3cold, a system may provide auxiliary power, which
allows a device to request wakeup using a Beacon or the sideband WAKE#
signal.  WAKE# is usually connected to platform logic such as ACPI GPE.
This is quite different from other power saving states, where devices
request wakeup via a PME message on the PCIe link.

Some devices, such as those in plug-in slots, have no direct platform
logic.  For example, there is usually no ACPI _PR3 for them.  D3cold
support for these devices can be done via the PCIe Downstream Port leading
to the device.  When the PCIe port is powered on/off, the device is powered
on/off too.  Wakeup events from the device will be notified to the
corresponding PCIe port.

For more information about PCIe D3cold and corresponding ACPI support,
please refer to:

- PCI Express Base Specification Revision 2.0
- Advanced Configuration and Power Interface Specification Revision 5.0

[bhelgaas: changelog]
Reviewed-by: Rafael J. Wysocki <rjw@sisk.pl>
Originally-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-23 10:50:59 -06:00
Bjorn Helgaas
d6d88c832e PCI: use __weak consistently
Use "__weak" instead of the gcc-specific "__attribute__ ((weak))"

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-06-20 10:44:35 -06:00
Matthew Garrett
1a39b310e9 vgaarb: Add support for setting the default video device (v2)
The default VGA device is a somewhat fluid concept on platforms with
multiple GPUs. Add support for setting it so switching code can update
things appropriately, and make sure that the sysfs code returns the right
device if it's changed.

v2: Updated to fix builds when __ARCH_HAS_VGA_DEFAULT_DEVICE is false.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: benh@kernel.crashing.org
Cc: airlied@redhat.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-04-24 09:50:15 +01:00
Yinghai Lu
210647af89 PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
The old pci_remove_bus_device actually did stop and remove.

Make the name reflect that to reduce confusion.

This patch is done by sed scripts and changes back some incorrect
__pci_remove_bus_device changes.

Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-27 12:12:18 -08:00
Yinghai Lu
2f320521a0 PCI: Make rescan bus increase bridge resource size if needed
Current rescan will not touch bridge MMIO and IO.

Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.

Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-02-14 08:44:53 -08:00
Linus Torvalds
c49c41a413 Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
  capabilities: remove __cap_full_set definition
  security: remove the security_netlink_recv hook as it is equivalent to capable()
  ptrace: do not audit capability check when outputing /proc/pid/stat
  capabilities: remove task_ns_* functions
  capabitlies: ns_capable can use the cap helpers rather than lsm call
  capabilities: style only - move capable below ns_capable
  capabilites: introduce new has_ns_capabilities_noaudit
  capabilities: call has_ns_capability from has_capability
  capabilities: remove all _real_ interfaces
  capabilities: introduce security_capable_noaudit
  capabilities: reverse arguments to security_capable
  capabilities: remove the task from capable LSM hook entirely
  selinux: sparse fix: fix several warnings in the security server cod
  selinux: sparse fix: fix warnings in netlink code
  selinux: sparse fix: eliminate warnings for selinuxfs
  selinux: sparse fix: declare selinux_disable() in security.h
  selinux: sparse fix: move selinux_complete_init
  selinux: sparse fix: make selinux_secmark_refcount static
  SELinux: Fix RCU deref check warning in sel_netport_insert()

Manually fix up a semantic mis-merge wrt security_netlink_recv():

 - the interface was removed in commit fd77846152 ("security: remove
   the security_netlink_recv hook as it is equivalent to capable()")

 - a new user of it appeared in commit a38f7907b9 ("crypto: Add
   userspace configuration API")

causing no automatic merge conflict, but Eric Paris pointed out the
issue.
2012-01-14 18:36:33 -08:00
Eric Paris
b7e724d303 capabilities: reverse arguments to security_capable
security_capable takes ns, cred, cap.  But the LSM capable() hook takes
cred, ns, cap.  The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!

Signed-off-by: Eric Paris <eparis@redhat.com>
2012-01-05 18:52:53 -05:00
Paul Gortmaker
363c75db1d pci: Fix files needing export.h for EXPORT_SYMBOL/THIS_MODULE
They were implicitly getting it from device.h --> module.h but
we want to clean that up.  So add the minimal header for these
macros.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:22 -04:00
Yinghai Lu
dc2c2c9dd5 PCI/sysfs: move bus cpuaffinity to class dev_attrs
Requested by Greg KH to fix a race condition in the creating of PCI bus
cpuaffinity files.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:13 -07:00
Yinghai Lu
b9d320fcb6 PCI: add rescan to /sys/.../pci_bus/.../
After remove the device from /sys, we have to rescan all or
find out the bridge and access /sys../device/rescan there.

this patch add /sys/.../pci_bus/.../rescan. So user can rescan more easy.
that is more clean and easy to understand.

like after remove 0000:c4:00.0, you can rescan 0000:c4 directly.

-v2: According to Jesse, use function instead of exposing attr, so could hide
	#ifdef in header file.
     also add code to remove rescan file in remove path.
-v3: GregKH pointed out that we should use dev_attrs to avoid racing.
     So add pcibus_attrs and make it to be member of pcibus_attrs.
-v4: Change name to pcibus_dev_attrs according to GregKH

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-21 12:17:12 -07:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Serge E. Hallyn
3486740a4f userns: security: make capabilities relative to the user namespace
- Introduce ns_capable to test for a capability in a non-default
  user namespace.
- Teach cap_capable to handle capabilities in a non-default
  user namespace.

The motivation is to get to the unprivileged creation of new
namespaces.  It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.

I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.

Changelog:
	11/05/2010: [serge] add apparmor
	12/14/2010: [serge] fix capabilities to created user namespaces
	Without this, if user serge creates a user_ns, he won't have
	capabilities to the user_ns he created.  THis is because we
	were first checking whether his effective caps had the caps
	he needed and returning -EPERM if not, and THEN checking whether
	he was the creator.  Reverse those checks.
	12/16/2010: [serge] security_real_capable needs ns argument in !security case
	01/11/2011: [serge] add task_ns_capable helper
	01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
	02/16/2011: [serge] fix a logic bug: the root user is always creator of
		    init_user_ns, but should not always have capabilities to
		    it!  Fix the check in cap_capable().
	02/21/2011: Add the required user_ns parameter to security_capable,
		    fixing a compile failure.
	02/23/2011: Convert some macros to functions as per akpm comments.  Some
		    couldn't be converted because we can't easily forward-declare
		    them (they are inline if !SECURITY, extern if SECURITY).  Add
		    a current_user_ns function so we can use it in capability.h
		    without #including cred.h.  Move all forward declarations
		    together to the top of the #ifdef __KERNEL__ section, and use
		    kernel-doc format.
	02/23/2011: Per dhowells, clean up comment in cap_capable().
	02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.

(Original written and signed off by Eric;  latest, modified version
acked by him)

[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:02 -07:00
Linus Torvalds
99759619b2 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: label: remove #include of ACPI header to avoid warnings
  PCI: label: Fix compilation error when CONFIG_ACPI is unset
  PCI: pre-allocate additional resources to devices only after successful allocation of essential resources.
  PCI: introduce reset_resource()
  PCI: data structure agnostic free list function
  PCI: refactor io size calculation code
  PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH
  PCI hotplug: acpiphp: set current_state to D0 in register_slot
  PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs
  PCI: add more checking to ICH region quirks
  PCI: aer-inject: Override PCIe AER Mask Registers
  PCI: fix tlan build when CONFIG_PCI is not enabled
  PCI: remove quirk for pre-production systems
  PCI: Avoid potential NULL pointer dereference in pci_scan_bridge
  PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs
  PCI: sysfs: Fix failure path for addition of "vpd" attribute
2011-03-18 10:56:44 -07:00
Chris Wright
a628e7b87e pci: use security_capable() when checking capablities during config space read
This reintroduces commit 47970b1b which was subsequently reverted
as f00eaeea.  The original change was broken and caused X startup
failures and generally made privileged processes incapable of reading
device dependent config space.  The normal capable() interface returns
true on success, but the LSM interface returns 0 on success.  This thinko
is now fixed in this patch, and has been confirmed to work properly.

So, once again...Eric Paris noted that commit de139a3 ("pci: check caps
from sysfs file open to read device dependent config space") caused the
capability check to bypass security modules and potentially auditing.
Rectify this by calling security_capable() when checking the open file's
capabilities for config space reads.

Reported-by: Eric Paris <eparis@redhat.com>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Alex Riesen <raa.lkml@gmail.com>
Cc: Sedat Dilek <sedat.dilek@googlemail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
2011-02-15 19:06:31 +11:00
Linus Torvalds
f00eaeea7a Revert "pci: use security_capable() when checking capablities during config space read"
This reverts commit 47970b1b2a.

It turns out it breaks several distributions.  Looks like the stricter
selinux checks fail due to selinux policies not being set to allow the
access - breaking X, but also lspci.

So while the change was clearly the RightThing(tm) to do in theory, in
practice we have backwards compatibility issues making it not work.

Reported-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: David Airlie <airlied@linux.ie>
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-02-13 07:50:50 -08:00
Chris Wright
47970b1b2a pci: use security_capable() when checking capablities during config space read
Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file
open to read device dependent config space") caused the capability check
to bypass security modules and potentially auditing.  Rectify this by
calling security_capable() when checking the open file's capabilities
for config space reads.

Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
2011-02-11 17:58:11 +11:00
Ben Hutchings
0f12a4e293 PCI: sysfs: Fix failure path for addition of "vpd" attribute
Commit 280c73d ("PCI: centralize the capabilities code in
pci-sysfs.c") changed the initialisation of the "rom" and "vpd"
attributes, and made the failure path for the "vpd" attribute
incorrect.  We must free the new attribute structure (attr), but
instead we currently free dev->vpd->attr.  That will normally be NULL,
resulting in a memory leak, but it might be a stale pointer, resulting
in a double-free.

Found by inspection; compile-tested only.

Cc: stable@kernel.org
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-02-08 10:02:46 -08:00
Alex Williamson
ff29530e65 PCI: sysfs: Update ROM to include default owner write access
The PCI sysfs ROM interface requires an enabling write to access the ROM
image, but the default file mode is 0400.  The original proposed patch
adding sysfs ROM support was a true read-only interface, with the
enabling bit coming in as a feature request.  I suspect it was simply an
oversight that the file mode didn't get updated to match the API.

Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-01-14 08:55:42 -08:00
Darrick J. Wong
8c05cd08a7 PCI: fix offset check for sysfs mmapped files
I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts.
Running an strace of the X server shows that it's doing this:

open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10
mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument)

This code seems to be asking for a shared read/write mapping of 16MB worth of
BAR0 starting at file offset 0, and letting the kernel assign a starting
address.  Unfortunately, this -EINVAL causes X not to start.  Looking into
dmesg, there's a complaint like so:

process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x        96000000, size 0x         1000000)

...with the following code in pci_mmap_fits:

	pci_start = (mmap_api == PCI_MMAP_SYSFS) ?
		pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
        if (start >= pci_start && start < pci_start + size &&
                        start + nr <= pci_start + size)

It looks like the logic here is set up such that when the mmap call comes via
sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
resource's start and end address, and the end of the vma to be no farther than
the end.  However, the sysfs PCI resource files always start at offset zero,
which means that this test always fails for programs that mmap the sysfs files.
Given the comment in the original commit
3b519e4ea6, I _think_ the old procfs files
require that the file offset be equal to the resource's base address when
mmapping.

I think what we want here is for pci_start to be 0 when mmap_api ==
PCI_MMAP_PROCFS.  The following patch makes that change, after which the Matrox
and Mach64 X drivers work again.

Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-16 09:15:39 -08:00
Randy Dunlap
e25cd062b1 PCI: sysfs: fix printk warnings
Cast pci_resource_start() and pci_resource_len() to u64 for printk.

drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-15 09:34:44 -08:00
Martin Wilck
3b519e4ea6 PCI: fix size checks for mmap() on /proc/bus/pci files
The checks for valid mmaps of PCI resources made through /proc/bus/pci files
that were introduced in 9eff02e204 have several
problems:

1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
whereas under /sys/bus/pci/devices, the start of the resource corresponds
to offset 0. This may lead to false negatives in pci_mmap_fits(), which
implicitly assumes the /sys/bus/pci/devices layout.

2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
to false positives, because pci_mmap_fits() doesn't treat empty resources
correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
in this case!).

3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
WARNINGS for the first resources that don't fit until the correct one is found.

On many controllers the first 2-4 BARs are used, and the others are empty.
In this case, an mmap attempt will first fail on the non-empty BARs
(including the "right" BAR because of 1.) and emit bogus WARNINGS because
of 3., and finally succeed on the first empty BAR because of 2.
This is certainly not the intended behaviour.

This patch addresses all 3 issues.
Updated with an enum type for the additional parameter for pci_mmap_fits().

Cc: stable@kernel.org
Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-11-11 09:34:32 -08:00
Narendra K
911e1c9b05 PCI: export SMBIOS provided firmware instance and label to sysfs
This patch exports SMBIOS provided firmware instance and label of
onboard PCI devices to sysfs.  New files are:
  /sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
  /sys/bus/pci/devices/.../index which contains the firmware device type
instance for the given device.

Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:36:01 -07:00
Alex Williamson
8633328be2 PCI: Allow read/write access to sysfs I/O port resources
PCI sysfs resource files currently only allow mmap'ing.  On x86 this
works fine for memory backed BARs, but doesn't work at all for I/O
port backed BARs.  Add read/write to I/O port PCI sysfs resource
files to allow userspace access to these device regions.

Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:32:08 -07:00
Kulikov Vasiliy
a3f5835a8e PCI: pci-sysfs: remove casts from void*
Remove unnesessary casts from void*.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:29:18 -07:00
Jesse Barnes
3be434f024 Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"
This reverts commit 75568f8094.

Since they're just a convenience anyway, remove these symlinks since
they're causing duplicate filename errors in the wild.

Acked-by: Alex Chiang <achiang@canonical.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-06-11 13:08:37 -07:00
Linus Torvalds
6109e2ce26 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (36 commits)
  PCI: hotplug: pciehp: Removed check for hotplug of display devices
  PCI: read memory ranges out of Broadcom CNB20LE host bridge
  PCI: Allow manual resource allocation for PCI hotplug bridges
  x86/PCI: make ACPI MCFG reserved error messages ACPI specific
  PCI hotplug: Use kmemdup
  PM/PCI: Update PCI power management documentation
  PCI: output FW warning in pci_read/write_vpd
  PCI: fix typos pci_device_dis/enable to pci_dis/enable_device in comments
  PCI quirks: disable msi on AMD rs4xx internal gfx bridges
  PCI: Disable MSI for MCP55 on P5N32-E SLI
  x86/PCI: irq and pci_ids patch for additional Intel Cougar Point DeviceIDs
  PCI: aerdrv: trivial cleanup for aerdrv_core.c
  PCI: aerdrv: trivial cleanup for aerdrv.c
  PCI: aerdrv: introduce default_downstream_reset_link
  PCI: aerdrv: rework find_aer_service
  PCI: aerdrv: remove is_downstream
  PCI: aerdrv: remove magical ROOT_ERR_STATUS_MASKS
  PCI: aerdrv: redefine PCI_ERR_ROOT_*_SRC
  PCI: aerdrv: rework do_recovery
  PCI: aerdrv: rework get_e_source()
  ...
2010-05-21 18:58:52 -07:00
Chris Wright
de139a3393 pci: check caps from sysfs file open to read device dependent config space
The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN
check to verify privileges before allowing a user to read device
dependent config space.  This is meant to protect from an unprivileged
user potentially locking up the box.

When assigning a PCI device directly to a guest with libvirt and KVM,
the sysfs config space file is chown'd to the unprivileged user that
the KVM guest will run as.  The guest needs to have full access to the
device's config space since it's responsible for driving the device.
However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN
check will not allow read access beyond the config header.

With this patch we check privileges against the capabilities used when
openining the sysfs file.  The allows a privileged process to open the
file and hand it to an unprivileged process, and the unprivileged process
can still read all of the config space.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:32 -07:00
Chris Wright
2c3c8bea60 sysfs: add struct file* to bin_attr callbacks
This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-21 09:37:31 -07:00
Michal Schmidt
447c5dd733 PCI: return correct value when writing to the "reset" attribute
A successful write() to the "reset" sysfs attribute should return the
number of bytes written, not 0. Otherwise userspace (bash) retries the
write over and over again.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-11 12:01:11 -07:00
Alex Chiang
75568f8094 PCI: create function symlinks in /sys/bus/pci/slots/N/
Create convenience symlinks in sysfs, linking slots to device
functions, and vice versa. These links make it easier for users to
figure out which devices actually live in what slots.

For example:

sapphire:/sys/bus/pci/slots # ls
1  10  2  3  4  5  6  7  8  9

sapphire:/sys/bus/pci/slots # ls -l 3
total 0
-r--r--r-- 1 root root 65536 Aug 18 14:10 address
lrwxrwxrwx 1 root root     0 Aug 18 14:10 function0 ->
../../../../devices/pci0000:23/0000:23:01.0
lrwxrwxrwx 1 root root     0 Aug 18 14:10 function1 ->
../../../../devices/pci0000:23/0000:23:01.1

sapphire:/sys/bus/pci/slots # ls -l 3/function0/slot
lrwxrwxrwx 1 root root 0 Aug 18 14:13 3/function0/slot ->
../../../bus/pci/slots/3

The original form of this patch was written by Matthew Wilcox,
and was enhanced to include links from the sysfs slots/ directory
pointing back at the device functions.

Cc: willy@linux.intel.com
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-05-11 12:01:07 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Mel Gorman
6757eca348 sysfs: Initialised pci bus legacy_mem field before use
PPC64 is failing to boot the latest mmotm due to an uninitialised pointer in
pci_create_legacy_files(). The surprise is that machines boot at all and it
would appear to affect current mainline as well.  This patch fixes the problem.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-19 07:12:11 -07:00
Stephen Rothwell
62e877b893 sysfs: fix for thinko with sysfs_bin_attr_init()
After merging the final tree, today's linux-next build (powerpc
allyesconfig) failed like this:

drivers/pci/pci-sysfs.c: In function 'pci_create_legacy_files':
drivers/pci/pci-sysfs.c:645: error: lvalue required as unary '&' operand
drivers/pci/pci-sysfs.c:658: error: lvalue required as unary '&' operand

Caused by commit "sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on
dynamic attributes" interacting with commit "sysfs: Use one lockdep
class per sysfs attribute") both from the driver-core tree.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:52 -08:00
Eric W. Biederman
a07e4156a2 sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on dynamic attributes
These are the non-static sysfs attributes that exist on
my test machine.  Fix them to use sysfs_attr_init or
sysfs_bin_attr_init as appropriate.   It simply requires
making a sysfs attribute present to see this.  So this
is a little bit tedious but otherwise not too bad.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:51 -08:00
David John
6be954d1f9 PCI: Check the node argument passed to cpumask_of_node
Commit e0cd516 "PCI: derive nearby CPUs from device's instead of bus'
NUMA information" causes an null pointer dereference when reading from
the sysfs attributes local_cpu* on Intel machines with no ACPI NUMA
proximity info, since dev->numa_node gets set to -1 for all PCI devices,
which then gets passed to cpumask_of_node.

Add a check to prevent this.

Signed-off-by: David John <davidjon@xenontk.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-04 15:10:56 -08:00
Yinghai Lu
bb965401fd PCI: show dma_mask bits in /sys
So we can catch if the driver sets an incorrect dma_mask.

Reviewed-by: Grant Grundler <grundler@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-04 15:47:50 -08:00
Andreas Herrmann
e0cd516034 PCI: derive nearby CPUs from device's instead of bus' NUMA information
In case of AMD CPU northbridge functions this NUMA information might
differ.  Here is an example from a 4-socket system.

Currently Linux shows

  root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node
  0
  root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu*
  0-3
  00000000,0000000f

which is not correct for northbridge functions as the local CPUs
are those of the same socket.

With this patch and a quirk for AMD CPU NB functions Linux can
do better and correctly show

  root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat numa_node
  2
  root@hagen:/sys/devices/pci0000:00/0000:00:1a.4# cat local_cpu*
  8-11
  00000000,00000f00

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-11-06 14:09:15 -08:00
Michael S. Tsirkin
711d57796f PCI: expose function reset capability in sysfs
Some devices allow an individual function to be reset without affecting
other functions in the same device: that's what pci_reset_function does.
For devices that have this support, expose reset attribite in sysfs.

This is useful e.g. for virtualization, where a qemu userspace
process wants to reset the device when the guest is reset,
to emulate machine reboot as closely as possible.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:24 -07:00
Randy Dunlap
cffb2fafb7 docbooks: add/fix PCI kernel-doc
Add drivers/pci/*.c source files to DocBook/kernel-api.tmpl
and update those pci/*.c source files that need kernel-doc fixes.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-22 14:49:33 -07:00
Alex Chiang
c2ac7cdc67 PCI: allow PCI core hotplug to remove PCI root bus
There is no reason to prevent removal of root bus devices. A subsequent
rescan will find them just fine.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-06 11:30:02 -07:00
Yuji Shimada
296ccb086d PCI: Setup disabled bridges even if buses are added
This patch sets up disabled bridges even if buses have already been
added.

pci_assign_unassigned_resources is called after buses are added.
pci_assign_unassigned_resources calls pci_bus_assign_resources.
pci_bus_assign_resources calls pci_setup_bridge to configure BARs of
bridges.

Currently pci_setup_bridge returns immediately if the bus have already
been added. So pci_assign_unassigned_resources can't configure BARs of
bridges that were added in a disabled state; this patch fixes the issue.

On logical hot-add, we need to prevent the kernel from re-initializing
bridges that have already been initialized. To achieve this,
pci_setup_bridge returns immediately if the bridge have already been
enabled.

We don't need to check whether the specified bus is a root bus or not.
pci_setup_bridge is not called on a root bus, because a root bus does
not have a bridge.

The patch adds a new helper function, pci_is_enabled. I made the
function name similar to pci_is_managed. The codes which use
enable_cnt directly are changed to use pci_is_enabled.

Acked-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Yuji Shimada <shimada-yxb@necst.nec.co.jp>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-04-06 11:25:06 -07:00
Alex Chiang
738a6396c2 PCI: Introduce /sys/bus/pci/devices/.../rescan
This interface allows the user to force a rescan of the device's
parent bus and all subordinate buses, and rediscover devices removed
earlier from this part of the device tree.

Cc: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:59:07 -07:00
Alex Chiang
77c27c7b49 PCI: Introduce /sys/bus/pci/devices/.../remove
This patch adds an attribute named "remove" to a PCI device's sysfs
directory.  Writing a non-zero value to this attribute will remove the PCI
device and any children of it.

Trent Piepho wrote the original implementation and documentation.

Thanks to Vegard Nossum for testing under kmemcheck and finding locking
issues with the sysfs interface.

Cc: Trent Piepho <xyzzy@speakeasy.org>
Tested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:58:48 -07:00
Alex Chiang
705b1aaa82 PCI: Introduce /sys/bus/pci/rescan
This interface allows the user to force a rescan of all PCI buses
in system, and rediscover devices that have been removed earlier.

pci_bus_attrs implementation from Trent Piepho.

Thanks to Vegard Nossum for discovering locking issues with the
sysfs interface.

Cc: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 14:57:58 -07:00
Dave Airlie
217f45de3d PCI: expose boot VGA device via sysfs.
X really would like to know which VGA device was considered the boot
device by the system. The x86 PCI fixups have support for discovering
this but we provide no way to expose it to userspace.

This adds a sysfs file per VGA class device which has the value 0 for
non the boot device or unknown, and 1 if the VGA device is the boot
device.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:17 -07:00
Ivan Kokshaysky
10a0ef39fb PCI/alpha: pci sysfs resources
This closes http://bugzilla.kernel.org/show_bug.cgi?id=10893
which is a showstopper for X development on alpha.

The generic HAVE_PCI_MMAP code (drivers/pci-sysfs.c) is not
very useful since we have to deal with three different types
of MMIO address spaces: sparse and dense mappings for old
ev4/ev5 machines and "normal" 1:1 MMIO space (bwx) for ev56 and
later.
Also "write combine" mappings are meaningless on alpha - roughly
speaking, alpha does write combining, IO reordering and other
optimizations by default, unless user splits IO accesses
with memory barriers.

I think the cleanest way to deal with resource files on alpha
is to convert the default no-op pci_create_resource_files() and
pci_remove_resource_files() for !HAVE_PCI_MMAP case into __weak
functions and override them with alpha specific ones.

Another alpha hook is needed for "legacy_" resource files
to handle sparse addressing (pci_adjust_legacy_attr).

With the "standard" resourceN files on ev56/ev6 libpciaccess
works "out of the box". Handling of resourceN_sparse/resourceN_dense
files on older machines obviously requires some userland work.

Sparse/dense stuff has been tested on sx164 (pca56/pyxis, normally
uses bwx IO) with the kernel hacked into "cia compatible" mode.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:36 -07:00
Timothy S. Nelson
97c44836cd PCI: return error on failure to read PCI ROMs
This patch makes the ROM reading code return an error to user space if
the size of the ROM read is equal to 0.

The patch also emits a warnings if the contents of the ROM are invalid,
and documents the effects of the "enable" file on ROM reading.

Signed-off-by: Timothy S. Nelson <wayland@wayland.id.au>
Acked-by: Alex Villacis-Lasso <a_villacis@palosanto.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-04 16:58:41 -08:00
Linus Torvalds
4e9b1c184c Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  [IA64] fix typo in cpumask_of_pcibus()
  x86: fix x86_32 builds for summit and es7000 arch's
  cpumask: use work_on_cpu in acpi-cpufreq.c for read_measured_perf_ctrs
  cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
  cpumask: use cpumask_var_t in acpi-cpufreq.c
  cpumask: use work_on_cpu in acpi/cstate.c
  cpumask: convert struct cpufreq_policy to cpumask_var_t
  cpumask: replace CPUMASK_ALLOC etc with cpumask_var_t
  x86: cleanup remaining cpumask_t ops in smpboot code
  cpumask: update pci_bus_show_cpuaffinity to use new cpumask API
  cpumask: update local_cpus_show to use new cpumask API
  ia64: cpumask fix for is_affinity_mask_valid()
2009-01-10 06:12:18 -08:00
Stephen Hemminger
287d19ce2e PCI: revise VPD access interface
Change PCI VPD API which was only used by sysfs to something usable
in drivers.
   * move iteration over multiple words to the low level
   * use conventional types for arguments
   * add exportable wrapper

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:17 -08:00
Yu Zhao
fde09c6d8f PCI: define PCI resource names in an 'enum'
This patch moves all definitions of the PCI resource names to an 'enum',
and also replaces some hard-coded resource variables with symbol
names. This change eases introduction of device specific resources.

Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:13:01 -08:00
Trent Piepho
92425a405e PCI: Make settable sysfs attributes more consistent
PCI devices have three settable boolean attributes, enable,
broken_parity_status, and msi_bus.

The store functions for these would silently interpret "0x01" as false,
"1llogical" as true, and "true" would be (silently!) ignored and do
nothing.

This is inconsistent with typical sysfs handling of settable attributes,
and just plain doesn't make much sense.

So, use strict_strtoul(), which was created for this purpose.  The store
functions will treat a value of 0 as false, non-zero as true, and return
-EINVAL for a parse failure.

Additionally, is_enabled_store() and msi_bus_store() return -EPERM if
CAP_SYS_ADMIN is lacking, rather than silently doing nothing.  This is more
typical behavior for sysfs attributes that need a capability.

And msi_bus_store() will only print the "forced subordinate bus ..."
warning if the MSI flag was actually forced to a different value.

Signed-off-by: Trent Piepho <xyzzy@speakeasy.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:58 -08:00
Arjan van de Ven
e8de1481fd resource: allow MMIO exclusivity for device drivers
Device drivers that use pci_request_regions() (and similar APIs) have a
reasonable expectation that they are the only ones accessing their device.
As part of the e1000e hunt, we were afraid that some userland (X or some
bootsplash stuff) was mapping the MMIO region that the driver thought it
had exclusively via /dev/mem or via various sysfs resource mappings.

This patch adds the option for device drivers to cause their reserved
regions to the "banned from /dev/mem use" list, so now both kernel memory
and device-exclusive MMIO regions are banned.
NOTE: This is only active when CONFIG_STRICT_DEVMEM is set.

In addition to the config option, a kernel parameter iomem=relaxed is
provided for the cases where developers want to diagnose, in the field,
drivers issues from userspace.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:32 -08:00
Jesse Barnes
9eff02e204 PCI: check mmap range of /proc/bus/pci files too
/proc/bus/pci allows you to mmap resource ranges too, so we should probably be
checking to make sure the mapping is somewhat valid.  Uses the same code as the recent sysfs mmap range checking patch from Linus.

Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:20 -08:00
Mike Travis
3be83050d0 cpumask: update local_cpus_show to use new cpumask API
Impact: use new cpumask API to reduce stack usage

Replace the local cpumask_t variable with a pointer to the
const cpumask that needs to be printed.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-04 15:39:25 +01:00
Rusty Russell
29c0177e6a cpumask: change cpumask_scnprintf, cpumask_parse_user, cpulist_parse, and cpulist_scnprintf to take pointers.
Impact: change calling convention of existing cpumask APIs

Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.

These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
2008-12-13 21:20:25 +10:30
Ed Swierk
88e7df0b7e PCI: fix range check on mmapped sysfs resource files
pci_mmap_fits() returns the wrong answer if the sysfs resource file size
is not a multiple of the page size.  vm_end and vm_start are already
page-aligned, so size - start < nr, causing mmap() to return EINVAL.

Signed-off-by: Ed Swierk <eswierk@aristanetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-11-03 14:41:16 -08:00
Benjamin Herrenschmidt
f19aeb1f36 PCI: Add ability to mmap legacy_io on some platforms
This adds the ability to mmap legacy IO space to the legacy_io files
in sysfs on platforms that support it. This will allow to clean up
X to use this instead of /dev/mem for legacy IO accesses such as
those performed by Int10.

While at it I moved pci_create/remove_legacy_files() to pci-sysfs.c
where I think they belong, thus making more things statis in there
and cleaned up some spurrious prototypes in the ia64 pci.h file

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20 11:01:46 -07:00
Zhao, Yu
280c73d369 PCI: centralize the capabilities code in pci-sysfs.c
This patch centralizes functions used to add and remove sysfs entries
for various capabilities. With this cleanup, the code is more readable
and easier for adding new capability related functions.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20 10:54:31 -07:00
Zhao, Yu
557848c3c0 PCI: replace cfg space size (256/4096) by macros.
This is a cleanup that changes all PCI configuration space size
representations to the macros (PCI_CFG_SPACE_SIZE and
PCI_CFG_SPACE_EXP_SIZE). And the macros are also moved from
drivers/pci/probe.c to drivers/pci/pci.h.

Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20 10:54:29 -07:00
Linus Torvalds
b5ff7df3df Check mapped ranges on sysfs resource files
This is loosely based on a patch by Jesse Barnes to check the user-space
PCI mappings though the sysfs interfaces.  Quoting Jesse's original
explanation:

  It's fairly common for applications to map PCI resources through sysfs.
  However, with the current implementation, it's possible for an application
  to map far more than the range corresponding to the resourceN file it
  opened.  This patch plugs that hole by checking the range at mmap time,
  similar to what is done on platforms like sparc64 in their lower level
  PCI remapping routines.

  It was initially put together to help debug the e1000e NVRAM corruption
  problem, since we initially thought an X driver might be walking past the
  end of one of its mappings and clobbering the NVRAM.  It now looks like
  that's not the case, but doing the check is still important for obvious
  reasons.

and this version of the patch differs in that it uses a helper function
to clarify the code, and does all the checks in pages (instead of bytes)
in order to avoid overflows when doing "<< PAGE_SHIFT" etc.

Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-02 18:52:51 -07:00
Benjamin Li
99cb233d60 PCI: Limit VPD read/write lengths for Broadcom 5706, 5708, 5709 rev.
For Broadcom 5706, 5708, 5709 rev. A nics, any read beyond the
VPD end tag will hang the device.  This problem was initially
observed when a vpd entry was created in sysfs
('/sys/bus/pci/devices/<id>/vpd').   A read to this sysfs entry
will dump 32k of data.  Reading a full 32k will cause an access
beyond the VPD end tag causing the device to hang.  Once the device
is hung, the bnx2 driver will not be able to reset the device.
We believe that it is legal to read beyond the end tag and
therefore the solution is to limit the read/write length.

A majority of this patch is from Matthew Wilcox who gave code for
reworking the PCI vpd size information.  A PCI quirk added for the
Broadcom NIC's to limit the read/write's.

Signed-off-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-02 11:25:54 -07:00
Ben Hutchings
a94c248113 PCI: Restrict VPD read permission to root
Some PCI devices will lock up if we attempt to read from VPD addresses
beyond some device-dependent limit.  Until we can identify these
devices and adjust the file size accordingly, only let root read VPD
through sysfs to prevent a DoS by normal users.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-01 09:51:53 -07:00
Jesse Barnes
81d5575a48 PCI: fixup write combine comment in pci_mmap_resource
Now that we can actually do write combining properly, there's no need to have
the FIXME.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-06-12 13:51:46 -07:00
venkatesh.pallipadi@intel.com
45aec1ae72 x86: PAT export resource_wc in pci sysfs
For the ranges with IORESOURCE_PREFETCH, export a new resource_wc interface in
pci /sysfs along with resource (which is uncached).

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-12 10:12:42 +02:00
Linus Torvalds
bda0c0afa7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6: (42 commits)
  PCI: Change PCI subsystem MAINTAINER
  PCI: pci-iommu-iotlb-flushing-speedup
  PCI: pci_setup_bridge() mustn't be __devinit
  PCI: pci_bus_size_cardbus() mustn't be __devinit
  PCI: pci_scan_device() mustn't be __devinit
  PCI: pci_alloc_child_bus() mustn't be __devinit
  PCI: replace remaining __FUNCTION__ occurrences
  PCI: Hotplug: fakephp: Return success, not ENODEV, when bus rescan is triggered
  PCI: Hotplug: Fix leaks in IBM Hot Plug Controller Driver - ibmphp_init_devno()
  PCI: clean up resource alignment management
  PCI: aerdrv_acpi.c: remove unneeded NULL check
  PCI: Update VIA CX700 quirk
  PCI: Expose PCI VPD through sysfs
  PCI: iommu: iotlb flushing
  PCI: simplify quirk debug output
  PCI: iova RB tree setup tweak
  PCI: parisc: use generic pci_enable_resources()
  PCI: ppc: use generic pci_enable_resources()
  PCI: powerpc: use generic pci_enable_resources()
  PCI: ia64: use generic pci_enable_resources()
  ...
2008-04-21 15:58:35 -07:00
Ben Hutchings
94e6108803 PCI: Expose PCI VPD through sysfs
Vital Product Data (VPD) may be exposed by PCI devices in several
ways.  It is generally unsafe to read this information through the
existing interfaces to user-land because of stateful interfaces.

This adds:
- abstract operations for VPD access (struct pci_vpd_ops)
- VPD state information in struct pci_dev (struct pci_vpd)
- an implementation of the VPD access method specified in PCI 2.2
  (in access.c)
- a 'vpd' binary file in sysfs directories for PCI devices with VPD
  operations defined

It adds a probe for PCI 2.2 VPD in pci_scan_device() and release of
VPD state in pci_release_dev().

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:07 -07:00
Shaohua Li
7d715a6c1a PCI: add PCI Express ASPM support
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.

This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
        -default, BIOS default setting
        -powersave, highest power saving mode, enable all available ASPM
state and clock power management
        -performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.

In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.

Note: some devices might not work well with aspm, either because chipset
issue or device issue. The patch provide API (pci_disable_link_state),
driver can disable ASPM for specific device.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-04-20 21:47:03 -07:00
Mike Travis
39106dcf85 cpumask: use new cpus_scnprintf function
* Cleaned up references to cpumask_scnprintf() and added new
    cpulist_scnprintf() interfaces where appropriate.

  * Fix some small bugs (or code efficiency improvments) for various uses
    of cpumask_scnprintf.

  * Clean up some checkpatch errors.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-19 19:44:59 +02:00
Greg Kroah-Hartman
cc3a1378b4 Revert "PCI: PCIE ASPM support"
This reverts commit 6c723d5bd8.

It caused build errors on non-x86 platforms, config file confusion, and
even some boot errors on some x86-64 boxes.  All around, not quite ready
for prime-time :(

Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-02 11:32:01 -08:00
Greg Kroah-Hartman
fd7d1ced29 PCI: make pci_bus a struct device
This moves the pci_bus class device to be a real struct device and at
the same time, place it in the device tree in the correct location.

Note, the old "bridge" symlink is now gone, but this was a non-standard
link and no userspace program used it.  If you need to determine the
device that the bus is on, follow the standard device symlink, or walk
up the device tree.


Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-01 15:04:31 -08:00
Shaohua Li
6c723d5bd8 PCI: PCIE ASPM support
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.

This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
        -default, BIOS default setting
        -powersave, highest power saving mode, enable all available ASPM
state
and clock power management
        -performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.

In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-01 15:04:30 -08:00
Julia Lawall
151fc5dfc8 PCI: drivers/pci/pci-sysfs.c: Add missing pci_dev_put
There should be a pci_dev_put when breaking out of a loop that iterates
over calls to pci_get_device and similar functions.

This was fixed using the following semantic patch.

// <smpl>
@@
identifier d;
type T;
expression e;
iterator for_each_pci_dev;
@@

T *d;
...
for_each_pci_dev(d)
  {... when != pci_dev_put(d)
       when != e = d
(
   return d;
|
+  pci_dev_put(d);
?  return ...;
)
...}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-28 14:35:26 -08:00
Alexey Dobriyan
aa0ac36518 Remove capability.h from mm.h
I forgot to remove capability.h from mm.h while removing sched.h!  This
patch remedies that, because the only inline function which was using
CAP_something was made out of line.

Cross-compile tested without regressions on:

	all powerpc defconfigs
	all mips defconfigs
	all m68k defconfigs
	all arm defconfigs
	all ia64 defconfigs

	alpha alpha-allnoconfig alpha-defconfig alpha-up
	arm
	i386 i386-allnoconfig i386-defconfig i386-up
	ia64 ia64-allnoconfig ia64-defconfig ia64-up
	m68k
	mips
	parisc parisc-allnoconfig parisc-defconfig parisc-up
	powerpc powerpc-up
	s390 s390-allnoconfig s390-defconfig s390-up
	sparc sparc-allnoconfig sparc-defconfig sparc-up
	sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up
	um-x86_64
	x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:45 -07:00
Linus Torvalds
21ba0f88ae Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
  PCI: Only build PCI syscalls on architectures that want them
  PCI: limit pci_get_bus_and_slot to domain 0
  PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
  PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
  PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
  PCI: hotplug: pciehp: wait for 1 second after power off slot
  PCI: pci_set_power_state(): check for PM capabilities earlier
  PCI: cpci_hotplug: Convert to use the kthread API
  PCI: add pci_try_set_mwi
  PCI: pcie: remove SPIN_LOCK_UNLOCKED
  PCI: ROUND_UP macro cleanup in drivers/pci
  PCI: remove pci_dac_dma_... APIs
  PCI: pci-x-pci-express-read-control-interfaces cleanups
  PCI: Fix typo in include/linux/pci.h
  PCI: pci_ids, remove double or more empty lines
  PCI: pci_ids, add atheros and 3com_2 vendors
  PCI: pci_ids, reorder some entries
  PCI: i386: traps, change VENDOR to DEVICE
  PCI: ATM: lanai, change VENDOR to DEVICE
  PCI: Change all drivers to use pci_device->revision
  ...
2007-07-12 13:40:57 -07:00
Zhang Rui
91a6902958 sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
Well, first of all, I don't want to change so many files either.

What I do:
Adding a new parameter "struct bin_attribute *" in the
.read/.write methods for the sysfs binary attributes.

In fact, only the four lines change in fs/sysfs/bin.c and
include/linux/sysfs.h do the real work.
But I have to update all the files that use binary attributes
to make them compatible with the new .read and .write methods.
I'm not sure if I missed any. :(

Why I do this:
For a sysfs attribute, we can get a pointer pointing to the
struct attribute in the .show/.store method,
while we can't do this for the binary attributes.
I don't know why this is different, but this does make it not
so handy to use the binary attributes as the regular ones.
So I think this patch is reasonable. :)

Who benefits from it:
The patch that exposes ACPI tables in sysfs
requires such an improvement.
All the table binary attributes share the same .read method.
Parameter "struct bin_attribute *" is used to get
the table signature and instance number which are used to
distinguish different ACPI table binary attributes.

Without this parameter, we need to offer different .read methods
for different ACPI table binary attributes.
This is impossible as there are various ACPI tables on different
platforms, and we don't know what they are until they are loaded.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo
7b595756ec sysfs: kill unnecessary attribute->owner
sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Michael Ellerman
a2cd52ca90 PCI: Make pcibios_add_platform_entries() return errors
Currently pcibios_add_platform_entries() returns void, but could fail,
so instead have it return an int and propagate errors up to
pci_create_sysfs_dev_files().

Fixes:
arch/powerpc/kernel/pci_64.c: In function 'pcibios_add_platform_entries':
arch/powerpc/kernel/pci_64.c:878: warning: ignoring return value of
	'device_create_file', declared with attribute warn_unused_result
arch/powerpc/kernel/pci_32.c: In function 'pcibios_add_platform_entries':
  arch/powerpc/kernel/pci_32.c:1043: warning: ignoring return value of
	'device_create_file', declared with attribute warn_unused_result

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:07 -07:00
Michael Ellerman
575e3348cb PCI: Use a weak symbol for the empty version of pcibios_add_platform_entries()
I'm not sure if this is going to fly, weak symbols work on the compilers I'm
using, but whether they work for all of the affected architectures I can't say.
I've cc'ed as many arch maintainers/lists as I could find.

But assuming they do, we can use a weak empty definition of
pcibios_add_platform_entries() to avoid having an empty definition on every
arch.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:07 -07:00
Michael Ellerman
9890b12a4a PCI: Free resource files in error path of pci_create_sysfs_dev_files()
pci_create_sysfs_dev_files() should call pci_remove_resource_files() in
its error path, to match the call it makes to pci_create_resource_files().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:43 -07:00