Commit Graph

289 Commits

Author SHA1 Message Date
Yijing Wang
a281b788d6 PCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev
Retrieve the first MSI IRQ to compute the MSI index from struct msi_desc
rather than the struct pci_dev to avoid an additional memory access.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-16 14:44:49 -06:00
Yijing Wang
4cc901613b PCI/MSI: Remove unused function msi_remove_pci_irq_vectors()
msi_remove_pci_irq_vectors() is unused, so remove it.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-16 14:44:20 -06:00
Yijing Wang
d873b4d449 PCI/MSI: Add msi_setup_entry() to clean up MSI initialization
Move MSI entry stuff to a new function, msi_setup_entry(), to simplify
msi_capability_init() as MSI-X does.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-16 14:42:07 -06:00
Yijing Wang
31ea5d4dfe PCI/MSI: Cache Multiple Message Capable in struct msi_desc
The Multiple Message Capable field in the MSI Message Control register
indicates how many vectors the device supports.  This field is read-only,
so cache it in msi_desc to avoid reading it repeatedly.

Since we cache the extracted field (not the entire Message Control
register), we can use msi_mask() instead of msi_capable_mask(), which is
then unused, so remove it.

[bhelgaas: fix whitespace, changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-03 16:55:07 -06:00
Yijing Wang
199596ef91 PCI/MSI: Remove unused msi_enabled_mask()
No one uses msi_enabled_mask(); remove the dead code.  No functional
change.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-03 16:54:10 -06:00
Yijing Wang
66f0d0c40c PCI/MSI: Add internal msix_clear_and_set_ctrl() function
Add msix_clear_and_set_ctrl() simplify code.  No functional change.

[bhelgaas: fix whitespace]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-03 16:48:59 -06:00
Bjorn Helgaas
38a6148248 Merge branches 'pci/msi', 'pci/iommu' and 'pci/cleanup' into next
* pci/msi:
  PCI/MSI: Fix memory leak in free_msi_irqs()

* pci/iommu:
  PCI: Add function 1 DMA alias quirk for HighPoint RocketRaid 642L
  PCI: Add bridge DMA alias quirk for ITE bridge

* pci/cleanup:
  PCI: Merge multi-line quoted strings
  PCI: Whitespace cleanup
  PCI: Move EXPORT_SYMBOL so it immediately follows function/variable
2014-06-11 14:38:25 -06:00
Alexei Starovoitov
b701c0b1fe PCI/MSI: Fix memory leak in free_msi_irqs()
free_msi_irqs() is leaking memory, since list_for_each_entry(entry,
&dev->msi_list, list) {...} is never executed, because dev->msi_list is
made empty by the loop just above this one.

Fix it by relying on zero termination of attribute array like
populate_msi_sysfs() does.

Fixes: 1c51b50c29 ("PCI/MSI: Export MSI mode using attributes, not kobjects")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: stable@vger.kernel.org	# v3.14+
2014-06-11 11:13:19 -06:00
Ryan Desfosses
227f064705 PCI: Merge multi-line quoted strings
Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.

No functional change.

[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 20:20:42 -06:00
Bjorn Helgaas
e5558d1a51 Merge branches 'dma-api', 'pci/virtualization', 'pci/msi', 'pci/misc' and 'pci/resource' into next
* dma-api:
  iommu/exynos: Remove unnecessary "&" from function pointers
  DMA-API: Update dma_pool_create ()and dma_pool_alloc() descriptions
  DMA-API: Fix duplicated word in DMA-API-HOWTO.txt
  DMA-API: Capitalize "CPU" consistently
  sh/PCI: Pass GAPSPCI_DMA_BASE CPU & bus address to dma_declare_coherent_memory()
  DMA-API: Change dma_declare_coherent_memory() CPU address to phys_addr_t
  DMA-API: Clarify physical/bus address distinction

* pci/virtualization:
  PCI: Mark RTL8110SC INTx masking as broken

* pci/msi:
  PCI/MSI: Remove pci_enable_msi_block()

* pci/misc:
  PCI: Remove pcibios_add_platform_entries()
  s390/pci: use pdev->dev.groups for attribute creation
  PCI: Move Open Firmware devspec attribute to PCI common code

* pci/resource:
  PCI: Add resource allocation comments
  PCI: Simplify __pci_assign_resource() coding style
  PCI: Change pbus_size_mem() return values to be more conventional
  PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources
  PCI: Support BAR sizes up to 8GB
  resources: Clarify sanity check message
  PCI: Don't add disabled subtractive decode bus resources
  PCI: Don't print anything while decoding is disabled
  PCI: Don't set BAR to zero if dma_addr_t is too small
  PCI: Don't convert BAR address to resource if dma_addr_t is too small
  PCI: Reject BAR above 4GB if dma_addr_t is too small
  PCI: Fail safely if we can't handle BARs larger than 4GB
  x86/gart: Tidy messages and add bridge device info
  x86/gart: Replace printk() with pr_info()
  x86/PCI: Move pcibios_assign_resources() annotation to definition
  x86/PCI: Mark ATI SBx00 HPET BAR as IORESOURCE_PCI_FIXED
  x86/PCI: Don't try to move IORESOURCE_PCI_FIXED resources
  x86/PCI: Fix Broadcom CNB20LE unintended sign extension
2014-05-26 17:29:17 -06:00
Alexander Gordeev
034cd97ebd PCI/MSI: Remove pci_enable_msi_block()
There are no users of pci_enable_msi_block() function left.  Obsolete it in
favor of pci_enable_msi_range() and pci_enable_msi_exact() functions.

Previously, we called arch_setup_msi_irqs() once, requesting the same
vector count we passed to arch_msi_check_device().  Now we may call it
several times: if it returns failure, we may retry and request fewer
vectors.

We don't keep track of the vector count we initially passed to
arch_msi_check_device().  We only keep track of the number of vectors
successfully set up by arch_setup_msi_irqs(), and this is what we use to
clean things up when disabling MSI.  Therefore, we assume that
arch_msi_check_device() does nothing that will have to be cleaned up later.

[bhelgaas: changelog]
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-04-30 16:56:47 -06:00
Bjorn Helgaas
518a6a34f6 Merge branches 'pci/hotplug', 'pci/msi', 'pci/virtualization' and 'pci/misc' into next
* pci/hotplug:
  PCI: rphahp: Fix endianess issues
  PCI: Allow hotplug service drivers to operate in polling mode
  PCI: pciehp: Acknowledge spurious "cmd completed" event
  PCI: pciehp: Use PCI_EXP_SLTCAP_PSN define
  PCI: hotplug: Remove unnecessary "dev->bus" test

* pci/msi:
  GenWQE: Use pci_enable_msi_exact() instead of pci_enable_msi_block()
  PCI/MSI: Simplify populate_msi_sysfs()
  PCI/portdrv: Use pci_enable_msix_exact() instead of pci_enable_msix()

* pci/virtualization:
  PCI: Add Patsburg (X79) to Intel PCH root port ACS quirk

* pci/misc:
  PCI: Fix use of uninitialized MPS value
  PCI: Remove dead code
  MAINTAINERS: Add arch/x86/kernel/quirks.c to PCI file patterns
  PCI: Remove unnecessary __ref annotations
  PCI: Fail new_id for vendor/device values already built into driver
  PCI: Add new ID for Intel GPU "spurious interrupt" quirk
  PCI: Update my email address
  PCI: Fix incorrect vgaarb conditional in WARN_ON()
  PCI: Use designated initialization in PCI_VDEVICE
  PCI: Remove old serial device IDs
  PCI: Remove unnecessary includes of <linux/init.h>
  powerpc/PCI: Fix NULL dereference in sys_pciconfig_iobase() list traversal
2014-04-29 17:43:58 -06:00
Paul Gortmaker
56a3d18279 PCI: Remove unnecessary includes of <linux/init.h>
None of these files are actually using any __init type directives and hence
don't need to include <linux/init.h>.   Most are just a left over from
__devinit and __cpuinit removal, or simply due to code getting copied from
one driver to the next.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-04-14 16:12:37 -06:00
Jan Beulich
1406276c12 PCI/MSI: Simplify populate_msi_sysfs()
Simplify populate_msi_sysfs() by

  - Swapping the order of the two allocations and storing the
    msi_dev_attr-derived pointer right after allocation, allowing the
    cleanup code to pick things up without extra effort.

  - Using kasprintf() instead of the kmalloc()/sprintf() pair.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-04-14 15:15:34 -06:00
Masanari Iida
75ce2d53ce PCI/MSI: Fix pci_msix_vec_count() htmldocs failure
An empty line in msi.c caused "make htmldocs" failure:

  Warning(/home/iida/Repo/linux-next//drivers/pci/msi.c:962): bad line:

Fixes: ff1aa430a2 ("PCI/MSI: Add pci_msix_vec_count()")
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-02-13 10:47:44 -07:00
Greg Kroah-Hartman
2923775647 PCI/MSI: Fix leak of msi_attrs
Coverity reported that I forgot to clean up some allocated memory on the
error path in populate_msi_sysfs(), so this patch fixes that.

Thanks to Dave Jones for pointing out where the error was, I obviously
can't read code this morning...

Found by Coverity (CID 1163317).

Fixes: 1c51b50c29 ("PCI/MSI: Export MSI mode using attributes, not kobjects")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Dave Jones <davej@redhat.com>
2014-02-13 10:47:35 -07:00
Greg Kroah-Hartman
86bb4f697a PCI/MSI: Check kmalloc() return value, fix leak of name
Coverity reported that I forgot to check the return value of kmalloc() when
creating the MSI attribute name, so fix that up and properly free it if
there is an error when allocating the msi_dev_attr variable.

Found by Coverity (CID 1163315 and 1163316).

Fixes: 1c51b50c29 ("PCI/MSI: Export MSI mode using attributes, not kobjects")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-02-13 10:47:20 -07:00
Bjorn Helgaas
04f982beb9 Merge branch 'pci/msi' into next
* pci/msi:
  PCI/MSI: Add pci_enable_msi_range() and pci_enable_msix_range()
  PCI/MSI: Add pci_msix_vec_count()
  PCI/MSI: Remove pci_enable_msi_block_auto()
  PCI/MSI: Add pci_msi_vec_count()
2014-01-07 17:34:39 -07:00
Alexander Gordeev
302a2523c2 PCI/MSI: Add pci_enable_msi_range() and pci_enable_msix_range()
This adds pci_enable_msi_range(), which supersedes the pci_enable_msi()
and pci_enable_msi_block() MSI interfaces.

It also adds pci_enable_msix_range(), which supersedes the
pci_enable_msix() MSI-X interface.

The old interfaces have three categories of return values:

    negative: failure; caller should not retry
    positive: failure; value indicates number of interrupts that *could*
	have been allocated, and caller may retry with a smaller request
    zero: success; at least as many interrupts allocated as requested

It is error-prone to handle these three cases correctly in drivers.

The new functions return either a negative error code or a number of
successfully allocated MSI/MSI-X interrupts, which is expected to lead to
clearer device driver code.

pci_enable_msi(), pci_enable_msi_block() and pci_enable_msix() still exist
unchanged, but are deprecated and may be removed after callers are updated.

[bhelgaas: tweak changelog]
Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2014-01-03 17:17:55 -07:00
Alexander Gordeev
ff1aa430a2 PCI/MSI: Add pci_msix_vec_count()
This creates an MSI-X counterpart for pci_msi_vec_count().  Device drivers
can use this function to obtain maximum number of MSI-X interrupts the
device supports and use that number in a subsequent call to
pci_enable_msix().

pci_msix_vec_count() supersedes pci_msix_table_size() and returns a
negative errno if device does not support MSI-X interrupts.  After this
update, callers must always check the returned value.

The only user of pci_msix_table_size() was the PCI-Express port driver,
which is also updated by this change.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2014-01-03 17:17:55 -07:00
Alexander Gordeev
7b92b4f61e PCI/MSI: Remove pci_enable_msi_block_auto()
The new pci_msi_vec_count() interface makes pci_enable_msi_block_auto()
superfluous.

Drivers can use pci_msi_vec_count() to learn the maximum number of MSIs
supported by the device, and then call pci_enable_msi_block().

pci_enable_msi_block_auto() was introduced recently, and its only user is
the AHCI driver, which is also updated by this change.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
2014-01-03 17:17:55 -07:00
Alexander Gordeev
d1ac1d2622 PCI/MSI: Add pci_msi_vec_count()
Device drivers can use this interface to obtain the maximum number of MSI
interrupts the device supports and use that number, e.g., in a subsequent
call to pci_enable_msi_block().

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2014-01-03 17:17:55 -07:00
Bjorn Helgaas
47e0ab3f39 Merge branch 'pci/msi' into next
* pci/msi:
  PCI/MSI: Make pci_enable_msi/msix() 'nvec' argument type as int
  PCI/MSI: Return -ENOSYS for unimplemented interfaces, not -1
  PCI/MSI: Return msix_capability_init() failure if populate_msi_sysfs() fails
  s390/PCI: Remove superfluous check of MSI type
  s390/PCI: Fix single MSI only check
  PCI/MSI: Export MSI mode using attributes, not kobjects
2013-12-20 12:41:40 -07:00
Alexander Gordeev
52179dc9ed PCI/MSI: Make pci_enable_msi/msix() 'nvec' argument type as int
Make pci_enable_msi_block(), pci_enable_msi_block_auto() and
pci_enable_msix() consistent with regard to the type of 'nvec' argument.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2013-12-20 09:45:05 -07:00
Alexander Gordeev
2adc7907ba PCI/MSI: Return msix_capability_init() failure if populate_msi_sysfs() fails
If populate_msi_sysfs() function failed msix_capability_init() must return
the error code, but it returns the success instead.  This update fixes the
described misbehaviour.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2013-12-20 09:45:05 -07:00
Greg Kroah-Hartman
1c51b50c29 PCI/MSI: Export MSI mode using attributes, not kobjects
The PCI MSI sysfs code is a mess with kobjects for things that don't really
need to be kobjects.  This patch creates attributes dynamically for the MSI
interrupts instead of using kobjects.

Note, this removes a directory from sysfs.  Old MSI kobjects:

  pci_device
     └── msi_irqs
         └── 40
             └── mode

New MSI attributes:

  pci_device
     └── msi_irqs
         └── 40

As there was only one file "mode" with the kobject model, the interrupt
number is now a file that returns the "mode" of the interrupt (msi vs.
msix).

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2013-12-19 15:14:52 -07:00
DuanZhenzhong
ac8344c4c0 PCI: Drop "irq" param from *_restore_msi_irqs()
Change x86_msi.restore_msi_irqs(struct pci_dev *dev, int irq) to
x86_msi.restore_msi_irqs(struct pci_dev *dev).

restore_msi_irqs() restores multiple MSI-X IRQs, so param 'int irq' is
unneeded.  This makes code more consistent between vm and bare metal.

Dom0 MSI-X restore code can also be optimized as XEN only has a hypercall
to restore all MSI-X vectors at one time.

Tested-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-12-13 08:44:30 -07:00
Bjorn Helgaas
f7625980f5 PCI: Fix whitespace, capitalization, and spelling errors
Fix whitespace, capitalization, and spelling errors.  No functional change.
I know "busses" is not an error, but "buses" was more common, so I used it
consistently.

Signed-off-by: Marta Rybczynska <rybczynska@gmail.com> (pci_reset_bridge_secondary_bus())
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-14 11:28:18 -07:00
Konrad Rzeszutek Wilk
0e4ccb1505 PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()
Certain platforms do not allow writes in the MSI-X BARs to setup or tear
down vector values.  To combat against the generic code trying to write to
that and either silently being ignored or crashing due to the pagetables
being marked R/O this patch introduces a platform override.

Note that we keep two separate, non-weak, functions default_mask_msi_irqs()
and default_mask_msix_irqs() for the behavior of the arch_mask_msi_irqs()
and arch_mask_msix_irqs(), as the default behavior is needed by x86 PCI
code.

For Xen, which does not allow the guest to write to MSI-X tables - as the
hypervisor is solely responsible for setting the vector values - we
implement two nops.

This fixes a Xen guest crash when passing a PCI device with MSI-X to the
guest.  See the bugzilla for more details.

[bhelgaas: add bugzilla info]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=64581
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com>
CC: Zhenzhong Duan <zhenzhong.duan@oracle.com>
2013-11-06 16:32:19 -07:00
Yijing Wang
869a16157d PCI: Fail MSI/MSI-X initialization if device is not in PCI_D0
Currently, pci_enable_msi() and pci_enable_msix() return success even if
the device power state is not D0.  However, we don't write the MSI message
to the device registers, and the registers will never be updated later.

This patch makes pci_enable_msi() and pci_enable_msix() return an error
instead.

[bhelgaas: changelog]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-10-29 13:30:52 -06:00
Martin Schwidefsky
0244ad004a Remove GENERIC_HARDIRQ config option
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-09-13 15:09:52 +02:00
Thomas Petazzoni
6a4324ebf5 PCI: msi: add default MSI operations for !HAVE_GENERIC_HARDIRQS platforms
Some platforms (e.g S390) don't use the generic hardirqs code and
therefore do not defined HAVE_GENERIC_HARDIRQS. This prevents using
the irq_set_chip_data() and irq_get_chip_data() functions that are
used for the default implementations of the MSI operations.

So, when CONFIG_GENERIC_HARDIRQS is not enabled, provide another
default implementation of the MSI operations, that simply errors
out. The architecture is responsible for implementing those operations
(which is the case on S390), and cannot use the msi_chip infrastructure.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-13 15:16:30 +00:00
Thierry Reding
0cbdcfcf42 PCI: Introduce new MSI chip infrastructure
The new struct msi_chip is used to associated an MSI controller with a
PCI bus. It is automatically handed down from the root to its children
during bus enumeration.

This patch provides default (weak) implementations for the architecture-
specific MSI functions (arch_setup_msi_irq(), arch_teardown_msi_irq()
and arch_msi_check_device()) which check if a PCI device's bus has an
attached MSI chip and forward the call appropriately.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12 15:26:58 +00:00
Thomas Petazzoni
4287d824f2 PCI: use weak functions for MSI arch-specific functions
Until now, the MSI architecture-specific functions could be overloaded
using a fairly complex set of #define and compile-time
conditionals. In order to prepare for the introduction of the msi_chip
infrastructure, it is desirable to switch all those functions to use
the 'weak' mechanism. This commit converts all the architectures that
were overidding those MSI functions to use the new strategy.

Note that we keep two separate, non-weak, functions
default_teardown_msi_irqs() and default_restore_msi_irqs() for the
default behavior of the arch_teardown_msi_irqs() and
arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI
code.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Daniel Price <daniel.price@gmail.com>
Tested-by: Thierry Reding <thierry.reding@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: linux-s390@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: linux-ia64@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David S. Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
2013-08-12 15:26:39 +00:00
Alexander Gordeev
65f6ae66a6 PCI: Allocate only as many MSI vectors as requested by driver
Because of the encoding of the "Multiple Message Capable" and "Multiple
Message Enable" fields, a device can only advertise that it's capable of a
power-of-two number of vectors, and the OS can only enable a power-of-two
number.

For example, a device that's limited internally to using 18 vectors would
have to advertise that it's capable of 32.  The 14 extra vectors consume
vector numbers and IRQ descriptors even though the device can't actually
use them.

This fix introduces a 'msi_desc::nvec_used' field to address this issue.
When non-zero, it is the actual number of MSIs the device will send, as
requested by the device driver.  This value should be used by architectures
to set up and tear down only as many interrupt resources as the device will
actually use.

Note, although the existing 'msi_desc::multiple' field might seem
redundant, in fact it is not.  The number of MSIs advertised need not be
the smallest power-of-two larger than the number of MSIs the device will
send.  Thus, it is not always possible to derive the former from the
latter, so we need to keep them both to handle this case.

[bhelgaas: changelog, rename to "nvec_used"]
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-05-28 11:31:16 -06:00
Dan Carpenter
e5f66eafe5 PCI: Set ->mask_pos correctly
The "+" operation has higher precedence than "?:" and ->msi_cap is
always non-zero here so the original statement is equivalent to:

    entry->mask_pos = PCI_MSI_MASK_64;

Which wasn't the intent.

[bhelgaas: my fault from 78b5a310ce]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-30 08:49:19 -07:00
Bjorn Helgaas
d4f09c5d7f Merge branch 'pci/gavin-msi-cleanup' into next
* pci/gavin-msi-cleanup:
  vfio-pci: Use cached MSI/MSI-X capabilities
  vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
  PCI: Remove "extern" from function declarations
  PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
  PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
  PCI: Use msix_table_size() directly, drop multi_msix_capable()
  PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
  PCI: Drop is_64bit_address() and is_mask_bit_support() macros
  PCI: Drop msi_data_reg() macro
  PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
  PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
  PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
  PCI: Clean up MSI/MSI-X capability #defines
  PCI: Use cached MSI-X cap while enabling MSI-X
  PCI: Use cached MSI cap while enabling MSI interrupts
  PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
  PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
  PCI: Use u8, not int, for PM capability offset
  [SCSI] megaraid_sas: Use correct #define for MSI-X capability
2013-04-24 11:37:49 -06:00
Bjorn Helgaas
4d18760c67 PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the
Table Offset register, not the flags ("Message Control" per spec)
register.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
78b5a310ce PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
msi_mask_reg() doesn't provide any useful abstraction, do drop it.

Remove the now-empty drivers/pci/msi.h.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
527eee292d PCI: Use msix_table_size() directly, drop multi_msix_capable()
The users of multi_msix_capable() are really interested in the table
size, so just say what we mean.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
909094c62e PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
msix_table_offset_reg() is used only once and adds a useless indirection,
so just use the table offset directly.

msix_pba_offset_reg() is unused, so just delete it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
4987ce8205 PCI: Drop is_64bit_address() and is_mask_bit_support() macros
is_64bit_address() and is_mask_bit_support() don't provide any useful
abstraction, so drop them.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
2f22134936 PCI: Drop msi_data_reg() macro
msi_data_reg() doesn't provide any useful abstraction, so drop it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
9925ad0cf1 PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
msi_lower_address_reg() and msi_upper_address_reg() don't provide any
useful abstraction, so drop them.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
f84ecd285f PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
Note the error in pci_msix_table_size() -- we used PCI_MSI_FLAGS to
locate the PCI_MSIX_FLAGS word.  No actual breakage because PCI_MSI_FLAGS
and PCI_MSIX_FLAGS happen to be the same.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
f5322169b4 PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
We always know the type (MSI vs MSI-X), so we can use the correct
cached capability offset rather than relying on the copy in the
msi_attrib.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Gavin Shan
520fe9dc1b PCI: Use cached MSI-X cap while enabling MSI-X
The patch uses the cached MSI-X capability offset in
pci_dev instead of reading it from config space when enabling
MSI-X interrupts.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Gavin Shan
f465136d72 PCI: Use cached MSI cap while enabling MSI interrupts
The patch uses the cached MSI capability offset in pci_dev instead
of reading it from config space when enabling MSI interrupts.

[bhelgaas: removed unrelated msi_control_reg() changes]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Gavin Shan
cdf1fd4d90 PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
The function pci_msi_check_device() is called while enabling MSI
or MSI-X interrupts to make sure the PCI device can support MSI
or MSI-X capability.  This patch removes the check on MSI or MSI-X
capability in the function and lets the caller do the check.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Gavin Shan
e375b56181 PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
The patch caches the MSI and MSI-X capability offset in PCI device
(struct pci_dev) so that we needn't read it from the config space
upon enabling or disabling MSI or MSI-X interrupts.

[bhelgaas: moved pm_cap size change to separate patch]
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-23 09:50:30 -06:00
Bjorn Helgaas
9738abedd6 PCI: Make local functions/structs static
This fixes "no previous prototype" warnings found via "make W=1".

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-04-12 11:26:01 -06:00
Alexander Gordeev
08261d87f7 PCI/MSI: Enable multiple MSIs with pci_enable_msi_block_auto()
The new function pci_enable_msi_block_auto() tries to allocate
maximum possible number of MSIs up to the number the device
supports. It generalizes a pattern when pci_enable_msi_block()
is contiguously called until it succeeds or fails.

Opposite to pci_enable_msi_block() which takes the number of
MSIs to allocate as a input parameter,
pci_enable_msi_block_auto() could be used by device drivers to
obtain the number of assigned MSIs and the number of MSIs the
device supports.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c3de2419df94a0f95ca1a6f755afc421486455e6.1353324359.git.agordeev@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-01-24 17:25:13 +01:00
Jan Glauber
9a4da8a5b1 s390/pci: PCI adapter interrupts for MSI/MSI-X
Support PCI adapter interrupts using the Single-IRQ-mode. Single-IRQ-mode
disables an adapter IRQ automatically after delivering it until the SIC
instruction enables it again. This is used to reduce the number of IRQs
for streaming workloads.

Up to 64 MSI handlers can be registered per PCI function.
A hash table is used to map interrupt numbers to MSI descriptors.
The interrupt vector is scanned using the flogr instruction.
Only MSI/MSI-X interrupts are supported, no legacy INTs.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-11-30 17:47:21 +01:00
Konrad Rzeszutek Wilk
76ccc29701 x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
The MSI restore function will become a function pointer in an
x86_msi_ops struct. It defaults to the implementation in the
io_apic.c and msi.c. We piggyback on the indirection mechanism
introduced by "x86: Introduce x86_msi_ops".

Cc: x86@kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 14:02:26 -08:00
Neil Horman
424eb39159 PCI: msi: fix imbalanced refcount of msi irq sysfs objects
This warning was recently reported to me:

------------[ cut here ]------------
WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
Hardware name: VMware Virtual Platform
kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
being called.
Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
vmw_pvscsi
Pid: 630, comm: modprobe Tainted: G        W   3.1.6-1.fc16.x86_64 #1
Call Trace:
 [<ffffffff8106b73f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff8106b836>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff810da293>] ? free_desc+0x63/0x70
 [<ffffffff812a9aa0>] kobject_put+0x50/0x60
 [<ffffffff812e4c25>] free_msi_irqs+0xd5/0x120
 [<ffffffff812e524c>] pci_enable_msi_block+0x24c/0x2c0
 [<ffffffffa017c273>] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
 [<ffffffffa0182e94>] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
 [<ffffffff812d141c>] local_pci_probe+0x5c/0xd0
 [<ffffffff812d2cb9>] pci_device_probe+0x109/0x130
 [<ffffffff8138ba2c>] driver_probe_device+0x9c/0x2b0
 [<ffffffff8138bceb>] __driver_attach+0xab/0xb0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138bc40>] ? driver_probe_device+0x2b0/0x2b0
 [<ffffffff8138a8ac>] bus_for_each_dev+0x5c/0x90
 [<ffffffff8138b63e>] driver_attach+0x1e/0x20
 [<ffffffff8138b240>] bus_add_driver+0x1b0/0x2a0
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff8138c246>] driver_register+0x76/0x140
 [<ffffffff815ca414>] ? printk+0x51/0x53
 [<ffffffffa0188000>] ? 0xffffffffa0187fff
 [<ffffffff812d2996>] __pci_register_driver+0x56/0xd0
 [<ffffffffa018803a>] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
 [<ffffffff81002042>] do_one_initcall+0x42/0x180
 [<ffffffff810aad71>] sys_init_module+0x91/0x200
 [<ffffffff815dccc2>] system_call_fastpath+0x16/0x1b
---[ end trace 44593438a59a9558 ]---
Using INTx interrupt, #Rx queues: 1.

It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
be called.  Because populate_msi_sysfs fails, we never registered any of the
msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
on each of them, which gets flagged in the above stack trace.

The fix is pretty straightforward.  We can key of the parent pointer in the
kobject.  It is only set if the kobject_init_and_add succededs in
populate_msi_sysfs.  If anything fails there, each kobject has its parent reset
to NULL

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:17 -08:00
Eric W. Biederman
d5dea7d95c PCI: msi: Disable msi interrupts when we initialize a pci device
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup.  The booting kernel had not enabled those interupts so
was not prepared to handle them.

I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts.  The pci
spec specifies that msi interrupts should be off by default.  Drivers
are expected to enable the msi interrupts if they want to use them.  Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.

This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.

Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:29 -08:00
Neil Horman
da8d1c8ba4 PCI/sysfs: add per pci device msi[x] irq listing (v5)
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs.  For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:25 -08:00
Paul Gortmaker
363c75db1d pci: Fix files needing export.h for EXPORT_SYMBOL/THIS_MODULE
They were implicitly getting it from device.h --> module.h but
we want to clean that up.  So add the minimal header for these
macros.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:31:22 -04:00
Thomas Gleixner
dced35aeb0 drivers: Final irq namespace conversion
Scripted with coccinelle.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:19 +02:00
Sheng Yang
8d80528696 PCI: Add mask bit definition for MSI-X table
Then we can use it instead of magic number 1.

Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-12-23 12:53:08 -08:00
Thomas Gleixner
1525bf0d8f msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
Introduce an override for the arch_[teardown|setup]_msi_irqs
that can be utilized to fallback to the default arch_* code.

If a platform wants to utilize the code paths defined
in driver/pci/msi.c it has to define HAVE_DEFAULT_MSI_TEARDOWN_IRQS
or HAVE_DEFAULT_MSI_SETUP_IRQS. Otherwise the old mechanism
of over-ridding the arch_* works fine.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: x86@kernel.org
2010-10-18 10:49:33 -04:00
Thomas Gleixner
39431acb1a pci: Cleanup the irq_desc mess in msi
Handing down irq_desc to msi just so that msi can access
irq_desc.irq_data.msi_desc is a pretty stupid idea. The calling code
can hand down a pointer to msi_desc so msi code does not need to know
about the irq descriptor at all.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-10-12 16:53:34 +02:00
Thomas Gleixner
1c9db52534 pci: Convert msi to new irq_chip functions
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
2010-10-12 16:53:34 +02:00
Ben Hutchings
30da552428 PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.

However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
  last MSI message written
- Use the new functions where appropriate

Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:41:39 -07:00
Ben Hutchings
fcd097f31a PCI: MSI: Remove unsafe and unnecessary hardware access
During suspend on an SMP system, {read,write}_msi_msg_desc() may be
called to mask and unmask interrupts on a device that is already in a
reduced power state.  At this point memory-mapped registers including
MSI-X tables are not accessible, and config space may not be fully
functional either.

While a device is in a reduced power state its interrupts are
effectively masked and its MSI(-X) state will be restored when it is
brought back to D0.  Therefore these functions can simply read and
write msi_desc::msg for devices not in D0.

Further, read_msi_msg_desc() should only ever be used to update a
previously written message, so it can always read msi_desc::msg
and never needs to touch the hardware.

Tested-by: "Michael Chan" <mchan@broadcom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:29:34 -07:00
Kenji Kaneshige
4302e0fb7f PCI: fix wrong memory address handling in MSI-X
Use resource_size_t for MMIO address instead of unsigned long. Otherwise,
higher 32-bits of MMIO address are cleared unexpectedly in x86-32 PAE.

Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-07-30 09:29:14 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Hidetoshi Seto
500559a92d PCI MSI: Style cleanups
Cleanups (nearly based on checkpatch).

Before: total: 11 errors, 2 warnings, 0 checks, 842 lines checked
After:  total:  0 errors, 0 warnings, 0 checks, 842 lines checked

v2: fix it's/its mistakes in comment

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:35 -07:00
Hidetoshi Seto
d9d7070e61 PCI MSI: MSI-X cleanup, msix_setup_entries()
Cleanup based on the prototype from Matthew Milcox.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:34 -07:00
Hidetoshi Seto
75cb342687 PCI MSI: MSI-X cleanup, msix_program_entries()
Cleanup based on the prototype from Matthew Milcox.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:33 -07:00
Hidetoshi Seto
5a05a9d819 PCI MSI: MSI-X cleanup, msix_map_region()
Cleanup based on the prototype from Matthew Milcox.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:33 -07:00
Hidetoshi Seto
583871d436 PCI MSI: Relocate error path in init_msix_capability()
Move it from the middle of the function to the end.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:32 -07:00
Hidetoshi Seto
f56e448132 PCI MSI: Unify msi_free_irqs() and msix_free_all_irqs()
Unify msi_free_irqs() and msix_free_all_irqs(), and rename it to a
common void function free_msi_irqs().

And relocate the common function to where the prototype is located now.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:31 -07:00
Hidetoshi Seto
9cc8d54815 PCI MSI: Use list_first_entry()
use list_first_entry() instead of list_entry().

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:30 -07:00
Hidetoshi Seto
c901851fdd PCI MSI: Remove attribute check from pci_disable_msi()
The msi_list never have MSI-X's msi_desc while MSI is enabled,
and also it never have MSI's msi_desc while MSI-X is enabled.

This patch remove check for MSI-X entry from the pci_disable_msi(),
referring that pci_disable_msix() does not have any check for MSI
entry.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:29 -07:00
Hidetoshi Seto
12abb8ba84 PCI MSI: Fix restoration of MSI/MSI-X mask states in suspend/resume
There are 2 problems on mask states in suspend/resume.

[1]:
It is better to restore the mask states of MSI/MSI-X to initial states
(MSI is unmasked, MSI-X is masked) when we release the device.
The pci_msi_shutdown() does the restoration of mask states for MSI,
while the msi_free_irqs() does it for MSI-X.  In other words, in the
"disable" path both of MSI and MSI-X are handled, but in the "shutdown"
path only MSI is handled.

MSI:
   pci_disable_msi()
      => pci_msi_shutdown()
         [ mask states for MSI restored ]
         => msi_set_enable(dev, pos, 0);
      => msi_free_irqs()

MSI-X:
   pci_disable_msix()
      => pci_msix_shutdown()
         => msix_set_enable(dev, 0);
      => msix_free_all_irqs
         => msi_free_irqs()
            [ mask states for MSI-X restored ]

This patch moves the masking for MSI-X from msi_free_irqs() to
pci_msix_shutdown().

This change has some positive side effects:
 - It prevents OS from touching mask states before reading preserved
   bits in the register, which can be happen if msi_free_irqs() is
   called from error path in msix_capability_init().
 - It also prevents touching the register after turning off MSI-X in
   "disable" path, which can be a problem on some devices.

[2]:
We have cache of the mask state in msi_desc, which is automatically
updated when msi/msix_mask_irq() is called.  This cached states are
used for the resume.

But since what need to be restored in the resume is the states before
the shutdown on the suspend, calling msi/msix_mask_irq() from
pci_msi/msix_shutdown() is not appropriate.

This patch introduces __msi/msix_mask_irq() that do mask as same
as msi/msix_mask_irq() but does not update cached state, for use
in pci_msi/msix_shutdown().

[updated: get rid of msi/msix_mask_irq_nocache() (proposed by Matthew Wilcox)]

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29 12:18:13 -07:00
Hidetoshi Seto
7ba1930db0 PCI MSI: Unmask MSI if setup failed
The initial state of mask register of MSI is unmasked.  We set it
masked before calling arch_setup_msi_irqs().  If arch_setup_msi_irq()
fails, it is better to restore the state of the mask register.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29 12:16:19 -07:00
Hidetoshi Seto
2c21fd4b33 PCI MSI: shorten PCI_MSIX_ENTRY_* symbol names
These names are too long!  Drop _OFFSET to save some bytes/lines.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29 12:15:19 -07:00
Hidetoshi Seto
0d07348931 PCI MSI: Return if alloc_msi_entry for MSI-X failed
In current code it continues setup even if alloc_msi_entry() for MSI-X
is failed due to lack of memory.  It means arch_setup_msi_irqs() might
be called with msi_desc entries less than its argument nvec.

At least x86's arch_setup_msi_irqs() uses list_for_each_entry() for
dev->msi_list that suspected to have entries same numbers as nvec, and
it doesn't check the number of allocated vectors and passed arg nvec.
Therefore it will result in success of pci_enable_msix(), with less
vectors allocated than requested.

This patch fixes the error route to return -ENOMEM, instead of continuing
the setup (proposed by Matthew Wilcox).

Note that there is no iounmap in msi_free_irqs() if no msi_disc is
allocated.

Reviewed-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-29 12:10:10 -07:00
Hidetoshi Seto
2af5066f66 PCI: make msi_free_irqs() to use msix_mask_irq() instead of open coded write
Use msix_mask_irq() instead of direct use of writel, so as not to clear
preserved bits in the Vector Control register [31:1].

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-19 15:11:45 -07:00
Matthew Wilcox
f598282f51 PCI: Fix the NIU MSI-X problem in a better way
The previous MSI-X fix (8d18101853) had
three bugs.  First, it didn't move the write that disabled the vector.
This led to writing garbage to the MSI-X vector (spotted by Michael
Ellerman).  It didn't fix the PCI resume case, and it had a race window
where the device could generate an interrupt before the MSI-X registers
were programmed (leading to a DMA to random addresses).

Fortunately, the MSI-X capability has a bit to mask all the vectors.
By setting this bit instead of clearing the enable bit, we can ensure
the device will not generate spurious interrupts.  Since the capability
is now enabled, the NIU device will not have a problem with the reads
and writes to the MSI-X registers being in the original order in the code.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-19 15:11:39 -07:00
Matthew Wilcox
110828c9cd PCI: remove redundant __msi_set_enable()
We have the 'pos' of the MSI capability at all locations which call
msi_set_enable(), so pass it to msi_set_enable() instead of making it
find the capability every time.

Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-18 13:57:24 -07:00
Kenji Kaneshige
ab7de999a2 PCI: remove invalid comment of msi_mask_irq()
Remove invalid comment of msi_mask_irq().

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-16 14:30:18 -07:00
Michael S. Tsirkin
57fbf52c86 PCI MSI: let drivers retry when not enough vectors
pci_enable_msix currently returns -EINVAL if you ask
for more vectors than supported by the device, which would
typically cause fallback to regular interrupts.

It's better to return the table size, making the driver retry
MSI-X with less vectors.

Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-11 12:04:18 -07:00
Hidetoshi Seto
67b5db6502 PCI MSI: Define PCI_MSI_MASK_32/64
Impact: cleanup, improve readability

Define PCI_MSI_MASK_32/64 for 32/64bit devices, instead of using
implicit offset (-4), "PCI_MSI_MASK_BIT - 4" and "PCI_MSI_MASK_BIT".

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-06-11 12:04:06 -07:00
Matthew Wilcox
8d18101853 PCI MSI: Fix MSI-X with NIU cards
The NIU device refuses to allow accesses to MSI-X registers before MSI-X
is enabled.  This patch fixes the problem by moving the read of the mask
register to after MSI-X is enabled.

Reported-by: David S. Miller <davem@davemloft.net>
Tested-by: David S. Miller <davem@davemloft.net>
Reviewed-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-05-11 17:02:27 -07:00
Matthew Wilcox
1c8d7b0a56 PCI MSI: Add support for multiple MSI
Add the new API pci_enable_msi_block() to allow drivers to
request multiple MSI and reimplement pci_enable_msi in terms of
pci_enable_msi_block.  Ensure that the architecture back ends don't
have to know about multiple MSI.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:14 -07:00
Matthew Wilcox
f2440d9acb PCI MSI: Refactor interrupt masking code
Since most of the callers already know whether they have an MSI or
an MSI-X capability, split msi_set_mask_bits() into msi_mask_irq()
and msix_mask_irq().  The only callers which don't (mask_msi_irq()
and unmask_msi_irq()) can share code in msi_set_mask_bit().  This then
becomes the only caller of msix_flush_writes(), so we can inline it.
The flushing read can be to any address that belongs to the device,
so we can eliminate the calculation too.

We can also get rid of maskbits_mask from struct msi_desc and simply
recalculate it on the rare occasion that we need it.  The single-bit
'masked' element is replaced by a copy of the 32-bit 'masked' register,
so this patch does not affect the size of msi_desc.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:13 -07:00
Matthew Wilcox
264d9caaa1 PCI MSI: Use mask_pos instead of mask_base when appropriate
MSI interrupts have a mask_pos where MSI-X have a mask_base.  Use a
transparent union to get rid of some ugly casts.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:13 -07:00
Matthew Wilcox
379f5327a8 PCI MSI: msi_desc->dev is always initialised
By passing the pci_dev into alloc_msi_entry() we can be sure that
the ->dev entry is always assigned and so we don't need to check it.
Also, we used kzalloc() so we don't need to initialise ->irq to 0.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:12 -07:00
Matthew Wilcox
24d2755339 PCI MSI: Replace 'type' with 'is_msix'
By changing from a 5-bit field to a 1-bit field, we free up some bits
that can be used by a later patch.  Also rearrange the fields for better
packing on 64-bit platforms (reducing the size of msi_desc from 72 bytes
to 64 bytes).

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:12 -07:00
Michael Ellerman
b5fbf53324 PCI/MSI: Allow arch code to return the number of MSI-X available
There is code in msix_capability_init() which, when the requested number
of MSI-X couldn't be allocated, calculates how many MSI-X /could/ be
allocated and returns that to the driver. That allows the driver to then
make a second request, with a number of MSIs that should succeed.

The current code requires the arch code to setup as many msi_descs as it
can, and then return to the generic code. On some platforms the arch
code may already know how many MSI-X it can allocate, before it sets up
any of the msi_descs.

So change the logic such that if the arch code returns a positive error
code, that is taken to be the number of MSI-X that could be allocated.
If the error code is negative we still calculate the number available
using the old method.

Because it's a little subtle, make sure the error return code from
arch_setup_msi_irq() is always negative. That way only implementations
of arch_setup_msi_irqs() need to be careful about returning a positive
error code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:34 -07:00
Michael Ellerman
11df1f0551 PCI/MSI: Use #ifdefs instead of weak functions
Weak functions aren't all they're cracked up to be. They lead to
incorrect binaries with some toolchains, they require us to have empty
functions we otherwise wouldn't, and the unused code is not elided
(as of gcc 4.3.2 anyway).

So replace the weak MSI arch hooks with the #define foo foo idiom. We no
longer need empty versions of arch_setup/teardown_msi_irq().

This is less source (by 1 line!), and results in smaller binaries too:

   text	   data	    bss	    dec	    hex	filename
9354300	1693916	 678424	11726640 b2ef30	build/powerpc/vmlinux-before
9354052	1693852	 678424	11726328 b2edf8	build/powerpc/vmlinux-after

Also smaller on x86_64 and arm (iop13xx).

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:26 -07:00
Rafael J. Wysocki
a52e2e3513 PCI/MSI: Introduce pci_msix_table_size()
Introduce new function pci_msix_table_size() returning the size of
the MSI-X table of given PCI device or 0 if the device doesn't
support MSI-X.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-19 19:29:25 -07:00
Matthew Wilcox
0b49ec37a2 PCI/MSI: fix msi_mask() shift fix
Hidetoshi Seto points out that commit
bffac3c593 has wrong values in the array.
Rather than correct the array, we can just use a bounds check and
perform the calculation specified in the comment.  As a bonus, this will
not run off the end of the array if the device specifies an illegal
value in the MSI capability.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-02-13 11:59:03 -08:00
Matthew Wilcox
bffac3c593 PCI MSI: Fix undefined shift by 32
Add an msi_mask() function which returns the correct bitmask for the
number of MSI interrupts you have.  This fixes an undefined bug in
msi_capability_init().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-27 09:53:25 -08:00
Hidetoshi Seto
0db29af1e7 PCI/MSI: bugfix/utilize for msi_capability_init()
This patch fix a following bug and does a cleanup.

bug:
	commit 5993760f7f
	had a wrong change (since is_64 is boolean[0|1]):

-               pci_write_config_dword(dev,
-                       msi_mask_bits_reg(pos, is_64bit_address(control)),
-                       maskbits);
+               pci_write_config_dword(dev, entry->msi_attrib.is_64, maskbits);

utilize:
	Unify separated if (entry->msi_attrib.maskbit) statements.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: "Jike Song" <albcamus@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-16 12:35:25 -08:00
Andrew Patterson
07ae95f988 ACPI/PCI: PCI MSI _OSC support capabilities called when root bridge added
The _OSC capability OSC_MSI_SUPPORT is set when the root bridge is added
with pci_acpi_osc_support(), so we no longer need to do it in the PCI
MSI driver.  Also adds the function pci_msi_enabled, which returns true
if pci=nomsi is not on the kernel command-line.

Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:31 -08:00
Yinghai Lu
3145e941fc x86, MSI: pass irq_cfg and irq_desc
Impact: simplify code

Pass irq_desc and cfg around, instead of raw IRQ numbers - this way
we dont have to look it up again and again.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-08 14:31:59 +01:00
Taku Izumi
d389fec6a2 ACPI/PCI: Set support bit for MSI in support field of _OSC
Currently linux doesn't have any code to set the "MSI supported" bit in
Support Fireld of _OSC. This patch adds the code for that.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-22 16:42:35 -07:00
Jike Song
5993760f7f PCI: utilize calculated results when detecting MSI features
In msi_capability_init, we can make use of the calculated results
instead of calling is_mask_bit_support and is_64bit_address twice.

Signed-off-by: Jike Song <albcamus@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-10-20 10:53:50 -07:00
Jesse Barnes
abad2ec98f PCI: fully restore MSI state at resume time
With the recent change to avoid masking MSIs using the MSI enable bit, devices
without an MSI mask bit will have their MSI capability always enabled when MSI
is in use, so we need to restore it regardless of the mask bit state.

Fixes kernel bz 11178.

Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-08-07 08:52:37 -07:00
Matthew Wilcox
ce6fce4295 PCI MSI: Don't disable MSIs if the mask bit isn't supported
David Vrabel has a device which generates an interrupt storm on the INTx
pin if we disable MSI interrupts altogether.  Masking interrupts is only
a performance optimisation, so we can ignore the request to mask the
interrupt.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-28 14:43:22 -07:00
Bjorn Helgaas
80ccba1186 PCI: use dev_printk when possible
Convert printks to use dev_printk().

I converted pr_debug() to dev_dbg().  Both use KERN_DEBUG and are enabled
only when DEBUG is defined.

I converted printk(KERN_DEBUG) to dev_printk(KERN_DEBUG), not to dev_dbg(),
because dev_dbg() is only enabled when DEBUG is defined.

I converted DBG(KERN_INFO) (only in setup-bus.c) to dev_info().  The DBG()
name makes it sound like debug, but it's been enabled forever, so dev_info()
preserves the previous behavior.

I tried to make the resource assignment formats more consistent, e.g.,
  "BAR %d: got res [%#llx-%#llx] bus [%#llx-%#llx] flags %#lx\n"
instead of sometimes using "start-end" and sometimes using "size@start".
I'm not attached to one or the other; I'd just like them consistent.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-06-25 16:05:13 -07:00
Hidetoshi Seto
5ca5c02f0e PCI/MSI: skip calling pci_find_capability from msi_set_mask_bits
The position of MSI capability is already cached in the msi_desc when
we enter the msi_set_mask_bits().  Use it instead.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-06-10 10:59:49 -07:00
Yinghai Lu
d52877c7b1 pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2
[PATCH 2/2] pci/irq: let pci_device_shutdown to call pci_msi_shutdown v2

this change

| commit 23a274c8a5
| Author: Prakash, Sathya <sathya.prakash@lsi.com>
| Date:   Fri Mar 7 15:53:21 2008 +0530
|
|     [SCSI] mpt fusion: Enable MSI by default for SAS controllers
|
|     This patch modifies the driver to enable MSI by default for all SAS chips.
|
|     Signed-off-by: Sathya Prakash <sathya.prakash@lsi.com>
|     Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
Causes the kexec of a RHEL 5.1 kernel to fail.

root casue: the rhel 5.1 kernel still uses INTx emulation.  and
mptscsih_shutdown doesn't call pci_disable_msi to reenable INTx on kexec path

So call pci_msi_shutdown in the shutdown path to do the same thing to msix

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
2008-04-29 09:12:51 -07:00
Yinghai Lu
8e149e09f9 pci/irq: restore mask_bits in msi shutdown -v3
[PATCH 1/2] pci/irq: restore mask_bits in msi shutdown -v3

Yinghai found that kexec'ing a RHEL 5.1 kernel with 2.6.25-rc3+ kernels
prevents his NIC from working.  He bisected to

| commit 89d694b9db
| Author: Thomas Gleixner <tglx@linutronix.de>
| Date:   Mon Feb 18 18:25:17 2008 +0100
|
|   genirq: do not leave interupts enabled on free_irq
|
|   The default_disable() function was changed in commit:
|
|    76d2160147
|    genirq: do not mask interrupts by default
|

For MSI, default_shutdown will call mask_bit for msi device.  All mask bits
will left disabled after free_irq.  Then in the kexec case, the next kernel
can only use msi_enable bit, so all device's MSI can not be used.

So lets to restore the mask bit to its pci reset defined value (enabled) when
we disable the kernels use of msi to be a little friendlier to kexec'd kernels.

Extend msi_set_mask_bit to msi_set_mask_bits to take mask, so we can fully
restore that to 0x00 instead of 0xfe.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@hobbes.lan>
2008-04-29 09:11:12 -07:00
Adrian Bunk
6a9e7f2031 PCI: drivers/pci/msi.c: move arch hooks to the top
This patch fixes the following problem present with older gcc versions:

<--  snip  -->

...
  CC      drivers/pci/msi.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/pci/msi.c:692: warning: weak declaration of `arch_msi_check_device' after first use results in unspecified behavior
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/pci/msi.c:704: warning: weak declaration of `arch_setup_msi_irqs' after first use results in unspecified behavior
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/pci/msi.c:724: warning: weak declaration of `arch_teardown_msi_irqs' after first use results in unspecified behavior
...

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-01 15:04:25 -08:00
Linas Vepstas
94688cf245 PCI: export pci_restore_msi_state()
PCI error recovery usually involves the PCI adapter being reset.
If the device is using MSI, the reset will cause the MSI state
to be lost; the device driver needs to restore the MSI state.

The pci_restore_msi_state() routine is currently protected
by CONFIG_PM; remove this, and also export the symbol, so
that it can be used in a modle.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-02-01 15:04:22 -08:00
David Miller
ba698ad4b7 PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
A reasonably common problem with some devices is that they will
disable MSI generation when the INTX_DISABLE bit is set in the
PCI_COMMAND register.

Quirk this explicitly, guarding the pci_intx() calls in msi.c with
this quirk indication.

The first entries for this quirk are for 5714 and 5780 Tigon3 chips,
and thus we can remove the workaround code from the tg3.c driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Michael Chan <mchan@broadcom.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-11-05 13:35:16 -08:00
Roland Dreier
cbf5d9e6b9 MSI: Use correct data offset for 32-bit MSI in read_msi_msg()
While reading the MSI code trying to find a reason why MSI wouldn't
work for devices that have a 32-bit MSI address capability, I noticed
that read_msi_msg() seems to read the message data from the wrong
offset in this case.

Signed-off-by: Roland Dreier <roland@digitalvampire.org>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:17 -07:00
Eric W. Biederman
78b7611c4a msi: mask the msix vector before we unmap it
With these two lines in the reverse order the drives/block/ccis.c was
oopsing in msi_free_irqs.  Silly us calling writel on an area after
we unmap it.

BUG: unable to handle kernel paging request at virtual address f8b2200c
 printing eip:
c01e9cc7
*pdpt = 0000000000003001
*pde = 0000000037e48067
*pte = 0000000000000000
Oops: 0002 [#1]
SMP
Modules linked in: cciss ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core
sunrpc loop dm_multipath button battery asus_acpi ac tg3 floppy sg dm_snapshot
dm_zero dm_mirror ext3 jbd dm_mod ata_piix libata mptsas scsi_transport_sas
mptspi scsi_transport_spi mptscsih mptbase sd_mod scsi_mod
CPU:    1
EIP:    0060:[<c01e9cc7>]    Not tainted VLI
EFLAGS: 00010286   (2.6.22-rc2-gd2579053 #1)
EIP is at msi_free_irqs+0x81/0xbe
eax: f8b22000   ebx: f71f3180   ecx: f7fff280   edx: c1886eb8
esi: f7c4e800   edi: f7c4ec48   ebp: 00000002   esp: f5a0dec8
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068
Process rmmod (pid: 5286, ti=f5a0d000 task=c47d2550 task.ti=f5a0d000)
Stack: 00000002 f8b72294 00000400 f8b69ca7 f8b6bc6c 00000002 00000000 00000000
       00000000 00000000 00000000 f5a997f4 f8b69d61 f7c5a4b0 f7c4e848 f7c4e848
       f7c4e800 f7c4e800 f8b72294 f7c4e848 f8b72294 c01e3cdf f7c4e848 c024c469
Call Trace:
 [<f8b69ca7>] cciss_shutdown+0xae/0xc3 [cciss]
 [<f8b69d61>] cciss_remove_one+0xa5/0x178 [cciss]
 [<c01e3cdf>] pci_device_remove+0x16/0x35
 [<c024c469>] __device_release_driver+0x71/0x8e
 [<c024c56e>] driver_detach+0xa0/0xde
 [<c024bc5c>] bus_remove_driver+0x27/0x41
 [<c01e3ef3>] pci_unregister_driver+0xb/0x13
 [<f8b6a343>] cciss_cleanup+0xf/0x51 [cciss]
 [<c0139ced>] sys_delete_module+0x110/0x135
 [<c0104c7a>] sysenter_past_esp+0x5f/0x85

Here's a patch that just reverses the 2 lines of code as Eric suggests. Please
consider this for inclusion.

Signed-off-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Chase Maupin <chase.maupin@hp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-01 08:18:27 -07:00
Eric W. Biederman
0dd11f9be4 msi: fix the ordering of msix irqs
"Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net> writes:

Found what seems the problem with our vectors being listed backward.  In
drivers/pci/msi.c we should be using list_add_tail rather than list_add to
preserve the ordering across various kernels.  Please consider this for
inclusion.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Screwed-up-by: Michael Ellerman <michael@ellerman.id.au>
Cc: "Mike Miller (OS Dev)" <mikem@beardog.cca.cpqcorp.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-01 08:18:27 -07:00
Dan Williams
4fdadebc31 msi: fix ARM compile
In file included from drivers/pci/msi.c:22:
include/asm/smp.h:17:26: asm/arch/smp.h: No such file or directory
include/asm/smp.h:20:3: #error "<asm-arm/smp.h> included in non-SMP build"
include/asm/smp.h:23:1: warning: "raw_smp_processor_id" redefined
In file included from include/linux/sched.h:65,
                 from include/linux/mm.h:4,
                 from drivers/pci/msi.c:10:
include/linux/smp.h:85:1: warning: this is the location of the previous
definition

Tested on powerpc, i386, and x86_64.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-31 16:56:36 -07:00
David Miller
b3b7cc7b41 Fix assertion failure with MSI on sparc64
Today's find is a triggered assertion in msi_free_irqs() when the system
doesn't support MSI, in which case arch_setup_msi_irqs() always returns
an error.

The problem is that when this happens we branch into msi_free_irqs(), to
which you added the following assertion loop:

	list_for_each_entry(entry, &dev->msi_list, list)
		BUG_ON(irq_has_action(entry->irq));

Well, if arch_setup_msi_irqs() fails, entry->irq will be zero and
although that's never assigned to any normal devices we use that IRQ
number for the timer interrupt on sparc64 so this assertion triggers.

Better to test for zero before doing the irq_has_action() assertion
thing.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 16:01:18 -07:00
Randy Dunlap
e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Michael Ellerman
032de8e2fe MSI: Give archs the option to free all MSI/Xs at once.
This patch introduces an optional function, arch_teardown_msi_irqs(),
which gives an arch the opportunity to do per-device teardown for
MSI/X. If that's not required, the default version simply calls
arch_teardown_msi_irq() for each msi irq required.

arch_teardown_msi_irqs() is simply passed a pdev, attached to the pdev
is a list of msi_descs, it is up to the arch to free the irq associated
with each of these as appropriate.

For archs that _don't_ implement arch_teardown_msi_irqs(), all msi_descs
with irq == 0 are considered unallocated, and the arch teardown routine
is not called on them.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Michael Ellerman
9c8313343c MSI: Give archs the option to allocate all MSI/Xs at once.
This patch introduces an optional function, arch_setup_msi_irqs(),
(note the plural) which gives an arch the opportunity to do per-device
setup for MSI/X and then allocate all the requested MSI/Xs at once.

If that's not required by the arch, the default version simply calls
arch_setup_msi_irq() for each MSI irq required.

arch_setup_msi_irqs() is passed a pdev, attached to the pdev is a list
of msi_descs with irq == 0, it is up to the arch to connect these up to
an irq (via set_irq_msi()) or return an error. For convenience the number
of vectors and the type are passed also.

All msi_descs with irq != 0 are considered allocated, and the arch
teardown routine will be called on them when necessary.

The existing semantics of pci_enable_msix() are that if the requested
number of irqs can not be allocated, the maximum number that _could_ be
allocated is returned. To support that, we define that in case of an
error from arch_setup_msi_irqs(), the number of msi_descs with irq != 0
are considered allocated, and are counted toward the "max that could be
allocated".


Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Michael Ellerman
7fe3730de7 MSI: arch must connect the irq and the msi_desc
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.

set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.

The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.

Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Michael Ellerman
314e77b3ee MSI: Remove dev->first_msi_irq
Now that we keep a list of msi descriptors, we don't need first_msi_irq
in the pci dev.

If we somehow have zero MSIs configured list_entry() will give us weird
oopes or nice memory corruption bugs. So be paranoid. Add BUG_ONs and also
a check in pci_msi_check_device() to make sure nvec > 0.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Michael Ellerman
4aa9bc955d MSI: Use a list instead of the custom link structure
The msi descriptors are linked together with what looks a lot like
a linked list, but isn't a struct list_head list. Make it one.

The only complication is that previously we walked a list of irqs, and
got the descriptor for each with get_irq_msi(). Now we have a list of
descriptors and need to get the irq out of it, so it needs to be in the
actual struct msi_desc. We use 0 to indicate no irq is setup.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Michael Ellerman
c9953a73e9 MSI: Add an arch_msi_check_device()
Add an arch_check_device(), which gives archs a chance to check the input
to pci_enable_msi/x. The arch might be interested in the value of nvec so
pass it in. Propagate the error value returned from the arch routine out
to the caller.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:37 -07:00
Michael Ellerman
17bbc12acd MSI: Rename pci_msi_supported() to pci_msi_check_device()
As pointed out by Eric, the name pci_msi_supported() suggests it should
return a boolean value, however it doesn't. So update the name to be
a bit less confusing and update the doco too.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
128bc5fced MSI: Consolidate precondition checks
Consolidate precondition checks into a single if statement.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
b1e2303dba MSI: Expand pci_msi_supported()
pci_enable_msi() and pci_enable_msix() both search for the MSI/MSI-X
capability, we can fold this into pci_msi_supported() by passing the
type in.

Update the code to match the comment for pci_msi_supported(). That is
it returns 0 on success, and anything else indicates an error.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
3e916c0503 MSI: Remove msi_cache
We don't need a special cache just for msi descriptors. They're not
particularly large, under 100 bytes for sure, and don't seem to require any
special alignment etc. On most systems there will be relatively few MSIs,
and hence we waste most of a page on the cache. Better to just kzalloc the
space for the few we do need.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
4cc086fa5b MSI: Move EXPORT_SYMBOL()s near their definition
Move EXPORT_SYMBOL()s near their definition.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
7ede9c1fa5 MSI: Consolidate BUG_ON()s.
When freeing MSIs and MSI-Xs, we BUG_ON() if the irq has not been
freed, ie. if it still has an action. We can consolidate all of these
BUG_ON()s into msi_free_irqs() as all the code paths lead there almost
immediately anyway.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
fc4afc7b2b MSI: Consolidate MSI-X irq freeing code
For the MSI-X case we do exactly the same logic in pci_disable_msix() and
msi_remove_pci_irq_vectors(), so consolidate them.

msi_remove_pci_irq_vectors() wasn't setting dev->first_msi_irq to 0, but
I think it should have been, so the consolidated version does.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
00ba16ab26 MSI: Simplify BUG() handling in msi_remove_pci_irq_vectors() part 2
Although it might be nice to do a printk before BUG'ing, it's really not
necessary, and it complicates the code.

The behaviour has changed slightly, in that before we set a flag if the irq
had an action, and continued freeing the other irqs. But as I see it that's
all irrelevant because we end up BUG'ing anyway.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
c31af39870 MSI: Simplify BUG() handling in msi_remove_pci_irq_vectors() part 1
Although it might be nice to do a printk before BUG'ing, it's really not
necessary, and it complicates the code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
54bc6c0b0e MSI: Simplify BUG() handling in pci_disable_msix()
Although it might be nice to do a printk before BUG'ing, it's really not
necessary, and it complicates the code.

The behaviour has changed slightly, in that before we set a flag if the irq
had an action, and continued freeing the other irqs. But as I see it that's
all irrelevant because we end up BUG'ing anyway.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:36 -07:00
Michael Ellerman
e387b9eefe MSI: Simplify BUG() handling in pci_disable_msi()
Although it might be nice to do a printk before BUG'ing, it's really not
necessary, and it complicates the code.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:35 -07:00
Mitch Williams
988cbb15e0 PCI: Flush MSI-X table writes
This patch fixes a kernel bug which is triggered when using the
irqbalance daemon with MSI-X hardware.

Because both MSI-X interrupt messages and MSI-X table writes are posted,
it's possible for them to cross while in-flight.  This results in
interrupts being received long after the kernel thinks they're disabled,
and in interrupts being sent to stale vectors after rebalancing.

This patch performs a read flush after writes to the MSI-X table for
mask and unmask operations.  Since the SMP affinity is set while
the interrupt is masked, and since it's unmasked immediately after,
no additional flushes are required in the various affinity setting
routines.

This patch has been validated with (unreleased) network hardware which
uses MSI-X.

Revised with input from Eric Biederman.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:34 -07:00
Eric W. Biederman
348e3fd194 [PATCH] msi: synchronously mask and unmask msi-x irqs.
This is a simplified and actually more comprehensive form of a bug
fix from Mitch Williams <mitch.a.williams@intel.com>.

When we mask or unmask a msi-x irqs the writes may be posted because
we are writing to memory mapped region.  This means the mask and
unmask don't happen immediately but at some unspecified time in the
future.  Which is out of sync with how the mask/unmask logic work
for ioapic irqs.

The practical result is that we get very subtle and hard to track down
irq migration bugs.

This patch performs a read flush after writes to the MSI-X table for mask
and unmask operations.  Since the SMP affinity is set while the interrupt
is masked, and since it's unmasked immediately after, no additional flushes
are required in the various affinity setting routines.

The testing by Mitch Williams on his especially problematic system should
still be valid as I have only simplified the code, not changed the
functionality.

We currently have 7 drivers: cciss, mthca, cxgb3, forceth, s2io,
pcie/portdrv_core, and qla2xxx in 2.6.21 that are affected by this
problem when the hardware they driver is plugged into the right slot.

Given the difficulty of reproducing this bug and tracing it down to
anything that even remotely resembles a cause, even if people are
being affected we aren't likely to see many meaningful bug reports, and
the people who see this bug aren't likely to be able to reproduce this
bug in a timely fashion.  So it is best to get this problem fixed
as soon as we can so people don't have problems.

Then if people do have a kernel message stating "No irq for vector" we
will know it is yet another novel cause that needs a complete new
investigation.

Cc: Greg KH <greg@kroah.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-03 14:02:49 -07:00
Eric W. Biederman
392ee1e6dd [PATCH] msi: Safer state caching.
There are two ways pci_save_state and pci_restore_state are used.  As
helper functions during suspend/resume, and as helper functions around
a hardware reset event.  When used as helper functions around a hardware
reset event there is no reason to believe the calls will be paired, nor
is there a good reason to believe that if we restore the msi state from
before the reset that it will match the current msi state.  Since arch
code may change the msi message without going through the driver, drivers
currently do not have enough information to even know when to call
pci_save_state to ensure they will have msi state in sync with the other
kernel irq reception data structures.

It turns out the solution is straight forward, cache the state in the
existing msi data structures (not the magic pci saved things) and
have the msi code update the cached state each time we write to the hardware.
This means we never need to read the hardware to figure out what the hardware
state should be.

By modifying the caching in this manner we get to remove our save_state
routines and only need to provide restore_state routines.

The only fields that were at all tricky to regenerate were the msi and msi-x
control registers and the way we regenerate them currently is a bit dependent
upon assumptions on how we use the allow msi registers to be configured and used
making the code a little bit brittle.  If we ever change what cases we allow
or how we configure the msi bits we can address the fragility then.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-12 16:31:50 -07:00
Eric W. Biederman
58e0543e8f [PATCH] msi: support masking msi irqs without a mask bit
For devices that do not support msi-x we only support 1 interrupt.  Therefore
we can disable that one interrupt by disabling the msi capability itself.  If
we leave the intx interrupts disabled while we have the msi capability
disabled no interrupts should be delivered from that device.

Devices with just the minimal msi support (and thus hitting this code path)
include things like the intel e1000 nic, so it looks like is going to be a
fairly common case and thus important to get right.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Eric W. Biederman
b1cbf4e4dd [PATCH] msi: fix up the msi enable/disable logic
enable/disable_msi_mode have several side effects which keeps them from being
generally useful.  So this patch replaces them with with two much more
targeted functions: msi_set_enable and msix_set_enable.

This patch makes pci_dev->msi_enabled and pci_dev->msix_enabled the definitive
way to test if linux has enabled the msi capability, and has the appropriate
msi data structures set up.

This patch ensures that while writing the msi messages in save/restore and
during device initialization we have the msi capability disabled so we don't
get into races.  The pci spec requires that we do not have the msi capability
enabled and the msi messages unmasked while we write the messages.  Completely
disabling the capability is overkill but it is easy :)

Care has been taken so we never have both a msi capability and intx enabled
simultaneously.  We haven't run into a problem yet but better safe then sorry.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Eric W. Biederman
f5f2b13129 [PATCH] msi: sanely support hardware level msi disabling
In some cases when we are not using msi we need a way to ensure that the
hardware does not have an msi capability enabled.  Currently the code has been
calling disable_msi_mode to try and achieve that.  However disable_msi_mode
has several other side effects and is only available when msi support is
compiled in so it isn't really appropriate.

Instead this patch implements pci_msi_off which disables all msi and msix
capabilities unconditionally with no additional side effects.

pci_disable_device was redundantly clearing the bus master enable flag and
clearing the msi enable bit.  A device that is not allowed to perform bus
mastering operations cannot generate intx or msi interrupt messages as those
are essentially a special case of dma, and require bus mastering.  So the call
in pci_disable_device to disable msi capabilities was redundant.

quirk_pcie_pxh also called disable_msi_mode and is updated to use pci_msi_off.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-03-05 07:57:50 -08:00
Eric W. Biederman
f7feaca77d msi: Make MSI useable more architectures
The arch hooks arch_setup_msi_irq and arch_teardown_msi_irq are now
responsible for allocating and freeing the linux irq in addition to
setting up the the linux irq to work with the interrupt.

arch_setup_msi_irq now takes a pci_device and a msi_desc and returns
an irq.

With this change in place this code should be useable by all platforms
except those that won't let the OS touch the hardware like ppc RTAS.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:08 -08:00
Eric W. Biederman
5b912c108c msi: Kill the msi_desc array.
We need to be able to get from an irq number to a struct msi_desc.
The msi_desc array in msi.c had several short comings the big one was
that it could not be used outside of msi.c.  Using irq_data in struct
irq_desc almost worked except on some architectures irq_data needs to
be used for something else.

So this patch adds a msi_desc pointer to irq_desc, adds the appropriate
wrappers and changes all of the msi code to use them.

The dynamic_irq_init/cleanup code was tweaked to ensure the new
field is left in a well defined state.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:08 -08:00
Eric W. Biederman
1c659d61cf msi: Remove attach_msi_entry.
The attach_msi_entry has been reduced to a single simple assignment,
so for simplicity remove the abstraction and directory perform the
assignment.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:08 -08:00
Eric W. Biederman
866a8c87c4 msi: Fix msi_remove_pci_irq_vectors.
Since msi_remove_pci_irq_vectors is designed to be called during
hotplug remove it is actively wrong to query the hardware and expect
meaningful results back.

To that end remove the pci_find_capability calls.  Testing
dev->msi_enabled and dev->msix_enabled gives us all of the information
we need.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:07 -08:00
Eric W. Biederman
d40f540ce6 msi: Remove msi_lock.
With the removal of msi_lookup_irq all of the functions using msi_lock
operated on a single device and none of them could reasonably be
called on that device at the same time. 

Since what little synchronization that needs to happen needs to happen
outside of the msi functions, msi_lock could never be contended and as
such is useless and just complicates the code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:07 -08:00
Eric W. Biederman
ded86d8d37 msi: Kill msi_lookup_irq
The function msi_lookup_irq was horrible.  As a side effect of running
it changed dev->irq, and then the callers would need to change it
back.  In addition it does a global scan through all of the irqs,
which seems to be the sole justification of the msi_lock.

To remove the neede for msi_lookup_irq I added first_msi_irq to struct
pci_dev.  Then depending on the context I replaced msi_lookup_irq with
dev->first_msi_irq, dev->msi_enabled, or dev->msix_enabled.

msi_enabled and msix_enabled were already present in pci_dev for other
reasons.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:07 -08:00
Michael Ellerman
8fed4b6523 MSI: Combine pci_(save|restore)_msi/msix_state
The PCI save/restore code doesn't need to care about MSI vs MSI-X, all
it really wants is to say "save/restore all MSI(-X) info for this device".

This is borne out in the code, we call the MSI and MSI-X save routines
side by side, and similarly with the restore routines.

So combine the MSI/MSI-X routines into pci_save_msi_state() and
pci_restore_msi_state(). It is up to those routines to decide what state
needs to be saved.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:07 -08:00
Michael Ellerman
0fcfdabbdb MSI: Remove pci_scan_msi_device()
pci_scan_msi_device() doesn't do anything anymore, so remove it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:07 -08:00
Michael Ellerman
88187dfa4d MSI: Replace pci_msi_quirk with calls to pci_no_msi()
I don't see any reason why we need pci_msi_quirk, quirk code can just
call pci_no_msi() instead.

Remove the check of pci_msi_quirk in msi_init(). This is safe as all
calls to msi_init() are protected by calls to pci_msi_supported(),
which checks pci_msi_enable, which is disabled by pci_no_msi().

The pci_disable_msi routines didn't check pci_msi_quirk, only
pci_msi_enable, but as far as I can see that was a bug not a feature.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:06 -08:00
Satoru Takeuchi
c54c187907 PCI: cleanup MSI code
Cleanup MSI code as follows:

 - fix some types
 - fix strange local variable definition
 - delete unnecessary blank line
 - add comment to #endif which is far from corresponding #ifdef

Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-02-07 15:50:06 -08:00
Linus Torvalds
7f3af60e5a Merge branch 'intx' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'intx' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
  PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exit
2006-12-07 15:04:20 -08:00