Commit Graph

187 Commits

Author SHA1 Message Date
Lukas Wunner
55a6b7a657 PCI: pciehp: Drop slot workqueue
Previously the slot workqueue was used to handle events and enable or
disable the slot.  That's no longer the case as those tasks are done
synchronously in the IRQ thread.  The slot workqueue is thus merely used
to handle a button press after the 5 second delay and only one such work
item may be in flight at any given time.  A separate workqueue isn't
necessary for this simple task, so use the system workqueue instead.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-23 17:04:13 -05:00
Lukas Wunner
0e94916e60 PCI: pciehp: Handle events synchronously
Up until now, pciehp's IRQ handler schedules a work item for each event,
which in turn schedules a work item to enable or disable the slot.  This
double indirection was necessary because sleeping wasn't allowed in the
IRQ handler.

However it is now that pciehp has been converted to threaded IRQ handling
and polling, so handle events synchronously in pciehp_ist() and remove
the work item infrastructure (with the exception of work items to handle
a button press after the 5 second delay).

For link or presence change events, move the register read to determine
the current link or presence state behind acquisition of the slot lock
to prevent it from becoming stale while the lock is contended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-07-23 17:04:12 -05:00
Lukas Wunner
ec07a44730 PCI: pciehp: Convert to threaded polling
We've just converted pciehp to threaded IRQ handling, but still cannot
sleep in pciehp_ist() because the function is also called in poll mode,
which runs in softirq context (from a timer).

Convert poll mode to a kthread so that pciehp_ist() always runs in task
context.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
2018-07-23 17:04:12 -05:00
Lukas Wunner
7b4ce26bcf PCI: pciehp: Convert to threaded IRQ
pciehp's IRQ handler queues up a work item for each event signaled by
the hardware.  A more modern alternative is to let a long running
kthread service the events.  The IRQ handler's sole job is then to check
whether the IRQ originated from the device in question, acknowledge its
receipt to the hardware to quiesce the interrupt and wake up the kthread.

One benefit is reduced latency to handle the IRQ, which is a necessity
for realtime environments.  Another benefit is that we can make pciehp
simpler and more robust by handling events synchronously in process
context, rather than asynchronously by queueing up work items.  pciehp's
usage of work items is a historic artifact, it predates the introduction
of threaded IRQ handlers by two years.  (The former was introduced in
2007 with commit 5d386e1ac4 ("pciehp: Event handling rework"), the
latter in 2009 with commit 3aa551c9b4 ("genirq: add threaded interrupt
handler support").)

Convert pciehp to threaded IRQ handling by retrieving the pending events
in pciehp_isr(), saving them for later consumption by the thread handler
pciehp_ist() and clearing them in the Slot Status register.

By clearing the Slot Status (and thereby acknowledging the events) in
pciehp_isr(), we can avoid requesting the IRQ with IRQF_ONESHOT, which
would have the unpleasant side effect of starving devices sharing the
IRQ until pciehp_ist() has finished.

pciehp_isr() does not count how many times each event occurred, but
merely records the fact *that* an event occurred.  If the same event
occurs a second time before pciehp_ist() is woken, that second event
will not be recorded separately, which is problematic according to
commit fad214b0aa ("PCI: pciehp: Process all hotplug events before
looking for new ones") because we may miss removal of a card in-between
two back-to-back insertions.  We're about to make pciehp_ist() resilient
to missed events.  The present commit regresses the driver's behavior
temporarily in order to separate the changes into reviewable chunks.
This doesn't affect regular slow-motion hotplug, only plug-unplug-plug
operations that happen in a timespan shorter than wakeup of the IRQ
thread.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
2018-07-23 17:04:12 -05:00
Lukas Wunner
1204e35bed PCI: pciehp: Fix unprotected list iteration in IRQ handler
Commit b440bde74f ("PCI: Add pci_ignore_hotplug() to ignore hotplug
events for a device") iterates over the devices on a hotplug port's
subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem.
It is thus possible for a user to cause a crash by concurrently
manipulating the device list, e.g. by disabling slot power via sysfs
on a different CPU or by initiating a remove/rescan via sysfs.

This can't be fixed by acquiring pci_bus_sem because it may sleep.
The simplest fix is to avoid the list iteration altogether and just
check the ignore_hotplug flag on the port itself.  This works because
pci_ignore_hotplug() sets the flag both on the device as well as on its
parent bridge.

We do lose the ability to print the name of the device blocking hotplug
in the debug message, but that's probably bearable.

Fixes: b440bde74f ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2018-07-23 17:04:10 -05:00
Lukas Wunner
281e878eab PCI: pciehp: Fix use-after-free on unplug
When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the
hotplug_slot struct is deregistered and thus freed before freeing the
IRQ.  The IRQ handler and the work items it schedules print the slot
name referenced from the freed structure in various informational and
debug log messages, each time resulting in a quadruple dereference of
freed pointers (hotplug_slot -> pci_slot -> kobject -> name).

At best the slot name is logged as "(null)", at worst kernel memory is
exposed in logs or the driver crashes:

  pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present

An attacker may provoke the bug by unplugging multiple devices on a
Thunderbolt daisy chain at once.  Unplugging can also be simulated by
powering down slots via sysfs.  The bug is particularly easy to trigger
in poll mode.

It has been present since the driver's introduction in 2004:
https://git.kernel.org/tglx/history/c/c16b4b14d980

Fix by rearranging teardown such that the IRQ is freed first.  Run the
work items queued by the IRQ handler to completion before freeing the
hotplug_slot struct by draining the work queue from the ->release_slot
callback which is invoked by pci_hp_deregister().

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v2.6.4
2018-07-23 17:04:10 -05:00
Bjorn Helgaas
f64c146410 Merge branch 'pci/hotplug'
- fix use-before-set error in ibmphp (Dan Carpenter)

  - fix pciehp timeouts caused by Command Completed errata (Bjorn Helgaas)

  - fix refcounting in pnv_php hotplug (Julia Lawall)

  - clear pciehp Presence Detect and Data Link Layer Status Changed on
    resume so we don't miss hotplug events (Mika Westerberg)

  - only request pciehp control if we support it, so platform can use ACPI
    hotplug otherwise (Mika Westerberg)

  - convert SHPC to be builtin only (Mika Westerberg)

  - request SHPC control via _OSC if we support it (Mika Westerberg)

  - simplify SHPC handoff from firmware (Mika Westerberg)

* pci/hotplug:
  PCI: Improve "partially hidden behind bridge" log message
  PCI: Improve pci_scan_bridge() and pci_scan_bridge_extend() doc
  PCI: Move resource distribution for single bridge outside loop
  PCI: Account for all bridges on bus when distributing bus numbers
  ACPI / hotplug / PCI: Drop unnecessary parentheses
  ACPI / hotplug / PCI: Mark stale PCI devices disconnected
  ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug
  PCI: hotplug: Add hotplug_is_native()
  PCI: shpchp: Add shpchp_is_native()
  PCI: shpchp: Fix AMD POGO identification
  PCI: shpchp: Use dev_printk() for OSHP-related messages
  PCI: shpchp: Remove get_hp_hw_control_from_firmware() wrapper
  PCI: shpchp: Remove acpi_get_hp_hw_control_from_firmware() flags
  PCI: shpchp: Rely on previous _OSC results
  PCI: shpchp: Request SHPC control via _OSC when adding host bridge
  PCI: shpchp: Convert SHPC to be builtin only
  PCI: pciehp: Make pciehp_is_native() stricter
  PCI: pciehp: Rename host->native_hotplug to host->native_pcie_hotplug
  PCI: pciehp: Request control of native hotplug only if supported
  PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
  PCI: pnv_php: Add missing of_node_put()
  PCI: pciehp: Add quirk for Command Completed errata
  PCI: Add Qualcomm vendor ID
  PCI: ibmphp: Fix use-before-set in get_max_bus_speed()

# Conflicts:
#	drivers/acpi/pci_root.c
2018-06-06 16:10:10 -05:00
Mika Westerberg
13c65840fe PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume
After a suspend/resume cycle the Presence Detect or Data Link Layer Status
Changed bits might be set.  If we don't clear them those events will not
fire anymore and nothing happens for instance when a device is now
hot-unplugged.

Fix this by clearing those bits in a newly introduced function
pcie_reenable_notification().  This should be fine because immediately
after, we check if the adapter is still present by reading directly from
the status register.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@vger.kernel.org
2018-05-23 17:42:53 -05:00
Oza Pawandeep
9f5a70f18c PCI: Add generic pcie_wait_for_link() interface
Clients such as hotplug and Downstream Port Containment (DPC) both need to
wait until a link becomes active or inactive.

Add a generic pcie_wait_link_active() interface and use it instead of
duplicating the code.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
2018-05-17 16:44:11 -05:00
Bjorn Helgaas
d22b362184 PCI: pciehp: Add quirk for Command Completed errata
Several PCIe hotplug controllers have errata that mean they do not set the
Command Completed bit unless writes to the Slot Command register change
"Control" bits.  Command Completed is never set for writes that only change
software notification "Enable" bits.  This results in timeouts like this:

  pciehp 0000:00:1c.0:pcie004: Timeout on hotplug command 0x1038 (issued 65284 msec ago)

When this erratum is present, avoid these timeouts by marking commands
"completed" immediately unless they change the "Control" bits.

Here's the text of the Intel erratum CF118.  We assume this applies to all
Intel parts:

  CF118        PCIe Slot Status Register Command Completed bit not always
               updated on any configuration write to the Slot Control
               Register

  Problem:     For PCIe root ports (devices 0 - 10) supporting hot-plug,
               the Slot Status Register (offset AAh) Command Completed
               (bit[4]) status is updated under the following condition:
               IOH will set Command Completed bit after delivering the new
               commands written in the Slot Controller register (offset
               A8h) to VPP. The IOH detects new commands written in Slot
               Control register by checking the change of value for Power
               Controller Control (bit[10]), Power Indicator Control
               (bits[9:8]), Attention Indicator Control (bits[7:6]), or
               Electromechanical Interlock Control (bit[11]) fields. Any
               other configuration writes to the Slot Control register
               without changing the values of these fields will not cause
               Command Completed bit to be set.

               The PCIe Base Specification Revision 2.0 or later describes
               the “Slot Control Register” in section 7.8.10, as follows
               (Reference section 7.8.10, Slot Control Register, Offset
               18h). In hot-plug capable Downstream Ports, a write to the
               Slot Control register must cause a hot-plug command to be
               generated (see Section 6.7.3.2 for details on hot-plug
               commands). A write to the Slot Control register in a
               Downstream Port that is not hotplug capable must not cause a
               hot-plug command to be executed.

               The PCIe Spec intended that every write to the Slot Control
               Register is a command and expected a command complete status
               to abstract the VPP implementation specific nuances from the
               OS software. IOH PCIe Slot Control Register implementation
               is not fully conforming to the PCIe Specification in this
               respect.

  Implication: Software checking on the Command Completed status after
               writing to the Slot Control register may time out.

  Workaround:  Software can read the Slot Control register and compare the
               existing and new values to determine if it should check the
               Command Completed status after writing to the Slot Control
               register.

Per Sinan, the Qualcomm QDF2400 controller also does not set the Command
Completed bit unless writes to the Slot Command register change "Control"
bits.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Link: https://lkml.kernel.org/r/8770820b-85a0-172b-7230-3a44524e6c9f@molgen.mpg.de
Reported-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de>	# Lenovo X60
Tested-by: Paul Menzel <pmenzel+linux-pci@molgen.mpg.de>	# Lenovo X60
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>		# Qcom quirk
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2018-05-07 16:25:08 -05:00
Bjorn Helgaas
ab8c609356 Merge branch 'pci/spdx' into next
* pci/spdx:
  PCI: Add SPDX GPL-2.0+ to replace implicit GPL v2 or later statement
  PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate
  PCI: Add SPDX GPL-2.0 to replace COPYING boilerplate
  PCI: Add SPDX GPL-2.0 to replace GPL v2 boilerplate
  PCI: Add SPDX GPL-2.0 when no license was specified
2018-02-01 11:40:07 -06:00
Bjorn Helgaas
412ee7cd3d Merge branch 'pci/misc' into next
* pci/misc:
  PCI: Add dummy pci_irqd_intx_xlate() for CONFIG_PCI=n build
  PCI: Add wrappers for dev_printk()
  PCI: Remove unnecessary messages for memory allocation failures
  PCI: Add #defines for Completion Timeout Disable feature
  hinic: Replace PCI pool old API
  net: e100: Replace PCI pool old API
  block: DAC960: Replace PCI pool old API
  MAINTAINERS: Include more PCI files
  PCI: Remove unneeded kallsyms include
  powerpc/pci: Unroll two pass loop when scanning bridges
  powerpc/pci: Use for_each_pci_bridge() helper
2018-01-31 10:10:32 -06:00
Bjorn Helgaas
736759ef59 PCI: Add SPDX GPL-2.0+ to replace GPL v2 or later boilerplate
Add SPDX GPL-2.0+ to all PCI files that specified the GPL and allowed
either GPL version 2 or any later version.

Remove the boilerplate GPL version 2 or later language, relying on the
assertion in b24413180f ("License cleanup: add SPDX GPL-2.0 license
identifier to files with no license") that the SPDX identifier may be used
instead of the full boilerplate text.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-28 15:49:06 -06:00
Lukas Wunner
493fb50e95 PCI: pciehp: Assume NoCompl+ for Thunderbolt ports
Certain Thunderbolt 1 controllers claim to support Command Completed events
(value of 0b in the No Command Completed Support field of the Slot
Capabilities register) but in reality they neither set the Command
Completed bit in the Slot Status register nor signal a Command Completed
interrupt:

  8086:1513  CV82524  [Light Ridge 4C  2010]
  8086:151a  DSL2310  [Eagle Ridge 2C  2011]
  8086:151b  CVL2510  [Light Peak 2C   2010]
  8086:1547  DSL3510  [Cactus Ridge 4C 2012]
  8086:1548  DSL3310  [Cactus Ridge 2C 2012]
  8086:1549  DSL2210  [Port Ridge 1C   2011]

All known newer chips (Redwood Ridge and onwards) set No Command Completed
Support, indicating that they do not support Command Completed events.

The user-visible impact is that after unplugging such a device, 2 seconds
elapse until pciehp is unbound.  That's because on ->remove,
pcie_write_cmd() is called via pcie_disable_notification() and every call
to pcie_write_cmd() takes 2 seconds (1 second for each invocation of
pcie_wait_cmd()):

  [  337.942727] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x1038 (issued 21176 msec ago)
  [  340.014735] pciehp 0000:0a:00.0:pcie204: Timeout on hotplug command 0x0000 (issued 2072 msec ago)

That by itself has always been unpleasant, but the situation has become
worse with commit cc27b735ad ("PCI/portdrv: Turn off PCIe services during
shutdown"):  Now pciehp is unbound on ->shutdown.  Because Thunderbolt
controllers typically have 4 hotplug ports, every reboot and shutdown is
now delayed by 8 seconds, plus another 2 seconds for every attached
Thunderbolt 1 device.

Thunderbolt hotplug slots are not physical slots that one inserts cards
into, but rather logical hotplug slots implemented in silicon.  Devices
appear beyond those logical slots once a PCI tunnel is established on top
of the Thunderbolt Converged I/O switch.  One would expect commands written
to the Slot Control register to be executed immediately by the silicon, so
for simplicity we always assume NoCompl+ for Thunderbolt ports.

Fixes: cc27b735ad ("PCI/portdrv: Turn off PCIe services during shutdown")
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org	# v4.12+
Cc: Sinan Kaya <okaya@codeaurora.org>
Cc: Yehezkel Bernat <yehezkel.bernat@intel.com>
Cc: Michael Jamet <michael.jamet@intel.com>
Cc: Andreas Noever <andreas.noever@gmail.com>
2018-01-23 14:28:41 -06:00
Markus Elfring
c7abb2352c PCI: Remove unnecessary messages for memory allocation failures
Per ebfdc40969 ("checkpatch: attempt to find unnecessary 'out of memory'
messages"), when a memory allocation fails, the memory subsystem emits
generic "out of memory" messages (see slab_out_of_memory() for some of this
logging).  Therefore, additional error messages in the caller don't add
much value.

Remove messages that merely report "out of memory".

This preserves some messages that report additional information, e.g.,
allocation failures that mean we drop hotplug events.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[bhelgaas: changelog, squash patches, make similar changes to acpiphp,
cpqphp, ibmphp, keep warning when dropping hotplug event]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-01-17 08:41:41 -06:00
Mika Westerberg
db63d40017 PCI: pciehp: Do not clear Presence Detect Changed during initialization
It is possible that the hotplug event has already happened before the
driver is attached to a PCIe hotplug downstream port. If we just clear the
status we never get the hotplug interrupt and thus the event will be
missed.

To make sure that does not happen, we leave Presence Detect Changed bit
untouched during initialization. Then once the event is unmasked we get an
interrupt and handle the hotplug event properly.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-11-06 18:49:00 -06:00
Mika Westerberg
499022396a PCI: pciehp: Fix race condition handling surprise link down
A surprise link down may retrain very quickly causing the same slot
generate a link up event before handling the link down event completes.

Since the link is active, the power off work queued from the first link
down will cause a second down event when power is disabled. However, the
link up event sets the slot state to POWERON_STATE before the event to
handle this is enqueued, making the second down event believe it needs to
do something.

This creates constant link up and down event cycle.

To prevent this it is better to handle each event at the time in order it
occurred, so change the driver to use ordered workqueue instead.

A normal device hotplug triggers two events (presense detect and link up)
that are already handled properly in the driver but we currently log an
error if we find an existing device in the slot. Since this is not an error
change the log level to be debug instead to avoid scaring users.

This is based on the original work by Ashok Raj.

Link: https://patchwork.kernel.org/patch/9469023
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2017-11-06 18:49:00 -06:00
Kees Cook
c4459a0867 PCI: pciehp: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly. This fixes what appears to be a bug
in passing the wrong pointer to the timer handler (address of ctrl pointer
instead of ctrl pointer).

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
2017-11-06 18:48:57 -06:00
Keith Busch
7612b3b28c PCI: pciehp: Report power fault only once until we clear it
When a power fault occurs, the power controller sets Power Fault Detected
in the Slot Status register, and pciehp_isr() queues an INT_POWER_FAULT
event to handle it.

It also clears Power Fault Detected, but since nothing has yet changed to
correct the power fault, the power controller will likely set it again
immediately, which may cause an infinite loop when pcie_isr() rechecks
Slot Status.

Fix that by masking off Power Fault Detected from new events if the driver
hasn't seen the power fault clear from the previous handling attempt.

Fixes: fad214b0aa ("PCI: pciehp: Process all hotplug events before looking for new ones")
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog, pull test out and add comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Mayurkumar Patel <mayurkumar.patel@intel.com>
Cc: stable@vger.kernel.org	# 4.9+
2017-08-15 13:46:31 -05:00
Ashok Raj
385895fef6 PCI: pciehp: Prioritize data-link event over presence detect
If Slot Status indicates changes in both Data Link Layer Status and
Presence Detect, prioritize the Link status change.

When both events are observed, pciehp currently relies on the Slot Status
Presence Detect State (PDS) to agree with the Link Status Data Link Layer
Active status.  The Presence Detect State, however, may be set to 1 through
out-of-band presence detect even if the link is down, which creates
conflicting events.

Since the Link Status accurately reflects the reachability of the
downstream bus, the Link Status event should take precedence over a
Presence Detect event.  Skip checking the PDC status if we handled a link
event in the same handler.

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
2016-12-07 17:00:44 -06:00
Keith Busch
576243b3f9 PCI: pciehp: Allow exclusive userspace control of indicators
PCIe hotplug supports optional Attention and Power Indicators, which are
used internally by pciehp.  Users can't control the Power Indicator, but
they can control the Attention Indicator by writing to a sysfs "attention"
file.

The Slot Control register has two bits for each indicator, and the PCIe
spec defines the encodings for each as (Reserved/On/Blinking/Off).  For
sysfs "attention" writes, pciehp_set_attention_status() maps into these
encodings, so the only useful write values are 0 (Off), 1 (On), and 2
(Blinking).

However, some platforms use all four bits for platform-specific indicators,
and they need to allow direct user control of them while preventing pciehp
from using them at all.

Add a "hotplug_user_indicators" flag to the pci_dev structure.  When set,
pciehp does not use either the Attention Indicator or the Power Indicator,
and the low four bits (values 0x0 - 0xf) of sysfs "attention" write values
are written directly to the Attention Indicator Control and Power Indicator
Control fields.

[bhelgaas: changelog, rename flag and accessors to s/attention/indicator/]
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-09-22 18:20:11 -05:00
Bjorn Helgaas
6e49b304e3 PCI: pciehp: Clean up dmesg "Slot(%s)" messages
Print slot name consistently as "Slot(%s)".  I don't know whether that's
ideal, but we can at least do it the same way all the time.  No functional
change intended.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-14 14:25:00 -05:00
Bjorn Helgaas
4947793916 PCI: pciehp: Remove unnecessary guard
In pcie_isr(), we return early if no status bits other than
PCI_EXP_SLTSTA_CC are set.  This was introduced by dbd79aed1a ("pciehp:
fix NULL dereference in interrupt handler"), but it is no longer necessary
because all the subsequent pcie_isr() code is already predicated on a
status bit being set.

Remove the unnecessary test for ~PCI_EXP_SLTSTA_CC.  No functional change
intended.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-14 14:24:54 -05:00
Mayurkumar Patel
0c923d1da3 PCI: pciehp: Don't re-read Slot Status when queuing hotplug event
Previously we read Slot Status to learn about hotplug events, then cleared
the events, then re-read Slot Status to find out what happened.  But Slot
Status might have changed before the second read.

Capture the Slot Status once before clearing the events.  Also capture the
Link Status if we had a link status change.

[bhelgaas: changelog, split to separate patch]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-14 14:24:40 -05:00
Mayurkumar Patel
fad214b0aa PCI: pciehp: Process all hotplug events before looking for new ones
Previously we accumulated hotplug events, then processed them, essentially
like this:

  events = 0
  do {
    status = read(Slot Status)
    status &= EVENT_MASK              # only look at events
    events |= status                  # accumulate events
    write(Slot Status, events)        # clear events
  } while (status)
  process events

The problem is that as soon as we clear events in Slot Status, the hardware
may send notifications for new events, and we lose information about the
first events.  For example, we might see two Presence Detect Changed
events, but lose the fact that the slot was temporarily empty:

  read  PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS clear  # slot empty
  write PCI_EXP_SLTSTA_PDC                                # clear PDC event
  read  PCI_EXP_SLTSTA_PDC set, PCI_EXP_SLTSTA_PDS set    # slot occupied

The current code does not process a removal; it only processes the
insertion, which fails because we didn't remove the original device.

To avoid this problem, read Slot Status once and process all the events
before reading it again, like this:

  do {
    read events
    clear events
    process events
  } while (events)

[bhelgaas: changelog, add external loop around pciehp_isr()]
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mayurkumar Patel <mayurkumar.patel@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-14 14:24:31 -05:00
Bjorn Helgaas
70e8b40176 PCI: pciehp: Return IRQ_NONE when we can't read interrupt status
After 1469d17dd3 ("PCI: pciehp: Handle invalid data when reading from
non-existent devices"), we returned IRQ_HANDLED when we failed to read
interrupt status from the bridge.  I think it's better to return IRQ_NONE,
as we do in other cases where there's no interrupt pending.  This will
facilitate refactoring the loop in pcie_isr(): we'll be able to call the
ISR in a loop as long as it returns IRQ_HANDLED.

Return IRQ_NONE if we couldn't read interrupt status.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-14 14:24:25 -05:00
Bjorn Helgaas
a8499f20d3 PCI: pciehp: Rename pcie_isr() locals for clarity
Rename "detected" and "intr_loc" to "status" and "events" for clarity.
"status" is the value we read from the Slot Status register; "events" is
the set of hot-plug events we need to process.  No functional change
intended.

Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2016-09-12 12:26:07 -05:00
Lukas Wunner
ed91de7e14 PCI: pciehp: Ignore interrupts during D3cold
If a hotplug port is suspended to D3cold, its slot status register cannot
be read.  If that hotplug port happens to share its IRQ with other devices,
whenever an interrupt occurs for one of these devices, pciehp logs a
"no response from device" message and tries to read the PCI_EXP_SLTSTA
register, even though we know that will fail.

Ignore interrupts while we're in D3cold.

[bhelgaas: changelog]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-06-20 13:58:36 -05:00
Bjorn Helgaas
2db0f71f56 PCI: pciehp: Remove ignored MRL sensor interrupt events
We queued interrupt events for the MRL being opened or closed, but the code
in interrupt_event_handler() that handles these events ignored them.

Stop enabling MRL interrupts and remove the ignored events.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-08-10 14:24:09 -05:00
Jarod Wilson
1469d17dd3 PCI: pciehp: Handle invalid data when reading from non-existent devices
It's platform-dependent, but an MMIO read to a non-existent PCI device
generally returns data with all bits set.  This happens when the host
bridge or Root Complex times out waiting for a response from the device and
fabricates return data to complete the CPU's read.

One example, reported in the bugzilla below, involved this hierarchy:

  pci 0000:00:1c.0: PCI bridge to [bus 02-3a] Root Port
  pci 0000:02:00.0: PCI bridge to [bus 03-0a] Upstream Port
  pci 0000:03:03.0: PCI bridge to [bus 05-07] Downstream Port
  pci 0000:05:00.0: PCI bridge to [bus 06-07] Thunderbolt Upstream Port
  pci 0000:06:00.0: PCI bridge to [bus 07]    Thunderbolt Downstream Port
  pci 0000:07:00.0: BCM57762 NIC

Unplugging the Thunderbolt switch and the NIC below it resulted in this:

  pciehp 0000:03:03.0: Surprise Removal
  tg3 0000:07:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not clear MAC_TX_MODE=ffffffff
  pciehp 0000:06:00.0: unloading service driver pciehp
  pciehp 0000:06:00.0: pcie_isr: intr_loc 11f
  pciehp 0000:06:00.0: Switch interrupt received
  pciehp 0000:06:00.0: Latch open on Slot
  pciehp 0000:06:00.0: Attention button interrupt received
  pciehp 0000:06:00.0: Button pressed on Slot
  pciehp 0000:06:00.0: Presence/Notify input change
  pciehp 0000:06:00.0: Card present on Slot
  pciehp 0000:06:00.0: Power fault interrupt received
  pciehp 0000:06:00.0: Data Link Layer State change
  pciehp 0000:06:00.0: Link Up event

The pciehp driver correctly noticed that the Thunderbolt switch (05:00.0
and 06:00.0) and NIC (07:00.0) had been removed, and it called their driver
remove methods.

Since the NIC was already gone, tg3 received 0xffffffff when it tried to
read from the device.  The resulting timeout is a tg3 issue and not of
interest here.

Similarly, since the 06:00.0 Thunderbolt switch was already gone,
pcie_isr() received 0xffff when it tried to read PCI_EXP_SLTSTA, and pciehp
thought that was valid status showing that many events had happened: the
latch had been opened, the attention button had been pressed, a card was
now present, and the link was now up.  These are all wrong, of course, but
pciehp went on to try to power up and enumerate devices below the
non-existent bridge:

  pciehp 0000:06:00.0: PCI slot - powering on due to button press
  pciehp 0000:06:00.0: Surprise Insertion
  pci 0000:07:00.0 id reading try 50 times with interval 20 ms to get ffffffff

[bhelgaas: changelog, also check in pcie_poll_cmd() & pcie_do_write_cmd()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99841
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-08-10 14:24:09 -05:00
Yijing Wang
ac10836b68 PCI: pciehp: Simplify pcie_poll_cmd()
Move first slot status read into while to simplify code.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2015-07-15 22:03:33 -05:00
Bjorn Helgaas
4f092fec67 PCI: pciehp: Inline the "handle event" functions into the ISR
The pciehp_handle_*() functions (pciehp_handle_attention_button(), etc.)
only contain a line or two of useful code, so it's clumsy to put
them in separate functions.  All they so is add an event to a work queue,
and it's clearer to see that directly in the ISR.

Inline them directly into pcie_isr().  No functional change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rajat Jain <rajatja@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2015-06-18 16:14:49 -05:00
Bjorn Helgaas
3784e0c6b0 PCI: pciehp: Clean up debug logging
The pciehp debug logging is overly verbose and often redundant.  Almost all
of the information printed by dbg_ctrl() is also printed by the normal PCI
core enumeration code and by pcie_init().

Remove the redundant debug info.

When claiming a pciehp bridge, we print the slot characteristics, e.g.,

  Slot #6 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+

Add the Hot-Plug Capable and Hot-Plug Surprise bits to this information,
and print it all in the same order as lspci does.

No functional change except the message text changes.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rajat Jain <rajatja@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2015-06-17 17:35:28 -05:00
Alex Williamson
a5dd4b4b05 PCI: pciehp: Wait for hotplug command completion where necessary
The commit referenced below deferred waiting for command completion until
the start of the next command, allowing hardware to do the latching
asynchronously.  Unfortunately, being ready to accept a new command is the
only indication we have that the previous command is completed.  In cases
where we need that state change to be enabled, we must still wait for
completion.  For instance, pciehp_reset_slot() attempts to disable anything
that might generate a surprise hotplug on slots that support presence
detection.  If we don't wait for those settings to latch before the
secondary bus reset, we negate any value in attempting to prevent the
spurious hotplug.

Create a base function with optional wait and helper functions so that
pcie_write_cmd() turns back into the "safe" interface which waits before
and after issuing a command and add pcie_write_cmd_nowait(), which
eliminates the trailing wait for asynchronous completion.  The following
functions are returned to their previous behavior:

  pciehp_power_on_slot
  pciehp_power_off_slot
  pcie_disable_notification
  pciehp_reset_slot

The rationale is that pciehp_power_on_slot() enables the link and therefore
relies on completion of power-on.  pciehp_power_off_slot() and
pcie_disable_notification() need a wait because data structures may be
freed after these calls and continued signaling from the device would be
unexpected.  And, of course, pciehp_reset_slot() needs to wait for the
scenario outlined above.

Fixes: 3461a06866 ("PCI: pciehp: Wait for hotplug command completion lazily")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v3.17+
2015-06-09 10:46:29 -05:00
Linus Torvalds
80213c03c4 PCI changes for the v3.18 merge window:
Enumeration
     - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
     - Enable Config Request Retry Status when supported (Rajat Jain)
     - Add generic domain handling (Catalin Marinas)
     - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)
 
   Resource management
     - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
     - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)
 
   PCI device hotplug
     - Prevent NULL dereference during pciehp probe (Andreas Noever)
     - Move _HPP & _HPX handling into core (Bjorn Helgaas)
     - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
     - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
     - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
     - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
     - Fix wait time in pciehp timeout message (Yinghai Lu)
     - Add more pciehp Slot Control debug output (Yinghai Lu)
     - Stop disabling pciehp notifications during init (Yinghai Lu)
 
   MSI
     - Remove arch_msi_check_device() (Alexander Gordeev)
     - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
     - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
     - Remove unused kobject from struct msi_desc (Yijing Wang)
     - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
     - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
     - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
     - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
     - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)
 
   Power management
     - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
     - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)
 
   AER
     - Add additional AER error strings (Gong Chen)
     - Make <linux/aer.h> standalone includable (Thierry Reding)
 
   Virtualization
     - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
     - Add ACS quirk for Intel 10G NICs (Alex Williamson)
     - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
     - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
     - Add device flag helpers (Ethan Zhao)
     - Assume all Mellanox devices have broken INTx masking (Gavin Shan)
 
   Generic host bridge driver
     - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
     - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
     - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
     - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
     - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
     - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
     - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
     - Add arm64 architectural support for PCI (Liviu Dudau)
 
   APM X-Gene
     - Add APM X-Gene PCIe driver (Tanmay Inamdar)
     - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)
 
   Freescale i.MX6
     - Probe in module_init(), not fs_initcall() (Lucas Stach)
     - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)
 
   Marvell MVEBU
     - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)
 
   NVIDIA Tegra
     - Make sure the PCIe PLL is really reset (Eric Yuen)
     - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
     - Fix extended configuration space mapping (Peter Daifuku)
     - Implement resource hierarchy (Thierry Reding)
     - Clear CLKREQ# enable on port disable (Thierry Reding)
     - Add Tegra124 support (Thierry Reding)
 
   ST Microelectronics SPEAr13xx
     - Pass config resource through reg property (Pratyush Anand)
 
   Synopsys DesignWare
     - Use NULL instead of false (Fabio Estevam)
     - Parse bus-range property from devicetree (Lucas Stach)
     - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
     - Remove pci_assign_unassigned_resources() (Lucas Stach)
     - Check private_data validity in single place (Lucas Stach)
     - Setup and clear exactly one MSI at a time (Lucas Stach)
     - Remove open-coded bitmap operations (Lucas Stach)
     - Fix configuration base address when using 'reg' (Minghuan Lian)
     - Fix IO resource end address calculation (Minghuan Lian)
     - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
     - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
     - Add support for v3.65 hardware (Murali Karicheri)
     - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)
 
   TI Keystone
     - Add TI Keystone PCIe driver (Murali Karicheri)
     - Limit MRSS for all downstream devices (Murali Karicheri)
     - Assume controller is already in RC mode (Murali Karicheri)
     - Set device ID based on SoC to support multiple ports (Murali Karicheri)
 
   Xilinx AXI
     - Add Xilinx AXI PCIe driver (Srikanth Thokala)
     - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)
 
   Miscellaneous
     - Clean up whitespace (Quentin Lambert)
     - Remove assignments from "if" conditions (Quentin Lambert)
     - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
     - x86: Mark DMI tables as initialization data (Mathias Krause)
     - x86: Move __init annotation to the correct place (Mathias Krause)
     - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
     - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
     - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
     - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
     - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUNWmJAAoJEFmIoMA60/r8GncP/3uHRoBrnaF6pv+S1l1p3Fs/
 l1kKH91/IuAAU7VJX8pkNybFqx02topWmiVVXAzqvD01PcRLGCLjPbWl5h+y5/Ja
 CHZH33AwHAmm0kt4BrOSOeHTLJhAigly2zV3P4F8jRIgyaeMoGZ6Ko4tkQUpm21k
 +ohrOd4cxYkmzzCjKwsZZhKnyRNpae8FmTk3VQBPuN8DbhvFPrqo5/+GeAdSZTdS
 HZHpfl2HL4095aY7uBVsZqNkjQyl6SnWwjkjLnuI8q3qA3BLgDZE/Jr8F/MNuW1V
 y01JIjerFWMDFyBIkpg7moYnODy6oP3KvczwYdKGmqsJja+0MQvYhLTwD+R/yTQS
 SewJA0mL3T3EJEfnFYkCiaIX27xIwk/FxHfaKPN91xgx/QM7xCVZNrU2/dXjhoX1
 GqLKxOEaFHhWWTyT5Dj27I0ZcElzFZ3tIwvrHfs8y22oAuAlsAypaUgvUwRfL4CO
 hOj4ITZa0t041sYWqxCoGAA9Fdp8HMzNKKS5F4mhADz4Ad9v6uPCNv/s/RoxVsbm
 jhZOtPYJ0/iCA+kNVX563S8Z3VpfPI+7bBjcj2WKdzW+IlICvOKT+kvwL2Tv/rE7
 w0hrNsbkgGsYbPldMx7LwCavsUtYFuNj0zoU6vkhP2jk6O2Tn5VXDmjrXH0v3iHI
 v03vlUtre0bQ26fzDyLQ
 =4Zv1
 -----END PGP SIGNATURE-----

Merge tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "The interesting things here are:

   - Turn on Config Request Retry Status Software Visibility.  This
     caused hangs last time, but we included a fix this time.
   - Rework PCI device configuration to use _HPP/_HPX more aggressively
   - Allow PCI devices to be put into D3cold during system suspend
   - Add arm64 PCI support
   - Add APM X-Gene host bridge driver
   - Add TI Keystone host bridge driver
   - Add Xilinx AXI host bridge driver

  More detailed summary:

  Enumeration
    - Check Vendor ID only for Config Request Retry Status (Rajat Jain)
    - Enable Config Request Retry Status when supported (Rajat Jain)
    - Add generic domain handling (Catalin Marinas)
    - Generate uppercase hex for modalias interface class (Ricardo Ribalda Delgado)

  Resource management
    - Add missing MEM_64 mask in pci_assign_unassigned_bridge_resources() (Yinghai Lu)
    - Increase IBM ipr SAS Crocodile BARs to at least system page size (Douglas Lehr)

  PCI device hotplug
    - Prevent NULL dereference during pciehp probe (Andreas Noever)
    - Move _HPP & _HPX handling into core (Bjorn Helgaas)
    - Apply _HPP to PCIe devices as well as PCI (Bjorn Helgaas)
    - Apply _HPP/_HPX to display devices (Bjorn Helgaas)
    - Preserve SERR & PARITY settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Preserve MPS and MRRS settings when applying _HPP/_HPX (Bjorn Helgaas)
    - Apply _HPP/_HPX to all devices, not just hot-added ones (Bjorn Helgaas)
    - Fix wait time in pciehp timeout message (Yinghai Lu)
    - Add more pciehp Slot Control debug output (Yinghai Lu)
    - Stop disabling pciehp notifications during init (Yinghai Lu)

  MSI
    - Remove arch_msi_check_device() (Alexander Gordeev)
    - Rename pci_msi_check_device() to pci_msi_supported() (Alexander Gordeev)
    - Move D0 check into pci_msi_check_device() (Alexander Gordeev)
    - Remove unused kobject from struct msi_desc (Yijing Wang)
    - Remove "pos" from the struct msi_desc msi_attrib (Yijing Wang)
    - Add "msi_bus" sysfs MSI/MSI-X control for endpoints (Yijing Wang)
    - Use __get_cached_msi_msg() instead of get_cached_msi_msg() (Yijing Wang)
    - Use __read_msi_msg() instead of read_msi_msg() (Yijing Wang)
    - Use __write_msi_msg() instead of write_msi_msg() (Yijing Wang)

  Power management
    - Drop unused runtime PM support code for PCIe ports (Rafael J.  Wysocki)
    - Allow PCI devices to be put into D3cold during system suspend (Rafael J. Wysocki)

  AER
    - Add additional AER error strings (Gong Chen)
    - Make <linux/aer.h> standalone includable (Thierry Reding)

  Virtualization
    - Add ACS quirk for Solarflare SFC9120 & SFC9140 (Alex Williamson)
    - Add ACS quirk for Intel 10G NICs (Alex Williamson)
    - Add ACS quirk for AMD A88X southbridge (Marti Raudsepp)
    - Remove unused pci_find_upstream_pcie_bridge(), pci_get_dma_source() (Alex Williamson)
    - Add device flag helpers (Ethan Zhao)
    - Assume all Mellanox devices have broken INTx masking (Gavin Shan)

  Generic host bridge driver
    - Fix ioport_map() for !CONFIG_GENERIC_IOMAP (Liviu Dudau)
    - Add pci_register_io_range() and pci_pio_to_address() (Liviu Dudau)
    - Define PCI_IOBASE as the base of virtual PCI IO space (Liviu Dudau)
    - Fix the conversion of IO ranges into IO resources (Liviu Dudau)
    - Add pci_get_new_domain_nr() and of_get_pci_domain_nr() (Liviu Dudau)
    - Add support for parsing PCI host bridge resources from DT (Liviu Dudau)
    - Add pci_remap_iospace() to map bus I/O resources (Liviu Dudau)
    - Add arm64 architectural support for PCI (Liviu Dudau)

  APM X-Gene
    - Add APM X-Gene PCIe driver (Tanmay Inamdar)
    - Add arm64 DT APM X-Gene PCIe device tree nodes (Tanmay Inamdar)

  Freescale i.MX6
    - Probe in module_init(), not fs_initcall() (Lucas Stach)
    - Delay enabling reference clock for SS until it stabilizes (Tim Harvey)

  Marvell MVEBU
    - Fix uninitialized variable in mvebu_get_tgt_attr() (Thomas Petazzoni)

  NVIDIA Tegra
    - Make sure the PCIe PLL is really reset (Eric Yuen)
    - Add error path tegra_msi_teardown_irq() cleanup (Jisheng Zhang)
    - Fix extended configuration space mapping (Peter Daifuku)
    - Implement resource hierarchy (Thierry Reding)
    - Clear CLKREQ# enable on port disable (Thierry Reding)
    - Add Tegra124 support (Thierry Reding)

  ST Microelectronics SPEAr13xx
    - Pass config resource through reg property (Pratyush Anand)

  Synopsys DesignWare
    - Use NULL instead of false (Fabio Estevam)
    - Parse bus-range property from devicetree (Lucas Stach)
    - Use pci_create_root_bus() instead of pci_scan_root_bus() (Lucas Stach)
    - Remove pci_assign_unassigned_resources() (Lucas Stach)
    - Check private_data validity in single place (Lucas Stach)
    - Setup and clear exactly one MSI at a time (Lucas Stach)
    - Remove open-coded bitmap operations (Lucas Stach)
    - Fix configuration base address when using 'reg' (Minghuan Lian)
    - Fix IO resource end address calculation (Minghuan Lian)
    - Rename get_msi_data() to get_msi_addr() (Minghuan Lian)
    - Add get_msi_data() to pcie_host_ops (Minghuan Lian)
    - Add support for v3.65 hardware (Murali Karicheri)
    - Fold struct pcie_port_info into struct pcie_port (Pratyush Anand)

  TI Keystone
    - Add TI Keystone PCIe driver (Murali Karicheri)
    - Limit MRSS for all downstream devices (Murali Karicheri)
    - Assume controller is already in RC mode (Murali Karicheri)
    - Set device ID based on SoC to support multiple ports (Murali Karicheri)

  Xilinx AXI
    - Add Xilinx AXI PCIe driver (Srikanth Thokala)
    - Fix xilinx_pcie_assign_msi() return value test (Dan Carpenter)

  Miscellaneous
    - Clean up whitespace (Quentin Lambert)
    - Remove assignments from "if" conditions (Quentin Lambert)
    - Move PCI_VENDOR_ID_VMWARE to pci_ids.h (Francesco Ruggeri)
    - x86: Mark DMI tables as initialization data (Mathias Krause)
    - x86: Move __init annotation to the correct place (Mathias Krause)
    - x86: Mark constants of pci_mmcfg_nvidia_mcp55() as __initconst (Mathias Krause)
    - x86: Constify pci_mmcfg_probes[] array (Mathias Krause)
    - x86: Mark PCI BIOS initialization code as such (Mathias Krause)
    - Parenthesize PCI_DEVID and PCI_VPD_LRDT_ID parameters (Megan Kamiya)
    - Remove unnecessary variable in pci_add_dynid() (Tobias Klauser)"

* tag 'pci-v3.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (109 commits)
  arm64: dts: Add APM X-Gene PCIe device tree nodes
  PCI: Add ACS quirk for AMD A88X southbridge devices
  PCI: xgene: Add APM X-Gene PCIe driver
  PCI: designware: Remove open-coded bitmap operations
  PCI/MSI: Remove unnecessary temporary variable
  PCI/MSI: Use __write_msi_msg() instead of write_msi_msg()
  MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg()
  PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg()
  PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints
  PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib
  PCI/MSI: Remove unused kobject from struct msi_desc
  PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported()
  PCI/MSI: Move D0 check into pci_msi_check_device()
  PCI/MSI: Remove arch_msi_check_device()
  irqchip: armada-370-xp: Remove arch_msi_check_device()
  PCI/MSI/PPC: Remove arch_msi_check_device()
  arm64: Add architectural support for PCI
  PCI: Add pci_remap_iospace() to map bus I/O resources
  of/pci: Add support for parsing PCI host bridge resources from DT
  of/pci: Add pci_get_new_domain_nr() and of_get_pci_domain_nr()
  ...

Conflicts:
	arch/arm64/boot/dts/apm-storm.dtsi
2014-10-09 15:03:49 -04:00
Yinghai Lu
31ff2a5e42 PCI: pciehp: Stop disabling notifications during init
During pciehp initialization, we previously wrote two hotplug commands:

  pciehp_probe
    pcie_init
      pcie_disable_notification
        pcie_write_cmd           # command 1
    pcie_init_notification
      pcie_enable_notification
        pcie_write_cmd           # command 2

For controllers with errata like Intel CF118, we previously waited for a
timeout before issuing the second hotplug command because the first command
only updates interrupt enable bits and is not a "real" hotplug command, so
the controller doesn't report Command Completed for it.

But there's no need to disable notifications in the first place.  If BIOS
left them enabled, we could easily take an interrupt before disabling them,
so there's no benefit in disabling them for the tiny window before we
enable them.

Drop the unnecessary pcie_disable_notification() call.

[bhelgaas: changelog]
Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-23 10:03:59 -06:00
Yinghai Lu
cf8d7b589c PCI: pciehp: Add more Slot Control debug output
Add more Slot Control debug output and move one print after
pcie_write_cmd() to be consistent with other debug output.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-23 10:03:57 -06:00
Yinghai Lu
d433889cd5 PCI: pciehp: Fix wait time in timeout message
When we warned about a timeout on a hotplug command, we previously printed
the time between calls to pcie_write_cmd(), without accounting for any time
spent actually waiting.  Consider this sequence:

  pcie_write_cmd
    write SLTCTL
    cmd_started = jiffies          # T1

  pcie_write_cmd
    pcie_wait_cmd
      now = jiffies                # T2
      wait_event_timeout           # we may wait here
      if (timeout)
        ctrl_info("Timeout on command issued %u msec ago",
                  jiffies_to_msecs(now - cmd_started))

We previously printed (T2 - T1), but that doesn't include the time spent in
wait_event_timeout().

Fix this by using the current jiffies value, not the one cached before
calling wait_event_timeout().

[bhelgaas: changelog, use current jiffies instead of adding timeout]
Fixes: 40b960831c ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-23 10:03:54 -06:00
Yinghai Lu
7cbeb9f90d PCI: pciehp: Fix pcie_wait_cmd() timeout
pcie_poll_cmd() take msecs instead of jiffies, so convert timeout to msecs.

Fixes: 40b960831c ("PCI: pciehp: Compute timeout from hotplug command start time")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-22 20:05:45 -06:00
Bjorn Helgaas
d537a3abb4 PCI: pciehp: Reduce PCIe slot_ctrl to 16 bits
4283c70e91 ("PCI: pciehp: Make pcie_wait_cmd() self-contained") added
a cache of the most recent command written to the Slot Control register.
This register is only 16 bits wide, but the cache ("slot_ctrl") is 32 bits.

Reduce slot_ctrl to a u16 so it matches the register size.  No functional
change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-09-12 20:12:29 -06:00
Bjorn Helgaas
b440bde74f PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device
Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.

Some drivers expect to remain bound to a device even while they power it
off and back on again.  This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed.  But some drivers accept that risk.

Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed.  Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.

The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU.  They
power off the unused GPU, but they want to remain bound to it.

This is a reimplementation of f244d8b623 ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.

This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below).  The
resume of the radeon device may also fail, e.g.,

This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:

    [drm] radeon: finishing device.
    radeon 0000:01:00.0: Userspace still has active objects !
    radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
    ...
    WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
    trying to unbind memory from uninitialized GART !

or while resuming it, as in bug 77261:

    radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
    radeon 0000:01:00.0: GPU lockup ...
    radeon 0000:01:00.0: GPU pci config reset
    pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
    radeon 0000:01:00.0: GPU reset succeeded, trying to resume
    *ERROR* radeon: dpm resume failed
    radeon 0000:01:00.0: Wait for MC idle timedout !

Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
CC: stable@vger.kernel.org	# v3.15+
2014-09-10 13:45:01 -06:00
Myron Stowe
0d25d35c98 PCI: pciehp: Clear Data Link Layer State Changed during init
During PCIe hot-plug initialization - pciehp_probe() - data structures
related to slot capabilities are set up.  As part of this set up, ISRs are
put in place to handle slot events and all event bits are cleared out.

This patch adds the Data Link Layer State Changed (PCI_EXP_SLTSTA_DLLSC)
Slot Status bit to the event bits that are cleared out during
initialization.

If the BIOS doesn't clear DLLSC before handoff to the OS, pciehp notices
that it's set and interprets it as a new Link Up event, which results in
spurious messages:

  pciehp 0000:82:04.0:pcie24: slot(4): Link Up event
  pciehp 0000:82:04.0:pcie24: Device 0000:83:00.0 already exists at 0000:83:00, cannot hot-add
  pciehp 0000:82:04.0:pcie24: Cannot add device at 0000:83:00

Prior to e48f1b67f6 ("PCI: pciehp: Use link change notifications for
hot-plug and removal"), pciehp ignored DLLSC.

Reference:
  PCI-SIG.  PCI Express Base Specification Revision 4.0 Version 0.3
  (PCI-SIG, 2014): 7.8.11. Slot Status Register (Offset 1Ah).

[bhelgaas: add e48f1b67f6 ref and stable tag]
Fixes: e48f1b67f6 ("PCI: pciehp: Use link change notifications for hot-plug and removal")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79611
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org	# v3.15+
2014-07-07 14:53:43 -06:00
Rajat Jain
6c1a32e067 PCI: pciehp: Remove struct controller.no_cmd_complete
"no_cmd_complete" is only used once, and it duplicates read-only
information we already have in the cached Slot Capabilities value.

Remove the field and use the existing macro NO_CMD_CMPL() instead.

[bhelgaas: changelog]
Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-07-05 11:38:26 -06:00
Bjorn Helgaas
2cc56f3028 PCI: pciehp: Remove assumptions about which commands cause completion events
We use incorrect logic to decide whether a PCIe hotplug controller
generates command completion events.

5808639bfa ("pciehp: fix slow probing") assumed that the Slot Status
"Command Completed" bit was set only for commands affecting slot power,
indicators, or electromechanical interlock.  That assumption is false: per
sec. 6.7.3.2 of PCIe spec r3.0, a write targeting any portion of the Slot
Control register is a command, and (if command completed events are
supported) software must wait for a command to complete before issuing the
next command.

5808639bfa was to fix boot-time timeouts (see bugzilla below) on a Lenovo
Thinkpad R61 with an Intel hotplug controller.  The controller probably has
the Intel CF118 erratum, which means it doesn't report Command Completed
unless the Slot Control power, indicator, or interlock bits are changed.
This causes a timeout because pciehp always waits for Command Complete (if
supported), regardless of which bits are changed.

Remove the incorrect logic because the timeouts have been addressed
differently by these changes:

  PCI: pciehp: Wait for hotplug command completion lazily
  PCI: pciehp: Compute timeout from hotplug command start time

Link: https://bugzilla.kernel.org/show_bug.cgi?id=10751
Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-17 15:26:20 -06:00
Bjorn Helgaas
40b960831c PCI: pciehp: Compute timeout from hotplug command start time
If we issue a hotplug command, go do something else, then come back and
wait for the command to complete, we don't have to wait the whole timeout
period, because some of it elapsed while we were doing something else.

Keep track of the time we issued the command, and wait only until the
timeout period from that point has elapsed.

For controllers with errata like Intel CF118, we previously timed out
before issuing the second hotplug command:

  At time T1 (during boot):
    - Write DLLSCE, ABPE, PDCE, etc. to Slot Control
  At time T2 (hotplug event):
    - Wait for command completion (CC) in Slot Status
    - Timeout at T2 + 1 second because CC is never set in Slot Status
    - Write PCC, PIC, etc. to Slot Control

With this change, we wait until T1 + 1 second instead of T2 + 1 second.
If the hotplug event is more than 1 second after the boot-time
initialization, we won't wait for the timeout at all.

We still emit a "Timeout on hotplug command" message if it timed out; we
should see this on the first hotplug event on every controller with this
erratum, as well as on real errors on controllers without the erratum.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-17 15:26:11 -06:00
Bjorn Helgaas
3461a06866 PCI: pciehp: Wait for hotplug command completion lazily
Previously we issued a hotplug command and waited for it to complete.  But
there's no need to wait until we're ready to issue the *next* command.  The
next command will probably be much later, so the first one may have already
completed and we may not have to actually wait at all.

Because of hardware errata, some controllers generate command completion
events for some commands but not others.  In the case of Intel CF118 (see
spec update reference), the controller indicates command completion only
for Slot Control writes that change the value of the following bits:

  Power Controller Control
  Power Indicator Control
  Attention Indicator Control
  Electromechanical Interlock Control

Changes to other bits, e.g., the interrupt enable bits, do not cause the
Command Completed bit to be set.  Controllers from AMD and Nvidia are
reported to have similar errata.

These errata cause timeouts when pcie_enable_notification() enables
interrupts.  Previously that timeout occurred at boot-time.  With this
change, the timeout occurs later, when we change the state of the slot
power, indicators, or interlock.  This speeds up boot but causes a timeout
at the first hotplug event on the slot.  Subsequent events don't timeout
because only the first (boot-time) hotplug command updates Slot Control
without touching the power/indicator/interlock controls.

Link: http://www.intel.com/content/www/us/en/processors/xeon/xeon-e7-v2-spec-update.html
Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-17 15:26:02 -06:00
Bjorn Helgaas
4283c70e91 PCI: pciehp: Make pcie_wait_cmd() self-contained
pcie_wait_cmd() waits for the controller to finish a hotplug command.  Move
the associated logic (to determine whether waiting is required and whether
we're using interrupts or polling) from pcie_write_cmd() to
pcie_wait_cmd().

No functional change.

Tested-by: Rajat Jain <rajatxjain@gmail.com>	(IDT 807a controller)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
2014-06-16 11:47:59 -06:00
Ryan Desfosses
227f064705 PCI: Merge multi-line quoted strings
Merge quoted strings that are broken across lines into a single entity.
The compiler merges them anyway, but checkpatch complains about it, and
merging them makes it easier to grep for strings.

No functional change.

[bhelgaas: changelog, do the same for everything under drivers/pci]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 20:20:42 -06:00
Ryan Desfosses
3c78bc61f5 PCI: Whitespace cleanup
Fix various whitespace errors.

No functional change.

[bhelgaas: fix other similar problems]
Signed-off-by: Ryan Desfosses <ryan@desfo.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-06-10 20:20:19 -06:00
Rajat Jain
476a357fd9 PCI: pciehp: Acknowledge spurious "cmd completed" event
In case of a spurious "cmd completed", pcie_write_cmd() does not clear it,
but yet expects more "cmd completed" events to be generated.  This does not
happen because the previous (spurious) event has not been acknowledged.
Fix that.

Signed-off-by: Rajat Jain <rajatxjain@gmail.com>
Signed-off-by: Rajat Jain <rajatjain@juniper.net>
Signed-off-by: Guenter Roeck <groeck@juniper.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-04-24 16:47:09 -06:00