Commit Graph

71 Commits

Author SHA1 Message Date
Gavin Shan
8a6b3710cc powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
The flag EEH_PE_RESET indicates blocking config space of the PE
during reset time. We potentially need block PE's config space
other than reset time. So it's reasonable to replace it with
EEH_PE_CFG_BLOCKED to indicate its usage.

There are no substantial code or logic changes in this patch.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-15 11:27:18 +11:00
Gavin Shan
5cfb20b96f powerpc/eeh: Emulate EEH recovery for VFIO devices
When enabling EEH functionality on passed through devices (PE)
with VFIO, the devices in the PE would be removed permanently
from guest side. In that case, the PE remains frozen state.
When returning PE to host, or restarting the guest again, we
had mechanism unfreezing the PE by clearing PESTA/B frozen
bits. However, that's not enough for some adapters, which are
indicated as following "lspci" shows. Those adapters require
hot reset on the parent bus to bring their firmware back to
workable state. Otherwise, those adaptrs won't be operative
and the host (for returning case) or the guest will fail to
load the drivers for those adapters without exception.

0000:01:00.0 Ethernet controller: Emulex Corporation OneConnect \
             10Gb NIC (be3) (rev 02)
0000:01:00.0 0200: 19a2:0710 (rev 02)
0001:03:00.0 Ethernet controller: Emulex Corporation OneConnect \
             NIC (Lancer) (rev 10)
0001:03:00.0 0200: 10df:e220 (rev 10)

The patch adds mechanism to emulate EEH recovery (for hot reset
on parent PCI bus) on 3 gates to fix the issue: open/release one
adapter of the PE, enable EEH functionality on one adapter of the
PE.

Reported-by:  Murilo Fossa Vicentini <muvic@br.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:18 +10:00
Gavin Shan
4eeeff0ebc powerpc/eeh: Unfreeze PE on enabling EEH functionality
When passing through PE to guest, that's possibly in frozen
state. The driver for the pass-through devices on guest side
can't be loaded successfully as reported. We already had one
gate in eeh_dev_open() to clear PE frozen state accordingly,
but that's not enough because the function is only called at
QEMU startup for once.

The patch adds another gate in eeh_pe_set_option() so that the
PE frozen state can be cleared at QEMU restart time.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:14 +10:00
Gavin Shan
131c123abe powerpc/eeh: Introduce eeh_ops::err_inject
The patch introduces eeh_ops::err_inject(), which allows to inject
specified errors to indicated PE for testing purpose. The functionality
isn't support on pSeries platform. On PowerNV, the functionality
relies on OPAL API opal_pci_err_inject().

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:10 +10:00
Gavin Shan
0d5ee5205e powerpc/eeh: Freeze PE before PE reset
The patch adds one more option (EEH_OPT_FREEZE_PE) to set_option()
method to proactively freeze PE, which will be issued before resetting
pass-throughed PE to drop MMIO access during reset because it's
always contributing to recursive EEH error.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:07 +10:00
Gavin Shan
3e938052fb powerpc/eeh: Drop unused argument in eeh_check_failure()
eeh_check_failure() is used to check frozen state of the PE which
owns the indicated I/O address. The argument "val" of the function
isn't used. The patch drops it and return the frozen state of the
PE as expected.

Cc: Vishal Mansur <vmansur@linux.vnet.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-30 17:15:05 +10:00
Wei Yang
2a58222f82 powerpc/eeh: Fix kernel crash when passing through VF
When doing vfio passthrough a VF, the kernel will crash with following
message:

[  442.656459] Unable to handle kernel paging request for data at address 0x00000060
[  442.656593] Faulting instruction address: 0xc000000000038b88
[  442.656706] Oops: Kernel access of bad area, sig: 11 [#1]
[  442.656798] SMP NR_CPUS=1024 NUMA PowerNV
[  442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core]
[  442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ #37
[  442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000
[  442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0
[  442.658352] REGS: c000000f6dc3f580 TRAP: 0300   Not tainted  (3.10.42yw-pkvm+)
[  442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 28004882  XER: 20000000
[  442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1
GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000
GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff
GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000
GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0
GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18
GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0
GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0
GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003
[  442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130
[  442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
[  442.659585] Call Trace:
[  442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable)
[  442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40
[  442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80
[  442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0
[  442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430
[  442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0
[  442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci]
[  442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630
[  442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0
[  442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0
[  442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98
[  442.660420] Instruction dump:
[  442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78
[  442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002
[  442.660679] ---[ end trace a64ac9546bcf0328 ]---
[  442.660724]

The reason is current VF is not EEH enabled.

This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it
will prevent converting with NULL pointer.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
CC: Michael Ellerman <mpe@ellerman.id.au>

V3 -> V4:
   1. move the macro definition from include/linux/pci.h to
      arch/powerpc/include/asm/eeh.h

V2 -> V3:
   1. rebased on 3.17-rc4
   2. introduce a macro
   3. use this macro in several other places

V1 -> V2:
   1. code style and patch subject adjustment

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-09-25 23:14:56 +10:00
Gavin Shan
bb593c0049 powerpc/eeh: Aux PE data for error log
The patch allows PE (struct eeh_pe) instance to have auxillary data,
whose size is configurable on basis of platform. For PowerNV, the
auxillary data will be used to cache PHB diag-data for that PE
(frozen PE or fenced PHB). In turn, we can retrieve the diag-data
at any later points.

It's useful for the case of VFIO PCI devices where the error log
should be cached, and then be retrieved by the guest at later point.
Also, it can avoid PHB diag-data overwritting if another frozen PE
reported and the previous diag-data isn't fetched by guest.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:43 +10:00
Gavin Shan
dc561fb9e7 powerpc/eeh: Selectively enable IO for error log
According to the experiment I did, PCI config access is blocked
on P7IOC frozen PE by hardware, but PHB3 doesn't do that. That
means we always get 0xFF's while dumping PCI config space of the
frozen PE on P7IOC. We don't have the problem on PHB3. So we have
to enable I/O prioir to collecting error log. Otherwise, meaningless
0xFF's are always returned.

The patch fixes it by EEH flag (EEH_ENABLE_IO_FOR_LOG), which is
selectively set to indicate the case for: P7IOC on PowerNV platform,
pSeries platform.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:25 +10:00
Gavin Shan
05b1721d9f powerpc/eeh: Refactor EEH flag accessors
There are multiple global EEH flags. Almost each flag has its own
accessor, which doesn't make sense. The patch refactors EEH flag
accessors so that they look unified:

  eeh_add_flag():   Add EEH flag
  eeh_clear_flag(): Clear EEH flag
  eeh_has_flag():   Check if one specific flag has been set
  eeh_enabled():    Check if EEH functionality has been enabled

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:41:21 +10:00
Gavin Shan
212d16cdca powerpc/eeh: EEH support for VFIO PCI device
The patch exports functions to be used by new VFIO ioctl command,
which will be introduced in subsequent patch, to support EEH
functinality for VFIO PCI devices.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:28:48 +10:00
Gavin Shan
05ec424e38 powerpc/eeh: Avoid event on passed PE
We must not handle EEH error on devices which are passed to somebody
else. Instead, we expect that the frozen device owner detects an EEH
error and recovers from it.

This avoids EEH error handling on passed through devices so the device
owner gets a chance to handle them.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-08-05 15:28:47 +10:00
Gavin Shan
357b2f3dd9 powerpc/eeh: Dump PE location code
As Ben suggested, it's meaningful to dump PE's location code
for site engineers when hitting EEH errors. The patch introduces
function eeh_pe_loc_get() to retireve the location code from
dev-tree so that we can output it when hitting EEH errors.

If primary PE bus is root bus, the PHB's dev-node would be tried
prior to root port's dev-node. Otherwise, the upstream bridge's
dev-node of the primary PE bus will be check for the location code
directly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-06-11 19:12:23 +10:00
Benjamin Herrenschmidt
965b5608f7 Revert "powerpc/powernv: Fundamental reset on PLX ports"
This reverts commit b2b5efcf20.

This code was way too board specific, there are quirks as to how
the PERST line is wired on different boards, we'll have to revisit
this using/creating appropriate firmware interfaces.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-05-20 10:20:49 +10:00
Gavin Shan
b2b5efcf20 powerpc/powernv: Fundamental reset on PLX ports
The patch intends to support fundamental reset on PLX downstream
ports. If the PCI device matches any one of the internal table,
which includes PLX vendor ID, bridge device ID, register offset
for fundamental reset and bit, fundamental reset will be done
accordingly. Otherwise, it will fail back to hot reset.

Additional flag (EEH_DEV_FRESET) is introduced to record the last
reset type on the PCI bridge.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:35:05 +10:00
Gavin Shan
26833a5029 powerpc/eeh: Make the delay for PE reset unified
Basically, we have 3 types of resets to fulfil PE reset: fundamental,
hot and PHB reset. For the later 2 cases, we need PCI bus reset hold
and settlement delay as specified by PCI spec. PowerNV and pSeries
platforms are running on top of different firmware and some of the
delays have been covered by underly firmware (PowerNV).

The patch makes the delays unified to be done in backend, instead of
EEH core.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:34:48 +10:00
Gavin Shan
d2b0f6f77e powerpc/eeh: No hotplug on permanently removed dev
The issue was detected in a bit complicated test case where
we have multiple hierarchical PEs shown as following figure:

                +-----------------+
                | PE#3     p2p#0  |
                |          p2p#1  |
                +-----------------+
                        |
                +-----------------+
                | PE#4     pdev#0 |
                |          pdev#1 |
                +-----------------+

PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p
bridges. We accidentally had less-known scenario: PE#4 was removed
permanently from the system because of permanent failure (e.g.
exceeding the max allowd failure times in last hour), then we detects
EEH errors on PE#3 and tried to recover it. However, eeh_dev instances
for pdev#0/1 were not detached from PE#4, which was still connected to
PE#3. All of that was because of the fact that we rely on count-based
pcibios_release_device(), which isn't reliable enough. When doing
recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which
are not valid any more. Eventually, we run into kernel crash.

The patch fixes above issue from two aspects. For unplug, we simply
skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED
&& !EEH_PE_STATE_RECOVERING) and its frozen count should be greater
than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently
removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read
its PCI config so that PCI core will omit them.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:34:32 +10:00
Gavin Shan
8a5ad35686 powerpc/eeh: Cleanup EEH subsystem variables
There're 2 EEH subsystem variables: eeh_subsystem_enabled and
eeh_probe_mode. We needn't maintain 2 variables and we can just
have one variable and introduce different flags. The patch also
introduces additional flag EEH_FORCE_DISABLE, which will be used
to disable EEH subsystem via boot parameter ("eeh=off") in future.
Besides, the patch also introduces flag EEH_ENABLED, which is
changed to disable or enable EEH functionality on the fly through
debugfs entry in future.

With the patch applied, the creteria to check the enabled EEH
functionality is changed to:

!EEH_FORCE_DISABLED && EEH_ENABLED : Enabled
                       Other cases : Disabled

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:34:23 +10:00
Gavin Shan
2a18dfc6ee powerpc/eeh: Use cached capability for log dump
When calling into eeh_gather_pci_data() on pSeries platform, we
possiblly don't have pci_dev instance yet, but eeh_dev is always
ready. So we use cached capability from eeh_dev instead of pci_dev
for log dump there. In order to keep things unified, we also cache
PCI capability positions to eeh_dev for PowerNV as well.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:34:19 +10:00
Gavin Shan
d0914f503f powerpc/eeh: Block PCI-CFG access during PE reset
We've observed multiple PE reset failures because of PCI-CFG
access during that period. Potentially, some device drivers
can't support EEH very well and they can't put the device to
motionless state before PE reset. So those device drivers might
produce PCI-CFG accesses during PE reset. Also, we could have
PCI-CFG access from user space (e.g. "lspci"). Since access to
frozen PE should return 0xFF's, we can block PCI-CFG access
during the period of PE reset so that we won't get recrusive EEH
errors.

The patch adds flag EEH_PE_RESET, which is kept during PE reset.
The PowerNV/pSeries PCI-CFG accessors reuse the flag to block
PCI-CFG accordingly.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:34:02 +10:00
Gavin Shan
9e04937560 powerpc/eeh: Remove EEH_PE_PHB_DEAD
The PE state (for eeh_pe instance) EEH_PE_PHB_DEAD is duplicate to
EEH_PE_ISOLATED. Originally, those PHBs (PHB PE) with EEH_PE_PHB_DEAD
would be removed from the system. However, it's safe to replace
that with EEH_PE_ISOLATED.

The patch also clear EEH_PE_RECOVERING after fenced PHB has been handled,
either failure or success. It makes the PHB PE state consistent with:

	PHB functions normally		  NONE
	PHB has been removed		  EEH_PE_ISOLATED
	PHB fenced, recovery in progress  EEH_PE_ISOLATED | RECOVERING

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 17:33:39 +10:00
Gavin Shan
2ec5a0adf6 powerpc/eeh: Cleanup on eeh_subsystem_enabled
The patch cleans up variable eeh_subsystem_enabled so that we needn't
refer the variable directly from external. Instead, we will use
function eeh_enabled() and eeh_set_enable() to operate the variable.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-17 11:19:39 +11:00
Gavin Shan
7e4e7867b1 powerpc/eeh: Handle multiple EEH errors
For one PCI error relevant OPAL event, we possibly have multiple
EEH errors for that. For example, multiple frozen PEs detected on
different PHBs. Unfortunately, we didn't cover the case. The patch
enumarates the return value from eeh_ops::next_error() and change
eeh_handle_special_event() and eeh_ops::next_error() to handle all
existing EEH errors.

As Ben pointed out, we needn't list_for_each_entry_safe() since we
are not deleting any PHB from the hose_list and the EEH serialized
lock should be held while purging EEH events. The patch covers those
suggestions as well.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15 17:18:58 +11:00
Gavin Shan
f26c7a035b powerpc/eeh: Hotplug improvement
When EEH error comes to one specific PCI device before its driver
is loaded, we will apply hotplug to recover the error. During the
plug time, the PCI device will be probed and its driver is loaded.
Then we wrongly calls to the error handlers if the driver supports
EEH explicitly.

The patch intends to fix by introducing flag EEH_DEV_NO_HANDLER and
set it before we remove the PCI device. In turn, we can avoid wrongly
calls the error handlers of the PCI device after its driver loaded.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15 13:58:29 +11:00
Gavin Shan
1d350544d5 powerpc/eeh: Add restore_config operation
After reset on the specific PE or PHB, we never configure AER
correctly on PowerNV platform. We needn't care it on pSeries
platform. The patch introduces additional EEH operation eeh_ops::
restore_config() so that we have chance to configure AER correctly
for PowerNV platform.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-15 13:46:46 +11:00
Gavin Shan
ab55d2187d powerpc/eeh: Introdce flag to protect sysfs
The patch introduces flag EEH_DEV_SYSFS to keep track that the sysfs
entries for the corresponding EEH device (then PCI device) has been
added or removed, in order to avoid race condition.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:49 +10:00
Gavin Shan
4b83bd452f powerpc/eeh: Don't use pci_dev during BAR restore
While restoring BARs for one specific PCI device, the pci_dev
instance should have been released. So it's not reliable to use
the pci_dev instance on restoring BARs. However, we still need
some information (e.g. PCIe capability position, header type) from
the pci_dev instance. So we have to store those information to
EEH device in advance.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:49 +10:00
Gavin Shan
f5c57710dd powerpc/eeh: Use partial hotplug for EEH unaware drivers
When EEH error happens to one specific PE, some devices with drivers
supporting EEH won't except hotplug on the device. However, there
might have other deivces without driver, or with driver without EEH
support. For the case, we need do partial hotplug in order to make
sure that the PE becomes absolutely quite during reset. Otherise,
the PE reset might fail and leads to failure of error recovery.

The current code doesn't handle that 'mixed' case properly, it either
uses the error callbacks to the drivers, or tries hotplug, but doesn't
handle a PE (EEH domain) composed of a combination of the two.

The patch intends to support so-called "partial" hotplug for EEH:
Before we do reset, we stop and remove those PCI devices without
EEH sensitive driver. The corresponding EEH devices are not detached
from its PE, but with special flag. After the reset is done, those
EEH devices with the special flag will be scanned one by one.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:48 +10:00
Gavin Shan
9feed42e93 powerpc/eeh: Use safe list traversal when walking EEH devices
Currently, we're trasversing the EEH devices list using list_for_each_entry().
That's not safe enough because the EEH devices might be removed from
its parent PE while doing iteration. The patch replaces that with
list_for_each_entry_safe().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:47 +10:00
Gavin Shan
807a827d4e powerpc/eeh: Keep PE during hotplug
When we do normal hotplug, the PE (shadow EEH structure) shouldn't be
kept around.

However, we need to keep it if the hotplug an artifial one caused by
EEH errors recovery.

Since we remove EEH device through the PCI hook pcibios_release_device(),
the flag "purge_pe" passed to various functions is meaningless. So the patch
removes the meaningless flag and introduce new flag "EEH_PE_KEEP"
to save the PE while doing hotplug during EEH error recovery.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:47 +10:00
Gavin Shan
f2856491d2 powerpc/eeh: Export functions for hotplug
Make some functions public in order to support hotplug on either specific
PCI bus or PCI device in future.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-24 14:18:46 +10:00
Gavin Shan
eeb6361fdd powerpc/eeh: Avoid build warnings
The patch is for avoiding following build warnings:

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_init().
   This is often because .pnv_pci_ioda_fixup lacks a __init

   The function .pnv_pci_ioda_fixup() references
   the function __init .eeh_addr_cache_build().
   This is often because .pnv_pci_ioda_fixup lacks a __init

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-07-01 11:10:33 +10:00
Gavin Shan
ef6a285773 powerpc/eeh: Remove eeh_mutex
Originally, eeh_mutex was introduced to protect the PE hierarchy
tree and the attached EEH devices because EEH core was possiblly
running with multiple threads to access the PE hierarchy tree.
However, we now have only one kthread in EEH core. So we needn't
the eeh_mutex and just remove it.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-25 17:24:41 +10:00
Gavin Shan
8a6b1bc70d powerpc/eeh: EEH core to handle special event
On PowerNV platform, the EEH event caused by interrupt won't have
binding PE. The patch enables EEH core to handle the special event.
To avoid the current logic we have, The eeh_handle_event() is renamed
to eeh_handle_normal_event(), and the eeh_handle_special_event() is
introduced. The function eeh_handle_event() dispatches to above two
functions according to the input parameter. Besides, new backend
"next_error" added to eeh_ops and it's expected to have following
return values:

        4 - Dead IOC           3 - Dead PHB
        2 - Fenced PHB         1 - Frozen PE
        0 - No error found

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:14 +10:00
Gavin Shan
4907581dc2 powerpc/eeh: Export confirm_error_lock
An EEH event is created and queued to the event queue for each
ingress EEH error. When there're mutiple EEH errors, we need serialize
the process to keep consistent PE state (flags). The spinlock
"confirm_error_lock" was introduced for the purpose. We'll inject
EEH event upon error reporting interrupts on PowerNV platform. So
we export the spinlock for that to use for consistent PE state.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:11 +10:00
Gavin Shan
5a71978e4b powerpc/eeh: Trace time on first error for PE
We're not expecting that one specific PE got frozen for over 5
times in last hour. Otherwise, the PE will be removed from the
system upon newly coming EEH errors. The patch introduces time
stamp to trace the first error on specific PE in last hour and
function to update that accordingly. Besides, the time stamp
is recovered during PE hotplug path as we did for frozen count.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:06:04 +10:00
Gavin Shan
21fd21f590 powerpc/eeh: EEH post initialization operation
The patch adds new EEH operation post_init. It's used to notify
the platform that EEH core has completed the EEH probe. By that,
PowerNV platform starts to use the services supplied by EEH
functionality.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:51 +10:00
Gavin Shan
51fb5f5632 powerpc/eeh: Make eeh_init() public
For EEH on PowerNV platform, we will do EEH probe based on the
real PCI devices. The PCI devices are available after PCI probe.
So we have to call eeh_init() explicitly on PowerNV platform
after PCI probe. The patch also does EEH probe for PowerNV platform
in eeh_init().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:48 +10:00
Gavin Shan
8cdb283371 powerpc/eeh: Trace PCI bus from PE
There're several types of PEs can be supported for now: PHB, Bus
and Device dependent PE. For PCI bus dependent PE, tracing the
corresponding PCI bus from PE (struct eeh_pe) would make the code
more efficient. The patch also enables the retrieval of PCI bus based
on the PCI bus dependent PE.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:45 +10:00
Gavin Shan
0156680854 powerpc/eeh: Make eeh_pe_get() public
While processing EEH event interrupt from P7IOC, we need function
to retrieve the PE according to the indicated EEH device. The patch
makes function eeh_pe_get() public so that other source files can call
it for that purpose. Also, the patch fixes referring to wrong BDF
(Bus/Device/Function) address while searching PE in function
__eeh_pe_get().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:41 +10:00
Gavin Shan
9ff67433ce powerpc/eeh: Make eeh_phb_pe_get() public
One of the possible cases indicated by P7IOC interrupt is fenced
PHB. For that case, we need fetch the PE corresponding to the PHB
and disable the PHB and all subordinate PCI buses/devices, recover
from the fenced state and eventually enable the whole PHB. We need
one function to fetch the PHB PE outside eeh_pe.c and the patch is
going to make eeh_phb_pe_get() public for that purpose.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:05:38 +10:00
Gavin Shan
2d5c121678 powerpc/eeh: Enhance converting EEH dev
Under some special circumstances, the EEH device doesn't have the
associated device tree node or PCI device. The patch enhances those
functions converting EEH device to device tree node or PCI device
accordingly to avoid unnecessary system crash.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-06-20 17:04:36 +10:00
Thadeu Lima de Souza Cascardo
6a040ce725 powerpc/eeh: Fix crash when adding a device in a slot with DDW
The DDW code uses a eeh_dev struct from the pci_dev. However, this is
not set until eeh_add_device_late is called.

Since pci_bus_add_devices is called before eeh_add_device_late, the PCI
devices are added to the bus, making drivers' probe hooks to be called.
These will call set_dma_mask, which will call the DDW code, which will
require the eeh_dev struct from pci_dev. This would result in a crash,
due to a NULL dereference.

Calling eeh_add_device_late after pci_bus_add_devices would make the
system BUG, because device files shouldn't be added to devices there
were not added to the system. So, a new function is needed to add such
files only after pci_bus_add_devices have been called.

Cc: stable@vger.kernel.org
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-01-10 17:01:58 +11:00
Greg Kroah-Hartman
cad5cef62a POWERPC: drivers: remove __dev* attributes.
CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
markings need to be removed.

This change removes the use of __devinit, __devexit_p, __devinitdata,
__devinitconst, and __devexit from these drivers.

Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.

Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-01-03 15:57:04 -08:00
Gavin Shan
20ee6a9708 powerpc/eeh: Remove EEH PE for normal PCI hotplug
Function eeh_rmv_from_parent_pe() could be called by the path of
either normal PCI hotplug, or EEH recovery. For the former case,
we need purge the corresponding PE on removal of the associated
PE bus.

The patch tries to cover that by passing more information to function
pcibios_remove_pci_devices() so that we know if the corresponding PE
needs to be purged or be marked as "invalid".

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-18 15:32:23 +10:00
Gavin Shan
5efc3ad732 powerpc/eeh: Introduce EEH_PE_INVALID type PE
When EEH error happens on the PE whose PCI devices don't have
attached drivers. In function eeh_handle_event(), the default
value PCI_ERS_RESULT_NONE will be returned after iterating all
drivers of those PCI devices belonging to the PE. Actually, we
don't have installed drivers for the PCI devices. Under the
circumstance, we will remove the corresponding PCI bus of the PE,
including the associated EEH devices and PE instance. However,
we still need the information stored in the PE instance to do PE
reset after that. So it's unsafe to free the PE instance.

The patch introduces EEH_PE_INVALID type PE to address the issue.
When the PCI bus and the corresponding attached EEH devices are
removed, we will mark the PE as EEH_PE_INVALID. At later point,
the PE will be changed to EEH_PE_DEVICE or EEH_PE_BUS when the
corresponding EEH devices are attached again.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-18 15:02:49 +10:00
Gavin Shan
3ab96a02e8 powerpc/eeh: Cleanup on EEH PCI address cache
The patch does cleanup on EEH PCI address cache based on the fact
EEH core is the only user of the component.

        * Cleanup on function names so that they all have prefix
          "eeh" and looks more short.
        * Function printk() has been replaced with pr_debug() or
          pr_warning() accordingly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:59:00 +10:00
Gavin Shan
f8f7d63fd9 powerpc/eeh: Trace eeh device from I/O cache
The idea comes from Benjamin Herrenschmidt. The eeh cache helps
fetching the pci device according to the given I/O address. Since
the eeh cache is serving for eeh, it's reasonable for eeh cache
to trace eeh device except pci device.

The patch make eeh cache to trace eeh device. Also, the major
eeh entry function eeh_dn_check_failure has been renamed to
eeh_dev_check_failure since it will take eeh device as input
parameter.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:35:44 +10:00
Gavin Shan
d7bb88629d powerpc/eeh: Probe mode support
While EEH module is installed, PCI devices is checked one by one
to see if it supports eeh. On different platforms, the PCI devices
are referred through different ways when the EEH module is loaded.
For example, on pSeries platform, that is done by OF node. However,
we would do that by real PCI devices (struct pci_dev) on PowerNV
platform in future. So we needs some mechanism to differentiate
those cases by classifying them to probe modes, either from OF
nodes or real PCI devices.

The patch implements the support to eeh probe mode. Also, the
EEH on pSeries has set it into EEH_PROBE_MODE_DEVTREE. That means
the probe will be done based on OF nodes on pSeries platform.

In addition, On pSeries platform, it's done by OF nodes. The patch
moves the the probe function from EEH core to platform dependent
backend and some cleanup applied.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:35:44 +10:00
Gavin Shan
dbbceee12f powerpc/eeh: Move stats to PE
The patch removes the eeh related statistics for eeh device since
they have been maintained by the corresponding eeh PE. Also, the
flags used to trace the state of eeh device and PE have been reworked
for a little bit.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-09-10 09:35:43 +10:00