linux/drivers/pci/pcie
Naga Chumbalkar bbfa306a1e PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
v3 -> v2: Modified the text that describes the problem
v2 -> v1: Returned -EPERM
v1      : http://marc.info/?l=linux-pci&m=130013194803727&w=2

For servers whose hardware cannot handle ASPM the BIOS ought to set the
FADT bit shown below:
In Sec 5.2.9.3 (IA-PC Boot Arch. Flags) of ACPI4.0a Specification, please
see Table 5-11:
PCIe ASPM Controls: If set, indicates to OSPM that it must not enable
OPSM ASPM control on this platform.

However there are shipping servers whose BIOS did not set this bit. (An
example is the HP ProLiant DL385 G6. A Maintenance BIOS will fix that).
For such servers even if a call is made via pci_no_aspm(), based on _OSC
support in the BIOS, it may be too late because the ASPM code may have
already allocated and filled its "link_list".

So if a user sets the ASPM "policy" to "powersave" via /sys then
pcie_aspm_set_policy() will run through the "link_list" and re-configure
ASPM policy on devices that advertise ASPM L0s/L1 capability:
# echo powersave > /sys/module/pcie_aspm/parameters/policy
# cat /sys/module/pcie_aspm/parameters/policy
default performance [powersave]

That can cause NMIs since the hardware doesn't play well with ASPM:
[ 1651.906015] NMI: PCI system error (SERR) for reason b1 on CPU 0.
[ 1651.906015] Dazed and confused, but trying to continue

Ideally, the BIOS should have set that FADT bit in the first place but we
could be more robust - especially given the fact that Windows doesn't
cause NMIs in the above scenario.

There should be a sanity check to not allow a user to modify ASPM policy
when aspm_disabled is set.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-03-21 09:40:57 -07:00
..
aer PCI: aer-inject: Override PCIe AER Mask Registers 2011-03-04 10:41:02 -08:00
aspm.c PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs 2011-03-21 09:40:57 -07:00
Kconfig kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT 2011-01-20 17:02:05 -08:00
Makefile PCI: PCIe: Move PCIe PME code to the pcie directory 2010-08-24 13:47:48 -07:00
pme.c PCI/PM: Report wakeup events before resuming devices 2011-01-14 08:55:43 -08:00
portdrv_acpi.c PCI/ACPI: Request _OSC control once for each root bridge (v3) 2011-01-14 08:55:41 -08:00
portdrv_bus.c PCI: portdrv: remove unnecessary struct pcie_port_data 2009-12-04 15:56:19 -08:00
portdrv_core.c PCI/PCIe: Clear Root PME Status bits early during system resume 2010-12-23 12:54:03 -08:00
portdrv_pci.c PCI/PCIe: Clear Root PME Status bits early during system resume 2010-12-23 12:54:03 -08:00
portdrv.h PCI/ACPI: Request _OSC control once for each root bridge (v3) 2011-01-14 08:55:41 -08:00