linux/arch
Frederic Barrat a8a30219ba powerpc/powernv/eeh: Fix oops when probing cxl devices
Recent cleanup in the way EEH support is added to a device causes a
kernel oops when the cxl driver probes a device and creates virtual
devices discovered on the FPGA:

  BUG: Kernel NULL pointer dereference at 0x000000a0
  Faulting instruction address: 0xc000000000048070
  Oops: Kernel access of bad area, sig: 7 [#1]
  ...
  NIP eeh_add_device_late.part.9+0x50/0x1e0
  LR  eeh_add_device_late.part.9+0x3c/0x1e0
  Call Trace:
    _dev_info+0x5c/0x6c (unreliable)
    pnv_pcibios_bus_add_device+0x60/0xb0
    pcibios_bus_add_device+0x40/0x60
    pci_bus_add_device+0x30/0x100
    pci_bus_add_devices+0x64/0xd0
    cxl_pci_vphb_add+0xe0/0x130 [cxl]
    cxl_probe+0x504/0x5b0 [cxl]
    local_pci_probe+0x6c/0x110
    work_for_cpu_fn+0x38/0x60

The root cause is that those cxl virtual devices don't have a
representation in the device tree and therefore no associated pci_dn
structure. In eeh_add_device_late(), pdn is NULL, so edev is NULL and
we oops.

We never had explicit support for EEH for those virtual devices.
Instead, EEH events are reported to the (real) pci device and handled
by the cxl driver. Which can then forward to the virtual devices and
handle dependencies. The fact that we try adding EEH support for the
virtual devices is new and a side-effect of the recent cleanup.

This patch fixes it by skipping adding EEH support on powernv for
devices which don't have a pci_dn structure.

The cxl driver doesn't create virtual devices on pseries so this patch
doesn't fix it there intentionally.

Fixes: b905f8cdca ("powerpc/eeh: EEH for pSeries hot plug")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191016162833.22509-1-fbarrat@linux.ibm.com
2019-10-25 22:08:50 +11:00
..
alpha mm: introduce MADV_PAGEOUT 2019-09-25 17:51:41 -07:00
arc mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
arm ARM: SoC fixes 2019-10-05 17:18:43 -07:00
arm64 ARM: SoC fixes 2019-10-05 17:18:43 -07:00
c6x mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
csky csky-for-linus-5.4-rc1: arch/csky patches for 5.4-rc1 2019-09-30 10:16:17 -07:00
h8300 mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
hexagon mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
ia64 Merge branch 'akpm' (patches from Andrew) 2019-09-24 16:10:23 -07:00
m68k mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
microblaze Merge branch 'akpm' (patches from Andrew) 2019-09-24 16:10:23 -07:00
mips MIPS: fw/arc: Remove unused addr variable 2019-10-04 11:46:22 -07:00
nds32 mm: consolidate pgtable_cache_init() and pgd_cache_init() 2019-09-24 15:54:09 -07:00
nios2 nios2 update for v5.4-rc1 2019-09-27 13:02:19 -07:00
openrisc mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
parisc mm: introduce MADV_PAGEOUT 2019-09-25 17:51:41 -07:00
powerpc powerpc/powernv/eeh: Fix oops when probing cxl devices 2019-10-25 22:08:50 +11:00
riscv riscv: Fix memblock reservation for device tree blob 2019-10-01 13:22:39 -07:00
s390 KVM: s390: mark __insn32_query() as __always_inline 2019-10-05 13:51:22 +02:00
sh mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
sparc arch/sparc/include/asm/pgtable_64.h: fix build 2019-09-26 10:27:06 -07:00
um mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
unicore32 mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
x86 ARM and x86 bugfixes of all kinds. The most visible one is that migrating 2019-10-04 11:17:51 -07:00
xtensa mm: treewide: clarify pgtable_page_{ctor,dtor}() naming 2019-09-26 10:10:44 -07:00
.gitignore
Kconfig arm64, mm: make randomization selected by generic topdown mmap layout 2019-09-24 15:54:11 -07:00