Commit Graph

7 Commits

Author SHA1 Message Date
Olaf Hering
273d280381 [PATCH] powerpc: fix NULL pointer in handle_eeh_events
This patch fixes a crash in handle_eeh_events,
but ethtool -t still doesnt work right.

...
pepino:~ # cpu 0x3: Vector: 300 (Data Access) at [c00000005192bbe0]
    pc: c00000000004a380: .handle_eeh_events+0xe0/0x23c
    lr: c00000000004a374: .handle_eeh_events+0xd4/0x23c
    sp: c00000005192be60
   msr: 9000000000009032
   dar: 268
 dsisr: 40000000
  current = 0xc0000001fe7bf1a0
  paca    = 0xc00000000048b280
    pid   = 16322, comm = eehd
enter ? for help
[c00000005192bf00] c00000000004a808 .eeh_event_handler+0xcc/0x130
[c00000005192bf90] c000000000025e00 .kernel_thread+0x4c/0x68

...

(none):/# /usr/sbin/ethtool -i eth0
driver: e100
version: 3.5.10-k2-NAPI
firmware-version: N/A
bus-info: 0000:21:01.0
(none):/# /usr/sbin/ethtool -t eth0
Call Trace:
[C00000000F8DEFF0] [C00000000000F270] .show_stack+0x74/0x1b4 (unreliable)
[C00000000F8DF0A0] [C000000000049D04] .eeh_dn_check_failure+0x290/0x2d8
[C00000000F8DF150] [C000000000049E58] .eeh_check_failure+0x10c/0x138
[C00000000F8DF1E0] [C0000000002DFDB0] .e100_hw_reset+0x70/0xf4
[C00000000F8DF270] [C0000000002E1BBC] .e100_hw_init+0x2c/0x260
[C00000000F8DF310] [C0000000002E2464] .e100_loopback_test+0x8c/0x220
[C00000000F8DF3C0] [C0000000002E28DC] .e100_diag_test+0xdc/0x16c
[C00000000F8DF490] [C000000000420BE0] .dev_ethtool+0xf24/0x14f8
[C00000000F8DF8F0] [C00000000041F4A8] .dev_ioctl+0x5cc/0x740
[C00000000F8DFA20] [C00000000040FEFC] .sock_ioctl+0x3d0/0x404
[C00000000F8DFAC0] [C0000000000D513C] .do_ioctl+0x68/0x108
[C00000000F8DFB50] [C0000000000D56B0] .vfs_ioctl+0x4d4/0x510
[C00000000F8DFC10] [C0000000000D5740] .sys_ioctl+0x54/0x94
[C00000000F8DFCC0] [C0000000000FB6EC] .ethtool_ioctl+0x11c/0x150
[C00000000F8DFD60] [C0000000000F7E40] .compat_sys_ioctl+0x338/0x3bc
[C00000000F8DFE30] [C00000000000871C] syscall_exit+0x0/0x40
EEH: Detected PCI bus error on device 0000:21:01.0
EEH: This PCI device has failed 1 times since last reboot: <NULL> -

modprobe: FATAL: Could not load /lib/modules/2.6.16-rc4-git7/modules.dep: No such file or directory

Cannot get strings: No such device
(none):/#
(none):/# EEH: Unable to configure device bridge (-3) for /pci@400000000110/pci@2,2

(none):/# Call Trace:
[C00000000FA17940] [C00000000000F270] .show_stack+0x74/0x1b4 (unreliable)
[C00000000FA179F0] [C000000000049D04] .eeh_dn_check_failure+0x290/0x2d8
[C00000000FA17AA0] [C00000000001E114] .rtas_read_config+0x120/0x154
[C00000000FA17B40] [C000000000049664] .early_enable_eeh+0x274/0x2bc
[C00000000FA17C00] [C000000000049708] .eeh_add_device_early+0x5c/0x6c
[C00000000FA17C90] [C000000000049748] .eeh_add_device_tree_early+0x30/0x5c
[C00000000FA17D20] [C000000000046568] .pcibios_add_pci_devices+0x8c/0x1f8
[C00000000FA17DD0] [C00000000004A528] .eeh_reset_device+0xe0/0x110
[C00000000FA17E60] [C00000000004A698] .handle_eeh_events+0x140/0x250
[C00000000FA17F00] [C00000000004AC7C] .eeh_event_handler+0xe8/0x140
[C00000000FA17F90] [C000000000025784] .kernel_thread+0x4c/0x68
EEH: Detected PCI bus error on device <NULL>
EEH: This PCI device has failed 1 times since last reboot: <NULL> -
EEH: Unable to configure device bridge (-3) for /pci@400000000110/pci@2,2
Call Trace:
[C00000000FA17940] [C00000000000F270] .show_stack+0x74/0x1b4 (unreliable)
[C00000000FA179F0] [C000000000049D04] .eeh_dn_check_failure+0x290/0x2d8
[C00000000FA17AA0] [C00000000001E114] .rtas_read_config+0x120/0x154
[C00000000FA17B40] [C000000000049664] .early_enable_eeh+0x274/0x2bc
[C00000000FA17C00] [C000000000049708] .eeh_add_device_early+0x5c/0x6c
[C00000000FA17C90] [C000000000049748] .eeh_add_device_tree_early+0x30/0x5c
[C00000000FA17D20] [C000000000046568] .pcibios_add_pci_devices+0x8c/0x1f8
[C00000000FA17DD0] [C00000000004A528] .eeh_reset_device+0xe0/0x110
[C00000000FA17E60] [C00000000004A698] .handle_eeh_events+0x140/0x250
[C00000000FA17F00] [C00000000004AC7C] .eeh_event_handler+0xe8/0x140
[C00000000FA17F90] [C000000000025784] .kernel_thread+0x4c/0x68
EEH: Detected PCI bus error on device <NULL>
EEH: This PCI device has failed 1 times since last reboot: <NULL> -
EEH: Unable to configure device bridge (-3) for /pci@400000000110/pci@2,2
Call Trace:
[C00000000FA17940] [C00000000000F270] .show_stack+0x74/0x1b4 (unreliable)
[C00000000FA179F0] [C000000000049D04] .eeh_dn_check_failure+0x290/0x2d8
[C00000000FA17AA0] [C00000000001E114] .rtas_read_config+0x120/0x154
[C00000000FA17B40] [C000000000049664] .early_enable_eeh+0x274/0x2bc
[C00000000FA17C00] [C000000000049708] .eeh_add_device_early+0x5c/0x6c
[C00000000FA17C90] [C000000000049748] .eeh_add_device_tree_early+0x30/0x5c
[C00000000FA17D20] [C000000000046568] .pcibios_add_pci_devices+0x8c/0x1f8
[C00000000FA17DD0] [C00000000004A528] .eeh_reset_device+0xe0/0x110
[C00000000FA17E60] [C00000000004A698] .handle_eeh_events+0x140/0x250
[C00000000FA17F00] [C00000000004AC7C] .eeh_event_handler+0xe8/0x140
[C00000000FA17F90] [C000000000025784] .kernel_thread+0x4c/0x68
EEH: Detected PCI bus error on device <NULL>
and so on

Signed-off-by: Olaf Hering <olh@suse.de>
Acked-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2006-02-28 16:25:54 +11:00
Al Viro
d04e4e115b [PATCH] eeh_driver NULL noise removal
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-02-07 20:58:33 -05:00
Paul Mackerras
18eb3b398d powerpc: Fix up some compile errors in the PCI error recovery code
<asm/systemcfg.h> is gone now, and the PCI error recovery constants
in include/linux/pci.h changed their names in the process of getting
accepted.

Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from 5a2516156c591fc3d2059fbd93f97e15eb6010d6 commit)
2006-01-10 15:32:31 +11:00
Linas Vepstas
3914ac7b0e [PATCH] powerpc: handle multifunction PCI devices properly
239-eeh-multifunction-consolidate.patch

New-style firmware will often place multiple different functions
under a non-EEH-aware parent.  However, these devices might share
a common PE "partition endpoint" and config address, ad thus any
EEH events will affect all of the devices in common.  This patch
makes the effort to find all of these common devices and handle
them together.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from 216810296bb97d39da8e176822e9de78d2f00187 commit)
2006-01-10 15:30:23 +11:00
Linas Vepstas
b6495c0c8f [PATCH] powerpc: Don't continue with PCI Error recovery if slot reset failed.
238-eeh-stop-if-reset_failed.patch

If the firmware is unable to reset the PCI slot for some reason, then
don't attempt any further recovery steps after that point.  Instead,
mark the device as permanently failed.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from e06b942521eb2cdaf232726f45a820d5837acb12 commit)
2006-01-10 15:30:14 +11:00
Linas Vepstas
9fb40eb883 [PATCH] powerpc: Remove duplicate code
234-eeh-find-pe.patch

The find_device_pe() routine is duplicated in two files. Remove one of
the two copies, declare the other extern.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from 48408e708282d4d0269136ff27ea5acbd9410b5a commit)
2006-01-10 15:29:33 +11:00
Linas Vepstas
77bd741561 [PATCH] powerpc: PCI Error Recovery: PPC64 core recovery routines
Various PCI bus errors can be signaled by newer PCI controllers.  The
core error recovery routines are architecture dependent.  This patch adds
a recovery infrastructure for the  PPC64 pSeries systems.

Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from e8ca11b460c4c9c7fa6b529be221529ebd770e38 commit)
2006-01-10 15:28:32 +11:00