linux/drivers/infiniband/hw
Steve Wise 2f25e9a540 RDMA/cxgb4: EEH errors can hang the driver
A few more EEH fixes:

c4iw_wait_for_reply(): detect fatal EEH condition on timeout and
return an error.

The iw_cxgb4 driver was only calling ib_deregister_device() on an EEH
event followed by a ib_register_device() when the device was
reinitialized.  However, the RDMA core doesn't allow multiple
iterations of register/deregister by the provider. See
drivers/infiniband/core/sysfs.c: ib_device_unregister_sysfs() where
the kobject ref is held until the device is deallocated in
ib_deallocate_device().  Calling deregister adds this kobj reference,
and then a subsequent register call will generate a WARN_ON() from the
kobject subsystem because the kobject is being initialized but is
already initialized with the ref held.

So the provider must deregister and dealloc when resetting for an EEH
event, then alloc/register to re-initialize.  To do this, we cannot
use the device ptr as our ULD handle since it will change with each
reallocation.  This commit adds a ULD context struct which is used as
the ULD handle, and then contains the device pointer and other state
needed.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-05-09 22:06:23 -07:00
..
amso1100 Fix common misspellings 2011-03-31 11:26:23 -03:00
cxgb3 ipv4: Create and use route lookup helpers. 2011-03-12 15:08:42 -08:00
cxgb4 RDMA/cxgb4: EEH errors can hang the driver 2011-05-09 22:06:23 -07:00
ehca RDMA: Use vzalloc() to replace vmalloc()+memset(0) 2011-01-12 11:11:58 -08:00
ipath Fix common misspellings 2011-03-31 11:26:23 -03:00
mlx4 mlx4: generalization of multicast steering. 2011-03-23 12:24:21 -07:00
mthca IB: Increase DMA max_segment_size on Mellanox hardware 2011-03-22 09:39:18 -07:00
nes Fix common misspellings 2011-03-31 11:26:23 -03:00
qib Revert wrong fixes for common misspellings 2011-04-26 23:31:11 -07:00