linux/drivers/infiniband/ulp/ipoib
David J. Wilder 0cd4d0fd9b IPoIB: Clear ipoib_neigh.dgid in ipoib_neigh_alloc()
IPoIB can miss a change in destination GID under some conditions.  The
problem is caused when ipoib_neigh->dgid contains a stale address.
The fix is to set ipoib_neigh->dgid to zero in ipoib_neigh_alloc().

This can happen when a system using bonding on its IPoIB interfaces
has switched its active interface from interface A to B and back to A.
The system that fails over will not correctly processes the 2nd
address change, as described below.

When an address has changed neighbor->ha is updated with the new
address.  Each neighbor has an associated ipoib_neigh.
ipoib_neigh->dgid also holds a copy of the remote node's hardware
address.  When an address changes neighbor->ha is updated by the
network layer (arp code) with the new address.  IPoIB detects this
change in ipoib_start_xmit() by comparing neighbor->ha with
ipoib_neigh->dgid.  The bug is that ipoib_neigh->dgid may already
contain the new address (A) thus the change from B to A is missed by
ipoib.  Here is the sequence of events:

    ipoib_neigh->dgid = A  and  neighbor->ha = A

The address is switched to B (the first switch)

    neighbor->ha = B

The change is seen in ipoib_start_xmit() -- neighbor->ha !=
ipoib_neigh->dgid so ipoib_neigh is released, and a new one is
allocated.

The allocator may return the same chunk of memory that was just
released, therefore ipoib_neigh->dgid still contains A at this point.

ipoib_neigh->dgid should be updated in neigh_add_path(), but if the
following conditions are true dgid is not updated:

        1) __path_find() returns a path
        2) path->ah is NULL

The remote system now switches from address B to A, neighbor->ha is
updated to A.

Now we have again : ipoib_neigh->dgid = A  and  neighbor->ha = A

Since the addresses are the same ipoib won't process the change in
address.  Fix this by zeroing out the dgid field when allocating a new
struct ipoib_neigh.

Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-12-09 10:03:00 -08:00
..
ipoib_cm.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2009-09-14 10:37:28 -07:00
ipoib_ethtool.c IPoIB: Clean up ethtool support 2008-10-22 15:49:29 -07:00
ipoib_fs.c RDMA: Remove subversion $Id tags 2008-07-14 23:48:44 -07:00
ipoib_ib.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2009-09-14 10:37:28 -07:00
ipoib_main.c IPoIB: Clear ipoib_neigh.dgid in ipoib_neigh_alloc() 2009-12-09 10:03:00 -08:00
ipoib_multicast.c IPoIB: Don't turn on carrier for a non-active port 2009-09-24 12:01:05 -07:00
ipoib_verbs.c IPoIB: Get rid of ipoib_mcast_detach() wrapper 2008-07-14 23:48:50 -07:00
ipoib_vlan.c net: Fix ipoib rtnl_lock sysfs deadlock. 2009-05-18 22:15:59 -07:00
ipoib.h infiniband: remove IPOIB_GID_RAW_ARG, IPOIB_GID_ARG, IPOIB_GID_FMT 2008-10-28 23:02:37 -07:00
Kconfig IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG 2008-07-24 20:37:25 -07:00
Makefile IPoIB: Add basic ethtool support 2008-04-16 21:09:32 -07:00