IPoIB: Fix loss of connectivity after bonding failover on both sides

Fix bonding failover in the case both peers failover and the
gratuitous ARP is lost.  In that case, the sender side will create an
ipoib_neigh and issue a path request with the old GID first.  When
skb->dst->neighbour->ha changes due to ARP refresh, this ipoib_neigh
will not be added to the path->list of the path of the new GID,
because the ipoib_neigh already exists.  It will not have an AH
either, because of sender-side failover.  Therefore, it will not get
an AH when the path is resolved.

The solution here is to compare GIDs in ipoib_start_xmit() even if
neigh->ah is invalid.  Comparing with an uninitialized value of
neigh->dgid should be fine, since a spurious match is harmless (and
astronomically unlikely too).

Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: Yossi Etigin <yosefe@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This commit is contained in:
Yossi Etigin 2009-01-09 14:05:11 -08:00 committed by Roland Dreier
parent 6a94cb7306
commit a50df398cd

View File

@ -711,7 +711,6 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
neigh = *to_ipoib_neigh(skb->dst->neighbour);
if (neigh->ah)
if (unlikely((memcmp(&neigh->dgid.raw,
skb->dst->neighbour->ha + 4,
sizeof(union ib_gid))) ||
@ -724,6 +723,7 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
* so ipoib_put_ah() will never do more than
* decrement the ref count.
*/
if (neigh->ah)
ipoib_put_ah(neigh->ah);
list_del(&neigh->list);
ipoib_neigh_free(dev, neigh);