linux/net/core
Ihar Hrachyshka 77d7123342 neighbour: update neigh timestamps iff update is effective
It's a common practice to send gratuitous ARPs after moving an
IP address to another device to speed up healing of a service. To
fulfill service availability constraints, the timing of network peers
updating their caches to point to a new location of an IP address can be
particularly important.

Sometimes neigh_update calls won't touch neither lladdr nor state, for
example if an update arrives in locktime interval. The neigh->updated
value is tested by the protocol specific neigh code, which in turn
will influence whether NEIGH_UPDATE_F_OVERRIDE gets set in the
call to neigh_update() or not. As a result, we may effectively ignore
the update request, bailing out of touching the neigh entry, except that
we still bump its timestamps inside neigh_update.

This may be a problem for updates arriving in quick succession. For
example, consider the following scenario:

A service is moved to another device with its IP address. The new device
sends three gratuitous ARP requests into the network with ~1 seconds
interval between them. Just before the first request arrives to one of
network peer nodes, its neigh entry for the IP address transitions from
STALE to DELAY.  This transition, among other things, updates
neigh->updated. Once the kernel receives the first gratuitous ARP, it
ignores it because its arrival time is inside the locktime interval. The
kernel still bumps neigh->updated. Then the second gratuitous ARP
request arrives, and it's also ignored because it's still in the (new)
locktime interval. Same happens for the third request. The node
eventually heals itself (after delay_first_probe_time seconds since the
initial transition to DELAY state), but it just wasted some time and
require a new ARP request/reply round trip. This unfortunate behaviour
both puts more load on the network, as well as reduces service
availability.

This patch changes neigh_update so that it bumps neigh->updated (as well
as neigh->confirmed) only once we are sure that either lladdr or entry
state will change). In the scenario described above, it means that the
second gratuitous ARP request will actually update the entry lladdr.

Ideally, we would update the neigh entry on the very first gratuitous
ARP request. The locktime mechanism is designed to ignore ARP updates in
a short timeframe after a previous ARP update was honoured by the kernel
layer. This would require tracking timestamps for state transitions
separately from timestamps when actual updates are received. This would
probably involve changes in neighbour struct. Therefore, the patch
doesn't tackle the issue of the first gratuitous APR ignored, leaving
it for a follow-up.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-17 11:41:38 -04:00
..
datagram.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
dev.c xdp: refine xdp api with regards to generic xdp 2017-05-11 21:30:57 -04:00
devlink.c net/devlink: Add E-Switch encapsulation control 2017-04-22 20:26:37 +03:00
drop_monitor.c drop_monitor: use setup_timer 2017-03-12 23:47:16 -07:00
dst_cache.c net: dst_cache_per_cpu_dst_set() can be static 2016-03-18 17:45:08 -04:00
dst.c net: pending_confirm is not used anymore 2017-02-07 13:07:47 -05:00
ethtool.c net: Add ESP offload features 2017-04-14 10:05:36 +02:00
fib_rules.c fib_rules: fix error return code 2017-04-27 16:35:57 -04:00
filter.c bpf: restore skb->sk before pskb_trim() call 2017-04-30 22:23:16 -04:00
flow_dissector.c flow_dissector: add mpls support (v2) 2017-04-24 14:30:46 -04:00
flow.c flowcache: more "unsigned int" 2017-04-03 19:04:48 -07:00
gen_estimator.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
gen_stats.c net_sched: gen_estimator: complete rewrite of rate estimators 2016-12-05 15:21:59 -05:00
gro_cells.c net: Generic XDP 2017-04-25 13:33:49 -04:00
hwbm.c net: hwbm: Fix unbalanced spinlock in error case 2016-05-25 12:35:09 -07:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwt_bpf.c netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
lwtunnel.c lwtunnel: fix error path in lwtunnel_fill_encap() 2017-04-30 22:41:29 -04:00
Makefile gro_cells: move to net/core/gro_cells.c 2017-02-08 14:38:18 -05:00
neighbour.c neighbour: update neigh timestamps iff update is effective 2017-05-17 11:41:38 -04:00
net_namespace.c net: Initialise init_net.count to 1 2017-04-30 22:32:16 -04:00
net-procfs.c net: remove NETDEV_TX_LOCKED support 2016-04-26 15:53:05 -04:00
net-sysfs.c net: use net->count to check whether a netns is alive or not 2017-03-13 16:02:27 -07:00
net-sysfs.h
net-traces.c net: IPv6 fib lookup tracepoint 2015-11-22 11:54:10 -05:00
netclassid_cgroup.c cgroup, net_cls: iterate the fds of only the tasks which are being migrated 2017-03-22 10:32:46 -07:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c netpoll: Check for skb->queue_mapping 2017-04-21 15:45:19 -04:00
netprio_cgroup.c net: break include loop netdevice.h, dsa.h, devlink.h 2017-03-28 22:46:04 -07:00
pktgen.c net-tc: convert tc_verd to integer bitfields 2017-01-08 20:58:52 -05:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c ipv4: Namespaceify tcp_max_syn_backlog knob 2016-12-29 11:38:31 -05:00
rtnetlink.c net: Improve handling of failures on link and route dumps 2017-05-16 14:54:11 -04:00
scm.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
secure_seq.c tcp: randomize timestamps on syncookies 2017-05-05 12:00:11 -04:00
skbuff.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
sock_diag.c netlink: extended ACK reporting 2017-04-13 13:58:20 -04:00
sock_reuseport.c soreuseport: use "unsigned int" in __reuseport_alloc() 2017-04-03 19:06:38 -07:00
sock.c netem: fix skb_orphan_partial() 2017-05-11 21:32:48 -04:00
stream.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
sysctl_net_core.c Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning 2017-04-21 13:22:34 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00