linux/net/core
Neil Horman 2d8bff1269 netpoll: Close race condition between poll_one_napi and napi_disable
Drivers might call napi_disable while not holding the napi instance poll_lock.
In those instances, its possible for a race condition to exist between
poll_one_napi and napi_disable.  That is to say, poll_one_napi only tests the
NAPI_STATE_SCHED bit to see if there is work to do during a poll, and as such
the following may happen:

CPU0				CPU1
ndo_tx_timeout			napi_poll_dev
 napi_disable			 poll_one_napi
  test_and_set_bit (ret 0)
				  test_bit (ret 1)
   reset adapter		   napi_poll_routine

If the adapter gets a tx timeout without a napi instance scheduled, its possible
for the adapter to think it has exclusive access to the hardware  (as the napi
instance is now scheduled via the napi_disable call), while the netpoll code
thinks there is simply work to do.  The result is parallel hardware access
leading to corrupt data structures in the driver, and a crash.

Additionaly, there is another, more critical race between netpoll and
napi_disable.  The disabled napi state is actually identical to the scheduled
state for a given napi instance.  The implication being that, if a napi instance
is disabled, a netconsole instance would see the napi state of the device as
having been scheduled, and poll it, likely while the driver was dong something
requiring exclusive access.  In the case above, its fairly clear that not having
the rings in a state ready to be polled will cause any number of crashes.

The fix should be pretty easy.  netpoll uses its own bit to indicate that that
the napi instance is in a state of being serviced by netpoll (NAPI_STATE_NPSVC).
We can just gate disabling on that bit as well as the sched bit.  That should
prevent netpoll from conducting a napi poll if we convert its set bit to a
test_and_set_bit operation to provide mutual exclusion

Change notes:
V2)
	Remove a trailing whtiespace
	Resubmit with proper subject prefix

V3)
	Clean up spacing nits

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: jmaxwell@redhat.com
Tested-by: jmaxwell@redhat.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-23 14:32:50 -07:00
..
datagram.c net: Fix skb_set_peeked use-after-free bug 2015-08-06 21:55:47 -07:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
dev.c netpoll: Close race condition between poll_one_napi and napi_disable 2015-09-23 14:32:50 -07:00
drop_monitor.c net: Replace get_cpu_var through this_cpu_ptr 2014-08-26 13:45:47 -04:00
dst.c tun_dst: Remove opts_size 2015-08-31 21:23:42 -07:00
ethtool.c net/ethtool: Add current supported tunable options 2015-06-11 00:36:37 -07:00
fib_rules.c net: ipv6: use common fib_default_rule_pref 2015-09-09 14:19:50 -07:00
filter.c ebpf: emit correct src_reg for conditional jumps 2015-09-11 14:52:41 -07:00
flow_dissector.c flow_dissector: Use 'const' where possible. 2015-09-01 21:19:17 -07:00
flow.c flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c 2015-09-01 17:00:24 -07:00
gen_estimator.c net_sched: gen_estimator: extend pps limit 2015-07-08 13:59:20 -07:00
gen_stats.c gen_stats.c: Duplicate xstats buffer for later use 2015-02-19 15:45:53 -05:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c lwt: Add cfg argument to build_state 2015-08-24 10:34:40 -07:00
Makefile lwtunnel: infrastructure for handling light weight tunnels like mpls 2015-07-21 10:39:03 -07:00
neighbour.c net: add explicit logging and stat for neighbour table overflow 2015-08-10 13:46:21 -07:00
net_namespace.c netns: make nsid_lock per net 2015-05-17 23:41:11 -04:00
net-procfs.c
net-sysfs.c net: allow sleeping when modifying store_rps_map 2015-08-13 21:40:57 -07:00
net-sysfs.h net: netdev_kobject_init: annotate with __init 2014-01-05 20:27:54 -05:00
net-traces.c net: FIB tracepoints 2015-08-29 13:05:16 -07:00
netclassid_cgroup.c cgroup: net_cls: fix false-positive "suspicious RCU usage" 2015-07-25 00:13:18 -07:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c netpoll: Close race condition between poll_one_napi and napi_disable 2015-09-23 14:32:50 -07:00
netprio_cgroup.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
pktgen.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-13 16:23:11 -07:00
ptp_classifier.c net: filter: split 'struct sk_filter' into socket and bpf parts 2014-08-02 15:03:58 -07:00
request_sock.c inet: fix races with reqsk timers 2015-08-10 21:17:29 -07:00
rtnetlink.c rtnetlink: catch -EOPNOTSUPP errors from ndo_bridge_getlink 2015-09-15 15:03:57 -07:00
scm.c net: introduce helper macro for_each_cmsghdr 2014-12-10 22:41:55 -05:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-08-27 21:45:31 -07:00
sock_diag.c sock, diag: fix panic in sock_diag_put_filterinfo 2015-09-02 11:29:29 -07:00
sock.c net: core: drop null test before destroy functions 2015-09-15 16:49:43 -07:00
stream.c tcp: set SOCK_NOSPACE under memory pressure 2015-05-09 17:38:36 -04:00
sysctl_net_core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-03-20 18:51:09 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: fix unaligned access to crafted TCP header in helper API 2014-10-22 12:52:55 -04:00
utils.c net: Add inet_proto_csum_replace_by_diff utility function 2015-08-17 21:33:06 -07:00