linux/net
Neil Horman 89ed5b5190 af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET
When an application is run that:
a) Sets its scheduler to be SCHED_FIFO
and
b) Opens a memory mapped AF_PACKET socket, and sends frames with the
MSG_DONTWAIT flag cleared, its possible for the application to hang
forever in the kernel.  This occurs because when waiting, the code in
tpacket_snd calls schedule, which under normal circumstances allows
other tasks to run, including ksoftirqd, which in some cases is
responsible for freeing the transmitted skb (which in AF_PACKET calls a
destructor that flips the status bit of the transmitted frame back to
available, allowing the transmitting task to complete).

However, when the calling application is SCHED_FIFO, its priority is
such that the schedule call immediately places the task back on the cpu,
preventing ksoftirqd from freeing the skb, which in turn prevents the
transmitting task from detecting that the transmission is complete.

We can fix this by converting the schedule call to a completion
mechanism.  By using a completion queue, we force the calling task, when
it detects there are no more frames to send, to schedule itself off the
cpu until such time as the last transmitted skb is freed, allowing
forward progress to be made.

Tested by myself and the reporter, with good results

Change Notes:

V1->V2:
	Enhance the sleep logic to support being interruptible and
allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)

V2->V3:
	Rearrage the point at which we wait for the completion queue, to
avoid needing to check for ph/skb being null at the end of the loop.
Also move the complete call to the skb destructor to avoid needing to
modify __packet_set_status.  Also gate calling complete on
packet_read_pending returning zero to avoid multiple calls to complete.
(Willem de Bruijn)

	Move timeo computation within loop, to re-fetch the socket
timeout since we also use the timeo variable to record the return code
from the wait_for_complete call (Neil Horman)

V3->V4:
	Willem has requested that the control flow be restored to the
previous state.  Doing so lets us eliminate the need for the
po->wait_on_complete flag variable, and lets us get rid of the
packet_next_frame function, but introduces another complexity.
Specifically, but using the packet pending count, we can, if an
applications calls sendmsg multiple times with MSG_DONTWAIT set, each
set of transmitted frames, when complete, will cause
tpacket_destruct_skb to issue a complete call, for which there will
never be a wait_on_completion call.  This imbalance will lead to any
future call to wait_for_completion here to return early, when the frames
they sent may not have completed.  To correct this, we need to re-init
the completion queue on every call to tpacket_snd before we enter the
loop so as to ensure we wait properly for the frames we send in this
iteration.

	Change the timeout and interrupted gotos to out_put rather than
out_status so that we don't try to free a non-existant skb
	Clean up some extra newlines (Willem de Bruijn)

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-26 19:38:29 -07:00
..
6lowpan treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
9p treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 188 2019-05-30 11:29:21 -07:00
802 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
8021q treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
appletalk treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 372 2019-06-05 17:37:10 +02:00
atm treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
ax25 ax25: fix inconsistent lock state in ax25_destroy_timer 2019-06-16 14:22:37 -07:00
batman-adv This feature/cleanup patchset includes the following patches: 2019-05-09 09:44:17 -07:00
bluetooth ipv6: constify rt6_nexthop() 2019-06-26 13:26:08 -07:00
bpf treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 2019-05-30 11:29:53 -07:00
bpfilter treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
bridge treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
caif treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 194 2019-05-30 11:29:22 -07:00
can can: purge socket error queue on sock destruct 2019-06-07 23:03:54 +02:00
ceph treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 268 2019-06-05 17:30:29 +02:00
core net: remove duplicate fetch in sock_getsockopt 2019-06-18 10:04:16 -07:00
dcb treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201 2019-05-30 11:29:52 -07:00
dccp treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
decnet treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 53 2019-05-24 17:36:42 +02:00
dns_resolver treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
dsa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-07 09:29:14 -07:00
ethernet treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
hsr hsr: fix don't prune the master node from the node_db 2019-05-23 09:29:44 -07:00
ieee802154 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
ife treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
ipv4 ipv4: reset rt_iif for recirculated mcast/bcast out pkts 2019-06-26 12:40:10 -07:00
ipv6 ipv6: fix neighbour resolution with raw socket 2019-06-26 13:26:08 -07:00
iucv net/af_iucv: always register net_device notifier 2019-06-19 16:26:33 -04:00
kcm treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
key treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
l2tp treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
l3mdev treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
lapb lapb: fixed leak of control-blocks. 2019-06-16 20:44:20 -07:00
llc treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 281 2019-06-05 17:36:36 +02:00
mac80211 SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
mac802154 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 174 2019-05-30 11:26:41 -07:00
mpls mpls: fix af_mpls dependencies for real 2019-06-12 09:42:34 -07:00
ncsi treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netfilter ipv6: constify rt6_nexthop() 2019-06-26 13:26:08 -07:00
netlabel treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13 2019-05-21 11:28:45 +02:00
netlink treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
netrom treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nfc SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
nsh treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
openvswitch Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-17 15:55:34 -07:00
packet af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET 2019-06-26 19:38:29 -07:00
phonet treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 336 2019-06-05 17:37:07 +02:00
psample treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
qrtr treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 284 2019-06-05 17:36:37 +02:00
rds net: rds: fix memory leak in rds_ib_flush_mr_pool 2019-06-06 10:32:16 -07:00
rfkill treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
rose treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
rxrpc treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sched net/sched: cbs: Fix error path of cbs_module_init 2019-06-23 11:32:48 -07:00
sctp sctp: change to hold sk after auth shkey is created successfully 2019-06-26 19:29:23 -07:00
smc net/smc: Fix error path in smc_init 2019-06-26 10:10:16 -07:00
strparser treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
sunrpc SUNRPC: Fix a credential refcount leak 2019-06-21 14:45:09 -04:00
switchdev treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
tipc tipc: check msg->req data len in tipc_nl_compat_bearer_disable 2019-06-24 10:03:59 -07:00
tls net/tls: fix page double free on TX cleanup 2019-06-24 07:20:45 -07:00
unix treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-06-21 22:23:35 -07:00
wimax treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 268 2019-06-05 17:30:29 +02:00
wireless SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
x25 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 41 2019-05-24 17:27:12 +02:00
xdp xdp: check device pointer before clearing 2019-06-12 16:41:47 +02:00
xfrm treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 335 2019-06-05 17:37:06 +02:00
compat.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
Makefile net: split out functions related to registering inflight socket files 2019-02-28 08:24:23 -07:00
socket.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sysctl_net.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00