linux/net
Rainer Weikusat ec0d215f94 af_unix: fix 'poll for write'/connected DGRAM sockets
For n:1 'datagram connections' (eg /dev/log), the unix_dgram_sendmsg
routine implements a form of receiver-imposed flow control by
comparing the length of the receive queue of the 'peer socket' with
the max_ack_backlog value stored in the corresponding sock structure,
either blocking the thread which caused the send-routine to be called
or returning EAGAIN. This routine is used by both SOCK_DGRAM and
SOCK_SEQPACKET sockets. The poll-implementation for these socket types
is datagram_poll from core/datagram.c. A socket is deemed to be
writeable by this routine when the memory presently consumed by
datagrams owned by it is less than the configured socket send buffer
size. This is always wrong for PF_UNIX non-stream sockets connected to
server sockets dealing with (potentially) multiple clients if the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an (usual)
application to 'poll for writeability by repeated send request with
O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_recvmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:34:18 -07:00
..
9p 9p: fix error path during early mount 2008-05-14 19:23:27 -05:00
802
8021q vlan: Use bitmask of feature flags instead of seperate feature bits 2008-05-23 00:27:50 -07:00
appletalk
atm atm: [br2864] fix routed vcmux support 2008-06-16 17:18:18 -07:00
ax25 ax25: Fix NULL pointer dereference and lockup. 2008-06-03 14:53:46 -07:00
bluetooth bluetooth: rfcomm_dev_state_change deadlock fix 2008-06-03 14:27:17 -07:00
bridge bridge: Consolidate error paths in br_add_bridge(). 2008-05-04 17:58:07 -07:00
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-05-08 19:03:26 -07:00
core tcp: fix for splice receive when used with software LRO 2008-06-27 17:27:21 -07:00
dccp dccp: Bug in initial acknowledgment number assignment 2008-06-11 11:19:10 +01:00
decnet ip: Use inline function dst_metric() instead of direct access to dst->metric[] 2008-05-04 22:14:42 -07:00
econet net: Allow netdevices to specify needed head/tailroom 2008-05-12 20:48:31 -07:00
ethernet [NET]: Return more appropriate error from eth_validate_addr(). 2008-04-13 22:45:40 -07:00
ieee80211 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2008-04-14 02:30:23 -07:00
ipv4 tcp: calculate tcp_mem based on low memory instead of all memory 2008-06-27 17:23:57 -07:00
ipv6 netfilter: ip6table_mangle: don't reroute in LOCAL_IN 2008-06-24 13:30:45 -07:00
ipx
irda irda: Sock leak on error path in irda_create. 2008-06-03 15:18:36 -07:00
iucv iucv: Delay bus registration until core is ready. 2008-04-10 02:12:45 -07:00
key ipsec: pfkey should ignore events when no listeners 2008-06-10 14:25:34 -07:00
lapb
llc llc: Fix double accounting of received packets 2008-05-30 02:57:29 -07:00
mac80211 mac80211: fix an oops in several failure paths in key allocation 2008-06-27 14:49:52 -04:00
netfilter netfilter: nf_conntrack_h323: fix module unload crash 2008-06-17 15:52:32 -07:00
netlabel Audit: collect sessionid in netlink messages 2008-04-28 06:18:03 -04:00
netlink netlink: genl: fix circular locking 2008-06-18 02:07:07 -07:00
netrom
packet net: Allow netdevices to specify needed head/tailroom 2008-05-12 20:48:31 -07:00
rfkill rfkill: Fix device type check when toggling states 2008-04-15 15:04:35 -04:00
rose rose: Wrong list_lock argument in rose_node seqops 2008-05-02 17:03:22 -07:00
rxrpc net: Add missing braces to multi-statement if()s 2008-05-02 16:20:10 -07:00
sched pkt_sched: Change HTB_HYSTERESIS to a runtime parameter htb_hysteresis. 2008-06-16 16:39:32 -07:00
sctp sctp: Make sure N * sizeof(union sctp_addr) does not overflow. 2008-06-20 22:04:34 -07:00
sunrpc Merge branch 'for-2.6.26' of git://linux-nfs.org/~bfields/linux 2008-05-20 19:30:54 -07:00
tipc tipc: Increase buffer header to support worst-case device 2008-05-08 21:38:24 -07:00
unix af_unix: fix 'poll for write'/connected DGRAM sockets 2008-06-27 19:34:18 -07:00
wanrouter
wireless mac80211: implement EU regulatory domain 2008-06-25 10:31:29 -04:00
x25
xfrm xfrm: xfrm_algo: correct usage of RIPEMD-160 2008-06-04 12:04:55 -07:00
compat.c net: Add compat support for getsockopt (MCAST_MSFILTER) 2008-04-29 03:23:22 -07:00
Kconfig
Makefile
nonet.c
socket.c net: Unexport move_addr_to_{kernel,user} 2008-04-23 03:37:49 -07:00
sysctl_net.c net: fix returning void-valued expression warnings 2008-05-01 02:47:38 -07:00
TUNABLE