linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-24 05:02:12 +00:00

Author	SHA1	Message	Date
Eric Dumazet	c2bb06db59	net: fix build errors if ipv6 is disabled CONFIG_IPV6=n is still a valid choice ;) It appears we can remove dead code. Reported-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 13:04:03 -04:00
Eric Dumazet	efe4208f47	ipv6: make lookups simpler and faster TCP listener refactoring, part 4 : To speed up inet lookups, we moved IPv4 addresses from inet to struct sock_common Now is time to do the same for IPv6, because it permits us to have fast lookups for all kind of sockets, including upcoming SYN_RECV. Getting IPv6 addresses in TCP lookups currently requires two extra cache lines, plus a dereference (and memory stall). inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6 This patch is way bigger than its IPv4 counter part, because for IPv4, we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6, it's not doable easily. inet6_sk(sk)->daddr becomes sk->sk_v6_daddr inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr at the same offset. We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic macro. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-10-09 00:01:25 -04:00
David S. Miller	0e76a3a587	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge net into net-next to setup some infrastructure Eric Dumazet needs for usbnet changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-08-03 21:36:46 -07:00
NeilBrown	447383d2ba	NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure. Since we enabled auto-tuning for sunrpc TCP connections we do not guarantee that there is enough write-space on each connection to queue a reply. If memory pressure causes the window to shrink too small, the request throttling in sunrpc/svc will not accept any requests so no more requests will be handled. Even when pressure decreases the window will not grow again until data is sent on the connection. This means we get a deadlock: no requests will be handled until there is more space, and no space will be allocated until a request is handled. This can be simulated by modifying svc_tcp_has_wspace to inflate the number of byte required and removing the 'svc_sock_setbufsize' calls in svc_setup_socket. I found that multiplying by 16 was enough to make the requirement exceed the default allocation. With this modification in place: mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt would block and eventually time out because the nfs server could not accept any requests. This patch relaxes the request throttling to always allow at least one request through per connection. It does this by checking both sk_stream_min_wspace() and xprt->xpt_reserved are zero. The first is zero when the TCP transmit queue is empty. The second is zero when there are no RPC requests being processed. When both of these are zero the socket is idle and so one more request can safely be allowed through. Applying this patch allows the above mount command to succeed cleanly. Tracing shows that the allocated write buffer space quickly grows and after a few requests are handled, the extra tests are no longer needed to permit further requests to be processed. The main purpose of request throttling is to handle the case when one client is slow at collecting replies and the send queue gets full of replies that the client hasn't acknowledged (at the TCP level) yet. As we only change behaviour when the send queue is empty this main purpose is still preserved. Reported-by: Ben Myers <bpm@sgi.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-08-01 08:39:16 -04:00
Eric Dumazet	64dc61306c	net: add sk_stream_is_writeable() helper Several call sites use the hardcoded following condition : sk_stream_wspace(sk) >= sk_stream_min_wspace(sk) Lets use a helper because TCP_NOTSENT_LOWAT support will change this condition for TCP sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-07-24 17:54:48 -07:00
J. Bruce Fields	1f691b07c5	svcrpc: don't error out on small tcp fragment Though clients we care about mostly don't do this, it is possible for rpc requests to be sent in multiple fragments. Here we have a sanity check to ensure that the final received rpc isn't too small--except that the number we're actually checking is the length of just the final fragment, not of the whole rpc. So a perfectly legal rpc that's unluckily fragmented could cause the server to close the connection here. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:04 -04:00
J. Bruce Fields	cf3aa02cb4	svcrpc: fix handling of too-short rpc's If we detect that an rpc is too short, we abort and close the connection. Except, there's a bug here: we're leaving sk_datalen nonzero without leaving any pages in the sk_pages array. The most likely result of the inconsistency is a subsequent crash in svc_tcp_clear_pages. Also demote the BUG_ON in svc_tcp_clear_pages to a WARN. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-07-01 17:32:04 -04:00
Tom Parkin	73df66f8b1	ipv6: rename datagram_send_ctl and datagram_recv_ctl The datagram_*_ctl functions in net/ipv6/datagram.c are IPv6-specific. Since datagram_send_ctl is publicly exported it should be appropriately named to reflect the fact that it's for IPv6 only. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 13:53:08 -05:00
Linus Torvalds	982197277c	Merge branch 'for-3.8' of git://linux-nfs.org/~bfields/linux Pull nfsd update from Bruce Fields: "Included this time: - more nfsd containerization work from Stanislav Kinsbursky: we're not quite there yet, but should be by 3.9. - NFSv4.1 progress: implementation of basic backchannel security negotiation and the mandatory BACKCHANNEL_CTL operation. See http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues for remaining TODO's - Fixes for some bugs that could be triggered by unusual compounds. Our xdr code wasn't designed with v4 compounds in mind, and it shows. A more thorough rewrite is still a todo. - If you've ever seen "RPC: multiple fragments per record not supported" logged while using some sort of odd userland NFS client, that should now be fixed. - Further work from Jeff Layton on our mechanism for storing information about NFSv4 clients across reboots. - Further work from Bryan Schumaker on his fault-injection mechanism (which allows us to discard selective NFSv4 state, to excercise rarely-taken recovery code paths in the client.) - The usual mix of miscellaneous bugs and cleanup. Thanks to everyone who tested or contributed this cycle." * 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits) nfsd4: don't leave freed stateid hashed nfsd4: free_stateid can use the current stateid nfsd4: cleanup: replace rq_resused count by rq_next_page pointer nfsd: warn on odd reply state in nfsd_vfs_read nfsd4: fix oops on unusual readlike compound nfsd4: disable zero-copy on non-final read ops svcrpc: fix some printks NFSD: Correct the size calculation in fault_inject_write NFSD: Pass correct buffer size to rpc_ntop nfsd: pass proper net to nfsd_destroy() from NFSd kthreads nfsd: simplify service shutdown nfsd: replace boolean nfsd_up flag by users counter nfsd: simplify NFSv4 state init and shutdown nfsd: introduce helpers for generic resources init and shutdown nfsd: make NFSd service structure allocated per net nfsd: make NFSd service boot time per-net nfsd: per-net NFSd up flag introduced nfsd: move per-net startup code to separated function nfsd: pass net to __write_ports() and down nfsd: pass net to nfsd_set_nrthreads() ...	2012-12-20 14:04:11 -08:00
J. Bruce Fields	afc59400d6	nfsd4: cleanup: replace rq_resused count by rq_next_page pointer It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 22:00:16 -05:00
J. Bruce Fields	3a28e33111	svcrpc: fix some printks Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-17 16:02:40 -05:00
J. Bruce Fields	836fbadb96	svcrpc: support multiple-fragment rpc's Over TCP, RPC's are preceded by a single 4-byte field telling you how long the rpc is (in bytes). The spec also allows you to send an RPC in multiple such records (the high bit of the length field is used to tell you whether this is the final record). We've survived for years without supporting this because in practice the clients we care about don't use it. But the userland rpc libraries do, and every now and then an experimental client will run into this. (Most recently I noticed it while trying to write a pynfs check.) And we're really on the wrong side of the spec here--let's fix this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:49:24 -05:00
J. Bruce Fields	8af345f58a	svcrpc: track rpc data length separately from sk_tcplen Keep a separate field, sk_datalen, that tracks only the data contained in a fragment, not including the fragment header. For now, this is always just max(0, sk_tcplen - 4), but after we allow multiple fragments sk_datalen will accumulate the total rpc data size while sk_tcplen only tracks progress receiving the current fragment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:49:14 -05:00
J. Bruce Fields	6a72ae2e23	svcrpc: fix off-by-4 error in "incomplete TCP record" dprintk The full reclen doesn't include the fragment header, but sk_tcplen does. Fix this to make it an apples-to-apples comparison. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:49:06 -05:00
J. Bruce Fields	ad46ccf094	svcrpc: delay minimum-rpc-size check till later Soon we want to support multiple fragments, in which case it may be legal for a single fragment to be smaller than 8 bytes, so we'll want to delay this check till we've reached the last fragment. Also fix an outdated comment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:47:58 -05:00
J. Bruce Fields	cc248d4b1d	svcrpc: don't byte-swap sk_reclen in place Byte-swapping in place is always a little dubious. Let's instead define this field to always be big-endian, and do the swapping on demand where we need it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-12-04 07:47:23 -05:00
Weston Andros Adamson	1b7a181907	SUNRPC: remove BUG_ONs from _reclassify_socket Replace multiple BUG_ON() calls with WARN_ON_ONCE() and early return when sanity checking socket ownership (lock). The bind call will fail if the socket was unsuccessfully reclassified. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-11-04 14:43:41 -05:00
J. Bruce Fields	eccf50c129	nfsd: remove unused listener-removal interfaces You can use nfsd/portlist to give nfsd additional sockets to listen on. In theory you can also remove listening sockets this way. But nobody's ever done that as far as I can tell. Also this was partially broken in 2.6.25, by `a217813f90` "knfsd: Support adding transports by writing portlist file". (Note that we decide whether to take the "delfd" case by checking for a digit--but what's actually expected in that case is something made by svc_one_sock_name(), which won't begin with a digit.) So, let's just rip out this stuff. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-09-10 10:55:19 -04:00
J. Bruce Fields	9f9d2ebe69	svcrpc: make xpo_recvfrom return only >=0 The only errors returned from xpo_recvfrom have been -EAGAIN and -EAFNOSUPPORT. The latter was removed by a previous patch. That leaves only -EAGAIN, which is treated just like 0 by the caller (svc_recv). So, just ditch -EAGAIN and return 0 instead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:41:07 -04:00
J. Bruce Fields	af6d572134	svcrpc: don't bother checking bad svc_addr_len result None of the callers should see an unsupported address family (only one of them even bothers to check for that case), so just check for the buggy case in svc_addr_len and don't bother elsewhere. Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:40:10 -04:00
J. Bruce Fields	f23abfdb94	svcrpc: minor udp code cleanup Order the code in a more boring way. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:07:50 -04:00
J. Bruce Fields	39b5530137	svcrpc: share some setup of listening sockets There's some duplicate code here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 17:07:48 -04:00
J. Bruce Fields	a8e10078a8	svcrpc: clean up control flow Mainly, use the kernel standard err = -ERROR; if (something_bad) goto out; normal case; rather than if (something_bad) err = -ERROR else { normal case; } Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 14:08:41 -04:00
J. Bruce Fields	72c3537607	svcrpc: standardize svc_setup_socket return convention Use the kernel-standard ptr-or-error return convention instead of passing a pointer to the error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-21 14:08:40 -04:00
J. Bruce Fields	be1e44441a	svcrpc: fix BUG() in svc_tcp_clear_pages Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if sk_tcplen would lead us to expect n pages of data). svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen. This is OK, since both functions are serialized by XPT_BUSY. However, that means the inconsistency must be repaired before dropping XPT_BUSY. Therefore we should be ensuring that svc_tcp_save_pages repairs the problem before exiting svc_tcp_recv_record on error. Symptoms were a BUG() in svc_tcp_clear_pages. Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-08-20 18:38:44 -04:00
Eric Dumazet	22911fc581	net: skb_free_datagram_locked() doesnt drop all packets dropwatch wrongly diagnose all received UDP packets as drops. This patch removes trace_kfree_skb() done in skb_free_datagram_locked(). Locations calling skb_free_datagram_locked() should do it on their own. As a result, drops are accounted on the right function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 15:40:57 -07:00
Joe Perches	e87cc4728f	net: Convert net_ratelimit uses to net_<level>_ratelimited Standardize the net core ratelimited logging functions. Coalesce formats, align arguments. Change a printk then vprintk sequence to use printf extension %pV. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-15 13:45:03 -04:00
Pavel Emelyanov	4a17fd5229	sock: Introduce named constants for sk_reuse Name them in a "backward compatible" manner, i.e. reuse or not are still 1 and 0 respectively. The reuse value of 2 means that the socket with it will forcibly reuse everyone else's port. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 15:52:25 -04:00
J. Bruce Fields	1df00640c9	Merge nfs containerization work from Trond's tree The nfs containerization work is a prerequisite for Jeff Layton's reboot recovery rework.	2012-03-26 11:48:54 -04:00
Trond Myklebust	09acfea5d8	SUNRPC: Fix a few sparse warnings net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment (different address spaces) - svc_partial_recvfrom now takes a struct kvec, so the variable save_iovbase needs to be an ordinary (void *) Make a bunch of variables in net/sunrpc/xprtsock.c static Fix a couple of "warning: symbol 'foo' was not declared. Should it be static?" reports. Fix a couple of conflicting function declarations. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-03-11 19:30:02 -04:00
Stanislav Kinsbursky	9f912ceb7e	SUNPRC: remove marking service temporary sockets with XPT_CHNGBUF This is a cleanup patch. Service temporary sockets can be TCP or RDMA only. But XPT_CHNGBUF service socket flag is checked only for UDP sockets on receive. Thus (if I don't miss something non-obvious) this bit raising for temporary sockets can be removed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2012-02-03 14:26:43 -05:00
Trond Myklebust	d3b773e4fd	SUNRPC: fixup for namespace changes Fixes this build error when CONFIG_NET_NS is not set: net/sunrpc/svcsock.c: In function 'svc_setup_socket': net/sunrpc/svcsock.c:1412:40: error: 'struct sock_common' has no member named 'skc_net' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:22 -05:00
Stanislav Kinsbursky	5247fab5c8	SUNRPC: pass network namespace to service registering routines Lockd and NFSd services will handle requests from and to many network nsamespaces. And thus have to be registered and unregistered per network namespace. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-31 19:28:13 -05:00
Linus Torvalds	0b48d42235	Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux * 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits) nfsd4: nfsd4_create_clid_dir return value is unused NFSD: Change name of extended attribute containing junction svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown svcrpc: fix double-free on shutdown of nfsd after changing pool mode nfsd4: be forgiving in the absence of the recovery directory nfsd4: fix spurious 4.1 post-reboot failures NFSD: forget_delegations should use list_for_each_entry_safe NFSD: Only reinitilize the recall_lru list under the recall lock nfsd4: initialize special stateid's at compile time NFSd: use network-namespace-aware cache registering routines SUNRPC: create svc_xprt in proper network namespace svcrpc: update outdated BKL comment nfsd41: allow non-reclaim open-by-fh's in 4.1 svcrpc: avoid memory-corruption on pool shutdown svcrpc: destroy server sockets all at once svcrpc: make svc_delete_xprt static nfsd: Fix oops when parsing a 0 length export nfsd4: Use kmemdup rather than duplicating its implementation nfsd4: add a separate (lockowner, inode) lookup nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error ...	2012-01-14 12:26:41 -08:00
Stanislav Kinsbursky	bd4620ddf6	SUNRPC: create svc_xprt in proper network namespace This patch makes svc_xprt inherit network namespace link from its socket. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-12-06 16:20:42 -05:00
Alexey Dobriyan	4e3fd7a06d	net: remove ipv6_addr_copy() C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-22 16:43:32 -05:00
Paul Gortmaker	3a9a231d97	net: Fix files explicitly needing to include module.h With calls to modular infrastructure, these files really needs the full module.h header. Call it out so some of the cleanups of implicit and unrequired includes elsewhere can be cleaned up. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:30:28 -04:00
Mi Jinlong	849a1cf13d	SUNRPC: Replace svc_addr_u by sockaddr_storage For IPv6 local address, lockd can not callback to client for missing scope id when binding address at inet6_bind: 324 if (addr_type & IPV6_ADDR_LINKLOCAL) { 325 if (addr_len >= sizeof(struct sockaddr_in6) && 326 addr->sin6_scope_id) { 327 /* Override any existing binding, if another one 328 * is supplied by user. 329 / 330 sk->sk_bound_dev_if = addr->sin6_scope_id; 331 } 332 333 / Binding to link-local address requires an interface */ 334 if (!sk->sk_bound_dev_if) { 335 err = -EINVAL; 336 goto out_unlock; 337 } Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info besides address. Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-14 08:21:48 -04:00
Linus Torvalds	28890d3598	Merge branch 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (44 commits) NFSv4: Don't use the delegation->inode in nfs_mark_return_delegation() nfs: don't use d_move in nfs_async_rename_done RDMA: Increasing RPCRDMA_MAX_DATA_SEGS SUNRPC: Replace xprt->resend and xprt->sending with a priority queue SUNRPC: Allow caller of rpc_sleep_on() to select priority levels SUNRPC: Support dynamic slot allocation for TCP connections SUNRPC: Clean up the slot table allocation SUNRPC: Initalise the struct xprt upon allocation SUNRPC: Ensure that we grab the XPRT_LOCK before calling xprt_alloc_slot pnfs: simplify pnfs files module autoloading nfs: document nfsv4 sillyrename issues NFS: Convert nfs4_set_ds_client to EXPORT_SYMBOL_GPL SUNRPC: Convert the backchannel exports to EXPORT_SYMBOL_GPL SUNRPC: sunrpc should not explicitly depend on NFS config options NFS: Clean up - simplify the switch to read/write-through-MDS NFS: Move the pnfs write code into pnfs.c NFS: Move the pnfs read code into pnfs.c NFS: Allow the nfs_pageio_descriptor to signal that a re-coalesce is needed NFS: Use the nfs_pageio_descriptor->pg_bsize in the read/write request NFS: Cache rpc_ops in struct nfs_pageio_descriptor ...	2011-07-27 13:23:02 -07:00
H Hartley Sweeten	177e4f998b	svcsock.c: include sunrpc.h to quiet sparse noise Include the private header sunrpc.h to pickup the declaration of the function svc_send_common to quiet the following sparse noise: warning: symbol 'svc_send_common' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-07-15 18:58:43 -04:00
Trond Myklebust	9e00abc3c2	SUNRPC: sunrpc should not explicitly depend on NFS config options Change explicit references to CONFIG_NFS_V4_1 to implicit ones Get rid of the unnecessary defines in backchannel_rqst.c and bc_svc.c: the Makefile takes care of those dependency. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-07-15 09:12:23 -04:00
J. Bruce Fields	8985ef0b8a	svcrpc: complete svsk processing on cb receive failure Currently when there's some failure to receive a callback (because we couldn't find a matching xid, for example), we exit svc_recv with sk_tcplen still set but without any pages saved with the socket. This will cause a crash later in svc_tcp_restore_pages. Instead, make sure we reset that tcp information whether the callback received failed or succeeded. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-10 10:47:46 -04:00
Olga Kornievskaia	9660439861	svcrpc: take advantage of tcp autotuning Allow the NFSv4 server to make use of TCP autotuning behaviour, which was previously disabled by setting the sk_userlocks variable. Set the receive buffers to be big enough to receive the whole RPC request, and set this for the listening socket, not the accept socket. Remove the code that readjusts the receive/send buffer sizes for the accepted socket. Previously this code was used to influence the TCP window management behaviour, which is no longer needed when autotuning is enabled. This can improve IO bandwidth on networks with high bandwidth-delay products, where a large tcp window is required. It also simplifies performance tuning, since getting adequate tcp buffers previously required increasing the number of nfsd threads. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-04-07 14:36:40 -04:00
J. Bruce Fields	31d68ef65c	SUNRPC: Don't wait for full record to receive tcp data Ensure that we immediately read and buffer data from the incoming TCP stream so that we grow the receive window quickly, and don't deadlock on large READ or WRITE requests. Also do some minor exit cleanup. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
Trond Myklebust	586c52cc61	svcrpc: copy cb reply instead of pages It's much simpler just to copy the cb reply data than to play tricks with pages. Callback replies will typically be very small (at least until we implement cb_getattr, in which case files with very long ACLs could pose a problem), so there's no loss in efficiency. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
J. Bruce Fields	cc6c2127f2	svcrpc: close connection if client sends short packet If the client sents a record too short to contain even the beginning of the rpc header, then just close the connection. The current code drops the record data and continues. I don't see the point. It's a hopeless situation and simpler just to cut off the connection completely. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
J. Bruce Fields	48e6555c7b	svcrpc: note network-order types in svc_process_calldir Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:40 -04:00
Trond Myklebust	5ee78d483c	SUNRPC: svc_tcp_recvfrom cleanup Minor cleanup in preparation for later patches. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:39 -04:00
Trond Myklebust	0601f79392	SUNRPC: requeue tcp socket less frequently Don't requeue the socket in some cases where we know it's unnecessary. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-04-07 14:36:39 -04:00
Eric Dumazet	eaefd1105b	net: add __rcu annotations to sk_wq and wq Add proper RCU annotations/verbs to sk_wq and wq members Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access) Fix sunrpc sk_sleep() abuse too Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-22 10:19:31 -08:00

1 2 3 4 5

211 Commits