linux

Author	SHA1	Message	Date
Linus Torvalds	4ef58d4e2a	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits) tree-wide: fix misspelling of "definition" in comments reiserfs: fix misspelling of "journaled" doc: Fix a typo in slub.txt. inotify: remove superfluous return code check hdlc: spelling fix in find_pvc() comment doc: fix regulator docs cut-and-pasteism mtd: Fix comment in Kconfig doc: Fix IRQ chip docs tree-wide: fix assorted typos all over the place drivers/ata/libata-sff.c: comment spelling fixes fix typos/grammos in Documentation/edac.txt sysctl: add missing comments fs/debugfs/inode.c: fix comment typos sgivwfb: Make use of ARRAY_SIZE. sky2: fix sky2_link_down copy/paste comment error tree-wide: fix typos "couter" -> "counter" tree-wide: fix typos "offest" -> "offset" fix kerneldoc for set_irq_msi() spidev: fix double "of of" in comment comment typo fix: sybsystem -> subsystem ...	2009-12-09 19:43:33 -08:00
Suresh Jayaraman	053e324f67	rpc: remove unneeded function parameter in gss_add_msg() The pointer to struct gss_auth parameter in gss_add_msg is not really needed after commit `5b7ddd4a`. Zap it. Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-09 16:23:18 -05:00
Trond Myklebust	e4ee5d4dfd	Merge branch 'bugfixes' into nfs-for-next	2009-12-08 14:36:53 -05:00
Roel Kluin	480e3243df	SUNRPC: IS_ERR/PTR_ERR confusion IS_ERR returns 1 or 0, PTR_ERR returns the error value. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-08 13:13:03 -05:00
Linus Torvalds	d7fc02c7ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits) mac80211: fix reorder buffer release iwmc3200wifi: Enable wimax core through module parameter iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter iwmc3200wifi: Coex table command does not expect a response iwmc3200wifi: Update wiwi priority table iwlwifi: driver version track kernel version iwlwifi: indicate uCode type when fail dump error/event log iwl3945: remove duplicated event logging code b43: fix two warnings ipw2100: fix rebooting hang with driver loaded cfg80211: indent regulatory messages with spaces iwmc3200wifi: fix NULL pointer dereference in pmkid update mac80211: Fix TX status reporting for injected data frames ath9k: enable 2GHz band only if the device supports it airo: Fix integer overflow warning rt2x00: Fix padding bug on L2PAD devices. WE: Fix set events not propagated b43legacy: avoid PPC fault during resume b43: avoid PPC fault during resume tcp: fix a timewait refcnt race ... Fix up conflicts due to sysctl cleanups (dead sysctl_check code and CTL_UNNUMBERED removed) in kernel/sysctl_check.c net/ipv4/sysctl_net_ipv4.c net/ipv6/addrconf.c net/sctp/sysctl.c	2009-12-08 07:55:01 -08:00
Linus Torvalds	1557d33007	Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...	2009-12-08 07:38:50 -08:00
Jiri Kosina	d014d04386	Merge branch 'for-next' into for-linus Conflicts: kernel/irq/chip.c	2009-12-07 18:36:35 +01:00
André Goddard Rosa	af901ca181	tree-wide: fix assorted typos all over the place That is "success", "unknown", "through", "performance", "[re\|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over\|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-12-04 15:39:55 +01:00
Trond Myklebust	7285f2d2ff	Merge branch 'devel' into linux-next	2009-12-03 21:27:36 -05:00
Chuck Lever	3a28becc35	SUNRPC: soft connect semantics for UDP Introduce soft connect behavior for UDP transports. In this case, a major timeout returns ETIMEDOUT instead of EIO. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	caabea8a56	SUNRPC: Use soft connect semantics when performing RPC ping Currently, if a remote RPC service is unreachable, an RPC ping will hang until the underlying transport connect attempt times out. A more desirable behavior might be to have the ping fail immediately so upper layers can recover appropriately. In the case of an NFS mount, for instance, this would mean the mount(2) system call could fail immediately if the server isn't listening, rather than hanging uninterruptibly for more than 3 minutes. Change rpc_ping() so that it fails immediately for connection-oriented transports. rpc_create() will then fail immediately for such transports if an RPC ping was requested. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	012da158f6	SUNRPC: Use soft connects for autobinding over TCP Autobinding is handled by the rpciod process, not in user processes that are generating regular RPC requests. Thus autobinding is usually not affected by signals targetting user processes, such as KILL or timer expiration events. In addition, an RPC request generated by a user process that has RPC_TASK_SOFTCONN set and needs to perform an autobind will hang if the remote rpcbind service is not available. For rpcbind queries on connection-oriented transports, let's use the new soft connect semantic to return control to the user's process quickly, if the kernel's rpcbind client can't connect to the remote rpcbind service. Logic is introduced in call_bind_status() to handle connection errors that occurred during an asynchronous rpcbind query. The logic abandons the rpcbind query if the RPC request has SOFTCONN set, and retries after a few seconds in the normal case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	2a76b3bfa2	SUNRPC: Use TCP for local rpcbind upcalls Use TCP with the soft connect semantic for local rpcbind upcalls so the kernel can detect immediately if the local rpcbind daemon is not running. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	c526611dd6	SUNRPC: Use a cached RPC client and transport for rpcbind upcalls The kernel's rpcbind client creates and deletes an rpc_clnt and its underlying transport socket for every upcall to the local rpcbind daemon. When starting a typical NFS server on IPv4 and IPv6, the NFS service itself does three upcalls (one per version) times two upcalls (one per transport) times two upcalls (one per address family), making 12, plus another one for the initial call to unregister previous NFS services. Starting the NLM service adds an additional 13 upcalls, for similar reasons. (Currently the NFS service doesn't start IPv6 listeners, but it will soon enough). Instead, let's create an rpc_clnt for rpcbind upcalls during the first local rpcbind query, and cache it. This saves the overhead of creating and destroying an rpc_clnt and a socket for every upcall. The new logic also prevents the kernel from attempting an RPCB_SET or RPCB_UNSET if it knows from the start that the local portmapper does not support rpcbind protocol version 4. This will cut down on the number of rpcbind upcalls in legacy environments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	5a46211540	SUNRPC: Simplify synopsis of rpcb_local_clnt() Clean up: At one point, rpcb_local_clnt() handled IPv6 loopback addresses too, but it doesn't any more; only IPv4 loopback is used now. Get rid of the @addr and @addrlen arguments to rpcb_local_clnt(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	09a21c4102	SUNRPC: Allow RPCs to fail quickly if the server is unreachable The kernel sometimes makes RPC calls to services that aren't running. Because the kernel's RPC client always assumes the hard retry semantic when reconnecting a connection-oriented RPC transport, the underlying reconnect logic takes a long while to time out, even though the remote may have responded immediately with ECONNREFUSED. In certain cases, like upcalls to our local rpcbind daemon, or for NFS mount requests, we'd like the kernel to fail immediately if the remote service isn't reachable. This allows another transport to be tried immediately, or the pending request can be abandoned quickly. Introduce a per-request flag which controls how call_transmit_status() behaves when request transmission fails because the server cannot be reached. We don't want soft connection semantics to apply to other errors. The default case of the switch statement in call_transmit_status() no longer falls through; the fall through code is copied to the default case, and a "break;" is added. The transport's connection re-establishment timeout is also ignored for such requests. We want the request to fail immediately, so the reconnect delay is skipped. Additionally, we don't want a connect failure here to further increase the reconnect timeout value, since this request will not be retried. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	206a134b4d	SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status() The success case, where task->tk_status == 0, is by far the most frequent case in call_transmit_status(). The default: arm of the switch statement in call_transmit_status() handles the 0 case. default: was moved close to the top of the switch statement in call_transmit_status() under the theory that the compiler places object code for the earliest arms of a switch statement first, making the CPU do less work. The default: arm of a switch statement, however, is executed only after all the other cases have been checked. Even if the compiler rearranges the object code, the default: arm is the "last resort", meaning all of the other cases have been explicitly exhausted. That makes the current arrangement about as inefficient as it gets for the common case. To fix this, add an explicit check for zero before the switch statement. That forces the compiler to do the zero check first, no matter what optimizations it might try to do to the switch statement. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Chuck Lever	dd1fd90fe6	SUNRPC: Display compressed (shorthand) IPv6 presentation addresses Recent changes to snprintf() introduced the %pI6c formatter, which can display an IPv6 address with standard shorthanding. Using a shorthanded address can save us a few bytes of memory for each stored presentation address, or a few bytes on the wire when sending these in a universal address. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 15:58:56 -05:00
Trond Myklebust	f0380f3d16	RPC: Fix two potential races in put_rpccred It is possible for rpcauth_destroy_credcache() to cause the rpc credentials to be unhashed while put_rpccred is waiting for the rpc_credcache_lock on another cpu. Should this happen, then we can end up calling hlist_del_rcu(&cred->cr_hash) a second time in put_rpccred, thus causing list corruption. Should the credential actually be hashed, it is also possible for rpcauth_lookup_credcache to find and reference it before we get round to unhashing it. In this case, the call to rpcauth_unhash_cred will fail, and so we should just exit without destroying the cred. Reported-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 08:10:17 -05:00
Trond Myklebust	feb8ca37cc	SUNRPC: Ensure that we honour autoclose before attempting to reconnect If the XPRT_CLOSE_WAIT flag is set, we need to ensure that we call xprt->ops->close() while holding xprt_lock_write() before we can start reconnecting. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-12-03 08:10:17 -05:00
David S. Miller	ff9c38bba3	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/ht.c	2009-12-01 22:13:38 -08:00
Joe Perches	f64f9e7192	net: Move && and \|\| to end of previous line Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-29 16:55:45 -08:00
J. Bruce Fields	9b8b317d58	Merge commit 'v2.6.32-rc8' into HEAD	2009-11-23 12:34:58 -05:00
J. Bruce Fields	78c210efde	Revert "knfsd: avoid overloading the CPU scheduler with enormous load averages" This reverts commit `59a252ff8c`. This helps in an entirely cached workload but not necessarily in workloads that require waiting on disk. Conflicts: include/linux/sunrpc/svc.h net/sunrpc/svc_xprt.c Reported-by: Simon Kirby <sim@hostway.ca> Tested-by: Jesper Krogh <jesper@krogh.cc> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-11-23 12:34:05 -05:00
David S. Miller	3505d1a9fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sfc/sfe4001.c drivers/net/wireless/libertas/cmd.c drivers/staging/Kconfig drivers/staging/Makefile drivers/staging/rtl8187se/Kconfig drivers/staging/rtl8192e/Kconfig	2009-11-18 22:19:03 -08:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Chuck Lever	1e360a60b2	SUNRPC: Address buffer overrun in rpc_uaddr2sockaddr() The size of buf[] must account for the string termination needed for the first strict_strtoul() call. Introduced in commit `a02d6926`. Fábio Olivé Leite points out that strict_strtoul() requires _either_ '\n\0' _or_ '\0' termination, so use the simpler '\0' here instead. See http://bugzilla.kernel.org/show_bug.cgi?id=14546 . Reported-by: argp@census-labs.com Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Fábio Olivé Leite <fleite@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-11-14 08:17:04 +09:00
Eric W. Biederman	f8572d8f2a	sysctl net: Remove unused binary sysctl code Now that sys_sysctl is a compatiblity wrapper around /proc/sys all sysctl strategy routines, and all ctl_name and strategy entries in the sysctl tables are unused, and can be revmoed. In addition neigh_sysctl_register has been modified to no longer take a strategy argument and it's callers have been modified not to pass one. Cc: "David Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:05:06 -08:00
David S. Miller	230f9bb701	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/usb/cdc_ether.c All CDC ethernet devices of type USB_CLASS_COMM need to use '&mbm_info'. Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-06 00:55:55 -08:00
Linus Torvalds	a84216e671	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits) mac80211: check interface is down before type change cfg80211: fix NULL ptr deref libertas if_usb: Fix crash on 64-bit machines mac80211: fix reason code output endianness mac80211: fix addba timer ath9k: fix misplaced semicolon on rate control b43: Fix DMA TX bounce buffer copying mac80211: fix BSS leak rt73usb.c : more ids ipw2200: fix oops on missing firmware gre: Fix dev_addr clobbering for gretap sky2: set carrier off in probe net: fix sk_forward_alloc corruption pcnet_cs: add cis of PreMax PE-200 ethernet pcmcia card r8169: Fix card drop incoming VLAN tagged MTU byte large jumbo frames ibmtr: possible Read buffer overflow? net: Fix RPF to work with policy routing net: fix kmemcheck annotations e1000e: rework disable K1 at 1000Mbps for 82577/82578 e1000e: config PHY via software after resets ...	2009-11-03 07:44:01 -08:00
Eric Dumazet	9d410c7960	net: fix sk_forward_alloc corruption On UDP sockets, we must call skb_free_datagram() with socket locked, or risk sk_forward_alloc corruption. This requirement is not respected in SUNRPC. Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC Reported-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-30 12:25:12 -07:00
J. Bruce Fields	dc83d6e27f	nfsd4: don't try to map gid's in generic rpc code For nfsd we provide users the option of mapping uid's to server-side supplementary group lists. That makes sense for nfsd, but not necessarily for other rpc users (such as the callback client). So move that lookup to svcauth_unix_set_client, which is a program-specific method. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-10-27 19:34:43 -04:00
Eric Dumazet	c720c7e838	inet: rename some inet_sock fields In order to have better cache layouts of struct sock (separate zones for rx/tx paths), we need this preliminary patch. Goal is to transfert fields used at lookup time in the first read-mostly cache line (inside struct sock_common) and move sk_refcnt to a separate cache line (only written by rx path) This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr, sport and id fields. This allows a future patch to define these fields as macros, like sk_refcnt, without name clashes. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-18 18:52:53 -07:00
Alexey Dobriyan	d43c36dc6b	headers: remove sched.h from interrupt.h After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-10-11 11:20:58 -07:00
Brian Haley	b301e82cf8	IPv6: use ipv6_addr_set_v4mapped() Might as well use the ipv6_addr_set_v4mapped() inline we created last year. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-07 13:58:25 -07:00
Jaswinder Singh Rajput	0c01695dab	net: fix htmldocs sunrpc, clnt.c DOCPROC Documentation/DocBook/networking.xml Warning(net/sunrpc/clnt.c:647): No description found for parameter 'req' Warning(net/sunrpc/clnt.c:647): No description found for parameter 'tk_ops' Warning(net/sunrpc/clnt.c:647): Excess function parameter 'ops' description in 'rpc_run_bc_task' Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Andy Adamson <andros@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-09-24 15:39:14 -07:00
Jaswinder Singh Rajput	7a73fdde39	net: fix htmldocs sunrpc, clnt.c DOCPROC Documentation/DocBook/networking.xml Warning(net/sunrpc/clnt.c:647): No description found for parameter 'req' Warning(net/sunrpc/clnt.c:647): No description found for parameter 'tk_ops' Warning(net/sunrpc/clnt.c:647): Excess function parameter 'ops' description in 'rpc_run_bc_task' Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Cc: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Cc: Benny Halevy <bhalevy@panasas.com> Cc: Andy Adamson <andros@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Miller <davem@davemloft.net> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-24 14:58:42 -04:00
Alexey Dobriyan	8d65af789f	sysctl: remove "struct file *" argument of ->proc_handler It's unused. It isn't needed -- read or write flag is already passed and sysctl shouldn't care about the rest. It _was_ used in two places at arch/frv for some reason. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-24 07:21:04 -07:00
Alexey Dobriyan	2bcd57ab61	headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:13:10 -07:00
Randy Dunlap	4111d4fde6	sunrpc/rpc_pipe: fix kernel-doc notation Fix kernel-doc notation (& warnings) in sunrpc/rpc_pipe.c. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-23 14:36:38 -04:00
Neil Brown	61d0a8e6a8	NFS/RPC: fix problems with reestablish_timeout and related code. [[resending with correct cc: - "vfs.kernel.org" just isn't right!]] xprt->reestablish_timeout is used to cause TCP connection attempts to back off if the connection fails so as not to hammer the network, but to still allow immediate connections when there is no reason to believe there is a problem. It is not used for the first connection (when transport->sock is NULL) but only on reconnects. It is currently set: a/ to 0 when xs_tcp_state_change finds a state of TCP_FIN_WAIT1 on the assumption that the client has closed the connection so the reconnect should be immediate when needed. b/ to at least XS_TCP_INIT_REEST_TO when xs_tcp_state_change detects TCP_CLOSING or TCP_CLOSE_WAIT on the assumption that the server closed the connection so a small delay at least is required. c/ as above when xs_tcp_state_change detects TCP_SYN_SENT, so that it is never 0 while a connection has been attempted, else the doubling will produce 0 and there will be no backoff. d/ to double is value (up to a limit) when delaying a connection, thus providing exponential backoff and e/ to XS_TCP_INIT_REEST_TO in xs_setup_tcp as simple initialisation. So you can see it is highly dependant on xs_tcp_state_change being called as expected. However experimental evidence shows that xs_tcp_state_change does not see all state changes. ("rpcdebug -m rpc trans" can help show what actually happens). Results show: TCP_ESTABLISHED is reported when a connection is made. TCP_SYN_SENT is never reported, so rule 'c' above is never effective. When the server closes the connection, TCP_CLOSE_WAIT and TCP_LAST_ACK might be reported, and TCP_CLOSE is always reported. This rule 'b' above will sometimes be effective, but not reliably. When the client closes the connection, it used to result in TCP_FIN_WAIT1, TCP_FIN_WAIT2, TCP_CLOSE. However since commit `f75e674` (SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnect) we don't see any events on client-close. I think this is because xs_restore_old_callbacks is called to disconnect xs_tcp_state_change before the socket is closed. In any case, rule 'a' no longer applies. So all that is left are rule d, which successfully doubles the timeout which is never rest, and rule e which initialises the timeout. Even if the rules worked as expected, there would be a problem because a successful connection does not reset the timeout, so a sequence of events where the server closes the connection (e.g. during failover testing) will cause longer and longer timeouts with no good reason. This patch: - sets reestablish_timeout to 0 in xs_close thus effecting rule 'a' - sets it to 0 in xs_tcp_data_ready to ensure that a successful connection resets the timeout - sets it to at least XS_TCP_INIT_REEST_TO after it is doubled, thus effecting rule c I have not reimplemented rule b and the new version of rule c seems sufficient. I suspect other code in xs_tcp_data_ready needs to be revised as well. For example I don't think connect_cookie is being incremented as often as it should be. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-09-23 14:36:37 -04:00
Linus Torvalds	a87e84b5cd	Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux * 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits) nfsd4: nfsv4 clients should cross mountpoints nfsd: revise 4.1 status documentation sunrpc/cache: avoid variable over-loading in cache_defer_req sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req nfsd: return success for non-NFS4 nfs4_state_start nfsd41: Refactor create_client() nfsd41: modify nfsd4.1 backchannel to use new xprt class nfsd41: Backchannel: Implement cb_recall over NFSv4.1 nfsd41: Backchannel: cb_sequence callback nfsd41: Backchannel: Setup sequence information nfsd41: Backchannel: Server backchannel RPC wait queue nfsd41: Backchannel: Add sequence arguments to callback RPC arguments nfsd41: Backchannel: callback infrastructure nfsd4: use common rpc_cred for all callbacks nfsd4: allow nfs4 state startup to fail SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous nfsd4: fix null dereference creating nfsv4 callback client nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked. ...	2009-09-22 07:54:33 -07:00
Alexey Dobriyan	b87221de6a	const: mark remaining super_operations const Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:24 -07:00
NeilBrown	cd68c374ea	sunrpc/cache: avoid variable over-loading in cache_defer_req In cache_defer_req, 'dreq' is used for two significantly different values that happen to be of the same type. This is both confusing, and makes it hard to extend the range of one of the values as we will in the next patch. So introduce 'discard' to take one of the values. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-18 17:01:12 -04:00
NeilBrown	67e7328f15	sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req Using list_del_init is generally safer than list_del, and it will allow us, in a subsequent patch, to see if an entry has already been processed or not. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-18 11:47:49 -04:00
Trond Myklebust	5d351754fc	SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous Otherwise, the upcall is going to be synchronous, which may not be what the caller wants... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-15 20:49:33 -04:00
Alexandros Batsakis	f300baba5a	nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel [sunrpc: change idle timeout value for the backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-13 15:46:15 -04:00
NeilBrown	908329f2c0	sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked. The extra call to cache_revisit_request in cache_fresh_unlocked is not needed, as should have been fairly clear at the time of commit `4013edea9a` If there are requests to be revisited, then we can be sure that CACHE_PENDING is set, so the second call is sufficient. So remove the first call. Then remove the 'new' parameter, then remove the return value for cache_fresh_locked which is only used to provide the value for 'new'. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-11 17:08:54 -04:00
NeilBrown	9e4c6379a6	sunrpc/cache: change cache_defer_req to return -ve error, not boolean. As "cache_defer_req" does not sound like a predicate, having it return a boolean value can be confusing. It is more consistent to return 0 for success and negative for error. Exactly what error code to return is not important as we don't differentiate between reasons why the request wasn't deferred, we only care about whether it was deferred or not. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-11 17:03:27 -04:00
Rahul Iyer	4cfc7e6019	nfsd41: sunrpc: Added rpc server-side backchannel handling When the call direction is a reply, copy the xid and call direction into the req->rq_private_buf.head[0].iov_base otherwise rpc_verify_header returns rpc_garbage. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Mike Sager <sager@netapp.com> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [get rid of CONFIG_NFSD_V4_1] [sunrpc: refactoring of svc_tcp_recvfrom] [nfsd41: sunrpc: create common send routine for the fore and the back channels] [nfsd41: sunrpc: Use free_page() to free server backchannel pages] [nfsd41: sunrpc: Document server backchannel locking] [nfsd41: sunrpc: remove bc_connect_worker()] [nfsd41: sunrpc: Define xprt_server_backchannel()[ [nfsd41: sunrpc: remove bc_close and bc_init_auto_disconnect dummy functions] [nfsd41: sunrpc: eliminate unneeded switch statement in xs_setup_tcp()] [nfsd41: sunrpc: Don't auto close the server backchannel connection] [nfsd41: sunrpc: Remove unused functions] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: change bc_sock to bc_xprt] [nfsd41: sunrpc: move struct rpc_buffer def into a common header file] [nfsd41: sunrpc: use rpc_sleep in bc_send_request so not to block on mutex] [removed cosmetic changes] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [sunrpc: add new xprt class for nfsv4.1 backchannel] [sunrpc: v2.1 change handling of auto_close and init_auto_disconnect operations for the nfsv4.1 backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> [reverted more cosmetic leftovers] [got rid of xprt_server_backchannel] [separated "nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Cc: Trond Myklebust <trond.myklebust@netapp.com> [sunrpc: change idle timeout value for the backchannel] Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Acked-by: Trond Myklebust <trond.myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-11 15:04:16 -04:00
Trond Myklebust	ab3bbaa8b2	Merge branch 'nfs-for-2.6.32'	2009-09-11 14:59:37 -04:00
Benny Halevy	6951867b99	nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h Move struct rpc_buffer's definition into a sunrpc.h, a common, internal header file, in preparation for supporting the nfsv4.1 backchannel. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: #include <linux/net.h> from sunrpc.h] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-09-10 12:09:06 -04:00
Trond Myklebust	2574cc9f4f	SUNRPC: Fix rpc_task_force_reencode This patch fixes the bug that was reported in http://bugzilla.kernel.org/show_bug.cgi?id=14053 If we're in the case where we need to force a reencode and then resend of the RPC request, due to xprt_transmit failing with a networking error, then we _must_ retransmit the entire request. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-28 19:35:56 -10:00
Wei Yongjun	b0401d7253	sunrpc: move the close processing after do recvfrom method sunrpc: "Move close processing to a single place" (`d7979ae4a0`) moved the close processing before the recvfrom method. This may cause the close processing never to execute. So this patch moves it to the right place. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-27 17:18:38 -04:00
Wei Yongjun	eac81736e6	sunrpc: reply AUTH_BADCRED to RPCSEC_GSS with unknown service When an RPC message is received with RPCSEC_GSS with an unknown service (not RPC_GSS_SVC_NONE, RPC_GSS_SVC_INTEGRITY, or RPC_GSS_SVC_PRIVACY), svcauth_gss_accept() returns AUTH_BADCRED, but svcauth_gss_release() subsequently drops the response entirely, discarding the error. Fix that so the AUTH_BADCRED error is returned to the client. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-25 17:39:43 -04:00
Ryusei Yamaguchi	ed2d8aed52	knfsd: Replace lock_kernel with a mutex in nfsd pool stats. lock_kernel() in knfsd was replaced with a mutex. The later commit `03cf6c9f49` ("knfsd: add file to export stats about nfsd pools") did not follow that change. This patch fixes the issue. Also move the get and put of nfsd_serv to the open and close methods (instead of start and stop methods) to allow atomic check and increment of reference count in the open method (where we can still return an error). Signed-off-by: Ryusei Yamaguchi <mandel59@gmail.com> Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: Greg Banks <gnb@fmeh.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-25 12:39:37 -04:00
Alexandros Batsakis	8f55f3c0a0	nfsd41: sunrpc: svc_tcp_recv_record() Factor functionality out of svc_tcp_recvfrom() to simplify routine Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-24 18:13:10 -04:00
J. Bruce Fields	e9dc122166	Merge branch 'nfs-for-2.6.32' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 into for-2.6.32-incoming Conflicts: net/sunrpc/cache.c	2009-08-21 11:27:29 -04:00
Trond Myklebust	405d8f8b1d	SUNRPC: Ensure that sunrpc gets initialised before nfs, lockd, etc... We can oops if rpc_pipefs isn't properly initialised before we start to set up objects that depend upon it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-21 08:17:56 -04:00
Trond Myklebust	f7e86ab92f	SUNRPC: cache must take a reference to the cache detail's module on open() Otherwise we Oops if the module containing the cache detail is removed before all cache readers have closed the file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:16 -04:00
Trond Myklebust	e571cbf1a4	NFS: Add a dns resolver for use with NFSv4 referrals and migration The NFSv4 and NFSv4.1 protocols both allow for the redirection of a client from one server to another in order to support filesystem migration and replication. For full protocol support, we need to add the ability to convert a DNS host name into an IP address that we can feed to the RPC client. We'll reuse the sunrpc cache, now that it has been converted to work with rpc_pipefs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:15 -04:00
Trond Myklebust	96c61cbd0f	SUNRPC: Fix a typo in cache_pipefs_files We want the channel to be a regular file, so that we don't need to supply rpc_pipe_ops. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-19 18:22:15 -04:00
Trond Myklebust	6a396f67d2	Merge branch 'nfsv4_xdr_cleanups-for-2.6.32' into nfs-for-2.6.32 Conflicts: fs/nfs/nfs4xdr.c	2009-08-19 18:21:52 -04:00
Benny Halevy	98866b5abe	sunrpc: ntoh -> be*_to_cpu ntohl is already defined as be32_to_cpu. be64_to_cpu has architecture specific optimized implementations. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:12:52 -04:00
Benny Halevy	9f162d2a81	sunrpc: hton -> cpu_to_be* htonl is already defined as cpu_to_be32. cpu_to_be64 has architecture specific optimized implementations. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-14 13:12:38 -04:00
Trond Myklebust	f884dcaead	Merge branch 'sunrpc_cache-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:58 -04:00
Trond Myklebust	976a6f921c	Merge branch 'patches_cel-for-2.6.32' into nfs-for-2.6.32	2009-08-10 17:45:50 -04:00
Trond Myklebust	8854e82d9a	SUNRPC: Add an rpc_pipefs front end for the sunrpc cache code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:30 -04:00
Trond Myklebust	173912a6ad	SUNRPC: Move procfs-specific stuff out of the generic sunrpc cache code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:29 -04:00
Trond Myklebust	bc74b4f5e6	SUNRPC: Allow the cache_detail to specify alternative upcall mechanisms For events that are rare, such as referral DNS lookups, it makes limited sense to have a daemon constantly listening for upcalls on a channel. An alternative in those cases might simply be to run the app that fills the cache using call_usermodehelper_exec() and friends. The following patch allows the cache_detail to specify alternative upcall mechanisms for these particular cases. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:29 -04:00
Trond Myklebust	da77005f0d	SUNRPC: Remove the global temporary write buffer in net/sunrpc/cache.c While we do want to protect against multiple concurrent readers and writers on each upcall/downcall pipe, we don't want to limit concurrent reading and writing to separate caches. This patch therefore replaces the static buffer 'write_buf', which can only be used by one writer at a time, with use of the page cache as the temporary buffer for downcalls. We still fall back to using the the old global buffer if the downcall is larger than PAGE_CACHE_SIZE, since this is apparently needed by the SPKM security context initialisation. It then replaces the use of the global 'queue_io_mutex' with the inode->i_mutex in cache_read() and cache_write(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:28 -04:00
Trond Myklebust	5b7a1b9f92	SUNRPC: Ensure we initialise the cache_detail before creating procfs files Also ensure that we destroy those files before we destroy the cache_detail. Otherwise, user processes might attempt to write into uninitialised caches. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:27 -04:00
Trond Myklebust	2da8ca26c6	NFSD: Clean up the idmapper warning... What part of 'internal use' is so hard to understand? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:26 -04:00
Trond Myklebust	e57aed77ad	SUNRPC: One more clean up for rpc_create_client_dir() In order to allow rpc_pipefs to create directories with different types of subtrees, it is useful to allow the caller to customise the subtree filling process. In order to do so, we separate out the parts which are specific to making an RPC client directory, and put them in a separate helper, then we convert the process of filling the directory contents into a callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:26 -04:00
Trond Myklebust	23ac658170	SUNRPC: clean up rpc_setup_pipedir() There is still a little wart or two there: Since we've already got a vfsmount, we might as well pass that in to rpc_create_client_dir. Another point is that if we open code __rpc_lookup_path() here, then we can avoid looking up the entire parent directory path over and over again: it doesn't change. Also get rid of rpc_clnt->cl_pathname, since it has no users... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:25 -04:00
Trond Myklebust	7d217caca5	SUNRPC: Replace rpc_client->cl_dentry and cl_mnt, with a cl_path Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:24 -04:00
Trond Myklebust	7d59d1e865	SUNRPC: Clean up rpc_create_client_dir() Factor out the code that does lookups from the code that actually creates the directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:23 -04:00
Trond Myklebust	458adb8ba9	SUNRPC: Rename rpc_mkdir to rpc_create_client_dir() This reflects the fact that rpc_mkdir() as it stands today, can only create a RPC client type directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:22 -04:00
Trond Myklebust	bb1567491e	SUNRPC: rpc_pipefs cleanup Move the files[] array closer to rpc_fill_super() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:21 -04:00
Trond Myklebust	ac6fecee31	SUNRPC: Clean up rpc_populate/depopulate Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:20 -04:00
Trond Myklebust	cfeaa4a3ca	SUNRPC: Clean up rpc_lookup_create Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:20 -04:00
Trond Myklebust	810d90bc2a	SUNRPC: Clean up rpc_unlink() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:18 -04:00
Trond Myklebust	7589806e96	SUNRPC: Clean up file creation code in rpc_pipefs Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:17 -04:00
Trond Myklebust	b5bb61da2e	SUNRPC: Clean up rpc_pipefs lookup code... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:17 -04:00
Trond Myklebust	7364af6a2d	SUNRPC: Allow rpc_pipefs_ops to have null values for upcall and downcall Also ensure that we use the umode_t type when appropriate... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:16 -04:00
Trond Myklebust	b693ba4a33	SUNRPC: Constify rpc_pipe_ops... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:14:15 -04:00
Chuck Lever	c05988cdb0	SUNRPC: Add documenting comments in net/sunrpc/timer.c Clean up: provide documenting comments for the functions in net/sunrpc/timer.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:47 -04:00
Chuck Lever	9dc3b095b7	SUNRPC: Update xprt address strings after an rpcbind completes After a bind completes, update the transport instance's address strings so debugging messages display the current port the transport is connected to. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:46 -04:00
Chuck Lever	c740eff84b	SUNRPC: Kill RPC_DISPLAY_ALL At some point, I recall that rpc_pipe_fs used RPC_DISPLAY_ALL. Currently there are no uses of RPC_DISPLAY_ALL outside the transport modules themselves, so we can safely get rid of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:46 -04:00
Chuck Lever	fbfffbd5e7	SUNRPC: Rename sock_xprt.addr as sock_xprt.srcaddr Clean up: Give the "addr" and "port" field less ambiguous names. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:46 -04:00
Chuck Lever	f8b761eff1	SUNRPC: Eliminate PROC macro from rpcb_clnt Clean up: Replace PROC macro with open coded C99 structure initializers to improve readability. The rpcbind v4 GETVERSADDR procedure is never sent by the current implementation, so it is not copied to the new structures. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:44 -04:00
Chuck Lever	0e47f0d665	SUNRPC: Clean up: Remove unused XDR decoder functions from rpcb_clnt.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:43 -04:00
Chuck Lever	c0c077df00	SUNRPC: Introduce new xdr_stream-based decoders to rpcb_clnt.c Replace the open-coded decode logic for PMAP_GETPORT/RPCB_GETADDR with an xdr_stream-based implementation, similar to what NFSv4 uses, to protect against buffer overflows. The new implementation also checks that the incoming port number is reasonable. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:43 -04:00
Chuck Lever	7ed0ff983c	SUNRPC: Introduce xdr_stream-based decoders for RPCB_UNSET Replace the open-coded decode logic for rpcbind UNSET results with an xdr_stream-based implementation, similar to what NFSv4 uses, to protect against buffer overflows. The new function is unused for the moment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:42 -04:00
Chuck Lever	0d36c4f757	SUNRPC: Clean up: Remove unused XDR encoder functions from rpcb_clnt.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:41 -04:00
Chuck Lever	6f2c2db7a4	SUNRPC: Introduce new xdr_stream-based encoders to rpcb_clnt.c Replace the open-coded encode logic for rpcbind arguments with an xdr_stream-based implementation, similar to what NFSv4 uses, to better protect against buffer overflows. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:40 -04:00
Chuck Lever	c877b849d3	SUNRPC: Use rpc_ntop() for constructing transport address strings Clean up: In addition to using the new generic rpc_ntop() and rpc_get_port() functions, have the RPC client compute the presentation address buffer sizes dynamically using kstrdup(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:36 -04:00
Chuck Lever	ba809130bc	SUNRPC: Remove duplicate universal address generation RPC universal address generation is currently done in several places: rpcb_clnt.c, nfs4proc.c xprtsock.c, and xprtrdma.c. Remove the redundant cases that convert a socket address to a universal address. The nfs4proc.c case takes a pre-formatted presentation address string, not a socket address, so we'll leave that one. Because the new uaddr constructor uses the recently introduced rpc_ntop(), it now supports proper "::" shorthanding for IPv6 addresses. This allows the kernel to register properly formed universal addresses with the local rpcbind service, in _all_ cases. The kernel can now also send properly formed universal addresses in RPCB_GETADDR requests, and support link-local properly when encoding and decoding IPv6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:35 -04:00
Chuck Lever	a02d692611	SUNRPC: Provide functions for managing universal addresses Introduce a set of functions in the kernel's RPC implementation for converting between a socket address and either a standard presentation address string or an RPC universal address. The universal address functions will be used to encode and decode RPCB_FOO and NFSv4 SETCLIENTID arguments. The other functions are part of a previous promise to deliver shared functions that can be used by upper-layer protocols to display and manipulate IP addresses. The kernel's current address printf formatters were designed specifically for kernel to user-space APIs that require a particular string format for socket addresses, thus are somewhat limited for the purposes of sunrpc.ko. The formatter for IPv6 addresses, %pI6, does not support short-handing or scope IDs. Also, these printf formatters are unique per address family, so a separate formatter string is required for printing AF_INET and AF_INET6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:34 -04:00
Chuck Lever	0b10bf5e14	SUNRPC: Move XDR data type size macros Clean up: To make subsequent patches cleaner, move the XDR data type size macros to the top of the file (similar to nfs4xdr.c) first. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:09:33 -04:00
Trond Myklebust	cbf1107126	SUNRPC: convert some sysctls into module parameters Parameters like the minimum reserved port, and the number of slot entries should really be module parameters rather than sysctls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-08-09 15:06:19 -04:00
NeilBrown	560ab42ef9	sunrpc: fix memory leak in unix_gid cache. When we look up an entry in the uid->gidlist cache, we take a reference to the content but don't drop the reference to the cache entry. So it never gets freed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-04 16:52:44 -04:00
NeilBrown	989a19b9b1	sunrpc/cache: recheck cache validity after cache_defer_req If cache_defer_req did not leave the request on a queue, then it could possibly have waited long enough that the cache became valid. So check the status after the call. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-04 16:21:44 -04:00
NeilBrown	5c4d263903	sunrpc/cache: make sure deferred requests eventually get revisited. While deferred requests normally get revisited quite quickly, it is possible for a request to remain in the deferral queue when the cache item is discarded. We can easily make sure that doesn't happen by calling cache_revisit_request just before the final 'put'. Also there is a small chance that a race would cause one thread to defer a request against a cache item while another thread is failing to queue an upcall for that item. So when the upcall fails, make sure to revisit all deferred requests. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-04 11:03:01 -04:00
NeilBrown	f866a8194f	sunrpc/cache: rename queue_loose to cache_dequeue 'loose' was a mis-spelling of 'lose', and even that wasn't a good word choice. So give this function a more useful name. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-08-04 09:14:30 -04:00
Chuck Lever	7702ce40bc	SUNRPC: handle IPv6 PKTINFO when extracting destination address PKTINFO is needed to scrape the caller's IP address off the socket so RPC datagram replies are routed correctly. Fill in missing pieces in the kernel RPC server's UDP receive path to request IPv6 PKTINFO and correctly parse the IPv6 cmsg header. Without this patch, kernel RPC services drop all incoming requests on UDP on IPv6. Related commit: `7a37f5787e` Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-14 17:39:46 -04:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
Wei Yongjun	846d8e7cc8	svcrdma: fix error handling of rdma_alloc_frmr() ib_alloc_fast_reg_mr() and ib_alloc_fast_reg_page_list() returns ERR_PTR() and not NULL. Compile tested only. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-07-03 10:14:59 -04:00
Jesper Dangaard Brouer	75de874f5c	sunrpc: Use rcu_barrier() on unload. The sunrpc module uses rcu_call() thus it should use rcu_barrier() on module unload. Have not verified that the possibility for new call_rcu() callbacks has been disabled. As a hint for checking, the functions calling call_rcu() (unx_destroy_cred and generic_destroy_cred) are registered as crdestroy function pointer in struct rpc_credops. Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-26 13:51:34 -07:00
Linus Torvalds	7e0338c0de	Merge branch 'for-2.6.31' of git://fieldses.org/git/linux-nfsd * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits) SUNRPC: Fix the TCP server's send buffer accounting nfsd41: Backchannel: minorversion support for the back channel nfsd41: Backchannel: cleanup nfs4.0 callback encode routines nfsd41: Remove ip address collision detection case nfsd: optimise the starting of zero threads when none are running. nfsd: don't take nfsd_mutex twice when setting number of threads. nfsd41: sanity check client drc maxreqs nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct NFS: kill off complicated macro 'PROC' sunrpc: potential memory leak in function rdma_read_xdr nfsd: minor nfsd_vfs_write cleanup nfsd: Pull write-gathering code out of nfsd_vfs_write nfsd: track last inode only in use_wgather case sunrpc: align cache_clean work's timer nfsd: Use write gathering only with NFSv2 NFSv4: kill off complicated macro 'PROC' NFSv4: do exact check about attribute specified knfsd: remove unreported filehandle stats counters knfsd: fix reply cache memory corruption knfsd: reply cache cleanups ...	2009-06-22 12:55:50 -07:00
Ricardo Labiaga	e9f0298558	nfs41: sunrpc: xprt_alloc_bc_request() should not use spin_lock_bh() xprt_alloc_bc_request() is always called in soft interrupt context. Grab the spin_lock instead of the bottom half spin_lock. Softirqs do not preempt other softirqs running on the same processor, so there is no need to disable bottom halves. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-20 14:55:39 -04:00
Trond Myklebust	47fcb03fef	SUNRPC: Fix the TCP server's send buffer accounting Currently, the sunrpc server is refusing to allow us to process new RPC calls if the TCP send buffer is 2/3 full, even if we do actually have enough free space to guarantee that we can send another request. The following patch fixes svc_tcp_has_wspace() so that we only stop processing requests if we know that the socket buffer cannot possibly fit another reply. It also fixes the tcp write_space() callback so that we only clear the SOCK_NOSPACE flag when the TCP send buffer is less than 2/3 full. This should ensure that the send window will grow as per the standard TCP socket code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-18 19:58:51 -07:00
Trond Myklebust	1f84603c09	Merge branch 'devel-for-2.6.31' into for-2.6.31 Conflicts: fs/nfs/client.c fs/nfs/super.c	2009-06-18 18:13:44 -07:00
Trond Myklebust	301933a0ac	Merge commit 'linux-pnfs/nfs41-for-2.6.31' into nfsv41-for-2.6.31	2009-06-17 17:59:58 -07:00
Ricardo Labiaga	dd2b63d049	nfs41: Rename rq_received to rq_reply_bytes_recvd The 'rq_received' member of 'struct rpc_rqst' is used to track when we have received a reply to our request. With v4.1, the backchannel can now accept callback requests over the existing connection. Rename this field to make it clear that it is only used for tracking reply bytes and not all bytes received on the connection. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:40 -07:00
Rahul Iyer	343952fa5a	nfs41: Get the rpc_xprt * from the rpc_rqst instead of the rpc_clnt. Obtain the rpc_xprt from the rpc_rqst so that calls and callback replies can both use the same code path. A client needs the rpc_xprt in order to reply to a callback. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:34 -07:00
Benny Halevy	8f97524235	nfs41: create a svc_xprt for nfs41 callback thread and use for incoming callbacks Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:31 -07:00
Andy Adamson	9c9f3f5fa6	nfs41: sunrpc: add a struct svc_xprt pointer to struct svc_serv for backchannel use This svc_xprt is passed on to the callback service thread to be later used to processes incoming svc_rqst's Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:31 -07:00
Benny Halevy	7652e5a09b	nfs41: sunrpc: provide functions to create and destroy a svc_xprt for backchannel use For nfs41 callbacks we need an svc_xprt to process requests coming up the backchannel socket as rpc_rqst's that are transformed into svc_rqst's that need a rq_xprt to be processed. The svc_{udp,tcp}_create methods are too heavy for this job as svc_create_socket creates an actual socket to listen on while for nfs41 we're "reusing" the fore channel's socket. Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:30 -07:00
Ricardo Labiaga	4d6bbb6233	nfs41: Backchannel bc_svc_process() Implement the NFSv4.1 backchannel service. Invokes the common callback processing logic svc_process_common() to authenticate the call and dispatch the appropriate NFSv4.1 XDR decoder and operation procedure. It then invokes bc_send() to send the reply over the same connection. bc_send() is implemented in a separate patch. At this time there is no slot validation or reply cache handling. [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Move bc_svc_process() declaration to correct patch] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:29 -07:00
Ricardo Labiaga	1cad7ea6fe	nfs41: Refactor svc_process() net/sunrpc/svc.c:svc_process() is used by the NFSv4 callback service to process RPC requests arriving over connections initiated by the server. NFSv4.1 supports callbacks over the backchannel on connections initiated by the client. This patch refactors svc_process() so that common code can also be used by the backchannel. Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:28 -07:00
Ricardo Labiaga	0d90ba1cd4	nfs41: Backchannel callback service helper routines Executes the backchannel task on the RPC state machine using the existing open connection previously established by the client. Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> nfs41: Add bc_svc.o to sunrpc Makefile. [nfs41: bc_send() does not need to be exported outside RPC module] [nfs41: xprt_free_bc_request() need not be exported outside RPC module] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Update copyright] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:28 -07:00
Ricardo Labiaga	55ae1aabfb	nfs41: Add backchannel processing support to RPC state machine Adds rpc_run_bc_task() which is called by the NFS callback service to process backchannel requests. It performs similar work to rpc_run_task() though "schedules" the backchannel task to be executed starting at the call_trasmit state in the RPC state machine. It also introduces some miscellaneous updates to the argument validation, call_transmit, and transport cleanup functions to take into account that there are now forechannel and backchannel tasks. Backchannel requests do not carry an RPC message structure, since the payload has already been XDR encoded using the existing NFSv4 callback mechanism. Introduce a new transmit state for the client to reply on to backchannel requests. This new state simply reserves the transport and issues the reply. In case of a connection related error, disconnects the transport and drops the reply. It requires the forechannel to re-establish the connection and the server to retransmit the request, as stated in NFSv4.1 section 2.9.2 "Client and Server Transport Behavior". Note: There is no need to loop attempting to reserve the transport. If EAGAIN is returned by xprt_prepare_transmit(), return with tk_status == 0, setting tk_action to call_bc_transmit. rpc_execute() will invoke it again after the task is taken off the sleep queue. [nfs41: rpc_run_bc_task() need not be exported outside RPC module] [nfs41: New call_bc_transmit RPC state] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Backchannel: No need to loop in call_bc_transmit()] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [rpc_count_iostats incorrectly exits early] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Convert rpc_reply_expected() to inline function] [Remove unnecessary BUG_ON()] [Rename variable] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 14:11:24 -07:00
Trond Myklebust	88b5ed73bc	SUNRPC: Fix a missing "break" option in xs_tcp_setup_socket() In the case of -EADDRNOTAVAIL and/or unhandled connection errors, we want to get rid of the existing socket and retry immediately, just as the comment says. Currently we end up sleeping for a minute, due to the missing "break" statement. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 13:22:57 -07:00
Ricardo Labiaga	44b98efdd0	nfs41: New xs_tcp_read_data() Handles RPC replies and backchannel callbacks. Traditionally the NFS client has expected only RPC replies on its open connections. With NFSv4.1, callbacks can arrive over an existing open connection. This patch refactors the old xs_tcp_read_request() into an RPC reply handler: xs_tcp_read_reply(), a new backchannel callback handler: xs_tcp_read_callback(), and a common routine to read the data off the transport: xs_tcp_read_common(). The new xs_tcp_read_callback() queues callback requests onto a queue where the callback service (a separate thread) is listening for the processing. This patch incorporates work and suggestions from Rahul Iyer (iyer@netapp.com) and Benny Halevy (bhalevy@panasas.com). xs_tcp_read_callback() drops the connection when the number of expected callbacks is exceeded. Use xprt_force_disconnect(), ensuring tasks on the pending queue are awaken on disconnect. [nfs41: Keep track of RPC call/reply direction with a flag] [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: xs_tcp_read_callback() should use xprt_force_disconnect()] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Moves embedded #ifdefs into #ifdef function blocks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 13:06:16 -07:00
Ricardo Labiaga	fb7a0b9add	nfs41: New backchannel helper routines This patch introduces support to setup the callback xprt on the client side. It allocates/ destroys the preallocated memory structures used to process backchannel requests. At setup time, xprt_setup_backchannel() is invoked to allocate one or more rpc_rqst structures and substructures. This ensures that they are available when an RPC callback arrives. The rpc_rqst structures are maintained in a linked list attached to the rpc_xprt structure. We keep track of the number of allocations so that they can be correctly removed when the channel is destroyed. When an RPC callback arrives, xprt_alloc_bc_request() is invoked to obtain a preallocated rpc_rqst structure. An rpc_xprt structure is returned, and its RPC_BC_PREALLOC_IN_USE bit is set in rpc_xprt->bc_flags. The structure is removed from the the list since it is now in use, and it will be later added back when its user is done with it. After the RPC callback replies, the rpc_rqst structure is returned by invoking xprt_free_bc_request(). This clears the RPC_BC_PREALLOC_IN_USE bit and adds it back to the list, allowing it to be reused by a subsequent RPC callback request. To be consistent with the reception of RPC messages, the backchannel requests should be placed into the 'struct rpc_rqst' rq_rcv_buf, which is then in turn copied to the 'struct rpc_rqst' rq_private_buf. [nfs41: Preallocate rpc_rqst receive buffer for handling callbacks] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Update copyright notice and explain page allocation] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 13:06:14 -07:00
Ricardo Labiaga	f9acac1a47	nfs41: Initialize new rpc_xprt callback related fields Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 13:06:14 -07:00
Ricardo Labiaga	f4a2e418bf	nfs41: Process the RPC call direction Reading and storing the RPC direction is a three step process. 1. xs_tcp_read_calldir() reads the RPC direction, but it will not store it in the XDR buffer since the 'struct rpc_rqst' is not yet available. 2. The 'struct rpc_rqst' is obtained during the TCP_RCV_COPY_DATA state. This state need not necessarily be preceeded by the TCP_RCV_READ_CALLDIR. For example, we may be reading a continuation packet to a large reply. Therefore, we can't simply obtain the 'struct rpc_rqst' during the TCP_RCV_READ_CALLDIR state and assume it's available during TCP_RCV_COPY_DATA. This patch adds a new TCP_RCV_READ_CALLDIR flag to indicate the need to read the RPC direction. It then uses TCP_RCV_COPY_CALLDIR to indicate the RPC direction needs to be saved after the 'struct rpc_rqst' has been allocated. 3. The 'struct rpc_rqst' is obtained by the xs_tcp_read_data() helper functions. xs_tcp_read_common() then saves the RPC direction in the XDR buffer if TCP_RCV_COPY_CALLDIR is set. This will happen when we're reading the data immediately after the direction was read. xs_tcp_read_common() then clears this flag. [was nfs41: Skip past the RPC call direction] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: Add RPC direction back into the XDR buffer] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: sunrpc: Don't skip past the RPC call direction] Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 12:43:46 -07:00
Ricardo Labiaga	18dca02aeb	nfs41: Add ability to read RPC call direction on TCP stream. NFSv4.1 callbacks can arrive over an existing connection. This patch adds the logic to read the RPC call direction (call or reply). It does this by updating the state machine to look for the call direction invoking xs_tcp_read_calldir(...) after reading the XID. [nfs41: Keep track of RPC call/reply direction with a flag] As per 11/14/08 review of RFC 53/85. Add a new flag to track whether the incoming message is an RPC call or an RPC reply. TCP_RPC_REPLY is set in the 'struct sock_xprt' tcp_flags in xs_tcp_read_calldir() if the message is an RPC reply sent on the forechannel. It is cleared if the message is an RPC request sent on the back channel. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>	2009-06-17 12:43:45 -07:00
Andy Adamson	aae2006e9b	nfs41: sunrpc: Export the call prepare state for session reset Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-06-17 12:25:07 -07:00
Christoph Lameter	62bc62a873	page allocator: use a pre-calculated value instead of num_online_nodes() in fast paths num_online_nodes() is called in a number of places but most often by the page allocator when deciding whether the zonelist needs to be filtered based on cpusets or the zonelist cache. This is actually a heavy function and touches a number of cache lines. This patch stores the number of online nodes at boot time and updates the value when nodes get onlined and offlined. The value is then used in a number of important paths in place of num_online_nodes(). [rientjes@google.com: do not override definition of node_set_online() with macro] Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-16 19:47:35 -07:00
Christian Engelmayer	59fb30660b	sunrpc: potential memory leak in function rdma_read_xdr In case the check on ch_count fails the cleanup path is skipped and the previously allocated memory 'rpl_map', 'chl_map' is not freed. Reported by Coverity. Signed-off-by: Christian Engelmayer <christian.engelmayer@frequentis.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 19:34:32 -07:00
Anton Blanchard	6aad89c837	sunrpc: align cache_clean work's timer Align cache_clean work. Signed-off-by: Anton Blanchard <anton@samba.org> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-06-15 18:14:58 -07:00
J. Bruce Fields	7eef4091a6	Merge commit 'v2.6.30' into for-2.6.31	2009-06-15 18:08:07 -07:00
David S. Miller	9cbc1cb8cd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/scsi/fcoe/fcoe.c net/core/drop_monitor.c net/core/net-traces.c	2009-06-15 03:02:23 -07:00
Jesper Dangaard Brouer	bf12691d84	sunrpc/auth_gss: Call rcu_barrier() on module unload. As the module uses rcu_call() we should make sure that all rcu callback has been completed before removing the code. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-10 01:11:27 -07:00
Eric Dumazet	adf30907d6	net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry skb_dst(const struct sk_buff skb) void skb_dst_set(struct sk_buff skb, struct dst_entry dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:04 -07:00
Linus Torvalds	c8bce3d3bd	Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: svcrdma: dma unmap the correct length for the RPCRDMA header page. nfsd: Revert "svcrpc: take advantage of tcp autotuning" nfsd: fix hung up of nfs client while sync write data to nfs server	2009-05-29 08:49:09 -07:00
Steve Wise	98779be861	svcrdma: dma unmap the correct length for the RPCRDMA header page. The svcrdma module was incorrectly unmapping the RPCRDMA header page. On IBM pserver systems this causes a resource leak that results in running out of bus address space (10 cthon iterations will reproduce it). The code was mapping the full page but only unmapping the actual header length. The fix is to only map the header length. I also cleaned up the use of ib_dma_map_page() calls since the unmap logic always uses ib_dma_unmap_single(). I made these symmetrical. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 18:57:24 -04:00
J. Bruce Fields	7f4218354f	nfsd: Revert "svcrpc: take advantage of tcp autotuning" This reverts commit `47a14ef1af` "svcrpc: take advantage of tcp autotuning", which uncovered some further problems in the server rpc code, causing significant performance regressions in common cases. We will likely reinstate this patch after releasing 2.6.30 and applying some work on the underlying fixes to the problem (developed by Trond). Reported-by: Jeff Moyer <jmoyer@redhat.com> Cc: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-27 18:51:06 -04:00
Vu Pham	68743082b5	XPRTRDMA: fix client rpcrdma FRMR registration on mlx4 devices mlx4/connectX FRMR requires local write enable together with remote rdma write enable. This fixes NFS/RDMA operation over the ConnectX Infiniband HCA in the default memreg mode. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-05-26 14:51:00 -04:00
Linus Torvalds	2ea3f86848	Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: nfsd: silence lockdep warning lockd: fix list corruption on lockd restart nfsd4: check for negative dentry before use in nfsv4 readdir nfsd41: slots are freed with session svcrdma: clean up error paths. svcrdma: Fix dma map direction for rdma read targets	2009-05-12 17:11:56 -07:00
Steve Wise	21515e46bc	svcrdma: clean up error paths. These fixes resolved crashes due to resource leak BUG_ON checks. The resource leaks were detected by introducing asynchronous transport errors. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 14:19:10 -04:00
Trond Myklebust	f75e6745aa	SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnect See http://bugzilla.kernel.org/show_bug.cgi?id=13034 If the port gets into a TIME_WAIT state, then we cannot reconnect without binding to a new port. Tested-by: Petr Vandrovec <petr@vandrovec.name> Tested-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 16:35:08 -07:00
Chuck Lever	017cb47f46	SUNRPC: Clean up one_sock_name() Clean up svc_one_sock_name() by setting up automatic variables for frequently used expressions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:29 -04:00
Chuck Lever	58de2f8658	SUNRPC: Support PF_INET6 in one_sock_name() Add an arm to the switch statement in svc_one_sock_name() so it can construct the name of PF_INET6 sockets properly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aime Le Rouzic <aime.le-rouzic@bull.net> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:29 -04:00
Chuck Lever	e7942b9f25	SUNRPC: Switch one_sock_name() to use snprintf() Use snprintf() in one_sock_name() to prevent overflowing the output buffer. If the name doesn't fit in the buffer, the buffer is filled in with an empty string, and -ENAMETOOLONG is returned. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:29 -04:00
Chuck Lever	8435d34dbb	SUNRPC: pass buffer size to svc_sock_names() Adjust the synopsis of svc_sock_names() to pass in the size of the output buffer. Add a documenting comment. This is a cosmetic change for now. A subsequent patch will make sure the buffer length is passed to one_sock_name(), where the length will actually be useful. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	bfba9ab4c6	SUNRPC: pass buffer size to svc_addsock() Adjust the synopsis of svc_addsock() to pass in the size of the output buffer. Add a documenting comment. This is a cosmetic change for now. A subsequent patch will make sure the buffer length is passed to one_sock_name(), where the length will actually be useful. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	335c54bdc4	NFSD: Prevent a buffer overflow in svc_xprt_names() The svc_xprt_names() function can overflow its buffer if it's so near the end of the passed in buffer that the "name too long" string still doesn't fit. Of course, it could never tell if it was near the end of the passed in buffer, since its only caller passes in zero as the buffer length. Let's make this API a little safer. Change svc_xprt_names() so it always checks for a buffer overflow, and change its only caller to pass in the correct buffer length. If svc_xprt_names() does overflow its buffer, it now fails with an ENAMETOOLONG errno, instead of trying to write a message at the end of the buffer. I don't like this much, but I can't figure out a clean way that's always safe to return some of the names, and an indication that the buffer was not long enough. The displayed error when doing a 'cat /proc/fs/nfsd/portlist' is "File name too long". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:28 -04:00
Chuck Lever	abc5c44d62	SUNRPC: Fix error return value of svc_addr_len() The svc_addr_len() helper function returns -EAFNOSUPPORT if it doesn't recognize the address family of the passed-in socket address. However, the return type of this function is size_t, which means -EAFNOSUPPORT is turned into a very large positive value in this case. The check in svc_udp_recvfrom() to see if the return value is less than zero therefore won't work at all. Additionally, handle_connect_req() passes this value directly to memset(). This could cause memset() to clobber a large chunk of memory if svc_addr_len() has returned an error. Currently the address family of these addresses, however, is known to be supported long before handle_connect_req() is called, so this isn't a real risk. Change the error return value of svc_addr_len() to zero, which fits in the range of size_t, and is safer to pass to memset() directly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:54:25 -04:00
H Hartley Sweeten	dcf1a3573e	net/sunrpc/svc_xprt.c: fix sparse warnings Fix the following sparse warnings in net/sunrpc/svc_xprt.c. warning: symbol 'svc_recv' was not declared. Should it be static? warning: symbol 'svc_drop' was not declared. Should it be static? warning: symbol 'svc_send' was not declared. Should it be static? warning: symbol 'svc_close_all' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-28 13:00:02 -04:00
Steve Wise	d0687be7c7	svcrdma: Fix dma map direction for rdma read targets The nfs server rdma transport was mapping rdma read target pages for TO_DEVICE instead of FROM_DEVICE. This causes data corruption on non cache-coherent systems if frmrs are used. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-25 18:11:14 -04:00
Linus Torvalds	a63856252d	Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux: (81 commits) nfsd41: define nfsd4_set_statp as noop for !CONFIG_NFSD_V4 nfsd41: define NFSD_DRC_SIZE_SHIFT in set_max_drc nfsd41: Documentation/filesystems/nfs41-server.txt nfsd41: CREATE_EXCLUSIVE4_1 nfsd41: SUPPATTR_EXCLCREAT attribute nfsd41: support for 3-word long attribute bitmask nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify nfsd41: pass writable attrs mask to nfsd4_decode_fattr nfsd41: provide support for minor version 1 at rpc level nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions nfsd41: add OPEN4_SHARE_ACCESS_WANT nfs4_stateid bmap nfsd41: access_valid nfsd41: clientid handling nfsd41: check encode size for sessions maxresponse cached nfsd41: stateid handling nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op nfsd41: destroy_session operation nfsd41: non-page DRC for solo sequence responses nfsd41: Add a create session replay cache nfsd41: create_session operation ...	2009-04-06 13:25:56 -07:00
Linus Torvalds	90975ef712	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits) cpumask: remove cpumask allocation from idle_balance, fix numa, cpumask: move numa_node_id default implementation to topology.h, fix cpumask: remove cpumask allocation from idle_balance x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus x86: cpumask: update 32-bit APM not to mug current->cpus_allowed x86: microcode: cleanup x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash numa, cpumask: move numa_node_id default implementation to topology.h cpumask: convert node_to_cpumask_map[] to cpumask_var_t cpumask: remove x86 cpumask_t uses. cpumask: use cpumask_var_t in uv_flush_tlb_others. cpumask: remove cpumask_t assignment from vector_allocation_domain() cpumask: make Xen use the new operators. cpumask: clean up summit's send_IPI functions cpumask: use new cpumask functions throughout x86 x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t cpumask: convert node_to_cpumask_map[] to cpumask_var_t x86: unify 32 and 64-bit node_to_cpumask_map ...	2009-04-05 10:33:07 -07:00
Andy Adamson	2f425878b6	nfsd: don't use the deferral service, return NFS4ERR_DELAY On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be returned. It is up to the NFSv4.1 client to resend only the operations that have not been processed. Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in nfsd4_proc_compound() only when NFSv4.1 Sessions are used. Note: this isn't an adequate solution on its own. It's acceptable as a way to get some minimal 4.1 up and working, but we're going to have to find a way to avoid returning DELAY in all common cases before 4.1 can really be considered ready. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd41: reverse rq_nodeferral negative logic] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [sunrpc: initialize rq_usedeferral] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-04-03 17:41:12 -07:00
Linus Torvalds	811158b147	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits) trivial: Update my email address trivial: NULL noise: drivers/mtd/tests/mtd_*test.c trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h trivial: Fix misspelling of "Celsius". trivial: remove unused variable 'path' in alloc_file() trivial: fix a pdlfush -> pdflush typo in comment trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL trivial: wusb: Storage class should be before const qualifier trivial: drivers/char/bsr.c: Storage class should be before const qualifier trivial: h8300: Storage class should be before const qualifier trivial: fix where cgroup documentation is not correctly referred to trivial: Give the right path in Documentation example trivial: MTD: remove EOL from MODULE_DESCRIPTION trivial: Fix typo in bio_split()'s documentation trivial: PWM: fix of #endif comment trivial: fix typos/grammar errors in Kconfig texts trivial: Fix misspelling of firmware trivial: cgroups: documentation typo and spelling corrections trivial: Update contact info for Jochen Hein trivial: fix typo "resgister" -> "register" ...	2009-04-03 15:24:35 -07:00
Trond Myklebust	cc85906110	Merge branch 'devel' into for-linus	2009-04-01 13:28:15 -04:00
Trond Myklebust	c69da774b2	SUNRPC: Ensure IPV6_V6ONLY is set on the socket before binding to a port Also ensure that we use the protocol family instead of the address family when calling sock_create_kern(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-04-01 13:24:29 -04:00
Rusty Russell	558f6ab910	Merge branch 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip Conflicts: arch/x86/include/asm/topology.h drivers/oprofile/buffer_sync.c (Both cases: changed in Linus' tree, removed in Ingo's).	2009-03-31 13:33:50 +10:30
Linus Torvalds	d17abcd541	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: oprofile: Thou shalt not call __exit functions from __init functions cpumask: remove the now-obsoleted pcibus_to_cpumask(): generic cpumask: remove cpumask_t from core cpumask: convert rcutorture.c cpumask: use new cpumask_ functions in core code. cpumask: remove references to struct irqaction's mask field. cpumask: use mm_cpumask() wrapper: kernel/fork.c cpumask: use set_cpu_active in init/main.c cpumask: remove node_to_first_cpu cpumask: fix seq_bitmap_*() functions. cpumask: remove dangerous CPU_MASK_ALL_PTR, &CPU_MASK_ALL	2009-03-30 18:00:26 -07:00
Ingo Molnar	65fb0d23fc	Merge branch 'linus' into cpumask-for-linus Conflicts: arch/x86/kernel/cpu/common.c	2009-03-30 23:53:32 +02:00
Alexey Dobriyan	99b7623380	proc 2/2: remove struct proc_dir_entry::owner Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy as correctly noted at bug #12454. Someone can lookup entry with NULL ->owner, thus not pinning enything, and release it later resulting in module refcount underflow. We can keep ->owner and supply it at registration time like ->proc_fops and ->data. But this leaves ->owner as easy-manipulative field (just one C assignment) and somebody will forget to unpin previous/pin current module when switching ->owner. ->proc_fops is declared as "const" which should give some thoughts. ->read_proc/->write_proc were just fixed to not require ->owner for protection. rmmod'ed directories will be empty and return "." and ".." -- no harm. And directories with tricky enough readdir and lookup shouldn't be modular. We definitely don't want such modular code. Removing ->owner will also make PDE smaller. So, let's nuke it. Kudos to Jeff Layton for reminding about this, let's say, oversight. http://bugzilla.kernel.org/show_bug.cgi?id=12454 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-03-31 01:14:44 +04:00
Matt LaPlante	692105b8ac	trivial: fix typos/grammar errors in Kconfig texts Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-03-30 15:22:01 +02:00
Rusty Russell	aa85ea5b89	cpumask: use new cpumask_ functions in core code. Impact: cleanup Time to clean up remaining laggards using the old cpu_ functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Trond.Myklebust@netapp.com	2009-03-30 22:05:16 +10:30
Chuck Lever	9355982830	SUNRPC: Remove CONFIG_SUNRPC_REGISTER_V4 We just augmented the kernel's RPC service registration code so that it automatically adjusts to what is supported in user space. Thus we no longer need the kernel configuration option to enable registering RPC services with v4 -- it's all done automatically. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 16:00:17 -04:00
Chuck Lever	363f724cdd	SUNRPC: rpcb_register() should handle errors silently Move error reporting for RPC registration to rpcb_register's caller. This way the caller can choose to recover silently from certain errors, but report errors it does not recognize. Error reporting for kernel RPC service registration is now handled in one place. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:59:48 -04:00
Chuck Lever	cadc0fa534	SUNRPC: Simplify kernel RPC service registration The kernel registers RPC services with the local portmapper with an rpcbind SET upcall to the local portmapper. Traditionally, this used rpcbind v2 (PMAP), but registering RPC services that support IPv6 requires rpcbind v3 or v4. Since we now want separate PF_INET and PF_INET6 listeners for each kernel RPC service, svc_register() will do only one of those registrations at a time. For PF_INET, it tries an rpcb v4 SET upcall first; if that fails, it does a legacy portmap SET. This makes it entirely backwards compatible with legacy user space, but allows a proper v4 SET to be used if rpcbind is available. For PF_INET6, it does an rpcb v4 SET upcall. If that fails, it fails the registration, and thus the transport creation. This let's the kernel detect if user space is able to support IPv6 RPC services, and thus whether it should maintain a PF_INET6 listener for each service at all. This provides complete backwards compatibilty with legacy user space that only supports rpcbind v2. The only down-side is that registering a new kernel RPC service may take an extra exchange with the local portmapper on legacy systems, but this is an infrequent operation and is done over UDP (no lingering sockets in TIMEWAIT), so it shouldn't be consequential. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:58:37 -04:00
Chuck Lever	d5a8620f7c	SUNRPC: Simplify svc_unregister() Our initial implementation of svc_unregister() assumed that PMAP_UNSET cleared all rpcbind registrations for a [program, version] tuple. However, we now have evidence that PMAP_UNSET clears only "inet" entries, and not "inet6" entries, in the rpcbind database. For backwards compatibility with the legacy portmapper, the svc_unregister() function also must work if user space doesn't support rpcbind version 4 at all. Thus we'll send an rpcbind v4 UNSET, and if that fails, we'll send a PMAP_UNSET. This simplifies the code in svc_unregister() and provides better backwards compatibility with legacy user space that does not support rpcbind version 4. We can get rid of the conditional compilation in here as well. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:58:07 -04:00
Chuck Lever	1673d0de40	SUNRPC: Allow callers to pass rpcb_v4_register a NULL address The user space TI-RPC library uses an empty string for the universal address when unregistering all target addresses for [program, version]. The kernel's rpcb client should behave the same way. Here, we are switching between several registration methods based on the protocol family of the incoming address. Rename the other rpcbind v4 registration functions to make it clear that they, as well, are switched on protocol family. In /etc/netconfig, this is either "inet" or "inet6". NB: The loopback protocol families are not supported in the kernel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:57:00 -04:00
Chuck Lever	126e4bc3b3	SUNRPC: rpcbind actually interprets r_owner string RFC 1833 has little to say about the contents of r_owner; it only specifies that it is a string, and states that it is used to control who can UNSET an entry. Our port of rpcbind (from Sun) assumes this string contains a numeric UID value, not alphabetical or symbolic characters, but checks this value only for AF_LOCAL RPCB_SET or RPCB_UNSET requests. In all other cases, rpcbind ignores the contents of the r_owner string. The reference user space implementation of rpcb_set(3) uses a numeric UID for all SET/UNSET requests (even via the network) and an empty string for all other requests. We emulate that behavior here to maintain bug-for-bug compatibility. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:56:04 -04:00
Chuck Lever	3aba45536f	SUNRPC: Clean up address type casts in rpcb_v4_register() Clean up: Simplify rpcb_v4_register() and its helpers by moving the details of sockaddr type casting to rpcb_v4_register()'s helper functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:55:52 -04:00
Chuck Lever	ba5c35e0c7	SUNRPC: Don't return EPROTONOSUPPORT in svc_register()'s helpers The RPC client returns -EPROTONOSUPPORT if there is a protocol version mismatch (ie the remote RPC server doesn't support the RPC protocol version sent by the client). Helpers for the svc_register() function return -EPROTONOSUPPORT if they don't recognize the passed-in IPPROTO_ value. These are two entirely different failure modes. Have the helpers return -ENOPROTOOPT instead of -EPROTONOSUPPORT. This will allow callers to determine more precisely what the underlying problem is, and decide to report or recover appropriately. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:55:40 -04:00
Chuck Lever	fc28decdc9	SUNRPC: Use IPv4 loopback for registering AF_INET6 kernel RPC services The kernel uses an IPv6 loopback address when registering its AF_INET6 RPC services so that it can tell whether the local portmapper is actually IPv6-enabled. Since the legacy portmapper doesn't listen on IPv6, however, this causes a long timeout on older systems if the kernel happens to try creating and registering an AF_INET6 RPC service. Originally I wanted to use a connected transport (either TCP or connected UDP) so that the upcall would fail immediately if the portmapper wasn't listening on IPv6, but we never agreed on what transport to use. In the end, it's of little consequence to the kernel whether the local portmapper is listening on IPv6. It's only important whether the portmapper supports rpcbind v4. And the kernel can't tell that at all if it is sending requests via IPv6 -- the portmapper will just ignore them. So, send both rpcbind v2 and v4 SET/UNSET requests via IPv4 loopback to maintain better backwards compatibility between new kernels and legacy user space, and prevent multi-second hangs in some cases when the kernel attempts to register RPC services. This patch is part of a series that addresses http://bugzilla.kernel.org/show_bug.cgi?id=12256 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:55:28 -04:00
Chuck Lever	7d21c0f984	SUNRPC: Set IPV6ONLY flag on PF_INET6 RPC listener sockets We are about to convert to using separate RPC listener sockets for PF_INET and PF_INET6. This echoes the way IPv6 is handled in user space by TI-RPC, and eliminates the need for ULPs to worry about mapped IPv4 AF_INET6 addresses when doing address comparisons. Start by setting the IPV6ONLY flag on PF_INET6 RPC listener sockets. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:55:18 -04:00
Chuck Lever	49a9072f29	SUNRPC: Remove @family argument from svc_create() and svc_create_pooled() Since an RPC service listener's protocol family is specified now via svc_create_xprt(), it no longer needs to be passed to svc_create() or svc_create_pooled(). Remove that argument from the synopsis of those functions, and remove the sv_family field from the svc_serv struct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:48 -04:00
Chuck Lever	9652ada3fb	SUNRPC: Change svc_create_xprt() to take a @family argument The sv_family field is going away. Pass a protocol family argument to svc_create_xprt() instead of extracting the family from the passed-in svc_serv struct. Again, as this is a listener socket and not an address, we make this new argument an "int" protocol family, instead of an "sa_family_t." Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:36 -04:00
Chuck Lever	baf01caf09	SUNRPC: svc_setup_socket() gets protocol family from socket Since the sv_family field is going away, modify svc_setup_socket() to extract the protocol family from the passed-in socket instead of from the passed-in svc_serv struct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:23 -04:00
Chuck Lever	4b62e58ccc	SUNRPC: Pass a family argument to svc_register() The sv_family field is going away. Instead of using sv_family, have the svc_register() function take a protocol family argument. Since this argument represents a protocol family, and not an address family, this argument takes an int, as this is what is passed to sock_create_kern(). Also make sure svc_register's helpers are checking for PF_FOO instead of AF_FOO. The value of [AP]F_FOO are equivalent; this is simply a symbolic change to reflect the semantics of the value stored in that variable. sock_create_kern() should return EPFNOSUPPORT if the passed-in protocol family isn't supported, but it uses EAFNOSUPPORT for this case. We will stick with that tradition here, as svc_register() is called by the RPC server in the same path as sock_create_kern(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:54:12 -04:00
Chuck Lever	156e62094a	SUNRPC: Clean up svc_find_xprt() calling sequence Clean up: add documentating comment and use appropriate data types for svc_find_xprt()'s arguments. This also eliminates a mixed sign comparison: @port was an int, while the return value of svc_xprt_local_port() is an unsigned short. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:53:57 -04:00
Chuck Lever	776bd5c7a2	SUNRPC: Don't flag empty RPCB_GETADDR reply as bogus In 2007, commit `e65fe3976f` added additional sanity checking to rpcb_decode_getaddr() to make sure we were getting a reply that was long enough to be an actual universal address. If the uaddr string isn't long enough, the XDR decoder returns EIO. However, an empty string is a valid RPCB_GETADDR response if the requested service isn't registered. Moreover, "::.n.m" is also a valid RPCB_GETADDR response for IPv6 addresses that is shorter than rpcb_decode_getaddr()'s lower limit of 11. So this sanity check introduced a regression for rpcbind requests against IPv6 remotes. So revert the lower bound check added by commit `e65fe3976f`, and add an explicit check for an empty uaddr string, similar to libtirpc's rpcb_getaddr(3). Pointed-out-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-28 15:52:08 -04:00
Linus Torvalds	3ae5080f4c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (37 commits) fs: avoid I_NEW inodes Merge code for single and multiple-instance mounts Remove get_init_pts_sb() Move common mknod_ptmx() calls into caller Parse mount options just once and copy them to super block Unroll essentials of do_remount_sb() into devpts vfs: simple_set_mnt() should return void fs: move bdev code out of buffer.c constify dentry_operations: rest constify dentry_operations: configfs constify dentry_operations: sysfs constify dentry_operations: JFS constify dentry_operations: OCFS2 constify dentry_operations: GFS2 constify dentry_operations: FAT constify dentry_operations: FUSE constify dentry_operations: procfs constify dentry_operations: ecryptfs constify dentry_operations: CIFS constify dentry_operations: AFS ...	2009-03-27 16:23:12 -07:00
ideawu	abd91ee979	sunrpc/svc.c: Remove unused line 'rqstp->rq_server = serv;' in svc_process There is no need to set rqstp->rq_server to serv, while serv is initialized as rqstp->rq_server at previous line. And between these two lines, there is no change to rqstp->rq_server. Signed-off-by: ideawu <ideawu@163.com> Reviewed-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-27 19:15:21 -04:00
Al Viro	3ba13d179e	constify dentry_operations: rest Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-27 14:44:03 -04:00
David S. Miller	08abe18af1	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Conflicts: drivers/net/wimax/i2400m/usb-notif.c	2009-03-26 15:23:24 -07:00
Tom Talpey	2e3c230bc7	SVCRDMA: fix recent printk format warnings. printk formats in prior commit were reversed/incorrect. Compiled without warning on x86 and x86_64, but detected on ppc. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:37 -04:00
Trond Myklebust	55420c24a0	SUNRPC: Ensure we close the socket on EPIPE errors too... As long as one task is holding the socket lock, then calls to xprt_force_disconnect(xprt) will not succeed in shutting down the socket. In particular, this would mean that a server initiated shutdown will not succeed until the lock is relinquished. In order to avoid the deadlock, we should ensure that xs_tcp_send_request() closes the socket on EPIPE errors too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:36 -04:00
Trond Myklebust	b61d59fffd	SUNRPC: xs_tcp_connect_worker{4,6}: merge common code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:35 -04:00
Trond Myklebust	25fe6142a5	SUNRPC: Add a sysctl to control the duration of the socket linger timeout Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:34 -04:00
Trond Myklebust	7d1e8255cf	SUNRPC: Add the equivalent of the linger and linger2 timeouts to RPC sockets This fixes a regression against FreeBSD servers as reported by Tomas Kasparek. Apparently when using RPC over a TCP socket, the FreeBSD servers don't ever react to the client closing the socket, and so commit `e06799f958` (SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket) causes the setup to hang forever whenever the client attempts to close and then reconnect. We break the deadlock by adding a 'linger2' style timeout to the socket, after which, the client will abort the connection using a TCP 'RST'. The default timeout is set to 15 seconds. A subsequent patch will put it under user control by means of a systctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-19 15:17:34 -04:00
Olga Kornievskaia	47a14ef1af	svcrpc: take advantage of tcp autotuning Allow the NFSv4 server to make use of TCP autotuning behaviour, which was previously disabled by setting the sk_userlocks variable. Set the receive buffers to be big enough to receive the whole RPC request, and set this for the listening socket, not the accept socket. Remove the code that readjusts the receive/send buffer sizes for the accepted socket. Previously this code was used to influence the TCP window management behaviour, which is no longer needed when autotuning is enabled. This can improve IO bandwidth on networks with high bandwidth-delay products, where a large tcp window is required. It also simplifies performance tuning, since getting adequate tcp buffers previously required increasing the number of nfsd threads. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Jim Rees <rees@umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:46:59 -04:00
Greg Banks	03cf6c9f49	knfsd: add file to export stats about nfsd pools Add /proc/fs/nfsd/pool_stats to export to userspace various statistics about the operation of rpc server thread pools. This patch is based on a forward-ported version of knfsd-add-pool-thread-stats which has been shipping in the SGI "Enhanced NFS" product since 2006 and which was previously posted: http://article.gmane.org/gmane.linux.nfs/10375 It has also been updated thus: * moved EXPORT_SYMBOL() to near the function it exports * made the new struct struct seq_operations const * used SEQ_START_TOKEN instead of ((void )1) merged fix from SGI PV 990526 "sunrpc: use dprintk instead of printk in svc_pool_stats_()" by Harshula Jayasuriya. merged fix from SGI PV 964001 "Crash reading pool_stats before nfsds are started". Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:42 -04:00
Greg Banks	59a252ff8c	knfsd: avoid overloading the CPU scheduler with enormous load averages Avoid overloading the CPU scheduler with enormous load averages when handling high call-rate NFS loads. When the knfsd bottom half is made aware of an incoming call by the socket layer, it tries to choose an nfsd thread and wake it up. As long as there are idle threads, one will be woken up. If there are lot of nfsd threads (a sensible configuration when the server is disk-bound or is running an HSM), there will be many more nfsd threads than CPUs to run them. Under a high call-rate low service-time workload, the result is that almost every nfsd is runnable, but only a handful are actually able to run. This situation causes two significant problems: 1. The CPU scheduler takes over 10% of each CPU, which is robbing the nfsd threads of valuable CPU time. 2. At a high enough load, the nfsd threads starve userspace threads of CPU time, to the point where daemons like portmap and rpc.mountd do not schedule for tens of seconds at a time. Clients attempting to mount an NFS filesystem timeout at the very first step (opening a TCP connection to portmap) because portmap cannot wake up from select() and call accept() in time. Disclaimer: these effects were observed on a SLES9 kernel, modern kernels' schedulers may behave more gracefully. The solution is simple: keep in each svc_pool a counter of the number of threads which have been woken but have not yet run, and do not wake any more if that count reaches an arbitrary small threshold. Testing was on a 4 CPU 4 NIC Altix using 4 IRIX clients, each with 16 synthetic client threads simulating an rsync (i.e. recursive directory listing) workload reading from an i386 RH9 install image (161480 regular files in 10841 directories) on the server. That tree is small enough to fill in the server's RAM so no disk traffic was involved. This setup gives a sustained call rate in excess of 60000 calls/sec before being CPU-bound on the server. The server was running 128 nfsds. Profiling showed schedule() taking 6.7% of every CPU, and __wake_up() taking 5.2%. This patch drops those contributions to 3.0% and 2.2%. Load average was over 120 before the patch, and 20.9 after. This patch is a forward-ported version of knfsd-avoid-nfsd-overload which has been shipping in the SGI "Enhanced NFS" product since 2006. It has been posted before: http://article.gmane.org/gmane.linux.nfs/10374 Signed-off-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-03-18 17:38:41 -04:00
Rusty Russell	a70f730282	cpumask: replace node_to_cpumask with cpumask_of_node. Impact: cleanup node_to_cpumask (and the blecherous node_to_cpumask_ptr which contained a declaration) are replaced now everyone implements cpumask_of_node. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-13 14:49:46 +10:30
Trond Myklebust	5e3771ce2d	SUNRPC: Ensure that xs_nospace return values are propagated If xs_nospace() finds that the socket has disconnected, it attempts to return ENOTCONN, however that value is then squashed by the callers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:01 -04:00
Trond Myklebust	8a2cec295f	SUNRPC: Delay, then retry on connection errors. Enforce the comment in xs_tcp_connect_worker4/xs_tcp_connect_worker6 that we should delay, then retry on certain connection errors. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:01 -04:00
Trond Myklebust	2a4919919a	SUNRPC: Return EAGAIN instead of ENOTCONN when waking up xprt->pending While we should definitely return socket errors to the task that is currently trying to send data, there is no need to propagate the same error to all the other tasks on xprt->pending. Doing so actually slows down recovery, since it causes more than one tasks to attempt socket recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:00 -04:00
Trond Myklebust	482f32e65d	SUNRPC: Handle socket errors correctly Ensure that we pick up and handle socket errors as they occur. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:38:00 -04:00
Trond Myklebust	c8485e4d63	SUNRPC: Handle ECONNREFUSED correctly in xprt_transmit() If we get an ECONNREFUSED error, we currently go to sleep on the 'xprt->sending' wait queue. The problem is that no timeout is set there, and there is nothing else that will wake the task up later. We should deal with ECONNREFUSED in call_status, given that is where we also deal with -EHOSTDOWN, and friends. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:59 -04:00
Trond Myklebust	40d2549db5	SUNRPC: Don't disconnect if a connection is still in progress. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:58 -04:00
Trond Myklebust	670f945731	SUNRPC: Ensure we set XPRT_CLOSING only after we've sent a tcp FIN... ...so that we can distinguish between when we need to shutdown and when we don't. Also remove the call to xs_tcp_shutdown() from xs_tcp_connect(), since xprt_connect() makes the same test. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:58 -04:00
Trond Myklebust	15f081ca8d	SUNRPC: Avoid an unnecessary task reschedule on ENOTCONN If the socket is unconnected, and xprt_transmit() returns ENOTCONN, we currently give up the lock on the transport channel. Doing so means that the lock automatically gets assigned to the next task in the xprt->sending queue, and so that task needs to be woken up to do the actual connect. The following patch aims to avoid that unnecessary task switch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:57 -04:00
Tom Talpey	441e3e2429	SUNRPC: dynamically load RPC transport modules on-demand Provide an api to attempt to load any necessary kernel RPC client transport module automatically. By convention, the desired module name is "xprt"+"transport name". For example, when NFS mounting with "-o proto=rdma", attempt to load the "xprtrdma" module. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:56 -04:00
Tom Talpey	b38ab40ad5	XPRTRDMA: correct an rpc/rdma inline send marshaling error Certain client rpc's which contain both lengthy page-contained metadata and a non-empty xdr_tail buffer require careful handling to avoid overlapped memory copying. Rearranging of existing rpcrdma marshaling code avoids it; this fixes an NFSv4 symlink creation error detected with connectathon basic/test8 to multiple servers. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:55 -04:00
Tom Talpey	b1e1e15877	SVCRDMA: remove faulty assertions in rpc/rdma chunk validation. Certain client-provided RPCRDMA chunk alignments result in an additional scatter/gather entry, which triggered nfs/rdma server assertions incorrectly. OpenSolaris nfs/rdma client connectathon testing was blocked by these in the special/locking section. Signed-off-by: Tom Talpey <tmtalpey@gmail.com> Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:37:55 -04:00
Chuck Lever	fe315e76fc	SUNRPC: Avoid spurious wake-up during UDP connect processing To clear out old state, the UDP connect workers unconditionally invoke xs_close() before proceeding with a new connect. Nowadays this causes a spurious wake-up of the task waiting for the connect to complete. This is a little racey, but usually harmless. The waiting task immediately retries the connect via a call_bind/call_connect sequence, which usually finds the transport already in the connected state because the connect worker has finished in the background. To avoid a spurious wake-up, factor the xs_close() logic that resets the underlying socket into a helper, and have the UDP connect workers call that helper instead of xs_close(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:10:21 -04:00
Trond Myklebust	01d37c428a	SUNRPC: xprt_connect() don't abort the task if the transport isn't bound If the transport isn't bound, then we should just return ENOTCONN, letting call_connect_status() and/or call_status() deal with retrying. Currently, we appear to abort all pending tasks with an EIO error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:09:39 -04:00
Trond Myklebust	fba91afbec	SUNRPC: Fix an Oops due to socket not set up yet... We can Oops in both xs_udp_send_request() and xs_tcp_send_request() if the call to xs_sendpages() returns an error due to the socket not yet being set up. Deal with that situation by returning a new error: ENOTSOCK, so that we know to avoid dereferencing transport->sock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-11 14:06:41 -04:00
Trond Myklebust	eb9b55ab4d	SUNRPC: Tighten up the task locking rules in __rpc_execute() We should probably not be testing any flags after we've cleared the RPC_TASK_RUNNING flag, since rpc_make_runnable() is then free to assign the rpc_task to another workqueue, which may then destroy it. We can fix any races with rpc_make_runnable() by ensuring that we only clear the RPC_TASK_RUNNING flag while holding the rpc_wait_queue->lock that the task is supposed to be sleeping on (and then checking whether or not the task really is sleeping). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2009-03-10 20:33:16 -04:00
Ilpo Järvinen	1f0fa15432	net/sunrpc/xprtsock.c: some common code found $ diff-funcs xs_udp_write_space net/sunrpc/xprtsock.c net/sunrpc/xprtsock.c xs_tcp_write_space --- net/sunrpc/xprtsock.c:xs_udp_write_space() +++ net/sunrpc/xprtsock.c:xs_tcp_write_space() @@ -1,4 +1,4 @@ - * xs_udp_write_space - callback invoked when socket buffer space + * xs_tcp_write_space - callback invoked when socket buffer space * becomes available * @sk: socket whose state has changed * @@ -7,12 +7,12 @@ * progress, otherwise we'll waste resources thrashing kernel_sendmsg * with a bunch of small requests. / -static void xs_udp_write_space(struct sock sk) +static void xs_tcp_write_space(struct sock sk) { read_lock(&sk->sk_callback_lock); - / from net/core/sock.c:sock_def_write_space / - if (sock_writeable(sk)) { + / from net/core/stream.c:sk_stream_write_space / + if (sk_stream_wspace(sk) >= sk_stream_min_wspace(sk)) { struct socket sock; struct rpc_xprt *xprt; $ codiff net/sunrpc/xprtsock.o net/sunrpc/xprtsock.o.new net/sunrpc/xprtsock.c: xs_tcp_write_space \| -163 xs_udp_write_space \| -163 2 functions changed, 326 bytes removed net/sunrpc/xprtsock.c: xs_write_space \| +179 1 function changed, 179 bytes added net/sunrpc/xprtsock.o.new: 3 functions changed, 179 bytes added, 326 bytes removed, diff: -147 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-06 23:48:33 -08:00
Randy Dunlap	55128bc23e	sunrpc: fix rdma dependencies Fix sunrpc/rdma build dependencies. Survives 12 build combinations of INET, IPV6, SUNRPC, INFINIBAND, and INFINIBAND_ADDR_TRANS. ERROR: "rdma_destroy_id" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_connect" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_destroy_qp" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_create_id" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_create_qp" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_resolve_route" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_disconnect" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_resolve_addr" [net/sunrpc/xprtrdma/xprtrdma.ko] undefined! ERROR: "rdma_accept" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! ERROR: "rdma_destroy_id" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! ERROR: "rdma_listen" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! ERROR: "rdma_create_id" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! ERROR: "rdma_create_qp" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! ERROR: "rdma_bind_addr" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! ERROR: "rdma_disconnect" [net/sunrpc/xprtrdma/svcrdma.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-03 15:20:13 -08:00
J. Bruce Fields	ce0cf6622c	nfs: note that CONFIG_SUNRPC_XPRT_RDMA turns on server side support too We forgot to update this when adding server-side support. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-27 17:26:58 -05:00
Alexey Dobriyan	9098c24f35	fs/Kconfig: move sunrpc out Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-01-22 13:16:00 +03:00
Trond Myklebust	24c3767e41	SUNRPC: The sunrpc server code should not be used by out-of-tree modules Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-07 17:18:42 -05:00
Tom Tucker	22945e4a1c	svc: Clean up deferred requests on transport destruction A race between svc_revisit and svc_delete_xprt can result in deferred requests holding references on a transport that can never be recovered because dead transports are not enqueued for subsequent processing. Check for XPT_DEAD in revisit to clean up completing deferrals on a dead transport and sweep a transport's deferred queue to do the same for queued but unprocessed deferrals. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-07 17:08:46 -05:00
Tom Tucker	2779e3ae39	svc: Move kfree of deferral record to common code The rqstp structure has a pointer to a svc_deferred_req record that is allocated when requests are deferred. This record is common to all transports and can be freed in common code. Move the kfree of the rq_deferred to the common svc_xprt_release function. This also fixes a memory leak in the RDMA transport which does not kfree the dr structure in it's version of the xpo_release_rqst callback. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-07 15:40:45 -05:00
Trond Myklebust	69b6ba3712	SUNRPC: Ensure the server closes sockets in a timely fashion We want to ensure that connected sockets close down the connection when we set XPT_CLOSE, so that we don't keep it hanging while cleaning up all the stuff that is keeping a reference to the socket. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:57 -05:00
Jeff Layton	c9233eb7b0	sunrpc: add sv_maxconn field to svc_serv (try #3 ) svc_check_conn_limits() attempts to prevent denial of service attacks by having the service close old connections once it reaches a threshold. This threshold is based on the number of threads in the service: (serv->sv_nrthreads + 3) * 20 Once we reach this, we drop the oldest connections and a printk pops to warn the admin that they should increase the number of threads. Increasing the number of threads isn't an option however for services like lockd. We don't want to eliminate this check entirely for such services but we need some way to increase this limit. This patch adds a sv_maxconn field to the svc_serv struct. When it's set to 0, we use the current method to calculate the max number of connections. RPC services can then set this on an as-needed basis. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-01-06 11:53:47 -05:00
Al Viro	56ff5efad9	zero i_uid/i_gid on inode allocation ... and don't bother in callers. Don't bother with zeroing i_blocks, while we are at it - it's already been zeroed. i_mode is not worth the effort; it has no common default value. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-01-05 11:54:28 -05:00
Trond Myklebust	08cc36cbd1	Merge branch 'devel' into next	2008-12-30 16:51:43 -05:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Olga Kornievskaia	2efef7080f	rpc: add service field to new upcall This patch extends the new upcall with a "service" field that currently can have 2 values: "" or "nfs". These values specify matching rules for principals in the keytab file. The "" means that gssd is allowed to use "root", "nfs", or "host" keytab entries while the other option requires "nfs". Restricting gssd to use the "nfs" principal is needed for when the server performs a callback to the client. The server in this case has to authenticate itself as an "nfs" principal. We also need "service" field to distiguish between two client-side cases both currently using a uid of 0: the case of regular file access by the root user, and the case of state-management calls (such as setclientid) which should use a keytab for authentication. (And the upcall should fail if an appropriate principal can't be found.) Signed-off: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:56 -05:00
Olga Kornievskaia	8b1c7bf5b6	rpc: add target field to new upcall This patch extends the new upcall by adding a "target" field communicating who we want to authenticate to (equivalently, the service principal that we want to acquire a ticket for). Signed-off: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:26 -05:00
Olga Kornievskaia	61054b14d5	nfsd: support callbacks with gss flavors This patch adds server-side support for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:00 -05:00
Olga Kornievskaia	945b34a772	rpc: allow gss callbacks to client This patch adds client-side support to allow for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:18:34 -05:00
Olga Kornievskaia	608207e888	rpc: pass target name down to rpc level on callbacks The rpc client needs to know the principal that the setclientid was done as, so it can tell gssd who to authenticate to. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:40 -05:00
Olga Kornievskaia	68e76ad0ba	nfsd: pass client principal name in rsc downcall Two principals are involved in krb5 authentication: the target, who we authenticate to (normally the name of the server, like nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we authenticate as (normally a user, like bfields@UMICH.EDU) In the case of NFSv4 callbacks, the target of the callback should be the source of the client's setclientid call, and the source should be the nfs server's own principal. Therefore we allow svcgssd to pass down the name of the principal that just authenticated, so that on setclientid we can store that principal name with the new client, to be used later on callbacks. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:15 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	34769fc488	rpc: implement new upcall Implement the new upcall. We decide which version of the upcall gssd will use (new or old), by creating both pipes (the new one named "gssd", the old one named after the mechanism (e.g., "krb5")), and then waiting to see which version gssd actually opens. We don't permit pipes of the two different types to be opened at once. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:16:37 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	5b7ddd4a7b	rpc: store pointer to pipe inode in gss upcall message Keep a pointer to the inode that the message is queued on in the struct gss_upcall_msg. This will be convenient, especially after we have a choice of two pipes that an upcall could be queued on. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:15:44 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	79a3f20b64	rpc: use count of pipe openers to wait for first open Introduce a global variable pipe_version which will eventually be used to keep track of which version of the upcall gssd is using. For now, though, it only keeps track of whether any pipe is open or not; it is negative if not, zero if one is opened. We use this to wait for the first gssd to open a pipe. (Minor digression: note this waits only for the very first open of any pipe, not for the first open of a pipe for a given auth; thus we still need the RPC_PIPE_WAIT_FOR_OPEN behavior to wait for gssd to open new pipes that pop up on subsequent mounts.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:10:52 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	cf81939d6f	rpc: track number of users of the gss upcall pipe Keep a count of the number of pipes open plus the number of messages on a pipe. This count isn't used yet. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:10:19 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	e712804ae4	rpc: call release_pipe only on last close I can't see any reason we need to call this until either the kernel or the last gssd closes the pipe. Also, this allows to guarantee that open_pipe and release_pipe are called strictly in pairs; open_pipe on gssd's first open, release_pipe on gssd's last close (or on the close of the kernel side of the pipe, if that comes first). That will make it very easy for the gss code to keep track of which pipes gssd is using. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:09:47 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	c381060869	rpc: add an rpc_pipe_open method We want to transition to a new gssd upcall which is text-based and more easily extensible. To simplify upgrades, as well as testing and debugging, it will help if we can upgrade gssd (to a version which understands the new upcall) without having to choose at boot (or module-load) time whether we want the new or the old upcall. We will do this by providing two different pipes: one named, as currently, after the mechanism (normally "krb5"), and supporting the old upcall. One named "gssd" and supporting the new upcall version. We allow gssd to indicate which version it supports by its choice of which pipe to open. As we have no interest in supporting simultaneous use of both versions, we'll forbid opening both pipes at the same time. So, add a new pipe_open callback to the rpc_pipefs api, which the gss code can use to track which pipes have been open, and to refuse opens of incompatible pipes. We only need this to be called on the first open of a given pipe. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:08:32 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	db75b3d6b5	rpc: minor gss_alloc_msg cleanup I want to add a little more code here, so it'll be convenient to have this flatter. Also, I'll want to add another error condition, so it'll be more convenient to return -ENOMEM than NULL in the error case. The only caller is already converting NULL to -ENOMEM anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:07:13 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	b03568c322	rpc: factor out warning code from gss_pipe_destroy_msg We'll want to call this from elsewhere soon. And this is a bit nicer anyway. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:55 -05:00
$\"J. Bruce Fields\$ \"J. Bruce Fields\	99db356368	rpc: remove unnecessary assignment We're just about to kfree() gss_auth, so there's no point to setting any of its fields. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:33 -05:00
Jeff Layton	6dcd3926b2	sunrpc: fix code that makes auth_gss send destroy_cred message (try #2 ) There's a bit of a chicken and egg problem when it comes to destroying auth_gss credentials. When we destroy the last instance of a GSSAPI RPC credential, we should send a NULL RPC call with a GSS procedure of RPCSEC_GSS_DESTROY to hint to the server that it can destroy those creds. This isn't happening because we're setting clearing the uptodate bit on the credentials and then setting the operations to the gss_nullops. When we go to do the RPC call, we try to refresh the creds. That fails with -EACCES and the call fails. Fix this by not clearing the UPTODATE bit for the credentials and adding a new crdestroy op for gss_nullops that just tears down the cred without trying to destroy the context. The only difference between this patch and the first one is the removal of some minor formatting deltas. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:57 -05:00
Peter Staubach	64672d55d9	optimize attribute timeouts for "noac" and "actimeo=0" Hi. I've been looking at a bugzilla which describes a problem where a customer was advised to use either the "noac" or "actimeo=0" mount options to solve a consistency problem that they were seeing in the file attributes. It turned out that this solution did not work reliably for them because sometimes, the local attribute cache was believed to be valid and not timed out. (With an attribute cache timeout of 0, the cache should always appear to be timed out.) In looking at this situation, it appears to me that the problem is that the attribute cache timeout code has an off-by-one error in it. It is assuming that the cache is valid in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The cache should be considered valid only in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this change, the options, "noac" and "actimeo=0", work as originally expected. This problem was previously addressed by special casing the attrtimeo == 0 case. However, since the problem is only an off- by-one error, the cleaner solution is address the off-by-one error and thus, not require the special case. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	7bd8826915	SUNRPC: rpcsec_gss modules should not be used by out-of-tree code Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:32 -05:00
Trond Myklebust	468039ee46	SUNRPC: Convert the xdr helpers and rpc_pipefs to EXPORT_SYMBOL_GPL We've never considered the sunrpc code as part of any ABI to be used by out-of-tree modules. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:31 -05:00
Trond Myklebust	88a9fe8cae	SUNRPC: Remove the last remnant of the BKL... Somehow, this escaped the previous purge. There should be no need to keep any extra locks in the XDR callbacks. The NFS client XDR code only writes into private objects, whereas all reads of shared objects are confined to fields that do not change, such as filehandles... Ditto for lockd, the NFSv2/v3 client mount code, and rpcbind. The nfsd XDR code may require the BKL, but since it does a synchronous RPC call from a thread that already holds the lock, that issue is moot. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:31 -05:00
David S. Miller	eb14f01959	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c	2008-12-15 20:03:50 -08:00
Ilpo Järvinen	b1721d2bb9	rpc/rdma: goto instead of copypaste Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-14 23:19:48 -08:00
Roel Kluin	5eaa65b240	net: Make static Sparse asked whether these could be static. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-12-10 15:18:31 -08:00
James Morris	ec98ce480a	Merge branch 'master' into next Conflicts: fs/nfsd/nfs4recover.c Manually fixed above to use new creds API functions, e.g. nfs4_save_creds(). Signed-off-by: James Morris <jmorris@namei.org>	2008-12-04 17:16:36 +11:00
Linus Torvalds	2433c41789	Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: NLM: client-side nlm_lookup_host() should avoid matching on srcaddr nfsd: use of unitialized list head on error exit in nfs4recover.c Add a reference to sunrpc in svc_addsock nfsd: clean up grace period on early exit	2008-12-03 16:40:37 -08:00
Ingo Molnar	ff0db0490a	sunrpc: fix warning in net/sunrpc/xprtrdma/verbs.c fix this warning: net/sunrpc/xprtrdma/verbs.c: In function ‘rpcrdma_conn_upcall’: net/sunrpc/xprtrdma/verbs.c:279: warning: unused variable ‘addr’ Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 16:58:42 -08:00
Ingo Molnar	ed72b9c6e0	sunrpc: fix warning in net/sunrpc/xprtrdma/svc_rdma_transport.c this warning: net/sunrpc/xprtrdma/svc_rdma_transport.c: In function ‘svc_rdma_accept’: net/sunrpc/xprtrdma/svc_rdma_transport.c:830: warning: ‘dma_mr_acc’ may be used uninitialized in this function triggers because GCC does not recognize the (correct) flow connection between need_dma_mr and dma_mr_acc. Annotate it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-25 16:49:37 -08:00
Tom Tucker	2da2c21d75	Add a reference to sunrpc in svc_addsock The svc_addsock function adds transport instances without taking a reference on the sunrpc.ko module, however, the generic transport destruction code drops a reference when a transport instance is destroyed. Add a try_module_get call to the svc_addsock function for transport instances added by this function. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Jeff Moyer <jmoyer@redhat.com>	2008-11-24 10:15:01 -06:00
David S. Miller	6ab33d5171	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c include/net/mac80211.h net/phonet/af_phonet.c	2008-11-20 16:44:00 -08:00
Trond Myklebust	23918b0306	SUNRPC: Fix a performance regression in the RPC authentication code Fix a regression reported by Max Kellermann whereby kernel profiling showed that his clients were spending 45% of their time in rpcauth_lookup_credcache. It turns out that although his processes had identical uid/gid/groups, generic_match() was failing to detect this, because the task->group_info pointers were not shared. This again lead to the creation of a huge number of identical credentials at the RPC layer. The regression is fixed by comparing the contents of task->group_info if the actual pointers are not identical. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-11-20 13:17:40 -08:00
David Howells	86a264abe5	CRED: Wrap current->cred and a few other accessors Wrap current->cred and a few other accessors to hide their actual implementation. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:18 +11:00
David Howells	b6dff3ec5e	CRED: Separate task security context from task_struct Separate the task security context from task_struct. At this point, the security data is temporarily embedded in the task_struct with two pointers pointing to it. Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in entry.S via asm-offsets. With comment fixes Signed-off-by: Marc Dionne <marc.c.dionne@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:16 +11:00
David Howells	8f4194026b	CRED: Wrap task credential accesses in the SunRPC protocol Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(\|e\|s\|fs)[ug]id to current_(\|e\|s\|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <jmorris@namei.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: linux-nfs@vger.kernel.org Signed-off-by: James Morris <jmorris@namei.org>	2008-11-14 10:39:09 +11:00
David S. Miller	e0db4a786b	sunrpc: Fix build warning due to typo in %pI4 format changes. Noticed by Stephen Hemminger. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-02 23:57:06 -08:00
Harvey Harrison	21454aaad3	net: replace NIPQUAD() in net/*/ Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-31 00:54:56 -07:00
David S. Miller	a1744d3bee	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/p54/p54common.c	2008-10-31 00:17:34 -07:00
Harvey Harrison	5b095d9892	net: replace %p6 with %pI6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:52:50 -07:00
Harvey Harrison	4b7a4274ca	net: replace %#p6 format specifier with %pi6 gcc warns when using the # modifier with the %p format specifier, so we can't use this to omit the colons when needed, introduces %pi6 instead. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-29 12:50:24 -07:00
Harvey Harrison	b189db5d29	net: remove NIP6(), NIP6_FMT, NIP6_SEQFMT and final users Open code NIP6_FMT in the one call inside sscanf and one user of NIP6() that could use %p6 in the netfilter code. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:38 -07:00
Harvey Harrison	fdb46ee752	net, misc: replace uses of NIP6_FMT with %p6 Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 23:02:32 -07:00
Harvey Harrison	b071195deb	net: replace all current users of NIP6_SEQFMT with %#p6 The define in kernel.h can be done away with at a later time. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-28 16:05:40 -07:00
Trond Myklebust	5f707eb429	SUNRPC: Fix potential race in put_rpccred() We have to be careful when we try to unhash the credential in put_rpccred(), because we're not holding the credcache lock, so the call to rpcauth_unhash_cred() may fail if someone else has looked the cred up, and obtained a reference to it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-28 15:21:42 -04:00
Trond Myklebust	eac0d18d44	SUNRPC: Fix rpcauth_prune_expired We need to make sure that we don't remove creds from the cred_unused list if they are still under the moratorium, or else they will never get garbage collected. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-28 15:21:41 -04:00
Trond Myklebust	2a9e1cfa23	SUNRPC: Respond promptly to server TCP resets If the server sends us an RST error while we're in the TCP_ESTABLISHED state, then that will not result in a state change, and so the RPC client ends up hanging forever (see http://bugzilla.kernel.org/show_bug.cgi?id=11154) We can intercept the reset by setting up an sk->sk_error_report callback, which will then allow us to initiate a proper shutdown and retry... We also make sure that if the send request receives an ECONNRESET, then we shutdown too... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-28 15:21:39 -04:00
Linus Torvalds	b225ee5bed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) ipv4: Add a missing rcu_assign_pointer() in routing cache. [netdrvr] ibmtr: PCMCIA IBMTR is ok on 64bit xen-netfront: Avoid unaligned accesses to IP header lmc: copy_*_user under spinlock [netdrvr] myri10ge, ixgbe: remove broken select INTEL_IOATDMA	2008-10-17 08:58:52 -07:00
Johannes Berg	95a5afca4a	net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) Some code here depends on CONFIG_KMOD to not try to load protocol modules or similar, replace by CONFIG_MODULES where more than just request_module depends on CONFIG_KMOD and and also use try_then_request_module in ebtables. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-16 15:24:51 -07:00
Trond Myklebust	6925bac120	Merge branch 'next'	2008-10-15 15:54:56 -04:00
Linus Torvalds	8acd3a60bc	Merge branch 'for-2.6.28' of git://linux-nfs.org/~bfields/linux * 'for-2.6.28' of git://linux-nfs.org/~bfields/linux: (59 commits) svcrdma: Fix IRD/ORD polarity svcrdma: Update svc_rdma_send_error to use DMA LKEY svcrdma: Modify the RPC reply path to use FRMR when available svcrdma: Modify the RPC recv path to use FRMR when available svcrdma: Add support to svc_rdma_send to handle chained WR svcrdma: Modify post recv path to use local dma key svcrdma: Add a service to register a Fast Reg MR with the device svcrdma: Query device for Fast Reg support during connection setup svcrdma: Add FRMR get/put services NLM: Remove unused argument from svc_addsock() function NLM: Remove "proto" argument from lockd_up() NLM: Always start both UDP and TCP listeners lockd: Remove unused fields in the nlm_reboot structure lockd: Add helper to sanity check incoming NOTIFY requests lockd: change nlmclnt_grant() to take a "struct sockaddr *" lockd: Adjust nlmsvc_lookup_host() to accomodate AF_INET6 addresses lockd: Adjust nlmclnt_lookup_host() signature to accomodate non-AF_INET lockd: Support non-AF_INET addresses in nlm_lookup_host() NLM: Convert nlm_lookup_host() to use a single argument svcrdma: Add Fast Reg MR Data Types ...	2008-10-14 12:31:14 -07:00
Alan Cox	113aa838ec	net: Rationalise email address: Network Specific Parts Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-10-13 19:01:08 -07:00
Tom Talpey	c055551e97	RPC/RDMA: ensure connection attempt is complete before signalling. The RPC/RDMA connection logic could return early from reconnection attempts, leading to additional spurious retries. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:15:06 -04:00
Tom Talpey	08ca0dce1e	RPC/RDMA: correct the reconnect timer backoff The RPC/RDMA code had a constant 5-second reconnect backoff, and always performed it, even when re-establishing a connection to a server after the RPC layer closed it due to being idle. Make it an geometric backoff (up to 30 seconds), and don't delay idle reconnect. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:15:02 -04:00
Tom Talpey	b3cd8d45a7	RPC/RDMA: optionally emit useful transport info upon connect/disconnect. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:59 -04:00
Tom Talpey	5f37d561e0	RPC/RDMA: reformat a debug printk to keep lines together. The send marshaling code split a particular dprintk across two lines, which makes it hard to extract from logfiles. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:42 -04:00
Tom Talpey	5675add36e	RPC/RDMA: harden connection logic against missing/late rdma_cm upcalls. Add defensive timeouts to wait_for_completion() calls in RDMA address resolution, and make them interruptible. Fix the timeout units to milliseconds (formerly jiffies) and move to private header. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:31 -04:00
Tom Talpey	1a954051b0	RPC/RDMA: fix connect/reconnect resource leak. The RPC/RDMA code can leak RDMA connection manager endpoints in certain error cases on connect. Don't signal unwanted events, and be certain to destroy any allocated qp. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:13:02 -04:00
Tom Talpey	926449ba66	RPC/RDMA: return a consistent error, when connect fails. The xprt_connect call path does not expect such errors as ECONNREFUSED to be returned from failed transport connection attempts, otherwise it translates them to EIO and signals fatal errors. For example, mount.nfs prints simply "internal error". Translate all such errors to ENOTCONN from RPC/RDMA to match sockets behavior. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:12:44 -04:00
Tom Talpey	9191ca3b38	RPC/RDMA: adhere to protocol for unpadded client trailing write chunks. The RPC/RDMA protocol allows clients and servers to avoid RDMA operations for data which is purely the result of XDR padding. On the client, automatically insert the necessary padding for such server replies, and optionally don't marshal such chunks. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:12:33 -04:00
Tom Talpey	fee08caf94	RPC/RDMA: avoid an oops due to disconnect racing with async upcalls. RDMA disconnects yield an upcall from the RDMA connection manager, which can race with rpc transport close, e.g. on ^C of a mount. Ensure any rdma cm_id and qp are fully destroyed before continuing. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:11:40 -04:00
Tom Talpey	ad0e9e01da	RPC/RDMA: maintain the RPC task bytes-sent statistic. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:10:40 -04:00
Tom Talpey	575448bd36	RPC/RDMA: suppress retransmit on RPC/RDMA clients. An RPC/RDMA client cannot retransmit on an unbroken connection, doing so violates its flow control with the server. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:10:36 -04:00
Tom Tucker	b334eaabf4	RPC/RDMA: fix connection IRD/ORD setting This logic sets the connection parameter that configures the local device and informs the remote peer how many concurrent incoming RDMA_READ requests are supported. The original logic didn't really do what was intended for two reasons: - The max number supported by the device is typically smaller than any one factor in the calculation used, and - The field in the connection parameter structure where the value is stored is a u8 and always overflows for the default settings. So what really happens is the value requested for responder resources is the left over 8 bits from the "desired value". If the desired value happened to be a multiple of 256, the result was zero and it wouldn't connect at all. Given the above and the fact that max_requests is almost always larger than the max responder resources supported by the adapter, this patch simplifies this logic and simply requests the max supported by the device, subject to a reasonable limit. This bug was found by Jim Schutt at Sandia. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:56 -04:00
Tom Talpey	3197d309f5	RPC/RDMA: support FRMR client memory registration. Configure, detect and use "fastreg" support from IB/iWARP verbs layer to perform RPC/RDMA memory registration. Make FRMR the default memreg mode (will fall back if not supported by the selected RDMA adapter). This allows full and optimal operation over the cxgb3 adapter, and others. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:34 -04:00
Tom Talpey	bd7ed1d133	RPC/RDMA: check selected memory registration mode at runtime. At transport creation, check for, and use, any local dma lkey. Then, check that the selected memory registration mode is in fact supported by the RDMA adapter selected for the mount. Fall back to best alternative if not. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:26 -04:00
Tom Talpey	fe9053b30b	RPC/RDMA: add data types and new FRMR memory registration enum. Internal RPC/RDMA structure updates in preparation for FRMR support. Signed-off-by: Tom Talpey <talpey@netapp.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:24 -04:00
Tom Talpey	8d4ba0347c	RPC/RDMA: refactor the inline memory registration code. Refactor the memory registration and deregistration routines. This saves stack space, makes the code more readable and prepares to add the new FRMR registration methods. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 15:09:16 -04:00
J. Bruce Fields	107e0008df	Merge branch 'from-tomtucker' into for-2.6.28	2008-10-08 18:22:18 -04:00
Cedric Le Goater	63ffc23d30	sunrpc: fix oops in rpc_create when the mount namespace is unshared On a system with nfs mounts, if a task unshares its mount namespace, a oops can occur when the system is rebooted if the task is the last to unreference the nfs mount. It will try to create a rpc request using utsname() which has been invalidated by free_nsproxy(). The patch fixes the issue by using the global init_utsname() which is always valid. the capability of identifying rpc clients per uts namespace stills needs some extra work so this should not be a problem. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c024c9ab>] rpc_create+0x332/0x42f Oops: 0000 [#1] DEBUG_PAGEALLOC Pid: 1857, comm: uts-oops Not tainted (2.6.27-rc5-00319-g7686ad5 #4) EIP: 0060:[<c024c9ab>] EFLAGS: 00210287 CPU: 0 EIP is at rpc_create+0x332/0x42f EAX: 00000000 EBX: df26adf0 ECX: c0251887 EDX: 00000001 ESI: df26ae58 EDI: c02f293c EBP: dda0fc9c ESP: dda0fc2c DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process uts-oops (pid: 1857, ti=dda0e000 task=dd9a0778 task.ti=dda0e000) Stack: c0104532 dda0fffc dda0fcac dda0e000 dda0e000 dd93b7f0 00000009 c02f2880 df26aefc dda0fc68 c01096b7 00000000 c0266ee0 c039a070 c039a070 dda0fc74 c012ca67 c039a064 dda0fc8c c012cb20 c03daf74 00000011 00000000 c0275c90 Call Trace: [<c0104532>] ? dump_trace+0xc2/0xe2 [<c01096b7>] ? save_stack_trace+0x1c/0x3a [<c012ca67>] ? save_trace+0x37/0x8c [<c012cb20>] ? add_lock_to_list+0x64/0x96 [<c0256fc4>] ? rpcb_register_call+0x62/0xbb [<c02570c8>] ? rpcb_register+0xab/0xb3 [<c0252f4d>] ? svc_register+0xb4/0x128 [<c0253114>] ? svc_destroy+0xec/0x103 [<c02531b2>] ? svc_exit_thread+0x87/0x8d [<c01a75cd>] ? lockd_down+0x61/0x81 [<c01a577b>] ? nlmclnt_done+0xd/0xf [<c01941fe>] ? nfs_destroy_server+0x14/0x16 [<c0194328>] ? nfs_free_server+0x4c/0xaa [<c019a066>] ? nfs_kill_super+0x23/0x27 [<c0158585>] ? deactivate_super+0x3f/0x51 [<c01695d1>] ? mntput_no_expire+0x95/0xb4 [<c016965b>] ? release_mounts+0x6b/0x7a [<c01696cc>] ? __put_mnt_ns+0x62/0x70 [<c0127501>] ? free_nsproxy+0x25/0x80 [<c012759a>] ? switch_task_namespaces+0x3e/0x43 [<c01275a9>] ? exit_task_namespaces+0xa/0xc [<c0117fed>] ? do_exit+0x4fd/0x666 [<c01181b3>] ? do_group_exit+0x5d/0x83 [<c011fa8c>] ? get_signal_to_deliver+0x2c8/0x2e0 [<c0102630>] ? do_notify_resume+0x69/0x700 [<c011d85a>] ? do_sigaction+0x134/0x145 [<c0127205>] ? hrtimer_nanosleep+0x8f/0xce [<c0126d1a>] ? hrtimer_wakeup+0x0/0x1c [<c0103488>] ? work_notifysig+0x13/0x1b ======================= Code: 70 20 68 cb c1 2c c0 e8 75 4e 01 00 8b 83 ac 00 00 00 59 3d 00 f0 ff ff 5f 77 63 eb 57 a1 00 80 2d c0 8b 80 a8 02 00 00 8d 73 68 <8b> 40 04 83 c0 45 e8 41 46 f7 ff ba 20 00 00 00 83 f8 21 0f 4c EIP: [<c024c9ab>] rpc_create+0x332/0x42f SS:ESP 0068:dda0fc2c Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:19:10 -04:00
Trond Myklebust	96165e2b7c	SUNRPC: Fix a memory leak in rpcb_getport_async Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:18:57 -04:00
Trond Myklebust	9a4bd29fe8	SUNRPC: Fix autobind on cloned rpc clients Despite the fact that cloned rpc clients won't have the cl_autobind flag set, they may still find themselves calling rpcb_getport_async(). For this to happen, it suffices for a _parent_ rpc_clnt to use autobinding, in which case any clone may find itself triggering the !xprt_bound() case in call_bind(). The correct fix for this is to walk back up the tree of cloned rpc clients, in order to find the parent that 'owns' the transport, either because it has clnt->cl_autobind set, or because it originally created the transport... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:18:53 -04:00
Denis V. Lunev	c9f6cde6e2	sunrpc: do not pin sunrpc module in the memory Basically, try_module_get here are pretty useless. Any other module using this API will pin sunrpc in memory due using exported symbols. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:14:54 -04:00
Tom Tucker	67080c8236	svcrdma: Fix IRD/ORD polarity The inititator/responder resources in the event have been swapped. They no represent what the local peer would set their values to in order to match the peer. Note that iWARP does not exchange these on the wire and the provider is simply putting in the local device max. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:13 -05:00
Tom Tucker	04911b539c	svcrdma: Update svc_rdma_send_error to use DMA LKEY Update the svc_rdma_send_error code to use the DMA LKEY which is valid regardless of the memory registration strategy in use. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:08 -05:00
Tom Tucker	afd566ea08	svcrdma: Modify the RPC reply path to use FRMR when available Use FRMR to map local RPC reply data. This allows RDMA_WRITE to send reply data using a single WR. The FRMR is invalidated by linking the LOCAL_INV WR to the RDMA_SEND message used to complete the reply. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:05 -05:00
Tom Tucker	146b6df6a5	svcrdma: Modify the RPC recv path to use FRMR when available RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using an FRMR to map the data sink improves NFSRDMA security on transports that place the RDMA_READ data sink LKEY on the wire because the valid lifetime of the MR is only the duration of the RDMA_READ. The LKEY is invalidated when the last RDMA_READ WR completes. Mapping the data sink also allows for very large amounts to data to be fetched with a single WR, so if the client is also using FRMR, the entire RPC read-list can be fetched with a single WR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:46:01 -05:00
Tom Tucker	5b180a9a64	svcrdma: Add support to svc_rdma_send to handle chained WR WR can be submitted as linked lists of WR. Update the svc_rdma_send routine to handle WR chains. This will be used to submit a WR that uses an FRMR with another WR that invalidates the FRMR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:56 -05:00
Tom Tucker	a5abf4e815	svcrdma: Modify post recv path to use local dma key Update the svc_rdma_post_recv routine to use the adapter's global LKEY instead of sc_phys_mr which is only valid when using a DMA MR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:52 -05:00
Tom Tucker	e118321062	svcrdma: Add a service to register a Fast Reg MR with the device Fast Reg MR introduces a new WR type. Add a service to register the region with the adapter and update the completion handling to support completions with a NULL WR context. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:49 -05:00
Tom Tucker	3a5c63803d	svcrdma: Query device for Fast Reg support during connection setup Query the device capabilities in the svc_rdma_accept function to determine what advanced memory management capabilities are supported by the device. Based on the query, select the most secure model available given the requirements of the transport and capabilities of the adapter. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:45 -05:00
Tom Tucker	64be8608c1	svcrdma: Add FRMR get/put services Add services for the allocating, freeing, and unmapping Fast Reg MR. These services will be used by the transport connection setup, send and receive routines. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-10-06 14:45:18 -05:00
Chuck Lever	2937391385	NLM: Remove unused argument from svc_addsock() function Clean up: The svc_addsock() function no longer uses its "proto" argument, so remove it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-10-04 17:12:27 -04:00
Benny Halevy	d5b337b487	nfsd: use nfs client rpc callback program since commit `ff7d9756b5` "nfsd: use static memory for callback program and stats" do_probe_callback uses a static callback program (NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog as passed in by the client in setclientid (4.0) or create_session (4.1). This patches introduces rpc_create_args.prognumber that allows overriding program->number when creating rpc_clnt. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	db820d6376	SUNRPC: Clean up debug messages in rpcb_clnt.c The RPCB XDR functions are used for multiple procedures. For instance, rpcb_encode_getaddr() is used for RPCB_GETADDR, RPCB_SET, and RPCB_UNSET. Make the XDR debug messages more generic so they are less confusing. And, unlike in other RPC consumers in the kernel, a single debug flag enables all levels of debug messages in the RPC bind client, including XDR debug messages. Since the XDR decoders already report success or failure in this case, remove redundant debug messages in the mid-level rpcb_register_call() function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	f6fb3f6f59	SUNRPC: Fix up svc_unregister() With the new rpcbind code, a PMAP_UNSET will not have any effect on services registered via rpcbind v3 or v4. Implement a version of svc_unregister() that uses an RPCB_UNSET with an empty netid string to make sure we have cleared all entries for a kernel RPC service when shutting down, or before starting a fresh instance of the service. Use the new version only when CONFIG_SUNRPC_REGISTER_V4 is enabled; otherwise, the legacy PMAP version is used to ensure complete backwards-compatibility with the Linux portmapper daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	9d548b9c95	SUNRPC: Use short-hand IPv6 ANYADDR for RPCB_SET Clean up: When doing an RPCB_SET, make the kernel's rpcb client use the shorthand "::" for the universal form of the IPv6 ANY address. Without this patch, rpcbind will advertise: 0000:0000:0000:0000:0000:0000:0000:0000.x.y This is cosmetic only. It cleans up the display of information from /sbin/rpcinfo. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:40 -04:00
Chuck Lever	2c7eb0b206	SUNRPC: Register both netids for AF_INET6 servers TI-RPC is a user-space library of RPC functions that replaces ONC RPC and allows RPC to operate in the new world of IPv6. TI-RPC combines the concept of a transport protocol (UDP and TCP) and a protocol family (PF_INET and PF_INET6) into a single identifier called a "netid." For example, "udp" means UDP over IPv4, and "udp6" means UDP over IPv6. For rpcbind, then, the RPC service tuple that is registered and advertised is: [RPC program, RPC version, service address and port, netid] instead of [RPC program, RPC version, port, protocol] Service address is typically ANYADDR, but can be a specific address of one of the interfaces on a multi-homed host. The third item in the new tuple is expressed as a universal address. The current Linux rpcbind implementation registers a netid for both protocol families when RPCB_SET is done for just the PF_INET6 version of the netid (ie udp6 or tcp6). So registering "udp6" causes a registration for "udp" to appear automatically as well. We've recently determined that this is incorrect behavior. In the TI-RPC world, "udp6" is not meant to imply that the registered RPC service handles requests from AF_INET as well, even if the listener socket does address mapping. "udp" and "udp6" are entirely separate capabilities, and must be registered separately. The Linux kernel, unlike TI-RPC, leverages address mapping to allow a single listener socket to handle requests for both AF_INET and AF_INET6. This is still OK, but the kernel currently assumes registering "udp6" will cover "udp" as well. It registers only "udp6" for it's AF_INET6 services, even though they handle both AF_INET and AF_INET6 on the same port. So svc_register() actually needs to register both "udp" and "udp6" explicitly (and likewise for TCP). Until rpcbind is fixed, the kernel can ignore the return code for the second RPCB_SET call. Please merge this with commit 15231312: SUNRPC: Support IPv6 when registering kernel RPC services Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Olaf Kirch <okir@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:39 -04:00
Chuck Lever	a26cfad6e0	SUNRPC: Support IPv6 when registering kernel RPC services In order to advertise NFS-related services on IPv6 interfaces via rpcbind, the kernel RPC server implementation must use rpcb_v4_register() instead of rpcb_register(). A new kernel build option allows distributions to use the legacy v2 call until they integrate an appropriate user-space rpcbind daemon that can support IPv6 RPC services. I tried adding some automatic logic to fall back if registering with a v4 protocol request failed, but there are too many corner cases. So I just made it a compile-time switch that distributions can throw when they've replaced portmapper with rpcbind. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:38 -04:00
Chuck Lever	7252d575ab	SUNRPC: Split portmap unregister API into separate function Create a separate server-level interface for unregistering RPC services. The mechanics of, and the API for, registering and unregistering RPC services will diverge further as support for IPv6 is added. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:38 -04:00
Chuck Lever	14aeb2118d	SUNRPC: Simplify rpcb_register() API Bruce suggested there's no need to expose the difference between an error sending the PMAP_SET request and an error reply from the portmapper to rpcb_register's callers. The user space equivalent of rpcb_register() is pmap_set(3), which returns a bool_t : either the PMAP set worked, or it didn't. Simple. So let's remove the "*okay" argument from rpcb_register() and rpcb_v4_register(), and simply return an error if any part of the call didn't work. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:37 -04:00
Chuck Lever	b6632339e3	SUNRPC: Set V6ONLY socket option for RPC listener sockets My plan is to use an AF_INET listener on systems that support only IPv4, and an AF_INET6 listener on systems that can support IPv6. Incoming IPv4 packets will be posted to an AF_INET6 listener with a mapped IPv4 address. Max Matveev <makc@sgi.com> says: Creating a single listener can be dangerous - if net.ipv6.bindv6only is enabled then it's possible to create another listener in v4 namespace on the same port and steal the traffic from the "unifed" listener. You need to disable V6ONLY explicitly via a sockopt to stop that. Set appropriate socket option on RPC server listener sockets to prevent this. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 18:13:37 -04:00
Chuck Lever	5dd248f6f1	SUNRPC: Use proper INADDR_ANY when setting up RPC services on IPv6 Teach svc_create_xprt() to use the correct ANY address for AF_INET6 based RPC services. No caller uses AF_INET6 yet. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 17:56:56 -04:00
Chuck Lever	e851db5b05	SUNRPC: Add address family field to svc_serv data structure Introduce and initialize an address family field in the svc_serv structure. This field will determine what family to use for the service's listener sockets and what families are advertised via the local rpcbind daemon. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-29 17:56:56 -04:00
Arnaldo Carvalho de Melo	6067804047	net: Use hton[sl]() instead of __constant_hton[sl]() where applicable Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-09-20 22:20:49 -07:00
Cyrill Gorcunov	27df6f25ff	sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports Vegard Nossum reported ---------------------- > I noticed that something weird is going on with /proc/sys/sunrpc/transports. > This file is generated in net/sunrpc/sysctl.c, function proc_do_xprt(). When > I "cat" this file, I get the expected output: > $ cat /proc/sys/sunrpc/transports > tcp 1048576 > udp 32768 > But I think that it does not check the length of the buffer supplied by > userspace to read(). With my original program, I found that the stack was > being overwritten by the characters above, even when the length given to > read() was just 1. David Wagner added (among other things) that copy_to_user could be probably used here. Ingo Oeser suggested to use simple_read_from_buffer() here. The conclusion is that proc_do_xprt doesn't check for userside buffer size indeed so fix this by using Ingo's suggestion. Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> CC: Ingo Oeser <ioe-lkml@rameria.de> Cc: Neil Brown <neilb@suse.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Greg Banks <gnb@sgi.com> Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-01 14:24:24 -04:00
Tom Tucker	24b8b44780	svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet RDMA_READ completions are kept on a separate queue from the general I/O request queue. Since a separate lock is used to protect the RDMA_READ completion queue, a race exists between the dto_tasklet and the svc_rdma_recvfrom thread where the dto_tasklet sets the XPT_DATA bit and adds I/O to the read-completion queue. Concurrently, the recvfrom thread checks the generic queue, finds it empty and resets the XPT_DATA bit. A subsequent svc_xprt_enqueue will fail to enqueue the transport for I/O and cause the transport to "stall". The fix is to protect both lists with the same lock and set the XPT_DATA bit with this lock held. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-08-13 16:57:31 -04:00
Ingo Molnar	414f746d23	Merge branch 'linus' into cpus4096	2008-07-28 21:14:43 +02:00
Alexey Dobriyan	51cc50685a	SL*B: drop kmem cache argument from constructor Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:07 -07:00
FUJITA Tomonori	8d8bb39b9e	dma-mapping: add the device argument to dma_mapping_error() Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER architecture does: This enables us to cleanly fix the Calgary IOMMU issue that some devices are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423). I think that per-device dma_mapping_ops support would be also helpful for KVM people to support PCI passthrough but Andi thinks that this makes it difficult to support the PCI passthrough (see the above thread). So I CC'ed this to KVM camp. Comments are appreciated. A pointer to dma_mapping_ops to struct dev_archdata is added. If the pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's NULL, the system-wide dma_ops pointer is used as before. If it's useful for KVM people, I plan to implement a mechanism to register a hook called when a new pci (or dma capable) device is created (it works with hot plugging). It enables IOMMUs to set up an appropriate dma_mapping_ops per device. The major obstacle is that dma_mapping_error doesn't take a pointer to the device unlike other DMA operations. So x86 can't have dma_mapping_ops per device. Note all the POWER IOMMUs use the same dma_mapping_error function so this is not a problem for POWER but x86 IOMMUs use different dma_mapping_error functions. The first patch adds the device argument to dma_mapping_error. The patch is trivial but large since it touches lots of drivers and dma-mapping.h in all the architecture. This patch: dma_mapping_error() doesn't take a pointer to the device unlike other DMA operations. So we can't have dma_mapping_ops per device. Note that POWER already has dma_mapping_ops per device but all the POWER IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device argument. [akpm@linux-foundation.org: fix sge] [akpm@linux-foundation.org: fix svc_rdma] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix bnx2x] [akpm@linux-foundation.org: fix s2io] [akpm@linux-foundation.org: fix pasemi_mac] [akpm@linux-foundation.org: fix sdhci] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: fix sparc] [akpm@linux-foundation.org: fix ibmvscsi] Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Muli Ben-Yehuda <muli@il.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-26 12:00:03 -07:00
Mike Travis	0bc3cc03fa	cpumask: change cpumask_of_cpu_ptr to use new cpumask_of_cpu * Replace previous instances of the cpumask_of_cpu_ptr* macros with a the new (lvalue capable) generic cpumask_of_cpu(). Signed-off-by: Mike Travis <travis@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jack Steiner <steiner@sgi.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-26 16:40:33 +02:00
Ingo Molnar	eb6a12c242	Merge branch 'linus' into cpus4096-for-linus Conflicts: net/sunrpc/svc.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-21 17:19:50 +02:00
Linus Torvalds	14b395e35d	Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits) nfsd: nfs4xdr.c do-while is not a compound statement nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c lockd: Pass "struct sockaddr *" to new failover-by-IP function lockd: get host reference in nlmsvc_create_block() instead of callers lockd: minor svclock.c style fixes lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock lockd: nlm_release_host() checks for NULL, caller needn't file lock: reorder struct file_lock to save space on 64 bit builds nfsd: take file and mnt write in nfs4_upgrade_open nfsd: document open share bit tracking nfsd: tabulate nfs4 xdr encoding functions nfsd: dprint operation names svcrdma: Change WR context get/put to use the kmem cache svcrdma: Create a kmem cache for the WR contexts svcrdma: Add flush_scheduled_work to module exit function svcrdma: Limit ORD based on client's advertised IRD svcrdma: Remove unused wait q from svcrdma_xprt structure svcrdma: Remove unneeded spin locks from __svc_rdma_free svcrdma: Add dma map count and WARN_ON ...	2008-07-20 21:21:46 -07:00
Mike Travis	65c0118453	cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr * This patch replaces the dangerous lvalue version of cpumask_of_cpu with new cpumask_of_cpu_ptr macros. These are patterned after the node_to_cpumask_ptr macros. In general terms, if there is a cpumask_of_cpu_map[] then a pointer to the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map is provided when there is a large NR_CPUS count, reducing greatly the amount of code generated and stack space used for cpumask_of_cpu(). The pointer to the cpumask_t value is needed for calling set_cpus_allowed_ptr() to reduce the amount of stack space needed to pass the cpumask_t value. If there isn't a cpumask_of_cpu_map[], then a temporary variable is declared and filled in with value from cpumask_of_cpu(cpu) as well as a pointer variable pointing to this temporary variable. Afterwards, the pointer is used to reference the cpumask value. The compiler will optimize out the extra dereference through the pointer as well as the stack space used for the pointer, resulting in identical code. A good example of the orthogonal usages is in net/sunrpc/svc.c: case SVC_POOL_PERCPU: { unsigned int cpu = m->pool_to[pidx]; cpumask_of_cpu_ptr(cpumask, cpu); oldmask = current->cpus_allowed; set_cpus_allowed_ptr(current, cpumask); return 1; } case SVC_POOL_PERNODE: { unsigned int node = m->pool_to[pidx]; node_to_cpumask_ptr(nodecpumask, node); oldmask = current->cpus_allowed; set_cpus_allowed_ptr(current, nodecpumask); return 1; } Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:02:57 +02:00
Ingo Molnar	bb2c018b09	Merge branch 'linus' into cpus4096 Conflicts: drivers/acpi/processor_throttling.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-18 22:00:54 +02:00
David S. Miller	49997d7515	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/powerpc/booting-without-of.txt drivers/atm/Makefile drivers/net/fs_enet/fs_enet-main.c drivers/pci/pci-acpi.c net/8021q/vlan.c net/iucv/iucv.c	2008-07-18 02:39:39 -07:00
Ingo Molnar	82638844d9	Merge branch 'linus' into cpus4096 Conflicts: arch/x86/xen/smp.c kernel/sched_rt.c net/iucv/iucv.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-07-16 00:29:07 +02:00
Trond Myklebust	a86dc496b7	SUNRPC: Remove the BKL from the callback functions Push it into those callback functions that actually need it. Note that all the NFS operations use their own locking, so don't need the BKL. Ditto for the rpcbind client. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:10:57 -04:00
Chuck Lever	c2e1b09ff2	SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon Introduce a new API to register RPC services on IPv6 interfaces to allow the NFS server and lockd to advertise on IPv6 networks. Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind protocol version 4 to contact the local rpcbind daemon. The version 4 SET/UNSET procedures allow services to register address families besides AF_INET, register at specific network interfaces, and register transport protocols besides UDP and TCP. All of this functionality is exposed via the new rpcb_v4_register() kernel API. A user-space rpcbind daemon implementation that supports version 4 of the rpcbind protocol is required in order to make use of this new API. Note that rpcbind version 3 is sufficient to support the new rpcbind facilities listed above, but most extant implementations use version 4. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:55 -04:00
Chuck Lever	babe80eb49	SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier rpcbind version 4 registration will reuse part of rpcb_register, so just split it out into a separate function now. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:51 -04:00
Chuck Lever	423d8b0647	SUNRPC: None of rpcb_create's callers wants a privileged source port Clean up: Callers that required a privileged source port now use rpcb_create_local(), so we can remove the @privileged argument from rpcb_create(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:47 -04:00
Chuck Lever	cc5598b78f	SUNRPC: Introduce a specific rpcb_create for contacting localhost Add rpcb_create_local() for use by rpcb_register() and upcoming IPv6 registration functions. Ensure any errors encountered by rpcb_create_local() are properly reported. We can also use a statically allocated constant loopback socket address instead of one allocated on the stack and initialized every time the function is called. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:44 -04:00
Chuck Lever	166b88d755	SUNRPC: Use correct XDR encoding procedure for rpcbind SET/UNSET The rpcbind versions 3 and 4 SET and UNSET procedures use the same arguments as the GETADDR procedure. While definitely a bug, this hasn't been a problem so far since the kernel hasn't used version 3 or 4 SET and UNSET. But this will change in just a moment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-15 18:08:40 -04:00
Trond Myklebust	381ba74af5	SUNRPC: Ensure our task is notified when an rpcbind call is done If another task is busy in rpcb_getport_async number, it is more efficient to have it wake us up when it has finished instead of arbitrarily sleeping for 5 seconds. Also ensure that rpcb_wake_rpcbind_waiters() is called regardless of whether or not rpcb_getport_done() gets called. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:45 -04:00
Chuck Lever	40fef8a649	SUNRPC: Use only rpcbind v2 for AF_INET requests Some server vendors support the higher versions of rpcbind only for AF_INET6. The kernel doesn't need to use v3 or v4 for AF_INET anyway, so change the kernel's rpcbind client to query AF_INET servers over rpcbind v2 only. This has a few interesting benefits: 1. If the rpcbind request is going over TCP, and the server doesn't support rpcbind versions 3 or 4, the client reduces by two the number of ephemeral ports left in TIME_WAIT for each rpcbind request. This will help during NFS mount storms. 2. The rpcbind interaction with servers that don't support rpcbind versions 3 or 4 will use less network traffic. Also helpful during mount storms. 3. We can eliminate the kernel build option that controls whether the kernel's rpcbind client uses rpcbind version 3 and 4 for AF_INET servers. Less complicated kernel configuration... Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:37 -04:00
Chuck Lever	8842413aa4	SUNRPC: Use GETADDR for rpcbind version 4 queries Some rpcbind servers that do support rpcbind version 4 do not support the GETVERSADDR procedure. Use GETADDR for querying rpcbind servers via rpcbind version 4 instead of GETVERSADDR. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:36 -04:00
Chuck Lever	6a77405157	SUNRPC: Use rpcbind version 2 GETPORT Clean up: Change the version 2 procedure name to GETPORT. It's the same procedure number as GETADDR, but version 2 implementations usually refer to it as GETPORT. This also now matches the procedure name used in the version 2 procedure entry in the rpcb_next_version[] array, making it slightly less confusing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:35 -04:00
Chuck Lever	fc200e794d	SUNRPC: Document some naked integers in rpcbind client Clean up: Replace naked integers that represent rpcbind protocol versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:34 -04:00
Chuck Lever	877fcf1039	SUNRPC: More useful debugging output for rpcb client Clean up dprintk's in rpcb client's XDR decoder functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:33 -04:00
Chuck Lever	b22602a673	SUNRPC: Ensure all transports set rq_xtime consistently The RPC client uses the rq_xtime field in each RPC request to determine the round-trip time of the request. Currently, the rq_xtime field is initialized by each transport just before it starts enqueing a request to be sent. However, transports do not handle initializing this value consistently; sometimes they don't initialize it at all. To make the measurement of request round-trip time consistent for all RPC client transport capabilities, pull rq_xtime initialization into the RPC client's generic transport logic. Now all transports will get a standardized RTT measure automatically, from: xprt_transmit() to xprt_complete_rqst() This makes round-trip time calculation more accurate for the TCP transport. The socket ->sendmsg() method can return "-EAGAIN" if the socket's output buffer is full, so the TCP transport's ->send_request() method may call the ->sendmsg() method repeatedly until it gets all of the request's bytes queued in the socket's buffer. Currently, the TCP transport sets the rq_xtime field every time through that loop so the final value is the timestamp just before the last call to the underlying socket's ->sendmsg() method. After this patch, the rq_xtime field contains a timestamp that reflects the time just before the first call to ->sendmsg(). This is consequential under heavy workloads because large requests often take multiple ->sendmsg() calls to get all the bytes of a request queued. The TCP transport causes the request to sleep until the remote end of the socket has received enough bytes to clear space in the socket's local output buffer. This delay can be quite significant. The method introduced by this patch is a more accurate measure of RTT for stream transports, since the server can cause enough back pressure to delay (ie increase the latency of) requests from the client. Additionally, this patch corrects the behavior of the RDMA transport, which entirely neglected to initialize the rq_xtime field. RPC performance metrics for RDMA transports now display correct RPC request round trip times. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Tom Talpey <thomas.talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:15 -04:00
$\\\"J. Bruce Fields\\\$ \\\"J. Bruce Fields\\\	a486aeda9b	rpc: minor cleanup of scheduler callback code Try to make the comment here a little more clear and concise. Also, this macro definition seems unnecessary. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:14 -04:00
$\\\"J. Bruce Fields\\\$ \\\"J. Bruce Fields\\\	d25a03cf96	rpc: remove some unused macros There used to be a print_hexl() function that used isprint(), now gone. I don't know why NFS_NGROUPS and CA_RUN_AS_MACHINE were here. I also don't know why another #define that's actually used was marked "unused". Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:12 -04:00
$\\\"J. Bruce Fields\\\$ \\\"J. Bruce Fields\\\	720b8f2d6f	rpc: eliminate unused variable in auth_gss upcall code Also, a minor comment grammar fix in the same file. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:11 -04:00
Olga Kornievskaia	b6b6152c46	rpc: bring back cl_chatty The cl_chatty flag alows us to control whether a given rpc client leaves "server X not responding, timed out" messages in the syslog. Such messages make sense for ordinary nfs clients (where an unresponsive server means applications on the mountpoint are probably hanging), but not for the callback client (which can fail more commonly, with the only result just of disabling some optimizations). Previously cl_chatty was removed, do to lack of users; reinstate it, and use it for the nfsd's callback client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:10 -04:00
Chuck Lever	cd983ef81b	SUNRPC: Remove obsolete messages during transport connect Recent changes to the RPC client's transport connect logic make connect status values ECONNREFUSED and ECONNRESET impossible. Clean up xprt_connect_status() to account for these changes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:09:06 -04:00
Chuck Lever	cb3997b5a0	SUNRPC: Display some debugging information as text rather than numbers In rpc_show_tasks(), display the program name, version number, procedure name and tk_action as human-readable variable-length text fields rather than columnar numbers. Doing the symbol lookup here helps in cases where we have actual debugging output from a kernel log, but don't have access to the kernel image or RPC module that generated the output. Sample output: -pid- flgs status -client- --rqstp- -timeout ---ops-- 5608 0001 -11 eeb42690 f6d93710 0 f8fa1764 nfsv3 WRITE a:call_transmit_status q:none 5609 0001 -11 eeb42690 f6d937e0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5610 0001 -11 eeb42690 f6d93230 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5611 0001 -11 eeb42690 f6d93300 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5612 0001 -11 eeb42690 f6d93090 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5613 0001 -11 eeb42690 f6d933d0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5614 0001 -11 eeb42690 f6d93cc0 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5615 0001 -11 eeb42690 f6d93a50 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5616 0001 -11 eeb42690 f6d93640 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5617 0001 -11 eeb42690 f6d93b20 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending 5618 0001 -11 eeb42690 f6d93160 0 f8fa1764 nfsv3 WRITE a:call_status q:xprt_sending Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:58 -04:00
Chuck Lever	38e886e0c1	SUNRPC: Refactor rpc_show_tasks Clean up: move the logic that displays each task to its own function. This removes indentation and makes future changes easier. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:56 -04:00
Chuck Lever	68a23ee94e	SUNRPC: Don't display the rpc_show_tasks header if there are no tasks Clean up: don't display the rpc_show_tasks column header unless there is at least one task to display. As far as I can tell, it is safe to let the list_for_each_entry macro decide that each list is empty. scripts/checkpatch.pl also wants a KERN_FOO at the start of any newly added printk() calls, so this and subsequent patches will also add KERN_INFO. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:55 -04:00
Chuck Lever	b0e1c57ea0	SUNRPC: Rename "call_" functions that are no longer FSM states The RPC client uses a finite state machine to move RPC tasks through each step of an RPC request. Each state is contained in a function in net/sunrpc/clnt.c, and named call_foo. Some of the functions named call_foo have changed over the past few years and are no longer states in the FSM. These include: call_encode, call_header, and call_verify. As a clean up, rename the functions that have changed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:53 -04:00
Chuck Lever	3748f1e447	SUNRPC: Add a function to display the name of an RPC procedure Improve debugging messages in call_start() and call_verify() by having them show the RPC procedure name instead of the procedure number. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:53 -04:00
Trond Myklebust	0f38b873ae	SUNRPC: Use GFP_NOFS when allocating credentials Since the credentials may be allocated during the call to rpc_new_task(), which again may be called by a memory allocator... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:48 -04:00
Trond Myklebust	b390c2b55c	SUNRPC: An ENOMEM error from call_encode is always fatal The special 'ENOMEM' case that was previously flagged as non-fatal is bogus: auth_gss always returns EAGAIN for non-fatal errors, and may in fact return ENOMEM in the special case where xdr_buf_read_netobj runs out of preallocated buffer space (invariably a _fatal_ error, since there is no provision for preallocating larger buffers). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:43 -04:00
Trond Myklebust	8b39f2b410	SUNRPC: Ensure we exit early in case of an encode error All errors from call_encode(), with exception of EAGAIN are fatal, so we should immediately return instead of proceeding to xprt_transmit(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-09 12:08:41 -04:00
Trond Myklebust	803a9067e1	SUNRPC: Fix an rpcbind breakage for the case of IPv6 lookups Now that rpcb_next_version has been split into an IPv4 version and an IPv6 version, we Oops when rpcb_call_async attempts to look up the IPv6-specific RPC procedure in rpcb_next_version. Fix the Oops simply by having rpcb_getport_async pass the correct RPC procedure as an argument. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-08 15:23:10 -04:00
Trond Myklebust	0d3a34b48c	SUNRPC: Fix a double-free in rpcbind It is wrong to be freeing up the rpcbind arguments if the call to rpcb_call_async() fails, since they should already have been freed up by rpcb_map_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-07-08 15:23:00 -04:00
Ingo Molnar	68083e05d7	Merge commit 'v2.6.26-rc9' into cpus4096	2008-07-06 14:23:39 +02:00
J. Bruce Fields	e86322f611	Merge branch 'for-bfields' of git://linux-nfs.org/~tomtucker/xprt-switch-2.6 into for-2.6.27	2008-07-03 16:24:06 -04:00
J. Bruce Fields	b620754bfe	svcrpc: fix handling of garbage args To return garbage_args, the accept_stat must be 0, and we must have a verifier. So we shouldn't be resetting the write pointer as we reject the call. Also, we must add the two placeholder words here regardless of success of the unwrap, to ensure the output buffer is left in a consistent state for svcauth_gss_release(). This fixes a BUG() in svcauth_gss.c:svcauth_gss_release(). Thanks to Aime Le Rouzic for bug report, debugging help, and testing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Tested-by: Aime Le Rouzic <aime.le-rouzic@bull.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-03 12:46:56 -07:00
Tom Tucker	8948896c9e	svcrdma: Change WR context get/put to use the kmem cache Change the WR context pool to be shared across mount points. This reduces the RDMA transport memory footprint significantly since idle mounts don't consume WR context memory. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:02:02 -05:00
Tom Tucker	bf5927d84e	svcrdma: Create a kmem cache for the WR contexts Create a kmem cache to hold WR contexts. Next we will convert the WR context get and put services to use this kmem cache. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:02:01 -05:00
Tom Tucker	902a94e088	svcrdma: Add flush_scheduled_work to module exit function Make certain all transports pending free are flushed from the wq before unloading the module. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:02:00 -05:00
Tom Tucker	36ef25e464	svcrdma: Limit ORD based on client's advertised IRD When adapters have differing IRD limits, the RDMA transport will fail to connect properly. The RDMA transport should use the client's advertised inbound read limit when computing its outbound read limit. For iWARP transports, there is currently no standard for exchanging IRD/ORD during connection establishment so the 'responder_resources' field in the connect event is the local device's limit. The RDMA transport can be configured to use a smaller ORD by writing the desired number to the /proc/sys/sunrpc/svc_rdma/max_outbound_read_requests file. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:59 -05:00
Tom Tucker	94dba4918d	svcrdma: Remove unneeded spin locks from __svc_rdma_free At the time __svc_rdma_free is called, we are guaranteed that all references to this transport are gone. There is, therefore, no need to protect the resource lists with a spin lock. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:57 -05:00
Tom Tucker	87295b6c5c	svcrdma: Add dma map count and WARN_ON Add a dma map count in order to verify that all DMA mapping resources have been freed when the transport is closed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:56 -05:00
Tom Tucker	e6ab914371	svcrdma: Move the DMA unmap logic to the CQ handler Separate DMA unmap from context destruction and perform DMA unmapping in the SQ/RQ CQ reap functions. This is necessary to support software based RDMA implementations that actually copy the data in their ib_dma_unmap callback functions and architectures that don't have cache coherent I/O busses. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:55 -05:00
Tom Tucker	f820c57ebf	svcrdma: Use reply and chunk map for RDMA_READ processing Modify the RDMA_READ processing to use the reply and chunk list mapping data types. Also add a special purpose 'hdr_count' field in in the context to hold the header page count instead of overloading the SGE length field and corrupting the DMA map length. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:55 -05:00
Tom Tucker	34d16e42a6	svcrdma: Use RPC reply map for RDMA_WRITE processing Use the new svc_rdma_req_map data type for mapping the client side memory to the server side memory. Move the DMA mapping to the context pointed to by each WR individually so that it is unmapped after the WR completes. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:54 -05:00
Tom Tucker	ab96dddbed	svcrdma: Add a type for keeping NFS RPC mapping Create a new data structure to hold the remote client address space to local server address space mapping. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-07-02 15:01:53 -05:00
Kevin Coffman	863a24882e	gss_krb5: Use random value to initialize confounder Initialize the value used for the confounder to a random value rather than starting from zero. Allow for confounders of length 8 or 16 (which will be needed for AES). Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:47:38 -04:00
Kevin Coffman	db8add5789	gss_krb5: move gss_krb5_crypto into the krb5 module The gss_krb5_crypto.o object belongs in the rpcsec_gss_krb5 module. Also, there is no need to export symbols from gss_krb5_crypto.c Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:47:32 -04:00
Kevin Coffman	d00953a53e	gss_krb5: create a define for token header size and clean up ptr location cleanup: Document token header size with a #define instead of open-coding it. Don't needlessly increment "ptr" past the beginning of the header which makes the values passed to functions more understandable and eliminates the need for extra "krb5_hdr" pointer. Clean up some intersecting white-space issues flagged by checkpatch.pl. This leaves the checksum length hard-coded at 8 for DES. A later patch cleans that up. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:47:25 -04:00
Jeff Layton	a75c5d01e4	sunrpc: remove sv_kill_signal field from svc_serv struct Since we no longer make any distinction between shutdown signals with nfsd, then it becomes easier to just standardize on a particular signal to use to bring it down (SIGINT, in this case). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:49 -04:00
Jeff Layton	9867d76ca1	knfsd: convert knfsd to kthread API This patch is rather large, but I couldn't figure out a way to break it up that would remain bisectable. It does several things: - change svc_thread_fn typedef to better match what kthread_create expects - change svc_pool_map_set_cpumask to be more kthread friendly. Make it take a task arg and and get rid of the "oldmask" - have svc_set_num_threads call kthread_create directly - eliminate __svc_create_thread Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:49 -04:00
Neil Brown	bedbdd8bad	knfsd: Replace lock_kernel with a mutex for nfsd thread startup/shutdown locking. This removes the BKL from the RPC service creation codepath. The BKL really isn't adequate for this job since some of this info needs protection across sleeps. Also, add some comments to try and clarify how the locking should work and to make it clear that the BKL isn't necessary as long as there is adequate locking between tasks when touching the svc_serv fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-06-23 13:02:49 -04:00
Adrian Bunk	0b04082995	net: remove CVS keywords This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-06-11 21:00:38 -07:00
Mike Travis	3f9b48a758	net: Pass reference to cpumask variable in net/sunrpc/svc.c * Pass reference to cpumask variable instead of using stack. For inclusion into sched-devel/latest tree. Based on: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + sched-devel/latest .../mingo/linux-2.6-sched-devel.git Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-23 18:44:36 +02:00
Linus Torvalds	d40ace0c7b	Merge branch 'for-2.6.26' of git://linux-nfs.org/~bfields/linux * 'for-2.6.26' of git://linux-nfs.org/~bfields/linux: (25 commits) svcrdma: Verify read-list fits within RPCSVC_MAXPAGES svcrdma: Change svc_rdma_send_error return type to void svcrdma: Copy transport address and arm CQ before calling rdma_accept svcrdma: Set rqstp transport address in rdma_read_complete function svcrdma: Use ib verbs version of dma_unmap svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free svcrdma: Move the QP and cm_id destruction to svc_rdma_free svcrdma: Add reference for each SQ/RQ WR svcrdma: Move destroy to kernel thread svcrdma: Shrink scope of spinlock on RQ CQ svcrdma: Use standard Linux lists for context cache svcrdma: Simplify RDMA_READ deferral buffer management svcrdma: Remove unused READ_DONE context flags bit svcrdma: Return error from rdma_read_xdr so caller knows to free context svcrdma: Fix error handling during listening endpoint creation svcrdma: Free context on post_recv error in send_reply svcrdma: Free context on ib_post_recv error svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler svcrdma: Fix return value in svc_rdma_send svcrdma: Fix race with dto_tasklet in svc_rdma_send ...	2008-05-20 19:30:54 -07:00
J. Bruce Fields	68432a03f8	Merge branch 'from-tomtucker' into for-2.6.26	2008-05-20 19:57:38 -04:00
Tom Tucker	a6f911c04e	svcrdma: Verify read-list fits within RPCSVC_MAXPAGES A RDMA read-list cannot contain more elements than RPCSVC_MAXPAGES or it will overflow the DTO context. Verify this when processing the protocol header. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:34:02 -05:00
Tom Tucker	008fdbc571	svcrdma: Change svc_rdma_send_error return type to void The svc_rdma_send_error function is called when an RPCRDMA protocol error is detected. This function attempts to post an error reply message. Since an error posting to a transport in error is ignored, change the return type to void. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:34:01 -05:00
Tom Tucker	af261af4db	svcrdma: Copy transport address and arm CQ before calling rdma_accept This race was found by inspection. Messages can be received from the peer immediately following the rdma_accept call, however, the CQ have not yet been armed and the transport address has not yet been set. Set the transport address in the connect request handler and arm the CQ prior to calling rdma_accept. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:34:00 -05:00
Tom Tucker	69500c43b4	svcrdma: Set rqstp transport address in rdma_read_complete function The rdma_read_complete function needs to copy the rqstp transport address from the transport. Failure to do so can result in using the wrong authentication method for the RPC or bug checking if the rqstp address is not valid. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:59 -05:00
Tom Tucker	97a3df382e	svcrdma: Use ib verbs version of dma_unmap Use the ib_verbs version of the dma_unmap service in the svc_rdma_put_context function. This should support providers using software rdma. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:58 -05:00
Tom Tucker	356d0a1519	svcrdma: Cleanup queued, but unprocessed I/O in svc_rdma_free When the transport is closing, the DTO tasklet may queue data that never gets processed. Clean up resources associated with this I/O. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:57 -05:00
Tom Tucker	1711386c62	svcrdma: Move the QP and cm_id destruction to svc_rdma_free Move the destruction of the QP and CM_ID to the free path so that the QP cleanup code doesn't race with the dto_tasklet handling flushed WR. The QP reference is not needed because we now have a reference for every WR. Also add a guard in the SQ and RQ completion handlers to ignore calls generated by some providers when the QP is destroyed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:56 -05:00
Tom Tucker	0905c0f0a2	svcrdma: Add reference for each SQ/RQ WR Add a reference on the transport for every outstanding WR. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:55 -05:00
Tom Tucker	8da91ea8de	svcrdma: Move destroy to kernel thread Some providers may wait while destroying adapter resources. Since it is possible that the last reference is put on the dto_tasklet, the actual destroy must be scheduled as a work item. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:54 -05:00
Tom Tucker	47698e083e	svcrdma: Shrink scope of spinlock on RQ CQ The rq_cq_reap function is only called from the dto_tasklet. The only resource shared with other threads is the sc_rq_dto_q. Move the spin lock to protect only this list. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:53 -05:00
Tom Tucker	8740767376	svcrdma: Use standard Linux lists for context cache Replace the one-off linked list implementation used to implement the context cache with the standard Linux list_head lists. Add a context counter to catch resource leaks. A WARN_ON will be added later to ensure that we've freed all contexts. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:52 -05:00
Tom Tucker	02e7452de7	svcrdma: Simplify RDMA_READ deferral buffer management An NFS_WRITE requires a set of RDMA_READ requests to fetch the write data from the client. There are two principal pieces of data that need to be tracked: the list of pages that comprise the completed RPC and the SGE of dma mapped pages to refer to this list of pages. Previously this whole bit was managed as a linked list of contexts with the context containing the page list buried in this list. This patch simplifies this processing by not keeping a linked list, but rather only a pionter from the last submitted RDMA_READ's context to the context that maps the set of pages that describe the RPC. This significantly simplifies this code path. SGE contexts are cleaned up inline in the DTO path instead of at read completion time. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:51 -05:00
Tom Tucker	10a38c33f4	svcrdma: Remove unused READ_DONE context flags bit The RDMACTXT_F_READ_DONE bit is not longer used. Remove it. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:50 -05:00
Tom Tucker	d16d40093a	svcrdma: Return error from rdma_read_xdr so caller knows to free context The rdma_read_xdr function did not discriminate between no read-list and an error posting the read-list. This results in a leak of a page if there is an error posting the read-list. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:49 -05:00
Tom Tucker	58e8f62137	svcrdma: Fix error handling during listening endpoint creation A listening endpoint isn't known to the generic transport switch until the svc_create_xprt function returns without error. Calling svc_xprt_put within the xpo_create function causes the module reference count to be erroneously decremented. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:48 -05:00
Tom Tucker	5ac461a6f0	svcrdma: Free context on post_recv error in send_reply If an error is encountered trying to post a recv buffer in send_reply, free the passed in context. Return an error to the caller so it is aware that the request was not posted. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:47 -05:00
Tom Tucker	05a0826a6e	svcrdma: Free context on ib_post_recv error If there is an error posting the recv WR to the RQ, free the context associated with the WR. This would leak a context when asynchronous errors occurred on the transport while conccurent threads were processing their RPC. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:47 -05:00
Tom Tucker	120693d12c	svcrdma: Add put of connection ESTABLISHED reference in rdma_cma_handler The svcrdma transport takes a reference when it gets the ESTABLISHED event from the provider. This reference is supposed to be removed when the DISCONNECT event is received, however, the call to svc_xprt_put was missing in the switch statement. This results in the memory associated with the transport never being freed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:46 -05:00
Tom Tucker	9d6347acd2	svcrdma: Fix return value in svc_rdma_send Fix the return value on close to -ENOTCONN so caller knows to free context. Also if a thread is waiting for free SQ space, check for close when waking to avoid posting WR to a closing transport. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:45 -05:00
Tom Tucker	dbcd00eba9	svcrdma: Fix race with dto_tasklet in svc_rdma_send The svc_rdma_send function will attempt to reap SQ WR to make room for a new request if it finds the SQ full. This function races with the dto_tasklet that also reaps SQ WR. To avoid polling and arming the CQ unnecessarily move the test_and_clear_bit of the RDMAXPRT_SQ_PENDING flag and arming of the CQ to the sq_cq_reap function. Refactor the rq_cq_reap function to match sq_cq_reap so that the code is easier to follow. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:44 -05:00
Tom Tucker	0e7f011a19	svcrdma: Simplify receive buffer posting The svcrdma transport provider currently allocates receive buffers to the RQ through the xpo_release_rqst method. This approach is overly complicated since it means that the rqstp rq_xprt_ctxt has to be selectively set based on whether the RPC is going to be processed immediately or deferred. Instead, just post the receive buffer when we are certain that we are replying in the send_reply function. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:43 -05:00
Tom Tucker	aa3314c8d6	svc: Remove unused header files from svc_xprt.c This cosmetic patch removes unused header files that svc_xprt.c inherited from svcsock.c Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:42 -05:00
Tom Tucker	fc63a05086	svc: Remove extra check for XPT_DEAD bit in svc_xprt_enqueue Remove a redundant check for the XPT_DEAD bit in the svc_xprt_enqueue function. This same bit is checked below while holding the pool lock and prints a debug message if found to be dead. Signed-off-by: Tom Tucker <tom@opengridcomputing.com>	2008-05-19 07:33:41 -05:00
J. Bruce Fields	d71a4dd72e	svcrpc: fix proc/net/rpc/auth.unix.ip/content display Commit `f15364bd4c` ("IPv6 support for NFS server export caches") dropped a couple spaces, rendering the output here difficult to read. (However note that we expect the output to be parsed only by humans, not machines, so this shouldn't have broken any userland software.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-05-18 19:13:07 -04:00
Trond Myklebust	b4528762ca	SUNRPC: AUTH_SYS "machine creds" shouldn't use negative valued uid/gid Apparently this causes Solaris 10 servers to refuse our NFSv4 SETCLIENTID calls. Fall back to root creds for now, since most servers that care are very likely to have root squashing enabled. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-05-18 14:18:27 -04:00
Huang Weiyi	625fc3a375	Remove duplicated include in net/sunrpc/svc.c <linux/sched.h> we included twice. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-05-08 10:58:25 -07:00
Denis V. Lunev	e7fe23363b	sunrpc: assign PDE->data before gluing PDE into /proc tree Simply replace proc_create and further data assigned with proc_create_data. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-05-02 02:44:36 -07:00
Linus Torvalds	77a50df2b1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: iwlwifi: Allow building iwl3945 without iwl4965. wireless: Fix compile error with wifi & leds tcp: Fix slab corruption with ipv6 and tcp6fuzz ipv4/ipv6 compat: Fix SSM applications on 64bit kernels. [IPSEC]: Use digest_null directly for auth sunrpc: fix missing kernel-doc can: Fix copy_from_user() results interpretation Revert "ipv6: Fix typo in net/ipv6/Kconfig" tipc: endianness annotations ipv6: result of csum_fold() is already 16bit, no need to cast [XFRM] AUDIT: Fix flowlabel text format ambibuity.	2008-04-28 09:44:11 -07:00
Randy Dunlap	0b80ae4201	sunrpc: fix missing kernel-doc Fix missing sunrpc kernel-doc: Warning(linux-2.6.25-git7//net/sunrpc/xprt.c:451): No description found for parameter 'action' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-04-27 14:26:52 -07:00
Linus Torvalds	563307b2fa	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (80 commits) SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request make nfs_automount_list static NFS: remove duplicate flags assignment from nfs_validate_mount_data NFS - fix potential NULL pointer dereference v2 SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use SUNRPC: Fix a race in gss_refresh_upcall() SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests SUNRPC: Remove the unused export of xprt_force_disconnect SUNRPC: remove XS_SENDMSG_RETRY SUNRPC: Protect creds against early garbage collection NFSv4: Attempt to use machine credentials in SETCLIENTID calls NFSv4: Reintroduce machine creds NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid() nfs: fix printout of multiword bitfields nfs: return negative error value from nfs{,4}_stat_to_errno NLM/lockd: Ensure client locking calls use correct credentials NFS: Remove the buggy lock-if-signalled case from do_setlk() NLM/lockd: Fix a race when cancelling a blocking lock NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel ...	2008-04-24 11:46:16 -07:00
Trond Myklebust	233607dbbc	Merge branch 'devel'	2008-04-24 14:01:02 -04:00
Trond Myklebust	b48633bd08	SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request RFC 2203 requires the server to drop the request if it believes the RPCSEC_GSS context is out of sequence. The problem is that we have no way on the client to know why the server dropped the request. In order to avoid spinning forever trying to resend the request, the safe approach is therefore to always invalidate the RPCSEC_GSS context on every major timeout. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-24 13:53:46 -04:00
Chuck Lever	0dc220f081	SUNRPC: Use unsigned loop and array index in svc_init_buffer() Clean up: Suppress a harmless compiler warning. Index rq_pages[] with an unsigned type. Make "pages" unsigned as well, as it never represents a value less than zero. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Chuck Lever	50c8bb13ea	SUNRPC: Use unsigned index when looping over arrays Clean up: Suppress a harmless compiler warning in the RPC server related to array indices. ARRAY_SIZE() returns a size_t, so use unsigned type for a loop index when looping over arrays. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Chuck Lever	c0401ea008	SUNRPC: Update RPC server's TCP record marker decoder Clean up: Update the RPC server's TCP record marker decoder to match the constructs used by the RPC client's TCP socket transport. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Chuck Lever	b7872fe86d	SUNRPC: RPC server still uses 2.4 method for disabling TCP Nagle Use the 2.6 method for disabling TCP Nagle in the kernel's RPC server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:43 -04:00
Jeff Layton	8774282c4c	SUNRPC: remove svc_create_thread() Now that the nfs4 callback thread uses the kthread API, there are no more users of svc_create_thread(). Remove it. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:42 -04:00
Kevin Coffman	4ab4b0bedd	sunrpc: make token header values less confusing g_make_token_header() and g_token_size() add two too many, and therefore their callers pass in "(logical_value - 2)" rather than "logical_value" as hard-coded values which causes confusion. This dates back to the original g_make_token_header which took an optional token type (token_id) value and added it to the token. This was removed, but the routine always adds room for the token_id rather than not. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:41 -04:00
Kevin Coffman	5743d65c2f	gss_krb5: consistently use unsigned for seqnum Consistently use unsigned (u32 vs. s32) for seqnum. In get_mic function, send the local copy of seq_send, rather than the context version. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:41 -04:00
Andrew Morton	c3bb257d2d	net/sunrpc/svc.c: suppress unintialized var warning net/sunrpc/svc.c: In function '__svc_create_thread': net/sunrpc/svc.c:587: warning: 'oldmask.bits[0u]' may be used uninitialized in this function Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: David S. Miller <davem@davemloft.net> Cc: Tom Tucker <tom@opengridcomputing.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:40 -04:00
Kevin Coffman	30aef3166a	Remove define for KRB5_CKSUM_LENGTH, which will become enctype-dependent cleanup: When adding new encryption types, the checksum length can be different for each enctype. Face the fact that the current code only supports DES which has a checksum length of 8. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:40 -04:00
Kevin Coffman	3d4a688678	Correct grammer/typos in dprintks cleanup: Fix grammer/typos to use "too" instead of "to" Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:40 -04:00
Tom Tucker	830bb59b6e	SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send The svcrdma transport can crash if a send is waiting for an empty SQ slot and the connection is closed due to an asynchronous error. The crash is caused when svc_rdma_send attempts to send on a deleted QP. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:40 -04:00
Harshula Jayasuriya	dd35210e1e	sunrpc: GSS integrity and decryption failures should return GARBAGE_ARGS In function svcauth_gss_accept() (net/sunrpc/auth_gss/svcauth_gss.c) the code that handles GSS integrity and decryption failures should be returning GARBAGE_ARGS as specified in RFC 2203, sections 5.3.3.4.2 and 5.3.3.4.3. Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: Harshula Jayasuriya <harshula@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:39 -04:00
Jeff Layton	7b54fe61ff	SUNRPC: allow svc_recv to break out of 500ms sleep when alloc_page fails svc_recv() calls alloc_page(), and if it fails it does a 500ms uninterruptible sleep and then reattempts. There doesn't seem to be any real reason for this to be uninterruptible, so change it to an interruptible sleep. Also check for kthread_stop() and signalled() after setting the task state to avoid races that might lead to sleeping after kthread_stop() wakes up the task. I've done some very basic smoke testing with this, but obviously it's hard to test the actual changes since this all depends on an alloc_page() call failing. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:38 -04:00
J. Bruce Fields	67eb6ff610	svcrpc: move unused field from cache_deferred_req This field is set once and never used; probably some artifact of an earlier implementation idea. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:37 -04:00
Aurélien Charbon	f15364bd4c	IPv6 support for NFS server export caches This adds IPv6 support to the interfaces that are used to express nfsd exports. All addressed are stored internally as IPv6; backwards compatibility is maintained using mapped addresses. Thanks to Bruce Fields, Brian Haley, Neil Brown and Hideaki Joshifuji for comments Signed-off-by: Aurelien Charbon <aurelien.charbon@bull.net> Cc: Neil Brown <neilb@suse.de> Cc: Brian Haley <brian.haley@hp.com> Cc: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:36 -04:00
Jeff Layton	7086721f9c	SUNRPC: have svc_recv() check kthread_should_stop() When using kthreads that call into svc_recv, we want to make sure that they do not block there for a long time when we're trying to take down the kthread. This patch changes svc_recv() to check kthread_should_stop() at the same places that it checks to see if it's signalled(). Also check just before svc_recv() tries to schedule(). By making sure that we check it just after setting the task state we can avoid having to use any locking or signalling to ensure it doesn't block for a long time. There's still a chance of a 500ms sleep if alloc_page() fails, but that should be a rare occurrence and isn't a terribly long time in the context of a kthread being taken down. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:36 -04:00
Jeff Layton	23d42ee278	SUNRPC: export svc_sock_update_bufs Needed since the plan is to not have a svc_create_thread helper and to have current users of that function just call kthread_run directly. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-04-23 16:13:35 -04:00
Trond Myklebust	cd019f7517	SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use When a server rejects our credential with an AUTH_REJECTEDCRED or similar, we need to refresh the credential and then retry the request. However, we do want to allow any requests that are in flight to finish executing, so that we can at least attempt to process the replies that depend on this instance of the credential. The solution is to ensure that gss_refresh() looks up an entirely new RPCSEC_GSS credential instead of attempting to create a context for the existing invalid credential. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:55:19 -04:00
Trond Myklebust	7b6962b0a6	SUNRPC: Fix a race in gss_refresh_upcall() If the downcall completes before we get the spin_lock then we currently fail to refresh the credential. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:55:15 -04:00
Trond Myklebust	7c1d71cf56	SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests NFSv4 requires us to ensure that we break the TCP connection before we're allowed to retransmit a request. However in the case where we're retransmitting several requests that have been sent on the same connection, we need to ensure that we don't interfere with the attempt to reconnect and/or break the connection again once it has been established. We therefore introduce a 'connection' cookie that is bumped every time a connection is broken. This allows requests to track if they need to force a disconnection. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:55:12 -04:00
Trond Myklebust	636ac43318	SUNRPC: Remove the unused export of xprt_force_disconnect Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:55:08 -04:00
Trond Myklebust	06b4b681ab	SUNRPC: remove XS_SENDMSG_RETRY The condition for exiting from the loop in xs_tcp_send_request() should be that we find we're not making progress (i.e. number of bytes sent is 0). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:55:05 -04:00
Trond Myklebust	d2b8314163	SUNRPC: Protect creds against early garbage collection Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:55:02 -04:00
Trond Myklebust	7c67db3a8a	NFSv4: Reintroduce machine creds We need to try to ensure that we always use the same credentials whenever we re-establish the clientid on the server. If not, the server won't recognise that we're the same client, and so may not allow us to recover state. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:54:56 -04:00
Trond Myklebust	78ea323be6	NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid() With the recent change to generic creds, we can no longer use cred->cr_ops->cr_name to distinguish between RPCSEC_GSS principals and AUTH_SYS/AUTH_NULL identities. Replace it with the rpc_authops->au_name instead... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:54:53 -04:00
Trond Myklebust	1e799b673c	SUNRPC: Fix read ordering problems with req->rq_private_buf.len We want to ensure that req->rq_private_buf.len is updated before req->rq_received, so that call_decode() doesn't use an old value for req->rq_rcv_buf.len. In 'call_decode()' itself, instead of using task->tk_status (which is set using req->rq_received) must use the actual value of req->rq_private_buf.len when deciding whether or not the received RPC reply is too short. Finally ensure that we set req->rq_rcv_buf.len to zero when retrying a request. A typo meant that we were resetting req->rq_private_buf.len in call_decode(), and then clobbering that value with the old rq_rcv_buf.len again in xprt_transmit(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:53:20 -04:00
Trond Myklebust	080a1f148d	SUNRPC: Don't attempt to destroy expired RPCSEC_GSS credentials.. ..and always destroy using a 'soft' RPC call. Destroying GSS credentials isn't mandatory; the server can always cope with a few credentials not getting destroyed in a timely fashion. This actually fixes a hang situation. Basically, some servers will decide that the client is crazy if it tries to destroy an RPC context for which they have sent an RPCSEC_GSS_CREDPROBLEM, and so will refuse to talk to it for a while. The regression therefor probably was introduced by commit `0df7fb74fb`. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:52:54 -04:00
Trond Myklebust	b6ddf64ffe	SUNRPC: Fix up xprt_write_space() The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether or not we have someone waiting for buffer memory. Convert the SUNRPC layer to use the same idiom. Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In fact, the most common case will be that there is nobody waiting for buffer space. SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was limited by the application window. Ensure that we follow the same idiom as the rest of the networking layer here too. Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that write_space() doesn't keep waking things up on xprt->pending. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:52:44 -04:00
Trond Myklebust	24b74bf0c9	SUNRPC: Fix a bug in call_decode() call_verify() can, under certain circumstances, free the RPC slot. In that case, our cached pointer 'req = task->tk_rqstp' is invalid. Bug was introduced in commit `220bcc2afd` (SUNRPC: Don't call xprt_release in call refresh). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-19 16:52:33 -04:00
Mike Travis	c5f59f0833	nodemask: use new node_to_cpumask_ptr function * Use new node_to_cpumask_ptr. This creates a pointer to the cpumask for a given node. This definition is in mm patch: asm-generic-add-node_to_cpumask_ptr-macro.patch * Use new set_cpus_allowed_ptr function. Depends on: [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch [sched-devel]: sched: add new set_cpus_allowed_ptr function [x86/latest]: x86: add cpus_scnprintf function Cc: Greg Kroah-Hartman <gregkh@suse.de> Cc: Greg Banks <gnb@melbourne.sgi.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-04-19 19:44:59 +02:00
Linus Torvalds	334d094504	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.26 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.26: (1090 commits) [NET]: Fix and allocate less memory for ->priv'less netdevices [IPV6]: Fix dangling references on error in fib6_add(). [NETLABEL]: Fix NULL deref in netlbl_unlabel_staticlist_gen() if ifindex not found [PKT_SCHED]: Fix datalen check in tcf_simp_init(). [INET]: Uninline the __inet_inherit_port call. [INET]: Drop the inet_inherit_port() call. SCTP: Initialize partial_bytes_acked to 0, when all of the data is acked. [netdrvr] forcedeth: internal simplifications; changelog removal phylib: factor out get_phy_id from within get_phy_device PHY: add BCM5464 support to broadcom PHY driver cxgb3: Fix __must_check warning with dev_dbg. tc35815: Statistics cleanup natsemi: fix MMIO for PPC 44x platforms [TIPC]: Cleanup of TIPC reference table code [TIPC]: Optimized initialization of TIPC reference table [TIPC]: Remove inlining of reference table locking routines e1000: convert uint16_t style integers to u16 ixgb: convert uint16_t style integers to u16 sb1000.c: make const arrays static sb1000.c: stop inlining largish static functions ...	2008-04-18 18:02:35 -07:00
David S. Miller	1e42198609	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-04-17 23:56:30 -07:00
Roland Dreier	0f39cf3d54	IB/core: Add support for "send with invalidate" work requests Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-04-16 21:09:32 -07:00
Chuck Lever	ed13c27e54	SUNRPC: Fix a memory leak in rpc_create() Commit `510deb0d` was supposed to move the xprt_create_transport() call in rpc_create(), but neglected to remove the old call site. This resulted in a transport leak after every rpc_create() call. This leak is present in 2.6.24 and 2.6.25. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-08 21:07:00 -04:00
Trond Myklebust	daeba89d43	SUNRPC: don't call flush_dcache_page() with an invalid pointer Fix a problem in _copy_to_pages(), whereby it may call flush_dcache_page() with an invalid pointer due to the fact that 'pgto' gets incremented beyond the end of the page array. Fix is to exit the loop without this unnecessary increment of pgto. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-04-08 21:06:50 -04:00
David S. Miller	3bb5da3837	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	2008-04-03 14:33:42 -07:00
Tom Tucker	c8237a5fce	SVCRDMA: Check num_sge when setting LAST_CTXT bit The RDMACTXT_F_LAST_CTXT bit was getting set incorrectly when the last chunk in the read-list spanned multiple pages. This resulted in a kernel panic when the wrong context was used to build the RPC iovec page list. RDMA_READ is used to fetch RPC data from the client for NFS_WRITE requests. A scatter-gather is used to map the advertised client side buffer to the server-side iovec and associated page list. WR contexts are used to convey which scatter-gather entries are handled by each WR. When the write data is large, a single RPC may require multiple RDMA_READ requests so the contexts for a single RPC are chained together in a linked list. The last context in this list is marked with a bit RDMACTXT_F_LAST_CTXT so that when this WR completes, the CQ handler code can enqueue the RPC for processing. The code in rdma_read_xdr was setting this bit on the last two contexts on this list when the last read-list chunk spanned multiple pages. This caused the svc_rdma_recvfrom logic to incorrectly build the RPC and caused the kernel to crash because the second-to-last context doesn't contain the iovec page list. Modified the condition that sets this bit so that it correctly detects the last context for the RPC. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Tested-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-26 11:24:19 -07:00
Roland Dreier	d3073779f8	SVCRDMA: Use only 1 RDMA read scatter entry for iWARP adapters The iWARP protocol limits RDMA read requests to a single scatter entry. NFS/RDMA has code in rdma_read_max_sge() that is supposed to limit the sge_count for RDMA read requests to 1, but the code to do that is inside an #ifdef RDMA_TRANSPORT_IWARP block. In the mainline kernel at least, RDMA_TRANSPORT_IWARP is an enum and not a preprocessor #define, so the #ifdef'ed code is never compiled. In my test of a kernel build with -j8 on an NFS/RDMA mount, this problem eventually leads to trouble starting with: svcrdma: Error posting send = -22 svcrdma : RDMA_READ error = -22 and things go downhill from there. The trivial fix is to delete the #ifdef guard. The check seems to be a remnant of when the NFS/RDMA code was not merged and needed to compile against multiple kernel versions, although I don't think it ever worked as intended. In any case now that the code is upstream there's no need to test whether the RDMA_TRANSPORT_IWARP constant is defined or not. Without this patch, my kernel build on an NFS/RDMA mount using NetEffect adapters quickly and 100% reproducibly failed with an error like: ld: final link failed: Software caused connection abort With the patch applied I was able to complete a kernel build on the same setup. (Tom Tucker says this is "actually an _ancient_ remnant when it had to compile against iWARP vs. non-iWARP enabled OFA trees.") Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-24 11:25:25 -07:00
Trond Myklebust	c7c350e92a	Merge branch 'hotfixes' into devel	2008-03-19 17:59:44 -04:00
David S. Miller	577f99c1d0	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/rt2x00/rt2x00dev.c net/8021q/vlan_dev.c	2008-03-18 00:37:55 -07:00
David S. Miller	2f633928cb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-03-17 23:44:31 -07:00
Al Viro	27724426a9	[SUNRPC]: net/* NULL noise Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-17 22:48:03 -07:00
Al Viro	e6f1cebf71	[NET] endianness noise: INADDR_ANY Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-17 22:44:53 -07:00
Trond Myklebust	98a8e32394	SUNRPC: Add a helper rpcauth_lookup_generic_cred() The NFSv4 protocol allows clients to negotiate security protocols on the fly in the case where an administrator on the server changes the export settings and/or in the case where we may have a filesystem migration event. Instead of having the NFS client code cache credentials that are tied to a particular AUTH method it is therefore preferable to have a generic credential that can be converted into whatever AUTH is in use by the RPC client when the read/write/sillyrename/... is put on the wire. We do this by means of the new "generic" credential, which basically just caches the minimal information that is needed to look up an RPCSEC_GSS, AUTH_SYS, or AUTH_NULL credential. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-14 13:42:49 -04:00
Trond Myklebust	5c691044ec	SUNRPC: Add an rpc_credop callback for binding a credential to an rpc_task We need the ability to treat 'generic' creds specially, since they want to bind instances of the auth cred instead of binding themselves. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-14 13:42:41 -04:00
Trond Myklebust	9a559efd41	SUNRPC: Add a generic RPC credential Add an rpc credential that is not tied to any particular auth mechanism, but that can be cached by NFS, and later used to look up a cred for whichever auth mechanism that turns out to be valid when the RPC call is being made. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-14 13:42:38 -04:00
Trond Myklebust	4ccda2cdd8	SUNRPC: Clean up rpcauth_bindcred() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-14 13:42:35 -04:00
Trond Myklebust	af09383577	SUNRPC: Fix RPCAUTH_LOOKUP_ROOTCREDS The current RPCAUTH_LOOKUP_ROOTCREDS flag only works for AUTH_SYS authentication, and then only as a special case in the code. This patch removes the auth_sys special casing, and replaces it with generic code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-14 13:42:32 -04:00
Trond Myklebust	25337fdc85	SUNRPC: Fix a bug in rpcauth_lookup_credcache() The hash bucket is for some reason always being set to zero. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-14 13:42:29 -04:00
Tom Tucker	3fedb3c5a8	SVCRDMA: Fix erroneous BUG_ON in send_write The assertion that checks for sge context overflow is incorrectly hard-coded to 32. This causes a kernel bug check when using big-data mounts. Changed the BUG_ON to use the computed value RPCSVC_MAXPAGES. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-12 12:37:34 -07:00
Tom Tucker	c48cbb405c	SVCRDMA: Add xprt refs to fix close/unmount crash RDMA connection shutdown on an SMP machine can cause a kernel crash due to the transport close path racing with the I/O tasklet. Additional transport references were added as follows: - A reference when on the DTO Q to avoid having the transport deleted while queued for I/O. - A reference while there is a QP able to generate events. - A reference until the DISCONNECTED event is received on the CM ID Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-03-12 12:37:34 -07:00
Trond Myklebust	9446389ef6	Merge commit 'origin' into devel	2008-03-08 11:49:24 -05:00
Linus Torvalds	4c1aa6f8b9	Merge branch 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'hotfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix dentry revalidation for NFSv4 referrals and mountpoint crossings NFS: Fix the fsid revalidation in nfs_update_inode() SUNRPC: Fix a nfs4 over rdma transport oops NFS: Fix an f_mode/f_flags confusion in fs/nfs/write.c	2008-03-07 12:08:07 -08:00
Tom Talpey	ee1a2c564f	SUNRPC: Fix a nfs4 over rdma transport oops Prevent an RPC oops when freeing a dynamically allocated RDMA buffer, used in certain special-case large metadata operations. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-03-07 14:35:32 -05:00
Harvey Harrison	0dc47877a3	net: replace remaining __FUNCTION__ occurrences __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-03-05 20:47:47 -08:00
Trond Myklebust	cdd0972945	Merge branch 'cleanups' into next	2008-02-28 23:48:05 -08:00
Trond Myklebust	5e4424af9a	SUNRPC: Remove now-redundant RCU-safe rpc_task free path Now that we've tightened up the locking rules for RPC queue wakeups, we can remove the RCU-safe kfree calls... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-28 23:26:28 -08:00
Trond Myklebust	ff2d7db848	SUNRPC: Ensure that we read all available tcp data Don't stop until we run out of data, or we hit an error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-28 23:26:27 -08:00
Trond Myklebust	f5fb7b06e4	SUNRPC: Eliminate the now-redundant rpc_start_wakeup() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-28 23:26:25 -08:00
Trond Myklebust	eb276c0e10	SUNRPC: Switch tasks to using the rpc_waitqueue's timer function Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-28 23:26:21 -08:00
Trond Myklebust	36df9aae31	SUNRPC: Add a timer function to wait queues. This is designed to replace the timeout timer in the individual rpc_tasks. By putting the timer function in the wait queue, we will eventually be able to reduce the total number of timers in use by the RPC subsystem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-28 23:21:59 -08:00
Trond Myklebust	f6a1cc8930	SUNRPC: Add a (empty for the moment) destructor for rpc_wait_queues Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-28 23:17:27 -08:00
Wang Chen	2ce8f047d5	[SUNRPC]: Use proc_create() to setup ->proc_fops first Use proc_create() to make sure that ->proc_fops be setup before gluing PDE to main tree. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-02-28 14:00:59 -08:00
Trond Myklebust	5d00837b90	SUNRPC: Run rpc timeout functions as callbacks instead of in softirqs An audit of the current RPC timeout functions shows that they don't really ever need to run in the softirq context. As long as the softirq is able to signal that the wakeup is due to a timeout (which it can do by setting task->tk_status to -ETIMEDOUT) then the callback functions can just run as standard task->tk_callback functions (in the rpciod/process context). The only possible border-line case would be xprt_timer() for the case of UDP, when the callback is used to reduce the size of the transport congestion window. In testing, however, the effect of moving that update to a callback would appear to be minor. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-25 21:40:44 -08:00
Trond Myklebust	fda1393938	SUNRPC: Convert users of rpc_wake_up_task to use rpc_wake_up_queued_task Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-25 21:40:42 -08:00
Trond Myklebust	96ef13b283	SUNRPC: Add a new helper rpc_wake_up_queued_task() In all cases where we currently use rpc_wake_up_task(), we almost always know on which waitqueue the rpc_task is actually sleeping. This will allows us to simplify the queue locking in a future patch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-25 21:40:40 -08:00
Trond Myklebust	fde95c7554	SUNRPC: Clean up rpc_run_timer() All RPC timeout callback functions are expected to wake the task up. We can enforce this by moving the wakeup back into rpc_run_timer. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-25 21:40:39 -08:00
Trond Myklebust	32bfb5c0f4	SUNRPC: Allow the rpc_release() callback to be run on another workqueue A lot of the work done by the rpc_release() callback is inappropriate for rpciod as it will often involve things like starting a new rpc call in order to clean up state after an interrupted NFSv4 open() call, or calls to mntput(), etc. This patch allows the caller of rpc_run_task() to specify that the rpc_release callback should run on a different workqueue than the default rpciod_workqueue. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-25 21:40:34 -08:00
Pavel Emelyanov	5216a8e70e	Wrap buffers used for rpc debug printks into RPC_IFDEBUG Sorry for the noise, but here's the v3 of this compilation fix :) There are some places, which declare the char buf[...] on the stack to push it later into dprintk(). Since the dprintk sometimes (if the CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers cause gcc to produce appropriate warnings. Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to compile them out when not needed. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-21 18:42:29 -05:00
Jan Blunck	1d957f9bf8	Introduce path_put() * Add path_put() functions for releasing a reference to the dentry and vfsmount of a struct path in the right order * Switch from path_release(nd) to path_put(&nd->path) * Rename dput_path() to path_put_conditional() [akpm@linux-foundation.org: fix cifs] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: <linux-fsdevel@vger.kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:13:33 -08:00
Jan Blunck	4ac9137858	Embed a struct path into struct nameidata instead of nd->{dentry,mnt} This is the central patch of a cleanup series. In most cases there is no good reason why someone would want to use a dentry for itself. This series reflects that fact and embeds a struct path into nameidata. Together with the other patches of this series - it enforced the correct order of getting/releasing the reference count on <dentry,vfsmount> pairs - it prepares the VFS for stacking support since it is essential to have a struct path in every place where the stack can be traversed - it reduces the overall code size: without patch series: text data bss dec hex filename 5321639 858418 715768 6895825 6938d1 vmlinux with patch series: text data bss dec hex filename 5320026 858418 715768 `6894212` 693284 vmlinux This patch: Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix cifs] [akpm@linux-foundation.org: fix smack] Signed-off-by: Jan Blunck <jblunck@suse.de> Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-14 21:13:33 -08:00
Trond Myklebust	cbc2005925	SUNRPC: Declare as const the rpc_message arguments to rpc_call_sync/async Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-02-14 11:17:24 -05:00
Randy Dunlap	65b6e42cdc	docbook: sunrpc filenames and notation fixes Use updated file list for docbook files and fix kernel-doc warnings in sunrpc: Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:689): No description found for parameter 'rpc_client' Warning(linux-2.6.24-git12//net/sunrpc/rpc_pipe.c:765): No description found for parameter 'flags' Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:584): No description found for parameter 'tk_ops' Warning(linux-2.6.24-git12//net/sunrpc/clnt.c:618): No description found for parameter 'bufsize' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-13 16:21:19 -08:00
Roland Dreier	bb50c8012c	SUNPRC: Fix printk format warning net/sunrpc/xprtrdma/svc_rdma_sendto.c:160: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'u64' Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-10 18:11:22 -05:00
Chuck Lever	ea339d46b9	SUNRPC: RPC program information is stored in unsigned integers Clean up: When looping over RPC version and procedure numbers, use unsigned index variables. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 17:01:31 -05:00
Trond Myklebust	d2f7e79e3b	SUNRPC: Move exported symbol definitions after function declaration part 2 Do it for the server code... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 17:01:24 -05:00
Jeff Layton	0113ab3464	SUNRPC: spin svc_rqst initialization to its own function Move the initialzation in __svc_create_thread that happens prior to thread creation to a new function. Export the function to allow services to have better control over the svc_rqst structs. Also rearrange the rqstp initialization to prevent NULL pointer dereferences in svc_exit_thread in case allocations fail. Signed-off-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:15 -05:00
Tom Tucker	4b8449af75	rdma: makefile Add the svcrdma module to the xprtrdma makefile. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	ef1eac0a3f	rdma: ONCRPC RDMA protocol marshalling This logic parses the ONCRDMA protocol headers that precede the actual RPC header. It is placed in a separate file to keep all protocol aware code in a single place. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	c06b540a54	rdma: SVCRDMA sendto This file implements the RDMA transport sendto function. A RPC reply on an RDMA transport consists of some number of RDMA_WRITE requests followed by an RDMA_SEND request. The sendto function parses the ONCRPC RDMA reply header to determine how to send the reply back to the client. The send queue is sized so as to be able to send complete replies for requests in most cases. In the event that there are not enough SQ WR slots to reply, e.g. big data, the send will block the NFSD thread. The I/O callback functions in svc_rdma_transport.c that reap WR completions wake any waiters blocked on the SQ. In general, the goal is not to block NFSD threads and the has_wspace method stall requests when the SQ is nearly full. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	d5b31be682	rdma: SVCRDMA recvfrom This file implements the RDMA transport recvfrom function. The function dequeues work reqeust completion contexts from an I/O list that it shares with the I/O tasklet in svc_rdma_transport.c. For ONCRPC RDMA, an RPC may not be complete when it is received. Instead, the RDMA header that precedes the RPC message informs the transport where to get the RPC data from on the client and where to place it in the RPC message before it is delivered to the server. The svc_rdma_recvfrom function therefore, parses this RDMA header and issues any necessary RDMA operations to fetch the remainder of the RPC from the client. Special handling is required when the request involves an RDMA_READ. In this case, recvfrom submits the RDMA_READ requests to the underlying transport driver and then returns 0. When the transport completes the last RDMA_READ for the request, it enqueues it on a read completion queue and enqueues the transport. The recvfrom code favors this queue over the regular DTO queue when satisfying reads. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	377f9b2f45	rdma: SVCRDMA Core Transport Services This file implements the core transport data management and I/O path. The I/O path for RDMA involves receiving callbacks on interrupt context. Since all the svc transport locks are _bh locks we enqueue the transport on a list, schedule a tasklet to dequeue data indications from the RDMA completion queue. The tasklet in turn takes _bh locks to enqueue receive data indications on a list for the transport. The svc_rdma_recvfrom transport function dequeues data from this list in an NFSD thread context. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	ef7fbf59e6	rdma: SVCRDMA Transport Module This file implements the RDMA transport module initialization and termination logic and registers the transport sysctl variables. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	9571af18fa	svc: Add svc_xprt_names service to replace svc_sock_names Create a transport independent version of the svc_sock_names function. The toclose capability of the svc_sock_names service can be implemented using the svc_xprt_find and svc_xprt_close services. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:14 -05:00
Tom Tucker	a217813f90	knfsd: Support adding transports by writing portlist file Update the write handler for the portlist file to allow creating new listening endpoints on a transport. The general form of the string is: <transport_name><space><port number> For example: echo "tcp 2049" > /proc/fs/nfsd/portlist This is intended to support the creation of a listening endpoint for RDMA transports without adding #ifdef code to the nfssvc.c file. Transports can also be removed as follows: '-'<transport_name><space><port number> For example: echo "-tcp 2049" > /proc/fs/nfsd/portlist Attempting to add a listener with an invalid transport string results in EPROTONOSUPPORT and a perror string of "Protocol not supported". Attempting to remove an non-existent listener (.e.g. bad proto or port) results in ENOTCONN and a perror string of "Transport endpoint is not connected" Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	7fcb98d58c	svc: Add svc API that queries for a transport instance Add a new svc function that allows a service to query whether a transport instance has already been created. This is used in lockd to determine whether or not a transport needs to be created when a lockd instance is brought up. Specifying 0 for the address family or port is effectively a wild-card, and will result in matching the first transport in the service's list that has a matching class name. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	dc9a16e49d	svc: Add /proc/sys/sunrpc/transport files Add a file that when read lists the set of registered svc transports. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	260c1d1298	svc: Add transport hdr size for defer/revisit Some transports have a header in front of the RPC header. The current defer/revisit processing considers only the iov_len and arg_len to determine how much to back up when saving the original request to revisit. Add a field to the rqstp structure to save the size of the transport header so svc_defer can correctly compute the start of a request. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	0f0257eaa5	svc: Move the xprt independent code to the svc_xprt.c file This functionally trivial patch moves all of the transport independent functions from the svcsock.c file to the transport independent svc_xprt.c file. In addition the following formatting changes were made: - White space cleanup - Function signatures on single line - The inline directive was removed - Lines over 80 columns were reformatted - The term 'socket' was changed to 'transport' in comments - The SMP comment was moved and updated. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	18d19f949d	svc: Make svc_check_conn_limits xprt independent The svc_check_conn_limits function only manipulates xprt fields. Change references to svc_sock->sk_xprt to svc_xprt directly. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	57b1d3baba	svc: Removing remaining references to rq_sock in rqstp This functionally empty patch removes rq_sock and unamed union from rqstp structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	4e5caaa5f2	svc: Move create logic to common code Move the svc transport list logic into common transport creation code. Refactor this code path to make the flow of control easier to read. Move the setting and clearing of the BUSY_BIT during transport creation to common code. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	9f8bfae693	svc: Make svc_age_temp_sockets svc_age_temp_transports This function is transport independent. Change it to use svc_xprt directly and change it's name to reflect this. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:13 -05:00
Tom Tucker	c36adb2a7f	svc: Make svc_recv transport neutral All of the transport field and functions used by svc_recv are now transport independent. Change the svc_recv function to use the svc_xprt structure directly instead of the transport specific svc_sock structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	eab996d4ac	svc: Make svc_sock_release svc_xprt_release The svc_sock_release function only touches transport independent fields. Change the function to manipulate svc_xprt directly instead of the transport dependent svc_sock structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	9dbc240f19	svc: Move the sockaddr information to svc_xprt This patch moves the transport sockaddr to the svc_xprt structure. Convenience functions are added to set and get the local and remote addresses of a transport from the transport provider as well as determine the length of a sockaddr. A transport is responsible for setting the xpt_local and xpt_remote addresses in the svc_xprt structure as part of transport creation and xpo_accept processing. This cannot be done in a generic way and in fact varies between TCP, UDP and RDMA. A set of xpo_ functions (e.g. getlocalname, getremotename) could have been added but this would have resulted in additional caching and copying of the addresses around. Note that the xpt_local address should also be set on listening endpoints; for TCP/RDMA this is done as part of endpoint creation. For connected transports like TCP and RDMA, the addresses never change and can be set once and copied into the rqstp structure for each request. For UDP, however, the local and remote addresses may change for each request. In this case, the address information is obtained from the UDP recvmsg info and copied into the rqstp structure from there. A svc_xprt_local_port function was also added that returns the local port given a transport. This is used by svc_create_xprt when returning the port associated with a newly created transport, and later when creating a generic find transport service to check if a service is already listening on a given port. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	8c7b0172a1	svc: Make deferral processing xprt independent This patch moves the transport independent sk_deferred list to the svc_xprt structure and updates the svc_deferred_req structure to keep pointers to svc_xprt's directly. The deferral processing code is also moved out of the transport dependent recvfrom functions and into the generic svc_recv path. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	def13d7401	svc: Move the authinfo cache to svc_xprt. Move the authinfo cache to svc_xprt. This allows both the TCP and RDMA transports to share this logic. A flag bit is used to determine if auth information is to be cached or not. Previously, this code looked at the transport protocol. I've also changed the spin_lock/unlock logic so that a lock is not taken for transports that are not caching auth info. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	4bc6c497b2	svc: Remove sk_lastrecv With the implementation of the new mark and sweep algorithm for shutting down old connections, the sk_lastrecv field is no longer needed. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	6bc5ab1367	svc: Move accept call to svc_xprt_received to common code Now that the svc_xprt_received function handles transports, the call to svc_xprt_received in the xpo_tcp_accept function can be moved to common code. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	a6046f71f2	svc: Change svc_sock_received to svc_xprt_received and export it All fields touched by svc_sock_received are now transport independent. Change it to use svc_xprt directly. This function is called from transport dependent code, so export it. Update the comment to clearly state the rules for calling this function. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:12 -05:00
Tom Tucker	a50fea26b9	svc: Make svc_send transport neutral Move the sk_mutex field to the transport independent svc_xprt structure. Now all the fields that svc_send touches are transport neutral. Change the svc_send function to use the transport independent svc_xprt directly instead of the transport dependent svc_sock structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:11 -05:00
Tom Tucker	f6150c3cab	svc: Make the enqueue service transport neutral and export it. The svc_sock_enqueue function is now transport independent since all of the fields it touches have been moved to the transport independent svc_xprt structure. Change the function to use the svc_xprt structure directly instead of the transport specific svc_sock structure. Transport specific data-ready handlers need to call this function, so export it. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:11 -05:00
Tom Tucker	7a90e8cc21	svc: Move sk_reserved to svc_xprt This functionally trivial patch moves the sk_reserved field to the transport independent svc_xprt structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:11 -05:00
Tom Tucker	7a18208383	svc: Make close transport independent Move sk_list and sk_ready to svc_xprt. This involves close because these lists are walked by svcs when closing all their transports. So I combined the moving of these lists to svc_xprt with making close transport independent. The svc_force_sock_close has been changed to svc_close_all and takes a list as an argument. This removes some svc internals knowledge from the svcs. This code races with module removal and transport addition. Thanks to Simon Holm Thøgersen for a compile fix. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Simon Holm Thøgersen <odie@cs.aau.dk>	2008-02-01 16:42:11 -05:00
Tom Tucker	bb5cf160b2	svc: Move sk_server and sk_pool to svc_xprt This is another incremental change that moves transport independent fields from svc_sock to the svc_xprt structure. The changes should be functionally null. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:11 -05:00
Tom Tucker	02fc6c3618	svc: Move sk_flags to the svc_xprt structure This functionally trivial change moves the transport independent sk_flags field to the transport independent svc_xprt structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:11 -05:00
Tom Tucker	e1b3157f97	svc: Change sk_inuse to a kref Change the atomic_t reference count to a kref and move it to the transport indepenent svc_xprt structure. Change the reference count wrapper names to be generic. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:11 -05:00
Tom Tucker	d7c9f1ed97	svc: Change services to use new svc_create_xprt service Modify the various kernel RPC svcs to use the svc_create_xprt service. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:09 -05:00
Tom Tucker	b700cbb11f	svc: Add a generic transport svc_create_xprt function The svc_create_xprt function is a transport independent version of the svc_makesock function. Since transport instance creation contains transport dependent and independent components, add an xpo_create transport function. The transport implementation of this function allocates the memory for the endpoint, implements the transport dependent initialization logic, and calls svc_xprt_init to initialize the transport independent field (svc_xprt) in it's data structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:09 -05:00
Tom Tucker	f9f3cc4fae	svc: Move connection limit checking to its own function Move the code that poaches connections when the connection limit is hit to a subroutine to make the accept logic path easier to follow. Since this is in the new connection path, it should not be a performance issue. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:09 -05:00
Tom Tucker	44a6995b32	svc: Remove unnecessary call to svc_sock_enqueue The svc_tcp_accept function calls svc_sock_enqueue after setting the SK_CONN bit. This doesn't actually do anything because the SK_BUSY bit is still set. The call is unnecessary anyway because the generic code in svc_recv calls svc_sock_received after calling the accept function. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:09 -05:00
Tom Tucker	38a417cc99	svc: Add xpo_accept transport function Previously, the accept logic looked into the socket state to determine whether to call accept or recv when data-ready was indicated on an endpoint. Since some transports don't use sockets, this logic now uses a flag bit (SK_LISTENER) to identify listening endpoints. A transport function (xpo_accept) allows each transport to define its own accept processing. A transport's initialization logic is reponsible for setting the SK_LISTENER bit. I didn't see any way to do this in transport independent logic since the passive side of a UDP connection doesn't listen and always recv's. In the svc_recv function, if the SK_LISTENER bit is set, the transport xpo_accept function is called to handle accept processing. Note that all functions are defined even if they don't make sense for a given transport. For example, accept doesn't mean anything for UDP. The function is defined anyway and bug checks if called. The UDP transport should never set the SK_LISTENER bit. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	d7979ae4a0	svc: Move close processing to a single place Close handling was duplicated in the UDP and TCP recvfrom methods. This code has been moved to the transport independent svc_recv function. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	323bee32e9	svc: Add a transport function that checks for write space In order to avoid blocking a service thread, the receive side checks to see if there is sufficient write space to reply to the request. Each transport has a different mechanism for determining if there is enough write space to reply. The code that checked for write space was coupled with code that checked for CLOSE and CONN. These checks have been broken out into separate statements to make the code easier to read. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	e831fe65b1	svc: Add xpo_prep_reply_hdr Some transports add fields to the RPC header for replies, e.g. the TCP record length. This function is called when preparing the reply header to allow each transport to add whatever fields it requires. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	755cceaba7	svc: Add per-transport delete functions Add transport specific xpo_detach and xpo_free functions. The xpo_detach function causes the transport to stop delivering data-ready events and enqueing the transport for I/O. The xpo_free function frees all resources associated with the particular transport instance. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	5148bf4ebc	svc: Add transport specific xpo_release function The svc_sock_release function releases pages allocated to a thread. For UDP this frees the receive skb. For RDMA it will post a receive WR and bump the client credit count. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	5d137990f5	svc: Move sk_sendto and sk_recvfrom to svc_xprt_class The sk_sendto and sk_recvfrom are function pointers that allow svc_sock to be used for both UDP and TCP. Move these function pointers to the svc_xprt_ops structure. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	490231558e	svc: Add a max payload value to the transport The svc_max_payload function currently looks at the socket type to determine the max payload. Add a max payload value to svc_xprt_class so it can be returned directly. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:08 -05:00
Tom Tucker	360d873864	svc: Make svc_sock the tcp/udp transport Make TCP and UDP svc_sock transports, and register them with the svc transport core. A transport type (svc_sock) has an svc_xprt as its first member, and calls svc_xprt_init to initialize this field. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:07 -05:00
Tom Tucker	1d8206b97a	svc: Add an svc transport class The transport class (svc_xprt_class) represents a type of transport, e.g. udp, tcp, rdma. A transport class has a unique name and a set of transport operations kept in the svc_xprt_ops structure. A transport class can be dynamically registered and unregisterd. The svc_xprt_class represents the module that implements the transport type and keeps reference counts on the module to avoid unloading while there are active users. The endpoint (svc_xprt) is a generic, transport independent endpoint that can be used to send and receive data for an RPC service. It inherits it's operations from the transport class. A transport driver module registers and unregisters itself with svc sunrpc by calling svc_reg_xprt_class, and svc_unreg_xprt_class respectively. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Neil Brown <neilb@suse.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Greg Banks <gnb@sgi.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:07 -05:00
J. Bruce Fields	cb5c7d668e	svcrpc: ensure gss DESTROY tokens free contexts from cache If we don't do this then we'll end up with a pointless unusable context sitting in the cache until the time the original context would have expired. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:07 -05:00
J. Bruce Fields	b39c18fce0	sunrpc: gss: simplify rsi_parse logic Make an obvious simplification that removes a few lines and some unnecessary indentation; no change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:07 -05:00
J. Bruce Fields	980e5a40a4	nfsd: fix rsi_cache reference count leak For some reason we haven't been put()'ing the reference count here. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:07 -05:00
J. Bruce Fields	dbf847ecb6	knfsd: allow cache_register to return error on failure Newer server features such as nfsv4 and gss depend on proc to work, so a failure to initialize the proc files they need should be treated as fatal. Thanks to Andrew Morton for style fix and compile fix in case where CONFIG_NFSD_V4 is undefined. Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:05 -05:00
J. Bruce Fields	ffe9386b6e	nfsd: move cache proc (un)registration to separate function Just some minor cleanup. Also I don't see much point in trying to register further proc entries if initial entries fail; so just stop trying in that case. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:04 -05:00
J. Bruce Fields	df95a9d4fb	knfsd: cache unregistration needn't return error There's really nothing much the caller can do if cache unregistration fails. And indeed, all any caller does in this case is print an error and continue. So just return void and move the printk's inside cache_unregister. Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:04 -05:00
J. Bruce Fields	a490c681cb	knfsd: fix cache.c comment The path here must be left over from some earlier draft; fix it. And do some more minor cleanup while we're there. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:03 -05:00
Chuck Lever	e5cff482c7	SUNRPC: Use unsigned string lengths in xdr_decode_string_inplace XDR strings, opaques, and net objects should all use unsigned lengths. To wit, RFC 4506 says: 4.2. Unsigned Integer An XDR unsigned integer is a 32-bit datum that encodes a non-negative integer in the range [0,4294967295]. ... 4.11. String The standard defines a string of n (numbered 0 through n-1) ASCII bytes to be the number n encoded as an unsigned integer (as described above), and followed by the n bytes of the string. After this patch, xdr_decode_string_inplace now matches the other XDR string and array helpers that take a string length argument. See: xdr_encode_opaque_fixed, xdr_encode_opaque, xdr_encode_array Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-By: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:02 -05:00
Chuck Lever	01b2969a85	SUNRPC: Prevent length underflow in read_flush() Make sure we compare an unsigned length to an unsigned count in read_flush(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-02-01 16:42:02 -05:00
Linus Torvalds	75659ca0c1	Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc * 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits) Remove commented-out code copied from NFS NFS: Switch from intr mount option to TASK_KILLABLE Add wait_for_completion_killable Add wait_event_killable Add schedule_timeout_killable Use mutex_lock_killable in vfs_readdir Add mutex_lock_killable Use lock_page_killable Add lock_page_killable Add fatal_signal_pending Add TASK_WAKEKILL exit: Use task_is_* signal: Use task_is_* sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL ptrace: Use task_is_* power: Use task_is_* wait: Use TASK_NORMAL proc/base.c: Use task_is_* proc/array.c: Use TASK_REPORT perfmon: Use task_is_* ... Fixed up conflicts in NFS/sunrpc manually..	2008-02-01 11:45:47 +11:00
travis@sgi.com	df3825c56d	x86: change NR_CPUS arrays in numa_64 Change the following static arrays sized by NR_CPUS to per_cpu data variables: char cpu_to_node_map[NR_CPUS]; Signed-off-by: Mike Travis <travis@sgi.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-01-30 13:33:11 +01:00
Trond Myklebust	34f5b4662b	SUNRPC: Don't bother changing the sigmask for asynchronous RPC calls The caller will never sleep in rpc_execute, so don't bother setting the sigmask. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:10 -05:00
Chuck Lever	afc881124b	SUNRPC: rpcb_getport_sync() passes incorrect address size to rpc_create() The variable "sin" is a pointer, so sizeof(sin) is the size of a pointer, not the size of thing that sin points to. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:09 -05:00
Chuck Lever	67d6021362	SUNRPC: Clean up block comment preceding rpcb_getport_sync() Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:09 -05:00
Chuck Lever	f1ec08cb94	SUNRPC: Use appropriate argument types in rpcb client Clean up: Follow recommendations of Chapter 5 of Documentation/CodingStyle and use "u32" instead of "__u32" for types in definitions that are not shared with user space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:09 -05:00
Chuck Lever	b91e101fca	SUNRPC: rpcb_getport_sync() should use built-in hostname generator rpc_create() can already fill in the hostname with a string representation of the server's IP address, so remove redundant logic in in rpcb_getport_sync() that does that. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:09 -05:00
Chuck Lever	33e01dc7f5	SUNRPC: Clean up functions that free address_strings array Clean up: document the rule (kfree) and the exceptions (RPC_DISPLAY_PROTO and RPC_DISPLAY_NETID) when freeing the objects in a transport's address_strings array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:08 -05:00
Trond Myklebust	86d61d8638	SUNRPC: Fix up constant string declarations in struct rpcbind_args ...and eliminate an unnecessary cast. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:06 -05:00
Chuck Lever	b454ae9060	SUNRPC: fewer conditionals in the format_ip_address routines Clean up: have the set up routines explicitly pass the strings to be used for the transport name and NETID. This removes a number of conditionals and dependencies on rpc_xprt.prot, which is overloaded. Tighten up type checking on the address_strings array while we're at it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:04 -05:00
Chuck Lever	7df089952f	SUNRPC: Fix use of copy_to_user() in gss_pipe_upcall() The gss_pipe_upcall() function expects the copy_to_user() function to return a negative error value if the call fails, but copy_to_user() returns an unsigned long number of bytes that couldn't be copied. Can rpc_pipefs actually retry a partially completed upcall read? If not, then gss_pipe_upcall() should punt any partial read, just like the upcall logic in net/sunrpc/cache.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:06:00 -05:00
Trond Myklebust	ba7392bb37	SUNRPC: Add support for per-client timeout values In order to be able to support setting the timeo and retrans parameters on a per-mountpoint basis, we move the rpc_timeout structure into the rpc_clnt. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:59 -05:00
Trond Myklebust	2881ae74e6	SUNRPC: Clean up the transport timeout initialisation Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:58 -05:00
Trond Myklebust	698b6d088e	SUNRPC: cleanup for rpc_new_client() There is no reason why we shouldn't just pass the rpc_create_args. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:58 -05:00
Chuck Lever	0fb2b7e945	SUNRPC: Move universal address definitions to global header Universal addresses are defined in RFC 1833 and clarified in RFC 3530. We need to use them in several places in the NFS and RPC clients, so move the relevant definition and block comment to an appropriate global include file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:50 -05:00
Chuck Lever	0a48f5d70f	SUNRPC: RPC version numbers are u32 Clean up: use correct type for RPC version numbers in rpcbind client. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:50 -05:00
Chuck Lever	9f6ad26d2a	SUNRPC: Fix socket address handling in rpcb_clnt Make sure rpcb_clnt passes the correct address length to rpc_create(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:50 -05:00
Chuck Lever	510deb0d70	SUNRPC: rpc_create() default hostname should support AF_INET6 addresses If the ULP doesn't pass a hostname string to rpc_create(), it manufactures one based on the passed-in address. Be smart enough to handle an AF_INET6 address properly in this case. Move the default servername logic before the xprt_create_transport() call to simplify error handling in rpc_create(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:49 -05:00
Chuck Lever	9b45b74ce2	SUNRPC: Remove an unneeded implicit type cast when calling rpc_depopulate() The two arguments of rpc_depopulate() that pass in inode numbers should use the same type as inode->i_ino: unsigned long. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:43 -05:00
Chuck Lever	322e2efe62	SUNRPC: temp var should match return type of xdr_skb_read_actor The return type of xdr_skb_read_actor functions is size_t. This fixes a nit I unwittingly overlooked in commit `dd456471`. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:43 -05:00
Chuck Lever	5d40a8a525	SUNRPC: Check a return result Minor: Replace an empty if statement with a debugging dprintk. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Thomas Talpey <Thomas.Talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:42 -05:00
Chuck Lever	d4b37ff735	SUNRPC: Fix an unnecessary implicit type cast in rpcrdma_count_chunks() Nit: rl_nchunks is an unsigned integer, so pass it into rpcrdma_count_chunks() via an unsigned integer argument. This eliminates a harmless mixed sign comparison in rpcrdma_count_chunks() Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Thomas Talpey <Thomas.Talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:42 -05:00
Chuck Lever	2a428b2b8f	SUNRPC: Prevent mixed sign comparisons in rpcrdma_convert_iovs() Keep the type of the buffer position the same during iovec conversion to reduce the likelihood of unexpected results from comparisons and length computations. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Thomas Talpey <Thomas.Talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:41 -05:00
Trond Myklebust	a4a874990c	SUNRPC: Cleanup to remove the last users of the RPC_WAITQ declaration Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:41 -05:00
Trond Myklebust	47fe064831	SUNRPC: Unexport rpc_init_task() and rpc_execute() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:40 -05:00
Trond Myklebust	e8f5d77c80	SUNRPC: allow the caller of rpc_run_task to preallocate the struct rpc_task Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:38 -05:00
Trond Myklebust	b5627943ab	SUNRPC: Remove the now unused function rpc_call_setup() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:35 -05:00
Trond Myklebust	5138fde011	NFS/SUNRPC: Convert all users of rpc_call_setup() Replace use of rpc_call_setup() with rpc_init_task(), and in cases where we need to initialise task->tk_action, with rpc_call_start(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:32 -05:00
Trond Myklebust	b3ef8b3bb9	SUNRPC: Allow rpc_init_task() to initialise the rpc_task->tk_msg In preparation for the removal of rpc_call_setup(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:31 -05:00
Trond Myklebust	77de2c590e	SUNRPC: Add a helper rpc_call_start() that initialises task->tk_action Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:31 -05:00
Trond Myklebust	5085925902	SUNRPC: Mask signals across the call to rpc_call_setup() in rpc_run_task To ensure that the RPCSEC_GSS upcall is performed with the correct sigmask. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:31 -05:00
Trond Myklebust	3ff7576dda	SUNRPC: Clean up the initialisation of priority queue scheduling info. We want the default scheduling priority (priority == 0) to remain RPC_PRIORITY_NORMAL. Also ensure that the priority wait queue scheduling is per process id instead of sometimes being per thread, and sometimes being per inode. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:30 -05:00
Trond Myklebust	c970aa85e7	SUNRPC: Clean up rpc_run_task Make it use the new task initialiser structure instead of acting as a wrapper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:30 -05:00
Trond Myklebust	84115e1cd4	SUNRPC: Cleanup of rpc_task initialisation Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:30 -05:00
Trond Myklebust	e8914c65f7	SUNRPC: Restrict sunrpc client exports The sunrpc client exports are not meant to be part of any official kernel API: they can change at the drop of a hat. Mark them as internal functions using EXPORT_SYMBOL_GPL. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:28 -05:00
Trond Myklebust	a6eaf8bdf9	SUNRPC: Move exported declarations to the function declarations Do this for all RPC client related functions and XDR functions. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:28 -05:00
J. Bruce Fields	93a44a75b9	sunrpc: document the rpc_pipefs kernel api Add kerneldoc comments for the rpc_pipefs.c functions that are exported. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:27 -05:00
Trond Myklebust	663b8858dd	SUNRPC: Reconnect immediately whenever the server isn't refusing it. If we've disconnected from the server, rather than the other way round, then it makes little sense to wait 3 seconds before reconnecting. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:27 -05:00
Trond Myklebust	62da3b2488	SUNRPC: Rename xprt_disconnect() xprt_disconnect() should really only be called when the transport shutdown is completed, and it is time to wake up any pending tasks. Rename it to xprt_disconnect_done() in order to reflect the semantical change. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:27 -05:00
Trond Myklebust	3ebb067d92	SUNRPC: Make call_status()/call_decode() call xprt_force_disconnect() Move the calls to xprt_disconnect() over to xprt_force_disconnect() in order to enable the transport layer to manage the state of the XPRT_CONNECTED flag. Ditto in xs_tcp_read_fraghdr(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:26 -05:00
Trond Myklebust	7272dcd31d	SUNRPC: xprt_autoclose() should not call xprt_disconnect() The transport layer should do that itself whenever appropriate. Note that the RDMA transport already assumes that it needs to call xprt_disconnect in xprt_rdma_close(). For TCP sockets, we want to call xprt_disconnect() only after the connection has been closed by both ends. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:26 -05:00
Trond Myklebust	e06799f958	SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket By using shutdown() rather than close() we allow the RPC client to wait for the TCP close handshake to complete before we start trying to reconnect using the same port. We use shutdown(SHUT_WR) only instead of shutting down both directions, however we wait until the server has closed the connection on its side. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:26 -05:00
Trond Myklebust	ef80367071	SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:25 -05:00
Trond Myklebust	3b948ae5be	SUNRPC: Allow the client to detect if the TCP connection is closed Add an xprt->state bit to enable the TCP ->state_change() method to signal whether or not the TCP connection is in the process of closing down. This will to be used by the reconnection logic in a separate patch. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:25 -05:00
Trond Myklebust	67a391d72c	SUNRPC: Fix TCP rebinding logic Currently the TCP rebinding logic assumes that if we're not using a reserved port, then we don't need to reconnect on the same port if a disconnection event occurs. This breaks most RPC duplicate reply cache implementations. Also take into account the fact that xprt_min_resvport and xprt_max_resvport may change while we're reconnecting, since the user may change them at any time via the sysctls. Ensure that we check the port boundaries every time we loop in xs_bind4/xs_bind6. Also ensure that if the boundaries change, we only scan the ports a maximum of 2 times. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:25 -05:00
Trond Myklebust	66af1e5585	SUNRPC: Fix a race in xs_tcp_state_change() When scheduling the autoclose RPC call, we want to ensure that we don't race against the test_bit() call in xprt_clear_locked(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-30 02:05:24 -05:00
Eric Dumazet	9a429c4983	[NET]: Add some acquires/releases sparse annotations. Add __acquires() and __releases() annotations to suppress some sparse warnings. example of warnings : net/ipv4/udp.c:1555:14: warning: context imbalance in 'udp_seq_start' - wrong count at exit net/ipv4/udp.c:1571:13: warning: context imbalance in 'udp_seq_stop' - unexpected unlock Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 15:00:31 -08:00
YOSHIFUJI Hideaki	8d614434ab	[SUNRPC]: Use htonl() where appropriate. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:58:05 -08:00
Herbert Xu	1781f7f580	[UDP]: Restore missing inDatagrams increments The previous move of the the UDP inDatagrams counter caused the counting of encapsulated packets, SUNRPC data (as opposed to call) packets and RXRPC packets to go missing. This patch restores all of these. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:56:33 -08:00
Pavel Emelyanov	b24b8a247f	[NET]: Convert init_timer into setup_timer Many-many code in the kernel initialized the timer->function and timer->data together with calling init_timer(timer). There is already a helper for this. Use it for networking code. The patch is HUGE, but makes the code 130 lines shorter (98 insertions(+), 228 deletions(-)). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-28 14:53:35 -08:00
James Morris	3392c34922	NFS: add newline to kernel warning message in auth_gss code Add newline to kernel warning message in gss_create(). Signed-off-by: James Morris <jmorris@namei.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-01-03 09:37:16 -05:00
James Lentini	50e1092b3a	SUNRPC xprtrdma: fix XDR tail buf marshalling for all ops rpcrdma_convert_iovs is passed an xdr_buf representing either an RPC request or an RPC reply. In the case of a request, several calculations and tests involving pos are unnecessary. In the case of a reply, several calculations and tests involving pos are incorrect (the code tests pos against the reply xdr buf's len field, which is always 0 at the time rpcrdma_convert_iovs is executed). This change removes the incorrect/unnecessary calculations and tests involving pos. This fixes an observed problem when reading certain file sizes over NFS/RDMA. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-12-11 22:01:59 -05:00
Matthew Wilcox	150030b78a	NFS: Switch from intr mount option to TASK_KILLABLE By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr' mount option. We have to use _killable everywhere instead of _interruptible as we get rid of rpc_clnt_sigmask/sigunmask. Signed-off-by: Liam R. Howlett <howlett@gmail.com> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>	2007-12-06 17:40:25 -05:00
Linus Torvalds	8c27eba549	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (41 commits) [XFRM]: Fix leak of expired xfrm_states [ATM]: [he] initialize lock and tasklet earlier [IPV4]: Remove bogus ifdef mess in arp_process [SKBUFF]: Free old skb properly in skb_morph [IPV4]: Fix memory leak in inet_hashtables.h when NUMA is on [IPSEC]: Temporarily remove locks around copying of non-atomic fields [TCP] MTUprobe: Cleanup send queue check (no need to loop) [TCP]: MTUprobe: receiver window & data available checks fixed [MAINTAINERS]: tlan list is subscribers-only [SUNRPC]: Remove SPIN_LOCK_UNLOCKED [SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static [PFKEY]: Sending an SADB_GET responds with an SADB_GET [IRDA]: Compilation for CONFIG_INET=n case [IPVS]: Fix compiler warning about unused register_ip_vs_protocol [ARP]: Fix arp reply when sender ip 0 [IPV6] TCPMD5: Fix deleting key operation. [IPV6] TCPMD5: Check return value of tcp_alloc_md5sig_pool(). [IPV4] TCPMD5: Use memmove() instead of memcpy() because we have overlaps. [IPV4] TCPMD5: Omit redundant NULL check for kfree() argument. ieee80211: Stop net_ratelimit/IEEE80211_DEBUG_DROP log pollution ...	2007-11-26 20:09:07 -08:00
Joe Perches	014313a9d6	SUNRPC: Add missing "space" to net/sunrpc/auth_gss.c Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:59 -05:00
Adrian Bunk	483066d62e	SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static xs_setup_{udp,tcp}() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:50 -05:00
James Lentini	cfcb43ff7c	SUNRPC: remove NFS/RDMA client's binary sysctls Support for binary sysctls is being deprecated in 2.6.24. Since there are no applications using the NFS/RDMA client's binary sysctls, it makes sense to remove them. The patch below does this while leaving the /proc/sys interface unchanged. Please consider this for 2.6.24. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:21:19 -05:00
Jiri Slaby	5ba03e82b3	[SUNRPC]: Remove SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED is deprecated, use DEFINE_SPINLOCK instead Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-22 19:40:22 +08:00
Adrian Bunk	5fe4a33430	[SUNRPC]: Make xprtsock.c:xs_setup_{udp,tcp}() static xs_setup_{udp,tcp}() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-11-22 19:38:25 +08:00
Joe Perches	c3c9d45828	[SUNRPC]: Add missing "space" Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-19 23:48:08 -08:00
J. Bruce Fields	eda4f9b799	sunrpc: rpc_pipe_poll may miss available data in some cases Pipe messages start out life on a queue on the inode, but when first read they're moved to the filp's private pointer. So it's possible for a poll here to return null even though there's a partially read message available. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-17 13:08:47 -05:00
Kevin Coffman	ef338bee3f	sunrpc: return error if unsupported enctype or cksumtype is encountered Return an error from gss_import_sec_context_kerberos if the negotiated context contains encryption or checksum types not supported by the kernel code. This fixes an Oops because success was assumed and later code found no internal_ctx_id. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-17 13:08:46 -05:00
Kevin Coffman	ffc40f5692	sunrpc: gss_pipe_downcall(), don't assume all errors are transient Instead of mapping all errors except EACCES to EAGAIN, map all errors except EAGAIN to EACCES. An example is user-land negotiating a Kerberos context with an encryption type that is not supported by the kernel code. (This can happen due to mis-configuration or a bug in the Kerberos code that does not honor our request to limit the encryption types negotiated.) This failure is not transient, and returning EAGAIN causes mount to continuously retry rather than giving up. Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-17 13:08:45 -05:00
Adrian Bunk	d5cd97872d	sunrpc/xprtrdma/transport.c: fix use-after-free Fix an obvious use-after-free spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:41 -08:00
Jens Axboe	c46f2334c8	[SG] Get rid of __sg_mark_end() sg_mark_end() overwrites the page_link information, but all users want __sg_mark_end() behaviour where we just set the end bit. That is the most natural way to use the sg list, since you'll fill it in and then mark the end point. So change sg_mark_end() to only set the termination bit. Add a sg_magic debug check as well, and clear a chain pointer if it is set. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
Adrian Bunk	87ae9afdca	cleanup asm/scatterlist.h includes Not architecture specific code should not #include <asm/scatterlist.h>. This patch therefore either replaces them with #include <linux/scatterlist.h> or simply removes them if they were unused. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
David S. Miller	51c739d1f4	[NET]: Fix incorrect sg_mark_end() calls. This fixes scatterlist corruptions added by commit `68e3f5dd4d` [CRYPTO] users: Fix up scatterlist conversion errors The issue is that the code calls sg_mark_end() which clobbers the sg_page() pointer of the final scatterlist entry. The first part fo the fix makes skb_to_sgvec() do __sg_mark_end(). After considering all skb_to_sgvec() call sites the most correct solution is to call __sg_mark_end() in skb_to_sgvec() since that is what all of the callers would end up doing anyways. I suspect this might have fixed some problems in virtio_net which is the sole non-crypto user of skb_to_sgvec(). Other similar sg_mark_end() cases were converted over to __sg_mark_end() as well. Arguably sg_mark_end() is a poorly named function because it doesn't just "mark", it clears out the page pointer as a side effect, which is what led to these bugs in the first place. The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable() and arguably it could be converted to __sg_mark_end() if only so that we can delete this confusing interface from linux/scatterlist.h Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:29:29 -07:00
J. Bruce Fields	521c2a43b2	[SUNRPC]: fix rpc debugging Commit `baa3a2a0d2`, by removing initialization of the ctl_name field, broke this conditional, preventing the display of rpc_tasks that you previously got when turning on rpc debugging. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 01:07:15 -07:00
Stephen Rothwell	e08a132b0e	[SUNRPC] rpc_rdma: we need to cast u64 to unsigned long long for printing as some architectures have unsigned long for u64. net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_create_chunks': net/sunrpc/xprtrdma/rpc_rdma.c:222: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/sunrpc/xprtrdma/rpc_rdma.c:234: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_count_chunks': net/sunrpc/xprtrdma/rpc_rdma.c:577: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64 Noticed on PowerPC pseries_defconfig build. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 00:44:32 -07:00
Al Viro	2d8a972661	SUNRPC endianness annotations rpcrdma stuff lacks endianness annotations for on-the-wire data. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-29 07:41:32 -07:00
Herbert Xu	68e3f5dd4d	[CRYPTO] users: Fix up scatterlist conversion errors This patch fixes the errors made in the users of the crypto layer during the sg_init_table conversion. It also adds a few conversions that were missing altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-27 00:52:07 -07:00
Jens Axboe	642f149031	SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 11:20:47 +02:00
Jens Axboe	fa05f1286b	Update net/ to use sg helpers Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:19:56 +02:00
Pavel Emelyanov	ba25f9dcc4	Use helpers to obtain task pid in printks The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Eric W. Biederman	baa3a2a0d2	sysctl: remove broken sunrpc debug binary sysctls This is debug code so no need to support binary sysctl, and the binary sysctls as they were written were not consistent with what showed up in /proc so remove the binary sysctl support. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexey Dobriyan <adobriyan@sw.ru> Cc: "David S. Miller" <davem@davemloft.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:22 -07:00
Christoph Lameter	4ba9b9d0ba	Slab API: remove useless ctor parameter and reorder parameters Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void object, struct kmem_cache s, unsigned long flags) to ctor(struct kmem_cache s, void object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:42:45 -07:00
Andrew Morton	a56daeb7d5	net/sunrpc/xprtrdma/verbs.c printk warning fix sparc64: net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 3) net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 4) Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "David S. Miller" <davem@davemloft.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:43:23 -07:00
Linus Torvalds	f4921aff5b	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: (131 commits) NFSv4: Fix a typo in nfs_inode_reclaim_delegation NFS: Add a boot parameter to disable 64 bit inode numbers NFS: nfs_refresh_inode should clear cache_validity flags on success NFS: Fix a connectathon regression in NFSv3 and NFSv4 NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode SUNRPC: Don't call xprt_release in call refresh SUNRPC: Don't call xprt_release() if call_allocate fails SUNRPC: Fix buggy UDP transmission [23/37] Clean up duplicate includes in [2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static SUNRPC: Use correct type in buffer length calculations SUNRPC: Fix default hostname created in rpc_create() nfs: add server port to rpc_pipe info file NFS: Get rid of some obsolete macros NFS: Simplify filehandle revalidation NFS: Ensure that nfs_link() returns a hashed dentry NFS: Be strict about dentry revalidation when doing exclusive create NFS: Don't zap the readdir caches upon error NFS: Remove the redundant nfs_reval_fsid() NFSv3: Always use directory post-op attributes in nfs3_proc_lookup ... Fix up trivial conflict due to sock_owned_by_user() cleanup manually in net/sunrpc/xprtsock.c	2007-10-15 10:47:35 -07:00
Linus Torvalds	37ca506adc	Merge branch 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux * 'nfs-server-stable' of git://linux-nfs.org/~bfields/linux: knfsd: query filesystem for NFSv4 getattr of FATTR4_MAXNAME knfsd: nfsv4 delegation recall should take reference on client knfsd: don't shutdown callbacks until nfsv4 client is freed knfsd: let nfsd manage timing out its own leases knfsd: Add source address to sunrpc svc errors knfsd: 64 bit ino support for NFS server svcgss: move init code into separate function knfsd: remove code duplication in nfsd4_setclientid() nfsd warning fix knfsd: fix callback rpc cred knfsd: move nfsv4 slab creation/destruction to module init/exit knfsd: spawn kernel thread to probe callback channel knfsd: nfs4 name->id mapping not correctly parsing negative downcall knfsd: demote some printk()s to dprintk()s knfsd: cleanup of nfsd4 cmp_* functions knfsd: delete code made redundant by map_new_errors nfsd: fix horrible indentation in nfsd_setattr nfsd: remove unused cache_for_each macro nfsd: tone down inaccurate dprintk	2007-10-15 08:16:53 -07:00
Pavel Emelyanov	ec93103519	[SUNRPC]: Make the sunrpc use the seq_open_private() Just switch to the consolidated code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:55:36 -07:00
Eric W. Biederman	457c4cbc5a	[NET]: Make /proc/net per network namespace This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:06 -07:00
John Heffner	02b3d34631	[NET] Cleanup: Use sock_owned_by_user() macro Changes asserts in sunrpc to use sock_owned_by_user() macro instead of referencing sock_lock.owner directly. Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:49:00 -07:00
Ilpo Järvinen	172589ccdd	[NET]: DIV_ROUND_UP cleanup (part two) Hopefully captured all single statement cases under net/. I'm not too sure if there is some policy about #includes that are "guaranteed" (ie., in the current tree) to be available through some other #included header, so I just added linux/kernel.h to each changed file that didn't #include it previously. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-10 16:48:37 -07:00
Dr. David Alan Gilbert	354ecbb9dd	knfsd: Add source address to sunrpc svc errors This patch adds the address of the client that caused an error in sunrpc/svc.c so that you get errors that look like: svc: 192.168.66.28, port=709: unknown version (3 for prog 100003, nfsd) I've seen machines which get bunches of unknown version or similar errors from time to time, and while the recent patch to add the service helps to find which service has the wrong version it doesn't help find the potentially bad client. The patch is against a checkout of Linus's git tree made on 2007-08-24. One observation is that the svc_print_addr function prints to a buffer which in this case makes life a little more complex; it just feels as if there must be lots of places that print a connection address - is there a better function to use anywhere? I think actually there are a few places with semi duplicated code; e.g. one_sock_name switches on the address family but only currently has IPV4; I wonder how many other places are similar. Signed-off-by: Dave Gilbert <linux@treblig.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de>	2007-10-09 18:31:57 -04:00
J. Bruce Fields	21fcd02be3	svcgss: move init code into separate function We've let svcauth_gss_accept() get much too long and hairy. The RPC_GSS_PROC_INIT and RPC_GSS_PROC_CONTINUE_INIT cases share very little with the other cases, so it's very natural to split them off into a separate function. This will also nicely isolate the piece of code we need to parametrize to authenticating gss-protected NFSv4 callbacks on behalf of the NFS client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de>	2007-10-09 18:31:57 -04:00
Trond Myklebust	220bcc2afd	SUNRPC: Don't call xprt_release in call refresh Call it from call_verify() instead... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:42 -04:00
Trond Myklebust	b6e9c713f5	SUNRPC: Don't call xprt_release() if call_allocate fails It completely fouls up the RPC call statistics, and serves no useful purpose. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:40 -04:00
Trond Myklebust	2199700f1d	SUNRPC: Fix buggy UDP transmission xs_sendpages() may return a negative result. We sure as hell don't want to add that to the 'tk_bytes_sent' tally... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:37 -04:00
Adrian Bunk	7f4adeff6f	[2.6 patch] net/sunrpc/rpcb_clnt.c: make struct rpcb_program static This patch makes the needlessly global struct rpcb_program static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:32 -04:00
Chuck Lever	67f97d83bf	SUNRPC: Use correct type in buffer length calculations Use correct type signage in gss_krb5_remove_padding() when doing length calculations. Both xdr_buf.len and iov.iov_len are size_t, which is unsigned; so use an unsigned type for our temporary length variable to ensure we don't overflow it.. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:30 -04:00
J. Bruce Fields	afde94f398	SUNRPC: Fix default hostname created in rpc_create() Since 43780b87fa7..., rpc_create() fills in a default hostname based on the ip address if the servername passed in is null. A small typo made that default incorrect. (But this information appears to be used only for debugging right now, so I don't believe the typo causes any bugs in the current kernel.) Thanks to Olga Kornievskaia for bug report and testing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Olga Kornievskaia <aglo@citi.umich.edu> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:28 -04:00
J. Bruce Fields	bf19aacecb	nfs: add server port to rpc_pipe info file On the client, when an alternate server port is specified on the mount commandline, we need to make sure gssd knows about it. Also, on the server side, when we're sending krb5 callbacks to the client, we'll use the same mechanism to let gssd know about the callback port. Thanks to Olga Kornievskaia for testing and for an earlier implementation. Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Cc: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:20:25 -04:00
Chuck Lever	1321d8d971	SUNRPC: Fix bytes-per-op accounting for RPC over UDP NFS performance metrics reported zero bytes sent per op when mounting with UDP. The UDP socket transport wasn't properly counting the number of bytes sent. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:19 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	c56c65fb67	RPCRDMA: rpc rdma verbs interface implementation This implements the interface from rpcrdma to the RDMA verbs interface supported by Infniband and iWARP. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:08 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	e96018280c	RPCRDMA: rpc rdma protocol implementation This implements the marshaling and unmarshaling of the rpcrdma transport headers. Connection management is also addressed. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:06 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	f58851e6b0	RPCRDMA: rpc rdma transport switch This implements the configuration and building of the core transport switch implementation of the rpcrdma transport. Stubs are provided for the rpcrdma protocol handling, and the infiniband/iwarp verbs interface. These are provided in following patches. Signed-off-by: Tom Talpey <talpey@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:18:03 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	0896a725a1	NFS/SUNRPC: use transport protocol naming Instead of an { address family, raw IP protocol number }-tuple, use the newly-defined RPC identifier when creating clients in the upper layers. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:53 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	4fa016eb24	NFS/SUNRPC: support transport protocol naming To prepare for including non-sockets-based RPC transports, select RPC transports by an identifier (to be used in following patches). Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:50 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	49c36fcc44	SUNRPC: rearrange RPC sockets definitions To prepare for including non-sockets-based RPC transports, move the sockets-dependent definitions into their own file. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:48 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	3c341b0b92	SUNRPC: rename the rpc_xprtsock_create structure To prepare for including non-sockets-based RPC transports, change the overly suggestive name of the transport creation arguments struct. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:45 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	bc25571e21	SUNRPC: Finish API to load RPC transport implementations dynamically Allow RPC client transport implementations to be loaded as needed, or as they become available from distributors or third-party vendors. Note that we leave the IP sockets implementation in sunrpc.o permanently, as IP functionality is always available in any kernel that runs NFS. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:42 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	81c098af3d	SUNRPC: Provide a new API for registering transport implementations To allow transport capabilities to be loaded dynamically, provide an API for registering and unregistering the transports with the RPC client. Eventually xprt_create_transport() will be changed to search the list of registered transports when initializing a fresh transport. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:40 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	1244480976	SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions As a preface to allowing arbitrary transport modules to be loaded dynamically, add EXPORT_SYMBOL_GPL for all generic transport functions that a transport implementation might want to use. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:36 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	4f22ccc346	SUNRPC: mark bulk read/write data in xdrbuf Adds a flag word to the xdrbuf struct which indicates any bulk disposition of the data. This enables RPC transport providers to marshal it efficiently/appropriately, and may enable other optimizations. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:34 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	4417c8c41a	SUNRPC: export per-transport rpcbind netid's The rpcbind (v3+) netid is provided by each RPC client transport. This fixes an omission in IPv6 rpcbind client support, and enables future extension. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:20 -04:00
$\"Talpey, Thomas\$ \"Talpey, Thomas\	4f40ee4a02	SUNRPC: move per-transport rpcbind netid's Move the TCP/UDP rpcbind netid's from the rpcbind client to a global header. Signed-off-by: Tom Talpey <tmt@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:17:18 -04:00
Chuck Lever	b79dc8ced1	SUNRPC: RPC bind failures should be permanent for NULL requests The purpose of an RPC ping (a NULL request) is to determine whether the remote end is operating and supports the RPC program and version of the request. If we do an RPC bind and the remote's rpcbind service says "this program or service isn't supported" then we have our answer already, and we should give up immediately. This is good for the kernel mount client, as it will cause the request to fail, and then allow an immediate retry with different options. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:16:58 -04:00
Chuck Lever	906462af4c	SUNRPC: Split another new rpcbind retry error code from EACCES Add more new error code processing to the kernel's rpcbind client and to call_bind_status() to distinguish two cases: Case 1: the remote has replied that the program/version tuple is not registered (returns EACCES) Case 2: retry with a lesser rpcbind version (rpcb now returns EPFNOSUPPORT) This change allows more specific error processing for each of these two cases. We now fail case 2 instead of retrying... it's a server configuration error not to support even rpcbind version 2. And don't expose this new error code to user land -- convert it to EIO before failing the RPC. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:16:56 -04:00
Chuck Lever	2429cbf6a1	SUNRPC: Add a new error code for retry waiting for another binder Add new error code processing to the kernel's rpcbind client and to call_bind_status() to distinguish two cases: Case 1: the remote has replied that the program/version tuple is not registered (returns -EACCES) Case 2: another process is already in the middle of binding on this transport (now returns -EAGAIN) This change allows more specific retry processing for each of these two cases. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:16:53 -04:00
Chuck Lever	4784cb51a3	SUNRPC: Retry bad rpcbind replies When a server returns a bad rpcbind reply, make rpcbind client recovery logic retry with an older protocol version. Older versions are more likely to work correctly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:16:51 -04:00
Chuck Lever	e65fe3976f	SUNRPC: Make rpcb_decode_getaddr more picky about universal addresses Add better sanity checking of server replies to the GETVERSADDR reply decoder. Change the error return code: EIO is what other XDR decoding routines return if there is a failure while decoding. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-09 17:16:48 -04:00

... 11 12 13 14 15 ...

1767 Commits