linux

Author	SHA1	Message	Date
Konstantin Khorenko	3aa6e0aa8a	NFSD: memory corruption due to writing beyond the stat array If nfsd fails to find an exported via NFS file in the readahead cache, it should increment corresponding nfsdstats counter (ra_depth[10]), but due to a bug it may instead write to ra_depth[11], corrupting the following field. In a kernel with NFSDv4 compiled in the corruption takes the form of an increment of a counter of the number of NFSv4 operation 0's received; since there is no operation 0, this is harmless. In a kernel with NFSDv4 disabled it corrupts whatever happens to be in the memory beyond nfsdstats. Signed-off-by: Konstantin Khorenko <khorenko@openvz.org> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
Benny Halevy	0af3f814cc	NFSD: use nfserr for status after decode_cb_op_status Bugs introduced in `85a5648019` "NFSD: Update XDR decoders in NFSv4 callback client" Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:35:18 -05:00
J. Bruce Fields	541ce98c10	nfsd: don't leak dentry count on mnt_want_write failure The exit cleanup isn't quite right here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-02-14 10:31:08 -05:00
Linus Torvalds	f8206b925f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (23 commits) sanitize vfsmount refcounting changes fix old umount_tree() breakage autofs4: Merge the remaining dentry ops tables Unexport do_add_mount() and add in follow_automount(), not ->d_automount() Allow d_manage() to be used in RCU-walk mode Remove a further kludge from __do_follow_link() autofs4: Bump version autofs4: Add v4 pseudo direct mount support autofs4: Fix wait validation autofs4: Clean up autofs4_free_ino() autofs4: Clean up dentry operations autofs4: Clean up inode operations autofs4: Remove unused code autofs4: Add d_manage() dentry operation autofs4: Add d_automount() dentry operation Remove the automount through follow_link() kludge code from pathwalk CIFS: Use d_automount() rather than abusing follow_link() NFS: Use d_automount() rather than abusing follow_link() AFS: Use d_automount() rather than abusing follow_link() Add an AT_NO_AUTOMOUNT flag to suppress terminal automount ...	2011-01-16 11:31:50 -08:00
David Howells	cc53ce53c8	Add a dentry op to allow processes to be held during pathwalk transit Add a dentry op (d_manage) to permit a filesystem to hold a process and make it sleep when it tries to transit away from one of that filesystem's directories during a pathwalk. The operation is keyed off a new dentry flag (DCACHE_MANAGE_TRANSIT). The filesystem is allowed to be selective about which processes it holds and which it permits to continue on or prohibits from transiting from each flagged directory. This will allow autofs to hold up client processes whilst letting its userspace daemon through to maintain the directory or the stuff behind it or mounted upon it. The ->d_manage() dentry operation: int (d_manage)(struct path path, bool mounting_here); takes a pointer to the directory about to be transited away from and a flag indicating whether the transit is undertaken by do_add_mount() or do_move_mount() skipping through a pile of filesystems mounted on a mountpoint. It should return 0 if successful and to let the process continue on its way; -EISDIR to prohibit the caller from skipping to overmounted filesystems or automounting, and to use this directory; or some other error code to return to the user. ->d_manage() is called with namespace_sem writelocked if mounting_here is true and no other locks held, so it may sleep. However, if mounting_here is true, it may not initiate or wait for a mount or unmount upon the parameter directory, even if the act is actually performed by userspace. Within fs/namei.c, follow_managed() is extended to check with d_manage() first on each managed directory, before transiting away from it or attempting to automount upon it. follow_down() is renamed follow_down_one() and should only be used where the filesystem deliberately intends to avoid management steps (e.g. autofs). A new follow_down() is added that incorporates the loop done by all other callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS and CIFS do use it, their use is removed by converting them to use d_automount()). The new follow_down() calls d_manage() as appropriate. It also takes an extra parameter to indicate if it is being called from mount code (with namespace_sem writelocked) which it passes to d_manage(). follow_down() ignores automount points so that it can be used to mount on them. __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to sleep. It would be possible to enter d_manage() in rcu-walk mode too, and have that determine whether to abort or not itself. That would allow the autofs daemon to continue on in rcu-walk mode. Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't required as every tranist from that directory will cause d_manage() to be invoked. It can always be set again when necessary. ========================== WHAT THIS MEANS FOR AUTOFS ========================== Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to trigger the automounting of indirect mounts, and both of these can be called with i_mutex held. autofs knows that the i_mutex will be held by the caller in lookup(), and so can drop it before invoking the daemon - but this isn't so for d_revalidate(), since the lock is only held on _some_ of the code paths that call it. This means that autofs can't risk dropping i_mutex from its d_revalidate() function before it calls the daemon. The bug could manifest itself as, for example, a process that's trying to validate an automount dentry that gets made to wait because that dentry is expired and needs cleaning up: mkdir S ffffffff8014e05a 0 32580 24956 Call Trace: [<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897 [<ffffffff80127f7d>] avc_has_perm+0x46/0x58 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e [<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b [<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149 [<ffffffff80036d96>] __lookup_hash+0xa0/0x12f [<ffffffff80057a2f>] lookup_create+0x46/0x80 [<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4 versus the automount daemon which wants to remove that dentry, but can't because the normal process is holding the i_mutex lock: automount D ffffffff8014e05a 0 32581 1 32561 Call Trace: [<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b [<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1 [<ffffffff80063c89>] .text.lock.mutex+0xf/0x14 [<ffffffff800e6d55>] do_rmdir+0x77/0xde [<ffffffff8005d229>] tracesys+0x71/0xe0 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 which means that the system is deadlocked. This patch allows autofs to hold up normal processes whilst the daemon goes ahead and does things to the dentry tree behind the automouter point without risking a deadlock as almost no locks are held in d_manage() and none in d_automount(). Signed-off-by: David Howells <dhowells@redhat.com> Was-Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-01-15 20:07:31 -05:00
Linus Torvalds	18bce371ae	Merge branch 'for-2.6.38' of git://linux-nfs.org/~bfields/linux * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits) nfsd4: fix callback restarting nfsd: break lease on unlink, link, and rename nfsd4: break lease on nfsd setattr nfsd: don't support msnfs export option nfsd4: initialize cb_per_client nfsd4: allow restarting callbacks nfsd4: simplify nfsd4_cb_prepare nfsd4: give out delegations more quickly in 4.1 case nfsd4: add helper function to run callbacks nfsd4: make sure sequence flags are set after destroy_session nfsd4: re-probe callback on connection loss nfsd4: set sequence flag when backchannel is down nfsd4: keep finer-grained callback status rpc: allow xprt_class->setup to return a preexisting xprt rpc: keep backchannel xprt as long as server connection rpc: move sk_bc_xprt to svc_xprt nfsd4: allow backchannel recovery nfsd4: support BIND_CONN_TO_SESSION nfsd4: modify session list under cl_lock Documentation: fl_mylease no longer exists ... Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The vfs-scale work touched some msnfs cases, and this merge removes support for that entirely, so the conflict was trivial to resolve.	2011-01-14 13:17:26 -08:00
J. Bruce Fields	a8f2800b4f	nfsd4: fix callback restarting Ensure a new callback is added to the client's list of callbacks at most once. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-14 14:51:31 -05:00
J. Bruce Fields	4795bb37ef	nfsd: break lease on unlink, link, and rename Any change to any of the links pointing to an entry should also break delegations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:09 -05:00
J. Bruce Fields	6a76bebefe	nfsd4: break lease on nfsd setattr Leases (delegations) should really be broken on any metadata change, not just on size change. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:08 -05:00
J. Bruce Fields	9ce137eee4	nfsd: don't support msnfs export option We've long had these pointless #ifdef MSNFS's sprinkled throughout the code--pointless because MSNFS is always defined (and we give no config option to make that easy to change). So we could just remove the ifdef's and compile the resulting code unconditionally. But as long as we're there: why not just rip out this code entirely? The only purpose is to implement the "msnfs" export option which turns on Windows-like behavior in some cases, and: - the export option isn't documented anywhere; - the userland utilities (which would need to be able to parse "msnfs" in an export file) don't support it; - I don't know how to maintain this, as I don't know what the proper behavior is; and - google shows no evidence that anyone has ever used this. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:07 -05:00
J. Bruce Fields	9ee1ba5402	nfsd4: initialize cb_per_client Otherwise a callback that is aborted before it runs will result in a list_del on an uninitialized list head. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-13 21:04:06 -05:00
Linus Torvalds	275220f0fc	Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits) block: ensure that completion error gets properly traced blktrace: add missing probe argument to block_bio_complete block cfq: don't use atomic_t for cfq_group block cfq: don't use atomic_t for cfq_queue block: trace event block fix unassigned field block: add internal hd part table references block: fix accounting bug on cross partition merges kref: add kref_test_and_get bio-integrity: mark kintegrityd_wq highpri and CPU intensive block: make kblockd_workqueue smarter Revert "sd: implement sd_check_events()" block: Clean up exit_io_context() source code. Fix compile warnings due to missing removal of a 'ret' variable fs/block: type signature of major_to_index(int) to major_to_index(unsigned) block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p) cfq-iosched: don't check cfqg in choose_service_tree() fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors cdrom: export cdrom_check_events() sd: implement sd_check_events() sr: implement sr_check_events() ...	2011-01-13 10:45:01 -08:00
Linus Torvalds	b9d919a4ac	Merge branch 'nfs-for-2.6.38' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.38' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (89 commits) NFS fix the setting of exchange id flag NFS: Don't use vm_map_ram() in readdir NFSv4: Ensure continued open and lockowner name uniqueness NFS: Move cl_delegations to the nfs_server struct NFS: Introduce nfs_detach_delegations() NFS: Move cl_state_owners and related fields to the nfs_server struct NFS: Allow walking nfs_client.cl_superblocks list outside client.c pnfs: layout roc code pnfs: update nfs4_callback_recallany to handle layouts pnfs: add CB_LAYOUTRECALL handling pnfs: CB_LAYOUTRECALL xdr code pnfs: change lo refcounting to atomic_t pnfs: check that partial LAYOUTGET return is ignored pnfs: add layout to client list before sending rpc pnfs: serialize LAYOUTGET(openstateid) pnfs: layoutget rpc code cleanup pnfs: change how lsegs are removed from layout list pnfs: change layout state seqlock to a spinlock pnfs: add prefix to struct pnfs_layout_hdr fields pnfs: add prefix to struct pnfs_layout_segment fields ...	2011-01-11 15:11:56 -08:00
J. Bruce Fields	5ce8ba25d6	nfsd4: allow restarting callbacks If we lose the backchannel and then the client repairs the problem, resend any callbacks. We use a new cb_done flag to track whether there is still work to be done for the callback or whether it can be destroyed with the rpc. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	3ff3600e7e	nfsd4: simplify nfsd4_cb_prepare Remove handling for a nonexistant case (status && !-EAGAIN). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	14a24e99f4	nfsd4: give out delegations more quickly in 4.1 case Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	229b2a0839	nfsd4: add helper function to run callbacks Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	84f5f7ccc5	nfsd4: make sure sequence flags are set after destroy_session If this loses any backchannel, make sure we have a chance to notice that and set the sequence flags. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:11 -05:00
J. Bruce Fields	eea4980660	nfsd4: re-probe callback on connection loss This makes sure we set the sequence flag when necessary. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	0d7bb71907	nfsd4: set sequence flag when backchannel is down Implement the SEQ4_STATUS_CB_PATH_DOWN flag. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	77a3569d6c	nfsd4: keep finer-grained callback status Distinguish between when the callback channel is known to be down, and when it is not yet confirmed. This will be useful in the 4.1 case. Also, we don't seem to be using the fact that this field is atomic. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	dcbeaa68db	nfsd4: allow backchannel recovery Now that we have a list of connections to choose from, we can teach the callback code to just pick a suitable connection and use that, instead of insisting on forever using the connection that the first create_session was sent with. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-01-11 15:04:10 -05:00
J. Bruce Fields	1d1bc8f207	nfsd4: support BIND_CONN_TO_SESSION Basic xdr and processing for BIND_CONN_TO_SESSION. This adds a connection to the list of connections associated with a session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-11 15:04:09 -05:00
J. Bruce Fields	4c6493785a	nfsd4: modify session list under cl_lock We want to traverse this from the callback code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2011-01-11 15:04:09 -05:00
Linus Torvalds	23d69b09b7	Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.	2011-01-07 16:58:04 -08:00
Nick Piggin	b7ab39f631	fs: dcache scale dentry refcount Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:21 +11:00
Takuma Umeya	6f3d772fb8	nfs4: set source address when callback is generated when callback is generated in NFSv4 server, it doesn't set the source address. When an alias IP is utilized on NFSv4 server and suppose the client is accessing via that alias IP (e.g. eth0:0), the client invokes the callback to the IP address that is set on the original device (e.g. eth0). This behavior results in timeout of xprt. The patch sets the IP address that the client should invoke callback to. Signed-off-by: Takuma Umeya <tumeya@redhat.com> [bfields@redhat.com: Simplify gen_callback arguments, use helper function] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 19:43:01 -05:00
J. Bruce Fields	3c72602340	nfsd4: return nfs errno from name_to_id functions This avoids the need for the confusing ESRCH mapping. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:22:11 -05:00
J. Bruce Fields	775a1905e1	nfsd4: remove outdated pathname-comments Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:22:10 -05:00
J. Bruce Fields	2ca72e17e5	nfsd4: move idmap and acl header files into fs/nfsd These are internal nfsd interfaces. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:22:09 -05:00
J. Bruce Fields	f6af99ec1b	nfsd4: name->id mapping should fail with BADOWNER not BADNAME According to rfc 3530 BADNAME is for strings that represent paths; BADOWNER is for user/group names that don't map. And the too-long name should probably be BADOWNER as well; it's effectively the same as if we couldn't map it. Cc: stable@kernel.org Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reported-by: Simon Kirby <sim@hostway.ca> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 18:21:36 -05:00
J. Bruce Fields	c45821d263	locks: eliminate fl_mylease callback The nfs server only supports read delegations for now, so we don't care how conflicts are determined. All we care is that unlocks are recognized as matching the leases they are meant to remove. After the last patch, a comparison of struct files will work for that purpose. So we no longer need this callback. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:28 -05:00
J. Bruce Fields	c84d500bc4	nfsd4: use a single struct file for delegations When we converted to sharing struct filess between nfs4 opens I went too far and also used the same mechanism for delegations. But keeping a reference to the struct file ensures it will outlast the lease, and allows us to remove the lease with the same file as we added it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:27 -05:00
J. Bruce Fields	e63eb93750	nfsd4: eliminate lease delete callback nfsd controls the lifetime of the lease, not the lock code, so there's no need for this callback on lease destruction. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:26 -05:00
J. Bruce Fields	da165dd60e	nfsd: remove some unnecessary dropit handling We no longer need a few of these special cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:23 -05:00
J. Bruce Fields	062304a815	nfsd: stop translating EAGAIN to nfserr_dropit We no longer need this. Also, EWOULDBLOCK is generally a synonym for EAGAIN, but that may not be true on all architectures, so map it as well. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:23 -05:00
J. Bruce Fields	9e701c6109	svcrpc: simpler request dropping Currently we use -EAGAIN returns to determine when to drop a deferred request. On its own, that is error-prone, as it makes us treat -EAGAIN returns from other functions specially to prevent inadvertent dropping. So, use a flag on the request instead. Returning an error on request deferral is still required, to prevent further processing, but we no longer need worry that an error return on its own could result in a drop. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:22 -05:00
J. Bruce Fields	3beb6cd1d4	nfsd: don't drop requests on -ENOMEM We never want to drop a request if we could return a JUKEBOX/DELAY error instead; so, convert to nfserr_jukebox and let nfsd_dispatch() convert that to a dropit error as a last resort if JUKEBOX/DELAY is unavailable (as in the NFSv2 case). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:20 -05:00
Kirill A. Shutemov	65e4c89455	nfsd: declare several functions of nfs4callback as static setup_callback_client(), nfsd4_release_cb() and nfsd4_process_cb_update() do not have users outside the translation unit. Let's declare it as static. Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-01-04 16:49:19 -05:00
Mi Jinlong	22b6dee842	nfsd4: fix oops on secinfo_no_name result encoding The secinfo_no_name code oopses on encoding with BUG: unable to handle kernel NULL pointer dereference at 00000044 IP: [<e2bd239a>] nfsd4_encode_secinfo+0x1c/0x1c1 [nfsd] We should implement a nfsd4_encode_secinfo_no_name() instead using nfsd4_encode_secinfo(). Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-29 11:54:06 -07:00
Jens Axboe	3603b8eacc	Fix compile warnings due to missing removal of a 'ret' variable Commit `a8adbe3` forgot to remove the return variable, kill it. drivers/block/loop.c: In function 'lo_splice_actor': drivers/block/loop.c:398: warning: unused variable 'ret' [...] fs/nfsd/vfs.c: In function 'nfsd_splice_actor': fs/nfsd/vfs.c:848: warning: unused variable 'ret' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-12-20 09:15:19 +01:00
J. Bruce Fields	04f4ad16b2	nfsd4: implement secinfo_no_name Implementation of this operation is mandatory for NFSv4.1. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:25 -05:00
J. Bruce Fields	0ff7ab4671	nfsd4: move guts of nfsd4_lookupp into helper We'll reuse this code in secinfo_no_name. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:24 -05:00
J. Bruce Fields	56560b9ae0	nfsd4: 4.1 SECINFO should consume filehandle See the referenced spec language; an attempt by a 4.1 client to use the current filehandle after a secinfo call should result in a NOFILEHANDLE error. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:23 -05:00
bookjovi@gmail.com	5b6a599f0d	nfs: add missed CONFIG_NFSD_DEPRECATED these pieces of code only make sense when CONFIG_NFSD_DEPRECATED enabled Signed-off-by: Jovi Zhang <bookjovi@gmail.com> fs/nfsd/nfsctl.c \| 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:20 -05:00
J. Bruce Fields	18b631f838	nfsd: fix offset printk's in nfsd3 read/write Thanks to dysbr01@ca.com for noticing that the debugging printk in the v3 write procedure can print >2GB offsets as negative numbers: https://bugzilla.kernel.org/show_bug.cgi?id=23342 Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:18 -05:00
J. Bruce Fields	e203d506bd	nfsd4: fix mixed 4.0/4.1 handling, 4.1 reboot Instead of failing to find client entries which don't match the minorversion, we should be finding them, then either erroring out or expiring them as appropriate. This also fixes a problem which would cause the 4.1 server to fail to recognize clients after a second reboot. Reported-by: Casey Bodley <cbodley@citi.umich.edu> Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:48:01 -05:00
J. Bruce Fields	6e5f15c93d	nfsd4: replace unintuitive match_clientid_establishment Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-17 15:47:41 -05:00
J. Bruce Fields	ec66ee3797	Merge commit 'v2.6.37-rc6' into for-2.6.38	2010-12-17 13:29:07 -05:00
Michał Mirosław	a8adbe378b	fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors This patch pulls calls to buf->ops->confirm() from all actors passed (also indirectly) to splice_from_pipe_feed(). Is avoiding the call to buf->ops->confirm() while splice()ing to /dev/null is an intentional optimization? No other user does that and this will remove this special case. Against current linux.git `6313e3c217`. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-12-17 08:56:44 +01:00
Chuck Lever	bf2695516d	SUNRPC: New xdr_streams XDR decoder API Now that all client-side XDR decoder routines use xdr_streams, there should be no need to support the legacy calling sequence [rpc_rqst , __be32 , RPC res *] anywhere. We can construct an xdr_stream in the generic RPC code, instead of in each decoder function. This is a refactoring change. It should not cause different behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:25 -05:00
Chuck Lever	9f06c719f4	SUNRPC: New xdr_streams XDR encoder API Now that all client-side XDR encoder routines use xdr_streams, there should be no need to support the legacy calling sequence [rpc_rqst , __be32 , RPC arg *] anywhere. We can construct an xdr_stream in the generic RPC code, instead of in each encoder function. Also, all the client-side encoder functions return 0 now, making a return value superfluous. Take this opportunity to convert them to return void instead. This is a refactoring change. It should not cause different behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:25 -05:00
Chuck Lever	7d93bd71cb	NFS: Repair whitespace damage in NFS PROC macro Clean up. When I was making other changes in this area, checkscript.pl complained about the use of leading blanks in the PROC macros in the xdr files. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:24 -05:00
Chuck Lever	85a5648019	NFSD: Update XDR decoders in NFSv4 callback client Clean up. Remove old-style NFSv4 XDR macros in favor of the style now used in fs/nfs/nfs4xdr.c. These were forgotten during the recent nfs4xdr.c rewrite. Additional whitespace cleanup adds to the size of this patch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:24 -05:00
Chuck Lever	a033db487e	NFSD: Update XDR encoders in NFSv4 callback client Clean up. Remove old-style NFSv4 XDR macros in favor of the style now used in fs/nfs/nfs4xdr.c. These were forgotten during the recent nfs4xdr.c rewrite. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-12-16 12:37:23 -05:00
Tejun Heo	afe2c511fb	workqueue: convert cancel_rearming_delayed_work[queue]() users to cancel_delayed_work_sync() cancel_rearming_delayed_work[queue]() has been superceded by cancel_delayed_work_sync() quite some time ago. Convert all the in-kernel users. The conversions are completely equivalent and trivial. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "David S. Miller" <davem@davemloft.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Cc: netdev@vger.kernel.org Cc: Anton Vorontsov <cbou@mail.ru> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Alex Elder <aelder@sgi.com> Cc: xfs-masters@oss.sgi.com Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: netfilter-devel@vger.kernel.org Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: linux-nfs@vger.kernel.org	2010-12-15 10:56:11 +01:00
Neil Brown	c1ac3ffcd0	nfsd: Fix possible BUG_ON firing in set_change_info If vfs_getattr in fill_post_wcc returns an error, we don't set fh_post_change. For NFSv4, this can result in set_change_info triggering a BUG_ON. i.e. fh_post_saved being zero isn't really a bug. So: - instead of BUGging when fh_post_saved is zero, just clear ->atomic. - if vfs_getattr fails in fill_post_wcc, take a copy of i_ctime anyway. This will be used i seg_change_info, but not overly trusted. - While we are there, remove the pointless 'if' statements in set_change_info. There is no harm setting all the values. Signed-off-by: NeilBrown <neilb@suse.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-12-08 11:44:04 -05:00
Mi Jinlong	1205065764	NFS4.1: Fix bug server don't reply the right fore_channel to client at create_session At the latest kernel(2.6.37-rc1), server just initialize the forechannel at init_forechannel_attrs, but don't reflect it to reply. After initialize the session success, we should copy the forechannel info to nfsd4_create_session struct. Reviewed-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
Mi Jinlong	ced6dfe9fc	NFS4.1: server gets drc mem fail should reply error at create_session When server gets drc mem fail, it should reply error to client. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
J. Bruce Fields	044bc1d432	nfsd4: return serverfault on request for ssv We're refusing to support a mandatory features of 4.1, so serverfault seems the better error; see e.g.: http://www.ietf.org/mail-archive/web/nfsv4/current/msg07638.html Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
Mi Jinlong	5afa040b30	NFSv4.1: Make sure nfsd can decode SP4_SSV correctly at exchange_id According to RFC, the argument of ssv_sp_parms4 is: struct ssv_sp_parms4 { state_protect_ops4 ssp_ops; sec_oid4 ssp_hash_algs<>; sec_oid4 ssp_encr_algs<>; uint32_t ssp_window; uint32_t ssp_num_gss_handles; }; If client send a exchange_id with SP4_SSV, server cann't decode the SP4_SSV's ssp_hash_algs and ssp_encr_algs arguments correctly. Because the kernel treat the two arguments as a signal sec_oid4 struct, but should be a set of sec_oid4 struct. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:12 -05:00
Dan Carpenter	43b0178eda	nfsd: fix NULL dereference in setattr() The original code would oops if this were called from nfsd4_setattr() because "filpp" is NULL. (Note this case is currently impossible, as long as we only give out read delegations.) Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-19 18:35:11 -05:00
Arnd Bergmann	460781b542	BKL: remove references to lock_kernel from comments Lock_kernel is gone from the code, so the comments should be updated, too. nfsd now uses lock_flocks instead of lock_kernel to protect against posix file locks. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com> Cc: linux-nfs@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
J. Bruce Fields	21b75b0199	nfsd4: fix 4.1 connection registration race If a connection is closed just after a sequence or create_session is sent over it, we could end up trying to register a callback that will never get called since the xprt is already marked dead. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-11-02 17:13:52 -04:00
Christoph Hellwig	51ee4b84f5	locks: let the caller free file_lock on ->setlease failure The caller allocated it, the caller should free it. The only issue so far is that we could change the flp pointer even on an error return if the fl_change callback failed. But we can simply move the flp assignment after the fl_change invocation, as the callers don't care about the flp return value if the setlease call failed. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-31 06:35:15 -07:00
J. Bruce Fields	fcf744a96c	nfsd4: initialize delegation pointer to lease The NFSv4 server was initializing the dp->dl_flock pointer by the somewhat ridiculous method of a locks_copy_lock callback. Now that setlease uses the passed-in lock instead of doing a copy, dl_flock no longer gets set, resulting in the lock leaking on delegation release, and later possible hangs (among other problems). So, initialize dl_flock and get rid of the callback. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-30 18:08:15 -07:00
Al Viro	fc14f2fef6	convert get_sb_single() users Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:28 -04:00
Linus Torvalds	7420a8c0de	Merge branch 'flock' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'flock' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: locks: turn lock_flocks into a spinlock fasync: re-organize fasync entry insertion to allow it under a spinlock locks/nfsd: allocate file lock outside of spinlock lockd: fix nlmsvc_notify_blocked locking lockd: push lock_flocks down	2010-10-27 18:13:34 -07:00
Arnd Bergmann	c5b1f0d92c	locks/nfsd: allocate file lock outside of spinlock As suggested by Christoph Hellwig, this moves allocation of new file locks out of generic_setlease into the callers, nfs4_open_delegation and fcntl_setlease in order to allow GFP_KERNEL allocations when lock_flocks has become a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>	2010-10-27 21:41:50 +02:00
Arnd Bergmann	763641d812	lockd: push lock_flocks down lockd should use lock_flocks() instead of lock_kernel() to lock against posix locks accessing the i_flock list. This is a prerequisite to turning lock_flocks into a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com>	2010-10-27 21:39:39 +02:00
Linus Torvalds	426e1f5cec	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) split invalidate_inodes() fs: skip I_FREEING inodes in writeback_sb_inodes fs: fold invalidate_list into invalidate_inodes fs: do not drop inode_lock in dispose_list fs: inode split IO and LRU lists fs: switch bdev inode bdi's correctly fs: fix buffer invalidation in invalidate_list fsnotify: use dget_parent smbfs: use dget_parent exportfs: use dget_parent fs: use RCU read side protection in d_validate fs: clean up dentry lru modification fs: split __shrink_dcache_sb fs: improve DCACHE_REFERENCED usage fs: use percpu counter for nr_dentry and nr_dentry_unused fs: simplify __d_free fs: take dcache_lock inside __d_path fs: do not assign default i_ino in new_inode fs: introduce a per-cpu last_ino allocator new helper: ihold() ...	2010-10-26 17:58:44 -07:00
Linus Torvalds	4390110fef	Merge branch 'for-2.6.37' of git://linux-nfs.org/~bfields/linux * 'for-2.6.37' of git://linux-nfs.org/~bfields/linux: (99 commits) svcrpc: svc_tcp_sendto XPT_DEAD check is redundant svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue svcrpc: assume svc_delete_xprt() called only once svcrpc: never clear XPT_BUSY on dead xprt nfsd4: fix connection allocation in sequence() nfsd4: only require krb5 principal for NFSv4.0 callbacks nfsd4: move minorversion to client nfsd4: delay session removal till free_client nfsd4: separate callback change and callback probe nfsd4: callback program number is per-session nfsd4: track backchannel connections nfsd4: confirm only on succesful create_session nfsd4: make backchannel sequence number per-session nfsd4: use client pointer to backchannel session nfsd4: move callback setup into session init code nfsd4: don't cache seq_misordered replies SUNRPC: Properly initialize sock_xprt.srcaddr in all cases SUNRPC: Use conventional switch statement when reclassifying sockets sunrpc/xprtrdma: clean up workqueue usage sunrpc: Turn list_for_each-s into the ..._entry-s ... Fix up trivial conflicts (two different deprecation notices added in separate branches) in Documentation/feature-removal-schedule.txt	2010-10-26 09:55:25 -07:00
Linus Torvalds	a4dd8dce14	Merge branch 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: net/sunrpc: Use static const char arrays nfs4: fix channel attribute sanity-checks NFSv4.1: Use more sensible names for 'initialize_mountpoint' NFSv4.1: pnfs: filelayout: add driver's LAYOUTGET and GETDEVICEINFO infrastructure NFSv4.1: pnfs: add LAYOUTGET and GETDEVICEINFO infrastructure NFS: client needs to maintain list of inodes with active layouts NFS: create and destroy inode's layout cache NFSv4.1: pnfs: filelayout: introduce minimal file layout driver NFSv4.1: pnfs: full mount/umount infrastructure NFS: set layout driver NFS: ask for layouttypes during v4 fsinfo call NFS: change stateid to be a union NFSv4.1: pnfsd, pnfs: protocol level pnfs constants SUNRPC: define xdr_decode_opaque_fixed NFSD: remove duplicate NFS4_STATEID_SIZE	2010-10-26 09:52:09 -07:00
Christoph Hellwig	c37650161a	fs: add sync_inode_metadata Add a new helper to write out the inode using the writeback code, that is including the correct dirty bit and list manipulation. A few of filesystems already opencode this, and a lot of others should be using it instead of using write_inode_now which also writes out the data. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-25 21:18:19 -04:00
J. Bruce Fields	a663bdd8c5	nfsd4: fix connection allocation in sequence() We're doing an allocation under a spinlock, and ignoring the possibility of allocation failure. A better fix wouldn't require an unnecessary allocation in the common case, but we'll leave that for later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-24 21:07:07 -04:00
Andy Adamson	3c9101a057	NFSD: remove duplicate NFS4_STATEID_SIZE Already accepted by Bruce Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-10-24 18:02:53 -04:00
Linus Torvalds	092e0e7e52	Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek	2010-10-22 10:52:56 -07:00
Linus Torvalds	79f14b7c56	Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: (30 commits) BKL: remove BKL from freevxfs BKL: remove BKL from qnx4 autofs4: Only declare function when CONFIG_COMPAT is defined autofs: Only declare function when CONFIG_COMPAT is defined ncpfs: Lock socket in ncpfs while setting its callbacks fs/locks.c: prepare for BKL removal BKL: Remove BKL from ncpfs BKL: Remove BKL from OCFS2 BKL: Remove BKL from squashfs BKL: Remove BKL from jffs2 BKL: Remove BKL from ecryptfs BKL: Remove BKL from afs BKL: Remove BKL from USB gadgetfs BKL: Remove BKL from autofs4 BKL: Remove BKL from isofs BKL: Remove BKL from fat BKL: Remove BKL from ext2 filesystem BKL: Remove BKL from do_new_mount() BKL: Remove BKL from cgroup BKL: Remove BKL from NTFS ...	2010-10-22 10:52:01 -07:00
Linus Torvalds	5704e44d28	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: introduce CONFIG_BKL. dabusb: remove the BKL sunrpc: remove the big kernel lock init/main.c: remove BKL notations blktrace: remove the big kernel lock rtmutex-tester: make it build without BKL dvb-core: kill the big kernel lock dvb/bt8xx: kill the big kernel lock tlclk: remove big kernel lock fix rawctl compat ioctls breakage on amd64 and itanic uml: kill big kernel lock parisc: remove big kernel lock cris: autoconvert trivial BKL users alpha: kill big kernel lock isapnp: BKL removal s390/block: kill the big kernel lock hpet: kill BKL, add compat_ioctl	2010-10-22 10:43:11 -07:00
J. Bruce Fields	5d18c1c2a9	nfsd4: only require krb5 principal for NFSv4.0 callbacks In the sessions backchannel case, we don't need a krb5 principal name for the client; we use the already-created forechannel credentials instead. Some cleanup, while we're there: make it clearer which code here is 4.0- or sessions- specific. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:12:14 -04:00
J. Bruce Fields	8323c3b2a6	nfsd4: move minorversion to client The minorversion seems more a property of the client than the callback channel. Some time we should probably also enforce consistent minorversion usage from the client; for now, this is just a cosmetic change. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:12:02 -04:00
J. Bruce Fields	792c95dd51	nfsd4: delay session removal till free_client Have unhash_client_locked() remove client and associated sessions from global hashes, but delay further dismantling till free_client(). (After unhash_client_locked(), the only remaining references outside the destroying thread are from any connections which have xpt_user callbacks registered.) This will simplify locking on session destruction. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:56 -04:00
J. Bruce Fields	5a3c9d7134	nfsd4: separate callback change and callback probe Only one of the nfsd4_callback_probe callers actually cares about changing the callback information. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:55 -04:00
J. Bruce Fields	8b5ce5cd44	nfsd4: callback program number is per-session The callback program is allowed to depend on the session which the callback is going over. No change in behavior yet, while we still only do callbacks over a single session for the lifetime of the client. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:54 -04:00
J. Bruce Fields	d29c374cd2	nfsd4: track backchannel connections We need to keep track of which connections are available for use with the backchannel, which for the forechannel, and which for both. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:53 -04:00
J. Bruce Fields	86c3e16cc7	nfsd4: confirm only on succesful create_session Following rfc 5661, section 18.36.4: "If the session is not successfully created, then no changes are made to any client records on the server." We shouldn't be confirming or incrementing the sequence id in this case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:52 -04:00
J. Bruce Fields	ac7c46f29a	nfsd4: make backchannel sequence number per-session Currently we don't deal well with a client that has multiple sessions associated with it (even simultaneously, or serially over the lifetime of the client). In particular, we don't attempt to keep the backchannel running after the original session diseappears. We will fix that soon. Once we do that, we need the slot sequence number to be per-session; otherwise, for example, we cannot correctly handle a case like this: - All session 1 connections are lost. - The client creates session 2. We use it for the backchannel (since it's the only working choice). - The client gives us a new connection to use with session 1. - The client destroys session 2. At this point our only choice is to go back to using session 1. When we do so we must use the sequence number that is next for session 1. We therefore need to maintain multiple sequence number streams. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:51 -04:00
J. Bruce Fields	90c8145bb6	nfsd4: use client pointer to backchannel session Instead of copying the sessionid, use the new cl_cb_session pointer, which indicates which session we're using for the backchannel. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:50 -04:00
J. Bruce Fields	edd7678663	nfsd4: move callback setup into session init code The backchannel should be associated with a session, it isn't really global to the client. We do, however, want a pointer global to the client which tracks which session we're currently using for client-based callbacks. This is a first step in that direction; for now, just reshuffling of code with no significant change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-21 10:11:49 -04:00
J. Bruce Fields	cd5b814458	nfsd4: don't cache seq_misordered replies Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-21 10:11:48 -04:00
Arnd Bergmann	6de5bd128d	BKL: introduce CONFIG_BKL. With all the patches we have queued in the BKL removal tree, only a few dozen modules are left that actually rely on the BKL, and even there are lots of low-hanging fruit. We need to decide what to do about them, this patch illustrates one of the options: Every user of the BKL is marked as 'depends on BKL' in Kconfig, and the CONFIG_BKL becomes a user-visible option. If it gets disabled, no BKL using module can be built any more and the BKL code itself is compiled out. The one exception is file locking, which is practically always enabled and does a 'select BKL' instead. This effectively forces CONFIG_BKL to be enabled until we have solved the fs/lockd mess and can apply the patch that removes the BKL from fs/locks.c. Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-21 15:44:13 +02:00
Arnd Bergmann	6038f373a3	llseek: automatically add .llseek fop All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode i, struct file f) { <+... ( nonseekable_open(...) \| nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file f, char p, size_t s, loff_t off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { <+... ( off = E \| off += E \| func(..., off, ...) \| E = off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file f, const char p, size_t s, loff_t off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable / }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, / open uses nonseekable / }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, / we have seq_read / }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, / readdir is present / }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, / read accesses f_pos / }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, / write accesses f_pos / }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, / read and write both use no f_pos / }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, / write uses no f_pos / }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, / read uses no f_pos / }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, / no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-15 15:53:27 +02:00
J. Bruce Fields	b1e86db1de	nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlink As of commit `43a9aa64a2` "NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR", we sometimes call fh_unlock on a filehandle that isn't fully initialized. We should fix up the callers, but as a quick fix it is also sufficient just to remove this assertion. Reported-by: Marius Tolzmann <tolzmann@molgen.mpg.de> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-13 15:48:55 -04:00
J. Bruce Fields	ecec6e34e1	nfsd4: expire clients more promptly Expire clients more promptly, at the expense of possibly running the laundromat thread more frequently. Though it's not the default, I'd like it to be feasible to run with a lease time of just a few seconds, at which point a minimum 10 second wait between laundromat runs seems a little much. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-11 20:00:18 -04:00
Arnd Bergmann	b89f432133	fs/locks.c: prepare for BKL removal This prepares the removal of the big kernel lock from the file locking code. We still use the BKL as long as fs/lockd uses it and ceph might sleep, but we can flip the definition to a private spinlock as soon as that's done. All users outside of fs/lockd get converted to use lock_flocks() instead of lock_kernel() where appropriate. Based on an earlier patch to use a spinlock from Matthew Wilcox, who has attempted this a few times before, the earliest patch from over 10 years ago turned it into a semaphore, which ended up being slower than the BKL and was subsequently reverted. Someone should do some serious performance testing when this becomes a spinlock, since this has caused problems before. Using a spinlock should be at least as good as the BKL in theory, but who knows... Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Matthew Wilcox <willy@linux.intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Sage Weil <sage@newdream.net> Cc: linux-kernel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org	2010-10-05 11:02:04 +02:00
J. Bruce Fields	3351514215	nfsd4: return expired on unfound stateid's Commit `78155ed75f` "nfsd4: distinguish expired from stale stateids" attempted to distinguish expired and stale stateid's using time information that may not have been completely reliable, so I reverted it. That was throwing out the baby with the bathwater; we still do want to return expired, but let's do that using the simpler approach of just assuming any stateid is expired if it looks like it was given out by the current server instance, but we can't find it any more. This may help clients that are recovering from network partitions. Reported-by: Bian Naimeng <biannm@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-02 18:49:33 -04:00
J. Bruce Fields	328ead2872	nfsd4: add new connections to session As long as we're not implementing any session security, we should just automatically add any new connections that come along to the list of sessions associated with the session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:45 -04:00
J. Bruce Fields	db90681d6e	nfsd4: refactor connection allocation Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:45 -04:00
J. Bruce Fields	19cf5c026f	nfsd4: use callbacks on svc_xprt_deletion Remove connections from the list when they go down. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	c7662518c7	nfsd4: keep per-session list of connections The spec requires us in various places to keep track of the connections associated with each session. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	5b6feee960	nfsd4: clean up session allocation Changes: - make sure session memory reservation is released on failure path. - use min_t()/min() for more compact code in several places. - break alloc_init_session into smaller pieces. - miscellaneous other cleanup. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	dd93842457	nfsd4: fix alloc_init_session return type This returns an nfs error, not -ERRNO. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	c23753dac1	nfsd4: fix alloc_init_session BUILD_BUG_ON() Note we're allocating an array of nfsd4_slot *'s, not nfsd4_slot's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	6ff8da0887	nfsd4: Move callback setup to callback queue Instead of creating the new rpc client from a regular server thread, set a flag, kick off a null call, and allow the null call to do the work of setting up the client on the callback workqueue. Use a spinlock to ensure the callback work gets a consistent view of the callback parameters. This allows, for example, changing the callback from contexts where sleeping is not allowed. I hope it will also keep the locking simple as we add more session and trunking features, by serializing most of the callback-specific work. This also closes a small race where the the new cb_ident could be used with an old connection (or vice-versa). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:44 -04:00
J. Bruce Fields	fb00392326	nfsd4: remove separate cb_args struct I don't see the point of the separate struct. It seems to just be getting in the way. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	cee277d924	nfsd4: use generic callback code in null case This will eventually allow us, for example, to kick off null callback from contexts where we can't sleep. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	5878453dbd	nfsd4: generic callback code Make the recall callback code more generic, so that other callbacks will be able to use it too. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	1c8556026e	nfsd4: rename nfs4_rpc_args->nfsd4_cb_args With apologies for the gratuitous rename, the new name seems more helpful to me. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	586f36735e	nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence These two structs don't really need to be distinct as far as I can tell. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
J. Bruce Fields	07263f1efe	nfsd4: minor variable renaming (cb -> conn) Now that we have both nfsd4_callback and nfsd4_cb_conn structures, I get confused if variables of both types are always named cb.... Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-10-01 19:29:43 -04:00
Pavel Emelyanov	c653ce3f0a	sunrpc: Add net to rpc_create_args Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 17:18:56 -04:00
Pavel Emelyanov	fc5d00b04a	sunrpc: Add net argument to svc_create_xprt Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 17:18:54 -04:00
Benny Halevy	2b44f1ba40	nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length The existing code adjusted it based on the worst case scenario for the returned bitmap and the best case scenario for the supported attrs attribute. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [bfields@redhat.com: removed likely/unlikely's] Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-10-01 16:52:24 -04:00
Pavel Emelyanov	352114f395	sunrpc: Add net to pure API calls There are two calls that operate on ip_map_cache and are directly called from the nfsd code. Other places will be handled in a different way. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-27 10:16:11 -04:00
J. Bruce Fields	74ec1e1269	nfsd: fix /proc/net/rpc/nfsd.export/content display Note with "first" always 0, and "lastflags" initially 0, we always dump a spurious set of 0 flags at the start, among other problems. Fix. And attempt to make the code a little more obvious. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-26 14:48:25 -04:00
Pavel Emelyanov	049ef27b22	nfsd: Export get_task_comm for nfsd The git://linux-nfs.org/~bfields/linux.git nfsd-next branch doesn't compile when nfsd is a module with the following error: ERROR: "get_task_comm" [fs/nfsd/nfsd.ko] undefined! Replace the get_task_comm call with direct comm access, which is safe for current. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-23 10:34:21 -04:00
NeilBrown	1e1405673e	nfsd: allow deprecated interface to be compiled out. Add CONFIG_NFSD_DEPRECATED, default to y. Only include deprecated interface if this is defined. This allows distros to remove this interface before the official removal, and allows developers to test without it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-22 15:33:14 -04:00
NeilBrown	c67874f942	nfsd: formally deprecate legacy nfsd syscall interface The syscall interface is has been replaced by a more flexible interface since 2.6.0. It is time to work towards discarding the old interface. So add a entry in feature-removal-schedule.txt and print a warning when the interface is used. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-22 15:33:13 -04:00
NeilBrown	839049a873	nfsd/idmap: drop special request deferal in favour of improved default. The idmap code manages request deferal by waiting for a reply from userspace rather than putting the NFS request on a queue to be retried from the start. Now that the common deferal code does this there is no need for the special code in idmap. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-21 17:08:31 -04:00
NeilBrown	8ff30fa4ef	nfsd: disable deferral for NFSv4 Now that a slight delay in getting a reply to an upcall doesn't require deferring of requests, request deferral for all NFSv4 requests - the concept doesn't really fit with the v4 model. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-21 17:02:27 -04:00
J. Bruce Fields	c88739b373	Merge remote branch 'trond/bugfixes' into for-2.6.37 Without some client-side fixes, server testing is currently difficult.	2010-09-19 23:48:32 -04:00
Trond Myklebust	827e345702	SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies The NFSv4 client's callback server calls svc_gss_principal(), which is defined in the auth_rpcgss.ko The NFSv4 server has the same dependency, and in addition calls svcauth_gss_flavor(), gss_mech_get_by_pseudoflavor(), gss_pseudoflavor_to_service() and gss_mech_put() from the same module. The module auth_rpcgss itself has no dependencies aside from sunrpc, so we only need to select RPCSEC_GSS. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-09-12 19:57:50 -04:00
Linus Torvalds	4f63e3c5be	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd4: mask out non-access bits in nfs4_access_to_omode	2010-09-07 19:21:02 -07:00
NeilBrown	c5b29f885a	sunrpc: use seconds since boot in expiry cache This protects us from confusion when the wallclock time changes. We convert to and from wallclock when setting or reading expiry times. Also use seconds since boot for last_clost time. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-07 19:21:20 -04:00
NeilBrown	17cebf658e	sunrpc: extract some common sunrpc_cache code from nfsd Rather can duplicating this idiom twice, put it in an inline function. This reduces the usage of 'expiry_time' out side the sunrpc/cache.c code and thus the impact of a change that is about to be made to that field. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-07 19:21:19 -04:00
Andy Adamson	1132b26029	nfsd: remove duplicate NFS4_STATEID_SIZE declaration Use NFS4_STATEID_SIZE from include/linux/nfs4 Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-07 19:21:18 -04:00
J. Bruce Fields	8f34a430ac	nfsd4: mask out non-access bits in nfs4_access_to_omode This fixes an unnecessary BUG(). Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-09-02 15:25:09 -04:00
Linus Torvalds	2547d1d20f	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: nfsd: fix NULL dereference in nfsd_statfs() nfsd4: fix downgrade/lock logic nfsd4: typo fix in find_any_file nfsd4: bad BUG() in preprocess_stateid_op	2010-08-28 14:05:55 -07:00
Takashi Iwai	f6360efb83	nfsd: fix NULL dereference in nfsd_statfs() The commit `ebabe9a900` pass a struct path to vfs_statfs introduced the struct path initialization, and this seems to trigger an Oops on my machine. fh_dentry field may be NULL and set later in fh_verify(), thus the initialization of path must be after fh_verify(). Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:23:16 -04:00
J. Bruce Fields	f632265d0f	Merge commit 'v2.6.36-rc1' into HEAD	2010-08-26 13:22:27 -04:00
J. Bruce Fields	7d94784293	nfsd4: fix downgrade/lock logic If we already had a RW open for a file, and get a readonly open, we were piggybacking on the existing RW open. That's inconsistent with the downgrade logic which blows away the RW open assuming you'll still have a readonly open. Also, make sure there is a readonly or writeonly open available for locking, again to prevent bad behavior in downgrade cases when any RW open may be lost. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:22:02 -04:00
J. Bruce Fields	18608ad49c	nfsd4: typo fix in find_any_file Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:21:09 -04:00
J. Bruce Fields	30c0e1ef0a	nfsd4: bad BUG() in preprocess_stateid_op It's OK for this function to return without setting filp--we do it in the special-stateid case. And there's a legitimate case where we can hit this, since we do permit reads on write-only stateid's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-26 13:20:51 -04:00
Linus Torvalds	763008c435	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFS: Fix an Oops in the NFSv4 atomic open code NFS: Fix the selection of security flavours in Kconfig NFS: fix the return value of nfs_file_fsync() rpcrdma: Fix SQ size calculation when memreg is FRMR xprtrdma: Do not truncate iova_start values in frmr registrations. nfs: Remove redundant NULL check upon kfree() nfs: Add "lookupcache" to displayed mount options NFS: allow close-to-open cache semantics to apply to root of NFS filesystem SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)	2010-08-18 15:45:23 -07:00
Trond Myklebust	df486a2590	NFS: Fix the selection of security flavours in Kconfig Randy Dunlap reports: ERROR: "svc_gss_principal" [fs/nfs/nfs.ko] undefined! because in fs/nfs/Kconfig, NFS_V4 selects RPCSEC_GSS_KRB5 and/or in fs/nfsd/Kconfig, NFSD_V4 selects RPCSEC_GSS_KRB5. RPCSEC_GSS_KRB5 does 5 selects, but none of these is enforced/followed by the fs/nfs[d]/Kconfig configs: select SUNRPC_GSS select CRYPTO select CRYPTO_MD5 select CRYPTO_DES select CRYPTO_CBC Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: J. Bruce Fields <bfields@fieldses.org> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-08-17 17:42:45 -04:00
Linus Torvalds	8c8946f509	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.	2010-08-10 11:39:13 -07:00
Linus Torvalds	5f248c9c25	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits) no need for list_for_each_entry_safe()/resetting with superblock list Fix sget() race with failing mount vfs: don't hold s_umount over close_bdev_exclusive() call sysv: do not mark superblock dirty on remount sysv: do not mark superblock dirty on mount btrfs: remove junk sb_dirt change BFS: clean up the superblock usage AFFS: wait for sb synchronization when needed AFFS: clean up dirty flag usage cifs: truncate fallout mbcache: fix shrinker function return value mbcache: Remove unused features add f_flags to struct statfs(64) pass a struct path to vfs_statfs update VFS documentation for method changes. All filesystems that need invalidate_inode_buffers() are doing that explicitly convert remaining ->clear_inode() to ->evict_inode() Make ->drop_inode() just return whether inode needs to be dropped fs/inode.c:clear_inode() is gone fs/inode.c:evict() doesn't care about delete vs. non-delete paths now ... Fix up trivial conflicts in fs/nilfs2/super.c	2010-08-10 11:26:52 -07:00
Christoph Hellwig	ebabe9a900	pass a struct path to vfs_statfs We'll need the path to implement the flags field for statvfs support. We do have it available in all callers except: - ecryptfs_statfs. This one doesn't actually need vfs_statfs but just needs to do a caller to the lower filesystem statfs method. - sys_ustat. Add a non-exported statfs_by_dentry helper for it which doesn't won't be able to fill out the flags field later on. In addition rename the helpers for statfs vs fstatfs to do_*statfs instead of the misleading vfs prefix. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:48:42 -04:00
Linus Torvalds	0d9f9e122c	Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd4: fix file open accounting for RDWR opens nfsd: don't allow setting maxblksize after svc created nfsd: initialize nfsd versions before creating svc net: sunrpc: removed duplicated #include nfsd41: Fix a crash when a callback is retried nfsd: fix startup/shutdown order bug nfsd: minor nfsd read api cleanup gcc-4.6: nfsd: fix initialized but not read warnings nfsd4: share file descriptors between stateid's nfsd4: fix openmode checking on IO using lock stateid nfsd4: miscellaneous process_open2 cleanup nfsd4: don't pretend to support write delegations nfsd: bypass readahead cache when have struct file nfsd: minor nfsd_svc() cleanup nfsd: move more into nfsd_startup() nfsd: just keep single lockd reference for nfsd nfsd: clean up nfsd_create_serv error handling nfsd: fix error handling in __write_ports_addxprt nfsd: fix error handling when starting nfsd with rpcbind down nfsd4: fix v4 state shutdown error paths ...	2010-08-07 14:24:41 -07:00
J. Bruce Fields	998db52c03	nfsd4: fix file open accounting for RDWR opens Commit `f9d7562fdb` "nfsd4: share file descriptors between stateid's" didn't correctly account for O_RDWR opens. Symptoms include leaked files, resulting in failures to unmount and/or warnings about orphaned inodes on reboot. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-07 09:50:05 -04:00
J. Bruce Fields	7fa53cc872	nfsd: don't allow setting maxblksize after svc created It's harmless to set this after the server is created, but also ineffective, since the value is only used at the time of svc_create_pooled(). So fail the attempt, in keeping with the pattern set by write_versions, write_{lease,grace}time and write_recoverydir. (This could break userspace that tried to write to nfsd/max_block_size between setting up sockets and starting the server. However, such code wouldn't have worked anyway, and I don't know of any examples--rpc.nfsd in nfs-utils, probably the only user of the interface, doesn't do that.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 18:00:33 -04:00
J. Bruce Fields	e844a7b980	nfsd: initialize nfsd versions before creating svc Commit `59db4a0c10` "nfsd: move more into nfsd_startup()" inadvertently moved nfsd_versions after nfsd_create_svc(). On older distributions using an rpc.nfsd that does not explicitly set the list of nfsd versions, this results in svc-create_pooled() being called with an empty versions array. The resulting incomplete initialization leads to a NULL dereference in svc_process_common() the first time a client accesses the server. Move nfsd_reset_versions() back before the svc_create_pooled(); this time, put it closer to the svc_create_pooled() call, to make this mistake more difficult in the future. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:40 -04:00
Boaz Harrosh	c18c821fd4	nfsd41: Fix a crash when a callback is retried If a callback is retried at nfsd4_cb_recall_done() due to some error, the returned rpc reply crashes here: @@ -514,6 +514,7 @@ decode_cb_sequence(struct xdr_stream xdr, struct nfsd4_cb_sequence res, u32 dummy; __be32 *p; + BUG_ON(!res); if (res->cbs_minorversion == 0) return 0; [BUG_ON added for demonstration] This is because the nfsd4_cb_done_sequence() has NULLed out the task->tk_msg.rpc_resp pointer. Also eventually the rpc would use the new slot without making sure it is free by calling nfsd41_cb_setup_sequence(). This problem was introduced by a 4.1 protocol addition patch: [`0421b5c5`] nfsd41: Backchannel: Implement cb_recall over NFSv4.1 Which was overlooking the possibility of an RPC callback retries. For not-4.1 case redoing the _prepare is harmless. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:39 -04:00
J. Bruce Fields	774f8bbd9e	nfsd: fix startup/shutdown order bug We must create the server before we can call init_socks or check the number of threads. Symptoms were a NULL pointer dereference in nfsd_svc(). Problem identified by Jeff Layton. Also fix a minor cleanup-on-error case in nfsd_startup(). Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-08-06 17:05:30 -04:00
J. Bruce Fields	039a87ca53	nfsd: minor nfsd read api cleanup Christoph points that the NFSv2/v3 callers know which case they want here, so we may as well just call the file=NULL case directly instead of making this conditional. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-30 12:54:54 -04:00
Andi Kleen	6904996101	gcc-4.6: nfsd: fix initialized but not read warnings Fixes at least one real minor bug: the nfs4 recovery dir sysctl would not return its status properly. Also I finished Al's `1e41568d73` ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") commit, it moved the IMA code, but left the old path initializer in there. The rest is just dead code removed I think, although I was not fully sure about the "is_borc" stuff. Some more review would be still good. Found by gcc 4.6's new warnings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 19:32:17 -04:00
J. Bruce Fields	f9d7562fdb	nfsd4: share file descriptors between stateid's The vfs doesn't really allow us to "upgrade" a file descriptor from read-only to read-write, and our attempt to do so in nfs4_upgrade_open is ugly and incomplete. Move to a different scheme where we keep multiple opens, shared between open stateid's, in the nfs4_file struct. Each file will be opened at most 3 times (for read, write, and read-write), and those opens will be shared between all clients and openers. On upgrade we will do another open if necessary instead of attempting to upgrade an existing open. We keep count of the number of readers and writers so we know when to close the shared files. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-29 18:19:23 -04:00
J. Bruce Fields	0292191417	nfsd4: fix openmode checking on IO using lock stateid It is legal to perform a write using the lock stateid that was originally associated with a read lock, or with a file that was originally opened for read, but has since been upgraded. So, when checking the openmode, check the mode associated with the open stateid from which the lock was derived. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:37:12 -04:00
J. Bruce Fields	21fb4016bd	nfsd4: miscellaneous process_open2 cleanup Move more work into helper functions. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:34:29 -04:00
J. Bruce Fields	c3e4808086	nfsd4: don't pretend to support write delegations The delegation code mostly pretends to support either read or write delegations. However, correct support for write delegations would require, for example, breaking of delegations (and/or implementation of cb_getattr) on stat. Currently all that stops us from handing out delegations is a subtle reference-counting issue. Avoid confusion by adding an earlier check that explicitly refuses write delegations. For now, though, I'm not going so far as to rip out existing half-support for write delegations, in case we get around to using that soon. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-29 16:05:51 -04:00
Eric Paris	2a12a9d781	fsnotify: pass a file instead of an inode to open, read, and write fanotify, the upcoming notification system actually needs a struct path so it can do opens in the context of listeners, and it needs a file so it can get f_flags from the original process. Close was the only operation that already was passing a struct file to the notification hook. This patch passes a file for access, modify, and open as well as they are easily available to these hooks. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:32 -04:00
J. Bruce Fields	fa0a21269f	nfsd: bypass readahead cache when have struct file The readahead cache compensates for the fact that the NFS server currently does an open and close on every IO operation in the NFSv2 and NFSv3 case. In the NFSv4 case we have long-lived struct files associated with client opens, so there's no need for this. In fact, concurrent IO's using trying to modify the same file->f_ra may cause problems. So, don't bother with the readahead cache in that case. Note eventually we'll likely do this in the v2/v3 case as well by keeping a cache of struct files instead of struct file_ra_state's. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-27 18:15:54 -04:00
J. Bruce Fields	af4718f3f9	nfsd: minor nfsd_svc() cleanup More idiomatic to put the error case in the if clause. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:27 -04:00
J. Bruce Fields	59db4a0c10	nfsd: move more into nfsd_startup() This is just cleanup--it's harmless to call nfsd_rachache_init, nfsd_init_socks, and nfsd_reset_versions more than once. But there's no point to it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:26 -04:00
Jeff Layton	ac77efbe2b	nfsd: just keep single lockd reference for nfsd Right now, nfsd keeps a lockd reference for each socket that it has open. This is unnecessary and complicates the error handling on startup and shutdown. Change it to just do a lockd_up when starting the first nfsd thread just do a single lockd_down when taking down the last nfsd thread. Because of the strange way the sv_count is handled this requires an extra flag to tell whether the nfsd_serv holds a reference for lockd or not. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:26 -04:00
Jeff Layton	628b368728	nfsd: clean up nfsd_create_serv error handling There doesn't seem to be any need to reset the nfssvc_boot time if the nfsd startup failed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:25 -04:00
Jeff Layton	0cd14a061e	nfsd: fix error handling in __write_ports_addxprt __write_ports_addxprt calls nfsd_create_serv. That increases the refcount of nfsd_serv (which is tracked in sv_nrthreads). The service only decrements the thread count on error, not on success like __write_ports_addfd does, so using this interface leaves the nfsd thread count high. Fix this by having this function call svc_destroy() on error to release the reference (and possibly to tear down the service) and simply decrement the refcount without tearing down the service on success. This makes the sv_threads handling work basically the same in both __write_ports_addxprt and __write_ports_addfd. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:24 -04:00
Jeff Layton	78a8d7c8ca	nfsd: fix error handling when starting nfsd with rpcbind down The refcounting for nfsd is a little goofy. What happens is that we create the nfsd RPC service, attach sockets to it but don't actually start the threads until someone writes to the "threads" procfile. To do this, __write_ports_addfd will create the nfsd service and then will decrement the refcount when exiting but won't actually destroy the service. This is fine when there aren't errors, but when there are this can cause later attempts to start nfsd to fail. nfsd_serv will be set, and that causes __write_versions to return EBUSY. Fix this by calling svc_destroy on nfsd_serv when this function is going to return error. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:23 -04:00
Jeff Layton	4ad9a344be	nfsd4: fix v4 state shutdown error paths If someone tries to shut down the laundry_wq while it isn't up it'll cause an oops. This can happen because write_ports can create a nfsd_svc before we really start the nfs server, and we may fail before the server is ever started. Also make sure state is shutdown on error paths in nfsd_svc(). Use a common global nfsd_up flag instead of nfs4_init, and create common helper functions for nfsd start/shutdown, as there will be other work that we want done only when we the number of nfsd threads transitions between zero and nonzero. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:51:22 -04:00
J. Bruce Fields	55b13354d7	nfsd: remove unused assignment from nfsd_link Trivial cleanup, since "dest" is never used. Reported-by: Anshul Madan <Anshul.Madan@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2010-07-23 08:50:39 -04:00
Chuck Lever	43a9aa64a2	NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR Some well-known NFSv3 clients drop their directory entry caches when they receive replies with no WCC data. Without this data, they employ extra READ, LOOKUP, and GETATTR requests to ensure their directory entry caches are up to date, causing performance to suffer needlessly. In order to return WCC data, our server has to have both the pre-op and the post-op attribute data on hand when a reply is XDR encoded. The pre-op data is filled in when the incoming fh is locked, and the post-op data is filled in when the fh is unlocked. Unfortunately, for REMOVE, RMDIR, MKNOD, and MKDIR, the directory fh is not unlocked until well after the reply has been XDR encoded. This means that encode_wcc_data() does not have wcc_data for the parent directory, so none is returned to the client after these operations complete. By unlocking the parent directory fh immediately after the internal operations for each NFS procedure is complete, the post-op data is filled in before XDR encoding starts, so it can be returned to the client properly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-07 17:12:32 -04:00
J. Bruce Fields	6a85d6c769	nfsd4: comment nitpick Reported-by: "Madan, Anshul" <Anshul.Madan@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-07-06 12:40:22 -04:00
J. Bruce Fields	cba9ba4b90	nfsd4: fix delegation recall race use-after-free When the rarely-used callback-connection-changing setclientid occurs simultaneously with a delegation recall, we rerun the recall by requeueing it on a workqueue. But we also need to take a reference on the delegation in that case, since the delegation held by the rpc itself will be released by the rpc_release callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-24 12:24:55 -04:00
J. Bruce Fields	ac94bf5825	nfsd4: fix deleg leak on callback error Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-24 12:24:53 -04:00
J. Bruce Fields	ec8acac84a	nfsd4: remove some debugging code This is overkill. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 22:29:03 -04:00
Benny Halevy	9303bbd3de	nfsd: nfs4callback encode_stateid helper function To be used also for the pnfs cb_layoutrecall callback Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfsd4: fix cb_recall encoding] "nfsd: nfs4callback encode_stateid helper function" forgot to reserve more space after return from the new helper. Reported-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:51 -04:00
J. Bruce Fields	4731030d58	nfsd4: translate memory errors to delay, not serverfault If the server is out of memory is better for clients to back off and retry than to just error out. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:36 -04:00
J. Bruce Fields	76407f76e0	nfsd4; fix session reference count leak Note the session has to be put() here regardless of what happens to the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-22 17:19:28 -04:00
Linus Torvalds	b95a568093	Merge branch 'for-2.6.35' of git://linux-nfs.org/~bfields/linux * 'for-2.6.35' of git://linux-nfs.org/~bfields/linux: nfsd4: shut down callback queue outside state lock nfsd: nfsd_setattr needs to call commit_metadata	2010-06-09 12:43:04 -07:00
J. Bruce Fields	44b56603c4	Merge branch 'for-2.6.34-incoming' into for-2.6.35-incoming	2010-06-08 20:05:18 -04:00
J. Bruce Fields	c3935e3049	nfsd4: shut down callback queue outside state lock This reportedly causes a lockdep warning on nfsd shutdown. That looks like a false positive to me, but there's no reason why this needs the state lock anyway. Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-08 19:33:52 -04:00
Christoph Hellwig	b160fdabe9	nfsd: nfsd_setattr needs to call commit_metadata The conversion of write_inode_now calls to commit_metadata in commit `f501912a35` missed out the call in nfsd_setattr. But without this conversion we can't guarantee that a SETATTR request has actually been commited to disk with XFS, which causes a regression from 2.6.32 (only for NFSv2, but anyway). Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-06-01 19:17:50 -04:00
J. Bruce Fields	68a4b48ce6	nfsd4: don't bother storing callback reply tag We don't use this, and probably never will. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:59 -04:00
J. Bruce Fields	24a0111e40	nfsd4: fix use of op_share_access NFSv4.1 adds additional flags to the share_access argument of the open call. These flags need to be masked out in some of the existing code, but current code does that inconsistently. Tested-by: Michael Groshans <groshans@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:55 -04:00
J. Bruce Fields	172c85dd57	nfsd4: treat more recall errors as failures If a recall fails for some unexpected reason, instead of ignoring it and treating it like a success, it's safer to treat it as a failure, preventing further delgation grants and returning CB_PATH_DOWN. Also put put switches in a (two me) more logical order, with normal case first. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:53 -04:00
J. Bruce Fields	378b7d37f9	nfsd4: remove extra put() on callback errors Since rpc_call_async() guarantees that the release method will be called even on failure, this put is wrong. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-31 12:43:51 -04:00
Alexey Dobriyan	4be929be34	kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN - C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not USHORT_MAX/SHORT_MAX/SHORT_MIN. - Make SHRT_MIN of type s16, not int, for consistency. [akpm@linux-foundation.org: fix drivers/dma/timb_dma.c] [akpm@linux-foundation.org: fix security/keys/keyring.c] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 08:07:02 -07:00
Christoph Hellwig	8018ab0574	sanitize vfs_fsync calling conventions Now that the last user passing a NULL file pointer is gone we can remove the redundant dentry argument and associated hacks inside vfs_fsynmc_range. The next step will be removig the dentry argument from ->fsync, but given the luck with the last round of method prototype changes I'd rather defer this until after the main merge window. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
Christoph Hellwig	e970a573ce	nfsd: open a file descriptor for fsync in nfs4 recovery Instead of just looking up a path use do_filp_open to get us a file structure for the nfs4 recovery directory. This allows us to get rid of the last non-standard vfs_fsync caller with a NULL file pointer. [AV: should be using fput(), not filp_close()] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
J. Bruce Fields	e4e83ea47b	Revert "nfsd4: distinguish expired from stale stateids" This reverts commit `78155ed75f`. We're depending here on the boot time that we use to generate the stateid being monotonic, but get_seconds() is not necessarily. We still depend at least on boot_time being different every time, but that is a safer bet. We have a few reports of errors that might be explained by this problem, though we haven't been able to confirm any of them. But the minor gain of distinguishing expired from stale errors seems not worth the risk. Conflicts: fs/nfsd/nfs4state.c Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 19:03:50 -04:00
Pavel Emelyanov	47cee541a4	nfsd: safer initialization order in find_file() The alloc_init_file() first adds a file to the hash and then initializes its fi_inode, fi_id and fi_had_conflict. The uninitialized fi_inode could thus be erroneously checked by the find_file(), so move the hash insertion lower. The client_mutex should prevent this race in practice; however, we eventually hope to make less use of the client_mutex, so the ordering here is an accident waiting to happen. I didn't find whether the same can be true for two other fields, but the common sense tells me it's better to initialize an object before putting it into a global hash table :) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 12:05:20 -04:00
J. Bruce Fields	b7299f4439	nfs4: minor callback code simplification, comment Note the position in the version array doesn't have to match the actual rpc version number--to me it seems clearer to maintain the distinction. Also document choice of rpc callback version number, as discussed in e.g. http://www.ietf.org/mail-archive/web/nfsv4/current/msg07985.html and followups. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-18 11:51:38 -04:00
Pavel Emelyanov	15ddb4aec5	NFSD: don't report compiled-out versions as present The /proc/fs/nfsd/versions file calls nfsd_vers() to check whether the particular nfsd version is present/available. The problem is that once I turn off e.g. NFSD-V4 this call returns -1 which is true from the callers POV which is wrong. The proposal is to report false in that case. The bug has existed since `6658d3a7bb` "[PATCH] knfsd: remove nfsd_versbits as intermediate storage for desired versions". Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: stable@kernel.org Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-14 18:46:14 -04:00
J. Bruce Fields	4dc6ec00f6	nfsd4: implement reclaim_complete This is a mandatory operation. Also, here (not in open) is where we should be committing the reboot recovery information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 12:03:11 -04:00
Benny Halevy	ab707e1565	nfsd4: nfsd4_destroy_session must set callback client under the state lock nfsd4_set_callback_client must be called under the state lock to atomically set or unset the callback client and shutting down the previous one. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:59:11 -04:00
Benny Halevy	d76829889a	nfsd4: keep a reference count on client while in use Get a refcount on the client on SEQUENCE, Release the refcount and renew the client when all respective compounds completed. Do not expire the client by the laundromat while in use. If the client was expired via another path, free it when the compounds complete and the refcount reaches 0. Note that unhash_client_locked must call list_del_init on cl_lru as it may be called twice for the same client (once from nfs4_laundromat and then from expire_client) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:58:54 -04:00
Benny Halevy	07cd4909a6	nfsd4: mark_client_expired Mark the client as expired under the client_lock so it won't be renewed when an nfsv4.1 session is done, after it was explicitly expired during processing of the compound. Do not renew a client mark as expired (in particular, it is not on the lru list anymore) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:47:22 -04:00
Benny Halevy	46583e2597	nfsd4: introduce nfs4_client.cl_refcount Currently just initialize the cl_refcount to 1 and decrement in expire_client(), conditionally freeing the client when the refcount reaches 0. To be used later by nfsv4.1 compounds to keep the client from timing out while in use. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-13 11:47:03 -04:00
Benny Halevy	84d38ac9ab	nfsd4: refactor expire_client Separate out unhashing of the client and session. To be used later by the laundromat. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:02 -04:00
Benny Halevy	36acb66bda	nfsd4: extend the client_lock to cover cl_lru To be used later on to hold a reference count on the client while in use by a nfsv4.1 compound. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:02 -04:00
Benny Halevy	328efbab0f	nfsd4: use list_move in move_to_confirmed rather than list_del_init, list_add Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
Benny Halevy	be1fdf6c43	nfsd4: fold release_session into expire_client and grab the client lock once for all the client's sessions. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
Benny Halevy	9089f1b478	nfsd4: rename sessionid_lock to client_lock In preparation to share the lock's scope to both client and session hash tables. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-11 21:02:01 -04:00
J. Bruce Fields	5d4cec2f2f	nfsd4: fix bare destroy_session null dereference It's legal to send a DESTROY_SESSION outside any session (as the only operation in a compound), in which case cstate->session will be NULL; check for that case. While we're at it, move these checks into a separate helper function. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-07 19:08:47 -04:00
J. Bruce Fields	5306293c9c	Merge commit 'v2.6.34-rc6' Conflicts: fs/nfsd/nfs4callback.c	2010-05-04 11:29:05 -04:00
Benny Halevy	dbd65a7e44	nfsd4: use local variable in nfs4svc_encode_compoundres 'cs' is already computed, re-use it. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-05-04 10:10:36 -04:00
J. Bruce Fields	26c0c75e69	nfsd4: fix unlikely race in session replay case In the replay case, the renew_client(session->se_client); happens after we've droppped the sessionid_lock, and without holding a reference on the session; so there's nothing preventing the session being freed before we get here. Thanks to Benny Halevy for catching a bug in an earlier version of this patch. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Benny Halevy <bhalevy@panasas.com>	2010-05-03 08:32:31 -04:00
Neil Brown	2bc3c1179c	nfsd4: bug in read_buf When read_buf is called to move over to the next page in the pagelist of an NFSv4 request, it sets argp->end to essentially a random number, certainly not an address within the page which argp->p now points to. So subsequent calls to READ_BUF will think there is much more than a page of spare space (the cast to u32 ensures an unsigned comparison) so we can expect to fall off the end of the second page. We never encountered thsi in testing because typically the only operations which use more than two pages are write-like operations, which have their own decoding logic. Something like a getattr after a write may cross a page boundary, but it would be very unusual for it to cross another boundary after that. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-26 15:39:08 -04:00
Dan Carpenter	d03859a4ac	nfsd: potential ERR_PTR dereference on exp_export() error paths. We "goto finish" from several places where "exp" is an ERR_PTR. Also I changed the check for "fsid_key" so that it was consistent with the check I added. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 12:03:02 -04:00
J. Bruce Fields	5771635592	nfsd4: complete enforcement of 4.1 op ordering Enforce the rules about compound op ordering. Motivated by implementing RECLAIM_COMPLETE, for which the client is implicit in the current session, so it is important to ensure a succesful SEQUENCE proceeds the RECLAIM_COMPLETE. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:35:14 -04:00
J. Bruce Fields	4b21d0defc	nfsd4: allow 4.0 clients to change callback path The rfc allows a client to change the callback parameters, but we didn't previously implement it. Teach the callbacks to rerun themselves (by placing themselves on a workqueue) when they recognize that their rpc task has been killed and that the callback connection has changed. Then we can change the callback connection by setting up a new rpc client, modifying the nfs4 client to point at it, waiting for any work in progress to complete, and then shutting down the old client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	2bf23875f5	nfsd4: rearrange cb data structures Mainly I just want to separate the arguments used for setting up the tcp client from the rest. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	b12a05cbdf	nfsd4: cl_count is unused Now that the shutdown sequence guarantees callbacks are shut down before the client is destroyed, we no longer have a use for cl_count. We'll probably reinstate a reference count on the client some day, but it will be held by users other than callbacks. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:02 -04:00
J. Bruce Fields	b5a1a81e5c	nfsd4: don't sleep in lease-break callback The NFSv4 server's fl_break callback can sleep (dropping the BKL), in order to allocate a new rpc task to send a recall to the client. As far as I can tell this doesn't cause any races in the current code, but the analysis is difficult. Also, the sleep here may complicate the move away from the BKL. So, just schedule some work to do the job for us instead. The work will later also prove useful for restarting a call after the callback information is changed. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-22 11:34:01 -04:00
J. Bruce Fields	3c4ab2aaa9	nfsd4: indentation cleanup Looks like a put-and-paste mistake. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-19 15:12:51 -04:00
J. Bruce Fields	408b79bcc3	nfsd4: consistent session flag setting We should clear these flags on any new create_session, not just on the first one. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-16 21:47:37 -04:00
J. Bruce Fields	9045b4b9f7	nfsd4: remove probe task's reference on client Any null probe rpc will be synchronously destroyed by the rpc_shutdown_client() in expire_client(), so the rpc task cannot outlast the nfs4 client. Therefore there's no need for that task to hold a reference on the client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 17:04:32 -04:00
J. Bruce Fields	3df796dbe9	nfsd4: remove dprintk I haven't found this useful. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 17:04:31 -04:00
J. Bruce Fields	147efd0dd7	nfsd4: shutdown callbacks on expiry Once we've expired the client, there's no further purpose to the callbacks; go ahead and shut down the callback client rather than waiting for the last reference to go. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 16:36:30 -04:00
J. Bruce Fields	227f98d98d	nfsd4: preallocate nfs4_rpc_args Instead of allocating this small structure, just include it in the delegation. The nfsd4_callback structure isn't really necessary yet, but we plan to add to it all the information necessary to perform a callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-04-02 16:28:11 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Jeff Layton	91885258e8	nfsd: don't break lease while servicing a COMMIT This is the second attempt to fix the problem whereby a COMMIT call causes a lease break and triggers a possible deadlock. The problem is that nfsd attempts to break a lease on a COMMIT call. This triggers a delegation recall if the lease is held for a delegation. If the client is the one holding the delegation and it's the same one on which it's issuing the COMMIT, then it can't return that delegation until the COMMIT is complete. But, nfsd won't complete the COMMIT until the delegation is returned. The client and server are essentially deadlocked until the state is marked bad (due to the client not responding on the callback channel). The first patch attempted to deal with this by eliminating the open of the file altogether and simply had nfsd_commit pass a NULL file pointer to the vfs_fsync_range. That would conflict with some work in progress by Christoph Hellwig to clean up the fsync interface, so this patch takes a different approach. This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates to nfsd_open that it should not break any leases when opening the file, and has nfsd_commit set that flag on the nfsd_open call. For now, this patch leaves nfsd_commit opening the file with write access since I'm not clear on what sort of access would be more appropriate. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-22 15:37:53 -04:00
NeilBrown	61f8603d93	nfsd: factor out hash functions for export caches. Both the _lookup and the _update functions for these two caches independently calculate the hash of the key. So factor out that code for improved reuse. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-16 18:05:11 -04:00
J. Bruce Fields	e739cf1da4	Merge commit 'v2.6.34-rc1' into for-2.6.35-incoming	2010-03-09 17:22:08 -05:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
J. Bruce Fields	e7b184f199	nfsd4: document lease/grace-period limits The current documentation here is out of date, and not quite right. (Future work: some user documentation would be useful.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:10 -05:00
J. Bruce Fields	efc4bb4fdd	nfsd4: allow setting grace period time Allow explicit configuration of the grace period time as well as the lease period time. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:08 -05:00
J. Bruce Fields	f013574014	nfsd4: reshuffle lease-setting code to allow reuse We'll soon allow setting the grace period, so we'll want to share this code. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:03 -05:00
J. Bruce Fields	f958a1320f	nfsd4: remove unnecessary lease-setting function This is another layer of indirection that doesn't really buy us anything. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:03 -05:00
J. Bruce Fields	e46b498c84	nfsd4: simplify lease/grace interaction The original code here assumed we'd allow the user to change the lease any time, but only allow the change to take effect on restart. Since then we modified the code to allow setting the lease on when the server is down. Update the rest of the code to reflect that fact, clarify variable names, and add document. Also, the code insisted that the grace period always be the longer of the old and new lease periods, but that's overly conservative--as long as it lasts at least the old lease period, old clients should still know to recover in time. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:02 -05:00
J. Bruce Fields	cf07d2ea43	nfsd4: simplify references to nfsd4 lease time Instead of accessing the lease time directly, some users call nfs4_lease_time(), and some a macro, NFSD_LEASE_TIME, defined as nfs4_lease_time(). Neither layer of indirection serves any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-06 15:02:01 -05:00
Linus Torvalds	05c5cb31ec	Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux * 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: (22 commits) nfsd4: fix minor memory leak svcrpc: treat uid's as unsigned nfsd: ensure sockets are closed on error Revert "sunrpc: move the close processing after do recvfrom method" Revert "sunrpc: fix peername failed on closed listener" sunrpc: remove unnecessary svc_xprt_put NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN xfs_export_operations.commit_metadata commit_metadata export operation replacing nfsd_sync_dir lockd: don't clear sm_monitored on nsm_reboot_lookup lockd: release reference to nsm_handle in nlm_host_rebooted nfsd: Use vfs_fsync_range() in nfsd_commit NFSD: Create PF_INET6 listener in write_ports SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt() NFSD: Support AF_INET6 in svc_addsock() function SUNRPC: Use rpc_pton() in ip_map_parse() nfsd: 4.1 has an rfc number nfsd41: Create the recovery entry for the NFSv4.1 client nfsd: use vfs_fsync for non-directories ...	2010-03-06 11:31:38 -08:00
Wu Fengguang	42e4960868	vfs: take f_lock on modifying f_mode after open time We'll introduce FMODE_RANDOM which will be runtime modified. So protect all runtime modification to f_mode with f_lock to avoid races. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@kernel.org> [2.6.33.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:25 -08:00
Linus Torvalds	e213e26ab3	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits) quota: stop using QUOTA_OK / NO_QUOTA dquot: cleanup dquot initialize routine dquot: move dquot initialization responsibility into the filesystem dquot: cleanup dquot drop routine dquot: move dquot drop responsibility into the filesystem dquot: cleanup dquot transfer routine dquot: move dquot transfer responsibility into the filesystem dquot: cleanup inode allocation / freeing routines dquot: cleanup space allocation / freeing routines ext3: add writepage sanity checks ext3: Truncate allocated blocks if direct IO write fails to update i_size quota: Properly invalidate caches even for filesystems with blocksize < pagesize quota: generalize quota transfer interface quota: sb_quota state flags cleanup jbd: Delay discarding buffers in journal_unmap_buffer ext3: quota_write cross block boundary behaviour quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota quota: split out compat_sys_quotactl support from quota.c quota: split out netlink notification support from quota.c quota: remove invalid optimization from quota_sync_all ... Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c	2010-03-05 13:20:53 -08:00
Christoph Hellwig	907f4554e2	dquot: move dquot initialization responsibility into the filesystem Currently various places in the VFS call vfs_dq_init directly. This means we tie the quota code into the VFS. Get rid of that and make the filesystem responsible for the initialization. For most metadata operations this is a straight forward move into the methods, but for truncate and open it's a bit more complicated. For truncate we currently only call vfs_dq_init for the sys_truncate case because open already takes care of it for ftruncate and open(O_TRUNC) - the new code causes an additional vfs_dq_init for those which is harmless. For open the initialization is moved from do_filp_open into the open method, which means it happens slightly earlier now, and only for regular files. The latter is fine because we don't need to initialize it for operations on special files, and we already do it as part of the namespace operations for directories. Add a dquot_file_open helper that filesystems that support generic quotas can use to fill in ->open. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2010-03-05 00:20:30 +01:00
J. Bruce Fields	4ea41e2de5	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs into for-2.6.34-incoming Resolve merge conflict in fs/xfs/linux-2.6/xfs_export.c.	2010-03-04 12:04:51 -05:00
J. Bruce Fields	8d75da8afd	nfsd4: fix minor memory leak There's no need to allocate this cred more than once. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-03-03 16:13:29 -05:00
Al Viro	462d60577a	fix NFS4 handling of mountpoint stat RFC says we need to follow the chain of mounts if there's more than one stacked on that point. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:07:57 -05:00
Al Viro	8737c9305b	Switch may_open() and break_lease() to passing O_... ... instead of mixing FMODE_ and O_ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 13:00:21 -05:00
Chuck Lever	58255a4e3c	NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN The server's callback client should stop trying to connect to the client's callback server as soon as it gets ECONNREFUSED. The NFS server's callback client does not call rpc_ping(), but appears to have it's own "ping" procedure, so it wasn't covered by commit `caabea8a`. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-24 17:50:28 -08:00
Ben Myers	f501912a35	commit_metadata export operation replacing nfsd_sync_dir - Add commit_metadata export_operation to allow the underlying filesystem to decide how to commit an inode most efficiently. - Usage of nfsd_sync_dir and write_inode_now has been replaced with the commit_metadata function that takes a svc_fh. - The commit_metadata function calls the commit_metadata export_op if it's there, or else falls back to sync_inode instead of fsync and write_inode_now because only metadata need be synced here. - nfsd4_sync_rec_dir now uses vfs_fsync so that commit_metadata can be static Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-20 13:13:44 -08:00
Chuck Ebbert	aeaa5ccd64	vfs: don't call ima_file_check() unconditionally in nfsd_open() commit `1e41568d73` ("Take ima_path_check() in nfsd past dentry_open() in nfsd_open()") moved this code back to its original location but missed the "else". Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-20 00:47:31 -05:00
Daniel Mack	3ad2f3fbb9	tree-wide: Assorted spelling fixes In particular, several occurances of funny versions of 'success', 'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address', 'beginning', 'desirable', 'separate' and 'necessary' are fixed. Signed-off-by: Daniel Mack <daniel@caiaq.de> Cc: Joe Perches <joe@perches.com> Cc: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-09 11:13:56 +01:00
Linus Torvalds	deb0c98c7f	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: Revert "nfsd4: fix error return when pseudoroot missing"	2010-02-08 17:08:01 -08:00
J. Bruce Fields	260c64d235	Revert "nfsd4: fix error return when pseudoroot missing" Commit `f39bde24b2` fixed the error return from PUTROOTFH in the case where there is no pseudofilesystem. This is really a case we shouldn't hit on a correctly configured server: in the absence of a root filehandle, there's no point accepting version 4 NFS rpc calls at all. But the shared responsibility between kernel and userspace here means the kernel on its own can't eliminate the possiblity of this happening. And we have indeed gotten this wrong in distro's, so new client-side mount code that attempts to negotiate v4 by default first has to work around this case. Therefore when commit `f39bde24b2` arrived at roughly the same time as the new v4-default mount code, which explicitly checked only for the previous error, the result was previously fine mounts suddenly failing. We'll fix both sides for now: revert the error change, and make the client-side mount workaround more robust. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-02-08 15:25:23 -05:00
Mimi Zohar	9bbb6cad01	ima: rename ima_path_check to ima_file_check ima_path_check actually deals with files! call it ima_file_check instead. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Mimi Zohar	8eb988c70e	fix ima breakage The "Untangling ima mess, part 2 with counters" patch messed up the counters. Based on conversations with Al Viro, this patch streamlines ima_path_check() by removing the counter maintaince. The counters are now updated independently, from measuring the file, in __dentry_open() and alloc_file() by calling ima_counts_get(). ima_path_check() is called from nfsd and do_filp_open(). It also did not measure all files that should have been measured. Reason: ima_path_check() got bogus value passed as mask. [AV: mea culpa] [AV: add missing nfsd bits] Signed-off-by: Mimi Zohar <zohar@us.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Al Viro	1e41568d73	Take ima_path_check() in nfsd past dentry_open() in nfsd_open() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Trond Myklebust	aa696a6f34	nfsd: Use vfs_fsync_range() in nfsd_commit The NFS COMMIT operation allows the client to specify the exact byte range that it wishes to sync to disk in order to optimise server performance. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-29 18:53:11 -05:00
Chuck Lever	37498292aa	NFSD: Create PF_INET6 listener in write_ports Try to create a PF_INET6 listener for NFSD, if IPv6 is enabled in the kernel. Make sure nfsd_serv's reference count is decreased if __write_ports_addxprt() failed to create a listener. See __write_ports_addfd(). Our current plan is to rely on rpc.nfsd to create appropriate IPv6 listeners when server-side NFS/IPv6 support is desired. Legacy behavior, via the write_threads or write_svc kernel APIs, will remain the same -- only IPv4 listeners are created. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [bfields@citi.umich.edu: Move error-handling code to end] Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-27 17:01:08 -05:00
Chuck Lever	6871790815	SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found" write_ports() converts svc_create_xprt()'s ENOENT error return to EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error message that makes sense. It turns out that several of the other kernel APIs rpc.nfsd use can also return ENOENT from svc_create_xprt(), by way of lockd_up(). On the client side, an NFSv2 or NFSv3 mount request can also return the result of lockd_up(). This error may also be returned during an NFSv4 mount request, since the NFSv4 callback service uses svc_create_xprt() to create the callback listener. An ENOENT error return results in a confusing error message from the mount command. Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-26 17:59:21 -05:00
Ricardo Labiaga	8b8aae4009	nfsd41: Create the recovery entry for the NFSv4.1 client Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-14 12:24:46 -05:00
Christoph Hellwig	6a68f89ee1	nfsd: use vfs_fsync for non-directories Instead of opencoding the fsync calling sequence use vfs_fsync. This also gets rid of the useless i_mutex over the data writeout. Consolidate the remaining special code for syncing directories and document it's quirks. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Ricardo Labiaga	de3cab793c	nfsd4: Use FIRST_NFS4_OP in nfsd4_decode_compound() Since we're checking for LAST_NFS4_OP, use FIRST_NFS4_OP to be consistent. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Ricardo Labiaga	c551866e64	nfsd41: nfsd4_decode_compound() does not recognize all ops The server incorrectly assumes that the operations in the array start with value 0. The first operation (OP_ACCESS) has a value of 3, causing the check in nfsd4_decode_compound to be off. Instead of comparing that the operation number is less than the number of elements in the array, the server should verify that it is less than the maximum valid operation number defined by LAST_NFS4_OP. Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-13 09:42:26 -05:00
Linus Torvalds	93939f4e5d	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: sunrpc: fix peername failed on closed listener nfsd: make sure data is on disk before calling ->fsync nfsd: fix "insecure" export option	2010-01-06 18:10:15 -08:00
Christoph Hellwig	7211a4e859	nfsd: make sure data is on disk before calling ->fsync nfsd is not using vfs_fsync, so I missed it when changing the calling convention during the 2.6.32 window. This patch fixes it to not only start the data writeout, but also wait for it to complete before calling into ->fsync. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-06 17:37:26 -05:00
J. Bruce Fields	3d354cbc43	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-20 20:19:51 -08:00
J. Bruce Fields	f69ac2f5a3	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-20 10:22:58 -05:00
Linus Torvalds	bac5e54c29	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits) direct I/O fallback sync simplification ocfs: stop using do_sync_mapping_range cleanup blockdev_direct_IO locking make generic_acl slightly more generic sanitize xattr handler prototypes libfs: move EXPORT_SYMBOL for d_alloc_name vfs: force reval of target when following LAST_BIND symlinks (try #7) ima: limit imbalance msg Untangling ima mess, part 3: kill dead code in ima Untangling ima mess, part 2: deal with counters Untangling ima mess, part 1: alloc_file() O_TRUNC open shouldn't fail after file truncation ima: call ima_inode_free ima_inode_free IMA: clean up the IMA counts updating code ima: only insert at inode creation time ima: valid return code from ima_inode_alloc fs: move get_empty_filp() deffinition to internal.h Sanitize exec_permission_lite() Kill cached_lookup() and real_lookup() Kill path_lookup_open() ... Trivial conflicts in fs/direct-io.c	2009-12-16 12:04:02 -08:00

... 3 4 5 6 7 ...

1254 Commits