Commit Graph

894 Commits

Author SHA1 Message Date
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Jeff Layton
91885258e8 nfsd: don't break lease while servicing a COMMIT
This is the second attempt to fix the problem whereby a COMMIT call
causes a lease break and triggers a possible deadlock.

The problem is that nfsd attempts to break a lease on a COMMIT call.
This triggers a delegation recall if the lease is held for a delegation.
If the client is the one holding the delegation and it's the same one on
which it's issuing the COMMIT, then it can't return that delegation
until the COMMIT is complete. But, nfsd won't complete the COMMIT until
the delegation is returned. The client and server are essentially
deadlocked until the state is marked bad (due to the client not
responding on the callback channel).

The first patch attempted to deal with this by eliminating the open of
the file altogether and simply had nfsd_commit pass a NULL file pointer
to the vfs_fsync_range. That would conflict with some work in progress
by Christoph Hellwig to clean up the fsync interface, so this patch
takes a different approach.

This declares a new NFSD_MAY_NOT_BREAK_LEASE access flag that indicates
to nfsd_open that it should not break any leases when opening the file,
and has nfsd_commit set that flag on the nfsd_open call.

For now, this patch leaves nfsd_commit opening the file with write
access since I'm not clear on what sort of access would be more
appropriate.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-22 15:37:53 -04:00
NeilBrown
61f8603d93 nfsd: factor out hash functions for export caches.
Both the _lookup and the _update functions for these two caches
independently calculate the hash of the key.
So factor out that code for improved reuse.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-16 18:05:11 -04:00
J. Bruce Fields
e739cf1da4 Merge commit 'v2.6.34-rc1' into for-2.6.35-incoming 2010-03-09 17:22:08 -05:00
Jiri Kosina
318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
J. Bruce Fields
e7b184f199 nfsd4: document lease/grace-period limits
The current documentation here is out of date, and not quite right.

(Future work: some user documentation would be useful.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:10 -05:00
J. Bruce Fields
efc4bb4fdd nfsd4: allow setting grace period time
Allow explicit configuration of the grace period time as well as the
lease period time.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:08 -05:00
J. Bruce Fields
f013574014 nfsd4: reshuffle lease-setting code to allow reuse
We'll soon allow setting the grace period, so we'll want to share this
code.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:03 -05:00
J. Bruce Fields
f958a1320f nfsd4: remove unnecessary lease-setting function
This is another layer of indirection that doesn't really buy us
anything.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:03 -05:00
J. Bruce Fields
e46b498c84 nfsd4: simplify lease/grace interaction
The original code here assumed we'd allow the user to change the lease
any time, but only allow the change to take effect on restart.  Since
then we modified the code to allow setting the lease on when the server
is down.  Update the rest of the code to reflect that fact, clarify
variable names, and add document.

Also, the code insisted that the grace period always be the longer of
the old and new lease periods, but that's overly conservative--as long
as it lasts at least the old lease period, old clients should still know
to recover in time.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:02 -05:00
J. Bruce Fields
cf07d2ea43 nfsd4: simplify references to nfsd4 lease time
Instead of accessing the lease time directly, some users call
nfs4_lease_time(), and some a macro, NFSD_LEASE_TIME, defined as
nfs4_lease_time().  Neither layer of indirection serves any purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-06 15:02:01 -05:00
Linus Torvalds
05c5cb31ec Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: (22 commits)
  nfsd4: fix minor memory leak
  svcrpc: treat uid's as unsigned
  nfsd: ensure sockets are closed on error
  Revert "sunrpc: move the close processing after do recvfrom method"
  Revert "sunrpc: fix peername failed on closed listener"
  sunrpc: remove unnecessary svc_xprt_put
  NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN
  xfs_export_operations.commit_metadata
  commit_metadata export operation replacing nfsd_sync_dir
  lockd: don't clear sm_monitored on nsm_reboot_lookup
  lockd: release reference to nsm_handle in nlm_host_rebooted
  nfsd: Use vfs_fsync_range() in nfsd_commit
  NFSD: Create PF_INET6 listener in write_ports
  SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found"
  SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt()
  NFSD: Support AF_INET6 in svc_addsock() function
  SUNRPC: Use rpc_pton() in ip_map_parse()
  nfsd: 4.1 has an rfc number
  nfsd41: Create the recovery entry for the NFSv4.1 client
  nfsd: use vfs_fsync for non-directories
  ...
2010-03-06 11:31:38 -08:00
Wu Fengguang
42e4960868 vfs: take f_lock on modifying f_mode after open time
We'll introduce FMODE_RANDOM which will be runtime modified.  So protect
all runtime modification to f_mode with f_lock to avoid races.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@kernel.org>			[2.6.33.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-06 11:26:25 -08:00
Linus Torvalds
e213e26ab3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
  quota: stop using QUOTA_OK / NO_QUOTA
  dquot: cleanup dquot initialize routine
  dquot: move dquot initialization responsibility into the filesystem
  dquot: cleanup dquot drop routine
  dquot: move dquot drop responsibility into the filesystem
  dquot: cleanup dquot transfer routine
  dquot: move dquot transfer responsibility into the filesystem
  dquot: cleanup inode allocation / freeing routines
  dquot: cleanup space allocation / freeing routines
  ext3: add writepage sanity checks
  ext3: Truncate allocated blocks if direct IO write fails to update i_size
  quota: Properly invalidate caches even for filesystems with blocksize < pagesize
  quota: generalize quota transfer interface
  quota: sb_quota state flags cleanup
  jbd: Delay discarding buffers in journal_unmap_buffer
  ext3: quota_write cross block boundary behaviour
  quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
  quota: split out compat_sys_quotactl support from quota.c
  quota: split out netlink notification support from quota.c
  quota: remove invalid optimization from quota_sync_all
  ...

Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
2010-03-05 13:20:53 -08:00
Christoph Hellwig
907f4554e2 dquot: move dquot initialization responsibility into the filesystem
Currently various places in the VFS call vfs_dq_init directly.  This means
we tie the quota code into the VFS.  Get rid of that and make the
filesystem responsible for the initialization.   For most metadata operations
this is a straight forward move into the methods, but for truncate and
open it's a bit more complicated.

For truncate we currently only call vfs_dq_init for the sys_truncate case
because open already takes care of it for ftruncate and open(O_TRUNC) - the
new code causes an additional vfs_dq_init for those which is harmless.

For open the initialization is moved from do_filp_open into the open method,
which means it happens slightly earlier now, and only for regular files.
The latter is fine because we don't need to initialize it for operations
on special files, and we already do it as part of the namespace operations
for directories.

Add a dquot_file_open helper that filesystems that support generic quotas
can use to fill in ->open.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2010-03-05 00:20:30 +01:00
J. Bruce Fields
4ea41e2de5 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs into for-2.6.34-incoming
Resolve merge conflict in fs/xfs/linux-2.6/xfs_export.c.
2010-03-04 12:04:51 -05:00
J. Bruce Fields
8d75da8afd nfsd4: fix minor memory leak
There's no need to allocate this cred more than once.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-03-03 16:13:29 -05:00
Al Viro
462d60577a fix NFS4 handling of mountpoint stat
RFC says we need to follow the chain of mounts if there's more
than one stacked on that point.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:57 -05:00
Al Viro
8737c9305b Switch may_open() and break_lease() to passing O_...
... instead of mixing FMODE_ and O_

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 13:00:21 -05:00
Chuck Lever
58255a4e3c NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN
The server's callback client should stop trying to connect to the
client's callback server as soon as it gets ECONNREFUSED.

The NFS server's callback client does not call rpc_ping(), but appears
to have it's own "ping" procedure, so it wasn't covered by commit
caabea8a.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-02-24 17:50:28 -08:00
Ben Myers
f501912a35 commit_metadata export operation replacing nfsd_sync_dir
- Add commit_metadata export_operation to allow the underlying filesystem to
decide how to commit an inode most efficiently.

- Usage of nfsd_sync_dir and write_inode_now has been replaced with the
commit_metadata function that takes a svc_fh.

- The commit_metadata function calls the commit_metadata export_op if it's
there, or else falls back to sync_inode instead of fsync and write_inode_now
because only metadata need be synced here.

- nfsd4_sync_rec_dir now uses vfs_fsync so that commit_metadata can be static

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-02-20 13:13:44 -08:00
Chuck Ebbert
aeaa5ccd64 vfs: don't call ima_file_check() unconditionally in nfsd_open()
commit 1e41568d73 ("Take ima_path_check()
in nfsd past dentry_open() in nfsd_open()") moved this code back to its
original location but missed the "else".

Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-02-20 00:47:31 -05:00
Daniel Mack
3ad2f3fbb9 tree-wide: Assorted spelling fixes
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack <daniel@caiaq.de>
Cc: Joe Perches <joe@perches.com>
Cc: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-09 11:13:56 +01:00
Linus Torvalds
deb0c98c7f Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.33' of git://linux-nfs.org/~bfields/linux:
  Revert "nfsd4: fix error return when pseudoroot missing"
2010-02-08 17:08:01 -08:00
J. Bruce Fields
260c64d235 Revert "nfsd4: fix error return when pseudoroot missing"
Commit f39bde24b2 fixed the error return from PUTROOTFH in the
case where there is no pseudofilesystem.

This is really a case we shouldn't hit on a correctly configured server:
in the absence of a root filehandle, there's no point accepting version
4 NFS rpc calls at all.

But the shared responsibility between kernel and userspace here means
the kernel on its own can't eliminate the possiblity of this happening.
And we have indeed gotten this wrong in distro's, so new client-side
mount code that attempts to negotiate v4 by default first has to work
around this case.

Therefore when commit f39bde24b2 arrived at roughly the same
time as the new v4-default mount code, which explicitly checked only for
the previous error, the result was previously fine mounts suddenly
failing.

We'll fix both sides for now: revert the error change, and make the
client-side mount workaround more robust.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-02-08 15:25:23 -05:00
Mimi Zohar
9bbb6cad01 ima: rename ima_path_check to ima_file_check
ima_path_check actually deals with files!  call it ima_file_check instead.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-02-07 03:06:22 -05:00
Mimi Zohar
8eb988c70e fix ima breakage
The "Untangling ima mess, part 2 with counters" patch messed
up the counters.  Based on conversations with Al Viro, this patch
streamlines ima_path_check() by removing the counter maintaince.
The counters are now updated independently, from measuring the file,
in __dentry_open() and alloc_file() by calling ima_counts_get().
ima_path_check() is called from nfsd and do_filp_open().
It also did not measure all files that should have been measured.
Reason: ima_path_check() got bogus value passed as mask.
[AV: mea culpa]
[AV: add missing nfsd bits]

Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-02-07 03:06:22 -05:00
Al Viro
1e41568d73 Take ima_path_check() in nfsd past dentry_open() in nfsd_open()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-02-07 03:06:22 -05:00
Trond Myklebust
aa696a6f34 nfsd: Use vfs_fsync_range() in nfsd_commit
The NFS COMMIT operation allows the client to specify the exact byte range
that it wishes to sync to disk in order to optimise server performance.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-29 18:53:11 -05:00
Chuck Lever
37498292aa NFSD: Create PF_INET6 listener in write_ports
Try to create a PF_INET6 listener for NFSD, if IPv6 is enabled in the
kernel.

Make sure nfsd_serv's reference count is decreased if
__write_ports_addxprt() failed to create a listener.  See
__write_ports_addfd().

Our current plan is to rely on rpc.nfsd to create appropriate IPv6
listeners when server-side NFS/IPv6 support is desired.  Legacy
behavior, via the write_threads or write_svc kernel APIs, will remain
the same -- only IPv4 listeners are created.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[bfields@citi.umich.edu: Move error-handling code to end]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-27 17:01:08 -05:00
Chuck Lever
6871790815 SUNRPC: NFS kernel APIs shouldn't return ENOENT for "transport not found"
write_ports() converts svc_create_xprt()'s ENOENT error return to
EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error
message that makes sense.

It turns out that several of the other kernel APIs rpc.nfsd use can
also return ENOENT from svc_create_xprt(), by way of lockd_up().

On the client side, an NFSv2 or NFSv3 mount request can also return
the result of lockd_up().  This error may also be returned during an
NFSv4 mount request, since the NFSv4 callback service uses
svc_create_xprt() to create the callback listener.  An ENOENT error
return results in a confusing error message from the mount command.

Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-26 17:59:21 -05:00
Ricardo Labiaga
8b8aae4009 nfsd41: Create the recovery entry for the NFSv4.1 client
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-14 12:24:46 -05:00
Christoph Hellwig
6a68f89ee1 nfsd: use vfs_fsync for non-directories
Instead of opencoding the fsync calling sequence use vfs_fsync.  This also
gets rid of the useless i_mutex over the data writeout.

Consolidate the remaining special code for syncing directories and document
it's quirks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-13 09:42:26 -05:00
Ricardo Labiaga
de3cab793c nfsd4: Use FIRST_NFS4_OP in nfsd4_decode_compound()
Since we're checking for LAST_NFS4_OP, use FIRST_NFS4_OP to be consistent.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-13 09:42:26 -05:00
Ricardo Labiaga
c551866e64 nfsd41: nfsd4_decode_compound() does not recognize all ops
The server incorrectly assumes that the operations in the
array start with value 0.  The first operation (OP_ACCESS)
has a value of 3, causing the check in nfsd4_decode_compound
to be off.

Instead of comparing that the operation number is less than
the number of elements in the array, the server should verify
that it is less than the maximum valid operation number
defined by LAST_NFS4_OP.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-13 09:42:26 -05:00
Linus Torvalds
93939f4e5d Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.33' of git://linux-nfs.org/~bfields/linux:
  sunrpc: fix peername failed on closed listener
  nfsd: make sure data is on disk before calling ->fsync
  nfsd: fix "insecure" export option
2010-01-06 18:10:15 -08:00
Christoph Hellwig
7211a4e859 nfsd: make sure data is on disk before calling ->fsync
nfsd is not using vfs_fsync, so I missed it when changing the calling
convention during the 2.6.32 window.  This patch fixes it to not only
start the data writeout, but also wait for it to complete before calling
into ->fsync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-01-06 17:37:26 -05:00
J. Bruce Fields
3d354cbc43 nfsd: fix "insecure" export option
A typo in 12045a6ee9 "nfsd: let "insecure" flag vary by
pseudoflavor" reversed the sense of the "insecure" flag.

Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-20 20:19:51 -08:00
J. Bruce Fields
f69ac2f5a3 nfsd: fix "insecure" export option
A typo in 12045a6ee9 "nfsd: let "insecure" flag vary by
pseudoflavor" reversed the sense of the "insecure" flag.

Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-20 10:22:58 -05:00
Linus Torvalds
bac5e54c29 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
  direct I/O fallback sync simplification
  ocfs: stop using do_sync_mapping_range
  cleanup blockdev_direct_IO locking
  make generic_acl slightly more generic
  sanitize xattr handler prototypes
  libfs: move EXPORT_SYMBOL for d_alloc_name
  vfs: force reval of target when following LAST_BIND symlinks (try #7)
  ima: limit imbalance msg
  Untangling ima mess, part 3: kill dead code in ima
  Untangling ima mess, part 2: deal with counters
  Untangling ima mess, part 1: alloc_file()
  O_TRUNC open shouldn't fail after file truncation
  ima: call ima_inode_free ima_inode_free
  IMA: clean up the IMA counts updating code
  ima: only insert at inode creation time
  ima: valid return code from ima_inode_alloc
  fs: move get_empty_filp() deffinition to internal.h
  Sanitize exec_permission_lite()
  Kill cached_lookup() and real_lookup()
  Kill path_lookup_open()
  ...

Trivial conflicts in fs/direct-io.c
2009-12-16 12:04:02 -08:00
Al Viro
1429b3eca2 Untangling ima mess, part 3: kill dead code in ima
Kill the 'update' argument of ima_path_check(), kill
dead code in ima.

Current rules: ima counters are bumped at the same time
when the file switches from put_filp() fodder to fput()
one.  Which happens exactly in two places - alloc_file()
and __dentry_open().  Nothing else needs to do that at
all.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:47 -05:00
Al Viro
b65a9cfc2c Untangling ima mess, part 2: deal with counters
* do ima_get_count() in __dentry_open()
* stop doing that in followups
* move ima_path_check() to right after nameidata_to_filp()
* don't bump counters on it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-16 12:16:47 -05:00
J. Bruce Fields
7663dacd92 nfsd: remove pointless paths in file headers
The new .h files have paths at the top that are now out of date.  While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:47 -05:00
J. Bruce Fields
1557aca790 nfsd: move most of nfsfh.h to fs/nfsd
Most of this can be trivially moved to a private header as well.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:46 -05:00
J. Bruce Fields
774b147828 nfsd: make V4ROOT exports read-only
I can't see any use for writeable V4ROOT exports.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:44 -05:00
Steve Dickson
03a816b46d nfsd: restrict filehandles accepted in V4ROOT case
On V4ROOT exports, only accept filehandles that are the *root* of some
export.  This allows mountd to allow or deny access to individual
directories and symlinks on the pseudofilesystem.

Note that the checks in readdir and lookup are not enough, since a
malicious host with access to the network could guess filehandles that
they weren't able to obtain through lookup or readdir.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:07:24 -05:00
J. Bruce Fields
f2ca7153ca nfsd: allow exports of symlinks
We want to allow exports of symlinks, to allow mountd to communicate to
the kernel which symlinks lead to exports, and hence which symlinks need
to be visible on the pseudofilesystem.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:07:24 -05:00
J. Bruce Fields
3227fa41ab nfsd: filter readdir results in V4ROOT case
As with lookup, we treat every boject as a mountpoint and pretend it
doesn't exist if it isn't exported.

The preexisting code here is confusing, but I haven't yet figured out
how to make it clearer.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:07:24 -05:00
J. Bruce Fields
82ead7fe41 nfsd: filter lookup results in V4ROOT case
We treat every object as a mountpoint and pretend it doesn't exist if
it isn't exported.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:07:23 -05:00
J. Bruce Fields
3b6cee7bc4 nfsd4: don't continue "under" mounts in V4ROOT case
If /A/mount/point/ has filesystem "B" mounted on top of it, and if "A"
is exported, but not "B", then the nfs server has always returned to the
client a filehandle for the mountpoint, instead of for the root of "B",
allowing the client to see the subtree of "A" that would otherwise be
hidden by B.

Disable this behavior in the case of V4ROOT exports; we implement the
path restrictions of V4ROOT exports by treating *every* directory as if
it were a mountpoint, and allowing traversal *only* if the new directory
is exported.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:07:23 -05:00
Steve Dickson
eb4c86c6a5 nfsd: introduce export flag for v4 pseudoroot
NFSv4 differs from v2 and v3 in that it presents a single unified
filesystem tree, whereas v2 and v3 exported multiple filesystem (whose
roots could be found using a separate mount protocol).

Our original NFSv4 server implementation asked the administrator to
designate a single filesystem as the NFSv4 root, then to mount
filesystems they wished to export underneath.  (Often using bind mounts
of already-existing filesystems.)

This was conceptually simple, and allowed easy implementation, but
created a serious obstacle to upgrading between v2/v3: since the paths
to v4 filesystems were different, administrators would have to adjust
all the paths in client-side mount commands when switching to v4.

Various workarounds are possible.  For example, the administrator could
export "/" and designate it as the v4 root.  However, the security risks
of that approach are obvious, and in any case we shouldn't be requiring
the administrator to take extra steps to fix this problem; instead, the
server should present consistent paths across different versions by
default.

These patches take a modified version of that approach: we provide a new
export option which exports only a subset of a filesystem.  With this
flag, it becomes safe for mountd to export "/" by default, with no need
for additional configuration.

We begin just by defining the new flag.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 14:00:40 -05:00
J. Bruce Fields
12045a6ee9 nfsd: let "insecure" flag vary by pseudoflavor
This was an oversight; it should be among the export flags that can be
allowed to vary by pseudoflavor.  This allows an administrator to (for
example) allow auth_sys mounts only from low ports, but allow auth_krb5
mounts to use any port.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 19:08:58 -05:00
J. Bruce Fields
e8e8753f7a nfsd: new interface to advertise export features
Soon we will add the new V4ROOT flag, and allow the INSECURE flag to
vary by pseudoflavor.  It would be useful for nfs-utils (for example,
for improved exportfs error reporting) to be able to know when this
happens.  Use this new interface for that purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:51:29 -05:00
Boaz Harrosh
9a74af2133 nfsd: Move private headers to source directory
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:12 -05:00
Boaz Harrosh
341eb18446 nfsd: Source files #include cleanups
Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.

This patch should improve the compilation speed of the nfsd module.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:09 -05:00
J. Bruce Fields
57ecb34feb nfsd4: fix share mode permissions
NFSv4 opens may function as locks denying other NFSv4 users the rights
to open a file.

We're requiring a user to have write permissions before they can deny
write.  We're *not* requiring a user to have write permissions to deny
read, which is if anything a more drastic denial.

What was intended was to require write permissions for DENY_READ.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:06:54 -05:00
J. Bruce Fields
864f0f61f8 nfsd: simplify fh_verify access checks
All nfsd security depends on the security checks in fh_verify, and
especially on nfsd_setuser().

It therefore bothers me that the nfsd_setuser call may be made from
three different places, depending on whether the filehandle has already
been mapped to a dentry, and on whether subtreechecking is in force.

Instead, make an unconditional call in fh_verify(), so it's trivial to
verify that the call always occurs.

That leaves us with a redundant nfsd_setuser() call in the subtreecheck
case--it needs the correct user set earlier in order to check execute
permissions on the path to this filehandle--but I'm willing to accept
that minor inefficiency in the subtreecheck case in return for more
straightforward permission checking.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-25 17:55:46 -05:00
J. Bruce Fields
9b8b317d58 Merge commit 'v2.6.32-rc8' into HEAD 2009-11-23 12:34:58 -05:00
Petr Vandrovec
479c2553af Fix memory corruption caused by nfsd readdir+
Commit 8177e6d6df ("nfsd: clean up
readdirplus encoding") introduced single character typo in nfs3 readdir+
implementation.  Unfortunately that typo has quite bad side effects:
random memory corruption, followed (on my box) with immediate
spontaneous box reboot.

Using 'p1' instead of 'p' fixes my Linux box rebooting whenever VMware
ESXi box tries to list contents of my home directory.

Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-11-14 12:55:55 -08:00
J. Bruce Fields
0a3adadee4 nfsd: make fs/nfsd/vfs.h for common includes
None of this stuff is used outside nfsd, so move it out of the common
linux include directory.

Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-13 13:23:02 -05:00
Benny Halevy
8c10cbdb4a nfsd: use STATEID_FMT and STATEID_VAL for printing stateids
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-05 12:06:29 -05:00
Peter Staubach
1b7e0403c6 nfsd: register NFS_ACL with rpcbind
Modify the NFS server to register the NFS_ACL services with the rpcbind
daemon.  This allows the client to ping for the existence of the NFS_ACL
support via commands such as "rpcinfo -t <server> nfs_acl".

This patch also modifies the NFS_ACL support so that responses to
version 2 NULLPROC requests can be made.

The changelog for the patch which turned off this functionality
mentioned something about not registering the NFS_ACL as being part of
some tradition.  I can't find this tradition and the only other
implementation which supports NFS_ACL does register them with the
rpcbind daemon.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-04 13:46:37 -05:00
Frank Filz
aba24d7158 nfsd: Fix sort_pacl in fs/nfsd/nf4acl.c to actually sort groups
We have been doing some extensive testing of Linux support for ACLs on
NFDS v4. We have noticed that the server rejects ACLs where the groups
are out of order, for example, the following ACL is rejected:

A::OWNER@:rwaxtTcCy
A::user101@domain:rwaxtcy
A::GROUP@:rwaxtcy
A:g:group102@domain:rwaxtcy
A:g:group101@domain:rwaxtcy
A::EVERYONE@:rwaxtcy

Examining the server code, I found that after converting an NFS v4 ACL
to POSIX, sort_pacl is called to sort the user ACEs and group ACEs.
Unfortunately, a minor bug causes the group sort to be skipped.

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:44 -04:00
J. Bruce Fields
efe0cb6d5a nfsd4.1: common slot allocation size calculation
We do the same calculation in a couple places; use a helper function,
and add a little documentation, in the hopes of preventing bugs like
that fixed in the last patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:43 -04:00
J. Bruce Fields
dd829c4564 nfsd4.1: fix session memory use calculation
Unbalanced calculations on creation and destruction of sessions could
cause our estimate of cache memory used to become negative, sometimes
resulting in spurious SERVERFAULT returns to client CREATE_SESSION
requests.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-10-27 19:34:43 -04:00
J. Bruce Fields
e343eb0d60 Merge commit 'v2.6.32-rc5' into for-2.6.33 2009-10-27 18:45:17 -04:00
Alexey Dobriyan
828c09509b const: constify remaining file_operations
[akpm@linux-foundation.org: fix KVM]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-01 16:11:11 -07:00
Andy Adamson
ddc04fd4d5 nfsd41: use sv_max_mesg for forechannel max sizes
ca_maxresponsesize and ca_maxrequest size include the RPC header.

sv_max_mesg is sv_max_payolad plus a page for overhead and is used in
svc_init_buffer to allocate server buffer space for both the request and reply.
Note that this means we can service an RPC compound that requires
ca_maxrequestsize (MAXWRITE) or ca_max_responsesize (MAXREAD) but that we do
not support an RPC compound that requires both ca_maxrequestsize and
ca_maxresponsesize.

Signed-off-by: Andy Adamson <andros@netapp.com>
[bfields@citi.umich.edu: more documentation updates]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-28 12:40:15 -04:00
J. Bruce Fields
f39bde24b2 nfsd4: fix error return when pseudoroot missing
We really shouldn't hit this case at all, and forthcoming kernel and
nfs-utils changes should eliminate this case; if it does happen,
consider it a bug rather than reporting an error that doesn't really
make sense for the operation (since there's no reason for a server to be
accepting v4 traffic yet have no root filehandle).

Also move some exp_pseudoroot code into a helper function while we're
here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-28 12:21:26 -04:00
J. Bruce Fields
289ede453e nfsd: minor nfsd_lookup cleanup
Break out some of nfsd_lookup_dentry into helper functions.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-28 12:07:53 -04:00
J. Bruce Fields
fed8381126 nfsd4: cross mountpoints when looking up parents
3c394ddaa7 "nfsd4: nfsv4 clients should
cross mountpoints" forgot to handle lookups of parents directories.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-28 12:07:52 -04:00
Alexey Dobriyan
2bcd57ab61 headers: utsname.h redux
* remove asm/atomic.h inclusion from linux/utsname.h --
   not needed after kref conversion
 * remove linux/utsname.h inclusion from files which do not need it

NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 18:13:10 -07:00
James Morris
88e9d34c72 seq_file: constify seq_operations
Make all seq_operations structs const, to help mitigate against
revectoring user-triggerable function pointers.

This is derived from the grsecurity patch, although generated from scratch
because it's simpler than extracting the changes from there.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 07:39:29 -07:00
Linus Torvalds
a87e84b5cd Merge branch 'for-2.6.32' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits)
  nfsd4: nfsv4 clients should cross mountpoints
  nfsd: revise 4.1 status documentation
  sunrpc/cache: avoid variable over-loading in cache_defer_req
  sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
  nfsd: return success for non-NFS4 nfs4_state_start
  nfsd41: Refactor create_client()
  nfsd41: modify nfsd4.1 backchannel to use new xprt class
  nfsd41: Backchannel: Implement cb_recall over NFSv4.1
  nfsd41: Backchannel: cb_sequence callback
  nfsd41: Backchannel: Setup sequence information
  nfsd41: Backchannel: Server backchannel RPC wait queue
  nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
  nfsd41: Backchannel: callback infrastructure
  nfsd4: use common rpc_cred for all callbacks
  nfsd4: allow nfs4 state startup to fail
  SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous
  nfsd4: fix null dereference creating nfsv4 callback client
  nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
  nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
  sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
  ...
2009-09-22 07:54:33 -07:00
Alexey Dobriyan
7b021967c5 const: make lock_manager_operations const
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-22 07:17:25 -07:00
Steve Dickson
3c394ddaa7 nfsd4: nfsv4 clients should cross mountpoints
Allow NFS v4 clients to seamlessly cross mount point without
have to set either the 'crossmnt' or the 'nohide' export
options.

Signed-Off-By: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-21 16:02:25 -04:00
Ricardo Labiaga
b09333c464 nfsd41: Refactor create_client()
Move common initialization of 'struct nfs4_client' inside create_client().

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>

[nfsd41: Remember the auth flavor to use for callbacks]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:52:13 -04:00
Alexandros Batsakis
3ddc8bf5f3 nfsd41: modify nfsd4.1 backchannel to use new xprt class
This patch enables the use of the nfsv4.1 backchannel.

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
[initialize rpc_create_args.bc_xprt too]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:52:13 -04:00
Ricardo Labiaga
0421b5c55a nfsd41: Backchannel: Implement cb_recall over NFSv4.1
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[nfsd41: cb_recall callback]
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Share v4.0 and v4.1 back channel xdr]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
[nfsd41: conditionally decode_sequence in nfs4_xdr_dec_cb_recall]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: Backchannel: Add sequence arguments to callback RPC arguments]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[pulled-in definition of nfsd4_cb_done]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:52:12 -04:00
Benny Halevy
2af73580b7 nfsd41: Backchannel: cb_sequence callback
Implement the cb_sequence callback conforming to draft-ietf-nfsv4-minorversion1

Note: highest slot id and target highest slot id do not have to be 0
as was previously implemented.  They can be greater than what the
nfs server sent if the client supports a larger slot table on the
backchannel.  At this point we just ignore that.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[Rework the back channel xdr using the shared v4.0 and v4.1 framework.]
Signed-off-by: Andy Adamson <andros@netapp.com>
[fixed indentation]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use nfsd4_cb_sequence for callback minorversion]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: fix verification of CB_SEQUENCE highest slot id[
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: Backchannel: Remove old backchannel serialization]
[nfsd41: Backchannel: First callback sequence ID should be 1]
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: decode_cb_sequence does not need to actually decode ignored fields]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:56 -04:00
Ricardo Labiaga
2a1d1b5938 nfsd41: Backchannel: Setup sequence information
Follows the model used by the NFS client.  Setup the RPC prepare and done
function pointers so that we can populate the sequence information if
minorversion == 1.  rpc_run_task() is then invoked directly just like
existing NFS client operations do.

nfsd4_cb_prepare() determines if the sequence information needs to be setup.
If the slot is in use, it adds itself to the wait queue.

nfsd4_cb_done() wakes anyone sleeping on the callback channel wait queue
after our RPC reply has been received.  It also sets the task message
result pointer to NULL to clearly indicate we're done using it.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[define and initialize cl_cb_seq_nr here]
[pulled out unused defintion of nfsd4_cb_done]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:56 -04:00
Ricardo Labiaga
199ff35e1c nfsd41: Backchannel: Server backchannel RPC wait queue
RPC callback requests will wait on this wait queue if the backchannel
is out of slots.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:55 -04:00
Ricardo Labiaga
132f97715c nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
Follow the model we use in the client. Make the sequence arguments
part of the regular RPC arguments.  None of the callbacks that are
soon to be implemented expect results that need to be passed back
to the caller, so we don't define a separate RPC results structure.
For session validation, the cb_sequence decoding will use a pointer
to the sequence arguments that are part of the RPC argument.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[define struct nfsd4_cb_sequence here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:55 -04:00
Andy Adamson
38524ab38f nfsd41: Backchannel: callback infrastructure
Keep the xprt used for create_session in cl_cb_xprt.
Mark cl_callback.cb_minorversion = 1 and remember
the client provided cl_callback.cb_prog rpc program number.
Use it to probe the callback path.

Use the client's network address to initialize as the
callback's address as expected by the xprt creation
routines.

Define xdr sizes and code nfs4_cb_compound header to be able
to send a null callback rpc.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[get callback minorversion from fore channel's]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: change bc_sock to bc_xprt]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pulled definition for cl_cb_xprt]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: set up backchannel's cb_addr]
[moved rpc_create_args init to "nfsd: modify nfsd4.1 backchannel to use new xprt class"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:55 -04:00
J. Bruce Fields
80fc015bdf nfsd4: use common rpc_cred for all callbacks
Callbacks are always made using the machine's identity, so we can use a
single auth_generic credential shared among callbacks to all clients and
let the rpc code take care of the rest.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:34 -04:00
J. Bruce Fields
29ab23cc5d nfsd4: allow nfs4 state startup to fail
The failure here is pretty unlikely, but we should handle it anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:33 -04:00
J. Bruce Fields
886e3b7fe6 nfsd4: fix null dereference creating nfsv4 callback client
On setting up the callback to the client, we attempt to use the same
authentication flavor the client did.  We find an rpc cred to use by
calling rpcauth_lookup_credcache(), which assumes that the given
authentication flavor has a credentials cache.  However, this is not
required to be true--in particular, auth_null does not use one.
Instead, we should call the auth's lookup_cred() method.

Without this, a client attempting to mount using nfsv4 and auth_null
triggers a null dereference.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-15 20:49:33 -04:00
Benny Halevy
4be36ca0ce nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-13 15:57:39 -04:00
Trond Myklebust
ab3bbaa8b2 Merge branch 'nfs-for-2.6.32' 2009-09-11 14:59:37 -04:00
J. Bruce Fields
aed100fafb nfsd: fix leak on error in nfsv3 readdir
Note the !dchild->d_inode case can leak the filehandle.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-04 15:48:00 -04:00
J. Bruce Fields
8177e6d6df nfsd: clean up readdirplus encoding
Make the return from compose_entry_fh() zero or an error, even though
the returned error isn't used, just to make the meaning of the return
immediately obvious.

Move some repeated code out of main function into helper.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-04 15:47:40 -04:00
J. Bruce Fields
1be10a88ca nfsd4: filehandle leak or error exit from fh_compose()
A number of callers (nfsd4_encode_fattr(), at least) don't bother to
release the filehandle returned to fh_compose() if fh_compose() returns
an error.  So, modify fh_compose() to release the filehandle before
returning an error.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-04 11:59:32 -04:00
Trond Myklebust
2671a4bf35 NFSd: Fix filehandle leak in exp_pseudoroot() and nfsd4_path()
nfsd4_path() allocates a temporary filehandle and then fails to free it
before the function exits, leaking reference counts to the dentry and
export that it refers to.

Also, nfsd4_lookupp() puts the result of exp_pseudoroot() in a temporary
filehandle which it releases on success of exp_pseudoroot() but not on
failure; fix exp_pseudoroot to ensure that on failure it releases the
filehandle before returning.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-03 16:57:57 -04:00
J. Bruce Fields
bc6c53d5a1 nfsd: move fsid_type choice out of fh_compose
More trivial cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-02 23:54:48 -04:00
J. Bruce Fields
8e498751f2 nfsd: move some of fh_compose into helper functions
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-02 23:53:51 -04:00
David Howells
e0e817392b CRED: Add some configurable debugging [try #6]
Add a config option (CONFIG_DEBUG_CREDENTIALS) to turn on some debug checking
for credential management.  The additional code keeps track of the number of
pointers from task_structs to any given cred struct, and checks to see that
this number never exceeds the usage count of the cred struct (which includes
all references, not just those from task_structs).

Furthermore, if SELinux is enabled, the code also checks that the security
pointer in the cred struct is never seen to be invalid.

This attempts to catch the bug whereby inode_has_perm() faults in an nfsd
kernel thread on seeing cred->security be a NULL pointer (it appears that the
credential struct has been previously released):

	http://www.kerneloops.org/oops.php?number=252883

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-02 21:29:01 +10:00
Andy Adamson
557ce2646e nfsd41: replace page based DRC with buffer based DRC
Use NFSD_SLOT_CACHE_SIZE size buffers for sessions DRC instead of holding nfsd
pages in cache.

Connectathon testing has shown that 1024 bytes for encoded compound operation
responses past the sequence operation is sufficient, 512 bytes is a little too
small. Set NFSD_SLOT_CACHE_SIZE to 1024.

Allocate memory for the session DRC in the CREATE_SESSION operation
to guarantee that the memory resource is available for caching responses.
Allocate each slot individually in preparation for slot table size negotiation.

Remove struct nfsd4_cache_entry and helper functions for the old page-based
DRC.

The iov_len calculation in nfs4svc_encode_compoundres is now always
correct.  Replay is now done in nfsd4_sequence under the state lock, so
the session ref count is only bumped on non-replay. Clean up the
nfs4svc_encode_compoundres session logic.

The nfsd4_compound_state statp pointer is also not used.
Remove nfsd4_set_statp().

Move useful nfsd4_cache_entry fields into nfsd4_slot.

Signed-off-by: Andy Adamson <andros@netapp.com
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:06 -04:00
Andy Adamson
bdac86e215 nfsd41: replace nfserr_resource in pure nfs41 responses
nfserr_resource is not a legal error for NFSv4.1. Replace it with
nfserr_serverfault for EXCHANGE_ID and CREATE_SESSION processing.

We will also need to map nfserr_resource to other errors in routines shared
by NFSv4.0 and NFSv4.1

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:05 -04:00
Andy Adamson
a8dfdaeb7a nfsd41: use session maxreqs for sequence target and highest slotid
This fixes a bug in the sequence operation reply.

The sequence operation returns the highest slotid it will accept in the future
in sr_highest_slotid, and the highest slotid it prefers the client to use.
Since we do not re-negotiate the session slot table yet, these should both
always be set to the session ca_maxrequests.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:05 -04:00
Andy Adamson
a649637c73 nfsd41: bound forechannel drc size by memory usage
By using the requested ca_maxresponsesize_cached * ca_maxresponses to bound
a forechannel drc request size, clients can tailor a session to usage.

For example, an I/O session (READ/WRITE only) can have a much smaller
ca_maxresponsesize_cached (for only WRITE compound responses) and a lot larger
ca_maxresponses to service a large in-flight data window.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 22:24:05 -04:00