The server's callback client should stop trying to connect to the
client's callback server as soon as it gets ECONNREFUSED.
The NFS server's callback client does not call rpc_ping(), but appears
to have it's own "ping" procedure, so it wasn't covered by commit
caabea8a.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
- Add commit_metadata export_operation to allow the underlying filesystem to
decide how to commit an inode most efficiently.
- Usage of nfsd_sync_dir and write_inode_now has been replaced with the
commit_metadata function that takes a svc_fh.
- The commit_metadata function calls the commit_metadata export_op if it's
there, or else falls back to sync_inode instead of fsync and write_inode_now
because only metadata need be synced here.
- nfsd4_sync_rec_dir now uses vfs_fsync so that commit_metadata can be static
Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The NFS COMMIT operation allows the client to specify the exact byte range
that it wishes to sync to disk in order to optimise server performance.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Try to create a PF_INET6 listener for NFSD, if IPv6 is enabled in the
kernel.
Make sure nfsd_serv's reference count is decreased if
__write_ports_addxprt() failed to create a listener. See
__write_ports_addfd().
Our current plan is to rely on rpc.nfsd to create appropriate IPv6
listeners when server-side NFS/IPv6 support is desired. Legacy
behavior, via the write_threads or write_svc kernel APIs, will remain
the same -- only IPv4 listeners are created.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[bfields@citi.umich.edu: Move error-handling code to end]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
write_ports() converts svc_create_xprt()'s ENOENT error return to
EPROTONOSUPPORT so that rpc.nfsd (in user space) can report an error
message that makes sense.
It turns out that several of the other kernel APIs rpc.nfsd use can
also return ENOENT from svc_create_xprt(), by way of lockd_up().
On the client side, an NFSv2 or NFSv3 mount request can also return
the result of lockd_up(). This error may also be returned during an
NFSv4 mount request, since the NFSv4 callback service uses
svc_create_xprt() to create the callback listener. An ENOENT error
return results in a confusing error message from the mount command.
Let's have svc_create_xprt() return EPROTONOSUPPORT instead of ENOENT.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Instead of opencoding the fsync calling sequence use vfs_fsync. This also
gets rid of the useless i_mutex over the data writeout.
Consolidate the remaining special code for syncing directories and document
it's quirks.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Since we're checking for LAST_NFS4_OP, use FIRST_NFS4_OP to be consistent.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The server incorrectly assumes that the operations in the
array start with value 0. The first operation (OP_ACCESS)
has a value of 3, causing the check in nfsd4_decode_compound
to be off.
Instead of comparing that the operation number is less than
the number of elements in the array, the server should verify
that it is less than the maximum valid operation number
defined by LAST_NFS4_OP.
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'for-2.6.33' of git://linux-nfs.org/~bfields/linux:
sunrpc: fix peername failed on closed listener
nfsd: make sure data is on disk before calling ->fsync
nfsd: fix "insecure" export option
nfsd is not using vfs_fsync, so I missed it when changing the calling
convention during the 2.6.32 window. This patch fixes it to not only
start the data writeout, but also wait for it to complete before calling
into ->fsync.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
A typo in 12045a6ee9 "nfsd: let "insecure" flag vary by
pseudoflavor" reversed the sense of the "insecure" flag.
Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A typo in 12045a6ee9 "nfsd: let "insecure" flag vary by
pseudoflavor" reversed the sense of the "insecure" flag.
Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
direct I/O fallback sync simplification
ocfs: stop using do_sync_mapping_range
cleanup blockdev_direct_IO locking
make generic_acl slightly more generic
sanitize xattr handler prototypes
libfs: move EXPORT_SYMBOL for d_alloc_name
vfs: force reval of target when following LAST_BIND symlinks (try #7)
ima: limit imbalance msg
Untangling ima mess, part 3: kill dead code in ima
Untangling ima mess, part 2: deal with counters
Untangling ima mess, part 1: alloc_file()
O_TRUNC open shouldn't fail after file truncation
ima: call ima_inode_free ima_inode_free
IMA: clean up the IMA counts updating code
ima: only insert at inode creation time
ima: valid return code from ima_inode_alloc
fs: move get_empty_filp() deffinition to internal.h
Sanitize exec_permission_lite()
Kill cached_lookup() and real_lookup()
Kill path_lookup_open()
...
Trivial conflicts in fs/direct-io.c
Kill the 'update' argument of ima_path_check(), kill
dead code in ima.
Current rules: ima counters are bumped at the same time
when the file switches from put_filp() fodder to fput()
one. Which happens exactly in two places - alloc_file()
and __dentry_open(). Nothing else needs to do that at
all.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* do ima_get_count() in __dentry_open()
* stop doing that in followups
* move ima_path_check() to right after nameidata_to_filp()
* don't bump counters on it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The new .h files have paths at the top that are now out of date. While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
On V4ROOT exports, only accept filehandles that are the *root* of some
export. This allows mountd to allow or deny access to individual
directories and symlinks on the pseudofilesystem.
Note that the checks in readdir and lookup are not enough, since a
malicious host with access to the network could guess filehandles that
they weren't able to obtain through lookup or readdir.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We want to allow exports of symlinks, to allow mountd to communicate to
the kernel which symlinks lead to exports, and hence which symlinks need
to be visible on the pseudofilesystem.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
As with lookup, we treat every boject as a mountpoint and pretend it
doesn't exist if it isn't exported.
The preexisting code here is confusing, but I haven't yet figured out
how to make it clearer.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If /A/mount/point/ has filesystem "B" mounted on top of it, and if "A"
is exported, but not "B", then the nfs server has always returned to the
client a filehandle for the mountpoint, instead of for the root of "B",
allowing the client to see the subtree of "A" that would otherwise be
hidden by B.
Disable this behavior in the case of V4ROOT exports; we implement the
path restrictions of V4ROOT exports by treating *every* directory as if
it were a mountpoint, and allowing traversal *only* if the new directory
is exported.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
NFSv4 differs from v2 and v3 in that it presents a single unified
filesystem tree, whereas v2 and v3 exported multiple filesystem (whose
roots could be found using a separate mount protocol).
Our original NFSv4 server implementation asked the administrator to
designate a single filesystem as the NFSv4 root, then to mount
filesystems they wished to export underneath. (Often using bind mounts
of already-existing filesystems.)
This was conceptually simple, and allowed easy implementation, but
created a serious obstacle to upgrading between v2/v3: since the paths
to v4 filesystems were different, administrators would have to adjust
all the paths in client-side mount commands when switching to v4.
Various workarounds are possible. For example, the administrator could
export "/" and designate it as the v4 root. However, the security risks
of that approach are obvious, and in any case we shouldn't be requiring
the administrator to take extra steps to fix this problem; instead, the
server should present consistent paths across different versions by
default.
These patches take a modified version of that approach: we provide a new
export option which exports only a subset of a filesystem. With this
flag, it becomes safe for mountd to export "/" by default, with no need
for additional configuration.
We begin just by defining the new flag.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This was an oversight; it should be among the export flags that can be
allowed to vary by pseudoflavor. This allows an administrator to (for
example) allow auth_sys mounts only from low ports, but allow auth_krb5
mounts to use any port.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Soon we will add the new V4ROOT flag, and allow the INSECURE flag to
vary by pseudoflavor. It would be useful for nfs-utils (for example,
for improved exportfs error reporting) to be able to know when this
happens. Use this new interface for that purpose.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.
This patch should improve the compilation speed of the nfsd module.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
NFSv4 opens may function as locks denying other NFSv4 users the rights
to open a file.
We're requiring a user to have write permissions before they can deny
write. We're *not* requiring a user to have write permissions to deny
read, which is if anything a more drastic denial.
What was intended was to require write permissions for DENY_READ.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
All nfsd security depends on the security checks in fh_verify, and
especially on nfsd_setuser().
It therefore bothers me that the nfsd_setuser call may be made from
three different places, depending on whether the filehandle has already
been mapped to a dentry, and on whether subtreechecking is in force.
Instead, make an unconditional call in fh_verify(), so it's trivial to
verify that the call always occurs.
That leaves us with a redundant nfsd_setuser() call in the subtreecheck
case--it needs the correct user set earlier in order to check execute
permissions on the path to this filehandle--but I'm willing to accept
that minor inefficiency in the subtreecheck case in return for more
straightforward permission checking.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Commit 8177e6d6df ("nfsd: clean up
readdirplus encoding") introduced single character typo in nfs3 readdir+
implementation. Unfortunately that typo has quite bad side effects:
random memory corruption, followed (on my box) with immediate
spontaneous box reboot.
Using 'p1' instead of 'p' fixes my Linux box rebooting whenever VMware
ESXi box tries to list contents of my home directory.
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
None of this stuff is used outside nfsd, so move it out of the common
linux include directory.
Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Modify the NFS server to register the NFS_ACL services with the rpcbind
daemon. This allows the client to ping for the existence of the NFS_ACL
support via commands such as "rpcinfo -t <server> nfs_acl".
This patch also modifies the NFS_ACL support so that responses to
version 2 NULLPROC requests can be made.
The changelog for the patch which turned off this functionality
mentioned something about not registering the NFS_ACL as being part of
some tradition. I can't find this tradition and the only other
implementation which supports NFS_ACL does register them with the
rpcbind daemon.
Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We have been doing some extensive testing of Linux support for ACLs on
NFDS v4. We have noticed that the server rejects ACLs where the groups
are out of order, for example, the following ACL is rejected:
A::OWNER@:rwaxtTcCy
A::user101@domain:rwaxtcy
A::GROUP@:rwaxtcy
A:g:group102@domain:rwaxtcy
A:g:group101@domain:rwaxtcy
A::EVERYONE@:rwaxtcy
Examining the server code, I found that after converting an NFS v4 ACL
to POSIX, sort_pacl is called to sort the user ACEs and group ACEs.
Unfortunately, a minor bug causes the group sort to be skipped.
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We do the same calculation in a couple places; use a helper function,
and add a little documentation, in the hopes of preventing bugs like
that fixed in the last patch.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Unbalanced calculations on creation and destruction of sessions could
cause our estimate of cache memory used to become negative, sometimes
resulting in spurious SERVERFAULT returns to client CREATE_SESSION
requests.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
ca_maxresponsesize and ca_maxrequest size include the RPC header.
sv_max_mesg is sv_max_payolad plus a page for overhead and is used in
svc_init_buffer to allocate server buffer space for both the request and reply.
Note that this means we can service an RPC compound that requires
ca_maxrequestsize (MAXWRITE) or ca_max_responsesize (MAXREAD) but that we do
not support an RPC compound that requires both ca_maxrequestsize and
ca_maxresponsesize.
Signed-off-by: Andy Adamson <andros@netapp.com>
[bfields@citi.umich.edu: more documentation updates]
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We really shouldn't hit this case at all, and forthcoming kernel and
nfs-utils changes should eliminate this case; if it does happen,
consider it a bug rather than reporting an error that doesn't really
make sense for the operation (since there's no reason for a server to be
accepting v4 traffic yet have no root filehandle).
Also move some exp_pseudoroot code into a helper function while we're
here.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
3c394ddaa7 "nfsd4: nfsv4 clients should
cross mountpoints" forgot to handle lookups of parents directories.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* remove asm/atomic.h inclusion from linux/utsname.h --
not needed after kref conversion
* remove linux/utsname.h inclusion from files which do not need it
NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make all seq_operations structs const, to help mitigate against
revectoring user-triggerable function pointers.
This is derived from the grsecurity patch, although generated from scratch
because it's simpler than extracting the changes from there.
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-2.6.32' of git://linux-nfs.org/~bfields/linux: (68 commits)
nfsd4: nfsv4 clients should cross mountpoints
nfsd: revise 4.1 status documentation
sunrpc/cache: avoid variable over-loading in cache_defer_req
sunrpc/cache: use list_del_init for the list_head entries in cache_deferred_req
nfsd: return success for non-NFS4 nfs4_state_start
nfsd41: Refactor create_client()
nfsd41: modify nfsd4.1 backchannel to use new xprt class
nfsd41: Backchannel: Implement cb_recall over NFSv4.1
nfsd41: Backchannel: cb_sequence callback
nfsd41: Backchannel: Setup sequence information
nfsd41: Backchannel: Server backchannel RPC wait queue
nfsd41: Backchannel: Add sequence arguments to callback RPC arguments
nfsd41: Backchannel: callback infrastructure
nfsd4: use common rpc_cred for all callbacks
nfsd4: allow nfs4 state startup to fail
SUNRPC: Defer the auth_gss upcall when the RPC call is asynchronous
nfsd4: fix null dereference creating nfsv4 callback client
nfsd4: fix whitespace in NFSPROC4_CLNT_CB_NULL definition
nfsd41: sunrpc: add new xprt class for nfsv4.1 backchannel
sunrpc/cache: simplify cache_fresh_locked and cache_fresh_unlocked.
...
Allow NFS v4 clients to seamlessly cross mount point without
have to set either the 'crossmnt' or the 'nohide' export
options.
Signed-Off-By: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>