Commit Graph

438 Commits

Author SHA1 Message Date
Jeff Layton
cc8a55320b nfsd: serialize layout stateid morphing operations
In order to allow the client to make a sane determination of what
happened with racing LAYOUTGET/LAYOUTRETURN/CB_LAYOUTRECALL calls, we
must ensure that the seqids return accurately represent the order of
operations. The simplest way to do that is to ensure that operations on
a single stateid are serialized.

This patch adds a mutex to the layout stateid, and locks it when
checking the layout stateid's seqid. The mutex is held over the entire
operation and released after the seqid is bumped.

Note that in the case of CB_LAYOUTRECALL we must move the increment of
the seqid and setting into a new cb "prepare" operation. The lease
infrastructure will call the lm_break callback with a spinlock held, so
and we can't take the mutex in that codepath.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-10-23 15:57:32 -04:00
Kinglong Mee
ead8fb8c24 NFSD: Set the attributes used to store the verifier for EXCLUSIVE4_1
According to rfc5661 18.16.4,
"If EXCLUSIVE4_1 was used, the client determines the attributes
 used for the verifier by comparing attrset with cva_attrs.attrmask;"

So, EXCLUSIVE4_1 also needs those bitmask used to store the verifier.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-31 16:16:39 -04:00
J. Bruce Fields
c87fb4a378 lockd: NLM grace period shouldn't block NFSv4 opens
NLM locks don't conflict with NFSv4 share reservations, so we're not
going to learn anything new by watiting for them.

They do conflict with NFSv4 locks and with delegations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-13 10:22:06 -04:00
Kinglong Mee
6cd22668e8 nfsd: Remove unneeded values in nfsd4_open()
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-08-10 16:05:49 -04:00
Kinglong Mee
d8398fc117 nfsd: Set lc_size_chg before ops->proc_layoutcommit
After proc_layoutcommit success, i_size_read(inode) always >= new_size.
Just set lc_size_chg before proc_layoutcommit, if proc_layoutcommit
failed, nfsd will skip the lc_size_chg, so it's no harm.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-07-20 14:58:46 -04:00
Christoph Hellwig
96bcad5064 nfsd: fput rd_file from XDR encode context
Remove the hack where we fput the read-specific file in generic code.
Instead we can do it in nfsd4_encode_read as that gets called for all
error cases as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22 14:15:04 -04:00
Christoph Hellwig
af90f707fa nfsd: take struct file setup fully into nfs4_preprocess_stateid_op
This patch changes nfs4_preprocess_stateid_op so it always returns
a valid struct file if it has been asked for that.  For that we
now allocate a temporary struct file for special stateids, and check
permissions if we got the file structure from the stateid.  This
ensures that all callers will get their handling of special stateids
right, and avoids code duplication.

There is a little wart in here because the read code needs to know
if we allocated a file structure so that it can copy around the
read-ahead parameters.  In the long run we should probably aim to
cache full file structures used with special stateids instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-06-22 14:15:03 -04:00
Linus Torvalds
9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00
J. Bruce Fields
980608fb50 nfsd4: disallow SEEK with special stateids
If the client uses a special stateid then we'll pass a NULL file to
vfs_llseek.

Fixes: 24bab49122 " NFSD: Implement SEEK"
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: stable@vger.kernel.org
Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 16:16:01 -04:00
J. Bruce Fields
5ba4a25ab7 nfsd4: disallow ALLOCATE with special stateids
vfs_fallocate will hit a NULL dereference if the client tries an
ALLOCATE or DEALLOCATE with a special stateid.  Fix that.  (We also
depend on the open to have broken any conflicting leases or delegations
for us.)

(If it turns out we need to allow special stateid's then we could do a
temporary open here in the special-stateid case, as we do for read and
write.  For now I'm assuming it's not necessary.)

Fixes: 95d871f03c "nfsd: Add ALLOCATE support"
Cc: stable@vger.kernel.org
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-04-21 15:44:06 -04:00
David Howells
2b0143b5c9 VFS: normal filesystems (and lustre): d_inode() annotations
that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-04-15 15:06:57 -04:00
Kinglong Mee
1ec8c0c47f nfsd: Remove duplicate macro define for max sec label length
NFS4_MAXLABELLEN has defined for sec label max length, use it directly.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
Jeff Layton
4229789993 nfsd: remove unused status arg to nfsd4_cleanup_open_state
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:39 -04:00
Kinglong Mee
beaca2347f NFSD: Use correct reply size calculating function
ALLOCATE/DEALLOCATE only reply one status value to client,
so, using nfsd4_only_status_rsize for reply size calculating.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-31 16:46:38 -04:00
Kinglong Mee
a1420384e3 NFSD: Put exports after nfsd4_layout_verify fail
Fix commit 9cf514ccfa (nfsd: implement pNFS operations).

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2015-03-20 16:15:42 -04:00
Christoph Hellwig
31ef83dc05 nfsd: add trace events
For now just a few simple events to trace the layout stateid lifetime, but
these already were enough to find several bugs in the Linux client layout
stateid handling.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-02 18:09:44 +01:00
Christoph Hellwig
c5c707f96f nfsd: implement pNFS layout recalls
Add support to issue layout recalls to clients.  For now we only support
full-file recalls to get a simple and stable implementation.  This allows
to embedd a nfsd4_callback structure in the layout_state and thus avoid
any memory allocations under spinlocks during a recall.  For normal
use cases that do not intent to share a single file between multiple
clients this implementation is fully sufficient.

To ensure layouts are recalled on local filesystem access each layout
state registers a new FL_LAYOUT lease with the kernel file locking code,
which filesystems that support pNFS exports that require recalls need
to break on conflicting access patterns.

The XDR code is based on the old pNFS server implementation by
Andy Adamson, Benny Halevy, Boaz Harrosh, Dean Hildebrand, Fred Isaman,
Marc Eshel, Mike Sager and Ricardo Labiaga.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-02 18:09:43 +01:00
Christoph Hellwig
9cf514ccfa nfsd: implement pNFS operations
Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and
LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage
outstanding layouts and devices.

Layout management is very straight forward, with a nfs4_layout_stateid
structure that extends nfs4_stid to manage layout stateids as the
top-level structure.  It is linked into the nfs4_file and nfs4_client
structures like the other stateids, and contains a linked list of
layouts that hang of the stateid.  The actual layout operations are
implemented in layout drivers that are not part of this commit, but
will be added later.

The worst part of this commit is the management of the pNFS device IDs,
which suffers from a specification that is not sanely implementable due
to the fact that the device-IDs are global and not bound to an export,
and have a small enough size so that we can't store the fsid portion of
a file handle, and must never be reused.  As we still do need perform all
export authentication and validation checks on a device ID passed to
GETDEVICEINFO we are caught between a rock and a hard place.  To work
around this issue we add a new hash that maps from a 64-bit integer to a
fsid so that we can look up the export to authenticate against it,
a 32-bit integer as a generation that we can bump when changing the device,
and a currently unused 32-bit integer that could be used in the future
to handle more than a single device per export.  Entries in this hash
table are never deleted as we can't reuse the ids anyway, and would have
a severe lifetime problem anyway as Linux export structures are temporary
structures that can go away under load.

Parts of the XDR data, structures and marshaling/unmarshaling code, as
well as many concepts are derived from the old pNFS server implementation
from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman,
Mike Sager, Ricardo Labiaga and many others.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2015-02-02 18:09:42 +01:00
Jeff Layton
779fb0f3af sunrpc: move rq_splice_ok flag into rq_flags
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:21 -05:00
Jeff Layton
30660e04b0 sunrpc: move rq_usedeferral flag to rq_flags
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-12-09 11:22:20 -05:00
Anna Schumaker
b0cb908523 nfsd: Add DEALLOCATE support
DEALLOCATE only returns a status value, meaning we can use the noop()
xdr encoder to reply to the client.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-11-07 16:20:15 -05:00
Anna Schumaker
95d871f03c nfsd: Add ALLOCATE support
The ALLOCATE operation is used to preallocate space in a file.  I can do
this by using vfs_fallocate() to do the actual preallocation.

ALLOCATE only returns a status indicator, so we don't need to write a
special encode() function.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-11-07 16:19:49 -05:00
J. Bruce Fields
51904b0807 nfsd4: fix crash on unknown operation number
Unknown operation numbers are caught in nfsd4_decode_compound() which
sets op->opnum to OP_ILLEGAL and op->status to nfserr_op_illegal.  The
error causes the main loop in nfsd4_proc_compound() to skip most
processing.  But nfsd4_proc_compound also peeks ahead at the next
operation in one case and doesn't take similar precautions there.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-10-23 13:39:51 -04:00
J. Bruce Fields
d1d84c9626 nfsd4: fix response size estimation for OP_SEQUENCE
We added this new estimator function but forgot to hook it up.  The
effect is that NFSv4.1 (and greater) won't do zero-copy reads.

The estimate was also wrong by 8 bytes.

Fixes: ccae70a9ee "nfsd4: estimate sequence response size"
Cc: stable@vger.kernel.org
Reported-by: Chuck Lever <chucklever@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-10-21 09:10:50 -04:00
Anna Schumaker
24bab49122 NFSD: Implement SEEK
This patch adds server support for the NFS v4.2 operation SEEK, which
returns the position of the next hole or data segment in a file.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-09-29 14:35:20 -04:00
Trond Myklebust
3234975f47 nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-08-05 10:55:16 -04:00
Jeff Layton
58fb12e6a4 nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache
We don't want to rely on the client_mutex for protection in the case of
NFSv4 open owners. Instead, we add a mutex that will only be taken for
NFSv4.0 state mutating operations, and that will be released once the
entire compound is done.

Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay
take a reference to the stateowner when they are using it for NFSv4.0
open and lock replay caching.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-31 14:20:19 -04:00
Jeff Layton
b3fbfe0e7a nfsd: print status when nfsd4_open fails to open file it just created
It's possible for nfsd to fail opening a file that it has just created.
When that happens, we throw a WARN but it doesn't include any info about
the error code. Print the status code to give us a bit more info.

Our QA group hit some of these warnings under some very heavy stress
testing. My suspicion is that they hit the file-max limit, but it's hard
to know for sure. Go ahead and add a -ENFILE mapping to
nfserr_serverfault to make the error more distinct (and correct).

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-29 23:08:38 -04:00
Trond Myklebust
0fe492db60 nfsd: Convert nfs4_check_open_reclaim() to work with lookup_clientid()
lookup_clientid is preferable to find_confirmed_client since it's able
to use the cached client in the compound state.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-09 20:55:07 -04:00
Kinglong Mee
1e444f5bc0 NFSD: Remove iattr parameter from nfsd_symlink()
Commit db2e747b14 (vfs: remove mode parameter from vfs_symlink())
have remove mode parameter from vfs_symlink.
So that, iattr isn't needed by nfsd_symlink now, just remove it.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08 17:14:31 -04:00
J. Bruce Fields
7fb84306f5 nfsd4: rename cr_linkname->cr_data
The name of a link is currently stored in cr_name and cr_namelen, and
the content in cr_linkname and cr_linklen.  That's confusing.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08 17:14:24 -04:00
J. Bruce Fields
52ee04330f nfsd: let nfsd_symlink assume null-terminated data
Currently nfsd_symlink has a weird hack to serve callers who don't
null-terminate symlink data: it looks ahead at the next byte to see if
it's zero, and copies it to a new buffer to null-terminate if not.

That means callers don't have to null-terminate, but they *do* have to
ensure that the byte following the end of the data is theirs to read.

That's a bit subtle, and the NFSv4 code actually got this wrong.

So let's just throw out that code and let callers pass null-terminated
strings; we've already fixed them to do that.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08 17:14:23 -04:00
J. Bruce Fields
b829e9197a nfsd: fix rare symlink decoding bug
An NFS operation that creates a new symlink includes the symlink data,
which is xdr-encoded as a length followed by the data plus 0 to 3 bytes
of zero-padding as required to reach a 4-byte boundary.

The vfs, on the other hand, wants null-terminated data.

The simple way to handle this would be by copying the data into a newly
allocated buffer with space for the final null.

The current nfsd_symlink code tries to be more clever by skipping that
step in the (likely) case where the byte following the string is already
0.

But that assumes that the byte following the string is ours to look at.
In fact, it might be the first byte of a page that we can't read, or of
some object that another task might modify.

Worse, the NFSv4 code tries to fix the problem by actually writing to
that byte.

In the NFSv2/v3 cases this actually appears to be safe:

	- nfs3svc_decode_symlinkargs explicitly null-terminates the data
	  (after first checking its length and copying it to a new
	  page).
	- NFSv2 limits symlinks to 1k.  The buffer holding the rpc
	  request is always at least a page, and the link data (and
	  previous fields) have maximum lengths that prevent the request
	  from reaching the end of a page.

In the NFSv4 case the CREATE op is potentially just one part of a long
compound so can end up on the end of a page if you're unlucky.

The minimal fix here is to copy and null-terminate in the NFSv4 case.
The nfsd_symlink() interface here seems too fragile, though.  It should
really either do the copy itself every time or just require a
null-terminated string.

Reported-by: Jeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-07-08 17:14:22 -04:00
Jeff Layton
f419992c1f nfsd: add __force to opaque verifier field casts
sparse complains that we're stuffing non-byte-swapped values into
__be32's here. Since they're supposed to be opaque, it doesn't matter
much. Just add __force to make sparse happy.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-23 11:31:37 -04:00
Kinglong Mee
bf18f163e8 NFSD: Using exp_get for export getting
Don't using cache_get besides export.h, using exp_get for export.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-23 11:31:36 -04:00
Kinglong Mee
f15a5cf912 SUNRPC/NFSD: Change to type of bool for rq_usedeferral and rq_splice_ok
rq_usedeferral and rq_splice_ok are used as 0 and 1, just defined to bool.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-23 11:31:36 -04:00
Kinglong Mee
3c7aa15d20 NFSD: Using min/max/min_t/max_t for calculate
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-23 11:31:36 -04:00
J. Bruce Fields
05638dc73a nfsd4: simplify server xdr->next_page use
The rpc code makes available to the NFS server an array of pages to
encod into.  The server represents its reply as an xdr buf, with the
head pointing into the first page in that array, the pages ** array
starting just after that, and the tail (if any) sharing any leftover
space in the page used by the head.

While encoding, we use xdr_stream->page_ptr to keep track of which page
we're currently using.

Currently we set xdr_stream->page_ptr to buf->pages, which makes the
head a weird exception to the rule that page_ptr always points to the
page we're currently encoding into.  So, instead set it to buf->pages -
1 (the page actually containing the head), and remove the need for a
little unintuitive logic in xdr_get_next_encode_buffer() and
xdr_truncate_encode.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-06 19:22:46 -04:00
Jeff Layton
7025005d5e nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound
The memset of resp in svc_process_common should ensure that these are
already zeroed by the time they get here.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-04 15:42:03 -04:00
Jeff Layton
ba5378b66f nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open
In the NFS4_OPEN_CLAIM_PREVIOUS case, we should only mark it confirmed
if the nfs4_check_open_reclaim check succeeds.

In the NFS4_OPEN_CLAIM_DELEG_PREV_FH and NFS4_OPEN_CLAIM_DELEGATE_PREV
cases, I see no point in declaring the openowner confirmed when the
operation is going to fail anyway, and doing so might allow the client
to game things such that it wouldn't need to confirm a subsequent open
with the same owner.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-06-04 15:42:02 -04:00
J. Bruce Fields
a5cddc885b nfsd4: better reservation of head space for krb5
RPC_MAX_AUTH_SIZE is scattered around several places.  Better to set it
once in the auth code, where this kind of estimate should be made.  And
while we're at it we can leave it zero when we're not using krb5i or
krb5p.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:32:17 -04:00
J. Bruce Fields
ccae70a9ee nfsd4: estimate sequence response size
Otherwise a following patch would turn off all 4.1 zero-copy reads.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:32:07 -04:00
J. Bruce Fields
b86cef60da nfsd4: better estimate of getattr response size
We plan to use this estimate to decide whether or not to allow zero-copy
reads.  Currently we're assuming all getattr's are a page, which can be
both too small (ACLs e.g. may be arbitrarily long) and too large (after
an upcoming read patch this will unnecessarily prevent zero copy reads
in any read compound also containing a getattr).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:32:06 -04:00
J. Bruce Fields
561f0ed498 nfsd4: allow large readdirs
Currently we limit readdir results to a single page.  This can result in
a performance regression compared to NFSv3 when reading large
directories.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:32:03 -04:00
J. Bruce Fields
4f0cefbf38 nfsd4: more precise nfsd4_max_reply
It will turn out to be useful to have a more accurate estimate of reply
size; so, piggyback on the existing op reply-size estimators.

Also move nfsd4_max_reply to nfs4proc.c to get easier access to struct
nfsd4_operation and friends.  (Thanks to Christoph Hellwig for pointing
out that simplification.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:57 -04:00
J. Bruce Fields
8c7424cff6 nfsd4: don't try to encode conflicting owner if low on space
I ran into this corner case in testing: in theory clients can provide
state owners up to 1024 bytes long.  In the sessions case there might be
a risk of this pushing us over the DRC slot size.

The conflicting owner isn't really that important, so let's humor a
client that provides a small maxresponsize_cached by allowing ourselves
to return without the conflicting owner instead of outright failing the
operation.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:55 -04:00
J. Bruce Fields
2825a7f907 nfsd4: allow encoding across page boundaries
After this we can handle for example getattr of very large ACLs.

Read, readdir, readlink are still special cases with their own limits.

Also we can't handle a new operation starting close to the end of a
page.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:54 -04:00
J. Bruce Fields
a8095f7e80 nfsd4: size-checking cleanup
Better variable name, some comments, etc.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:53 -04:00
J. Bruce Fields
ea8d7720b2 nfsd4: remove redundant encode buffer size checking
Now that all op encoders can handle running out of space, we no longer
need to check the remaining size for every operation; only nonidempotent
operations need that check, and that can be done by
nfsd4_check_resp_size.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:52 -04:00
J. Bruce Fields
d0a381dd0e nfsd4: teach encoders to handle reserve_space failures
We've tried to prevent running out of space with COMPOUND_SLACK_SPACE
and special checking in those operations (getattr) whose result can vary
enormously.

However:
	- COMPOUND_SLACK_SPACE may be difficult to maintain as we add
	  more protocol.
	- BUG_ON or page faulting on failure seems overly fragile.
	- Especially in the 4.1 case, we prefer not to fail compounds
	  just because the returned result came *close* to session
	  limits.  (Though perfect enforcement here may be difficult.)
	- I'd prefer encoding to be uniform for all encoders instead of
	  having special exceptions for encoders containing, for
	  example, attributes.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-30 17:31:49 -04:00
J. Bruce Fields
6ac90391c6 nfsd4: keep xdr buf length updated
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-28 14:52:38 -04:00
J. Bruce Fields
d3f627c815 nfsd4: use xdr_stream throughout compound encoding
Note this makes ADJUST_ARGS useless; we'll remove it in the following
patch.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-28 14:52:35 -04:00
J. Bruce Fields
ddd1ea5636 nfsd4: use xdr_reserve_space in attribute encoding
This is a cosmetic change for now; no change in behavior.

Note we're just depending on xdr_reserve_space to do the bounds checking
for us, we're not really depending on its adjustment of iovec or xdr_buf
lengths yet, as those are fixed up by as necessary after the fact by
read-link operations and by nfs4svc_encode_compoundres.  However we do
have to update xdr->iov on read-like operations to prevent
xdr_reserve_space from messing with the already-fixed-up length of the
the head.

When the attribute encoding fails partway through we have to undo the
length adjustments made so far.  We do it manually for now, but later
patches will add an xdr_truncate_encode() helper to handle cases like
this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-28 14:52:34 -04:00
J. Bruce Fields
07d1f80207 nfsd4: fix encoding of out-of-space replies
If nfsd4_check_resp_size() returns an error then we should really be
truncating the reply here, otherwise we may leave extra garbage at the
end of the rpc reply.

Also add a warning to catch any cases where our reply-size estimates may
be wrong in the case of a non-idempotent operation.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-27 11:09:08 -04:00
J. Bruce Fields
1802a67894 nfsd4: reserve head space for krb5 integ/priv info
Currently if the nfs-level part of a reply would be too large, we'll
return an error to the client.  But if the nfs-level part fits and
leaves no room for krb5p or krb5i stuff, then we just drop the request
entirely.

That's no good.  Instead, reserve some slack space at the end of the
buffer and make sure we fail outright if we'd come close.

The slack space here is a massive overstimate of what's required, we
should probably try for a tighter limit at some point.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:47 -04:00
J. Bruce Fields
2d124dfaad nfsd4: move proc_compound xdr encode init to helper
Mechanical transformation with no change of behavior.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:46 -04:00
J. Bruce Fields
d518465866 nfsd4: tweak nfsd4_encode_getattr to take xdr_stream
Just change the nfsd4_encode_getattr api.  Not changing any code or
adding any new functionality yet.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:46 -04:00
J. Bruce Fields
4aea24b2ff nfsd4: embed xdr_stream in nfsd4_compoundres
This is a mechanical transformation with no change in behavior.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:45 -04:00
J. Bruce Fields
e372ba60de nfsd4: decoding errors can still be cached and require space
Currently a non-idempotent op reply may be cached if it fails in the
proc code but not if it fails at xdr decoding.  I doubt there are any
xdr-decoding-time errors that would make this a problem in practice, so
this probably isn't a serious bug.

The space estimates should also take into account space required for
encoding of error returns.  Again, not a practical problem, though it
would become one after future patches which will tighten the space
estimates.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:44 -04:00
J. Bruce Fields
f34e432b67 nfsd4: fix write reply size estimate
The write reply also includes count and stable_how.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:43 -04:00
J. Bruce Fields
622f560e6a nfsd4: read size estimate should include padding
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:42 -04:00
J. Bruce Fields
5b648699af nfsd4: READ, READDIR, etc., are idempotent
OP_MODIFIES_SOMETHING flags operations that we should be careful not to
initiate without being sure we have the buffer space to encode a reply.

None of these ops fall into that category.

We could probably remove a few more, but this isn't a very important
problem at least for ops whose reply size is easy to estimate.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-23 09:03:41 -04:00
Trond Myklebust
14bcab1a39 NFSd: Clean up nfs4_preprocess_stateid_op
Move the state locking and file descriptor reference out from the
callers and into nfs4_preprocess_stateid_op() itself.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-05-07 11:05:48 -04:00
Kinglong Mee
2336745e87 NFSD: Clear wcc data between compound ops
Testing NFS4.0 by pynfs, I got some messeages as,
"nfsd: inode locked twice during operation."

When one compound RPC contains two or more ops that locks
the filehandle,the second op will cause the message.

As two SETATTR ops, after the first SETATTR, nfsd will not call
fh_put() to release current filehandle, it means filehandle have
unlocked with fh_post_saved = 1.
The second SETATTR find fh_post_saved = 1, and printk the message.

v2: introduce helper fh_clear_wcc().

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-30 10:47:34 -04:00
J. Bruce Fields
480efaee08 nfsd4: fix setclientid encode size
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 21:24:55 -04:00
Kinglong Mee
4daeed25ad NFSD: simplify saved/current fh uses in nfsd4_proc_compound
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-28 18:01:40 -04:00
J. Bruce Fields
4c69d5855a nfsd4: session needs room for following op to error out
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-03-27 16:30:59 -04:00
Linus Torvalds
d9894c228b Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 - Handle some loose ends from the vfs read delegation support.
   (For example nfsd can stop breaking leases on its own in a
    fewer places where it can now depend on the vfs to.)
 - Make life a little easier for NFSv4-only configurations
   (thanks to Kinglong Mee).
 - Fix some gss-proxy problems (thanks Jeff Layton).
 - miscellaneous bug fixes and cleanup

* 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits)
  nfsd: consider CLAIM_FH when handing out delegation
  nfsd4: fix delegation-unlink/rename race
  nfsd4: delay setting current_fh in open
  nfsd4: minor nfs4_setlease cleanup
  gss_krb5: use lcm from kernel lib
  nfsd4: decrease nfsd4_encode_fattr stack usage
  nfsd: fix encode_entryplus_baggage stack usage
  nfsd4: simplify xdr encoding of nfsv4 names
  nfsd4: encode_rdattr_error cleanup
  nfsd4: nfsd4_encode_fattr cleanup
  minor svcauth_gss.c cleanup
  nfsd4: better VERIFY comment
  nfsd4: break only delegations when appropriate
  NFSD: Fix a memory leak in nfsd4_create_session
  sunrpc: get rid of use_gssp_lock
  sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt
  sunrpc: don't wait for write before allowing reads from use-gss-proxy file
  nfsd: get rid of unused function definition
  Define op_iattr for nfsd4_open instead using macro
  NFSD: fix compile warning without CONFIG_NFSD_V3
  ...
2014-01-30 10:18:43 -08:00
J. Bruce Fields
4335723e8e nfsd4: fix delegation-unlink/rename race
If a file is unlinked or renamed between the time when we do the local
open and the time when we get the delegation, then we will return to the
client indicating that it holds a delegation even though the file no
longer exists under the name it was open under.

But a client performing an open-by-name, when it is returned a
delegation, must be able to assume that the file is still linked at the
name it was opened under.

So, hold the parent i_mutex for longer to prevent concurrent renames or
unlinks.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-27 13:59:16 -05:00
J. Bruce Fields
c0e6bee480 nfsd4: delay setting current_fh in open
This is basically a no-op, to simplify a following patch.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-27 13:59:16 -05:00
Christoph Hellwig
4ac7249ea5 nfsd: use get_acl and ->set_acl
Remove the boilerplate code to marshall and unmarhall ACL objects into
xattrs and operate on the posix_acl objects directly.  Also move all
the ACL handling code into nfs?acl.c where it belongs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-01-26 08:26:41 -05:00
J. Bruce Fields
41ae6e714a nfsd4: better VERIFY comment
This confuses me every time.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-07 16:01:15 -05:00
Kinglong Mee
7e55b59b2f SUNRPC/NFSD: Support a new option for ignoring the result of svc_register
NFSv4 clients can contact port 2049 directly instead of needing the
portmapper.

Therefore a failure to register to the portmapper when starting an
NFSv4-only server isn't really a problem.

But Gareth Williams reports that an attempt to start an NFSv4-only
server without starting portmap fails:

  #rpc.nfsd -N 2 -N 3
  rpc.nfsd: writing fd to kernel failed: errno 111 (Connection refused)
  rpc.nfsd: unable to set any sockets for nfsd

Add a flag to svc_version to tell the rpc layer it can safely ignore an
rpcbind failure in the NFSv4-only case.

Reported-by: Gareth Williams <gareth@garethwilliams.me.uk>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-03 18:18:49 -05:00
Kinglong Mee
a8bb84bc9e nfsd: calculate the missing length of bitmap in EXCHANGE_ID
commit 58cd57bfd9
"nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID"
miss calculating the length of bitmap for spo_must_enforce and spo_must_allow.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-01-02 17:44:52 -05:00
Weston Andros Adamson
58cd57bfd9 nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID
- don't BUG_ON() when not SP4_NONE
 - calculate recv and send reserve sizes correctly

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-08-07 12:06:07 -04:00
J. Bruce Fields
35f7a14fc1 nfsd4: fix minorversion support interface
You can turn on or off support for minorversions using e.g.

	echo "-4.2" >/proc/fs/nfsd/versions

However, the current implementation is a little wonky.  For example, the
above will turn off 4.2 support, but it will also turn *on* 4.1 support.

This didn't matter as long as we only had 2 minorversions, which was
true till very recently.

And do a little cleanup here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-12 16:48:52 -04:00
J. Bruce Fields
89f6c3362c nfsd4: delegation-based open reclaims should bypass permissions
We saw a v4.0 client's create fail as follows:

	- open create succeeds and gets a read delegation
	- client attempts to set mode on new file, gets DELAY while
	  server recalls delegation.
	- client attempts a CLAIM_DELEGATE_CUR open using the
	  delegation, gets error because of new file mode.

This probably can't happen on a recent kernel since we're no longer
giving out delegations on create opens.  Nevertheless, it's a
bug--reclaim opens should bypass permission checks.

Reported-by: Steve Dickson <steved@redhat.com>
Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-07-01 17:32:05 -04:00
David Quigley
18032ca062 NFSD: Server implementation of MAC Labeling
Implement labeled NFS on the server: encoding and decoding, and writing
and reading, of file labels.

Enabled with CONFIG_NFSD_V4_SECURITY_LABEL.

Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-15 09:27:02 -04:00
J. Bruce Fields
9f415eb255 nfsd4: don't allow owner override on 4.1 CLAIM_FH opens
The Linux client is using CLAIM_FH to implement regular opens, not just
recovery cases, so it depends on the server to check permissions
correctly.

Therefore the owner override, which may make sense in the delegation
recovery case, isn't right in the CLAIM_FH case.

Symptoms: on a client with 49f9a0fafd
"NFSv4.1: Enable open-by-filehandle", Bryan noticed this:

	touch test.txt
	chmod 000 test.txt
	echo test > test.txt

succeeding.

Cc: stable@kernel.org
Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-05-03 16:36:14 -04:00
J. Bruce Fields
2a6cf944c2 nfsd4: don't remap EISDIR errors in rename
We're going out of our way here to remap an error to make rfc 3530
happy--but the rfc itself (nor rfc 1813, which has similar language)
gives no justification.  And disagrees with local filesystem behavior,
with Linux and posix man pages, and knfsd's implemented behavior for v2
and v3.

And the documented behavior seems better, in that it gives a little more
information--you could implement the 3530 behavior using the posix
behavior, but not the other way around.

Also, the Linux client makes no attempt to remap this error in the v4
case, so it can end up just returning EEXIST to the application in a
case where it should return EISDIR.

So honestly I think the rfc's are just buggy here--or in any case it
doesn't see worth the trouble to remap this error.

Reported-by: Frank S Filz <ffilz@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-30 15:44:20 -04:00
J. Bruce Fields
bbc9c36c31 nfsd4: more sessions/open-owner-replay cleanup
More logic that's unnecessary in the 4.1 case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:56 -04:00
J. Bruce Fields
3d74e6a5b6 nfsd4: no need for replay_owner in sessions case
The replay_owner will never be used in the sessions case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-09 09:08:55 -04:00
J. Bruce Fields
9411b1d4c7 nfsd4: cleanup handling of nfsv4.0 closed stateid's
Closed stateid's are kept around a little while to handle close replays
in the 4.0 case.  So we stash them in the last-used stateid in the
oo_last_closed_stateid field of the open owner.  We can free that in
encode_seqid_op_tail once the seqid on the open owner is next
incremented.  But we don't want to do that on the close itself; so we
set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the
first time through encode_seqid_op_tail, then when we see that flag set
next time we free it.

This is unnecessarily baroque.

Instead, just move the logic that increments the seqid out of the xdr
code and into the operation code itself.

The justification given for the current placement is that we need to
wait till the last minute to be sure we know whether the status is a
sequence-id-mutating error or not, but examination of the code shows
that can't actually happen.

Reported-by: Yanchuan Nian <ycnian@gmail.com>
Tested-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-08 09:55:32 -04:00
fanchaoting
b022032e19 nfsd: don't run get_file if nfs4_preprocess_stateid_op return error
we should return error status directly when nfs4_preprocess_stateid_op
return error.

Signed-off-by: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 15:19:06 -04:00
J. Bruce Fields
9d313b17db nfsd4: handle seqid-mutating open errors from xdr decoding
If a client sets an owner (or group_owner or acl) attribute on open for
create, and the mapping of that owner to an id fails, then we return
BAD_OWNER.  But BAD_OWNER is a seqid-mutating error, so we can't
shortcut the open processing that case: we have to at least look up the
owner so we can find the seqid to bump.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:47:53 -04:00
J. Bruce Fields
b600de7ab9 nfsd4: remove BUG_ON
This BUG_ON just crashes the thread a little earlier than it would
otherwise--it doesn't seem useful.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-04-03 11:47:45 -04:00
J. Bruce Fields
84822d0b3b nfsd4: simplify nfsd4_encode_fattr interface slightly
It seems slightly simpler to make nfsd4_encode_fattr rather than its
callers responsible for advancing the write pointer on success.

(Also: the count == 0 check in the verify case looks superfluous.
Running out of buffer space is really the only reason fattr encoding
should fail with eresource.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2013-01-23 18:17:35 -05:00
J. Bruce Fields
a1dc695582 nfsd4: free_stateid can use the current stateid
Cc: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17 22:00:27 -05:00
J. Bruce Fields
9b3234b922 nfsd4: disable zero-copy on non-final read ops
To ensure ordering of read data with any following operations, turn off
zero copy if the read is not the final operation in the compound.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-17 16:02:41 -05:00
Stanislav Kinsbursky
b9c0ef8571 nfsd: make NFSd service boot time per-net
This is simple: an NFSd service can be started at different times in
different network environments. So, its "boot time" has to be assigned
per net.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:38 -05:00
Neil Brown
7007c90fb9 nfsd: avoid permission checks on EXCLUSIVE_CREATE replay
With NFSv4, if we create a file then open it we explicit avoid checking
the permissions on the file during the open because the fact that we
created it ensures we should be allow to open it (the create and the
open should appear to be a single operation).

However if the reply to an EXCLUSIVE create gets lots and the client
resends the create, the current code will perform the permission check -
because it doesn't realise that it did the open already..

This patch should fix this.

Note that I haven't actually seen this cause a problem.  I was just
looking at the code trying to figure out a different EXCLUSIVE open
related issue, and this looked wrong.

(Fix confirmed with pynfs 4.0 test OPEN4--bfields)

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
[bfields: use OWNER_OVERRIDE and update for 4.1]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-12-10 16:25:31 -05:00
J. Bruce Fields
ffe1137ba7 nfsd4: delay filling in write iovec array till after xdr decoding
Our server rejects compounds containing more than one write operation.
It's unclear whether this is really permitted by the spec; with 4.0,
it's possibly OK, with 4.1 (which has clearer limits on compound
parameters), it's probably not OK.  No client that we're aware of has
ever done this, but in theory it could be useful.

The source of the limitation: we need an array of iovecs to pass to the
write operation.  In the worst case that array of iovecs could have
hundreds of elements (the maximum rwsize divided by the page size), so
it's too big to put on the stack, or in each compound op.  So we instead
keep a single such array in the compound argument.

We fill in that array at the time we decode the xdr operation.

But we decode every op in the compound before executing any of them.  So
once we've used that array we can't decode another write.

If we instead delay filling in that array till the time we actually
perform the write, we can reuse it.

Another option might be to switch to decoding compound ops one at a
time.  I considered doing that, but it has a number of other side
effects, and I'd rather fix just this one problem for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-26 09:08:15 -05:00
Stanislav Kinsbursky
3320fef19b nfsd: use service net instead of hard-coded init_net
This patch replaces init_net by SVC_NET(), where possible and also passes
proper context to nested functions where required.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-15 07:40:50 -05:00
J. Bruce Fields
cb73a9f464 nfsd4: implement backchannel_ctl operation
This operation is mandatory for servers to implement.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-11-07 19:39:58 -05:00
J. Bruce Fields
d15c077e44 nfsd4: enforce per-client sessions/no-sessions distinction
Something like creating a client with setclientid and then trying to
confirm it with create_session may not crash the server, but I'm not
completely positive of that, and in any case it's obviously bad client
behavior.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-10-01 17:39:58 -04:00
Bryan Schumaker
24ff99c6fe NFSD: Swap the struct nfs4_operation getter and setter
stateid_setter should be matched to op_set_currentstateid, rather than
op_get_currentstateid.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-08-20 18:53:25 -04:00
Stanislav Kinsbursky
5ccb0066f2 LockD: pass actual network namespace to grace period management functions
Passed network namespace replaced hard-coded init_net

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-07-27 16:49:22 -04:00
Linus Torvalds
c6f5c93098 Merge branch 'for-3.4' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from J. Bruce Fields:
 "One bugfix, and one minor header fix from Jeff Layton while we're
  here"

* 'for-3.4' of git://linux-nfs.org/~bfields/linux:
  nfsd: include cld.h in the headers_install target
  nfsd: don't fail unchecked creates of non-special files
2012-04-19 14:54:52 -07:00
Al Viro
96f6f98501 nfsd: fix b0rken error value for setattr on read-only mount
..._want_write() returns -EROFS on failure, _not_ an NFS error value.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-04-13 10:12:00 -04:00
J. Bruce Fields
9dc4e6c4d1 nfsd: don't fail unchecked creates of non-special files
Allow a v3 unchecked open of a non-regular file succeed as if it were a
lookup; typically a client in such a case will want to fall back on a
local open, so succeeding and giving it the filehandle is more useful
than failing with nfserr_exist, which makes it appear that nothing at
all exists by that name.

Similarly for v4, on an open-create, return the same errors we would on
an attempt to open a non-regular file, instead of returning
nfserr_exist.

This fixes a problem found doing a v4 open of a symlink with
O_RDONLY|O_CREAT, which resulted in the current client returning EEXIST.

Thanks also to Trond for analysis.

Cc: stable@kernel.org
Reported-by: Orion Poplawski <orion@cora.nwra.com>
Tested-by: Orion Poplawski <orion@cora.nwra.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-04-11 17:49:52 -04:00
Jeff Layton
a52d726bbd nfsd: convert nfs4_client->cl_cb_flags to a generic flags field
We'll need a way to flag the nfs4_client as already being recorded on
stable storage so that we don't continually upcall. Currently, that's
recorded in the cl_firststate field of the client struct. Using an
entire u32 to store a flag is rather wasteful though.

The cl_cb_flags field is only using 2 bits right now, so repurpose that
to a generic flags field. Rename NFSD4_CLIENT_KILL to
NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback
flags. Add a mask that we can use for existing checks that look to see
whether any flags are set, so that the new flags don't interfere.

Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag,
and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-26 11:49:47 -04:00
Chuck Lever
ab4684d156 NFSD: Fix nfs4_verifier memory alignment
Clean up due to code review.

The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.

We can fix most of this by using a __be32 array to generate the
verifier's contents and then byte-copying it into the verifier field.

However, there is one spot where there is a backwards compatibility
constraint: the do_nfsd_create() call expects a verifier which is
32-bit aligned.  Fix this spot by forcing the alignment of the create
verifier in the nfsd4_open args structure.

Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier.  The two are not interchangeable, even if they happen to
have the same value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-20 15:36:15 -04:00
Trond Myklebust
8f199b8262 NFSD: Fix warnings when NFSD_DEBUG is not defined
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-20 15:34:19 -04:00
J. Bruce Fields
59deeb9e5a nfsd4: reduce do_open_lookup() stack usage
I get 320 bytes for struct svc_fh on x86_64, really a little large to be
putting on the stack; kmalloc() instead.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:37 -05:00
J. Bruce Fields
41fd1e42f8 nfsd4: delay setting current filehandle till success
Compound processing stops on error, so the current filehandle won't be
used on error.  Thus the order here doesn't really matter.  It'll be
more convenient to do it later, though.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-03-06 18:13:36 -05:00
Benny Halevy
2c8bd7e0d1 nfsd41: split out share_access want and signal flags while decoding
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-17 18:38:42 -05:00
Tigran Mkrtchyan
37c593c573 nfsd41: use current stateid by value
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:45 -05:00
Tigran Mkrtchyan
9428fe1abb nfsd41: consume current stateid on DELEGRETURN and OPENDOWNGRADE
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:44 -05:00
Tigran Mkrtchyan
1e97b5190d nfsd41: handle current stateid in SETATTR and FREE_STATEID
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:43 -05:00
Tigran Mkrtchyan
d14710532f nfsd41: mark LOOKUP, LOOKUPP and CREATE to invalidate current stateid
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:42 -05:00
Tigran Mkrtchyan
8307111476 nfsd41: save and restore current stateid with current fh
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:41 -05:00
Tigran Mkrtchyan
80e01cc1e2 nfsd41: mark PUTFH, PUTPUBFH and PUTROOTFH to clear current stateid
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:41 -05:00
Tigran Mkrtchyan
30813e2773 nfsd41: consume current stateid on read and write
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:40 -05:00
Tigran Mkrtchyan
62cd4a591c nfsd41: handle current stateid on lock and locku
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:39 -05:00
Tigran Mkrtchyan
8b70484c67 nfsd41: handle current stateid in open and close
Signed-off-by: Tigran Mkrtchyan <kofemann@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-02-15 11:20:38 -05:00
Linus Torvalds
0b48d42235 Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux
* 'for-3.3' of git://linux-nfs.org/~bfields/linux: (31 commits)
  nfsd4: nfsd4_create_clid_dir return value is unused
  NFSD: Change name of extended attribute containing junction
  svcrpc: don't revert to SVC_POOL_DEFAULT on nfsd shutdown
  svcrpc: fix double-free on shutdown of nfsd after changing pool mode
  nfsd4: be forgiving in the absence of the recovery directory
  nfsd4: fix spurious 4.1 post-reboot failures
  NFSD: forget_delegations should use list_for_each_entry_safe
  NFSD: Only reinitilize the recall_lru list under the recall lock
  nfsd4: initialize special stateid's at compile time
  NFSd: use network-namespace-aware cache registering routines
  SUNRPC: create svc_xprt in proper network namespace
  svcrpc: update outdated BKL comment
  nfsd41: allow non-reclaim open-by-fh's in 4.1
  svcrpc: avoid memory-corruption on pool shutdown
  svcrpc: destroy server sockets all at once
  svcrpc: make svc_delete_xprt static
  nfsd: Fix oops when parsing a 0 length export
  nfsd4: Use kmemdup rather than duplicating its implementation
  nfsd4: add a separate (lockowner, inode) lookup
  nfsd4: fix CONFIG_NFSD_FAULT_INJECTION compile error
  ...
2012-01-14 12:26:41 -08:00
Al Viro
bad0dcffc2 new helpers: fh_{want,drop}_write()
A bunch of places in nfsd does mnt_{want,drop}_write on vfsmount of
export of given fhandle.  Switched to obvious inlined helpers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:52:35 -05:00
Mi Jinlong
0cf99b91c6 nfsd41: allow non-reclaim open-by-fh's in 4.1
With NFSv4.0 it was safe to assume that open-by-filehandles were always
reclaims.

With NFSv4.1 there are non-reclaim open-by-filehandle operations, so we
should ensure we're only insisting on reclaims in the
OPEN_CLAIM_PREVIOUS case.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-12-06 16:19:04 -05:00
Mi Jinlong
345c284290 nfs41: implement DESTROY_CLIENTID operation
According to rfc5661 18.50, implement DESTROY_CLIENTID operation.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-24 04:24:30 -04:00
J. Bruce Fields
8b289b2c23 nfsd4: implement new 4.1 open reclaim types
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-19 11:52:12 -04:00
J. Bruce Fields
856121b2e8 nfsd4: warn on open failure after create
If we create the object and then return failure to the client, we're
left with an unexpected file in the filesystem.

I'm trying to eliminate such cases but not 100% sure I have so an
assertion might be helpful for now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:50:08 -04:00
J. Bruce Fields
d29b20cd58 nfsd4: clean up open owners on OPEN failure
If process_open1() creates a new open owner, but the open later fails,
the current code will leave the open owner around.  It won't be on the
close_lru list, and the client isn't expected to send a CLOSE, so it
will hang around as long as the client does.

Similarly, if process_open1() removes an existing open owner from the
close lru, anticipating that an open owner that previously had no
associated stateid's now will, but the open subsequently fails, then
we'll again be left with the same leak.

Fix both problems.

Reported-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-17 17:33:57 -04:00
J. Bruce Fields
b6d2f1ca3c nfsd4: more robust ignoring of WANT bits in OPEN
Mask out the WANT bits right at the start instead of on each use.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-10-11 12:15:15 -04:00
J. Bruce Fields
c856694e3d nfsd4: make op_cacheresult another flag
I'm not sure why I used a new field for this originally.

Also, the differences between some of these flags are a little subtle;
add some comments to explain.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-20 14:45:51 -04:00
J. Bruce Fields
dad1c067eb nfsd4: replace oo_confirmed by flag bit
I want at least one more bit here.  So, let's haul out the caps lock key
and add a flags field.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-16 17:44:16 -04:00
Mi Jinlong
58e7b33a58 nfsd41: try to check reply size before operation
For checking the size of reply before calling a operation,
we need try to get maxsize of the operation's reply.

v3: using new method as Bruce said,

 "we could handle operations in two different ways:

	- For operations that actually change something (write, rename,
	  open, close, ...), do it the way we're doing it now: be
	  very careful to estimate the size of the response before even
	  processing the operation.
	- For operations that don't change anything (read, getattr, ...)
	  just go ahead and do the operation.  If you realize after the
	  fact that the response is too large, then return the error at
	  that point.

  So we'd add another flag to op_flags: say, OP_MODIFIES_SOMETHING.  And for
  operations with OP_MODIFIES_SOMETHING set, we'd do the first thing.  For
  operations without it set, we'd do the second."

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
[bfields@redhat.com: crash, don't attempt to handle, undefined op_rsize_bop]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-16 10:31:01 -04:00
J. Bruce Fields
fe0750e5c4 nfsd4: split stateowners into open and lockowners
The stateowner has some fields that only make sense for openowners, and
some that only make sense for lockowners, and I find it a lot clearer if
those are separated out.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-07 09:45:49 -04:00
J. Bruce Fields
7c13f344cf nfsd4: drop most stateowner refcounting
Maybe we'll bring it back some day, but we don't have much real use for
it now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-01 11:12:47 -04:00
J. Bruce Fields
5ec094c109 nfsd4: extend state lock over seqid replay logic
There are currently a couple races in the seqid replay code: a
retransmission could come while we're still encoding the original reply,
or a new seqid-mutating call could come as we're encoding a replay.

So, extend the state lock over the encoding (both encoding of a replayed
reply and caching of the original encoded reply).

I really hate doing this, and previously added the stateowner
reference-counting code to avoid it (which was insufficient)--but I
don't see a less complicated alternative at the moment.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-09-01 07:07:59 -04:00
J. Bruce Fields
3e77246393 nfsd4: stop using nfserr_resource for transitory errors
The server is returning nfserr_resource for both permanent errors and
for errors (like allocation failures) that might be resolved by retrying
later.  Save nfserr_resource for the former and use delay/jukebox for
the latter.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:21:21 -04:00
J. Bruce Fields
a043226bc1 nfsd4: permit read opens of executable-only files
A client that wants to execute a file must be able to read it.  Read
opens over nfs are therefore implicitly allowed for executable files
even when those files are not readable.

NFSv2/v3 get this right by using a passed-in NFSD_MAY_OWNER_OVERRIDE on
read requests, but NFSv4 has gotten this wrong ever since
dc730e1737 "nfsd4: fix owner-override on
open", when we realized that the file owner shouldn't override
permissions on non-reclaim NFSv4 opens.

So we can't use NFSD_MAY_OWNER_OVERRIDE to tell nfsd_permission to allow
reads of executable files.

So, do the same thing we do whenever we encounter another weird NFS
permission nit: define yet another NFSD_MAY_* flag.

The industry's future standardization on 128-bit processors will be
motivated primarily by the need for integers with enough bits for all
the NFSD_MAY_* flags.

Reported-by: Leonardo Borda <leonardoborda@gmail.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-27 14:20:20 -04:00
J. Bruce Fields
75c096f753 nfsd4: it's OK to return nfserr_symlink
The nfsd4 code has a bunch of special exceptions for error returns which
map nfserr_symlink to other errors.

In fact, the spec makes it clear that nfserr_symlink is to be preferred
over less specific errors where possible.

The patch that introduced it back in 2.6.4 is "kNFSd: correct symlink
related error returns.", which claims that these special exceptions are
represent an NFSv4 break from v2/v3 tradition--when in fact the symlink
error was introduced with v4.

I suspect what happened was pynfs tests were written that were overly
faithful to the (known-incomplete) rfc3530 error return lists, and then
code was fixed up mindlessly to make the tests pass.

Delete these unnecessary exceptions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-26 18:22:50 -04:00
J. Bruce Fields
aadab6c6f4 nfsd4: return nfserr_symlink on v4 OPEN of non-regular file
Without this, an attempt to open a device special file without first
stat'ing it will fail.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-19 13:25:32 -04:00
Bernd Schubert
832023bffb nfsd4: Remove check for a 32-bit cookie in nfsd4_readdir()
Fan Yong <yong.fan@whamcloud.com> noticed setting
FMODE_32bithash wouldn't work with nfsd v4, as
nfsd4_readdir() checks for 32 bit cookies. However, according to RFC 3530
cookies have a 64 bit type and cookies are also defined as u64 in
'struct nfsd4_readdir'. So remove the test for >32-bit values.

Cc: stable@kernel.org
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-08-16 15:19:28 -04:00
J. Bruce Fields
1091006c5e nfsd: turn on reply cache for NFSv4
It's sort of ridiculous that we've never had a working reply cache for
NFSv4.

On the other hand, we may still not: our current reply cache is likely
not very good, especially in the TCP case (which is the only case that
matters for v4).  What we really need here is some serious testing.

Anyway, here's a start.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-18 09:39:01 -04:00
J. Bruce Fields
3e98abffd1 nfsd4: call nfsd4_release_compoundargs from pc_release
This simplifies cleanup a bit.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-18 09:38:02 -04:00
Mi Jinlong
ab1350b2b3 nfsd41: Deny new lock before RECLAIM_COMPLETE done
Before nfs41 client's RECLAIM_COMPLETE done, nfs server should deny any
new locks or opens.

rfc5661:

   " Whenever a client establishes a new client ID and before it does
   the first non-reclaim operation that obtains a lock, it MUST send a
   RECLAIM_COMPLETE with rca_one_fs set to FALSE, even if there are no
   locks to reclaim.  If non-reclaim locking operations are done before
   the RECLAIM_COMPLETE, an NFS4ERR_GRACE error will be returned. "

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 19:00:40 -04:00
Bryan Schumaker
1745680454 NFSD: Added TEST_STATEID operation
This operation is used by the client to check the validity of a list of
stateids.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:48 -04:00
Bryan Schumaker
e1ca12dfb1 NFSD: added FREE_STATEID operation
This operation is used by the client to tell the server to free a
stateid.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:47 -04:00
Benny Halevy
094b5d74f4 NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND
DESTROY_CLIENTID MAY be preceded with a SEQUENCE operation as long as
   the client ID derived from the session ID of SEQUENCE is not the same
   as the client ID to be destroyed.  If the client IDs are the same,
   then the server MUST return NFS4ERR_CLIENTID_BUSY.

(that's not implemented yet)

   If DESTROY_CLIENTID is not prefixed by SEQUENCE, it MUST be the only
   operation in the COMPOUND request (otherwise, the server MUST return
   NFS4ERR_NOT_ONLY_OP).

This fixes the error return; before, we returned
NFS4ERR_OP_NOT_IN_SESSION; after this patch, we return NFS4ERR_NOTSUPP.

Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:41 -04:00
Mi Jinlong
ac6721a13e nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly
The NFS server uses nfsd_create_v3 to handle EXCLUSIVE4_1 opens, but
that function is not prepared to handle them.

Rename nfsd_create_v3() to do_nfsd_create(), and add handling of
EXCLUSIVE4_1.

Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-29 20:47:52 -04:00
J. Bruce Fields
68d9318435 nfsd4: fix wrongsec handling for PUTFH + op cases
When PUTFH is followed by an operation that uses the filehandle, and
when the current client is using a security flavor that is inconsistent
with the given filehandle, we have a choice: we can return WRONGSEC
either when the current filehandle is set using the PUTFH, or when the
filehandle is first used by the following operation.

Follow the recommendations of RFC 5661 in making this choice.

(Our current behavior prevented the client from doing security
negotiation by returning WRONGSEC on PUTFH+SECINFO_NO_NAME.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-29 20:47:51 -04:00
J. Bruce Fields
29a78a3ed7 nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller
The secinfo caller actually won't want this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-11 08:42:22 -04:00
J. Bruce Fields
22b0321496 nfsd4: introduce OPDESC helper
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-04-11 08:42:21 -04:00
Mi Jinlong
5ece3cafbd nfsd41: modify the members value of nfsd4_op_flags
The members of nfsd4_op_flags, (ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS)
equals to  ALLOWED_AS_FIRST_OP, maybe that's not what we want.

OP_PUTROOTFH with op_flags = ALLOWED_WITHOUT_FH | ALLOWED_ON_ABSENT_FS,
can't appears as the first operation with out SEQUENCE ops.

This patch modify the wrong value of ALLOWED_WITHOUT_FH etc which
was introduced by f9bb94c4.

Cc: stable@kernel.org
Reviewed-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-03-07 12:10:33 -05:00
J. Bruce Fields
1d1bc8f207 nfsd4: support BIND_CONN_TO_SESSION
Basic xdr and processing for BIND_CONN_TO_SESSION.  This adds a
connection to the list of connections associated with a session.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-11 15:04:09 -05:00
J. Bruce Fields
da165dd60e nfsd: remove some unnecessary dropit handling
We no longer need a few of these special cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-01-04 16:49:23 -05:00
J. Bruce Fields
04f4ad16b2 nfsd4: implement secinfo_no_name
Implementation of this operation is mandatory for NFSv4.1.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-17 15:48:25 -05:00
J. Bruce Fields
0ff7ab4671 nfsd4: move guts of nfsd4_lookupp into helper
We'll reuse this code in secinfo_no_name.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-17 15:48:24 -05:00
J. Bruce Fields
56560b9ae0 nfsd4: 4.1 SECINFO should consume filehandle
See the referenced spec language; an attempt by a 4.1 client to use the
current filehandle after a secinfo call should result in a NOFILEHANDLE
error.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-12-17 15:48:23 -05:00
NeilBrown
8ff30fa4ef nfsd: disable deferral for NFSv4
Now that a slight delay in getting a reply to an upcall doesn't
require deferring of requests, request deferral for all NFSv4
requests - the concept doesn't really fit with the v4 model.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2010-09-21 17:02:27 -04:00
J. Bruce Fields
4dc6ec00f6 nfsd4: implement reclaim_complete
This is a mandatory operation.  Also, here (not in open) is where we
should be committing the reboot recovery information.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-05-13 12:03:11 -04:00
J. Bruce Fields
5306293c9c Merge commit 'v2.6.34-rc6'
Conflicts:
	fs/nfsd/nfs4callback.c
2010-05-04 11:29:05 -04:00
J. Bruce Fields
26c0c75e69 nfsd4: fix unlikely race in session replay case
In the replay case, the

	renew_client(session->se_client);

happens after we've droppped the sessionid_lock, and without holding a
reference on the session; so there's nothing preventing the session
being freed before we get here.

Thanks to Benny Halevy for catching a bug in an earlier version of this
patch.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Benny Halevy <bhalevy@panasas.com>
2010-05-03 08:32:31 -04:00
J. Bruce Fields
5771635592 nfsd4: complete enforcement of 4.1 op ordering
Enforce the rules about compound op ordering.

Motivated by implementing RECLAIM_COMPLETE, for which the client is
implicit in the current session, so it is important to ensure a
succesful SEQUENCE proceeds the RECLAIM_COMPLETE.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2010-04-22 11:35:14 -04:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
J. Bruce Fields
7663dacd92 nfsd: remove pointless paths in file headers
The new .h files have paths at the top that are now out of date.  While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-15 15:01:47 -05:00
Boaz Harrosh
9a74af2133 nfsd: Move private headers to source directory
Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:12 -05:00
Boaz Harrosh
341eb18446 nfsd: Source files #include cleanups
Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.

This patch should improve the compilation speed of the nfsd module.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:12:09 -05:00
J. Bruce Fields
57ecb34feb nfsd4: fix share mode permissions
NFSv4 opens may function as locks denying other NFSv4 users the rights
to open a file.

We're requiring a user to have write permissions before they can deny
write.  We're *not* requiring a user to have write permissions to deny
read, which is if anything a more drastic denial.

What was intended was to require write permissions for DENY_READ.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-12-14 18:06:54 -05:00
J. Bruce Fields
0a3adadee4 nfsd: make fs/nfsd/vfs.h for common includes
None of this stuff is used outside nfsd, so move it out of the common
linux include directory.

Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-11-13 13:23:02 -05:00
Trond Myklebust
a06b1261bd NFSD: Fix a bug in the NFSv4 'supported attrs' mandatory attribute
The fact that the filesystem doesn't currently list any alternate
locations does _not_ imply that the fs_locations attribute should be
marked as "unsupported".

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-01 20:00:17 -04:00
Andy Adamson
abfabf8caf nfsd41: encode replay sequence from the slot values
The sequence operation is not cached; always encode the sequence operation on
a replay from the slot table and session values. This simplifies the sessions
replay logic in nfsd4_proc_compound.

If this is a replay of a compound that was specified not to be cached, return
NFS4ERR_RETRY_UNCACHED_REP.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 16:12:34 -04:00
Andy Adamson
c8647947f8 nfsd41: rename nfsd4_enc_uncached_replay
This function is only used for SEQUENCE replay.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:36 -04:00
Andy Adamson
49557cc74c nfsd41: Use separate DRC for setclientid
Instead of trying to share the generic 4.1 reply cache code for the
CREATE_SESSION reply cache, it's simpler to handle CREATE_SESSION
separately.

The nfs41 single slot clientid DRC holds the results of create session
processing.  CREATE_SESSION can be preceeded by a SEQUENCE operation
(an embedded CREATE_SESSION) and the create session single slot cache must be
maintained.  nfsd4_replay_cache_entry() and nfsd4_store_cache_entry() do not
implement the replay of an embedded CREATE_SESSION.

The clientid DRC slot does not need the inuse, cachethis or other fields that
the multiple slot session cache uses.  Replace the clientid DRC cache struct
nfs4_slot cache with a new nfsd4_clid_slot cache.  Save the xdr struct
nfsd4_create_session into the cache at the end of processing, and on a replay,
replace the struct for the replay request with the cached version all while
under the state lock.

nfsd4_proc_compound will handle both the solo and embedded CREATE_SESSION case
via the normal use of encode_operation.

Errors that do not change the create session cache:
A create session NFS4ERR_STALE_CLIENTID error means that a client record
(and associated create session slot) could not be found and therefore can't
be changed.  NFSERR_SEQ_MISORDERED errors do not change the slot cache.

All other errors get cached.

Remove the clientid DRC specific check in nfs4svc_encode_compoundres to
put the session only if cstate.session is set which will now always be true.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-28 14:30:29 -04:00
Yu Zhiguo
9208faf297 NFSv4: ACL in operations 'open' and 'create' should be used
ACL in operations 'open' and 'create' is decoded but never be used.
It should be set as the initial ACL for the object according to RFC3530.
If error occurs when setting the ACL, just clear the ACL bit in the
returned attr bitmap.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-07-14 12:16:47 -04:00
Yu Zhiguo
0a93a47f04 NFSv4: kill off complicated macro 'PROC'
J. Bruce Fields wrote:
...
> (This is extremely confusing code to track down: note that
> proc->pc_decode is set to nfs4svc_decode_compoundargs() by the PROC()
> macro at the end of fs/nfsd/nfs4proc.c.  Which means, for example, that
> grepping for nfs4svc_decode_compoundargs() gets you nowhere.  Patches to
> kill off that macro would be welcomed....)

the macro 'PROC' is complicated and obscure, it had better
be killed off in order to make the code more clear.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-01 18:09:20 -04:00
Yu Zhiguo
3c8e03166a NFSv4: do exact check about attribute specified
Server should return NFS4ERR_ATTRNOTSUPP if an attribute specified is
not supported in current environment.
Operations CREATE, NVERIFY, OPEN, SETATTR and VERIFY should do this check.

This bug is found when do newpynfs tests. The names of the tests that failed
are following:
  CR12 NVF7a NVF7b NVF7c NVF7d NVF7f NVF7r NVF7s
  OPEN15 VF7a VF7b VF7c VF7d VF7f VF7r VF7s

Add function do_check_fattr() to do exact check:
1, Check attribute specified is supported by the NFSv4 server or not.
2, Check FATTR4_WORD0_ACL & FATTR4_WORD0_FS_LOCATIONS are supported
   in current environment or not.
3, Check attribute specified is writable or not.

step 1 and 3 are done in function nfsd4_decode_fattr() but removed
to this function now.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-06-01 18:01:54 -04:00
Benny Halevy
79fb54abd2 nfsd41: CREATE_EXCLUSIVE4_1
Implement the CREATE_EXCLUSIVE4_1 open mode conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

This mode allows the client to atomically create a file
if it doesn't exist while setting some of its attributes.

It must be implemented if the server supports persistent
reply cache and/or pnfs.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:23 -07:00
Andy Adamson
7e70570647 nfsd41: support for 3-word long attribute bitmask
Also, use client minorversion to generate supported attrs

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:23 -07:00
Benny Halevy
95ec28cda3 nfsd: dynamically skip encoded fattr bitmap in _nfsd4_verify
_nfsd4_verify currently skips 3 words from the encoded buffer begining.
With support for 3-word attr bitmaps in nfsd41, nfsd4_encode_fattr
may encode 1, 2, or 3 words, and not always 2 as it used to be, hence
we need to find out where to skip using the encoded bitmap length.

Note: This patch may be applied over pre-nfsd41 nfsd.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:22 -07:00
Benny Halevy
8daf220a6a nfsd41: control nfsv4.1 svc via /proc/fs/nfsd/versions
Support enabling and disabling nfsv4.1 via /proc/fs/nfsd/versions
by writing the strings "+4.1" or "-4.1" correspondingly.

Use user mode nfs-utils (rpc.nfsd option) to enable.
This will allow us to get rid of CONFIG_NFSD_V4_1

[nfsd41: disable support for minorversion by default]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:21 -07:00
Andy Adamson
d87a8ade95 nfsd41: access_valid
For nfs41, the open share flags are used also for
delegation "wants" and "signals".  Check that they are valid.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:21 -07:00
Andy Adamson
60adfc50de nfsd41: clientid handling
Extract the clientid from sessionid to set the op_clientid on open.
Verify that the clid for other stateful ops is zero for minorversion != 0
Do all other checks for stateful ops without sessions.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[fixed whitespace indent]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 remove sl_session from nfsd4_open]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:20 -07:00
Andy Adamson
6668958fac nfsd41: stateid handling
When sessions are used, stateful operation sequenceid and stateid handling
are not used. When sessions are used,  on the first open set the seqid to 1,
mark state confirmed and skip seqid processing.

When sessionas are used the stateid generation number is ignored when it is zero
whereas without sessions bad_stateid or stale stateid is returned.

Add flags to propagate session use to all stateful ops and down to
check_stateid_generation.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfsd4_has_session should return a boolean, not u32]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: pass nfsd4_compoundres * to nfsd4_process_open1]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_stateid_op]
[nfsd41: calculate HAS_SESSION in nfs4_preprocess_seqid_op]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:19 -07:00
Benny Halevy
dd453dfd70 nfsd: pass nfsd4_compound_state* to nfs4_preprocess_{state,seq}id_op
Currently we only use cstate->current_fh,
will also be used by nfsd41 code.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:19 -07:00
Andy Adamson
bf864a31d5 nfsd41: non-page DRC for solo sequence responses
A session inactivity time compound (lease renewal) or a compound where the
sequence operation has sa_cachethis set to FALSE do not require any pages
to be held in the v4.1 DRC. This is because struct nfsd4_slot is already
caching the session information.

Add logic to the nfs41 server to not cache response pages for solo sequence
responses.

Return nfserr_replay_uncached_rep on the operation following the sequence
operation when sa_cachethis is FALSE.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfsd4_replay_cache_entry]
[nfsd41: rename nfsd4_no_page_in_cache]
[nfsd41 rename nfsd4_enc_no_page_replay]
[nfsd41 nfsd4_is_solo_sequence]
[nfsd41 change nfsd4_not_cached return]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed return type to bool]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41 drop parens in nfsd4_is_solo_sequence call]
Signed-off-by: Andy Adamson <andros@netapp.com>
[changed "== 0" to "!"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:19 -07:00
Andy Adamson
da3846a286 nfsd41: nfsd DRC logic
Replay a request in nfsd4_sequence.
Add a minorversion to struct nfsd4_compound_state.

Pass the current slot to nfs4svc_encode_compound res via struct
nfsd4_compoundres to set an NFSv4.1 DRC entry.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use bool inuse for slot state]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: use cstate session in nfs4svc_encode_compoundres]
[nfsd41 replace nfsd4_set_cache_entry]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:17 -07:00
Andy Adamson
f9bb94c4c6 nfsd41: enforce NFS4ERR_SEQUENCE_POS operation order rules for minorversion != 0 only.
Signed-off-by: Andy Adamson<andros@netapp.com>
[nfsd41: do not verify nfserr_sequence_pos for minorversion 0]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:16 -07:00
Andy Adamson
069b6ad4bb nfsd41: proc stubs
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:14 -07:00
Andy Adamson
2f425878b6 nfsd: don't use the deferral service, return NFS4ERR_DELAY
On an NFSv4.1 server cache miss that causes an upcall, NFS4ERR_DELAY will be
returned. It is up to the NFSv4.1 client to resend only the operations that
have not been processed.

Initialize rq_usedeferral to 1 in svc_process(). It sill be turned off in
nfsd4_proc_compound() only when NFSv4.1 Sessions are used.

Note: this isn't an adequate solution on its own. It's acceptable as a way
to get some minimal 4.1 up and working, but we're going to have to find a
way to avoid returning DELAY in all common cases before 4.1 can really be
considered ready.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: reverse rq_nodeferral negative logic]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[sunrpc: initialize rq_usedeferral]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-04-03 17:41:12 -07:00
Benny Halevy
2076601632 nfsd: remove nfsd4_ops array size
There's no need for it.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-30 17:03:11 -04:00
Andy Adamson
e354d571bb nfsd: embed nfsd4_current_state in nfsd4_compoundres
Remove the allocation of struct nfsd4_compound_state.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-29 16:20:12 -04:00
J. Bruce Fields
5cb031b0af nfsd4: remove redundant check from nfsd4_open
Note that we already checked for this invalid case at the top of this
function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:41 -04:00
J. Bruce Fields
a1c8c4d1ff nfsd4: support putpubfh operation
Currently putpubfh returns NFSERR_OPNOTSUPP, which isn't actually
allowed for v4.  The right error is probably NFSERR_NOTSUPP.

But let's just implement it; though rarely seen, it can be used by
Solaris (with a special mount option), is mandated by the rfc, and is
trivial for us to support.

Thanks to Yang Hongyang for pointing out the original problem, and to
Mike Eisler, Tom Talpey, Trond Myklebust, and Dave Noveck for further
argument....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:40 -04:00
David Shaw
31dec2538e Short write in nfsd becomes a full write to the client
If a filesystem being written to via NFS returns a short write count
(as opposed to an error) to nfsd, nfsd treats that as a success for
the entire write, rather than the short count that actually succeeded.

For example, given a 8192 byte write, if the underlying filesystem
only writes 4096 bytes, nfsd will ack back to the nfs client that all
8192 bytes were written.  The nfs client does have retry logic for
short writes, but this is never called as the client is told the
complete write succeeded.

There are probably other ways it could happen, but in my case it
happened with a fuse (filesystem in userspace) filesystem which can
rather easily have a partial write.

Here is a patch to properly return the short write count to the
client.

Signed-off-by: David Shaw <dshaw@jabberwocky.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:40 -04:00
J. Bruce Fields
6150ef0dc7 nfsd4: remove unused CHECK_FH flag
All users now pass this, so it's meaningless.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:38:37 -04:00
J. Bruce Fields
a4773c08f2 nfsd4: use helper for copying filehandles for replay
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:30:49 -04:00
J. Bruce Fields
13024b7b40 nfsd4: fix misplaced comment
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:30:49 -04:00
J. Bruce Fields
99f8872638 nfsd: clarify exclusive create bitmask result.
The use of |= is confusing--the bitmask is always initialized to zero in
this case, so we're effectively just doing an assignment here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-03-18 17:30:48 -04:00
Benny Halevy
0407717d85 nfsd: dprint each op status in nfsd4_proc_compound
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-01-07 17:32:45 -05:00
J. Bruce Fields
af558e33be nfsd: common grace period control
Rewrite grace period code to unify management of grace period across
lockd and nfsd.  The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it.  This creates a slight race condition, since
the enforcement is not coordinated.  It's also more complicated than
necessary.

Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.

We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-10-03 16:19:02 -04:00
Andy Adamson
c228c24bf1 nfsd: fix compound state allocation error handling
Move the cstate_alloc call so that if it fails, the response is setup to
encode the NFS error. The out label now means that the
nfsd4_compound_state has not been allocated.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-09-01 14:17:48 -04:00
Adrian Bunk
f1c7f79b6a [NFSD] uninline nfsd4_op_name()
There doesn't seem to be a compelling reason why nfsd4_op_name() is
marked as "inline":

It's only used in a dprintk(), and as long as it has only one caller
non-ancient gcc versions anyway inline it automatically.

This patch fixes the following compile error with gcc 3.4:

  ...
    CC      fs/nfsd/nfs4proc.o
  nfs4proc.c: In function `nfsd4_proc_compound':
  nfs4proc.c:854: sorry, unimplemented: inlining failed in call to
  nfs4proc.c:897: sorry, unimplemented: called from here
  make[3]: *** [fs/nfsd/nfs4proc.o] Error 1

Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
[ Also made it "const char *"  - Linus]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-08-08 11:22:19 -07:00
Benny Halevy
b001a1b6aa nfsd: dprint operation names
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-07-02 19:03:19 -04:00
Miklos Szeredi
8837abcab3 nfsd: rename MAY_ flags
Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it
clear, that these are not used outside nfsd, and to avoid name and
number space conflicts with the VFS.

[comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well]

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:02:50 -04:00
J. Bruce Fields
3b12cd9862 nfsd: add dprintk of compound return
We already print each operation of the compound when debugging is turned
on; printing the result could also help with remote debugging.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-06-23 13:02:48 -04:00
Dave Hansen
18f335aff8 [PATCH] r/o bind mounts: elevate write count for xattr_permission() callers
This basically audits the callers of xattr_permission(), which calls
permission() and can perform writes to the filesystem.

[AV: add missing parts - removexattr() and nfsd posix acls, plug for a leak
spotted by Miklos]

Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19 00:29:15 -04:00
Frank Filz
406a7ea97d nfsd: Allow AIX client to read dir containing mountpoints
This patch addresses a compatibility issue with a Linux NFS server and
AIX NFS client.

I have exported /export as fsid=0 with sec=krb5:krb5i
I have mount --bind /home onto /export/home
I have exported /export/home with sec=krb5i

The AIX client mounts / -o sec=krb5:krb5i onto /mnt

If I do an ls /mnt, the AIX client gets a permission error. Looking at
the network traceIwe see a READDIR looking for attributes
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID. The response gives a
NFS4ERR_WRONGSEC which the AIX client is not expecting.

Since the AIX client is only asking for an attribute that is an
attribute of the parent file system (pseudo root in my example), it
seems reasonable that there should not be an error.

In discussing this issue with Bruce Fields, I initially proposed
ignoring the error in nfsd4_encode_dirent_fattr() if all that was being
asked for was FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID, however,
Bruce suggested that we avoid calling cross_mnt() if only these
attributes are requested.

The following patch implements bypassing cross_mnt() if only
FATTR4_RDATTR_ERROR and FATTR4_MOUNTED_ON_FILEID are called. Since there
is some complexity in the code in nfsd4_encode_fattr(), I didn't want to
duplicate code (and introduce a maintenance nightmare), so I added a
parameter to nfsd4_encode_fattr() that indicates whether it should
ignore cross mounts and simply fill in the attribute using the passed in
dentry as opposed to it's parent.

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-02-01 16:42:06 -05:00
J. Bruce Fields
2fdada03b3 knfsd: demote some printk()s to dprintk()s
To quote a recent mail from Andrew Morton:

	Look: if there's a way in which an unprivileged user can trigger
	a printk we fix it, end of story.

OK.  I assume that goes double for printk()s that might be triggered by
random hosts on the internet.  So, disable some printk()s that look like
they could be triggered by malfunctioning or malicious clients.  For
now, just downgrade them to dprintk()s.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Acked-by:  Neil Brown <neilb@suse.de>
2007-10-09 18:31:56 -04:00
Jeff Layton
749997e512 knfsd: set the response bitmask for NFS4_CREATE_EXCLUSIVE
RFC 3530 says:

 If the server uses an attribute to store the exclusive create verifier, it
 will signify which attribute by setting the appropriate bit in the attribute
 mask that is returned in the results.

Linux uses the atime and mtime to store the verifier, but sends a zeroed out
bitmask back to the client.  This patch makes sure that we set the correct
bits in the bitmask in this situation.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-31 15:39:38 -07:00
Andy Adamson
dcb488a3b7 knfsd: nfsd4: implement secinfo
Implement the secinfo operation.

(Thanks to Usha Ketineni wrote an earlier version of this support.)

Cc: Usha Ketineni <uketinen@us.ibm.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:08 -07:00
J. Bruce Fields
df547efb03 knfsd: nfsd4: simplify exp_pseudoroot arguments
We're passing three arguments to exp_pseudoroot, two of which are just fields
of the svc_rqst.  Soon we'll want to pass in a third field as well.  So let's
just give up and pass in the whole struct svc_rqst.

Also sneak in some minor style cleanups while we're at it.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:07 -07:00
J.Bruce Fields
27d630ece0 [PATCH] knfsd: nfsd4: simplify filehandle check
Kill another big "if" clause.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
eeac294ebd [PATCH] knfsd: nfsd4: simplify migration op check
I'm not too fond of these big if conditions.  Replace them by checks of a flag
in the operation descriptor.  To my eye this makes the code a bit more
self-documenting, and makes the complicated part of the code (proc_compound) a
little more compact.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
b591480bbe [PATCH] knfsd: nfsd4: reorganize compound ops
Define an op descriptor struct, use it to simplify nfsd4_proc_compound().

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
c954e2a5d1 [PATCH] knfsd: nfsd4: make verify and nverify wrappers
Make wrappers for verify and nverify, for consistency with other ops.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
7191155bd3 [PATCH] knfsd: nfsd4: don't inline nfsd4 compound op functions
The inlining contributes to bloating the stack of nfsd4_compound, and I want
to change the compound op functions to function pointers anyway.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
a4f1706a9b [PATCH] knfsd: nfsd4: move replay_owner to cstate
Tuck away the replay_owner in the cstate while we're at it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
d9e626f1e2 [PATCH] knfsd: nfsd4: remove spurious replay_owner check
OK, this is embarassing--I've even looked back at the history, and cannot for
the life of me figure out why I added this check.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
ca3643171b [PATCH] knfsd: nfsd4: pass saved and current fh together into nfsd4 operations
Pass the saved and current filehandles together into all the nfsd4 compound
operations.

I want a unified interface to these operations so we can just call them by
pointer and throw out the huge switch statement.

Also I'll eventually want a structure like this--that holds the state used
during compound processing--for deferral.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:54 -08:00
J.Bruce Fields
e571019911 [PATCH] knfsd: nfsd4: clarify units of COMPOUND_SLACK_SPACE
A comment here incorrectly states that "slack_space" is measured in words, not
bytes.  Remove the comment, and adjust a variable name and a few comments to
clarify the situation.

This is pure cleanup; there should be no change in functionality.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:53 -08:00
J. Bruce Fields
81ac95c556 [PATCH] nfsd4: fix open-create permissions
In the case where an open creates the file, we shouldn't be rechecking
permissions to open the file; the open succeeds regardless of what the new
file's mode bits say.

This patch fixes the problem, but only by introducing yet another parameter
to nfsd_create_v3.  This is ugly.  This will be fixed by later patches.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:23 -08:00
J. Bruce Fields
af85852de0 [PATCH] nfsd4: reindent do_open_lookup()
Minor rearrangement, cleanup of do_open_lookup().  No change in behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-11-08 18:29:22 -08:00
Al Viro
a90b061c0b [PATCH] nfsd: nfs_replay_me
We are using NFS_REPLAY_ME as a special error value that is never leaked to
clients.  That works fine; the only problem is mixing host- and network-
endian values in the same objects.  Network-endian equivalent would work just
as fine; switch to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:43 -07:00
Al Viro
b37ad28bca [PATCH] nfsd: nfs4 code returns error values in net-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:42 -07:00
Al Viro
2ebbc012a9 [PATCH] xdr annotations: NFSv4 server
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:41 -07:00
Al Viro
7111c66e4e [PATCH] fix svc_procfunc declaration
svc_procfunc instances return __be32, not int

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
J. Bruce Fields
9801d8a39c [PATCH] knfsd: nfsd4: fix open permission checking
We weren't actually checking for SHARE_ACCESS_WRITE, with the result that the
owner could open a non-writeable file for write!

Continue to allow DENY_WRITE only with write access.

Thanks to Jim Rees for reporting the bug.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:46 -07:00
J. Bruce Fields
dc730e1737 [PATCH] knfsd: nfsd4: fix owner-override on open
If a client creates a file using an open which sets the mode to 000, or if a
chmod changes permissions after a file is opened, then situations may arise
where an NFS client knows that some IO is permitted (because a process holds
the file open), but the NFS server does not (because it doesn't know about the
open, and only sees that the IO conflicts with the current mode of the file).

As a hack to solve this problem, NFS servers normally allow the owner to
override permissions on IO.  The client can still enforce correct
permissions-checking on open by performing an explicit access check.

In NFSv4 the client can rely on the explicit on-the-wire open instead of an
access check.

Therefore we should not be allowing the owner to override permissions on an
over-the-wire open!

However, we should still allow the owner to override permissions in the case
where the client is claiming an open that it already made either before a
reboot, or while it was holding a delegation.

Thanks to Jim Rees for reporting the bug.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-17 08:18:45 -07:00
J.Bruce Fields
42ca099381 [PATCH] knfsd: nfsd4: actually use all the pieces to implement referrals
Use all the pieces set up so far to implement referral support, allowing
return of NFS4ERR_MOVED and fs_locations attribute.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:23 -07:00
NeilBrown
3cc03b164c [PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom
..  by allocating the array of 'kvec' in 'struct svc_rqst'.

As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
allocate an array of this size on the stack.  So we allocate it in 'struct
svc_rqst'.

However svc_rqst contains (indirectly) an array of the same type and size
(actually several, but they are in a union).  So rather than waste space, we
move those arrays out of the separately allocated union and into svc_rqst to
share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
different times, so there is no conflict).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-04 07:55:15 -07:00
Greg Banks
3e3b480096 [PATCH] knfsd: add some missing newlines in printks
Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-02 07:57:17 -07:00
Shankar Anand
e2b209509c [PATCH] knfsd: nfsd4: add per-operation server stats
Add an nfs4 operations count array to nfsd_stats structure.  The count is
incremented in nfsd4_proc_compound() where all the operations are handled
by the nfsv4 server.  This count of individual nfsv4 operations is also
entered into /proc filesystem.

Signed-off-by: Shankar Anand<shanand@novell.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10 13:24:27 -07:00
NeilBrown
f0e2993e9e [PATCH] knfsd: nfsd4: remove nfsd_setuser from putrootfh
Since nfsd_setuser() is already called from any operation that uses the
current filehandle (because it's called from fh_verify), there's no reason to
call it from putrootfh.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:52 -07:00
NeilBrown
7775f4c85d [PATCH] knfsd: Correct reserved reply space for read requests.
NFSd makes sure there is enough space to hold the maximum possible reply
before accepting a request.  The units for this maximum is (4byte) words.
However in three places, particularly for read request, the number given is
a number of bytes.

This means too much space is reserved which is slightly wasteful.

This is the sort of patch that could uncover a deeper bug, and it is not
critical, so it would be best for it to spend a while in -mm before going
in to mainline.

(akpm: target 2.6.17-rc2, 2.6.16.3 (approx))

Discovered-by: "Eivind  Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:51 -07:00
J. Bruce Fields
cbd0d51a33 [PATCH] knfsd: fix nfs4_open lock leak
I just noticed that my patch "don't create on open that fails due to
ERR_GRACE" (recently commited as fb553c0f17)
had an obvious problem that causes a deadlock on reboot recovery.  Sending
in this now since it seems like a clear 2.6.16 candidate.--b.

We're returning with a lock held in some error cases.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-02-07 16:12:31 -08:00
Fred Isaman
5274881992 [PATCH] nfsd4: clean up settattr code
Clean up some unnecessary special-casing in the setattr code..

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:27 -08:00
J. Bruce Fields
fb553c0f17 [PATCH] nfsd4: don't create on open that fails due to ERR_GRACE
In an earlier patch (commit b648330a1d) I noted
that a too-early grace-period check was preventing us from bumping the
sequence id on open.  Unfortunately in that patch I stupidly moved the
grace-period check back too far, so now an open for create can succesfully
create the file while still returning ERR_GRACE.

The correct place for that check is after we've set the open_owner and handled
any replays, but before we actually start mucking with the filesystem.

Thanks to Avishay Traeger for reporting the bug.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:26 -08:00
J. Bruce Fields
375c5547cb [PATCH] nfsd4: nfs4state.c miscellaneous goto removals
Remove some goto's that made the logic here a little more tortuous than
necessary.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:26 -08:00
J. Bruce Fields
a525825df1 [PATCH] nfsd4: handle replays of failed open reclaims
We need to make sure open reclaims are marked confirmed immediately so that we
can handle replays even if they fail (e.g.  with a seqid-incrementing error).
(See 8.1.8.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:26 -08:00
J. Bruce Fields
fd44527707 [PATCH] nfsd4: operation debugging
Simple, useful debugging printk: print the number of each op as we process it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:25 -08:00
Neil Brown
f2327d9adb [PATCH] nfsd4: move replay_owner
It seems more natural to move the setting of the replay_owner into the
relevant procedure instead of doing it in nfsv4_proc_compound.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-13 08:22:31 -07:00
NeilBrown
b648330a1d [PATCH] nfsd4: ERR_GRACE should bump seqid on open
The GRACE and NOGRACE errors should bump the sequence id on open.  So we delay
the handling of these errors until nfsd4_process_open2, at which point we've
set the open owner, so the encode routine will be able to bump the sequence
id.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-07 18:24:08 -07:00
NeilBrown
7e06b7f9e9 [PATCH] knfsd: nfs4: hold filp while reading or writing
We're trying to read and write from a struct file that we may not hold a
reference to any more (since a close could be processed as soon as we drop the
state lock).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:30 -07:00
NeilBrown
c815afc73e [PATCH] nfsd4: block metadata ops during grace period
We currently return err_grace if a user attempts a non-reclaim open during the
grace period.  But we also need to prevent renames and removes, at least, to
ensure clients have the chance to recover state on files before they are moved
or deleted.

Of course, local users could also do renames and removes during the lease
period, and there's not much we can do about that.  This at least will help
with remote users.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
NeilBrown
0dd3c19212 [PATCH] nfsd4: support CLAIM_DELEGATE_CUR
Add OPEN claim type NFS4_OPEN_CLAIM_DELEGATE_CUR to nfsd4_open().

A delegation stateid and a name are provided.  OPEN with O_CREAT is not legal
with this claim type; otherwise, use the NFS4_OPEN_CLAIM_NULL code path to
lookup the filename to be opened.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-24 00:06:28 -07:00
Linus Torvalds
1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00