Commit Graph

1181 Commits

Author SHA1 Message Date
Jeff Layton
b95239ca49 nfsd: make nfsd4_run_cb a bool return function
queue_work can return false and not queue anything, if the work is
already queued. If that happens in the case of a CB_RECALL, we'll have
taken an extra reference to the stid that will never be put. Ensure we
throw a warning in that case.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:50:57 -04:00
Jeff Layton
25fbe1fca1 nfsd: fix comments about spinlock handling with delegations
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:23:55 -04:00
Jeff Layton
4d01416ab4 nfsd: only fill out return pointer on success in nfsd4_lookup_stateid
In the case of a revoked delegation, we still fill out the pointer even
when returning an error, which is bad form. Only overwrite the pointer
on success.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:23:44 -04:00
Dai Ngo
019805fea9 NFSD: fix use-after-free on source server when doing inter-server copy
Use-after-free occurred when the laundromat tried to free expired
cpntf_state entry on the s2s_cp_stateids list after inter-server
copy completed. The sc_cp_list that the expired copy state was
inserted on was already freed.

When COPY completes, the Linux client normally sends LOCKU(lock_state x),
FREE_STATEID(lock_state x) and CLOSE(open_state y) to the source server.
The nfs4_put_stid call from nfsd4_free_stateid cleans up the copy state
from the s2s_cp_stateids list before freeing the lock state's stid.

However, sometimes the CLOSE was sent before the FREE_STATEID request.
When this happens, the nfsd4_close_open_stateid call from nfsd4_close
frees all lock states on its st_locks list without cleaning up the copy
state on the sc_cp_list list. When the time the FREE_STATEID arrives the
server returns BAD_STATEID since the lock state was freed. This causes
the use-after-free error to occur when the laundromat tries to free
the expired cpntf_state.

This patch adds a call to nfs4_free_cpntf_statelist in
nfsd4_close_open_stateid to clean up the copy state before calling
free_ol_stateid_reaplist to free the lock state's stid on the reaplist.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:21:46 -04:00
Chuck Lever
781fde1a2b NFSD: Rename the fields in copy_stateid_t
Code maintenance: The name of the copy_stateid_t::sc_count field
collides with the sc_count field in struct nfs4_stid, making the
latter difficult to grep for when auditing stateid reference
counting.

No behavior change expected.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:50 -04:00
ChenXiaoSong
1d7f6b302b nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops
Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.

inode is converted from seq_file->file instead of seq_file->private in
client_info_show().

Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:49 -04:00
Dai Ngo
7746b32f46 NFSD: add shrinker to reap courtesy clients on low memory condition
Add courtesy_client_reaper to react to low memory condition triggered
by the system memory shrinker.

The delayed_work for the courtesy_client_reaper is scheduled on
the shrinker's count callback using the laundry_wq.

The shrinker's scan callback is not used for expiring the courtesy
clients due to potential deadlocks.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:41 -04:00
Dai Ngo
3a4ea23d86 NFSD: keep track of the number of courtesy clients in the system
Add counter nfs4_courtesy_client_count to nfsd_net to keep track
of the number of courtesy clients in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:41 -04:00
Chuck Lever
c035362eb9 NFSD: Add a mechanism to wait for a DELEGRETURN
Subsequent patches will use this mechanism to wake up an operation
that is waiting for a client to return a delegation.

The new tracepoint records whether the wait timed out or was
properly awoken by the expected DELEGRETURN:

            nfsd-1155  [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out)

Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-09-26 14:02:32 -04:00
Chuck Lever
1035d65446 NFSD: Add tracepoints to report NFSv4 callback completions
Wireshark has always been lousy about dissecting NFSv4 callbacks,
especially NFSv4.0 backchannel requests. Add tracepoints so we
can surgically capture these events in the trace log.

Tracepoints are time-stamped and ordered so that we can now observe
the timing relationship between a CB_RECALL Reply and the client's
DELEGRETURN Call. Example:

            nfsd-1153  [002]   211.986391: nfsd_cb_recall:       addr=192.168.1.67:45767 client 62ea82e4:fee7492a stateid 00000003:00000001

            nfsd-1153  [002]   212.095634: nfsd_compound:        xid=0x0000002c opcnt=2
            nfsd-1153  [002]   212.095647: nfsd_compound_status: op=1/2 OP_PUTFH status=0
            nfsd-1153  [002]   212.095658: nfsd_file_put:        hash=0xf72 inode=0xffff9291148c7410 ref=3 flags=HASHED|REFERENCED may=READ file=0xffff929103b3ea00
            nfsd-1153  [002]   212.095661: nfsd_compound_status: op=2/2 OP_DELEGRETURN status=0
   kworker/u25:8-148   [002]   212.096713: nfsd_cb_recall_done:  client 62ea82e4:fee7492a stateid 00000003:00000001 status=0

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-09-26 14:02:32 -04:00
NeilBrown
bb4d53d66e NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
When locking a file to access ACLs and xattrs etc, use explicit locking
with inode_lock() instead of fh_lock().  This means that the calls to
fh_fill_pre/post_attr() are also explicit which improves readability and
allows us to place them only where they are needed.  Only the xattr
calls need pre/post information.

When locking a file we don't need I_MUTEX_PARENT as the file is not a
parent of anything, so we can use inode_lock() directly rather than the
inode_lock_nested() call that fh_lock() uses.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-08-04 10:28:41 -04:00
NeilBrown
19d008b469 NFSD: reduce locking in nfsd_lookup()
nfsd_lookup() takes an exclusive lock on the parent inode, but no
callers want the lock and it may not be needed at all if the
result is in the dcache.

Change nfsd_lookup_dentry() to not take the lock, and call
lookup_one_len_locked() which takes lock only if needed.

nfsd4_open() currently expects the lock to still be held, but that isn't
necessary as nfsd_validate_delegated_dentry() provides required
guarantees without the lock.

NOTE: NFSv4 requires directory changeinfo for OPEN even when a create
  wasn't requested and no change happened.  Now that nfsd_lookup()
  doesn't use fh_lock(), we need to explicitly fill the attributes
  when no create happens.  A new fh_fill_both_attrs() is provided
  for that task.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-08-04 10:28:20 -04:00
NeilBrown
7fe2a71dda NFSD: introduce struct nfsd_attrs
The attributes that nfsd might want to set on a file include 'struct
iattr' as well as an ACL and security label.
The latter two are passed around quite separately from the first, in
part because they are only needed for NFSv4.  This leads to some
clumsiness in the code, such as the attributes NOT being set in
nfsd_create_setattr().

We need to keep the directory locked until all attributes are set to
ensure the file is never visibile without all its attributes.  This need
combined with the inconsistent handling of attributes leads to more
clumsiness.

As a first step towards tidying this up, introduce 'struct nfsd_attrs'.
This is passed (by reference) to vfs.c functions that work with
attributes, and is assembled by the various nfs*proc functions which
call them.  As yet only iattr is included, but future patches will
expand this.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
Jeff Layton
876c553cb4 NFSD: verify the opened dentry after setting a delegation
Between opening a file and setting a delegation on it, someone could
rename or unlink the dentry. If this happens, we do not want to grant a
delegation on the open.

On a CLAIM_NULL open, we're opening by filename, and we may (in the
non-create case) or may not (in the create case) be holding i_rwsem
when attempting to set a delegation.  The latter case allows a
race.

After getting a lease, redo the lookup of the file being opened and
validate that the resulting dentry matches the one in the open file
description.

To properly redo the lookup we need an rqst pointer to pass to
nfsd_lookup_dentry(), so make sure that is available.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
Jeff Layton
bbf936edd5 NFSD: drop fh argument from alloc_init_deleg
Currently, we pass the fh of the opened file down through several
functions so that alloc_init_deleg can pass it to delegation_blocked.
The filehandle of the open file is available in the nfs4_file however,
so there's no need to pass it in a separate argument.

Drop the argument from alloc_init_deleg, nfs4_open_delegation and
nfs4_set_delegation.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
Dai Ngo
4271c2c088 NFSD: limit the number of v4 clients to 1024 per 1GB of system memory
Currently there is no limit on how many v4 clients are supported
by the system. This can be a problem in systems with small memory
configuration to function properly when a very large number of
clients exist that creates memory shortage conditions.

This patch enforces a limit of 1024 NFSv4 clients, including courtesy
clients, per 1GB of system memory.  When the number of the clients
reaches the limit, requests that create new clients are returned
with NFS4ERR_DELAY and the laundromat is kicked start to trim old
clients. Due to the overhead of the upcall to remove the client
record, the maximun number of clients the laundromat removes on
each run is limited to 128. This is done to ensure the laundromat
can still process the other tasks in a timely manner.

Since there is now a limit of the number of clients, the 24-hr
idle time limit of courtesy client is no longer needed and was
removed.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Dai Ngo
0926c39515 NFSD: keep track of the number of v4 clients in the system
Add counter nfs4_client_count to keep track of the total number
of v4 clients, including courtesy clients, in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Dai Ngo
6867137ebc NFSD: refactoring v4 specific code to a helper in nfs4state.c
This patch moves the v4 specific code from nfsd_init_net() to
nfsd4_init_leases_net() helper in nfs4state.c

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Chuck Lever
427f5f83a3 NFSD: Ensure nf_inode is never dereferenced
The documenting comment for struct nf_file states:

/*
 * A representation of a file that has been opened by knfsd. These are hashed
 * in the hashtable by inode pointer value. Note that this object doesn't
 * hold a reference to the inode by itself, so the nf_inode pointer should
 * never be dereferenced, only used for comparison.
 */

Replace the two existing dereferences to make the comment always
true.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Chuck Lever
5e138c4a75 NFSD: NFSv4 CLOSE should release an nfsd_file immediately
The last close of a file should enable other accessors to open and
use that file immediately. Leaving the file open in the filecache
prevents other users from accessing that file until the filecache
garbage-collects the file -- sometimes that takes several seconds.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:42 -04:00
Chuck Lever
be0230069f NFSD: Separate tracepoints for acquire and create
These tracepoints collect different information: the create case does
not open a file, so there's no nf_file available.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:15:54 -04:00
Chuck Lever
043862b09c NFSD: Add documenting comment for nfsd4_release_lockowner()
And return explicit nfserr values that match what is documented in the
new comment / API contract.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-26 10:50:50 -04:00
Chuck Lever
bd8fdb6e54 NFSD: Modernize nfsd4_release_lockowner()
Refactor: Use existing helpers that other lock operations use. This
change removes several automatic variables, so re-organize the
variable declarations for readability.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-26 10:50:50 -04:00
Chuck Lever
ce3c4ad7f4 NFSD: Fix possible sleep during nfsd4_release_lockowner()
nfsd4_release_lockowner() holds clp->cl_lock when it calls
check_for_locks(). However, check_for_locks() calls nfsd_file_get()
/ nfsd_file_put() to access the backing inode's flc_posix list, and
nfsd_file_put() can sleep if the inode was recently removed.

Let's instead rely on the stateowner's reference count to gate
whether the release is permitted. This should be a reliable
indication of locks-in-use since file lock operations and
->lm_get_owner take appropriate references, which are released
appropriately when file locks are removed.

Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
2022-05-26 10:50:49 -04:00
Chuck Lever
7e2ce0cc15 NFSD: Move documenting comment for nfsd4_process_open2()
Clean up nfsd4_open() by converting a large comment at the only
call site for nfsd4_process_open2() to a kerneldoc comment in
front of that function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-23 11:06:29 -04:00
Chuck Lever
fb70bf124b NFSD: Instantiate a struct file when creating a regular NFSv4 file
There have been reports of races that cause NFSv4 OPEN(CREATE) to
return an error even though the requested file was created. NFSv4
does not provide a status code for this case.

To mitigate some of these problems, reorganize the NFSv4
OPEN(CREATE) logic to allocate resources before the file is actually
created, and open the new file while the parent directory is still
locked.

Two new APIs are added:

+ Add an API that works like nfsd_file_acquire() but does not open
the underlying file. The OPEN(CREATE) path can use this API when it
already has an open file.

+ Add an API that is kin to dentry_open(). NFSD needs to create a
file and grab an open "struct file *" atomically. The
alloc_empty_file() has to be done before the inode create. If it
fails (for example, because the NFS server has exceeded its
max_files limit), we avoid creating the file and can still return
an error to the NFS client.

BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: JianHong Yin <jiyin@redhat.com>
2022-05-23 11:06:29 -04:00
Dai Ngo
e9488d5ae1 NFSD: Show state of courtesy client in client info
Update client_info_show to show state of courtesy client
and seconds since last renew.

Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19 12:25:40 -04:00
Dai Ngo
27431affb0 NFSD: add support for lock conflict to courteous server
This patch allows expired client with lock state to be in COURTESY
state. Lock conflict with COURTESY client is resolved by the fs/lock
code using the lm_lock_expirable and lm_expire_lock callback in the
struct lock_manager_operations.

If conflict client is in COURTESY state, set it to EXPIRABLE and
schedule the laundromat to run immediately to expire the client. The
callback lm_expire_lock waits for the laundromat to flush its work
queue before returning to caller.

Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19 12:25:39 -04:00
Dai Ngo
d76cc46b37 NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd
This patch moves create/destroy of laundry_wq from nfs4_state_start
and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent
the laundromat from being freed while a thread is processing a
conflicting lock.

Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19 12:25:39 -04:00
Dai Ngo
3d69427151 NFSD: add support for share reservation conflict to courteous server
This patch allows expired client with open state to be in COURTESY
state. Share/access conflict with COURTESY client is resolved by
setting COURTESY client to EXPIRABLE state, schedule laundromat
to run and returning nfserr_jukebox to the request client.

Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19 12:25:39 -04:00
Dai Ngo
66af257999 NFSD: add courteous server support for thread with only delegation
This patch provides courteous server support for delegation only.
Only expired client with delegation but no conflict and no open
or lock state is allowed to be in COURTESY state.

Delegation conflict with COURTESY/EXPIRABLE client is resolved by
setting it to EXPIRABLE, queue work for the laundromat and return
delay to the caller. Conflict is resolved when the laudromat runs
and expires the EXIRABLE client while the NFS client retries the
OPEN request. Local thread request that gets conflict is doing the
retry in _break_lease.

Client in COURTESY or EXPIRABLE state is allowed to reconnect and
continues to have access to its state. Access to the nfs4_client by
the reconnecting thread and the laundromat is serialized via the
client_lock.

Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-19 12:25:39 -04:00
Chuck Lever
50719bf344 NFSD: Fix nfsd_breaker_owns_lease() return values
These have been incorrect since the function was introduced.

A proper kerneldoc comment is added since this function, though
static, is part of an external interface.

Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-03-11 10:25:16 -05:00
Chuck Lever
35aff0678f NFSD: Clean up _lm_ operation names
The common practice is to name function instances the same as the
method names, but with a uniquifying prefix. Commit aef9583b23
("NFSD: Get reference of lockowner when coping file_lock") missed
this -- the new function names should both have been of the form
"nfsd4_lm_*".

Before more lock manager operations are added in NFSD, rename these
two functions for consistency.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-03-11 10:25:16 -05:00
Dai Ngo
ab451ea952 nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client.
From RFC 7530 Section 16.34.5:

o  The server has not recorded an unconfirmed { v, x, c, *, * } and
   has recorded a confirmed { v, x, c, *, s }.  If the principals of
   the record and of SETCLIENTID_CONFIRM do not match, the server
   returns NFS4ERR_CLID_INUSE without removing any relevant leased
   client state, and without changing recorded callback and
   callback_ident values for client { x }.

The current code intends to do what the spec describes above but
it forgot to set 'old' to NULL resulting to the confirmed client
to be expired.

Fixes: 2b63482185 ("nfsd: fix clid_inuse on mount with security change")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Bruce Fields <bfields@fieldses.org>
2022-01-28 09:04:00 -05:00
J. Bruce Fields
074b07d94e nfsd: fix crash on COPY_NOTIFY with special stateid
RTM says "If the special ONE stateid is passed to
nfs4_preprocess_stateid_op(), it returns status=0 but does not set
*cstid. nfsd4_copy_notify() depends on stid being set if status=0, and
thus can crash if the client sends the right COPY_NOTIFY RPC."

RFC 7862 says "The cna_src_stateid MUST refer to either open or locking
states provided earlier by the server.  If it is invalid, then the
operation MUST fail."

The RFC doesn't specify an error, and the choice doesn't matter much as
this is clearly illegal client behavior, but bad_stateid seems
reasonable.

Simplest is just to guarantee that nfs4_preprocess_stateid_op, called
with non-NULL cstid, errors out if it can't return a stateid.

Reported-by: rtm@csail.mit.edu
Fixes: 624322f1ad ("NFSD add COPY_NOTIFY operation")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
2022-01-08 14:42:03 -05:00
Vasily Averin
47446d74f1 nfsd4: add refcount for nfsd4_blocked_lock
nbl allocated in nfsd4_lock can be released by a several ways:
directly in nfsd4_lock(), via nfs4_laundromat(), via another nfs
command RELEASE_LOCKOWNER or via nfsd4_callback.
This structure should be refcounted to be used and released correctly
in all these cases.

Refcount is initialized to 1 during allocation and is incremented
when nbl is added into nbl_list/nbl_lru lists.

Usually nbl is linked into both lists together, so only one refcount
is used for both lists.

However nfsd4_lock() should keep in mind that nbl can be present
in one of lists only. This can happen if nbl was handled already
by nfs4_laundromat/nfsd4_callback/etc.

Refcount is decremented if vfs_lock_file() returns FILE_LOCK_DEFERRED,
because nbl can be handled already by nfs4_laundromat/nfsd4_callback/etc.

Refcount is not changed in find_blocked_lock() because of it reuses counter
released after removing nbl from lists.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-08 14:42:01 -05:00
J. Bruce Fields
40595cdc93 nfs: block notification on fs with its own ->lock
NFSv4.1 supports an optional lock notification feature which notifies
the client when a lock comes available.  (Normally NFSv4 clients just
poll for locks if necessary.)  To make that work, we need to request a
blocking lock from the filesystem.

We turned that off for NFS in commit f657f8eef3 ("nfs: don't atempt
blocking locks on nfs reexports") [sic] because it actually blocks the
nfsd thread while waiting for the lock.

Thanks to Vasily Averin for pointing out that NFS isn't the only
filesystem with that problem.

Any filesystem that leaves ->lock NULL will use posix_lock_file(), which
does the right thing.  Simplest is just to assume that any filesystem
that defines its own ->lock is not safe to request a blocking lock from.

So, this patch mostly reverts commit f657f8eef3 ("nfs: don't atempt
blocking locks on nfs reexports") [sic] and commit b840be2f00 ("lockd:
don't attempt blocking locks on nfs reexports"), and instead uses a
check of ->lock (Vasily's suggestion) to decide whether to support
blocking lock notifications on a given filesystem.  Also add a little
documentation.

Perhaps someday we could add back an export flag later to allow
filesystems with "good" ->lock methods to support blocking lock
notifications.

Reported-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
[ cel: Description rewritten to address checkpatch nits ]
[ cel: Fixed warning when SUNRPC debugging is disabled ]
[ cel: Fixed NULL check ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Vasily Averin <vvs@virtuozzo.com>
2022-01-08 14:42:01 -05:00
J. Bruce Fields
3dcd1d8aab nfsd: improve stateid access bitmask documentation
The use of the bitmaps is confusing.  Add a cross-reference to make it
easier to find the existing comment.  Add an updated reference with URL
to make it quicker to look up.  And a bit more editorializing about the
value of this.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-01-08 14:42:01 -05:00
J. Bruce Fields
548ec0805c nfsd: fix use-after-free due to delegation race
A delegation break could arrive as soon as we've called vfs_setlease.  A
delegation break runs a callback which immediately (in
nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru.  If we
then exit nfs4_set_delegation without hashing the delegation, it will be
freed as soon as the callback is done with it, without ever being
removed from del_recall_lru.

Symptoms show up later as use-after-free or list corruption warnings,
usually in the laundromat thread.

I suspect aba2072f45 "nfsd: grant read delegations to clients holding
writes" made this bug easier to hit, but I looked as far back as v3.0
and it looks to me it already had the same problem.  So I'm not sure
where the bug was introduced; it may have been there from the beginning.

Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-12-10 11:55:15 -05:00
Linus Torvalds
38764c7340 A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping
support for a filehandle format deprecated 20 years ago, and further
 xdr-related cleanup from Chuck.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAmGMPYkVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+JVwQAKbrpgbzl91u+T6W9MUGgQVzDpeP
 XIy3NxCu/4pZ8SToWF3trz71sskokmkPPaZyuISD2C8e4DxO5LQ3fJLhtS9CjRFB
 x4iZUxH7V2BoWrb5SY6TDWBEqaq4MY9f7tIbvUu5xpa0FIupLqJjYh2CP8vqtsbm
 lblQKXz4ao0jwDzSVimNnPcTccpB25VIzwHsSOszRhN4rTjMgyHoETx2cqJne5IU
 Tx/hH0UlpnwuQ7aVpcjMoKqIyUWDTMejx51pyZhHB47DVKL7HsnZvg59mTpXFcBx
 29edvWT9yy1+w3nGkTYSkOgO9DyHvCbmQzIsvoYlmbZ2sdmTKK8Wuv2Ehcw3OfvL
 MXGmy2EXIhzvTZXyN6pL1bBwwNSxdqJhVSxvrPLz1EymIkxf/IDI8eyUicVXd3Vq
 K2xOn+CXyIbXWCU85ru8UA77r1+x//gSwqcJvtKUavbNJUwNt935CE2n3+o/0OL/
 pToZ89nhcaRyDP1jJKA37K48VLNtBXzZZQlRovyLelNojam/kzZkXX8dI6oV9VD1
 Ymjm0mbdZzwhE3C1HxKlxwZqhN+7YoyxMQuWjFMp28wxH+dkz/USCulKZ3/H+neD
 0YBSgvwe92JqkZTW2AOjipL+beAuKJ4zsfCCl2XZig/rHGutiwOf2GfgdRmJM6AD
 6aiufVWKNNRQef9y
 =yKBl
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.16' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping
  support for a filehandle format deprecated 20 years ago, and further
  xdr-related cleanup from Chuck"

* tag 'nfsd-5.16' of git://linux-nfs.org/~bfields/linux: (26 commits)
  nfsd4: remove obselete comment
  nfsd: document server-to-server-copy parameters
  NFSD:fix boolreturn.cocci warning
  nfsd: update create verifier comment
  SUNRPC: Change return value type of .pc_encode
  SUNRPC: Replace the "__be32 *p" parameter to .pc_encode
  NFSD: Save location of NFSv4 COMPOUND status
  SUNRPC: Change return value type of .pc_decode
  SUNRPC: Replace the "__be32 *p" parameter to .pc_decode
  SUNRPC: De-duplicate .pc_release() call sites
  SUNRPC: Simplify the SVC dispatch code path
  SUNRPC: Capture value of xdr_buf::page_base
  SUNRPC: Add trace event when alloc_pages_bulk() makes no progress
  svcrdma: Split svcrmda_wc_{read,write} tracepoints
  svcrdma: Split the svcrdma_wc_send() tracepoint
  svcrdma: Split the svcrdma_wc_receive() tracepoint
  NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()
  SUNRPC: xdr_stream_subsegment() must handle non-zero page_bases
  NFSD: Initialize pointer ni with NULL and not plain integer 0
  NFSD: simplify struct nfsfh
  ...
2021-11-10 16:45:54 -08:00
Colin Ian King
8e70bf27fd NFSD: Initialize pointer ni with NULL and not plain integer 0
Pointer ni is being initialized with plain integer zero. Fix
this by initializing with NULL.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-02 15:51:10 -04:00
NeilBrown
d8b26071e6 NFSD: simplify struct nfsfh
Most of the fields in 'struct knfsd_fh' are 2 levels deep (a union and a
struct) and are accessed using macros like:

 #define fh_FOO fh_base.fh_new.fb_FOO

This patch makes the union and struct anonymous, so that "fh_FOO" can be
a name directly within 'struct knfsd_fh' and the #defines aren't needed.

The file handle as a whole is sometimes accessed as "fh_base" or
"fh_base.fh_pad", neither of which are particularly helpful names.
As the struct holding the filehandle is now anonymous, we
cannot use the name of that, so we union it with 'fh_raw' and use that
where the raw filehandle is needed.  fh_raw also ensure the structure is
large enough for the largest possible filehandle.

fh_raw is a 'char' array, removing any need to cast it for memcpy etc.

SVCFH_fmt() is simplified using the "%ph" printk format.  This
changes the appearance of filehandles in dprintk() debugging, making
them a little more precise.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2021-10-02 15:51:10 -04:00
Linus Torvalds
cf1d2c3e7e Critical bug fixes:
- Fix crash in NLM TEST procedure
 - NFSv4.1+ backchannel not restored after PATH_DOWN
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmFLUKcACgkQM2qzM29m
 f5fCAhAAp66o6n49/fxOLWo+MftFlT1EY8NtFjyTh1x/o4R9S74qxTy3RC3GzRvk
 oGOnkFvuiToyjcoeyb9yumYxO00Qf75hrTJvsXqnsbrLZOAKVuITn9MkQXBOXjCi
 GDxQSRFg8ihz0vG4YbE/brnZR1fIMr7KSzXLwdXOs8mKvro7JmiiB87JOGhw9yon
 W9+bFcnN2TynYsqmtHu987LvaIUE79dFfhrfj6bIobNQ25oqJoG5e1/M48/1MJol
 DFPiWoErJ/S1c0lA8rbjIvtzgbXs84U88EXmFUVsxSXhepGui3Uh/cA49vu46icH
 vze8fwHs6q3qzF7gE6jbslrrdQ/H6AZ6arhe27h4cVxdh0AouDuBat2xLY2I4TP3
 DckfLbEsOqTJhfzqYnk+8ckOaBMpkfyDqG6SodIKglPoknNCtCp0/7NuYF0yMLe5
 I6pO7JDgz7ySrbpm27ZMOpdwkLqqA1i8V9MPvimUsKTYJqlVBsc2RsdldQhunNbd
 50InJarWQ+japkEl3WK3aJ5rTluiIWjcePT7wA76wP3PnZmcjweOiQMc8uuLlzPw
 tOLRlHdpdZzeM3hGuI6KKsg8ZRbDB7L8YiaLkSwxl2qwJwDSB0xo7/WWwVzkyfdf
 zdQ2cR9z70I2Bgxq/1lAPB8tXq+SEvu1qCYDFSTo3I9c0Y2frs4=
 =L0c1
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:
 "Critical bug fixes:

   - Fix crash in NLM TEST procedure

   - NFSv4.1+ backchannel not restored after PATH_DOWN"

* tag 'nfsd-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN
  NLM: Fix svcxdr_encode_owner()
2021-09-22 09:21:02 -07:00
Dai Ngo
02579b2ff8 nfsd: back channel stuck in SEQ4_STATUS_CB_PATH_DOWN
When the back channel enters SEQ4_STATUS_CB_PATH_DOWN state, the client
recovers by sending BIND_CONN_TO_SESSION but the server fails to recover
the back channel and leaves it as NFSD4_CB_DOWN.

Fix by enhancing nfsd4_bind_conn_to_session to probe the back channel
by calling nfsd4_probe_callback.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-09-17 10:35:12 -04:00
Linus Torvalds
8bda955776 New features:
- Support for server-side disconnect injection via debugfs
 - Protocol definitions for new RPC_AUTH_TLS authentication flavor
 
 Performance improvements:
 - Reduce page allocator traffic in the NFSD splice read actor
 - Reduce CPU utilization in svcrdma's Send completion handler
 
 Notable bug fixes:
 - Stabilize lockd operation when re-exporting NFS mounts
 - Fix the use of %.*s in NFSD tracepoints
 - Fix /proc/sys/fs/nfs/nsm_use_hostnames
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmEqq0AACgkQM2qzM29m
 f5dYig/5AaPN2BWYf4D1VkrAS3+zGS+3IN23WVgpbA54jgfjPEH+Aa00YhEQQa0j
 Y5u/jE5g/tWvenDefq5BmvdRfZMWCVc2JkngctOSflhaREUWK+HgCkH+5DQs6zUM
 rbX7qy0v6wJnEMSlwCKJ2AuZbYw7Bsg2nvOgEbb718/ent3umeoXEK09x3HTWLEp
 eVcMU5uicB5wRRPpROYG792oWzUScQ8kyiRCKJfQDoR7bINhBeVHObAIFMBo1UaH
 x9CMX4RlPYGmoMYUc+AqcOM7hizucHpXqM1r3oVjQ7FyI+pmDLuLL/3OTjtRUX7+
 nYLqNW/PijH9PjFe4BPjGHAUQfKiTIXANAe8VdjQj70D40jYkP+jQ9SPdV+pEgi4
 U4azfK3S+85/bRYYq/1alcLiP1+6dgcL++rVvnKESTH9NRgNoEw2WZHeKxXiYaxU
 p7oOC4XdnYDwcz/3QVWa0sK2kA5IJHzOsCQR7OilD09NAJ+AbJTAp0H3xFXTllzb
 AV2CAEBVZlP+pZYOehuVnKpZPa7YAWx92wRK2anbRUMZN3lF1wWBEOTd6KweIpTx
 l2GJSf3GWBqL1x9PjSet/cBusxYjTA+S1hE7KMrsNPhzbvpIgAZEtSqOfn9apDCV
 uAFIN2DSiHm3Tv0aFSJWo+CMyKkyktuiS8JFKaFdzCp9NtsBM2M=
 =TGkK
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "New features:

   - Support for server-side disconnect injection via debugfs

   - Protocol definitions for new RPC_AUTH_TLS authentication flavor

  Performance improvements:

   - Reduce page allocator traffic in the NFSD splice read actor

   - Reduce CPU utilization in svcrdma's Send completion handler

  Notable bug fixes:

   - Stabilize lockd operation when re-exporting NFS mounts

   - Fix the use of %.*s in NFSD tracepoints

   - Fix /proc/sys/fs/nfs/nsm_use_hostnames"

* tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (31 commits)
  nfsd: fix crash on LOCKT on reexported NFSv3
  nfs: don't allow reexport reclaims
  lockd: don't attempt blocking locks on nfs reexports
  nfs: don't atempt blocking locks on nfs reexports
  Keep read and write fds with each nlm_file
  lockd: update nlm_lookup_file reexport comment
  nlm: minor refactoring
  nlm: minor nlm_lookup_file argument change
  lockd: lockd server-side shouldn't set fl_ops
  SUNRPC: Add documentation for the fail_sunrpc/ directory
  SUNRPC: Server-side disconnect injection
  SUNRPC: Move client-side disconnect injection
  SUNRPC: Add a /sys/kernel/debug/fail_sunrpc/ directory
  svcrdma: xpt_bc_xprt is already clear in __svc_rdma_free()
  nfsd4: Fix forced-expiry locking
  rpc: fix gss_svc_init cleanup on failure
  SUNRPC: Add RPC_AUTH_TLS protocol numbers
  lockd: change the proc_handler for nsm_use_hostnames
  sysctl: introduce new proc handler proc_dobool
  SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency()
  ...
2021-08-31 10:57:06 -07:00
J. Bruce Fields
0bcc7ca40b nfsd: fix crash on LOCKT on reexported NFSv3
Unlike other filesystems, NFSv3 tries to use fl_file in the GETLK case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:29 -04:00
J. Bruce Fields
bb0a55bb71 nfs: don't allow reexport reclaims
In the reexport case, nfsd is currently passing along locks with the
reclaim bit set.  The client sends a new lock request, which is granted
if there's currently no conflict--even if it's possible a conflicting
lock could have been briefly held in the interim.

We don't currently have any way to safely grant reclaim, so for now
let's just deny them all.

I'm doing this by passing the reclaim bit to nfs and letting it fail the
call, with the idea that eventually the client might be able to do
something more forgiving here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:28 -04:00
J. Bruce Fields
f657f8eef3 nfs: don't atempt blocking locks on nfs reexports
NFS implements blocking locks by blocking inside its lock method.  In
the reexport case, this blocks the nfs server thread, which could lead
to deadlocks since an nfs server thread might be required to unlock the
conflicting lock.  It also causes a crash, since the nfs server thread
assumes it can free the lock when its lm_notify lock callback is called.

Ideal would be to make the nfs lock method return without blocking in
this case, but for now it works just not to attempt blocking locks.  The
difference is just that the original client will have to poll (as it
does in the v4.0 case) instead of getting a callback when the lock's
available.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-26 15:32:10 -04:00
Jeff Layton
f7e33bdbd6 fs: remove mandatory file locking support
We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it
off in Fedora and RHEL8. Several other distros have followed suit.

I've heard of one problem in all that time: Someone migrated from an
older distro that supported "-o mand" to one that didn't, and the host
had a fstab entry with "mand" in it which broke on reboot. They didn't
actually _use_ mandatory locking so they just removed the mount option
and moved on.

This patch rips out mandatory locking support wholesale from the kernel,
along with the Kconfig option and the Documentation file. It also
changes the mount code to ignore the "mand" mount option instead of
erroring out, and to throw a big, ugly warning.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-23 06:15:36 -04:00
J. Bruce Fields
f7104cc1a9 nfsd4: Fix forced-expiry locking
This should use the network-namespace-wide client_lock, not the
per-client cl_lock.

You shouldn't see any bugs unless you're actually using the
forced-expiry interface introduced by 89c905becc.

Fixes: 89c905becc "nfsd: allow forced expiration of NFSv4 clients"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-08-17 11:47:54 -04:00