Commit Graph

1206 Commits

Author SHA1 Message Date
Linus Torvalds
c1649ec557 nfsd-6.2 fixes:
- Fix recently introduced use-after-free bugs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmPG0eQACgkQM2qzM29m
 f5fjNA/+NTUFM/e0Ol0WKbQisBZudTvpE/G/iFqdWq50JNwVc1mwr2teNaWmyagp
 o/Sdrp/DX0aWNK+CAb9VcgpuA2tZiCqMGxjw1YwZhPnfn15l/kMX1a7ueWFXU3bz
 EuYtiDyIvu+9CxYYqBGS6brBmiccjbAL02GaVakc1SiJ4dhn1ZrCnuVXbx8uDwSX
 /XmKTpW+yGiQ6KpJOPXIe2mx4+hcvOkSjLXteYMMbbMHhMK3zNWpyS/d9X4AJWxA
 ZFxH6x8Hi+7cDK3doXL3xLX+fbcofC6NKalvDplUff8NhKowWlFTcH84JZnJ8lt1
 8lwZ2GAPwxYd54L04mn8adBZjkFo4pF++sjoTltpUES12bN3qtd+zzLC416FGN7a
 B2mq3DetAFCodNcasb7zW861WtTQRs3a8w9H7DJ39kRRF52ScGVucrfBOHaOmGud
 T1jrol8kQvjzdQ+wVbMpN14Vf0eqo4Z+bWNekMzM2ROZCnHKy1uHe78iS32oXh6H
 YiSO9PH82lesomVxEyF32V3DM4HSd3IG5IWcPS0+U2PHWj4N7vzWFJQf/JXJl8mG
 h6+rX2hI7iW1P585jkNxahAP4A+NCF/kf0J4xgDfeLw8FH8HEfkfaEky2K9wyZaS
 ZEEb34F7bIGS4jeENjmRbgbmxDgva9yH5qoqK9EhpjMPoHgzBNY=
 =qZ7R
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix recently introduced use-after-free bugs

* tag 'nfsd-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: replace delayed_work with work_struct for nfsd_client_shrinker
  NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time
  NFSD: fix use-after-free in nfsd4_ssc_setup_dul()
2023-01-17 09:29:17 -08:00
Dai Ngo
7c24fa2250 NFSD: replace delayed_work with work_struct for nfsd_client_shrinker
Since nfsd4_state_shrinker_count always calls mod_delayed_work with
0 delay, we can replace delayed_work with work_struct to save some
space and overhead.

Also add the call to cancel_work after unregister the shrinker
in nfs4_state_shutdown_net.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-01-12 09:38:30 -05:00
Dai Ngo
f385f7d244 NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time
Currently the nfsd-client shrinker is registered and unregistered at
the time the nfsd module is loaded and unloaded. The problem with this
is the shrinker is being registered before all of the relevant fields
in nfsd_net are initialized when nfsd is started. This can lead to an
oops when memory is low and the shrinker is called while nfsd is not
running.

This patch moves the  register/unregister of nfsd-client shrinker from
module load/unload time to nfsd startup/shutdown time.

Fixes: 44df6f439a ("NFSD: add delegation reaper to react to low memory condition")
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-01-11 21:32:05 -05:00
Linus Torvalds
7dd4b804e0 nfsd-6.2 fixes:
- Fix a race when creating NFSv4 files
 - Revert the use of relaxed bitops
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmO9twgACgkQM2qzM29m
 f5cQ/A//SSv/eZl2cnAMZtN1zd7wIMfI6E9y8Ccv49aebUXGmGDKSwz/CUns2sgO
 avWentInUYg2cexIjaQnQeGkiQt0Do+3u/cdT86h2e8q3UhvctYWO5uRCqbP+36H
 JRLfNUUbic4P8Yp/LZ5DvwOWae4PLdZq71mxJkaTXGHt8zLn/yEntCY8jb6V7D2L
 SxMXAoO05bdzfPc8lXKmaGi4JMsANEOMh5ZMRpKxKTEFQG352db17MqwOAW/Qe+t
 mMXY2jRfeufFwimmwLK06EzItgcs6D9g7dM3oIwDUNiPL4l3lOYeynbYOref7fD3
 4u11LwZdzZ5LYIZ0HoTpRu3ZxAbrTtmd1FiT7SwN9jjq1vu0Zx0sfqk0R9VixY3c
 jP+wYKEDTQUkIVdbG6g/u6yQZvwM281+GiAXoD3FJWKJDwAaqwxd6cphCn314RKY
 hlgG4DGhAi0BYbiLVu5ObQwRb1yPgCP2pXqguAdAKbTM2DVC2+hAW3NDUcIKrR1U
 JoXmGBaWeuJU9/0JbfVzddXUCs227hnovj1nmGW7E8JUegW4m+3oscEP8tsC5H5S
 J3Jr9ovxyYGQE1qxM5909hjPjrZxI3NszKIpgWoo9/jJLUWfGtnS2BclrXUxQrdl
 rvbKHvmSLyOsFYnZ5Nt7uj1l7LtWMljrjOjPqe02iU6pRDNHa9Y=
 =/7AX
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix a race when creating NFSv4 files

 - Revert the use of relaxed bitops

* tag 'nfsd-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: Use set_bit(RQ_DROPME)
  Revert "SUNRPC: Use RMW bitops in single-threaded hot paths"
  nfsd: fix handling of cached open files in nfsd4_open codepath
2023-01-10 15:03:06 -06:00
Jeff Layton
0b3a551fa5 nfsd: fix handling of cached open files in nfsd4_open codepath
Commit fb70bf124b ("NFSD: Instantiate a struct file when creating a
regular NFSv4 file") added the ability to cache an open fd over a
compound. There are a couple of problems with the way this currently
works:

It's racy, as a newly-created nfsd_file can end up with its PENDING bit
cleared while the nf is hashed, and the nf_file pointer is still zeroed
out. Other tasks can find it in this state and they expect to see a
valid nf_file, and can oops if nf_file is NULL.

Also, there is no guarantee that we'll end up creating a new nfsd_file
if one is already in the hash. If an extant entry is in the hash with a
valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with
the value of op_file and the old nf_file will leak.

Fix both issues by making a new nfsd_file_acquirei_opened variant that
takes an optional file pointer. If one is present when this is called,
we'll take a new reference to it instead of trying to open the file. If
the nfsd_file already has a valid nf_file, we'll just ignore the
optional file and pass the nfsd_file back as-is.

Also rework the tracepoints a bit to allow for an "opened" variant and
don't try to avoid counting acquisitions in the case where we already
have a cached open file.

Fixes: fb70bf124b ("NFSD: Instantiate a struct file when creating a regular NFSv4 file")
Cc: Trond Myklebust <trondmy@hammerspace.com>
Reported-by: Stanislav Saner <ssaner@redhat.com>
Reported-and-Tested-by: Ruben Vestergaard <rubenv@drcmr.dk>
Reported-and-Tested-by: Torkil Svensgaard <torkil@drcmr.dk>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-01-06 13:17:06 -05:00
Linus Torvalds
764822972d NFSD 6.2 Release Notes
This release introduces support for the CB_RECALL_ANY operation.
 NFSD can send this operation to request that clients return any
 delegations they choose. The server uses this operation to handle
 low memory scenarios or indicate to a client when that client has
 reached the maximum number of delegations the server supports.
 
 The NFSv4.2 READ_PLUS operation has been simplified temporarily
 whilst support for sparse files in local filesystems and the VFS is
 improved.
 
 Two major data structure fixes appear in this release:
 
 * The nfs4_file hash table is replaced with a resizable hash table
   to reduce the latency of NFSv4 OPEN operations.
 
 * Reference counting in the NFSD filecache has been hardened against
   races.
 
 In furtherance of removing support for NFSv2 in a subsequent kernel
 release, a new Kconfig option enables server-side support for NFSv2
 to be left out of a kernel build.
 
 MAINTAINERS has been updated to indicate that changes to fs/exportfs
 should go through the NFSD tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmOXPE4ACgkQM2qzM29m
 f5faaRAAh7YT5V61afPbfgBybO5AbDzztpZSNjNjLZs78piSnFp6hP75yNtTviwQ
 1o7St13/NkCmDaIdGUpr02U01zbM1BDOq2wGckImOJLNSgb7xHV5r4PqkRiFkh0t
 QYSnwG+wp8fDUJeCL/nAOAu9I9EQUqHzWchxiU/h8ln2hN3rXUlIRSeo17Wy7zkD
 cNIcoAjTi9fzY3dE6H4r+lZTdNCYH+AdzChmKrHdRZQwq0Xs3FWv4gAMTLbDuD4P
 B6NDHz0Umn6XnFsJGptwozkwaWeMQw4GyJj/3iUiO8JF209SaoYXMPjJAyG6tYYa
 fUrgv4UXGeXjigDbLBA5IYxfhX7GXjMQSaj3edhzyrl8P74q4/Cq/8fDUnAZ841m
 E+TGSCPIQD0QuIjdXxLv9KLY8JNThSfcAt6jr5GBXhPZQr8xpS0BqK/Onr68fgZC
 Lpull5xN68L4A1B7cf2GNPuMyvkBKxwSGXOehldh/BkvpVMjFwqd4/q5xWC+6CcQ
 tbOkjTbbSS71nzJwZip0NphaYCa3qQPzKT4SZzn/I4I9W5otbwYBx734Bw46gTDE
 ZPUXTuJ00VPgX07wbLRahg521Fwzr+8sk1WnVYq82PoaMh1l9FjzLNGouQWBdo3E
 UzIo/KUfQKmoZce6O723L6OI4ffdK5oMtfaTpe+SiUPpV1lUAcA=
 =jNlu
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This release introduces support for the CB_RECALL_ANY operation. NFSD
  can send this operation to request that clients return any delegations
  they choose. The server uses this operation to handle low memory
  scenarios or indicate to a client when that client has reached the
  maximum number of delegations the server supports.

  The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst
  support for sparse files in local filesystems and the VFS is improved.

  Two major data structure fixes appear in this release:

   - The nfs4_file hash table is replaced with a resizable hash table to
     reduce the latency of NFSv4 OPEN operations.

   - Reference counting in the NFSD filecache has been hardened against
     races.

  In furtherance of removing support for NFSv2 in a subsequent kernel
  release, a new Kconfig option enables server-side support for NFSv2 to
  be left out of a kernel build.

  MAINTAINERS has been updated to indicate that changes to fs/exportfs
  should go through the NFSD tree"

* tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits)
  NFSD: Avoid clashing function prototypes
  SUNRPC: Fix crasher in unwrap_integ_data()
  SUNRPC: Make the svc_authenticate tracepoint conditional
  NFSD: Use only RQ_DROPME to signal the need to drop a reply
  SUNRPC: Clean up xdr_write_pages()
  SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
  NFSD: add CB_RECALL_ANY tracepoints
  NFSD: add delegation reaper to react to low memory condition
  NFSD: add support for sending CB_RECALL_ANY
  NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
  trace: Relocate event helper files
  NFSD: pass range end to vfs_fsync_range() instead of count
  lockd: fix file selection in nlmsvc_cancel_blocked
  lockd: ensure we use the correct file descriptor when unlocking
  lockd: set missing fl_flags field when retrieving args
  NFSD: Use struct_size() helper in alloc_session()
  nfsd: return error if nfs4_setacl fails
  lockd: set other missing fields when unlocking files
  NFSD: Add an nfsd_file_fsync tracepoint
  sunrpc: svc: Remove an unused static function svc_ungetu32()
  ...
2022-12-12 20:54:39 -08:00
Dai Ngo
638593be55 NFSD: add CB_RECALL_ANY tracepoints
Add tracepoints to trace start and end of CB_RECALL_ANY operation.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
[ cel: added show_rca_mask() macro ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-10 11:01:12 -05:00
Dai Ngo
44df6f439a NFSD: add delegation reaper to react to low memory condition
The delegation reaper is called by nfsd memory shrinker's on
the 'count' callback. It scans the client list and sends the
courtesy CB_RECALL_ANY to the clients that hold delegations.

To avoid flooding the clients with CB_RECALL_ANY requests, the
delegation reaper sends only one CB_RECALL_ANY request to each
client per 5 seconds.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
[ cel: moved definition of RCA4_TYPE_MASK_RDATA_DLG ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-10 11:01:12 -05:00
Dai Ngo
a1049eb47f NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
Refactoring courtesy_client_reaper to generic low memory
shrinker so it can be used for other purposes.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-10 11:01:12 -05:00
Xiu Jianfeng
85a0d0c9a5 NFSD: Use struct_size() helper in alloc_session()
Use struct_size() helper to simplify the code, no functional changes.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-10 11:01:11 -05:00
Jeff Layton
77c67530e1 nfsd: use locks_inode_context helper
nfsd currently doesn't access i_flctx safely everywhere. This requires a
smp_load_acquire, as the pointer is set via cmpxchg (a release
operation).

Acked-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2022-11-30 05:08:10 -05:00
Chuck Lever
d47b295e8d NFSD: Use rhashtable for managing nfs4_file objects
fh_match() is costly, especially when filehandles are large (as is
the case for NFSv4). It needs to be used sparingly when searching
data structures. Unfortunately, with common workloads, I see
multiple thousands of objects stored in file_hashtbl[], which has
just 256 buckets, making its bucket hash chains quite lengthy.

Walking long hash chains with the state_lock held blocks other
activity that needs that lock. Sizable hash chains are a common
occurrance once the server has handed out some delegations, for
example -- IIUC, each delegated file is held open on the server by
an nfs4_file object.

To help mitigate the cost of searching with fh_match(), replace the
nfs4_file hash table with an rhashtable, which can dynamically
resize its bucket array to minimize hash chain length.

The result of this modification is an improvement in the latency of
NFSv4 operations, and the reduction of nfsd CPU utilization due to
eliminating the cost of multiple calls to fh_match() and reducing
the CPU cache misses incurred while walking long hash chains in the
nfs4_file hash table.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:47 -05:00
Chuck Lever
1542474800 NFSD: Refactor find_file()
find_file() is now the only caller of find_file_locked(), so just
fold these two together.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:47 -05:00
Chuck Lever
9270fc514b NFSD: Clean up find_or_add_file()
Remove the call to find_file_locked() in insert_nfs4_file(). Tracing
shows that over 99% of these calls return NULL. Thus it is not worth
the expense of the extra bucket list traversal. insert_file() already
deals correctly with the case where the item is already in the hash
bucket.

Since nfsd4_file_hash_insert() is now just a wrapper around
insert_file(), move the meat of insert_file() into
nfsd4_file_hash_insert() and get rid of it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
2022-11-28 12:54:47 -05:00
Chuck Lever
3341678f2f NFSD: Add a nfsd4_file_hash_remove() helper
Refactor to relocate hash deletion operation to a helper function
that is close to most other nfs4_file data structure operations.

The "noinline" annotation will become useful in a moment when the
hlist_del_rcu() is replaced with a more complex rhash remove
operation. It also guarantees that hash remove operations can be
traced with "-p function -l remove_nfs4_file_locked".

This also simplifies the organization of forward declarations: the
to-be-added rhashtable and its param structure will be defined
/after/ put_nfs4_file().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:47 -05:00
Chuck Lever
81a21fa3e7 NFSD: Clean up nfsd4_init_file()
Name this function more consistently. I'm going to use nfsd4_file_
and nfsd4_file_hash_ for these helpers.

Change the @fh parameter to be const pointer for better type safety.

Finally, move the hash insertion operation to the caller. This is
typical for most other "init_object" type helpers, and it is where
most of the other nfs4_file hash table operations are located.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:46 -05:00
Chuck Lever
3fe828cadd NFSD: Update file_hashtbl() helpers
Enable callers to use const pointers for type safety.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:46 -05:00
Chuck Lever
a1c74569bb NFSD: Trace delegation revocations
Delegation revocation is an exceptional event that is not otherwise
visible externally (eg, no network traffic is emitted). Generate a
trace record when it occurs so that revocation can be observed or
other activity can be triggered. Example:

nfsd-1104  [005]  1912.002544: nfsd_stid_revoke:        client 633c9343:4e82788d stateid 00000003:00000001 ref=2 type=DELEG

Trace infrastructure is provided for subsequent additional tracing
related to nfs4_stid activity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:46 -05:00
Chuck Lever
20eee313ff NFSD: Trace stateids returned via DELEGRETURN
Handing out a delegation stateid is recorded with the
nfsd_deleg_read tracepoint, but there isn't a matching tracepoint
for recording when the stateid is returned.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:46 -05:00
Chuck Lever
dcf3f80965 NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately"
This reverts commit 5e138c4a75.

That commit attempted to make files available to other users as soon
as all NFSv4 clients were done with them, rather than waiting until
the filecache LRU had garbage collected them.

It gets the reference counting wrong, for one thing.

But it also misses that DELEGRETURN should release a file in the
same fashion. In fact, any nfsd_file_put() on an file held open
by an NFSv4 client needs potentially to release the file
immediately...

Clear the way for implementing that idea.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
2022-11-28 12:54:45 -05:00
Jeff Layton
e0aa651068 nfsd: don't call nfsd_file_put from client states seqfile display
We had a report of this:

    BUG: sleeping function called from invalid context at fs/nfsd/filecache.c:440

...with a stack trace showing nfsd_file_put being called from
nfs4_show_open. This code has always tried to call fput while holding a
spinlock, but we recently changed this to use the filecache, and that
started triggering the might_sleep() in nfsd_file_put.

states_start takes and holds the cl_lock while iterating over the
client's states, and we can't sleep with that held.

Have the various nfs4_show_* functions instead hold the fi_lock instead
of taking a nfsd_file reference.

Fixes: 78599c42ae ("nfsd4: add file to display list of client's opens")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138357
Reported-by: Zhi Li <yieli@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:45 -05:00
Linus Torvalds
f9bbe0c99e Fixes:
- Fix an export leak
 - Fix a potential tracepoint crash
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmNuZA4ACgkQM2qzM29m
 f5fqphAAsxlP4niwGd9AQsQnHGymyutwCHetj0YIyYIyCP7alHVf69h4oEe4kp5H
 Xg9H996MySXEaHbjnvHQ4VtUmaQBvpLeSXpA5ChnOeU9V8WbvSBGxEYGWojQqO8k
 qk9wG2hNzAixh91IhICtnTEaIWwM1S6R/A8ytm2vz/PtWXzHcTtLKSZ30jayBXog
 svumLaD9PSMdhspVRsTFVRjbEReaQqcB588YPKylfv68DNfGRWAU+EhE0TJXkfoW
 32STWIxiSHPj9wv3xtWgel01L3IkhLlWiSALcN1m6Lk/5U5/NWF+LTNcBdrQU1rl
 /mMYwfz/pruie5w0TRAepdBq1tniY1RVtFr/h59uihdM844uL7xYtkpKgPvHQGeQ
 e8YroIhGFl5kJ93S9EtJLiJ768d71SFXymXa3YK5SW1YzaMBrDhpr6zWkmDIe4Fv
 Z5MFY3AENvsvADQKzPZqJXJLU+3Y81oVQknrUAJIkDxHMWO9a/Bxyv7u1Wk0jk+N
 A5nRiYfl0tL1ByRjhp60uKCYeE8XcTnrkwqCtLgDyKPt9Uu42MhSLtny8fF/1Aoh
 dmMh/XkaVAEE8PEDoS1q/UEspSe/22MBu9Qkum1eekBIRpSj+y6ydE+/X1aYD4Du
 dTaMewtlqloUWtw6At5VHz5wKgTLZfBLE0aNOPkY2+krHQ7gRwU=
 =QAjx
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix an export leak

 - Fix a potential tracepoint crash

* tag 'nfsd-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: put the export reference in nfsd4_verify_deleg_dentry
  nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint
2022-11-11 11:28:26 -08:00
Jeff Layton
50256e4793 nfsd: put the export reference in nfsd4_verify_deleg_dentry
nfsd_lookup_dentry returns an export reference in addition to the dentry
ref. Ensure that we put it too.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138866
Fixes: 876c553cb4 ("NFSD: verify the opened dentry after setting a delegation")
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-08 11:32:53 -05:00
Jason A. Donenfeld
a251c17aa5 treewide: use get_random_u32() when possible
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:58 -06:00
Jeff Layton
895ddf5ed4 nfsd: extra checks when freeing delegation stateids
We've had some reports of problems in the refcounting for delegation
stateids that we've yet to track down. Add some extra checks to ensure
that we've removed the object from various lists before freeing it.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2127067
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:50:58 -04:00
Jeff Layton
b95239ca49 nfsd: make nfsd4_run_cb a bool return function
queue_work can return false and not queue anything, if the work is
already queued. If that happens in the case of a CB_RECALL, we'll have
taken an extra reference to the stid that will never be put. Ensure we
throw a warning in that case.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:50:57 -04:00
Jeff Layton
25fbe1fca1 nfsd: fix comments about spinlock handling with delegations
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:23:55 -04:00
Jeff Layton
4d01416ab4 nfsd: only fill out return pointer on success in nfsd4_lookup_stateid
In the case of a revoked delegation, we still fill out the pointer even
when returning an error, which is bad form. Only overwrite the pointer
on success.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:23:44 -04:00
Dai Ngo
019805fea9 NFSD: fix use-after-free on source server when doing inter-server copy
Use-after-free occurred when the laundromat tried to free expired
cpntf_state entry on the s2s_cp_stateids list after inter-server
copy completed. The sc_cp_list that the expired copy state was
inserted on was already freed.

When COPY completes, the Linux client normally sends LOCKU(lock_state x),
FREE_STATEID(lock_state x) and CLOSE(open_state y) to the source server.
The nfs4_put_stid call from nfsd4_free_stateid cleans up the copy state
from the s2s_cp_stateids list before freeing the lock state's stid.

However, sometimes the CLOSE was sent before the FREE_STATEID request.
When this happens, the nfsd4_close_open_stateid call from nfsd4_close
frees all lock states on its st_locks list without cleaning up the copy
state on the sc_cp_list list. When the time the FREE_STATEID arrives the
server returns BAD_STATEID since the lock state was freed. This causes
the use-after-free error to occur when the laundromat tries to free
the expired cpntf_state.

This patch adds a call to nfs4_free_cpntf_statelist in
nfsd4_close_open_stateid to clean up the copy state before calling
free_ol_stateid_reaplist to free the lock state's stid on the reaplist.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:21:46 -04:00
Chuck Lever
781fde1a2b NFSD: Rename the fields in copy_stateid_t
Code maintenance: The name of the copy_stateid_t::sc_count field
collides with the sc_count field in struct nfs4_stid, making the
latter difficult to grep for when auditing stateid reference
counting.

No behavior change expected.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:50 -04:00
ChenXiaoSong
1d7f6b302b nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops
Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code.

inode is converted from seq_file->file instead of seq_file->private in
client_info_show().

Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:49 -04:00
Dai Ngo
7746b32f46 NFSD: add shrinker to reap courtesy clients on low memory condition
Add courtesy_client_reaper to react to low memory condition triggered
by the system memory shrinker.

The delayed_work for the courtesy_client_reaper is scheduled on
the shrinker's count callback using the laundry_wq.

The shrinker's scan callback is not used for expiring the courtesy
clients due to potential deadlocks.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:41 -04:00
Dai Ngo
3a4ea23d86 NFSD: keep track of the number of courtesy clients in the system
Add counter nfs4_courtesy_client_count to nfsd_net to keep track
of the number of courtesy clients in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-09-26 14:02:41 -04:00
Chuck Lever
c035362eb9 NFSD: Add a mechanism to wait for a DELEGRETURN
Subsequent patches will use this mechanism to wake up an operation
that is waiting for a client to return a delegation.

The new tracepoint records whether the wait timed out or was
properly awoken by the expected DELEGRETURN:

            nfsd-1155  [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out)

Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-09-26 14:02:32 -04:00
Chuck Lever
1035d65446 NFSD: Add tracepoints to report NFSv4 callback completions
Wireshark has always been lousy about dissecting NFSv4 callbacks,
especially NFSv4.0 backchannel requests. Add tracepoints so we
can surgically capture these events in the trace log.

Tracepoints are time-stamped and ordered so that we can now observe
the timing relationship between a CB_RECALL Reply and the client's
DELEGRETURN Call. Example:

            nfsd-1153  [002]   211.986391: nfsd_cb_recall:       addr=192.168.1.67:45767 client 62ea82e4:fee7492a stateid 00000003:00000001

            nfsd-1153  [002]   212.095634: nfsd_compound:        xid=0x0000002c opcnt=2
            nfsd-1153  [002]   212.095647: nfsd_compound_status: op=1/2 OP_PUTFH status=0
            nfsd-1153  [002]   212.095658: nfsd_file_put:        hash=0xf72 inode=0xffff9291148c7410 ref=3 flags=HASHED|REFERENCED may=READ file=0xffff929103b3ea00
            nfsd-1153  [002]   212.095661: nfsd_compound_status: op=2/2 OP_DELEGRETURN status=0
   kworker/u25:8-148   [002]   212.096713: nfsd_cb_recall_done:  client 62ea82e4:fee7492a stateid 00000003:00000001 status=0

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-09-26 14:02:32 -04:00
NeilBrown
bb4d53d66e NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
When locking a file to access ACLs and xattrs etc, use explicit locking
with inode_lock() instead of fh_lock().  This means that the calls to
fh_fill_pre/post_attr() are also explicit which improves readability and
allows us to place them only where they are needed.  Only the xattr
calls need pre/post information.

When locking a file we don't need I_MUTEX_PARENT as the file is not a
parent of anything, so we can use inode_lock() directly rather than the
inode_lock_nested() call that fh_lock() uses.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-08-04 10:28:41 -04:00
NeilBrown
19d008b469 NFSD: reduce locking in nfsd_lookup()
nfsd_lookup() takes an exclusive lock on the parent inode, but no
callers want the lock and it may not be needed at all if the
result is in the dcache.

Change nfsd_lookup_dentry() to not take the lock, and call
lookup_one_len_locked() which takes lock only if needed.

nfsd4_open() currently expects the lock to still be held, but that isn't
necessary as nfsd_validate_delegated_dentry() provides required
guarantees without the lock.

NOTE: NFSv4 requires directory changeinfo for OPEN even when a create
  wasn't requested and no change happened.  Now that nfsd_lookup()
  doesn't use fh_lock(), we need to explicitly fill the attributes
  when no create happens.  A new fh_fill_both_attrs() is provided
  for that task.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-08-04 10:28:20 -04:00
NeilBrown
7fe2a71dda NFSD: introduce struct nfsd_attrs
The attributes that nfsd might want to set on a file include 'struct
iattr' as well as an ACL and security label.
The latter two are passed around quite separately from the first, in
part because they are only needed for NFSv4.  This leads to some
clumsiness in the code, such as the attributes NOT being set in
nfsd_create_setattr().

We need to keep the directory locked until all attributes are set to
ensure the file is never visibile without all its attributes.  This need
combined with the inconsistent handling of attributes leads to more
clumsiness.

As a first step towards tidying this up, introduce 'struct nfsd_attrs'.
This is passed (by reference) to vfs.c functions that work with
attributes, and is assembled by the various nfs*proc functions which
call them.  As yet only iattr is included, but future patches will
expand this.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
Jeff Layton
876c553cb4 NFSD: verify the opened dentry after setting a delegation
Between opening a file and setting a delegation on it, someone could
rename or unlink the dentry. If this happens, we do not want to grant a
delegation on the open.

On a CLAIM_NULL open, we're opening by filename, and we may (in the
non-create case) or may not (in the create case) be holding i_rwsem
when attempting to set a delegation.  The latter case allows a
race.

After getting a lease, redo the lookup of the file being opened and
validate that the resulting dentry matches the one in the open file
description.

To properly redo the lookup we need an rqst pointer to pass to
nfsd_lookup_dentry(), so make sure that is available.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
Jeff Layton
bbf936edd5 NFSD: drop fh argument from alloc_init_deleg
Currently, we pass the fh of the opened file down through several
functions so that alloc_init_deleg can pass it to delegation_blocked.
The filehandle of the open file is available in the nfs4_file however,
so there's no need to pass it in a separate argument.

Drop the argument from alloc_init_deleg, nfs4_open_delegation and
nfs4_set_delegation.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:17:00 -04:00
Dai Ngo
4271c2c088 NFSD: limit the number of v4 clients to 1024 per 1GB of system memory
Currently there is no limit on how many v4 clients are supported
by the system. This can be a problem in systems with small memory
configuration to function properly when a very large number of
clients exist that creates memory shortage conditions.

This patch enforces a limit of 1024 NFSv4 clients, including courtesy
clients, per 1GB of system memory.  When the number of the clients
reaches the limit, requests that create new clients are returned
with NFS4ERR_DELAY and the laundromat is kicked start to trim old
clients. Due to the overhead of the upcall to remove the client
record, the maximun number of clients the laundromat removes on
each run is limited to 128. This is done to ensure the laundromat
can still process the other tasks in a timely manner.

Since there is now a limit of the number of clients, the 24-hr
idle time limit of courtesy client is no longer needed and was
removed.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Dai Ngo
0926c39515 NFSD: keep track of the number of v4 clients in the system
Add counter nfs4_client_count to keep track of the total number
of v4 clients, including courtesy clients, in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Dai Ngo
6867137ebc NFSD: refactoring v4 specific code to a helper in nfs4state.c
This patch moves the v4 specific code from nfsd_init_net() to
nfsd4_init_leases_net() helper in nfs4state.c

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Chuck Lever
427f5f83a3 NFSD: Ensure nf_inode is never dereferenced
The documenting comment for struct nf_file states:

/*
 * A representation of a file that has been opened by knfsd. These are hashed
 * in the hashtable by inode pointer value. Note that this object doesn't
 * hold a reference to the inode by itself, so the nf_inode pointer should
 * never be dereferenced, only used for comparison.
 */

Replace the two existing dereferences to make the comment always
true.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:56 -04:00
Chuck Lever
5e138c4a75 NFSD: NFSv4 CLOSE should release an nfsd_file immediately
The last close of a file should enable other accessors to open and
use that file immediately. Leaving the file open in the filecache
prevents other users from accessing that file until the filecache
garbage-collects the file -- sometimes that takes several seconds.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:16:42 -04:00
Chuck Lever
be0230069f NFSD: Separate tracepoints for acquire and create
These tracepoints collect different information: the create case does
not open a file, so there's no nf_file available.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-07-29 20:15:54 -04:00
Chuck Lever
043862b09c NFSD: Add documenting comment for nfsd4_release_lockowner()
And return explicit nfserr values that match what is documented in the
new comment / API contract.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-26 10:50:50 -04:00
Chuck Lever
bd8fdb6e54 NFSD: Modernize nfsd4_release_lockowner()
Refactor: Use existing helpers that other lock operations use. This
change removes several automatic variables, so re-organize the
variable declarations for readability.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-26 10:50:50 -04:00
Chuck Lever
ce3c4ad7f4 NFSD: Fix possible sleep during nfsd4_release_lockowner()
nfsd4_release_lockowner() holds clp->cl_lock when it calls
check_for_locks(). However, check_for_locks() calls nfsd_file_get()
/ nfsd_file_put() to access the backing inode's flc_posix list, and
nfsd_file_put() can sleep if the inode was recently removed.

Let's instead rely on the stateowner's reference count to gate
whether the release is permitted. This should be a reliable
indication of locks-in-use since file lock operations and
->lm_get_owner take appropriate references, which are released
appropriately when file locks are removed.

Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
2022-05-26 10:50:49 -04:00
Chuck Lever
7e2ce0cc15 NFSD: Move documenting comment for nfsd4_process_open2()
Clean up nfsd4_open() by converting a large comment at the only
call site for nfsd4_process_open2() to a kerneldoc comment in
front of that function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-05-23 11:06:29 -04:00