Commit Graph

3191 Commits

Author SHA1 Message Date
Chuck Lever
d0abdae519 NFSD: Replace READ* macros in nfsd4_decode_secinfo()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:40 -05:00
Chuck Lever
d12f90458d NFSD: Replace READ* macros in nfsd4_decode_renew()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:40 -05:00
Chuck Lever
ba881a0a53 NFSD: Replace READ* macros in nfsd4_decode_rename()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
b7f5fbf219 NFSD: Replace READ* macros in nfsd4_decode_remove()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
0dfaf2a371 NFSD: Replace READ* macros in nfsd4_decode_readdir()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
3909c3bc60 NFSD: Replace READ* macros in nfsd4_decode_read()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
a73bed9841 NFSD: Replace READ* macros in nfsd4_decode_putfh()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
dca71651f0 NFSD: Replace READ* macros in nfsd4_decode_open_downgrade()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
06bee693a1 NFSD: Replace READ* macros in nfsd4_decode_open_confirm()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
61e5e0b3ec NFSD: Replace READ* macros in nfsd4_decode_open()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
1708e50b01 NFSD: Add helper to decode OPEN's open_claim4 argument
Refactor for clarity.

Note that op_fname is the only instance of an NFSv4 filename stored
in a struct xdr_netobj. Convert it to a u32/char * pair so that the
new nfsd4_decode_filename() helper can be used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
b07bebd9eb NFSD: Replace READ* macros in nfsd4_decode_share_deny()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:39 -05:00
Chuck Lever
9aa62f5199 NFSD: Replace READ* macros in nfsd4_decode_share_access()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
e6ec04b27b NFSD: Add helper to decode OPEN's openflag4 argument
Refactor for clarity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
bf33bab3c4 NFSD: Add helper to decode OPEN's createhow4 argument
Refactor for clarity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
796dd1c6b6 NFSD: Add helper to decode NFSv4 verifiers
This helper will be used to simplify decoders in subsequent
patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
3d5877e8e0 NFSD: Replace READ* macros in nfsd4_decode_lookup()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
ca9cf9fc27 NFSD: Replace READ* macros in nfsd4_decode_locku()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
0a146f04aa NFSD: Replace READ* macros in nfsd4_decode_lockt()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
7c59deed5c NFSD: Replace READ* macros in nfsd4_decode_lock()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
8918cc0d2b NFSD: Add helper for decoding locker4
Refactor for clarity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
144e826940 NFSD: Add helpers to decode a clientid4 and an NFSv4 state owner
These helpers will also be used to simplify decoders in subsequent
patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:38 -05:00
Chuck Lever
5dcbfabb67 NFSD: Relocate nfsd4_decode_opaque()
Enable nfsd4_decode_opaque() to be used in more decoders, and
replace the READ* macros in nfsd4_decode_opaque().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
5c505d1286 NFSD: Replace READ* macros in nfsd4_decode_link()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
f759eff260 NFSD: Replace READ* macros in nfsd4_decode_getattr()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
95e6482ced NFSD: Replace READ* macros in nfsd4_decode_delegreturn()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
000dfa18b3 NFSD: Replace READ* macros in nfsd4_decode_create()
A dedicated decoder for component4 is introduced here, which will be
used by other operation decoders in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
d1c263a031 NFSD: Replace READ* macros in nfsd4_decode_fattr()
Let's be more careful to avoid overrunning the memory that backs
the bitmap array. This requires updating the synopsis of
nfsd4_decode_fattr().

Bruce points out that a server needs to be careful to return nfs_ok
when a client presents bitmap bits the server doesn't support. This
includes bits in bitmap words the server might not yet support.

The current READ* based implementation is good about that, but that
requirement hasn't been documented.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
66f0476c70 NFSD: Replace READ* macros that decode the fattr4 umask attribute
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
dabe91828f NFSD: Replace READ* macros that decode the fattr4 security label attribute
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
1c3eff7ea4 NFSD: Replace READ* macros that decode the fattr4 time_set attributes
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:37 -05:00
Chuck Lever
393c31dd27 NFSD: Replace READ* macros that decode the fattr4 owner_group attribute
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
9853a5ac9b NFSD: Replace READ* macros that decode the fattr4 owner attribute
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
1c8f0ad7dd NFSD: Replace READ* macros that decode the fattr4 mode attribute
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
c941a96823 NFSD: Replace READ* macros that decode the fattr4 acl attribute
Refactor for clarity and to move infrequently-used code out of line.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
2ac1b9b2af NFSD: Replace READ* macros that decode the fattr4 size attribute
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
081d53fe0b NFSD: Change the way the expected length of a fattr4 is checked
Because the fattr4 is now managed in an xdr_stream, all that is
needed is to store the initial position of the stream before
decoding the attribute list. Then the actual length of the list
is computed using the final stream position, after decoding is
complete.

No behavior change is expected.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
cbd9abb370 NFSD: Replace READ* macros in nfsd4_decode_commit()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
d3d2f38154 NFSD: Replace READ* macros in nfsd4_decode_close()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
d169a6a9e5 NFSD: Replace READ* macros in nfsd4_decode_access()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
c1346a1216 NFSD: Replace the internals of the READ_BUF() macro
Convert the READ_BUF macro in nfs4xdr.c from open code to instead
use the new xdr_stream-style decoders already in use by the encode
side (and by the in-kernel NFS client implementation). Once this
conversion is done, each individual NFSv4 argument decoder can be
independently cleaned up to replace these macros with C code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:36 -05:00
Chuck Lever
08281341be NFSD: Add tracepoints in nfsd4_decode/encode_compound()
For troubleshooting purposes, record failures to decode NFSv4
operation arguments and encode operation results.

trace_nfsd_compound_decode_err() replaces the dprintk() call sites
that are embedded in READ_* macros that are about to be removed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:35 -05:00
Chuck Lever
0dfdad1c1d NFSD: Add tracepoints in nfsd_dispatch()
For troubleshooting purposes, record GARBAGE_ARGS and CANT_ENCODE
failures.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:35 -05:00
Chuck Lever
788f7183fb NFSD: Add common helpers to decode void args and encode void results
Start off the conversion to xdr_stream by de-duplicating the functions
that decode void arguments and encode void results.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:35 -05:00
Chuck Lever
5191955d6f SUNRPC: Prepare for xdr_stream-style decoding on the server-side
A "permanent" struct xdr_stream is allocated in struct svc_rqst so
that it is usable by all server-side decoders. A per-rqst scratch
buffer is also allocated to handle decoding XDR data items that
cross page boundaries.

To demonstrate how it will be used, add the first call site for the
new svcxdr_init_decode() API.

As an additional part of the overall conversion, add symbolic
constants for successful and failed XDR operations. Returning "0" is
overloaded. Sometimes it means something failed, but sometimes it
means success. To make it more clear when XDR decoding functions
succeed or fail, introduce symbolic constants.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:35 -05:00
Chuck Lever
0ae4c3e8a6 SUNRPC: Add xdr_set_scratch_page() and xdr_reset_scratch_buffer()
Clean up: De-duplicate some frequently-used code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:46:35 -05:00
Huang Guobin
231307df24 nfsd: Fix error return code in nfsd_file_cache_init()
Fix to return PTR_ERR() error code from the error handling case instead of
0 in function nfsd_file_cache_init(), as done elsewhere in this function.

Fixes: 65294c1f2c5e7("nfsd: add a new struct file caching facility to nfsd")
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 14:45:56 -05:00
Chuck Lever
f45a444cfe NFSD: Add SPDX header for fs/nfsd/trace.c
Clean up.

The file was contributed in 2014 by Christoph Hellwig in commit
31ef83dc05 ("nfsd: add trace events").

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:24 -05:00
Chuck Lever
3a90e1dff1 NFSD: Remove extra "0x" in tracepoint format specifier
Clean up: %p adds its own 0x already.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:24 -05:00
Chuck Lever
b76278ae68 NFSD: Clean up the show_nf_may macro
Display all currently possible NFSD_MAY permission flags.

Move and rename show_nf_may with a more generic name because the
NFSD_MAY permission flags are used in other places besides the file
cache.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:24 -05:00
Alex Shi
71fd721839 nfsd/nfs3: remove unused macro nfsd3_fhandleres
The macro is unused, remove it to tame gcc warning:
fs/nfsd/nfs3proc.c:702:0: warning: macro "nfsd3_fhandleres" is not used
[-Wunused-macros]

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:23 -05:00
Tom Rix
25fef48bdb NFSD: A semicolon is not needed after a switch statement.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:23 -05:00
Chuck Lever
76e5492b16 NFSD: Invoke svc_encode_result_payload() in "read" NFSD encoders
Have the NFSD encoders annotate the boundaries of every
direct-data-placement eligible result data payload. Then change
svcrdma to use that annotation instead of the xdr->page_len
when handling Write chunks.

For NFSv4 on RDMA, that enables the ability to recognize multiple
result payloads per compound. This is a pre-requisite for supporting
multiple Write chunks per RPC transaction.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:22 -05:00
Chuck Lever
03493bca08 SUNRPC: Rename svc_encode_read_payload()
Clean up: "result payload" is a less confusing name for these
payloads. "READ payload" reflects only the NFS usage.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-11-30 13:00:21 -05:00
Dai Ngo
49a3613273 NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy
Need to initialize nfsd4_copy's refcount to 1 to avoid use-after-free
warning when nfs4_put_copy is called from nfsd4_cb_offload_release.

Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-11-05 17:25:14 -05:00
Dai Ngo
36e1e5ba90 NFSD: Fix use-after-free warning when doing inter-server copy
The source file nfsd_file is not constructed the same as other
nfsd_file's via nfsd_file_alloc. nfsd_file_put should not be
called to free the object; nfsd_file_put is not the inverse of
kzalloc, instead kfree is called by nfsd4_do_async_copy when done.

Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-11-05 17:25:14 -05:00
Chuck Lever
66d60e3ad1 NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL
A late paragraph of RFC 1813 Section 3.3.11 states:

| ... if the server does not support the target type or the
| target type is illegal, the error, NFS3ERR_BADTYPE, should
| be returned. Note that NF3REG, NF3DIR, and NF3LNK are
| illegal types for MKNOD.

The Linux NFS server incorrectly returns NFSERR_INVAL in these
cases.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-11-05 17:20:12 -05:00
Chuck Lever
1905cac9d6 NFSD: NFSv3 PATHCONF Reply is improperly formed
Commit cc028a10a4 ("NFSD: Hoist status code encoding into XDR
encoder functions") missed a spot.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-11-05 17:20:12 -05:00
Linus Torvalds
24717cfbbb The one new feature this time, from Anna Schumaker, is READ_PLUS, which
has the same arguments as READ but allows the server to return an array
 of data and hole extents.
 
 Otherwise it's a lot of cleanup and bugfixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl+Q5vsVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+DUAP/RlALnXbaoWi8YCcEcc9U1LoQKbD
 CJpDR+FqCOyGwRuzWung/5pvkOO50fGEeAroos+2rF/NgRkQq8EFr9AuBhNOYUFE
 IZhWEOfu/r2ukXyBmcu21HGcWLwPnyJehvjuzTQW2wOHlBi/sdoL5Ap1sVlwVLj5
 EZ5kqJLD+ioG2sufW99Spi55l1Cy+3Y0IhLSWl4ZAE6s8hmFSYAJZFsOeI0Afx57
 USPTDRaeqjyEULkb+f8IhD0eRApOUo4evDn9dwQx+of7HPa1CiygctTKYwA3hnlc
 gXp2KpVA1REaiYVgOPwYlnqBmJ2K9X0wCRzcWy2razqEcVAX/2j7QCe9M2mn4DC8
 xZ2q4SxgXu9yf0qfUSVnDxWmP6ipqq7OmsG0JXTFseGKBdpjJY1qHhyqanVAGvEg
 I+xHnnWfGwNCftwyA3mt3RfSFPsbLlSBIMZxvN4kn8aVlqszGITOQvTdQcLYA6kT
 xWllBf4XKVXMqF0PzerxPDmfzBfhx6b1VPWOIVcu7VLBg3IXoEB2G5xG8MUJiSch
 OUTCt41LUQkerQlnzaZYqwmFdSBfXJefmcE/x/vps4VtQ/fPHX1jQyD7iTu3HfSP
 bRlkKHvNVeTodlBDe/HTPiTA99MShhBJyvtV5wfzIqwjc1cNreed+ePppxn8mxJi
 SmQ2uZk/MpUl7/V0
 =rcOj
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "The one new feature this time, from Anna Schumaker, is READ_PLUS,
  which has the same arguments as READ but allows the server to return
  an array of data and hole extents.

  Otherwise it's a lot of cleanup and bugfixes"

* tag 'nfsd-5.10' of git://linux-nfs.org/~bfields/linux: (43 commits)
  NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
  SUNRPC: fix copying of multiple pages in gss_read_proxy_verf()
  sunrpc: raise kernel RPC channel buffer size
  svcrdma: fix bounce buffers for unaligned offsets and multiple pages
  nfsd: remove unneeded break
  net/sunrpc: Fix return value for sysctl sunrpc.transports
  NFSD: Encode a full READ_PLUS reply
  NFSD: Return both a hole and a data segment
  NFSD: Add READ_PLUS hole segment encoding
  NFSD: Add READ_PLUS data support
  NFSD: Hoist status code encoding into XDR encoder functions
  NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
  NFSD: Remove the RETURN_STATUS() macro
  NFSD: Call NFSv2 encoders on error returns
  NFSD: Fix .pc_release method for NFSv2
  NFSD: Remove vestigial typedefs
  NFSD: Refactor nfsd_dispatch() error paths
  NFSD: Clean up nfsd_dispatch() variables
  NFSD: Clean up stale comments in nfsd_dispatch()
  NFSD: Clean up switch statement in nfsd_dispatch()
  ...
2020-10-22 09:44:27 -07:00
Dai Ngo
0cfcd405e7 NFSv4.2: Fix NFS4ERR_STALE error when doing inter server copy
NFS_FS=y as dependency of CONFIG_NFSD_V4_2_INTER_SSC still have
build errors and some configs with NFSD=m to get NFS4ERR_STALE
error when doing inter server copy.

Added ops table in nfs_common for knfsd to access NFS client modules.

Fixes: 3ac3711adb ("NFSD: Fix NFS server build errors")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-21 10:31:20 -04:00
Tom Rix
c1488428a8 nfsd: remove unneeded break
Because every path through nfs4_find_file()'s
switch does an explicit return, the break is not needed.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-16 15:15:04 -04:00
Anna Schumaker
9f0b5792f0 NFSD: Encode a full READ_PLUS reply
Reply to the client with multiple hole and data segments. I use the
result of the first vfs_llseek() call for encoding as an optimization so
we don't have to immediately repeat the call. This also lets us encode
any remaining reply as data if we get an unexpected result while trying
to calculate a hole.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-12 10:29:45 -04:00
Anna Schumaker
278765ea07 NFSD: Return both a hole and a data segment
But only one of each right now. We'll expand on this in the next patch.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-12 10:29:45 -04:00
Anna Schumaker
2db27992dd NFSD: Add READ_PLUS hole segment encoding
However, we still only reply to the READ_PLUS call with a single segment
at this time.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-12 10:29:45 -04:00
Anna Schumaker
528b84934e NFSD: Add READ_PLUS data support
This patch adds READ_PLUS support for returning a single
NFS4_CONTENT_DATA segment to the client. This is basically the same as
the READ operation, only with the extra information about data segments.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-12 10:29:45 -04:00
Chuck Lever
cc028a10a4 NFSD: Hoist status code encoding into XDR encoder functions
The original intent was presumably to reduce code duplication. The
trade-off was:

- No support for an NFSD proc function returning a non-success
  RPC accept_stat value.
- No support for void NFS replies to non-NULL procedures.
- Everyone pays for the deduplication with a few extra conditional
  branches in a hot path.

In addition, nfsd_dispatch() leaves *statp uninitialized in the
success path, unlike svc_generic_dispatch().

Address all of these problems by moving the logic for encoding
the NFS status code into the NFS XDR encoders themselves. Then
update the NFS .pc_func methods to return an RPC accept_stat
value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-12 10:29:44 -04:00
Chuck Lever
4b74fd793a NFSD: Map nfserr_wrongsec outside of nfsd_dispatch
Refactor: Handle this NFS version-specific mapping in the only
place where nfserr_wrongsec is generated.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:42 -04:00
Chuck Lever
14168d678a NFSD: Remove the RETURN_STATUS() macro
Refactor: I'm about to change the return value from .pc_func. Clear
the way by replacing the RETURN_STATUS() macro with logic that
plants the status code directly into the response structure.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:42 -04:00
Chuck Lever
f0af22101d NFSD: Call NFSv2 encoders on error returns
Remove special dispatcher logic for NFSv2 error responses. These are
rare to the point of becoming extinct, but all NFS responses have to
pay the cost of the extra conditional branches.

With this change, the NFSv2 error cases now get proper
xdr_ressize_check() calls.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:42 -04:00
Chuck Lever
1841b9b614 NFSD: Fix .pc_release method for NFSv2
nfsd_release_fhandle() assumes that rqstp->rq_resp always points to
an nfsd_fhandle struct. In fact, no NFSv2 procedure uses struct
nfsd_fhandle as its response structure.

So far that has been "safe" to do because the res structs put the
resp->fh field at that same offset as struct nfsd_fhandle. I don't
think that's a guarantee, though, and there is certainly nothing
preventing a developer from altering the fields in those structures.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:42 -04:00
Chuck Lever
7cf8357043 NFSD: Remove vestigial typedefs
Clean up: These are not used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:42 -04:00
Chuck Lever
85085aacef NFSD: Refactor nfsd_dispatch() error paths
nfsd_dispatch() is a hot path. Ensure the compiler takes the
processing of rare error cases out of line.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
Chuck Lever
4c96cb56ee NFSD: Clean up nfsd_dispatch() variables
For consistency and code legibility, use a similar organization of
variables as svc_generic_dispatch().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
Chuck Lever
383c440d4f NFSD: Clean up stale comments in nfsd_dispatch()
Add a documenting comment for the function. Remove comments that
simply describe obvious aspects of the code, but leave comments
that explain the differences in processing of each NFS version.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
Chuck Lever
84c138e78d NFSD: Clean up switch statement in nfsd_dispatch()
Reorder the arms so the compiler places checks for the most frequent
case first.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
Chuck Lever
dcc46991d3 NFSD: Encoder and decoder functions are always present
nfsd_dispatch() is a hot path. Let's optimize the XDR method calls
for the by-far common case, which is that the XDR methods are indeed
present.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
Chuck Lever
ba1df797e5 NFSACL: Replace PROC() macro with open code
Clean up: Follow-up on ten-year-old commit b9081d90f5 ("NFS: kill
off complicated macro 'PROC'") by performing the same conversion in
the NFSACL code. To reduce the chance of error, I copied the original
C preprocessor output and then made some minor edits.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
Chuck Lever
6b3dccd48d NFSD: Add missing NFSv2 .pc_func methods
There's no protection in nfsd_dispatch() against a NULL .pc_func
helpers. A malicious NFS client can trigger a crash by invoking the
unused/unsupported NFSv2 ROOT or WRITECACHE procedures.

The current NFSD dispatcher does not support returning a void reply
to a non-NULL procedure, so the reply to both of these is wrong, for
the moment.

Cc: <stable@vger.kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-10-02 09:37:41 -04:00
J. Bruce Fields
13956160fc nfsd: rq_lease_breaker cleanup
Since only the v4 code cares about it, maybe it's better to leave
rq_lease_breaker out of the common dispatch code?

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:02:03 -04:00
J. Bruce Fields
50747dd5e4 nfsd4: remove check_conflicting_opens warning
There are actually rare races where this is possible (e.g. if a new open
intervenes between the read of i_writecount and the fi_fds).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:02:02 -04:00
J. Bruce Fields
ae3c57b5ca nfsd: Cache R, RW, and W opens separately
The nfsd open code has always kept separate read-only, read-write, and
write-only opens as necessary to ensure that when a client closes or
downgrades, we don't retain more access than necessary.

Also, I didn't realize the cache behaved this way when I wrote
94415b06eb "nfsd4: a client's own opens needn't prevent delegations".
There I assumed fi_fds[O_WRONLY] and fi_fds[O_RDWR] would always be
distinct.  The violation of that assumption is triggering a
WARN_ON_ONCE() and could also cause the server to give out a delegation
when it shouldn't.

Fixes: 94415b06eb ("nfsd4: a client's own opens needn't prevent delegations")
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:02:02 -04:00
Rik van Riel
8c38b705b4 silence nfscache allocation warnings with kvzalloc
silence nfscache allocation warnings with kvzalloc

Currently nfsd_reply_cache_init attempts hash table allocation through
kmalloc, and manually falls back to vzalloc if that fails. This makes
the code a little larger than needed, and creates a significant amount
of serial console spam if you have enough systems.

Switching to kvzalloc gets rid of the allocation warnings, and makes
the code a little cleaner too as a side effect.

Freeing of nn->drc_hashtbl is already done using kvfree currently.

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:28 -04:00
Zheng Bin
44b49aa65f nfsd: fix comparison to bool warning
Fixes coccicheck warning:

fs/nfsd/nfs4proc.c:3234:5-29: WARNING: Comparison to bool

Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:28 -04:00
Chuck Lever
5aff7d0820 NFSD: Correct type annotations in COPY XDR functions
Squelch some sparse warnings:

/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16:    expected int status
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1860:16:    got restricted __be32
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24:    expected restricted __be32
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:1862:24:    got int status

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:28 -04:00
Chuck Lever
b9a492376d NFSD: Correct type annotations in user xattr XDR functions
Squelch some sparse warnings:

/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24:    expected int
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4692:24:    got restricted __be32 [usertype]
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32:    expected int
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4702:32:    got restricted __be32 [usertype]
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13:    expected restricted __be32 [usertype] err
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4739:13:    got int
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15:    expected unsigned int [assigned] [usertype] count
/home/cel/src/linux/linux/fs/nfsd/nfs4xdr.c:4891:15:    got restricted __be32 [usertype]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:27 -04:00
Chuck Lever
8237284a00 NFSD: Correct type annotations in user xattr helpers
Squelch some sparse warnings:

/home/cel/src/linux/linux/fs/nfsd/vfs.c:2264:13: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2264:13:    expected int err
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2264:13:    got restricted __be32
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2266:24: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2266:24:    expected restricted __be32
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2266:24:    got int err
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2288:13: warning: incorrect type in assignment (different base types)
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2288:13:    expected int err
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2288:13:    got restricted __be32
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2290:24: warning: incorrect type in return expression (different base types)
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2290:24:    expected restricted __be32
/home/cel/src/linux/linux/fs/nfsd/vfs.c:2290:24:    got int err

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:27 -04:00
Anna Schumaker
403217f304 SUNRPC/NFSD: Implement xdr_reserve_space_vec()
Reserving space for a large READ payload requires special handling when
reserving space in the xdr buffer pages. One problem we can have is use
of the scratch buffer, which is used to get a pointer to a contiguous
region of data up to PAGE_SIZE. When using the scratch buffer, calls to
xdr_commit_encode() shift the data to it's proper alignment in the xdr
buffer. If we've reserved several pages in a vector, then this could
potentially invalidate earlier pointers and result in incorrect READ
data being sent to the client.

I get around this by looking at the amount of space left in the current
page, and never reserve more than that for each entry in the read
vector. This lets us place data directly where it needs to go in the
buffer pages.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:27 -04:00
Hou Tao
3caf91757c nfsd: rename delegation related tracepoints to make them less confusing
Now when a read delegation is given, two delegation related traces
will be printed:

    nfsd_deleg_open: client 5f45b854:e6058001 stateid 00000030:00000001
    nfsd_deleg_none: client 5f45b854:e6058001 stateid 0000002f:00000001

Although the intention is to let developers know two stateid are
returned, the traces are confusing about whether or not a read delegation
is handled out. So renaming trace_nfsd_deleg_none() to trace_nfsd_open()
and trace_nfsd_deleg_open() to trace_nfsd_deleg_read() to make
the intension clearer.

The patched traces will be:

    nfsd_deleg_read: client 5f48a967:b55b21cd stateid 00000003:00000001
    nfsd_open: client 5f48a967:b55b21cd stateid 00000002:00000001

Suggested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:27 -04:00
Alex Dewar
e2a1840e56 nfsd: Remove unnecessary assignment in nfs4xdr.c
In nfsd4_encode_listxattrs(), the variable p is assigned to at one point
but this value is never used before p is reassigned. Fix this.

Addresses-Coverity: ("Unused value")
Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:27 -04:00
Alex Dewar
4cce11fa48 nfsd: Fix typo in comment
Missing "is".

Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:26 -04:00
J. Bruce Fields
12ed22f3c3 nfsd: give up callbacks on revoked delegations
The delegation is no longer returnable, so I don't think there's much
point retrying the recall.

(I think it's worth asking why we even need separate CLOSED_DELEG and
REVOKED_DELEG states.  But treating them the same would currently cause
nfsd4_free_stateid to call list_del_init(&dp->dl_recall_lru) on a
delegation that the laundromat had unhashed but not revoked, incorrectly
removing it from the laundromat's reaplist or a client's dl_recall_lru.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:26 -04:00
J. Bruce Fields
e56dc9e294 nfsd: remove fault injection code
It was an interesting idea but nobody seems to be using it, it's buggy
at this point, and nfs4state.c is already complicated enough without it.
The new nfsd/clients/ code provides some of the same functionality, and
could probably do more if desired.

This feature has been deprecated since 9d60d93198 ("Deprecate nfsd
fault injection").

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-09-25 18:01:26 -04:00
Christoph Hellwig
fa01b1e973 block: add a bdev_is_partition helper
Add a littler helper to make the somewhat arcane bd_contains checks a
little more obvious.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25 08:18:57 -06:00
Linus Torvalds
2ac69819ba Fixes:
- Eliminate an oops introduced in v5.8
 - Remove a duplicate #include added by nfsd-5.9
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAl8/4gUACgkQM2qzM29m
 f5cxKBAAp7UjD3YNlLhSviowuOfYpWNjyk1cEQ6hWFA9oVeSfZfU3/axW8uYTHPm
 QZ6ams6gjorP4CXwVkFGFHpTRg4CfVN9g5lKxrcjvELqNllWBhE9UupRgbX3+XBE
 qselRI22M64o2tfDE+tPrDB8w8PwHmqrHwRydXfgiFlHk7nt6xD7NitaJBnPlYPM
 21OBl6mrjLwtRwvX9n5wpy/+bfOTHbGV5VNez0fAfKXggNmRdt/UNROC4doLg4M0
 2khAV3vgx49FRpCPL6SZPcBYd6zfrYOcj3iSf6wpxS5nTb2MifXFqz1MvKRTj863
 gzvSmh7vuf0+EaOAXuLjCD9dURZpuG/k0vJGijOgaSt0+vNQHjIgZ1XRFHQtQCp4
 zPJ/Qyk5k7uajHzcBFuNPUFAkOovH6LRoOzpqGvXhwaxrMPWti0LyyVKidVJrt/d
 EtOKQR+HCN0zAwjadXSPK8Nw1PjMzplkF7TaxXvF2LdO/4vpEZZNoz+if59gRcFY
 65h2++7y+0MCX8l83uUZfs+jQU2aR1w5a0DjVzi86xzJtyhr6gEyTj3Z6L9HIHwW
 dnSpUmoiaCoN0eqxvEBjw0VEPqB806CuiUER0Jdd8k7mPk04fsQ/9+UsYyliSLEG
 N56LFSWLXLHsySa2WkuB/ghzT2/Q0vFoZKXW0KNSD7W4C5XMxi4=
 =czB3
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.9-1' of git://git.linux-nfs.org/projects/cel/cel-2.6

Pull nfs server fixes from Chuck Lever:

 - Eliminate an oops introduced in v5.8

 - Remove a duplicate #include added by nfsd-5.9

* tag 'nfsd-5.9-1' of git://git.linux-nfs.org/projects/cel/cel-2.6:
  SUNRPC: remove duplicate include
  nfsd: fix oops on mixed NFSv4/NFSv3 client access
2020-08-25 18:01:36 -07:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
J. Bruce Fields
34b09af4f5 nfsd: fix oops on mixed NFSv4/NFSv3 client access
If an NFSv2/v3 client breaks an NFSv4 client's delegation, it will hit a
NULL dereference in nfsd_breaker_owns_lease().

Easily reproduceable with for example

	mount -overs=4.2 server:/export /mnt/
	sleep 1h </mnt/file &
	mount -overs=3 server:/export /mnt2/
	touch /mnt2/file

Reported-by: Robert Dinse <nanook@eskimo.com>
Fixes: 28df3d1539 ("nfsd: clients don't need to break their own delegations")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208807
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-08-16 16:51:18 -04:00
Linus Torvalds
7a6b60441f Highlights:
- Support for user extended attributes on NFS (RFC 8276)
 - Further reduce unnecessary NFSv4 delegation recalls
 
 Notable fixes:
 
 - Fix recent krb5p regression
 - Address a few resource leaks and a rare NULL dereference
 
 Other:
 
 - De-duplicate RPC/RDMA error handling and other utility functions
 - Replace storage and display of kernel memory addresses by tracepoints
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAl8oBt0ACgkQM2qzM29m
 f5dTFQ/9H72E6gr1onsia0/Py0CO8F9qzLgmUBl1vVYAh2/vPqUL1ypxrC5OYrAy
 TOqESTsJvmGluCFc/77XUTD7NvJY3znIWim49okwDiyee4Y14ZfRhhCxyyA6Z94E
 FjJQb5TbF1Mti4X3dN8Gn7O1Y/BfTjDAAXnXGlTA1xoLcxM5idWIj+G8x0bPmeDb
 2fTbgsoETu6MpS2/L6mraXVh3d5ESOJH+73YvpBl0AhYPzlNASJZMLtHtd+A/JbO
 IPkMP/7UA5DuJtWGeuQ4I4D5bQNpNWMfN6zhwtih4IV5bkRC7vyAOLG1R7w9+Ufq
 58cxPiorMcsg1cHnXG0Z6WVtbMEdWTP/FzmJdE5RC7DEJhmmSUG/R0OmgDcsDZET
 GovPARho01yp80GwTjCIctDHRRFRL4pdPfr8PjVHetSnx9+zoRUT+D70Zeg/KSy2
 99gmCxqSY9BZeHoiVPEX/HbhXrkuDjUSshwl98OAzOFmv6kbwtLntgFbWlBdE6dB
 mqOxBb73zEoZ5P9GA2l2ShU3GbzMzDebHBb9EyomXHZrLejoXeUNA28VJ+8vPP5S
 IVHnEwOkdJrNe/7cH4jd/B0NR6f8Da/F9kmkLiG2GNPMqQ8bnVhxTUtZkcAE+fd4
 f34qLxsoht70wSSfISjBs7hP5KxEM1lOAf0w0RpycPUKJNV1FB0=
 =OEpF
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6

Pull NFS server updates from Chuck Lever:
 "Highlights:
   - Support for user extended attributes on NFS (RFC 8276)
   - Further reduce unnecessary NFSv4 delegation recalls

  Notable fixes:
   - Fix recent krb5p regression
   - Address a few resource leaks and a rare NULL dereference

  Other:
   - De-duplicate RPC/RDMA error handling and other utility functions
   - Replace storage and display of kernel memory addresses by tracepoints"

* tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
  svcrdma: CM event handler clean up
  svcrdma: Remove transport reference counting
  svcrdma: Fix another Receive buffer leak
  SUNRPC: Refresh the show_rqstp_flags() macro
  nfsd: netns.h: delete a duplicated word
  SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
  nfsd: avoid a NULL dereference in __cld_pipe_upcall()
  nfsd4: a client's own opens needn't prevent delegations
  nfsd: Use seq_putc() in two functions
  svcrdma: Display chunk completion ID when posting a rw_ctxt
  svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
  svcrdma: Introduce Send completion IDs
  svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
  svcrdma: Introduce Receive completion IDs
  svcrdma: Introduce infrastructure to support completion IDs
  svcrdma: Add common XDR encoders for RDMA and Read segments
  svcrdma: Add common XDR decoders for RDMA and Read segments
  SUNRPC: Add helpers for decoding list discriminators symbolically
  svcrdma: Remove declarations for functions long removed
  svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
  ...
2020-08-09 13:58:04 -07:00
Linus Torvalds
eb65405eb6 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl8qeCkACgkQnJ2qBz9k
 QNlAGQf/YVruyVLZ7kCv6EMCHauXm3K1lEGpbXsTW04HpStxGx7mtLGN/Au+EYJR
 VnRkCMt6TSMQGMBkNF83dUCwXHkeL1rd6frJBLVOErkg50nUuD4kjTVw9Lzw9itx
 CPhKnPPlsRkDkZPxkg3WEdqPgzJREWBZUaB38QUPjYN46q7HfPYDANTh5wI1GiGs
 27+PvzlttjhkQpQ14pYU/nu4xf/nmgmmHhgfsJArQP2EzYOrKxsWKhXS5uPdtNlf
 mXiZMaqW2AlyDGlw3myOEySrrSuaR77M2bzDo7mjqffI9wSVTytKEhtg0i8OMWmv
 pZ38OQobznnFoqzc1GL70IE0DEU48g==
 =d81d
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:

 - fanotify fix for softlockups when there are many queued events

 - performance improvement to reduce fsnotify overhead when not used

 - Amir's implementation of fanotify events with names. With these you
   can now efficiently monitor whole filesystem, eg to mirror changes to
   another machine.

* tag 'fsnotify_for_v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (37 commits)
  fanotify: compare fsid when merging name event
  fsnotify: create method handle_inode_event() in fsnotify_operations
  fanotify: report parent fid + child fid
  fanotify: report parent fid + name + child fid
  fanotify: add support for FAN_REPORT_NAME
  fanotify: report events with parent dir fid to sb/mount/non-dir marks
  fanotify: add basic support for FAN_REPORT_DIR_FID
  fsnotify: remove check that source dentry is positive
  fsnotify: send event with parent/name info to sb/mount/non-dir marks
  audit: do not set FS_EVENT_ON_CHILD in audit marks mask
  inotify: do not set FS_EVENT_ON_CHILD in non-dir mark mask
  fsnotify: pass dir and inode arguments to fsnotify()
  fsnotify: create helper fsnotify_inode()
  fsnotify: send event to parent and child with single callback
  inotify: report both events on parent and child with single callback
  dnotify: report both events on parent and child with single callback
  fanotify: no external fh buffer in fanotify_name_event
  fanotify: use struct fanotify_info to parcel the variable size buffer
  fsnotify: add object type "child" to object type iterator
  fanotify: use FAN_EVENT_ON_CHILD as implicit flag on sb/mount/non-dir marks
  ...
2020-08-06 19:29:51 -07:00
Linus Torvalds
99ea1521a0 Remove uninitialized_var() macro for v5.9-rc1
- Clean up non-trivial uses of uninitialized_var()
 - Update documentation and checkpatch for uninitialized_var() removal
 - Treewide removal of uninitialized_var()
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oYLQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsfjEACvf0D3WL3H7sLHtZ2HeMwOgAzq
 il08t6vUscINQwiIIK3Be43ok3uQ1Q+bj8sr2gSYTwunV2IYHFferzgzhyMMno3o
 XBIGd1E+v1E4DGBOiRXJvacBivKrfvrdZ7AWiGlVBKfg2E0fL1aQbe9AYJ6eJSbp
 UGqkBkE207dugS5SQcwrlk1tWKUL089lhDAPd7iy/5RK76OsLRCJFzIerLHF2ZK2
 BwvA+NWXVQI6pNZ0aRtEtbbxwEU4X+2J/uaXH5kJDszMwRrgBT2qoedVu5LXFPi8
 +B84IzM2lii1HAFbrFlRyL/EMueVFzieN40EOB6O8wt60Y4iCy5wOUzAdZwFuSTI
 h0xT3JI8BWtpB3W+ryas9cl9GoOHHtPA8dShuV+Y+Q2bWe1Fs6kTl2Z4m4zKq56z
 63wQCdveFOkqiCLZb8s6FhnS11wKtAX4czvXRXaUPgdVQS1Ibyba851CRHIEY+9I
 AbtogoPN8FXzLsJn7pIxHR4ADz+eZ0dQ18f2hhQpP6/co65bYizNP5H3h+t9hGHG
 k3r2k8T+jpFPaddpZMvRvIVD8O2HvJZQTyY6Vvneuv6pnQWtr2DqPFn2YooRnzoa
 dbBMtpon+vYz6OWokC5QNWLqHWqvY9TmMfcVFUXE4AFse8vh4wJ8jJCNOFVp8On+
 drhmmImUr1YylrtVOw==
 =xHmk
 -----END PGP SIGNATURE-----

Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull uninitialized_var() macro removal from Kees Cook:
 "This is long overdue, and has hidden too many bugs over the years. The
  series has several "by hand" fixes, and then a trivial treewide
  replacement.

   - Clean up non-trivial uses of uninitialized_var()

   - Update documentation and checkpatch for uninitialized_var() removal

   - Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  compiler: Remove uninitialized_var() macro
  treewide: Remove uninitialized_var() usage
  checkpatch: Remove awareness of uninitialized_var() macro
  mm/debug_vm_pgtable: Remove uninitialized_var() usage
  f2fs: Eliminate usage of uninitialized_var() macro
  media: sur40: Remove uninitialized_var() usage
  KVM: PPC: Book3S PR: Remove uninitialized_var() usage
  clk: spear: Remove uninitialized_var() usage
  clk: st: Remove uninitialized_var() usage
  spi: davinci: Remove uninitialized_var() usage
  ide: Remove uninitialized_var() usage
  rtlwifi: rtl8192cu: Remove uninitialized_var() usage
  b43: Remove uninitialized_var() usage
  drbd: Remove uninitialized_var() usage
  x86/mm/numa: Remove uninitialized_var() usage
  docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
Amir Goldstein
b9a1b97725 fsnotify: create method handle_inode_event() in fsnotify_operations
The method handle_event() grew a lot of complexity due to the design of
fanotify and merging of ignore masks.

Most backends do not care about this complex functionality, so we can hide
this complexity from them.

Introduce a method handle_inode_event() that serves those backends and
passes a single inode mark and less arguments.

This change converts all backends except fanotify and inotify to use the
simplified handle_inode_event() method.  In pricipal, inotify could have
also used the new method, but that would require passing more arguments
on the simple helper (data, data_type, cookie), so we leave it with the
handle_event() method.

Link: https://lore.kernel.org/r/20200722125849.17418-9-amir73il@gmail.com
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27 23:25:50 +02:00
Amir Goldstein
b54cecf5e2 fsnotify: pass dir argument to handle_event() callback
The 'inode' argument to handle_event(), sometimes referred to as
'to_tell' is somewhat obsolete.
It is a remnant from the times when a group could only have an inode mark
associated with an event.

We now pass an iter_info array to the callback, with all marks associated
with an event.

Most backends ignore this argument, with two exceptions:
1. dnotify uses it for sanity check that event is on directory
2. fanotify uses it to report fid of directory on directory entry
   modification events

Remove the 'inode' argument and add a 'dir' argument.
The callback function signature is deliberately changed, because
the meaning of the argument has changed and the arguments have
been documented.

The 'dir' argument is set to when 'file_name' is specified and it is
referring to the directory that the 'file_name' entry belongs to.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-27 18:32:47 +02:00
Randy Dunlap
94a4beaa6b nfsd: netns.h: delete a duplicated word
Drop the repeated word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-24 17:25:13 -04:00
J. Bruce Fields
9affa43581 nfsd4: fix NULL dereference in nfsd/clients display code
We hold the cl_lock here, and that's enough to keep stateid's from going
away, but it's not enough to prevent the files they point to from going
away.  Take fi_lock and a reference and check for NULL, as we do in
other code.

Reported-by: NeilBrown <neilb@suse.de>
Fixes: 78599c42ae ("nfsd4: add file to display list of client's opens")
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-07-22 16:47:14 -04:00
Kees Cook
3f649ab728 treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:35:15 -07:00
Amir Goldstein
9a02aa40dd nfsd: use fsnotify_data_inode() to get the unlinked inode
The inode argument to handle_event() is about to become obsolete.

Link: https://lore.kernel.org/r/20200708111156.24659-4-amir73il@gmail.com
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2020-07-15 17:36:47 +02:00
Scott Mayhew
df60446cd1 nfsd: avoid a NULL dereference in __cld_pipe_upcall()
If the rpc_pipefs is unmounted, then the rpc_pipe->dentry becomes NULL
and dereferencing the dentry->d_sb will trigger an oops.  The only
reason we're doing that is to determine the nfsd_net, which could
instead be passed in by the caller.  So do that instead.

Fixes: 11a60d1592 ("nfsd: add a "GetVersion" upcall for nfsdcld")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:28:46 -04:00
J. Bruce Fields
94415b06eb nfsd4: a client's own opens needn't prevent delegations
We recently fixed lease breaking so that a client's actions won't break
its own delegations.

But we still have an unnecessary self-conflict when granting
delegations: a client's own write opens will prevent us from handing out
a read delegation even when no other client has the file open for write.

Fix that by turning off the checks for conflicting opens under
vfs_setlease, and instead performing those checks in the nfsd code.

We don't depend much on locks here: instead we acquire the delegation,
then check for conflicts, and drop the delegation again if we find any.

The check beforehand is an optimization of sorts, just to avoid
acquiring the delegation unnecessarily.  There's a race where the first
check could cause us to deny the delegation when we could have granted
it.  But, that's OK, delegation grants are optional (and probably not
even a good idea in that case).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:28:46 -04:00
Xu Wang
0b7cd9d9ca nfsd: Use seq_putc() in two functions
A single character (line break) should be put into a sequence.
Thus use the corresponding function "seq_putc()".

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:28:46 -04:00
Frank van der Linden
0e885e846d nfsd: add fattr support for user extended attributes
Check if user extended attributes are supported for an inode,
and return the answer when being queried for file attributes.

An exported filesystem can now signal its RFC8276 user extended
attributes capability.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
Frank van der Linden
23e50fe3a5 nfsd: implement the xattr functions and en/decode logic
Implement the main entry points for the *XATTR operations.

Add functions to calculate the reply size for the user extended attribute
operations, and implement the XDR encode / decode logic for these
operations.

Add the user extended attributes operations to nfsd4_ops.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
Frank van der Linden
6178713bd4 nfsd: add structure definitions for xattr requests / responses
Add the structures used in extended attribute request / response handling.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
Frank van der Linden
c11d7fd1b3 nfsd: take xattr bits into account for permission checks
Since the NFSv4.2 extended attributes extension defines 3 new access
bits for xattr operations, take them in to account when validating
what the client is asking for, and when checking permissions.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
Frank van der Linden
32119446bb nfsd: define xattr functions to call into their vfs counterparts
This adds the filehandle based functions for the xattr operations
that call in to the vfs layer to do the actual work.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
[ cel: address checkpatch.pl complaint ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
Frank van der Linden
4dd05fceb7 nfsd: add defines for NFSv4.2 extended attribute support
Add defines for server-side extended attribute support. Most have
already been added as part of client support, but these are
the network order error codes for the noxattr and xattr2big errors,
and the addition of the xattr support to the supported file
attributes (if configured).

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
Frank van der Linden
874c7b8ea5 nfsd: split off the write decode code into a separate function
nfs4_decode_write has code to parse incoming XDR write data in to
a kvec head, and a list of pages.

Put this code in to a separate function, so that it can be used
later by the xattr code, for setxattr. No functional change.

Signed-off-by: Frank van der Linden <fllinden@amazon.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-13 17:27:03 -04:00
J. Bruce Fields
bf2654017e nfsd: fix nfsdfs inode reference count leak
I don't understand this code well, but  I'm seeing a warning about a
still-referenced inode on unmount, and every other similar filesystem
does a dput() here.

Fixes: e8a79fb14f ("nfsd: add nfsd/clients directory")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-06-29 14:48:28 -04:00
J. Bruce Fields
681370f4b0 nfsd4: fix nfsdfs reference count loop
We don't drop the reference on the nfsdfs filesystem with
mntput(nn->nfsd_mnt) until nfsd_exit_net(), but that won't be called
until the nfsd module's unloaded, and we can't unload the module as long
as there's a reference on nfsdfs.  So this prevents module unloading.

Fixes: 2c830dd720 ("nfsd: persist nfsd filesystem across mounts")
Reported-and-Tested-by:  Luo Xiaogang <lxgrxd@163.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-06-29 14:48:02 -04:00
J. Bruce Fields
22cf8419f1 nfsd: apply umask on fs without ACL support
The server is failing to apply the umask when creating new objects on
filesystems without ACL support.

To reproduce this, you need to use NFSv4.2 and a client and server
recent enough to support umask, and you need to export a filesystem that
lacks ACL support (for example, ext4 with the "noacl" mount option).

Filesystems with ACL support are expected to take care of the umask
themselves (usually by calling posix_acl_create).

For filesystems without ACL support, this is up to the caller of
vfs_create(), vfs_mknod(), or vfs_mkdir().

Reported-by: Elliott Mitchell <ehem+debian@m5p.com>
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Fixes: 47057abde5 ("nfsd: add support for the umask attribute")
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-06-17 10:48:58 -04:00
Linus Torvalds
c742b63473 Highlights:
- Keep nfsd clients from unnecessarily breaking their own delegations:
   Note this requires a small kthreadd addition, discussed at:
   https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com
   The result is Tejun Heo's suggestion, and he was OK with this going
   through my tree.
 - Patch nfsd/clients/ to display filenames, and to fix byte-order when
   displaying stateid's.
 - fix a module loading/unloading bug, from Neil Brown.
 - A big series from Chuck Lever with RPC/RDMA and tracing improvements,
   and lay some groundwork for RPC-over-TLS.
 
 Note Stephen Rothwell spotted two conflicts in linux-next.  Both should
 be straightforward:
 	include/trace/events/sunrpc.h
 		https://lore.kernel.org/r/20200529105917.50dfc40f@canb.auug.org.au
 	net/sunrpc/svcsock.c
 		https://lore.kernel.org/r/20200529131955.26c421db@canb.auug.org.au
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl7iRYwVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+yx8QALIfyz/ziPgjGBnNJGCW8BjWHz7+
 rGI+1SP2EUpgJ0fGJc9MpGyYTa5T3pTgsENnIRtegyZDISg2OQ5GfifpkTz4U7vg
 QbWRihs/W9EhltVYhKvtLASAuSAJ8ETbDfLXVb2ncY7iO6JNvb22xwsgKZILmzm1
 uG4qSszmBZzpMUUy51kKJYJZ3ysP+v14qOnyOXEoeEMuJYNK9FkQ9bSPZ6wTJNOn
 hvZBMbU7LzRyVIvp358mFHY+vwq5qBNkJfVrZBkURGn4OxWPbWDXzqOi0Zs1oBjA
 L+QODIbTLGkopu/rD0r1b872PDtket7p5zsD8MreeI1vJOlt3xwqdCGlicIeNATI
 b0RG7sqh+pNv0mvwLxSNTf3rO0EKW6tUySqCnQZUAXFGRH0nYM2TWze4HUr2zfWT
 EgRMwxHY/AZUStZBuCIHPJ6inWnKuxSUELMf2a9JHO1BJc/yClRgmwJGdthVwb9u
 GP6F3/maFu+9YOO6iROMsqtxDA+q5vch5IBzevNOOBDEQDKqENmogR/knl9DmAhF
 sr+FOa3O0u6S4tgXw/TU97JS/h1L2Hu6QVEwU2iVzWtlUUOFVMZQODJTB6Lts4Ka
 gKzYXWvCHN+LyETsN6q7uHFg9mtO7xO5vrrIgo72SuVCscDw/8iHkoOOFLief+GE
 O0fR0IYjW8U1Rkn2
 =YEf0
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Highlights:

   - Keep nfsd clients from unnecessarily breaking their own
     delegations.

     Note this requires a small kthreadd addition. The result is Tejun
     Heo's suggestion (see link), and he was OK with this going through
     my tree.

   - Patch nfsd/clients/ to display filenames, and to fix byte-order
     when displaying stateid's.

   - fix a module loading/unloading bug, from Neil Brown.

   - A big series from Chuck Lever with RPC/RDMA and tracing
     improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com

* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
  sunrpc: use kmemdup_nul() in gssp_stringify()
  nfsd: safer handling of corrupted c_type
  nfsd4: make drc_slab global, not per-net
  SUNRPC: Remove unreachable error condition in rpcb_getport_async()
  nfsd: Fix svc_xprt refcnt leak when setup callback client failed
  sunrpc: clean up properly in gss_mech_unregister()
  sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
  sunrpc: check that domain table is empty at module unload.
  NFSD: Fix improperly-formatted Doxygen comments
  NFSD: Squash an annoying compiler warning
  SUNRPC: Clean up request deferral tracepoints
  NFSD: Add tracepoints for monitoring NFSD callbacks
  NFSD: Add tracepoints to the NFSD state management code
  NFSD: Add tracepoints to NFSD's duplicate reply cache
  SUNRPC: svc_show_status() macro should have enum definitions
  SUNRPC: Restructure svc_udp_recvfrom()
  SUNRPC: Refactor svc_recvfrom()
  SUNRPC: Clean up svc_release_skb() functions
  SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
  SUNRPC: Replace dprintk() call sites in TCP receive path
  ...
2020-06-11 10:33:13 -07:00
J. Bruce Fields
c25bf185e5 nfsd: safer handling of corrupted c_type
This can only happen if there's a bug somewhere, so let's make it a WARN
not a printk.  Also, I think it's safest to ignore the corruption rather
than trying to fix it by removing a cache entry.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-06-03 11:12:32 -04:00
NeilBrown
a37b0715dd mm/writeback: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE
PF_LESS_THROTTLE exists for loop-back nfsd (and a similar need in the
loop block driver and callers of prctl(PR_SET_IO_FLUSHER)), where a
daemon needs to write to one bdi (the final bdi) in order to free up
writes queued to another bdi (the client bdi).

The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirty
pages, so that it can still dirty pages after other processses have been
throttled.  The purpose of this is to avoid deadlock that happen when
the PF_LESS_THROTTLE process must write for any dirty pages to be freed,
but it is being thottled and cannot write.

This approach was designed when all threads were blocked equally,
independently on which device they were writing to, or how fast it was.
Since that time the writeback algorithm has changed substantially with
different threads getting different allowances based on non-trivial
heuristics.  This means the simple "add 25%" heuristic is no longer
reliable.

The important issue is not that the daemon needs a *larger* dirty page
allowance, but that it needs a *private* dirty page allowance, so that
dirty pages for the "client" bdi that it is helping to clear (the bdi
for an NFS filesystem or loop block device etc) do not affect the
throttling of the daemon writing to the "final" bdi.

This patch changes the heuristic so that the task is not throttled when
the bdi it is writing to has a dirty page count below below (or equal
to) the free-run threshold for that bdi.  This ensures it will always be
able to have some pages in flight, and so will not deadlock.

In a steady-state, it is expected that PF_LOCAL_THROTTLE tasks might
still be throttled by global threshold, but that is acceptable as it is
only the deadlock state that is interesting for this flag.

This approach of "only throttle when target bdi is busy" is consistent
with the other use of PF_LESS_THROTTLE in current_may_throttle(), were
it causes attention to be focussed only on the target bdi.

So this patch
 - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE,
 - removes the 25% bonus that that flag gives, and
 - If PF_LOCAL_THROTTLE is set, don't delay at all unless the
   global and the local free-run thresholds are exceeded.

Note that previously realtime threads were treated the same as
PF_LESS_THROTTLE threads.  This patch does *not* change the behvaiour
for real-time threads, so it is now different from the behaviour of nfsd
and loop tasks.  I don't know what is wanted for realtime.

[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Chuck Lever <chuck.lever@oracle.com>	[nfsd]
Cc: Christoph Hellwig <hch@lst.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Link: http://lkml.kernel.org/r/87ftbf7gs3.fsf@notabene.neil.brown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:08 -07:00
J. Bruce Fields
027690c75e nfsd4: make drc_slab global, not per-net
I made every global per-network-namespace instead.  But perhaps doing
that to this slab was a step too far.

The kmem_cache_create call in our net init method also seems to be
responsible for this lockdep warning:

[   45.163710] Unable to find swap-space signature
[   45.375718] trinity-c1 (855): attempted to duplicate a private mapping with mremap.  This is not supported.
[   46.055744] futex_wake_op: trinity-c1 tries to shift op by -209; fix this program
[   51.011723]
[   51.013378] ======================================================
[   51.013875] WARNING: possible circular locking dependency detected
[   51.014378] 5.2.0-rc2 #1 Not tainted
[   51.014672] ------------------------------------------------------
[   51.015182] trinity-c2/886 is trying to acquire lock:
[   51.015593] 000000005405f099 (slab_mutex){+.+.}, at: slab_attr_store+0xa2/0x130
[   51.016190]
[   51.016190] but task is already holding lock:
[   51.016652] 00000000ac662005 (kn->count#43){++++}, at: kernfs_fop_write+0x286/0x500
[   51.017266]
[   51.017266] which lock already depends on the new lock.
[   51.017266]
[   51.017909]
[   51.017909] the existing dependency chain (in reverse order) is:
[   51.018497]
[   51.018497] -> #1 (kn->count#43){++++}:
[   51.018956]        __lock_acquire+0x7cf/0x1a20
[   51.019317]        lock_acquire+0x17d/0x390
[   51.019658]        __kernfs_remove+0x892/0xae0
[   51.020020]        kernfs_remove_by_name_ns+0x78/0x110
[   51.020435]        sysfs_remove_link+0x55/0xb0
[   51.020832]        sysfs_slab_add+0xc1/0x3e0
[   51.021332]        __kmem_cache_create+0x155/0x200
[   51.021720]        create_cache+0xf5/0x320
[   51.022054]        kmem_cache_create_usercopy+0x179/0x320
[   51.022486]        kmem_cache_create+0x1a/0x30
[   51.022867]        nfsd_reply_cache_init+0x278/0x560
[   51.023266]        nfsd_init_net+0x20f/0x5e0
[   51.023623]        ops_init+0xcb/0x4b0
[   51.023928]        setup_net+0x2fe/0x670
[   51.024315]        copy_net_ns+0x30a/0x3f0
[   51.024653]        create_new_namespaces+0x3c5/0x820
[   51.025257]        unshare_nsproxy_namespaces+0xd1/0x240
[   51.025881]        ksys_unshare+0x506/0x9c0
[   51.026381]        __x64_sys_unshare+0x3a/0x50
[   51.026937]        do_syscall_64+0x110/0x10b0
[   51.027509]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   51.028175]
[   51.028175] -> #0 (slab_mutex){+.+.}:
[   51.028817]        validate_chain+0x1c51/0x2cc0
[   51.029422]        __lock_acquire+0x7cf/0x1a20
[   51.029947]        lock_acquire+0x17d/0x390
[   51.030438]        __mutex_lock+0x100/0xfa0
[   51.030995]        mutex_lock_nested+0x27/0x30
[   51.031516]        slab_attr_store+0xa2/0x130
[   51.032020]        sysfs_kf_write+0x11d/0x180
[   51.032529]        kernfs_fop_write+0x32a/0x500
[   51.033056]        do_loop_readv_writev+0x21d/0x310
[   51.033627]        do_iter_write+0x2e5/0x380
[   51.034148]        vfs_writev+0x170/0x310
[   51.034616]        do_pwritev+0x13e/0x160
[   51.035100]        __x64_sys_pwritev+0xa3/0x110
[   51.035633]        do_syscall_64+0x110/0x10b0
[   51.036200]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   51.036924]
[   51.036924] other info that might help us debug this:
[   51.036924]
[   51.037876]  Possible unsafe locking scenario:
[   51.037876]
[   51.038556]        CPU0                    CPU1
[   51.039130]        ----                    ----
[   51.039676]   lock(kn->count#43);
[   51.040084]                                lock(slab_mutex);
[   51.040597]                                lock(kn->count#43);
[   51.041062]   lock(slab_mutex);
[   51.041320]
[   51.041320]  *** DEADLOCK ***
[   51.041320]
[   51.041793] 3 locks held by trinity-c2/886:
[   51.042128]  #0: 000000001f55e152 (sb_writers#5){.+.+}, at: vfs_writev+0x2b9/0x310
[   51.042739]  #1: 00000000c7d6c034 (&of->mutex){+.+.}, at: kernfs_fop_write+0x25b/0x500
[   51.043400]  #2: 00000000ac662005 (kn->count#43){++++}, at: kernfs_fop_write+0x286/0x500

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 3ba75830ce "drc containerization"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-06-01 17:44:45 -04:00
Linus Torvalds
81e8c10dac Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Introduce crypto_shash_tfm_digest() and use it wherever possible.
   - Fix use-after-free and race in crypto_spawn_alg.
   - Add support for parallel and batch requests to crypto_engine.

  Algorithms:
   - Update jitter RNG for SP800-90B compliance.
   - Always use jitter RNG as seed in drbg.

  Drivers:
   - Add Arm CryptoCell driver cctrng.
   - Add support for SEV-ES to the PSP driver in ccp"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (114 commits)
  crypto: hisilicon - fix driver compatibility issue with different versions of devices
  crypto: engine - do not requeue in case of fatal error
  crypto: cavium/nitrox - Fix a typo in a comment
  crypto: hisilicon/qm - change debugfs file name from qm_regs to regs
  crypto: hisilicon/qm - add DebugFS for xQC and xQE dump
  crypto: hisilicon/zip - add debugfs for Hisilicon ZIP
  crypto: hisilicon/hpre - add debugfs for Hisilicon HPRE
  crypto: hisilicon/sec2 - add debugfs for Hisilicon SEC
  crypto: hisilicon/qm - add debugfs to the QM state machine
  crypto: hisilicon/qm - add debugfs for QM
  crypto: stm32/crc32 - protect from concurrent accesses
  crypto: stm32/crc32 - don't sleep in runtime pm
  crypto: stm32/crc32 - fix multi-instance
  crypto: stm32/crc32 - fix run-time self test issue.
  crypto: stm32/crc32 - fix ext4 chksum BUG_ON()
  crypto: hisilicon/zip - Use temporary sqe when doing work
  crypto: hisilicon - add device error report through abnormal irq
  crypto: hisilicon - remove codes of directly report device errors through MSI
  crypto: hisilicon - QM memory management optimization
  crypto: hisilicon - unify initial value assignment into QM
  ...
2020-06-01 12:00:10 -07:00
Xiyu Yang
a4abc6b12e nfsd: Fix svc_xprt refcnt leak when setup callback client failed
nfsd4_process_cb_update() invokes svc_xprt_get(), which increases the
refcount of the "c->cn_xprt".

The reference counting issue happens in one exception handling path of
nfsd4_process_cb_update(). When setup callback client failed, the
function forgets to decrease the refcnt increased by svc_xprt_get(),
causing a refcnt leak.

Fix this issue by calling svc_xprt_put() when setup callback client
failed.

Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-28 18:15:00 -04:00
J. Bruce Fields
6670ee2ef2 Merge branch 'nfsd-5.8' of git://linux-nfs.org/~cel/cel-2.6 into for-5.8-incoming
Highlights of this series:
* Remove serialization of sending RPC/RDMA Replies
* Convert the TCP socket send path to use xdr_buf::bvecs (pre-requisite for
RPC-on-TLS)
* Fix svcrdma backchannel sendto return code
* Convert a number of dprintk call sites to use tracepoints
* Fix the "suggest braces around empty body in an 'else' statement" warning
2020-05-21 10:58:15 -04:00
Chuck Lever
f2453978a4 NFSD: Fix improperly-formatted Doxygen comments
fs/nfsd/nfsctl.c:256: warning: Function parameter or member 'file' not described in 'write_unlock_ip'
fs/nfsd/nfsctl.c:256: warning: Function parameter or member 'buf' not described in 'write_unlock_ip'
fs/nfsd/nfsctl.c:256: warning: Function parameter or member 'size' not described in 'write_unlock_ip'
fs/nfsd/nfsctl.c:295: warning: Function parameter or member 'file' not described in 'write_unlock_fs'
fs/nfsd/nfsctl.c:295: warning: Function parameter or member 'buf' not described in 'write_unlock_fs'
fs/nfsd/nfsctl.c:295: warning: Function parameter or member 'size' not described in 'write_unlock_fs'
fs/nfsd/nfsctl.c:352: warning: Function parameter or member 'file' not described in 'write_filehandle'
fs/nfsd/nfsctl.c:352: warning: Function parameter or member 'buf' not described in 'write_filehandle'
fs/nfsd/nfsctl.c:352: warning: Function parameter or member 'size' not described in 'write_filehandle'
fs/nfsd/nfsctl.c:434: warning: Function parameter or member 'file' not described in 'write_threads'
fs/nfsd/nfsctl.c:434: warning: Function parameter or member 'buf' not described in 'write_threads'
fs/nfsd/nfsctl.c:434: warning: Function parameter or member 'size' not described in 'write_threads'
fs/nfsd/nfsctl.c:478: warning: Function parameter or member 'file' not described in 'write_pool_threads'
fs/nfsd/nfsctl.c:478: warning: Function parameter or member 'buf' not described in 'write_pool_threads'
fs/nfsd/nfsctl.c:478: warning: Function parameter or member 'size' not described in 'write_pool_threads'
fs/nfsd/nfsctl.c:697: warning: Function parameter or member 'file' not described in 'write_versions'
fs/nfsd/nfsctl.c:697: warning: Function parameter or member 'buf' not described in 'write_versions'
fs/nfsd/nfsctl.c:697: warning: Function parameter or member 'size' not described in 'write_versions'
fs/nfsd/nfsctl.c:858: warning: Function parameter or member 'file' not described in 'write_ports'
fs/nfsd/nfsctl.c:858: warning: Function parameter or member 'buf' not described in 'write_ports'
fs/nfsd/nfsctl.c:858: warning: Function parameter or member 'size' not described in 'write_ports'
fs/nfsd/nfsctl.c:892: warning: Function parameter or member 'file' not described in 'write_maxblksize'
fs/nfsd/nfsctl.c:892: warning: Function parameter or member 'buf' not described in 'write_maxblksize'
fs/nfsd/nfsctl.c:892: warning: Function parameter or member 'size' not described in 'write_maxblksize'
fs/nfsd/nfsctl.c:941: warning: Function parameter or member 'file' not described in 'write_maxconn'
fs/nfsd/nfsctl.c:941: warning: Function parameter or member 'buf' not described in 'write_maxconn'
fs/nfsd/nfsctl.c:941: warning: Function parameter or member 'size' not described in 'write_maxconn'
fs/nfsd/nfsctl.c:1023: warning: Function parameter or member 'file' not described in 'write_leasetime'
fs/nfsd/nfsctl.c:1023: warning: Function parameter or member 'buf' not described in 'write_leasetime'
fs/nfsd/nfsctl.c:1023: warning: Function parameter or member 'size' not described in 'write_leasetime'
fs/nfsd/nfsctl.c:1039: warning: Function parameter or member 'file' not described in 'write_gracetime'
fs/nfsd/nfsctl.c:1039: warning: Function parameter or member 'buf' not described in 'write_gracetime'
fs/nfsd/nfsctl.c:1039: warning: Function parameter or member 'size' not described in 'write_gracetime'
fs/nfsd/nfsctl.c:1094: warning: Function parameter or member 'file' not described in 'write_recoverydir'
fs/nfsd/nfsctl.c:1094: warning: Function parameter or member 'buf' not described in 'write_recoverydir'
fs/nfsd/nfsctl.c:1094: warning: Function parameter or member 'size' not described in 'write_recoverydir'
fs/nfsd/nfsctl.c:1125: warning: Function parameter or member 'file' not described in 'write_v4_end_grace'
fs/nfsd/nfsctl.c:1125: warning: Function parameter or member 'buf' not described in 'write_v4_end_grace'
fs/nfsd/nfsctl.c:1125: warning: Function parameter or member 'size' not described in 'write_v4_end_grace'

fs/nfsd/nfs4proc.c:1164: warning: Function parameter or member 'nss' not described in 'nfsd4_interssc_connect'
fs/nfsd/nfs4proc.c:1164: warning: Function parameter or member 'rqstp' not described in 'nfsd4_interssc_connect'
fs/nfsd/nfs4proc.c:1164: warning: Function parameter or member 'mount' not described in 'nfsd4_interssc_connect'
fs/nfsd/nfs4proc.c:1262: warning: Function parameter or member 'rqstp' not described in 'nfsd4_setup_inter_ssc'
fs/nfsd/nfs4proc.c:1262: warning: Function parameter or member 'cstate' not described in 'nfsd4_setup_inter_ssc'
fs/nfsd/nfs4proc.c:1262: warning: Function parameter or member 'copy' not described in 'nfsd4_setup_inter_ssc'
fs/nfsd/nfs4proc.c:1262: warning: Function parameter or member 'mount' not described in 'nfsd4_setup_inter_ssc'

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-20 17:30:44 -04:00
Chuck Lever
45f56da822 NFSD: Squash an annoying compiler warning
Clean up: Fix gcc empty-body warning when -Wextra is used.

../fs/nfsd/nfs4state.c:3898:3: warning: suggest braces around empty body in an ‘else’ statement [-Wempty-body]

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-20 17:30:44 -04:00
Chuck Lever
1eace0d1e9 NFSD: Add tracepoints for monitoring NFSD callbacks
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-20 17:30:44 -04:00
Chuck Lever
dd5e3fbc1f NFSD: Add tracepoints to the NFSD state management code
Capture obvious events and replace dprintk() call sites. Introduce
infrastructure so that adding more tracepoints in this code later
is simplified.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-20 17:30:44 -04:00
Chuck Lever
0b175b1864 NFSD: Add tracepoints to NFSD's duplicate reply cache
Try to capture DRC failures.

Two additional clean-ups:
- Introduce Doxygen-style comments for the main entry points
- Remove a dprintk that fires for an allocation failure. This was
  the only dprintk in the REPCACHE class.

Reported-by: kbuild test robot <lkp@intel.com>
[ cel: force typecast for display of checksum values ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-05-20 17:30:34 -04:00
Ma Feng
44fb26c6b4 nfsd: Fix old-style function definition
Fix warning:

fs/nfsd/nfssvc.c:604:6: warning: old-style function definition [-Wold-style-definition]
 bool i_am_nfsd()
      ^~~~~~~~~

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ma Feng <mafeng.ma@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-11 09:16:14 -04:00
J. Bruce Fields
28df3d1539 nfsd: clients don't need to break their own delegations
We currently revoke read delegations on any write open or any operation
that modifies file data or metadata (including rename, link, and
unlink).  But if the delegation in question is the only read delegation
and is held by the client performing the operation, that's not really
necessary.

It's not always possible to prevent this in the NFSv4.0 case, because
there's not always a way to determine which client an NFSv4.0 delegation
came from.  (In theory we could try to guess this from the transport
layer, e.g., by assuming all traffic on a given TCP connection comes
from the same client.  But that's not really correct.)

In the NFSv4.1 case the session layer always tells us the client.

This patch should remove such self-conflicts in all cases where we can
reliably determine the client from the compound.

To do that we need to track "who" is performing a given (possibly
lease-breaking) file operation.  We're doing that by storing the
information in the svc_rqst and using kthread_data() to map the current
task back to a svc_rqst.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-08 21:23:10 -04:00
Eric Biggers
ea794db264 nfsd: use crypto_shash_tfm_digest()
Instead of manually allocating a 'struct shash_desc' on the stack and
calling crypto_shash_digest(), switch to using the new helper function
crypto_shash_tfm_digest() which does this for us.

Cc: linux-nfs@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-05-08 15:32:15 +10:00
J. Bruce Fields
c2d715a1af nfsd: handle repeated BIND_CONN_TO_SESSION
If the client attempts BIND_CONN_TO_SESSION on an already bound
connection, it should be either a no-op or an error.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06 15:59:23 -04:00
Achilles Gaikwad
580da465a0 nfsd4: add filename to states output
Add filename to states output for ease of debugging.

Signed-off-by: Achilles Gaikwad <agaikwad@redhat.com>
Signed-off-by: Kenneth Dsouza <kdsouza@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06 15:59:23 -04:00
J. Bruce Fields
ee590d2597 nfsd4: stid display should preserve on-the-wire byte order
When we decode the stateid we byte-swap si_generation.

But for simplicity's sake and ease of comparison with network traces,
it's better to display the whole thing in network order.

Reported-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06 15:59:23 -04:00
J. Bruce Fields
ace7ade4f5 nfsd4: common stateid-printing code
There's a problem with how I'm formatting stateids.  Before I fix it,
I'd like to move the stateid formatting into a common helper.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-05-06 15:59:23 -04:00
Chuck Lever
6221f1d9b6 SUNRPC: Fix backchannel RPC soft lockups
Currently, after the forward channel connection goes away,
backchannel operations are causing soft lockups on the server
because call_transmit_status's SOFTCONN logic ignores ENOTCONN.
Such backchannel Calls are aggressively retried until the client
reconnects.

Backchannel Calls should use RPC_TASK_NOCONNECT rather than
RPC_TASK_SOFTCONN. If there is no forward connection, the server is
not capable of establishing a connection back to the client, thus
that backchannel request should fail before the server attempts to
send it. Commit 58255a4e3c ("NFSD: NFSv4 callback client should
use RPC_TASK_SOFTCONN") was merged several years before
RPC_TASK_NOCONNECT was available.

Because setup_callback_client() explicitly sets NOPING, the NFSv4.0
callback connection depends on the first callback RPC to initiate
a connection to the client. Thus NFSv4.0 needs to continue to use
RPC_TASK_SOFTCONN.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org> # v4.20+
2020-04-17 12:40:31 -04:00
Vasily Averin
e1e8399eee nfsd: memory corruption in nfsd4_lock()
New struct nfsd4_blocked_lock allocated in find_or_allocate_block()
does not initialized nbl_list and nbl_lru.
If conflock allocation fails rollback can call list_del_init()
access uninitialized fields and corrupt memory.

v2: just initialize nbl_list and nbl_lru right after nbl allocation.

Fixes: 76d348fadf ("nfsd: have nfsd4_lock use blocking locks for v4.1+ lock")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-04-13 10:28:21 -04:00
J. Bruce Fields
69afd26798 nfsd: fsnotify on rmdir under nfsd/clients/
Userspace should be able to monitor nfsd/clients/ to see when clients
come and go, but we're failing to send fsnotify events.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-19 11:33:42 -04:00
J. Bruce Fields
663e36f076 nfsd4: kill warnings on testing stateids with mismatched clientids
It's normal for a client to test a stateid from a previous instance,
e.g. after a network partition.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-19 10:51:42 -04:00
Petr Vorel
6cbfad5f20 nfsd: remove read permission bit for ctl sysctl
It's meant to be write-only.

Fixes: 89c905becc ("nfsd: allow forced expiration of NFSv4 clients")
Signed-off-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:34 -04:00
Chuck Lever
3ac3711adb NFSD: Fix NFS server build errors
yuehaibing@huawei.com reports the following build errors arise when
CONFIG_NFSD_V4_2_INTER_SSC is set and the NFS client is not built
into the kernel:

fs/nfsd/nfs4proc.o: In function `nfsd4_do_copy':
nfs4proc.c:(.text+0x23b7): undefined reference to `nfs42_ssc_close'
fs/nfsd/nfs4proc.o: In function `nfsd4_copy':
nfs4proc.c:(.text+0x5d2a): undefined reference to `nfs_sb_deactive'
fs/nfsd/nfs4proc.o: In function `nfsd4_do_async_copy':
nfs4proc.c:(.text+0x61d5): undefined reference to `nfs42_ssc_open'
nfs4proc.c:(.text+0x6389): undefined reference to `nfs_sb_deactive'

The new inter-server copy code invokes client functions. Until the
NFS server has infrastructure to load the appropriate NFS client
modules to handle inter-server copy requests, let's constrain the
way this feature is built.

Reported-by: YueHaibing <yuehaibing@huawei.com>
Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: YueHaibing <yuehaibing@huawei.com> # build-tested
2020-03-16 12:04:34 -04:00
Trond Myklebust
65286b883c nfsd: export upcalls must not return ESTALE when mountd is down
If the rpc.mountd daemon goes down, then that should not cause all
exports to start failing with ESTALE errors. Let's explicitly
distinguish between the cache upcall cases that need to time out,
and those that do not.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:33 -04:00
Trond Myklebust
6a30e47fa0 nfsd: Add tracepoints for update of the expkey and export cache entries
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:33 -04:00
Trond Myklebust
cf749f3cc7 nfsd: Add tracepoints for exp_find_key() and exp_get_by_name()
Add tracepoints for upcalls.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:33 -04:00
Trond Myklebust
f01274a933 nfsd: Add tracing to nfsd_set_fh_dentry()
Add tracing to allow us to figure out where any stale filehandle issues
may be originating from.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:33 -04:00
Trond Myklebust
a451b12311 nfsd: Don't add locks to closed or closing open stateids
In NFSv4, the lock stateids are tied to the lockowner, and the open stateid,
so that the action of closing the file also results in either an automatic
loss of the locks, or an error of the form NFS4ERR_LOCKS_HELD.

In practice this means we must not add new locks to the open stateid
after the close process has been invoked. In fact doing so, can result
in the following panic:

 kernel BUG at lib/list_debug.c:51!
 invalid opcode: 0000 [#1] SMP NOPTI
 CPU: 2 PID: 1085 Comm: nfsd Not tainted 5.6.0-rc3+ #2
 Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.14410784.B64.1908150010 08/15/2019
 RIP: 0010:__list_del_entry_valid.cold+0x31/0x55
 Code: 1a 3d 9b e8 74 10 c2 ff 0f 0b 48 c7 c7 f0 1a 3d 9b e8 66 10 c2 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 b0 1a 3d 9b e8 52 10 c2 ff <0f> 0b 48 89 fe 4c 89 c2 48 c7 c7 78 1a 3d 9b e8 3e 10 c2 ff 0f 0b
 RSP: 0018:ffffb296c1d47d90 EFLAGS: 00010246
 RAX: 0000000000000054 RBX: ffff8ba032456ec8 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffff8ba039e99cc8 RDI: ffff8ba039e99cc8
 RBP: ffff8ba032456e60 R08: 0000000000000781 R09: 0000000000000003
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff8ba009a4abe0
 R13: ffff8ba032456e8c R14: 0000000000000000 R15: ffff8ba00adb01d8
 FS:  0000000000000000(0000) GS:ffff8ba039e80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fb213f0b008 CR3: 00000001347de006 CR4: 00000000003606e0
 Call Trace:
  release_lock_stateid+0x2b/0x80 [nfsd]
  nfsd4_free_stateid+0x1e9/0x210 [nfsd]
  nfsd4_proc_compound+0x414/0x700 [nfsd]
  ? nfs4svc_decode_compoundargs+0x407/0x4c0 [nfsd]
  nfsd_dispatch+0xc1/0x200 [nfsd]
  svc_process_common+0x476/0x6f0 [sunrpc]
  ? svc_sock_secure_port+0x12/0x30 [sunrpc]
  ? svc_recv+0x313/0x9c0 [sunrpc]
  ? nfsd_svc+0x2d0/0x2d0 [nfsd]
  svc_process+0xd4/0x110 [sunrpc]
  nfsd+0xe3/0x140 [nfsd]
  kthread+0xf9/0x130
  ? nfsd_destroy+0x50/0x50 [nfsd]
  ? kthread_park+0x90/0x90
  ret_from_fork+0x1f/0x40

The fix is to ensure that lock creation tests for whether or not the
open stateid is unhashed, and to fail if that is the case.

Fixes: 659aefb68e ("nfsd: Ensure we don't recognise lock stateids after freeing them")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:33 -04:00
Chuck Lever
7dcf4ab952 NFSD: Clean up nfsd4_encode_readv
Address some minor nits I noticed while working on this function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:31 -04:00
Chuck Lever
412055398b nfsd: Fix NFSv4 READ on RDMA when using readv
svcrdma expects that the payload falls precisely into the xdr_buf
page vector. This does not seem to be the case for
nfsd4_encode_readv().

This code is called only when fops->splice_read is missing or when
RQ_SPLICE_OK is clear, so it's not a noticeable problem in many
common cases.

Add new transport method: ->xpo_read_payload so that when a READ
payload does not fit exactly in rq_res's page vector, the XDR
encoder can inform the RPC transport exactly where that payload is,
without the payload's XDR pad.

That way, when a Write chunk is present, the transport knows what
byte range in the Reply message is supposed to be matched with the
chunk.

Note that the Linux NFS server implementation of NFS/RDMA can
currently handle only one Write chunk per RPC-over-RDMA message.
This simplifies the implementation of this fix.

Fixes: b042098063 ("nfsd4: allow exotic read compounds")
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=198053
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:31 -04:00
Madhuparna Bhowmik
057a227435 fs: nfsd: fileache.c: Use built-in RCU list checking
list_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when  CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:31 -04:00
Madhuparna Bhowmik
36a8049181 fs: nfsd: nfs4state.c: Use built-in RCU list checking
list_for_each_entry_rcu() has built-in RCU and lock checking.

Pass cond argument to list_for_each_entry_rcu() to silence
false lockdep warning when  CONFIG_PROVE_RCU_LIST is enabled
by default.

Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:31 -04:00
Scott Mayhew
7627d7dc79 nfsd: set the server_scope during service startup
Currently, nfsd4_encode_exchange_id() encodes the utsname nodename
string in the server_scope field.  In a multi-host container
environemnt, if an nfsd container is restarted on a different host than
it was originally running on, clients will see a server_scope mismatch
and will not attempt to reclaim opens.

Instead, set the server_scope while we're in a process context during
service startup, so we get the utsname nodename of the current process
and store that in nfsd_net.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
[bfields: fix up major_id too]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16 12:04:30 -04:00
Linus Torvalds
08dffcc7d9 Highlights:
- Server-to-server copy code from Olga.  To use it, client and
 	  both servers must have support, the target server must be able
 	  to access the source server over NFSv4.2, and the target
 	  server must have the inter_copy_offload_enable module
 	  parameter set.
 	- Improvements and bugfixes for the new filehandle cache,
 	  especially in the container case, from Trond
 	- Also from Trond, better reporting of write errors.
 	- Y2038 work from Arnd.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl490mAVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+HkkP/33CsYXp0wvfNrfxCY3zHRxHpfw+
 T9Ownxxw0RAJc/dRluC/2PIKJ20uVqtLrplU63bMBqJn84WF7OALq9twZ79a3fVF
 mvdmnZbNq9B3ncKJlT7akkEelyJCRap7NgG/oTyubE8MlPl6gKpD8c+G7XdW/uN+
 r0fprQz4rW4CYCBGSHq7HusEKqY4Gw+gbyAfJ6A79TMjF1ei51PG+9c8rkIsI5CO
 1TQ3gY1gSJmGf2DoF86Q9WTVb+DvRTEs+t7QkxY/Vlo+QXY8CZyu+qSxN7i/F20m
 gv2GrSpQMS9DEK/ZaG6cxaH+sM18Db4KLvcl3koL6lONHDR2OafSdKLyy0I60jhO
 WfDSHhfDCrAdASTjNlTPrjBrdK3gafiaJVL9vy901ZJjPaNb3EH0nMQ5bEvOBECq
 TCqPcQUcbku+qUVIcFwzSK1hXQFQHNh8WIuqXvNviZIzFDoipwsHVnQK02Owj89L
 R2tbZue1O8voacg/9xw3tWAT7pI+SaBb0EvJuqRxBshiZEU8kKKtMchOwSECRDcu
 k4lcqC5EFW7e4EzGlr6Wx8sI5lwCapva8ccjmPXX+R/vyM81oxWGB84GqWjjwubH
 3Fcok23F9rW2IQJkqgPlNj/9hAjTn2+vM13UbfMlnchGNsQ2gbkc5CDGC/J6Wwpo
 tHVristV9Gu5bJym
 =FxLY
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.6' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Highlights:

   - Server-to-server copy code from Olga.

     To use it, client and both servers must have support, the target
     server must be able to access the source server over NFSv4.2, and
     the target server must have the inter_copy_offload_enable module
     parameter set.

   - Improvements and bugfixes for the new filehandle cache, especially
     in the container case, from Trond

   - Also from Trond, better reporting of write errors.

   - Y2038 work from Arnd"

* tag 'nfsd-5.6' of git://linux-nfs.org/~bfields/linux: (55 commits)
  sunrpc: expiry_time should be seconds not timeval
  nfsd: make nfsd_filecache_wq variable static
  nfsd4: fix double free in nfsd4_do_async_copy()
  nfsd: convert file cache to use over/underflow safe refcount
  nfsd: Define the file access mode enum for tracing
  nfsd: Fix a perf warning
  nfsd: Ensure sampling of the write verifier is atomic with the write
  nfsd: Ensure sampling of the commit verifier is atomic with the commit
  sunrpc: clean up cache entry add/remove from hashtable
  sunrpc: Fix potential leaks in sunrpc_cache_unhash()
  nfsd: Ensure exclusion between CLONE and WRITE errors
  nfsd: Pass the nfsd_file as arguments to nfsd4_clone_file_range()
  nfsd: Update the boot verifier on stable writes too.
  nfsd: Fix stable writes
  nfsd: Allow nfsd_vfs_write() to take the nfsd_file as an argument
  nfsd: Fix a soft lockup race in nfsd_file_mark_find_or_create()
  nfsd: Reduce the number of calls to nfsd_file_gc()
  nfsd: Schedule the laundrette regularly irrespective of file errors
  nfsd: Remove unused constant NFSD_FILE_LRU_RESCAN
  nfsd: Containerise filecache laundrette
  ...
2020-02-07 17:50:21 -08:00
Chen Zhou
50d0def966 nfsd: make nfsd_filecache_wq variable static
Fix sparse warning:

fs/nfsd/filecache.c:55:25: warning:
	symbol 'nfsd_filecache_wq' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-02-07 13:30:41 -05:00
Dan Carpenter
91fd3c3edc nfsd4: fix double free in nfsd4_do_async_copy()
This frees "copy->nf_src" before and again after the goto.

Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-02-06 11:22:55 -05:00
Trond Myklebust
689827cd5b nfsd: convert file cache to use over/underflow safe refcount
Use the 'refcount_t' type instead of 'atomic_t' for improved
refcounting safety.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-02-06 11:22:55 -05:00
Trond Myklebust
c19285596d nfsd: Define the file access mode enum for tracing
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-02-06 11:22:55 -05:00
Trond Myklebust
a9ceb060b3 nfsd: Fix a perf warning
perf does not know how to deal with a __builtin_bswap32() call, and
complains. All other functions just store the xid etc in host endian
form, so let's do that in the tracepoint for nfsd_file_acquire too.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-02-06 11:22:54 -05:00
Alexey Dobriyan
97a32539b9 proc: convert everything to "struct proc_ops"
The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
seq_file.h.

Conversion rule is:

	llseek		=> proc_lseek
	unlocked_ioctl	=> proc_ioctl

	xxx		=> proc_xxx

	delete ".owner = THIS_MODULE" line

[akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
[sfr@canb.auug.org.au: fix kernel/sched/psi.c]
  Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-04 03:05:26 +00:00
Trond Myklebust
19e0663ff9 nfsd: Ensure sampling of the write verifier is atomic with the write
When doing an unstable write, we need to ensure that we sample the
write verifier before releasing the lock, and allowing a commit to
the same file to proceed.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:41 -05:00
Trond Myklebust
524ff1af22 nfsd: Ensure sampling of the commit verifier is atomic with the commit
When we have a successful commit, ensure we sample the commit verifier
before releasing the lock.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:41 -05:00
Trond Myklebust
1b28d756b2 nfsd: Ensure exclusion between CLONE and WRITE errors
Ensure that we can distinguish between synchronous CLONE and
WRITE errors, and that we can assign them correctly.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:41 -05:00
Trond Myklebust
b66ae6dd0c nfsd: Pass the nfsd_file as arguments to nfsd4_clone_file_range()
Needed in order to fix exclusion w.r.t. writes.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
7bf94c6ba9 nfsd: Update the boot verifier on stable writes too.
We don't know if the error returned by the fsync() call is
exclusive to the data written by the stable write, so play it
safe.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
5011af4c69 nfsd: Fix stable writes
Strictly speaking, a stable write error needs to reflect the
write + the commit of that write (and only that write). To
ensure that we don't pick up the write errors from other
writebacks, add a rw_semaphore to provide exclusion.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
16f8f89410 nfsd: Allow nfsd_vfs_write() to take the nfsd_file as an argument
Needed in order to fix stable writes.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
90d2f1da83 nfsd: Fix a soft lockup race in nfsd_file_mark_find_or_create()
If nfsd_file_mark_find_or_create() keeps winning the race for the
nfsd_file_fsnotify_group->mark_mutex against nfsd_file_mark_put()
then it can soft lock up, since fsnotify_add_inode_mark() ends
up always finding an existing entry.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
b6669305d3 nfsd: Reduce the number of calls to nfsd_file_gc()
Don't call nfsd_file_gc() on every put of the reference in nfsd_file_put().
Instead, do it only when we're expecting the refcount to go to 1.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
55f84cc47f nfsd: Schedule the laundrette regularly irrespective of file errors
Emsure we schedule the laundrette even if the struct file is carrying
file errors.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
bd6e1cece8 nfsd: Remove unused constant NFSD_FILE_LRU_RESCAN
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
9542e6a643 nfsd: Containerise filecache laundrette
Ensure that if the filecache laundrette gets stuck, it only affects
the knfsd instances of one container.

The notifier callbacks can be called from various contexts so avoid
using synchonous filesystem operations that might deadlock.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
36ebbdb96b nfsd: cleanup nfsd_file_lru_dispose()
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:40 -05:00
Trond Myklebust
28c7d86bb6 nfsd: fix filecache lookup
If the lookup keeps finding a nfsd_file with an unhashed open file,
then retry once only.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org
Fixes: 65294c1f2c "nfsd: add a new struct file caching facility to nfsd"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-22 16:25:01 -05:00
zhengbin
e44b4bf264 nfsd: use true,false for bool variable in nfssvc.c
Fixes coccicheck warning:

fs/nfsd/nfssvc.c:394:2-14: WARNING: Assignment of 0/1 to bool variable
fs/nfsd/nfssvc.c:407:2-14: WARNING: Assignment of 0/1 to bool variable
fs/nfsd/nfssvc.c:422:2-14: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-03 11:21:12 -05:00
zhengbin
500c248171 nfsd: use true,false for bool variable in nfs4proc.c
Fixes coccicheck warning:

fs/nfsd/nfs4proc.c:235:1-18: WARNING: Assignment of 0/1 to bool variable
fs/nfsd/nfs4proc.c:368:1-17: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-03 11:21:12 -05:00
zhengbin
384a7ccaa3 nfsd: use true,false for bool variable in vfs.c
Fixes coccicheck warning:

fs/nfsd/vfs.c:1389:5-13: WARNING: Assignment of 0/1 to bool variable
fs/nfsd/vfs.c:1398:5-13: WARNING: Assignment of 0/1 to bool variable
fs/nfsd/vfs.c:1415:2-10: WARNING: Assignment of 0/1 to bool variable

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-01-03 11:21:12 -05:00
Arnd Bergmann
364d5814b9 nfsd: remove nfs4_reset_lease() declarations
The function was removed a long time ago, but the declaration
and a dummy implementation are still there, referencing the
deprecated time_t type.

Remove both.

Fixes: f958a1320f ("nfsd4: remove unnecessary lease-setting function")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 22:07:17 -05:00
Arnd Bergmann
9104ae494e nfsd: use ktime_get_real_seconds() in nfs4_verifier
gen_confirm() generates a unique identifier based on the current
time. This overflows in year 2038, but that is harmless since it
generally does not lead to duplicates, as long as the time has
been initialized by a real-time clock or NTP.

Using ktime_get_boottime_seconds() or ktime_get_seconds() would
avoid the overflow, but it would be more likely to result in
non-unique numbers.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 22:07:17 -05:00
Arnd Bergmann
20b7d86f29 nfsd: use boottime for lease expiry calculation
A couple of time_t variables are only used to track the state of the
lease time and its expiration. The code correctly uses the 'time_after()'
macro to make this work on 32-bit architectures even beyond year 2038,
but the get_seconds() function and the time_t type itself are deprecated
as they behave inconsistently between 32-bit and 64-bit architectures
and often lead to code that is not y2038 safe.

As a minor issue, using get_seconds() leads to problems with concurrent
settimeofday() or clock_settime() calls, in the worst case timeout never
triggering after the time has been set backwards.

Change nfsd to use time64_t and ktime_get_boottime_seconds() here. This
is clearly excessive, as boottime by itself means we never go beyond 32
bits, but it does mean we handle this correctly and consistently without
having to worry about corner cases and should be no more expensive than
the previous implementation on 64-bit architectures.

The max_cb_time() function gets changed in order to avoid an expensive
64-bit division operation, but as the lease time is at most one hour,
there is no change in behavior.

Also do the same for server-to-server copy expiration time.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[bfields@redhat.com: fix up copy expiration]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 22:07:17 -05:00
Arnd Bergmann
9594497f2c nfsd: fix jiffies/time_t mixup in LRU list
The nfsd4_blocked_lock->nbl_time timestamp is recorded in jiffies,
but then compared to a CLOCK_REALTIME timestamp later on, which makes
no sense.

For consistency with the other timestamps, change this to use a time_t.

This is a change in behavior, which may cause regressions, but the
current code is not sensible. On a system with CONFIG_HZ=1000,
the 'time_after((unsigned long)nbl->nbl_time, (unsigned long)cutoff))'
check is false for roughly the first 18 days of uptime and then true
for the next 49 days.

Fixes: 7919d0a27f ("nfsd: add a LRU list for blocked locks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
2561c92b12 nfsd: fix delay timer on 32-bit architectures
The nfsd4_cb_layout_done() function takes a 'time_t' value,
multiplied by NSEC_PER_SEC*2 to get a nanosecond value.

This works fine on 64-bit architectures, but on 32-bit, any
value over 1 second results in a signed integer overflow
with unexpected results.

Cast one input to a 64-bit type in order to produce the
same result that we have on 64-bit architectures, regarless
of the type of nfsd4_lease.

Fixes: 6b9b21073d ("nfsd: give up on CB_LAYOUTRECALLs after two lease periods")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
b6356d4202 nfsd: use time64_t in nfsd_proc_setattr() check
Change to time64_t and ktime_get_real_seconds() to make the
logic work correctly on 32-bit architectures beyond 2038.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
2a1aa48929 nfsd: pass a 64-bit guardtime to nfsd_setattr()
Guardtime handling in nfs3 differs between 32-bit and 64-bit
architectures, and uses the deprecated time_t type.

Change it to using time64_t, which behaves the same way on
64-bit and 32-bit architectures, treating the number as an
unsigned 32-bit entity with a range of year 1970 to 2106
consistently, and avoiding the y2038 overflow.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
9cc7680149 nfsd: make 'boot_time' 64-bit wide
The local boot time variable gets truncated to time_t at the moment,
which can lead to slightly odd behavior on 32-bit architectures.

Use ktime_get_real_seconds() instead of get_seconds() to always
get a 64-bit result, and keep it that way wherever possible.

It still gets truncated in a few places:

- When assigning to cl_clientid.cl_boot, this is already documented
  and is only used as a unique identifier.

- In clients_still_reclaiming(), the truncation is to 'unsigned long'
  in order to use the 'time_before() helper.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
e4598e38ee nfsd: use timespec64 in encode_time_delta
The values in encode_time_delta are always small and don't
overflow the range of 'struct timespec', so changing it has
no effect.

Change it to timespec64 as a prerequisite for removing the
timespec definition later.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
92c5e46911 nfsd: handle nfs3 timestamps as unsigned
The decode_time3 function behaves differently on 32-bit
and 64-bit architectures: on the former, a 32-bit timestamp
gets converted into an signed number and then into a timestamp
between 1902 and 2038, while on the latter it is interpreted
as unsigned in the range 1970-2106.

Change all the remaining 'timespec' in nfsd to 'timespec64'
to make the behavior the same, and use the current interpretation
of the dominant 64-bit architectures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
e29f470396 nfsd: print 64-bit timestamps in client_info_show
The nii_time field gets truncated to 'time_t' on 32-bit architectures
before printing.

Remove the use of 'struct timespec' to product the correct output
beyond 2038.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Arnd Bergmann
b3f255ef6b nfsd: use ktime_get_seconds() for timestamps
The delegation logic in nfsd uses the somewhat inefficient
seconds_since_boot() function to record time intervals.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Aditya Pakki
fc1b206595 nfsd: remove unnecessary assertion in nfsd4_encode_replay
The replay variable is set in the only caller of nfsd4_encode_replay.
The assertion is unnecessary and the patch removes this check.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
Trond Myklebust
57f6403496 nfsd: Clone should commit src file metadata too
vfs_clone_file_range() can modify the metadata on the source file too,
so we need to commit that to stable storage as well.

Reported-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Acked-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:46:08 -05:00
zhengbin
fc5fc5d7cc nfsd4: Remove unneeded semicolon
Fixes coccicheck warning:

fs/nfsd/nfs4state.c:3376:2-3: Unneeded semicolon

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-19 17:42:44 -05:00
Trond Myklebust
09a80f2aef nfsd: Return the correct number of bytes written to the file
We must allow for the fact that iov_iter_write() could have returned
a short write (e.g. if there was an ENOSPC issue).

Fixes: d890be159a "nfsd: Add I/O trace points in the NFSv4 write path"
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-17 14:32:20 -05:00
J. Bruce Fields
d781e3df71 nfsd4: avoid NULL deference on strange COPY compounds
With cross-server COPY we've introduced the possibility that the current
or saved filehandle might not have fh_dentry/fh_export filled in, but we
missed a place that assumed it was.  I think this could be triggered by
a compound like:

	PUTFH(foreign filehandle)
	GETATTR
	SAVEFH
	COPY

First, check_if_stalefh_allowed sets no_verify on the first (PUTFH) op.
Then op_func = nfsd4_putfh runs and leaves current_fh->fh_export NULL.
need_wrongsec_check returns true, since this PUTFH has OP_IS_PUTFH_LIKE
set and GETATTR does not have OP_HANDLES_WRONGSEC set.

We should probably also consider tightening the checks in
check_if_stalefh_allowed and double-checking that we don't assume the
filehandle is verified elsewhere in the compound.  But I think this
fixes the immediate issue.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 4e48f1cccab3 "NFSD: allow inter server COPY to have... "
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-09 11:44:07 -05:00
Olga Kornievskaia
2e577f0fac NFSD fixing possible null pointer derefering in copy offload
Static checker revealed possible error path leading to possible
NULL pointer dereferencing.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e0639dc580: ("NFSD introduce async copy feature")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-09 11:44:07 -05:00
Olga Kornievskaia
b8290ca250 NFSD fix nfserro errno mismatch
There is mismatch between __be32 and u32 in nfserr and errno.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: d5e54eeb0e3d ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-09 11:44:07 -05:00
Olga Kornievskaia
3f9544ca62 NFSD: fix seqid in copy stateid
s_stid->si_generation is a u32, copy->stateid.seqid is a __be32, so we
should be byte-swapping here if necessary.

This effectively undoes the byte-swap performed when reading
s_stid->s_generation in nfsd4_decode_copy().  Without this second swap,
the stateid we sent to the source in READ could be different from the
one the client provided us in the COPY.  We didn't spot this in testing
since our implementation always uses a 0 in the seqid field.  But other
implementations might not do that.

You'd think we should just skip the byte-swapping entirely, but the
s_stid field can be used for either our own stateids (in the
intra-server case) or foreign stateids (in the inter-server case), and
the former are interpreted by us and need byte-swapping.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: d5e54eeb0e3d ("NFSD add nfs4 inter ssc to nfsd4_copy")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-09 11:44:07 -05:00
Olga Kornievskaia
10db651210 NFSD fix mismatching type in nfsd4_set_netaddr
Fix __be32 and u32 mismatch in return and assignment.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: dbd4c2dd8f13 ("NFSD add COPY_NOTIFY operation")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-09 11:44:07 -05:00
Dan Carpenter
5277a79e2d nfsd: unlock on error in manage_cpntf_state()
We are holding the "nn->s2s_cp_lock" so we can't return directly
without unlocking first.

Fixes: f3dee17721a0 ("NFSD check stateids against copy stateids")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-09 11:44:07 -05:00
Olga Kornievskaia
ce0887ac96 NFSD add nfs4 inter ssc to nfsd4_copy
Given a universal address, mount the source server from the destination
server.  Use an internal mount. Call the NFS client nfs42_ssc_open to
obtain the NFS struct file suitable for nfsd_copy_range.

Ability to do "inter" server-to-server depends on the an nfsd kernel
parameter "inter_copy_offload_enable".

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-12-09 11:44:07 -05:00
Olga Kornievskaia
b9e8638e3d NFSD: allow inter server COPY to have a STALE source server fh
The inter server to server COPY source server filehandle
is a foreign filehandle as the COPY is sent to the destination
server.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-12-09 11:42:14 -05:00
Olga Kornievskaia
51100d2b87 NFSD generalize nfsd4_compound_state flag names
Allow for sid_flag field non-stateid use.

Signed-off-by: Andy Adamson <andros@netapp.com>
2019-12-09 11:42:14 -05:00
Olga Kornievskaia
b734220425 NFSD check stateids against copy stateids
Incoming stateid (used by a READ) could be a saved copy stateid.
Using the provided stateid, look it up in the list of copy_notify
stateids. If found, use the parent's stateid and parent's clid
to look up the parent's stid to do the appropriate checks.

Update the copy notify timestamp (cpntf_time) with current time
this making it 'active' so that laundromat thread will not delete
copy notify state.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-12-09 11:42:14 -05:00
Olga Kornievskaia
624322f1ad NFSD add COPY_NOTIFY operation
Introducing the COPY_NOTIFY operation.

Create a new unique stateid that will keep track of the copy
state and the upcoming READs that will use that stateid.
Each associated parent stateid has a list of copy
notify stateids. A copy notify structure makes a copy of
the parent stateid and a clientid and will use it to look
up the parent stateid during the READ request (suggested
by Trond Myklebust <trond.myklebust@hammerspace.com>).

At nfs4_put_stid() time, we walk the list of the associated
copy notify stateids and delete them.

Laundromat thread will traverse globally stored copy notify
stateid in idr and notice if any haven't been referenced in the
lease period, if so, it'll remove them.

Return single netaddr to advertise to the copy.

Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
2019-12-09 11:42:14 -05:00
Olga Kornievskaia
51911868fc NFSD COPY_NOTIFY xdr
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-12-09 11:42:14 -05:00
Olga Kornievskaia
84e1b21d5e NFSD add ca_source_server<> to COPY
Decode the ca_source_server list that's sent but only use the
first one. Presence of non-zero list indicates an "inter" copy.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-12-09 11:42:14 -05:00
Olga Kornievskaia
af76fc6c15 NFSD fill-in netloc4 structure
nfs.4 defines nfs42_netaddr structure that represents netloc4.

Populate needed fields from the sockaddr structure.

This will be used by flexfiles and 4.2 inter copy

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
2019-12-09 11:42:14 -05:00
Linus Torvalds
911d137ab0 This is a relatively quiet cycle for nfsd, mainly various bugfixes.
Possibly most interesting is Trond's fixes for some callback races that
 were due to my incomplete understanding of rpc client shutdown.
 Unfortunately at the last minute I've started noticing a new
 intermittent failure to send callbacks.  As the logic seems basically
 correct, I'm leaving Trond's patches in for now, and hope to find a fix
 in the next week so I don't have to revert those patches.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl3r3AAVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+rjkP/3L6DZs0Uv0BYbGq5Gmit0uoPSQk
 8BT7oQhbagCh+ULRYWCnK6cz82wejR4Gzq4PLyl5x5Vcc5x+bLoPI9YgiRZlIbZu
 ZvSg93E6SITLfq5xRlDC0MlIVZkI+HoIfyYgv1aYiWvQ3834bcx4DxVm9h7cNpT3
 x37anEFi1lv3n9fct3obOrs3AvCS76XyA6VVhcSLJ77amKQ+O7LI0crqUc6cuX2i
 CkTwTSDwyCrzkx3dZ2xDPDTbLecxw+Ce4adaby5v3GEQo3TOCmEWX92D3dvzfMmv
 ICU07FsVOILnIT/fmC91b1+JWVRLjUUBw5EPmDduwSP/yw4YnIEODFEP/wAUAmMJ
 vJ9hi9c1rThQ9n8h08RIwA2snhnpXRxKCWhpIRY6WM8DhHL9Y9AuVPYTKxhQOjPK
 l3wbOGcMW63NrTOPHHN7hTB0vDLgPKIXYVIrMvZTd/P7CghDDEbhT1gDvx/IL3Uq
 WrHKbJtK7rbx9i2bh5f6fH0DRrv7lxbD0ffunRRa3twPAe6zsG9WPjsbZZraZzEg
 O7/o3wZu2N7MpL5bXPfzB+5ylOTxvNWew07NJjA4BIOfwin3bw/71YfB0Vnoairv
 PhmbN2Dj4/t82ld0JU5GJWojpUfH4ARXM2Li9WO99wzx+KrxScsqGPnRMFe9dC7b
 Q7ltP1p0gUbkJ88Z
 =b2zA
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "This is a relatively quiet cycle for nfsd, mainly various bugfixes.

  Possibly most interesting is Trond's fixes for some callback races
  that were due to my incomplete understanding of rpc client shutdown.
  Unfortunately at the last minute I've started noticing a new
  intermittent failure to send callbacks. As the logic seems basically
  correct, I'm leaving Trond's patches in for now, and hope to find a
  fix in the next week so I don't have to revert those patches"

* tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits)
  nfsd: depend on CRYPTO_MD5 for legacy client tracking
  NFSD fixing possible null pointer derefering in copy offload
  nfsd: check for EBUSY from vfs_rmdir/vfs_unink.
  nfsd: Ensure CLONE persists data and metadata changes to the target file
  SUNRPC: Fix backchannel latency metrics
  nfsd: restore NFSv3 ACL support
  nfsd: v4 support requires CRYPTO_SHA256
  nfsd: Fix cld_net->cn_tfm initialization
  lockd: remove __KERNEL__ ifdefs
  sunrpc: remove __KERNEL__ ifdefs
  race in exportfs_decode_fh()
  nfsd: Drop LIST_HEAD where the variable it declares is never used.
  nfsd: document callback_wq serialization of callback code
  nfsd: mark cb path down on unknown errors
  nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()
  nfsd: minor 4.1 callback cleanup
  SUNRPC: Fix svcauth_gss_proxy_init()
  SUNRPC: Trace gssproxy upcall results
  sunrpc: fix crash when cache_head become valid before update
  nfsd: remove private bin2hex implementation
  ...
2019-12-07 16:56:00 -08:00
Patrick Steinhardt
38a2204f52 nfsd: depend on CRYPTO_MD5 for legacy client tracking
The legacy client tracking infrastructure of nfsd makes use of MD5 to
derive a client's recovery directory name. As the nfsd module doesn't
declare any dependency on CRYPTO_MD5, though, it may fail to allocate
the hash if the kernel was compiled without it. As a result, generation
of client recovery directories will fail with the following error:

    NFSD: unable to generate recoverydir name

The explicit dependency on CRYPTO_MD5 was removed as redundant back in
6aaa67b5f3 (NFSD: Remove redundant "select" clauses in fs/Kconfig
2008-02-11) as it was already implicitly selected via RPCSEC_GSS_KRB5.
This broke when RPCSEC_GSS_KRB5 was made optional for NFSv4 in commit
df486a2590 (NFS: Fix the selection of security flavours in Kconfig) at
a later point.

Fix the issue by adding back an explicit dependency on CRYPTO_MD5.

Fixes: df486a2590 (NFS: Fix the selection of security flavours in Kconfig)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-07 11:28:52 -05:00
Olga Kornievskaia
18f428d4e2 NFSD fixing possible null pointer derefering in copy offload
Static checker revealed possible error path leading to possible
NULL pointer dereferencing.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e0639dc580: ("NFSD introduce async copy feature")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-12-07 11:28:46 -05:00
NeilBrown
466e16f092 nfsd: check for EBUSY from vfs_rmdir/vfs_unink.
vfs_rmdir and vfs_unlink can return -EBUSY if the
target is a mountpoint.  This currently gets passed to
nfserrno() by nfsd_unlink(), and that results in a WARNing,
which is not user-friendly.

Possibly the best NFSv4 error is NFS4ERR_FILE_OPEN, because
there is a sense in which the object is currently in use
by some other task.  The Linux NFSv4 client will map this
back to EBUSY, which is an added benefit.

For NFSv3, the best we can do is probably NFS3ERR_ACCES, which isn't
true, but is not less true than the other options.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-30 14:59:52 -05:00
Trond Myklebust
a25e3726b3 nfsd: Ensure CLONE persists data and metadata changes to the target file
The NFSv4.2 CLONE operation has implicit persistence requirements on the
target file, since there is no protocol requirement that the client issue
a separate operation to persist data.
For that reason, we should call vfs_fsync_range() on the destination file
after a successful call to vfs_clone_file_range().

Fixes: ffa0160a10 ("nfsd: implement the NFSv4.2 CLONE operation")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.5+
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-30 14:55:38 -05:00
J. Bruce Fields
7c149057d0 nfsd: restore NFSv3 ACL support
An error in e333f3bbef left the nfsd_acl_program->pg_vers array empty,
which effectively turned off the server's support for NFSv3 ACLs.

Fixes: e333f3bbef "nfsd: Allow containers to set supported nfs versions"
Cc: stable@vger.kernel.org
Cc: Trond Myklebust <trondmy@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-19 17:43:05 -05:00
Al Viro
6c2d4798a8 new helper: lookup_positive_unlocked()
Most of the callers of lookup_one_len_unlocked() treat negatives are
ERR_PTR(-ENOENT).  Provide a helper that would do just that.  Note
that a pinned positive dentry remains positive - it's ->d_inode is
stable, etc.; a pinned _negative_ dentry can become positive at any
point as long as you are not holding its parent at least shared.
So using lookup_one_len_unlocked() needs to be careful;
lookup_positive_unlocked() is safer and that's what the callers
end up open-coding anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-15 13:49:04 -05:00
Scott Mayhew
a2e2f2dc77 nfsd: v4 support requires CRYPTO_SHA256
The new nfsdcld client tracking operations use sha256 to compute hashes
of the kerberos principals, so make sure CRYPTO_SHA256 is enabled.

Fixes: 6ee95d1c89 ("nfsd: add support for upcall version 2")
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-12 14:53:26 -05:00
Scott Mayhew
18b9a895e6 nfsd: Fix cld_net->cn_tfm initialization
Don't assign an error pointer to cld_net->cn_tfm, otherwise an oops will
occur in nfsd4_remove_cld_pipe().

Also, move the initialization of cld_net->cn_tfm so that it occurs after
the check to see if nfsdcld is running.  This is necessary because
nfsd4_client_tracking_init() looks for -ETIMEDOUT to determine whether
to use the "old" nfsdcld tracking ops.

Fixes: 6ee95d1c89 ("nfsd: add support for upcall version 2")
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-12 14:53:26 -05:00
Mao Wenan
2a67803e13 nfsd: Drop LIST_HEAD where the variable it declares is never used.
The declarations were introduced with the file, but the declared
variables were not used.

Fixes: 65294c1f2c ("nfsd: add a new struct file caching facility to nfsd")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
J. Bruce Fields
cc1ce2f13e nfsd: document callback_wq serialization of callback code
The callback code relies on the fact that much of it is only ever called
from the ordered workqueue callback_wq, and this is worth documenting.

Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
J. Bruce Fields
20428a8047 nfsd: mark cb path down on unknown errors
An unexpected error is probably a sign that something is wrong with the
callback path.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
Trond Myklebust
2bbfed98a4 nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()
When we're destroying the client lease, and we call
nfsd4_shutdown_callback(), we must ensure that we do not return
before all outstanding callbacks have terminated and have
released their payloads.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:59 -05:00
Trond Myklebust
12357f1b2c nfsd: minor 4.1 callback cleanup
Move all the cb_holds_slot management into helper functions.  No change
in behavior.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-08 12:32:44 -05:00
Andy Shevchenko
12b4157b7d nfsd: remove private bin2hex implementation
Calling sprintf in a loop is not very efficient, and in any case,
we already have an implementation of bin-to-hex conversion in lib/
which we might as well use.

Note that original code used to nul-terminate the destination while
bin2hex doesn't. That's why replace kmalloc() with kzalloc().

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-10-11 12:49:14 -04:00
Scott Mayhew
6e73e92b15 nfsd4: fix up replay_matches_cache()
When running an nfs stress test, I see quite a few cached replies that
don't match up with the actual request.  The first comment in
replay_matches_cache() makes sense, but the code doesn't seem to
match... fix it.

This isn't exactly a bugfix, as the server isn't required to catch every
case of a false retry.  So, we may as well do this, but if this is
fixing a problem then that suggests there's a client bug.

Fixes: 53da6a53e1 ("nfsd4: catch some false session retries")
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-10-09 16:39:31 -04:00
J. Bruce Fields
c4b77edb3f nfsd: "\%s" should be "%s"
Randy says:
> sparse complains about these, as does gcc when used with --pedantic.
> sparse says:
>
> ../fs/nfsd/nfs4state.c:2385:23: warning: unknown escape sequence: '\%'
> ../fs/nfsd/nfs4state.c:2385:23: warning: unknown escape sequence: '\%'
> ../fs/nfsd/nfs4state.c:2388:23: warning: unknown escape sequence: '\%'
> ../fs/nfsd/nfs4state.c:2388:23: warning: unknown escape sequence: '\%'

I'm not sure how this crept in.  Fix it.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-10-08 16:01:33 -04:00
YueHaibing
19a1aad888 nfsd: remove set but not used variable 'len'
Fixes gcc '-Wunused-but-set-variable' warning:

fs/nfsd/nfs4xdr.c: In function nfsd4_encode_splice_read:
fs/nfsd/nfs4xdr.c:3464:7: warning: variable len set but not used [-Wunused-but-set-variable]

It is not used since commit 83a63072c8 ("nfsd: fix nfs read eof detection")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-10-08 16:01:33 -04:00
Linus Torvalds
298fb76a55 Highlights:
- add a new knfsd file cache, so that we don't have to open and
 	  close on each (NFSv2/v3) READ or WRITE.  This can speed up
 	  read and write in some cases.  It also replaces our readahead
 	  cache.
 	- Prevent silent data loss on write errors, by treating write
 	  errors like server reboots for the purposes of write caching,
 	  thus forcing clients to resend their writes.
 	- Tweak the code that allocates sessions to be more forgiving,
 	  so that NFSv4.1 mounts are less likely to hang when a server
 	  already has a lot of clients.
 	- Eliminate an arbitrary limit on NFSv4 ACL sizes; they should
 	  now be limited only by the backend filesystem and the
 	  maximum RPC size.
 	- Allow the server to enforce use of the correct kerberos
 	  credentials when a client reclaims state after a reboot.
 
 And some miscellaneous smaller bugfixes and cleanup.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl2OoFcVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+dRoP/3OW1NxPjpjbCQWZL0M+O3AYJJla
 W8E+uoZKMosFEe/ymokMD0Vn5s47jPaMCifMjHZa2GygW8zHN9X2v0HURx/lob+o
 /rJXwMn78N/8kdbfDz2FvaCPeT0IuNzRIFBV8/sSXofqwCBwvPo+cl0QGrd4/xLp
 X35qlupx62TRk+kbdRjvv8kpS5SJ7BvR+FSA1WubNYWw2hpdEsr2OCFdGq2Wvthy
 DK6AfGBXfJGsOE+HGCSj6ejRV6i0UOJ17P8gRSsx+YT0DOe5E7ROjt+qvvRwk489
 wmR8Vjuqr1e40eGAUq3xuLfk5F5NgycY4ekVxk/cTVFNwWcz2DfdjXQUlyPAbrSD
 SqIyxN1qdKT24gtr7AHOXUWJzBYPWDgObCVBXUGzyL81RiDdhf38HRNjL2TcSDld
 tzCjQ0wbPw+iT74v6qQRY05oS+h3JOtDjU4pxsBnxVtNn4WhGJtaLfWW8o1C1QwU
 bc4aX3TlYhDmzU7n7Zjt4rFXGJfyokM+o6tPao1Z60Pmsv1gOk4KQlzLtW/jPHx4
 ZwYTwVQUKRDBfC62nmgqDyGI3/Qu11FuIxL2KXUCgkwDxNWN4YkwYjOGw9Lb5qKM
 wFpxq6CDNZB/IWLEu8Yg85kDPPUJMoI657lZb7Osr/MfBpU0YljcMOIzLBy8uV1u
 9COUbPaQipiWGu/0
 =diBo
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
 "Highlights:

   - Add a new knfsd file cache, so that we don't have to open and close
     on each (NFSv2/v3) READ or WRITE. This can speed up read and write
     in some cases. It also replaces our readahead cache.

   - Prevent silent data loss on write errors, by treating write errors
     like server reboots for the purposes of write caching, thus forcing
     clients to resend their writes.

   - Tweak the code that allocates sessions to be more forgiving, so
     that NFSv4.1 mounts are less likely to hang when a server already
     has a lot of clients.

   - Eliminate an arbitrary limit on NFSv4 ACL sizes; they should now be
     limited only by the backend filesystem and the maximum RPC size.

   - Allow the server to enforce use of the correct kerberos credentials
     when a client reclaims state after a reboot.

  And some miscellaneous smaller bugfixes and cleanup"

* tag 'nfsd-5.4' of git://linux-nfs.org/~bfields/linux: (34 commits)
  sunrpc: clean up indentation issue
  nfsd: fix nfs read eof detection
  nfsd: Make nfsd_reset_boot_verifier_locked static
  nfsd: degraded slot-count more gracefully as allocation nears exhaustion.
  nfsd: handle drc over-allocation gracefully.
  nfsd: add support for upcall version 2
  nfsd: add a "GetVersion" upcall for nfsdcld
  nfsd: Reset the boot verifier on all write I/O errors
  nfsd: Don't garbage collect files that might contain write errors
  nfsd: Support the server resetting the boot verifier
  nfsd: nfsd_file cache entries should be per net namespace
  nfsd: eliminate an unnecessary acl size limit
  Deprecate nfsd fault injection
  nfsd: remove duplicated include from filecache.c
  nfsd: Fix the documentation for svcxdr_tmpalloc()
  nfsd: Fix up some unused variable warnings
  nfsd: close cached files prior to a REMOVE or RENAME that would replace target
  nfsd: rip out the raparms cache
  nfsd: have nfsd_test_lock use the nfsd_file cache
  nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
  ...
2019-09-27 17:00:27 -07:00
Trond Myklebust
83a63072c8 nfsd: fix nfs read eof detection
Currently, the knfsd server assumes that a short read indicates an
end of file. That assumption is incorrect. The short read means that
either we've hit the end of file, or we've hit a read error.

In the case of a read error, the client may want to retry (as per the
implementation recommendations in RFC1813 and RFC7530), but currently it
is being told that it hit an eof.

Move the code to detect eof from version specific code into the generic
nfsd read.

Report eof only in the two following cases:
1) read() returns a zero length short read with no error.
2) the offset+length of the read is >= the file size.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-23 16:24:08 -04:00
YueHaibing
65643f4c82 nfsd: Make nfsd_reset_boot_verifier_locked static
Fix sparse warning:

fs/nfsd/nfssvc.c:364:6: warning:
 symbol 'nfsd_reset_boot_verifier_locked' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-23 12:20:10 -04:00
NeilBrown
2030ca560c nfsd: degraded slot-count more gracefully as allocation nears exhaustion.
This original code in nfsd4_get_drc_mem() would hand out 30
slots (approximately NFSD_MAX_MEM_PER_SESSION bytes at slightly
over 2K per slot) to each requesting client until it ran out
of space, then it would possibly give one last client a reduced
allocation, then fail the allocation.

Since commit de766e5704 ("nfsd: give out fewer session slots as
limit approaches") the last 90 slots to be given to about 12
clients with quickly reducing slot counts (better than just 3
clients).  This still seems unnecessarily hasty.
A subsequent patch allows over-allocation so every client gets
at least one slot, but that might be a bit restrictive.

The requested number of nfsd threads is the best guide we have to the
expected number of clients, so use that - if it is at least 8.

256 threads on a 256Meg machine - which is a lot for a tiny machine -
would result in nfsd_drc_max_mem being 2Meg, so 8K (3 slots) would be
available for the first client, and over 200 clients would get more
than 1 slot.  So I don't think this change will be too debilitating on
poorly configured machines, though it does mean that a sensible
configuration is a little more important.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-20 12:31:51 -04:00
NeilBrown
7f49fd5d7a nfsd: handle drc over-allocation gracefully.
Currently, if there are more clients than allowed for by the
space allocation in set_max_drc(), we fail a SESSION_CREATE
request with NFS4ERR_DELAY.
This means that the client retries indefinitely, which isn't
a user-friendly response.

The RFC requires NFS4ERR_NOSPC, but that would at best result in a
clean failure on the client, which is not much more friendly.

The current space allocation is a best-guess and doesn't provide any
guarantees, we could still run out of space when trying to allocate
drc space.

So fail more gracefully - always give out at least one slot.
If all clients used all the space in all slots, we might start getting
memory pressure, but that is possible anyway.

So ensure 'num' is always at least 1, and remove the test for it
being zero.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-20 12:30:02 -04:00
Linus Torvalds
e170eb2771 Merge branch 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount API infrastructure updates from Al Viro:
 "Infrastructure bits of mount API conversions.

  The rest is more of per-filesystem updates and that will happen
  in separate pull requests"

* 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  mtd: Provide fs_context-aware mount_mtd() replacement
  vfs: Create fs_context-aware mount_bdev() replacement
  new helper: get_tree_keyed()
  vfs: set fs_context::user_ns for reconfigure
2019-09-18 13:15:58 -07:00
Scott Mayhew
6ee95d1c89 nfsd: add support for upcall version 2
Version 2 upcalls will allow the nfsd to include a hash of the kerberos
principal string in the Cld_Create upcall.  If a principal is present in
the svc_cred, then the hash will be included in the Cld_Create upcall.
We attempt to use the svc_cred.cr_raw_principal (which is returned by
gssproxy) first, and then fall back to using the svc_cred.cr_principal
(which is returned by both gssproxy and rpc.svcgssd).  Upon a subsequent
restart, the hash will be returned in the Cld_Gracestart downcall and
stored in the reclaim_str_hashtbl so it can be used when handling
reclaim opens.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10 09:26:33 -04:00
Scott Mayhew
11a60d1592 nfsd: add a "GetVersion" upcall for nfsdcld
Add a "GetVersion" upcall to allow nfsd to determine the maximum upcall
version that the nfsdcld userspace daemon supports.  If the daemon
responds with -EOPNOTSUPP, then we know it only supports v1.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10 09:26:33 -04:00
Trond Myklebust
bbf2f09883 nfsd: Reset the boot verifier on all write I/O errors
If multiple clients are writing to the same file, then due to the fact
we share a single file descriptor between all NFSv3 clients writing
to the file, we have a situation where clients can miss the fact that
their file data was not persisted. While this should be rare, it
could cause silent data loss in situations where multiple clients
are using NLM locking or O_DIRECT to write to the same file.
Unfortunately, the stateless nature of NFSv3 and the fact that we
can only identify clients by their IP address means that we cannot
trivially cache errors; we would not know when it is safe to
release them from the cache.

So the solution is to declare a reboot. We understand that this
should be a rare occurrence, since disks are usually stable. The
most frequent occurrence is likely to be ENOSPC, at which point
all writes to the given filesystem are likely to fail anyway.

So the expectation is that clients will be forced to retry their
writes until they hit the fatal error.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10 09:23:41 -04:00
Trond Myklebust
055b24a8f2 nfsd: Don't garbage collect files that might contain write errors
If a file may contain unstable writes that can error out, then we want
to avoid garbage collecting the struct nfsd_file that may be
tracking those errors.
So in the garbage collector, we try to avoid collecting files that aren't
clean. Furthermore, we avoid immediately kicking off the garbage collector
in the case where the reference drops to zero for the case where there
is a write error that is being tracked.

If the file is unhashed while an error is pending, then declare a
reboot, to ensure the client resends any unstable writes.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10 09:23:41 -04:00
Trond Myklebust
27c438f53e nfsd: Support the server resetting the boot verifier
Add support to allow the server to reset the boot verifier in order to
force clients to resend I/O after a timeout failure.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10 09:23:41 -04:00
Trond Myklebust
5e113224c1 nfsd: nfsd_file cache entries should be per net namespace
Ensure that we can safely clear out the file cache entries when the
nfs server is shut down on a container. Otherwise, the file cache
may end up pinning the mounts.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-09-10 09:23:41 -04:00
Al Viro
533770cc0a new helper: get_tree_keyed()
For vfs_get_keyed_super users.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-09-05 14:34:22 -04:00
J. Bruce Fields
2b86e3aaf9 nfsd: eliminate an unnecessary acl size limit
We're unnecessarily limiting the size of an ACL to less than what most
filesystems will support.  Some users do hit the limit and it's
confusing and unnecessary.

It still seems prudent to impose some limit on the number of ACEs the
client gives us before passing it straight to kmalloc().  So, let's just
limit it to the maximum number that would be possible given the amount
of data left in the argument buffer.

That will still leave one limit beyond whatever the filesystem imposes:
the client and server negotiate a limit on the size of a request, which
we have to respect.

But we're no longer imposing any additional arbitrary limit.

struct nfs4_ace is 20 bytes on my system and the maximum call size we'll
negotiate is about a megabyte, so in practice this is limiting the
allocation here to about a megabyte.

Reported-by: "de Vandiere, Louis" <louis.devandiere@atos.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-28 21:13:45 -04:00
J. Bruce Fields
9d60d93198 Deprecate nfsd fault injection
This is only useful for client testing.  I haven't really maintained it,
and reference counting and locking are wrong at this point.  You can get
some of the same functionality now from nfsd/clients/.

It was a good idea but I think its time has passed.

In the unlikely event of users, hopefully the BROKEN dependency will
prompt them to speak up.  Otherwise I expect to remove it soon.

Reported-by: Alex Lyakas <alex@zadara.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-26 10:36:56 -04:00
YueHaibing
bb13f35b96 nfsd: remove duplicated include from filecache.c
Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-20 11:15:34 -04:00
Trond Myklebust
ed9927533a nfsd: Fix the documentation for svcxdr_tmpalloc()
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:10 -04:00
Trond Myklebust
b96811cd02 nfsd: Fix up some unused variable warnings
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:10 -04:00
Jeff Layton
7775ec57f4 nfsd: close cached files prior to a REMOVE or RENAME that would replace target
It's not uncommon for some workloads to do a bunch of I/O to a file and
delete it just afterward. If knfsd has a cached open file however, then
the file may still be open when the dentry is unlinked. If the
underlying filesystem is nfs, then that could trigger it to do a
sillyrename.

On a REMOVE or RENAME scan the nfsd_file cache for open files that
correspond to the inode, and proactively unhash and put their
references. This should prevent any delete-on-last-close activity from
occurring, solely due to knfsd's open file cache.

This must be done synchronously though so we use the variants that call
flush_delayed_fput. There are deadlock possibilities if you call
flush_delayed_fput while holding locks, however. In the case of
nfsd_rename, we don't even do the lookups of the dentries to be renamed
until we've locked for rename.

Once we've figured out what the target dentry is for a rename, check to
see whether there are cached open files associated with it. If there
are, then unwind all of the locking, close them all, and then reattempt
the rename.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:10 -04:00
Jeff Layton
501cb1849f nfsd: rip out the raparms cache
The raparms cache was set up in order to ensure that we carry readahead
information forward from one RPC call to the next. In other words, it
was set up because each RPC call was forced to open a struct file, then
close it, causing the loss of readahead information that is normally
cached in that struct file, and used to keep the page cache filled when
a user calls read() multiple times on the same file descriptor.

Now that we cache the struct file, and reuse it for all the I/O calls
to a given file by a given user, we no longer have to keep a separate
readahead cache.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:09 -04:00
Jeff Layton
6b556ca287 nfsd: have nfsd_test_lock use the nfsd_file cache
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:09 -04:00
Jeff Layton
5c4583b2b7 nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
Have nfs4_preprocess_stateid_op pass back a nfsd_file instead of a filp.
Since we now presume that the struct file will be persistent in most
cases, we can stop fiddling with the raparms in the read code. This
also means that we don't really care about the rd_tmp_file field
anymore.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:09 -04:00
Jeff Layton
eb82dd3937 nfsd: convert fi_deleg_file and ls_file fields to nfsd_file
Have them keep an nfsd_file reference instead of a struct file.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:09 -04:00
Jeff Layton
fd4f83fd7d nfsd: convert nfs4_file->fi_fds array to use nfsd_files
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:09:09 -04:00
Jeff Layton
5920afa3c8 nfsd: hook nfsd_commit up to the nfsd_file cache
Use cached filps if possible instead of opening a new one every time.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-08-19 11:00:40 -04:00