linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-29 15:41:36 +00:00

Author	SHA1	Message	Date
NeilBrown	b0d87dbd8b	NFSD: Refactor nfsd_setuser_and_check_port() There are several places where __fh_verify unconditionally dereferences rqstp to check that the connection is suitably secure. They look at rqstp->rq_xprt which is not meaningful in the target use case of "localio" NFS in which the client talks directly to the local server. Prepare these to always succeed when rqstp is NULL. Signed-off-by: NeilBrown <neilb@suse.de> Co-developed-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:30 -04:00
NeilBrown	0a183f24a7	NFSD: Handle @rqstp == NULL in check_nfsd_access() LOCALIO-initiated open operations are not running in an nfsd thread and thus do not have an associated svc_rqst context. Signed-off-by: NeilBrown <neilb@suse.de> Co-developed-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:29 -04:00
Mike Snitzer	1545e488b1	nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h Eliminates duplicate functions in various files to allow for additional callers. Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:29 -04:00
Mike Snitzer	1fcb16674e	nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno Common nfs4_stat_to_errno() is used by fs/nfs/nfs4xdr.c and will be used by fs/nfs/localio.c Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:29 -04:00
Mike Snitzer	4806ded4c1	nfs_common: factor out nfs_errtbl and nfs_stat_to_errno Common nfs_stat_to_errno() is used by both fs/nfs/nfs2xdr.c and fs/nfs/nfs3xdr.c Will also be used by fs/nfsd/localio.c Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:29 -04:00
Dan Aloni	dfb07e990a	nfs: add 'noalignwrite' option for lock-less 'lost writes' prevention There are some applications that write to predefined non-overlapping file offsets from multiple clients and therefore don't need to rely on file locking. However, if these applications want non-aligned offsets and sizes they need to either use locks or risk data corruption, as the NFS client defaults to extending writes to whole pages. This commit adds a new mount option `noalignwrite`, which allows to turn that off and avoid the need of locking, as long as these applications don't overlap on offsets. Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:13 -04:00
Li Lingfeng	6d26c5e4d8	nfs: fix the comment of nfs_get_root The comment for nfs_get_root() needs to be updated as it would also be used by NFS4 as follows: @x[ nfs_get_root+1 nfs_get_tree_common+1819 nfs_get_tree+2594 vfs_get_tree+73 fc_mount+23 do_nfs4_mount+498 nfs4_try_get_tree+134 nfs_get_tree+2562 vfs_get_tree+73 path_mount+2776 do_mount+226 __se_sys_mount+343 __x64_sys_mount+106 do_syscall_64+69 entry_SYSCALL_64_after_hwframe+97 , mount.nfs4]: 1 Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:13 -04:00
Roi Azarzar	615e693b14	NFSv4.2: Fix detection of "Proxying of Times" server support According to draft-ietf-nfsv4-delstid-07: If a server informs the client via the fattr4_open_arguments attribute that it supports OPEN_ARGS_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS and it returns a valid delegation stateid for an OPEN operation which sets the OPEN4_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS flag, then it MUST query the client via a CB_GETATTR for the fattr4_time_deleg_access (see Section 5.2) attribute and fattr4_time_deleg_modify attribute (see Section 5.2). Thus, we should look that the server supports proxying of times via OPEN4_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS. We want to be extra pedantic and continue to check that FATTR4_TIME_DELEG_ACCESS and FATTR4_TIME_DELEG_MODIFY are set. The server needs to expose both for the client to correctly detect "Proxying of Times" support. Signed-off-by: Roi Azarzar <roi.azarzar@vastdata.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: `dcb3c20f74` ("NFSv4: Add a capability for delegated attributes") Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:13 -04:00
Trond Myklebust	af94dca79b	NFSv4: Fail mounts if the lease setup times out If the server is down when the client is trying to mount, so that the calls to exchange_id or create_session fail, then we should allow the mount system call to fail rather than hang and block other mount/umount calls. Reported-by: Oleksandr Tymoshenko <ovt@google.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:13 -04:00
Zhaoyang Huang	03e02b9417	fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private This patch is inspired by a code review of fs codes which aims at folio's extra refcnt that could introduce unwanted behavious when judging refcnt, such as[1].That is, the folio passed to mapping_evict_folio carries the refcnts from find_lock_entries, page_cache, corresponding to PTEs and folio's private if has. However, current code doesn't take the refcnt for folio's private which could have mapping_evict_folio miss the one to only PTE and lead to call filemap_release_folio wrongly. [1] long mapping_evict_folio(struct address_space mapping, struct folio folio) { ... //current code will misjudge here if there is one pte on the folio which is be deemed as the one as folio's private if (folio_ref_count(folio) > folio_nr_pages(folio) + folio_has_private(folio) + 1) return 0; if (!filemap_release_folio(folio, 0)) return 0; return remove_mapping(mapping, folio); } Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:13 -04:00
Gaosheng Cui	40c80881eb	nfs: Remove obsoleted declaration for nfs_read_prepare The nfs_read_prepare() have been removed since commit `a4cdda5911` ("NFS: Create a common pgio_rpc_prepare function"), and now it is useless, so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:13 -04:00
Thorsten Blum	e343678ee9	nfs: Remove unnecessary NULL check before kfree() Since kfree() already checks if its argument is NULL, an additional check before calling kfree() is unnecessary and can be removed. Remove it and thus also the following Coccinelle/coccicheck warning reported by ifnullfree.cocci: WARNING: NULL check before some freeing functions is not needed Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:12 -04:00
Thorsten Blum	bb8e4ce500	nfs: Annotate struct nfs_cache_array with __counted_by() Add the __counted_by compiler attribute to the flexible array member array to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Increment size before adding a new struct to the array. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:12 -04:00
NeilBrown	d98f722725	nfs: simplify and guarantee owner uniqueness. I have evidence of an Linux NFS client getting NFS4ERR_BAD_SEQID to a v4.0 LOCK request to a Linux server (which had fixed the problem with RELEASE_LOCKOWNER bug fixed). The LOCK request presented a "new" lock owner so there are two seq ids in the request: that for the open file, and that for the new lock. Given the context I am confident that the new lock owner was reported to have the wrong seqid. As lock owner identifiers are reused, the server must still have a lock owner active which the client thinks is no longer active. I wasn't able to determine a root-cause but the simplest fix seems to be to ensure lock owners are always unique much as open owners are (thanks to a time stamp). The easiest way to ensure uniqueness is with a 64bit counter for each server. That will never cycle (if updated once a nanosecond the last 584 years. A single NFS server would not handle open/lock requests nearly that fast, and a Linux node is unlikely to have an uptime approaching that). This patch removes the 2 ida and instead uses a per-server atomic64_t to provide uniqueness. Note that the lock owner already encodes the id as 64 bits even though it is a 32bit value. So changing to a 64bit value does not change the encoding of the lock owner. The open owner encoding is now 4 bytes larger. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:12 -04:00
Li Lingfeng	8f6a7c9467	nfs: fix memory leak in error path of nfs4_do_reclaim Commit `c77e22834a` ("NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()") separate out the freeing of the state owners from nfs4_purge_state_owners() and finish it outside the rcu lock. However, the error path is omitted. As a result, the state owners in "freeme" will not be released. Fix it by adding freeing in the error path. Fixes: `c77e22834a` ("NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()") Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>	2024-09-23 15:03:12 -04:00
Anna Schumaker	8c04a6d6e0	NFSD 6.12 Release Notes Notable features of this release include: - Pre-requisites for automatically determining the RPC server thread count - Clean-up and preparation for supporting LOCALIO, which will be merged via the NFS client tree - Enhancements and fixes to NFSv4.2 COPY offload - A new Python-based tool for generating kernel SunRPC XDR encoding and decoding functions, added as an aid for prototyping features in protocols based on the Linux kernel's SunRPC implementation. As always I am grateful to the NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmbxg60ACgkQM2qzM29m f5d+9A/+LiXAjR3x1vlbGFiMAW3Alixg5wE6AM7M1I/OH/dBCkWU1gzneWYaUXAk cIGp5sH2Uco2mFVswOZyQ3tX8T/2PeY+Kx5qrlK5h0bTUoz95AIyLe3LA/4o4CIL qMGlLQyVq9UolggPoRdigsDhKVwLcu3hWaG7ykkTquyrOPLBKgzRNSwVKLpFc/0/ mQToOf6HLjgFkEUR3pmXAMsVq88/BpjHIXeNhx2Z1ekWslSKjrAu2gC0rc6/s9Wi JsTtzSdnqefc2jsNVZ8FT+V7mDF1sxrN4SnHruSLhJsd5tL/3HDkiZEvdG2Sh0nH zQlDpMpNbZyCvaWs6jgaZeMRiNSSl7q31zXUgX2bkWpL/EnagujZHtLZroUgLQfA BO8HhRqdt1wJohiv2aMlFvnlp+GhSH5FdcXv1cT/CmyTNGqbXENqoCUA1OT9kE55 RvXVCLD4YbmCb5EpjLavhu/NuFOc9l9GitKlhiJlcX86QAu/C1Bu1DOyqgq5G0VW Xl/q7xIvNZz0mh7x8kKVV4bQHsm9pnoNz57CZFPahoHg/+BR4u6p8LepowpaHjHj Ef62BzYwQtuw0jCyufDea+uCt5CGwUM3Y5iBiQogtnvFK6ie8WwD0QTI2SYpcWZ/ T6RwDOX5jlMWWmuibSK2STgwkblG3vAmMot0RtEbZILvB/ld9qw= =Ybsc -----END PGP SIGNATURE----- Merge tag 'nfsd-6.12' into linux-next-with-localio NFSD 6.12 Release Notes Notable features of this release include: - Pre-requisites for automatically determining the RPC server thread count - Clean-up and preparation for supporting LOCALIO, which will be merged via the NFS client tree - Enhancements and fixes to NFSv4.2 COPY offload - A new Python-based tool for generating kernel SunRPC XDR encoding and decoding functions, added as an aid for prototyping features in protocols based on the Linux kernel's SunRPC implementation. As always I am grateful to the NFSD contributors, reviewers, testers, and bug reporters who participated during this cycle.	2024-09-23 15:00:07 -04:00
NeilBrown	45bb63ed20	nfsd: fix delegation_blocked() to block correctly for at least 30 seconds The pair of bloom filtered used by delegation_blocked() was intended to block delegations on given filehandles for between 30 and 60 seconds. A new filehandle would be recorded in the "new" bit set. That would then be switch to the "old" bit set between 0 and 30 seconds later, and it would remain as the "old" bit set for 30 seconds. Unfortunately the code intended to clear the old bit set once it reached 30 seconds old, preparing it to be the next new bit set, instead cleared the new bit set before switching it to be the old bit set. This means that the "old" bit set is always empty and delegations are blocked between 0 and 30 seconds. This patch updates bd->new before clearing the set with that index, instead of afterwards. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Cc: stable@vger.kernel.org Fixes: `6282cd5655` ("NFSD: Don't hand out delegations for 30 seconds after recalling them.") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:38 -04:00
Jeff Layton	bf92e5008b	nfsd: fix initial getattr on write delegation At this point in compound processing, currentfh refers to the parent of the file, not the file itself. Get the correct dentry from the delegation stateid instead. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:37 -04:00
NeilBrown	a078a7dc0e	nfsd: untangle code in nfsd4_deleg_getattr_conflict() The code in nfsd4_deleg_getattr_conflict() is convoluted and buggy. With this patch we: - properly handle non-nfsd leases. We must not assume flc_owner is a delegation unless fl_lmops == &nfsd_lease_mng_ops - move the main code out of the for loop - have a single exit which calls nfs4_put_stid() (and other exits which don't need to call that) [ jlayton: refactored on top of Neil's other patch: nfsd: fix nfsd4_deleg_getattr_conflict in presence of third party lease ] Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:36 -04:00
Scott Mayhew	5559c157b7	nfsd: enforce upper limit for namelen in __cld_pipe_inprogress_downcall() This patch is intended to go on top of "nfsd: return -EINVAL when namelen is 0" from Li Lingfeng. Li's patch checks for 0, but we should be enforcing an upper bound as well. Note that if nfsdcld somehow gets an id > NFS4_OPAQUE_LIMIT in its database, it'll truncate it to NFS4_OPAQUE_LIMIT when it does the downcall anyway. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:35 -04:00
Li Lingfeng	22451a16b7	nfsd: return -EINVAL when namelen is 0 When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may result in namelen being 0, which will cause memdup_user() to return ZERO_SIZE_PTR. When we access the name.data that has been assigned the value of ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is triggered. [ T1205] ================================================================== [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260 [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205 [ T1205] [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406 [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ T1205] Call Trace: [ T1205] dump_stack+0x9a/0xd0 [ T1205] ? nfs4_client_to_reclaim+0xe9/0x260 [ T1205] __kasan_report.cold+0x34/0x84 [ T1205] ? nfs4_client_to_reclaim+0xe9/0x260 [ T1205] kasan_report+0x3a/0x50 [ T1205] nfs4_client_to_reclaim+0xe9/0x260 [ T1205] ? nfsd4_release_lockowner+0x410/0x410 [ T1205] cld_pipe_downcall+0x5ca/0x760 [ T1205] ? nfsd4_cld_tracking_exit+0x1d0/0x1d0 [ T1205] ? down_write_killable_nested+0x170/0x170 [ T1205] ? avc_policy_seqno+0x28/0x40 [ T1205] ? selinux_file_permission+0x1b4/0x1e0 [ T1205] rpc_pipe_write+0x84/0xb0 [ T1205] vfs_write+0x143/0x520 [ T1205] ksys_write+0xc9/0x170 [ T1205] ? __ia32_sys_read+0x50/0x50 [ T1205] ? ktime_get_coarse_real_ts64+0xfe/0x110 [ T1205] ? ktime_get_coarse_real_ts64+0xa2/0x110 [ T1205] do_syscall_64+0x33/0x40 [ T1205] entry_SYSCALL_64_after_hwframe+0x67/0xd1 [ T1205] RIP: 0033:0x7fdbdb761bc7 [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514 [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7 [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008 [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001 [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000 [ T1205] ================================================================== Fix it by checking namelen. Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Fixes: `74725959c3` ("nfsd: un-deprecate nfsdcld") Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Scott Mayhew <smayhew@redhat.com> Tested-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:35 -04:00
Chuck Lever	0505de9615	NFSD: Wrap async copy operations with trace points Add an nfsd_copy_async_done to record the timestamp, the final status code, and the callback stateid of an async copy. Rename the nfsd_copy_do_async tracepoint to match that naming convention to make it easier to enable both of these with a single glob. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Chuck Lever	d3c430aa97	NFSD: Clean up extra whitespace in trace_nfsd_copy_done Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Chuck Lever	e1d2697c53	NFSD: Record the callback stateid in copy tracepoints Match COPY operations up with CB_OFFLOAD operations. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Chuck Lever	11848e985c	NFSD: Display copy stateids with conventional print formatting Make it easier to grep for s2s COPY stateids in trace logs: Use the same display format in nfsd_copy_class as is used to display other stateids. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Chuck Lever	aadc3bbea1	NFSD: Limit the number of concurrent async COPY operations Nothing appears to limit the number of concurrent async COPY operations that clients can start. In addition, AFAICT each async COPY can copy an unlimited number of 4MB chunks, so can run for a long time. Thus IMO async COPY can become a DoS vector. Add a restriction mechanism that bounds the number of concurrent background COPY operations. Start simple and try to be fair -- this patch implements a per-namespace limit. An async COPY request that occurs while this limit is exceeded gets NFS4ERR_DELAY. The requesting client can choose to send the request again after a delay or fall back to a traditional read/write style copy. If there is need to make the mechanism more sophisticated, we can visit that in future patches. Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Chuck Lever	9ed666eba4	NFSD: Async COPY result needs to return a write verifier Currently, when NFSD handles an asynchronous COPY, it returns a zero write verifier, relying on the subsequent CB_OFFLOAD callback to pass the write verifier and a stable_how4 value to the client. However, if the CB_OFFLOAD never arrives at the client (for example, if a network partition occurs just as the server sends the CB_OFFLOAD operation), the client will never receive this verifier. Thus, if the client sends a follow-up COMMIT, there is no way for the client to assess the COMMIT result. The usual recovery for a missing CB_OFFLOAD is for the client to send an OFFLOAD_STATUS operation, but that operation does not carry a write verifier in its result. Neither does it carry a stable_how4 value, so the client /must/ send a COMMIT in this case -- which will always fail because currently there's still no write verifier in the COPY result. Thus the server needs to return a normal write verifier in its COPY result even if the COPY operation is to be performed asynchronously. If the server recognizes the callback stateid in subsequent OFFLOAD_STATUS operations, then obviously it has not restarted, and the write verifier the client received in the COPY result is still valid and can be used to assess a COMMIT of the copied data, if one is needed. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	15392c8cd1	nfsd: avoid races with wake_up_var() wake_up_var() needs a barrier after the important change is made in the var and before wake_up_var() is called, else it is possible that a wake up won't be sent when it should. In each case here the var is changed in an "atomic" manner, so smb_mb__after_atomic() is sufficient. In one case the important change (removing the lease) is performed after the wake_up, which is backwards. The code survives in part because the wait_var_event is given a timeout. This patch adds the required barriers and calls destroy_delegation() before waking any threads waiting for the delegation to be destroyed. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	985eeae9c8	nfsd: use clear_and_wake_up_bit() nfsd has two places that open-code clear_and_wake_up_bit(). One has the required memory barriers. The other does not. Change both to use clear_and_wake_up_bit() so we have the barriers without the noise. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Thorsten Blum	2869b3a00e	NFSD: Annotate struct pnfs_block_deviceaddr with __counted_by() Add the __counted_by compiler attribute to the flexible array member volumes to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Use struct_size() instead of manually calculating the number of bytes to allocate for a pnfs_block_deviceaddr with a single volume. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Guoqing Jiang	d078cbf5c3	nfsd: call cache_put if xdr_reserve_space returns NULL If not enough buffer space available, but idmap_lookup has triggered lookup_fn which calls cache_get and returns successfully. Then we missed to call cache_put here which pairs with cache_get. Fixes: `ddd1ea5636` ("nfsd4: use xdr_reserve_space in attribute encoding") Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Reviwed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Jeff Layton	ba017fd391	nfsd: add more nfsd_cb tracepoints Add some tracepoints in the callback client RPC operations. Also add a tracepoint to nfsd4_cb_getattr_done. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Jeff Layton	c1c9f3ea74	nfsd: track the main opcode for callbacks Keep track of the "main" opcode for the callback, and display it in the tracepoint. This makes it simpler to discern what's happening when there is more than one callback in flight. The one special case is the CB_NULL RPC. That's not a CB_COMPOUND opcode, so designate the value 0 for that. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Jeff Layton	e8581a9124	nfsd: add more info to WARN_ON_ONCE on failed callbacks Currently, you get the warning and stack trace, but nothing is printed about the relevant error codes. Add that in. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Li Lingfeng	76a3f3f164	nfsd: fix some spelling errors in comments Fix spelling errors in comments of nfsd4_release_lockowner and nfs4_set_delegation. Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Li Lingfeng	eb059a413c	nfsd: remove unused parameter of nfsd_file_mark_find_or_create Commit `427f5f83a3` ("NFSD: Ensure nf_inode is never dereferenced") passes inode directly to nfsd_file_mark_find_or_create instead of getting it from nf, so there is no need to pass nf. Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Hongbo Li	c2feb7ee39	nfsd: use LIST_HEAD() to simplify code list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Li Lingfeng	340e61e44c	nfsd: map the EBADMSG to nfserr_io to avoid warning Ext4 will throw -EBADMSG through ext4_readdir when a checksum error occurs, resulting in the following WARNING. Fix it by mapping EBADMSG to nfserr_io. nfsd_buffered_readdir iterate_dir // -EBADMSG -74 ext4_readdir // .iterate_shared ext4_dx_readdir ext4_htree_fill_tree htree_dirblock_to_tree ext4_read_dirblock __ext4_read_dirblock ext4_dirblock_csum_verify warn_no_space_for_csum __warn_no_space_for_csum return ERR_PTR(-EFSBADCRC) // -EBADMSG -74 nfserrno // WARNING [ 161.115610] ------------[ cut here ]------------ [ 161.116465] nfsd: non-standard errno: -74 [ 161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0 [ 161.118596] Modules linked in: [ 161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138 [ 161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe mu.org 04/01/2014 [ 161.123601] RIP: 0010:nfserrno+0x9d/0xd0 [ 161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6 05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33 [ 161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286 [ 161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a [ 161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827 [ 161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021 [ 161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8 [ 161.135244] FS: 0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000 [ 161.136695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0 [ 161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 161.141519] PKRU: 55555554 [ 161.142076] Call Trace: [ 161.142575] ? __warn+0x9b/0x140 [ 161.143229] ? nfserrno+0x9d/0xd0 [ 161.143872] ? report_bug+0x125/0x150 [ 161.144595] ? handle_bug+0x41/0x90 [ 161.145284] ? exc_invalid_op+0x14/0x70 [ 161.146009] ? asm_exc_invalid_op+0x12/0x20 [ 161.146816] ? nfserrno+0x9d/0xd0 [ 161.147487] nfsd_buffered_readdir+0x28b/0x2b0 [ 161.148333] ? nfsd4_encode_dirent_fattr+0x380/0x380 [ 161.149258] ? nfsd_buffered_filldir+0xf0/0xf0 [ 161.150093] ? wait_for_concurrent_writes+0x170/0x170 [ 161.151004] ? generic_file_llseek_size+0x48/0x160 [ 161.151895] nfsd_readdir+0x132/0x190 [ 161.152606] ? nfsd4_encode_dirent_fattr+0x380/0x380 [ 161.153516] ? nfsd_unlink+0x380/0x380 [ 161.154256] ? override_creds+0x45/0x60 [ 161.155006] nfsd4_encode_readdir+0x21a/0x3d0 [ 161.155850] ? nfsd4_encode_readlink+0x210/0x210 [ 161.156731] ? write_bytes_to_xdr_buf+0x97/0xe0 [ 161.157598] ? __write_bytes_to_xdr_buf+0xd0/0xd0 [ 161.158494] ? lock_downgrade+0x90/0x90 [ 161.159232] ? nfs4svc_decode_voidarg+0x10/0x10 [ 161.160092] nfsd4_encode_operation+0x15a/0x440 [ 161.160959] nfsd4_proc_compound+0x718/0xe90 [ 161.161818] nfsd_dispatch+0x18e/0x2c0 [ 161.162586] svc_process_common+0x786/0xc50 [ 161.163403] ? nfsd_svc+0x380/0x380 [ 161.164137] ? svc_printk+0x160/0x160 [ 161.164846] ? svc_xprt_do_enqueue.part.0+0x365/0x380 [ 161.165808] ? nfsd_svc+0x380/0x380 [ 161.166523] ? rcu_is_watching+0x23/0x40 [ 161.167309] svc_process+0x1a5/0x200 [ 161.168019] nfsd+0x1f5/0x380 [ 161.168663] ? nfsd_shutdown_threads+0x260/0x260 [ 161.169554] kthread+0x1c4/0x210 [ 161.170224] ? kthread_insert_work_sanity_check+0x80/0x80 [ 161.171246] ret_from_fork+0x1f/0x30 Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Li Lingfeng	2039c5da5d	NFSD: remove redundant assignment operation Commit `5826e09bf3` ("NFSD: OP_CB_RECALL_ANY should recall both read and write delegations") added a new assignment statement to add RCA4_TYPE_MASK_WDATA_DLG to ra_bmval bitmask of OP_CB_RECALL_ANY. So the old one should be removed. Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Chuck Lever	202f39039a	NFSD: Fix NFSv4's PUTPUBFH operation According to RFC 8881, all minor versions of NFSv4 support PUTPUBFH. Replace the XDR decoder for PUTPUBFH with a "noop" since we no longer want the minorversion check, and PUTPUBFH has no arguments to decode. (Ideally nfsd4_decode_noop should really be called nfsd4_decode_void). PUTPUBFH should now behave just like PUTROOTFH. Reported-by: Cedric Blancher <cedric.blancher@gmail.com> Fixes: `e1a90ebd8b` ("NFSD: Combine decode operations for v4 and v4.1") Cc: Dan Shelton <dan.f.shelton@gmail.com> Cc: Roland Mainz <roland.mainz@nrubsig.org> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Mark Grimes	32b34fa485	nfsd: Add quotes to client info 'callback address' The 'callback address' in client_info_show is output without quotes causing yaml parsers to fail on processing IPv6 addresses. Adding quotes to 'callback address' also matches that used by the 'address' field. Signed-off-by: Mark Grimes <mark.grimes@ixsystems.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	438f81e0e9	nfsd: move error choice for incorrect object types to version-specific code. If an NFS operation expects a particular sort of object (file, dir, link, etc) but gets a file handle for a different sort of object, it must return an error. The actual error varies among NFS versions in non-trivial ways. For v2 and v3 there are ISDIR and NOTDIR errors and, for NFSv4 only, INVAL is suitable. For v4.0 there is also NFS4ERR_SYMLINK which should be used if a SYMLINK was found when not expected. This take precedence over NOTDIR. For v4.1+ there is also NFS4ERR_WRONG_TYPE which should be used in preference to EINVAL when none of the specific error codes apply. When nfsd_mode_check() finds a symlink where it expected a directory it needs to return an error code that can be converted to NOTDIR for v2 or v3 but will be SYMLINK for v4. It must be different from the error code returns when it finds a symlink but expects a regular file - that must be converted to EINVAL or SYMLINK. So we introduce an internal error code nfserr_symlink_not_dir which each version converts as appropriate. nfsd_check_obj_isreg() is similar to nfsd_mode_check() except that it is only used by NFSv4 and only for OPEN. NFSERR_INVAL is never a suitable error if the object is the wrong time. For v4.0 we use nfserr_symlink for non-dirs even if not a symlink. For v4.1 we have nfserr_wrong_type. We handle this difference in-place in nfsd_check_obj_isreg() as there is nothing to be gained by delaying the choice to nfsd4_map_status(). As a result of these changes, nfsd_mode_check() doesn't need an rqstp arg any more. Note that NFSv4 operations are actually performed in the xdr code(!!!) so to the only place that we can map the status code successfully is in nfsd4_encode_operation(). Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	36ffa3d0de	nfsd: be more systematic about selecting error codes for internal use. Rather than using ad hoc values for internal errors (30000, 11000, ...) use 'enum' to sequentially allocate numbers starting from the first known available number - now visible as NFS4ERR_FIRST_FREE. The goal is values that are distinct from all be32 error codes. To get those we must first select integers that are not already used, then convert them with cpu_to_be32(). Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	1459ad5767	nfsd: Move error code mapping to per-version proc code. There is code scattered around nfsd which chooses an error status based on the particular version of nfs being used. It is cleaner to have the version specific choices in version specific code. With this patch common code returns the most specific error code possible and the version specific code maps that if necessary. Both v2 (nfsproc.c) and v3 (nfs3proc.c) now have a "map_status()" function which is called to map the resp->status before each non-trivial nfsd_proc_* or nfsd3_proc_* function returns. NFS4ERR_SYMLINK and NFS4ERR_WRONG_TYPE introduce extra complications and are left for a later patch. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	ef7f6c4904	nfsd: move V4ROOT version check to nfsd_set_fh_dentry() This further centralizes version number checks. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	c689bdd3bf	nfsd: further centralize protocol version checks. With this patch the only places that test ->rq_vers against a specific version are nfsd_v4client() and nfsd_set_fh_dentry(). The latter sets some flags in the svc_fh, which now includes: fh_64bit_cookies fh_use_wgather Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	4f67d24f72	nfsd: use nfsd_v4client() in nfsd_breaker_owns_lease() nfsd_breaker_owns_lease() currently open-codes the same test that nfsd_v4client() performs. With this patch we use nfsd_v4client() instead. Also as i_am_nfsd() is only used in combination with kthread_data(), replace it with nfsd_current_rqst() which combines the two and returns a valid svc_rqst, or NULL. The test for NULL is moved into nfsd_v4client() for code clarity. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	9fd45c16f3	nfsd: Pass 'cred' instead of 'rqstp' to some functions. nfsd_permission(), exp_rdonly(), nfsd_setuser(), and nfsexp_flags() only ever need the cred out of rqstp, so pass it explicitly instead of the whole rqstp. This makes the interfaces cleaner. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	c55aeef776	nfsd: Don't pass all of rqst into rqst_exp_find() Rather than passing the whole rqst, pass the pieces that are actually needed. This makes the inputs to rqst_exp_find() more obvious. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Sagi Grimberg	11673b2a91	nfsd: don't assume copy notify when preprocessing the stateid Move the stateid handling to nfsd4_copy_notify. If nfs4_preprocess_stateid_op did not produce an output stateid, error out. Copy notify specifically does not permit the use of special stateids, so enforce that outside generic stateid pre-processing. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00

1 2 3 4 5 ...

93258 Commits