linux

Author	SHA1	Message	Date
Gustavo A. R. Silva	76c50eb70d	nfsd: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding a couple of break statements instead of just letting the code fall through to the next case. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-20 16:55:07 -04:00
J. Bruce Fields	aba2072f45	nfsd: grant read delegations to clients holding writes It's OK to grant a read delegation to a client that holds a write, as long as it's the only client holding the write. We originally tried to do this in commit `94415b06eb` ("nfsd4: a client's own opens needn't prevent delegations"), which had to be reverted in commit `6ee65a7730` ("Revert "nfsd4: a client's own opens needn't prevent delegations""). Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-19 16:41:36 -04:00
J. Bruce Fields	ebd9d2c2f5	nfsd: reshuffle some code No change in behavior, I'm just moving some code around to avoid forward references in a following patch. (To do someday: figure out how to split up nfs4state.c. It's big and disorganized.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-19 16:41:36 -04:00
J. Bruce Fields	a0ce48375a	nfsd: track filehandle aliasing in nfs4_files It's unusual but possible for multiple filehandles to point to the same file. In that case, we may end up with multiple nfs4_files referencing the same inode. For delegation purposes it will turn out to be useful to flag those cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-19 16:41:36 -04:00
J. Bruce Fields	f9b60e2209	nfsd: hash nfs4_files by inode number The nfs4_file structure is per-filehandle, not per-inode, because the spec requires open and other state to be per filehandle. But it will turn out to be convenient for nfs4_files associated with the same inode to be hashed to the same bucket, so let's hash on the inode instead of the filehandle. Filehandle aliasing is rare, so that shouldn't have much performance impact. (If you have a ton of exported filesystems, though, and all of them have a root with inode number 2, could that get you an overlong hash chain? Perhaps this (and the v4 open file cache) should be hashed on the inode pointer instead.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-19 16:41:33 -04:00
J. Bruce Fields	217fd6f625	nfsd: ensure new clients break delegations If nfsd already has an open file that it plans to use for IO from another, it may not need to do another vfs open, but it still may need to break any delegations in case the existing opens are for another client. Symptoms are that we may incorrectly fail to break a delegation on a write open from a different client, when the delegation-holding client already has a write open. Fixes: `28df3d1539` ("nfsd: clients don't need to break their own delegations") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-16 15:08:37 -04:00
Vasily Averin	70c5307564	nfsd: removed unused argument in nfsd_startup_generic() Since commit `501cb1849f` ("nfsd: rip out the raparms cache") nrservs is not used in nfsd_startup_generic() Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-15 10:00:35 -04:00
Jiapeng Chong	363f8dd5ee	nfsd: remove unused function Fix the following clang warning: fs/nfsd/nfs4state.c:6276:1: warning: unused function 'end_offset' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-15 09:59:51 -04:00
Chuck Lever	8727f78855	svcrdma: Pass a useful error code to the send_err tracepoint Capture error codes in @ret, which is passed to the send_err tracepoint, so that they can be logged when something goes awry. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-14 12:09:27 -04:00
Chuck Lever	c7731d5e05	svcrdma: Rename goto labels in svc_rdma_sendto() Clean up: Make the goto labels consistent with other similar functions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-14 12:09:27 -04:00
Chuck Lever	351461f332	svcrdma: Don't leak send_ctxt on Send errors Address a rare send_ctxt leak in the svc_rdma_sendto() error paths. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-14 12:09:27 -04:00
Guobin Huang	b73ac6808b	NFSD: Use DEFINE_SPINLOCK() for spinlock spinlock can be initialized automatically with DEFINE_SPINLOCK() rather than explicitly calling spin_lock_init(). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Guobin Huang <huangguobin4@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-06 11:27:38 -04:00
Jiapeng Chong	dee9f6ade3	sunrpc: Remove unused function ip_map_lookup Fix the following clang warnings: net/sunrpc/svcauth_unix.c:306:30: warning: unused function 'ip_map_lookup' [-Wunused-function]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-04-06 11:24:31 -04:00
Olga Kornievskaia	e739b12042	NFSv4.2: fix copy stateid copying for the async copy This patch fixes Dan Carpenter's report that the static checker found a problem where memcpy() was copying into too small of a buffer. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `e0639dc580` ("NFSD introduce async copy feature") Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Dai Ngo <dai.ngo@oracle.com>	2021-04-01 09:36:31 -04:00
Gustavo A. R. Silva	c0a744dcaa	UAPI: nfsfh.h: Replace one-element array with flexible-array member There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Use an anonymous union with a couple of anonymous structs in order to keep userspace unchanged: $ pahole -C nfs_fhbase_new fs/nfsd/nfsfh.o struct nfs_fhbase_new { union { struct { __u8 fb_version_aux; /* 0 1 / __u8 fb_auth_type_aux; / 1 1 / __u8 fb_fsid_type_aux; / 2 1 / __u8 fb_fileid_type_aux; / 3 1 / __u32 fb_auth[1]; / 4 4 / }; / 0 8 / struct { __u8 fb_version; / 0 1 / __u8 fb_auth_type; / 1 1 / __u8 fb_fsid_type; / 2 1 / __u8 fb_fileid_type; / 3 1 / __u32 fb_auth_flex[0]; / 4 0 / }; / 0 4 / }; / 0 8 / / size: 8, cachelines: 1, members: 1 / / last cacheline: 8 bytes */ }; Also, this helps with the ongoing efforts to enable -Warray-bounds by fixing the following warnings: fs/nfsd/nfsfh.c: In function ‘nfsd_set_fh_dentry’: fs/nfsd/nfsfh.c:191:41: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 191 \| ntohl((__force __be32)fh->fh_fsid[1]))); \| ~~~~~~~~~~~^~~ ./include/linux/kdev_t.h:12:46: note: in definition of macro ‘MKDEV’ 12 \| #define MKDEV(ma,mi) (((ma) << MINORBITS) \| (mi)) \| ^~ ./include/uapi/linux/byteorder/little_endian.h:40:26: note: in expansion of macro ‘__swab32’ 40 \| #define __be32_to_cpu(x) __swab32((__force __u32)(__be32)(x)) \| ^~~~~~~~ ./include/linux/byteorder/generic.h:136:21: note: in expansion of macro ‘__be32_to_cpu’ 136 \| #define ___ntohl(x) __be32_to_cpu(x) \| ^~~~~~~~~~~~~ ./include/linux/byteorder/generic.h:140:18: note: in expansion of macro ‘___ntohl’ 140 \| #define ntohl(x) ___ntohl(x) \| ^~~~~~~~ fs/nfsd/nfsfh.c:191:8: note: in expansion of macro ‘ntohl’ 191 \| ntohl((__force __be32)fh->fh_fsid[1]))); \| ^~~~~ fs/nfsd/nfsfh.c:192:32: warning: array subscript 2 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 192 \| fh->fh_fsid[1] = fh->fh_fsid[2]; \| ~~~~~~~~~~~^~~ fs/nfsd/nfsfh.c:192:15: warning: array subscript 1 is above array bounds of ‘__u32[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds] 192 \| fh->fh_fsid[1] = fh->fh_fsid[2]; \| ~~~~~~~~~~~^~~ [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:59:14 -04:00
Chuck Lever	e3eded5e81	svcrdma: Clean up dto_q critical section in svc_rdma_recvfrom() This, to me, seems less cluttered and less redundant. I was hoping it could help reduce lock contention on the dto_q lock by reducing the size of the critical section, but alas, the only improvement is readability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:58:48 -04:00
Chuck Lever	5533c4f4b9	svcrdma: Remove svc_rdma_recv_ctxt::rc_pages and ::rc_arg These fields are no longer used. The size of struct svc_rdma_recv_ctxt is now less than 300 bytes on x86_64, down from 2440 bytes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:57:48 -04:00
Chuck Lever	9af723be86	svcrdma: Remove sc_read_complete_q Now that svc_rdma_recvfrom() waits for Read completion, sc_read_complete_q is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-31 15:57:48 -04:00
Chuck Lever	7d81ee8722	svcrdma: Single-stage RDMA Read Currently the generic RPC server layer calls svc_rdma_recvfrom() twice to retrieve an RPC message that uses Read chunks. I'm not exactly sure why this design was chosen originally. Instead, let's wait for the Read chunk completion inline in the first call to svc_rdma_recvfrom(). The goal is to eliminate some page allocator churn. rdma_read_complete() replaces pages in the second svc_rqst by calling put_page() repeatedly while the upper layer waits for the request to be constructed, which adds unnecessary NFS WRITE round- trip latency. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Tom Talpey <tom@talpey.com>	2021-03-31 15:57:39 -04:00
Chuck Lever	82011c80b3	SUNRPC: Move svc_xprt_received() call sites Currently, XPT_BUSY is not cleared until xpo_recvfrom returns. That effectively blocks the receipt and handling of the next RPC message until the current one has been taken off the transport. This strict ordering is a requirement for socket transports. For our kernel RPC/RDMA transport implementation, however, dequeuing an ingress message is nothing more than a list_del(). The transport can safely be marked un-busy as soon as that is done. To keep the changes simpler, this patch just moves the svc_xprt_received() call site from svc_handle_xprt() into the transports, so that the actual optimization can be done in a subsequent patch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	7dcfbd86ad	SUNRPC: Export svc_xprt_received() Prepare svc_xprt_received() to be called from transport code instead of from generic RPC server code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	cc93ce9529	svcrdma: Retain the page backing rq_res.head[0].iov_base svc_rdma_sendto() now waits for the NIC hardware to finish with the pages backing rq_res. We still have to release the page array in some cases, but now it's always safe to immediately re-use the page backing rq_res's head buffer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	579900670a	svcrdma: Remove unused sc_pages field Clean up. This significantly reduces the size of struct svc_rdma_send_ctxt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	2a1e4f21d8	svcrdma: Normalize Send page handling Currently svc_rdma_sendto() migrates xdr_buf pages into a separate page list and NULLs out a bunch of entries in rq_pages while the pages are under I/O. The Send completion handler then frees those pages later. Instead, let's wait for the Send completion, then handle page releasing in the nfsd thread. I'd like to avoid the cost of 250+ put_page() calls in the Send completion handler, which is single- threaded. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	e844d307d4	svcrdma: Add a "deferred close" helper Refactor a bit of commonly used logic so that every site that wants a close deferred to an nfsd thread does all the right things (set_bit(XPT_CLOSE) then enqueue). Also, once XPT_CLOSE is set on a transport, it is never cleared. If XPT_CLOSE is already set, then the close is already being handled and the enqueue can be skipped. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	c558d47596	svcrdma: Maintain a Receive water mark Post more Receives when the number of pending Receives drops below a water mark. The batch mechanism is disabled if the underlying device cannot support a reasonably-sized Receive Queue. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	7b748c30cc	svcrdma: Use svc_rdma_refresh_recvs() in wc_receive Replace svc_rdma_post_recv() with the new batch receive mechanism. For the moment it is posting just a single Receive WR at a time, so no change in behavior is expected. Since svc_rdma_wc_receive() was the last call site for svc_rdma_post_recv(), it is removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	77f0a2aa5c	svcrdma: Add a batch Receive posting mechanism Introduce a server-side mechanism similar to commit `e340c2d6ef` ("xprtrdma: Reduce the doorbell rate (Receive)") to post Receive WRs in batch. Its first consumer is svc_rdma_post_recvs(), which posts the initial set of Receive WRs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 13:22:13 -04:00
Chuck Lever	c6b7ed8f94	svcrdma: Remove stale comment for svc_rdma_wc_receive() xprt pinning was removed in commit `365e9992b9` ("svcrdma: Remove transport reference counting"), but this comment was not updated to reflect that change. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:05 -04:00
Chuck Lever	270f25edcc	svcrdma: Provide an explanatory comment in CMA event handler Clean up: explain why svc_xprt_enqueue() is invoked in the event handler even though no xpt_flags bits are toggled here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:05 -04:00
Chuck Lever	072db263e1	svcrdma: RPCDBG_FACILITY is no longer used Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:05 -04:00
NeilBrown	472d155a06	nfsd: report client confirmation status in "info" file mountd can now monitor clients appearing and disappearing in /proc/fs/nfsd/clients, and will log these events, in liu of the logging of mount/unmount events for NFSv3. Currently it cannot distinguish between unconfirmed clients (which might be transient and totally uninteresting) and confirmed clients. So add a "status: " line which reports either "confirmed" or "unconfirmed", and use fsnotify to report that the info file has been modified. This requires a bit of infrastructure to keep the dentry for the "info" file. There is no need to take a counted reference as the dentry must remain around until the client is removed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:04 -04:00
J. Bruce Fields	e7a833e9cc	nfsd: don't ignore high bits of copy count Note size_t is 32-bit on a 32-bit architecture, but cp_count is defined by the protocol to be 64 bit, so we could be turning a large copy into a 0-length copy here. Reported-by: <radchenkoy@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:04 -04:00
J. Bruce Fields	792a5112aa	nfsd: COPY with length 0 should copy to end of file >From https://tools.ietf.org/html/rfc7862#page-65 A count of 0 (zero) requests that all bytes from ca_src_offset through EOF be copied to the destination. Reported-by: <radchenkoy@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:04 -04:00
Ricardo Ribalda	34a624931b	nfsd: Fix typo "accesible" Trivial fix. Cc: linux-nfs@vger.kernel.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:03 -04:00
Trond Myklebust	c6c7f2a84d	nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted In order to ensure that knfsd threads don't linger once the nfsd pseudofs is unmounted (e.g. when the container is killed) we let nfsd_umount() shut down those threads and wait for them to exit. This also should ensure that we don't need to do a kernel mount of the pseudofs, since the thread lifetime is now limited by the lifetime of the filesystem. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:03 -04:00
Paul Menzel	f988a7b71d	nfsd: Log client tracking type log message as info instead of warning `printk()`, by default, uses the log level warning, which leaves the user reading NFSD: Using UMH upcall client tracking operations. wondering what to do about it (`dmesg --level=warn`). Several client tracking methods are tried, and expected to fail. That’s why a message is printed only on success. It might be interesting for users to know the chosen method, so use info-level instead of debug-level. Cc: linux-nfs@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:03 -04:00
J. Bruce Fields	7f7e7a4006	nfsd: helper for laundromat expiry calculations We do this same logic repeatedly, and it's easy to get the sense of the comparison wrong. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:03 -04:00
Chuck Lever	219a170502	NFSD: Clean up NFSDDBG_FACILITY macro These are no longer needed because there are no dprintk() call sites in these files. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:02 -04:00
Chuck Lever	6019ce0742	NFSD: Add a tracepoint to record directory entry encoding Enable watching the progress of directory encoding to capture the timing of any issues with reading or encoding a directory. The new tracepoint captures dirent encoding for all NFS versions. For example, here's what a few NFSv4 directory entries might look like: nfsd-989 [002] 468.596265: nfsd_dirent: fh_hash=0x5d162594 ino=2 name=. nfsd-989 [002] 468.596267: nfsd_dirent: fh_hash=0x5d162594 ino=1 name=.. nfsd-989 [002] 468.596299: nfsd_dirent: fh_hash=0x5d162594 ino=3827 name=zlib.c nfsd-989 [002] 468.596325: nfsd_dirent: fh_hash=0x5d162594 ino=3811 name=xdiff nfsd-989 [002] 468.596351: nfsd_dirent: fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h nfsd-989 [002] 468.596377: nfsd_dirent: fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:02 -04:00
Chuck Lever	1416f43530	NFSD: Clean up after updating NFSv3 ACL encoders Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:02 -04:00
Chuck Lever	15e432bf0c	NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:02 -04:00
Chuck Lever	20798dfe24	NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:01 -04:00
Chuck Lever	83d0b84572	NFSD: Clean up after updating NFSv2 ACL encoders Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:01 -04:00
Chuck Lever	07f5c2963c	NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:01 -04:00
Chuck Lever	8d2009a10b	NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:01 -04:00
Chuck Lever	778f068fa0	NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream The SETACL result encoder is exactly the same as the NFSv2 attrstatres decoder. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:00 -04:00
Chuck Lever	f8cba47344	NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:00 -04:00
Chuck Lever	8edc064888	NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:19:00 -04:00
Chuck Lever	8a2cf9f570	NFSD: Remove unused NFSv2 directory entry encoders Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2021-03-22 10:18:59 -04:00

1 2 3 4 5 ...

997205 Commits