linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-09 20:51:43 +00:00

Author	SHA1	Message	Date
Trond Myklebust	943cff67b8	NFSv4.1: Fix the r/wsize checking The intention of nfs4_session_set_rwsize() was to cap the r/wsize to the buffer sizes negotiated by the CREATE_SESSION. The initial code had a bug whereby we would not check the values negotiated by nfs_probe_fsinfo() (the assumption being that CREATE_SESSION will always negotiate buffer values that are sane w.r.t. the server's preferred r/wsizes) but would only check values set by the user in the 'mount' command. The code was changed in 4.11 to _always_ set the r/wsize, meaning that we now never use the server preferred r/wsizes. This is the regression that this patch fixes. Also rename the function to nfs4_session_limit_rwsize() in order to avoid future confusion. Fixes: `033853325f` (NFSv4.1 respect server's max size in CREATE_SESSION") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:17 -04:00
Trond Myklebust	ace9fad43a	NFSv4: Convert struct nfs4_state to use refcount_t Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:17 -04:00
Trond Myklebust	9ae075fdd1	NFSv4: Convert open state lookup to use RCU Further reduce contention on the inode->i_lock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:17 -04:00
Trond Myklebust	0de43976fb	NFS: Convert lookups of the open context to RCU Reduce contention on the inode->i_lock by ensuring that we use RCU when looking up the NFS open context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:17 -04:00
Trond Myklebust	6ba0c4e5bb	NFS: Simplify internal check for whether file is open for write Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:17 -04:00
Trond Myklebust	1db97eaa0b	NFS: Convert lookups of the lock context to RCU Speed up lookups of an existing lock context by avoiding the inode->i_lock, and using RCU instead. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	28ced9a84c	pNFS: Don't allocate more pages than we need to fit a layoutget response For the 'files' and 'flexfiles' layout types, we do not expect the reply to be any larger than 4k. The block and scsi layout types are a little more greedy, so we keep allocating the maximum response size for now. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	a2791d3a2c	pNFS: Don't zero out the array in nfs4_alloc_pages() We don't need a zeroed out array, since it is immediately being filled. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	ec846469ba	SUNRPC: Unexport xdr_partial_copy_from_skb() It is no longer used outside of net/sunrpc/socklib.c Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	4f54614975	SUNRPC: Clean up xs_udp_data_receive() Simplify the retry logic. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	550aebfe1c	SUNRPC: Allow AF_LOCAL sockets to use the generic stream receive Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	c50b8ee02f	SUNRPC: Clean up - rename xs_tcp_data_receive() to xs_stream_data_receive() In preparation for sharing with AF_LOCAL. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	277e4ab7d5	SUNRPC: Simplify TCP receive code by switching to using iterators Most of this code should also be reusable with other socket types. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	9d96acbc7f	SUNRPC: Add a bvec array to struct xdr_buf for use with iovec_iter() Add a bvec array to struct xdr_buf, and have the client allocate it when we need to receive data into pages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	431f6eb357	SUNRPC: Add a label for RPC calls that require allocation on receive If the RPC call relies on the receive call allocating pages as buffers, then let's label it so that we a) Don't leak memory by allocating pages for requests that do not expect this behaviour b) Can optimise for the common case where calls do not require allocation. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	79c99152a3	SUNRPC: Convert the xprt->sending queue back to an ordinary wait queue We no longer need priority semantics on the xprt->sending queue, because the order in which tasks are sent is now dictated by their position in the send queue. Note that the backlog queue remains a priority queue, meaning that slot resources are still managed in order of task priority. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	f42f7c2830	SUNRPC: Fix priority queue fairness Fix up the priority queue to not batch by owner, but by queue, so that we allow '1 << priority' elements to be dequeued before switching to the next priority queue. The owner field is still used to wake up requests in round robin order by owner to avoid single processes hogging the RPC layer by loading the queues. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	95f7691daa	SUNRPC: Convert xprt receive queue to use an rbtree If the server is slow, we can find ourselves with quite a lot of entries on the receive queue. Converting the search from an O(n) to O(log(n)) can make a significant difference, particularly since we have to hold a number of locks while searching. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	bd79bc579c	SUNRPC: Don't take transport->lock unnecessarily when taking XPRT_LOCK Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	adfa71446d	SUNRPC: Cleanup: remove the unused 'task' argument from the request_send() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:16 -04:00
Trond Myklebust	c544577dad	SUNRPC: Clean up transport write space handling Treat socket write space handling in the same way we now treat transport congestion: by denying the XPRT_LOCK until the transport signals that it has free buffer space. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	36bd7de949	SUNRPC: Turn off throttling of RPC slots for TCP sockets The theory was that we would need to grab the socket lock anyway, so we might as well use it to gate the allocation of RPC slots for a TCP socket. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	f05d54ecf6	SUNRPC: Allow soft RPC calls to time out when waiting for the XPRT_LOCK This no longer causes them to lose their place in the transmission queue. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	89f90fe1ad	SUNRPC: Allow calls to xprt_transmit() to drain the entire transmit queue Rather than forcing each and every RPC task to grab the socket write lock in order to send itself, we allow whichever task is holding the write lock to attempt to drain the entire transmit queue. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	86aeee0eb6	SUNRPC: Enqueue swapper tagged RPCs at the head of the transmit queue Avoid memory starvation by giving RPCs that are tagged with the RPC_TASK_SWAPPER flag the highest priority. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	75891f502f	SUNRPC: Support for congestion control when queuing is enabled Both RDMA and UDP transports require the request to get a "congestion control" credit before they can be transmitted. Right now, this is done when the request locks the socket. We'd like it to happen when a request attempts to be transmitted for the first time. In order to support retransmission of requests that already hold such credits, we also want to ensure that they get queued first, so that we don't deadlock with requests that have yet to obtain a credit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	918f3c1fe8	SUNRPC: Improve latency for interactive tasks One of the intentions with the priority queues was to ensure that no single process can hog the transport. The field task->tk_owner therefore identifies the RPC call's origin, and is intended to allow the RPC layer to organise queues for fairness. This commit therefore modifies the transmit queue to group requests by task->tk_owner, and ensures that we round robin among those groups. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	dcbbeda836	SUNRPC: Move RPC retransmission stat counter to xprt_transmit() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	5f2f6bd987	SUNRPC: Simplify xprt_prepare_transmit() Remove the checks for whether or not we need to transmit, and whether or not a reply has been received. Those are already handled in call_transmit() itself. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	04b3b88fbf	SUNRPC: Don't reset the request 'bytes_sent' counter when releasing XPRT_LOCK If the request is still on the queue, this will be incorrect behaviour. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	50f484e298	SUNRPC: Treat the task and request as separate in the xprt_ops->send_request() When we shift to using the transmit queue, then the task that holds the write lock will not necessarily be the same as the one being transmitted. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	902c58872e	SUNRPC: Fix up the back channel transmit Fix up the back channel code to recognise that it has already been transmitted, so does not need to be called again. Also ensure that we set req->rq_task. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	762e4e67b3	SUNRPC: Refactor RPC call encoding Move the call encoding so that it occurs before the transport connection etc. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	944b042921	SUNRPC: Add a transmission queue for RPC requests Add the queue that will enforce the ordering of RPC task transmission. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:15 -04:00
Trond Myklebust	ef3f54347f	SUNRPC: Distinguish between the slot allocation list and receive queue When storing a struct rpc_rqst on the slot allocation list, we currently use the same field 'rq_list' as we use to store the request on the receive queue. Since the structure is never on both lists at the same time, this is OK. However, for clarity, let's make that a union with different names for the different lists so that we can more easily distinguish between the two states. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	78b576ced2	SUNRPC: Minor cleanup for call_transmit() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	7f3a1d1e18	SUNRPC: Refactor xprt_transmit() to remove wait for reply code Allow the caller in clnt.c to call into the code to wait for a reply after calling xprt_transmit(). Again, the reason is that the backchannel code does not need this functionality. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	edc81dcd5b	SUNRPC: Refactor xprt_transmit() to remove the reply queue code Separate out the action of adding a request to the reply queue so that the backchannel code can simply skip calling it altogether. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	75c84151a9	SUNRPC: Rename xprt->recv_lock to xprt->queue_lock We will use the same lock to protect both the transmit and receive queues. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	ec37a58fba	SUNRPC: Don't wake queued RPC calls multiple times in xprt_transmit Rather than waking up the entire queue of RPC messages a second time, just wake up the task that was put to sleep. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	5ce970393b	SUNRPC: Test whether the task is queued before grabbing the queue spinlocks When asked to wake up an RPC task, it makes sense to test whether or not the task is still queued. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	359c48c04a	SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status Add a helper that will wake up a task that is sleeping on a specific queue, and will set the value of task->tk_status. This is mainly intended for use by the transport layer to notify the task of an error condition. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	cf9946cd61	SUNRPC: Refactor the transport request pinning We are going to need to pin for both send and receive. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	4cd34e7c2e	SUNRPC: Simplify dealing with aborted partially transmitted messages If the previous message was only partially transmitted, we need to close the socket in order to avoid corruption of the message stream. To do so, we currently hijack the unlocking of the socket in order to schedule the close. Now that we track the message offset in the socket state, we can move that kind of checking out of the socket lock code, which is needed to allow messages to remain queued after dropping the socket lock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	6c7a64e5a4	SUNRPC: Add socket transmit queue offset tracking Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	e1806c7bfb	SUNRPC: Move reset of TCP state variables into the reconnect code Rather than resetting state variables in socket state_change() callback, do it in the sunrpc TCP connect function itself. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	d1109aa56c	SUNRPC: Rename TCP receive-specific state variables Since we will want to introduce similar TCP state variables for the transmission of requests, let's rename the existing ones to label that they are for the receive side. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	3a03818fbe	SUNRPC: Avoid holding locks across the XDR encoding of the RPC message Currently, we grab the socket bit lock before we allow the message to be XDR encoded. That significantly slows down the transmission rate, since we serialise on a potentially blocking operation. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	7ebbbc6e7b	SUNRPC: Simplify identification of when the message send/receive is complete Add states to indicate that the message send and receive are not yet complete. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:14 -04:00
Trond Myklebust	3021a5bbbf	SUNRPC: The transmitted message must lie in the RPCSEC window of validity If a message has been encoded using RPCSEC_GSS, the server is maintaining a window of sequence numbers that it considers valid. The client should normally be tracking that window, and needs to verify that the sequence number used by the message being transmitted still lies inside the window of validity. So far, we've been able to assume this condition would be realised automatically, since the client has been encoding the message only after taking the socket lock. Once we change that condition, we will need the explicit check. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2018-09-30 15:35:13 -04:00

1 2 3 4 5 ...

783076 Commits