linux

mirror of https://github.com/torvalds/linux.git synced 2024-10-31 17:21:49 +00:00

Author	SHA1	Message	Date
Zach Brown	0f4b1c7e89	rds: fix rds_send_xmit() serialization rds_send_xmit() was changed to hold an interrupt masking spinlock instead of a mutex so that it could be called from the IB receive tasklet path. This broke the TCP transport because its xmit method can block and masks and unmasks interrupts. This patch serializes callers to rds_send_xmit() with a simple bit instead of the current spinlock or previous mutex. This enables rds_send_xmit() to be called from any context and to call functions which block. Getting rid of the c_send_lock exposes the bare c_lock acquisitions which are changed to block interrupts. A waitqueue is added so that rds_conn_shutdown() can wait for callers to leave rds_send_xmit() before tearing down partial send state. This lets us get rid of c_senders. rds_send_xmit() is changed to check the conn state after acquiring the RDS_IN_XMIT bit to resolve races with the shutdown path. Previously both worked with the conn state and then the lock in the same order, allowing them to race and execute the paths concurrently. rds_send_reset() isn't racing with rds_send_xmit() now that rds_conn_shutdown() properly ensures that rds_send_xmit() can't start once the conn state has been changed. We can remove its previous use of the spinlock. Finally, c_send_generation is redundant. Callers can race to test the c_flags bit by simply retrying instead of racing to test the c_send_generation atomic. Signed-off-by: Zach Brown <zach.brown@oracle.com>	2010-09-08 18:15:27 -07:00
Zach Brown	89bf9d4158	RDS/IB: get the xmit max_sge from the RDS IB device on the connection rds_ib_xmit_rdma() was calling ib_get_client_data() to get at the rds_ibdevice just to get the max_sge for the transmit. This patch instead has it get it directly off the rds_ibdev which is stored on the connection. The current code won't free the rds_ibdev until all the IB connections that use it are freed. So it's safe to reference the rds_ibdev this way. In the future it also makes it easier to support proper reference counting of the rds_ibdev struct. As an additional bonus, this gets rid of the performance hit of calling in to the IB stack to look up the rds_ibdev. The current implementation in the IB stack acquires an interrupt blocking spinlock to protect the registration of client callback data. Signed-off-by: Zach Brown <zach.brown@oracle.com>	2010-09-08 18:15:16 -07:00
Chris Mason	1cc2228c59	rds: Fix reference counting on the for xmit_atomic and xmit_rdma This makes sure we have the proper number of references in rds_ib_xmit_atomic and rds_ib_xmit_rdma. We also consistently drop references the same way for all message types as the IOs end. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-09-08 18:15:13 -07:00
Chris Mason	c9e65383a2	rds: Fix RDMA message reference counting The RDS send_xmit code was trying to get fancy with message counting and was dropping the final reference on the RDMA messages too early. This resulted in memory corruption and oopsen. The fix here is to always add a ref as the parts of the message passes through rds_send_xmit, and always drop a ref as the parts of the message go through completion handling. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2010-09-08 18:15:10 -07:00
Andy Grover	51e2cba8b5	RDS: Move atomic stats from general to ib-specific area Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:12:20 -07:00
Andy Grover	ff3d7d3613	RDS: Perform unmapping ops in stages Previously, RDS would wait until the final send WR had completed and then handle cleanup. With silent ops, we do not know if an atomic, rdma, or data op will be last. This patch handles any of these cases by keeping a pointer to the last op in the message in m_last_op. When the TX completion event fires, rds dispatches to per-op-type cleanup functions, and then does whole-message cleanup, if the last op equalled m_last_op. This patch also moves towards having op-specific functions take the op struct, instead of the overall rm struct. rds_ib_connection has a pointer to keep track of a a partially- completed data send operation. This patch changes it from an rds_message pointer to the narrower rm_data_op pointer, and modifies places that use this pointer as needed. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:12:08 -07:00
Andy Grover	6c7cc6e469	RDS: Rename data op members prefix from m_ to op_ For consistency. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:59 -07:00
Andy Grover	f8b3aaf2ba	RDS: Remove struct rds_rdma_op A big changeset, but it's all pretty dumb. struct rds_rdma_op was already embedded in struct rm_rdma_op. Remove rds_rdma_op and put its members in rm_rdma_op. Rename members with "op_" prefix instead of "r_", for consistency. Of course this breaks a lot, so fixup the code accordingly. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:58 -07:00
Andy Grover	241eef3e2f	RDS: Implement silent atomics Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:55 -07:00
Andy Grover	c8de3f1005	RDS/IB: Make all flow control code conditional on i_flowctl Maybe things worked fine with the flow control code running even in the non-flow-control case, but making it explicitly conditional helps the non-fc case be easier to read. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:53 -07:00
Andy Grover	1d34f17571	RDS: Remove unsignaled_bytes sysctl Removed unsignaled_bytes sysctl and code to signal based on it. I believe unsignaled_wrs is more than sufficient for our purposes. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:52 -07:00
Andy Grover	da5a06cef5	RDS: rewrite rds_ib_xmit Now that the header always goes first, it is possible to simplify rds_ib_xmit. Instead of having a path to handle 0-byte dgrams and another path to handle >0, these can both be handled in one path. This lets us eliminate xmit_populate_wr(). Rename sent to bytes_sent, to differentiate better from other variable named "send". Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:51 -07:00
Andy Grover	919ced4ce7	RDS/IB: Remove ib_[header/data]_sge() functions These functions were to cope with differently ordered sg entries depending on RDS 3.0 or 3.1+. Now that we've dropped 3.0 compatibility we no longer need them. Also, modify usage sites for these to refer to sge[0] or [1] directly. Reorder code to initialize header sgs first. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:50 -07:00
Andy Grover	6f3d05db0d	RDS/IB: Remove dead code Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:49 -07:00
Andy Grover	9c030391e8	RDS/IB: eliminate duplicate code both atomics and rdmas need to convert ib-specific completion codes into RDS status codes. Rename rds_ib_rdma_send_complete to rds_ib_send_complete, and have it take a pointer to the function to call with the new error code. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:47 -07:00
Andy Grover	15133f6e67	RDS: Implement atomic operations Implement a CMSG-based interface to do FADD and CSWP ops. Alter send routines to handle atomic ops. Add atomic counters to stats. Add xmit_atomic() to struct rds_transport Inline rds_ib_send_unmap_rdma into unmap_rm Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:41 -07:00
Andy Grover	ff87e97a9d	RDS: make m_rdma_op a member of rds_message This eliminates a separate memory alloc, although it is now necessary to add an "r_active" flag, since it is no longer to use the m_rdma_op pointer as an indicator of if an rdma op is present. rdma SGs allocated from rm sg pool. rds_rm_size also gets bigger. It's a little inefficient to run through CMSGs twice, but it makes later steps a lot smoother. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:38 -07:00
Andy Grover	21f79afa5f	RDS: fold rdma.h into rds.h RDMA is now an intrinsic part of RDS, so it's easier to just have a single header. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:37 -07:00
Andy Grover	e779137aa7	RDS: break out rdma and data ops into nested structs in rds_message Clearly separate rdma-related variables in rm from data-related ones. This is in anticipation of adding atomic support. Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:33 -07:00
Andy Grover	8690bfa17a	RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons Favor "if (foo)" style over "if (foo != NULL)". Signed-off-by: Andy Grover <andy.grover@oracle.com>	2010-09-08 18:11:32 -07:00
Sherman Pun	450d06c020	RDS: Properly unmap when getting a remote access error If the RDMA op has aborted with a remote access error, in addition to what we already do (tell userspace it has completed with an error) also unmap it and put() the rm. Otherwise, hangs may occur on arches that track maps and will not exit without proper cleanup. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:17:00 -07:00
Andy Grover	2e7b3b9945	RDS: Fix congestion issues for loopback We have two kinds of loopback: software (via loop transport) and hardware (via IB). sw is used for 127.0.0.1, and doesn't support rdma ops. hw is used for sends to local device IPs, and supports rdma. Both are used in different cases. For both of these, when there is a congestion map update, we want to call rds_cong_map_updated() but not actually send anything -- since loopback local and foreign congestion maps point to the same spot, they're already in sync. The old code never called sw loop's xmit_cong_map(),so rds_cong_map_updated() wasn't being called for it. sw loop ports would not work right with the congestion monitor. Fixing that meant that hw loopback now would send congestion maps to itself. This is also undesirable (racy), so we check for this case in the ib-specific xmit code. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:55 -07:00
Andy Grover	735f61e626	RDS: Do not BUG() on error returned from ib_post_send BUGging on a runtime error code should be avoided. This patch also eliminates all other BUG()s that have no real reason to exist. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-03-16 21:16:53 -07:00
Joe Perches	f64f9e7192	net: Move && and \|\| to end of previous line Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-11-29 16:55:45 -08:00
Steve Wise	7b70d0336d	RDS/IW+IB: Allow max credit advertise window. Fix hack that restricts the credit advertisement to 127. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-09 17:21:17 -07:00
Steve Wise	d39e0602bb	RDS/IW+IB: Set the RDS_LL_SEND_FULL bit when we're throttled. The RDS_LL_SEND_FULL bit should be set when we stop transmitted due to flow control. Otherwise the send worker will keep trying as opposed to sleeping until we unthrottle. Saves CPU. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-09 17:21:16 -07:00
Andy Grover	6a0979df32	RDS/IB: Implement IB-specific datagram send. Specific to IB is a credits-based flow control mechanism, in addition to the expected usage of the IB API to package outgoing data into work requests. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-02-26 23:39:31 -08:00

27 Commits