linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-28 22:02:28 +00:00

Author	SHA1	Message	Date
Alex Elder	6e2a4505db	rbd: don't zero-fill non-image object requests A result of ENOENT from a read request for an object that's part of an rbd image indicates that there is a hole in that portion of the image. Similarly, a short read for such an object indicates that the remainder of the read should be interpreted a full read with zeros filling out the end of the request. This behavior is not correct for objects that are not backing rbd image data. Currently rbd_img_obj_request_callback() assumes it should be done for all objects. Change rbd_img_obj_request_callback() so it only does this zeroing for image objects. Encapsulate that special handling in its own function. Add an assertion that the image object request is a bio request, since we assume that (and we currently don't support any other types). This resolves a problem identified here: http://tracker.ceph.com/issues/4559 The regression was introduced by `bf0d5f503d`. Reported-by: Dan van der Ster <dan@vanderster.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-off-by: Sage Weil <sage@inktank.com>	2013-03-29 11:32:07 -07:00
Sage Weil	1b83bef24c	libceph: update osd request/reply encoding Use the new version of the encoding for osd requests and replies. In the process, update the way we are tracking request ops and reply lengths and results in the struct ceph_osd_request. Update the rbd and fs/ceph users appropriately. The main changes are: - we keep pointers into the request memory for fields we need to update each time the request is sent out over the wire - we keep information about the result in an array in the request struct where the users can easily get at it. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com>	2013-02-26 15:02:50 -08:00
Alex Elder	c47f937154	rbd: pass length, not op for osd completions The only thing type-specific osd completion functions do with their osd op parameter is (in some cases) extract the number of bytes transferred from it. In the other cases, the xferred bytes field is not used, and total message data transfer byte count (which may well be zero) is used. Just set the object request transfer count in the main osd request callback function and provide that to the other routines. There is then no longer any need to pass the op pointer to the type-specific completion routines, so drop those parameters. Stop doing anything with the total message data length. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-02-26 15:00:06 -08:00
Alex Elder	39bf2c5d09	rbd: move rbd_osd_trivial_callback() This function is slightly out of place, probably the result of an errant automatic merge or something. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>	2013-02-26 14:59:49 -08:00
Alex Elder	cc344fa1b5	rbd: eliminate sparse warnings Fengguang Wu reminded me that there were outstanding sparse reports in the ceph and rbd code. This patch fixes these problems in rbd that lead to those reports: - Convert functions that are never referenced externally to have static scope. - Add a lockdep annotation to rbd_request_fn(), because it releases a lock before acquiring it again. This partially resolves: http://tracker.ceph.com/issues/4184 Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-25 15:37:08 -06:00
Alex Elder	37206ee5be	rbd: normalize dout() calls Add dout() calls to facilitate tracing of image and object requests. Change a few existing calls so they use __func__ rather than the hard-coded function name. Have calls always add ":" after the name of the function, and prefix pointer values with a consistent tag indicating what it represents. (Note that there remain some older dout() calls that are left untouched by this patch.) Issue a warning if rbd_osd_write_callback() ever gets a short write. This resolves: http://tracker.ceph.com/issues/4235 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-25 15:36:56 -06:00
Alex Elder	632b88cade	rbd: barriers are hard Let's go shopping! I'm afraid this may not have gotten it right: `07741308` rbd: add barriers near done flag operations The smp_wmb() should have been done before setting the done flag, to ensure all other data was valid before marking the object request done. Switch to use atomic_inc_return() here to set the done flag, which allows us to verify we don't mark something done more than once. Doing this also implies general barriers before and after the call. And although a read memory barrier might have been sufficient before reading the done flag, convert this to a full memory barrier just to put this issue to bed. This resolves: http://tracker.ceph.com/issues/4238 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-25 15:36:50 -06:00
Alex Elder	4dda41d3d7	rbd: ignore zero-length requests The old request code simply ignored zero-length requests. We should still operate that same way to avoid any changes in behavior. We can implement handling for special zero-length requests separately (see http://tracker.ceph.com/issues/4236). Add some assertions based on this new constraint. This resolves: http://tracker.ceph.com/issues/4237 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-25 15:36:36 -06:00
Alex Elder	903bb32e89	libceph: drop return value from page vector copy routines The return values provided for ceph_copy_to_page_vector() and ceph_copy_from_page_vector() serve no purpose, so get rid of them. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-19 19:14:05 -06:00
Alex Elder	23ed6e13b3	rbd: ignore result of ceph_copy_from_page_vector() The result of ceph_copy_from_page_vector() is simply the length argument it is provided. This is called by rbd_obj_method_sync(), which returns the result if it's non-negative. But we always either ignore or overwrite that return value. So explicitly ignore what's returned by the copy function, and have rbd_obj_method_sync() always return either a negative errno or 0. We also return the result of ceph_copy_from_page_vector() in rbd_obj_read_sync(). There we still want to return the number of bytes transferred, but we can use the value we already have in hand rather than what ceph_copy_from_page_vector() provides. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-19 19:14:05 -06:00
Alex Elder	1ceae7ef0f	rbd: prevent bytes transferred overflow In rbd_obj_read_sync(), verify the number of bytes transferred won't exceed what can be represented by a size_t before using it to indicate the number of bytes to copy to the result buffer. (The real motivation for this is to prepare for the next patch.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-19 19:14:04 -06:00
Alex Elder	fbfab53966	libceph: allow STAT osd operations Add support for CEPH_OSD_OP_STAT operations in the osd client and in rbd. This operation sends no data to the osd; everything required is encoded in identity of the target object. The result will be ENOENT if the object doesn't exist. If it does exist and no other error occurs the server returns the size and last modification time of the target object as output data (in little endian format). The size is a 64 bit unsigned and the time is ceph_timespec structure (two unsigned 32-bit integers, representing a seconds and nanoseconds value). This resolves: http://tracker.ceph.com/issues/4007 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-19 19:14:03 -06:00
Alex Elder	ef06f4d32a	rbd: add parentheses to object request iterator macros The for_each_obj_request*() macros should parenthesize their uses of the ireq parameter. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-19 19:14:02 -06:00
Alex Elder	3c663bbdcd	libceph: kill ceph_osdc_create_event() "one_shot" parameter There is only one caller of ceph_osdc_create_event(), and it provides 0 as its "one_shot" argument. Get rid of that argument and just use 0 in its place. Replace the code in handle_watch_notify() that executes if one_shot is nonzero in the event with a BUG_ON() call. While modifying "osd_client.c", give handle_watch_notify() static scope. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-18 12:20:00 -06:00
Alex Elder	077413082f	rbd: add barriers near done flag operations Somehow, I missed this little item in Documentation/atomic_ops.txt: * WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! * Create and use some helper functions that include the proper memory barriers for manipulating the done field. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	a14ea269dd	rbd: turn off interrupts for open/remove locking This commit: bc7a62ee5 rbd: prevent open for image being removed added checking for removing rbd before allowing an open, and used the same request spinlock for protecting that and updating the open count as is used for the request queue. However it used the non-irq protected version of the spinlocks. Fix that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	9cbb1d7268	libceph: don't require r_num_pages for bio requests There is a check in the completion path for osd requests that ensures the number of pages allocated is enough to hold the amount of incoming data expected. For bio requests coming from rbd the "number of pages" is not really meaningful (although total length would be). So stop requiring that nr_pages be supplied for bio requests. This is done by checking whether the pages pointer is null before checking the value of nr_pages. Note that this value is passed on to the messenger, but there it's only used for debugging--it's never used for validation. While here, change another spot that used r_pages in a debug message inappropriately, and also invalidate the r_con_filling_msg pointer after dropping a reference to it. This resolves: http://tracker.ceph.com/issues/3875 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	1e32d34cfa	rbd: don't take extra bio reference for osd client Currently, if the OSD client finds an osd request has had a bio list attached to it, it drops a reference to it (or rather, to the first entry on that list) when the request is released. The code that added that reference (i.e., the rbd client) is therefore required to take an extra reference to that first bio structure. The osd client doesn't really do anything with the bio pointer other than transfer it from the osd request structure to outgoing (for writes) and ingoing (for reads) messages. So it really isn't the right place to be taking or dropping references. Furthermore, the rbd client already holds references to all bio structures it passes to the osd client, and holds them until the request is completed. So there's no need for this extra reference whatsoever. So remove the bio_put() call in ceph_osdc_release_request(), as well as its matching bio_get() call in rbd_osd_req_create(). This change could lead to a crash if old libceph.ko was used with new rbd.ko. Add a compatibility check at rbd initialization time to avoid this possibilty. This resolves: http://tracker.ceph.com/issues/3798 and http://tracker.ceph.com/issues/3799 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:11 -08:00
Alex Elder	b82d167be6	rbd: prevent open for image being removed An open request for a mapped rbd image can arrive while removal of that mapping is underway. We need to prevent such an open request from succeeding. (It appears that Maciej Galkiewicz ran into this problem.) Define and use a "removing" flag to indicate a mapping is getting removed. Set it in the remove path after verifying nothing holds the device open. And check it in the open path before allowing the open to proceed. Acquire the rbd device's lock around each of these spots to avoid any races accessing the flags and open_count fields. This addresses: http://tracker.newdream.net/issues/3427 Reported-by: Maciej Galkiewicz <maciejgalkiewicz@ragnarson.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:10 -08:00
Alex Elder	6d292906f8	rbd: define flags field, use it for exists flag Define a new rbd device flags field, manipulated using bit operations. Replace the use of the current "exists" flag with a bit in this new "flags" field. Add a little commentary about the "exists" flag, which does not need to be manipulated atomically. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:10 -08:00
Alex Elder	8eb8756530	rbd: don't drop watch requests on completion When we register an osd request to linger, it means that request will stay around (under control of the osd client) until we've unregistered it. We do that for an rbd image's header object, and we keep a pointer to the object request associated with it. Keep a reference to the watch object request for as long as it is registered to linger. Drop it again after we've removed the linger registration. This resolves: http://tracker.ceph.com/issues/3937 (Note: this originally came about because the osd client was issuing a callback more than once. But that behavior will be changing soon, documented in tracker issue 3967.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:10 -08:00
Alex Elder	25dcf954c3	rbd: decrement obj request count when deleting Decrement the obj_request_count value when deleting an object request from its image request's list. Rearrange a few lines in the surrounding code. This resolves: http://tracker.ceph.com/issues/3940 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:10 -08:00
Alex Elder	975241afcb	rbd: track object rather than osd request for watch Switch to keeping track of the object request pointer rather than the osd request used to watch the rbd image header object. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:10 -08:00
Alex Elder	6977c3f983	rbd: unregister linger in watch sync routine Move the code that unregisters an rbd device's lingering header object watch request into rbd_dev_header_watch_sync(), so it occurs in the same function that originally sets up that request. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:09 -08:00
Alex Elder	9f20e02a53	rbd: get rid of rbd_req_sync_exec() Get rid rbd_req_sync_exec() because it is no longer used. That eliminates the last use of rbd_req_sync_op(), so get rid of that too. And finally, that leaves rbd_do_request() unreferenced, so get rid of that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:09 -08:00
Alex Elder	36be9a7618	rbd: implement sync method with new code Reimplement synchronous object method calls using the new request tracking code. Use the name rbd_obj_method_sync() Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:09 -08:00
Alex Elder	cf81b60e4b	rbd: send notify ack asynchronously When we receive notification of a change to an rbd image's header object we need to refresh our information about the image (its size and snapshot context). Once we have refreshed our rbd image we need to acknowledge the notification. This acknowledgement was previously done synchronously, but there's really no need to wait for it to complete. Change it so the caller doesn't wait for the notify acknowledgement request to complete. And change the name to reflect it's no longer synchronous. This resolves: http://tracker.newdream.net/issues/3877 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:09 -08:00
Alex Elder	5ae9db81b4	rbd: get rid of rbd_req_sync_notify_ack() Get rid rbd_req_sync_notify_ack() because it is no longer used. As a result rbd_simple_req_cb() becomes unreferenced, so get rid of that too. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:09 -08:00
Alex Elder	b8d70035b3	rbd: use new code for notify ack Use the new object request tracking mechanism for handling a notify_ack request. Move the callback function below the definition of this so we don't have to do a pre-declaration. This resolves: http://tracker.newdream.net/issues/3754 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:08 -08:00
Alex Elder	ecf7a0318b	rbd: get rid of rbd_req_sync_watch() Get rid of rbd_req_sync_watch(), because it is no longer used. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:08 -08:00
Alex Elder	9969ebc5af	rbd: implement watch/unwatch with new code Implement a new function to set up or tear down a watch event for an mapped rbd image header using the new request code. Create a new object request type "nodata" to handle this. And define rbd_osd_trivial_callback() which simply marks a request done. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:08 -08:00
Alex Elder	86ea43bfcb	rbd: get rid of rbd_req_sync_read() Delete rbd_req_sync_read() is no longer used, so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:08 -08:00
Alex Elder	788e2df3b9	rbd: implement sync object read with new code Reimplement the synchronous read operation used for reading a version 1 header using the new request tracking code. Name the resulting function rbd_obj_read_sync() to better reflect that it's a full object operation, not an object request. To do this, implement a new OBJ_REQUEST_PAGES object request type. This implements a new mechanism to allow the caller to wait for completion for an rbd_obj_request by calling rbd_obj_request_wait(). This partially resolves: http://tracker.newdream.net/issues/3755 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:08 -08:00
Alex Elder	7d250b949a	rbd: kill rbd_req_coll and rbd_request The two remaining callers of rbd_do_request() always pass a null collection pointer, so the "coll" and "coll_index" parameters are not needed. There is no other use of that data structure, so it can be eliminated. Deleting them means there is no need to allocate a rbd_request structure for the callback function. And since that's the only use of that structure, it too can be eliminated. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:07 -08:00
Alex Elder	2250a71b59	rbd: kill rbd_rq_fn() and all other related code Now that the request function has been replaced by one using the new request management data structures the old one can go away. Deleting it makes rbd_dev_do_request() no longer needed, and deleting that makes other functions unneeded, and so on. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:07 -08:00
Alex Elder	bf0d5f503d	rbd: new request tracking code This patch fully implements the new request tracking code for rbd I/O requests. Each I/O request to an rbd image will get an rbd_image_request structure allocated to track it. This provides access to all information about the original request, as well as access to the set of one or more object requests that are initiated as a result of the image request. An rbd_obj_request structure defines a request sent to a single osd object (possibly) as part of an rbd image request. An rbd object request refers to a ceph_osd_request structure built up to represent the request; for now it will contain a single osd operation. It also provides space to hold the result status and the version of the object when the osd request completes. An rbd_obj_request structure can also stand on its own. This will be used for reading the version 1 header object, for issuing acknowledgements to event notifications, and for making object method calls. All rbd object requests now complete asynchronously with respect to the osd client--they supply a common callback routine. This resolves: http://tracker.newdream.net/issues/3741 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-02-13 18:29:07 -08:00
Alex Elder	c04306471a	rbd: don't retry setting up header watch When an rbd image is initially mapped a watch event is registered so we can do something if the header object changes. The code that does this currently loops if initiating the watch request results in an ERANGE error. The osds will never return ERANGE, so there's no reason to do this loop, so get rid of it. This resolves: http://tracker.newdream.net/issues/3860 Note that the problem this loop was intended to solve is a race between collecting image header information and setting up the watch on the header object. The real fix for that problem is described here: http://tracker.newdream.net/issues/3871 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-25 17:33:37 -06:00
Alex Elder	38901e0f24	rbd: check for overflow in rbd_get_num_segments() The return type of rbd_get_num_segments() is int, but the values it operates on are u64. Although it's not likely, there's no guarantee the result won't exceed what can be respresented in an int. The function is already designed to return -ERANGE on error, so just add this possible overflow as another reason to return that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-25 17:33:14 -06:00
Alex Elder	98571b5aa7	rbd: small changes A few very minor changes to the rbd code: - RBD_MAX_OPT_LEN is unused, so get rid of it - Consolidate rbd options definitions - Make rbd_segment_name() return pointer to const char Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-25 17:32:46 -06:00
Alex Elder	e0b49868d3	rbd: fix type of snap_id in rbd_dev_v2_snap_info() The type of the snap_id local variable is defined with the wrong byte order. Fix that. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:35:55 -06:00
Alex Elder	8b84de7940	rbd: assign watch request more directly Both rbd_req_sync_op() and rbd_do_request() have a "linger" parameter, which is the address of a pointer that should refer to the osd request structure used to issue a request to an osd. Only one case ever supplies a non-null "linger" argument: an CEPH_OSD_OP_WATCH start. And in that one case it is assigned &rbd_dev->watch_request. Within rbd_do_request() (where the assignment ultimately gets made) we know the rbd_dev and therefore its watch_request field. We also know whether the op being sent is CEPH_OSD_OP_WATCH start. Stop opaquely passing down the "linger" pointer, and instead just assign the value directly inside rbd_do_request() when it's needed. This makes it unnecessary for rbd_req_sync_watch() to make arrangements to hold a value that's not available until a bit later. This more clearly separates setting up a watch request from submitting it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:59 -06:00
Alex Elder	5efea49a98	rbd: move remaining osd op setup into rbd_osd_req_op_create() The two remaining osd ops used by rbd are CEPH_OSD_OP_WATCH and CEPH_OSD_OP_NOTIFY_ACK. Move the setup of those operations into rbd_osd_req_op_create(), and get rid of rbd_create_rw_op() and rbd_destroy_op(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:59 -06:00
Alex Elder	2647ba3810	rbd: move call osd op setup into rbd_osd_req_op_create() Move the initialization of the CEPH_OSD_OP_CALL operation into rbd_osd_req_op_create(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:59 -06:00
Alex Elder	8d23bf2909	rbd: don't assign extent info in rbd_req_sync_op() Move the assignment of the extent offset and length and payload length out of rbd_req_sync_op() and into its caller in the one spot where a read (and note--no write) operation might be initiated. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00
Alex Elder	c561191813	rbd: don't assign extent info in rbd_do_request() In rbd_do_request() there's a sort of last-minute assignment of the extent offset and length and payload length for read and write operations. Move those assignments into the caller (in those spots that might initiate read or write operations) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00
Alex Elder	1821665749	rbd: don't leak rbd_req for rbd_req_sync_notify_ack() When rbd_req_sync_notify_ack() calls rbd_do_request() it supplies rbd_simple_req_cb() as its callback function. Because the callback is supplied, an rbd_req structure gets allocated and populated so it can be used by the callback. However rbd_simple_req_cb() is not freeing (or even using) the rbd_req structure, so it's getting leaked. Since rbd_simple_req_cb() has no need for the rbd_req structure, just avoid allocating one for this case. Of the three calls to rbd_do_request(), only the one from rbd_do_op() needs the rbd_req structure, and that call can be distinguished from the other two because it supplies a non-null rbd_collection pointer. So fix this leak by only allocating the rbd_req structure if a non-null "coll" value is provided to rbd_do_request(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00
Alex Elder	2e53c6c379	rbd: don't leak rbd_req on synchronous requests When rbd_do_request() is called it allocates and populates an rbd_req structure to hold information about the osd request to be sent. This is done for the benefit of the callback function (in particular, rbd_req_cb()), which uses this in processing when the request completes. Synchronous requests provide no callback function, in which case rbd_do_request() waits for the request to complete before returning. This case is not handling the needed free of the rbd_req structure like it should, so it is getting leaked. Note however that the synchronous case has no need for the rbd_req structure at all. So rather than simply freeing this structure for synchronous requests, just don't allocate it to begin with. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00
Alex Elder	907703d050	rbd: combine rbd sync watch/unwatch functions The rbd_req_sync_watch() and rbd_req_sync_unwatch() functions are nearly identical. Combine them into a single function with a flag indicating whether a watch is to be initiated or torn down. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00
Alex Elder	0903e875ca	rbd: use a common layout for each device Each osd message includes a layout structure, and for rbd it is always the same (at least for osd's in a given pool). Initialize a layout structure when an rbd_dev gets created and just copy that into osd requests for the rbd image. Replace an assertion that was done when initializing the layout structures with code that catches and handles anything that would trigger the assertion as soon as it is identified. This precludes that (bad) condition from ever occurring. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00
Alex Elder	47dba7ba26	rbd: don't bother calculating file mapping When rbd_do_request() has a request to process it initializes a ceph file layout structure and uses it to compute offsets and limits for the range of the request using ceph_calc_file_object_mapping(). The layout used is fixed, and is based on RBD_MAX_OBJ_ORDER (30). It sets the layout's object size and stripe unit to be 1 GB (2^30), and sets the stripe count to be 1. The job of ceph_calc_file_object_mapping() is to determine which of a sequence of objects will contain data covered by range, and within that object, at what offset the range starts. It also truncates the length of the range at the end of the selected object if necessary. This is needed for ceph fs, but for rbd it really serves no purpose. It does its own blocking of images into objects, echo of which is (1 << obj_order) in size, and as a result it ignores the "bno" value returned by ceph_calc_file_object_mapping(). In addition, by the point a request has reached this function, it is already destined for a single rbd object, and its length will not exceed that object's extent. Because of this, and because the mapping will result in blocking up the range using an integer multiple of the image's object order, ceph_calc_file_object_mapping() will never change the offset or length values defined by the request. In other words, this call is a big no-op for rbd data requests. There is one exception. We read the header object using this function, and in that case we will not have already limited the request size. However, the header is a single object (not a file or rbd image), and should not be broken into pieces anyway. So in fact we should not be calling ceph_calc_file_object_mapping() when operating on the header object. So... Don't call ceph_calc_file_object_mapping() in rbd_do_request(), because useless for image data and incorrect to do sofor the image header. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-01-17 16:34:58 -06:00

1 2 3 4 5 ...

293 Commits