Commit Graph

3678 Commits

Author SHA1 Message Date
Alex Elder
e93f315235 rbd: define parent image request routines
Define rbd_parent_request_create() and rbd_parent_request_destroy()
to handle the creation of parent image requests submitted for
layered image objects.  For simplicity, let rbd_img_request_put()
handle dropping the reference to any image request (parent or not),
and call whichever destructor is appropriate on the last put.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 15:06:44 -05:00
Alex Elder
fb65d2284c rbd: define rbd_dev_unparent()
Define rbd_dev_unparent() to encapsulate cleaning up parent data
structures from a layered rbd image.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 15:06:43 -05:00
Alex Elder
8785b1d487 rbd: don't release write request until necessary
Previously when a layered write was going to involve a copyup
request, the original osd request was released before submitting the
parent full-object read.  The osd request for the copyup would then
be allocated in rbd_img_obj_parent_read_full_callback().

Shortly we will be handling the event of mapped layered images
getting flattened, and when that occurs we need to resubmit the
original request.  We therefore don't want to release the osd
request until we really konw we're going to replace it--in the
callback function.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 15:06:43 -05:00
Alex Elder
642a25375f rbd: get parent info on refresh
Get parent info for format 2 images on every refresh (rather than
just during the initial probe).  This will be needed to detect the
disappearance of the parent image in the event a mapped image
becomes unlayered (i.e., flattened).  Avoid leaking the previous
parent spec on the second and subsequent times this information is
requested by dropping the previous one (if any) before updating it.
(Also, extract the pool id into a local variable before assigning
it into the parent spec.)

Switch to using a non-zero parent overlap value rather than the
existence of a parent (a non-null parent_spec pointer) to determine
whether to mark a request layered.  It will soon be possible for
a layered image to become unlayered while a request is in flight.

This means that the layered flag for an image request indicates that
there was a non-zero parent overlap at the time the image request
was created.  The parent overlap can change thereafter, which may
lead to special handling at request submission or completion time.

This and the next several patches are related to:
    http://tracker.ceph.com/issues/3763

NOTE:
If an error occurs while refreshing the parent info (i.e.,
requesting it after initial probe), the old parent info will
persist.  This is not really correct, and is a scenario that needs
to be addressed.  For now we'll assert that the failure mode is
unlikely, but the issue has been documented in tracker issue 5040.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 15:06:33 -05:00
Alex Elder
70cf49cfc7 rbd: ignore zero-overlap parent
An rbd clone image that has an overlap with its parent of 0 is
effectively not a layered image at all.  Detect this case and treat
such an image as non-layered.  Issue a warning to be sure the user
knows what's going on.

This resolves:
    http://tracker.ceph.com/issues/5028

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 14:12:41 -05:00
Alex Elder
b91f09f17b rbd: support reading parent page data for writes
Currently, rbd_img_obj_parent_read_full() assumes the incoming
object request contains bio data.  But if a layered image is part of
a multi-layer stack of images it will result in read requests of
page data to parent images.

This is handling the same kind of issue as was resolved by this
commit:
    5b2ab72d  rbd: support reading parent page data

This resolves:
    http://tracker.ceph.com/issues/5027

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 14:12:40 -05:00
Alex Elder
ebda6408f2 rbd: fix parent request size assumption
The code that reads object data from the parent for a copyup on
write request currently assumes that the size of that request is the
size of a "full" object from the original target image.

That is not necessarily the case.  The parent overlap could reduce
the request size below that.  To fix that assumption we need to
record the number of pages in the copyup_pages array, for both an
image request and an object request.  Rename a local variable in
rbd_img_obj_parent_read_full_callback() to reflect we're recording
the length of the parent read request, not the size of the target
object.

This resolves:
    http://tracker.ceph.com/issues/5038

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-13 14:09:01 -05:00
Alex Elder
c48f3f86e2 rbd: kill rbd_img_request_get()
Get rid of rbd_img_request_get(), because it isn't used, and maybe
won't ever be needed.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 20:17:00 -05:00
Alex Elder
1f3ef78861 rbd: only set up watch for mapped images
Any changes to parent images are immaterial to any mapped clone.
So there is no need to have a watch event registered on header
objects except for the header object of an image that is mapped.
In fact, a watch request is a write operation, and we may only
have read access to a parent image.

We can't set up the watch request until we know the name of the
header object though.  So pass a flag to rbd_dev_image_probe() to
indicate whether this probe is for a mapping or for a parent image.

Change the second parameter to rbd_dev_header_watch_sync() be
Boolean while we're at it.

This resolves:
    http://tracker.ceph.com/issues/4941

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 20:16:55 -05:00
Alex Elder
7ce4eef7b5 rbd: set mapping read-only flag in rbd_add()
The rbd_dev->mapping field for a parent image is not meaningful.
Since rbd_image_probe() is used both for images being mapped and
their parents, it doesn't make sense to set that flag in that
function.

So move the setting of the mapping.read_only flag out of
rbd_dev_image_probe() and into rbd_add() instead.

This resolves:
    http://tracker.ceph.com/issues/4940

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 20:16:50 -05:00
Alex Elder
5b2ab72d36 rbd: support reading parent page data
Currently, rbd_img_parent_read() assumes the incoming object request
contains bio data.  But if a layered image is part of a multi-layer
stack of images it will result in read requests of page data to parent
images.

Fortunately, it's not hard to add support for page data.

This resolves:
    http://tracker.ceph.com/issues/4939

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 20:16:25 -05:00
Alex Elder
91c6febb38 rbd: fix an incorrect assertion condition
In rbd_img_obj_parent_read_full_callback() there is an assertion
intended to verify the size of the image request for a full parent
read was the size of the original request's target object.  But
assertion was looking at the parent image order rather than the
original one, and these values can differ.

Fix that.

This resolves:
    http://tracker.ceph.com/issues/4938

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 20:16:10 -05:00
Alex Elder
2df3fac758 rbd: define rbd_dev_v2_header_info()
This rearranges rbd_dev_v2_refresh() so it works more like
rbd_dev_v1_header_info().  While format 1 images need to read the
whole header object to get any information, format 2 can collect
almost all information selectively.  So the one-time initialization
will remain in a separate function--based on rbd_dev_v2_probe().

Rename rbd_dev_v2_refresh() to be rbd_dev_v2_header_info(), and have
it call rbd_dev_v2_header_onetime() if it's being called for the
first time for the given rbd device.

Rename rbd_dev_v2_probe() to be rbd_dev_v2_header_onetime() and
remove the image size and snapshot context calls it held in
common with the refresh function.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:52 -05:00
Alex Elder
99a41ebcee rbd: get rid of trivial v1 header wrappers
Get rid of the trivial wrapper functions rbd_dev_v1_refresh() and
rbd_dev_v1_probe(), substituting rbd_dev_v1_header_read() calls
in their place.

Rename rbd_dev_v1_header_read() to be rbd_dev_v1_header_info(), to
be more generic (it will better reflect what happens with format 2
images).

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:46 -05:00
Alex Elder
30d60ba2f2 rbd: simplify rbd_dev_v1_probe()
An rbd_dev structure's fields are all zero-filled for an initial
probe, so there's no need to explicitly zero the parent_spec
and parent_overlap fields in rbd_dev_v1_probe().  Removing these
assignments makes rbd_dev_v1_probe() *almost* trivial.

Move the dout() message that announces discovery of an image into
rbd_dev_image_probe(), generalize to support images in either format
and only show it if an image is fully discovered.

This highlights that are some unnecessary cleanups in the error
path for rbd_dev_v1_probe(), so they can be removed.

Now rbd_dev_v1_probe() *is* a trivial wrapper function.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:41 -05:00
Alex Elder
662518b128 rbd: update in-core header directly
Now that rbd_header_from_disk() only fills in one-time fields once,
we can extend it slightly so it releases the other fields before
replacing their values.  This way there's no need to pass a
temporary buffer and then copy all the results in.  Just use the rbd
device header structure in rbd_header_from_disk() so its values get
updated directly.

Note that this means we need to take the header semaphore at the
point we update things.  So pass the rbd_dev rather than the address
of its header as its first argument to rbd_header_from_disk(), and
have it return an error code.

As a result, rbd_dev_v1_header_read() does all the work,
rbd_read_header() becomes unnecessary, and rbd_dev_v1_refresh()
becomes a very simple wrapper.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:37 -05:00
Alex Elder
bb23e37acb rbd: refactor rbd_header_from_disk()
This rearranges rbd_header_from_disk so that it:
    - allocates the snapshot context right away
    - keeps results in local variables, not changing the passed-in
      header until it's known we'll succeed
    - does initialization of set-once fields in a header only if
      they have not already been set

The last point is moot at the moment, because rbd_read_header()
(the only caller) always supplies a zero-filled header buffer.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:33 -05:00
Alex Elder
46578dcdca rbd: zero format 1 header structure earlier
The passed-in header structure is zeroed in rbd_header_from_disk().
Instead, have the caller do it.  Note that there are two callers,
rbd_dev_v1_refresh() and rbd_dev_v1_probe().  The latter already has
a zeroed header structure so zeroing it isn't necessary there.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:28 -05:00
Alex Elder
f35a4dee14 rbd: set the mapping size and features later
Defer setting the size and features fields of a mapped image until
after the Linux disk structure is set up.  Set the capacity of the
disk after that.

Rearrange the definition of rbd_image_header, separating the fields
that are set only once from those that can be updated.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 17:00:00 -05:00
Alex Elder
51344a38ba rbd: always set read-only flag in rbd_add()
Hold off setting the read-only flag in rbd_add() for an image being
mapped until we have successfully probed the image.  At that point
we know whether it's a snapshot mapping or not, so we can set the
read-only flag in that one place rather than doing so (for
snapshots) in rbd_dev_mapping_set().  To do this, pass a flag to the
image probe routine indicating whether we want a read-only mapping.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:48:12 -05:00
Alex Elder
6d80b130d5 rbd: kill rbd_dev_clear_mapping()
This function is a duplicate of rbd_dev_mapping_clear(), and was
added by mistake.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:48:12 -05:00
Alex Elder
8f4b7d9821 rbd: don't look up snapshot id in rbd_dev_mapping_set()
Currently rbd_dev_mapping_set() looks up the snapshot id for the
snapshot whose name is found in the rbd device's spec structure.

That function gets called by rbd_dev_device_setup(), which is
called by rbd_add() *after* rbd_dev_image_probe().  If the
image probe succeeds, the rbd device's spec will already have
been updated to include names and ids for all fields.

Therefore there's no need to look up the snapshot id in
rbd_dev_mapping_set().

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:48:11 -05:00
Alex Elder
c734b79655 rbd: don't print warning if not mapping a parent
The presence of the LAYERING bit in an rbd image's feature mask does
not guarantee the image actually has a parent image.  Currently that
bit is set only when a clone (i.e., image with a parent) is created,
but it is (currently) not cleared if that clone gets flattened back
into a "normal" image.  A "parent_id" query will leave the
parent_spec for the image being mapped a null pointer, but will not
return an error.

Currently, whenever an image with the LAYERED feature gets mapped, a
warning about the use of layered images gets printed.  But we don't
want to do this for a flattened image, so print the warning only
if we find there is a parent spec after the probe.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:48:11 -05:00
Alex Elder
29334ba49c rbd: kill rbd_update_mapping_size()
Since rbd_update_mapping_size() is now a trivial wrapper, just open
code it in its two callers.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:45:39 -05:00
Alex Elder
00a653e216 rbd: update capacity in rbd_dev_refresh()
When a mapped image changes size, we change the capacity recorded
for the Linux disk associated with it, in rbd_update_mapping_size().
That function is called in two places--the format 1 and format 2
refresh routines.

There is no need to set the capacity while holding the header
semaphore.  Instead, do it in the common rbd_dev_refresh(), using
the logic that's already there to initiate disk revalidation.

Add handling in the request function, just in case a request
that exceeds the capacity of the device comes in (perhaps one
that was started before a refresh shrunk the device).

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:45:30 -05:00
Alex Elder
e627db085e rbd: revalidate only for mapping size changes
This commit:
    d98df63e rbd: revalidate_disk upon rbd resize
instituted a call to revalidate_disk() to notify interested parties
that a mapped image has changed size.  This works well, as long as
the the rbd device doesn't map a snapshot.

A snapshot will never change size.  However, the base image the
snapshot is associated with can, and it can do so while the snapshot
is mapped.

The problem is that the test for the size is looking at the size of
the base image, not the size of the mapped snapshot.  This patch
corrects that.

Update the warning message shown in the event of error, and move
it into the callers.

This resolves:
    http://tracker.ceph.com/issues/4911

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:40:48 -05:00
Alex Elder
49ece55428 rbd: fix leak of format 2 snapshot context
When rbd_dev_v2_refresh() is called, the rbd device already has a
snapshot context associated with it.  But that never gets freed,
the pointer just gets overwritten.

Fix this by dropping the rbd device's reference to the snapshot
context before overwriting the pointer.

Because ceph_put_snap_context() already handles for a null pointer
we don't need to check for that (for the probe case, where no
context has yet been assigned).

This resolves:
    http://tracker.ceph.com/issues/4912

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-08 07:38:30 -05:00
Alex Elder
b5b09be30c rbd: fix image request leak on parent read
When a read for a layered image object finds the target object
doesn't exist, a read image request for the parent image is created
and submitted.  When that completes, the callback routine was
not releasing that parent image request.  Fix that.

The slab allocation stuff just added has greatly simplified the
search for the source of this memory leak.

This resolves:
    http://tracker.ceph.com/issues/4803

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 12:15:28 -05:00
Alex Elder
78c2a44aae rbd: allocate image object names with a slab allocator
The names of objects used for image object requests are always fixed
size.  So create a slab cache to manage them.  Define a new function
rbd_segment_name_free() to match rbd_segment_name() (which is what
supplies the dynamically-allocated name buffer).

This is part of:
    http://tracker.ceph.com/issues/3926

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 11:58:30 -05:00
Alex Elder
868311b1eb rbd: allocate object requests with a slab allocator
Create a slab cache to manage rbd_obj_request allocation.  We aren't
using a constructor, and we'll zero-fill object request structures
when they're allocated.

This is part of:
    http://tracker.ceph.com/issues/3926

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 11:58:30 -05:00
Alex Elder
f907ad5596 rbd: allocate name separate from obj_request
The next patch will define a slab allocator for a object requests.
To use that we'll need to allocate the name of an object separate
from the request structure itself.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 11:58:29 -05:00
Alex Elder
1c2a9dfe21 rbd: allocate image requests with a slab allocator
Create a slab cache to manage rbd_img_request allocation.  Nothing
too fancy at this point--we'll still initialize everything at
allocation time (no constructor)

This is part of:
    http://tracker.ceph.com/issues/3926

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 11:58:29 -05:00
Alex Elder
30d1cff817 rbd: use binary search for snapshot lookup
Use bsearch(3) to make snapshot lookup by id more efficient.  (There
could be thousands of snapshots, and conceivably many more.)

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 11:58:17 -05:00
Alex Elder
15228ede7d rbd: clear EXISTS flag if mapped snapshot disappears
This functionality inadvertently disappeared in the last patch.

Image snapshots can get removed at just about any time.  In
particular it can disappear even if it is in use by an rbd
client as a mapped image.

The rbd client deals with such a disappearance by responding to new
requests with ENXIO.  This is implemented by each rbd device
maintaining an EXISTS flag, which is normally set but cleared if a
snapshot disappears.

This patch (re-)implements the clearing of that flag.

Whenever mapped image header information is refreshed, if the
mapping is for a snapshot, verify the mapped snapshot is still
present in the updated snapshot context.  If it is not, clear the
flag.

It is not necessary to check this in the initial probe, because the
probe will not succeed if the snapshot doesn't exist.

This resolves:
    http://tracker.ceph.com/issues/4880

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-02 11:57:03 -05:00
Alex Elder
33dca39f5c rbd: kill off the snapshot list
We no longer use the snapshot list for anything.  When we need to
look up a snapshot name, id, size, or feature mask, we just do it
directly rather than relying on this list being updated with every
refresh.  The main reason it existed was for the benefit of the
device/sysfs entries that previously were associated with snapshots.

So get rid of the snapshot list, and struct rbd_snap, and the
hundreds of lines of code that supported them.

This resolves:
    http://tracker.ceph.com/issues/4868

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:22 -07:00
Alex Elder
2ad3d7167e rbd: define rbd_snap_size() and rbd_snap_features()
This patch defines a handful of new functions that will allow
us to get rid of the rbd device structure's list of snapshots.

Define rbd_snap_id_by_name() to look up a snapshot id given its
name.  This is efficient for format 1 images but not for format 2.
Fortunately it only gets called at mapping time so it's not that
critical.

Use rbd_snap_id_by_name() to find out the id for a snapshot getting
mapped, and pass that id to new functions rbd_snap_size() and
rbd_snap_features() to look up information about a given snapshot's
size and feature mask given its snapshot id.  All this gets done
in rbd_dev_mapping_set().

As a result, snap_by_name() is no longer needed, so get rid of it.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:20 -07:00
Alex Elder
54cac61fb6 rbd: use snap_id not index to look up snap info
In order to align with what was needed for format 1 rbd images,
rbd_dev_v2_snap_info() was set up to take as argument an index into
the array of snapshot ids in a rbd device's snapshot context.

This switches that around, so we pass the snapshot id instead.
In doing this, rbd_snap_name() now returns a dynamically-allocated
string rather than a fixed one, so there's no need to make a
duplicate in its caller, rbd_dev_spec_update().

This means the following functions take a snapshot id where they
previously used an index value:
    rbd_dev_snap_info()
    rbd_dev_v1_snap_info()
    rbd_dev_v2_snap_info()

A new function, rbd_dev_snap_index(), determines the snap index for
format 1 images and uses it to look up the name.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:19 -07:00
Alex Elder
9682fc6d3a rbd: look up snapshot name in names buffer
Rather than scanning the list of snapshot structures for it, scan
the snapshot context buffer containing snapshot names in order to
determine for a format 1 image the name associated with a given
snapshot id.

Pull out the part of rbd_dev_v1_snap_info() that does this scan into
a new function, _rbd_dev_v1_snap_name().  Have that function return
a dynamically-allocated copy of the name, and don't duplicate it in
rbd_dev_v1_snap_info().

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:18 -07:00
Alex Elder
dedc81ea84 rbd: drop obj_request->version
Nothing ever uses the version field maintained in the object request
structure any more, so get rid of it.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:17 -07:00
Alex Elder
e2a58ee55b rbd: drop rbd_obj_method_sync() version parameter
Only NULL is passed as the version argument to rbd_obj_method_sync(),
so get rid of it.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:16 -07:00
Alex Elder
cc4a38bdd5 rbd: more version parameter removal
Continued from the last patch, more parameters that can go away
because we no longer have a need to track object versions.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:15 -07:00
Alex Elder
7097f8df6e rbd: get rid of some version parameters
Several functions in rbd have parameters meant to allow the version
of an object to be passed in or out.  The purpose of those was to
allow the version of a header object to be maintained, but we no
longer do that.  As a result, these parameters are never actually
needed or used, so get rid of them.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:14 -07:00
Alex Elder
b21ebdddeb rbd: stop tracking header object version
The rbd code takes care to maintain the version of the header
object.  This was done in hopes of using it to detect a change in
the object between reading it and setting up a watch request to
be notified of changes.

The mechanism was never fully implemented, however.  And we now
avoid the original problem by setting up the watch request before
ever reading the content of the header.

The osd doesn't interpret the object version supplied with a WATCH
osd op, nor does it use the version supplied with a NOTIFY_ACK op
(we can just supply 0 for both).  There is therefore no need to
maintain the header's object version any more, so stop doing so.

We'll be able to simplify some more rbd code in the next few patches
as a result of this.

This resolves:
    http://tracker.ceph.com/issues/3952

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:13 -07:00
Alex Elder
cb75223d2b rbd: snap names are pointer to constant data
Make explicit that snapshot names don't change by making functions
return and take parameters that that point to const qualified data.

This resolves:
    http://tracker.ceph.com/issues/4867

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:12 -07:00
Alex Elder
a3fbe5d447 rbd: don't revalidate so much
Whenever a header object event causes a mapped rbd image to refresh
its header information, revalidate_disk() is being called.  This was
done in rbd_dev_refresh() outside the control mutex in order to
avoid a lock inversion.  Although a an event like this *might*
indicate the image has changed size, most of the time it does not.

Record the image size before and after the refresh, and only
call revalidate_disk() if it changes.

This resolves:
    http://tracker.ceph.com/issues/4867

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:11 -07:00
Alex Elder
96882f55c4 rbd: fix up the layering warning message
A warning gets spewed for any image being probed, including parent
images.  Set up a condition such that the warning message only gets
printed for the image being mapped, not any of its parents.

Also, I didn't like the way the warning ended up being so long.
Make it a terse warning instead.  People experimenting with layering
will know what the message means.

This is part of:
    http://tracker.ceph.com/issues/4867

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:10 -07:00
Alex Elder
812164f8c3 ceph: use ceph_create_snap_context()
Now that we have a library routine to create snap contexts, use it.

This is part of:
    http://tracker.ceph.com/issues/4857

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:09 -07:00
Alex Elder
b536f69a3a rbd: set up devices only for mapped images
Stop setting up Linux devices during the image probe operation.
Instead, set up the devices as a separate step after the image
probe, in rbd_add().

A consequence of this is that only mapped images get devices
assigned to them, which is pretty sweet.

This resolves:
    http://tracker.ceph.com/issues/4774

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:07 -07:00
Alex Elder
8ad42cd0c0 rbd: don't have device release destroy rbd_dev
Currently an rbd_device structure gets destroyed from the release
routine for the device embedded within it.  Stop doing that, instead
calling rbd_dev_image_release() right after rbd_bus_del_dev()
wherever the latter is called.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:05 -07:00
Alex Elder
6fd48b3be9 rbd: define rbd_dev_unprobe()
Define a new function rbd_dev_unprobe() which undoes state changes
that occur from calling rbd_dev_v1_probe() or rbd_dev_v2_probe().
Note that this is a superset of rbd_header_free(), which is now
getting removed (it seems to have been used improperly anyway).

Flesh out rbd_dev_image_release() so it undoes exactly what
rbd_dev_image_probe() does.

This means that:
    - rbd_dev_device_release() gets called when the last device
      reference gets dropped;
    - that undoes everything done by the rbd_dev_device_setup() call
      at the end of rbd_dev_image_probe() (and nothing more), ending
      by calling rbd_dev_image_release(); and
    - rbd_dev_image_release() undoes everything else done by
      rbd_dev_image_probe() (and this includes a call to
      rbd_dev_unprobe().

This means the image and device portions of an rbd device are fairly
cleanly separated now, so error paths should be a little easier to
verify than they used to be.

Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
2013-05-01 21:20:04 -07:00