linux/drivers/md
Nikos Tsironis 8b3fd1f53a dm clone: Flush destination device before committing metadata
dm-clone maintains an on-disk bitmap which records which regions are
valid in the destination device, i.e., which regions have already been
hydrated, or have been written to directly, via user I/O.

Setting a bit in the on-disk bitmap meas the corresponding region is
valid in the destination device and we redirect all I/O regarding it to
the destination device.

Suppose the destination device has a volatile write-back cache and the
following sequence of events occur:

1. A region gets hydrated, either through the background hydration or
   because it was written to directly, via user I/O.

2. The commit timeout expires and we commit the metadata, marking that
   region as valid in the destination device.

3. The system crashes and the destination device's cache has not been
   flushed, meaning the region's data are lost.

The next time we read that region we read it from the destination
device, since the metadata have been successfully committed, but the
data are lost due to the crash, so we read garbage instead of the old
data.

This has several implications:

1. In case of background hydration or of writes with size smaller than
   the region size (which means we first copy the whole region and then
   issue the smaller write), we corrupt data that the user never
   touched.

2. In case of writes with size equal to the device's logical block size,
   we fail to provide atomic sector writes. When the system recovers the
   user will read garbage from the sector instead of the old data or the
   new data.

3. In case of writes without the FUA flag set, after the system
   recovers, the written sectors will contain garbage instead of a
   random mix of sectors containing either old data or new data, thus we
   fail again to provide atomic sector writes.

4. Even when the user flushes the dm-clone device, because we first
   commit the metadata and then pass down the flush, the same risk for
   corruption exists (if the system crashes after the metadata have been
   committed but before the flush is passed down).

The only case which is unaffected is that of writes with size equal to
the region size and with the FUA flag set. But, because FUA writes
trigger metadata commits, this case can trigger the corruption
indirectly.

To solve this and avoid the potential data corruption we flush the
destination device **before** committing the metadata.

This ensures that any freshly hydrated regions, for which we commit the
metadata, are properly written to non-volatile storage and won't be lost
in case of a crash.

Fixes: 7431b7835f ("dm: add clone target")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-12-05 17:05:23 -05:00
..
bcache for-5.4/block-2019-09-16 2019-09-17 16:57:47 -07:00
persistent-data dm btree: increase rebalance threshold in __rebalance2() 2019-12-05 15:27:52 -05:00
dm-bio-prison-v1.c dm bio prison: replace spin_lock_irqsave with spin_lock_irq 2019-11-05 14:53:03 -05:00
dm-bio-prison-v1.h
dm-bio-prison-v2.c dm bio prison: replace spin_lock_irqsave with spin_lock_irq 2019-11-05 14:53:03 -05:00
dm-bio-prison-v2.h
dm-bio-record.h
dm-bufio.c dm bufio: introduce a global cache replacement 2019-09-13 17:00:21 -04:00
dm-builtin.c
dm-cache-background-tracker.c
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c dm cache metadata: Fix loading discard bitset 2019-04-18 16:18:25 -04:00
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c dm cache: replace spin_lock_irqsave with spin_lock_irq 2019-11-05 14:53:04 -05:00
dm-clone-metadata.c dm clone metadata: Use a two phase commit 2019-12-05 15:27:54 -05:00
dm-clone-metadata.h dm clone metadata: Use a two phase commit 2019-12-05 15:27:54 -05:00
dm-clone-target.c dm clone: Flush destination device before committing metadata 2019-12-05 17:05:23 -05:00
dm-core.h dm: disable DISCARD if the underlying storage no longer supports it 2019-04-04 15:33:59 -04:00
dm-crypt.c Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues" 2019-11-20 17:27:39 -05:00
dm-delay.c dm delay: fix a crash when invalid device is specified 2019-04-26 11:29:32 -04:00
dm-dust.c dm dust: add limited write failure mode 2019-11-05 15:25:34 -05:00
dm-era-target.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
dm-exception-store.c
dm-exception-store.h - Improve DM snapshot target's scalability by using finer grained 2019-05-16 15:55:48 -07:00
dm-flakey.c block: Kill gfp_t argument of blkdev_report_zones() 2019-07-11 20:04:37 -06:00
dm-init.c docs: device-mapper: move it to the admin-guide 2019-07-15 11:03:01 -03:00
dm-integrity.c dm integrity: fix excessive alignment of metadata runs 2019-11-15 14:49:16 -05:00
dm-io.c
dm-ioctl.c dm: introduce DM_GET_TARGET_VERSION 2019-09-16 10:18:01 -04:00
dm-kcopyd.c dm kcopyd: always complete failed jobs 2019-08-15 15:57:39 -04:00
dm-linear.c block: Kill gfp_t argument of blkdev_report_zones() 2019-07-11 20:04:37 -06:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dm log writes: fix incorrect comment about the logged sequence example 2019-07-09 14:13:33 -04:00
dm-log.c
dm-mpath.c dm mpath: remove harmful bio-based optimization 2019-11-26 10:22:46 -05:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c
dm-raid1.c dm raid1: use struct_size() with kzalloc() 2019-08-26 11:05:32 -04:00
dm-raid.c dm raid: Remove unnecessary negation of a shift in raid10_format_to_md_layout 2019-11-07 11:59:38 -05:00
dm-region-hash.c
dm-round-robin.c
dm-rq.c block: Delay default elevator initialization 2019-09-05 19:52:34 -06:00
dm-rq.h
dm-service-time.c
dm-snap-persistent.c
dm-snap-transient.c
dm-snap.c dm snapshot: rework COW throttling to fix deadlock 2019-10-10 09:46:05 -04:00
dm-stats.c dm stats: use struct_size() helper 2019-09-04 09:39:22 -04:00
dm-stats.h
dm-stripe.c dm stripe: use struct_size() in kmalloc() 2019-11-05 14:09:59 -05:00
dm-switch.c
dm-sysfs.c
dm-table.c dm table: do not allow request-based DM to stack on partitions 2019-11-05 11:22:52 -05:00
dm-target.c dm mpath: fix missing call of path selector type->end_io 2019-04-25 15:38:52 -04:00
dm-thin-metadata.c dm thin metadata: check if in fail_io mode when setting needs_check 2019-07-02 15:50:08 -04:00
dm-thin-metadata.h
dm-thin.c dm thin: wakeup worker only when deferred bios exist 2019-11-18 10:03:12 -05:00
dm-uevent.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
dm-uevent.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 156 2019-05-30 11:26:35 -07:00
dm-unstripe.c
dm-verity-fec.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dm-verity-fec.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dm-verity-target.c dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity-verify-sig.c dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity-verify-sig.h dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-verity.h dm verity: add root hash pkcs#7 signature verification 2019-08-23 10:13:14 -04:00
dm-writecache.c dm writecache: handle REQ_FUA 2019-11-05 14:21:40 -05:00
dm-zero.c
dm-zoned-metadata.c dm zoned: reduce overhead of backing device checks 2019-11-07 10:08:36 -05:00
dm-zoned-reclaim.c dm zoned: reduce overhead of backing device checks 2019-11-07 10:08:36 -05:00
dm-zoned-target.c dm zoned: reduce overhead of backing device checks 2019-11-07 10:08:36 -05:00
dm-zoned.h dm zoned: reduce overhead of backing device checks 2019-11-07 10:08:36 -05:00
dm.c dm: make dm_table_find_target return NULL 2019-08-23 10:13:12 -04:00
dm.h dm: make dm_table_find_target return NULL 2019-08-23 10:13:12 -04:00
Kconfig dm: Fix Kconfig indentation 2019-11-20 10:35:31 -05:00
Makefile dm: add clone target 2019-09-12 09:32:31 -04:00
md-bitmap.c md-bitmap: create and destroy wb_info_pool with the change of bitmap 2019-06-20 16:36:00 -07:00
md-bitmap.h
md-cluster.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 45 2019-05-24 17:27:12 +02:00
md-cluster.h
md-faulty.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 47 2019-05-24 17:27:13 +02:00
md-linear.c md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone 2019-09-03 14:49:28 -07:00
md-linear.h
md-multipath.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 47 2019-05-24 17:27:13 +02:00
md-multipath.h
md.c md: add feature flag MD_FEATURE_RAID0_LAYOUT 2019-09-13 13:10:06 -07:00
md.h md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone 2019-09-03 14:49:28 -07:00
raid0.c md/raid0: fix warning message for parameter default_layout 2019-10-16 09:43:02 -07:00
raid0.h md/raid0: avoid RAID0 data corruption due to layout confusion. 2019-09-13 13:10:05 -07:00
raid1-10.c md: raid1-10: Unify r{1,10}bio_pool_free 2019-06-15 01:37:35 -06:00
raid1.c md/raid1: fail run raid1 array when active disk less than one 2019-09-03 14:52:03 -07:00
raid1.h
raid5-cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
raid5-log.h raid5: set write hint for PPL 2019-03-12 10:15:18 -07:00
raid5-ppl.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
raid5.c raid5: remove STRIPE_OPS_REQ_PENDING 2019-09-13 13:14:39 -07:00
raid5.h raid5: use bio_end_sector in r5_next_bio 2019-09-13 13:14:43 -07:00
raid10.c md: allow last device to be forcibly removed from RAID1/RAID10. 2019-08-07 10:25:02 -07:00
raid10.h