linux/drivers
Nikos Tsironis 694cfe7f31 dm thin: Flush data device before committing metadata
The thin provisioning target maintains per thin device mappings that map
virtual blocks to data blocks in the data device.

When we write to a shared block, in case of internal snapshots, or
provision a new block, in case of external snapshots, we copy the shared
block to a new data block (COW), update the mapping for the relevant
virtual block and then issue the write to the new data block.

Suppose the data device has a volatile write-back cache and the
following sequence of events occur:

1. We write to a shared block
2. A new data block is allocated
3. We copy the shared block to the new data block using kcopyd (COW)
4. We insert the new mapping for the virtual block in the btree for that
   thin device.
5. The commit timeout expires and we commit the metadata, that now
   includes the new mapping from step (4).
6. The system crashes and the data device's cache has not been flushed,
   meaning that the COWed data are lost.

The next time we read that virtual block of the thin device we read it
from the data block allocated in step (2), since the metadata have been
successfully committed. The data are lost due to the crash, so we read
garbage instead of the old, shared data.

This has the following implications:

1. In case of writes to shared blocks, with size smaller than the pool's
   block size (which means we first copy the whole block and then issue
   the smaller write), we corrupt data that the user never touched.

2. In case of writes to shared blocks, with size equal to the device's
   logical block size, we fail to provide atomic sector writes. When the
   system recovers the user will read garbage from that sector instead
   of the old data or the new data.

3. Even for writes to shared blocks, with size equal to the pool's block
   size (overwrites), after the system recovers, the written sectors
   will contain garbage instead of a random mix of sectors containing
   either old data or new data, thus we fail again to provide atomic
   sectors writes.

4. Even when the user flushes the thin device, because we first commit
   the metadata and then pass down the flush, the same risk for
   corruption exists (if the system crashes after the metadata have been
   committed but before the flush is passed down to the data device.)

The only case which is unaffected is that of writes with size equal to
the pool's block size and with the FUA flag set. But, because FUA writes
trigger metadata commits, this case can trigger the corruption
indirectly.

Moreover, apart from internal and external snapshots, the same issue
exists for newly provisioned blocks, when block zeroing is enabled.
After the system recovers the provisioned blocks might contain garbage
instead of zeroes.

To solve this and avoid the potential data corruption we flush the
pool's data device **before** committing its metadata.

This ensures that the data blocks of any newly inserted mappings are
properly written to non-volatile storage and won't be lost in case of a
crash.

Cc: stable@vger.kernel.org
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-12-06 11:46:16 -05:00
..
accessibility
acpi Power management fix for 5.4-rc6 2019-11-01 09:30:48 -07:00
amba ARM updates for 5.4-rc: 2019-10-23 06:26:33 -04:00
android binder: Don't modify VMA bounds in ->mmap handler 2019-10-17 05:58:44 -07:00
ata ata: libahci_platform: Fix regulator_get_optional() misuse 2019-10-25 14:22:20 -06:00
atm atm: he: clean up an indentation issue 2019-09-25 13:54:45 +02:00
auxdisplay It's a somewhat calmer cycle for docs this time, as the churn of the mass 2019-09-17 16:22:26 -07:00
base PM: QoS: Drop frequency QoS types from device PM QoS 2019-10-21 02:05:21 +02:00
bcma bcma: make arrays pwr_info_offset and sprom_sizes static const, shrinks object size 2019-09-13 16:44:49 +03:00
block nbd: verify socket is supported during setup 2019-10-25 14:37:21 -06:00
bluetooth Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-09-15 14:17:27 +02:00
bus bus: ti-sysc: Fix watchdog quirk handling 2019-10-18 08:45:32 -07:00
cdrom
char char/random: Add a newline at the end of the file 2019-10-02 13:49:43 -07:00
clk Merge tag 'fix-missing-panels' into fixes 2019-10-04 09:06:41 -07:00
clocksource timer-of: don't use conditional expression with mixed 'void' types 2019-10-02 16:16:07 -07:00
connector
counter
cpufreq cpufreq: Cancel policy update work scheduled before freeing 2019-10-22 18:07:30 +02:00
cpuidle cpuidle: haltpoll: Take 'idle=' override into account 2019-10-22 11:43:17 +02:00
crypto inet: stop leaking jiffies on the wire 2019-11-01 14:57:52 -07:00
dax
dca
devfreq
dio
dma dmaengine: cppi41: Fix cppi41_dma_prep_slave_sg() when idle 2019-10-23 21:15:21 +05:30
dma-buf dma-buf/resv: fix exclusive fence get 2019-10-10 17:05:20 +02:00
edac EDAC/ghes: Fix Use after free in ghes_edac remove path 2019-10-17 11:27:05 +02:00
eisa
extcon chrome platform changes for v5.4 2019-09-19 14:14:28 -07:00
firewire
firmware efi/efi_test: Lock down /dev/efi_test and require CAP_SYS_ADMIN 2019-10-31 09:40:21 +01:00
fpga Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
fsi
gnss
gpio gpio: lynxpoint: set default handler to be handle_bad_irq() 2019-10-15 01:19:05 +02:00
gpu Merge tag 'drm-fixes-5.4-2019-10-30' of git://people.freedesktop.org/~agd5f/linux into drm-fixes 2019-11-01 11:27:39 +10:00
greybus
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid 2019-10-28 14:26:33 +01:00
hsi HSI changes for the 5.4 series 2019-09-22 12:02:21 -07:00
hv Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM_SLEEP 2019-10-01 14:49:45 -04:00
hwmon hwmon: (ina3221) Fix read timeout issue 2019-10-28 18:46:55 -07:00
hwspinlock
hwtracing Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
i2c i2c: stm32f7: remove warning when compiling with W=1 2019-10-24 20:52:21 +02:00
i3c
ide
idle
iio First set of IIO fixes for the 5.4 cycle. 2019-10-10 11:18:37 +02:00
infiniband RDMA/hns: Prevent memory leaks of eq->buf_list 2019-10-28 15:06:38 -03:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2019-10-25 17:31:53 -04:00
interconnect
iommu iommu/vt-d: Fix panic after kexec -p for kdump 2019-10-30 10:30:22 +01:00
ipack
irqchip irqchip updates for 5.4, take 2 2019-10-25 14:25:15 +02:00
isdn net: use skb_queue_empty_lockless() in poll() handlers 2019-10-28 13:33:41 -07:00
leds leds: lm3532: Fix optional led-max-microamp prop error handling 2019-09-12 20:45:52 +02:00
lightnvm lightnvm: print error when target is not found 2019-09-05 13:17:01 -06:00
macintosh cpufreq: Use per-policy frequency QoS 2019-10-21 02:05:21 +02:00
mailbox mailbox: qcom-apcs: fix max_register value 2019-09-17 00:54:29 -05:00
mcb
md dm thin: Flush data device before committing metadata 2019-12-06 11:46:16 -05:00
media media: stkwebcam: fix runtime PM after driver unbind 2019-10-04 14:38:46 +02:00
memory iommu/mediatek: Clean up struct mtk_smi_iommu 2019-08-30 15:57:27 +02:00
memstick memstick: jmb38x_ms: Fix an error handling path in 'jmb38x_ms_probe()' 2019-10-09 11:08:03 +02:00
message
mfd mfd: mt6397: Fix probe after changing mt6397-core 2019-10-24 08:49:25 +01:00
misc misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach 2019-10-04 18:22:14 +02:00
mmc mmc: mxs: fix flags passed to dmaengine_prep_slave_sg 2019-10-21 16:16:38 +02:00
mtd mtd: rawnand: au1550nd: Fix au_read_buf16() prototype 2019-10-07 09:56:36 +02:00
mux
net r8169: fix wrong PHY ID issue with RTL8168dp 2019-11-01 15:09:40 -07:00
nfc NFC: pn533: fix use-after-free and memleaks 2019-10-08 16:52:26 -07:00
ntb NTB: fix IDT Kconfig typos/spellos 2019-09-23 17:20:40 -04:00
nubus
nvdimm libnvdimm fixes v5.4-rc1 2019-09-29 10:33:41 -07:00
nvme Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-11-01 17:48:11 -07:00
nvmem Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
of of: reserved_mem: add missing of_node_put() for proper ref-counting 2019-10-23 15:15:05 -05:00
opp opp: Reinitialize the list_kref before adding the static OPPs again 2019-10-23 10:58:44 +05:30
oprofile
parisc parisc: Remove 32-bit DMA enforcement from sba_iommu 2019-10-14 21:44:26 +02:00
parport Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
pci PCI: PM: Fix pci_power_up() 2019-10-15 23:51:36 +02:00
pcmcia Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-09-28 08:14:15 -07:00
perf Merge branches 'for-next/52-bit-kva', 'for-next/cpu-topology', 'for-next/error-injection', 'for-next/perf', 'for-next/psci-cpuidle', 'for-next/rng', 'for-next/smpboot', 'for-next/tbi' and 'for-next/tlbi' into for-next/core 2019-08-30 12:46:12 +01:00
phy pci-v5.4-changes 2019-09-23 19:16:01 -07:00
pinctrl pinctrl: aspeed-g6: Rename SD3 to EMMC and rework pin groups 2019-10-16 15:58:27 +02:00
platform platform/x86: i2c-multi-instantiate: Fail the probe if no IRQ provided 2019-10-14 15:31:50 +03:00
pnp
power power supply and reset changes for the v5.4 series 2019-09-22 12:04:59 -07:00
powercap Power management updates for 5.4-rc1 2019-09-17 19:15:14 -07:00
pps
ps3
ptp ptp: fix typo of "mechanism" in Kconfig help text 2019-10-07 14:55:46 -04:00
pwm Revert "pwm: Let pwm_get_state() return the last implemented state" 2019-10-21 16:48:52 +02:00
rapidio
ras
regulator regulator: Fixes for v5.4 2019-10-23 15:31:17 -04:00
remoteproc remoteproc updates for v5.4 2019-09-22 10:55:08 -07:00
reset ARM: SoC fixes 2019-09-30 10:04:28 -07:00
rpmsg rpmsg: glink-smem: Name the edge based on parent remoteproc 2019-09-17 15:33:31 -07:00
rtc RTC for 5.4 2019-09-22 11:05:43 -07:00
s390 s390/zcrypt: fix memleak at release 2019-10-22 17:55:51 +02:00
sbus
scsi SCSI fixes on 20191101 2019-11-02 11:15:52 -07:00
sfi
sh
siox
slimbus
soc soc: imx: imx-scu: Getting UID from SCU should have response 2019-10-06 09:21:38 +08:00
soundwire soundwire updates for v5.4-rc1 2019-09-22 10:52:23 -07:00
spi LED updates for 5.4-rc1 2019-09-17 18:40:42 -07:00
spmi
ssb ssb: make array pwr_info_offset static const, makes object smaller 2019-09-13 17:23:18 +03:00
staging staging: wlan-ng: fix exit return when sme->key_idx >= NUM_WEPKEYS 2019-10-14 15:40:08 +02:00
target SCSI fixes on 20191101 2019-11-02 11:15:52 -07:00
tc
tee tee/shm: untag user pointers in tee_shm_register 2019-09-25 17:51:41 -07:00
thermal cpufreq: Use per-policy frequency QoS 2019-10-21 02:05:21 +02:00
thunderbolt
tty 8250-men-mcb: fix error checking when get_num_ports returns -ENODEV 2019-10-15 21:38:41 +02:00
uio Char/Misc driver patches for 5.4-rc1 2019-09-18 11:14:31 -07:00
usb usb: dwc3: gadget: fix race when disabling ep with cancelled xfers 2019-10-31 18:57:54 +01:00
vfio vfio/type1: Initialize resv_msi_base 2019-10-15 14:07:01 -06:00
vhost vringh: fix copy direction of vringh_iov_push_kern() 2019-10-28 04:25:04 -04:00
video video/logo: do not generate unneeded logo C files 2019-10-05 15:29:49 +09:00
virt virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr 2019-10-10 14:50:32 +02:00
virtio virtio_ring: fix stalls for packed rings 2019-10-28 04:24:46 -04:00
visorbus
vlynq
vme
w1 w1: ds250x: Fix build error without CRC16 2019-10-10 15:35:41 +02:00
watchdog linux-watchdog 5.4-rc1 tag 2019-09-27 11:17:38 -07:00
xen Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2019-10-19 17:09:11 -04:00
zorro
Kconfig Staging/IIO driver patches for 5.4-rc1 2019-09-18 11:05:34 -07:00
Makefile Staging/IIO driver patches for 5.4-rc1 2019-09-18 11:05:34 -07:00