linux/fs/btrfs
Filipe Manana 2be63d5ce9 Btrfs: fix file loss on log replay after renaming a file and fsync
We have two cases where we end up deleting a file at log replay time
when we should not. For this to happen the file must have been renamed
and a directory inode must have been fsynced/logged.

Two examples that exercise these two cases are listed below.

  Case 1)

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkdir -p /mnt/a/b
  $ mkdir /mnt/c
  $ touch /mnt/a/b/foo
  $ sync
  $ mv /mnt/a/b/foo /mnt/c/
  # Create file bar just to make sure the fsync on directory a/ does
  # something and it's not a no-op.
  $ touch /mnt/a/bar
  $ xfs_io -c "fsync" /mnt/a
  < power fail / crash >

  The next time the filesystem is mounted, the log replay procedure
  deletes file foo.

  Case 2)

  $ mkfs.btrfs -f /dev/sdb
  $ mount /dev/sdb /mnt
  $ mkdir /mnt/a
  $ mkdir /mnt/b
  $ mkdir /mnt/c
  $ touch /mnt/a/foo
  $ ln /mnt/a/foo /mnt/b/foo_link
  $ touch /mnt/b/bar
  $ sync
  $ unlink /mnt/b/foo_link
  $ mv /mnt/b/bar /mnt/c/
  $ xfs_io -c "fsync" /mnt/a/foo
  < power fail / crash >

  The next time the filesystem is mounted, the log replay procedure
  deletes file bar.

The reason why the files are deleted is because when we log inodes
other then the fsync target inode, we ignore their last_unlink_trans
value and leave the log without enough information to later replay the
rename operations. So we need to look at the last_unlink_trans values
and fallback to a transaction commit if they are greater than the
id of the last committed transaction.

So fix this by looking at the last_unlink_trans values and fallback to
transaction commits when needed. Also, when logging other inodes (for
case 1 we logged descendants of the fsync target inode while for case 2
we logged ascendants) we need to care about concurrent tasks updating
the last_unlink_trans of inodes we are logging (which was already an
existing problem in check_parent_dirs_for_sync()). Since we can not
acquire their inode mutex (vfs' struct inode ->i_mutex), as that causes
deadlocks with other concurrent operations that acquire the i_mutex of
2 inodes (other fsyncs or renames for example), we need to serialize on
the log_mutex of the inode we are logging. A task setting a new value for
an inode's last_unlink_trans must acquire the inode's log_mutex and it
must do this update before doing the actual unlink operation (which is
already the case except when deleting a snapshot). Conversely the task
logging the inode must first log the inode and then check the inode's
last_unlink_trans value while holding its log_mutex, as if its value is
not greater then the id of the last committed transaction it means it
logged a safe state of the inode's items, while if its value is not
smaller then the id of the last committed transaction it means the inode
state it has logged might not be safe (the concurrent task might have
just updated last_unlink_trans but hasn't done yet the unlink operation)
and therefore a transaction commit must be done.

Test cases for xfstests follow in separate patches.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2016-03-01 08:23:29 -08:00
..
tests btrfs: fix memory leak of fs_info in block group cache 2016-02-18 13:28:24 +01:00
acl.c Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-01-18 12:44:40 -08:00
async-thread.c btrfs: async-thread: Fix a use-after-free error for trace 2016-01-25 16:50:26 -08:00
async-thread.h btrfs: async_thread: Fix workqueue 'max_active' value when initializing 2015-08-31 11:46:40 -07:00
backref.c Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
backref.h btrfs: cleanup, remove inode_item_info helper 2015-01-14 19:23:47 +01:00
btrfs_inode.h btrfs: put delayed item hook into inode 2016-01-07 14:26:58 +01:00
check-integrity.c btrfs: use list_for_each_entry* in check-integrity.c 2016-01-07 14:38:42 +01:00
check-integrity.h block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
compression.c Btrfs: remove no longer used function extent_read_full_page_nolock() 2016-02-03 19:27:10 +00:00
compression.h btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
ctree.c Merge branch 'dev/gfp-flags' into for-chris-4.6 2016-02-26 15:38:28 +01:00
ctree.h Merge branch 'dev/control-ioctl' into for-chris-4.6 2016-02-26 15:38:34 +01:00
delayed-inode.c Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
delayed-inode.h btrfs: properly set the termination value of ctx->pos in readdir 2016-02-11 07:01:59 -08:00
delayed-ref.c btrfs: drop null testing before destroy functions 2016-02-18 11:46:03 +01:00
delayed-ref.h btrfs: better packing of btrfs_delayed_extent_op 2016-01-07 14:26:58 +01:00
dev-replace.c Merge branch 'foreign/liubo/replace-lockup' into for-chris-4.6 2016-02-26 15:38:32 +01:00
dev-replace.h Btrfs: fix lockdep deadlock warning due to dev_replace 2016-02-23 13:10:10 +01:00
dir-item.c Btrfs: make xattr replace operations atomic 2014-11-20 17:20:07 -08:00
disk-io.c Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
disk-io.h Merge branch 'misc-cleanups-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 2016-01-11 06:08:37 -08:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent_io.c Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
extent_io.h Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
extent_map.c btrfs: drop null testing before destroy functions 2016-02-18 11:46:03 +01:00
extent_map.h btrfs: cleanup, stop casting for extent_map->lookup everywhere 2016-01-15 19:22:28 +01:00
extent-tree.c Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
file-item.c Btrfs: Compute and look up csums based on sectorsized blocks 2016-02-01 19:23:47 +01:00
file.c Merge branch 'misc-4.6' into for-chris-4.6 2016-02-26 15:38:34 +01:00
free-space-cache.c Merge branch 'misc-cleanups-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 2016-01-11 06:08:37 -08:00
free-space-cache.h btrfs: constify remaining structs with function pointers 2016-01-07 15:01:14 +01:00
free-space-tree.c Revert "btrfs: synchronize incompat feature bits with sysfs files" 2016-01-29 08:19:37 -08:00
free-space-tree.h Btrfs: implement the free space B-tree 2015-12-17 12:16:47 -08:00
hash.c btrfs: LLVMLinux: Remove VLAIS 2014-10-14 10:51:22 +02:00
hash.h Btrfs: fix btrfs boot when compiled as built-in 2014-01-28 13:20:31 -08:00
inode-item.c Btrfs: consolidate btrfs_error() to btrfs_std_error() 2015-09-29 16:30:00 +02:00
inode-map.c Merge branch 'misc-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 2016-01-19 18:21:30 -08:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-01-15 19:25:02 +01:00
inode.c Btrfs patchsets for 4.6 2016-03-01 08:13:56 -08:00
ioctl.c Btrfs: fix file loss on log replay after renaming a file and fsync 2016-03-01 08:23:29 -08:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
locking.c btrfs: cleanup, remove stray return statements 2016-01-07 14:30:52 +01:00
locking.h btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
lzo.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
Makefile Btrfs: add free space tree sanity tests 2015-12-17 12:16:47 -08:00
math.h btrfs: cleanup 64bit/32bit divs, compile time constants 2015-03-03 17:23:57 +01:00
ordered-data.c btrfs: drop null testing before destroy functions 2016-02-18 11:46:03 +01:00
ordered-data.h Btrfs: change how we wait for pending ordered extents 2015-10-21 18:51:40 -07:00
orphan.c btrfs: kill the key type accessor helpers 2014-09-17 13:37:12 -07:00
print-tree.c btrfs: teach print_leaf about temporary item subtypes 2016-02-11 16:15:43 +01:00
print-tree.h
props.c btrfs: cleanup iterating over prop_handlers array 2015-10-21 18:28:48 +02:00
props.h Btrfs: add support for inode properties 2014-01-28 13:20:24 -08:00
qgroup.c btrfs: qgroup: account shared subtree during snapshot delete 2015-11-25 05:27:33 -08:00
qgroup.h btrfs: qgroup: Check if qgroup reserved space leaked 2015-10-21 18:41:10 -07:00
raid56.c btrfs: raid56: Use raid_write_end_io for scrub 2016-01-20 07:22:18 -08:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c Merge branch 'foreign/liubo/replace-lockup' into for-chris-4.6 2016-02-26 15:38:32 +01:00
relocation.c Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-01-29 15:46:49 -08:00
root-tree.c btrfs: Replace CURRENT_TIME by current_fs_time() 2016-02-18 11:46:03 +01:00
scrub.c Merge branch 'foreign/liubo/replace-lockup' into for-chris-4.6 2016-02-26 15:38:32 +01:00
send.c btrfs: send: use GFP_KERNEL everywhere 2016-02-11 15:19:39 +01:00
send.h Btrfs: use linux/sizes.h to represent constants 2016-01-07 14:38:02 +01:00
struct-funcs.c
super.c Merge branch 'dev/control-ioctl' into for-chris-4.6 2016-02-26 15:38:34 +01:00
sysfs.c btrfs: sysfs: check initialization state before updating features 2016-01-27 05:40:10 -08:00
sysfs.h btrfs: sysfs: introduce helper for syncing bits with sysfs files 2016-01-21 18:50:40 +01:00
transaction.c Merge branch 'cleanups-4.6' into for-chris-4.6 2016-02-26 15:38:33 +01:00
transaction.h btrfs: preallocate path for snapshot creation at ioctl time 2016-01-07 15:20:55 +01:00
tree-defrag.c Btrfs: fix locking bugs when defragging leaves 2015-12-18 02:51:32 +00:00
tree-log.c Btrfs: fix file loss on log replay after renaming a file and fsync 2016-03-01 08:23:29 -08:00
tree-log.h Btrfs: fix unreplayable log after snapshot delete + parent dir fsync 2016-03-01 08:23:25 -08:00
ulist.c btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c Btrfs: make btrfs_search_forward return with nodes unlocked 2014-09-17 13:38:02 -07:00
volumes.c Merge branch 'foreign/liubo/replace-lockup' into for-chris-4.6 2016-02-26 15:38:32 +01:00
volumes.h Merge branch 'misc-cleanups-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.5 2016-01-11 06:08:37 -08:00
xattr.c btrfs: Replace CURRENT_TIME by current_fs_time() 2016-02-18 11:46:03 +01:00
xattr.h btrfs: Use xattr handler infrastructure 2015-12-06 21:34:14 -05:00
zlib.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00