linux/fs/btrfs
Liu Bo 0b43e04f70 Btrfs: fix leaf corruption after __btrfs_drop_extents
Several reports about leaf corruption has been floating on the list, one of them
points to __btrfs_drop_extents(), and we find that the leaf becomes corrupted
after __btrfs_drop_extents(), it's really a rare case but it does exist.

The problem turns out to be btrfs_next_leaf() called in __btrfs_drop_extents().

So in btrfs_next_leaf(), we release the current path to re-search the last key of
the leaf for locating next leaf, and we've taken it into account that there might
be balance operations between leafs during this 'unlock and re-lock' dance, so
we check the path again and advance it if there are now more items available.
But things are a bit different if that last key happens to be removed and balance
gets a bigger key as the last one, and btrfs_search_slot will return it with
ret > 0, IOW, nothing change in this leaf except the new last key, then we think
we're okay because there is no more item balanced in, fine, we thinks we can
go to the next leaf.

However, we should return that bigger key, otherwise we deserve leaf corruption,
for example, in endio, skipping that key means that __btrfs_drop_extents() thinks
it has dropped all extent matched the required range and finish_ordered_io can
safely insert a new extent, but it actually doesn't and ends up a leaf
corruption.

One may be asking that why our locking on extent io tree doesn't work as
expected, ie. it should avoid this kind of race situation.  But in
__btrfs_drop_extents(), we don't always find extents which are included within
our locking range, IOW, extents can start before our searching start, in this
case locking on extent io tree doesn't protect us from the race.

This takes the special case into account.

Reviewed-by: Filipe Manana <fdmanana@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
2014-06-09 17:21:10 -07:00
..
tests Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
acl.c btrfs: remove useless ACL check 2014-06-09 17:20:42 -07:00
async-thread.c btrfs: fix crash in remount(thread_pool=) case 2014-04-07 10:41:52 -07:00
async-thread.h btrfs: Add trace for btrfs_workqueue alloc/destroy 2014-03-20 17:15:28 -07:00
backref.c Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
backref.h Btrfs: rework qgroup accounting 2014-06-09 17:20:48 -07:00
btrfs_inode.h btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking 2014-06-09 17:20:57 -07:00
check-integrity.c btrfs: check_int: propagate out-of-memory error upwards 2014-06-09 17:20:21 -07:00
check-integrity.h block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
compression.c btrfs: return ptr error from compression workspace 2014-06-09 17:20:22 -07:00
compression.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
ctree.c Btrfs: fix leaf corruption after __btrfs_drop_extents 2014-06-09 17:21:10 -07:00
ctree.h btrfs: allocate raid type kobjects dynamically 2014-06-09 17:21:01 -07:00
delayed-inode.c btrfs: free delayed node outside of root->inode_lock 2014-06-09 17:21:08 -07:00
delayed-inode.h Btrfs: introduce the delayed inode ref deletion for the single link inode 2014-01-28 13:20:09 -08:00
delayed-ref.c Btrfs: rework qgroup accounting 2014-06-09 17:20:48 -07:00
delayed-ref.h Btrfs: rework qgroup accounting 2014-06-09 17:20:48 -07:00
dev-replace.c Btrfs: don't flush all delalloc inodes when we doesn't get s_umount lock 2014-03-10 15:17:27 -04:00
dev-replace.h Btrfs: add new sources for device replace code 2012-12-12 17:15:41 -05:00
dir-item.c Btrfs: convert printk to btrfs_ and fix BTRFS prefix 2014-01-28 13:20:05 -08:00
disk-io.c Btrfs: async delayed refs 2014-06-09 17:20:58 -07:00
disk-io.h Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
export.c btrfs: remove fs/btrfs/compat.h 2013-11-11 22:03:19 -05:00
export.h
extent_io.c Btrfs: split up __extent_writepage to lower stack usage 2014-06-09 17:20:58 -07:00
extent_io.h Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
extent_map.c Btrfs: more efficient btrfs_drop_extent_cache 2014-03-10 15:16:57 -04:00
extent_map.h Btrfs: more efficient btrfs_drop_extent_cache 2014-03-10 15:16:57 -04:00
extent-tree.c btrfs: allocate raid type kobjects dynamically 2014-06-09 17:21:01 -07:00
file-item.c Btrfs: don't access non-existent key when csum tree is empty 2014-06-09 17:20:44 -07:00
file.c Btrfs: fix transaction leak during fsync call 2014-06-09 17:21:06 -07:00
free-space-cache.c Btrfs: break up __btrfs_write_out_cache to cut down stack usage 2014-06-09 17:20:55 -07:00
free-space-cache.h Btrfs: remove path arg from btrfs_truncate_free_space_cache 2013-11-11 21:51:33 -05:00
hash.c Btrfs: fix btrfs boot when compiled as built-in 2014-01-28 13:20:31 -08:00
hash.h Btrfs: fix btrfs boot when compiled as built-in 2014-01-28 13:20:31 -08:00
inode-item.c btrfs: cleanup: removed unused 'btrfs_get_inode_ref_index' 2014-01-28 13:19:39 -08:00
inode-map.c btrfs: remove newline from inode cache kthread name 2014-06-09 17:20:53 -07:00
inode-map.h
inode.c Btrfs: async delayed refs 2014-06-09 17:20:58 -07:00
ioctl.c Btrfs: fix clone to deal with holes when NO_HOLES feature is enabled 2014-06-09 17:21:09 -07:00
Kconfig Btrfs: fix btrfs boot when compiled as built-in 2014-01-28 13:20:31 -08:00
locking.c btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
locking.h Btrfs: remove btrfs_try_spin_lock 2013-03-14 14:57:10 -04:00
lzo.c btrfs: return errno instead of -1 from compression 2014-06-09 17:20:21 -07:00
Makefile Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
math.h Btrfs: cleanup duplicated division functions 2012-12-11 13:31:30 -05:00
ordered-data.c btrfs: remove stale newlines from log messages 2014-06-09 17:20:53 -07:00
ordered-data.h btrfs: Cleanup the "_struct" suffix in btrfs_workequeue 2014-03-10 15:17:16 -04:00
orphan.c btrfs: expand btrfs_find_item() to include find_orphan_item functionality 2014-01-28 13:19:37 -08:00
print-tree.c Btrfs: don't use ram_bytes for uncompressed inline items 2014-01-29 07:06:29 -08:00
print-tree.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
props.c Btrfs: add support for inode properties 2014-01-28 13:20:24 -08:00
props.h Btrfs: add support for inode properties 2014-01-28 13:20:24 -08:00
qgroup.c Btrfs: free tmp ulist for qgroup rescan 2014-06-09 17:20:55 -07:00
qgroup.h Btrfs: rework qgroup accounting 2014-06-09 17:20:48 -07:00
raid56.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-04-04 15:31:36 -07:00
raid56.h Btrfs: RAID5 and RAID6 2013-02-01 14:24:23 -05:00
rcu-string.h
reada.c btrfs: Cleanup the "_struct" suffix in btrfs_workequeue 2014-03-10 15:17:16 -04:00
relocation.c btrfs: remove stale newlines from log messages 2014-06-09 17:20:53 -07:00
root-tree.c Btrfs: use bitfield instead of integer data type for the some variants in btrfs_root 2014-06-09 17:20:40 -07:00
scrub.c btrfs: Remove unnecessary check for NULL 2014-06-09 17:20:23 -07:00
send.c Btrfs: send, use the right limits for xattr names and values 2014-06-09 17:21:00 -07:00
send.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
struct-funcs.c
super.c btrfs: remove stale newlines from log messages 2014-06-09 17:20:53 -07:00
sysfs.c btrfs: allocate raid type kobjects dynamically 2014-06-09 17:21:01 -07:00
sysfs.h btrfs: add simple debugfs interface 2014-03-10 15:15:51 -04:00
transaction.c Btrfs: async delayed refs 2014-06-09 17:20:58 -07:00
transaction.h Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
tree-defrag.c Btrfs: use bitfield instead of integer data type for the some variants in btrfs_root 2014-06-09 17:20:40 -07:00
tree-log.c Btrfs: use helpers for last_trans_log_full_commit instead of opencode 2014-06-09 17:20:45 -07:00
tree-log.h Btrfs: use helpers for last_trans_log_full_commit instead of opencode 2014-06-09 17:20:45 -07:00
ulist.c Btrfs: do not export ulist functions 2014-01-29 07:06:27 -08:00
ulist.h Btrfs: do not export ulist functions 2014-01-29 07:06:27 -08:00
uuid-tree.c Btrfs: convert printk to btrfs_ and fix BTRFS prefix 2014-01-28 13:20:05 -08:00
volumes.c fs: btrfs: volumes.c: Fix for possible null pointer dereference 2014-06-09 17:21:01 -07:00
volumes.h btrfs: balance filter: add limit of processed chunks 2014-06-09 17:20:26 -07:00
xattr.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-01-30 20:08:20 -08:00
xattr.h btrfs: use generic posix ACL infrastructure 2014-01-25 23:58:18 -05:00
zlib.c btrfs: return errno instead of -1 from compression 2014-06-09 17:20:21 -07:00