linux/fs/btrfs
Qu Wenruo b3ff8f1d38 btrfs: Don't submit any btree write bio if the fs has errors
[BUG]
There is a fuzzed image which could cause KASAN report at unmount time.

  BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390
  Read of size 8 at addr ffff888067cf6848 by task umount/1922

  CPU: 0 PID: 1922 Comm: umount Tainted: G        W         5.0.21 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
  Call Trace:
   dump_stack+0x5b/0x8b
   print_address_description+0x70/0x280
   kasan_report+0x13a/0x19b
   btrfs_queue_work+0x2c1/0x390
   btrfs_wq_submit_bio+0x1cd/0x240
   btree_submit_bio_hook+0x18c/0x2a0
   submit_one_bio+0x1be/0x320
   flush_write_bio.isra.41+0x2c/0x70
   btree_write_cache_pages+0x3bb/0x7f0
   do_writepages+0x5c/0x130
   __writeback_single_inode+0xa3/0x9a0
   writeback_single_inode+0x23d/0x390
   write_inode_now+0x1b5/0x280
   iput+0x2ef/0x600
   close_ctree+0x341/0x750
   generic_shutdown_super+0x126/0x370
   kill_anon_super+0x31/0x50
   btrfs_kill_super+0x36/0x2b0
   deactivate_locked_super+0x80/0xc0
   deactivate_super+0x13c/0x150
   cleanup_mnt+0x9a/0x130
   task_work_run+0x11a/0x1b0
   exit_to_usermode_loop+0x107/0x130
   do_syscall_64+0x1e5/0x280
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

[CAUSE]
The fuzzed image has a completely screwd up extent tree:

  leaf 29421568 gen 8 total ptrs 6 free space 3587 owner EXTENT_TREE
  refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938
          item 0 key (12587008 168 4096) itemoff 3942 itemsize 53
                  extent refs 1 gen 9 flags 1
                  ref#0: extent data backref root 5 objectid 259 offset 0 count 1
          item 1 key (12591104 168 8192) itemoff 3889 itemsize 53
                  extent refs 1 gen 9 flags 1
                  ref#0: extent data backref root 5 objectid 271 offset 0 count 1
          item 2 key (12599296 168 4096) itemoff 3836 itemsize 53
                  extent refs 1 gen 9 flags 1
                  ref#0: extent data backref root 5 objectid 259 offset 4096 count 1
          item 3 key (29360128 169 0) itemoff 3803 itemsize 33
                  extent refs 1 gen 9 flags 2
                  ref#0: tree block backref root 5
          item 4 key (29368320 169 1) itemoff 3770 itemsize 33
                  extent refs 1 gen 9 flags 2
                  ref#0: tree block backref root 5
          item 5 key (29372416 169 0) itemoff 3737 itemsize 33
                  extent refs 1 gen 9 flags 2
                  ref#0: tree block backref root 5

Note that leaf 29421568 doesn't have its backref in the extent tree.
Thus extent allocator can re-allocate leaf 29421568 for other trees.

In short, the bug is caused by:

- Existing tree block gets allocated to log tree
  This got its generation bumped.

- Log tree balance cleaned dirty bit of offending tree block
  It will not be written back to disk, thus no WRITTEN flag.

- Original owner of the tree block gets COWed
  Since the tree block has higher transid, no WRITTEN flag, it's reused,
  and not traced by transaction::dirty_pages.

- Transaction aborted
  Tree blocks get cleaned according to transaction::dirty_pages. But the
  offending tree block is not recorded at all.

- Filesystem unmount
  All pages are assumed to be are clean, destroying all workqueue, then
  call iput(btree_inode).
  But offending tree block is still dirty, which triggers writeback, and
  causes use-after-free bug.

The detailed sequence looks like this:

- Initial status
  eb: 29421568, header=WRITTEN bflags_dirty=0, page_dirty=0, gen=8,
      not traced by any dirty extent_iot_tree.

- New tree block is allocated
  Since there is no backref for 29421568, it's re-allocated as new tree
  block.
  Keep in mind that tree block 29421568 is still referred by extent
  tree.

- Tree block 29421568 is filled for log tree
  eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 << (gen bumped)
      traced by btrfs_root::dirty_log_pages

- Some log tree operations
  Since the fs is using node size 4096, the log tree can easily go a
  level higher.

- Log tree needs balance
  Tree block 29421568 gets all its content pushed to right, thus now
  it is empty, and we don't need it.
  btrfs_clean_tree_block() from __push_leaf_right() get called.

  eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
      traced by btrfs_root::dirty_log_pages

- Log tree write back
  btree_write_cache_pages() goes through dirty pages ranges, but since
  page of tree block 29421568 gets cleaned already, it's not written
  back to disk. Thus it doesn't have WRITTEN bit set.
  But ranges in dirty_log_pages are cleared.

  eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9
      not traced by any dirty extent_iot_tree.

- Extent tree update when committing transaction
  Since tree block 29421568 has transid equal to running trans, and has
  no WRITTEN bit, should_cow_block() will use it directly without adding
  it to btrfs_transaction::dirty_pages.

  eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
      not traced by any dirty extent_iot_tree.

  At this stage, we're doomed. We have a dirty eb not tracked by any
  extent io tree.

- Transaction gets aborted due to corrupted extent tree
  Btrfs cleans up dirty pages according to transaction::dirty_pages and
  btrfs_root::dirty_log_pages.
  But since tree block 29421568 is not tracked by neither of them, it's
  still dirty.

  eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9
      not traced by any dirty extent_iot_tree.

- Filesystem unmount
  Since all cleanup is assumed to be done, all workqueus are destroyed.
  Then iput(btree_inode) is called, expecting no dirty pages.
  But tree 29421568 is still dirty, thus triggering writeback.
  Since all workqueues are already freed, we cause use-after-free.

This shows us that, log tree blocks + bad extent tree can cause wild
dirty pages.

[FIX]
To fix the problem, don't submit any btree write bio if the filesytem
has any error.  This is the last safe net, just in case other cleanup
haven't caught catch it.

Link: https://github.com/bobfuzzer/CVE/tree/master/CVE-2019-19377
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-03-23 17:01:46 +01:00
..
tests btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root 2020-03-23 17:01:33 +01:00
acl.c btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
async-thread.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
async-thread.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
backref.c btrfs: backref, use correct count to resolve normal data refs 2020-03-23 17:01:40 +01:00
backref.h btrfs: fiemap: preallocate ulists for btrfs_check_shared 2019-07-01 13:34:53 +02:00
block-group.c btrfs: switch to per-transaction pinned extents 2020-03-23 17:01:38 +01:00
block-group.h btrfs: Move and unexport btrfs_rmap_block 2020-01-23 17:24:34 +01:00
block-rsv.c btrfs: slightly simplify global block reserve calculations 2020-03-23 17:01:46 +01:00
block-rsv.h btrfs: migrate the global_block_rsv helpers to block-rsv.c 2019-07-02 12:30:55 +02:00
btrfs_inode.h btrfs: introduce per-inode file extent tree 2020-03-23 17:01:24 +01:00
check-integrity.c btrfs: remove buffer_heads form super block mirror integrity checking 2020-03-23 17:01:40 +01:00
check-integrity.h btrfs: remove btrfsic_submit_bh() 2020-03-23 17:01:39 +01:00
compression.c btrfs: use larger zlib buffer for s390 hardware compression 2020-01-31 10:30:40 -08:00
compression.h btrfs: compression: remove ops pointer from workspace_manager 2019-11-18 12:46:59 +01:00
ctree.c btrfs: simplify parameters of btrfs_set_disk_extent_flags 2020-03-23 17:01:45 +01:00
ctree.h btrfs: simplify parameters of btrfs_set_disk_extent_flags 2020-03-23 17:01:45 +01:00
delalloc-space.c btrfs: add a comment describing delalloc space reservation 2020-03-23 17:01:27 +01:00
delalloc-space.h btrfs: migrate the delalloc space stuff to it's own home 2019-07-04 17:26:17 +02:00
delayed-inode.c btrfs: add wrapper for transaction abort predicate 2020-03-23 17:01:34 +01:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c Btrfs: fix race between adding and putting tree mod seq elements and nodes 2020-01-31 14:01:20 +01:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: sysfs, rename device_link add/remove functions 2020-03-23 17:01:35 +01:00
dev-replace.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dir-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
discard.c btrfs: add correction to handle -1 edge case in async discard 2020-01-20 16:41:01 +01:00
discard.h btrfs: have multiple discard lists 2020-01-20 16:41:00 +01:00
disk-io.c btrfs: reduce pointer intdirections in btree_readpage_end_io_hook 2020-03-23 17:01:45 +01:00
disk-io.h btrfs: use the page cache for super block reading 2020-03-23 17:01:39 +01:00
export.c btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
export.h btrfs: export helpers for subvolume name/id resolution 2020-03-23 17:01:42 +01:00
extent_io.c btrfs: Don't submit any btree write bio if the fs has errors 2020-03-23 17:01:46 +01:00
extent_io.h btrfs: sink argument tree to extent_read_full_page 2020-03-23 17:01:35 +01:00
extent_map.c Btrfs: fix race between using extent maps and merging them 2020-02-12 17:16:46 +01:00
extent_map.h btrfs: remove extent_map::bdev 2019-11-18 23:43:44 +01:00
extent-io-tree.h btrfs: switch to per-transaction pinned extents 2020-03-23 17:01:38 +01:00
extent-tree.c btrfs: simplify parameters of btrfs_set_disk_extent_flags 2020-03-23 17:01:45 +01:00
file-item.c btrfs: introduce per-inode file extent tree 2020-03-23 17:01:24 +01:00
file.c btrfs: convert snapshot/nocow exlcusion to drew lock 2020-03-23 17:01:44 +01:00
free-space-cache.c btrfs: simplify error handling in __btrfs_write_out_cache() 2020-03-23 17:01:43 +01:00
free-space-cache.h btrfs: have multiple discard lists 2020-01-20 16:41:00 +01:00
free-space-tree.c btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root 2020-03-23 17:01:33 +01:00
free-space-tree.h btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: keep track of which extents have been discarded 2020-01-20 16:40:57 +01:00
inode-map.h
inode.c btrfs: convert snapshot/nocow exlcusion to drew lock 2020-03-23 17:01:44 +01:00
ioctl.c btrfs: ioctl: resize: only show message if size is changed 2020-03-23 17:01:46 +01:00
Kconfig btrfs: add Kconfig dependency for BLAKE2B 2019-12-09 17:56:06 +01:00
locking.c btrfs: Implement DREW lock 2020-03-23 17:01:43 +01:00
locking.h btrfs: Implement DREW lock 2020-03-23 17:01:43 +01:00
lzo.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00
Makefile btrfs: add the beginning of async discard, discard workqueue 2020-01-20 16:40:57 +01:00
misc.h btrfs: add 64bit safe helper for power of two checks 2019-11-18 12:46:50 +01:00
ordered-data.c btrfs: drop argument tree from btrfs_lock_and_flush_ordered_range 2020-03-23 17:01:34 +01:00
ordered-data.h btrfs: drop argument tree from btrfs_lock_and_flush_ordered_range 2020-03-23 17:01:34 +01:00
orphan.c
print-tree.c btrfs: Remove unneeded semicolon 2020-01-20 16:40:55 +01:00
print-tree.h
props.c btrfs: props: remove unnecessary hash_init() 2019-11-18 12:46:55 +01:00
props.h btrfs: delete unused function btrfs_set_prop_trans 2019-04-29 19:02:54 +02:00
qgroup.c btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root 2020-03-23 17:01:33 +01:00
qgroup.h btrfs: destroy qgroup extent records on transaction abort 2020-02-19 00:35:54 +01:00
raid56.c btrfs: use struct_size to calculate size of raid hash table 2020-03-23 17:01:44 +01:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h
reada.c btrfs: rename btrfs_block_group_cache 2019-11-18 17:51:51 +01:00
ref-verify.c btrfs: ref-verify: fix memory leaks 2020-02-12 17:16:31 +01:00
ref-verify.h btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
relocation.c btrfs: relocation: Remove is_cowonly_root() 2020-03-23 17:01:38 +01:00
root-tree.c btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root 2020-03-23 17:01:33 +01:00
scrub.c btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root 2020-03-23 17:01:33 +01:00
send.c btrfs: rename btrfs_put_fs_root and btrfs_grab_fs_root 2020-03-23 17:01:33 +01:00
send.h
space-info.c btrfs: describe the space reservation system in general 2020-03-23 17:01:27 +01:00
space-info.h btrfs: take overcommit into account in inc_block_group_ro 2020-01-31 14:02:01 +01:00
struct-funcs.c btrfs: tie extent buffer and it's token together 2019-09-09 14:59:16 +02:00
super.c btrfs: adjust message level for unrecognized mount option 2020-03-23 17:01:45 +01:00
sysfs.c btrfs: sysfs, unify handler name of devinfo/missing 2020-03-23 17:01:36 +01:00
sysfs.h btrfs: sysfs, rename device_link add/remove functions 2020-03-23 17:01:35 +01:00
transaction.c btrfs: merge unlocking to common exit block in btrfs_commit_transaction 2020-03-23 17:01:46 +01:00
transaction.h btrfs: switch to per-transaction pinned extents 2020-03-23 17:01:38 +01:00
tree-checker.c btrfs: tree-checker: Verify location key for DIR_ITEM/DIR_INDEX 2020-01-20 16:40:56 +01:00
tree-checker.h btrfs: get fs_info from eb in btrfs_check_chunk_valid 2019-04-29 19:02:39 +02:00
tree-defrag.c btrfs: open code now trivial btrfs_set_lock_blocking 2019-02-25 14:13:27 +01:00
tree-log.c btrfs: Make btrfs_pin_extent_for_log_replay take transaction handle 2020-03-23 17:01:37 +01:00
tree-log.h btrfs: get fs_info from trans in btrfs_set_log_full_commit 2019-04-29 19:02:41 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: bail out of uuid tree scanning if we're closing 2020-03-23 17:01:41 +01:00
volumes.c btrfs: replace u_long type cast with unsigned long 2020-03-23 17:01:45 +01:00
volumes.h btrfs: make btrfs_check_uuid_tree private to disk-io.c 2020-03-23 17:01:41 +01:00
xattr.c Btrfs: fix failure to persist compression property xattr deletion on fsync 2019-06-17 16:37:17 +02:00
xattr.h btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
zlib.c btrfs: use larger zlib buffer for s390 hardware compression 2020-01-31 10:30:40 -08:00
zstd.c btrfs: compression: inline free_workspace 2019-11-18 12:46:59 +01:00