linux/fs/btrfs
Filipe Manana 5f96bfb763 btrfs: fix race that results in logging old extents during a fast fsync
When logging the extents of an inode during a fast fsync, we have a time
window where we can log extents that are from the previous transaction and
already persisted. This only makes us waste time unnecessarily.

The following sequence of steps shows how this can happen:

1) We are at transaction 1000;

2) An ordered extent E from inode I completes, that is it has gone through
   btrfs_finish_ordered_io(), and it set the extent maps' generation to
   1000 when we unpin the extent, which is the generation of the current
   transaction;

3) The commit for transaction 1000 starts by task A;

4) The task committing transaction 1000 sets the transaction state to
   unblocked, writes the dirty extent buffers and the super blocks, then
   unlocks tree_log_mutex;

5) Some change is made to inode I, resulting in creation of a new
   transaction with a generation of 1001;

6) The transaction 1000 commit starts unpinning extents. At this point
   fs_info->last_trans_committed still has a value of 999;

7) Task B starts an fsync on inode I, and when it gets to
   btrfs_log_changed_extents() sees the extent map for extent E in the
   list of modified extents. It sees the extent map has a generation of
   1000 and fs_info->last_trans_committed has a value of 999, so it
   proceeds to logging the respective file extent item and all the
   checksums covering its range.

   So we end up wasting time since the extent was already persisted and
   is reachable through the trees pointed to by the super block committed
   by transaction 1000.

So just fix this by comparing the extent maps generation against the
generation of the transaction handle - if it is smaller then the id in the
handle, we know the extent was already persisted and we do not need to log
it.

This patch belongs to a patch set that is comprised of the following
patches:

  btrfs: fix race causing unnecessary inode logging during link and rename
  btrfs: fix race that results in logging old extents during a fast fsync
  btrfs: fix race that causes unnecessary logging of ancestor inodes
  btrfs: fix race that makes inode logging fallback to transaction commit
  btrfs: fix race leading to unnecessary transaction commit when logging inode
  btrfs: do not block inode logging for so long during transaction commit

Performance results are mentioned in the change log of the last patch.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-12-09 19:16:06 +01:00
..
tests btrfs: remove recalc_thresholds from free space ops 2020-12-09 19:16:06 +01:00
acl.c
async-thread.c
async-thread.h
backref.c btrfs: pass root owner to read_tree_block 2020-12-08 15:54:07 +01:00
backref.h btrfs: rename BTRFS_ROOT_REF_COWS to BTRFS_ROOT_SHAREABLE 2020-05-25 11:25:35 +02:00
block-group.c btrfs: implement log-structured superblock for ZONED mode 2020-12-09 19:16:04 +01:00
block-group.h btrfs: load free space cache asynchronously 2020-12-08 15:54:03 +01:00
block-rsv.c btrfs: introduce mount option rescue=ignorebadroots 2020-12-08 15:53:41 +01:00
block-rsv.h
btrfs_inode.h btrfs: skip unnecessary searches for xattrs when logging an inode 2020-12-08 15:54:12 +01:00
check-integrity.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
check-integrity.h
compression.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
compression.h btrfs: compression: move declarations to header 2020-10-07 12:06:55 +02:00
ctree.c btrfs: simplify return values in setup_nodes_for_search 2020-12-08 15:54:13 +01:00
ctree.h btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
delalloc-space.c btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
delalloc-space.h btrfs: make btrfs_delalloc_reserve_space take btrfs_inode 2020-07-27 12:55:36 +02:00
delayed-inode.c btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-inode.h btrfs: make btrfs_delayed_update_inode take btrfs_inode 2020-12-08 15:54:10 +01:00
delayed-ref.c
delayed-ref.h
dev-replace.c btrfs: check and enable ZONED mode 2020-12-09 19:16:03 +01:00
dev-replace.h
dir-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
discard.c btrfs: don't miss async discards after scheduled work override 2020-12-08 15:54:05 +01:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
disk-io.h btrfs: move btrfs_find_highest_objectid/btrfs_find_free_objectid to disk-io.c 2020-12-09 19:16:05 +01:00
export.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
export.h
extent_io.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
extent_io.h btrfs: use fixed width int type for extent_state::state 2020-12-08 15:54:13 +01:00
extent_map.c
extent_map.h
extent-io-tree.h btrfs: use fixed width int type for extent_state::state 2020-12-08 15:54:13 +01:00
extent-tree.c btrfs: set the lockdep class for extent buffers on creation 2020-12-08 15:54:07 +01:00
file-item.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
file.c btrfs: disable fallocate in ZONED mode 2020-12-09 19:16:04 +01:00
free-space-cache.c btrfs: remove recalc_thresholds from free space ops 2020-12-09 19:16:06 +01:00
free-space-cache.h btrfs: remove recalc_thresholds from free space ops 2020-12-09 19:16:06 +01:00
free-space-tree.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
free-space-tree.h
inode-item.c btrfs: locking: rip out path->leave_spinning 2020-12-08 15:54:02 +01:00
inode.c btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
ioctl.c btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
Kconfig btrfs: switch to iomap for direct IO 2020-10-07 12:06:57 +02:00
locking.c btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
locking.h btrfs: remove the recurse parameter from __btrfs_tree_read_lock 2020-12-08 15:54:09 +01:00
lzo.c
Makefile btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
misc.h
ordered-data.c btrfs: switch cached fs_info::csum_size from u16 to u32 2020-12-08 15:53:59 +01:00
ordered-data.h btrfs: remove unnecessary local variables for checksum size 2020-12-08 15:54:00 +01:00
orphan.c
print-tree.c btrfs: pass root owner to read_tree_block 2020-12-08 15:54:07 +01:00
print-tree.h btrfs: pretty print leaked root name 2020-10-07 12:12:20 +02:00
props.c btrfs: simplify iget helpers 2020-05-25 11:25:37 +02:00
props.h
qgroup.c btrfs: pass root owner to read_tree_block 2020-12-08 15:54:07 +01:00
qgroup.h btrfs: qgroup: export qgroups in sysfs 2020-07-27 12:55:37 +02:00
raid56.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
raid56.h
rcu-string.h
reada.c btrfs: pass the owner_root and level to alloc_extent_buffer 2020-12-08 15:54:07 +01:00
ref-verify.c btrfs: use btrfs_read_node_slot in walk_down_tree 2020-12-08 15:54:06 +01:00
ref-verify.h
reflink.c btrfs: make btrfs_cont_expand take btrfs_inode 2020-12-08 15:54:12 +01:00
reflink.h
relocation.c btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
root-tree.c btrfs: qgroup: fix qgroup meta rsv leak for subvolume operations 2020-10-07 12:12:13 +02:00
scrub.c btrfs: implement log-structured superblock for ZONED mode 2020-12-09 19:16:04 +01:00
send.c btrfs: send: use helpers to access root_item::ctransid 2020-12-08 15:53:51 +01:00
send.h btrfs: send: avoid copying file data 2020-10-07 12:13:17 +02:00
space-info.c btrfs: kill the RCU protection for fs_info->space_info 2020-10-07 12:13:19 +02:00
space-info.h btrfs: add btrfs_reserve_data_bytes and use it 2020-10-07 12:06:52 +02:00
struct-funcs.c btrfs: use unaligned helpers for stack and header set/get helpers 2020-10-07 12:13:23 +02:00
super.c btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
sysfs.c btrfs: introduce ZONED feature flag 2020-12-08 15:54:16 +01:00
sysfs.h btrfs: split and refactor btrfs_sysfs_remove_devices_dir 2020-10-07 12:12:21 +02:00
transaction.c btrfs: remove inode number cache feature 2020-12-09 19:16:05 +01:00
transaction.h btrfs: return bool from btrfs_should_end_transaction 2020-12-08 15:54:16 +01:00
tree-checker.c btrfs: tree-checker: annotate all error branches as unlikely 2020-12-08 15:54:15 +01:00
tree-checker.h
tree-defrag.c btrfs: locking: remove all the blocking helpers 2020-12-08 15:54:01 +01:00
tree-log.c btrfs: fix race that results in logging old extents during a fast fsync 2020-12-09 19:16:06 +01:00
tree-log.h btrfs: make fast fsyncs wait only for writeback 2020-10-07 12:06:56 +02:00
ulist.c
ulist.h
uuid-tree.c btrfs: remove unnecessary casts in printk 2020-12-08 15:53:52 +01:00
volumes.c btrfs: drop casts of bio bi_sector 2020-12-09 19:16:05 +01:00
volumes.h btrfs: get zone information of zoned block devices 2020-12-09 19:15:57 +01:00
xattr.c btrfs: skip unnecessary searches for xattrs when logging an inode 2020-12-08 15:54:12 +01:00
xattr.h
zlib.c
zoned.c btrfs: implement log-structured superblock for ZONED mode 2020-12-09 19:16:04 +01:00
zoned.h btrfs: implement log-structured superblock for ZONED mode 2020-12-09 19:16:04 +01:00
zstd.c