linux/fs/bcachefs
Linus Torvalds 999a36b52b bcachefs updates for 6.8:
- btree write buffer rewrite: instead of adding keys to the btree write
    buffer at transaction commit time, we know journal them with a
    different journal entry type and copy them from the journal to the
    write buffer just prior to journal write.
 
    This reduces the number of atomic operations on shared cachelines
    in the transaction commit path and is a signicant performance
    improvement on some workloads: multithreaded 4k random writes went
    from ~650k iops to ~850k iops.
 
  - Bring back optimistic spinning for six locks: the new implementation
    doesn't use osq locks; instead we add to the lock waitlist as normal,
    and then spin on the lock_acquired bit in the waitlist entry, _not_
    the lock itself.
 
  - BCH_IOCTL_DEV_USAGE_V2, which allows for new data types
  - BCH_IOCTL_OFFLINE_FSCK, which runs the kernel implementation of fsck
    but without mounting: useful for transparently using the kernel
    version of fsck from 'bcachefs fsck' when the kernel version is a
    better match for the on disk filesystem.
 
  - BCH_IOCTL_ONLINE_FSCK: online fsck. Not all passes are supported yet,
    but the passes that are supported are fully featured - errors may be
    corrected as normal.
 
    The new ioctls use the new 'thread_with_file' abstraction for kicking
    off a kthread that's tied to a file descriptor returned to userspace
    via the ioctl.
 
  - btree_paths within a btree_trans are now dynamically growable,
    instead of being limited to 64. This is important for the
    check_directory_structure phase of fsck, and also fixes some issues
    we were having with btree path overflow in the reflink btree.
 
  - Trigger refactoring; prep work for the upcoming disk space accounting
    rewrite
 
  - Numerous bugfixes :)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWe8PUACgkQE6szbY3K
 bnYw6g/9GAXfIGasTZZwK2XEr36RYtEFYMwd/m9V1ET0DH6d/MFH9G7tTYl52AQ4
 k9cDFb0d2qdtNk2Rlml1lHFrxMzkp2Q7j9S4YcETrE+/Dir8ODVcJXrGeNTCMGmz
 B+C12mTOpWrzGMrioRgFZjWAnacsY3RP8NFRTT9HIJHO9UCP+xN5y++sX10C5Gwv
 7UVWTaUwjkgdYWkR8RCKGXuG5cNNlRp4Y0eeK2XruG1iI9VAilir1glcD/YMOY8M
 vECQzmf2ZLGFS/tpnmqVhNbNwVWpTQMYassvKaisWNHLDUgskOoF8YfoYSH27t7F
 GBb1154O2ga6ea866677FDeNVlg386mGCTUy2xOhMpDL3zW+/Is+8MdfJI4MJP5R
 EwcjHnn2bk0C2kULbAohw0gnU42FulfvsLNnrfxCeygmZrDoOOCL1HpvnBG4vskc
 Fp6NK83l974QnyLdPsjr1yB2d2pgb+uMP1v76IukQi0IjNSAyvwSa5nloPTHRzpC
 j6e2cFpdtX+6vEu6KngXVKTblSEnwhVBTaTR37Lr8PX1sZqFS/+mjRDgg3HZa/GI
 u0fC0mQyVL9KjDs5LJGpTc/qs8J4mpoS5+dfzn38MI76dFxd5TYZKWVfILTrOtDF
 ugDnoLkMuYFdueKI2M3YzxXyaA7HBT+7McAdENuJJzJnEuSAZs0=
 =JvA2
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-01-10' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - btree write buffer rewrite: instead of adding keys to the btree write
   buffer at transaction commit time, we now journal them with a
   different journal entry type and copy them from the journal to the
   write buffer just prior to journal write.

   This reduces the number of atomic operations on shared cachelines in
   the transaction commit path and is a signicant performance
   improvement on some workloads: multithreaded 4k random writes went
   from ~650k iops to ~850k iops.

 - Bring back optimistic spinning for six locks: the new implementation
   doesn't use osq locks; instead we add to the lock waitlist as normal,
   and then spin on the lock_acquired bit in the waitlist entry, _not_
   the lock itself.

 - New ioctls:

    - BCH_IOCTL_DEV_USAGE_V2, which allows for new data types

    - BCH_IOCTL_OFFLINE_FSCK, which runs the kernel implementation of
      fsck but without mounting: useful for transparently using the
      kernel version of fsck from 'bcachefs fsck' when the kernel
      version is a better match for the on disk filesystem.

    - BCH_IOCTL_ONLINE_FSCK: online fsck. Not all passes are supported
      yet, but the passes that are supported are fully featured - errors
      may be corrected as normal.

   The new ioctls use the new 'thread_with_file' abstraction for kicking
   off a kthread that's tied to a file descriptor returned to userspace
   via the ioctl.

 - btree_paths within a btree_trans are now dynamically growable,
   instead of being limited to 64. This is important for the
   check_directory_structure phase of fsck, and also fixes some issues
   we were having with btree path overflow in the reflink btree.

 - Trigger refactoring; prep work for the upcoming disk space accounting
   rewrite

 - Numerous bugfixes :)

* tag 'bcachefs-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (226 commits)
  bcachefs: eytzinger0_find() search should be const
  bcachefs: move "ptrs not changing" optimization to bch2_trigger_extent()
  bcachefs: fix simulateously upgrading & downgrading
  bcachefs: Restart recovery passes more reliably
  bcachefs: bch2_dump_bset() doesn't choke on u64s == 0
  bcachefs: improve checksum error messages
  bcachefs: improve validate_bset_keys()
  bcachefs: print sb magic when relevant
  bcachefs: __bch2_sb_field_to_text()
  bcachefs: %pg is banished
  bcachefs: Improve would_deadlock trace event
  bcachefs: fsck_err()s don't need to manually check c->sb.version anymore
  bcachefs: Upgrades now specify errors to fix, like downgrades
  bcachefs: no thread_with_file in userspace
  bcachefs: Don't autofix errors we can't fix
  bcachefs: add missing bch2_latency_acct() call
  bcachefs: increase max_active on io_complete_wq
  bcachefs: add time_stats for btree_node_read_done()
  bcachefs: don't clear accessed bit in btree node fill
  bcachefs: Add an option to control btree node prefetching
  ...
2024-01-10 16:34:17 -08:00
..
acl.c bcachefs: make RO snapshots actually RO 2024-01-01 11:47:07 -05:00
acl.h
alloc_background.c bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
alloc_background.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
alloc_foreground.c bcachefs: btree_iter -> btree_path_idx_t 2024-01-01 11:47:43 -05:00
alloc_foreground.h
alloc_types.h
backpointers.c bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
backpointers.h bcachefs: No need to allocate keys for write buffer 2024-01-01 11:47:38 -05:00
bbpos_types.h
bbpos.h
bcachefs_format.h bcachefs: Upgrades now specify errors to fix, like downgrades 2024-01-05 23:24:20 -05:00
bcachefs_ioctl.h bcachefs: Replace zero-length array with flex-array member and use __counted_by 2024-01-01 11:47:41 -05:00
bcachefs.h bcachefs: add time_stats for btree_node_read_done() 2024-01-05 23:24:20 -05:00
bkey_buf.h
bkey_cmp.h
bkey_methods.c bcachefs: rebalance_work btree is not a snapshots btree 2023-11-04 22:19:13 -04:00
bkey_methods.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
bkey_sort.c bcachefs: bkey_copy() is no longer a macro 2023-11-05 13:12:18 -05:00
bkey_sort.h
bkey.c
bkey.h bcachefs: bkey_copy() is no longer a macro 2023-11-05 13:12:18 -05:00
bset.c bcachefs: bch2_dump_bset() doesn't choke on u64s == 0 2024-01-05 23:24:21 -05:00
bset.h
btree_cache.c bcachefs: don't clear accessed bit in btree node fill 2024-01-05 23:24:20 -05:00
btree_cache.h bcachefs: Include btree_trans in more tracepoints 2024-01-01 11:47:40 -05:00
btree_gc.c bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
btree_gc.h
btree_io.c bcachefs: improve checksum error messages 2024-01-05 23:24:21 -05:00
btree_io.h bcachefs: Include btree_trans in more tracepoints 2024-01-01 11:47:40 -05:00
btree_iter.c bcachefs: Add an option to control btree node prefetching 2024-01-05 23:24:20 -05:00
btree_iter.h bcachefs: growable btree_paths 2024-01-01 11:47:44 -05:00
btree_journal_iter.c bcachefs: __journal_keys_sort() refactoring 2024-01-05 23:24:19 -05:00
btree_journal_iter.h bcachefs: Proper refcounting for journal_keys 2023-11-24 02:43:12 -05:00
btree_key_cache_types.h bcachefs: Run btree key cache shrinker less aggressively 2023-11-13 21:45:01 -05:00
btree_key_cache.c bcachefs: trans_for_each_path() no longer uses path->idx 2024-01-01 11:47:43 -05:00
btree_key_cache.h bcachefs; kill bch2_btree_key_cache_flush() 2024-01-01 11:47:41 -05:00
btree_locking.c bcachefs: Improve would_deadlock trace event 2024-01-05 23:24:21 -05:00
btree_locking.h bcachefs: btree_trans always has stats 2024-01-05 23:24:19 -05:00
btree_trans_commit.c bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
btree_types.h bcachefs: growable btree_paths 2024-01-01 11:47:44 -05:00
btree_update_interior.c bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
btree_update_interior.h bcachefs: Fix interior update path btree_path uses 2024-01-01 11:47:44 -05:00
btree_update.c bcachefs: btree_trans always has stats 2024-01-05 23:24:19 -05:00
btree_update.h bcachefs: Clean up btree_trans 2024-01-01 11:47:44 -05:00
btree_write_buffer_types.h bcachefs: Inline btree write buffer sort 2024-01-01 11:47:41 -05:00
btree_write_buffer.c bcachefs: __bch2_journal_key_to_wb -> bch2_journal_key_to_wb_slowpath 2024-01-05 23:24:19 -05:00
btree_write_buffer.h bcachefs: __bch2_journal_key_to_wb -> bch2_journal_key_to_wb_slowpath 2024-01-05 23:24:19 -05:00
buckets_types.h bcachefs: Kill dev_usage->buckets_ec 2024-01-01 11:47:38 -05:00
buckets_waiting_for_journal_types.h
buckets_waiting_for_journal.c
buckets_waiting_for_journal.h
buckets.c bcachefs: move "ptrs not changing" optimization to bch2_trigger_extent() 2024-01-05 23:24:46 -05:00
buckets.h bcachefs: unify extent trigger 2024-01-05 23:24:20 -05:00
chardev.c bcachefs: Online fsck can now fix errors 2024-01-05 23:24:20 -05:00
chardev.h
checksum.c
checksum.h bcachefs: improve checksum error messages 2024-01-05 23:24:21 -05:00
clock_types.h
clock.c
clock.h
compress.c bcachefs: Remove obsolete comment about zstd 2024-01-01 11:47:40 -05:00
compress.h
counters.c
counters.h
darray.c bcachefs: DARRAY_PREALLOCATED() 2024-01-01 11:46:52 -05:00
darray.h bcachefs: Convert split_devs() to darray 2024-01-01 11:47:43 -05:00
data_update.c bcachefs: bkey_for_each_ptr() now declares loop iter 2024-01-01 11:47:43 -05:00
data_update.h bcachefs: Data update path won't accidentaly grow replicas 2023-11-25 21:48:42 -05:00
debug.c bcachefs: Improve would_deadlock trace event 2024-01-05 23:24:21 -05:00
debug.h
dirent.c bcachefs: Fix reattach_inode() for snapshots 2024-01-01 11:47:44 -05:00
dirent.h bcachefs: Fix reattach_inode() for snapshots 2024-01-01 11:47:44 -05:00
disk_groups_types.h
disk_groups.c bcachefs: %pg is banished 2024-01-05 23:24:21 -05:00
disk_groups.h
ec_types.h bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1 2024-01-01 11:47:38 -05:00
ec.c bcachefs: unify stripe trigger 2024-01-05 23:24:20 -05:00
ec.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
errcode.c
errcode.h bcachefs: Split brain detection 2024-01-05 23:24:19 -05:00
error.c bcachefs: Don't autofix errors we can't fix 2024-01-05 23:24:20 -05:00
error.h bcachefs: fix BCH_FSCK_ERR enum 2023-12-19 19:01:52 -05:00
extent_update.c bcachefs: growable btree_paths 2024-01-01 11:47:44 -05:00
extent_update.h
extents_types.h
extents.c bcachefs: bkey_for_each_ptr() now declares loop iter 2024-01-01 11:47:43 -05:00
extents.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
eytzinger.h bcachefs: eytzinger0_find() search should be const 2024-01-05 23:24:46 -05:00
fifo.h
fs-common.c bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
fs-common.h
fs-io-buffered.c bcachefs: Kill GFP_NOFAIL usage in readahead path 2024-01-01 11:47:43 -05:00
fs-io-buffered.h
fs-io-direct.c bcachefs: Delete dio read alignment check 2024-01-01 11:47:42 -05:00
fs-io-direct.h
fs-io-pagecache.c bcachefs: Use correct fgf_t type as function argument 2023-11-13 21:42:21 -05:00
fs-io-pagecache.h bcachefs: Use correct fgf_t type as function argument 2023-11-13 21:42:21 -05:00
fs-io.c bcachefs: return from fsync on writeback error to avoid early shutdown 2024-01-01 11:47:40 -05:00
fs-io.h
fs-ioctl.c bcachefs updates for 6.8: 2024-01-10 16:34:17 -08:00
fs-ioctl.h bcachefs: x-macro-ify inode flags enum 2023-11-05 13:12:18 -05:00
fs.c bcachefs updates for 6.8: 2024-01-10 16:34:17 -08:00
fs.h bcachefs: kill INODE_LOCK, use lock_two_nondirectories() 2024-01-01 11:47:36 -05:00
fsck.c bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
fsck.h
inode.c bcachefs: unify inode trigger 2024-01-05 23:24:19 -05:00
inode.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
io_misc.c bcachefs: bkey_for_each_ptr() now declares loop iter 2024-01-01 11:47:43 -05:00
io_misc.h
io_read.c bcachefs: improve checksum error messages 2024-01-05 23:24:21 -05:00
io_read.h
io_write_types.h
io_write.c bcachefs: for_each_keylist_key() declares loop iter 2024-01-01 11:47:43 -05:00
io_write.h closures: CLOSURE_CALLBACK() to fix type punning 2023-11-24 00:29:58 -05:00
journal_io.c bcachefs: improve checksum error messages 2024-01-05 23:24:21 -05:00
journal_io.h closures: CLOSURE_CALLBACK() to fix type punning 2023-11-24 00:29:58 -05:00
journal_reclaim.c bcachefs: for_each_member_device_rcu() now declares loop iter 2024-01-01 11:47:42 -05:00
journal_reclaim.h bcachefs: btree write buffer now slurps keys from journal 2024-01-01 11:47:41 -05:00
journal_sb.c
journal_sb.h
journal_seq_blacklist.c bcachefs: convert bch_fs_flags to x-macro 2024-01-01 11:47:38 -05:00
journal_seq_blacklist.h
journal_types.h bcachefs: btree write buffer now slurps keys from journal 2024-01-01 11:47:41 -05:00
journal.c bcachefs: for_each_member_device_rcu() now declares loop iter 2024-01-01 11:47:42 -05:00
journal.h bcachefs: vstruct_for_each() now declares loop iter 2024-01-01 11:47:42 -05:00
Kconfig bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONS 2024-01-01 11:47:44 -05:00
keylist_types.h
keylist.c bcachefs: for_each_keylist_key() declares loop iter 2024-01-01 11:47:43 -05:00
keylist.h bcachefs: for_each_keylist_key() declares loop iter 2024-01-01 11:47:43 -05:00
logged_ops.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
logged_ops.h
lru.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
lru.h
Makefile bcachefs: factor out thread_with_file, thread_with_stdio 2024-01-05 23:24:19 -05:00
mean_and_variance_test.c
mean_and_variance.c bcachefs: mean and variance: fix kernel-doc for function params 2024-01-01 11:47:42 -05:00
mean_and_variance.h bcachefs: Fixes for rust bindgen 2024-01-01 11:47:42 -05:00
migrate.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
migrate.h
move_types.h
move.c bcachefs: bkey_for_each_ptr() now declares loop iter 2024-01-01 11:47:43 -05:00
move.h bcachefs: rebalance should wakeup on shutdown if disabled 2024-01-01 11:47:39 -05:00
movinggc.c bcachefs: for_each_member_device() now declares loop iter 2024-01-01 11:47:42 -05:00
movinggc.h
nocow_locking_types.h
nocow_locking.c
nocow_locking.h
opts.c bcachefs: BCH_ERR_opt_parse_error 2024-01-01 11:47:40 -05:00
opts.h bcachefs: Add an option to control btree node prefetching 2024-01-05 23:24:20 -05:00
printbuf.c bcachefs: prt_bitflags_vector() 2024-01-01 11:47:07 -05:00
printbuf.h bcachefs: prt_bitflags_vector() 2024-01-01 11:47:07 -05:00
quota_types.h
quota.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
quota.h
rebalance_types.h
rebalance.c bcachefs: bch_err_(fn|msg) check if should print 2024-01-01 11:47:41 -05:00
rebalance.h
recovery_types.h bcachefs: check_directory_structure() can now be run online 2024-01-01 11:47:44 -05:00
recovery.c bcachefs: Restart recovery passes more reliably 2024-01-05 23:24:21 -05:00
recovery.h bcachefs: bch2_run_online_recovery_passes() 2024-01-01 11:47:40 -05:00
reflink.c bcachefs: move "ptrs not changing" optimization to bch2_trigger_extent() 2024-01-05 23:24:46 -05:00
reflink.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
replicas_types.h bcachefs: Replace zero-length arrays with flexible-array members 2024-01-01 11:47:39 -05:00
replicas.c bcachefs: simplify bch_devs_list 2024-01-01 11:47:42 -05:00
replicas.h bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1 2024-01-01 11:47:38 -05:00
sb-clean.c bcachefs: for_each_member_device() now declares loop iter 2024-01-01 11:47:42 -05:00
sb-clean.h
sb-downgrade.c bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
sb-downgrade.h bcachefs: Upgrades now specify errors to fix, like downgrades 2024-01-05 23:24:20 -05:00
sb-errors_types.h bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
sb-errors.c bcachefs: bch_sb_field_downgrade 2024-01-01 11:47:07 -05:00
sb-errors.h bcachefs: bch_sb.recovery_passes_required 2024-01-01 11:47:07 -05:00
sb-members.c bcachefs: Fix printing of device durability 2024-01-05 23:24:19 -05:00
sb-members.h bcachefs: for_each_member_device_rcu() now declares loop iter 2024-01-01 11:47:42 -05:00
seqmutex.h
siphash.c
siphash.h
six.c bcachefs: six locks: Simplify optimistic spinning 2024-01-01 11:47:38 -05:00
six.h bcachefs: six lock: fix typos 2024-01-01 11:47:40 -05:00
snapshot.c bcachefs: fsck_err()s don't need to manually check c->sb.version anymore 2024-01-05 23:24:21 -05:00
snapshot.h bcachefs: Combine .trans_trigger, .atomic_trigger 2024-01-05 23:24:20 -05:00
str_hash.h bcachefs: bch_str_hash_flags_t 2024-01-01 11:47:37 -05:00
subvolume_types.h bcachefs: Fixes for rust bindgen 2024-01-01 11:47:42 -05:00
subvolume.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
subvolume.h bcachefs: make RO snapshots actually RO 2024-01-01 11:47:07 -05:00
super_types.h bcachefs updates for 6.8: 2024-01-10 16:34:17 -08:00
super-io.c bcachefs updates for 6.8: 2024-01-10 16:34:17 -08:00
super-io.h bcachefs: __bch2_sb_field_to_text() 2024-01-05 23:24:21 -05:00
super.c bcachefs: %pg is banished 2024-01-05 23:24:21 -05:00
super.h bcachefs: convert bch_fs_flags to x-macro 2024-01-01 11:47:38 -05:00
sysfs.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
sysfs.h
tests.c bcachefs: for_each_btree_key() now declares loop iter 2024-01-01 11:47:42 -05:00
tests.h
thread_with_file_types.h bcachefs: factor out thread_with_file, thread_with_stdio 2024-01-05 23:24:19 -05:00
thread_with_file.c bcachefs: no thread_with_file in userspace 2024-01-05 23:24:20 -05:00
thread_with_file.h bcachefs: factor out thread_with_file, thread_with_stdio 2024-01-05 23:24:19 -05:00
trace.c
trace.h bcachefs: Improve would_deadlock trace event 2024-01-05 23:24:21 -05:00
two_state_shared_lock.c
two_state_shared_lock.h
util.c bcachefs: Improve would_deadlock trace event 2024-01-05 23:24:21 -05:00
util.h bcachefs: %pg is banished 2024-01-05 23:24:21 -05:00
varint.c
varint.h
vstructs.h bcachefs: vstruct_for_each() now declares loop iter 2024-01-01 11:47:42 -05:00
xattr.c bcachefs: make RO snapshots actually RO 2024-01-01 11:47:07 -05:00
xattr.h