Commit Graph

469 Commits

Author SHA1 Message Date
Kent Overstreet
5ca8ff157d bcachefs: Use kvzalloc() when dynamically allocating btree paths
THis silences a mm/page_alloc.c warning about allocating more than a
page with GFP_NOFAIL - and there's no reason for this to not have a
vmalloc fallback anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
83bd5985fa bcachefs: Track iter->ip_allocated at bch2_trans_copy_iter()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
3254c1b0e5 bcachefs: Save key_cache_path in peek_slot()
When bch2_btree_iter_peek_slot() clones the iterator to search for the
next key, and then discovers that the key from the cloned iterator is
the key we want to return - we also want to save the
iter->key_cache_path as well, for the update path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
52946d828a bcachefs: Kill more -EIO error codes
This converts -EIOs related to btree node errors to private error codes,
which will help with some ongoing debugging by giving us better error
messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
fc634d8e46 bcachefs: btree_and_journal_iter.trans
we now always have a btree_trans when using a btree_and_journal_iter;
prep work for adding prefetching to btree_and_journal_iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:08 -04:00
Kent Overstreet
fadc6067f2 bcachefs: Set path->uptodate when no node at level
We were failing to set path->uptodate when reaching the end of a btree
node iterator, causing the new prefetch code for backpointers gc to go
into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:30:56 -04:00
Kent Overstreet
ba89083e9f bcachefs: Fix journal replay with unreadable btree roots
When a btree root is unreadable, we still might be able to get some data
back by replaying what's in the journal. Previously though, we got
confused when journal replay would attempt to replay a key for a level
that didn't exist.

This adds bch2_btree_increase_depth(), so that journal replay can handle
this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:18:13 -04:00
Kent Overstreet
204f45140f bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
If we're in FILTER_SNAPSHOTS mode and we start scanning a range of the
keyspace where no keys are visible in the current snapshot, we have a
problem - we'll scan for a very long time before scanning terminates.

Awhile back, this was fixed for most cases with peek_upto() (and
assertions that enforce that it's being used).

But the fix missed the fact that the inodes btree is different - every
key offset is in a different snapshot tree, not just the inode field.

Fixes:
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-02-24 20:41:46 -05:00
Kent Overstreet
b97de45365 bcachefs: Improve trace_trans_restart_relock
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00
Kent Overstreet
49a5192c0e bcachefs: Add an option to control btree node prefetching
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:20 -05:00
Kent Overstreet
89056f245b bcachefs: track transaction durations
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
83322e8ca8 bcachefs: btree_trans always has stats
reserve slot 0 for unknown (when we overflow), to avoid some branches

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
c558c577cb bcachefs: bch2_btree_trans_peek_slot_updates
refactoring the BTREE_ITER_WITH_UPDATES code, prep for removing the flag
and making it always-on

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
359e89add5 bcachefs: bch2_btree_trans_peek_prev_updates
bch2_btree_iter_peek_prev() now supports BTREE_ITER_WITH_UPDATES

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
eb6863598a bcachefs: bch2_btree_trans_peek_updates
refactoring the BTREE_ITER_WITH_UPDATES code, prep for removing the flag
and making it always-on

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
0c99e17d3b bcachefs: growable btree_paths
XXX: we're allocating memory with btree locks held - bad

We need to plumb through an error path so we can do
allocate_dropping_locks() - but we're merging this now because it fixes
a transaction path overflow caused by indirect extent fragmentation, and
the resize path is rare.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
2c3b0fc3bd bcachefs: trans->nr_paths
Start to plumb through dynamically growable btree_paths; this patch
replaces most BTREE_ITER_MAX references with trans->nr_paths.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
5cc6daf749 bcachefs: trans->updates will also be resizable
the reflink triggers are also bumping up against the maximum number of
paths in a transaction - and generating proportional numbers of updates.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
31403dca5b bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONS
- Some tweaks to greatly reduce locking overhead for the list of btree
   transactions, so that it can always be enabled: leave btree_trans
   objects on the list when they're on the percpu single item freelist,
   and only check for duplicates in the same process when
   CONFIG_BCACHEFS_DEBUG is enabled

 - don't zero out the full btree_trans() unless we allocated it from
   the mempool

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
fea153a845 bcachefs: rcu protect trans->paths
Upcoming patches are going to be changing trans->paths to a
reallocatable buffer. We need to guard against use after free when it's
used by other threads; this introduces RCU protection to those paths and
changes them to check for trans->paths == NULL

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
6474b70610 bcachefs: Clean up btree_trans
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
398c98347d bcachefs: kill btree_path.idx
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
542e639674 bcachefs: bch2_btree_iter_peek_prev() no longer uses path->idx
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
566eabd36f bcachefs: bch2_path_get() no longer uses path->idx
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
b0b6737822 bcachefs: trans_for_each_path_with_node() no longer uses path->idx
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
ccb7b08fbb bcachefs: trans_for_each_path() no longer uses path->idx
path->idx is now a code smell: we should be using path_idx_t, since it's
stable across btree path reallocation.

This is also a bit faster, using the same loop counter vs. fetching
path->idx from each path we iterate over.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
311e446a41 bcachefs: bch2_btree_path_to_text() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
1f75ba4e65 bcachefs: struct trans_for_each_path_inorder_iter
reducing our usage of path->idx

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
7f9821a7c1 bcachefs: btree_insert_entry -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
07f383c71f bcachefs: btree_iter -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
788cc25d15 bcachefs: btree_path_alloc() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
96ed47d130 bcachefs: bch2_btree_path_traverse() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
f6363acaa6 bcachefs: bch2_btree_path_make_mut() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
4617d94617 bcachefs: bch2_btree_path_set_pos() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
74e600c19a bcachefs; bch2_path_put() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
255ebbbf75 bcachefs: bch2_path_get() -> btree_path_idx_t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
5ce8b92da0 bcachefs: minor bch2_btree_path_set_pos() optimization
bpos_eq() is cheaper than bpos_cmp()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
0bc64d7e26 bcachefs: kill __bch2_btree_iter_peek_upto_and_restart()
dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:43 -05:00
Kent Overstreet
79904fa2bb bcachefs: bch2_trans_srcu_lock() should be static
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:42 -05:00
Kent Overstreet
559e6c2336 bcachefs: trans_for_each_update() now declares loop iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:42 -05:00
Kent Overstreet
e06af20719 bcachefs: fix userspace build errors
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:41 -05:00
Kent Overstreet
679972348d bcachefs: kill btree_trans->wb_updates
the btree write buffer path now creates a journal entry directly

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:41 -05:00
Kent Overstreet
f33600057f bcachefs: bch2_trans_node_add no longer uses trans_for_each_path()
In the future we'll be making trans->paths resizable and potentially
having _many_ more paths (for fsck); we need to start fixing algorithms
that walk each path in a transaction where possible.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:41 -05:00
Kent Overstreet
24de63dacb bcachefs: Improve trans->extra_journal_entries
Instead of using a darray, we now allocate journal entries for the
transaction commit path with our normal bump allocator - with an inlined
fastpath, and using btree_transaction_stats to remember how much to
initially allocate so as to avoid transaction restarts.

This is prep work for converting write buffer updates to use this
mechanism.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:41 -05:00
Kent Overstreet
a83b6c895c bcachefs: kill btree_path->(alloc_seq|downgrade_seq)
These were for extra info in tracepoints for debugging a specialized
issue - we do not want to bloat btree_path for this, at least in release
builds.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:41 -05:00
Kent Overstreet
f8fd5871be bcachefs: reserve path idx 0 for sentinal
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:40 -05:00
Kent Overstreet
6e92d15546 bcachefs: Refactor trans->paths_allocated to be standard bitmap
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:40 -05:00
Kent Overstreet
a564c9fad5 bcachefs: Include btree_trans in more tracepoints
This gives us more context information - e.g. which codepath is invoking
btree node reads.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:40 -05:00
Kent Overstreet
3398124444 bcachefs: Improve trace_trans_restart_would_deadlock
In the CI, we're seeing tests failing due to excessive would_deadlock
transaction restarts - the tracepoint now includes the lock cycle that
occured.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:39 -05:00
Kent Overstreet
e153a0d70b bcachefs: Improve trace_trans_restart_too_many_iters()
We now include the list of paths in use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:39 -05:00