linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 12:42:02 +00:00

Author	SHA1	Message	Date
Kent Overstreet	f82755e4e8	bcachefs: Data move path now uses bch2_trans_unlock_long() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-04 22:19:11 -04:00
Kent Overstreet	c4accde498	bcachefs: Ensure srcu lock is not held too long The SRCU read lock that btree_trans takes exists to make it safe for bch2_trans_relock() to deref pointers to btree nodes/key cache items we don't have locked, but as a side effect it blocks reclaim from freeing those items. Thus, it's important to not hold it for too long: we need to differentiate between bch2_trans_unlock() calls that will be only for a short duration, and ones that will be for an unbounded duration. This introduces bch2_trans_unlock_long(), to be used mainly by the data move paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-04 14:17:11 -04:00
Kent Overstreet	be9e782df3	bcachefs: Don't downgrade locks on transaction restart We should only be downgrading locks on success - otherwise, our transaction restarts won't be getting the correct locks and we'll livelock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-01 21:11:08 -04:00
Kent Overstreet	88dfe193bd	bcachefs: bch2_btree_id_str() Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-31 12:18:37 -04:00
Kent Overstreet	6bd68ec266	bcachefs: Heap allocate btree_trans We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:13 -04:00
Kent Overstreet	96dea3d599	bcachefs: Fix W=12 build errors Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:13 -04:00
Colin Ian King	6bf3766b52	bcachefs: Fix a handful of spelling mistakes in various messages There are several spelling mistakes in error messages. Fix these. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:13 -04:00
Kent Overstreet	5b7fbdcd5b	bcachefs: Fix silent enum conversion error This changes mark_btree_node_locked() to take an enum btree_node_locked_type, not a six_lock_type, since BTREE_NODE_UNLOCKED is -1 which may cause problems converting back and forth to six_lock_type if short enums are in use. With this change, we never store BTREE_NODE_UNLOCKED in a six_lock_type enum. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:12 -04:00
Kent Overstreet	8e877caaad	bcachefs: Split out snapshot.c subvolume.c has gotten a bit large, this splits out a separate file just for managing snapshot trees - BTREE_ID_snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:11 -04:00
Kent Overstreet	ff5b741c25	bcachefs: Zero btree_paths on allocation This fixes a bug in the cycle detector, bch2_check_for_deadlock() - we have to make sure the node pointers in the btree paths array are set to something not-garbage before another thread may see them. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:11 -04:00
Kent Overstreet	401585fe87	bcachefs: btree_journal_iter.c Split out a new file from recovery.c for managing the list of keys we read from the journal: before journal replay finishes the btree iterator code needs to be able to iterate over and return keys from the journal as well, so there's a fair bit of code here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:10 -04:00
Kent Overstreet	1e81f89b02	bcachefs: Fix assorted checkpatch nits Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:10 -04:00
Kent Overstreet	bf5a261c7a	bcachefs: Assorted fixes for clang clang had a few more warnings about enum conversion, and also didn't like the opts.c initializer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:09 -04:00
Kent Overstreet	73bd774d28	bcachefs: Assorted sparse fixes - endianness fixes - mark some things static - fix a few __percpu annotations - fix silent enum conversions Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:06 -04:00
Kent Overstreet	faa6cb6c13	bcachefs: Allow for unknown btree IDs We need to allow filesystems with metadata from newer versions to be mountable and usable by older versions. This patch enables us to roll out new btrees without a new major version number; we can now handle btree roots for unknown btree types. The unknown btree roots will be retained, and fsck (including backpointers) will check them, the same as other btree types. We add a dynamic array for the extra, unknown btree roots, in addition to the fixed size btree root array, and add new helpers for looking up btree roots. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:05 -04:00
Kent Overstreet	a5b696ee6e	bcachefs: seqmutex; fix a lockdep splat We can't be holding btree_trans_lock while copying to user space, which might incur a page fault. To fix this, convert it to a seqmutex so we can unlock/relock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:04 -04:00
Kent Overstreet	49c7cd9d8d	bcachefs: More drop_locks_do() conversions Using drop_locks_do() ensures that every unlock() is paired with a relock(), with proper error checking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	78367aaa5a	bcachefs: bch2_trans_kmalloc no longer allocates memory with btree locks held When allocating memory, gfp flags should generally be - GFP_NOWAIT\|__GFP_NOWARN if btree locks are held - GFP_NOFS if in the IO path or otherwise holding resources needed for IO submission - GFP_KERNEL otherwise Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	b5fd75669a	bcachefs: drop_locks_do() Add a new helper for the common pattern of: - trans_unlock() - do something - trans_relock() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	f154c3eb42	bcachefs: trans_for_each_path_safe() bch2_btree_trans_to_text() is used on btree_trans objects that are owned by different threads - when printing out deadlock cycles - so we need a safe version of trans_for_each_path(), else we race with seeing a btree_path that was just allocated and not fully initialized: Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:02 -04:00
Kent Overstreet	32913f49f5	six locks: Seq now only incremented on unlock Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:02 -04:00
Kent Overstreet	1fb4fe6317	six locks: Kill six_lock_state union As suggested by Linus, this drops the six_lock_state union in favor of raw bitmasks. On the one hand, bitfields give more type-level structure to the code. However, a significant amount of the code was working with six_lock_state as a u64/atomic64_t, and the conversions from the bitfields to the u64 were deemed a bit too out-there. More significantly, because bitfield order is poorly defined (#ifdef __LITTLE_ENDIAN_BITFIELD can be used, but is gross), incrementing the sequence number would overflow into the rest of the bitfield if the compiler didn't put the sequence number at the high end of the word. The new code is a bit saner when we're on an architecture without real atomic64_t support - all accesses to lock->state now go through atomic64_*() operations. On architectures with real atomic64_t support, we additionally use atomic bit ops for setting/clearing individual bits. Text size: 7467 bytes -> 4649 bytes - compilers still suck at bitfields. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:02 -04:00
Kent Overstreet	f375d6ca58	bcachefs: Don't call local_clock() twice in trans_begin() local_clock() is not as cheap as we'd like it to be, alas Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	11f117374a	bcachefs: Call bch2_path_put_nokeep() before bch2_path_put() bch2_path_put_nokeep() is sketchy, and we should consider removing it: it unconditionally frees btree_paths once their ref hits 0. The assumption is that we only use it for paths that have never been visible outside the btree core btree code; i.e. higher level code will never be making assumptions about locking based on these paths. However, there's subtle brokenness with this approach: - If we call bch2_path_put(), then bch2_path_put_nokeep(), bch2_path_put() may free the first path on the assumption that we we have another path keeping a node locked - but then bch2_path_put_nokeep() just unconditionally frees it. The same bug may arise if we're calling bch2_path_put() and bch2_path_put_nokeep() on the same (refcounted) path, or two adjacent paths that point to the same btree node. This patch hacks around one of these bugs by calling bch2_path_put_nokeep() first in bch2_trans_iter_exit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:58 -04:00
Kent Overstreet	65d48e3525	bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	511b629aca	bcachefs: bch2_btree_iter_peek_node_and_restart() Minor refactoring for the Rust interface. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:56 -04:00
Kent Overstreet	1306f87de3	bcachefs: Plumb btree_trans through btree cache code Soon, __bch2_btree_node_write() is going to require a btree_trans: zoned device support is going to require a new allocation for every btree node write. This is a bit of prep work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:55 -04:00
Kent Overstreet	0f2ea6550f	bcachefs: bch2_btree_iter_peek_and_restart_outlined() Needed for interfacing with Rust - bindgen can't handle inline functions, alas. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:55 -04:00
Kent Overstreet	5e2d8be8bd	bcachefs: Split trans->last_begin_ip and trans->last_restarted_ip These are two different things - this improves our debug assert messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:53 -04:00
Kent Overstreet	6623c0fcdf	bcachefs: Add an assertion for using multiple btree_trans A thread should never be using more than one btree_trans - doing so is an invitation for deadlocks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:53 -04:00
Kent Overstreet	920e69bc3d	bcachefs: Btree write buffer This adds a new method of doing btree updates - a straight write buffer, implemented as a flat fixed size array. This is only useful when we don't need to read from the btree in order to do the update, and when reading is infrequent - perfect for the LRU btree. This will make LRU btree updates fast enough that we'll be able to use it for persistently indexing buckets by fragmentation, which will be a massive boost to copygc performance. Changes: - A new btree_insert_type enum, for btree_insert_entries. Specifies btree, btree key cache, or btree write buffer. - bch2_trans_update_buffered(): updates via the btree write buffer don't need a btree path, so we need a new update path. - Transaction commit path changes: The update to the btree write buffer both mutates global, and can fail if there isn't currently room. Therefore we do all write buffer updates in the transaction all at once, and also if it fails we have to revert filesystem usage counter changes. If there isn't room we flush the write buffer in the transaction commit error path and retry. - A new persistent option, for specifying the number of entries in the write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	60b5538877	bcachefs: trans->notrace_relock_fail When we unlock in order to submit IO, the next relock event is likely to fail if submit_bio() blocked - we shouldn't those events in our _fail stats, since those are expected events and shouldn't cause test failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	434b1c75a4	bcachefs: Switch a BUG_ON() to a panic() This assert is popping - rarely - in the CI, this will help us track it down from the logs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	992fa4e620	bcachefs: Fix btree_path_alloc() We need to call bch2_trans_update_max_paths() before marking the new path as allocated, since we're not initializing it yet. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	c72f687a1f	bcachefs: Use for_each_btree_key_upto() more consistently It's important that in BTREE_ITER_FILTER_SNAPSHOTS mode we always use peek_upto() and provide an end for the interval we're searching for - otherwise, when we hit the end of the inode the next inode be in a different subvolume and not have any keys in the current snapshot, and we'd iterate over arbitrarily many keys before returning one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	94c69fafa7	bcachefs: Use six_lock_ip() This uses the new _ip() interface to six locks and hooks it up to btree_path->ip_allocated, when available. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	12344c7cb9	bcachefs: bch2_trans_in_restart_error() This replaces various BUG_ON() assertions with panics that tell us where the restart was done and the restart type. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	464b415539	bcachefs: Fix bch2_trans_reset_updates() This should have been resetting trans->fs_usage_deltas as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	4e3d18991a	bcachefs: Inline bch2_btree_path_traverse() fastpath Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	ee2c6ea776	bcachefs: btree_iter->ip_allocated In debug mode, we now track where btree iterators and paths are initialized/allocated - helpful in tracking down btree path overflows. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	6c36318cc7	bcachefs: key cache: Don't hold btree locks while using GFP_RECLAIM This is something we need to do more widely: instead of bothering with GFP_NOIO/GFP_NOFS, if we need to allocate memory while holding locks: - first attempt the allocation with GFP_NOWAIT - if that fails, drop btree locks with bch2_trans_unlock(), then retry with GFP_KERNEL. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	c82ed3047b	bcachefs: Fix bch2_btree_path_traverse_all() We need to take a ref on a path while we're traversing it: this fixes a bug with paths getting reused while being traversed, in the key cache fill code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	ee94c413a7	bcachefs: Delete a faulty assertion Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	9d7f2a4111	bcachefs: bch2_btree_trans_to_text(): print blocked time Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	e242b92af5	bcachefs: Fix for long running btree transactions & key cache While a btree transaction is running, we hold a SRCU read lock on the btree key cache that prevents btree key cache keys from being freed - this is so that relock() operations won't access freed memory. The downside of this is that long running btree transactions prevent memory from being freed from the key cache. This adds a check in bch2_trans_begin() - if the transaction has been running longer than 1 second, drop and retake the SRCU read lock and zero out pointers to unlock key cache paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	08f7803159	bcachefs: bch2_trans_revalidate_updates_in_node() When we started stashing the key being overwritten in btree_insert_entry, this introduced a typical iterator invalidation problem, triggered by btree node splits or resorts. Previously, dealt with this by unconditionally re-validating those stashed pointers in the transaction commit path. This patch gets rid of that by doing it only when needed, in bch2_trans_node_add() or bch2_trans_node_reinit_iter(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	321bdc73f3	bcachefs: bkey_min(), bkey_max() Parallel to bpos_min(), bpos_max() - trivial refactoring. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	ef0732861a	bcachefs: Add a missing bch2_btree_path_traverse() call bch2_btree_iter_peek_upto() in snapshots mode may need to keep a btree_path for the insert position, not just the position of the key we're returning. The code was incorrectly assuming this would be in the same btree node - we were missing a bch2_btree_path_traverse() call. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:47 -04:00
Kent Overstreet	5c792e1b64	bcachefs: Fix a btree iter assertion pop This fixes a (harmless) broken invariant in __bch2_btree_path_set_pos(): iterators to interior nodes should point to the first non whiteout. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:47 -04:00
Kent Overstreet	1617d56dc9	bcachefs: Key cache now works for snapshots btrees This switches btree_key_cache_fill() to use a btree iterator, not a btree path, so that it can search for keys in previous snapshots. We also add another iterator flag, BTREE_ITER_KEY_CACHE_FILL, to avoid recursion back into the key cache. This will allow us to re-enable the key cache for inodes in the next patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:47 -04:00

1 2 3 4 5 ...

410 Commits