Commit Graph

2880 Commits

Author SHA1 Message Date
Kent Overstreet
415e5107b0 bcachefs: Extra kthread_should_stop() calls for copygc
This fixes a bug where going read-only was taking longer than it should
have due to copygc forgetting to check kthread_should_stop()

Additionally: fix a missing is_kthread check in bch2_move_ratelimit().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 22:58:23 -05:00
Kent Overstreet
463086d998 bcachefs: Convert gc_alloc_start() to for_each_btree_key2()
This eliminates some SRCU warnings: for_each_btree_key2() runs every
loop iteration in a distinct transaction context.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 22:58:22 -05:00
Kent Overstreet
2111f39459 bcachefs: Fix race between btree writes and metadata drop
btree writes update the btree node key after every write, in order to
update sectors_written, and they also might need to drop pointers if one
of the writes failed in a replicated btree node.

But the btree node might also have had a pointer dropped while the write
was in flight, by bch2_dev_metadata_drop(), and thus there was a bug
where the btree node write would ovewrite the btree node's key with what
it had at the start of the write.

Fix this by dropping pointers not currently in the btree node key.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 22:58:22 -05:00
Kent Overstreet
ef0beeb8dd bcachefs: move journal seq assertion
journal_cur_seq() can legitimately be used outside of the journal lock,
where this assert can race

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 22:58:22 -05:00
Kent Overstreet
1b1bd0fd41 bcachefs: -EROFS doesn't count as move_extent_start_fail
The automated tests check if we've hit too many slowpath/error path
events and fail the test - if we're just shutting down, that naturally
shouldn't count.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 22:58:22 -05:00
Kent Overstreet
ae4d612cc1 bcachefs: trace_move_extent_start_fail() now includes errcode
Renamed from trace_move_extent_alloc_mem_fail, because there are other
reasons we colud fail (disk space allocation failure).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 17:18:24 -05:00
Kent Overstreet
5510a4af52 bcachefs: Fix split_race livelock
bch2_btree_update_start() calculates which nodes are going to have to be
split/rewritten, so that we know how many nodes to reserve and how deep
in the tree we have to take locks.

But btree node merges require inserting two keys into the parent node,
not just splits.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 17:18:24 -05:00
Kent Overstreet
03013bb0c6 bcachefs: Fix bucket data type for stripe buckets
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 17:18:24 -05:00
Kent Overstreet
d5bd37872a bcachefs: Add missing validation for jset_entry_data_usage
Validation was completely missing for replicas entries in the journal
(not the superblock replicas section) - we can't have replicas entries
pointing to invalid devices.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 17:18:24 -05:00
Kent Overstreet
bbc3a46065 bcachefs: Fix zstd compress workspace size
zstd apparently lies about the size of the compression workspace it
requires; if we double it compression succeeds.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-28 17:18:24 -05:00
Kent Overstreet
3f3ae1250e bcachefs: bpos is misaligned on big endian
bkey embeds a bpos that is misaligned on big endian; this is so that
bch2_bkey_swab() works correctly without having to differentiate between
packed and non-packed keys (a debatable design decision).

This means it can't have the __aligned() tag on big endian.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-25 21:48:42 -05:00
Kent Overstreet
e4f72bb46a bcachefs: Fix ec + durability calculation
Durability of an erasure coded pointer doesn't add the device
durability; durability is the same for any extent in that stripe so the
calculation only comes from the stripe.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-25 21:48:42 -05:00
Kent Overstreet
7d9f8468ff bcachefs: Data update path won't accidentaly grow replicas
Previously, there was a bug where if an extent had greater durability
than required (because we needed to move a durability=1 pointer and
ended up putting it on a durability 2 device), we would submit a write
for replicas=2 - the durability of the pointer being rewritten - instead
of the number of replicas required to bring it back up to the
data_replicas option.

This, plus the allocation path sometimes allocating on a greater
durability device than requested, meant that extents could continue
having more and more replicas added as they were being rewritten.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-25 21:48:42 -05:00
Kent Overstreet
0af8a06a4c bcachefs: deallocate_extra_replicas()
When allocating from devices with different durability, we might end up
with more replicas than required; this changes
bch2_alloc_sectors_start() to check for this, and drop replicas that
aren't needed to hit the number of replicas requested.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 03:03:47 -05:00
Kent Overstreet
8a443d3ea1 bcachefs: Proper refcounting for journal_keys
The btree iterator code overlays keys from the journal until journal
replay is finished; since we're now starting copygc/rebalance etc.
before replay is finished, this is multithreaded access and thus needs
refcounting.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:43:12 -05:00
Brian Foster
63807d9518 bcachefs: preserve device path as device name
Various userspace scripts/tools may expect mount entries in
/proc/mounts to reflect the device path names used to mount the
associated filesystem. bcachefs seems to normalize the device path
to the underlying device name based on the block device. This
confuses tools like fstests when the test devices might be lvm or
device-mapper based.

The default behavior for show_vfsmnt() appers to be to use the
string passed to alloc_vfsmnt(), so tweak bcachefs to copy the path
at device superblock read time and to display it via
->show_devname().

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:42:07 -05:00
Kent Overstreet
0a11adfb7a bcachefs: Fix an endianness conversion
cpu_to_le32(), not le32_to_cpu() - fixes a sparse complaint.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:42:07 -05:00
Kent Overstreet
468035ca4b bcachefs: Start gc, copygc, rebalance threads after initing writes ref
This fixes a bug where copygc would occasionally race with going
read-write and die, thinking we were read only, because it couldn't take
a ref on c->writes.

It's not necessary for copygc (or rebalance, or copygc) to take write
refs; they could run with BCH_TRANS_COMMIT_nocheck_rw, but this is an
easier fix that making sure that flag is passed correctly everywhere.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:30:32 -05:00
Kent Overstreet
202a7c292e bcachefs: Don't stop copygc thread on device resize
copygc no longer has to scan the buckets, so it's no longer a problem if
the number of buckets is changing while it's running.

This also fixes a bug where we forgot to restart copygc.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:30:32 -05:00
Kent Overstreet
261af2f1c6 bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops
This adds move_ctxt_wait_event_timeout(), which can sleep for a timeout
while also issueing pending moves as reads complete.

Co-developed-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:10:28 -05:00
Kent Overstreet
50e029c639 bcachefs: bch2_moving_ctxt_flush_all()
Introduce a new helper to flush all move IOs, and use it in a few places
where we should have been.

The new helper also drops btree locks before waiting on outstanding move
writes, avoiding potential deadlocks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 02:08:25 -05:00
Kent Overstreet
6201d91ee3 bcachefs: Put erasure coding behind an EXPERIMENTAL kconfig option
We still have disk space accounting changes coming for erasure coding,
and the changes won't be as strictly backwards compatible as they'd
ought to be - specifically, we need to start accounting striped data
under a separate counter in bch_alloc (which describes buckets).

A fsck will suffice for upgrading/downgrading, but since erasure coding
is the most incomplete major feature of bcachefs it still makes sense to
put behind a separate kconfig option, so that users are fully aware.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 00:29:58 -05:00
Kent Overstreet
d4e3b928ab closures: CLOSURE_CALLBACK() to fix type punning
Control flow integrity is now checking that type signatures match on
indirect function calls. That breaks closures, which embed a work_struct
in a closure in such a way that a closure_fn may also be used as a
workqueue fn by the underlying closure code.

So we have to change closure fns to take a work_struct as their
argument - but that results in a loss of clarity, as closure fns have
different semantics from normal workqueue functions (they run owning a
ref on the closure, which must be released with continue_at() or
closure_return()).

Thus, this patc introduces CLOSURE_CALLBACK() and closure_type() macros
as suggested by Kees, to smooth things over a bit.

Suggested-by: Kees Cook <keescook@chromium.org>
Cc: Coly Li <colyli@suse.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 00:29:58 -05:00
Kent Overstreet
ba276ce586 bcachefs: Fix missing locking for dentry->d_parent access
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-16 16:57:19 -05:00
Kent Overstreet
61b85cb0d7 bcachefs: six locks: Fix lost wakeup
In percpu reader mode, trylock() for read had a lost wakeup: on failure
to get the lock, we may have caused a writer to fail to get the lock,
because we temporarily elevated the reader count.

We need to check for waiters after decrementing the read count - not
before.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:44 -05:00
Kent Overstreet
62d73dfc44 bcachefs: Fix no_data_io mode checksum check
In no_data_io mode, we expect data checksums to be wrong - don't want to
spew the log with them.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:44 -05:00
Kent Overstreet
db18ef1a02 bcachefs: Fix bch2_check_nlinks() for snapshots
When searching the link table for the matching inode, we were searching
for a specific - incorrect - snapshot ID as well, causing us to fail to
find the inode.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:44 -05:00
Kent Overstreet
7125063fc6 bcachefs: Don't decrease BTREE_ITER_MAX when LOCKDEP=y
Running with fewer max btree paths doesn't work anymore when replication
is enabled - as we've added e.g. the freespace and bucket gens btrees,
we naturally end up needing more btree paths.

This is an issue with lockdep, we end up taking more locks than lockdep
will track (the MAX_LOCKD_DEPTH constant). But bcachefs as merged does
not yet support lockdep anyways, so we can leave that for later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:44 -05:00
Kent Overstreet
497c57a303 bcachefs: Disable debug log statements
The journal read path had some informational log statements preperatory
for ZNS support - they're not of interest to users, so we can turn them
off.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:44 -05:00
Kent Overstreet
f42fa17883 bcachefs: Fix missing transaction commit
In may_delete_deleted_inode(), there's a corner case when a snapshot was
taken while we had an unlinked inode: we don't want to delete the inode
in the internal (shared) snapshot node, since it might have been
reattached in a descendent snapshot.

Instead we propagate the key to any snapshot leaves it doesn't exist in,
so that it can be deleted there if necessary, and then clear the
unlinked flag in the internal node.

But we forgot to commit after clearing the unlinked flag, causing us to
go into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Kent Overstreet
178c4873fd bcachefs: Fix error path in bch2_mount()
This fixes a bug discovered by generic/388 where sb->s_fs_info was NULL
while the superblock was still active - the error path was entirely
fubar, and was trying to do something unclear and unecessary.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Daniel J Blueman
b783fc4d13 bcachefs: Fix potential sleeping during mount
During mount, bcachefs mount option processing may sleep while allocating a string buffer.

Fix this by reference counting in order to take the atomic path.

Signed-off-by: Daniel J Blueman <daniel@quora.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Kent Overstreet
069749688e bcachefs: Fix iterator leak in may_delete_deleted_inode()
may_delete_deleted_inode() was returning without exiting a btree
iterator, eventually causing propagate_key_to_snaphot_leaves() to go
into an infinite loop hitting btree_trans_too_many_iters().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Kent Overstreet
006ccc3090 bcachefs: Kill journal pre-reservations
This deletes the complicated and somewhat expensive journal
pre-reservation machinery in favor of just using journal watermarks:
when the journal is more than half full, we run journal reclaim more
aggressively, and when the journal is more than 3/4s full we only allow
journal reclaim to get new journal reservations.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Kent Overstreet
701ff57eb3 bcachefs: Check for nonce offset inconsistency in data_update path
We've rarely been seeing a nonce offset inconsistency that doesn't show
up in tests: this adds some extra verification code to the data update
path that prints out more relevant info when it occurs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:45:03 -05:00
Kent Overstreet
09b0283ee2 bcachefs: Make sure to drop/retake btree locks before reclaim
We really don't want to be invoking memory reclaim with btree locks
held: even aside from (solvable, but tricky) recursion issues, it can
cause painful to diagnose performance edge cases.

This fixes a recently reported issue in btree_key_can_insert_cached().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Fixes: https://lore.kernel.org/linux-bcachefs/CAGudoHEsb_hGRMeWeXh+UF6po0qQuuq_NKSEo+s1sEb6bDLjpA@mail.gmail.com/T/
2023-11-13 21:45:03 -05:00
Kent Overstreet
3b8c450777 bcachefs: btree_trans->write_locked
As prep work for the next patch to fix a key cache reclaim issue, we
need to start tracking whether we're currently holding write locks - so
that we can release and retake the before calling into memory reclaim.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:45:03 -05:00
Kent Overstreet
c65c13f0ea bcachefs: Run btree key cache shrinker less aggressively
The btree key cache maintains lists of items that have been freed, but
can't yet be reclaimed because a bch2_trans_relock() call might find
them - we're waiting for SRCU readers to release.

Previously, we wouldn't count these items against the number we're
attempting to scan for, which would mean we'd evict more live key cache
entries - doing quite a bit of potentially unecessary work.

With recent work to make sure we don't hold SRCU locks for too long, it
should be safe to count all the items on the freelists against number to
scan - even if we can't reclaim them yet, we will be able to soon.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:45:01 -05:00
Kent Overstreet
1bd5bcc9f5 bcachefs: Split out btree_key_cache_types.h
More consistent organization.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:44:14 -05:00
Kent Overstreet
4d6128dca6 bcachefs: Guard against insufficient devices to create stripes
We can't create stripes if we don't have enough devices - this
manifested as an integer underflow bug later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:42:22 -05:00
Kent Overstreet
03cc1e67a2 bcachefs: Fix null ptr deref in bch2_backpointer_get_node()
bch2_btree_iter_peek_node() can return a NULL ptr (when the tree is
shorter than the search depth); handle this with an early return.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: https://lore.kernel.org/linux-bcachefs/5fc3c28b-c232-4ec7-b0ac-4ef220ddf976@moroto.mountain/T/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:42:22 -05:00
Gustavo A. R. Silva
274c2f8fd2 bcachefs: Fix multiple -Warray-bounds warnings
Transform zero-length array `entries` into a proper flexible-array
member in `struct journal_seq_blacklist_table`; and fix the following
-Warray-bounds warnings:

fs/bcachefs/journal_seq_blacklist.c:148:26: warning: array subscript idx is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:150:30: warning: array subscript idx is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:154:27: warning: array subscript idx is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:176:27: warning: array subscript i is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:177:27: warning: array subscript i is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:297:34: warning: array subscript i is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:298:34: warning: array subscript i is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]
fs/bcachefs/journal_seq_blacklist.c:300:31: warning: array subscript i is outside array bounds of 'struct journal_seq_blacklist_table_entry[0]' [-Warray-bounds=]

This results in no differences in binary output.

This helps with the ongoing efforts to globally enable -Warray-bounds.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:42:22 -05:00
Gustavo A. R. Silva
1b8bc55628 bcachefs: Use DECLARE_FLEX_ARRAY() helper and fix multiple -Warray-bounds warnings
Transform zero-length array `s` into a proper flexible-array
member in `struct snapshot_table` via the DECLARE_FLEX_ARRAY()
helper; and fix tons of the following -Warray-bounds warnings:

fs/bcachefs/snapshot.h:36:21: warning: array subscript <unknown> is outside array bounds of 'struct snapshot_t[0]' [-Warray-bounds=]
fs/bcachefs/snapshot.h:36:21: warning: array subscript <unknown> is outside array bounds of 'struct snapshot_t[0]' [-Warray-bounds=]
fs/bcachefs/snapshot.c:135:70: warning: array subscript <unknown> is outside array bounds of 'struct snapshot_t[0]' [-Warray-bounds=]
fs/bcachefs/snapshot.h:36:21: warning: array subscript <unknown> is outside array bounds of 'struct snapshot_t[0]' [-Warray-bounds=]
fs/bcachefs/snapshot.h:36:21: warning: array subscript <unknown> is outside array bounds of 'struct snapshot_t[0]' [-Warray-bounds=]
fs/bcachefs/snapshot.h:36:21: warning: array subscript <unknown> is outside array bounds of 'struct snapshot_t[0]' [-Warray-bounds=]

This helps with the ongoing efforts to globally enable -Warray-bounds.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:42:21 -05:00
Kent Overstreet
c4f1f80a0e bcachefs: Use correct fgf_t type as function argument
This quiets a sparse complaint.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-13 21:42:21 -05:00
Jiapeng Chong
48d584b7f9 bcachefs: make bch2_target_to_text_sb static
The bch2_target_to_text_sb are not used outside the file disk_groups.c,
so the modification is defined as static.

fs/bcachefs/disk_groups.c:583:6: warning: no previous prototype for ‘bch2_target_to_text_sb’.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7144
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-11-13 21:42:21 -05:00
Linus Torvalds
c9d01179e1 Second bcachefs pull request for 6.7-rc1
Here's the second big bcachefs pull request. This brings your tree up to
 date with my master branch, which is what existing bcachefs users are
 currently running.
 
 All but the last few patches have been in linux-next, those being small
 fixes. Test results from my dashboard:
   https://evilpiepirate.org/~testdashboard/ci?commit=c7046ed0cf9bb33599aa7e72e7b67bba4be42d64
 
 New features:
  - rebalance_work btree (and metadata version 1.3): the rebalance thread
    no longer has to scan to find extents that need processing - big
    scalability improvement.
  - sb_errors superblock section: this adds counters for each fsck error
    type, since filesystem creation, along with the date of the most
    recent error. It'll get us better bug reports (since users do not
    typically report errors that fsck was able to fix), and I might add
    telemetry for this in the future.
 
 Fixes include:
  - multiple snapshot deletion fixes
  - members_v2 fixups
  - deleted_inodes btree fixes
  - copygc thread no longer spins when a device is full but has no
    fragmented buckets (i.e. rebalance needs to move data around instead)
  - a fix for a memory reclaim issue with the btree key cache: we're now
    careful not to hold the srcu read lock that blocks key cache reclaim
    for too long
  - an early allocator locking fix, from Brian
  - endianness fixes, from Brian
  - CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y, a big
    performance improvement on multithreaded workloads
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmVH9xYACgkQE6szbY3K
 bnahLRAAiNRZL73SQ+MW79o4yPqGwt0Eyy/mvoiGpZf1B8uXp0oZ55j2w3l887Uf
 LeM03mInAYCPdyp/d4vxqIr96j9BODmRRl8sEkkGdJDzokLG+22F0ovOe45KWTxL
 kBoNdng/O/oeOe/1K7taP3KzBvMx2nOF6oA+xfgyCjECMArAIXek0iocyEUR4Ywd
 vGKhLNn1k2c+94wacnDYwjjdcLBxoqxsFXlpu6V0BcaY+DX4J3aBaGmj75KEoCI0
 VbBOzxrOO4QzJrzW2+hxZZWgGyvReCkBJvqfORfuPxiSbFobTim10MdfZOAMQA1U
 Xr1FTEpK1wMX0/pPVgZRqaOsttC+yc/SsfPNgSxybgHPbDlMLaakDHjvYssbKOYG
 urDWSMG5yCsktSLj95SXsvUFKZaZFD72SKBNdgdt/nZjwTHuNQ7IkdrMwIrCQ/PT
 Ifn50UrR/Ahd8RAd5tyNCPw6U9VfwnxACSNl2KA7ONKpvHb+gSt1JsJTDyz1+gN9
 nFVrw1SHKQ6EIV6XhVon/5DEuRTzqoYGWoN08FHEUq9fBlvnVpmbJErCQMplOjz9
 OQnAfpJH4YqkpXyjFAjP1V0An+RUn8QvDgXNqC9TyvCYuOliVFuil4y7/c+7oIQU
 NEoz+jVLenqsGOGAbduI4/Q567COojRgwEvbebSIxSImXuhCNj4=
 =Lo4N
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2023-11-5' of https://evilpiepirate.org/git/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Here's the second big bcachefs pull request. This brings your tree up
  to date with my master branch, which is what existing bcachefs users
  are currently running.

  New features:
   - rebalance_work btree (and metadata version 1.3): the rebalance
     thread no longer has to scan to find extents that need processing -
     big scalability improvement.
   - sb_errors superblock section: this adds counters for each fsck
     error type, since filesystem creation, along with the date of the
     most recent error. It'll get us better bug reports (since users do
     not typically report errors that fsck was able to fix), and I might
     add telemetry for this in the future.

  Fixes include:
   - multiple snapshot deletion fixes
   - members_v2 fixups
   - deleted_inodes btree fixes
   - copygc thread no longer spins when a device is full but has no
     fragmented buckets (i.e. rebalance needs to move data around
     instead)
   - a fix for a memory reclaim issue with the btree key cache: we're
     now careful not to hold the srcu read lock that blocks key cache
     reclaim for too long
   - an early allocator locking fix, from Brian
   - endianness fixes, from Brian
   - CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y, a big
     performance improvement on multithreaded workloads"

* tag 'bcachefs-2023-11-5' of https://evilpiepirate.org/git/bcachefs: (70 commits)
  bcachefs: Improve stripe checksum error message
  bcachefs: Simplify, fix bch2_backpointer_get_key()
  bcachefs: kill thing_it_points_to arg to backpointer_not_found()
  bcachefs: bch2_ec_read_extent() now takes btree_trans
  bcachefs: bch2_stripe_to_text() now prints ptr gens
  bcachefs: Don't iterate over journal entries just for btree roots
  bcachefs: Break up bch2_journal_write()
  bcachefs: Replace ERANGE with private error codes
  bcachefs: bkey_copy() is no longer a macro
  bcachefs: x-macro-ify inode flags enum
  bcachefs: Convert bch2_fs_open() to darray
  bcachefs: Move __bch2_members_v2_get_mut to sb-members.h
  bcachefs: bch2_prt_datetime()
  bcachefs: CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y
  bcachefs: Add a comment for BTREE_INSERT_NOJOURNAL usage
  bcachefs: rebalance_work btree is not a snapshots btree
  bcachefs: Add missing printk newlines
  bcachefs: Fix recovery when forced to use JSET_NO_FLUSH journal entry
  bcachefs: .get_parent() should return an error pointer
  bcachefs: Fix bch2_delete_dead_inodes()
  ...
2023-11-07 11:38:38 -08:00
Kent Overstreet
c7046ed0cf bcachefs: Improve stripe checksum error message
We now include the name of the device in the error message - and also
increment the number of checksum errors on that device.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:14:22 -05:00
Kent Overstreet
853960d00b bcachefs: Simplify, fix bch2_backpointer_get_key()
- backpointer_not_found() checks backpointers_no_use_write_buffer, no
   need to do it inbackpointer_get_key().

 - always use backpointer_get_node() for pointers to nodes:
   backpointer_get_key() was sometimes returning the key from the root
   node unlocked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:14:22 -05:00
Kent Overstreet
daba90f2da bcachefs: kill thing_it_points_to arg to backpointer_not_found()
This can be calculated locally.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:14:22 -05:00
Kent Overstreet
aa98266588 bcachefs: bch2_ec_read_extent() now takes btree_trans
We're not supposed to have more than one btree_trans at a time in a
given thread - that causes recursive locking deadlocks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:13:57 -05:00