We shouldn't ever be writing past i_size - but, apparently there's still
a bug to track down.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In order to avoid trying to allocate too many btree iterators,
bch2_extent_atomic_end() needs to count how many iterators are going to
be needed for insertions and overwrites - but we weren't counting the
iterators for deleting a reflink_v when the refcount goes to 0.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
If the user buffer isn't aligned to the filesystem block size, on a
large enough IO - where it won't fit into a single bio -
bio_iov_iter_get_pages() won't necessarily return a bio with the proper
alignment.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When an extent is erasure coded, we need to record a replicas entry to
indicate that data is present on the devices that extent has pointers to
- but nr_required should be 0, because it's erasure coded.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Last of the basic operations for iterating forwards and backwards over
the btree: we now have
- peek(), returns key >= iter->pos
- next(), returns key > iter->pos
- peek_prev(), returns key <= iter->pos
- prev(), returns key < iter->pos
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
bcachefs used to work mostly in terms of PAGE_SIZE, not block size at
the vfs level - but that has since been fixed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Call bch2_btree_iter_verify from bch2_btree_node_iter_fix(); also verify
in btree_iter_peek_uptodate() that iter->k matches what's in the btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Any time we're modifying what's in the btree, iterators potentially have
to be updated - this one was exposed by the reflink code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The allocator needs to make sure there's buckets available on the
RESERVE_NONE freelist if at all possible - otherwise foreground IO will
get stuck.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
.key_debugcheck no longer needs to take a pointer to the btree node
Also, try to make sure wherever we're inserting or modifying keys in the
btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
With multiple iterators, if another iterator points to the key being
modified, we need to call bch2_btree_node_iter_fix() to re-unpack the
key into the iter->k
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Move extents instead of copying them - this way, we can iterate over
only live extents, not the entire keyspace. Also, this means we can
mostly skip running triggers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Being more rigorous about noting when the key the iterator currently
poins to has changed - which should also give us a nice performance
improvement due to not having to check if we have to skip other bsets
backwards as much.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This was spotted when the move_extent() path tried to allocate a bio for
a reflink_p extent, but adding pages to the bio failed because we
overflowed bi_max_vecs. Oops.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
bch2_mark_update() was correct, but bch2_trans_mark_update() wasn't
respecting BTREE_INSERT_NOMARK_OVERWRITES - key marking/triggers really
need to be cleaned up.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Importantly, we don't want to use bch2_fs_inconsistent_on() for errors
that fsck can repair, becuase that will just put us in RO mode and
prevent fsck from actually fixing stuff. Probably want to get rid of it
in the future.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
when iterating over reflink pointers, we use the key we just emitted to
set the iterator position - which means we have to be setting the key's
inode field as well
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The continue statement in bch2_trans_mark_extent() was wrong - by
bailing out early, we'd be constructing the wrong replicas list to
update. Also, the assertion in update_replicas() was wrong - due to
rounding with compressed extents, it is possible for sectors to be 0
sometimes.
Also, change extent_to_replicas() in replicas.c to match the replicas
list we construct in buckets.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Major simplification - gets rid of the need for marking buckets as
dirty, instead we write buckets if the in memory mark is different from
what's in the btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This fixes a bug in the journal replay -> extent_replay_key ->
split_compressed path, when we do an update that changes alloc info but
the alloc info in the btree isn't up to date yet.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
It's only updating timestamps, so this doubly doesn't make sense. fsync
will flush the journal, if necessary.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>