Commit Graph

39 Commits

Author SHA1 Message Date
Kent Overstreet
94817db956 bcachefs: Correctly validate k->u64s in btree node read path
validate_bset_keys() never properly validated k->u64s; it checked if it
was 0, but not if it was smaller than keys for the given packed format;
this fixes that small oversight.

This patch was backported, so it's adding quite a few error enums so
that they don't get renumbered and we don't have confusing gaps.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:21:04 -04:00
Kent Overstreet
a8958a1a95 bcachefs: bkey_copy() is no longer a macro
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:12:18 -05:00
Kent Overstreet
5cfd69775e bcachefs: Array bounds fixes
It's no longer legal to use a zero size array as a flexible array
member - this causes UBSAN to complain.

This patch switches our zero size arrays to normal flexible array
members when possible, and inserts casts in other places (e.g. where we
use the zero size array as a marker partway through an array).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:12 -04:00
Kent Overstreet
e08e63e44e bcachefs: BCH_COMPAT_bformat_overflow_done no longer required
Awhile back, we changed bkey_format generation to ensure that the packed
representation could never represent fields larger than the unpacked
representation.

This was to ensure that bkey_packed_successor() always gave a sensible
result, but in the current code bkey_packed_successor() is only used in
a debug assertion - not for anything important.

This kills the requirement that we've gotten rid of those weird bkey
formats, and instead changes the assertion to check if we're dealing
with an old weird bkey format.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
6c6439650e bcachefs: bkey_format helper improvements
- add a to_text() method for bkey_format

 - convert bch2_bkey_format_validate() to modern error message style,
   where we pass a printbuf for the error string instead of returning a
   static string

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
ab158fce47 bcachefs: Converting to typed bkeys is now allowed for err, null ptrs
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:00 -04:00
Kent Overstreet
b65499b7b1 bcachefs: bch2_btree_node_ondisk_to_text()
Pulling out a helper from cmd_list.c, as the rest is being rewritten in
Rust but we're not ready to rewrite lower-level btree code in Rust.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:56 -04:00
Kent Overstreet
ac2ccddc26 bcachefs: Drop some anonymous structs, unions
Rust bindgen doesn't cope well with anonymous structs and unions. This
patch drops the fancy anonymous structs & unions in bkey_i that let us
use the same helpers for bkey_i and bkey_packed; since bkey_packed is an
internal type that's never exposed to outside code, it's only a minor
inconvenienc.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:55 -04:00
Kent Overstreet
321bdc73f3 bcachefs: bkey_min(), bkey_max()
Parallel to bpos_min(), bpos_max() - trivial refactoring.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:48 -04:00
Kent Overstreet
e88a75ebe8 bcachefs: New bpos_cmp(), bkey_cmp() replacements
This patch introduces
 - bpos_eq()
 - bpos_lt()
 - bpos_le()
 - bpos_gt()
 - bpos_ge()

and equivalent replacements for bkey_cmp().

Looking at the generated assembly these could probably be improved
further, but we already see a significant code size improvement with
this patch.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:47 -04:00
Kent Overstreet
6530d89e0a bcachefs: extents no longer require special handling for packing
Extent overwrite used to  be handled differently, underneath the
journaling layer and within the core btree code. This imposed
restrictions on bkey packing/packed formats, which no longer apply.

This patch deletes those restrictions.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:47 -04:00
Kent Overstreet
0aba9eba76 bcachefs: Inline bch2_bkey_format_add_key()
This is only called in two places, and when it's used we use it in a
tight loop - it's definitely worth inlining.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:46 -04:00
Kent Overstreet
77671e8fff bcachefs: Move bkey bkey_unpack_key() to bkey.h
Long ago, bkey_unpack_key() was added to bset.h instead of bkey.h
because bkey.h didn't include btree_types.h, which it needs for the
compiled unpack function.

This patch finally moves it to the proper location.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:45 -04:00
Kent Overstreet
3e3e02e6bc bcachefs: Assorted checkpatch fixes
checkpatch.pl gives lots of warnings that we don't want - suggested
ignore list:

 ASSIGN_IN_IF
 UNSPECIFIED_INT	- bcachefs coding style prefers single token type names
 NEW_TYPEDEFS		- typedefs are occasionally good
 FUNCTION_ARGUMENTS	- we prefer to look at functions in .c files
			  (hopefully with docbook documentation), not .h
			  file prototypes
 MULTISTATEMENT_MACRO_USE_DO_WHILE
			- we have _many_ x-macros and other macros where
			  we can't do this

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:44 -04:00
Kent Overstreet
d0b50524f1 bcachefs: bch2_bkey_packed_to_binary_text()
For debugging the eytzinger search tree code, and low level bkey packing
code, it can be helpful to see things in binary: this patch improves our
helpers for doing so.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:38 -04:00
Kent Overstreet
15bc0948e7 bcachefs: Add an overflow check in set_bkey_val_u64s()
For now this is just a BUG_ON() - we may want to change this to return
an error in the future.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:38 -04:00
Kent Overstreet
7a7d17b2f7 bcachefs: Whiteouts for snapshots
This patch adds KEY_TYPE_whiteout, a new type of whiteout for snapshots,
when we're deleting and the key being deleted is in an ancestor
snapshot - and updates the transaction update/commit path to use it.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:12 -04:00
Kent Overstreet
807dda8c83 bcachefs: Kill bpos_diff() XXX check for perf regression
This improves the btree iterator lookup code by using
trans_for_each_iter_inorder().

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
2023-10-22 17:09:11 -04:00
Kent Overstreet
e751c01a8e bcachefs: Start using bpos.snapshot field
This patch starts treating the bpos.snapshot field like part of the key
in the btree code:

* bpos_successor() and bpos_predecessor() now include the snapshot field
* Keys in btrees that will be using snapshots (extents, inodes, dirents
  and xattrs) now always have their snapshot field set to U32_MAX

The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that
determines whether we're iterating over keys in all snapshots or not -
internally, this controlls whether bkey_(successor|predecessor)
increment/decrement the snapshot field, or only the higher bits of the
key.

We add a new member to struct btree_iter, iter->snapshot: when
BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always
equal iter->snapshot, which will be 0 for btrees that don't use
snapshots, and alsways U32_MAX for btrees that will use snapshots
(until we enable snapshot creation).

This patch also introduces a new metadata version number, and compat
code for reading from/writing to older versions - this isn't a forced
upgrade (yet).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:57 -04:00
Kent Overstreet
4cf91b0270 bcachefs: Split out bpos_cmp() and bkey_cmp()
With snapshots, we're going to need to differentiate between comparisons
that should and shouldn't include the snapshot field. bpos_cmp is now
the comparison function that does include the snapshot field, used by
core btree code.

Upper level filesystem code generally does _not_ want to compare against
the snapshot field - that code wants keys to compare as equal even when
one of them is in an ancestor snapshot.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:57 -04:00
Kent Overstreet
0390ea8ad8 bcachefs: Drop bkey noops
Bkey noops were introduced to deal with trimming inline data extents in
place in the btree: if the u64s field of a bkey was 0, that u64 was a
noop and we'd start looking for the next bkey immediately after it.

But extent handling has been lifted above the btree - we no longer
modify existing extents in place in the btree, and the compatibilty code
for old style extent btree nodes is gone, so we can completely drop this
code.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:57 -04:00
Kent Overstreet
61a19ce425 bcachefs: Fix bpos_diff()
Previously, bpos_diff() did not handle borrows correctly. Minor thing
considering how it was used, but worth fixing.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:55 -04:00
Kent Overstreet
79f88eba01 bcachefs: Rename KEY_TYPE_whiteout -> KEY_TYPE_hash_whiteout
Snapshots are going to need a different whiteout key type. Also, switch
to using BCH_BKEY_TYPES() to define the bkey value accessors.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:55 -04:00
Kent Overstreet
7f4e1d5d0f bcachefs: KEY_TYPE_alloc_v2
This introduces a new version of KEY_TYPE_alloc, which uses the new
varint encoding introduced for inodes. This means we'll eventually be
able to support much larger bucket sizes (for SMR devices), and the
read/write time fields are expanded to 64 bits - which will be used in
the next patch to get rid of the periodic rescaling of those fields.

Also, for buckets that are members of erasure coded stripes, this adds
persistent fields for the index of the stripe they're members of and the
stripe redundancy. This is part of work to get rid of having to scan and
read into memory the alloc and stripes btrees at mount time.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:52 -04:00
Kent Overstreet
33ccd7188e bcachefs: Don't allocate stripes at POS_MIN
In the future, stripe index 0 will be a sentinal value. This patch
doesn't disallow stripes at POS_MIN yet, leaving that for when we do the
on disk format changes.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:51 -04:00
Kent Overstreet
811d2bcd85 bcachefs: Drop typechecking from bkey_cmp_packed()
This only did anything in two places, and those can just be replaced
wiht bkey_cmp_left_packed()).

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:46 -04:00
Kent Overstreet
801a3de642 bcachefs: Indirect inline data extents
When inline data extents were added, reflink was forgotten about - we
need indirect inline data extents for reflink + inline data to work
correctly.

This patch adds them, and a new feature bit that's flipped when they're
used.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:45 -04:00
Kent Overstreet
47143a75e0 bcachefs: Disable extent merging
Extent merging is currently broken, and will be reimplemented
differently soon - right now it only happens when btree nodes are being
compacted, which makes it difficult to test.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:37 -04:00
Kent Overstreet
548b3d209f bcachefs: btree_ptr_v2
Add a new btree ptr type which contains the sequence number (random 64
bit cookie, actually) for that btree node - this lets us verify that
when we read in a btree node it really is the btree node we wanted.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:35 -04:00
Kent Overstreet
4be1a412ea bcachefs: Inline data extents
This implements extents that have their data inline, in the value,
instead of the bkey value being pointers to the data - and the read and
write paths are updated to read from these new extent types and write
them out, when the write size is small enough.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:32 -04:00
Kent Overstreet
ad44bdc351 bcachefs: bkey noops
For upcoming inline data extents, we're going to need to be able to
shorten the value of existing bkeys in the btree - and to make that work
we're going to be able to need to pad out the space the value previously
took up with something.

This patch changes the various code that iterates over bkeys to handle
k->u64s == 0 as meaning "skip the next 8 bytes".

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:32 -04:00
Kent Overstreet
b8098f36dd bcachefs: Don't use rep movsq for small memcopies
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:30 -04:00
Kent Overstreet
76426098e4 bcachefs: Reflink
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:25 -04:00
Kent Overstreet
99aaf57000 bcachefs: Refactor various code to not be extent specific
With reflink, various code now has to handle both KEY_TYPE_extent
or KEY_TYPE_reflink_v - so, convert it to be generic across all keys
with pointers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:24 -04:00
Kent Overstreet
1ae9733456 bcachefs: Don't overflow stack in bch2_extent_merge_inline()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:23 -04:00
Kent Overstreet
3ea2b1e128 bcachefs: cmp_int()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:20 -04:00
Kent Overstreet
26609b619f bcachefs: Make bkey types globally unique
this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.

Also improve the on disk format versioning stuff.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:12 -04:00
Kent Overstreet
cd575ddf57 bcachefs: Erasure coding
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:11 -04:00
Kent Overstreet
1c6fdbd8f2 bcachefs: Initial commit
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:07 -04:00