The fragmentation_lru field hasn't been needed since we reworked the LRU
btrees to use the btree write buffer; previously it was used to resolve
collisions, but the revised LRU btree uses the backpointer (the bucket)
as part of the key.
It should have been deleted at the time of the LRU rework; since it
wasn't, that left places for bugs to hide, in check/repair.
This fixes LRU fsck on a filesystem image helpfully provided by a user
who disappeared before I could get his name for the reported-by.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Previously, check_inode() would delete unlinked inodes if they weren't
on the deleted list - this code dating from before there was a deleted
list.
But, if we crash during a logged op (truncate or finsert/fcollapse) of
an unlinked file, logged op resume will get confused if the inode has
already been deleted - instead, just add it to the deleted list if it
needs to be there; delete_dead_inodes runs after logged op resume.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
BCH_SB_ERRS() has a field for the actual enum val so that we can reorder
to reorganize, but the way BCH_SB_ERR_MAX was defined didn't allow for
this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Most or all errors will be autofix in the future, we're currently just
doing the ones that we know are well tested.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
errors that are known to always be safe to fix should be autofix: this
should be most errors even at this point, but that will need some
thorough review.
note that errors are still logged in the superblock, so we'll still know
that they happened.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Implement bch2_accounting_invalid(); check for junk at the end, and
replicas accounting entries in particular need to be checked or we'll
pop asserts later.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Consolidate duplicated checks for extents/dirents/xattrs - these keys
should all have a corresponding inode of the correct type.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
fragmentation_lru derives from dirty_sectors, and wasn't being checked.
Co-developed-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
i.e. the start of automatic self healing:
If errors=continue or fix_safe, we now automatically fix simple errors
without user intervention.
New error action option: fix_safe
This replaces the existing errors=ro option, which gets a new slot, i.e.
existing errors=ro users now get errors=fix_safe.
This is currently only enabled for a limited set of errors - initially
just disk accounting; errors we would never not want to fix, and we
don't want to require user intervention (i.e. to make sure a bug report
gets filed).
Errors will still be counted in the superblock, so we (developers) will
still know they've been occuring if a bug report gets filed (as bug
reports typically include the errors superblock section).
Eventually we'll be enabling this for a much wider set of errors, after
we've done thorough error injection testing.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
LRUs only have 48 bits for the time field (i.e. LRU order); thus we need
overflow checks and guards.
Reported-by: syzbot+df3bf3f088dcaa728857@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We can only handle btree IDs up to 62, since the btree id (plus the type
for interior btree nodes) has to fit ito a 64 bit bitmask - check for
invalid ones to avoid invalid shifts later.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>