Since scrub path has been fully moved to scrub_stripe based facilities,
no more scrub_bio would be submitted.
Thus we can remove it completely, this involves:
- SCRUB_SECTORS_PER_BIO macro
- SCRUB_BIOS_PER_SCTX macro
- SCRUB_MAX_PAGES macro
- BTRFS_MAX_MIRRORS macro
- scrub_bio structure
- scrub_ctx::bios member
- scrub_ctx::curr member
- scrub_ctx::bios_in_flight member
- scrub_ctx::workers_pending member
- scrub_ctx::list_lock member
- scrub_ctx::list_wait member
- function scrub_bio_end_io_worker()
- function scrub_pending_bio_inc()
- function scrub_pending_bio_dec()
- function scrub_throttle()
- function scrub_submit()
- function scrub_find_csum()
- function drop_csum_range()
- Some unnecessary flush and scrub pauses
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Those two structures are used to represent a bunch of sectors for scrub,
but now they are fully replaced by scrub_stripe in one go, so we can
remove them. This involves:
- structure scrub_block
- structure scrub_sector
- structure scrub_page_private
- function attach_scrub_page_private()
- function detach_scrub_page_private()
Now we no longer need to use page::private to handle subpage.
- function alloc_scrub_block()
- function alloc_scrub_sector()
- function scrub_sector_get_page()
- function scrub_sector_get_page_offset()
- function scrub_sector_get_kaddr()
- function bio_add_scrub_sector()
- function scrub_checksum_data()
- function scrub_checksum_tree_block()
- function scrub_checksum_super()
- function scrub_check_fsid()
- function scrub_block_get()
- function scrub_block_put()
- function scrub_sector_get()
- function scrub_sector_put()
- function scrub_bio_end_io()
- function scrub_block_complete()
- function scrub_add_sector_to_rd_bio()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The old scrub code has different entrance to verify the content, and
since we have removed the writeback path, now we can start removing the
re-check part, including:
- scrub_recover structure
- scrub_sector::recover member
- function scrub_setup_recheck_block()
- function scrub_recheck_block()
- function scrub_recheck_block_checksum()
- function scrub_repair_block_group_good_copy()
- function scrub_repair_sector_from_good_copy()
- function scrub_is_page_on_raid56()
- function full_stripe_lock()
- function search_full_stripe_lock()
- function get_full_stripe_logical()
- function insert_full_stripe_lock()
- function lock_full_stripe()
- function unlock_full_stripe()
- btrfs_block_group::full_stripe_locks_root member
- btrfs_full_stripe_locks_tree structure
This infrastructure is to ensure RAID56 scrub is properly handling
recovery and P/Q scrub correctly.
This is no longer needed, before P/Q scrub we will wait for all
the involved data stripes to be scrubbed first, and RAID56 code has
internal lock to ensure no race in the same full stripe.
- function scrub_print_warning()
- function scrub_get_recover()
- function scrub_put_recover()
- function scrub_handle_errored_block()
- function scrub_setup_recheck_block()
- function scrub_bio_wait_endio()
- function scrub_submit_raid56_bio_wait()
- function scrub_recheck_block_on_raid56()
- function scrub_recheck_block()
- function scrub_recheck_block_checksum()
- function scrub_repair_block_from_good_copy()
- function scrub_repair_sector_from_good_copy()
And two more functions exported temporarily for later cleanup:
- alloc_scrub_sector()
- alloc_scrub_block()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since the whole scrub path has been switched to scrub_stripe based
solution, the old writeback path can be removed completely, which
involves:
- scrub_ctx::wr_curr_bio member
- scrub_ctx::flush_all_writes member
- function scrub_write_block_to_dev_replace()
- function scrub_write_sector_to_dev_replace()
- function scrub_add_sector_to_wr_bio()
- function scrub_wr_submit()
- function scrub_wr_bio_end_io()
- function scrub_wr_bio_end_io_worker()
And one more function needs to be exported temporarily:
- scrub_sector_get()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The structure scrub_parity is used to indicate that some extents are
scrubbed for the purpose of RAID56 P/Q scrubbing.
Since the whole RAID56 P/Q scrubbing path has been replaced with new
scrub_stripe infrastructure, and we no longer need to use scrub_parity
to modify the behavior of data stripes, we can remove it completely.
This removal involves:
- scrub_parity_workers
Now only one worker would be utilized, scrub_workers, to do the read
and repair.
All writeback would happen at the main scrub thread.
- scrub_block::sparity member
- scrub_parity structure
- function scrub_parity_get()
- function scrub_parity_put()
- function scrub_free_parity()
- function __scrub_mark_bitmap()
- function scrub_parity_mark_sectors_error()
- function scrub_parity_mark_sectors_data()
These helpers are no longer needed, scrub_stripe has its bitmaps and
we can use bitmap helpers to get the error/data status.
- scrub_parity_bio_endio()
- scrub_parity_check_and_repair()
- function scrub_sectors_for_parity()
- function scrub_extent_for_parity()
- function scrub_raid56_data_stripe_for_parity()
- function scrub_raid56_parity()
The new code would reuse the scrub read-repair and writeback path.
Just skip the dev-replace phase.
And scrub_stripe infrastructure allows us to submit and wait for those
data stripes before scrubbing P/Q, without extra infrastructure.
The following two functions are temporarily exported for later cleanup:
- scrub_find_csum()
- scrub_add_sector_to_rd_bio()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Implement the only missing part for scrub: RAID56 P/Q stripe scrub.
The workflow is pretty straightforward for the new function,
scrub_raid56_parity_stripe():
- Go through the regular scrub path for each data stripe
- Wait for the verification and repair to finish
- Writeback the repaired sectors to data stripes
- Make sure all stripes are properly repaired
If we have sectors unrepaired, we cannot continue, or we could further
corrupt the P/Q stripe.
- Submit the rbio for P/Q stripe
The dev-replace would be handled inside
raid56_parity_submit_scrub_rbio() path.
- Wait for the above bio to finish
Although the old code is no longer used, we still keep the declaration,
as the cleanup can be several times larger than this patch itself.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Switch scrub_simple_mirror() to the new scrub_stripe infrastructure.
Since scrub_simple_mirror() is the core part of scrub (only RAID56
P/Q stripes don't utilize it), we can get rid of a big chunk of code,
mostly scrub_extent(), scrub_sectors() and directly called functions.
There is a functionality change:
- Scrub speed throttle now only affects read on the scrubbing device
Writes (for repair and replace), and reads from other mirrors won't
be limited by the set limits.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new helper, queue_scrub_stripe(), would try to queue a stripe for
scrub. If all stripes are already in use, we will submit all the
existing ones and wait for them to finish.
Currently we would queue up to 8 stripes, to enlarge the blocksize to
512KiB to improve the performance. Sectors repaired on zoned need to be
relocated instead of in-place fix.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a new helper, scrub_write_sectors(), to submit write bios for
specified sectors to the target disk.
There are several differences compared to read path:
- Utilize btrfs_submit_scrub_write()
Now we still rely on the @mirror_num based writeback, but the
requirement is also a little different than regular writeback or read,
thus we have to call btrfs_submit_scrub_write().
- We cannot write the full stripe back
We can only write the sectors we have. There will be two call sites
later, one for repaired sectors, one for all utilized sectors of
dev-replace.
Thus the callers should specify their own write_bitmap.
This function only submit the bios, will not wait for them unless for
zoned case.
Caller must explicitly wait for the IO to finish.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new helper, scrub_stripe_read_repair_worker(), would handle the
read-repair part:
- Wait for the previous submitted read IO to finish
- Verify the contents of the stripe
- Go through the remaining mirrors, using as large blocksize as possible
At this stage, we just read out all the failed sectors from each
mirror and re-verify.
If no more failed sector, we can exit.
- Go through all mirrors again, sector-by-sector
This time, we read sector by sector, this is to address cases where
one bad sector mismatches the drive's internal checksum, and cause the
whole read range to fail.
We put this recovery method as the last resort, as sector-by-sector
reading is slow, and reading from other mirrors may have already fixed
the errors.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new helper, scrub_verify_stripe(), shares the same main workflow of
the old scrub code.
The major differences are:
- How pages/page_offset is grabbed
Everything can be grabbed from scrub_stripe easily.
- When error report happens
Currently the helper only verifies the sectors, not really doing any
error reporting.
The error reporting would be done after we have done the repair.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new helper, scrub_verify_one_metadata(), is almost the same as
scrub_checksum_tree_block().
The difference is in how we grab the pages from other structures.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The new helper will search the extent tree to find the first extent of a
logical range, then fill the sectors array by two loops:
- Loop 1 to fill common bits and metadata generation
- Loop 2 to fill csum data (only for data bgs)
This loop will use the new btrfs_lookup_csums_bitmap() to fill
the full csum buffer, and set scrub_sector_verification::csum.
With all the needed info filled by this function, later we only need to
submit and verify the stripe.
Here we temporarily export the helper to avoid warning on unused static
function.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces the following structures:
- scrub_sector_verification
Contains all the needed info to verify one sector (data or metadata).
- scrub_stripe
Contains all needed members (mostly bitmap based) to scrub one stripe
(with a length of BTRFS_STRIPE_LEN).
The basic idea is, we keep the existing per-device scrub behavior, but
merge all the scrub_bio/scrub_bio into one generic structure, and read
the full BTRFS_STRIPE_LEN stripe on the first try.
This means we will read some sectors which are not scrub target, but
that's fine. At dev-replace time we only writeback the utilized and good
sectors, and for read-repair we only writeback the repaired sectors.
With every read submitted in BTRFS_STRIPE_LEN, the need for complex bio
form shaping would be gone.
Although to get the same performance of the old scrub behavior, we would
need to submit the initial read for two stripes at once.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Move these out of ctree.h into scrub.h to cut down on code in ctree.h.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>