forked from Minki/linux
4ec1e369af
DM RAID: Add code to validate replacement slots for RAID10 arrays RAID10 can handle 'copies - 1' failures for each mirror group. This code ensures the user has provided a valid array - one whose devices specified for rebuild do not exceed the amount of redundancy available. Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
144 lines
5.6 KiB
Plaintext
144 lines
5.6 KiB
Plaintext
dm-raid
|
|
-------
|
|
|
|
The device-mapper RAID (dm-raid) target provides a bridge from DM to MD.
|
|
It allows the MD RAID drivers to be accessed using a device-mapper
|
|
interface.
|
|
|
|
The target is named "raid" and it accepts the following parameters:
|
|
|
|
<raid_type> <#raid_params> <raid_params> \
|
|
<#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>]
|
|
|
|
<raid_type>:
|
|
raid1 RAID1 mirroring
|
|
raid4 RAID4 dedicated parity disk
|
|
raid5_la RAID5 left asymmetric
|
|
- rotating parity 0 with data continuation
|
|
raid5_ra RAID5 right asymmetric
|
|
- rotating parity N with data continuation
|
|
raid5_ls RAID5 left symmetric
|
|
- rotating parity 0 with data restart
|
|
raid5_rs RAID5 right symmetric
|
|
- rotating parity N with data restart
|
|
raid6_zr RAID6 zero restart
|
|
- rotating parity zero (left-to-right) with data restart
|
|
raid6_nr RAID6 N restart
|
|
- rotating parity N (right-to-left) with data restart
|
|
raid6_nc RAID6 N continue
|
|
- rotating parity N (right-to-left) with data continuation
|
|
raid10 Various RAID10 inspired algorithms chosen by additional params
|
|
- RAID10: Striped Mirrors (aka 'Striping on top of mirrors')
|
|
- RAID1E: Integrated Adjacent Stripe Mirroring
|
|
- and other similar RAID10 variants
|
|
|
|
Reference: Chapter 4 of
|
|
http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
|
|
|
|
<#raid_params>: The number of parameters that follow.
|
|
|
|
<raid_params> consists of
|
|
Mandatory parameters:
|
|
<chunk_size>: Chunk size in sectors. This parameter is often known as
|
|
"stripe size". It is the only mandatory parameter and
|
|
is placed first.
|
|
|
|
followed by optional parameters (in any order):
|
|
[sync|nosync] Force or prevent RAID initialization.
|
|
|
|
[rebuild <idx>] Rebuild drive number idx (first drive is 0).
|
|
|
|
[daemon_sleep <ms>]
|
|
Interval between runs of the bitmap daemon that
|
|
clear bits. A longer interval means less bitmap I/O but
|
|
resyncing after a failure is likely to take longer.
|
|
|
|
[min_recovery_rate <kB/sec/disk>] Throttle RAID initialization
|
|
[max_recovery_rate <kB/sec/disk>] Throttle RAID initialization
|
|
[write_mostly <idx>] Drive index is write-mostly
|
|
[max_write_behind <sectors>] See '-write-behind=' (man mdadm)
|
|
[stripe_cache <sectors>] Stripe cache size (higher RAIDs only)
|
|
[region_size <sectors>]
|
|
The region_size multiplied by the number of regions is the
|
|
logical size of the array. The bitmap records the device
|
|
synchronisation state for each region.
|
|
|
|
[raid10_copies <# copies>]
|
|
[raid10_format near]
|
|
These two options are used to alter the default layout of
|
|
a RAID10 configuration. The number of copies is can be
|
|
specified, but the default is 2. There are other variations
|
|
to how the copies are laid down - the default and only current
|
|
option is "near". Near copies are what most people think of
|
|
with respect to mirroring. If these options are left
|
|
unspecified, or 'raid10_copies 2' and/or 'raid10_format near'
|
|
are given, then the layouts for 2, 3 and 4 devices are:
|
|
2 drives 3 drives 4 drives
|
|
-------- ---------- --------------
|
|
A1 A1 A1 A1 A2 A1 A1 A2 A2
|
|
A2 A2 A2 A3 A3 A3 A3 A4 A4
|
|
A3 A3 A4 A4 A5 A5 A5 A6 A6
|
|
A4 A4 A5 A6 A6 A7 A7 A8 A8
|
|
.. .. .. .. .. .. .. .. ..
|
|
The 2-device layout is equivalent 2-way RAID1. The 4-device
|
|
layout is what a traditional RAID10 would look like. The
|
|
3-device layout is what might be called a 'RAID1E - Integrated
|
|
Adjacent Stripe Mirroring'.
|
|
|
|
<#raid_devs>: The number of devices composing the array.
|
|
Each device consists of two entries. The first is the device
|
|
containing the metadata (if any); the second is the one containing the
|
|
data.
|
|
|
|
If a drive has failed or is missing at creation time, a '-' can be
|
|
given for both the metadata and data drives for a given position.
|
|
|
|
|
|
Example tables
|
|
--------------
|
|
# RAID4 - 4 data drives, 1 parity (no metadata devices)
|
|
# No metadata devices specified to hold superblock/bitmap info
|
|
# Chunk size of 1MiB
|
|
# (Lines separated for easy reading)
|
|
|
|
0 1960893648 raid \
|
|
raid4 1 2048 \
|
|
5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81
|
|
|
|
# RAID4 - 4 data drives, 1 parity (with metadata devices)
|
|
# Chunk size of 1MiB, force RAID initialization,
|
|
# min recovery rate at 20 kiB/sec/disk
|
|
|
|
0 1960893648 raid \
|
|
raid4 4 2048 sync min_recovery_rate 20 \
|
|
5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
|
|
|
|
'dmsetup table' displays the table used to construct the mapping.
|
|
The optional parameters are always printed in the order listed
|
|
above with "sync" or "nosync" always output ahead of the other
|
|
arguments, regardless of the order used when originally loading the table.
|
|
Arguments that can be repeated are ordered by value.
|
|
|
|
'dmsetup status' yields information on the state and health of the
|
|
array.
|
|
The output is as follows:
|
|
1: <s> <l> raid \
|
|
2: <raid_type> <#devices> <1 health char for each dev> <resync_ratio>
|
|
|
|
Line 1 is the standard output produced by device-mapper.
|
|
Line 2 is produced by the raid target, and best explained by example:
|
|
0 1960893648 raid raid4 5 AAAAA 2/490221568
|
|
Here we can see the RAID type is raid4, there are 5 devices - all of
|
|
which are 'A'live, and the array is 2/490221568 complete with recovery.
|
|
Faulty or missing devices are marked 'D'. Devices that are out-of-sync
|
|
are marked 'a'.
|
|
|
|
|
|
Version History
|
|
---------------
|
|
1.0.0 Initial version. Support for RAID 4/5/6
|
|
1.1.0 Added support for RAID 1
|
|
1.2.0 Handle creation of arrays that contain failed devices.
|
|
1.3.0 Added support for RAID 10
|
|
1.3.1 Allow device replacement/rebuild for RAID 10
|