- Misc VDO fixes

- Remove unused declarations dm_get_rq_mapinfo() and dm_zone_map_bio()
 
 - Dm-delay: Improve kernel documentation
 
 - Dm-crypt: Allow to specify the integrity key size as an option
 
 - Dm-bufio: Remove pointless NULL check
 
 - Small code cleanups: Use ERR_CAST; remove unlikely() around IS_ERR; use
   __assign_bit
 
 - Dm-integrity: Fix gcc 5 warning; convert comma to semicolon; fix smatch
   warning
 
 - Dm-integrity: Support recalculation in the 'I' mode
 
 - Revert "dm: requeue IO if mapping table not yet available"
 
 - Dm-crypt: Small refactoring to make the code more readable
 
 - Dm-cache: Remove pointless error check
 
 - Dm: Fix spelling errors
 
 - Dm-verity: Restart or panic on an I/O error if restart or panic was
   requested
 
 - Dm-verity: Fallback to platform keyring also if key in trusted keyring
   is rejected
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCZvapzRQcbXBhdG9ja2FA
 cmVkaGF0LmNvbQAKCRATAyx9YGnhbdKAAP4gHNU7aRmwTPcmvytEqBO4Pcz4eGB/
 tytj2+o1orph3AD/YD2X75YHOrdNKTLq+N0ecetAt0yDVUnJAUtKiOnx6Q8=
 =0f9T
 -----END PGP SIGNATURE-----

Merge tag 'for-6.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mikulas Patocka:

 - Misc VDO fixes

 - Remove unused declarations dm_get_rq_mapinfo() and dm_zone_map_bio()

 - Dm-delay: Improve kernel documentation

 - Dm-crypt: Allow to specify the integrity key size as an option

 - Dm-bufio: Remove pointless NULL check

 - Small code cleanups: Use ERR_CAST; remove unlikely() around IS_ERR;
   use __assign_bit

 - Dm-integrity: Fix gcc 5 warning; convert comma to semicolon; fix
   smatch warning

 - Dm-integrity: Support recalculation in the 'I' mode

 - Revert "dm: requeue IO if mapping table not yet available"

 - Dm-crypt: Small refactoring to make the code more readable

 - Dm-cache: Remove pointless error check

 - Dm: Fix spelling errors

 - Dm-verity: Restart or panic on an I/O error if restart or panic was
   requested

 - Dm-verity: Fallback to platform keyring also if key in trusted
   keyring is rejected

* tag 'for-6.12/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits)
  dm verity: fallback to platform keyring also if key in trusted keyring is rejected
  dm-verity: restart or panic on an I/O error
  dm: fix spelling errors
  dm-cache: remove pointless error check
  dm vdo: handle unaligned discards correctly
  dm vdo indexer: Convert comma to semicolon
  dm-crypt: Use common error handling code in crypt_set_keyring_key()
  dm-crypt: Use up_read() together with key_put() only once in crypt_set_keyring_key()
  Revert "dm: requeue IO if mapping table not yet available"
  dm-integrity: check mac_size against HASH_MAX_DIGESTSIZE in sb_mac()
  dm-integrity: support recalculation in the 'I' mode
  dm integrity: Convert comma to semicolon
  dm integrity: fix gcc 5 warning
  dm: Make use of __assign_bit() API
  dm integrity: Remove extra unlikely helper
  dm: Convert to use ERR_CAST()
  dm bufio: Remove NULL check of list_entry()
  dm-crypt: Allow to specify the integrity key size as option
  dm: Remove unused declaration and empty definition "dm_zone_map_bio"
  dm delay: enhance kernel documentation
  ...
This commit is contained in:
Linus Torvalds 2024-09-27 09:12:51 -07:00
commit e477dba544
26 changed files with 488 additions and 147 deletions

View File

@ -3,29 +3,52 @@ dm-delay
======== ========
Device-Mapper's "delay" target delays reads and/or writes Device-Mapper's "delay" target delays reads and/or writes
and maps them to different devices. and/or flushs and optionally maps them to different devices.
Parameters:: Arguments::
<device> <offset> <delay> [<write_device> <write_offset> <write_delay> <device> <offset> <delay> [<write_device> <write_offset> <write_delay>
[<flush_device> <flush_offset> <flush_delay>]] [<flush_device> <flush_offset> <flush_delay>]]
With separate write parameters, the first set is only used for reads. Table line has to either have 3, 6 or 9 arguments:
3: apply offset and delay to read, write and flush operations on device
6: apply offset and delay to device, also apply write_offset and write_delay
to write and flush operations on optionally different write_device with
optionally different sector offset
9: same as 6 arguments plus define flush_offset and flush_delay explicitely
on/with optionally different flush_device/flush_offset.
Offsets are specified in sectors. Offsets are specified in sectors.
Delays are specified in milliseconds. Delays are specified in milliseconds.
Example scripts Example scripts
=============== ===============
:: ::
#!/bin/sh #!/bin/sh
# Create device delaying rw operation for 500ms #
echo "0 `blockdev --getsz $1` delay $1 0 500" | dmsetup create delayed # Create mapped device named "delayed" delaying read, write and flush operations for 500ms.
#
dmsetup create delayed --table "0 `blockdev --getsz $1` delay $1 0 500"
:: ::
#!/bin/sh #!/bin/sh
# Create device delaying only write operation for 500ms and #
# splitting reads and writes to different devices $1 $2 # Create mapped device delaying write and flush operations for 400ms and
echo "0 `blockdev --getsz $1` delay $1 0 0 $2 0 500" | dmsetup create delayed # splitting reads to device $1 but writes and flushs to different device $2
# to different offsets of 2048 and 4096 sectors respectively.
#
dmsetup create delayed --table "0 `blockdev --getsz $1` delay $1 2048 0 $2 4096 400"
::
#!/bin/sh
#
# Create mapped device delaying reads for 50ms, writes for 100ms and flushs for 333ms
# onto the same backing device at offset 0 sectors.
#
dmsetup create delayed --table "0 `blockdev --getsz $1` delay $1 0 50 $2 0 100 $1 0 333"

View File

@ -160,6 +160,10 @@ iv_large_sectors
The <iv_offset> must be multiple of <sector_size> (in 512 bytes units) The <iv_offset> must be multiple of <sector_size> (in 512 bytes units)
if this flag is specified. if this flag is specified.
integrity_key_size:<bytes>
Use an integrity key of <bytes> size instead of using an integrity key size
of the digest size of the used HMAC algorithm.
Module parameters:: Module parameters::
max_read_size max_read_size

View File

@ -251,7 +251,12 @@ The messages are:
by the vdostats userspace program to interpret the output by the vdostats userspace program to interpret the output
buffer. buffer.
dump: config:
Outputs useful vdo configuration information. Mostly used
by users who want to recreate a similar VDO volume and
want to know the creation configuration used.
dump:
Dumps many internal structures to the system log. This is Dumps many internal structures to the system log. This is
not always safe to run, so it should only be used to debug not always safe to run, so it should only be used to debug
a hung vdo. Optional parameters to specify structures to a hung vdo. Optional parameters to specify structures to

View File

@ -529,9 +529,6 @@ static struct dm_buffer *list_to_buffer(struct list_head *l)
{ {
struct lru_entry *le = list_entry(l, struct lru_entry, list); struct lru_entry *le = list_entry(l, struct lru_entry, list);
if (!le)
return NULL;
return le_to_buffer(le); return le_to_buffer(le);
} }

View File

@ -1368,7 +1368,7 @@ static void mg_copy(struct work_struct *ws)
*/ */
bool rb = bio_detain_shared(mg->cache, mg->op->oblock, mg->overwrite_bio); bool rb = bio_detain_shared(mg->cache, mg->op->oblock, mg->overwrite_bio);
BUG_ON(rb); /* An exclussive lock must _not_ be held for this block */ BUG_ON(rb); /* An exclusive lock must _not_ be held for this block */
mg->overwrite_bio = NULL; mg->overwrite_bio = NULL;
inc_io_migrations(mg->cache); inc_io_migrations(mg->cache);
mg_full_copy(ws); mg_full_copy(ws);
@ -3200,8 +3200,6 @@ static int parse_cblock_range(struct cache *cache, const char *str,
* Try and parse form (ii) first. * Try and parse form (ii) first.
*/ */
r = sscanf(str, "%llu-%llu%c", &b, &e, &dummy); r = sscanf(str, "%llu-%llu%c", &b, &e, &dummy);
if (r < 0)
return r;
if (r == 2) { if (r == 2) {
result->begin = to_cblock(b); result->begin = to_cblock(b);
@ -3213,8 +3211,6 @@ static int parse_cblock_range(struct cache *cache, const char *str,
* That didn't work, try form (i). * That didn't work, try form (i).
*/ */
r = sscanf(str, "%llu%c", &b, &dummy); r = sscanf(str, "%llu%c", &b, &dummy);
if (r < 0)
return r;
if (r == 1) { if (r == 1) {
result->begin = to_cblock(b); result->begin = to_cblock(b);

View File

@ -530,10 +530,7 @@ static int __load_bitset_in_core(struct dm_clone_metadata *cmd)
return r; return r;
for (i = 0; ; i++) { for (i = 0; ; i++) {
if (dm_bitset_cursor_get_value(&c)) __assign_bit(i, cmd->region_map, dm_bitset_cursor_get_value(&c));
__set_bit(i, cmd->region_map);
else
__clear_bit(i, cmd->region_map);
if (i >= (cmd->nr_regions - 1)) if (i >= (cmd->nr_regions - 1))
break; break;

View File

@ -147,6 +147,7 @@ enum cipher_flags {
CRYPT_MODE_INTEGRITY_AEAD, /* Use authenticated mode for cipher */ CRYPT_MODE_INTEGRITY_AEAD, /* Use authenticated mode for cipher */
CRYPT_IV_LARGE_SECTORS, /* Calculate IV from sector_size, not 512B sectors */ CRYPT_IV_LARGE_SECTORS, /* Calculate IV from sector_size, not 512B sectors */
CRYPT_ENCRYPT_PREPROCESS, /* Must preprocess data for encryption (elephant) */ CRYPT_ENCRYPT_PREPROCESS, /* Must preprocess data for encryption (elephant) */
CRYPT_KEY_MAC_SIZE_SET, /* The integrity_key_size option was used */
}; };
/* /*
@ -2613,35 +2614,31 @@ static int crypt_set_keyring_key(struct crypt_config *cc, const char *key_string
key = request_key(type, key_desc + 1, NULL); key = request_key(type, key_desc + 1, NULL);
if (IS_ERR(key)) { if (IS_ERR(key)) {
kfree_sensitive(new_key_string); ret = PTR_ERR(key);
return PTR_ERR(key); goto free_new_key_string;
} }
down_read(&key->sem); down_read(&key->sem);
ret = set_key(cc, key); ret = set_key(cc, key);
if (ret < 0) {
up_read(&key->sem);
key_put(key);
kfree_sensitive(new_key_string);
return ret;
}
up_read(&key->sem); up_read(&key->sem);
key_put(key); key_put(key);
if (ret < 0)
goto free_new_key_string;
/* clear the flag since following operations may invalidate previously valid key */ /* clear the flag since following operations may invalidate previously valid key */
clear_bit(DM_CRYPT_KEY_VALID, &cc->flags); clear_bit(DM_CRYPT_KEY_VALID, &cc->flags);
ret = crypt_setkey(cc); ret = crypt_setkey(cc);
if (ret)
goto free_new_key_string;
if (!ret) { set_bit(DM_CRYPT_KEY_VALID, &cc->flags);
set_bit(DM_CRYPT_KEY_VALID, &cc->flags); kfree_sensitive(cc->key_string);
kfree_sensitive(cc->key_string); cc->key_string = new_key_string;
cc->key_string = new_key_string; return 0;
} else
kfree_sensitive(new_key_string);
free_new_key_string:
kfree_sensitive(new_key_string);
return ret; return ret;
} }
@ -2937,7 +2934,8 @@ static int crypt_ctr_auth_cipher(struct crypt_config *cc, char *cipher_api)
if (IS_ERR(mac)) if (IS_ERR(mac))
return PTR_ERR(mac); return PTR_ERR(mac);
cc->key_mac_size = crypto_ahash_digestsize(mac); if (!test_bit(CRYPT_KEY_MAC_SIZE_SET, &cc->cipher_flags))
cc->key_mac_size = crypto_ahash_digestsize(mac);
crypto_free_ahash(mac); crypto_free_ahash(mac);
cc->authenc_key = kmalloc(crypt_authenckey_size(cc), GFP_KERNEL); cc->authenc_key = kmalloc(crypt_authenckey_size(cc), GFP_KERNEL);
@ -3219,6 +3217,13 @@ static int crypt_ctr_optional(struct dm_target *ti, unsigned int argc, char **ar
cc->cipher_auth = kstrdup(sval, GFP_KERNEL); cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
if (!cc->cipher_auth) if (!cc->cipher_auth)
return -ENOMEM; return -ENOMEM;
} else if (sscanf(opt_string, "integrity_key_size:%u%c", &val, &dummy) == 1) {
if (!val) {
ti->error = "Invalid integrity_key_size argument";
return -EINVAL;
}
cc->key_mac_size = val;
set_bit(CRYPT_KEY_MAC_SIZE_SET, &cc->cipher_flags);
} else if (sscanf(opt_string, "sector_size:%hu%c", &cc->sector_size, &dummy) == 1) { } else if (sscanf(opt_string, "sector_size:%hu%c", &cc->sector_size, &dummy) == 1) {
if (cc->sector_size < (1 << SECTOR_SHIFT) || if (cc->sector_size < (1 << SECTOR_SHIFT) ||
cc->sector_size > 4096 || cc->sector_size > 4096 ||
@ -3607,10 +3612,10 @@ static void crypt_status(struct dm_target *ti, status_type_t type,
num_feature_args += test_bit(DM_CRYPT_NO_OFFLOAD, &cc->flags); num_feature_args += test_bit(DM_CRYPT_NO_OFFLOAD, &cc->flags);
num_feature_args += test_bit(DM_CRYPT_NO_READ_WORKQUEUE, &cc->flags); num_feature_args += test_bit(DM_CRYPT_NO_READ_WORKQUEUE, &cc->flags);
num_feature_args += test_bit(DM_CRYPT_NO_WRITE_WORKQUEUE, &cc->flags); num_feature_args += test_bit(DM_CRYPT_NO_WRITE_WORKQUEUE, &cc->flags);
num_feature_args += !!cc->used_tag_size;
num_feature_args += cc->sector_size != (1 << SECTOR_SHIFT); num_feature_args += cc->sector_size != (1 << SECTOR_SHIFT);
num_feature_args += test_bit(CRYPT_IV_LARGE_SECTORS, &cc->cipher_flags); num_feature_args += test_bit(CRYPT_IV_LARGE_SECTORS, &cc->cipher_flags);
if (cc->used_tag_size) num_feature_args += test_bit(CRYPT_KEY_MAC_SIZE_SET, &cc->cipher_flags);
num_feature_args++;
if (num_feature_args) { if (num_feature_args) {
DMEMIT(" %d", num_feature_args); DMEMIT(" %d", num_feature_args);
if (ti->num_discard_bios) if (ti->num_discard_bios)
@ -3631,6 +3636,8 @@ static void crypt_status(struct dm_target *ti, status_type_t type,
DMEMIT(" sector_size:%d", cc->sector_size); DMEMIT(" sector_size:%d", cc->sector_size);
if (test_bit(CRYPT_IV_LARGE_SECTORS, &cc->cipher_flags)) if (test_bit(CRYPT_IV_LARGE_SECTORS, &cc->cipher_flags))
DMEMIT(" iv_large_sectors"); DMEMIT(" iv_large_sectors");
if (test_bit(CRYPT_KEY_MAC_SIZE_SET, &cc->cipher_flags))
DMEMIT(" integrity_key_size:%u", cc->key_mac_size);
} }
break; break;
@ -3758,7 +3765,7 @@ static void crypt_io_hints(struct dm_target *ti, struct queue_limits *limits)
static struct target_type crypt_target = { static struct target_type crypt_target = {
.name = "crypt", .name = "crypt",
.version = {1, 27, 0}, .version = {1, 28, 0},
.module = THIS_MODULE, .module = THIS_MODULE,
.ctr = crypt_ctr, .ctr = crypt_ctr,
.dtr = crypt_dtr, .dtr = crypt_dtr,

View File

@ -284,6 +284,7 @@ struct dm_integrity_c {
mempool_t recheck_pool; mempool_t recheck_pool;
struct bio_set recheck_bios; struct bio_set recheck_bios;
struct bio_set recalc_bios;
struct notifier_block reboot_notifier; struct notifier_block reboot_notifier;
}; };
@ -321,7 +322,9 @@ struct dm_integrity_io {
struct dm_bio_details bio_details; struct dm_bio_details bio_details;
char *integrity_payload; char *integrity_payload;
unsigned payload_len;
bool integrity_payload_from_mempool; bool integrity_payload_from_mempool;
bool integrity_range_locked;
}; };
struct journal_completion { struct journal_completion {
@ -359,7 +362,7 @@ static struct kmem_cache *journal_io_cache;
#endif #endif
static void dm_integrity_map_continue(struct dm_integrity_io *dio, bool from_map); static void dm_integrity_map_continue(struct dm_integrity_io *dio, bool from_map);
static int dm_integrity_map_inline(struct dm_integrity_io *dio); static int dm_integrity_map_inline(struct dm_integrity_io *dio, bool from_map);
static void integrity_bio_wait(struct work_struct *w); static void integrity_bio_wait(struct work_struct *w);
static void dm_integrity_dtr(struct dm_target *ti); static void dm_integrity_dtr(struct dm_target *ti);
@ -491,7 +494,8 @@ static int sb_mac(struct dm_integrity_c *ic, bool wr)
__u8 *sb = (__u8 *)ic->sb; __u8 *sb = (__u8 *)ic->sb;
__u8 *mac = sb + (1 << SECTOR_SHIFT) - mac_size; __u8 *mac = sb + (1 << SECTOR_SHIFT) - mac_size;
if (sizeof(struct superblock) + mac_size > 1 << SECTOR_SHIFT) { if (sizeof(struct superblock) + mac_size > 1 << SECTOR_SHIFT ||
mac_size > HASH_MAX_DIGESTSIZE) {
dm_integrity_io_error(ic, "digest is too long", -EINVAL); dm_integrity_io_error(ic, "digest is too long", -EINVAL);
return -EINVAL; return -EINVAL;
} }
@ -1500,15 +1504,15 @@ static void dm_integrity_flush_buffers(struct dm_integrity_c *ic, bool flush_dat
if (!ic->meta_dev) if (!ic->meta_dev)
flush_data = false; flush_data = false;
if (flush_data) { if (flush_data) {
fr.io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC, fr.io_req.bi_opf = REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC;
fr.io_req.mem.type = DM_IO_KMEM, fr.io_req.mem.type = DM_IO_KMEM;
fr.io_req.mem.ptr.addr = NULL, fr.io_req.mem.ptr.addr = NULL;
fr.io_req.notify.fn = flush_notify, fr.io_req.notify.fn = flush_notify;
fr.io_req.notify.context = &fr; fr.io_req.notify.context = &fr;
fr.io_req.client = dm_bufio_get_dm_io_client(ic->bufio), fr.io_req.client = dm_bufio_get_dm_io_client(ic->bufio);
fr.io_reg.bdev = ic->dev->bdev, fr.io_reg.bdev = ic->dev->bdev;
fr.io_reg.sector = 0, fr.io_reg.sector = 0;
fr.io_reg.count = 0, fr.io_reg.count = 0;
fr.ic = ic; fr.ic = ic;
init_completion(&fr.comp); init_completion(&fr.comp);
r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT); r = dm_io(&fr.io_req, 1, &fr.io_reg, NULL, IOPRIO_DEFAULT);
@ -1946,8 +1950,13 @@ static int dm_integrity_map(struct dm_target *ti, struct bio *bio)
dio->bi_status = 0; dio->bi_status = 0;
dio->op = bio_op(bio); dio->op = bio_op(bio);
if (ic->mode == 'I') if (ic->mode == 'I') {
return dm_integrity_map_inline(dio); bio->bi_iter.bi_sector = dm_target_offset(ic->ti, bio->bi_iter.bi_sector);
dio->integrity_payload = NULL;
dio->integrity_payload_from_mempool = false;
dio->integrity_range_locked = false;
return dm_integrity_map_inline(dio, true);
}
if (unlikely(dio->op == REQ_OP_DISCARD)) { if (unlikely(dio->op == REQ_OP_DISCARD)) {
if (ti->max_io_len) { if (ti->max_io_len) {
@ -2397,15 +2406,13 @@ journal_read_write:
do_endio_flush(ic, dio); do_endio_flush(ic, dio);
} }
static int dm_integrity_map_inline(struct dm_integrity_io *dio) static int dm_integrity_map_inline(struct dm_integrity_io *dio, bool from_map)
{ {
struct dm_integrity_c *ic = dio->ic; struct dm_integrity_c *ic = dio->ic;
struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io)); struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io));
struct bio_integrity_payload *bip; struct bio_integrity_payload *bip;
unsigned payload_len, digest_size, extra_size, ret; unsigned ret;
sector_t recalc_sector;
dio->integrity_payload = NULL;
dio->integrity_payload_from_mempool = false;
if (unlikely(bio_integrity(bio))) { if (unlikely(bio_integrity(bio))) {
bio->bi_status = BLK_STS_NOTSUPP; bio->bi_status = BLK_STS_NOTSUPP;
@ -2418,28 +2425,67 @@ static int dm_integrity_map_inline(struct dm_integrity_io *dio)
return DM_MAPIO_REMAPPED; return DM_MAPIO_REMAPPED;
retry: retry:
payload_len = ic->tuple_size * (bio_sectors(bio) >> ic->sb->log2_sectors_per_block); if (!dio->integrity_payload) {
digest_size = crypto_shash_digestsize(ic->internal_hash); unsigned digest_size, extra_size;
extra_size = unlikely(digest_size > ic->tag_size) ? digest_size - ic->tag_size : 0; dio->payload_len = ic->tuple_size * (bio_sectors(bio) >> ic->sb->log2_sectors_per_block);
payload_len += extra_size; digest_size = crypto_shash_digestsize(ic->internal_hash);
dio->integrity_payload = kmalloc(payload_len, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN); extra_size = unlikely(digest_size > ic->tag_size) ? digest_size - ic->tag_size : 0;
if (unlikely(!dio->integrity_payload)) { dio->payload_len += extra_size;
const unsigned x_size = PAGE_SIZE << 1; dio->integrity_payload = kmalloc(dio->payload_len, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
if (payload_len > x_size) { if (unlikely(!dio->integrity_payload)) {
unsigned sectors = ((x_size - extra_size) / ic->tuple_size) << ic->sb->log2_sectors_per_block; const unsigned x_size = PAGE_SIZE << 1;
if (WARN_ON(!sectors || sectors >= bio_sectors(bio))) { if (dio->payload_len > x_size) {
bio->bi_status = BLK_STS_NOTSUPP; unsigned sectors = ((x_size - extra_size) / ic->tuple_size) << ic->sb->log2_sectors_per_block;
bio_endio(bio); if (WARN_ON(!sectors || sectors >= bio_sectors(bio))) {
return DM_MAPIO_SUBMITTED; bio->bi_status = BLK_STS_NOTSUPP;
bio_endio(bio);
return DM_MAPIO_SUBMITTED;
}
dm_accept_partial_bio(bio, sectors);
goto retry;
} }
dm_accept_partial_bio(bio, sectors);
goto retry;
} }
}
dio->range.logical_sector = bio->bi_iter.bi_sector;
dio->range.n_sectors = bio_sectors(bio);
if (!(ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)))
goto skip_spinlock;
#ifdef CONFIG_64BIT
/*
* On 64-bit CPUs we can optimize the lock away (so that it won't cause
* cache line bouncing) and use acquire/release barriers instead.
*
* Paired with smp_store_release in integrity_recalc_inline.
*/
recalc_sector = le64_to_cpu(smp_load_acquire(&ic->sb->recalc_sector));
if (likely(dio->range.logical_sector + dio->range.n_sectors <= recalc_sector))
goto skip_spinlock;
#endif
spin_lock_irq(&ic->endio_wait.lock);
recalc_sector = le64_to_cpu(ic->sb->recalc_sector);
if (dio->range.logical_sector + dio->range.n_sectors <= recalc_sector)
goto skip_unlock;
if (unlikely(!add_new_range(ic, &dio->range, true))) {
if (from_map) {
spin_unlock_irq(&ic->endio_wait.lock);
INIT_WORK(&dio->work, integrity_bio_wait);
queue_work(ic->wait_wq, &dio->work);
return DM_MAPIO_SUBMITTED;
}
wait_and_add_new_range(ic, &dio->range);
}
dio->integrity_range_locked = true;
skip_unlock:
spin_unlock_irq(&ic->endio_wait.lock);
skip_spinlock:
if (unlikely(!dio->integrity_payload)) {
dio->integrity_payload = page_to_virt((struct page *)mempool_alloc(&ic->recheck_pool, GFP_NOIO)); dio->integrity_payload = page_to_virt((struct page *)mempool_alloc(&ic->recheck_pool, GFP_NOIO));
dio->integrity_payload_from_mempool = true; dio->integrity_payload_from_mempool = true;
} }
bio->bi_iter.bi_sector = dm_target_offset(ic->ti, bio->bi_iter.bi_sector);
dio->bio_details.bi_iter = bio->bi_iter; dio->bio_details.bi_iter = bio->bi_iter;
if (unlikely(!dm_integrity_check_limits(ic, bio->bi_iter.bi_sector, bio))) { if (unlikely(!dm_integrity_check_limits(ic, bio->bi_iter.bi_sector, bio))) {
@ -2449,7 +2495,7 @@ retry:
bio->bi_iter.bi_sector += ic->start + SB_SECTORS; bio->bi_iter.bi_sector += ic->start + SB_SECTORS;
bip = bio_integrity_alloc(bio, GFP_NOIO, 1); bip = bio_integrity_alloc(bio, GFP_NOIO, 1);
if (unlikely(IS_ERR(bip))) { if (IS_ERR(bip)) {
bio->bi_status = errno_to_blk_status(PTR_ERR(bip)); bio->bi_status = errno_to_blk_status(PTR_ERR(bip));
bio_endio(bio); bio_endio(bio);
return DM_MAPIO_SUBMITTED; return DM_MAPIO_SUBMITTED;
@ -2470,8 +2516,8 @@ retry:
} }
ret = bio_integrity_add_page(bio, virt_to_page(dio->integrity_payload), ret = bio_integrity_add_page(bio, virt_to_page(dio->integrity_payload),
payload_len, offset_in_page(dio->integrity_payload)); dio->payload_len, offset_in_page(dio->integrity_payload));
if (unlikely(ret != payload_len)) { if (unlikely(ret != dio->payload_len)) {
bio->bi_status = BLK_STS_RESOURCE; bio->bi_status = BLK_STS_RESOURCE;
bio_endio(bio); bio_endio(bio);
return DM_MAPIO_SUBMITTED; return DM_MAPIO_SUBMITTED;
@ -2522,7 +2568,7 @@ static void dm_integrity_inline_recheck(struct work_struct *w)
} }
bip = bio_integrity_alloc(outgoing_bio, GFP_NOIO, 1); bip = bio_integrity_alloc(outgoing_bio, GFP_NOIO, 1);
if (unlikely(IS_ERR(bip))) { if (IS_ERR(bip)) {
bio_put(outgoing_bio); bio_put(outgoing_bio);
bio->bi_status = errno_to_blk_status(PTR_ERR(bip)); bio->bi_status = errno_to_blk_status(PTR_ERR(bip));
bio_endio(bio); bio_endio(bio);
@ -2579,6 +2625,9 @@ static int dm_integrity_end_io(struct dm_target *ti, struct bio *bio, blk_status
struct dm_integrity_io *dio = dm_per_bio_data(bio, sizeof(struct dm_integrity_io)); struct dm_integrity_io *dio = dm_per_bio_data(bio, sizeof(struct dm_integrity_io));
if (dio->op == REQ_OP_READ && likely(*status == BLK_STS_OK)) { if (dio->op == REQ_OP_READ && likely(*status == BLK_STS_OK)) {
unsigned pos = 0; unsigned pos = 0;
if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING) &&
unlikely(dio->integrity_range_locked))
goto skip_check;
while (dio->bio_details.bi_iter.bi_size) { while (dio->bio_details.bi_iter.bi_size) {
char digest[HASH_MAX_DIGESTSIZE]; char digest[HASH_MAX_DIGESTSIZE];
struct bio_vec bv = bio_iter_iovec(bio, dio->bio_details.bi_iter); struct bio_vec bv = bio_iter_iovec(bio, dio->bio_details.bi_iter);
@ -2598,9 +2647,10 @@ static int dm_integrity_end_io(struct dm_target *ti, struct bio *bio, blk_status
bio_advance_iter_single(bio, &dio->bio_details.bi_iter, ic->sectors_per_block << SECTOR_SHIFT); bio_advance_iter_single(bio, &dio->bio_details.bi_iter, ic->sectors_per_block << SECTOR_SHIFT);
} }
} }
if (likely(dio->op == REQ_OP_READ) || likely(dio->op == REQ_OP_WRITE)) { skip_check:
dm_integrity_free_payload(dio); dm_integrity_free_payload(dio);
} if (unlikely(dio->integrity_range_locked))
remove_range(ic, &dio->range);
} }
return DM_ENDIO_DONE; return DM_ENDIO_DONE;
} }
@ -2608,8 +2658,26 @@ static int dm_integrity_end_io(struct dm_target *ti, struct bio *bio, blk_status
static void integrity_bio_wait(struct work_struct *w) static void integrity_bio_wait(struct work_struct *w)
{ {
struct dm_integrity_io *dio = container_of(w, struct dm_integrity_io, work); struct dm_integrity_io *dio = container_of(w, struct dm_integrity_io, work);
struct dm_integrity_c *ic = dio->ic;
dm_integrity_map_continue(dio, false); if (ic->mode == 'I') {
struct bio *bio = dm_bio_from_per_bio_data(dio, sizeof(struct dm_integrity_io));
int r = dm_integrity_map_inline(dio, false);
switch (r) {
case DM_MAPIO_KILL:
bio->bi_status = BLK_STS_IOERR;
fallthrough;
case DM_MAPIO_REMAPPED:
submit_bio_noacct(bio);
fallthrough;
case DM_MAPIO_SUBMITTED:
return;
default:
BUG();
}
} else {
dm_integrity_map_continue(dio, false);
}
} }
static void pad_uncommitted(struct dm_integrity_c *ic) static void pad_uncommitted(struct dm_integrity_c *ic)
@ -3081,6 +3149,133 @@ free_ret:
kvfree(recalc_tags); kvfree(recalc_tags);
} }
static void integrity_recalc_inline(struct work_struct *w)
{
struct dm_integrity_c *ic = container_of(w, struct dm_integrity_c, recalc_work);
size_t recalc_tags_size;
u8 *recalc_buffer = NULL;
u8 *recalc_tags = NULL;
struct dm_integrity_range range;
struct bio *bio;
struct bio_integrity_payload *bip;
__u8 *t;
unsigned int i;
int r;
unsigned ret;
unsigned int super_counter = 0;
unsigned recalc_sectors = RECALC_SECTORS;
retry:
recalc_buffer = kmalloc(recalc_sectors << SECTOR_SHIFT, GFP_NOIO | __GFP_NOWARN);
if (!recalc_buffer) {
oom:
recalc_sectors >>= 1;
if (recalc_sectors >= 1U << ic->sb->log2_sectors_per_block)
goto retry;
DMCRIT("out of memory for recalculate buffer - recalculation disabled");
goto free_ret;
}
recalc_tags_size = (recalc_sectors >> ic->sb->log2_sectors_per_block) * ic->tuple_size;
if (crypto_shash_digestsize(ic->internal_hash) > ic->tuple_size)
recalc_tags_size += crypto_shash_digestsize(ic->internal_hash) - ic->tuple_size;
recalc_tags = kmalloc(recalc_tags_size, GFP_NOIO | __GFP_NOWARN);
if (!recalc_tags) {
kfree(recalc_buffer);
recalc_buffer = NULL;
goto oom;
}
spin_lock_irq(&ic->endio_wait.lock);
next_chunk:
if (unlikely(dm_post_suspending(ic->ti)))
goto unlock_ret;
range.logical_sector = le64_to_cpu(ic->sb->recalc_sector);
if (unlikely(range.logical_sector >= ic->provided_data_sectors))
goto unlock_ret;
range.n_sectors = min((sector_t)recalc_sectors, ic->provided_data_sectors - range.logical_sector);
add_new_range_and_wait(ic, &range);
spin_unlock_irq(&ic->endio_wait.lock);
if (unlikely(++super_counter == RECALC_WRITE_SUPER)) {
recalc_write_super(ic);
super_counter = 0;
}
if (unlikely(dm_integrity_failed(ic)))
goto err;
DEBUG_print("recalculating: %llx - %llx\n", range.logical_sector, range.n_sectors);
bio = bio_alloc_bioset(ic->dev->bdev, 1, REQ_OP_READ, GFP_NOIO, &ic->recalc_bios);
bio->bi_iter.bi_sector = ic->start + SB_SECTORS + range.logical_sector;
__bio_add_page(bio, virt_to_page(recalc_buffer), range.n_sectors << SECTOR_SHIFT, offset_in_page(recalc_buffer));
r = submit_bio_wait(bio);
bio_put(bio);
if (unlikely(r)) {
dm_integrity_io_error(ic, "reading data", r);
goto err;
}
t = recalc_tags;
for (i = 0; i < range.n_sectors; i += ic->sectors_per_block) {
memset(t, 0, ic->tuple_size);
integrity_sector_checksum(ic, range.logical_sector + i, recalc_buffer + (i << SECTOR_SHIFT), t);
t += ic->tuple_size;
}
bio = bio_alloc_bioset(ic->dev->bdev, 1, REQ_OP_WRITE, GFP_NOIO, &ic->recalc_bios);
bio->bi_iter.bi_sector = ic->start + SB_SECTORS + range.logical_sector;
__bio_add_page(bio, virt_to_page(recalc_buffer), range.n_sectors << SECTOR_SHIFT, offset_in_page(recalc_buffer));
bip = bio_integrity_alloc(bio, GFP_NOIO, 1);
if (unlikely(IS_ERR(bip))) {
bio_put(bio);
DMCRIT("out of memory for bio integrity payload - recalculation disabled");
goto err;
}
ret = bio_integrity_add_page(bio, virt_to_page(recalc_tags), t - recalc_tags, offset_in_page(recalc_tags));
if (unlikely(ret != t - recalc_tags)) {
bio_put(bio);
dm_integrity_io_error(ic, "attaching integrity tags", -ENOMEM);
goto err;
}
r = submit_bio_wait(bio);
bio_put(bio);
if (unlikely(r)) {
dm_integrity_io_error(ic, "writing data", r);
goto err;
}
cond_resched();
spin_lock_irq(&ic->endio_wait.lock);
remove_range_unlocked(ic, &range);
#ifdef CONFIG_64BIT
/* Paired with smp_load_acquire in dm_integrity_map_inline. */
smp_store_release(&ic->sb->recalc_sector, cpu_to_le64(range.logical_sector + range.n_sectors));
#else
ic->sb->recalc_sector = cpu_to_le64(range.logical_sector + range.n_sectors);
#endif
goto next_chunk;
err:
remove_range(ic, &range);
goto free_ret;
unlock_ret:
spin_unlock_irq(&ic->endio_wait.lock);
recalc_write_super(ic);
free_ret:
kfree(recalc_buffer);
kfree(recalc_tags);
}
static void bitmap_block_work(struct work_struct *w) static void bitmap_block_work(struct work_struct *w)
{ {
struct bitmap_block_status *bbs = container_of(w, struct bitmap_block_status, work); struct bitmap_block_status *bbs = container_of(w, struct bitmap_block_status, work);
@ -4619,6 +4814,17 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned int argc, char **argv
r = -ENOMEM; r = -ENOMEM;
goto bad; goto bad;
} }
r = bioset_init(&ic->recalc_bios, 1, 0, BIOSET_NEED_BVECS);
if (r) {
ti->error = "Cannot allocate bio set";
goto bad;
}
r = bioset_integrity_create(&ic->recalc_bios, 1);
if (r) {
ti->error = "Cannot allocate bio integrity set";
r = -ENOMEM;
goto bad;
}
} }
ic->metadata_wq = alloc_workqueue("dm-integrity-metadata", ic->metadata_wq = alloc_workqueue("dm-integrity-metadata",
@ -4717,13 +4923,18 @@ static int dm_integrity_ctr(struct dm_target *ti, unsigned int argc, char **argv
ti->error = "Block size doesn't match the information in superblock"; ti->error = "Block size doesn't match the information in superblock";
goto bad; goto bad;
} }
if (!le32_to_cpu(ic->sb->journal_sections) != (ic->mode == 'I')) { if (ic->mode != 'I') {
r = -EINVAL; if (!le32_to_cpu(ic->sb->journal_sections)) {
if (ic->mode != 'I') r = -EINVAL;
ti->error = "Corrupted superblock, journal_sections is 0"; ti->error = "Corrupted superblock, journal_sections is 0";
else goto bad;
}
} else {
if (le32_to_cpu(ic->sb->journal_sections)) {
r = -EINVAL;
ti->error = "Corrupted superblock, journal_sections is not 0"; ti->error = "Corrupted superblock, journal_sections is not 0";
goto bad; goto bad;
}
} }
/* make sure that ti->max_io_len doesn't overflow */ /* make sure that ti->max_io_len doesn't overflow */
if (!ic->meta_dev) { if (!ic->meta_dev) {
@ -4830,7 +5041,7 @@ try_smaller_buffer:
r = -ENOMEM; r = -ENOMEM;
goto bad; goto bad;
} }
INIT_WORK(&ic->recalc_work, integrity_recalc); INIT_WORK(&ic->recalc_work, ic->mode == 'I' ? integrity_recalc_inline : integrity_recalc);
} else { } else {
if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)) { if (ic->sb->flags & cpu_to_le32(SB_FLAG_RECALCULATING)) {
ti->error = "Recalculate can only be specified with internal_hash"; ti->error = "Recalculate can only be specified with internal_hash";
@ -4847,17 +5058,15 @@ try_smaller_buffer:
goto bad; goto bad;
} }
if (ic->mode != 'I') { ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev,
ic->bufio = dm_bufio_client_create(ic->meta_dev ? ic->meta_dev->bdev : ic->dev->bdev, 1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL, 0);
1U << (SECTOR_SHIFT + ic->log2_buffer_sectors), 1, 0, NULL, NULL, 0); if (IS_ERR(ic->bufio)) {
if (IS_ERR(ic->bufio)) { r = PTR_ERR(ic->bufio);
r = PTR_ERR(ic->bufio); ti->error = "Cannot initialize dm-bufio";
ti->error = "Cannot initialize dm-bufio"; ic->bufio = NULL;
ic->bufio = NULL; goto bad;
goto bad;
}
dm_bufio_set_sector_offset(ic->bufio, ic->start + ic->initial_sectors);
} }
dm_bufio_set_sector_offset(ic->bufio, ic->start + ic->initial_sectors);
if (ic->mode != 'R' && ic->mode != 'I') { if (ic->mode != 'R' && ic->mode != 'I') {
r = create_journal(ic, &ti->error); r = create_journal(ic, &ti->error);
@ -4979,6 +5188,7 @@ static void dm_integrity_dtr(struct dm_target *ti)
kvfree(ic->bbs); kvfree(ic->bbs);
if (ic->bufio) if (ic->bufio)
dm_bufio_client_destroy(ic->bufio); dm_bufio_client_destroy(ic->bufio);
bioset_exit(&ic->recalc_bios);
bioset_exit(&ic->recheck_bios); bioset_exit(&ic->recheck_bios);
mempool_exit(&ic->recheck_pool); mempool_exit(&ic->recheck_pool);
mempool_exit(&ic->journal_io_mempool); mempool_exit(&ic->journal_io_mempool);
@ -5033,7 +5243,7 @@ static void dm_integrity_dtr(struct dm_target *ti)
static struct target_type integrity_target = { static struct target_type integrity_target = {
.name = "integrity", .name = "integrity",
.version = {1, 12, 0}, .version = {1, 13, 0},
.module = THIS_MODULE, .module = THIS_MODULE,
.features = DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY, .features = DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY,
.ctr = dm_integrity_ctr, .ctr = dm_integrity_ctr,

View File

@ -2519,7 +2519,7 @@ static int super_validate(struct raid_set *rs, struct md_rdev *rdev)
rdev->saved_raid_disk = rdev->raid_disk; rdev->saved_raid_disk = rdev->raid_disk;
} }
/* Reshape support -> restore repective data offsets */ /* Reshape support -> restore respective data offsets */
rdev->data_offset = le64_to_cpu(sb->data_offset); rdev->data_offset = le64_to_cpu(sb->data_offset);
rdev->new_data_offset = le64_to_cpu(sb->new_data_offset); rdev->new_data_offset = le64_to_cpu(sb->new_data_offset);

View File

@ -496,8 +496,10 @@ static blk_status_t dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
map = dm_get_live_table(md, &srcu_idx); map = dm_get_live_table(md, &srcu_idx);
if (unlikely(!map)) { if (unlikely(!map)) {
DMERR_LIMIT("%s: mapping table unavailable, erroring io",
dm_device_name(md));
dm_put_live_table(md, srcu_idx); dm_put_live_table(md, srcu_idx);
return BLK_STS_RESOURCE; return BLK_STS_IOERR;
} }
ti = dm_table_find_target(map, 0); ti = dm_table_find_target(map, 0);
dm_put_live_table(md, srcu_idx); dm_put_live_table(md, srcu_idx);

View File

@ -2948,7 +2948,7 @@ static struct pool *pool_create(struct mapped_device *pool_md,
pmd = dm_pool_metadata_open(metadata_dev, block_size, format_device); pmd = dm_pool_metadata_open(metadata_dev, block_size, format_device);
if (IS_ERR(pmd)) { if (IS_ERR(pmd)) {
*error = "Error creating metadata object"; *error = "Error creating metadata object";
return (struct pool *)pmd; return ERR_CAST(pmd);
} }
pool = kzalloc(sizeof(*pool), GFP_KERNEL); pool = kzalloc(sizeof(*pool), GFP_KERNEL);

View File

@ -501,6 +501,7 @@ static void launch_data_vio(struct data_vio *data_vio, logical_block_number_t lb
memset(&data_vio->record_name, 0, sizeof(data_vio->record_name)); memset(&data_vio->record_name, 0, sizeof(data_vio->record_name));
memset(&data_vio->duplicate, 0, sizeof(data_vio->duplicate)); memset(&data_vio->duplicate, 0, sizeof(data_vio->duplicate));
vdo_reset_completion(&data_vio->decrement_completion);
vdo_reset_completion(completion); vdo_reset_completion(completion);
completion->error_handler = handle_data_vio_error; completion->error_handler = handle_data_vio_error;
set_data_vio_logical_callback(data_vio, attempt_logical_block_lock); set_data_vio_logical_callback(data_vio, attempt_logical_block_lock);
@ -1273,12 +1274,14 @@ static void clean_hash_lock(struct vdo_completion *completion)
static void finish_cleanup(struct data_vio *data_vio) static void finish_cleanup(struct data_vio *data_vio)
{ {
struct vdo_completion *completion = &data_vio->vio.completion; struct vdo_completion *completion = &data_vio->vio.completion;
u32 discard_size = min_t(u32, data_vio->remaining_discard,
VDO_BLOCK_SIZE - data_vio->offset);
VDO_ASSERT_LOG_ONLY(data_vio->allocation.lock == NULL, VDO_ASSERT_LOG_ONLY(data_vio->allocation.lock == NULL,
"complete data_vio has no allocation lock"); "complete data_vio has no allocation lock");
VDO_ASSERT_LOG_ONLY(data_vio->hash_lock == NULL, VDO_ASSERT_LOG_ONLY(data_vio->hash_lock == NULL,
"complete data_vio has no hash lock"); "complete data_vio has no hash lock");
if ((data_vio->remaining_discard <= VDO_BLOCK_SIZE) || if ((data_vio->remaining_discard <= discard_size) ||
(completion->result != VDO_SUCCESS)) { (completion->result != VDO_SUCCESS)) {
struct data_vio_pool *pool = completion->vdo->data_vio_pool; struct data_vio_pool *pool = completion->vdo->data_vio_pool;
@ -1287,12 +1290,12 @@ static void finish_cleanup(struct data_vio *data_vio)
return; return;
} }
data_vio->remaining_discard -= min_t(u32, data_vio->remaining_discard, data_vio->remaining_discard -= discard_size;
VDO_BLOCK_SIZE - data_vio->offset);
data_vio->is_partial = (data_vio->remaining_discard < VDO_BLOCK_SIZE); data_vio->is_partial = (data_vio->remaining_discard < VDO_BLOCK_SIZE);
data_vio->read = data_vio->is_partial; data_vio->read = data_vio->is_partial;
data_vio->offset = 0; data_vio->offset = 0;
completion->requeue = true; completion->requeue = true;
data_vio->first_reference_operation_complete = false;
launch_data_vio(data_vio, data_vio->logical.lbn + 1); launch_data_vio(data_vio, data_vio->logical.lbn + 1);
} }
@ -1965,7 +1968,8 @@ static void allocate_block(struct vdo_completion *completion)
.state = VDO_MAPPING_STATE_UNCOMPRESSED, .state = VDO_MAPPING_STATE_UNCOMPRESSED,
}; };
if (data_vio->fua) { if (data_vio->fua ||
data_vio->remaining_discard > (u32) (VDO_BLOCK_SIZE - data_vio->offset)) {
prepare_for_dedupe(data_vio); prepare_for_dedupe(data_vio);
return; return;
} }
@ -2042,7 +2046,6 @@ void continue_data_vio_with_block_map_slot(struct vdo_completion *completion)
return; return;
} }
/* /*
* We don't need to write any data, so skip allocation and just update the block map and * We don't need to write any data, so skip allocation and just update the block map and
* reference counts (via the journal). * reference counts (via the journal).
@ -2051,7 +2054,7 @@ void continue_data_vio_with_block_map_slot(struct vdo_completion *completion)
if (data_vio->is_zero) if (data_vio->is_zero)
data_vio->new_mapped.state = VDO_MAPPING_STATE_UNCOMPRESSED; data_vio->new_mapped.state = VDO_MAPPING_STATE_UNCOMPRESSED;
if (data_vio->remaining_discard > VDO_BLOCK_SIZE) { if (data_vio->remaining_discard > (u32) (VDO_BLOCK_SIZE - data_vio->offset)) {
/* This is not the final block of a discard so we can't acknowledge it yet. */ /* This is not the final block of a discard so we can't acknowledge it yet. */
update_metadata_for_data_vio_write(data_vio, NULL); update_metadata_for_data_vio_write(data_vio, NULL);
return; return;

View File

@ -729,6 +729,7 @@ static void process_update_result(struct data_vio *agent)
!change_context_state(context, DEDUPE_CONTEXT_COMPLETE, DEDUPE_CONTEXT_IDLE)) !change_context_state(context, DEDUPE_CONTEXT_COMPLETE, DEDUPE_CONTEXT_IDLE))
return; return;
agent->dedupe_context = NULL;
release_context(context); release_context(context);
} }
@ -1648,6 +1649,7 @@ static void process_query_result(struct data_vio *agent)
if (change_context_state(context, DEDUPE_CONTEXT_COMPLETE, DEDUPE_CONTEXT_IDLE)) { if (change_context_state(context, DEDUPE_CONTEXT_COMPLETE, DEDUPE_CONTEXT_IDLE)) {
agent->is_duplicate = decode_uds_advice(context); agent->is_duplicate = decode_uds_advice(context);
agent->dedupe_context = NULL;
release_context(context); release_context(context);
} }
} }
@ -2321,6 +2323,7 @@ static void timeout_index_operations_callback(struct vdo_completion *completion)
* send its requestor on its way. * send its requestor on its way.
*/ */
list_del_init(&context->list_entry); list_del_init(&context->list_entry);
context->requestor->dedupe_context = NULL;
continue_data_vio(context->requestor); continue_data_vio(context->requestor);
timed_out++; timed_out++;
} }

View File

@ -1105,6 +1105,9 @@ static int vdo_message(struct dm_target *ti, unsigned int argc, char **argv,
if ((argc == 1) && (strcasecmp(argv[0], "stats") == 0)) { if ((argc == 1) && (strcasecmp(argv[0], "stats") == 0)) {
vdo_write_stats(vdo, result_buffer, maxlen); vdo_write_stats(vdo, result_buffer, maxlen);
result = 1; result = 1;
} else if ((argc == 1) && (strcasecmp(argv[0], "config") == 0)) {
vdo_write_config(vdo, &result_buffer, &maxlen);
result = 1;
} else { } else {
result = vdo_status_to_errno(process_vdo_message(vdo, argc, argv)); result = vdo_status_to_errno(process_vdo_message(vdo, argc, argv));
} }
@ -2293,6 +2296,14 @@ static void handle_load_error(struct vdo_completion *completion)
return; return;
} }
if ((completion->result == VDO_UNSUPPORTED_VERSION) &&
(vdo->admin.phase == LOAD_PHASE_MAKE_DIRTY)) {
vdo_log_error("Aborting load due to unsupported version");
vdo->admin.phase = LOAD_PHASE_FINISHED;
load_callback(completion);
return;
}
vdo_log_error_strerror(completion->result, vdo_log_error_strerror(completion->result,
"Entering read-only mode due to load error"); "Entering read-only mode due to load error");
vdo->admin.phase = LOAD_PHASE_WAIT_FOR_READ_ONLY; vdo->admin.phase = LOAD_PHASE_WAIT_FOR_READ_ONLY;
@ -2737,6 +2748,19 @@ static int vdo_preresume_registered(struct dm_target *ti, struct vdo *vdo)
vdo_log_info("starting device '%s'", device_name); vdo_log_info("starting device '%s'", device_name);
result = perform_admin_operation(vdo, LOAD_PHASE_START, load_callback, result = perform_admin_operation(vdo, LOAD_PHASE_START, load_callback,
handle_load_error, "load"); handle_load_error, "load");
if (result == VDO_UNSUPPORTED_VERSION) {
/*
* A component version is not supported. This can happen when the
* recovery journal metadata is in an old version format. Abort the
* load without saving the state.
*/
vdo->suspend_type = VDO_ADMIN_STATE_SUSPENDING;
perform_admin_operation(vdo, SUSPEND_PHASE_START,
suspend_callback, suspend_callback,
"suspend");
return result;
}
if ((result != VDO_SUCCESS) && (result != VDO_READ_ONLY)) { if ((result != VDO_SUCCESS) && (result != VDO_READ_ONLY)) {
/* /*
* Something has gone very wrong. Make sure everything has drained and * Something has gone very wrong. Make sure everything has drained and
@ -2808,7 +2832,8 @@ static int vdo_preresume(struct dm_target *ti)
vdo_register_thread_device_id(&instance_thread, &vdo->instance); vdo_register_thread_device_id(&instance_thread, &vdo->instance);
result = vdo_preresume_registered(ti, vdo); result = vdo_preresume_registered(ti, vdo);
if ((result == VDO_PARAMETER_MISMATCH) || (result == VDO_INVALID_ADMIN_STATE)) if ((result == VDO_PARAMETER_MISMATCH) || (result == VDO_INVALID_ADMIN_STATE) ||
(result == VDO_UNSUPPORTED_VERSION))
result = -EINVAL; result = -EINVAL;
vdo_unregister_thread_device_id(); vdo_unregister_thread_device_id();
return vdo_status_to_errno(result); return vdo_status_to_errno(result);
@ -2832,7 +2857,7 @@ static void vdo_resume(struct dm_target *ti)
static struct target_type vdo_target_bio = { static struct target_type vdo_target_bio = {
.features = DM_TARGET_SINGLETON, .features = DM_TARGET_SINGLETON,
.name = "vdo", .name = "vdo",
.version = { 9, 0, 0 }, .version = { 9, 1, 0 },
.module = THIS_MODULE, .module = THIS_MODULE,
.ctr = vdo_ctr, .ctr = vdo_ctr,
.dtr = vdo_dtr, .dtr = vdo_dtr,

View File

@ -177,7 +177,7 @@ int uds_pack_open_chapter_index_page(struct open_chapter_index *chapter_index,
if (list_number < 0) if (list_number < 0)
return UDS_OVERFLOW; return UDS_OVERFLOW;
next_list = first_list + list_number--, next_list = first_list + list_number--;
result = uds_start_delta_index_search(delta_index, next_list, 0, result = uds_start_delta_index_search(delta_index, next_list, 0,
&entry); &entry);
if (result != UDS_SUCCESS) if (result != UDS_SUCCESS)

View File

@ -346,7 +346,6 @@ void __submit_metadata_vio(struct vio *vio, physical_block_number_t physical,
VDO_ASSERT_LOG_ONLY(!code->quiescent, "I/O not allowed in state %s", code->name); VDO_ASSERT_LOG_ONLY(!code->quiescent, "I/O not allowed in state %s", code->name);
VDO_ASSERT_LOG_ONLY(vio->bio->bi_next == NULL, "metadata bio has no next bio");
vdo_reset_completion(completion); vdo_reset_completion(completion);
completion->error_handler = error_handler; completion->error_handler = error_handler;

View File

@ -4,6 +4,7 @@
*/ */
#include "dedupe.h" #include "dedupe.h"
#include "indexer.h"
#include "logger.h" #include "logger.h"
#include "memory-alloc.h" #include "memory-alloc.h"
#include "message-stats.h" #include "message-stats.h"
@ -430,3 +431,50 @@ int vdo_write_stats(struct vdo *vdo, char *buf, unsigned int maxlen)
vdo_free(stats); vdo_free(stats);
return VDO_SUCCESS; return VDO_SUCCESS;
} }
static void write_index_memory(u32 mem, char **buf, unsigned int *maxlen)
{
char *prefix = "memorySize : ";
/* Convert index memory to fractional value */
if (mem == (u32)UDS_MEMORY_CONFIG_256MB)
write_string(prefix, "0.25, ", NULL, buf, maxlen);
else if (mem == (u32)UDS_MEMORY_CONFIG_512MB)
write_string(prefix, "0.50, ", NULL, buf, maxlen);
else if (mem == (u32)UDS_MEMORY_CONFIG_768MB)
write_string(prefix, "0.75, ", NULL, buf, maxlen);
else
write_u32(prefix, mem, ", ", buf, maxlen);
}
static void write_index_config(struct index_config *config, char **buf,
unsigned int *maxlen)
{
write_string("index : ", "{ ", NULL, buf, maxlen);
/* index mem size */
write_index_memory(config->mem, buf, maxlen);
/* whether the index is sparse or not */
write_bool("isSparse : ", config->sparse, ", ", buf, maxlen);
write_string(NULL, "}", ", ", buf, maxlen);
}
int vdo_write_config(struct vdo *vdo, char **buf, unsigned int *maxlen)
{
struct vdo_config *config = &vdo->states.vdo.config;
write_string(NULL, "{ ", NULL, buf, maxlen);
/* version */
write_u32("version : ", 1, ", ", buf, maxlen);
/* physical size */
write_block_count_t("physicalSize : ", config->physical_blocks * VDO_BLOCK_SIZE, ", ",
buf, maxlen);
/* logical size */
write_block_count_t("logicalSize : ", config->logical_blocks * VDO_BLOCK_SIZE, ", ",
buf, maxlen);
/* slab size */
write_block_count_t("slabSize : ", config->slab_size, ", ", buf, maxlen);
/* index config */
write_index_config(&vdo->geometry.index_config, buf, maxlen);
write_string(NULL, "}", NULL, buf, maxlen);
return VDO_SUCCESS;
}

View File

@ -8,6 +8,7 @@
#include "types.h" #include "types.h"
int vdo_write_config(struct vdo *vdo, char **buf, unsigned int *maxlen);
int vdo_write_stats(struct vdo *vdo, char *buf, unsigned int maxlen); int vdo_write_stats(struct vdo *vdo, char *buf, unsigned int maxlen);
#endif /* VDO_MESSAGE_STATS_H */ #endif /* VDO_MESSAGE_STATS_H */

View File

@ -1202,17 +1202,14 @@ static bool __must_check is_valid_recovery_journal_block(const struct recovery_j
* @journal: The journal to use. * @journal: The journal to use.
* @header: The unpacked block header to check. * @header: The unpacked block header to check.
* @sequence: The expected sequence number. * @sequence: The expected sequence number.
* @type: The expected metadata type.
* *
* Return: True if the block matches. * Return: True if the block matches.
*/ */
static bool __must_check is_exact_recovery_journal_block(const struct recovery_journal *journal, static bool __must_check is_exact_recovery_journal_block(const struct recovery_journal *journal,
const struct recovery_block_header *header, const struct recovery_block_header *header,
sequence_number_t sequence, sequence_number_t sequence)
enum vdo_metadata_type type)
{ {
return ((header->metadata_type == type) && return ((header->sequence_number == sequence) &&
(header->sequence_number == sequence) &&
(is_valid_recovery_journal_block(journal, header, true))); (is_valid_recovery_journal_block(journal, header, true)));
} }
@ -1371,7 +1368,8 @@ static void extract_entries_from_block(struct repair_completion *repair,
get_recovery_journal_block_header(journal, repair->journal_data, get_recovery_journal_block_header(journal, repair->journal_data,
sequence); sequence);
if (!is_exact_recovery_journal_block(journal, &header, sequence, format)) { if (!is_exact_recovery_journal_block(journal, &header, sequence) ||
(header.metadata_type != format)) {
/* This block is invalid, so skip it. */ /* This block is invalid, so skip it. */
return; return;
} }
@ -1557,10 +1555,13 @@ static int parse_journal_for_recovery(struct repair_completion *repair)
sequence_number_t i, head; sequence_number_t i, head;
bool found_entries = false; bool found_entries = false;
struct recovery_journal *journal = repair->completion.vdo->recovery_journal; struct recovery_journal *journal = repair->completion.vdo->recovery_journal;
struct recovery_block_header header;
enum vdo_metadata_type expected_format;
head = min(repair->block_map_head, repair->slab_journal_head); head = min(repair->block_map_head, repair->slab_journal_head);
header = get_recovery_journal_block_header(journal, repair->journal_data, head);
expected_format = header.metadata_type;
for (i = head; i <= repair->highest_tail; i++) { for (i = head; i <= repair->highest_tail; i++) {
struct recovery_block_header header;
journal_entry_count_t block_entries; journal_entry_count_t block_entries;
u8 j; u8 j;
@ -1572,19 +1573,15 @@ static int parse_journal_for_recovery(struct repair_completion *repair)
}; };
header = get_recovery_journal_block_header(journal, repair->journal_data, i); header = get_recovery_journal_block_header(journal, repair->journal_data, i);
if (header.metadata_type == VDO_METADATA_RECOVERY_JOURNAL) { if (!is_exact_recovery_journal_block(journal, &header, i)) {
/* This is an old format block, so we need to upgrade */
vdo_log_error_strerror(VDO_UNSUPPORTED_VERSION,
"Recovery journal is in the old format, a read-only rebuild is required.");
vdo_enter_read_only_mode(repair->completion.vdo,
VDO_UNSUPPORTED_VERSION);
return VDO_UNSUPPORTED_VERSION;
}
if (!is_exact_recovery_journal_block(journal, &header, i,
VDO_METADATA_RECOVERY_JOURNAL_2)) {
/* A bad block header was found so this must be the end of the journal. */ /* A bad block header was found so this must be the end of the journal. */
break; break;
} else if (header.metadata_type != expected_format) {
/* There is a mix of old and new format blocks, so we need to rebuild. */
vdo_log_error_strerror(VDO_CORRUPT_JOURNAL,
"Recovery journal is in an invalid format, a read-only rebuild is required.");
vdo_enter_read_only_mode(repair->completion.vdo, VDO_CORRUPT_JOURNAL);
return VDO_CORRUPT_JOURNAL;
} }
block_entries = header.entry_count; block_entries = header.entry_count;
@ -1620,8 +1617,14 @@ static int parse_journal_for_recovery(struct repair_completion *repair)
break; break;
} }
if (!found_entries) if (!found_entries) {
return validate_heads(repair); return validate_heads(repair);
} else if (expected_format == VDO_METADATA_RECOVERY_JOURNAL) {
/* All journal blocks have the old format, so we need to upgrade. */
vdo_log_error_strerror(VDO_UNSUPPORTED_VERSION,
"Recovery journal is in the old format. Downgrade and complete recovery, then upgrade with a clean volume");
return VDO_UNSUPPORTED_VERSION;
}
/* Set the tail to the last valid tail block, if there is one. */ /* Set the tail to the last valid tail block, if there is one. */
if (repair->tail_recovery_point.sector_count == 0) if (repair->tail_recovery_point.sector_count == 0)

View File

@ -28,7 +28,7 @@ const struct error_info vdo_status_list[] = {
{ "VDO_LOCK_ERROR", "A lock is held incorrectly" }, { "VDO_LOCK_ERROR", "A lock is held incorrectly" },
{ "VDO_READ_ONLY", "The device is in read-only mode" }, { "VDO_READ_ONLY", "The device is in read-only mode" },
{ "VDO_SHUTTING_DOWN", "The device is shutting down" }, { "VDO_SHUTTING_DOWN", "The device is shutting down" },
{ "VDO_CORRUPT_JOURNAL", "Recovery journal entries corrupted" }, { "VDO_CORRUPT_JOURNAL", "Recovery journal corrupted" },
{ "VDO_TOO_MANY_SLABS", "Exceeds maximum number of slabs supported" }, { "VDO_TOO_MANY_SLABS", "Exceeds maximum number of slabs supported" },
{ "VDO_INVALID_FRAGMENT", "Compressed block fragment is invalid" }, { "VDO_INVALID_FRAGMENT", "Compressed block fragment is invalid" },
{ "VDO_RETRY_AFTER_REBUILD", "Retry operation after rebuilding finishes" }, { "VDO_RETRY_AFTER_REBUILD", "Retry operation after rebuilding finishes" },

View File

@ -52,7 +52,7 @@ enum vdo_status_codes {
VDO_READ_ONLY, VDO_READ_ONLY,
/* the VDO is shutting down */ /* the VDO is shutting down */
VDO_SHUTTING_DOWN, VDO_SHUTTING_DOWN,
/* the recovery journal has corrupt entries */ /* the recovery journal has corrupt entries or corrupt metadata */
VDO_CORRUPT_JOURNAL, VDO_CORRUPT_JOURNAL,
/* exceeds maximum number of slabs supported */ /* exceeds maximum number of slabs supported */
VDO_TOO_MANY_SLABS, VDO_TOO_MANY_SLABS,

View File

@ -273,8 +273,10 @@ out:
if (v->mode == DM_VERITY_MODE_LOGGING) if (v->mode == DM_VERITY_MODE_LOGGING)
return 0; return 0;
if (v->mode == DM_VERITY_MODE_RESTART) if (v->mode == DM_VERITY_MODE_RESTART) {
kernel_restart("dm-verity device corrupted"); pr_emerg("dm-verity device corrupted\n");
emergency_restart();
}
if (v->mode == DM_VERITY_MODE_PANIC) if (v->mode == DM_VERITY_MODE_PANIC)
panic("dm-verity device corrupted"); panic("dm-verity device corrupted");
@ -597,6 +599,23 @@ static void verity_finish_io(struct dm_verity_io *io, blk_status_t status)
if (!static_branch_unlikely(&use_bh_wq_enabled) || !io->in_bh) if (!static_branch_unlikely(&use_bh_wq_enabled) || !io->in_bh)
verity_fec_finish_io(io); verity_fec_finish_io(io);
if (unlikely(status != BLK_STS_OK) &&
unlikely(!(bio->bi_opf & REQ_RAHEAD)) &&
!verity_is_system_shutting_down()) {
if (v->mode == DM_VERITY_MODE_RESTART ||
v->mode == DM_VERITY_MODE_PANIC)
DMERR_LIMIT("%s has error: %s", v->data_dev->name,
blk_status_to_str(status));
if (v->mode == DM_VERITY_MODE_RESTART) {
pr_emerg("dm-verity device corrupted\n");
emergency_restart();
}
if (v->mode == DM_VERITY_MODE_PANIC)
panic("dm-verity device corrupted");
}
bio_endio(bio); bio_endio(bio);
} }

View File

@ -127,7 +127,7 @@ int verity_verify_root_hash(const void *root_hash, size_t root_hash_len,
#endif #endif
VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL); VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
#ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING #ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING
if (ret == -ENOKEY) if (ret == -ENOKEY || ret == -EKEYREJECTED)
ret = verify_pkcs7_signature(root_hash, root_hash_len, sig_data, ret = verify_pkcs7_signature(root_hash, root_hash_len, sig_data,
sig_len, sig_len,
VERIFY_USE_PLATFORM_KEYRING, VERIFY_USE_PLATFORM_KEYRING,

View File

@ -2030,10 +2030,15 @@ static void dm_submit_bio(struct bio *bio)
struct dm_table *map; struct dm_table *map;
map = dm_get_live_table(md, &srcu_idx); map = dm_get_live_table(md, &srcu_idx);
if (unlikely(!map)) {
DMERR_LIMIT("%s: mapping table unavailable, erroring io",
dm_device_name(md));
bio_io_error(bio);
goto out;
}
/* If suspended, or map not yet available, queue this IO for later */ /* If suspended, queue this IO for later */
if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags)) || if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
unlikely(!map)) {
if (bio->bi_opf & REQ_NOWAIT) if (bio->bi_opf & REQ_NOWAIT)
bio_wouldblock_error(bio); bio_wouldblock_error(bio);
else if (bio->bi_opf & REQ_RAHEAD) else if (bio->bi_opf & REQ_RAHEAD)

View File

@ -109,7 +109,6 @@ void dm_zone_endio(struct dm_io *io, struct bio *clone);
int dm_blk_report_zones(struct gendisk *disk, sector_t sector, int dm_blk_report_zones(struct gendisk *disk, sector_t sector,
unsigned int nr_zones, report_zones_cb cb, void *data); unsigned int nr_zones, report_zones_cb cb, void *data);
bool dm_is_zone_write(struct mapped_device *md, struct bio *bio); bool dm_is_zone_write(struct mapped_device *md, struct bio *bio);
int dm_zone_map_bio(struct dm_target_io *io);
int dm_zone_get_reset_bitmap(struct mapped_device *md, struct dm_table *t, int dm_zone_get_reset_bitmap(struct mapped_device *md, struct dm_table *t,
sector_t sector, unsigned int nr_zones, sector_t sector, unsigned int nr_zones,
unsigned long *need_reset); unsigned long *need_reset);
@ -119,10 +118,6 @@ static inline bool dm_is_zone_write(struct mapped_device *md, struct bio *bio)
{ {
return false; return false;
} }
static inline int dm_zone_map_bio(struct dm_target_io *tio)
{
return DM_MAPIO_KILL;
}
#endif #endif
/* /*

View File

@ -524,7 +524,6 @@ int dm_post_suspending(struct dm_target *ti);
int dm_noflush_suspending(struct dm_target *ti); int dm_noflush_suspending(struct dm_target *ti);
void dm_accept_partial_bio(struct bio *bio, unsigned int n_sectors); void dm_accept_partial_bio(struct bio *bio, unsigned int n_sectors);
void dm_submit_bio_remap(struct bio *clone, struct bio *tgt_clone); void dm_submit_bio_remap(struct bio *clone, struct bio *tgt_clone);
union map_info *dm_get_rq_mapinfo(struct request *rq);
#ifdef CONFIG_BLK_DEV_ZONED #ifdef CONFIG_BLK_DEV_ZONED
struct dm_report_zones_args { struct dm_report_zones_args {