linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-26 06:02:05 +00:00

History

Mateusz Jończyk 36a5c03f23 md/raid1: set max_sectors during early return from choose_slow_rdev() Linux 6.9+ is unable to start a degraded RAID1 array with one drive, when that drive has a write-mostly flag set. During such an attempt, the following assertion in bio_split() is hit: BUG_ON(sectors <= 0); Call Trace: ? bio_split+0x96/0xb0 ? exc_invalid_op+0x53/0x70 ? bio_split+0x96/0xb0 ? asm_exc_invalid_op+0x1b/0x20 ? bio_split+0x96/0xb0 ? raid1_read_request+0x890/0xd20 ? __call_rcu_common.constprop.0+0x97/0x260 raid1_make_request+0x81/0xce0 ? __get_random_u32_below+0x17/0x70 ? new_slab+0x2b3/0x580 md_handle_request+0x77/0x210 md_submit_bio+0x62/0xa0 __submit_bio+0x17b/0x230 submit_bio_noacct_nocheck+0x18e/0x3c0 submit_bio_noacct+0x244/0x670 After investigation, it turned out that choose_slow_rdev() does not set the value of max_sectors in some cases and because of it, raid1_read_request calls bio_split with sectors == 0. Fix it by filling in this variable. This bug was introduced in commit `dfa8ecd167` ("md/raid1: factor out choose_slow_rdev() from read_balance()") but apparently hidden until commit `0091c5a269` ("md/raid1: factor out helpers to choose the best rdev from read_balance()") shortly thereafter. Cc: stable@vger.kernel.org # 6.9.x+ Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Fixes: `dfa8ecd167` ("md/raid1: factor out choose_slow_rdev() from read_balance()") Cc: Song Liu <song@kernel.org> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Paul Luse <paul.e.luse@linux.intel.com> Cc: Xiao Ni <xni@redhat.com> Cc: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Link: https://lore.kernel.org/linux-raid/20240706143038.7253-1-mat.jonczyk@o2.pl/ -- Tested on both Linux 6.10 and 6.9.8. Inside a VM, mdadm testsuite for RAID1 on 6.10 did not find any problems: ./test --dev=loop --no-error --raidtype=raid1 (on 6.9.8 there was one failure, caused by external bitmap support not compiled in). Notes: - I was reliably getting deadlocks when adding / removing devices on such an array - while the array was loaded with fsstress with 20 concurrent processes. When the array was idle or loaded with fsstress with 8 processes, no such deadlocks happened in my tests. This occurred also on unpatched Linux 6.8.0 though, but not on 6.1.97-rc1, so this is likely an independent regression (to be investigated). - I was also getting deadlocks when adding / removing the bitmap on the array in similar conditions - this happened on Linux 6.1.97-rc1 also though. fsstress with 8 concurrent processes did cause it only once during many tests. - in my testing, there was once a problem with hot adding an internal bitmap to the array: mdadm: Cannot add bitmap while array is resyncing or reshaping etc. mdadm: failed to set internal bitmap. even though no such reshaping was happening according to /proc/mdstat. This seems unrelated, though. Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240711202316.10775-1-mat.jonczyk@o2.pl		2024-07-12 01:30:38 +00:00
..
bcache	bcache: work around a __bitwise to bool conversion sparse warning	2024-06-28 10:25:00 -06:00
dm-vdo	bd_inode series	2024-05-21 09:51:42 -07:00
persistent-data	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-audit.c
dm-audit.h
dm-bio-prison-v1.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-bio-prison-v1.h	dm bio prison v1: add dm_cell_key_has_valid_range	2023-03-30 15:57:51 -04:00
dm-bio-prison-v2.c	dm: use bio_list_merge_init	2024-04-01 11:53:37 -06:00
dm-bio-prison-v2.h
dm-bio-record.h
dm-bufio.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-builtin.c
dm-cache-background-tracker.c
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c	Many singleton patches against the MM code. The patch series which are	2023-11-02 19:38:47 -10:00
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c	block: remove the discard_alignment flag	2024-06-20 06:53:14 -06:00
dm-clone-metadata.c	bitmap: introduce generic optimized bitmap_size()	2024-04-01 10:49:28 +01:00
dm-clone-metadata.h
dm-clone-target.c	block: remove the discard_alignment flag	2024-06-20 06:53:14 -06:00
dm-core.h	block: move integrity information into queue_limits	2024-06-14 10:20:07 -06:00
dm-crypt.c	block: remove the blk_integrity_profile structure	2024-06-14 10:20:06 -06:00
dm-delay.c	dm-delay: remove timer_lock	2024-05-09 09:10:58 -04:00
dm-dust.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-ebs-target.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-era-target.c	dm: use bio_list_merge_init	2024-04-01 11:53:37 -06:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-ima.c
dm-ima.h
dm-init.c	dm: open code dm_get_dev_t in dm_init_init	2023-06-05 10:57:40 -06:00
dm-integrity.c	block: move integrity information into queue_limits	2024-06-14 10:20:07 -06:00
dm-io-rewind.c
dm-io-tracker.h
dm-io.c	dm io: Support IO priority	2024-02-20 14:22:51 -05:00
dm-ioctl.c	dm ioctl: update DM_DRIVER_EMAIL to new dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-kcopyd.c	dm io: Support IO priority	2024-02-20 14:22:51 -05:00
dm-linear.c	dm: shortcut the calls to linear_map and stripe_map	2023-10-06 19:09:25 -04:00
dm-log-userspace-base.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c	dm: always manage discard support in terms of max_hw_discard_sectors	2024-05-20 15:51:19 -04:00
dm-log.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-mpath.c	dm: use bio_list_merge_init	2024-04-01 11:53:37 -06:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-ps-historical-service-time.c
dm-ps-io-affinity.c
dm-ps-queue-length.c
dm-ps-round-robin.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-ps-service-time.c
dm-raid1.c	dm io: Support IO priority	2024-02-20 14:22:51 -05:00
dm-raid.c	md: replace last_sync_action with new enum type	2024-06-12 16:32:57 +00:00
dm-region-hash.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-rq.c
dm-rq.h
dm-snap-persistent.c	dm io: Support IO priority	2024-02-20 14:22:51 -05:00
dm-snap-transient.c
dm-snap.c	dm: always manage discard support in terms of max_hw_discard_sectors	2024-05-20 15:51:19 -04:00
dm-stats.c	dm stats: limit the number of entries	2024-01-30 14:06:44 -05:00
dm-stats.h
dm-stripe.c	- Update DM core to directly call the map function for both the linear	2023-11-01 12:55:54 -10:00
dm-switch.c	dm: add helper macro for simple DM target module init and exit	2023-04-11 12:09:08 -04:00
dm-sysfs.c
dm-table.c	Merge branch 'for-6.11/block-limits' into for-6.11/block	2024-06-20 06:55:20 -06:00
dm-target.c	dm: always manage discard support in terms of max_hw_discard_sectors	2024-05-20 15:51:19 -04:00
dm-thin-metadata.c	- Update DM crypt to allocate compound pages if possible.	2023-06-30 12:16:00 -07:00
dm-thin-metadata.h
dm-thin.c	- Fix DM discard regressions due to DM core switching over to using	2024-05-21 11:43:11 -07:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c	dm: add helper macro for simple DM target module init and exit	2023-04-11 12:09:08 -04:00
dm-verity-fec.c	dm verity: Fix IO priority lost when reading FEC and hash	2024-02-20 14:22:55 -05:00
dm-verity-fec.h
dm-verity-loadpin.c	dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter	2023-06-28 10:43:04 -07:00
dm-verity-target.c	- Convert the DM verity and crypt targets from (ab)using tasklets to	2024-03-13 09:46:51 -07:00
dm-verity-verify-sig.c
dm-verity-verify-sig.h
dm-verity.h	dm-verity: Convert from tasklet to BH workqueue	2024-03-02 10:30:36 -05:00
dm-writecache.c	dm: update relevant MODULE_AUTHOR entries to latest dm-devel mailing list	2024-02-20 14:22:55 -05:00
dm-zero.c	dm: always manage discard support in terms of max_hw_discard_sectors	2024-05-20 15:51:19 -04:00
dm-zone.c	dm: handle REQ_OP_ZONE_RESET_ALL	2024-07-05 00:42:04 -06:00
dm-zoned-metadata.c	block: remove gfp_flags from blkdev_zone_mgmt	2024-02-12 08:41:16 -07:00
dm-zoned-reclaim.c
dm-zoned-target.c	block: move the zoned flag into the features field	2024-06-19 07:58:28 -06:00
dm-zoned.h
dm.c	dm: handle REQ_OP_ZONE_RESET_ALL	2024-07-05 00:42:04 -06:00
dm.h	dm: handle REQ_OP_ZONE_RESET_ALL	2024-07-05 00:42:04 -06:00
Kconfig	Kbuild updates for v6.9	2024-03-21 14:41:00 -07:00
Makefile	dm vdo: use a proper Makefile for dm-vdo	2024-02-20 13:43:17 -05:00
md-autodetect.c	md: Remove deprecated CONFIG_MD_LINEAR	2023-12-19 10:16:51 -08:00
md-bitmap.c	md/md-bitmap: fix writing non bitmap pages	2024-06-11 21:22:21 +00:00
md-bitmap.h	md-bitmap: don't use ->index for pages backing the bitmap file	2023-07-27 00:13:29 -07:00
md-cluster.c	md-cluster: fix no recovery job when adding/re-adding a disk	2024-07-12 01:30:18 +00:00
md-cluster.h	md-cluster: fix no recovery job when adding/re-adding a disk	2024-07-12 01:30:18 +00:00
md.c	md-cluster: fix no recovery job when adding/re-adding a disk	2024-07-12 01:30:18 +00:00
md.h	md-cluster: Constify struct md_cluster_operations	2024-07-04 06:20:27 +00:00
raid0.c	md: set md-specific flags for all queue limits	2024-06-26 09:37:35 -06:00
raid0.h	md/raid0: add discard support for the 'original' layout	2023-06-30 15:43:50 -07:00
raid1-10.c	md/raid1-10: factor out a new helper raid1_should_read_first()	2024-02-29 22:49:46 -08:00
raid1.c	md/raid1: set max_sectors during early return from choose_slow_rdev()	2024-07-12 01:30:38 +00:00
raid1.h	md/raid1: record nonrot rdevs while adding/removing rdevs to conf	2024-02-29 22:49:45 -08:00
raid5-cache.c	md/raid5: remove rcu protection to access rdev from conf	2023-11-27 15:49:05 -08:00
raid5-log.h
raid5-ppl.c	md: remove mddev->queue	2024-03-06 08:59:53 -08:00
raid5.c	md/raid5: recheck if reshape has finished with device_lock held	2024-07-04 06:35:19 +00:00
raid5.h	md/raid5: remove rcu protection to access rdev from conf	2023-11-27 15:49:05 -08:00
raid10.c	md: set md-specific flags for all queue limits	2024-06-26 09:37:35 -06:00
raid10.h	md/raid10: switch to use md_account_bio() for io accounting	2023-07-27 00:13:29 -07:00