linux/block
Davide Zini 2d31c684a0 block, bfq: inject I/O to underutilized actuators
The main service scheme of BFQ for sync I/O is serving one sync
bfq_queue at a time, for a while. In particular, BFQ enforces this
scheme when it deems the latter necessary to boost throughput or
to preserve service guarantees. Unfortunately, when BFQ enforces
this policy, only one actuator at a time gets served for a while,
because each bfq_queue contains I/O only for one actuator. The
other actuators may remain underutilized.

Actually, BFQ may serve (inject) extra I/O, taken from other
bfq_queues, in parallel with that of the in-service queue. This
injection mechanism may provide the ground for dealing also with
the above actuator-underutilization problem. Yet BFQ does not take
the actuator load into account when choosing which queue to pick
extra I/O from. In addition, BFQ may happen to inject extra I/O
only when the in-service queue is temporarily empty.

In view of these facts, this commit extends the
injection mechanism in such a way that the latter:
(1) takes into account also the actuator load;
(2) checks such a load on each dispatch, and injects I/O for an
    underutilized actuator, if there is one and there is I/O for it.

To perform the check in (2), this commit introduces a load
threshold, currently set to 4.  A linear scan of each actuator is
performed, until an actuator is found for which the following two
conditions hold: the load of the actuator is below the threshold,
and there is at least one non-in-service queue that contains I/O
for that actuator. If such a pair (actuator, queue) is found, then
the head request of that queue is returned for dispatch, instead
of the head request of the in-service queue.

We have set the threshold, empirically, to the minimum possible
value for which an actuator is fully utilized, or close to be
fully utilized. By doing so, injected I/O 'steals' as few
drive-queue slots as possibile to the in-service queue. This
reduces as much as possible the probability that the service of
I/O from the in-service bfq_queue gets delayed because of slot
exhaustion, i.e., because all the slots of the drive queue are
filled with I/O injected from other queues (NCQ provides for 32
slots).

This new mechanism also counters actuator underutilization in the
case of asymmetric configurations of bfq_queues. Namely if there
are few bfq_queues containing I/O for some actuators and many
bfq_queues containing I/O for other actuators. Or if the
bfq_queues containing I/O for some actuators have lower weights
than the other bfq_queues.

Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Davide Zini <davidezini2@gmail.com>
Link: https://lore.kernel.org/r/20230103145503.71712-8-paolo.valente@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-29 15:18:33 -07:00
..
partitions block: don't add partitions if GD_SUPPRESS_PART_SCAN is set 2022-09-03 11:29:03 -06:00
badblocks.c block/badblocks: Remove redundant assignments 2022-04-23 07:15:26 -06:00
bdev.c block: bdev & blktrace: use consistent function doc. notation 2022-12-01 09:16:46 -07:00
bfq-cgroup.c block, bfq: inject I/O to underutilized actuators 2023-01-29 15:18:33 -07:00
bfq-iosched.c block, bfq: inject I/O to underutilized actuators 2023-01-29 15:18:33 -07:00
bfq-iosched.h block, bfq: inject I/O to underutilized actuators 2023-01-29 15:18:33 -07:00
bfq-wf2q.c block, bfq: inject I/O to underutilized actuators 2023-01-29 15:18:33 -07:00
bio-integrity.c block: pass struct queue_limits to the bio splitting helpers 2022-08-02 21:08:53 -06:00
bio.c Revert "block: bio_copy_data_iter" 2023-01-04 14:43:27 -07:00
blk-cgroup-fc-appid.c cgroup: Homogenize cgroup_get_from_id() return value 2022-08-26 10:57:41 -10:00
blk-cgroup-rwstat.c
blk-cgroup-rwstat.h block: Use the new blk_opf_t type 2022-07-14 12:14:30 -06:00
blk-cgroup.c blk-cgroup: fix missing pd_online_fn() while activating policy 2023-01-16 19:04:07 -07:00
blk-cgroup.h blk-cgroup: Optimize blkcg_rstat_flush() 2022-11-16 16:58:44 -07:00
blk-core.c block: Drop spurious might_sleep() from blk_put_queue() 2023-01-08 20:29:28 -07:00
blk-crypto-fallback.c treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
blk-crypto-internal.h blk-crypto: pass a gendisk to blk_crypto_sysfs_{,un}register 2022-11-30 11:09:00 -07:00
blk-crypto-profile.c blk-crypto: Add a missing include directive 2022-11-23 10:38:54 -07:00
blk-crypto-sysfs.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
blk-crypto.c for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
blk-flush.c block: change request end_io handler to pass back a return value 2022-09-30 07:49:09 -06:00
blk-ia-ranges.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
blk-integrity.c
blk-ioc.c block: fix default IO priority handling again 2022-06-27 06:29:12 -06:00
blk-iocost.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
blk-iolatency.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
blk-ioprio.c blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-ioprio.h blk-ioprio: pass a gendisk to blk_ioprio_init and blk_ioprio_exit 2022-09-26 19:09:31 -06:00
blk-lib.c blk-lib: fix blkdev_issue_secure_erase 2022-09-15 00:25:17 -06:00
blk-map.c block: set FOLL_PCI_P2PDMA in bio_map_user_iov() 2022-11-09 11:29:21 -07:00
blk-merge.c block: don't allow splitting of a REQ_NOWAIT bio 2023-01-04 13:24:44 -07:00
blk-mq-cpumap.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-debugfs-zoned.c block: move zone related fields to struct gendisk 2022-07-06 06:46:26 -06:00
blk-mq-debugfs.c for-6.1/block-2022-10-03 2022-10-07 09:19:14 -07:00
blk-mq-debugfs.h block: remove per-disk debugfs files in blk_unregister_queue 2022-06-17 07:31:05 -06:00
blk-mq-pci.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-rdma.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq-sched.c block: split elevator_switch 2022-11-01 09:12:24 -06:00
blk-mq-sched.h
blk-mq-sysfs.c blk-mq: fix possible memleak when register 'hctx' failed 2022-11-25 06:34:03 -07:00
blk-mq-tag.c sbitmap: fix batched wait_cnt accounting 2022-09-12 00:10:34 -06:00
blk-mq-tag.h blk-mq: blk_mq_tag_busy is no need to return a value 2022-06-27 06:29:12 -06:00
blk-mq-virtio.c block: Change the return type of blk_mq_map_queues() into void 2022-08-22 10:07:53 -06:00
blk-mq.c block: fix hctx checks for batch allocation 2023-01-17 09:56:52 -07:00
blk-mq.h blk-mq: move the srcu_struct used for quiescing to the tagset 2022-11-02 08:35:34 -06:00
blk-pm.c
blk-pm.h
blk-rq-qos.c block/rq_qos: Use atomic_try_cmpxchg in atomic_inc_below 2022-07-12 14:38:52 -06:00
blk-rq-qos.h block/blk-rq-qos: delete useless enmu RQ_QOS_IOPRIO 2022-09-21 19:50:53 -06:00
blk-settings.c for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
blk-stat.c
blk-stat.h
blk-sysfs.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
blk-throttle.c blk-throttle: Use more suitable time_after check for update of slice_start 2022-12-05 13:45:31 -07:00
blk-throttle.h blk-throttle: pass a gendisk to blk_throtl_cancel_bios 2022-09-26 19:17:28 -06:00
blk-timeout.c
blk-wbt.c blk-wbt: don't enable throttling if default elevator is bfq 2022-10-23 18:59:17 -06:00
blk-wbt.h blk-wbt: don't show valid wbt_lat_usec in sysfs while wbt is disabled 2022-10-23 18:59:17 -06:00
blk-zoned.c block: adapt blk_mq_plug() to not plug for writes that require a zone lock 2022-09-29 07:45:47 -06:00
blk.h for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
bounce.c block: change the blk_queue_bounce calling convention 2022-08-02 17:22:54 -06:00
bsg-lib.c blk-mq: move the call to blk_put_queue out of blk_mq_destroy_queue 2022-10-25 08:25:10 -06:00
bsg.c Driver Core changes for 6.2-rc1 2022-12-16 03:54:54 -08:00
disk-events.c block: remove genhd.h 2022-02-02 07:49:59 -07:00
elevator.c block: untangle request_queue refcounting from sysfs 2022-11-30 11:09:00 -07:00
elevator.h block: add proper helpers for elevator_type module refcount management 2022-10-23 18:59:17 -06:00
fops.c block: remove blkdev_writepages 2022-11-16 11:32:53 -07:00
genhd.c block-2023-01-06 2023-01-06 13:12:42 -08:00
holder.c block: don't allow a disk link holder to itself 2022-11-16 15:19:56 -07:00
ioctl.c block: Do not reread partition table on exclusively open device 2022-12-01 07:44:03 -07:00
ioprio.c block: Fix handling of tasks without ioprio in ioprio_get(2) 2022-06-27 06:29:12 -06:00
Kconfig block: Remove "select SRCU" 2023-01-05 08:50:10 -07:00
Kconfig.iosched
kyber-iosched.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
Makefile blk-cgroup: move blkcg_{get,set}_fc_appid out of line 2022-05-02 14:06:20 -06:00
mq-deadline.c block: mq-deadline: Rename deadline_is_seq_writes() 2022-11-28 19:27:45 -07:00
opal_proto.h block: sed-opal: Add ioctl to return device status 2022-08-22 07:52:51 -06:00
sed-opal.c for-6.2/block-2022-12-08 2022-12-13 10:43:59 -08:00
t10-pi.c block: add pi for extended integrity 2022-03-07 12:48:35 -07:00