b686631865
Requests that triggers flushing volatile writeback cache to disk (barriers) have significant effect to overall performance. Block layer has sophisticated engine for combining several flush requests into one. But there is no statistics for actual flushes executed by disk. Requests which trigger flushes usually are barriers - zero-size writes. This patch adds two iostat counters into /sys/class/block/$dev/stat and /proc/diskstats - count of completed flush requests and their total time. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
297 lines
11 KiB
Plaintext
297 lines
11 KiB
Plaintext
What: /sys/block/<disk>/stat
|
|
Date: February 2008
|
|
Contact: Jerome Marchand <jmarchan@redhat.com>
|
|
Description:
|
|
The /sys/block/<disk>/stat files displays the I/O
|
|
statistics of disk <disk>. They contain 11 fields:
|
|
1 - reads completed successfully
|
|
2 - reads merged
|
|
3 - sectors read
|
|
4 - time spent reading (ms)
|
|
5 - writes completed
|
|
6 - writes merged
|
|
7 - sectors written
|
|
8 - time spent writing (ms)
|
|
9 - I/Os currently in progress
|
|
10 - time spent doing I/Os (ms)
|
|
11 - weighted time spent doing I/Os (ms)
|
|
12 - discards completed
|
|
13 - discards merged
|
|
14 - sectors discarded
|
|
15 - time spent discarding (ms)
|
|
16 - flush requests completed
|
|
17 - time spent flushing (ms)
|
|
For more details refer Documentation/admin-guide/iostats.rst
|
|
|
|
|
|
What: /sys/block/<disk>/<part>/stat
|
|
Date: February 2008
|
|
Contact: Jerome Marchand <jmarchan@redhat.com>
|
|
Description:
|
|
The /sys/block/<disk>/<part>/stat files display the
|
|
I/O statistics of partition <part>. The format is the
|
|
same as the above-written /sys/block/<disk>/stat
|
|
format.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/format
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Metadata format for integrity capable block device.
|
|
E.g. T10-DIF-TYPE1-CRC.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/read_verify
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Indicates whether the block layer should verify the
|
|
integrity of read requests serviced by devices that
|
|
support sending integrity metadata.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/tag_size
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Number of bytes of integrity tag space available per
|
|
512 bytes of data.
|
|
|
|
|
|
What: /sys/block/<disk>/integrity/device_is_integrity_capable
|
|
Date: July 2014
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Indicates whether a storage device is capable of storing
|
|
integrity metadata. Set if the device is T10 PI-capable.
|
|
|
|
What: /sys/block/<disk>/integrity/protection_interval_bytes
|
|
Date: July 2015
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Describes the number of data bytes which are protected
|
|
by one integrity tuple. Typically the device's logical
|
|
block size.
|
|
|
|
What: /sys/block/<disk>/integrity/write_generate
|
|
Date: June 2008
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Indicates whether the block layer should automatically
|
|
generate checksums for write requests bound for
|
|
devices that support receiving integrity metadata.
|
|
|
|
What: /sys/block/<disk>/alignment_offset
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report a physical block size that is
|
|
bigger than the logical block size (for instance a drive
|
|
with 4KB physical sectors exposing 512-byte logical
|
|
blocks to the operating system). This parameter
|
|
indicates how many bytes the beginning of the device is
|
|
offset from the disk's natural alignment.
|
|
|
|
What: /sys/block/<disk>/<partition>/alignment_offset
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report a physical block size that is
|
|
bigger than the logical block size (for instance a drive
|
|
with 4KB physical sectors exposing 512-byte logical
|
|
blocks to the operating system). This parameter
|
|
indicates how many bytes the beginning of the partition
|
|
is offset from the disk's natural alignment.
|
|
|
|
What: /sys/block/<disk>/queue/logical_block_size
|
|
Date: May 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
This is the smallest unit the storage device can
|
|
address. It is typically 512 bytes.
|
|
|
|
What: /sys/block/<disk>/queue/physical_block_size
|
|
Date: May 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
This is the smallest unit a physical storage device can
|
|
write atomically. It is usually the same as the logical
|
|
block size but may be bigger. One example is SATA
|
|
drives with 4KB sectors that expose a 512-byte logical
|
|
block size to the operating system. For stacked block
|
|
devices the physical_block_size variable contains the
|
|
maximum physical_block_size of the component devices.
|
|
|
|
What: /sys/block/<disk>/queue/minimum_io_size
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report a granularity or preferred
|
|
minimum I/O size which is the smallest request the
|
|
device can perform without incurring a performance
|
|
penalty. For disk drives this is often the physical
|
|
block size. For RAID arrays it is often the stripe
|
|
chunk size. A properly aligned multiple of
|
|
minimum_io_size is the preferred request size for
|
|
workloads where a high number of I/O operations is
|
|
desired.
|
|
|
|
What: /sys/block/<disk>/queue/optimal_io_size
|
|
Date: April 2009
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Storage devices may report an optimal I/O size, which is
|
|
the device's preferred unit for sustained I/O. This is
|
|
rarely reported for disk drives. For RAID arrays it is
|
|
usually the stripe width or the internal track size. A
|
|
properly aligned multiple of optimal_io_size is the
|
|
preferred request size for workloads where sustained
|
|
throughput is desired. If no optimal I/O size is
|
|
reported this file contains 0.
|
|
|
|
What: /sys/block/<disk>/queue/nomerges
|
|
Date: January 2010
|
|
Contact:
|
|
Description:
|
|
Standard I/O elevator operations include attempts to
|
|
merge contiguous I/Os. For known random I/O loads these
|
|
attempts will always fail and result in extra cycles
|
|
being spent in the kernel. This allows one to turn off
|
|
this behavior on one of two ways: When set to 1, complex
|
|
merge checks are disabled, but the simple one-shot merges
|
|
with the previous I/O request are enabled. When set to 2,
|
|
all merge tries are disabled. The default value is 0 -
|
|
which enables all types of merge tries.
|
|
|
|
What: /sys/block/<disk>/discard_alignment
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may
|
|
internally allocate space in units that are bigger than
|
|
the exported logical block size. The discard_alignment
|
|
parameter indicates how many bytes the beginning of the
|
|
device is offset from the internal allocation unit's
|
|
natural alignment.
|
|
|
|
What: /sys/block/<disk>/<partition>/discard_alignment
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may
|
|
internally allocate space in units that are bigger than
|
|
the exported logical block size. The discard_alignment
|
|
parameter indicates how many bytes the beginning of the
|
|
partition is offset from the internal allocation unit's
|
|
natural alignment.
|
|
|
|
What: /sys/block/<disk>/queue/discard_granularity
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may
|
|
internally allocate space using units that are bigger
|
|
than the logical block size. The discard_granularity
|
|
parameter indicates the size of the internal allocation
|
|
unit in bytes if reported by the device. Otherwise the
|
|
discard_granularity will be set to match the device's
|
|
physical block size. A discard_granularity of 0 means
|
|
that the device does not support discard functionality.
|
|
|
|
What: /sys/block/<disk>/queue/discard_max_bytes
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Devices that support discard functionality may have
|
|
internal limits on the number of bytes that can be
|
|
trimmed or unmapped in a single operation. Some storage
|
|
protocols also have inherent limits on the number of
|
|
blocks that can be described in a single command. The
|
|
discard_max_bytes parameter is set by the device driver
|
|
to the maximum number of bytes that can be discarded in
|
|
a single operation. Discard requests issued to the
|
|
device must not exceed this limit. A discard_max_bytes
|
|
value of 0 means that the device does not support
|
|
discard functionality.
|
|
|
|
What: /sys/block/<disk>/queue/discard_zeroes_data
|
|
Date: May 2011
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Will always return 0. Don't rely on any specific behavior
|
|
for discards, and don't read this file.
|
|
|
|
What: /sys/block/<disk>/queue/write_same_max_bytes
|
|
Date: January 2012
|
|
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Description:
|
|
Some devices support a write same operation in which a
|
|
single data block can be written to a range of several
|
|
contiguous blocks on storage. This can be used to wipe
|
|
areas on disk or to initialize drives in a RAID
|
|
configuration. write_same_max_bytes indicates how many
|
|
bytes can be written in a single write same command. If
|
|
write_same_max_bytes is 0, write same is not supported
|
|
by the device.
|
|
|
|
What: /sys/block/<disk>/queue/write_zeroes_max_bytes
|
|
Date: November 2016
|
|
Contact: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
|
|
Description:
|
|
Devices that support write zeroes operation in which a
|
|
single request can be issued to zero out the range of
|
|
contiguous blocks on storage without having any payload
|
|
in the request. This can be used to optimize writing zeroes
|
|
to the devices. write_zeroes_max_bytes indicates how many
|
|
bytes can be written in a single write zeroes command. If
|
|
write_zeroes_max_bytes is 0, write zeroes is not supported
|
|
by the device.
|
|
|
|
What: /sys/block/<disk>/queue/zoned
|
|
Date: September 2016
|
|
Contact: Damien Le Moal <damien.lemoal@wdc.com>
|
|
Description:
|
|
zoned indicates if the device is a zoned block device
|
|
and the zone model of the device if it is indeed zoned.
|
|
The possible values indicated by zoned are "none" for
|
|
regular block devices and "host-aware" or "host-managed"
|
|
for zoned block devices. The characteristics of
|
|
host-aware and host-managed zoned block devices are
|
|
described in the ZBC (Zoned Block Commands) and ZAC
|
|
(Zoned Device ATA Command Set) standards. These standards
|
|
also define the "drive-managed" zone model. However,
|
|
since drive-managed zoned block devices do not support
|
|
zone commands, they will be treated as regular block
|
|
devices and zoned will report "none".
|
|
|
|
What: /sys/block/<disk>/queue/nr_zones
|
|
Date: November 2018
|
|
Contact: Damien Le Moal <damien.lemoal@wdc.com>
|
|
Description:
|
|
nr_zones indicates the total number of zones of a zoned block
|
|
device ("host-aware" or "host-managed" zone model). For regular
|
|
block devices, the value is always 0.
|
|
|
|
What: /sys/block/<disk>/queue/chunk_sectors
|
|
Date: September 2016
|
|
Contact: Hannes Reinecke <hare@suse.com>
|
|
Description:
|
|
chunk_sectors has different meaning depending on the type
|
|
of the disk. For a RAID device (dm-raid), chunk_sectors
|
|
indicates the size in 512B sectors of the RAID volume
|
|
stripe segment. For a zoned block device, either
|
|
host-aware or host-managed, chunk_sectors indicates the
|
|
size in 512B sectors of the zones of the device, with
|
|
the eventual exception of the last zone of the device
|
|
which may be smaller.
|
|
|
|
What: /sys/block/<disk>/queue/io_timeout
|
|
Date: November 2018
|
|
Contact: Weiping Zhang <zhangweiping@didiglobal.com>
|
|
Description:
|
|
io_timeout is the request timeout in milliseconds. If a request
|
|
does not complete in this time then the block driver timeout
|
|
handler is invoked. That timeout handler can decide to retry
|
|
the request, to fail it or to start a device recovery strategy.
|