linux/io_uring
Prasad Singamsetty c34fc6f26a fs: Initial atomic write support
An atomic write is a write issued with torn-write protection, meaning
that for a power failure or any other hardware failure, all or none of the
data from the write will be stored, but never a mix of old and new data.

Userspace may add flag RWF_ATOMIC to pwritev2() to indicate that the
write is to be issued with torn-write prevention, according to special
alignment and length rules.

For any syscall interface utilizing struct iocb, add IOCB_ATOMIC for
iocb->ki_flags field to indicate the same.

A call to statx will give the relevant atomic write info for a file:
- atomic_write_unit_min
- atomic_write_unit_max
- atomic_write_segments_max

Both min and max values must be a power-of-2.

Applications can avail of atomic write feature by ensuring that the total
length of a write is a power-of-2 in size and also sized between
atomic_write_unit_min and atomic_write_unit_max, inclusive. Applications
must ensure that the write is at a naturally-aligned offset in the file
wrt the total write length. The value in atomic_write_segments_max
indicates the upper limit for IOV_ITER iovcnt.

Add file mode flag FMODE_CAN_ATOMIC_WRITE, so files which do not have the
flag set will have RWF_ATOMIC rejected and not just ignored.

Add a type argument to kiocb_set_rw_flags() to allows reads which have
RWF_ATOMIC set to be rejected.

Helper function generic_atomic_write_valid() can be used by FSes to verify
compliant writes. There we check for iov_iter type is for ubuf, which
implies iovcnt==1 for pwritev2(), which is an initial restriction for
atomic_write_segments_max. Initially the only user will be bdev file
operations write handler. We will rely on the block BIO submission path to
ensure write sizes are compliant for the bdev, so we don't need to check
atomic writes sizes yet.

Signed-off-by: Prasad Singamsetty <prasad.singamsetty@oracle.com>
jpg: merge into single patch and much rewrite
Acked-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Link: https://lore.kernel.org/r/20240620125359.2684798-4-john.g.garry@oracle.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-06-20 15:19:17 -06:00
..
advise.c io_uring: always go async for unsupported fadvise flags 2023-01-29 15:18:26 -07:00
advise.h
alloc_cache.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
cancel.c io_uring: fix warnings on shadow variables 2024-04-15 08:10:26 -06:00
cancel.h io_uring/cancel: don't default to setting req->work.cancel_seq 2024-02-08 13:27:06 -07:00
epoll.c io_uring: undeprecate epoll_ctl support 2023-05-26 20:22:41 -06:00
epoll.h
fdinfo.c io_uring: fix warnings on shadow variables 2024-04-15 08:10:26 -06:00
fdinfo.h
filetable.c io_uring/filetable: don't unnecessarily clear/reset bitmap 2024-05-08 08:27:45 -06:00
filetable.h io_uring: expand main struct io_kiocb flags to 64-bits 2024-02-08 13:27:03 -07:00
fs.c io_uring/fs: consider link->flags when getting path for LINKAT 2023-11-20 09:01:42 -07:00
fs.h
futex.c io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
futex.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
io_uring.c io_uring: remove checks for NULL 'sq_offset' 2024-05-22 11:13:44 -06:00
io_uring.h io_uring: check for non-NULL file pointer in io_file_can_poll() 2024-06-01 12:25:35 -06:00
io-wq.c io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue() 2024-06-04 07:39:00 -06:00
io-wq.h io_uring: break out of iowq iopoll on teardown 2023-09-07 09:02:27 -06:00
kbuf.c io_uring/kbuf: add helpers for getting/peeking multiple buffers 2024-04-22 11:26:01 -06:00
kbuf.h io_uring/kbuf: add helpers for getting/peeking multiple buffers 2024-04-22 11:26:01 -06:00
Makefile io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
memmap.c io_uring: don't attempt to mmap larger than what the user asks for 2024-05-29 09:53:14 -06:00
memmap.h io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
msg_ring.c io_uring/msg_ring: cleanup posting to IOPOLL vs !IOPOLL ring 2024-05-01 17:13:51 -06:00
msg_ring.h
napi.c io_uring/napi: fix timeout calculation 2024-06-04 07:32:45 -06:00
napi.h io_uring: add register/unregister napi function 2024-02-09 11:54:32 -07:00
net.c io_uring/net: assign kmsg inq/flags before buffer selection 2024-05-30 14:04:37 -06:00
net.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
nop.c io_uring: support to inject result for NOP 2024-05-10 06:09:45 -06:00
nop.h
notif.c io_uring/notif: disable LAZY_WAKE for linked notifs 2024-04-30 13:06:27 -06:00
notif.h io_uring/notif: implement notification stacking 2024-04-22 19:31:18 -06:00
opdef.c io_uring/rw: Free iovec before cleaning async data 2024-05-30 08:33:01 -06:00
opdef.h io_uring: drop ->prep_async() 2024-04-15 08:10:25 -06:00
openclose.c io_uring: enable audit and restrict cred override for IORING_OP_FIXED_FD_INSTALL 2024-01-23 15:25:14 -07:00
openclose.h io_uring/openclose: add support for IORING_OP_FIXED_FD_INSTALL 2023-12-12 07:42:57 -07:00
poll.c io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
poll.h io_uring/poll: shrink alloc cache size to 32 2024-04-15 08:10:25 -06:00
refs.h io_uring: kill dead code in io_req_complete_post 2024-04-15 08:10:26 -06:00
register.c io_uring: fix possible deadlock in io_register_iowq_max_workers() 2024-06-04 07:39:17 -06:00
register.h io_uring/register: move io_uring_register(2) related code to register.c 2023-12-19 08:54:20 -07:00
rsrc.c io_uring: move mapping/allocation helpers to a separate file 2024-04-15 08:10:26 -06:00
rsrc.h io_uring: remove io_req_put_rsrc_locked() 2024-04-15 08:10:26 -06:00
rw.c fs: Initial atomic write support 2024-06-20 15:19:17 -06:00
rw.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
slist.h io_uring: silence variable ‘prev’ set but not used warning 2023-03-09 10:10:58 -07:00
splice.c splice: return type ssize_t from all helpers 2023-12-12 16:19:59 +01:00
splice.h
sqpoll.c io_uring/sqpoll: ensure that normal task_work is also run timely 2024-05-21 13:41:14 -06:00
sqpoll.h io_uring/sqpoll: statistics of the true utilization of sq threads 2024-03-01 06:28:19 -07:00
statx.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
statx.h
sync.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
sync.h
tctx.c io_uring: Add io_uring_setup flag to pre-register ring fd and never install it 2023-05-16 08:06:00 -06:00
tctx.h
timeout.c io_uring/timeout: remove duplicate initialization of the io_timeout list. 2024-04-15 08:10:27 -06:00
timeout.h
truncate.c io_uring: add support for ftruncate 2024-02-09 09:04:39 -07:00
truncate.h io_uring: add support for ftruncate 2024-02-09 09:04:39 -07:00
uring_cmd.c io_uring: separate header for exported net bits 2024-04-15 08:10:26 -06:00
uring_cmd.h io_uring/alloc_cache: switch to array based caching 2024-04-15 08:10:25 -06:00
waitid.c io_uring: remove struct io_tw_state::locked 2024-04-15 08:10:24 -06:00
waitid.h io_uring: add IORING_OP_WAITID support 2023-09-21 12:04:45 -06:00
xattr.c io_uring: use file_mnt_idmap helper 2024-02-06 19:55:14 -07:00
xattr.h