linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-23 12:42:02 +00:00

Author	SHA1	Message	Date
Pavel Begunkov	46929b0868	io_uring: add io_commit_cqring_flush() Since __io_commit_cqring_flush users moved to different files, introduce io_commit_cqring_flush() helper and encapsulate all flags testing details inside. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0da03887435dd9869ffe46dcd3962bf104afcca3.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:15 -06:00
Pavel Begunkov	253993210b	io_uring: introduce locking helpers for CQE posting spin_lock(&ctx->completion_lock); /* post CQEs */ io_commit_cqring(ctx); spin_unlock(&ctx->completion_lock); io_cqring_ev_posted(ctx); We have many places repeating this sequence, and the three function unlock section is not perfect from the maintainance perspective and also makes it harder to add new locking/sync trick. Introduce two helpers. io_cq_lock(), which is simple and only grabs ->completion_lock, and io_cq_unlock_post() encapsulating the three call section. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/fe0c682bf7f7b55d9be55b0d034be9c1949277dc.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	305bef9887	io_uring: hide eventfd assumptions in eventfd paths Some io_uring-eventfd users assume that there won't be spurious wakeups. That assumption has to be honoured by all io_cqring_ev_posted() callers, which is inconvenient and from time to time leads to problems but should be maintained to not break the userspace. Instead of making the callers track whether a CQE was posted or not, hide it inside io_eventfd_signal(). It saves ->cached_cq_tail it saw last time and triggers the eventfd only when ->cached_cq_tail changed since then. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0ffc66bae37a2513080b601e4370e147faaa72c5.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	b321823a03	io_uring: fix io_poll_remove_all clang warnings clang complains on bitwise operations with bools, add a bit more verbosity to better show that we want to call io_poll_remove_all_table() twice but with different arguments. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f11d21dcdf9233e0eeb15fa13b858a05a78eb310.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	ba3cdb6fbb	io_uring: improve task exit timeout cancellations Don't spin trying to cancel timeouts that are reachable but not cancellable, e.g. already executing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ab8a7440a60bbdf69ae514f672ad050e43dd1b03.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	affa87db90	io_uring: fix multi ctx cancellation io_uring_try_cancel_requests() loops until there is nothing left to do with the ring, however there might be several rings and they might have dependencies between them, e.g. via poll requests. Instead of cancelling rings one by one, try to cancel them all and only then loop over if we still potenially some work to do. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8d491fe02d8ac4c77ff38061cf86b9a827e8845c.1655684496.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	d9dee4302a	io_uring: remove ->flush_cqes optimisation It's not clear how widely used IOSQE_CQE_SKIP_SUCCESS is, and how often ->flush_cqes flag prevents from completion being flushed. Sometimes it's high level of concurrency that enables it at least for one CQE, but sometimes it doesn't save much because nobody waiting on the CQ. Remove ->flush_cqes flag and the optimisation, it should benefit the normal use case. Note, that there is no spurious eventfd problem with that as checks for spuriousness were incorporated into io_eventfd_signal(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/692e81eeddccc096f449a7960365fa7b4a18f8e6.1655637157.git.asml.silence@gmail.com [axboe: remove now dead state->flush_cqes variable] Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	a830ffd287	io_uring: move io_eventfd_signal() Move io_eventfd_signal() in the sources without any changes and kill its forward declaration. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9ebebb3f6f56f5a5448a621e0b6a537720c43334.1655637157.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	9046c6415b	io_uring: reshuffle io_uring/io_uring.h It's a good idea to first do forward declarations and then inline helpers, otherwise there will be keep stumbling on dependencies between them. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1d7fa6672ed43f20ccc0c54ae201369ebc3ebfab.1655637157.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	d142c3ec8d	io_uring: remove extra io_commit_cqring() We don't post events in __io_commit_cqring_flush() anymore but send all requests to tw, so no need to do io_commit_cqring() there. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/f2481e32375e749be89c42e4804268b608722cef.1655637157.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Jens Axboe	ad163a7e25	io_uring: move a few private types to local headers Commit 3a3d47fa9cfd ("io_uring: make io_uring_types.h public") moved a bunch of io_uring types to a kernel wide header, so we could make tracing a bit saner rather than pass in a ton of arguments. However, there are a few types in there that are not really needed to be system wide. Move the cancel data and mapped buffers back to the appropriate io_uring local headers. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	48863ffd3e	io_uring: clean up tracing events We have lots of trace events accepting an io_uring request and wanting to print some of its fields like user_data, opcode, flags and so on. However, as trace points were unaware of io_uring structures, we had to pass all the fields as arguments. Teach trace/events/io_uring.h about struct io_kiocb and stop the misery of passing a horde of arguments to trace helpers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/40ff72f92798114e56d400f2b003beb6cde6ef53.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	ab1c84d855	io_uring: make io_uring_types.h public Move io_uring types to linux/include, need them public so tracing can see the definitions and we can clean trace/events/io_uring.h Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a15f12e8cb7289b2de0deaddcc7518d98a132d17.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	27a9d66fec	io_uring: kill extra io_uring_types.h includes io_uring/io_uring.h already includes io_uring_types.h, no need to include it every time. Kill it in a bunch of places, it prepares us for following patches. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/94d8c943fbe0ef949981c508ddcee7fc1c18850f.1655384063.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	b3659a65be	io_uring: change ->cqe_cached invariant for CQE32 With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but rather an implicit offset into cqes. Store the real cqe pointer and increment it accordingly if CQE32. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1ee1838cba16bed96381a006950b36ba640d998c.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	e8c328c391	io_uring: deduplicate io_get_cqe() calls Deduplicate calls to io_get_cqe() from __io_fill_cqe_req(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4fa077986cc3abab7c59ff4e7c390c783885465f.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	ae5735c69b	io_uring: deduplicate __io_fill_cqe_req tracing Deduplicate two trace_io_uring_complete() calls in __io_fill_cqe_req(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/277ed85dba5189ab7d932164b314013a0f0b0fdc.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	68494a65d0	io_uring: introduce io_req_cqe_overflow() __io_fill_cqe_req() is hot and inlined, we want it to be as small as possible. Add io_req_cqe_overflow() accepting only a request and doing all overflow accounting, and replace with it two calls to 6 argument io_cqring_event_overflow(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/048b9fbcce56814d77a1a540409c98c3d383edcb.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	faf88dde06	io_uring: don't inline __io_get_cqe() __io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it sheds ~500B from the binary. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1ac829198a881b7af8710926f99a3559b9f24c0.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	d245bca637	io_uring: don't expose io_fill_cqe_aux() Deduplicate some code and add a helper for filling an aux CQE, locking and notification. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Hao Xu	f09c8643f0	io_uring: kbuf: add comments for some tricky code Add comments to explain why it is always under uring lock when incrementing head in __io_kbuf_recycle. And rectify one comemnt about kbuf consuming in iowq case. Signed-off-by: Hao Xu <howeyxu@tencent.com> Link: https://lore.kernel.org/r/20220617050429.94293-1-hao.xu@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	9ca9fb24d5	io_uring: mutex locked poll hashing Currently we do two extra spin lock/unlock pairs to add a poll/apoll request to the cancellation hash table and remove it from there. On the submission side we often already hold ->uring_lock and tw completion is likely to hold it as well. Add a second cancellation hash table protected by ->uring_lock. In concerns for latency because of a need to have the mutex locked on the completion side, use the new table only in following cases: 1) IORING_SETUP_SINGLE_ISSUER: only one task grabs uring_lock, so there is little to no contention and so the main tw hander will almost always end up grabbing it before calling callbacks. 2) IORING_SETUP_SQPOLL: same as with single issuer, only one task is a major user of ->uring_lock. 3) apoll: we normally grab the lock on the completion side anyway to execute the request, so it's free. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1bbad9c78c454b7b92f100bbf46730a37df7194f.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:14 -06:00
Pavel Begunkov	5d7943d99d	io_uring: propagate locking state to poll cancel Poll cancellation will be soon need to grab ->uring_lock inside, pass the locking state, i.e. issue_flags, inside the cancellation functions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b86781d047727c07163443b57551a3fa57c7c5e1.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	e6f89be614	io_uring: introduce a struct for hash table Instead of passing around a pointer to hash buckets, add a bit of type safety and wrap it into a structure. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d65bc3faba537ec2aca9eabf334394936d44bd28.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	a2cdd51932	io_uring: pass hash table into poll_find In preparation for having multiple cancellation hash tables, pass a table pointer into io_poll_find() and other poll cancel functions. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a31c88502463dce09254240fa037352927d7ecc3.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	97bbdc06a4	io_uring: add IORING_SETUP_SINGLE_ISSUER Add a new IORING_SETUP_SINGLE_ISSUER flag and the userspace visible part of it, i.e. put limitations of submitters. Also, don't allow it together with IOPOLL as we're not going to put it to good use. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4bcc41ee467fdf04c8aab8baf6ce3ba21858c3d4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	0ec6dca223	io_uring: use state completion infra for poll reqs Use io_req_task_complete() for poll request completions, so it can utilise state completions and save lots of unnecessary locking. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ced94cb5a728d8e386c640d052fd3da3f5d6891a.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	8b1dfd343a	io_uring: clean up io_ring_ctx_alloc Add a variable for the number of hash buckets in io_ring_ctx_alloc(), makes it more readable. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/993926ed0d614ba9a76b2a85bebae2babcb13983.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	4a07723fb4	io_uring: limit the number of cancellation buckets Don't allocate to many hash/cancellation buckets, there might be too many, clamp it to 8 bits, or 256 * 64B = 16KB. We don't usually have too many requests, and 256 buckets should be enough, especially since we do hash search only in the cancellation path. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b9620c8072ba61a2d50eba894b89bd93a94a9abd.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	4dfab8abb4	io_uring: clean up io_try_cancel Get rid of an unnecessary extra goto in io_try_cancel() and simplify the function. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/48cf5417b43a8386c6c364dba1ad9b4c7382d158.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	1ab1edb0a1	io_uring: pass poll_find lock back Instead of using implicit knowledge of what is locked or not after io_poll_find() and co returns, pass back a pointer to the locked bucket if any. If set the user must to unlock the spinlock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dae1dc5749aa34367812ecf62f82fd3f053aae44.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Hao Xu	38513c464d	io_uring: switch cancel_hash to use per entry spinlock Add a new io_hash_bucket structure so that each bucket in cancel_hash has separate spinlock. Use per entry lock for cancel_hash, this removes some completion lock invocation and remove contension between different cancel_hash entries. Signed-off-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/05d1e135b0c8bce9d1441e6346776589e5783e26.1655371007.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Hao Xu	3654ab0c51	io_uring: poll: remove unnecessary req->ref set We now don't need to set req->refcount for poll requests since the reworked poll code ensures no request release race. Signed-off-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ec6fee45705890bdb968b0c175519242753c0215.1655371007.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	53ccf69bda	io_uring: don't inline io_put_kbuf io_put_kbuf() is huge, don't bloat the kernel with inlining. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/2e21ccf0be471ffa654032914b9430813cae53f8.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	7012c81593	io_uring: refactor io_req_task_complete() Clean up io_req_task_complete() and deduplicate io_put_kbuf() calls. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/ae3148ac7eb5cce3e06895cde306e9e959d6f6ae.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	75d7b3aec1	io_uring: kill REQ_F_COMPLETE_INLINE REQ_F_COMPLETE_INLINE is only needed to delay queueing into the completion list to io_queue_sqe() as __io_req_complete() is inlined and we don't want to bloat the kernel. As now we complete in a more centralised fashion in io_issue_sqe() we can get rid of the flag and queue to the list directly. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	df9830d883	io_uring: rw: delegate sync completions to core io_uring io_issue_sqe() from the io_uring core knows how to complete requests based on the returned error code, we can delegate io_read()/io_write() completion to it. Make kiocb_done() to return the right completion code and propagate it. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/32ef005b45d23bf6b5e6837740dc0331bb051bd4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <howeyxu@tencent.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Jens Axboe	bb8f870031	io_uring: remove unused IO_REQ_CACHE_SIZE defined Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	c65f5279ba	io_uring: don't set REQ_F_COMPLETE_INLINE in tw io_req_task_complete() enqueues requests for state completion itself, no need for REQ_F_COMPLETE_INLINE, which is only serve the purpose of not bloating the kernel. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/aca80f71464ad02c06f1311d998a2d6ee0b31573.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	3a08576b96	io_uring: remove check_cq checking from hot paths All ctx->check_cq events are slow path, don't test every single flag one by one in the hot path, but add a common guarding if. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/dff026585cea7ff3a172a7c83894a3b0111bbf6a.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	aeaa72c694	io_uring: never defer-complete multi-apoll Luckily, nnobody completes multi-apoll requests outside the polling functions, but don't set IO_URING_F_COMPLETE_DEFER in any case as there is nobody who is catching REQ_F_COMPLETE_INLINE, and so will leak requests if used. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a65ed3f5effd9321ee06e6edea294a03be3e15a0.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	6a02e4be81	io_uring: inline ->registered_rings There can be only 16 registered rings, no need to allocate an array for them separately but store it in tctx. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/495f0b953c87994dd9e13de2134019054fa5830d.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	48c13d8980	io_uring: explain io_wq_work::cancel_seq placement Add a comment on why we keep ->cancel_seq in struct io_wq_work instead of struct io_kiocb despite it needed only by io_uring but not io-wq. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/988e87eec9dc700b5dae933df3aefef303502f6c.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	aa1e90f64e	io_uring: move small helpers to headers There is a bunch of inline helpers that will be useful not only to the core of io_uring, move them to headers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:13 -06:00
Pavel Begunkov	22eb2a3fde	io_uring: refactor ctx slow data placement Shove all slow path data at the end of ctx and get rid of extra indention. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/bcaf200298dd469af20787650550efc66d89bef2.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Pavel Begunkov	aff5b2df9e	io_uring: better caching for ctx timeout fields Following timeout fields access patterns, move all of them into a separate cache line inside ctx, so they don't intervene with normal completion caching, especially since timeout removals and completion are separated and the later is done via tw. It also sheds some bytes from io_ring_ctx, 1216B -> 1152B Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/4b163793072840de53b3cb66e0c2995e7226ff78.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Pavel Begunkov	b25436038f	io_uring: move defer_list to slow data draining is slow path, move defer_list to the end where slow data lives inside the context. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e16379391ca72b490afdd24e8944baab849b4a7b.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Pavel Begunkov	5ff4fdffad	io_uring: make reg buf init consistent The default (i.e. empty) state of register buffer is dummy_ubuf, so set it to dummy on init instead of NULL. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c5456aecf03d9627fbd6e65e100e2b5293a6151e.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	61a2732af4	io_uring: deprecate epoll_ctl support As far as we know, nobody ever adopted the epoll_ctl management via io_uring. Deprecate it now with a warning, and plan on removing it in a later kernel version. When we do remove it, we can revert the following commits as well: `39220e8d4a` ("eventpoll: support non-blocking do_epoll_ctl() calls") `58e41a44c4` ("eventpoll: abstract out epoll_ctl() handler") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/io-uring/CAHk-=wiTyisXBgKnVHAGYCNvkmjk=50agS2Uk6nr+n3ssLZg2w@mail.gmail.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	b9ba8a4463	io_uring: add support for level triggered poll By default, the POLL_ADD command does edge triggered poll - if we get a non-zero mask on the initial poll attempt, we complete the request successfully. Support level triggered by always waiting for a notification, regardless of whether or not the initial mask matches the file state. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	d9b57aa3cf	io_uring: move opcode table to opdef.c We already have the declarations in opdef.h, move the rest into its own file rather than in the main io_uring.c file. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	f3b44f92e5	io_uring: move read/write related opcodes to its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	c98817e6cd	io_uring: move remaining file table manipulation to filetable.c Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	7357298448	io_uring: move rsrc related data, core, and commands Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	3b77495a97	io_uring: split provided buffers handling into its own file Move both the opcodes related to it, and the internals code dealing with it. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	7aaff708a7	io_uring: move cancelation into its own file This also helps cleanup the io_uring.h cancel parts, as we can make things static in the cancel.c file, mostly. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	329061d3e2	io_uring: move poll handling into its own file Add a io_poll_issue() rather than export the general task_work locking and io_issue_sqe(), and put the io_op_defs definition and structure into a separate header file so that poll can use it. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	cfd22e6b33	io_uring: add opcode name to io_op_defs This kills the last per-op switch. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	92ac8beaea	io_uring: include and forward-declaration sanitation Remove some dead headers we no longer need, and get rid of the io_ring_ctx and io_uring_fops forward declarations. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	c9f06aa7de	io_uring: move io_uring_task (tctx) helpers into its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	a4ad4f748e	io_uring: move fdinfo helpers to its own file This also means moving a bit more of the fixed file handling to the filetable side, which makes sense separately too. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	e5550a1447	io_uring: use io_is_uring_fops() consistently Convert the last spots that check for io_uring_fops to use the provided helper instead. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	17437f3114	io_uring: move SQPOLL related handling into its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	59915143e8	io_uring: move timeout opcodes and handling into its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	e418bbc97b	io_uring: move our reference counting into a header Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	36404b09aa	io_uring: move msg_ring into its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:12 -06:00
Jens Axboe	f9ead18c10	io_uring: split network related opcodes into its own file While at it, convert the handlers to just use io_eopnotsupp_prep() if CONFIG_NET isn't set. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	e0da14def1	io_uring: move statx handling to its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	a9c210cebe	io_uring: move epoll handler to its own file Would be nice to sort out Kconfig for this and don't even compile epoll.c if we don't have epoll configured. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	4cf9049528	io_uring: add a dummy -EOPNOTSUPP prep handler Add it and use it for the epoll handling, if epoll isn't configured. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	99f15d8d61	io_uring: move uring_cmd handling to its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	cd40cae29e	io_uring: split out open/close operations Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	453b329be5	io_uring: separate out file table handling code Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	f4c163dd7d	io_uring: split out fadvise/madvise operations Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	0d58472740	io_uring: split out fs related sync/fallocate functions This splits out sync_file_range, fsync, and fallocate. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	531113bbd5	io_uring: split out splice related operations This splits out splice and tee support. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	11aeb71406	io_uring: split out filesystem related operations This splits out renameat, unlinkat, mkdirat, symlinkat, and linkat. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	e28683bdfc	io_uring: move nop into its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	5e2a18d93f	io_uring: move xattr related opcodes to its own file Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	97b388d70b	io_uring: handle completions in the core Normally request handlers complete requests themselves, if they don't return an error. For the latter case, the core will complete it for them. This is unhandy for pushing opcode handlers further out, as we don't want a bunch of inline completion code and we don't want to make the completion path slower than it is now. Let the core handle any completion, unless the handler explicitly asks us not to. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	de23077eda	io_uring: set completion results upfront Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	e27f928ee1	io_uring: add io_uring_types.h This adds definitions of structs that both the core and the various opcode handlers need to know about. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	4d4c9cff4f	io_uring: define a request type cleanup handler This can move request type specific cleanup into a private handler, removing the need for the core io_uring parts to know what types they are dealing with. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	890968dc03	io_uring: unify struct io_symlink and io_hardlink They are really just a subset of each other, just use the one type. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	9a3a11f977	io_uring: convert iouring_cmd to io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	ceb452e1b4	io_uring: convert xattr to use io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	ea5af87d29	io_uring: convert rsrc_update to io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:11 -06:00
Jens Axboe	c1ee559501	io_uring: convert msg and nop to io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	2511d3030c	io_uring: convert splice to use io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	3e93a3571a	io_uring: convert epoll to io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	bb040a21fd	io_uring: convert file system request types to use io_cmd_type This converts statx, rename, unlink, mkdir, symlink, and hardlink to use io_cmd_type. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	37d4842f11	io_uring: convert madvise/fadvise to use io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	dd752582e3	io_uring: convert open/close path to use io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	a43714ace5	io_uring: convert timeout path to use io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	f38987f09a	io_uring: convert cancel path to use io_cmd_type Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	e4a71006ea	io_uring: convert the sync and fallocate paths to use io_cmd_type They all share the same struct io_sync, convert them to use the io_cmd_type approach instead. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	8ff86d85b7	io_uring: convert net related opcodes to use io_cmd_type This converts accept, connect, send/recv, sendmsg/recvmsg, shutdown, and socket to use io_cmd_type. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	bd8587e499	io_uring: remove recvmsg knowledge from io_arm_poll_handler() There's a special case for recvmsg with MSG_ERRQUEUE set. This is problematic as it means the core needs to know about this special request type. For now, just add a generic flag for it. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	c24b154967	io_uring: convert poll_update path to use io_cmd_type Remove struct io_poll_update from io_kiocb, and convert the poll path to use the io_cmd_type approach instead. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	8d4388d116	io_uring: convert poll path to use io_cmd_type Remove struct io_poll_iocb from io_kiocb, and convert the poll path to use the io_cmd_type approach instead. While at it, rename io_poll_iocb to io_poll which is consistent with the other request type private structures. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	3c306fb2f9	io_uring: convert read/write path to use io_cmd_type Remove struct io_rw from io_kiocb, and convert the read/write path to use the io_cmd_type approach instead. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	f49eca2156	io_uring: add generic command payload type to struct io_kiocb Each opcode generally has a command structure in io_kiocb which it can use to store data associated with that request. In preparation for having the core layer not know about what's inside these fields, add a generic io_cmd_data type and put in the union as well. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	dc919caff6	io_uring: move req async preparation into opcode handler Define an io_op_def->prep_async() handler and push the async preparation to there. Since we now have that, we can drop ->needs_async_setup, as they mean the same thing. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00
Jens Axboe	ed29b0b4fd	io_uring: move to separate directory In preparation for splitting io_uring up a bit, move it into its own top level directory. It didn't really belong in fs/ anyway, as it's not a file system only API. This adds io_uring/ and moves the core files in there, and updates the MAINTAINERS file for the new location. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-07-24 18:39:10 -06:00

1 2 3 4 5

204 Commits