seccomp: Support atomic "addfd + send reply"
Alban Crequy reported a race condition userspace faces when we want to add some fds and make the syscall return them[1] using seccomp notify. The problem is that currently two different ioctl() calls are needed by the process handling the syscalls (agent) for another userspace process (target): SECCOMP_IOCTL_NOTIF_ADDFD to allocate the fd and SECCOMP_IOCTL_NOTIF_SEND to return that value. Therefore, it is possible for the agent to do the first ioctl to add a file descriptor but the target is interrupted (EINTR) before the agent does the second ioctl() call. This patch adds a flag to the ADDFD ioctl() so it adds the fd and returns that value atomically to the target program, as suggested by Kees Cook[2]. This is done by simply allowing seccomp_do_user_notification() to add the fd and return it in this case. Therefore, in this case the target wakes up from the wait in seccomp_do_user_notification() either to interrupt the syscall or to add the fd and return it. This "allocate an fd and return" functionality is useful for syscalls that return a file descriptor only, like connect(2). Other syscalls that return a file descriptor but not as return value (or return more than one fd), like socketpair(), pipe(), recvmsg with SCM_RIGHTs, will not work with this flag. This effectively combines SECCOMP_IOCTL_NOTIF_ADDFD and SECCOMP_IOCTL_NOTIF_SEND into an atomic opteration. The notification's return value, nor error can be set by the user. Upon successful invocation of the SECCOMP_IOCTL_NOTIF_ADDFD ioctl with the SECCOMP_ADDFD_FLAG_SEND flag, the notifying process's errno will be 0, and the return value will be the file descriptor number that was installed. [1]: https://lore.kernel.org/lkml/CADZs7q4sw71iNHmV8EOOXhUKJMORPzF7thraxZYddTZsxta-KQ@mail.gmail.com/ [2]: https://lore.kernel.org/lkml/202012011322.26DCBC64F2@keescook/ Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io> Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210517193908.3113-4-sargun@sargun.me
This commit is contained in:
committed by
Kees Cook
parent
ddc4739169
commit
0ae71c7720
@@ -259,6 +259,18 @@ and ``ioctl(SECCOMP_IOCTL_NOTIF_SEND)`` a response, indicating what should be
|
||||
returned to userspace. The ``id`` member of ``struct seccomp_notif_resp`` should
|
||||
be the same ``id`` as in ``struct seccomp_notif``.
|
||||
|
||||
Userspace can also add file descriptors to the notifying process via
|
||||
``ioctl(SECCOMP_IOCTL_NOTIF_ADDFD)``. The ``id`` member of
|
||||
``struct seccomp_notif_addfd`` should be the same ``id`` as in
|
||||
``struct seccomp_notif``. The ``newfd_flags`` flag may be used to set flags
|
||||
like O_EXEC on the file descriptor in the notifying process. If the supervisor
|
||||
wants to inject the file descriptor with a specific number, the
|
||||
``SECCOMP_ADDFD_FLAG_SETFD`` flag can be used, and set the ``newfd`` member to
|
||||
the specific number to use. If that file descriptor is already open in the
|
||||
notifying process it will be replaced. The supervisor can also add an FD, and
|
||||
respond atomically by using the ``SECCOMP_ADDFD_FLAG_SEND`` flag and the return
|
||||
value will be the injected file descriptor number.
|
||||
|
||||
It is worth noting that ``struct seccomp_data`` contains the values of register
|
||||
arguments to the syscall, but does not contain pointers to memory. The task's
|
||||
memory is accessible to suitably privileged traces via ``ptrace()`` or
|
||||
|
||||
Reference in New Issue
Block a user