With all the IRQ machinery being in place, we can allow virtio
devices to additionally be configured as wakeup sources, in
which case basically any interrupt from them wakes us up. Note
that this requires a call FD because the VQs are all disabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
In external time-travel mode, where time is controlled via the
controller application socket, interrupt handling is a little
tricky. For example on virtio, the following happens:
* we receive a message (that requires an ACK) on the vhost-user socket
* we add a time-travel event to handle the interrupt
(this causes communication on the time socket)
* we ACK the original vhost-user message
* we then handle the interrupt once the event is triggered
This protocol ensures that the sender of the interrupt only continues
to run in the simulation when the time-travel event has been added.
So far, this was only done in the virtio driver, but it was actually
wrong, because only virtqueue interrupts were handled this way, and
config change interrupts were handled immediately. Additionally, the
messages were actually handled in the real Linux interrupt handler,
but Linux interrupt handlers are part of the simulation and shouldn't
run while there's no time event.
To really do this properly and only handle all kinds of interrupts in
the time-travel event when we are scheduled to run in the simulation,
rework this to plug in to the lower interrupt layers in UML directly:
Add a um_request_irq_tt() function that let's a time-travel aware
driver request an interrupt with an additional timetravel_handler()
that is called outside of the context of the simulation, to handle
the message only. It then adds an event to the time-travel calendar
if necessary, and no "real" Linux code runs outside of the time
simulation.
This also hooks in with suspend/resume properly now, since this new
timetravel_handler() can run while Linux is suspended and interrupts
are disabled, and decide to wake up (or not) the system based on the
message it received. Importantly in this case, it ACKs the message
before the system even resumes and interrupts are re-enabled, thus
allowing the simulation to progress properly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
If the system is suspended, the device shouldn't be able to send
anything to it. Disable virtqueues in suspend to simulate this,
and as we might be only using s2idle (kernel services are still
on), prevent sending anything on them as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
If we have a message without payload, we call full_read() with
len set to 0, which causes it to return -ECONNRESET. Catch this
case and explicitly return 0 for it so we can actually use the
zero-size config-changed message.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
There's no 'simtime' device, because implementing that through
virtio was just too much complexity. Clean up the comment that
still refers to it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Since struct device is refcounted, we shouldn't free the vu_dev
immediately when it's removed from the platform device, but only
when the references actually all go away. Move the freeing to
the release to accomplish that.
Fixes: 5d38f32499 ("um: drivers: Add virtio vhost-user driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We don't actually use this in um_request_irq(), so it can
never be assigned. It's also not clear what that would be
useful for, so just remove it.
This results in quite a number of cleanups, all the way to
removing the "SIGIO on close" startup check, since the data
it assigns (pty_close_sigio) is not used anymore.
While at it, also make this an enum so we get a minimum of
type checking, and remove the IRQ_NONE hack in virtio since
we now no longer have the name twice.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This separates the devices, which is better for debug and for
later suspend/resume and wakeup support, since there we'll
have to separate which IRQs can wake up the system and which
cannot.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
It's cumbersome and error-prone to keep adding fixed IRQ numbers,
and for proper device wakeup support for the virtio/vhost-user
support we need to have different IRQs for each device. Even if
in theory two IRQs (with and without wake) might be sufficient,
it's much easier to reason about it when we have dynamic number
assignment. It also makes it easier to add new devices that may
dynamically exist or depending on the configuration, etc.
Add support for this, up to 64 IRQs (the same limit as epoll FDs
we have right now). Since it's not easy to port all the existing
places to dynamic allocation (some data is statically initialized)
keep the low numbers are reserved for the existing hard-coded IRQ
numbers.
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Rename the bit to match latest virtio spec.
Add a compat macro to avoid breaking existing userspace.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Defining a macro with ... and __VA_ARGS__ (without ##) can cause
compilation errors if a macro use does not have additional args.
Add ## to __VA_ARGS__ in the macro definition.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This implements synchronized time-travel mode which - using a special
application on a unix socket - lets multiple machines take part in a
time-travelling simulation together.
The protocol for the unix domain socket is defined in the new file
include/uapi/linux/um_timetravel.h.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Implement in-band notifications that are necessary for running
vhost-user devices under externally synchronized time-travel
mode (which is in a follow-up patch). This feature makes what
usually should be eventfd notifications in-band messages.
We'll prefer this feature, under the assumption that only a
few (simulation) devices will ever support it, since it's not
very efficient.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This driver *can* be a module, but then its parameters (socket path)
are untrusted data from inside the VM, and that isn't allowed. Allow
the code to only be built-in to avoid that.
Fixes: 5d38f32499 ("um: drivers: Add virtio vhost-user driver")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
When we get an interrupt from the socket getting readable,
and start reading, there's a possibility for a race. This
depends on the implementation of the device, but e.g. with
qemu's libvhost-user, we can see:
device virtio_uml
---------------------------------------
write header
get interrupt
read header
read body -> returns -EAGAIN
write body
The -EAGAIN return is because the socket is non-blocking,
and then this leads us to abandon this message.
In fact, we've already read the header, so when the get
another signal/interrupt for the body, we again read it
as though it's a new message header, and also abandon it
for the same reason (wrong size etc.)
This essentially breaks things, and if that message was
one that required a response, it leads to a deadlock as
the device is waiting for the response but we'll never
reply.
Fix this by spinning on -EAGAIN as well when we read the
message body. We need to handle -EAGAIN as "no message"
while reading the header, since we share an interrupt.
Note that this situation is highly unlikely to occur in
normal usage, since there will be very few messages and
only in the startup phase. With the inband call feature
this does tend to happen (eventually) though.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
If the connection drops, just remove the device, we don't try
to recover from this right now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Implement the VHOST_USER_PROTOCOL_F_REPLY_ACK extension for both
slave requests (previous patch) where we have to reply and our
own requests where it helps understand if the slave failed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Implement the communication channel for the device to notify
us of some events, and notably implement the handling of the
config updates needed for the combination of this feature
and VHOST_USER_PROTOCOL_F_CONFIG.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This module allows virtio devices to be used over a vhost-user socket.
Signed-off-by: Erel Geron <erelx.geron@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>