linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-01 09:41:44 +00:00

History

Nadav Har'El b6f1250edb KVM: nVMX: Correct handling of interrupt injection The code in this patch correctly emulates external-interrupt injection while a nested guest L2 is running. Because of this code's relative un-obviousness, I include here a longer-than- usual justification for what it does - much longer than the code itself ;-) To understand how to correctly emulate interrupt injection while L2 is running, let's look first at what we need to emulate: How would things look like if the extra L0 hypervisor layer is removed, and instead of L0 injecting an interrupt, we had hardware delivering an interrupt? Now we have L1 running on bare metal with a guest L2, and the hardware generates an interrupt. Assuming that L1 set PIN_BASED_EXT_INTR_MASK to 1, and VM_EXIT_ACK_INTR_ON_EXIT to 0 (we'll revisit these assumptions below), what happens now is this: The processor exits from L2 to L1, with an external- interrupt exit reason but without an interrupt vector. L1 runs, with interrupts disabled, and it doesn't yet know what the interrupt was. Soon after, it enables interrupts and only at that moment, it gets the interrupt from the processor. when L1 is KVM, Linux handles this interrupt. Now we need exactly the same thing to happen when that L1->L2 system runs on top of L0, instead of real hardware. This is how we do this: When L0 wants to inject an interrupt, it needs to exit from L2 to L1, with external-interrupt exit reason (with an invalid interrupt vector), and run L1. Just like in the bare metal case, it likely can't deliver the interrupt to L1 now because L1 is running with interrupts disabled, in which case it turns on the interrupt window when running L1 after the exit. L1 will soon enable interrupts, and at that point L0 will gain control again and inject the interrupt to L1. Finally, there is an extra complication in the code: when nested_run_pending, we cannot return to L1 now, and must launch L2. We need to remember the interrupt we wanted to inject (and not clear it now), and do it on the next exit. The above explanation shows that the relative strangeness of the nested interrupt injection code in this patch, and the extra interrupt-window exit incurred, are in fact necessary for accurate emulation, and are not just an unoptimized implementation. Let's revisit now the two assumptions made above: If L1 turns off PIN_BASED_EXT_INTR_MASK (no hypervisor that I know does, by the way), things are simple: L0 may inject the interrupt directly to the L2 guest - using the normal code path that injects to any guest. We support this case in the code below. If L1 turns on VM_EXIT_ACK_INTR_ON_EXIT, things look very different from the description above: L1 expects to see an exit from L2 with the interrupt vector already filled in the exit information, and does not expect to be interrupted again with this interrupt. The current code does not (yet) support this case, so we do not allow the VM_EXIT_ACK_INTR_ON_EXIT exit-control to be turned on by L1. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>		2011-07-12 11:45:17 +03:00
..
boot	x86, setup: When probing memory with e801, use ax/bx as a pair	2011-04-25 14:52:37 -07:00
configs	cgroup: remove the ns_cgroup	2011-05-26 17:12:34 -07:00
crypto	crypto: aesni-intel - fix aesni build on i386	2011-05-18 09:03:34 +10:00
ia32	ns: Wire up the setns system call	2011-05-28 10:48:39 -07:00
include/asm	KVM: nVMX: vmcs12 checks on nested entry	2011-07-12 11:45:16 +03:00
kernel	x86 idle: APM requires pm_idle/default_idle unconditionally when a module	2011-06-14 13:42:20 -07:00
kvm	KVM: nVMX: Correct handling of interrupt injection	2011-07-12 11:45:17 +03:00
lguest	lguest: fix timer interrupt setup	2011-05-30 11:14:10 +09:30
lib	Merge branches 'x86-apic-for-linus', 'x86-asm-for-linus' and 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip	2011-05-19 17:49:35 -07:00
math-emu
mm	x86, efi: Do not reserve boot services regions within reserved areas	2011-06-18 22:48:49 +02:00
net	net: filter: Just In Time compiler for x86-64	2011-04-27 23:05:08 -07:00
oprofile	Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent	2011-06-08 15:49:03 +02:00
pci	Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen	2011-07-01 13:25:56 -07:00
platform	x86, efi: Do not reserve boot services regions within reserved areas	2011-06-18 22:48:49 +02:00
power	x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states	2010-08-20 14:59:02 +02:00
tools
vdso	x86: vdso: Remove unused variable	2011-05-26 13:17:35 +02:00
video
xen	xen/mmu: Fix for linker errors when CONFIG_SMP is not defined.	2011-06-30 09:21:10 -04:00
.gitignore
Kbuild	net: filter: Just In Time compiler for x86-64	2011-04-27 23:05:08 -07:00
Kconfig	arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT}	2011-05-26 17:12:38 -07:00
Kconfig.cpu	x86, cpu: Move AMD Elan Kconfig under "Processor family"	2011-04-08 13:01:25 -07:00
Kconfig.debug	lib: consolidate DEBUG_STACK_USAGE option	2011-05-25 08:39:54 -07:00
Makefile	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip	2010-10-21 13:06:00 -07:00
Makefile_32.cpu	x86, cpu: Move AMD Elan Kconfig under "Processor family"	2011-04-08 13:01:25 -07:00