According to the PSCI specification and the SMC/HVC calling
convention, PSCI function_ids that are not implemented must
return NOT_SUPPORTED as return value.
Current KVM implementation takes an unhandled PSCI function_id
as an error and injects an undefined instruction into the guest
if PSCI implementation is called with a function_id that is not
handled by the resident PSCI version (ie it is not implemented),
which is not the behaviour expected by a guest when calling a
PSCI function_id that is not implemented.
This patch fixes this issue by returning NOT_SUPPORTED whenever
the kvm PSCI call is executed for a function_id that is not
implemented by the PSCI kvm layer.
Cc: <stable@vger.kernel.org> # 3.18+
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The elr_el2 and spsr_el2 registers in fact contain the processor state
before entry into EL2. In the case of guest state it could be in either
el0 or el1.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The KVM-VFIO device is used by the QEMU VFIO device. It is used to
record the list of in-use VFIO groups so that KVM can manipulate
them.
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Until now we have been calling kvm_guest_exit after re-enabling
interrupts when we come back from the guest, but this has the
unfortunate effect that CPU time accounting done in the context of timer
interrupts occurring while the guest is running doesn't properly notice
that the time since the last tick was spent in the guest.
Inspired by the comment in the x86 code, move the kvm_guest_exit() call
below the local_irq_enable() call and change __kvm_guest_exit() to
kvm_guest_exit(), because we are now calling this function with
interrupts enabled. We have to now explicitly disable preemption and
not enable preemption before we've called kvm_guest_exit(), since
otherwise we could be preempted and everything happening before we
eventually get scheduled again would be accounted for as guest time.
At the same time, move the trace_kvm_exit() call outside of the atomic
section, since there is no reason for us to do that with interrupts
disabled.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We already check KVM_CAP_IRQFD in generic once enable CONFIG_HAVE_KVM_IRQFD,
kvm_vm_ioctl_check_extension_generic()
|
+ switch (arg) {
+ ...
+ #ifdef CONFIG_HAVE_KVM_IRQFD
+ case KVM_CAP_IRQFD:
+ #endif
+ ...
+ return 1;
+ ...
+ }
|
+ kvm_vm_ioctl_check_extension()
So its not necessary to check this in arch again, and also fix one typo,
s/emlation/emulation.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.
On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.
If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.
The same condition exists when trapping to enable VFP for the guest.
The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.
The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.
Reported-by: Vikram Sethi <vikrams@codeaurora.org>
Cc: stable@kernel.org # v3.9+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Commit 47a98b15ba ("arm/arm64: KVM: support for un-queuing active
IRQs") introduced handling of the GICD_I[SC]ACTIVER registers,
but only for the GICv2 emulation. For the sake of completeness and
as this is a pre-requisite for save/restore of the GICv3 distributor
state, we should also emulate their handling in the distributor and
redistributor frames of an emulated GICv3.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
No need to cast the void pointer returned by kmalloc() in
arch/arm/kvm/mmu.c::kvm_alloc_stage2_pgd().
Signed-off-by: Firo Yang <firogm@gmail.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
gfn_to_pfn_async is used in just one place, and because of x86-specific
treatment that place will need to look at the memory slot. Hence inline
it into try_async_pf and export __gfn_to_pfn_memslot.
The patch also switches the subsequent call to gfn_to_pfn_prot to use
__gfn_to_pfn_memslot. This is a small optimization. Finally, remove
the now-unused async argument of __gfn_to_pfn.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the
position in the memslots array, which is sorted by gfn.
Cc: stable@vger.kernel.org
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CR0.CD and CR0.NW are not used by shadow page table so that need
not adjust mmu if these two bit are changed
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, whenever guest MTRR registers are changed
kvm_mmu_reset_context is called to switch to the new root shadow page
table, however, it's useless since:
1) the cache type is not cached into shadow page's attribute so that
the original root shadow page will be reused
2) the cache type is set on the last spte, that means we should sync
the last sptes when MTRR is changed
This patch fixs this issue by drop all the spte in the gfn range which
is being updated by MTRR
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are some bugs in current get_mtrr_type();
1: bit 1 of mtrr_state->enabled is corresponding bit 11 of
IA32_MTRR_DEF_TYPE MSR which completely control MTRR's enablement
that means other bits are ignored if it is cleared
2: the fixed MTRR ranges are controlled by bit 0 of
mtrr_state->enabled (bit 10 of IA32_MTRR_DEF_TYPE)
3: if MTRR is disabled, UC is applied to all of physical memory rather
than mtrr_state->def_type
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Split kvm_unmap_rmapp and introduce kvm_zap_rmapp which will be used in the
later patch
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
slot_handle_level and its helper functions are ready now, use them to
clean up the code
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are several places walking all rmaps for the memslot so that
introduce common functions to cleanup the code
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's used to abstract the code from kvm_handle_hva_range and it will be
used by later patch
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's used to walk all the sptes on the rmap to clean up the
code
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit ff7bbb9c6a.
Sasha Levin is seeing odd jump in time values during boot of a KVM guest:
[...]
[ 0.000000] tsc: Detected 2260.998 MHz processor
[3376355.247558] Calibrating delay loop (skipped) preset value..
[...]
and bisected them to this commit.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM may turn a user page to a kernel page when kernel writes a readonly
user page if CR0.WP = 1. This shadow page entry will be reused after
SMAP is enabled so that kernel is allowed to access this user page
Fix it by setting SMAP && !CR0.WP into shadow page's role and reset mmu
once CR4.SMAP is updated
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a REP-string is executed in 64-bit mode with an address-size prefix,
ECX/EDI/ESI are used as counter and pointers. When ECX is initially zero, Intel
CPUs clear the high 32-bits of RCX, and recent Intel CPUs update the high bits
of the pointers in MOVS/STOS. This behavior is specific to Intel according to
few experiments.
As one may guess, this is an undocumented behavior. Yet, it is observable in
the guest, since at least VMX traps REP-INS/OUTS even when ECX=0. Note that
VMware appears to get it right. The behavior can be observed using the
following code:
#include <stdio.h>
#define LOW_MASK (0xffffffff00000000ull)
#define ALL_MASK (0xffffffffffffffffull)
#define TEST(opcode) \
do { \
asm volatile(".byte 0xf2 \n\t .byte 0x67 \n\t .byte " opcode "\n\t" \
: "=S"(s), "=c"(c), "=D"(d) \
: "S"(ALL_MASK), "c"(LOW_MASK), "D"(ALL_MASK)); \
printf("opcode %s rcx=%llx rsi=%llx rdi=%llx\n", \
opcode, c, s, d); \
} while(0)
void main()
{
unsigned long long s, d, c;
iopl(3);
TEST("0x6c");
TEST("0x6d");
TEST("0x6e");
TEST("0x6f");
TEST("0xa4");
TEST("0xa5");
TEST("0xa6");
TEST("0xa7");
TEST("0xaa");
TEST("0xab");
TEST("0xae");
TEST("0xaf");
}
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When REP-string instruction is preceded with an address-size prefix,
ECX/EDI/ESI are used as the operation counter and pointers. When they are
updated, the high 32-bits of RCX/RDI/RSI are cleared, similarly to the way they
are updated on every 32-bit register operation. Fix it.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the host sets hardware breakpoints to debug the guest, and a task-switch
occurs in the guest, the architectural DR7 will not be updated. The effective
DR7 would be updated instead.
This fix puts the DR7 update during task-switch emulation, so it now uses the
standard DR setting mechanism instead of the one that was previously used. As a
bonus, the update of DR7 will now be effective for AMD as well.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mostly a bunch of fixes, reworks and optimizations for s390.
There is one new feature (EDAT-2 inside the guest), which boils
down to 2GB pages.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJVTMbEAAoJEBF7vIC1phx8GPUP/0Ayeqvevtz4pT3l24FZBkET
8se3wqi3rfo+AjXcv6mBYetNc2p2U/sui1iaPNkNnMPjCjjWsnla+7xF6ixw46fT
o7qkhAKSbaIGnsYrip/Lxx9N9ThUVeqlzEfLGUgT4qanad6hxhW22wRB79p2qpWL
BzvNkljkvHOmapDTr+/dxdmwqcbSEWuTSIeQWIK3FRYJ9Uid2VsVYYenvKLTyWxH
1QzHhViKx25t3OV/igAdskPlCI9S1Js/BVQ9hnJueTikFlZQu1svFhiibWnr0bTs
8fTbN0UyEDYjemd4jr8yxHkSF7PQLtHhcoSyMRufIv1YDBpskOVskScAd1L6aUTF
lEPaxcJZvG3mVppqLeVz+wWDHuPw2JXJQ7RAj7j4big5ST09BcqGVe96TMsNGE5w
D8xRcufn2vW5UjK8MhHdjQBDTR3eTfgupCud2/XGryc9UZaLbc+vhHdlHBhiiBU9
4whxzKiHJZ07AsIrZQJtV1ui81m6zN571YccTpW36JSDa4qckgJ3jxZf+BKP5Hex
3hrwe6mFCgBqSO0oiWbBLs1FTaACqaUBfRHK9eu40ibqTyU9nZ8DGv+wtZsAiE8I
EYYqI5uvP2bQ2P4rYXxsQvKG0FCzlvSPo5UpYXcEhicr1GUw4Vx1p8Ta7VLki10O
mF9/HVyw3FvS2AF3yRo1
=ovvI
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-20150508' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Fixes and features for 4.2 (kvm/next)
Mostly a bunch of fixes, reworks and optimizations for s390.
There is one new feature (EDAT-2 inside the guest), which boils
down to 2GB pages.
Our implementation will never trigger interception code 12 as the
responsible setting is never enabled - and never will be.
The handler is dead code. Let's get rid of it.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch factors out the search for a floating irq destination
VCPU as well as the kicking of the found VCPU. The search is optimized
in the following ways:
1. stopped VCPUs can't take any floating interrupts, so try to find an
operating one. We have to take care of the special case where all
VCPUs are stopped and we don't have any valid destination.
2. use online_vcpus, not KVM_MAX_VCPU. This speeds up the search
especially if KVM_MAX_VCPU is increased one day. As these VCPU
objects are initialized prior to increasing online_vcpus, we can be
sure that they exist.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We can avoid checking guest control registers and guest PSW as well
as all the masking and calculations on the interrupt masks when
no interrupts are pending.
Also, the check for IRQ_PEND_COUNT can be removed, because we won't
enter the while loop if no interrupts are pending and invalid interrupt
types can't be injected.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Some updates to the control blocks need to be done in a way that
ensures that no CPU is within SIE. Provide wrappers around the
s390_vcpu_block functions and adopt the TOD migration code to
update in a guaranteed fashion. Also rename these functions to
have the kvm_s390_ prefix as everything else.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
exit_sie_sync is used to kick CPUs out of SIE and prevent reentering at
any point in time. This is used to reload the prefix pages and to
set the IBS stuff in a way that guarantees that after this function
returns we are no longer in SIE. All current users trigger KVM requests.
The request must be set before we block the CPUs to avoid races. Let's
make this implicit by adding the request into a new function
kvm_s390_sync_requests that replaces exit_sie_sync and split out
s390_vcpu_block and s390_vcpu_unblock, that can be used to keep
CPUs out of SIE independent of requests.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
1. Enable EDAT2 in the list of KVM facilities
2. Handle 2G frames in pfmf instruction
If we support EDAT2, we may enable handling of 2G frames if not in 24
bit mode.
3. Enable EDAT2 in sie_block
If the EDAT2 facility is available we enable GED2 mode control in the
sie_block.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We should only enable EDAT1 for the guest if the host actually supports
it and the cpu model for the guest has EDAT-1 enabled.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Guenther Hutzl <hutzl@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The fast path for a sie exit is that no kvm reqest is pending.
Make an early check to skip all single bit checks.
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Commit ea5f496925 ("KVM: s390: only one external call may be pending
at a time") introduced a bug on machines that don't have SIGP
interpretation facility installed.
The injection of an external call will now always fail with -EBUSY
(if none is already pending).
This leads to the following symptoms:
- An external call will be injected but with the wrong "src cpu id",
as this id will not be remembered.
- The target vcpu will not be woken up, therefore the guest will hang if
it cannot deal with unexpected failures of the SIGP EXTERNAL CALL
instruction.
- If an external call is already pending, -EBUSY will not be reported.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.0
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
smep_andnot_wp is initialized in kvm_init_shadow_mmu and shadow pages
should not be reused for different values of it. Thus, it has to be
added to the mask in kvm_mmu_pte_write.
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current permission check assumes that RSVD bit in PFEC is always zero,
however, it is not true since MMIO #PF will use it to quickly identify
MMIO access
Fix it by clearing the bit if walking guest page table is needed
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On cpu hotplug only KVM emits an unconditional message that its notifier
has been called. It certainly can be assumed that calling cpu hotplug
notifiers work, therefore there is no added value if KVM prints a message.
If an error happens on cpu online KVM will still emit a warning.
So let's remove this superfluous message.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
vcpu->arch.apic is NULL when a userspace irqchip is active. But instead
of letting the test incorrectly depend on in-kernel irqchip mode,
open-code it to catch also userspace x2APICs.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Far call in 64-bit has a 32-bit operand size. Remove the marking of this
operation as Stack so it can be emulated correctly in 64-bit.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Caching memslot value and using mark_page_dirty_in_slot() avoids another
O(log N) search when dirtying the page.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1428695247-27603-1-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop unnecessary rdtsc_barrier(), as has been determined empirically,
see 057e6a8c66 for details.
Noticed by Andy Lutomirski.
Improves clock_gettime() by approximately 15% on
Intel i7-3520M @ 2.90GHz.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the null test is needed, the call to cancel_delayed_work_sync would have
already crashed. Normally, the destroy function should only be called
if the init function has succeeded, in which case ioapic is not null.
Problem found using Coccinelle.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
PAT should be 0007_0406_0007_0406h on RESET and not modified on INIT.
VMX used a wrong value (host's PAT) and while SVM used the right one,
it never got to arch.pat.
This is not an issue with QEMU as it will force the correct value.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently KVM will clear the FPU bits in CR0.TS in the VMCS, and trap to
re-load them every time the guest accesses the FPU after a switch back into
the guest from the host.
This patch copies the x86 task switch semantics for FPU loading, with the
FPU loaded eagerly after first use if the system uses eager fpu mode,
or if the guest uses the FPU frequently.
In the latter case, after loading the FPU for 255 times, the fpu_counter
will roll over, and we will revert to loading the FPU on demand, until
it has been established that the guest is still actively using the FPU.
This mirrors the x86 task switch policy, which seems to work.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
An MSI interrupt should only be delivered to the lowest priority CPU
when it has RH=1, regardless of the delivery mode. Modified
kvm_is_dm_lowest_prio() to check for either irq->delivery_mode == APIC_DM_LOWPRI
or irq->msi_redir_hint.
Moved kvm_is_dm_lowest_prio() into lapic.h and renamed to
kvm_lowest_prio_delivery().
Changed a check in kvm_irq_delivery_to_apic_fast() from
irq->delivery_mode == APIC_DM_LOWPRI to kvm_is_dm_lowest_prio().
Signed-off-by: James Sullivan <sullivan.james.f@gmail.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extended struct kvm_lapic_irq with bool msi_redir_hint, which will
be used to determine if the delivery of the MSI should target only
the lowest priority CPU in the logical group specified for delivery.
(In physical dest mode, the RH bit is not relevant). Initialized the value
of msi_redir_hint to true when RH=1 in kvm_set_msi_irq(), and initialized
to false in all other cases.
Added value of msi_redir_hint to a debug message dump of an IRQ in
apic_send_ipi().
Signed-off-by: James Sullivan <sullivan.james.f@gmail.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change to u16 if they only contain data in the low 16 bits.
Change the level field to bool, since we assign 1 sometimes, but
just mask icr_low with APIC_INT_ASSERT in apic_send_ipi.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86 architecture defines differences between the reset and INIT sequences.
INIT does not initialize the FPU (including MMX, XMM, YMM, etc.), TSC, PMU,
MSRs (in general), MTRRs machine-check, APIC ID, APIC arbitration ID and BSP.
References (from Intel SDM):
"If the MP protocol has completed and a BSP is chosen, subsequent INITs (either
to a specific processor or system wide) do not cause the MP protocol to be
repeated." [8.4.2: MP Initialization Protocol Requirements and Restrictions]
[Table 9-1. IA-32 Processor States Following Power-up, Reset, or INIT]
"If the processor is reset by asserting the INIT# pin, the x87 FPU state is not
changed." [9.2: X87 FPU INITIALIZATION]
"The state of the local APIC following an INIT reset is the same as it is after
a power-up or hardware reset, except that the APIC ID and arbitration ID
registers are not affected." [10.4.7.3: Local APIC State After an INIT Reset
("Wait-for-SIPI" State)]
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1428924848-28212-1-git-send-email-namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introducing KVM_CAP_DISABLE_QUIRKS for disabling x86 quirks that were previous
created in order to overcome QEMU issues. Those issue were mostly result of
invalid VM BIOS. Currently there are two quirks that can be disabled:
1. KVM_QUIRK_LINT0_REENABLED - LINT0 was enabled after boot
2. KVM_QUIRK_CD_NW_CLEARED - CD and NW are cleared after boot
These two issues are already resolved in recent releases of QEMU, and would
therefore be disabled by QEMU.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Message-Id: <1428879221-29996-1-git-send-email-namit@cs.technion.ac.il>
[Report capability from KVM_CHECK_EXTENSION too. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>