linux

Author	SHA1	Message	Date
Jason J. Herne	72f250206f	KVM: s390: Provide guest TOD Clock Get/Set Controls Provide controls for setting/getting the guest TOD clock based on the VM attribute interface. Provide TOD and TOD_HIGH vm attributes on s390 for managing guest Time Of Day clock value. TOD_HIGH is presently always set to 0. In the future it will contain a high order expansion of the tod clock value after it overflows the 64-bits of the TOD. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:40 +01:00
Jens Freimann	556cc0dab1	KVM: s390: trace correct values for set prefix and machine checks When injecting SIGP set prefix or a machine check, we trace the values in our per-vcpu local_int data structure instead of the parameters passed to the function. Fix this by changing the trace statement to use the correct values. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:39 +01:00
Jens Freimann	49538d1238	KVM: s390: fix bug in sigp emergency signal injection Currently we are always setting the wrong bit in the bitmap for pending emergency signals. Instead of using emerg.code from the passed in irq parameter, we use the value in our per-vcpu local_int structure, which is always zero. That means all emergency signals will have address 0 as parameter. If two CPUs send a SIGP to the same target, one might be lost. Let's fix this by using the value from the parameter and also trace the correct value. Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:39 +01:00
Thomas Huth	3cfad02380	KVM: s390: Take addressing mode into account for MVPG interception The handler for MVPG partial execution interception does not take the current CPU addressing mode into account yet, so addresses are always treated as 64-bit addresses. For correct behaviour, we should properly handle 24-bit and 31-bit addresses, too. Since MVPG is defined to work with logical addresses, we can simply use guest_translate_address() to achieve the required behaviour (since DAT is disabled here, guest_translate_address() skips the MMU translation and only translates the address via kvm_s390_logical_to_effective() and kvm_s390_real_to_abs(), which is exactly what we want here). Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:38 +01:00
Christian Borntraeger	69a8d45626	KVM: s390: no need to hold the kvm->mutex for floating interrupts The kvm mutex was (probably) used to protect against cpu hotplug. The current code no longer needs to protect against that, as we only rely on CPU data structures that are guaranteed to be available if we can access the CPU. (e.g. vcpu_create will put the cpu in the array AFTER the cpu is ready). Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>	2015-01-23 13:25:37 +01:00
David Hildenbrand	2444b352c3	KVM: s390: forward most SIGP orders to user space Most SIGP orders are handled partially in kernel and partially in user space. In order to: - Get a correct SIGP SET PREFIX handler that informs user space - Avoid race conditions between concurrently executed SIGP orders - Serialize SIGP orders per VCPU We need to handle all "slow" SIGP orders in user space. The remaining ones to be handled completely in kernel are: - SENSE - SENSE RUNNING - EXTERNAL CALL - EMERGENCY SIGNAL - CONDITIONAL EMERGENCY SIGNAL According to the PoP, they have to be fast. They can be executed without conflicting to the actions of other pending/concurrently executing orders (e.g. STOP vs. START). This patch introduces a new capability that will - when enabled - forward all but the mentioned SIGP orders to user space. The instruction counters in the kernel are still updated. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:37 +01:00
David Hildenbrand	9fbd80828c	KVM: s390: clear the pfault queue if user space sets the invalid token We need a way to clear the async pfault queue from user space (e.g. for resets and SIGP SET ARCHITECTURE). This patch simply clears the queue as soon as user space sets the invalid pfault token. The definition of the invalid token is moved to uapi. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:36 +01:00
David Hildenbrand	ea5f496925	KVM: s390: only one external call may be pending at a time Only one external call may be pending at a vcpu at a time. For this reason, we have to detect whether the SIGP externcal call interpretation facility is available. If so, all external calls have to be injected using this mechanism. SIGP EXTERNAL CALL orders have to return whether another external call is already pending. This check was missing until now. SIGP SENSE hasn't returned yet in all conditions whether an external call was pending. If a SIGP EXTERNAL CALL irq is to be injected and one is already pending, -EBUSY is returned. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:36 +01:00
David Hildenbrand	d614be05c8	s390/sclp: introduce check for the SIGP Interpretation Facility This patch introduces the infrastructure to check whether the SIGP Interpretation Facility is installed on all VCPUs in the configuration. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:35 +01:00
David Hildenbrand	a3a9c59a68	KVM: s390: SIGP SET PREFIX cleanup This patch cleanes up the the SIGP SET PREFIX code. A SIGP SET PREFIX irq may only be injected if the target vcpu is stopped. Let's move the checking code into the injection code and return -EBUSY if the target vcpu is not stopped. Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:34 +01:00
David Hildenbrand	9a022067ad	KVM: s390: a VCPU may only stop when no interrupts are left pending As a SIGP STOP is an interrupt with the least priority, it may only result in stop of the vcpu when no other interrupts are left pending. To detect whether a non-stop irq is pending, we need a way to mask out stop irqs from the general kvm_cpu_has_interrupt() function. For this reason, the existing function (with an outdated name) is replaced by kvm_s390_vcpu_has_irq() which allows to mask out pending stop irqs. Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:34 +01:00
David Hildenbrand	6cddd432e3	KVM: s390: handle stop irqs without action_bits This patch removes the famous action_bits and moves the handling of SIGP STOP AND STORE STATUS directly into the SIGP STOP interrupt. The new local interrupt infrastructure is used to track pending stop requests. STOP irqs are the only irqs that don't get actively delivered. They remain pending until the stop function is executed (=stop intercept). If another STOP irq is already pending, -EBUSY will now be returned (needed for the SIGP handling code). Migration of pending SIGP STOP (AND STORE STATUS) orders should now be supported out of the box. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:33 +01:00
David Hildenbrand	2822545f9f	KVM: s390: new parameter for SIGP STOP irqs In order to get rid of the action_flags and to properly migrate pending SIGP STOP irqs triggered e.g. by SIGP STOP AND STORE STATUS, we need to remember whether to store the status when stopping. For this reason, a new parameter (flags) for the SIGP STOP irq is introduced. These flags further define details of the requested STOP and can be easily migrated. Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:33 +01:00
David Hildenbrand	2d00f75942	KVM: s390: forward hrtimer if guest ckc not pending yet Patch `0759d0681c` ("KVM: s390: cleanup handle_wait by reusing kvm_vcpu_block") changed the way pending guest clock comparator interrupts are detected. It was assumed that as soon as the hrtimer wakes up, the condition for the guest ckc is satisfied. This is however only true as long as adjclock() doesn't speed up the monotonic clock. Reason is that the hrtimer is based on CLOCK_MONOTONIC, the guest clock comparator detection is based on the raw TOD clock. If CLOCK_MONOTONIC runs faster than the TOD clock, the hrtimer wakes the target VCPU up too early and the target VCPU will not detect any pending interrupts, therefore going back to sleep. It will never be woken up again because the hrtimer has finished. The VCPU is stuck. As a quick fix, we have to forward the hrtimer until the guest clock comparator is really due, to guarantee properly timed wake ups. As the hrtimer callback might be triggered on another cpu, we have to make sure that the timer is really stopped and not currently executing the callback on another cpu. This can happen if the vcpu thread is scheduled onto another physical cpu, but the timer base is not migrated. So lets use hrtimer_cancel instead of try_to_cancel. A proper fix might be to introduce a RAW based hrtimer. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:32 +01:00
David Hildenbrand	0ac96caf0f	KVM: s390: base hrtimer on a monotonic clock The hrtimer that handles the wait with enabled timer interrupts should not be disturbed by changes of the host time. This patch changes our hrtimer to be based on a monotonic clock. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:32 +01:00
David Hildenbrand	bda343ef14	KVM: s390: prevent sleep duration underflows in handle_wait() We sometimes get an underflow for the sleep duration, which most likely won't result in the short sleep time we wanted. So let's check for sleep duration underflows and directly continue to run the guest if we get one. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:31 +01:00
Dominik Dingel	8c0a7ce606	KVM: s390: Allow userspace to limit guest memory size With commit `c6c956b80b` ("KVM: s390/mm: support gmap page tables with less than 5 levels") we are able to define a limit for the guest memory size. As we round up the guest size in respect to the levels of page tables we get to guest limits of: 2048 MB, 4096 GB, 8192 TB and 16384 PB. We currently limit the guest size to 16 TB, which means we end up creating a page table structure supporting guest sizes up to 8192 TB. This patch introduces an interface that allows userspace to tune this limit. This may bring performance improvements for small guests. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:30 +01:00
Dominik Dingel	dafd032a15	KVM: s390: move vcpu specific initalization to a later point As we will allow in a later patch to recreate gmaps with new limits, we need to make sure that vcpus get their reference for that gmap after they increased the online_vcpu counter, so there is no possible race. While we are doing this, we also can simplify the vcpu_init function, by moving ucontrol specifics to an own function. That way we also start now setting the kvm_valid_regs for the ucontrol path. Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:30 +01:00
Christian Borntraeger	0675d92dcf	KVM: s390: make local function static sparse rightfully complains about warning: symbol '__inject_extcall' was not declared. Should it be static? Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:25:29 +01:00
Dominik Dingel	31928aa586	KVM: remove unneeded return value of vcpu_postcreate The return value of kvm_arch_vcpu_postcreate is not checked in its caller. This is okay, because only x86 provides vcpu_postcreate right now and it could only fail if vcpu_load failed. But that is not possible during KVM_CREATE_VCPU (kvm_arch_vcpu_load is void, too), so just get rid of the unchecked return value. Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2015-01-23 13:24:52 +01:00
Nicholas Krause	bab5bb3982	kvm: x86: Remove kvm_make_request from lapic.c Adds a function kvm_vcpu_set_pending_timer instead of calling kvm_make_request in lapic.c. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:08 +01:00
Nadav Amit	edccda7ca7	KVM: x86: Access to LDT/GDT that wraparound is incorrect When access to descriptor in LDT/GDT wraparound outside long-mode, the address of the descriptor should be truncated to 32-bit. Citing Intel SDM 2.1.1.1 "Global and Local Descriptor Tables in IA-32e Mode": "GDTR and LDTR registers are expanded to 64-bits wide in both IA-32e sub-modes (64-bit mode and compatibility mode)." So in other cases, we need to truncate. Creating new function to return a pointer to descriptor table to avoid too much code duplication. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> [Wrap 64-bit check with #ifdef CONFIG_X86_64, to avoid a "right shift count >= width of type" warning and consequent undefined behavior. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:08 +01:00
Nadav Amit	e2cefa746e	KVM: x86: Do not set access bit on accessed segments When segment is loaded, the segment access bit is set unconditionally. In fact, it should be set conditionally, based on whether the segment had the accessed bit set before. In addition, it can improve performance. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:07 +01:00
Nadav Amit	ab708099a0	KVM: x86: POP [ESP] is not emulated correctly According to Intel SDM: "If the ESP register is used as a base register for addressing a destination operand in memory, the POP instruction computes the effective address of the operand after it increments the ESP register." The current emulation does not behave so. The fix required to waste another of the precious instruction flags and to check the flag in decode_modrm. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:07 +01:00
Nadav Amit	80976dbb5c	KVM: x86: em_call_far should return failure result Currently, if em_call_far fails it returns success instead of the resulting error-code. Fix it. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:06 +01:00
Nadav Amit	3dc4bc4f6b	KVM: x86: JMP/CALL using call- or task-gate causes exception The KVM emulator does not emulate JMP and CALL that target a call gate or a task gate. This patch does not try to implement these scenario as they are presumably rare; yet it returns X86EMUL_UNHANDLEABLE error in such cases instead of generating an exception. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:05 +01:00
Nadav Amit	16bebefe29	KVM: x86: fnstcw and fnstsw may cause spurious exception Since the operand size of fnstcw and fnstsw is updated during the execution, the emulation may cause spurious exceptions as it reads the memory beforehand. Marking these instructions as Mov (since the previous value is ignored) and DstMem16 to simplify the setting of operand size. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:05 +01:00
Nadav Amit	3313bc4ee8	KVM: x86: pop sreg accesses only 2 bytes Although pop sreg updates RSP according to the operand size, only 2 bytes are read. The current behavior may result in incorrect #GP or #PF exceptions. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:04 +01:00
Paolo Bonzini	fa4a2c080e	KVM: x86: mmu: replace assertions with MMU_WARN_ON, a conditional WARN_ON This makes the direction of the conditions consistent with code that is already using WARN_ON. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:04 +01:00
Paolo Bonzini	4c1a50de92	KVM: x86: mmu: remove ASSERT(vcpu) Because ASSERT is just a printk, these would oops right away. The assertion thus hardly adds anything. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:03 +01:00
Paolo Bonzini	ad896af0b5	KVM: x86: mmu: remove argument to kvm_init_shadow_mmu and kvm_init_shadow_ept_mmu The initialization function in mmu.c can always use walk_mmu, which is known to be vcpu->arch.mmu. Only init_kvm_nested_mmu is used to initialize vcpu->arch.nested_mmu. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:02 +01:00
Paolo Bonzini	e0c6db3e22	KVM: x86: mmu: do not use return to tail-call functions that return void This is, pedantically, not valid C. It also looks weird. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:02 +01:00
Marcelo Tosatti	6c19b7538f	KVM: x86: add tracepoint to wait_lapic_expire Add tracepoint to wait_lapic_expire. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> [Remind reader if early or late. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:48:01 +01:00
Marcelo Tosatti	d0659d946b	KVM: x86: add option to advance tscdeadline hrtimer expiration For the hrtimer which emulates the tscdeadline timer in the guest, add an option to advance expiration, and busy spin on VM-entry waiting for the actual expiration time to elapse. This allows achieving low latencies in cyclictest (or any scenario which requires strict timing regarding timer expiration). Reduces average cyclictest latency from 12us to 8us on Core i5 desktop. Note: this option requires tuning to find the appropriate value for a particular hardware/guest combination. One method is to measure the average delay between apic_timer_fn and VM-entry. Another method is to start with 1000ns, and increase the value in say 500ns increments until avg cyclictest numbers stop decreasing. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:47:30 +01:00
Marcelo Tosatti	7c6a98dfa1	KVM: x86: add method to test PIR bitmap vector kvm_x86_ops->test_posted_interrupt() returns true/false depending whether 'vector' is set. Next patch makes use of this interface. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:45:18 +01:00
Tiejun Chen	b4eef9b36d	kvm: x86: vmx: NULL out hwapic_isr_update() in case of !enable_apicv In most cases calling hwapic_isr_update(), we always check if kvm_apic_vid_enabled() == 1, but actually, kvm_apic_vid_enabled() -> kvm_x86_ops->vm_has_apicv() -> vmx_vm_has_apicv() or '0' in svm case -> return enable_apicv && irqchip_in_kernel(kvm) So its a little cost to recall vmx_vm_has_apicv() inside hwapic_isr_update(), here just NULL out hwapic_isr_update() in case of !enable_apicv inside hardware_setup() then make all related stuffs follow this. Note we don't check this under that condition of irqchip_in_kernel() since we should make sure definitely any caller don't work without in-kernel irqchip. Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:45:17 +01:00
Nicholas Krause	5ff22e7ebf	KVM: x86: Remove FIXMEs in emulate.c for the function,task_switch_32 Remove FIXME comments about needing fault addresses to be returned. These are propaagated from walk_addr_generic to gva_to_gpa and from there to ops->read_std and ops->write_std. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:45:17 +01:00
Eugene Korenevsky	19d5f10b3a	KVM: nVMX: consult PFEC_MASK and PFEC_MATCH when generating #PF VM-exit When generating #PF VM-exit, check equality: (PFEC & PFEC_MASK) == PFEC_MATCH If there is equality, the 14 bit of exception bitmap is used to take decision about generating #PF VM-exit. If there is inequality, inverted 14 bit is used. Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:45:16 +01:00
Eugene Korenevsky	e9ac033e6b	KVM: nVMX: Improve nested msr switch checking This patch improve checks required by Intel Software Developer Manual. - SMM MSRs are not allowed. - microcode MSRs are not allowed. - check x2apic MSRs only when LAPIC is in x2apic mode. - MSR switch areas must be aligned to 16 bytes. - address of first and last byte in MSR switch areas should not set any bits beyond the processor's physical-address width. Also it adds warning messages on failures during MSR switch. These messages are useful for people who debug their VMMs in nVMX. Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:45:15 +01:00
Wincy Van	ff651cb613	KVM: nVMX: Add nested msr load/restore algorithm Several hypervisors need MSR auto load/restore feature. We read MSRs from VM-entry MSR load area which specified by L1, and load them via kvm_set_msr in the nested entry. When nested exit occurs, we get MSRs via kvm_get_msr, writing them to L1`s MSR store area. After this, we read MSRs from VM-exit MSR load area, and load them via kvm_set_msr. Signed-off-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-01-08 22:45:14 +01:00
Linus Torvalds	b1940cd21c	Linux 3.19-rc3	2015-01-05 17:05:20 -08:00
Linus Torvalds	79b8cb9737	powerpc fixes for 3.19 Wire up sys_execveat(). Tested on 32 & 64 bit. Fix for kdump on LE systems with cpus hot unplugged. Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this broke other platforms, we'll do a proper fix for 3.20. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUqillAAoJEFHr6jzI4aWA+aYP/0zMNyRbQehtpF9OLKw2TGJW LeSdQMN+g6JWKzG2hsNIiKga5kquuCi0zgtu3dW/rxlzbXY0gxwAY7HFkjzgE7Em qlt/iairlg8ifE436DRCkH2SXTCDN2AnlRLO35QZ2zKoo8osJuA6YjDiQRN3Y//p ELvWraYoVdhbfGPg+gYJn94U/OAmFu0E0VrwEmpWie1YrfTNbbfTkS/BieqhqJTu UEX53tW6FXbmit3ReQGGsPzcoVWbXfIg09gRy14Bn8dtA1EipHKr1m5o1JCjGu9J moJVB8cMkj5o4OcQrPtJZI76FkzaVueNUVTUdFtIt1q6NCyvX8tZe16OKPYY0TEN j8snpJfsk+oZij5caYSKqTeMAPrh6JbIjbkl8cIwc7ClRkRf/l5sW0zaFPS+GDWl c9OFu1CdKO59g+apQ16BmMRw8b/pV22WkxSpqm9GD3I7Y7ABoUN+1l07d5H+aGRq khD7M2rUWh4CwSRvg+ealpyAP++4Sj6mn/z4pe/gem4Wn/ynxGi6CSsYYijB6PWP bm47t+QqyomRiH65Yo41vKGNYu/JfenvAWyJp8AnIajWqxwCRLVNSgySjtylUESs f02qdvLsBTd3/CorZxigWgThM1wWeKyCDO6NBbjVUC/L9O8Ck9U0WbJUVQFDgG0b +52Wv5FRb/xBF9XdpWsw =1yze -----END PGP SIGNATURE----- Merge tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc fixes from Michael Ellerman: - Wire up sys_execveat(). Tested on 32 & 64 bit. - Fix for kdump on LE systems with cpus hot unplugged. - Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this broke other platforms, we'll do a proper fix for 3.20. * tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: Revert "powerpc: Secondary CPUs must set cpu_callin_map after setting active and online" powerpc/kdump: Ignore failure in enabling big endian exception during crash powerpc: Wire up sys_execveat() syscall	2015-01-05 14:49:02 -08:00
Linus Torvalds	f40bde8588	Add execveat syscall -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJUqwAbAAoJEKurIx+X31iBQCkP/3W3+bo2uabm8sjz6ivBi3N8 FW+xiBeyKoCsJIbsdr+txpWoSeIJSQ2Wgd3QwB5L4ztrsykZMSXDg5NzMnnsWh/A 30EfjraillriN9PHt+jcBBEr43VIMoxHA4E8HOuG08tiEaEQMXyGE0PirbNIB2lu Dh5dq5u+fPA9KiD6lHLLfYSAq2PAOl425tsRRRWqaivWjyu0N5pvZJ7zsNN1bFv1 zVlA++IK7XyiM8QuGIMsQ3hLbVs3tdWVeqlS3CcqHRNi2rHQaRI5oX4Rdfj573Tn M7LfYINdhEzzggOpAK9o3olzF9+Z7E9Dj+hSO70GercgD0m3uaRPIgTig9SGP8nq IctyFjhC0MfTPMql83fhmcqEyXJelzKXpELT1FG3MpPUlSMXKyOHWxwfqyfUIj1z 5JV8hurPME0p5FlfLWVaA0o3tncx26SYZ3y7Tdd5fHoOkaZj+Wqcdgs48h5o+1UL VjtCh9JUTx5jwn1+hviPxi4l0gfLwcBLVMqV1F3UCBLMeFD9lRuUs9HiX6tQKXAv E6RR5CdGZPgzqgcmoNQU73JjleZWlMqNDgn/8QpZ1NWEOq8wkPJqKkbPL1wfO7yC Fso4tsRwJn7ykpz4xbEoXYaoTycuIPgqoddxv/OqWw1bTAtlk/mo1sxalrg0XLrv mdp5VTjy4jf+/m6HcOid =0lKw -----END PGP SIGNATURE----- Merge tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 fixlet from Tony Luck: "Add execveat syscall" * tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: [IA64] Enable execveat syscall for ia64	2015-01-05 14:31:20 -08:00
Tony Luck	b739896dd2	[IA64] Enable execveat syscall for ia64 See commit `51f39a1f0c` syscalls: implement execveat() system call Signed-off-by: Tony Luck <tony.luck@intel.com>	2015-01-05 11:25:19 -08:00
Linus Torvalds	693a30b8f1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml Pull UML fixes from Richard Weinberger: "Two fixes for UML regressions. Nothing exciting" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml: x86, um: actually mark system call tables readonly um: Skip futex_atomic_cmpxchg_inatomic() test	2015-01-04 11:46:43 -08:00
Pavel Machek	4bf9636c39	Revert "ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo" Commit `9fc2105aea` ("ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo") breaks audio in python, and probably elsewhere, with message FATAL: cannot locate cpu MHz in /proc/cpuinfo I'm not the first one to hit it, see for example https://theredblacktree.wordpress.com/2014/08/10/fatal-cannot-locate-cpu-mhz-in-proccpuinfo/ https://devtalk.nvidia.com/default/topic/765800/workaround-for-fatal-cannot-locate-cpu-mhz-in-proc-cpuinf/?offset=1 Reading original changelog, I have to say "Stop breaking working setups. You know who you are!". Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-01-04 11:39:59 -08:00
Daniel Borkmann	b485342bd7	x86, um: actually mark system call tables readonly Commit `a074335a37` ("x86, um: Mark system call tables readonly") was supposed to mark the sys_call_table in UML as RO by adding the const, but it doesn't have the desired effect as it's nevertheless being placed into the data section since __cacheline_aligned enforces sys_call_table being placed into .data..cacheline_aligned instead. We need to use the ____cacheline_aligned version instead to fix this issue. Before: $ nm -v arch/x86/um/sys_call_table_64.o \| grep -1 "sys_call_table" U sys_writev 0000000000000000 D sys_call_table 0000000000000000 D syscall_table_size After: $ nm -v arch/x86/um/sys_call_table_64.o \| grep -1 "sys_call_table" U sys_writev 0000000000000000 R sys_call_table 0000000000000000 D syscall_table_size Fixes: `a074335a37` ("x86, um: Mark system call tables readonly") Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2015-01-04 14:21:25 +01:00
Richard Weinberger	f911d73105	um: Skip futex_atomic_cmpxchg_inatomic() test futex_atomic_cmpxchg_inatomic() does not work on UML because it triggers a copy_from_user() in kernel context. On UML copy_from_user() can only be used if the kernel was called by a real user space process such that UML can use ptrace() to fetch the value. Reported-by: Miklos Szeredi <miklos@szeredi.hu> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Richard Weinberger <richard@nod.at> Tested-by: Daniel Walter <d.walter@0x90.at>	2015-01-04 14:20:26 +01:00
Linus Torvalds	d753856c9f	SCSI fixes on 20150102 This is a set of three fixes: one to correct an abort path thinko causing failures (and a panic) in USB on device misbehaviour, One to fix an out of order issue in the fnic driver and one to match discard expectations to qemu which otherwise cause Linux to behave badly as a guest. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJUpsmHAAoJEDeqqVYsXL0MdToH/1UsmOdtxNh6AfBDWWYi45o8 kBno1gTnRrDgIlOiRu+BtmL25A1rNcCdCQKjG6JBEBruqUVAwZztktGjfuSvso6s EYbrnE+5DmDs6cW8pp6GK3QGV+R+AmbT8oBbe/Kpg5LdTsdnQOozSycUp7X3XgTi pLOm6rW/AgEV1QFcQz1bjI6cnbcOZMcGZnC5qwphiKgBnVYd+PZY24RSHDKCu/va z2lsa5yqXFHKZZZRhYG343YqCTf3Dkph78124JoNvVm3EjO+GQAAiojiUa7l59UF RqNRqeMxfz2cPmBnJxbNmWiP1YQGBgOaNDRgc7D7SPxaMwUe9444Gm4MkoZWzmQ= =706l -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of three fixes: one to correct an abort path thinko causing failures (and a panic) in USB on device misbehaviour, One to fix an out of order issue in the fnic driver and one to match discard expectations to qemu which otherwise cause Linux to behave badly as a guest" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: SCSI: fix regression in scsi_send_eh_cmnd() fnic: IOMMU Fault occurs when IO and abort IO is out of order sd: tweak discard heuristics to work around QEMU SCSI issue	2015-01-02 13:24:41 -08:00
Linus Torvalds	6a4bfa7c3f	sound fixes for 3.19-rc3 Nothing too exciting as a new year's start here: most of fixes are for ASoC, a boot crash fix on OMAP for deferred probe, a few driver specific fixes (Intel, dwc, rockchip, rt5677), in addition to typo fixes in kerneldoc comments for PCM. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJUplctAAoJEGwxgFQ9KSmkmjwP/0Ccw2+8sh3uqPbmiwCBOYYL AUo0nhjvsoeCFqgs1rUdQBEeVqvTw5wckhmnvDAYri4wSlnT078m9o4lJpTo/Bem 1VngAWjcA7CErhaU9G988znGAuIWhf9nr/DVhabIr22vmsPrGUDQrC3B/28596mn Pcj2H412vZRMOu6+ZDKX+9oxAWBnP9UDD3hjrKjTCUxwVlhfxAfdbsPNbkivIrLb QAkhqIHuMlb3D2MaBUjbQ+CQf+Q0N1szZoy9t83OCUgbNhmNObuwvso21mD92mwQ 4reW0TdFHXOk3HrY0RBMkvnmrRzZoPBWxBH0bCTLO87rBtZC11IdkMpQ9mZ7gO9Q p/P8z+S1I8nuVmFamW0IbSjslVl8+zh/hJ3H+t4bZlOJgXevcl+VwIocsNcdPhTo V8CqcgwKLwwMuPOgHsiHk+2uCDxm23zmlgo5jyOuKuSnKxo5qu1aH3Mw7mU3S3dd +AOWmM4ighCK2rXkFJSH4UV7/bFIFGS0wfC7adqHW2BaXS/CRStwoYw/7C48c9Mk bY3vSjk7HwOebdnhwXclqWpLDPnmAWWfQvYW+mYakIWX0+qLpL1BFQUr6fYwmLMP GpPBdgqZmrchwEKQM5Wg+Gz4jlkOXzs1qUMidq4f6JxM2A1H2A8BfTz3pScJqqDh +lHj5upG2yVpYQbf4DEf =ARQX -----END PGP SIGNATURE----- Merge tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Nothing too exciting as a new year's start here: most of fixes are for ASoC, a boot crash fix on OMAP for deferred probe, a few driver specific fixes (Intel, dwc, rockchip, rt5677), in addition to typo fixes in kerneldoc comments for PCM" * tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: pcm: Fix kerneldoc for params_*() functions ASoC: rockchip: i2s: fix maxburst of dma data to 4 ASoC: rockchip: i2s: fix error defination of transmit data level ASoC: Intel: correct the fixed free block allocation ASoC: rt5677: fixed rt5677_dsp_vad_put rt5677_dsp_vad_get panic ASoC: Intel: Fix BYTCR machine driver MODULE_ALIAS ASoC: Intel: Fix BYTCR firmware name ASoC: dwc: Iterate over all channels ASoC: dwc: Ensure FIFOs are flushed to prevent channel swap ASoC: Intel: Add I2C dependency to two new machines ASoC: dapm: Remove snd_soc_of_parse_audio_routing() due to deferred probe	2015-01-02 12:57:20 -08:00

1 2 3 4 5 ...

494812 Commits