linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-13 23:51:39 +00:00

Author	SHA1	Message	Date
Huang Ying	482908b49e	ACPI, APEI, Use ERST for persistent storage of MCE Traditionally, fatal MCE will cause Linux print error log to console then reboot. Because MCE registers will preserve their content after warm reboot, the hardware error can be logged to disk or network after reboot. But system may fail to warm reboot, then you may lose the hardware error log. ERST can help here. Through saving the hardware error log into flash via ERST before go panic, the hardware error log can be gotten from the flash after system boot successful again. The fatal MCE processing procedure with ERST involved is as follow: - Hardware detect error, MCE raised - MCE read MCE registers, check error severity (fatal), prepare error record - Write MCE error record into flash via ERST - Go panic, then trigger system reboot - System reboot, /sbin/mcelog run, it reads /dev/mcelog to check flash for error record of previous boot via ERST, and output and clear them if available - /sbin/mcelog logs error records into disk or network ERST only accepts CPER record format, but there is no pre-defined CPER section can accommodate all information in struct mce, so a customized section type is defined to hold struct mce inside a CPER record as an error section. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2010-05-19 22:41:40 -04:00
Huang Ying	d334a49113	ACPI, APEI, Generic Hardware Error Source memory error support Generic Hardware Error Source provides a way to report platform hardware errors (such as that from chipset). It works in so called "Firmware First" mode, that is, hardware errors are reported to firmware firstly, then reported to Linux by firmware. This way, some non-standard hardware error registers or non-standard hardware link can be checked by firmware to produce more valuable hardware error information for Linux. Now, only SCI notification type and memory errors are supported. More notification type and hardware error type will be added later. These memory errors are reported to user space through /dev/mcelog via faking a corrected Machine Check, so that the error memory page can be offlined by /sbin/mcelog if the error count for one page is beyond the threshold. On some machines, Machine Check can not report physical address for some corrected memory errors, but GHES can do that. So this simplified GHES is implemented firstly. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2010-05-19 22:41:16 -04:00
Linus Torvalds	6fa0fddd5f	Merge branch 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus-hpet' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hpet: Add reference to chipset erratum documentation for disable-hpet-msi-quirk x86, hpet: Restrict read back to affected ATI chipsets	2010-05-19 17:10:24 -07:00
Linus Torvalds	7d02093e29	Merge branch 'timers-for-linus-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: avr32: Fix typo in read_persistent_clock() sparc: Convert sparc to use read/update_persistent_clock cris: Convert cris to use read/update_persistent_clock m68k: Convert m68k to use read/update_persistent_clock m32r: Convert m32r to use read/update_peristent_clock blackfin: Convert blackfin to use read/update_persistent_clock ia64: Convert ia64 to use read/update_persistent_clock avr32: Convert avr32 to use read/update_persistent_clock h8300: Convert h8300 to use read/update_persistent_clock frv: Convert frv to use read/update_persistent_clock mn10300: Convert mn10300 to use read/update_persistent_clock alpha: Convert alpha to use read/update_persistent_clock xtensa: Fix unnecessary setting of xtime time: Clean up direct xtime usage in xen	2010-05-19 17:10:06 -07:00
Avi Kivity	8fbf065d62	KVM: x86: Add missing locking to arch specific vcpu ioctls Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:41:11 +03:00
Avi Kivity	3dbe141595	KVM: MMU: Segregate shadow pages with different cr0.wp When cr0.wp=0, we may shadow a gpte having u/s=1 and r/w=0 with an spte having u/s=0 and r/w=1. This allows excessive access if the guest sets cr0.wp=1 and accesses through this spte. Fix by making cr0.wp part of the base role; we'll have different sptes for the two cases and the problem disappears. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:09 +03:00
Sheng Yang	a3d204e285	KVM: x86: Check LMA bit before set_efer kvm_x86_ops->set_efer() would execute vcpu->arch.efer = efer, so the checking of LMA bit didn't work. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:09 +03:00
Avi Kivity	f78e917688	KVM: Don't allow lmsw to clear cr0.pe The current lmsw implementation allows the guest to clear cr0.pe, contrary to the manual, which breaks EMM386.EXE. Fix by ORing the old cr0.pe with lmsw's operand. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:08 +03:00
Glauber Costa	371bcf646d	KVM: x86: Tell the guest we'll warn it about tsc stability This patch puts up the flag that tells the guest that we'll warn it about the tsc being trustworthy or not. By now, we also say it is not. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:06 +03:00
Glauber Costa	3a0d7256a6	x86, paravirt: don't compute pvclock adjustments if we trust the tsc If the HV told us we can fully trust the TSC, skip any correction Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:05 +03:00
Glauber Costa	838815a787	x86: KVM guest: Try using new kvm clock msrs We now added a new set of clock-related msrs in replacement of the old ones. In theory, we could just try to use them and get a return value indicating they do not exist, due to our use of kvm_write_msr_save. However, kvm clock registration happens very early, and if we ever try to write to a non-existant MSR, we raise a lethal #GP, since our idt handlers are not in place yet. So this patch tests for a cpuid feature exported by the host to decide which set of msrs are supported. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:04 +03:00
Glauber Costa	84478c829d	KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID Right now, we were using individual KVM_CAP entities to communicate userspace about which cpuids we support. This is suboptimal, since it generates a delay between the feature arriving in the host, and being available at the guest. A much better mechanism is to list para features in KVM_GET_SUPPORTED_CPUID. This makes userspace automatically aware of what we provide. And if we ever add a new cpuid bit in the future, we have to do that again, which create some complexity and delay in feature adoption. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:03 +03:00
Glauber Costa	0e6ac58acb	KVM: x86: add new KVMCLOCK cpuid feature This cpuid, KVM_CPUID_CLOCKSOURCE2, will indicate to the guest that kvmclock is available through a new set of MSRs. The old ones are deprecated. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:02 +03:00
Glauber Costa	11c6bffa42	KVM: x86: change msr numbers for kvmclock Avi pointed out a while ago that those MSRs falls into the pentium PMU range. So the idea here is to add new ones, and after a while, deprecate the old ones. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:01 +03:00
Glauber Costa	489fb490db	x86, paravirt: Add a global synchronization point for pvclock In recent stress tests, it was found that pvclock-based systems could seriously warp in smp systems. Using ingo's time-warp-test.c, I could trigger a scenario as bad as 1.5mi warps a minute in some systems. (to be fair, it wasn't that bad in most of them). Investigating further, I found out that such warps were caused by the very offset-based calculation pvclock is based on. This happens even on some machines that report constant_tsc in its tsc flags, specially on multi-socket ones. Two reads of the same kernel timestamp at approx the same time, will likely have tsc timestamped in different occasions too. This means the delta we calculate is unpredictable at best, and can probably be smaller in a cpu that is legitimately reading clock in a forward ocasion. Some adjustments on the host could make this window less likely to happen, but still, it pretty much poses as an intrinsic problem of the mechanism. A while ago, I though about using a shared variable anyway, to hold clock last state, but gave up due to the high contention locking was likely to introduce, possibly rendering the thing useless on big machines. I argue, however, that locking is not necessary. We do a read-and-return sequence in pvclock, and between read and return, the global value can have changed. However, it can only have changed by means of an addition of a positive value. So if we detected that our clock timestamp is less than the current global, we know that we need to return a higher one, even though it is not exactly the one we compared to. OTOH, if we detect we're greater than the current time source, we atomically replace the value with our new readings. This do causes contention on big boxes (but big here means BIG), but it seems like a good trade off, since it provide us with a time source guaranteed to be stable wrt time warps. After this patch is applied, I don't see a single warp in time during 5 days of execution, in any of the machines I saw them before. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Zachary Amsden <zamsden@redhat.com> CC: Jeremy Fitzhardinge <jeremy@goop.org> CC: Avi Kivity <avi@redhat.com> CC: Marcelo Tosatti <mtosatti@redhat.com> CC: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:41:00 +03:00
Glauber Costa	424c32f1aa	x86, paravirt: Enable pvclock flags in vcpu_time_info structure This patch removes one padding byte and transform it into a flags field. New versions of guests using pvclock will query these flags upon each read. Flags, however, will only be interpreted when the guest decides to. It uses the pvclock_valid_flags function to signal that a specific set of flags should be taken into consideration. Which flags are valid are usually devised via HV negotiation. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jeremy Fitzhardinge <jeremy@goop.org> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:40:59 +03:00
Roedel, Joerg	b69e8caef5	KVM: x86: Inject #GP with the right rip on efer writes This patch fixes a bug in the KVM efer-msr write path. If a guest writes to a reserved efer bit the set_efer function injects the #GP directly. The architecture dependent wrmsr function does not see this, assumes success and advances the rip. This results in a #GP in the guest with the wrong rip. This patch fixes this by reporting efer write errors back to the architectural wrmsr function. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:39 +03:00
Joerg Roedel	0d945bd935	KVM: SVM: Don't allow nested guest to VMMCALL into host This patch disables the possibility for a l2-guest to do a VMMCALL directly into the host. This would happen if the l1-hypervisor doesn't intercept VMMCALL and the l2-guest executes this instruction. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:38 +03:00
Joerg Roedel	3f0fd2927b	KVM: x86: Fix exception reinjection forced to true The patch merged recently which allowed to mark an exception as reinjected has a bug as it always marks the exception as reinjected. This breaks nested-svm shadow-on-shadow implementation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:37 +03:00
Avi Kivity	9ed3c444ab	KVM: Fix wallclock version writing race Wallclock writing uses an unprotected global variable to hold the version; this can cause one guest to interfere with another if both write their wallclock at the same time. Acked-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:36 +03:00
Avi Kivity	8facbbff07	KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots On svm, kvm_read_pdptr() may require reading guest memory, which can sleep. Push the spinlock into mmu_alloc_roots(), and only take it after we've read the pdptr. Tested-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:35 +03:00
Shane Wang	cafd66595d	KVM: VMX: enable VMXON check with SMX enabled (Intel TXT) Per document, for feature control MSR: Bit 1 enables VMXON in SMX operation. If the bit is clear, execution of VMXON in SMX operation causes a general-protection exception. Bit 2 enables VMXON outside SMX operation. If the bit is clear, execution of VMXON outside SMX operation causes a general-protection exception. This patch is to enable this kind of check with SMX for VMXON in KVM. Signed-off-by: Shane Wang <shane.wang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:34 +03:00
Marcelo Tosatti	f1d86e469b	KVM: x86: properly update ready_for_interrupt_injection The recent changes to emulate string instructions without entering guest mode exposed a bug where pending interrupts are not properly reflected in ready_for_interrupt_injection. The result is that userspace overwrites a previously queued interrupt, when irqchip's are emulated in userspace. Fix by always updating state before returning to userspace. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:33 +03:00
Avi Kivity	84ad33ef5d	KVM: VMX: Atomically switch efer if EPT && !EFER.NX When EPT is enabled, we cannot emulate EFER.NX=0 through the shadow page tables. This causes accesses through ptes with bit 63 set to succeed instead of failing a reserved bit check. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:32 +03:00
Avi Kivity	61d2ef2ce3	KVM: VMX: Add facility to atomically switch MSRs on guest entry/exit Some guest msr values cannot be used on the host (for example. EFER.NX=0), so we need to switch them atomically during guest entry or exit. Add a facility to program the vmx msr autoload registers accordingly. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:31 +03:00
Avi Kivity	5dfa3d170e	KVM: VMX: Add definitions for guest and host EFER autoswitch vmcs entries Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:31 +03:00
Avi Kivity	19b95dba03	KVM: VMX: Add definition for msr autoload entry Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:30 +03:00
Avi Kivity	0ee75bead8	KVM: Let vcpu structure alignment be determined at runtime vmx and svm vcpus have different contents and therefore may have different alignmment requirements. Let each specify its required alignment. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-19 11:36:29 +03:00
Xiao Guangrong	884a0ff0b6	KVM: MMU: cleanup invlpg code Using is_last_spte() to cleanup invlpg code Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:28 +03:00
Xiao Guangrong	5e1b3ddbf2	KVM: MMU: move unsync/sync tracpoints to proper place Move unsync/sync tracepoints to the proper place, it's good for us to obtain unsync page live time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:27 +03:00
Xiao Guangrong	85f2067c31	KVM: MMU: convert mmu tracepoints Convert mmu tracepoints by using DECLARE_EVENT_CLASS Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:26 +03:00
Xiao Guangrong	22c9b2d166	KVM: MMU: fix for calculating gpa in invlpg code If the guest is 32-bit, we should use 'quadrant' to adjust gpa offset Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:25 +03:00
Gui Jianfeng	d35b8dd935	KVM: Fix mmu shrinker error kvm_mmu_remove_one_alloc_mmu_page() assumes kvm_mmu_zap_page() only reclaims only one sp, but that's not the case. This will cause mmu shrinker returns a wrong number. This patch fix the counting error. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:23 +03:00
Eric Northup	5a7388c2d2	KVM: MMU: fix hashing for TDP and non-paging modes For TDP mode, avoid creating multiple page table roots for the single guest-to-host physical address map by fixing the inputs used for the shadow page table hash in mmu_alloc_roots(). Signed-off-by: Eric Northup <digitaleric@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-19 11:36:22 +03:00
Cyrill Gorcunov	ce7f15452c	perf, x86: P4 PMU -- add missing bit in CCCR mask Should be there for the sake of RAW events. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Lin Ming <ming.m.lin@intel.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100518212439.354345151@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-19 09:41:06 +02:00
Cyrill Gorcunov	9d36dfcf21	perf, x86: P4_pmu_schedule_events -- use smp_processor_id instead of raw_ This snippet somehow escaped the commit: \| commit `137351e0fe` \| Author: Cyrill Gorcunov <gorcunov@openvz.org> \| Date: Sat May 8 15:25:52 2010 +0400 \| \| x86, perf: P4 PMU -- protect sensible procedures from preemption so bring it eventually back. It helps to catch preemption issue (if there will be, rule of thumb -- don't use raw_ if you can). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100518212439.167259349@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-19 09:41:05 +02:00
Cyrill Gorcunov	623aab896e	perf, x86: P4 PMU -- do a real check for ESCR address being in hash To prevent from clashes in future code modifications do a real check for ESCR address being in hash. At moment the callers are known to pass sane values but better to be on a safe side. And comment fix. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> CC: Lin Ming <ming.m.lin@intel.com> CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20100518212439.004503600@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-19 09:41:05 +02:00
Feng Tang	a02ce953a1	x86/PCI: make ACPI MCFG reserved error messages ACPI specific Both ACPI and SFI firmwares will have MCFG space, but the error message isn't valid on SFI, so don't print the message in that case. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2010-05-18 15:03:27 -07:00
Linus Torvalds	537b60d178	Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV: uv_irq.c: Fix all sparse warnings x86, UV: Improve BAU performance and error recovery x86, UV: Delete unneeded boot messages x86, UV: Clean up UV headers for MMR definitions	2010-05-18 09:46:35 -07:00
Linus Torvalds	3ae684e1c4	Merge branch 'x86-txt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-txt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, tboot: Add support for S3 memory integrity protection	2010-05-18 09:28:24 -07:00
Linus Torvalds	c4fd308ed6	Merge branch 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pat: Update the page flags for memtype atomically instead of using memtype_lock x86, pat: In rbt_memtype_check_insert(), update new->type only if valid x86, pat: Migrate to rbtree only backend for pat memtype management x86, pat: Preparatory changes in pat.c for bigger rbtree change rbtree: Add support for augmented rbtrees	2010-05-18 09:28:04 -07:00
Linus Torvalds	96fbeb973a	Merge branch 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mrst: add nop functions to x86_init mpparse functions x86, mrst, pci: return 0 for non-present pci bars x86: Avoid check hlt for newer cpus	2010-05-18 09:27:49 -07:00
Linus Torvalds	1f8caa986a	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64: Combine SRAT regions when possible	2010-05-18 09:17:50 -07:00
Linus Torvalds	d6f3875252	Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/microcode: Use nonseekable_open() x86: Improve Intel microcode loader performance	2010-05-18 09:17:17 -07:00
Linus Torvalds	cb41838bbc	Merge branch 'core-hweight-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-hweight-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hweight: Use a 32-bit popcnt for __arch_hweight32() arch, hweight: Fix compilation errors x86: Add optimized popcnt variants bitops: Optimize hweight() by making use of compile-time evaluation	2010-05-18 09:17:01 -07:00
Linus Torvalds	98f01720cb	Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, acpi/irq: Define gsi_end when X86_IO_APIC is undefined x86, irq: Kill io_apic_renumber_irq x86, acpi/irq: Handle isa irqs that are not identity mapped to gsi's. x86, ioapic: Simplify probe_nr_irqs_gsi. x86, ioapic: Optimize pin_2_irq x86, ioapic: Move nr_ioapic_registers calculation to mp_register_ioapic. x86, ioapic: In mpparse use mp_register_ioapic x86, ioapic: Teach mp_register_ioapic to compute a global gsi_end x86, ioapic: Fix the types of gsi values x86, ioapic: Fix io_apic_redir_entries to return the number of entries. x86, ioapic: Only export mp_find_ioapic and mp_find_ioapic_pin in io_apic.h x86, acpi/irq: Generalize mp_config_acpi_legacy_irqs x86, acpi/irq: Fix acpi_sci_ioapic_setup so it has both bus_irq and gsi x86, acpi/irq: pci device dev->irq is an isa irq not a gsi x86, acpi/irq: Teach acpi_get_override_irq to take a gsi not an isa_irq x86, acpi/irq: Introduce apci_isa_irq_to_gsi	2010-05-18 09:15:57 -07:00
Linus Torvalds	41d59102e1	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, fpu: Use static_cpu_has() to implement use_xsave() x86: Add new static_cpu_has() function using alternatives x86, fpu: Use the proper asm constraint in use_xsave() x86, fpu: Unbreak FPU emulation x86: Introduce 'struct fpu' and related API x86: Eliminate TS_XSAVE x86-32: Don't set ignore_fpu_irq in simd exception x86: Merge kernel_math_error() into math_error() x86: Merge simd_math_error() into math_error() x86-32: Rework cache flush denied handler Fix trivial conflict in arch/x86/kernel/process.c	2010-05-18 08:58:16 -07:00
Linus Torvalds	07d77759c9	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hypervisor: add missing <linux/module.h> Modify the VMware balloon driver for the new x86_hyper API x86, hypervisor: Export the x86_hyper* symbols x86: Clean up the hypervisor layer x86, HyperV: fix up the license to mshyperv.c x86: Detect running on a Microsoft HyperV system x86, cpu: Make APERF/MPERF a normal table-driven flag x86, k8: Fix build error when K8_NB is disabled x86, cacheinfo: Disable index in all four subcaches x86, cacheinfo: Make L3 cache info per node x86, cacheinfo: Reorganize AMD L3 cache structure x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments x86, cacheinfo: Unify AMD L3 cache index disable checking cpufreq: Unify sysfs attribute definition macros powernow-k8: Fix frequency reporting x86, cpufreq: Add APERF/MPERF support for AMD processors x86: Unify APERF/MPERF support powernow-k8: Add core performance boost support x86, cpu: Add AMD core boosting feature flag to /proc/cpuinfo Fix up trivial conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c and drivers/cpufreq/cpufreq_ondemand.c	2010-05-18 08:49:13 -07:00
Linus Torvalds	b7723f9d21	Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Clean up arch/x86/Kconfig* x86-64: Don't export init_level4_pgt	2010-05-18 08:40:21 -07:00
Linus Torvalds	93c9d7f60c	Merge branch 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-atomic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix LOCK_PREFIX_HERE for uniprocessor build x86, atomic64: In selftest, distinguish x86-64 from 586+ x86-32: Fix atomic64_inc_not_zero return value convention lib: Fix atomic64_inc_not_zero test lib: Fix atomic64_add_unless return value convention x86-32: Fix atomic64_add_unless return value convention lib: Fix atomic64_add_unless test x86: Implement atomic[64]_dec_if_positive() lib: Only test atomic64_dec_if_positive on archs having it x86-32: Rewrite 32-bit atomic64 functions in assembly lib: Add self-test for atomic64_t x86-32: Allow UP/SMP lock replacement in cmpxchg64 x86: Add support for lock prefix in alternatives	2010-05-18 08:40:05 -07:00
Linus Torvalds	7421a10de7	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Use .cfi_sections for assembly code x86-64: Reduce SMP locks table size x86, asm: Introduce and use percpu_inc()	2010-05-18 08:35:37 -07:00
Linus Torvalds	4d7b4ac22f	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits) perf tools: Add mode to build without newt support perf symbols: symbol inconsistency message should be done only at verbose=1 perf tui: Add explicit -lslang option perf options: Type check all the remaining OPT_ variants perf options: Type check OPT_BOOLEAN and fix the offenders perf options: Check v type in OPT_U?INTEGER perf options: Introduce OPT_UINTEGER perf tui: Add workaround for slang < 2.1.4 perf record: Fix bug mismatch with -c option definition perf options: Introduce OPT_U64 perf tui: Add help window to show key associations perf tui: Make <- exit menus too perf newt: Add single key shortcuts for zoom into DSO and threads perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed perf newt: Fix the 'A'/'a' shortcut for annotate perf newt: Make <- exit the ui_browser x86, perf: P4 PMU - fix counters management logic perf newt: Make <- zoom out filters perf report: Report number of events, not samples perf hist: Clarify events_stats fields usage ... Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c	2010-05-18 08:19:03 -07:00
Linus Torvalds	3aaf51ace5	Merge branch 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits) oprofile/x86: make AMD IBS hotplug capable oprofile/x86: notify cpus only when daemon is running oprofile/x86: reordering some functions oprofile/x86: stop disabled counters in nmi handler oprofile/x86: protect cpu hotplug sections oprofile/x86: remove CONFIG_SMP macros oprofile/x86: fix uninitialized counter usage during cpu hotplug oprofile/x86: remove duplicate IBS capability check oprofile/x86: move IBS code oprofile/x86: return -EBUSY if counters are already reserved oprofile/x86: moving shutdown functions oprofile/x86: reserve counter msrs pairwise oprofile/x86: rework error handler in nmi_setup() oprofile: update file list in MAINTAINERS file oprofile: protect from not being in an IRQ context oprofile: remove double ring buffering ring-buffer: Add lost event count to end of sub buffer tracing: Show the lost events in the trace_pipe output ring-buffer: Add place holder recording of dropped events tracing: Fix compile error in module tracepoints when MODULE_UNLOAD not set ...	2010-05-18 08:18:07 -07:00
Linus Torvalds	1014cfe2fb	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: Reduce stack_trace usage lockdep: No need to disable preemption in debug atomic ops lockdep: Actually _dec_ in debug_atomic_dec lockdep: Provide off case for redundant_hardirqs_on increment lockdep: Simplify debug atomic ops lockdep: Fix redundant_hardirqs_on incremented with irqs enabled lockstat: Make lockstat counting per cpu i8253: Convert i8253_lock to raw_spinlock	2010-05-18 08:17:35 -07:00
Linus Torvalds	8123d8f17d	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86/amd-iommu: Add amd_iommu=off command line option iommu-api: Remove iommu_{un}map_range functions x86/amd-iommu: Implement ->{un}map callbacks for iommu-api x86/amd-iommu: Make amd_iommu_iova_to_phys aware of multiple page sizes x86/amd-iommu: Make iommu_unmap_page and fetch_pte aware of page sizes x86/amd-iommu: Make iommu_map_page and alloc_pte aware of page sizes kvm: Change kvm_iommu_map_pages to map large pages VT-d: Change {un}map_range functions to implement {un}map interface iommu-api: Add ->{un}map callbacks to iommu_ops iommu-api: Add iommu_map and iommu_unmap functions iommu-api: Rename ->{un}map function pointers to ->{un}map_range	2010-05-18 07:22:37 -07:00
Cyrill Gorcunov	ef4f30f54e	perf, x86: P4 PMU -- fix typo in unflagged NMI handling Tested-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> LKML-Reference: <1274174954.22793.17.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-18 12:05:20 +02:00
Cyrill Gorcunov	0db1a7bc00	perf, x86: P4 PMU -- handle unflagged events It might happen that an event can overflow without the proper overflow flag set. Check the sign bit in the raw counter value to solve this problem. Tested-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: fweisbec@gmail.com Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1274083984.6540.15.camel@minggr.sh.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-18 08:25:34 +02:00
H. Peter Anvin	c59bd56882	x86, hweight: Use a 32-bit popcnt for __arch_hweight32() Use a 32-bit popcnt instruction for __arch_hweight32(), even on x86-64. Even though the input register will usually be zero-extended due to the standard operation of the hardware, it isn't necessarily so if the input value was the result of truncating a 64-bit operation. Note: the POPCNT32 variant used on x86-64 has a technically unnecessary REX prefix to make it five bytes long, the same as a CALL instruction, therefore avoiding an unnecessary NOP. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <alpine.LFD.2.00.1005171443060.4195@i5.linux-foundation.org>	2010-05-17 15:17:16 -07:00
Andreas Herrmann	fec84e3307	x86, hpet: Add reference to chipset erratum documentation for disable-hpet-msi-quirk (At the moment the "SB700 Family Product Errata" document is available at http://support.amd.com/us/Embedded_TechDocs/46837.pdf) Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20100517164324.GB10254@alberich.amd.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-05-17 10:04:43 -07:00
Sreedhara DS	9a58a33339	IPC driver for Intel Mobile Internet Device (MID) platforms The IPC (inter processor communications) is used to provide the communications between kernel and system control units on some embedded Intel x86 platforms. (Various bits of clean up and restructuring by Alan Cox) Signed-off-by: Sreedhara DS <sreedhara.ds@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com>	2010-05-17 12:06:07 -04:00
Anton Blanchard	f3d46f9d31	atomic_t: Cast to volatile when accessing atomic variables In preparation for removing volatile from the atomic_t definition, this patch adds a volatile cast to all the atomic read functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-17 07:57:27 -07:00
Gui Jianfeng	df2fb6e710	KVM: MMU: fix sp->unsync type error in trace event definition sp->unsync is bool now, so update trace event declaration. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:19:29 +03:00
Joerg Roedel	ff47a49b23	KVM: SVM: Handle MCE intercepts always on host level This patch prevents MCE intercepts from being propagated into the L1 guest if they happened in an L2 guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:27 +03:00
Joerg Roedel	ce7ddec4bb	KVM: x86: Allow marking an exception as reinjected This patch adds logic to kvm/x86 which allows to mark an injected exception as reinjected. This allows to remove an ugly hack from svm_complete_interrupts that prevented exceptions from being reinjected at all in the nested case. The hack was necessary because an reinjected exception into the nested guest could cause a nested vmexit emulation. But reinjected exceptions must not intercept. The downside of the hack is that a exception that in injected could get lost. This patch fixes the problem and puts the code for it into generic x86 files because. Nested-VMX will likely have the same problem and could reuse the code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:26 +03:00
Joerg Roedel	c2c63a4939	KVM: SVM: Report emulated SVM features to userspace This patch implements the reporting of the emulated SVM features to userspace instead of the real hardware capabilities. Every real hardware capability needs emulation in nested svm so the old behavior was broken. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:24 +03:00
Joerg Roedel	d4330ef2fb	KVM: x86: Add callback to let modules decide over some supported cpuid bits This patch adds the get_supported_cpuid callback to kvm_x86_ops. It will be used in do_cpuid_ent to delegate the decission about some supported cpuid bits to the architecture modules. Cc: stable@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:23 +03:00
Joerg Roedel	228070b1b3	KVM: SVM: Propagate nested entry failure into guest hypervisor This patch implements propagation of a failes guest vmrun back into the guest instead of killing the whole guest. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:21 +03:00
Joerg Roedel	2be4fc7a02	KVM: SVM: Sync cr0 and cr3 to kvm state before nested handling This patch syncs cr0 and cr3 from the vmcb to the kvm state before nested intercept handling is done. This allows to simplify the vmexit path. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:20 +03:00
Joerg Roedel	2041a06a50	KVM: SVM: Make sure rip is synced to vmcb before nested vmexit This patch fixes a bug where a nested guest always went over the same instruction because the rip was not advanced on a nested vmexit. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:18 +03:00
Joerg Roedel	924584ccb0	KVM: SVM: Fix nested nmi handling The patch introducing nested nmi handling had a bug. The check does not belong to enable_nmi_window but must be in nmi_allowed. This patch fixes this. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:17 +03:00
Avi Kivity	8d3b932309	Merge remote branch 'tip/perf/core' Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:15 +03:00
Lai Jiangshan	cdbecfc398	KVM: VMX: free vpid when fail to create vcpu Fix bug of the exception path, free allocated vpid when fail to create vcpu. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:19:10 +03:00
Wei Yongjun	77a1a71570	KVM: MMU: cleanup for function unaccount_shadowed() Since gfn is not changed in the for loop, we do not need to call gfn_to_memslot_unaliased() under the loop, and it is safe to move it out. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:12 +03:00
Gui Jianfeng	2a059bf444	KVM: Get rid of dead function gva_to_page() Nobody use gva_to_page() anymore, get rid of it. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:10 +03:00
Gui Jianfeng	b2fc15a5ef	KVM: MMU: Remove unused varialbe in rmap_next() Remove unused varialbe in rmap_next() Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:09 +03:00
Gui Jianfeng	814a59d207	KVM: MMU: Make use of is_large_pte() in walker Make use of is_large_pte() instead of checking PT_PAGE_SIZE_MASK bit directly. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:07 +03:00
Gui Jianfeng	51fb60d81b	KVM: MMU: Move sync_page() first pte address calculation out of loop Move first pte address calculation out of loop to save some cycles. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:06 +03:00
Avi Kivity	87bc3bf972	KVM: MMU: Drop cr4.pge from shadow page role Since commit `bf47a760f6`, we no longer handle ptes with the global bit set specially, so there is no reason to distinguish between shadow pages created with cr4.gpe set and clear. Such tracking is expensive when the guest toggles cr4.pge, so drop it. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:03 +03:00
Lai Jiangshan	90d83dc3d4	KVM: use the correct RCU API for PROVE_RCU=y The RCU/SRCU API have already changed for proving RCU usage. I got the following dmesg when PROVE_RCU=y because we used incorrect API. This patch coverts rcu_deference() to srcu_dereference() or family API. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by qemu-system-x86/8550: #0: (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm] #1: (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm] stack backtrace: Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27 Call Trace: [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm] [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm] [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm] [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm] [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel] [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm] [<ffffffff810a8692>] ? unlock_page+0x27/0x2c [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm] [<ffffffff81060cfa>] ? up_read+0x23/0x3d [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a [<ffffffff810021db>] system_call_fastpath+0x16/0x1b Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:18:01 +03:00
Avi Kivity	9beeaa2d68	Merge branch 'perf' Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:17:58 +03:00
Xiao Guangrong	3246af0ece	KVM: MMU: cleanup for hlist walk restart Quote from Avi: \|Just change the assignment to a 'goto restart;' please, \|I don't like playing with list_for_each internals. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:56 +03:00
Gleb Natapov	acb5451789	KVM: prevent spurious exit to userspace during task switch emulation. If kvm_task_switch() fails code exits to userspace without specifying exit reason, so the previous exit reason is reused by userspace. Fix this by specifying exit reason correctly. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:55 +03:00
Xiao Guangrong	6b18493d60	KVM: MMU: remove unused parameter in mmu_parent_walk() 'vcpu' is unused, remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:53 +03:00
Xiao Guangrong	0571d366e0	KVM: MMU: reduce 'struct kvm_mmu_page' size Define 'multimapped' as 'bool'. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:52 +03:00
Xiao Guangrong	1b8c7934a4	KVM: MMU: remove unused struct kvm_unsync_walk Remove 'struct kvm_unsync_walk' since it's not used. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:50 +03:00
Gleb Natapov	19d0443726	KVM: fix emulator_task_switch() return value. emulator_task_switch() should return -1 for failure and 0 for success to the caller, just like x86_emulate_insn() does. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:49 +03:00
Avi Kivity	5b7e0102ae	KVM: MMU: Replace role.glevels with role.cr4_pae There is no real distinction between glevels=3 and glevels=4; both have exactly the same format and the code is treated exactly the same way. Drop role.glevels and replace is with role.cr4_pae (which is meaningful). This simplifies the code a bit. As a side effect, it allows sharing shadow page tables between pae and longmode guest page tables at the same guest page. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:47 +03:00
Jan Kiszka	e269fb2189	KVM: x86: Push potential exception error code on task switches When a fault triggers a task switch, the error code, if existent, has to be pushed on the new task's stack. Implement the missing bits. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:46 +03:00
Jan Kiszka	0760d44868	KVM: x86: Terminate early if task_switch_16/32 failed Stop the switch immediately if task_switch_16/32 returned an error. Only if that step succeeded, the switch should actually take place and update any register states. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:44 +03:00
Gleb Natapov	8f6abd06f5	KVM: x86: get rid of mmu_only parameter in emulator_write_emulated() We can call kvm_mmu_pte_write() directly from emulator_cmpxchg_emulated() instead of passing mmu_only down to emulator_write_emulated_onepage() and call it there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:43 +03:00
Gleb Natapov	020df0794f	KVM: move DR register access handling into generic code Currently both SVM and VMX have their own DR handling code. Move it to x86.c. Acked-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:39 +03:00
Andre Przywara	6bc31bdc55	KVM: SVM: implement NEXTRIPsave SVM feature On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB containing the desired information. Since it is cheap to do so, we use this field to override the guessed value on newer processors. A fix for older CPUs would be rather expensive, as it would require to fetch and partially decode the instruction. As the problem is not a security issue and needs special, handcrafted code to trigger (no compiler will ever generate such code), I omit a fix for older CPUs. If someone is interested, I have both a patch for these CPUs as well as demo code triggering this issue: It segfaults under KVM, but runs perfectly on native Linux. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:38 +03:00
Avi Kivity	f7a711971e	KVM: Fix MAXPHYADDR calculation when cpuid does not support it MAXPHYADDR is derived from cpuid 0x80000008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x80000008 is supported. Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:36 +03:00
Avi Kivity	e46479f852	KVM: Trace emulated instructions Log emulated instructions in ftrace, especially if they failed. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:35 +03:00
Avi Kivity	2fb53ad811	KVM: x86 emulator: Don't overwrite decode cache Currently if we an instruction spans a page boundary, when we fetch the second half we overwrite the first half. This prevents us from tracing the full instruction opcodes. Fix by appending the second half to the first. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:33 +03:00
Xiao Guangrong	24222c2fec	KVM: MMU: remove unnecessary NX check in walk_addr After is_rsvd_bits_set() checks, EFER.NXE must be enabled if NX bit is seted Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:30 +03:00
Xiao Guangrong	f84cbb0561	KVM: MMU: remove unused field kvm_mmu_page.oos_link is not used, so remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:29 +03:00
Xiao Guangrong	805d32dea4	KVM: MMU: cleanup/fix mmu audit code This patch does: - 'sp' parameter in inspect_spte_fn() is not used, so remove it - fix 'kvm' and 'slots' is not defined in count_rmaps() - fix a bug in inspect_spte_has_rmap() Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-05-17 12:17:27 +03:00
Avi Kivity	84b0c8c6a6	KVM: MMU: Disassociate direct maps from guest levels Direct maps are linear translations for a section of memory, used for real mode or with large pages. As such, they are independent of the guest levels. Teach the mmu about this by making page->role.glevels = 0 for direct maps. This allows direct maps to be shared among real mode and the various paging modes. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:44 +03:00
Xiao Guangrong	f815bce894	KVM: MMU: check reserved bits only if CR4.PSE=1 or CR4.PAE=1 - Check reserved bits only if CR4.PAE=1 or CR4.PSE=1 when guest #PF occurs - Fix a typo in reset_rsvds_bits_mask() Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-05-17 12:16:42 +03:00

1 2 3 4 5 ...

10673 Commits