linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-28 13:51:44 +00:00

Author	SHA1	Message	Date
Xiao Guangrong	0f53b5b1c0	KVM: MMU: cleanup pte write path This patch does: - call vcpu->arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:35 -03:00
Xiao Guangrong	5d163b1c9d	KVM: MMU: introduce a common function to get no-dirty-logged slot Cleanup the code of pte_prefetch_gfn_to_memslot and mapping_level_dirty_bitmap Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:34 -03:00
Xiao Guangrong	676646ee4b	KVM: MMU: remove unused macros These macros are not used, so removed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	842f22ed9b	KVM: MMU: cleanup page alloc and free Using __get_free_page instead of alloc_page and page_address, using free_page instead of __free_page and virt_to_page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	49b26e26e4	KVM: MMU: do not record gfn in kvm_mmu_pte_write No need to record the gfn to verifier the pte has the same mode as current vcpu, it's because we only speculatively update the pte only if the pte and vcpu have the same mode Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	1b7fd45c32	KVM: MMU: set spte accessed bit properly Set spte accessed bit only if guest_initiated == 1 that means the really accessed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Xiao Guangrong	da8dc75f0c	KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits Only remove write access in the last sptes. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:32 -03:00
Jan Kiszka	e935b8372c	KVM: Convert kvm_lock to raw_spinlock Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-03-17 13:08:30 -03:00
Avi Kivity	8234b22e1c	KVM: MMU: Don't flush shadow when enabling dirty tracking Instead, drop large mappings, which were the reason we dropped shadow. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-03-17 13:08:24 -03:00
Andrea Arcangeli	8ee53820ed	thp: mmu_notifier_test_young For GRU and EPT, we need gup-fast to set referenced bit too (this is why it's correct to return 0 when shadow_access_mask is zero, it requires gup-fast to set the referenced bit). qemu-kvm access already sets the young bit in the pte if it isn't zero-copy, if it's zero copy or a shadow paging EPT minor fault we relay on gup-fast to signal the page is in use... We also need to check the young bits on the secondary pagetables for NPT and not nested shadow mmu as the data may never get accessed again by the primary pte. Without this closer accuracy, we'd have to remove the heuristic that avoids collapsing hugepages in hugepage virtual regions that have not even a single subpage in use. ->test_young is full backwards compatible with GRU and other usages that don't have young bits in pagetables set by the hardware and that should nuke the secondary mmu mappings when ->clear_flush_young runs just like EPT does. Removing the heuristic that checks the young bit in khugepaged/collapse_huge_page completely isn't so bad either probably but I thought it was worth it and this makes it reliable. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:46 -08:00
Andrea Arcangeli	936a5fe6e6	thp: kvm mmu transparent hugepage support This should work for both hugetlbfs and transparent hugepages. [akpm@linux-foundation.org: bring forward PageTransCompound() addition for bisectability] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:41 -08:00
Xiao Guangrong	f8e453b00c	KVM: MMU: handle 'map_writable' in set_spte() function Move the operation of 'writable' to set_spte() to clean up code [avi: remove unneeded booleanification] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:31:19 +02:00
Xiao Guangrong	b034cf0105	KVM: MMU: audit: allow audit more guests at the same time It only allows to audit one guest in the system since: - 'audit_point' is a glob variable - mmu_audit_disable() is called in kvm_mmu_destroy(), so audit is disabled after a guest exited this patch fix those issues then allow to audit more guests at the same time Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:31:17 +02:00
Avi Kivity	9f8fe5043f	KVM: Replace reads of vcpu->arch.cr3 by an accessor This allows us to keep cr3 in the VMCS, later on. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:31:15 +02:00
Marcelo Tosatti	e49146dce8	KVM: MMU: only write protect mappings at pagetable level If a pagetable contains a writeable large spte, all of its sptes will be write protected, including non-leaf ones, leading to endless pagefaults. Do not write protect pages above PT_PAGE_TABLE_LEVEL, as the spte fault paths assume non-leaf sptes are writable. Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:31:13 +02:00
Avi Kivity	c445f8ef43	KVM: MMU: Initialize base_role for tdp mmus Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:31:11 +02:00
Andre Przywara	dc25e89e07	KVM: SVM: copy instruction bytes from VMCB In case of a nested page fault or an intercepted #PF newer SVM implementations provide a copy of the faulting instruction bytes in the VMCB. Use these bytes to feed the instruction emulator and avoid the costly guest instruction fetch in this case. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:31:07 +02:00
Andre Przywara	51d8b66199	KVM: cleanup emulate_instruction emulate_instruction had many callers, but only one used all parameters. One parameter was unused, another one is now hidden by a wrapper function (required for a future addition anyway), so most callers use now a shorter parameter list. Signed-off-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:31:00 +02:00
Takuya Yoshikawa	d4dbf47009	KVM: MMU: Make the way of accessing lpage_info more generic Large page information has two elements but one of them, write_count, alone is accessed by a helper function. This patch replaces this helper function with more generic one which returns newly named kvm_lpage_info structure and use it to access the other element rmap_pde. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:47 +02:00
Xiao Guangrong	fb67e14fc9	KVM: MMU: retry #PF for softmmu Retry #PF for softmmu only when the current vcpu has the same cr3 as the time when #PF occurs Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:41 +02:00
Xiao Guangrong	2ec4739ddc	KVM: MMU: fix accessed bit set on prefault path Retry #PF is the speculative path, so don't set the accessed bit Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:40 +02:00
Xiao Guangrong	78b2c54aa4	KVM: MMU: rename 'no_apf' to 'prefault' It's the speculative path if 'no_apf = 1' and we will specially handle this speculative path in the later patch, so 'prefault' is better to fit the sense. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:38 +02:00
Takuya Yoshikawa	700e1b1219	KVM: MMU: Avoid dropping accessed bit while removing write access One more "KVM: MMU: Don't drop accessed bit while updating an spte." Sptes are accessed by both kvm and hardware. This patch uses update_spte() to fix the way of removing write access. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:21 +02:00
Avi Kivity	6389ee9463	KVM: Pull extra page fault information into struct x86_exception Currently page fault cr2 and nesting infomation are carried outside the fault data structure. Instead they are placed in the vcpu struct, which results in confusion as global variables are manipulated instead of passing parameters. Fix this issue by adding address and nested fields to struct x86_exception, so this struct can carry all information associated with a fault. Signed-off-by: Avi Kivity <avi@redhat.com> Tested-by: Joerg Roedel <joerg.roedel@amd.com> Tested-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:30:02 +02:00
Avi Kivity	ab9ae31387	KVM: Push struct x86_exception info the various gva_to_gpa variants Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:59 +02:00
Xiao Guangrong	407c61c6bd	KVM: MMU: abstract invalid guest pte mapping Introduce a common function to map invalid gpte Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:49 +02:00
Xiao Guangrong	a4a8e6f76e	KVM: MMU: remove 'clear_unsync' parameter Remove it since we can judge it by using sp->unsync Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:48 +02:00
Lai Jiangshan	9bdbba13b8	KVM: MMU: rename 'reset_host_protection' to 'host_writable' Rename it to fit its sense better Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:46 +02:00
Xiao Guangrong	b330aa0c7d	KVM: MMU: don't drop spte if overwrite it from W to RO We just need flush tlb if overwrite a writable spte with a read-only one. And we should move this operation to set_spte() for sync_page path Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:29:45 +02:00
Xiao Guangrong	c4806acdce	KVM: MMU: fix apf prefault if nested guest is enabled If apf is generated in L2 guest and is completed in L1 guest, it will prefault this apf in L1 guest's mmu context. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:29:14 +02:00
Xiao Guangrong	060c2abe6c	KVM: MMU: support apf for nonpaing guest Let's support apf for nonpaing guest Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:29:13 +02:00
Xiao Guangrong	5054c0de66	KVM: MMU: fix missing post sync audit Add AUDIT_POST_SYNC audit for long mode shadow page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:29:11 +02:00
Xiao Guangrong	c9b263d2be	KVM: fix tracing kvm_try_async_get_page Tracing 'async' and pfn is useless, since 'async' is always true, and 'pfn' is always "fault_pfn' We can trace 'gva' and 'gfn' instead, it can help us to see the life-cycle of an async_pf Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:28:56 +02:00
Marcelo Tosatti	612819c3c6	KVM: propagate fault r/w information to gup(), allow read-only memory As suggested by Andrea, pass r/w error code to gup(), upgrading read fault to writable if host pte allows it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:28:40 +02:00
Marcelo Tosatti	7905d9a5ad	KVM: MMU: flush TLBs on writable -> read-only spte overwrite This can happen in the following scenario: vcpu0 vcpu1 read fault gup(.write=0) gup(.write=1) reuse swap cache, no COW set writable spte use writable spte set read-only spte Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:23:39 +02:00
Marcelo Tosatti	982c25658c	KVM: MMU: remove kvm_mmu_set_base_ptes Unused. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:23:38 +02:00
Jan Kiszka	7e1fbeac6f	KVM: x86: Mark kvm_arch_setup_async_pf static It has no user outside mmu.c and also no prototype. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:25 +02:00
Gleb Natapov	7c90705bf2	KVM: Inject asynchronous page fault into a PV guest if page is swapped out. Send async page fault to a PV guest if it accesses swapped out memory. Guest will choose another task to run upon receiving the fault. Allow async page fault injection only when guest is in user mode since otherwise guest may be in non-sleepable context and will not be able to reschedule. Vcpu will be halted if guest will fault on the same page again or if vcpu executes kernel code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:17 +02:00
Gleb Natapov	56028d0861	KVM: Retry fault before vmentry When page is swapped in it is mapped into guest memory only after guest tries to access it again and generate another fault. To save this fault we can map it immediately since we know that guest is going to access the page. Do it only when tdp is enabled for now. Shadow paging case is more complicated. CR[034] and EFER registers should be switched before doing mapping and then switched back. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:06 +02:00
Gleb Natapov	af585b921e	KVM: Halt vcpu if page it tries to access is swapped out If a guest accesses swapped out memory do not swap it in from vcpu thread context. Schedule work to do swapping and put vcpu into halted state instead. Interrupts will still be delivered to the guest and if interrupt will cause reschedule guest will continue to run another task. [avi: remove call to get_user_pages_noio(), nacked by Linus; this makes everything synchrnous again] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:21:39 +02:00
Avi Kivity	649497d1a3	KVM: MMU: Fix incorrect direct gfn for unpaged mode shadow We use the physical address instead of the base gfn for the four PAE page directories we use in unpaged mode. When the guest accesses an address above 1GB that is backed by a large host page, a BUG_ON() in kvm_mmu_set_gfn() triggers. Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=21962 Reported-and-tested-by: Nicolas Prochazka <prochazka.nicolas@gmail.com> KVM-Stable-Tag. Signed-off-by: Avi Kivity <avi@redhat.com>	2010-12-29 12:35:29 +02:00
Marcelo Tosatti	eb45fda45f	KVM: MMU: fix rmap_remove on non present sptes drop_spte should not attempt to rmap_remove a non present shadow pte. This fixes a BUG_ON seen on kvm-autotest. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reported-by: Lucas Meneghel Rodrigues <lmr@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-11-05 14:42:26 -02:00
Huang Ying	77db5cbd29	KVM: MCE: Send SRAR SIGBUS directly Originally, SRAR SIGBUS is sent to QEMU-KVM via touching the poisoned page. But commit `9605456919` prevents the signal from being sent. So now the signal is sent via force_sig_info_fault directly. [marcelo: use send_sig_info instead] Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:53:15 +02:00
Nicolas Kaiser	9611c18777	KVM: fix typo in copyright notice Fix typo in copyright notice. Signed-off-by: Nicolas Kaiser <nikai@nikai.net> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:53:14 +02:00
Avi Kivity	7ebaf15eef	KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:53:14 +02:00
Xiao Guangrong	6903074c36	KVM: MMU: audit: check whether have unsync sps after root sync After root synced, all unsync sps are synced, this patch add a check to make sure it's no unsync sps in VCPU's page table Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:53:14 +02:00
Xiao Guangrong	c42fffe3a3	KVM: MMU: audit: unregister audit tracepoints before module unloaded fix: Call Trace: [<ffffffffa01e46ba>] ? kvm_mmu_pte_write+0x229/0x911 [kvm] [<ffffffffa01c6ba9>] ? gfn_to_memslot+0x39/0xa0 [kvm] [<ffffffffa01c6c26>] ? mark_page_dirty+0x16/0x2e [kvm] [<ffffffffa01c6d6f>] ? kvm_write_guest_page+0x67/0x7f [kvm] [<ffffffff81066fbd>] ? local_clock+0x2a/0x3b [<ffffffffa01d52ce>] emulator_write_phys+0x46/0x54 [kvm] ...... Code: Bad RIP value. RIP [<ffffffffa0172056>] 0xffffffffa0172056 RSP <ffff880134f69a70> CR2: ffffffffa0172056 Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:53:13 +02:00
Xiao Guangrong	33f91edb92	KVM: MMU: set access bit for direct mapping Set access bit while setup up direct page table if it's nonpaing or npt enabled, it's good for CPU's speculate access Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:53:11 +02:00
Xiao Guangrong	6292757fb0	KVM: MMU: update 'root_hpa' out of loop in PAE shadow path The value of 'vcpu->arch.mmu.pae_root' is not modified, so we can update 'root_hpa' out of the loop. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-10-24 10:53:09 +02:00
Hillf Danton	cb16a7b387	KVM: MMU: fix counting of rmap entries in rmap_add() It seems that rmap entries are under counted. Signed-off-by: Hillf Danton <dhillf@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2010-10-24 10:52:59 +02:00

1 2 3 4 5 ...

345 Commits