linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-15 23:51:46 +00:00

Author	SHA1	Message	Date
Glauber Costa	d910f5c106	KVM guest: KVM Steal time registration This patch implements the kvm bits of the steal time infrastructure. The most important part of it, is the steal time clock. It is an continuous clock that shows the accumulated amount of steal time since vcpu creation. It is supposed to survive cpu offlining/onlining. [marcelo: fix build with CONFIG_KVM_GUEST=n] Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Avi Kivity <avi@redhat.com> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-24 11:49:36 +03:00
Glauber Costa	3c404b578f	KVM guest: Add a pv_ops stub for steal time This patch adds a function pointer in one of the many paravirt_ops structs, to allow guests to register a steal time function. Besides a steal time function, we also declare two jump_labels. They will be used to allow the steal time code to be easily bypassed when not in use. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:44 +03:00
Glauber Costa	c9aaa8957f	KVM: Steal time implementation To implement steal time, we need the hypervisor to pass the guest information about how much time was spent running other processes outside the VM, while the vcpu had meaningful work to do - halt time does not count. This information is acquired through the run_delay field of delayacct/schedstats infrastructure, that counts time spent in a runqueue but not running. Steal time is a per-cpu information, so the traditional MSR-based infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the memory area address containing information about steal time This patch contains the hypervisor part of the steal time infrasructure, and can be backported independently of the guest portion. [avi, yongjie: export delayacct_on, to avoid build failures in some configs] Signed-off-by: Glauber Costa <glommer@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Rik van Riel <riel@redhat.com> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Yongjie Ren <yongjie.ren@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-14 12:59:14 +03:00
Glauber Costa	9ddabbe72e	KVM: KVM Steal time guest/host interface To implement steal time, we need the hypervisor to pass the guest information about how much time was spent running other processes outside the VM. This is per-vcpu, and using the kvmclock structure for that is an abuse we decided not to make. In this patchset, I am introducing a new msr, KVM_MSR_STEAL_TIME, that holds the memory area address containing information about steal time This patch contains the headers for it. I am keeping it separate to facilitate backports to people who wants to backport the kernel part but not the hypervisor, or the other way around. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 13:17:03 +03:00
Glauber Costa	4b6b35f55c	KVM: Add constant to represent KVM MSRs enabled bit in guest/host interface This patch is simple, put in a different commit so it can be more easily shared between guest and hypervisor. It just defines a named constant to indicate the enable bit for KVM-specific MSRs. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 13:17:02 +03:00
Avi Kivity	411c588dfb	KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0 When CR0.WP=0, we sometimes map user pages as kernel pages (to allow the kernel to write to them). Unfortunately this also allows the kernel to fetch from these pages, even if CR4.SMEP is set. Adjust for this by also setting NX on the spte in these circumstances. Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 13:16:26 +03:00
Yang, Wei	d9c3476d8a	KVM: Remove RDWRGSFS bit from CR4_RESERVED_BITS This patch removes RDWRGSFS bit from CR4_RESERVED_BITS. Signed-off-by: Yang, Wei <wei.y.yang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 13:16:22 +03:00
Yang, Wei Y	8d9c975fc5	KVM: Remove SMEP bit from CR4_RESERVED_BITS This patch removes SMEP bit from CR4_RESERVED_BITS. Signed-off-by: Yang, Wei <wei.y.yang@intel.com> Signed-off-by: Shan, Haitao <haitao.shan@intel.com> Signed-off-by: Li, Xin <xin.li@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 13:16:12 +03:00
Avi Kivity	9dac77fa40	KVM: x86 emulator: fold decode_cache into x86_emulate_ctxt This saves a lot of pointless casts x86_emulate_ctxt and decode_cache. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 13:16:09 +03:00
Avi Kivity	36dd9bb5ce	KVM: x86 emulator: rename decode_cache::eip to _eip The name eip conflicts with a field of the same name in x86_emulate_ctxt, which we plan to fold decode_cache into. The name _eip is unfortunate, but what's really needed is a refactoring here, not a better name. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 13:16:09 +03:00
Nadav Har'El	7c1779384a	KVM: nVMX: vmcs12 checks on nested entry This patch adds a bunch of tests of the validity of the vmcs12 fields, according to what the VMX spec and our implementation allows. If fields we cannot (or don't want to) honor are discovered, an entry failure is emulated. According to the spec, there are two types of entry failures: If the problem was in vmcs12's host state or control fields, the VMLAUNCH instruction simply fails. But a problem is found in the guest state, the behavior is more similar to that of an exit. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:16 +03:00
Nadav Har'El	4704d0befb	KVM: nVMX: Exiting from L2 to L1 This patch implements nested_vmx_vmexit(), called when the nested L2 guest exits and we want to run its L1 parent and let it handle this exit. Note that this will not necessarily be called on every L2 exit. L0 may decide to handle a particular exit on its own, without L1's involvement; In that case, L0 will handle the exit, and resume running L2, without running L1 and without calling nested_vmx_vmexit(). The logic for deciding whether to handle a particular exit in L1 or in L0, i.e., whether to call nested_vmx_vmexit(), will appear in a separate patch below. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:16 +03:00
Nadav Har'El	0140caea3b	KVM: nVMX: Success/failure of VMX instructions. VMX instructions specify success or failure by setting certain RFLAGS bits. This patch contains common functions to do this, and they will be used in the following patches which emulate the various VMX instructions. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:12 +03:00
Nadav Har'El	b87a51ae28	KVM: nVMX: Implement reading and writing of VMX MSRs When the guest can use VMX instructions (when the "nested" module option is on), it should also be able to read and write VMX MSRs, e.g., to query about VMX capabilities. This patch adds this support. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:11 +03:00
Nadav Har'El	5e1746d620	KVM: nVMX: Allow setting the VMXE bit in CR4 This patch allows the guest to enable the VMXE bit in CR4, which is a prerequisite to running VMXON. Whether to allow setting the VMXE bit now depends on the architecture (svm or vmx), so its checking has moved to kvm_x86_ops->set_cr4(). This function now returns an int: If kvm_x86_ops->set_cr4() returns 1, __kvm_set_cr4() will also return 1, and this will cause kvm_set_cr4() will throw a #GP. Turning on the VMXE bit is allowed only when the nested VMX feature is enabled, and turning it off is forbidden after a vmxon. Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:10 +03:00
Takuya Yoshikawa	b5c9ff731f	KVM: x86 emulator: Avoid clearing the whole decode_cache During tracing the emulator, we noticed that init_emulate_ctxt() sometimes took a bit longer time than we expected. This patch is for mitigating the problem by some degree. By looking into the function, we soon notice that it clears the whole decode_cache whose size is about 2.5K bytes now. Furthermore, most of the bytes are taken for the two read_cache arrays, which are used only by a few instructions. Considering the fact that we are not assuming the cache arrays have been cleared when we store actual data, we do not need to clear the arrays: 2K bytes elimination. In addition, we can avoid clearing the fetch_cache and regs arrays. This patch changes the initialization not to clear the arrays. On our 64-bit host, init_emulate_ctxt() becomes 0.3 to 0.5us faster with this patch applied. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-07-12 11:45:09 +03:00
Xiao Guangrong	67052b3508	KVM: MMU: remove the arithmetic of parent pte rmap Parent pte rmap and page rmap are very similar, so use the same arithmetic for them Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:07 +03:00
Xiao Guangrong	53c07b1878	KVM: MMU: abstract the operation of rmap Abstract the operation of rmap to spte_list, then we can use it for the reverse mapping of parent pte in the later patch Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:06 +03:00
Xiao Guangrong	332b207d65	KVM: MMU: optimize pte write path if don't have protected sp Simply return from kvm_mmu_pte_write path if no shadow page is write-protected, then we can avoid to walk all shadow pages and hold mmu-lock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:02 +03:00
Avi Kivity	5e520e6278	KVM: VMX: Move VMREAD cleanup to exception handler We clean up a failed VMREAD by clearing the output register. Do it in the exception handler instead of unconditionally. This is worthwhile since there are more than a hundred call sites. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:45:00 +03:00
Takuya Yoshikawa	7b105ca290	KVM: x86 emulator: Stop passing ctxt->ops as arg of emul functions Dereference it in the actual users. This not only cleans up the emulator but also makes it easy to convert the old emulation functions to the new em_xxx() form later. Note: Remove some inline keywords to let the compiler decide inlining. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-07-12 11:44:59 +03:00
Jesper Juhl	e376fd664b	watchdog: Intel SCU Watchdog: Fix build and remove duplicate code Trying to build the Intel SCU Watchdog fails for me with gcc 4.6.0 - $ gcc --version \| head -n 1 gcc (GCC) 4.6.0 20110513 (prerelease) like this : CC drivers/watchdog/intel_scu_watchdog.o In file included from drivers/watchdog/intel_scu_watchdog.c:49:0: /home/jj/src/linux-2.6/arch/x86/include/asm/apb_timer.h: In function ‘apbt_time_init’: /home/jj/src/linux-2.6/arch/x86/include/asm/apb_timer.h:65:42: warning: ‘return’ with a value, in function returning void [enabled by default] drivers/watchdog/intel_scu_watchdog.c: In function ‘intel_scu_watchdog_init’: drivers/watchdog/intel_scu_watchdog.c:468:2: error: implicit declaration of function ‘sfi_get_mtmr’ [-Werror=implicit-function-declaration] drivers/watchdog/intel_scu_watchdog.c:468:32: warning: assignment makes pointer from integer without a cast [enabled by default] cc1: some warnings being treated as errors make[1]: * [drivers/watchdog/intel_scu_watchdog.o] Error 1 make: * [drivers/watchdog/intel_scu_watchdog.o] Error 2 Additionally, linux/types.h is needlessly being included twice in drivers/watchdog/intel_scu_watchdog.c Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>	2011-06-28 07:42:50 +00:00
KAMEZAWA Hiroyuki	c6830c2260	Fix node_start/end_pfn() definition for mm/page_cgroup.c commit `21a3c96` uses node_start/end_pfn(nid) for detection start/end of nodes. But, it's not defined in linux/mmzone.h but defined in /arch/???/include/mmzone.h which is included only under CONFIG_NEED_MULTIPLE_NODES=y. Then, we see mm/page_cgroup.c: In function 'page_cgroup_init': mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn' mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn' So, fixiing page_cgroup.c is an idea... But node_start_pfn()/node_end_pfn() is a very generic macro and should be implemented in the same manner for all archs. (m32r has different implementation...) This patch removes definitions of node_start/end_pfn() in each archs and defines a unified one in linux/mmzone.h. It's not under CONFIG_NEED_MULTIPLE_NODES, now. A result of macro expansion is here (mm/page_cgroup.c) for !NUMA start_pfn = ((&contig_page_data)->node_start_pfn); end_pfn = ({ pg_data_t __pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;}); for NUMA (x86-64) start_pfn = ((node_data[nid])->node_start_pfn); end_pfn = ({ pg_data_t __pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;}); Changelog: - fixed to avoid using "nid" twice in node_end_pfn() macro. Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com> Reported-and-tested-by: Ingo Molnar <mingo@elte.hu> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 14:13:09 -07:00
Linus Torvalds	10e18e6230	Merge branch 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/3.0' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Fix register corruption in pvclock_scale_delta KVM: MMU: fix opposite condition in mapping_level_dirty_bitmap KVM: VMX: do not overwrite uptodate vcpu->arch.cr3 on KVM_SET_SREGS KVM: MMU: Fix build warnings in walk_addr_generic()	2011-06-20 08:58:07 -07:00
Zachary Amsden	de2d1a524e	KVM: Fix register corruption in pvclock_scale_delta The 128-bit multiply in pvclock.h was missing an output constraint for EDX which caused a register corruption to appear. Thanks to Ulrich for diagnosing the EDX corruption and Avi for providing this fix. Signed-off-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-06-19 19:23:14 +03:00
Maarten Lankhorst	7d68dc3f10	x86, efi: Do not reserve boot services regions within reserved areas Commit `916f676f8d` started reserving boot service code since some systems require you to keep that code around until SetVirtualAddressMap is called. However, in some cases those areas will overlap with reserved regions. The proper medium-term fix is to fix the bootloader to prevent the conflicts from occurring by moving the kernel to a better position, but the kernel should check for this possibility, and only reserve regions which can be reserved. Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com> Link: http://lkml.kernel.org/r/4DF7A005.1050407@gmail.com Acked-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-18 22:48:49 +02:00
Linus Torvalds	f310642123	Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param x86 idle: deprecate "no-hlt" cmdline param x86 idle APM: deprecate CONFIG_APM_CPU_IDLE x86 idle floppy: deprecate disable_hlt() x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it x86 idle: clarify AMD erratum 400 workaround idle governor: Avoid lock acquisition to read pm_qos before entering idle cpuidle: menu: fixed wrapping timers at 4.294 seconds	2011-05-29 11:18:09 -07:00
Len Brown	02c68a0201	x86 idle: clarify AMD erratum 400 workaround The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting: 1. Intel C1E is somehow involved 2. All AMD processors with C1E are involved Use the string "amd_c1e" instead of simply "c1e" to clarify that this workaround is specific to AMD's version of C1E. Use the string "e400" to clarify that the workaround is specific to AMD processors with Erratum 400. This patch is text-substitution only, with no functional change. cc: x86@kernel.org Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-05-29 03:38:57 -04:00
Linus Torvalds	a947e23a8e	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, asm: Clean up desc.h a bit x86, amd: Do not enable ARAT feature on AMD processors below family 0x12 x86: Move do_page_fault()'s error path under unlikely() x86, efi: Retain boot service code until after switching to virtual mode x86: Remove unnecessary check in detect_ht() x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct x86, UV: Clean up uv_tlb.c x86, UV: Add support for SGI UV2 hub chip x86, cpufeature: Update CPU feature RDRND to RDRAND	2011-05-28 12:57:01 -07:00
Linus Torvalds	571503e100	Merge branch 'setns' * setns: ns: Wire up the setns system call Done as a merge to make it easier to fix up conflicts in arm due to addition of sendmmsg system call	2011-05-28 10:51:01 -07:00
Eric W. Biederman	7b21fddd08	ns: Wire up the setns system call 32bit and 64bit on x86 are tested and working. The rest I have looked at closely and I can't find any problems. setns is an easy system call to wire up. It just takes two ints so I don't expect any weird architecture porting problems. While doing this I have noticed that we have some architectures that are very slow to get new system calls. cris seems to be the slowest where the last system calls wired up were preadv and pwritev. avr32 is weird in that recvmmsg was wired up but never declared in unistd.h. frv is behind with perf_event_open being the last syscall wired up. On h8300 the last system call wired up was epoll_wait. On m32r the last system call wired up was fallocate. mn10300 has recvmmsg as the last system call wired up. The rest seem to at least have syncfs wired up which was new in the 2.6.39. v2: Most of the architecture support added by Daniel Lezcano <dlezcano@fr.ibm.com> v3: ported to v2.6.36-rc4 by: Eric W. Biederman <ebiederm@xmission.com> v4: Moved wiring up of the system call to another patch v5: ported to v2.6.39-rc6 v6: rebased onto parisc-next and net-next to avoid syscall conflicts. v7: ported to Linus's latest post 2.6.39 tree. > arch/blackfin/include/asm/unistd.h \| 3 ++- > arch/blackfin/mach-common/entry.S \| 1 + Acked-by: Mike Frysinger <vapier@gentoo.org> Oh - ia64 wiring looks good. Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-28 10:48:39 -07:00
Linus Torvalds	f23a5e1405	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM: Fix PM QOS's user mode interface to work with ASCII input PM / Hibernate: Update kerneldoc comments in hibernate.c PM / Hibernate: Remove arch_prepare_suspend() PM / Hibernate: Update some comments in core hibernate code	2011-05-27 14:27:34 -07:00
Ingo Molnar	9a3865b185	x86, asm: Clean up desc.h a bit I have looked at this file and found it rather ugly - improve readability a bit. No change in functionality. Link: http://lkml.kernel.org/n/tip-incpt6y26yd8586idx65t9ll@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-27 09:30:50 +02:00
Mike Frysinger	63ab25ebbc	kgdbts: unify/generalize gdb breakpoint adjustment The Blackfin arch, like the x86 arch, needs to adjust the PC manually after a breakpoint is hit as normally this is handled by the remote gdb. However, rather than starting another arch ifdef mess, create a common GDB_ADJUSTS_BREAK_OFFSET define for any arch to opt-in via their kgdb.h. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Dongdong Deng <dongdong.deng@windriver.com> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-26 17:12:36 -07:00
Mike Frysinger	c46dd6b48d	x86: convert to asm-generic ptrace.h Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Cc: Dongdong Deng <dongdong.deng@windriver.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-26 17:12:36 -07:00
Linus Torvalds	14587a2a25	Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: vdso: Remove unused variable x86-64: Optimize vDSO time() x86-64: Add time to vDSO x86-64: Turn off -pg and turn on -foptimize-sibling-calls for vDSO x86-64: Move vread_tsc into a new file with sensible options x86-64: Vclock_gettime(CLOCK_MONOTONIC) can't ever see nsec < 0 x86-64: Don't generate cmov in vread_tsc x86-64: Remove unnecessary barrier in vread_tsc x86-64: Clean up vdso/kernel shared variables	2011-05-26 12:19:31 -07:00
Linus Torvalds	f8d613e2a6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem: xen: cleancache shim to Xen Transcendent Memory ocfs2: add cleancache support ext4: add cleancache support btrfs: add cleancache support ext3: add cleancache support mm/fs: add hooks to support cleancache mm: cleancache core ops functions and config fs: add field to superblock to support cleancache mm/fs: cleancache documentation Fix up trivial conflict in fs/btrfs/extent_io.c due to includes	2011-05-26 10:50:56 -07:00
Dan Magenheimer	5bc20fc597	xen: cleancache shim to Xen Transcendent Memory This patch provides a shim between the kernel-internal cleancache API (see Documentation/mm/cleancache.txt) and the Xen Transcendent Memory ABI (see http://oss.oracle.com/projects/tmem). Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented pseudo-RAM store for cleancache pages, shared cleancache pages, and frontswap pages. Tmem provides enterprise-quality concurrency, full save/restore and live migration support, compression and deduplication. A presentation showing up to 8% faster performance and up to 52% reduction in sectors read on a kernel compile workload, despite aggressive in-kernel page reclamation ("self-ballooning") can be found at: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdf Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> Reviewed-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Rik Van Riel <riel@redhat.com> Cc: Jan Beulich <JBeulich@novell.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Andreas Dilger <adilger@sun.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org>	2011-05-26 10:02:21 -06:00
Ingo Molnar	de66ee979d	Merge branch 'linus' into x86/urgent Merge reason: we want to queue up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-26 13:51:35 +02:00
Roland Dreier	dbee8a0aff	x86: remove 32-bit versions of readq()/writeq() The presense of a writeq() implementation on 32-bit x86 that splits the 64-bit write into two 32-bit writes turns out to break the mpt2sas driver (and in general is risky for drivers as was discussed in <http://lkml.kernel.org/r/adaab6c1h7c.fsf@cisco.com>). To fix this, revert `2c5643b1c5` ("x86: provide readq()/writeq() on 32-bit too") and follow-on cleanups. This unfortunately leads to pushing non-atomic definitions of readq() and write() to various x86-only drivers that in the meantime started using the definitions in the x86 version of <asm/io.h>. However as discussed exhaustively, this is actually the right thing to do, because the right way to split a 64-bit transaction is hardware dependent and therefore belongs in the hardware driver (eg mpt2sas needs a spinlock to make sure no other accesses occur in between the two halves of the access). Build tested on 32- and 64-bit x86 allmodconfig. Link: http://lkml.kernel.org/r/x86-32-writeq-is-broken@mdm.bga.com Acked-by: Hitoshi Mitake <h.mitake@gmail.com> Cc: Kashyap Desai <Kashyap.Desai@lsi.com> Cc: Len Brown <lenb@kernel.org> Cc: Ravi Anand <ravi.anand@qlogic.com> Cc: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Jason Uhlenkott <juhlenko@akamai.com> Acked-by: James Bottomley <James.Bottomley@parallels.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:44 -07:00
Richard Kennedy	af6a25f0e1	x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct Reorder mm_context_t to remove alignment padding on 64 bit builds shrinking its size from 64 to 56 bytes. This allows mm_struct to shrink from 840 to 832 bytes, so using one fewer cache lines, and getting more objects per slab when using slub. slabinfo mm_struct reports before :- Sizes (bytes) Slabs ----------------------------------- Object : 840 Total : 7 SlabObj: 896 Full : 1 SlabSiz: 16384 Partial: 4 Loss : 56 CpuSlab: 2 Align : 64 Objects: 18 after :- Sizes (bytes) Slabs ---------------------------------- Object : 832 Total : 7 SlabObj: 832 Full : 1 SlabSiz: 16384 Partial: 4 Loss : 0 CpuSlab: 2 Align : 64 Objects: 19 Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Cc: wilsons@start.ca Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Pekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/1306244999.1999.5.camel@castor.rsk Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-25 16:16:41 +02:00
Cliff Wickman	f073cc8f39	x86, UV: Clean up uv_tlb.c SGI UV's uv_tlb.c driver has become rather hard to read, with overly large functions, non-standard coding style and (way) too long variable, constant and function names and non-obvious code flow sequences. This patch improves the readability and maintainability of the driver significantly, by doing the following strict code cleanups with no side effects: - Split long functions into shorter logical functions. - Shortened some variable and structure member names. - Added special functions for reads and writes of MMR regs with very long names. - Added the 'tunables' table to shortened tunables_write(). - Added the 'stat_description' table to shorten uv_ptc_proc_write(). - Pass fewer 'stat' arguments where it can be derived from the 'bcp' argument. - Function definitions consistent on one line, and inline in few (short) cases. - Moved some small structures and an atomic inline function to the header file. - Moved some local variables to the blocks where they are used. - Updated the copyright date. - Shortened uv_write_global_mmr64() etc. using some aliasing; no line breaks. Renamed many uv_.. functions that are not exported. - Aligned structure fields. [ note that not all structures are aligned the same way though; I'd like to keep the extensive commenting in some of them. ] - Shortened some long structure names. - Standard pass/fail exit from init_per_cpu() - Vertical alignment for mass initializations. - More separation between blocks of code. Tested on a 16-processor Altix UV. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: penberg@kernel.org Link: http://lkml.kernel.org/r/E1QOw12-0004MN-Lp@eag09.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-25 14:20:14 +02:00
Jack Steiner	2a919596c1	x86, UV: Add support for SGI UV2 hub chip This patch adds support for a new version of the SGI UV hub chip. The hub chip is the node controller that connects multiple blades into a larger coherent SSI. For the most part, UV2 is compatible with UV1. The majority of the changes are in the addresses of MMRs and in a few cases, the contents of MMRs. These changes are the result in changes in the system topology such as node configuration, processor types, maximum nodes, physical address sizes, etc. Signed-off-by: Jack Steiner <steiner@sgi.com> Link: http://lkml.kernel.org/r/20110511175028.GA18006@sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-25 14:20:13 +02:00
Kees Cook	7ccafc5f75	x86, cpufeature: Update CPU feature RDRND to RDRAND The Intel manual changed the name of the CPUID bit to match the instruction name. We should follow suit for sanity's sake. (See Intel SDM Volume 2, Table 3-20 "Feature Information Returned in the ECX Register".) [ hpa: we can only do this at this time because there are currently no CPUs with this feature on the market, hence this is pre-hardware enabling. However, Cc:'ing stable so that stable can present a consistent ABI. ] Signed-off-by: Kees Cook <kees.cook@canonical.com> Link: http://lkml.kernel.org/r/20110524232926.GA27728@outflux.net Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: <stable@kernel.org> v2.6.36-39	2011-05-24 16:37:27 -07:00
Rafael J. Wysocki	354258011e	PM / Hibernate: Remove arch_prepare_suspend() All architectures supporting hibernation define arch_prepare_suspend() as an empty function, so remove it. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>	2011-05-24 23:35:55 +02:00
Linus Torvalds	5129df03d0	Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: Unify input section names percpu: Avoid extra NOP in percpu_cmpxchg16b_double percpu: Cast away printk format warning percpu: Always align percpu output section to PAGE_SIZE Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun	2011-05-24 11:53:42 -07:00
Andy Lutomirski	44259b1abf	x86-64: Move vread_tsc into a new file with sensible options vread_tsc is short and hot, and it's userspace code so the usual reasons to enable -pg and turn off sibling calls don't apply. (OK, turning off sibling calls has no effect. But it might someday...) As an added benefit, tsc.c is profilable now. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C99c6d7f5efa3ccb65b4ac6eb443e1ab7bad47d7b.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-24 14:51:29 +02:00
Andy Lutomirski	8c49d9a74b	x86-64: Clean up vdso/kernel shared variables Variables that are shared between the vdso and the kernel are currently a bit of a mess. They are each defined with their own magic, they are accessed differently in the kernel, the vsyscall page, and the vdso, and one of them (vsyscall_clock) doesn't even really exist. This changes them all to use a common mechanism. All of them are delcared in vvar.h with a fixed address (validated by the linker script). In the kernel (as before), they look like ordinary read-write variables. In the vsyscall page and the vdso, they are accessed through a new macro VVAR, which gives read-only access. The vdso is now loaded verbatim into memory without any fixups. As a side bonus, access from the vdso is faster because a level of indirection is removed. While we're at it, pack jiffies and vgetcpu_mode into the same cacheline. Signed-off-by: Andy Lutomirski <luto@mit.edu> Cc: Andi Kleen <andi@firstfloor.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Borislav Petkov <bp@amd64.org> Link: http://lkml.kernel.org/r/%3C7357882fbb51fa30491636a7b6528747301b7ee9.1306156808.git.luto%40mit.edu%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-24 14:51:28 +02:00
Richard Weinberger	1b4ac2a935	x86: Get rid of asmregparm As UML does no longer need asmregparm we can remove it. Signed-off-by: Richard Weinberger <richard@nod.at> Cc: namhyung@gmail.com Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/%3C1306189085-29896-1-git-send-email-richard%40nod.at%3E Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-05-24 14:33:35 +02:00
Linus Torvalds	5e152b4c9e	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits) PCI: Don't use dmi_name_in_vendors in quirk PCI: remove unused AER functions PCI/sysfs: move bus cpuaffinity to class dev_attrs PCI: add rescan to /sys/.../pci_bus/.../ PCI: update bridge resources to get more big ranges when allocating space (again) KVM: Use pci_store/load_saved_state() around VM device usage PCI: Add interfaces to store and load the device saved state PCI: Track the size of each saved capability data area PCI/e1000e: Add and use pci_disable_link_state_locked() x86/PCI: derive pcibios_last_bus from ACPI MCFG PCI: add latency tolerance reporting enable/disable support PCI: add OBFF enable/disable support PCI: add ID-based ordering enable/disable support PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot. PCI: Set PCIE maxpayload for card during hotplug insertion PCI/ACPI: Report _OSC control mask returned on failure to get control x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs PCI: handle positive error codes PCI: check pci_vpd_pci22_wait() return PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio ... Fix up trivial conflicts in include/linux/pci_ids.h: commit `a6e5e2be44` moved the intel SMBUS ID definitons to the i2c-i801.c driver.	2011-05-23 15:39:34 -07:00

1 2 3 4 5 ...

2808 Commits