linux

Author	SHA1	Message	Date
Alexander Graf	10474ae894	KVM: Activate Virtualization On Demand X86 CPUs need to have some magic happening to enable the virtualization extensions on them. This magic can result in unpleasant results for users, like blocking other VMMs from working (vmx) or using invalid TLB entries (svm). Currently KVM activates virtualization when the respective kernel module is loaded. This blocks us from autoloading KVM modules without breaking other VMMs. To circumvent this problem at least a bit, this patch introduces on demand activation of virtualization. This means, that instead virtualization is enabled on creation of the first virtual machine and disabled on destruction of the last one. So using this, KVM can be easily autoloaded, while keeping other hypervisors usable. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:10 +02:00
Avi Kivity	bfd99ff5d4	KVM: Move assigned device code to own file Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:09 +02:00
Avi Kivity	367e1319b2	KVM: Return -ENOTTY on unrecognized ioctls Not the incorrect -EINVAL. Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:08 +02:00
Gleb Natapov	680b3648ba	KVM: Drop kvm->irq_lock lock from irq injection path The only thing it protects now is interrupt injection into lapic and this can work lockless. Even now with kvm->irq_lock in place access to lapic is not entirely serialized since vcpu access doesn't take kvm->irq_lock. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:08 +02:00
Gleb Natapov	eba0226bdf	KVM: Move IO APIC to its own lock The allows removal of irq_lock from the injection path. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:08 +02:00
Gleb Natapov	136bdfeee7	KVM: Move irq ack notifier list to arch independent code Mask irq notifier list is already there. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:07 +02:00
Gleb Natapov	3e71f88bc9	KVM: Maintain back mapping from irqchip/pin to gsi Maintain back mapping from irqchip/pin to gsi to speedup interrupt acknowledgment notifications. [avi: build fix on non-x86/ia64] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-12-03 09:32:07 +02:00
Ilya Loginov	2d4dc890b5	block: add helpers to run flush_dcache_page() against a bio and a request's pages Mtdblock driver doesn't call flush_dcache_page for pages in request. So, this causes problems on architectures where the icache doesn't fill from the dcache or with dcache aliases. The patch fixes this. The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid pointless empty cache-thrashing loops on architectures for which flush_dcache_page() is a no-op. Every architecture was provided with this flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is equal 1 or do nothing otherwise. See "fix mtd_blkdevs problem with caches on some architectures" discussion on LKML for more information. Signed-off-by: Ilya Loginov <isloginov@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Peter Horton <phorton@bitbox.co.uk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-26 09:16:19 +01:00
Lin Ming	2263576cfc	ACPICA: Add post-order callback to acpi_walk_namespace The existing interface only has a pre-order callback. This change adds an additional parameter for a post-order callback which will be more useful for bus scans. ACPICA BZ 779. Also update the external calls to acpi_walk_namespace. http://www.acpica.org/bugzilla/show_bug.cgi?id=779 Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2009-11-24 21:31:10 -05:00
David S. Miller	3505d1a9fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/sfc/sfe4001.c drivers/net/wireless/libertas/cmd.c drivers/staging/Kconfig drivers/staging/Makefile drivers/staging/rtl8187se/Kconfig drivers/staging/rtl8192e/Kconfig	2009-11-18 22:19:03 -08:00
Eric W. Biederman	6d4561110a	sysctl: Drop & in front of every proc_handler. For consistency drop & in front of every proc_handler. Explicity taking the address is unnecessary and it prevents optimizations like stubbing the proc_handlers to NULL. Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-18 08:37:40 -08:00
Lin Ming	0696b711e4	timekeeping: Fix clock_gettime vsyscall time warp Since commit `0a544198` "timekeeping: Move NTP adjusted clock multiplier to struct timekeeper" the clock multiplier of vsyscall is updated with the unmodified clock multiplier of the clock source and not with the NTP adjusted multiplier of the timekeeper. This causes user space observerable time warps: new CLOCK-warp maximum: 120 nsecs, 00000025c337c537 -> 00000025c337c4bf Add a new argument "mult" to update_vsyscall() and hand in the timekeeping internal NTP adjusted multiplier. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Tony Luck <tony.luck@intel.com> LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-11-17 11:52:34 +01:00
FUJITA Tomonori	6959450e56	swiotlb: Remove duplicate swiotlb_force extern declarations Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: tony.luck@intel.com LKML-Reference: <1258199198-16657-4-git-send-email-fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-15 09:03:10 +01:00
Eric W. Biederman	d00faf81af	sysctl ia64: Remove dead binary sysctl support Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name and .strategy members of sysctl tables are dead code. Remove them. Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-12 02:05:01 -08:00
FUJITA Tomonori	ad32e8cb86	swiotlb: Defer swiotlb init printing, export swiotlb_print_info() This enables us to avoid printing swiotlb memory info when we initialize swiotlb. After swiotlb initialization, we could find that we don't need swiotlb. This patch removes the code to print swiotlb memory info in swiotlb_init() and exports the function to do that. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: chrisw@sous-sol.org Cc: dwmw2@infradead.org Cc: joerg.roedel@amd.com Cc: muli@il.ibm.com Cc: tony.luck@intel.com Cc: benh@kernel.crashing.org LKML-Reference: <1257849980-22640-9-git-send-email-fujita.tomonori@lab.ntt.co.jp> [ -v2: merge up conflict ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 12:32:00 +01:00
Isaku Yamahata	5b5d94487d	ia64/xen: compilation fix This patch fixes the following compilation error introduced by a PCI related features. The change set of 5dd1af9f84c79bedd589db89e71ca733f3bf0ebd moves some xen related definitions from the arch header file (x86/include/asm/xen/hypervisor.h) to the common header file (include/xen/xen.h). So ia64/xen also follows it. In file included from linux-next/include/xen/grant_table.h:41, from linux-next/drivers/block/xen-blkfront.c:48: linux-next/arch/ia64/include/asm/xen/hypervisor.h:43: error: nested redefinition of 'enum xen_domain_type' linux-next/arch/ia64/include/asm/xen/hypervisor.h:43: error: redeclaration of 'enum xen_domain_type' linux-next/arch/ia64/include/asm/xen/hypervisor.h:44: error: redeclaration of enumerator 'XEN_NATIVE' linux-next/include/xen/xen.h:5: error: previous definition of 'XEN_NATIVE' was here linux-next/arch/ia64/include/asm/xen/hypervisor.h:45: error: redeclaration of enumerator 'XEN_PV_DOMAIN' linux-next/include/xen/xen.h:6: error: previous definition of 'XEN_PV_DOMAIN' was here linux-next/arch/ia64/include/asm/xen/hypervisor.h:46: error: redeclaration of enumerator 'XEN_HVM_DOMAIN' linux-next/include/xen/xen.h:7: error: previous definition of 'XEN_HVM_DOMAIN' was here Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-06 15:04:55 -08:00
Eric W. Biederman	6f589526df	sysctl: ia64 Use the compat_sys_sysctl Now that we have a generic 32bit compatibility implementation there is no need for ia64 to implement it's own. Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2009-11-06 03:53:18 -08:00
Bjorn Helgaas	c7dabef8a2	vsprintf: use %pR, %pr instead of %pRt, %pRf Jesse accidentally applied v1 [1] of the patchset instead of v2 [2]. This is the diff between v1 and v2. The changes in this patch are: - tidied vsprintf stack buffer to shrink and compute size more accurately - use %pR for decoding and %pr for "raw" (with type and flags) instead of adding %pRt and %pRf [1] http://lkml.org/lkml/2009/10/6/491 [2] http://lkml.org/lkml/2009/10/13/441 Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 13:06:41 -08:00
Bjorn Helgaas	637b363e86	ia64/PCI: print resources consistently with %pRt This uses %pRt to print additional resource information (type, size, prefetchability, etc.) consistently. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:19 -08:00
Jesse Barnes	ac1aa47b13	PCI: determine CLS more intelligently Till now, CLS has been determined either by arch code or as L1_CACHE_BYTES. Only x86 and ia64 set CLS explicitly and x86 doesn't always get it right. On most configurations, the chance is that firmware configures the correct value during boot. This patch makes pci_init() determine CLS by looking at what firmware has configured. It scans all devices and if all non-zero values agree, the value is used. If none is configured or there is a disagreement, pci_dfl_cache_line_size is used. arch can set the dfl value (via PCI_CACHE_LINE_BYTES or pci_dfl_cache_line_size) or override the actual one. ia64, x86 and sparc64 updated to set the default cls instead of the actual one. While at it, declare pci_cache_line_size and pci_dfl_cache_line_size in pci.h and drop private declarations from arch code. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Miller <davem@davemloft.net> Acked-by: Greg KH <gregkh@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-11-04 08:47:10 -08:00
Tony Luck	e8c93fc7b7	Revert "[IA64] fix percpu warnings" This reverts commit `b94b08081f`. genksyms currently cannot handle complicated types for exported percpu variables. Drop this patch for now as it prevents a module from being loaded on sn2 systems: xpc: no symbol version for per_cpu____sn_cnodeid_to_nasid xpc: Unknown symbol per_cpu____sn_cnodeid_to_nasid Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-11-02 09:23:08 -08:00
Tejun Heo	877105cc49	percpu: make percpu symbols in ia64 unique This patch updates percpu related symbols in ia64 such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/ia64/kernel/setup.c: s/cpu_info/ia64_cpu_info/ Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64@vger.kernel.org	2009-10-29 22:34:14 +09:00
Tejun Heo	c6e22f9e3e	percpu: make percpu symbols in xen unique This patch updates percpu related symbols in xen such that percpu symbols are unique and don't clash with local symbols. This serves two purposes of decreasing the possibility of global percpu symbol collision and allowing dropping per_cpu__ prefix from percpu symbols. * arch/x86/xen/smp.c, arch/x86/xen/time.c, arch/ia64/xen/irq_xen.c: add xen_ prefix to percpu variables * arch/ia64/xen/time.c: add xen_ prefix to percpu variables, drop processed_ prefix and make them static Partly based on Rusty Russell's "alloc_percpu: rename percpu vars which cause name clashes" patch. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Chris Wright <chrisw@sous-sol.org>	2009-10-29 22:34:13 +09:00
Tony Luck	8add570a70	Pull ticket-spinaphore into release branch	2009-10-23 17:59:06 -07:00
Tony Luck	48fade6c5a	Pull ticket4byte into release branch	2009-10-23 17:58:54 -07:00
Randy Dunlap	b94b08081f	[IA64] fix percpu warnings Fix percpu types warning in ia64/sn: arch/ia64/sn/kernel/setup.c:74: error: conflicting types for '__pcpu_scope___sn_cnodeid_to_nasid' arch/ia64/include/asm/sn/arch.h:74: error: previous declaration of '__pcpu_scope___sn_cnodeid_to_nasid' was here Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Jes Sorensen <jes@sgi.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-15 13:29:37 -07:00
Arnaldo Carvalho de Melo	67ca0e5fa8	ia64: Fix up the syscall table for recvmmsg Reported-by: "Tony Luck" <tony.luck@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-14 13:11:30 -07:00
Tony Luck	1502f08edc	[IA64] SMT friendly version of spin_unlock_wait() We can be kinder to SMT systems in spin_unlock_wait. Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-13 14:28:31 -07:00
Marcin Slusarz	54f8dd3c99	[IA64] use printk_once() unaligned.c/io_common.c Use printk_once() in a couple of places. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-13 12:48:25 -07:00
Matthew Wilcox	adcd740341	[IA64] Require SAL 3.2 in order to do extended config space ops We had assumed that SAL firmware would return an error if it didn't understand extended config space. Unfortunately, the SAL on the SGI 750 doesn't do that, it panics the machine. So, condition the extended PCI config space accesses on SAL revision 3.2. Signed-off-by: Matthew Wilcox <willy@linux.intel.com> Tested-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-13 10:44:42 -07:00
Roel Kluin	dec1798f81	[IA64] unsigned cannot be less than 0 in sn_hwperf_ioctl() struct sn_hwperf_ioctl_args member arg (u64) cannot be less than 0. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-13 10:40:00 -07:00
Takao Indoh	29e4e025be	[IA64] Restore registers in the stack on INIT Registers are not saved anywhere when INIT comes during fsys mode and we cannot know what happened when we investigate vmcore captured by kdump. This patch adds new function finish_pt_regs() so registers can be saved in such a case. Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-13 10:37:14 -07:00
Arnaldo Carvalho de Melo	a2e2725541	net: Introduce recvmmsg socket syscall Meaning receive multiple messages, reducing the number of syscalls and net stack entry/exit operations. Next patches will introduce mechanisms where protocols that want to optimize this operation will provide an unlocked_recvmsg operation. This takes into account comments made by: . Paul Moore: sock_recvmsg is called only for the first datagram, sock_recvmsg_nosec is used for the rest. . Caitlin Bestler: recvmmsg now has a struct timespec timeout, that works in the same fashion as the ppoll one. If the underlying protocol returns a datagram with MSG_OOB set, this will make recvmmsg return right away with as many datagrams (+ the OOB one) it has received so far. . Rémi Denis-Courmont & Steven Whitehouse: If we receive N < vlen datagrams and then recvmsg returns an error, recvmmsg will return the successfully received datagrams, store the error and return it in the next call. This paves the way for a subsequent optimization, sk_prot->unlocked_recvmsg, where we will be able to acquire the lock only at batch start and end, not at every underlying recvmsg call. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-12 23:40:10 -07:00
Neil Horman	3b885787ea	net: Generalize socket rx gap / receive queue overflow cmsg Create a new socket level option to report number of queue overflows Recently I augmented the AF_PACKET protocol to report the number of frames lost on the socket receive queue between any two enqueued frames. This value was exported via a SOL_PACKET level cmsg. AFter I completed that work it was requested that this feature be generalized so that any datagram oriented socket could make use of this option. As such I've created this patch, It creates a new SOL_SOCKET level option called SO_RXQ_OVFL, which when enabled exports a SOL_SOCKET level cmsg that reports the nubmer of times the sk_receive_queue overflowed between any two given frames. It also augments the AF_PACKET protocol to take advantage of this new feature (as it previously did not touch sk->sk_drops, which this patch uses to record the overflow count). Tested successfully by me. Notes: 1) Unlike my previous patch, this patch simply records the sk_drops value, which is not a number of drops between packets, but rather a total number of drops. Deltas must be computed in user space. 2) While this patch currently works with datagram oriented protocols, it will also be accepted by non-datagram oriented protocols. I'm not sure if thats agreeable to everyone, but my argument in favor of doing so is that, for those protocols which aren't applicable to this option, sk_drops will always be zero, and reporting no drops on a receive queue that isn't used for those non-participating protocols seems reasonable to me. This also saves us having to code in a per-protocol opt in mechanism. 3) This applies cleanly to net-next assuming that commit `977750076d` (my af packet cmsg patch) is reverted Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-10-12 13:26:31 -07:00
Tony Luck	883a3acf5b	[IA64] Re-implement spinaphores using ticket lock concepts Bound the wait time for the ptcg_sem by using similar idea to the ticket spin locks. In this case we have only one instance of a spinaphore, so make it 8 bytes rather than try to squeeze it into 4-bytes to keep the code simpler (and shorter). Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-09 10:52:39 -07:00
Tony Luck	9d40ee200a	[IA64] Squeeze ticket locks back into 4 bytes. Linus pointed out that other people have spent large amounts of time and effort to optimize the layout of frequently used structures. Often these have embedded locks, and the assumption is that a lock takes 4 bytes. Linus also pointed out how to work with the limited options for atomic instructions on Itanium. Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-10-07 10:54:19 -07:00
Tejun Heo	52594762a3	ia64: convert to dynamic percpu allocator Unlike other archs, ia64 reserves space for percpu areas during early memory initialization. These areas occupy a contiguous region indexed by cpu number on contiguous memory model or are grouped by node on discontiguous memory model. As allocation and initialization are done by the arch code, all that setup_per_cpu_areas() needs to do is communicating the determined layout to the percpu allocator. This patch implements setup_per_cpu_areas() for both contig and discontig memory models and drops HAVE_LEGACY_PER_CPU_AREA. Please note that for contig model, the allocation itself is modified only to allocate for possible cpus instead of NR_CPUS. As dynamic percpu allocator can handle non-direct mapping, there's no reason to allocate memory for cpus which aren't possible. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64 <linux-ia64@vger.kernel.org>	2009-10-02 13:28:56 +09:00
Tejun Heo	36886478f5	ia64: allocate percpu area for cpu0 like percpu areas for other cpus cpu0 used special percpu area reserved by the linker, __cpu0_per_cpu, which is set up early in boot by head.S. However, this doesn't guarantee that the area will be on the same node as cpu0 and the percpu area for cpu0 ends up very far away from percpu areas for other cpus which cause problems for congruent percpu allocator. This patch makes percpu area initialization allocate percpu area for cpu0 like any other cpus and copy it from __cpu0_per_cpu which now resides in the __init area. This means that for cpu0, percpu area is first setup at __cpu0_per_cpu early by head.S and then moved to an area in the linear mapping during memory initialization and it's not allowed to take a pointer to percpu variables between head.S and memory initialization. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64 <linux-ia64@vger.kernel.org>	2009-10-02 13:28:56 +09:00
Tejun Heo	12cda81777	ia64: initialize cpu maps early All information necessary to initialize cpu possible and present maps are available once early_acpi_boot_init() is complete. Reorganize setup_arch() and acpi init functions such that, * CPU information is printed after LAPIC entries are parsed in early_acpi_boot_init(). * smp_build_cpu_map() is called by setup_arch() instead of acpi functions. * smp_build_cpu_map() is called once all CPU related information is available before memory is initialized. This is primarily to allow find_memory() to use cpu maps but is also a general cleanup. Please note that with this change, the somewhat ad-hoc early_cpu_possible_map defined and used for NUMA configurations is probably unnecessary. Something to clean up another day. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64 <linux-ia64@vger.kernel.org>	2009-10-02 13:28:56 +09:00
Tejun Heo	126b3fcdec	ia64: don't alias VMALLOC_END to vmalloc_end If CONFIG_VIRTUAL_MEM_MAP is enabled, ia64 defines macro VMALLOC_END as unsigned long variable vmalloc_end which is adjusted to prepare room for vmemmap. This becomes probnlematic if a local variables vmalloc_end is defined in some function (not very unlikely) and VMALLOC_END is used in the function - the function thinks its referencing the global VMALLOC_END value but would be referencing its own local vmalloc_end variable. There's no reason VMALLOC_END should be a macro. Just define it as an unsigned long variable if CONFIG_VIRTUAL_MEM_MAP is set to avoid nasty surprises. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: linux-ia64 <linux-ia64@vger.kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org>	2009-10-02 13:28:55 +09:00
Alexey Dobriyan	f0f37e2f77	const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-27 11:39:25 -07:00
Linus Torvalds	73964f6bc8	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: ACPI: IA64=y ACPI=n build fix ACPI: Kill overly verbose "power state" log messages ACPI: fix Compaq Evo N800c (Pentium 4m) boot hang regression ACPI: Clarify resource conflict message thinkpad-acpi: fix CONFIG_THINKPAD_ACPI_HOTKEY_POLL build problem	2009-09-27 10:38:48 -07:00
Len Brown	e56d953d19	ACPI: IA64=y ACPI=n build fix ia64's sim_defconfig uses CONFIG_ACPI=n which now #define's acpi_disabled in <linux/acpi.h> So we shouldn't re-define it here in <asm/acpi.h> Signed-off-by: Len Brown <len.brown@intel.com>	2009-09-27 04:17:21 -04:00
Tony Luck	2c86963b09	[IA64] implement ticket locks for Itanium Back in January 2008 Nick Piggin implemented "ticket" spinlocks for X86 (See commit `314cdbefd1`). IA64 implementation has a couple of differences because of the available atomic operations ... e.g. we have no fetchadd2 instruction that operates on a 16-bit quantity so we make ticket locks use a 32-bit word for each of the current ticket and now-serving values. Performance on uncontended locks is about 8% worse than the previous implementation, but this seems a good trade for determinism in the contended case. Performance impact on macro-level benchmarks is in the noise. Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-25 08:42:16 -07:00
Linus Torvalds	94a8d5caba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits) cpumask: Move deprecated functions to end of header. cpumask: remove unused deprecated functions, avoid accusations of insanity cpumask: use new-style cpumask ops in mm/quicklist. cpumask: use mm_cpumask() wrapper: x86 cpumask: use mm_cpumask() wrapper: um cpumask: use mm_cpumask() wrapper: mips cpumask: use mm_cpumask() wrapper: mn10300 cpumask: use mm_cpumask() wrapper: m32r cpumask: use mm_cpumask() wrapper: arm cpumask: Use accessors for cpu__mask: um cpumask: Use accessors for cpu__mask: powerpc cpumask: Use accessors for cpu__mask: mips cpumask: Use accessors for cpu__mask: m32r cpumask: remove arch_send_call_function_ipi cpumask: arch_send_call_function_ipi_mask: s390 cpumask: arch_send_call_function_ipi_mask: powerpc cpumask: arch_send_call_function_ipi_mask: mips cpumask: arch_send_call_function_ipi_mask: m32r cpumask: arch_send_call_function_ipi_mask: alpha cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64 ...	2009-09-23 18:14:11 -07:00
Rusty Russell	0748bd0177	cpumask: remove arch_send_call_function_ipi Now everyone is converted to arch_send_call_function_ipi_mask, remove the shim and the #defines. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:34:47 +09:30
Rusty Russell	e50a6f1953	cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64 There were replaced by topology_core_cpumask and topology_thread_cpumask. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:34:43 +09:30
Rusty Russell	da83a84b53	ia64: convert last user of smp_call_function_mask smp_call_function_many is the new version: it takes a pointer. Also, use mm accessor macro while we're changing this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:34:40 +09:30
Rusty Russell	29c337a034	cpumask: remove obsolete node_to_cpumask now everyone uses cpumask_of_node Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:34:34 +09:30
Linus Torvalds	c37efa9325	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits) Use macros for .data.page_aligned section. Use macros for .bss.page_aligned section. Use new __init_task_data macro in arch init_task.c files. kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts. arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0 kbuild: add static to prototypes kbuild: fail build if recordmcount.pl fails kbuild: set -fconserve-stack option for gcc 4.5 kbuild: echo the record_mcount command gconfig: disable "typeahead find" search in treeviews kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling checkincludes.pl: add option to remove duplicates in place markup_oops: use modinfo to avoid confusion with underscored module names checkincludes.pl: provide usage helper checkincludes.pl: close file as soon as we're done with it ctags: usability fix kernel hacking: move STRIP_ASM_SYMS from General gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma kbuild: Check if linker supports the -X option kbuild: introduce ld-option ... Fix trivial conflict in scripts/basic/fixdep.c	2009-09-23 15:37:02 -07:00
Linus Torvalds	b09a75fc5e	Merge git://git.infradead.org/iommu-2.6 * git://git.infradead.org/iommu-2.6: (23 commits) intel-iommu: Disable PMRs after we enable translation, not before intel-iommu: Kill DMAR_BROKEN_GFX_WA option. intel-iommu: Fix integer wrap on 32 bit kernels intel-iommu: Fix integer overflow in dma_pte_{clear_range,free_pagetable}() intel-iommu: Limit DOMAIN_MAX_PFN to fit in an 'unsigned long' intel-iommu: Fix kernel hang if interrupt remapping disabled in BIOS intel-iommu: Disallow interrupt remapping if not all ioapics covered intel-iommu: include linux/dmi.h to use dmi_ routines pci/dmar: correct off-by-one error in dmar_fault() intel-iommu: Cope with yet another BIOS screwup causing crashes intel-iommu: iommu init error path bug fixes intel-iommu: Mark functions with __init USB: Work around BIOS bugs by quiescing USB controllers earlier ia64: IOMMU passthrough mode shouldn't trigger swiotlb init intel-iommu: make domain_add_dev_info() call domain_context_mapping() intel-iommu: Unify hardware and software passthrough support intel-iommu: Cope with broken HP DC7900 BIOS iommu=pt is a valid early param intel-iommu: double kfree() intel-iommu: Kill pointless intel_unmap_single() function ... Fixed up trivial include lines conflict in drivers/pci/intel-iommu.c	2009-09-23 10:06:10 -07:00
Linus Torvalds	31bbb9b58d	Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: itimers: Add tracepoints for itimer hrtimer: Add tracepoint for hrtimers timers: Add tracepoints for timer_list timers cputime: Optimize jiffies_to_cputime(1) itimers: Simplify arm_timer() code a bit itimers: Fix periodic tics precision itimers: Merge ITIMER_VIRT and ITIMER_PROF Trivial header file include conflicts in kernel/fork.c	2009-09-23 09:46:15 -07:00
Linus Torvalds	c11f6c8258	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (119 commits) ACPI: don't pass handle for fixed hardware notifications ACPI: remove null pointer checks in deferred execution path ACPI: simplify deferred execution path acerhdf: additional BIOS versions acerhdf: convert to dev_pm_ops acerhdf: fix fan control for AOA150 model thermal: add missing Kconfig dependency acpi: switch /proc/acpi/{debug_layer,debug_level} to seq_file hp-wmi: fix rfkill memory leak on unload ACPI: remove unnecessary #ifdef CONFIG_DMI ACPI: linux/acpi.h should not include linux/dmi.h hwmon driver for ACPI 4.0 power meters topstar-laptop: add new driver for hotkeys support on Topstar N01 thinkpad_acpi: fix rfkill memory leak on unload thinkpad-acpi: report brightness events when required thinkpad-acpi: don't poll by default any of the reserved hotkeys thinkpad-acpi: Fix procfs hotkey reset command thinkpad-acpi: deprecate hotkey_bios_mask thinkpad-acpi: hotkey poll fixes thinkpad-acpi: be more strict when detecting a ThinkPad ...	2009-09-23 09:32:11 -07:00
KAMEZAWA Hiroyuki	3089aa1b0c	kcore: use registerd physmem information For /proc/kcore, each arch registers its memory range by kclist_add(). In usual, - range of physical memory - range of vmalloc area - text, etc... are registered but "range of physical memory" has some troubles. It doesn't updated at memory hotplug and it tend to include unnecessary memory holes. Now, /proc/iomem (kernel/resource.c) includes required physical memory range information and it's properly updated at memory hotplug. Then, it's good to avoid using its own code(duplicating information) and to rebuild kclist for physical memory based on /proc/iomem. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:41 -07:00
KAMEZAWA Hiroyuki	9492587cf3	kcore: register text area in generic way Some 64bit arch has special segment for mapping kernel text. It should be entried to /proc/kcore in addtion to direct-linear-map, vmalloc area. This patch unifies KCORE_TEXT entry scattered under x86 and ia64. I'm not familiar with other archs (mips has its own even after this patch) but range of [_stext ..._end) is a valid area of text and it's not in direct-map area, defining CONFIG_ARCH_PROC_KCORE_TEXT is only a necessary thing to do. Note: I left mips as it is now. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:41 -07:00
KAMEZAWA Hiroyuki	a0614da88b	kcore: register vmalloc area in generic way For /proc/kcore, vmalloc areas are registered per arch. But, all of them registers same range of [VMALLOC_START...VMALLOC_END) This patch unifies them. By this. archs which have no kclist_add() hooks can see vmalloc area correctly. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:41 -07:00
KAMEZAWA Hiroyuki	c30bb2a25f	kcore: add kclist types Presently, kclist_add() only eats start address and size as its arguments. Considering to make kclist dynamically reconfigulable, it's necessary to know which kclists are for System RAM and which are not. This patch add kclist types as KCORE_RAM KCORE_VMALLOC KCORE_TEXT KCORE_OTHER This "type" is used in a patch following this for detecting KCORE_RAM. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 07:39:41 -07:00
Linus Torvalds	342ff1a1b5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits) trivial: fix typo in aic7xxx comment trivial: fix comment typo in drivers/ata/pata_hpt37x.c trivial: typo in kernel-parameters.txt trivial: fix typo in tracing documentation trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c trivial: remove unnecessary semicolons trivial: Fix duplicated word "options" in comment trivial: kbuild: remove extraneous blank line after declaration of usage() trivial: improve help text for mm debug config options trivial: doc: hpfall: accept disk device to unload as argument trivial: doc: hpfall: reduce risk that hpfall can do harm trivial: SubmittingPatches: Fix reference to renumbered step trivial: fix typos "man[ae]g?ment" -> "management" trivial: media/video/cx88: add __init/__exit macros to cx88 drivers trivial: fix typo in CONFIG_DEBUG_FS in gcov doc trivial: fix missing printk space in amd_k7_smp_check trivial: fix typo s/ketymap/keymap/ in comment trivial: fix typo "to to" in multiple files trivial: fix typos in comments s/DGBU/DBGU/ ...	2009-09-22 07:51:45 -07:00
Arnd Bergmann	6e17b17f1f	mm: remove duplicate asm/mman.h files A number of architectures have identical asm/mman.h files so they can all be merged by using the new generic file. The remaining asm/mman.h files are substantially different from each other. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:42 -07:00
Arnd Bergmann	90f72aa58b	mm: add MAP_HUGETLB for mmaping pseudo-anonymous huge page regions Add a flag for mmap that will be used to request a huge page region that will look like anonymous memory to user space. This is accomplished by using a file on the internal vfsmount. MAP_HUGETLB is a modifier of MAP_ANONYMOUS and so must be specified with it. The region will behave the same as a MAP_ANONYMOUS region using small pages. The patch also adds the MAP_STACK flag, which was previously defined only on some architectures but not on others. Since MAP_STACK is meant to be a hint only, architectures can define it without assigning a specific meaning to it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: David Rientjes <rientjes@google.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:41 -07:00
Geert Uytterhoeven	cc013a8890	arches: drop superfluous casts in nr_free_pages() callers Commit `9617729941` ("Drop free_pages()") modified nr_free_pages() to return 'unsigned long' instead of 'unsigned int'. This made the casts to 'unsigned long' in most callers superfluous, so remove them. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <zankel@tensilica.com> Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:34 -07:00
Anand Gadiyar	fd589a8f0a	trivial: fix typo "to to" in multiple files Signed-off-by: Anand Gadiyar <gadiyar@ti.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-09-21 15:14:55 +02:00
Joe Perches	d200c922bc	Use new __init_task_data macro in arch init_task.c files. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Tim Abbott <tabbott@ksplice.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-09-21 06:27:08 +02:00
Sam Ravnborg	f86fd30660	kbuild: rename ld-option to cc-ldoption ld-option is misnamed as it test options to gcc, not to ld. Renamed it to reflect this. Cc: Andi Kleen <andi@firstfloor.org> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-09-20 12:27:42 +02:00
Sam Ravnborg	caa27b66bd	kbuild: use INSTALLKERNEL to select customized installkernel script Replace the use of CROSS_COMPILE to select a customized installkernel script with the possibility to set INSTALLKERNEL to select a custom installkernel script when running make: make INSTALLKERNEL=arm-installkernel install With this patch we are now more consistent across different architectures - they did not all support use of CROSS_COMPILE. The use of CROSS_COMPILE was a hack as this really belongs to gcc/binutils and the installkernel script does not change just because we change toolchain. The use of CROSS_COMPILE caused troubles with an upcoming patch that saves CROSS_COMPILE when a kernel is built - it would no longer be installable. [Thanks to Peter Z. for this hint] This patch undos what Ian did in commit: `0f8e2d62fa` ("use ${CROSS_COMPILE}installkernel in arch/*/boot/install.sh") The patch has been lightly tested on x86 only - but all changes looks obvious. Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Mike Frysinger <vapier@gentoo.org> [blackfin] Acked-by: Russell King <linux@arm.linux.org.uk> [arm] Acked-by: Paul Mundt <lethal@linux-sh.org> [sh] Acked-by: "H. Peter Anvin" <hpa@zytor.com> [x86] Cc: Ian Campbell <icampbell@arcom.com> Cc: Tony Luck <tony.luck@intel.com> [ia64] Cc: Fenghua Yu <fenghua.yu@intel.com> [ia64] Cc: Hirokazu Takata <takata@linux-m32r.org> [m32r] Cc: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Kyle McMartin <kyle@mcmartin.ca> [parisc] Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> [powerpc] Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390] Cc: Thomas Gleixner <tglx@linutronix.de> [x86] Cc: Ingo Molnar <mingo@redhat.com> [x86] Signed-off-by: Sam Ravnborg <sam@ravnborg.org>	2009-09-20 12:18:14 +02:00
Len Brown	985f38781d	Merge branch 'acpica' into release	2009-09-19 01:45:22 -04:00
Linus Torvalds	fa877c71e2	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Clean up linker script using standard macros. [IA64] Use standard macros for page-aligned data. [IA64] Use .ref.text, not .text.init for start_ap. [IA64] sgi-xp: fix printk format warnings [IA64] ioc4_serial: fix printk format warnings [IA64] mbcs: fix printk format warnings [IA64] pci_br, fix infinite loop in find_free_ate() [IA64] kdump: Short path to freeze CPUs [IA64] kdump: Try INIT regardless of [IA64] kdump: Mask INIT first in panic-kdump path [IA64] kdump: Don't return APs to SAL from kdump [IA64] kexec: Unregister MCA handler before kexec [IA64] kexec: Make INIT safe while transition to [IA64] kdump: Mask MCA/INIT on frozen cpus Fix up conflict in arch/ia64/kernel/vmlinux.lds.S as per Tony's suggestion.	2009-09-18 09:33:07 -07:00
Linus Torvalds	dcbf77b9e8	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits) sched: Fix SD_POWERSAVING_BALANCE\|SD_PREFER_LOCAL vs SD_WAKE_AFFINE sched: Stop buddies from hogging the system sched: Add new wakeup preemption mode: WAKEUP_RUNNING sched: Fix TASK_WAKING & loadaverage breakage sched: Disable wakeup balancing sched: Rename flags to wake_flags sched: Clean up the load_idx selection in select_task_rq_fair sched: Optimize cgroup vs wakeup a bit sched: x86: Name old_perf in a unique way sched: Implement a gentler fair-sleepers feature sched: Add SD_PREFER_LOCAL sched: Add a few SYNC hint knobs to play with sched: Fix sync wakeups again sched: Add WF_FORK sched: Rename sync arguments sched: Rename select_task_rq() argument sched: Feature to disable APERF/MPERF cpu_power x86: sched: Provide arch implementations using aperf/mperf x86: Add generic aperf/mperf code x86: Move APERF/MPERF into a X86_FEATURE ... Fix up trivial conflict in arch/x86/include/asm/processor.h due to nearby addition of amd_get_nb_id() declaration from the EDAC merge.	2009-09-17 21:00:02 -07:00
Anirban Sinha	f4c3f03838	Fix ia64 build breakage in head.S The "cleanup console_print()" patch in commit `353f6dd2de` introduced an "extern" declaration into an assembly language file. Remove it. Signed-off-by: Anirban Sinha <asinha@zeugmasystems.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-16 12:28:52 -07:00
Linus Torvalds	4406c56d0a	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits) PCI hotplug: clean up acpi_run_hpp() PCI hotplug: acpiphp: use generic pci_configure_slot() PCI hotplug: shpchp: use generic pci_configure_slot() PCI hotplug: pciehp: use generic pci_configure_slot() PCI hotplug: add pci_configure_slot() PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation PCI: Clear saved_state after the state has been restored PCI PM: Return error codes from pci_pm_resume() PCI: use dev_printk in quirk messages PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset() PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle PCI Hotplug: acpiphp: find bridges the easy way PCI: pcie portdrv: remove unused variable PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support ACPI PM: Replace wakeup.prepared with reference counter PCI PM: Introduce device flag wakeup_prepared PCI / ACPI PM: Rework some debug messages PCI PM: Simplify PCI wake-up code ... Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree scanning having been moved and merged for the 32- and 64-bit cases. The 'needs_freset' initialization added in `6e19314cc` ("PCI/powerpc: support PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.	2009-09-16 07:49:54 -07:00
Peter Zijlstra	182a85f8a1	sched: Disable wakeup balancing Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't really mind too much, SD_BALANCE_NEWIDLE picks up most of the slack. On a dual socket, quad core, dual thread nehalem system: sysbench (--num_threads=16): SD_BALANCE_WAKE-: 13982 tx/s SD_BALANCE_WAKE+: 15688 tx/s kbuild (-j16): SD_BALANCE_WAKE-: 47.648295846 seconds time elapsed ( +- 0.312% ) SD_BALANCE_WAKE+: 47.608607360 seconds time elapsed ( +- 0.026% ) (same within noise) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-16 16:44:33 +02:00
Nelson Elhage	6ae8635085	[IA64] Clean up linker script using standard macros. Aside from using fewer output sections and moving some data around, the main side effect of this change is changing the alignment of some sections. In particular: * cachline-aligned and read_mostly data are now aligned to SMP_CACHE_BYTES. (Previously, they were laid out consecutively after a PAGE_SIZE alignment) * .init.ramfs is now page-aligned, per the INIT_RAM_FS macro. (Previously it had no explicit alignment). Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-15 09:52:33 -07:00
Nelson Elhage	ed7af3e63b	[IA64] Use standard macros for page-aligned data. Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-15 09:40:27 -07:00
Linus Torvalds	ada3fa1505	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c	2009-09-15 09:39:44 -07:00
Tim Abbott	f172468a14	[IA64] Use .ref.text, not .text.init for start_ap. It seems that start_ap doesn't need to be in a special location in the kernel, but it references some init code so it should be in .ref.text. Since this is the only thing in the .text.head section, eliminate .text.head from the linker script. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-15 09:29:31 -07:00
Linus Torvalds	227423904c	Merge branch 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, pat: Fix cacheflush address in change_page_attr_set_clr() mm: remove !NUMA condition from PAGEFLAGS_EXTENDED condition set x86: Fix earlyprintk=dbgp for machines without NX x86, pat: Sanity check remap_pfn_range for RAM region x86, pat: Lookup the protection from memtype list on vm_insert_pfn() x86, pat: Add lookup_memtype to get the current memtype of a paddr x86, pat: Use page flags to track memtypes of RAM pages x86, pat: Generalize the use of page flag PG_uncached x86, pat: Add rbtree to do quick lookup in memtype tracking x86, pat: Add PAT reserve free to io_mapping* APIs x86, pat: New i/f for driver to request memtype for IO regions x86, pat: ioremap to follow same PAT restrictions as other PAT users x86, pat: Keep identity maps consistent with mmaps even when pat_disabled x86, mtrr: make mtrr_aps_delayed_init static bool x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init generic-ipi: Allow cpus not yet online to call smp_call_function with irqs disabled x86: Fix an incorrect argument of reserve_bootmem() x86: Fix system crash when loading with "reservetop" parameter	2009-09-15 09:19:38 -07:00
Linus Torvalds	66a4fe0cb8	Merge branch 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6 * 'agp-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6: agp/intel: remove restore in resume agp: fix uninorth build intel-agp: Set dma mask for i915 agp: kill phys_to_gart() and gart_to_phys() intel-agp: fix sglist allocation to avoid vmalloc() intel-agp: Move repeated sglist free into separate function agp: Switch agp_{un,}map_page() to take struct page * argument agp: tidy up handling of scratch pages w.r.t. DMA API intel_agp: Use PCI DMA API correctly on chipsets new enough to have IOMMU agp: Add generic support for graphics dma remapping agp: Switch mask_memory() method to take address argument again, not page	2009-09-15 09:18:07 -07:00
Jiri Slaby	9b6b93998a	[IA64] pci_br, fix infinite loop in find_free_ate() When * there is almost out of ates * one asks for more than one ate * there are some available at the end of ate array then the inner for loop will end without incrementing 'index'. This means the outer loop will start at the same point finding it's available and runs the inner loop again from the same index. This repeats forever. Hence make sure we check we were at the end of ate array and return an error in such case. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Found-by: Jeff Mahoney <jeffm@novell.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-15 08:40:49 -07:00
Peter Zijlstra	b8a543ea5a	sched: Reduce forkexec_idx If we're looking to place a new task, we might as well find the idlest position _now_, not 1 tick ago. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:51:23 +02:00
Mike Galbraith	0ec9fab3d1	sched: Improve latencies and throughput Make the idle balancer more agressive, to improve a x264 encoding workload provided by Jason Garrett-Glaser: NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 252.82 fps, 22096.60 kb/s encoded 600 frames, 250.69 fps, 22096.60 kb/s encoded 600 frames, 245.76 fps, 22096.60 kb/s NO_NEXT_BUDDY LB_BIAS encoded 600 frames, 344.44 fps, 22096.60 kb/s encoded 600 frames, 346.66 fps, 22096.60 kb/s encoded 600 frames, 352.59 fps, 22096.60 kb/s NO_NEXT_BUDDY NO_LB_BIAS encoded 600 frames, 425.75 fps, 22096.60 kb/s encoded 600 frames, 425.45 fps, 22096.60 kb/s encoded 600 frames, 422.49 fps, 22096.60 kb/s Peter pointed out that this is better done via newidle_idx, not via LB_BIAS, newidle balancing should look for where there is load _now_, not where there was load 2 ticks ago. Worst-case latencies are improved as well as no buddies means less vruntime spread. (as per prior lkml discussions) This change improves kbuild-peak parallelism as well. Reported-by: Jason Garrett-Glaser <darkshikari@gmail.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1253011667.9128.16.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:51:16 +02:00
Peter Zijlstra	78e7ed53c9	sched: Tweak wake_idx When merging select_task_rq_fair() and sched_balance_self() we lost the use of wake_idx, restore that and set them to 0 to make wake balancing more aggressive. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:01:07 +02:00
Peter Zijlstra	c88d591089	sched: Merge select_task_rq_fair() and sched_balance_self() The problem with wake_idle() is that is doesn't respect things like cpu_power, which means it doesn't deal well with SMT nor the recent RT interaction. To cure this, it needs to do what sched_balance_self() does, which leads to the possibility of merging select_task_rq_fair() and sched_balance_self(). Modify sched_balance_self() to: - update_shares() when walking up the domain tree, (it only called it for the top domain, but it should have done this anyway), which allows us to remove this ugly bit from try_to_wake_up(). - do wake_affine() on the smallest domain that contains both this (the waking) and the prev (the wakee) cpu for WAKE invocations. Then use the top-down balance steps it had to replace wake_idle(). This leads to the dissapearance of SD_WAKE_BALANCE and SD_WAKE_IDLE_FAR, with SD_WAKE_IDLE replaced with SD_BALANCE_WAKE. SD_WAKE_AFFINE needs SD_BALANCE_WAKE to be effective. Touch all topology bits to replace the old with new SD flags -- platforms might need re-tuning, enabling SD_BALANCE_WAKE conditionally on a NUMA distance seems like a good additional feature, magny-core and small nehalem systems would want this enabled, systems with slow interconnects would not. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-15 16:01:05 +02:00
Linus Torvalds	f86054c245	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (23 commits) at_hdmac: Rework suspend_late()/resume_early() PM: Reset transition_started at dpm_resume_noirq PM: Update kerneldoc comments in drivers/base/power/main.c PM: Add convenience macro to make switching to dev_pm_ops less error-prone hp-wmi: Switch driver to dev_pm_ops floppy: Switch driver to dev_pm_ops PM: Trivial fixes PM / Hibernate / Memory hotplug: Always use for_each_populated_zone() PM/Hibernate: Do not try to allocate too much memory too hard (rev. 2) PM/Hibernate: Do not release preallocated memory unnecessarily (rev. 2) PM/Hibernate: Rework shrinking of memory PM: Fix typo in label name s/Platofrm_finish/Platform_finish/ PM: Run-time PM platform device bus support PM: Introduce core framework for run-time PM of I/O devices (rev. 17) Driver Core: Make PM operations a const pointer PM: Remove platform device suspend_late()/resume_early() V2 USB: Rework musb suspend()/resume_early() I2C: Rework i2c-s3c2410 suspend_late()/resume() V2 I2C: Rework i2c-pxa suspend_late()/resume_early() DMA: Rework txx9dmac suspend_late()/resume_early() ... Fix trivial conflict in drivers/base/platform.c (due to same constification patch being merged in both sides, along with some other PM work in the PM branch)	2009-09-14 20:03:54 -07:00
Linus Torvalds	69def9f05d	Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (202 commits) MAINTAINERS: update KVM entry KVM: correct error-handling code KVM: fix compile warnings on s390 KVM: VMX: Check cpl before emulating debug register access KVM: fix misreporting of coalesced interrupts by kvm tracer KVM: x86: drop duplicate kvm_flush_remote_tlb calls KVM: VMX: call vmx_load_host_state() only if msr is cached KVM: VMX: Conditionally reload debug register 6 KVM: Use thread debug register storage instead of kvm specific data KVM guest: do not batch pte updates from interrupt context KVM: Fix coalesced interrupt reporting in IOAPIC KVM guest: fix bogus wallclock physical address calculation KVM: VMX: Fix cr8 exiting control clobbering by EPT KVM: Optimize kvm_mmu_unprotect_page_virt() for tdp KVM: Document KVM_CAP_IRQCHIP KVM: Protect update_cr8_intercept() when running without an apic KVM: VMX: Fix EPT with WP bit change during paging KVM: Use kvm_{read,write}_guest_virt() to read and write segment descriptors KVM: x86 emulator: Add adc and sbb missing decoder flags KVM: Add missing #include ...	2009-09-14 17:43:43 -07:00
Anirban Sinha	353f6dd2de	cleanup console_print() console_print() is an old legacy interface mostly unused in the entire kernel tree. It's best to clean up its existing use and let developers use their own implementation of it as they feel fit. Signed-off-by: Anirban Sinha <asinha@zeugmasystems.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-14 17:41:42 -07:00
Hidetoshi Seto	0cced40e7c	[IA64] kdump: Short path to freeze CPUs Setting monarch_cpu = -1 to let slaves frozen might not work, because there might be slaves being late, not entered the rendezvous yet. Such slaves might be caught in while (monarch_cpu == -1) loop. Use kdump_in_progress instead of monarch_cpus to break INIT rendezvous and let all slaves enter DIE_INIT_SLAVE_LEAVE smoothly. And monarch no longer need to manage rendezvous if once kdump_in_progress is set, catch the monarch in DIE_INIT_MONARCH_ENTER then. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:19:24 -07:00
Hidetoshi Seto	5959906ee9	[IA64] kdump: Try INIT regardless of kdump_on_init CPUs should be frozen if possible, otherwise it might hinder kdump. So if there are CPUs not respond to IPI, try INIT to stop them. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:19:10 -07:00
Hidetoshi Seto	1726b0883d	[IA64] kdump: Mask INIT first in panic-kdump path Summary: Asserting INIT might block kdump if the system is already going to start kdump via panic. Description: INIT can interrupt anywhere in panic path, so it can interrupt in middle of kdump kicked by panic. Therefore there is a race if kdump is kicked concurrently, via Panic and via INIT. INIT could fail to invoke kdump if the system is already going to start kdump via panic. It could not restart kdump from INIT handler if some of cpus are already playing dead with INIT masked. It also means that INIT could block kdump's progress if no monarch is entered in the INIT rendezvous. Panic+INIT is a rare, but possible situation since it can be assumed that the kernel or an internal agent decides to panic the unstable system while another external agent decides to send an INIT to the system at same time. How to reproduce: Assert INIT just after panic, before all other cpus have frozen Expected results: continue kdump invoked by panic, or restart kdump from INIT Actual results: might be hang, crashdump not retrieved Proposed Fix: This patch masks INIT first in panic path to take the initiative on kdump, and reuse atomic value kdump_in_progress to make sure there is only one initiator of kdump. All INITs asserted later should be used only for freezing all other cpus. This mask will be removed soon by rfi in relocate_kernel.S, before jump into kdump kernel, after all cpus are frozen and no-op INIT handler is registered. So if INIT was in the interval while it is masked, it will pend on the system and will received just after the rfi, and handled by the no-op handler. If there was a MCA event while psr.mc is 1, in theory the event will pend on the system and will received just after the rfi same as above. MCA handler is unregistered here at the time, so received MCA will not reach to OS_MCA and will result in warmboot by SAL. Note that codes in this masked interval are relatively simpler than that in MCA/INIT handler which also executed with the mask. So it can be said that probability of error in this interval is supposed not so higher than that in MCA/INIT handler. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:18:54 -07:00
Hidetoshi Seto	68cb14c7c4	[IA64] kdump: Don't return APs to SAL from kdump Summary: Asserting INIT on cpu going to be offline will result in unexpected behavior. It will be a real problem in kdump cases where INIT might be asserted to unstable APs going to be offline by returning to SAL. Description: Since psr.mc is cleared when bits in psr are set to SAL_PSR_BITS_TO_SET in ia64_jump_to_sal(), there is a small window (~few msecs) that the cpu can receive INIT even if the cpu enter there via INIT handler. In this window we do restore of registers for SAL, so INIT asserted here will not work properly. It is hard to remove this window by masking INIT (i.e. setting psr.mc) because we have to unmask it later in OS, because we have to use branch instruction (br.ret, not rfi) to return SAL, due to OS_BOOT_RENDEZ to SAL return convention. I suppose this window will not be a real problem on cpu offline if we can educate people not to push INIT button during hotplug operation. However, only exception is a race in kdump and INIT. Now kdump returns APs to SAL before processing dump, but the kernel might receive INIT at that point in time. Such INIT might be asserted by kdump itself if an AP doesn't react IPI soon and kdump decided to use INIT to stop the AP. Or it might be asserted by operator or an external agent to start dump on the unstable system. Such panic+INIT or INIT+INIT cases should be rare, but it will be happy if we can retrieve crashdump even in such cases. How to reproduce: panic+INIT or INIT+INIT, with kdump configured Expected results: crashdump is retrieved anyway Actual results: panic, hang etc. (unexpected) Proposed fix To avoid the window on the way to SAL, this patch stops returning APs to SAL in case of kdump. In other words, this patch makes APs spin in OS instead of spinning in SAL. (* Note: What impact would be there? If a cpu is spinning in SAL, the cpu is in BOOT_RENDEZ loop, as same as offlined cpu. In theory if an INIT is asserted there, cpus in the BOOT_RENDEZ loop should not invoke OS_INIT on it. So in either way, no matter where the cpu is spinning actually in, once cpu starts spin and act as "frozen," INIT on the cpu have no effects. From another point of view, all debug information on the cpu should have stored to memory before the cpu start to be frozen. So no more action on the cpu is required.) I confirmed that the kdump sometime hangs by concurrent INITs (another INIT after an INIT), and it doesn't hang after applying this patch. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:18:37 -07:00
Hidetoshi Seto	6cc3efcdf0	[IA64] kexec: Unregister MCA handler before kexec Summary: MCA on the beginning of kdump/kexec kernel will result in unexpected behavior because MCA handler for previous kernel is invoked on the kdump kernel. Description: Once a cpu is passed to new kernel, all resources in previous kernel should not be used from the cpu. Even the resources for MCA handler are no exception. So we cannot handle MCAs and its machine check errors during kernel transition, until new handler for new kernel is registered with new resources ready for handling the MCA. How to reproduce: Assert MCA while kdump kernel is booting, before new MCA handler for kdump kernel is registered. Expected(Desirable) results: No recovery, cancel kdump and reboot the system. Actual results: MCA handler for previous kernel is invoked on the kdump kernel. => panic, hang etc. (unexpected) Proposed fix: To avoid entering MCA handler from early stage of new kernel, unregister the entry point from SAL before leave from current kernel. Then SAL will make all MCAs to warmboot safely, without invoking OS_MCA. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:18:17 -07:00
Hidetoshi Seto	07a6a4ae82	[IA64] kexec: Make INIT safe while transition to kdump/kexec kernel Summary: Asserting INIT on the beginning of kdump/kexec kernel will result in unexpected behavior because INIT handler for previous kernel is invoked on new kernel. Description: In panic situation, we can receive INIT while kernel transition, i.e. from beginning of panic to bootstrap of kdump kernel. Since we initialize registers on leave from current kernel, no longer monarch/slave handlers of current kernel in virtual mode are called safely. (In fact system goes hang as far as I confirmed) How to Reproduce: Start kdump # echo c > /proc/sysrq-trigger Then assert INIT while kdump kernel is booting, before new INIT handler for kdump kernel is registered. Expected(Desirable) result: kdump kernel boots without any problem, crashdump retrieved Actual result: INIT handler for previous kernel is invoked on kdump kernel => panic, hang etc. (unexpected) Proposed fix: We can unregister these init handlers from SAL before jumping into new kernel, however then the INIT will fallback to default behavior, result in warmboot by SAL (according to the SAL specification) and we cannot retrieve the crashdump. Therefore this patch introduces a NOP init handler and register it to SAL before leave from current kernel, to start kdump safely by preventing INITs from entering virtual mode and resulting in warmboot. On the other hand, in case of kexec that not for kdump, it also has same problem with INIT while kernel transition. This patch handles this case differently, because for kexec unregistering handlers will be preferred than registering NOP handler, since the situation "no handlers registered" is usual state for kernel's entry. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:18:02 -07:00
Hidetoshi Seto	4295ab3488	[IA64] kdump: Mask MCA/INIT on frozen cpus Summary: INIT asserted on kdump kernel invokes INIT handler not only on a cpu that running on the kdump kernel, but also BSP of the panicked kernel, because the (badly) frozen BSP can be thawed by INIT. Description: The kdump_cpu_freeze() is called on cpus except one that initiates panic and/or kdump, to stop/offline the cpu (on ia64, it means we pass control of cpus to SAL, or put them in spinloop). Note that CPU0(BSP) always go to spinloop, so if panic was happened on an AP, there are at least 2cpus (= the AP and BSP) which not back to SAL. On the spinning cpus, interrupts are disabled (rsm psr.i), but INIT is still interruptible because psr.mc for mask them is not set unless kdump_cpu_freeze() is not called from MCA/INIT context. Therefore, assume that a panic was happened on an AP, kdump was invoked, new INIT handlers for kdump kernel was registered and then an INIT is asserted. From the viewpoint of SAL, there are 2 online cpus, so INIT will be delivered to both of them. It likely means that not only the AP (= a cpu executing kdump) enters INIT handler which is newly registered, but also BSP (= another cpu spinning in panicked kernel) enters the same INIT handler. Of course setting of registers in BSP are still old (for panicked kernel), so what happen with running handler with wrong setting will be extremely unexpected. I believe this is not desirable behavior. How to Reproduce: Start kdump on one of APs (e.g. cpu1) # taskset 0x2 echo c > /proc/sysrq-trigger Then assert INIT after kdump kernel is booted, after new INIT handler for kdump kernel is registered. Expected results: An INIT handler is invoked only on the AP. Actual results: An INIT handler is invoked on the AP and BSP. Sample of results: I got following console log by asserting INIT after prompt "root:/>". It seems that two monarchs appeared by one INIT, and one panicked at last. And it also seems that the panicked one supposed there were 4 online cpus and no one did rendezvous: : [ 0 %]dropping to initramfs shell exiting this shell will reboot your system root:/> Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=0 ia64_init_handler: Promoting cpu 0 to monarch. Delaying for 5 seconds... All OS INIT slaves have reached rendezvous Processes interrupted by INIT - 0 (cpu 0 task 0xa000000100af0000) : <<snip>> : Entered OS INIT handler. PSP=fff301a0 cpu=0 monarch=1 Delaying for 5 seconds... mlogbuf_finish: printing switched to urgent mode, MCA/INIT might be dodgy or fail. OS INIT slave did not rendezvous on cpu 1 2 3 INIT swapper 0[0]: bugcheck! 0 [1] : <<snip>> : Kernel panic - not syncing: Attempted to kill the idle task! Proposed fix: To avoid this problem, this patch inserts ia64_set_psr_mc() to mask INIT on cpus going to be frozen. This masking have no effect if the kdump_cpu_freeze() is called from INIT handler when kdump_on_init == 1, because psr.mc is already turned on to 1 before entering OS_INIT. I confirmed that weird log like above are disappeared after applying this patch. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Haren Myneni <hbabu@us.ibm.com> Cc: kexec@lists.infradead.org Acked-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2009-09-14 16:17:05 -07:00
Rafael J. Wysocki	ac8d513a68	Merge branch 'master' into for-linus	2009-09-14 20:26:05 +02:00
Linus Torvalds	d7e9660ad9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1623 commits) netxen: update copyright netxen: fix tx timeout recovery netxen: fix file firmware leak netxen: improve pci memory access netxen: change firmware write size tg3: Fix return ring size breakage netxen: build fix for INET=n cdc-phonet: autoconfigure Phonet address Phonet: back-end for autoconfigured addresses Phonet: fix netlink address dump error handling ipv6: Add IFA_F_DADFAILED flag net: Add DEVTYPE support for Ethernet based devices mv643xx_eth.c: remove unused txq_set_wrr() ucc_geth: Fix hangs after switching from full to half duplex ucc_geth: Rearrange some code to avoid forward declarations phy/marvell: Make non-aneg speed/duplex forcing work for 88E1111 PHYs drivers/net/phy: introduce missing kfree drivers/net/wan: introduce missing kfree net: force bridge module(s) to be GPL Subject: [PATCH] appletalk: Fix skb leak when ipddp interface is not loaded ... Fixed up trivial conflicts: - arch/x86/include/asm/socket.h converted to <asm-generic/socket.h> in the x86 tree. The generic header has the same new #define's, so that works out fine. - drivers/net/tun.c fix conflict between `89f56d1e9` ("tun: reuse struct sock fields") that switched over to using 'tun->socket.sk' instead of the redundantly available (and thus removed) 'tun->sk', and `2b980dbd` ("lsm: Add hooks to the TUN driver") which added a new 'tun->sk' use. Noted in 'next' by Stephen Rothwell.	2009-09-14 10:37:28 -07:00
Linus Torvalds	eee2775d99	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits) rcu: Move end of special early-boot RCU operation earlier rcu: Changes from reviews: avoid casts, fix/add warnings, improve comments rcu: Create rcutree plugins to handle hotplug CPU for multi-level trees rcu: Remove lockdep annotations from RCU's _notrace() API members rcu: Add #ifdef to suppress __rcu_offline_cpu() warning in !HOTPLUG_CPU builds rcu: Add CPU-offline processing for single-node configurations rcu: Add "notrace" to RCU function headers used by ftrace rcu: Remove CONFIG_PREEMPT_RCU rcu: Merge preemptable-RCU functionality into hierarchical RCU rcu: Simplify rcu_pending()/rcu_check_callbacks() API rcu: Use debugfs_remove_recursive() simplify code. rcu: Merge per-RCU-flavor initialization into pre-existing macro rcu: Fix online/offline indication for rcudata.csv trace file rcu: Consolidate sparse and lockdep declarations in include/linux/rcupdate.h rcu: Renamings to increase RCU clarity rcu: Move private definitions from include/linux/rcutree.h to kernel/rcutree.h rcu: Expunge lingering references to CONFIG_CLASSIC_RCU, optimize on !SMP rcu: Delay rcu_barrier() wait until beginning of next CPU-hotunplug operation. rcu: Fix typo in rcu_irq_exit() comment header rcu: Make rcupreempt_trace.c look at offline CPUs ...	2009-09-11 13:20:18 -07:00
Linus Torvalds	a66a50054e	Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits) x86/gart: Do not select AGP for GART_IOMMU x86/amd-iommu: Initialize passthrough mode when requested x86/amd-iommu: Don't detach device from pt domain on driver unbind x86/amd-iommu: Make sure a device is assigned in passthrough mode x86/amd-iommu: Align locking between attach_device and detach_device x86/amd-iommu: Fix device table write order x86/amd-iommu: Add passthrough mode initialization functions x86/amd-iommu: Add core functions for pd allocation/freeing x86/dma: Mark iommu_pass_through as __read_mostly x86/amd-iommu: Change iommu_map_page to support multiple page sizes x86/amd-iommu: Support higher level PTEs in iommu_page_unmap x86/amd-iommu: Remove old page table handling macros x86/amd-iommu: Use 2-level page tables for dma_ops domains x86/amd-iommu: Remove bus_addr check in iommu_map_page x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX x86/amd-iommu: Change alloc_pte to support 64 bit address space x86/amd-iommu: Introduce increase_address_space function x86/amd-iommu: Flush domains if address space size was increased x86/amd-iommu: Introduce set_dte_entry function x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices ...	2009-09-11 13:16:37 -07:00
James Morris	a3c8b97396	Merge branch 'next' into for-linus	2009-09-11 08:04:49 +10:00
Avi Kivity	27c2381068	KVM: Add __KERNEL__ guards to exported headers Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 10:46:48 +03:00
Gleb Natapov	a1b37100d9	KVM: Reduce runnability interface with arch support code Remove kvm_cpu_has_interrupt() and kvm_arch_interrupt_allowed() from interface between general code and arch code. kvm_arch_vcpu_runnable() checks for interrupts instead. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:13 +03:00
Michael S. Tsirkin	bda9020e24	KVM: remove in_range from io devices This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write functions. in_range now becomes unused so it is removed from device ops in favor of read/write callbacks performing range checks internally. This allows aliasing (mostly for in-kernel virtio), as well as better error handling by making it possible to pass errors up to userspace. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2009-09-10 08:33:05 +03:00

1 2 3 4 5 ...

2231 Commits