linux

Author	SHA1	Message	Date
Tejun Heo	86ef4dbf1f	x86, NUMA: Drop @start/last_pfn from initmem_init() initmem_init() extensively accesses and modifies global data structures and the parameters aren't even followed depending on which path is being used. Drop @start/last_pfn and let it deal with @max_pfn directly. This is in preparation for further NUMA init cleanups. - v2: x86-32 initmem_init() weren't updated breaking 32bit builds. Fixed. Found by Yinghai. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Shaohui Zheng <shaohui.zheng@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@linux.intel.com>	2011-02-16 12:13:06 +01:00
Ingo Molnar	275a88d3cf	Merge branch 'x86/amd-nb' into x86/mm Merge reason: consolidate it into the more generic x86/mm tree to prevent conflicts with ongoing NUMA work. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-16 09:45:47 +01:00
Ingo Molnar	52b8b8d725	Merge branch 'x86/numa' into x86/mm Merge reason: consolidate it into the more generic x86/mm tree to prevent conflicts. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-16 09:44:15 +01:00
Ingo Molnar	02ac81a812	Merge branch 'x86/bootmem' into x86/mm Merge reason: the topic is ready - consolidate it into the more generic x86/mm tree and prevent conflicts. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-16 09:43:54 +01:00
Linus Torvalds	1cecd791f2	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix text_poke_smp_batch() deadlock perf tools: Fix thread_map event synthesizing in top and record watchdog, nmi: Lower the severity of error messages ARM: oprofile: Fix backtraces in timer mode oprofile: Fix usage of CONFIG_HW_PERF_EVENTS for oprofile_perf_init and friends	2011-02-15 10:18:48 -08:00
Naga Chumbalkar	84e383b322	x86, dmi, debug: Log board name (when present) in dmesg/oops output The "Type 2" SMBIOS record that contains Board Name is not strictly required and may be absent in the SMBIOS on some platforms. ( Please note that Type 2 is not listed in Table 3 in Sec 6.2 ("Required Structures and Data") of the SMBIOS v2.7 Specification. ) Use the Manufacturer Name (aka System Vendor) name. Print Board Name only when it is present. Before the fix: (i) dmesg output: DMI: /ProLiant DL380 G6, BIOS P62 01/29/2011 (ii) oops output: Pid: 2170, comm: bash Not tainted 2.6.38-rc4+ #3 /ProLiant DL380 G6 After the fix: (i) dmesg output: DMI: HP ProLiant DL380 G6, BIOS P62 01/29/2011 (ii) oops output: Pid: 2278, comm: bash Not tainted 2.6.38-rc4+ #4 HP ProLiant DL380 G6 Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com> Reviewed-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: <stable@kernel.org> # .3x - good for debugging, please apply as far back as it applies cleanly LKML-Reference: <20110214224423.2182.13929.sendpatchset@nchumbalkar.americas.hpqcorp.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-15 04:20:57 +01:00
Paul Bolle	678301ecad	x86, ioapic: Don't warn about non-existing IOAPICs if we have none mp_find_ioapic() prints errors like: ERROR: Unable to locate IOAPIC for GSI 13 if it can't find the IOAPIC that manages that specific GSI. I see errors like that at every boot of a laptop that apparently doesn't have any IOAPICs. But if there are no IOAPICs it doesn't seem to be an error that none can be found. A solution that gets rid of this message is to directly return if nr_ioapics (still) is zero. (But keep returning -1 in that case, so nothing breaks from this change.) The call chain that generates this error is: pnpacpi_allocated_resource() case ACPI_RESOURCE_TYPE_IRQ: pnpacpi_parse_allocated_irqresource() acpi_get_override_irq() mp_find_ioapic() Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-15 04:15:04 +01:00
Borislav Petkov	9e81509efc	x86, amd: Initialize variable properly Commit `d518573de6` ("x86, amd: Normalize compute unit IDs on multi-node processors") introduced compute unit normalization but causes a compiler warning: arch/x86/kernel/cpu/amd.c: In function 'amd_detect_cmp': arch/x86/kernel/cpu/amd.c:268: warning: 'cores_per_cu' may be used uninitialized in this function arch/x86/kernel/cpu/amd.c:268: note: 'cores_per_cu' was declared here The compiler is right - initialize it with a proper value. Also, fix up a comment while at it. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20110214171451.GB10076@kryptos.osrc.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-15 03:03:19 +01:00
Feng Tang	6b617e224d	x86/platform: Add a wallclock_init func to x86_init.timers ops Some wall clock devices use MMIO based HW register, this new function will give them a chance to do some initialization work before their get/set_time service get called, which is usually in early kernel boot phase. Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-02-14 18:20:43 +01:00
Ingo Molnar	b366801c95	Merge commit 'v2.6.38-rc4' into x86/numa Merge reason: Merge latest fixes before applying new patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 13:28:31 +01:00
Yinghai Lu	e5fea868e6	x86: Fix and clean up generic_processor_info() One of the error printouts in generic_processor_info() prints out the APIC version instead of the cpu index the warning text describes. Move version validation down, after we get the right cpu index. -v2: add comments about reason why we can have cpu=0 there. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4D5240A9.4080703@kernel.org> [ Cleaned up and made the BIOS bug printouts more consistent ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 13:23:51 +01:00
Ingo Molnar	91e04ec058	Merge commit 'v2.6.38-rc4' into x86/cpu Merge reason: pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 13:18:56 +01:00
Kamal Mostafa	9a6d44b9ad	x86: Emit "mem=nopentium ignored" warning when not supported Emit warning when "mem=nopentium" is specified on any arch other than x86_32 (the only that arch supports it). Signed-off-by: Kamal Mostafa <kamal@canonical.com> BugLink: http://bugs.launchpad.net/bugs/553464 Cc: Yinghai Lu <yinghai@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> LKML-Reference: <1296783486-23033-2-git-send-email-kamal@canonical.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org>	2011-02-14 13:15:43 +01:00
Kamal Mostafa	77eed821ac	x86: Fix panic when handling "mem={invalid}" param Avoid removing all of memory and panicing when "mem={invalid}" is specified, e.g. mem=blahblah, mem=0, or mem=nopentium (on platforms other than x86_32). Signed-off-by: Kamal Mostafa <kamal@canonical.com> BugLink: http://bugs.launchpad.net/bugs/553464 Cc: Yinghai Lu <yinghai@kernel.org> Cc: Len Brown <len.brown@intel.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: <stable@kernel.org> # .3x: as far back as it applies LKML-Reference: <1296783486-23033-1-git-send-email-kamal@canonical.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 13:15:43 +01:00
Shaohua Li	3a09fb4570	x86: Allocate 32 tlb_invalidate_interrupt handler stubs Add up to 32 invalidate_interrupt handlers. How many handlers are added depends on NUM_INVALIDATE_TLB_VECTORS. So if NUM_INVALIDATE_TLB_VECTORS is smaller than 32, we reduce code size. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> LKML-Reference: <1295232725.1949.708.camel@sli10-conroe> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 13:03:08 +01:00
Borislav Petkov	1c9d16e359	x86: Fix mwait_usable section mismatch We use it in non __cpuinit code now too so drop marker. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20110211171754.GA21047@aftab> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 12:08:28 +01:00
Ingo Molnar	d2137d5af4	Merge branch 'linus' into x86/bootmem Conflicts: arch/x86/mm/numa_64.c Merge reason: fix the conflict, update to latest -rc and pick up this dependent fix from Yinghai: `e6d2e2b2b1`: memblock: don't adjust size in memblock_find_base() Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-14 11:55:18 +01:00
Thomas Gleixner	5117348dea	x86: Readd missing irq_to_desc() in fixup_irq() commit a3c08e5d(x86: Convert irq_chip access to new functions) accidentally zapped desc = irq_to_desc(irq); in the vector loop. So we lock some random irq descriptor. Add it back. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> # .37	2011-02-12 11:56:22 +01:00
Peter Zijlstra	d91309f69b	x86: Fix text_poke_smp_batch() deadlock Fix this deadlock - we are already holding the mutex: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.38-rc4-test+ #1 ------------------------------------------------------- bash/1850 is trying to acquire lock: (text_mutex){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f but task is already holding lock: (smp_alt){+.+...}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (smp_alt){+.+...}: [<ffffffff81082d02>] lock_acquire+0xcd/0xf8 [<ffffffff8192e119>] __mutex_lock_common+0x4c/0x339 [<ffffffff8192e4ca>] mutex_lock_nested+0x3e/0x43 [<ffffffff8101050f>] alternatives_smp_switch+0x77/0x1d8 [<ffffffff81926a6f>] do_boot_cpu+0xd7/0x762 [<ffffffff819277dd>] native_cpu_up+0xe6/0x16a [<ffffffff81928e28>] _cpu_up+0x9d/0xee [<ffffffff81928f4c>] cpu_up+0xd3/0xe7 [<ffffffff82268d4b>] kernel_init+0xe8/0x20a [<ffffffff8100ba24>] kernel_thread_helper+0x4/0x10 -> #1 (cpu_hotplug.lock){+.+.+.}: [<ffffffff81082d02>] lock_acquire+0xcd/0xf8 [<ffffffff8192e119>] __mutex_lock_common+0x4c/0x339 [<ffffffff8192e4ca>] mutex_lock_nested+0x3e/0x43 [<ffffffff810568cc>] get_online_cpus+0x41/0x55 [<ffffffff810a1348>] stop_machine+0x1e/0x3e [<ffffffff819314c1>] text_poke_smp_batch+0x3a/0x3c [<ffffffff81932b6c>] arch_optimize_kprobes+0x10d/0x11c [<ffffffff81933a51>] kprobe_optimizer+0x152/0x222 [<ffffffff8106bb71>] process_one_work+0x1d3/0x335 [<ffffffff8106cfae>] worker_thread+0x104/0x1a4 [<ffffffff810707c4>] kthread+0x9d/0xa5 [<ffffffff8100ba24>] kernel_thread_helper+0x4/0x10 -> #0 (text_mutex){+.+.+.}: other info that might help us debug this: 6 locks held by bash/1850: #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #1: (s_active#75){.+.+.+}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #2: (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #3: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #4: (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f #5: (smp_alt){+.+...}, at: [<ffffffff8100a9c1>] return_to_handler+0x0/0x2f stack backtrace: Pid: 1850, comm: bash Not tainted 2.6.38-rc4-test+ #1 Call Trace: [<ffffffff81080eb2>] print_circular_bug+0xa8/0xb7 [<ffffffff8192e4ca>] mutex_lock_nested+0x3e/0x43 [<ffffffff81010302>] alternatives_smp_unlock+0x3d/0x93 [<ffffffff81010630>] alternatives_smp_switch+0x198/0x1d8 [<ffffffff8102568a>] native_cpu_die+0x65/0x95 [<ffffffff818cc4ec>] _cpu_down+0x13e/0x202 [<ffffffff8117a619>] sysfs_write_file+0x108/0x144 [<ffffffff8111f5a2>] vfs_write+0xac/0xff [<ffffffff8111f7a9>] sys_write+0x4a/0x6e Reported-by: Steven Rostedt <rostedt@goodmis.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: mathieu.desnoyers@efficios.com Cc: rusty@rustcorp.com.au Cc: ananth@in.ibm.com Cc: masami.hiramatsu.pt@hitachi.com Cc: fweisbec@gmail.com Cc: jbeulich@novell.com Cc: jbaron@redhat.com Cc: mhiramat@redhat.com LKML-Reference: <1297458466.5226.93.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-12 02:34:34 +01:00
Ingo Molnar	3e86858133	Merge commit 'v2.6.38-rc4' into perf/core Merge reason: pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-12 02:24:25 +01:00
Jan Beulich	691269f0d9	x86: Adjust section placement in AMD northbridge related code amd_nb_misc_ids[] can live in .rodata, and enable_pci_io_ecs() can be moved into .cpuinit.text. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Andreas Herrmann <Andreas.Herrmann3@amd.com> Cc: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <4D525DDD0200007800030F07@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-10 13:32:52 +01:00
Jan Beulich	b82fef82d5	x86: Partly unify asm-offsets_{32,64}.c Just consolidating the common parts. Full unification would seem straight forward, but it's not clear the necessary #ifdef-s would be acceptable. Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4D525D520200007800030EE9@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-10 13:31:37 +01:00
Jan Beulich	94d1ac8b55	x86: Reduce back the alignment of the per-CPU data section This complements commit: `47f19a0814`: percpu: Remove the multi-page alignment facility reverting one leftover of: `fe8e0c25ca`: x86, 32-bit: Align percpu area and irq stacks to THREAD_SIZE Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Alexander van Heukelum <heukelum@fastmail.fm> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4D525CE60200007800030EE5@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Alexander van Heukelum <heukelum@fastmail.fm>	2011-02-10 13:31:36 +01:00
Jan Beulich	2fb270f321	x86: Fix section mismatch in LAPIC initialization Additionally doing things conditionally upon smp_processor_id() being zero is generally a bad idea, as this means CPU 0 cannot be offlined and brought back online later again. While there may be other places where this is done, I think adding more of those should be avoided so that some day SMP can really become "symmetrical". Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> LKML-Reference: <4D525C7E0200007800030EE1@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-10 13:26:53 +01:00
Borislav Petkov	44d60c0f5c	x86, microcode, AMD: Extend ucode size verification The different families have a different max size for the ucode patch, adjust size checking to the family we're running on. Also, do not vzalloc the max size of the ucode but only the actual size that is passed on from the firmware loader. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2011-02-10 12:24:03 +01:00
Borislav Petkov	258721ef34	x86, microcode, AMD: Cleanup dmesg output Unify pr_* to use pr_fmt, shorten messages, correct type formatting. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <Andreas.Herrmann3@amd.com>	2011-02-09 16:05:36 +01:00
Borislav Petkov	05ff02e4c0	x86, microcode, AMD: Remove unneeded memset call collect_cpu_info_amd() clears its csig arg but this is done in the microcode_core's collect_cpu_info() by clearing the embedding struct ucode_cpu_info. Drop it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <Andreas.Herrmann3@amd.com>	2011-02-09 16:05:35 +01:00
Borislav Petkov	7cc27349cb	x86, microcode, AMD: Simplify get_next_ucode Do not copy the section header but look at it directly through the pointer. Also, make it return a ptr to a ucode header directly thus dropping a bunch of unneeded casts. Finally, simplify generic_load_microcode(), while at it. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <Andreas.Herrmann3@amd.com>	2011-02-09 16:05:34 +01:00
Borislav Petkov	10de52d665	x86, microcode, AMD: Simplify install_equiv_cpu_table There's no need to memcpy the ucode header in order to look at it only in this function - use the original buffer instead. Also, fix return type semantics by returning a negative value on error and a positive otherwise. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <Andreas.Herrmann3@amd.com>	2011-02-09 16:05:33 +01:00
Borislav Petkov	ffc7e8ac82	x86, microcode, AMD: Release firmware on error When the ucode magic is wrong, for whatever reason, we don't release the loaded firmware binary and its related resources. Make sure we do. Also, fix function naming to fit this driver's convention and shorten variable names. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <Andreas.Herrmann3@amd.com>	2011-02-09 16:05:32 +01:00
Borislav Petkov	6c53cbfced	x86, microcode: Correct sysdev_add error path When we encounter an error while initting the microcode driver on a CPU, we must undo the previously added sysfs group. Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Andreas Herrmann <Andreas.Herrmann3@amd.com>	2011-02-09 16:05:31 +01:00
Hans Rosenfeld	cabb5bd7ff	x86, amd: Support L3 Cache Partitioning on AMD family 0x15 CPUs L3 Cache Partitioning allows selecting which of the 4 L3 subcaches can be used for evictions by the L2 cache of each compute unit. By writing a 4-bit hexadecimal mask into the the sysfs file /sys/devices/system/cpu/cpuX/cache/index3/subcaches, the user can set the enabled subcaches for a CPU. The settings are directly read from and written to the hardware, so there is no way to have contradicting settings for two CPUs belonging to the same compute unit. Writing will always overwrite any previous setting for a compute unit. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: <Andreas.Herrmann3@amd.com> LKML-Reference: <1297098639-431383-1-git-send-email-hans.rosenfeld@amd.com> [ -v3: minor style fixes ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-07 19:16:22 +01:00
H. Peter Anvin	d344e38b2c	x86, nx: Mark the ACPI resume trampoline code as +x We reserve lowmem for the things that need it, like the ACPI wakeup code, way early to guarantee availability. This happens before we set up the proper pagetables, so set_memory_x() has no effect. Until we have a better solution, use an initcall to mark the wakeup code executable. Originally-by: Matthieu Castet <castet.matthieu@free.fr> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Matthias Hopf <mhopf@suse.de> Cc: rjw@sisk.pl Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4D4F8019.2090104@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-07 09:07:13 +01:00
Ingo Molnar	c7f9a6f377	Merge branch 'linus' into perf/core Merge reason: Pick up perf fixes that are now upstream Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-07 08:44:26 +01:00
Linus Torvalds	07675f484b	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-32: Make sure the stack is set up before we use it x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platforms x86, nx: Don't force pages RW when setting NX bits	2011-02-06 12:03:10 -08:00
H. Peter Anvin	11d4c3f9b6	x86-32: Make sure the stack is set up before we use it Since checkin `ebba638ae7` we call verify_cpu even in 32-bit mode. Unfortunately, calling a function means using the stack, and the stack pointer was not initialized in the 32-bit setup code! This code initializes the stack pointer, and simplifies the interface slightly since it is easier to rely on just a pointer value rather than a descriptor; we need to have different values for the segment register anyway. This retains start_stack as a virtual address, even though a physical address would be more convenient for 32 bits; the 64-bit code wants the other way around... Reported-by: Matthieu Castet <castet.matthieu@free.fr> LKML-Reference: <4D41E86D.8060205@free.fr> Tested-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-02-04 22:27:28 -08:00
Linus Torvalds	eb487ab4d5	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix reading in perf_event_read() watchdog: Don't change watchdog state on read of sysctl watchdog: Fix sysctl consistency watchdog: Fix broken nowatchdog logic perf: Fix Pentium4 raw event validation perf: Fix alloc_callchain_buffers()	2011-02-03 08:52:05 -08:00
Suresh Siddha	f7448548a9	x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platforms Markus Kohn ran into a hard hang regression on an acer aspire 1310, when acpi is enabled. git bisect showed the following commit as the bad one that introduced the boot regression. commit `d0af9eed5a` Author: Suresh Siddha <suresh.b.siddha@intel.com> Date: Wed Aug 19 18:05:36 2009 -0700 x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init Because of the UP configuration of that platform, native_smp_prepare_cpus() bailed out (in smp_sanity_check()) before doing the set_mtrr_aps_delayed_init() Further down the boot path, native_smp_cpus_done() will call the delayed MTRR initialization for the AP's (mtrr_aps_init()) with mtrr_aps_delayed_init not set. This resulted in the boot processor reprogramming its MTRR's to the values seen during the start of the OS boot. While this is not needed ideally, this shouldn't have caused any side-effects. This is because the reprogramming of MTRR's (set_mtrr_state() that gets called via set_mtrr()) will check if the live register contents are different from what is being asked to write and will do the actual write only if they are different. BP's mtrr state is read during the start of the OS boot and typically nothing would have changed when we ask to reprogram it on BP again because of the above scenario on an UP platform. So on a normal UP platform no reprogramming of BP MTRR MSR's happens and all is well. However, on this platform, bios seems to be modifying the fixed mtrr range registers between the start of OS boot and when we double check the live registers for reprogramming BP MTRR registers. And as the live registers are modified, we end up reprogramming the MTRR's to the state seen during the start of the OS boot. During ACPI initialization, something in the bios (probably smi handler?) don't like this fact and results in a hard lockup. We didn't see this boot hang issue on this platform before the commit `d0af9eed5a`, because only the AP's (if any) will program its MTRR's to the value that BP had at the start of the OS boot. Fix this issue by checking mtrr_aps_delayed_init before continuing further in the mtrr_aps_init(). Now, only AP's (if any) will program its MTRR's to the BP values during boot. Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393 [ By the way, this behavior of the bios modifying MTRR's after the start of the OS boot is not common and the kernel is not prepared to handle this situation well. Irrespective of this issue, during suspend/resume, linux kernel will try to reprogram the BP's MTRR values to the values seen during the start of the OS boot. So suspend/resume might be already broken on this platform for all linux kernel versions. ] Reported-and-bisected-by: Markus Kohn <jabber@gmx.org> Tested-by: Markus Kohn <jabber@gmx.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Thomas Renninger <trenn@novell.com> Cc: Rafael Wysocki <rjw@novell.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: stable@kernel.org # [v2.6.32+] LKML-Reference: <1296694975.4418.402.camel@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-03 12:10:38 +01:00
Richard Cochran	ce26efdefa	x86: Add clock_adjtime for x86 This patch adds the clock_adjtime system call to the x86 architecture. Signed-off-by: Richard Cochran <richard.cochran@omicron.at> Acked-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <20110201134419.968905083@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-02-02 15:28:19 +01:00
Ingo Molnar	8104a4775a	Merge commit 'v2.6.38-rc3' into perf/core Merge reason: Pick up latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-02 07:10:06 +01:00
Tejun Heo	4e62445b90	x86: Fix build failure on X86_UP_APIC Commit `4c321ff8` (x86: Replace cpu_2_logical_apicid[] with early percpu variable) and following changes introduced and used x86_cpu_to_logical_apicid percpu variable. It was declared and defined inside CONFIG_SMP && CONFIG_X86_32 but if CONFIG_X86_UP_APIC is set UP configuration makes use of it and build fails. Fix it by declaring and defining it inside CONFIG_X86_LOCAL_APIC && CONFIG_X86_32. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ingo Molnar <mingo@elte.hu> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <20110128162248.GA25746@htj.dyndns.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 17:24:49 +01:00
Tejun Heo	8db78cc4b4	x86: Unify NUMA initialization between 32 and 64bit Now that everything else is unified, NUMA initialization can be unified too. * numa_init_array() and init_cpu_to_node() are moved from numa_64 to numa. * numa_32::initmem_init() is updated to call numa_init_array() and setup_arch() to call init_cpu_to_node() on 32bit too. * x86_cpu_to_node_map is now initialized to NUMA_NO_NODE on 32bit too. This is safe now as numa_init_array() will initialize it early during boot. This makes NUMA mapping fully initialized before setup_per_cpu_areas() on 32bit too and thus makes the first percpu chunk which contains all the static variables and some of dynamic area allocated with NUMA affinity correctly considered. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-17-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Pekka Enberg <penberg@kernel.org>	2011-01-28 14:54:10 +01:00
Tejun Heo	de2d9445f1	x86: Unify node_to_cpumask_map handling between 32 and 64bit x86_32 has been managing node_to_cpumask_map explicitly from map_cpu_to_node() and friends in a rather ugly way. With previous changes, it's now possible to share the code with 64bit. * When CONFIG_NUMA_EMU is disabled, numa_add/remove_cpu() are implemented in numa.c and shared by 32 and 64bit. CONFIG_NUMA_EMU versions still live in numa_64.c. NUMA_EMU's dependency on 64bit is planned to be removed and the above should go away together. * identify_cpu() now calls numa_add_cpu() for 32bit too. This makes the explicit mask management from map_cpu_to_node() unnecessary. * The whole x86_32 specific map_cpu_to_node() chunk is no longer necessary. Dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-16-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: David Rientjes <rientjes@google.com> Cc: Shaohui Zheng <shaohui.zheng@intel.com>	2011-01-28 14:54:10 +01:00
Tejun Heo	645a79195f	x86: Unify CPU -> NUMA node mapping between 32 and 64bit Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for CPU -> NUMA node mapping. Replace it with early_percpu variable x86_cpu_to_node_map and share the mapping code with 64bit. * USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too. * x86_cpu_to_node_map and numa_set/clear_node() are moved from numa_64 to numa. For now, on 32bit, x86_cpu_to_node_map is initialized with 0 instead of NUMA_NO_NODE. This is to avoid introducing unexpected behavior change and will be updated once init path is unified. * srat_detect_node() is now enabled for x86_32 too. It calls numa_set_node() and initializes the mapping making explicit cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-15-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: David Rientjes <rientjes@google.com>	2011-01-28 14:54:09 +01:00
Tejun Heo	bbc9e2f452	x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bit The mapping between cpu/apicid and node is done via apicid_to_node[] on 64bit and apicid_2_node[] + apic->x86_32_numa_cpu_node() on 32bit. This difference makes it difficult to further unify 32 and 64bit NUMA handling. This patch unifies it by replacing both apicid_to_node[] and apicid_2_node[] with __apicid_to_node[] array, which is accessed by two accessors - set_apicid_to_node() and numa_cpu_node(). On 64bit, numa_cpu_node() always consults __apicid_to_node[] directly while 32bit goes through apic->numa_cpu_node() method to allow apic implementations to override it. srat_detect_node() for amd cpus contains workaround for broken NUMA configuration which assumes relationship between APIC ID, HT node ID and NUMA topology. Leave it to access __apicid_to_node[] directly as mapping through CPU might result in undesirable behavior change. The comment is reformatted and updated to note the ugliness. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-14-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: David Rientjes <rientjes@google.com>	2011-01-28 14:54:09 +01:00
Tejun Heo	89e5dc218e	x86: Replace apic->apicid_to_node() with ->x86_32_numa_cpu_node() apic->apicid_to_node() is 32bit specific apic operation which determines NUMA node for a CPU. Depending on the APIC implementation, it can be easier to determine NUMA node from either physical or logical apicid. Currently, ->apicid_to_node() takes @logical_apicid and calls hard_smp_processor_id() if the physical apicid is needed. This prevents NUMA mapping from being queried from a different CPU, which in turn makes it impossible to initialize NUMA mapping before SMP bringup. This patch replaces apic->apicid_to_node() with ->x86_32_numa_cpu_node() which takes @cpu, from which both logical and physical apicids can easily be determined. While at it, drop duplicate implementations from bigsmp_32 and summit_32, and use the default one. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-13-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:08 +01:00
Tejun Heo	df04cf011b	x86: Implement x86_32_early_logical_apicid() for numaq_32 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-12-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:08 +01:00
Tejun Heo	3b39d93784	x86: Implement x86_32_early_logical_apicid() for summit_32 Factor out logical apic id calculation from summit_init_apic_ldr() and use it for the x86_32_early_logical_apicid() callback. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-11-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:07 +01:00
Tejun Heo	12bf24a47c	x86: Implement x86_32_early_logical_apicid() for bigsmp_32 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-10-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:07 +01:00
Tejun Heo	3f6f679888	x86: Implement the default x86_32_early_logical_apicid() Implement x86_32_early_logical_apicid() for the default apic flat routing. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-9-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:07 +01:00
Tejun Heo	acb8bc09c6	x86: Add apic->x86_32_early_logical_apicid() On x86_32, the mapping between cpu and logical apic ID differs depending on the specific apic implementation in use. The mapping is initialized while bringing up CPUs; however, this makes early inits ignore memory topology. Add a x86_32 specific apic->x86_32_early_logical_apicid() which is called early during boot to query the mapping. The mapping is later verified against the result of init_apic_ldr(). The method is allowed to return BAD_APICID if it can't be determined early. noop variant which always returns BAD_APICID is implemented and added to all x86_32 apic implementations. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-8-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:06 +01:00
Tejun Heo	7632611f53	x86: Kill apic->cpu_to_logical_apicid() After the previous patch, apic->cpu_to_logical_apicid() is no longer used. Kill it. For apic types with custom cpu_to_logical_apicid() which is also used for other purposes, remove the function and modify its users to do the mapping directly. #ifdef's on CONFIG_SMP in es7000_32 and summit_32 are ignored during conversion as they are not used for UP kernels. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-7-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:06 +01:00
Tejun Heo	6f802c4bfa	x86: Always use x86_cpu_to_logical_apicid for cpu -> logical apic id Currently, cpu -> logical apic id translation is done by apic->cpu_to_logical_apicid() callback which may or may not use x86_cpu_to_logical_apicid. This is unnecessary as it should always equal logical_smp_processor_id() which is known early during CPU bring up. Initialize x86_cpu_to_logical_apicid after apic->init_apic_ldr() in setup_local_APIC() and always use x86_cpu_to_logical_apicid for cpu -> logical apic id mapping. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-6-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:05 +01:00
Tejun Heo	4c321ff8a0	x86: Replace cpu_2_logical_apicid[] with early percpu variable Unlike x86_64, on x86_32, the mapping from cpu to logical apicid may vary depending on apic in use. cpu_2_logical_apicid[] array is used for this mapping. Replace it with early percpu variable x86_cpu_to_logical_apicid to make it better aligned with other mappings. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: penberg@kernel.org Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-5-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:05 +01:00
Tejun Heo	1245e1668c	x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit only Both functions are used only in 32bit. Put them inside CONFIG_X86_32. This is to prepare for logical apicid handling update. - Cyrill Gorcunov spotted that I forgot to move declarations in ipi.h under CONFIG_X86_32. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: eric.dumazet@gmail.com Cc: brgerst@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-4-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:05 +01:00
Tejun Heo	b78aa66b1f	x86: Drop x86_32 MAX_APICID Commit `56d91f13` (x86, acpi: Add MAX_LOCAL_APIC for 32bit) added MAX_LOCAL_APIC for x86_32 but didn't replace MAX_APICID users with it. Convert MAX_APICID users to MAX_LOCAL_APIC and drop MAX_APICID. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-3-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:04 +01:00
Tejun Heo	bd22a2f198	x86: Kill unused static boot_cpu_logical_apicid in smpboot.c Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: Yinghai Lu <yinghai@kernel.org> Cc: eric.dumazet@gmail.com Cc: yinghai@kernel.org Cc: brgerst@gmail.com Cc: gorcunov@gmail.com Cc: shaohui.zheng@intel.com Cc: rientjes@google.com LKML-Reference: <1295789862-25482-2-git-send-email-tj@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-28 14:54:04 +01:00
Yinghai Lu	dda9911696	x86, perf: Change two init functions to static init_hw_perf_events() is called via early_initcall now. x86_pmu_event_init is x86_pmu member function. So we can change them to static. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> LKML-Reference: <4D3A16F9.109@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-27 19:23:07 +01:00
Stephane Eranian	d038b12c6d	perf: Fix Pentium4 raw event validation This patch fixes some issues with raw event validation on Pentium 4 (Netburst) based processors. As I was testing libpfm4 Netburst support, I ran into two problems in the p4_validate_raw_event() function: - the shared field must be checked ONLY when HT is on - the binding to ESCR register was missing The second item was causing raw events to not be encoded correctly compared to generic PMU events. With this patch, I can now pass Netburst events to libpfm4 examples and get meaningful results: $ task -e global_power_events🏃u noploop 1 noploop for 1 seconds 3,206,304,898 global_power_events:running Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: peterz@infradead.org Cc: paulus@samba.org Cc: davem@davemloft.net Cc: fweisbec@gmail.com Cc: perfmon2-devel@lists.sf.net Cc: eranian@gmail.com Cc: robert.richter@amd.com Cc: acme@redhat.com Cc: gorcunov@gmail.com Cc: ming.m.lin@intel.com LKML-Reference: <4d3efb2f.1252d80a.1a80.ffffc83f@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-27 19:21:53 +01:00
Yinghai Lu	792363d2be	x86: Don't copy per_cpu cpuinfo for BSP two times smp_store_cpu_info(0) will do that. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: Borislav Petkov <bp@alien8.de> LKML-Reference: <4D3A16F2.5090902@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-26 08:44:50 +01:00
Yinghai Lu	b3d7336db5	x86: Move llc_shared_map out of cpu_info cpu_info is already with per_cpu, We can take llc_shared_map out of cpu_info, and declare it as per_cpu variable directly. So later referencing could be simple and directly instead of diving to find cpu_info at first. Also could make smp_store_cpu_info() much simple to avoid to do save and restore trick. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Alok N Kataria <akataria@vmware.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Hans J. Koch <hjk@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> LKML-Reference: <4D3A16E8.5020608@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-26 08:44:49 +01:00
Hans Rosenfeld	41b2610c34	x86, amd: Extend AMD northbridge caching code to support "Link Control" devices "Link Control" devices (NB function 4) will be used by L3 cache partitioning on family 0x15. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: <andreas.herrmann3@amd.com> LKML-Reference: <1295881543-572552-4-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-26 08:28:23 +01:00
Hans Rosenfeld	b453de02b7	x86, amd: Enable L3 cache index disable on family 0x15 AMD family 0x15 CPUs support L3 cache index disable, so enable it on them. Signed-off-by: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: <andreas.herrmann3@amd.com> LKML-Reference: <1295881543-572552-3-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-26 08:28:23 +01:00
Andreas Herrmann	d518573de6	x86, amd: Normalize compute unit IDs on multi-node processors On multi-node CPUs we don't need the socket wide compute unit ID but the node-wide compute unit ID. Thus we need to normalize the value. This is similar to what we do with cpu_core_id. A compute unit is then identified by physical_package_id, node_id, and compute_unit_id. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <1295881543-572552-2-git-send-email-hans.rosenfeld@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-26 08:28:22 +01:00
Fenghua Yu	9599ec0471	x86-64, mem: Convert memmove() to assembly file and fix return value bug memmove_64.c only implements memmove() function which is completely written in inline assembly code. Therefore it doesn't make sense to keep the assembly code in .c file. Currently memmove() doesn't store return value to rax. This may cause issue if caller uses the return value. The patch fixes this issue. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> LKML-Reference: <1295314755-6625-1-git-send-email-fenghua.yu@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-01-25 16:58:39 -08:00
Jesper Juhl	2e5aa6824d	x86-64: Don't use pointer to out-of-scope variable in dump_trace() In arch/x86/kernel/dumpstack_64.c::dump_trace() we have this code: ... if (!stack) { unsigned long dummy; stack = &dummy; if (task && task != current) stack = (unsigned long )task->thread.sp; } bp = stack_frame(task, regs); / * Print function call entries in all stacks, starting at the * current stack address. If the stacks consist of nested * exceptions / tinfo = task_thread_info(task); for (;;) { char id; unsigned long *estack_end; estack_end = in_exception_stack(cpu, (unsigned long)stack, &used, &id); ... You'll notice that we assign to 'stack' the address of the variable 'dummy' which is only in-scope inside the 'if (!stack)'. So when we later access stack (at the end of the above, and assuming we did not take the 'if (task && task != current)' branch) we'll be using the address of a variable that is no longer in scope. I believe this patch is the proper fix, but I freely admit that I'm not 100% certain. Signed-off-by: Jesper Juhl <jj@chaosbits.net> LKML-Reference: <alpine.LNX.2.00.1101242232590.10252@swampdragon.chaosbits.net> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-01-24 13:46:15 -08:00
Borislav Petkov	93789b32db	x86, hotplug: Fix powersavings with offlined cores on AMD `ea53069231` made a CPU use monitor/mwait when offline. This is not the optimal choice for AMD wrt to powersavings and we'd prefer our cores to halt (i.e. enter C1) instead. For this, the same selection whether to use monitor/mwait has to be used as when we select the idle routine for the machine. With this patch, offlining cores 1-5 on a X6 machine allows core0 to boost again. [ hpa: putting this in urgent since it is a (power) regression fix ] Reported-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: stable@kernel.org # 37.x Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.hl> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <1295534572-10730-1-git-send-email-bp@amd64.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-01-21 18:14:54 -08:00
Fenghua Yu	f21bbec9ff	x86, mcheck, therm_throt.c: Export symbol platform_thermal_notify to allow coretemp to handler intr In therm_throt.c, commit `9e76a97efd` patch doesn't export the symbol platform_thermal_notify. Other drivers (e.g. drivers/hwmon/coretemp.c) can not find the symbol platform_thermal_notify when defining threshould interrupt handler. Please apply this patch to allow threshold interrupt handler in coretemp. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Cc: R Durgadoss <durgadoss.r@intel.com> Cc: khali@linux-fr.org <khali@linux-fr.org> Cc: lm-sensors@lm-sensors.org <lm-sensors@lm-sensors.org> Cc: Guenter Roeck <guenter.roeck@ericsson.com> LKML-Reference: <20110121041239.GB26954@linux-os.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-21 14:11:12 +01:00
Dave Jones	fb87ec382f	x86: Update CPU cache attributes table descriptors Update to latest definitions in: http://www.intel.com/Assets/PDF/appnote/241618.pdf [ Note, this update of the doc has removed some old values which we have listed. I think until we have clarification that they were never used in production, they should be left there. ] Signed-off-by: Dave Jones <davej@redhat.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> LKML-Reference: <20110120012055.GA15985@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-20 12:13:20 +01:00
Ingo Molnar	6b35eb9ddc	Revert "x86: Make relocatable kernel work with new binutils" This reverts commit `86b1e8dd83` ("x86: Make relocatable kernel work with new binutils"). Markus Trippelsdorf reported a boot failure caused by this patch. The real solution to the original patch will likely involve an arch-generic solution to define an overlaid jiffies_64 and jiffies variables. Until that's done and tested on all architectures revert this commit to solve the regression. Reported-and-bisected-by: Markus Trippelsdorf <markus@trippelsdorf.de> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Shaohua Li <shaohua.li@intel.com> Cc: "Lu, Hongjiu" <hongjiu.lu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>, Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <4D36A759.60704@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-19 10:09:42 +01:00
Linus Torvalds	404cbbd52f	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Clear irqstack thread_info x86: Make relocatable kernel work with new binutils	2011-01-18 14:29:21 -08:00
Brian Gerst	7b698ea377	x86: Clear irqstack thread_info Mathias Merz reported that v2.6.37 failed to boot on his system. Make sure that the thread_info part of the irqstack is initialized to zeroes. Reported-and-Tested-by: Matthias Merz <linux@merz-ka.de> Signed-off-by: Brian Gerst <brgerst@gmail.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <AANLkTimyKXfJ1x8tgwrr1hYnNLrPfgE1NTe4z7L6tUDm@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-18 14:58:37 +01:00
Shaohua Li	86b1e8dd83	x86: Make relocatable kernel work with new binutils The CONFIG_RELOCATABLE=y option is broken with new binutils, which will make boot panic. According to Lu Hongjiu, the affected binutils are from 2.20.51.0.12 to 2.21.51.0.3, which are release since Oct 22 this year. At least ubuntu 10.10 is using such binutils. See: http://sourceware.org/bugzilla/show_bug.cgi?id=12327 The reason of the boot panic is that we have 'jiffies = jiffies_64;' in vmlinux.lds.S. The jiffies isn't in any section. In kernel build, there is warning saying jiffies is an absolute address and can't be relocatable. At runtime, jiffies will have virtual address 0. Signed-off-by: Shaohua Li<shaohua.li@intel.com> Cc: Lu Hongjiu<hongjiu.lu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Sam Ravnborg <sam@ravnborg.org> LKML-Reference: <1295312269.1949.725.camel@sli10-conroe> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-18 09:05:33 +01:00
Linus Torvalds	f9ee7f60d6	Merge branches 'core-fixes-for-linus', 'x86-fixes-for-linus', 'timers-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: avoid pointless blocked-task warnings rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status rtmutex: Fix comment about why new_owner can be NULL in wake_futex_pi() * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, olpc: Add missing Kconfig dependencies x86, mrst: Set correct APB timer IRQ affinity for secondary cpu x86: tsc: Fix calibration refinement conditionals to avoid divide by zero x86, ia64, acpi: Clean up x86-ism in drivers/acpi/numa.c * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timekeeping: Make local variables static time: Rename misnamed minsec argument of clocks_calc_mult_shift() * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Remove syscall_exit_fields tracing: Only process module tracepoints once perf record: Add "nodelay" mode, disabled by default perf sched: Fix list of events, dropping unsupported ':r' modifier Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return" perf top: Fix annotate segv perf evsel: Fix order of event list deletion	2011-01-15 12:45:00 -08:00
Jacob Pan	6550904ddb	x86, mrst: Set correct APB timer IRQ affinity for secondary cpu Offlining the secondary CPU causes the timer irq affinity to be set to CPU 0. When the secondary CPU is back online again, the wrong irq affinity will be used. This patch ensures secondary per CPU timer always has the correct IRQ affinity when enabled. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> LKML-Reference: <1294963604-18111-1-git-send-email-jacob.jun.pan@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@kernel.org> 2.6.37	2011-01-14 11:53:44 -08:00
John Stultz	62627bec8a	x86: tsc: Fix calibration refinement conditionals to avoid divide by zero Konrad Wilk reported that the new delayed calibration crashes with a divide by zero on Xen. The reason is that Xen sets the pmtimer address, but reading from it returns 0xffffff. That results in the ref_start and ref_stop value being the same, so the delta is zero which causes the divide by zero later in the calculation. The conditional (!hpet && !ref_start && !ref_stop) which sanity checks the calibration reference values doesn't really make sense. If the refs are null, but hpet is on, we still want to break out. The div by zero would be possible to trigger by chance if both reads from the hardware provided the exact same value (due to hardware wrapping). So checking if both the ref values are the same should handle if we don't have hardware (both null) or if they are the same value (either by invalid hardware, or by chance), avoiding the div by zero issue. [ tglx: Applied the same fix to native_calibrate_tsc() where this check was copied from ] Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: John Stultz <johnstul@us.ibm.com> LKML-Reference: <1295024788-15619-1-git-send-email-johnstul@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-01-14 18:28:01 +01:00
Linus Torvalds	52cfd503ad	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (59 commits) ACPI / PM: Fix build problems for !CONFIG_ACPI related to NVS rework ACPI: fix resource check message ACPI / Battery: Update information on info notification and resume ACPI: Drop device flag wake_capable ACPI: Always check if _PRW is present before trying to evaluate it ACPI / PM: Check status of power resources under mutexes ACPI / PM: Rename acpi_power_off_device() ACPI / PM: Drop acpi_power_nocheck ACPI / PM: Drop acpi_bus_get_power() Platform / x86: Make fujitsu_laptop use acpi_bus_update_power() ACPI / Fan: Rework the handling of power resources ACPI / PM: Register power resource devices as soon as they are needed ACPI / PM: Register acpi_power_driver early ACPI / PM: Add function for updating device power state consistently ACPI / PM: Add function for device power state initialization ACPI / PM: Introduce __acpi_bus_get_power() ACPI / PM: Introduce function for refcounting device power resources ACPI / PM: Add functions for manipulating lists of power resources ACPI / PM: Prevent acpi_power_get_inferred_state() from making changes ACPICA: Update version to 20101209 ...	2011-01-13 20:15:35 -08:00
Linus Torvalds	dc8e7e3ec6	Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6 * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6: cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer intel_idle: open broadcast clock event cpuidle: CPUIDLE_FLAG_CHECK_BM is omap3_idle specific cpuidle: CPUIDLE_FLAG_TLB_FLUSHED is specific to intel_idle cpuidle: delete unused CPUIDLE_FLAG_SHALLOW, BALANCED, DEEP definitions SH, cpuidle: delete use of NOP CPUIDLE_FLAGS_SHALLOW cpuidle: delete NOP CPUIDLE_FLAG_POLL ACPI: processor_idle: delete use of NOP CPUIDLE_FLAGs cpuidle: Rename X86 specific idle poll state[0] from C0 to POLL ACPI, intel_idle: Cleanup idle= internal variables cpuidle: Make cpuidle_enable_device() call poll_idle_init() intel_idle: update Sandy Bridge core C-state residency targets	2011-01-13 20:15:18 -08:00
Andrea Arcangeli	bae9c19bf1	thp: split_huge_page_mm/vma split_huge_page_pmd compat code. Each one of those would need to be expanded to hundred of lines of complex code without a fully reliable split_huge_page_pmd design. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:41 -08:00
Andrea Arcangeli	8ac1f8320a	thp: pte alloc trans splitting pte alloc routines must wait for split_huge_page if the pmd is not present and not null (i.e. pmd_trans_splitting). The additional branches are optimized away at compile time by pmd_trans_splitting if the config option is off. However we must pass the vma down in order to know the anon_vma lock to wait for. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:40 -08:00
Andrea Arcangeli	331127f799	thp: add pmd paravirt ops Paravirt ops pmd_update/pmd_update_defer/pmd_set_at. Not all might be necessary (vmware needs pmd_update, Xen needs set_pmd_at, nobody needs pmd_update_defer), but this is to keep full simmetry with pte paravirt ops, which looks cleaner and simpler from a common code POV. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:39 -08:00
David Rientjes	d0a21265df	mm: unify module_alloc code for vmalloc Four architectures (arm, mips, sparc, x86) use __vmalloc_area() for module_init(). Much of the code is duplicated and can be generalized in a globally accessible function, __vmalloc_node_range(). __vmalloc_node() now calls into __vmalloc_node_range() with a range of [VMALLOC_START, VMALLOC_END) for functionally equivalent behavior. Each architecture may then use __vmalloc_node_range() directly to remove the duplication of code. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 17:32:34 -08:00
Linus Torvalds	e691d24e9c	Merge branch 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-olpc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, olpc: Speed up device tree creation during boot x86, olpc: Add OLPC device-tree support x86, of: Define irq functions to allow drivers/of/* to build on x86	2011-01-13 10:15:12 -08:00
Linus Torvalds	55065bc527	Merge branch 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm * 'kvm-updates/2.6.38' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (142 commits) KVM: Initialize fpu state in preemptible context KVM: VMX: when entering real mode align segment base to 16 bytes KVM: MMU: handle 'map_writable' in set_spte() function KVM: MMU: audit: allow audit more guests at the same time KVM: Fetch guest cr3 from hardware on demand KVM: Replace reads of vcpu->arch.cr3 by an accessor KVM: MMU: only write protect mappings at pagetable level KVM: VMX: Correct asm constraint in vmcs_load()/vmcs_clear() KVM: MMU: Initialize base_role for tdp mmus KVM: VMX: Optimize atomic EFER load KVM: VMX: Add definitions for more vm entry/exit control bits KVM: SVM: copy instruction bytes from VMCB KVM: SVM: implement enhanced INVLPG intercept KVM: SVM: enhance mov DR intercept handler KVM: SVM: enhance MOV CR intercept handler KVM: SVM: add new SVM feature bit names KVM: cleanup emulate_instruction KVM: move complete_insn_gp() into x86.c KVM: x86: fix CR8 handling KVM guest: Fix kvm clock initialization when it's configured out ...	2011-01-13 10:14:24 -08:00
Linus Torvalds	008d23e485	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.	2011-01-13 10:05:56 -08:00
Stephen Hemminger	3e5c12409c	set_rtc_mmss: show warning message only once Occasionally the system gets into a state where the CMOS clock has gotten slightly ahead of current time and the periodic update of RTC fails. The message is a nuisance and repeats spamming the log. See: http://www.ntp.org/ntpfaq/NTP-s-trbl-spec.htm#Q-LINUX-SET-RTC-MMSS Rather than just removing the message, make it show only once and reduce severity since it indicates a normal and non urgent condition. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 08:03:07 -08:00
Len Brown	43952886f0	Merge branch 'cpuidle-perf-events' into idle-test	2011-01-12 18:06:19 -05:00
Len Brown	56dbed129d	Merge branch 'linus' into idle-test	2011-01-12 18:06:06 -05:00
Thomas Renninger	f77cfe4ea2	cpuidle/x86/perf: fix power:cpu_idle double end events and throw cpu_idle events from the cpuidle layer Currently intel_idle and acpi_idle driver show double cpu_idle "exit idle" events -> this patch fixes it and makes cpu_idle events throwing less complex. It also introduces cpu_idle events for all architectures which use the cpuidle subsystem, namely: - arch/arm/mach-at91/cpuidle.c - arch/arm/mach-davinci/cpuidle.c - arch/arm/mach-kirkwood/cpuidle.c - arch/arm/mach-omap2/cpuidle34xx.c - arch/drivers/acpi/processor_idle.c (for all cases, not only mwait) - arch/x86/kernel/process.c (did throw events before, but was a mess) - drivers/idle/intel_idle.c (did throw events before) Convention should be: Fire cpu_idle events inside the current pm_idle function (not somewhere down the the callee tree) to keep things easy. Current possible pm_idle functions in X86: c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle -> this is really easy is now. This affects userspace: The type field of the cpu_idle power event can now direclty get mapped to: /sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...} instead of throwing very CPU/mwait specific values. This change is not visible for the intel_idle driver. For the acpi_idle driver it should only be visible if the vendor misses out C-states in his BIOS. Another (perf timechart) patch reads out cpuidle info of cpu_idle events from: /sys/.../cpuidle/stateX/*, then the cpuidle events are mapped to the correct C-/cpuidle state again, even if e.g. vendors miss out C-states in their BIOS and for example only export C1 and C3. -> everything is fine. Signed-off-by: Thomas Renninger <trenn@suse.de> CC: Robert Schoene <robert.schoene@tu-dresden.de> CC: Jean Pihet <j-pihet@ti.com> CC: Arjan van de Ven <arjan@linux.intel.com> CC: Ingo Molnar <mingo@elte.hu> CC: Frederic Weisbecker <fweisbec@gmail.com> CC: linux-pm@lists.linux-foundation.org CC: linux-acpi@vger.kernel.org CC: linux-kernel@vger.kernel.org CC: linux-perf-users@vger.kernel.org CC: linux-omap@vger.kernel.org Signed-off-by: Len Brown <len.brown@intel.com>	2011-01-12 18:05:16 -05:00
Thomas Renninger	d18960494f	ACPI, intel_idle: Cleanup idle= internal variables Having four variables for the same thing: idle_halt, idle_nomwait, force_mwait and boot_option_idle_overrides is rather confusing and unnecessary complex. if idle= boot param is passed, only set up one variable: boot_option_idle_overrides Introduces following functional changes/fixes: - intel_idle driver does not register if any idle=xy boot param is passed. - processor_idle.c will also not register a cpuidle driver and get active if idle=halt is passed. Before a cpuidle driver with one (C1, halt) state got registered Now the default_idle function will be used which finally uses the same idle call to enter sleep state (safe_halt()), but without registering a whole cpuidle driver. That means idle= param will always avoid cpuidle drivers to register with one exception (same behavior as before): idle=nomwait may still register acpi_idle cpuidle driver, but C1 will not use mwait, but hlt. This can be a workaround for IO based deeper sleep states where C1 mwait causes problems. Signed-off-by: Thomas Renninger <trenn@suse.de> cc: x86@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>	2011-01-12 12:47:30 -05:00
Avi Kivity	e5c3014282	KVM: Initialize fpu state in preemptible context init_fpu() (which is indirectly called by the fpu switching code) assumes it is in process context. Rather than makeing init_fpu() use an atomic allocation, which can cause a task to be killed, make sure the fpu is already initialized when we enter the run loop. KVM-Stable-Tag. Reported-and-tested-by: Kirill A. Shutemov <kas@openvz.org> Acked-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 12:02:26 +02:00
Len Brown	03b6e6e58d	Merge branch 'apei' into release	2011-01-12 05:02:22 -05:00
Avi Kivity	a63512a4d7	KVM guest: Fix kvm clock initialization when it's configured out Signed-off-by: Avi Kivity <avi@redhat.com>	2011-01-12 11:30:56 +02:00
Gleb Natapov	6adba52742	KVM: Let host know whether the guest can handle async PF in non-userspace context. If guest can detect that it runs in non-preemptable context it can handle async PFs at any time, so let host know that it can send async PF even if guest cpu is not in userspace. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:21 +02:00
Gleb Natapov	6c047cd982	KVM paravirt: Handle async PF in non preemptable context If async page fault is received by idle task or when preemp_count is not zero guest cannot reschedule, so do sti; hlt and wait for page to be ready. vcpu can still process interrupts while it waits for the page to be ready. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:19 +02:00
Gleb Natapov	631bc48782	KVM: Handle async PF in a guest. When async PF capability is detected hook up special page fault handler that will handle async page fault events and bypass other page faults to regular page fault handler. Also add async PF handling to nested SVM emulation. Async PF always generates exit to L1 where vcpu thread will be scheduled out until page is available. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:16 +02:00
Gleb Natapov	fd10cde929	KVM paravirt: Add async PF initialization to PV guest. Enable async PF in a guest if async PF capability is discovered. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:14 +02:00
Gleb Natapov	ca3f10172e	KVM paravirt: Move kvm_smp_prepare_boot_cpu() from kvmclock.c to kvm.c. Async PF also needs to hook into smp_prepare_boot_cpu so move the hook into generic code. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2011-01-12 11:23:10 +02:00
Huang Ying	81e88fdc43	ACPI, APEI, Generic Hardware Error Source POLL/IRQ/NMI notification type support Generic Hardware Error Source provides a way to report platform hardware errors (such as that from chipset). It works in so called "Firmware First" mode, that is, hardware errors are reported to firmware firstly, then reported to Linux by firmware. This way, some non-standard hardware error registers or non-standard hardware link can be checked by firmware to produce more valuable hardware error information for Linux. This patch adds POLL/IRQ/NMI notification types support. Because the memory area used to transfer hardware error information from BIOS to Linux can be determined only in NMI, IRQ or timer handler, but general ioremap can not be used in atomic context, so a special version of atomic ioremap is implemented for that. Known issue: - Error information can not be printed for recoverable errors notified via NMI, because printk is not NMI-safe. Will fix this via delay printing to IRQ context via irq_work or make printk NMI-safe. v2: - adjust printk format per comments. Signed-off-by: Huang Ying <ying.huang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>	2011-01-12 03:06:19 -05:00
Linus Torvalds	16ee8db6a9	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix Moorestown VRTC fixmap placement x86/gpio: Implement x86 gpio_to_irq convert function x86, UV: Fix APICID shift for Westmere processors x86: Use PCI method for enabling AMD extended config space before MSR method x86: tsc: Prevent delayed init if initial tsc calibration failed x86, lapic-timer: Increase the max_delta to 31 bits x86: Fix sparse non-ANSI function warnings in smpboot.c x86, numa: Fix CONFIG_DEBUG_PER_CPU_MAPS without NUMA emulation x86, AMD, PCI: Add AMD northbridge PCI device id for CPU families 12h and 14h x86, numa: Fix cpu to node mapping for sparse node ids x86, numa: Fake node-to-cpumask for NUMA emulation x86, numa: Fake apicid and pxm mappings for NUMA emulation x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU x86, numa: Reduce minimum fake node size to 32M Fix up trivial conflict in arch/x86/kernel/apic/x2apic_uv_x.c	2011-01-11 11:11:46 -08:00
Linus Torvalds	42776163e1	Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (28 commits) perf session: Fix infinite loop in __perf_session__process_events perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1) perf sched: Use PTHREAD_STACK_MIN to avoid pthread_attr_setstacksize() fail perf tools: Emit clearer message for sys_perf_event_open ENOENT return perf stat: better error message for unsupported events perf sched: Fix allocation result check perf, x86: P4 PMU - Fix unflagged overflows handling dynamic debug: Fix build issue with older gcc tracing: Fix TRACE_EVENT power tracepoint creation tracing: Fix preempt count leak tracepoint: Add __rcu annotation tracing: remove duplicate null-pointer check in skb tracepoint tracing/trivial: Add missing comma in TRACE_EVENT comment tracing: Include module.h in define_trace.h x86: Save rbp in pt_regs on irq entry x86, dumpstack: Fix unused variable warning x86, NMI: Clean-up default_do_nmi() x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU x86, NMI: Remove DIE_NMI_IPI x86, NMI: Add priorities to handlers ...	2011-01-11 11:02:13 -08:00
Jack Steiner	990a32d1e5	x86, UV: Fix APICID shift for Westmere processors Westmere processors use a different algorithm for assigning APICIDs on SGI UV systems. The location of the node number within the apicid is now a function of the processor type. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20110110195210.GA18737@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-11 12:44:45 +01:00
Jan Beulich	24d9b70b8c	x86: Use PCI method for enabling AMD extended config space before MSR method While both methods should work equivalently well for the native case, the Xen Dom0 case can't reliably work with the MSR one, since there's no guarantee that the virtual CPUs it has available fully cover all necessary physical ones. As per the suggestion of Robert Richter the patch only adds the PCI method, but leaves the MSR one as a fallback to cover new systems the PCI IDs of which may not have got added to the code base yet. The only change in v2 is the breaking out of the new CPI initialization method into a separate function, as requested by Ingo. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Robert Richter <robert.richter@amd.com> Cc: Andreas Herrmann3 <Andreas.Herrmann3@amd.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> LKML-Reference: <4D2B3FD7020000780002B67D@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-11 12:43:41 +01:00
Thomas Gleixner	29fe359ca2	x86: tsc: Prevent delayed init if initial tsc calibration failed commit `a8760ec` (x86: Check tsc available/disabled in the delayed init function) missed to prevent the setup of the delayed init function in case the initial tsc calibration failed. This results in the same divide by zero bug as we have seen without the tsc disabled check. Skip the delayed work setup when tsc_khz (the initial calibration value) is 0. Bisected-and-tested-by: Kirill A. Shutemov <kas@openvz.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-01-11 11:48:39 +01:00
Linus Torvalds	8c8ae4e8cd	Merge branch 'stable/generic' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/generic' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: HVM X2APIC support apic: Move hypervisor detection of x2apic to hypervisor.h	2011-01-10 08:48:29 -08:00
Pierre Tardy	4aed89d6b5	x86, lapic-timer: Increase the max_delta to 31 bits Latest atom socs(penwell) does not have hpet timer. As their local APIC timer is clocked at 400KHZ, and the current code limit their Initial Counter register to 23 bits, they cannot sleep more than 1.34 seconds which leads to ~2 spurious wakeup per second (1 per thread) These SOCs support 32bit timer so we change the max_delta to at least 31bits. So we can at least sleep for 300 seconds. We could not find any previous chip errata where lapic would only have 23 bit precision As powertop is suggesting to activate HPET to "sleep longer", this could mean this problem is already known. Problem is here since very first implementation of lapic timer as a clock event `e9e2cdb` [PATCH] clockevents: i386 drivers. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Pierre Tardy <pierre.tardy@intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Adrian Bunk <bunk@stusta.de> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: john stultz <johnstul@us.ibm.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Andi Kleen <ak@suse.de> LKML-Reference: <1294327409-19426-1-git-send-email-pierre.tardy@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-10 09:36:49 +01:00
Ingo Molnar	30285c6f03	Merge branch 'x86/apic-cleanups' into x86/urgent Merge reason: Topic is ready for upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-10 09:34:50 +01:00
Ingo Molnar	4385428a47	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent	2011-01-09 10:42:21 +01:00
Cyrill Gorcunov	047a3772fe	perf, x86: P4 PMU - Fix unflagged overflows handling Don found that P4 PMU reads CCCR register instead of counter itself (in attempt to catch unflagged event) this makes P4 NMI handler to consume all NMIs it observes. So the other NMI users such as kgdb simply have no chance to get NMI on their hands. Side note: at moment there is no way to run nmi-watchdog together with perf tool. This is because both 'perf top' and nmi-watchdog use same event. So while nmi-watchdog reserves one event/counter for own needs there is no room for perf tool left (there is a way to disable nmi-watchdog on boot of course). Ming has tested this patch with the following results \| 1. watchdog disabled \| \| kgdb tests on boot OK \| perf works OK \| \| 2. watchdog enabled, without patch perf-x86-p4-nmi-4 \| \| kgdb tests on boot hang \| \| 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb \| tests on boot \| \| "perf top" partialy works \| cpu-cycles no \| instructions yes \| cache-references no \| cache-misses no \| branch-instructions no \| branch-misses yes \| bus-cycles no \| \| 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied \| \| kgdb tests on boot OK \| perf does not work, NMI "Dazed and confused" messages show up \| Which means we still have problems with p4 box due to 'unknown' nmi happens but at least it should fix kgdb test cases. Reported-by: Jason Wessel <jason.wessel@windriver.com> Reported-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Lin Ming <ming.m.lin@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4D275E7E.3040903@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-09 10:40:52 +01:00
Randy Dunlap	91d88ce22b	x86: Fix sparse non-ANSI function warnings in smpboot.c Fix sparse warning for non-ANSI function declaration: arch/x86/kernel/smpboot.c💯30: warning: non-ANSI function declaration of function 'cpu_hotplug_driver_lock' arch/x86/kernel/smpboot.c:105:32: warning: non-ANSI function declaration of function 'cpu_hotplug_driver_unlock' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> LKML-Reference: <20110108195914.95d366ea.randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-09 10:15:19 +01:00
Linus Torvalds	72eb6a7914	Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits) gameport: use this_cpu_read instead of lookup x86: udelay: Use this_cpu_read to avoid address calculation x86: Use this_cpu_inc_return for nmi counter x86: Replace uses of current_cpu_data with this_cpu ops x86: Use this_cpu_ops to optimize code vmstat: User per cpu atomics to avoid interrupt disable / enable irq_work: Use per cpu atomics instead of regular atomics cpuops: Use cmpxchg for xchg to avoid lock semantics x86: this_cpu_cmpxchg and this_cpu_xchg operations percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support percpu,x86: relocate this_cpu_add_return() and friends connector: Use this_cpu operations xen: Use this_cpu_inc_return taskstats: Use this_cpu_ops random: Use this_cpu_inc_return fs: Use this_cpu_inc_return in buffer.c highmem: Use this_cpu_xx_return() operations vmstat: Use this_cpu_inc_return for vm statistics x86: Support for this_cpu_add, sub, dec, inc_return percpu: Generic support for this_cpu_add, sub, dec, inc_return ... Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c} as per Tejun.	2011-01-07 17:02:58 -08:00
Frederic Weisbecker	625dbc3b8a	x86: Save rbp in pt_regs on irq entry From the x86_64 low level interrupt handlers, the frame pointer is saved right after the partial pt_regs frame. rbp is not supposed to be part of the irq partial saved registers, but it only requires to extend the pt_regs frame by 8 bytes to do so, plus a tiny stack offset fixup on irq exit. This changes a bit the semantics or get_irq_entry() that is supposed to provide only the value of caller saved registers and the cpu saved frame. However it's a win for unwinders that can walk through stack frames on top of get_irq_regs() snapshots. A noticeable impact is that it makes perf events cpu-clock and task-clock events based callchains working on x86_64. Let's then save rbp into the irq pt_regs. As a result with: perf record -e cpu-clock perf bench sched messaging perf report --stdio Before: 20.94% perf [kernel.kallsyms] [k] lock_acquire \| --- lock_acquire \| \|--44.01%-- __write_nocancel \| \|--43.18%-- __read \| \|--6.08%-- fork \| create_worker \| \|--0.88%-- _dl_fixup \| \|--0.65%-- do_lookup_x \| \|--0.53%-- __GI___libc_read --4.67%-- [...] After: 19.23% perf [kernel.kallsyms] [k] __lock_acquire \| --- __lock_acquire \| \|--97.74%-- lock_acquire \| \| \| \|--21.82%-- _raw_spin_lock \| \| \| \| \| \|--37.26%-- unix_stream_recvmsg \| \| \| sock_aio_read \| \| \| do_sync_read \| \| \| vfs_read \| \| \| sys_read \| \| \| system_call \| \| \| __read \| \| \| \| \| \|--24.09%-- unix_stream_sendmsg \| \| \| sock_aio_write \| \| \| do_sync_write \| \| \| vfs_write \| \| \| sys_write \| \| \| system_call \| \| \| __write_nocancel v2: Fix cfi annotations. Reported-by: Soeren Sandmann Pedersen <sandmann@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: Jan Beulich <JBeulich@novell.com>	2011-01-07 17:40:56 +01:00
Rakib Mullick	39a6eebda2	x86, dumpstack: Fix unused variable warning In dump_stack function, bp isn't used anymore, which is introduced by commit `9c0729dc80`. This patch removes bp completely. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Soeren Sandmann <sandmann@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <AANLkTik9U_Z0WSZ7YjrykER_pBUfPDdgUUmtYx=R74nL@mail.gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2011-01-07 16:59:49 +01:00
Sheng Yang	2904ed8dd5	apic: Move hypervisor detection of x2apic to hypervisor.h Then we can reuse it for Xen later. Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Avi Kivity <avi@redhat.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-01-07 10:03:49 -05:00
Don Zickus	f2fd43954a	x86, NMI: Clean-up default_do_nmi() Just re-arrange the code a bit to make it easier to follow what is going on. Basically un-negating the if-statement and swapping the code inside the if-statement with code outside. No functional changes. Originally-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-7-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 15:08:53 +01:00
Don Zickus	ab846f13f6	x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU In original NMI handler, NMI reason io port (0x61) is only processed on BSP. This makes it impossible to hot-remove BSP. To solve the issue, a raw spinlock is used to allow the port to be processed on any CPU. Originally-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-6-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 15:08:53 +01:00
Don Zickus	c410b83077	x86, NMI: Remove DIE_NMI_IPI With priorities in place and no one really understanding the difference between DIE_NMI and DIE_NMI_IPI, just remove DIE_NMI_IPI and convert everyone to DIE_NMI. This also simplifies default_do_nmi() a little bit. Instead of calling the die_notifier in both the if and else part, just pull it out and call it before the if-statement. This has the side benefit of avoiding a call to the ioport to see if there is an external NMI sitting around until after the (more frequent) internal NMIs are dealt with. Patch-Inspired-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-5-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 15:08:53 +01:00
Don Zickus	166d751479	x86, NMI: Add priorities to handlers In order to consolidate the NMI die_chain events, we need to setup the priorities for the die notifiers. I started by defining a bunch of common priorities that can be used by the notifier blocks. Then I modified the notifier blocks to use the newly created priorities. Now that the priorities are straightened out, it should be easier to remove the event DIE_NMI_IPI. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-4-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 15:08:52 +01:00
Don Zickus	673a6092ce	x86: Convert some devices to use DIE_NMIUNKNOWN They are a handful of places in the code that register a die_notifier as a catch all in case no claims the NMI. Unfortunately, they trigger on events like DIE_NMI and DIE_NMI_IPI, which depending on when they registered may collide with other handlers that have the ability to determine if the NMI is theirs or not. The function unknown_nmi_error() makes one last effort to walk the die_chain when no one else has claimed the NMI before spitting out messages that the NMI is unknown. This is a better spot for these devices to execute any code without colliding with the other handlers. The two drivers modified are only compiled on x86 arches I believe, so they shouldn't be affected by other arches that may not have DIE_NMIUNKNOWN defined. Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Russ Anderson <rja@sgi.com> Cc: Corey Minyard <minyard@acm.org> Cc: openipmi-developer@lists.sourceforge.net Cc: dann frazier <dannf@hp.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 15:08:52 +01:00
Huang Ying	1c7b74d46f	x86, NMI: Add NMI symbol constants and rename memory parity to PCI SERR Replace the NMI related magic numbers with symbol constants. Memory parity error is only valid for IBM PC-AT, newer machine use bit 7 (0x80) of 0x61 port for PCI SERR. While memory error is usually reported via MCE. So corresponding function name and kernel log string is changed. But on some machines, PCI SERR line is still used to report memory errors. This is used by EDAC, so corresponding EDAC call is reserved. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1294348732-15030-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 15:08:51 +01:00
Ingo Molnar	1c2a48cf65	Merge branch 'linus' into x86/apic-cleanups Conflicts: arch/x86/include/asm/io_apic.h Merge reason: Resolve the conflict, update to a more recent -rc base Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-07 14:14:15 +01:00
Rafael J. Wysocki	976513dbfc	PM / ACPI: Move NVS saving and restoring code to drivers/acpi The saving of the ACPI NVS area during hibernation and suspend and restoring it during the subsequent resume is entirely specific to ACPI, so move it to drivers/acpi and drop the CONFIG_SUSPEND_NVS configuration option which is redundant. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>	2011-01-07 00:36:55 -05:00
Linus Torvalds	cb600d2f83	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, mm: Initialize initial_page_table before paravirt jumps	2011-01-06 11:12:17 -08:00
Linus Torvalds	47935a731b	Merge branches 'x86-alternatives-for-linus', 'x86-fpu-for-linus', 'x86-hwmon-for-linus', 'x86-paravirt-for-linus', 'core-locking-for-linus' and 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, suspend: Avoid unnecessary smp alternatives switch during suspend/resume * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86-64, asm: Use fxsaveq/fxrestorq in more places * 'x86-hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hwmon: Add core threshold notification to therm_throt.c * 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, paravirt: Use native_halt on a halt, not native_safe_halt * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking, lockdep: Convert sprintf_symbol to %pS * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: Better struct irqaction layout	2011-01-06 11:11:50 -08:00
Linus Torvalds	77a0dd54ba	Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, UV, BAU: Extend for more than 16 cpus per socket x86, UV: Fix the effect of extra bits in the hub nodeid register x86, UV: Add common uv_early_read_mmr() function for reading MMRs	2011-01-06 11:09:57 -08:00
Linus Torvalds	d7a5a18190	Merge branch 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-tsc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Check tsc available/disabled in the delayed init function x86: Improve TSC calibration using a delayed workqueue x86: Make tsc=reliable override boot time stability checks	2011-01-06 11:08:14 -08:00
Linus Torvalds	4f00b901d4	Merge branch 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-security-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: module: Move RO/NX module protection to after ftrace module update x86: Resume trampoline must be executable x86: Add RO/NX protection for loadable kernel modules x86: Add NX protection for kernel data x86: Fix improper large page preservation	2011-01-06 11:07:33 -08:00
Linus Torvalds	b4c6e2ea5e	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, earlyprintk: Move mrst early console to platform/ and fix a typo x86, apbt: Setup affinity for apb timers acting as per-cpu timer ce4100: Add errata fixes for UART on CE4100 x86: platform: Move iris to x86/platform where it belongs x86, mrst: Check platform_device_register() return code x86/platform: Add Eurobraille/Iris power off support x86, mrst: Add explanation for using 1960 as the year offset for vrtc x86, mrst: Fix dependencies of "select INTEL_SCU_IPC" x86, mrst: The shutdown for MRST requires the SCU IPC mechanism x86: Ce4100: Add reboot_fixup() for CE4100 ce4100: Add PCI register emulation for CE4100 x86: Add CE4100 platform support x86: mrst: Set vRTC's IRQ to level trigger type x86: mrst: Add audio driver bindings rtc: Add drivers/rtc/rtc-mrst.c x86: mrst: Add vrtc driver which serves as a wall clock device x86: mrst: Add Moorestown specific reboot/shutdown support x86: mrst: Parse SFI timer table for all timer configs x86/mrst: Add SFI platform device parsing code	2011-01-06 11:06:31 -08:00
Linus Torvalds	6f46b120a9	Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, microcode, AMD: Cleanup code a bit x86, microcode, AMD: Replace vmalloc+memset with vzalloc	2011-01-06 11:06:09 -08:00
Linus Torvalds	4e1db5e58a	Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: apic, amd: Make firmware bug messages more meaningful mce, amd: Remove goto in threshold_create_device() mce, amd: Add helper functions to setup APIC mce, amd: Shorten local variables mci_misc_{hi,lo} mce, amd: Implement mce_threshold_block_init() helper function	2011-01-06 11:05:21 -08:00
Linus Torvalds	37d9a8c5ea	Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix included-by file reference comments x86, cpu: Only CPU features determine NX capabilities x86, cpu: Call verify_cpu during 32bit CPU startup x86, cpu: Clear XD_DISABLED flag on Intel to regain NX x86, cpu: Rename verify_cpu_64.S to verify_cpu.S	2011-01-06 10:56:02 -08:00
Linus Torvalds	017892c341	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation x86, acpi: Add MAX_LOCAL_APIC for 32bit x86: io_apic: Split setup_ioapic_ids_from_mpc() x86: io_apic: Fix CONFIG_X86_IO_APIC=n breakage x86: apic: Move probe_nr_irqs_gsi() into ioapic_init_mappings() x86: Allow platforms to force enable apic	2011-01-06 10:51:36 -08:00
Linus Torvalds	42cbd8efb0	Merge branch 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-amd-nb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, cacheinfo: Cleanup L3 cache index disable support x86, amd-nb: Cleanup AMD northbridge caching code x86, amd-nb: Complete the rename of AMD NB and related code	2011-01-06 10:50:28 -08:00
Huang Ying	74d91e3c6a	x86, NMI: Add touch_nmi_watchdog to io_check_error delay Prevent the long delay in io_check_error making NMI watchdog timeout. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Don Zickus <dzickus@redhat.com> LKML-Reference: <1294198689-15447-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:22:58 +01:00
Dongdong Deng	554ec06398	x86: Avoid calling arch_trigger_all_cpu_backtrace() at the same time The spin_lock_debug/rcu_cpu_stall detector uses trigger_all_cpu_backtrace() to dump cpu backtrace. Therefore it is possible that trigger_all_cpu_backtrace() could be called at the same time on different CPUs, which triggers and 'unknown reason NMI' warning. The following case illustrates the problem: CPU1 CPU2 ... CPU N trigger_all_cpu_backtrace() set "backtrace_mask" to cpu mask \| generate NMI interrupts generate NMI interrupts ... \ \| / \ \| / The "backtrace_mask" will be cleaned by the first NMI interrupt at nmi_watchdog_tick(), then the following NMI interrupts generated by other cpus's arch_trigger_all_cpu_backtrace() will be taken as unknown reason NMI interrupts. This patch uses a test_and_set to avoid the problem, and stop the arch_trigger_all_cpu_backtrace() from calling to avoid dumping a double cpu backtrace info when there is already a trigger_all_cpu_backtrace() in progress. Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com> Reviewed-by: Bruce Ashfield <bruce.ashfield@windriver.com> Cc: fweisbec@gmail.com LKML-Reference: <1294198689-15447-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Don Zickus <dzickus@redhat.com>	2011-01-05 14:22:57 +01:00
Don Zickus	9ab181fa9f	x86: Only call smp_processor_id in non-preempt cases There are some paths that walk the die_chain with preemption on. Make sure we are in an NMI call before we start doing anything. This was triggered by do_general_protection calling notify_die with DIE_GPF. Reported-by: Jan Kiszka <jan.kiszka@web.de> Signed-off-by: Don Zickus <dzickus@redhat.com> LKML-Reference: <1294198689-15447-1-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:22:57 +01:00
Yinghai Lu	cb2ded37fd	x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion Found one x2apic pre-enabled system, x2apic_mode suddenly get corrupted after register some cpus, when compiled CONFIG_NR_CPUS=255 instead of 512. It turns out that generic_processor_info() ==> phyid_set(apicid, phys_cpu_present_map) causes the problem. phys_cpu_present_map is sized by MAX_APICS bits, and pre-enabled system some cpus have an apic id > 255. The variable after phys_cpu_present_map may get corrupted silently: ffffffff828e8420 B phys_cpu_present_map ffffffff828e8440 B apic_verbosity ffffffff828e8444 B local_apic_timer_c2_ok ffffffff828e8448 B disable_apic ffffffff828e844c B x2apic_mode ffffffff828e8450 B x2apic_disabled ffffffff828e8454 B num_processors ... Actually phys_cpu_present_map is referenced via apic id, instead index. We should use MAX_LOCAL_APIC instead MAX_APICS. For 64-bit it will be 32768 in all cases. BSS will increase by 4k bytes on 64-bit: text data bss dec filename 21696943 4193748 12787712 38678403 vmlinux.before 21696943 4193748 12791808 38682499 vmlinux.after No change on 32bit. Finally we can remove MAX_APCIS that was rather confusing. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> LKML-Reference: <4D23BD9C.3070102@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:09:23 +01:00
Yinghai Lu	f005fe12b9	x86-64: Move out cleanup higmap [_brk_end, _end) out of init_memory_mapping() It is not related to init_memory_mapping(), and init_memory_mapping() is getting more bigger. So make it as seperated function and call it from reserve_brk() and that is point when _brk_end is concluded. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4D1933E0.7090305@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-01-05 14:03:45 +01:00
Rusty Russell	d50d8fe192	x86, mm: Initialize initial_page_table before paravirt jumps v2.6.36-rc8-54-gb40827f (x86-32, mm: Add an initial page table for core bootstrapping) made x86 boot using initial_page_table and broke lguest. For 2.6.37 we simply cut & paste the initialization code into lguest (`da32dac101` "lguest: populate initial_page_table"), now we fix it properly by doing that initialization before the paravirt jump. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: lguest <lguest@ozlabs.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <201101041720.54535.rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 09:53:50 +01:00
Ingo Molnar	bc030d6cb9	Merge commit 'v2.6.37-rc8' into x86/apic Conflicts: arch/x86/include/asm/io_apic.h Merge reason: move to a fresh -rc, resolve the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 09:43:42 +01:00
Thomas Renninger	25e41933b5	perf: Clean up power events by introducing new, more generic ones Add these new power trace events: power:cpu_idle power:cpu_frequency power:machine_suspend The old C-state/idle accounting events: power:power_start power:power_end Have now a replacement (but we are still keeping the old tracepoints for compatibility): power:cpu_idle and power:power_frequency is replaced with: power:cpu_frequency power:machine_suspend is newly introduced. Jean Pihet has a patch integrated into the generic layer (kernel/power/suspend.c) which will make use of it. the type= field got removed from both, it was never used and the type is differed by the event type itself. perf timechart userspace tool gets adjusted in a separate patch. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Jean Pihet <jean.pihet@newoldbits.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rjw@sisk.pl LKML-Reference: <1294073445-14812-3-git-send-email-trenn@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1290072314-31155-2-git-send-email-trenn@suse.de>	2011-01-04 08:16:54 +01:00
Ingo Molnar	cc22219699	Merge commit 'v2.6.37-rc8' into perf/core Merge reason: pick up latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-04 08:08:54 +01:00
R, Durgadoss	9e76a97efd	x86, hwmon: Add core threshold notification to therm_throt.c This patch adds code to therm_throt.c to notify core thermal threshold events. These thresholds are supported by the IA32_THERM_INTERRUPT register. The status/log for the same is monitored using the IA32_THERM_STATUS register. The necessary #defines are in msr-index.h. A call back is added to mce.h, to further notify the thermal stack, about the threshold events. Signed-off-by: Durgadoss R <durgadoss.r@intel.com> LKML-Reference: <D6D887BA8C9DFF48B5233887EF04654105C1251710@bgsmsx502.gar.corp.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2011-01-03 08:30:30 -08:00
Tejun Heo	c1955b5f3a	x86: Use this_cpu_inc_return for nmi counter this_cpu_inc_return() saves us a memory access there. Reviewed-by: Pekka Enberg <penberg@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-30 12:22:17 +01:00
Tejun Heo	7b543a5334	x86: Replace uses of current_cpu_data with this_cpu ops Replace all uses of current_cpu_data with this_cpu operations on the per cpu structure cpu_info. The scala accesses are replaced with the matching this_cpu ops which results in smaller and more efficient code. In the long run, it might be a good idea to remove cpu_data() macro too and use per_cpu macro directly. tj: updated description Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-30 12:22:03 +01:00
Tejun Heo	0a3aee0da4	x86: Use this_cpu_ops to optimize code Go through x86 code and replace __get_cpu_var and get_cpu_var instances that refer to a scalar and are not used for address determinations. Cc: Yinghai Lu <yinghai@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-12-30 12:20:28 +01:00
Yinghai Lu	1411e0ec31	x86-64, numa: Put pgtable to local node memory Introduce init_memory_mapping_high(), and use it with 64bit. It will go with every memory segment above 4g to create page table to the memory range itself. before this patch all page tables was on one node. with this patch, one RED-PEN is killed debug out for 8 sockets system after patch [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff] [ 0.000000] 0000000000 - 007f600000 page 2M [ 0.000000] 007f600000 - 007f750000 page 4k [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff] [ 0.000000] RAMDISK: 7bc84000 - 7f745000 .... [ 0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used [ 0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used [ 0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used [ 0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used [ 0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used [ 0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used [ 0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used [ 0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used [ 0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used [ 0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff] [ 0.000000] 0100000000 - 1080000000 page 2M [ 0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff] [ 0.000000] memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff] [ 0.000000] 1080000000 - 2080000000 page 2M [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff] [ 0.000000] memblock_x86_reserve_range: [0x207ffc0000-0x207fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00002080000000-0x0000307fffffff] [ 0.000000] 2080000000 - 3080000000 page 2M [ 0.000000] kernel direct mapping tables up to 3080000000 @ [0x307ff3d000-0x307fffffff] [ 0.000000] memblock_x86_reserve_range: [0x307ffc0000-0x307fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00003080000000-0x0000407fffffff] [ 0.000000] 3080000000 - 4080000000 page 2M [ 0.000000] kernel direct mapping tables up to 4080000000 @ [0x407fefd000-0x407fffffff] [ 0.000000] memblock_x86_reserve_range: [0x407ffc0000-0x407fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00004080000000-0x0000507fffffff] [ 0.000000] 4080000000 - 5080000000 page 2M [ 0.000000] kernel direct mapping tables up to 5080000000 @ [0x507febd000-0x507fffffff] [ 0.000000] memblock_x86_reserve_range: [0x507ffc0000-0x507fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00005080000000-0x0000607fffffff] [ 0.000000] 5080000000 - 6080000000 page 2M [ 0.000000] kernel direct mapping tables up to 6080000000 @ [0x607fe7d000-0x607fffffff] [ 0.000000] memblock_x86_reserve_range: [0x607ffc0000-0x607fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00006080000000-0x0000707fffffff] [ 0.000000] 6080000000 - 7080000000 page 2M [ 0.000000] kernel direct mapping tables up to 7080000000 @ [0x707fe3d000-0x707fffffff] [ 0.000000] memblock_x86_reserve_range: [0x707ffc0000-0x707fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00007080000000-0x0000807fffffff] [ 0.000000] 7080000000 - 8080000000 page 2M [ 0.000000] kernel direct mapping tables up to 8080000000 @ [0x807fdfc000-0x807fffffff] [ 0.000000] memblock_x86_reserve_range: [0x807ffbf000-0x807fffffff] PGTABLE [ 0.000000] Initmem setup node 0 [0000000000000000-000000107fffffff] [ 0.000000] NODE_DATA [0x0000107ffbd000-0x0000107ffc1fff] [ 0.000000] Initmem setup node 1 [0000001080000000-000000207fffffff] [ 0.000000] NODE_DATA [0x0000207ffbb000-0x0000207ffbffff] [ 0.000000] Initmem setup node 2 [0000002080000000-000000307fffffff] [ 0.000000] NODE_DATA [0x0000307ffbb000-0x0000307ffbffff] [ 0.000000] Initmem setup node 3 [0000003080000000-000000407fffffff] [ 0.000000] NODE_DATA [0x0000407ffbb000-0x0000407ffbffff] [ 0.000000] Initmem setup node 4 [0000004080000000-000000507fffffff] [ 0.000000] NODE_DATA [0x0000507ffbb000-0x0000507ffbffff] [ 0.000000] Initmem setup node 5 [0000005080000000-000000607fffffff] [ 0.000000] NODE_DATA [0x0000607ffbb000-0x0000607ffbffff] [ 0.000000] Initmem setup node 6 [0000006080000000-000000707fffffff] [ 0.000000] NODE_DATA [0x0000707ffbb000-0x0000707ffbffff] [ 0.000000] Initmem setup node 7 [0000007080000000-000000807fffffff] [ 0.000000] NODE_DATA [0x0000807ffba000-0x0000807ffbefff] Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4D1933D1.9020609@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-12-29 15:48:08 -08:00
Yinghai Lu	45635ab5e4	x86: Change get_max_mapped() to inline Move it into head file. to prepare use it in other files. [ hpa: added missing <linux/types.h> and changed type to phys_addr_t. ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4D1933BA.8000508@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-12-29 15:47:55 -08:00
Yinghai Lu	32e3f2b00c	x86-64, gart: Fix allocation with memblock When trying to change alloc_bootmem with memblock to go with real top-down Found one old system: [ 0.000000] Node 0: aperture @ ac000000 size 64 MB [ 0.000000] Aperture pointing to e820 RAM. Ignoring. [ 0.000000] Your BIOS doesn't leave a aperture memory hole [ 0.000000] Please enable the IOMMU option in the BIOS setup [ 0.000000] This costs you 64 MB of RAM [ 0.000000] memblock_x86_reserve_range: [0x2020000000-0x2023ffffff] aperture64 [ 0.000000] Cannot allocate aperture memory hole (ffff882020000000,65536K) [ 0.000000] memblock_x86_free_range: [0x2020000000-0x2023ffffff] [ 0.000000] Kernel panic - not syncing: Not enough memory for aperture [ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.37-rc5-tip-yh-06229-gb792dc2-dirty #331 [ 0.000000] Call Trace: [ 0.000000] [<ffffffff81cf50fe>] ? panic+0x91/0x1a3 [ 0.000000] [<ffffffff827c66b2>] ? gart_iommu_hole_init+0x3d7/0x4a3 [ 0.000000] [<ffffffff81d026a9>] ? _etext+0x0/0x3 [ 0.000000] [<ffffffff827ba940>] ? pci_iommu_alloc+0x47/0x71 [ 0.000000] [<ffffffff827c820b>] ? mem_init+0x19/0xec [ 0.000000] [<ffffffff827b3c40>] ? start_kernel+0x20a/0x3e8 [ 0.000000] [<ffffffff827b32cc>] ? x86_64_start_reservations+0x9c/0xa0 [ 0.000000] [<ffffffff827b33e4>] ? x86_64_start_kernel+0x114/0x11b it means __alloc_bootmem_nopanic() get too high for that aperture. Use memblock_find_in_range() with limit directly. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4D0C0740.90104@kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2010-12-29 14:46:54 -08:00
Jesper Juhl	5cdd2de0a7	x86/microcode: Fix double vfree() and remove redundant pointer checks before vfree() In arch/x86/kernel/microcode_intel.c::generic_load_microcode() we have this: while (leftover) { ... if (get_ucode_data(mc, ucode_ptr, mc_size) \|\| microcode_sanity_check(mc) < 0) { vfree(mc); break; } ... } if (mc) vfree(mc); This will cause a double free of 'mc'. This patch fixes that by just removing the vfree() call in the loop since 'mc' will be freed nicely just after we break out of the loop. There's also a second change in the patch. I noticed a lot of checks for pointers being NULL before passing them to vfree(). That's completely redundant since vfree() deals gracefully with being passed a NULL pointer. Removing the redundant checks yields a nice size decrease for the object file. Size before the patch: text data bss dec hex filename 4578 240 1032 5850 16da arch/x86/kernel/microcode_intel.o Size after the patch: text data bss dec hex filename 4489 240 984 5713 1651 arch/x86/kernel/microcode_intel.o Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Cc: Shaohua Li <shaohua.li@intel.com> LKML-Reference: <alpine.LNX.2.00.1012251946100.10759@swampdragon.chaosbits.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-27 14:33:30 +01:00

1 2 3 4 5 ...

7273 Commits