Commit Graph

1023 Commits

Author SHA1 Message Date
Linus Torvalds
bb7762961d Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
  x86: fix system without memory on node0
  x86, mm: Fix node_possible_map logic
  mm, x86: remove MEMORY_HOTPLUG_RESERVE related code
  x86: make sparse mem work in non-NUMA mode
  x86: process.c, remove useless headers
  x86: merge process.c a bit
  x86: use sparse_memory_present_with_active_regions() on UMA
  x86: unify 64-bit UMA and NUMA paging_init()
  x86: Allow 1MB of slack between the e820 map and SRAT, not 4GB
  x86: Sanity check the e820 against the SRAT table using e820 map only
  x86: clean up and and print out initial max_pfn_mapped
  x86/pci: remove rounding quirk from e820_setup_gap()
  x86, e820, pci: reserve extra free space near end of RAM
  x86: fix typo in address space documentation
  x86: 46 bit physical address support on 64 bits
  x86, mm: fault.c, use printk_once() in is_errata93()
  x86: move per-cpu mmu_gathers to mm/init.c
  x86: move max_pfn_mapped and max_low_pfn_mapped to setup.c
  x86: unify noexec handling
  x86: remove (null) in /sys kernel_page_tables
  ...
2009-06-10 16:13:20 -07:00
Linus Torvalds
7dc3ca39cb Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, nmi: Use predefined numbers instead of hardcoded one
  x86: asm/processor.h: remove double declaration
  x86, mtrr: replace MTRRdefType_MSR with msr-index's MSR_MTRRdefType
  x86, mtrr: replace MTRRfix4K_C0000_MSR with msr-index's MSR_MTRRfix4K_C0000
  x86, mtrr: remove mtrr MSRs double declaration
  x86, mtrr: replace MTRRfix16K_80000_MSR with msr-index's MSR_MTRRfix16K_80000
  x86, mtrr: replace MTRRfix64K_00000_MSR with msr-index's MSR_MTRRfix64K_00000
  x86, mtrr: replace MTRRcap_MSR with msr-index's MSR_MTRRcap
  x86: mce: remove duplicated #include
  x86: msr-index.h remove duplicate MSR C001_0015 declaration
  x86: clean up arch/x86/kernel/tsc_sync.c a bit
  x86: use symbolic name for VM86_SIGNAL when used as vm86 default return
  x86: added 'ifndef _ASM_X86_IOMAP_H' to iomap.h
  x86: avoid multiple declaration of kstack_depth_to_print
  x86: vdso/vma.c declare vdso_enabled and arch_setup_additional_pages before they get used
  x86: clean up declarations and variables
  x86: apic/x2apic_cluster.c x86_cpu_to_logical_apicid should be static
  x86 early quirks: eliminate unused function
2009-06-10 15:49:36 -07:00
Mel Gorman
32b154c0b0 x86: ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302

On x86 and x86-64, it is possible that page tables are shared beween
shared mappings backed by hugetlbfs.  As part of this,
page_table_shareable() checks a pair of vma->vm_flags and they must match
if they are to be shared.  All VMA flags are taken into account, including
VM_LOCKED.

The problem is that VM_LOCKED is cleared on fork().  When a process with a
shared memory segment forks() to exec() a helper, there will be shared
VMAs with different flags.  The impact is that the shared segment is
sometimes considered shareable and other times not, depending on what
process is checking.

What happens is that the segment page tables are being shared but the
count is inaccurate depending on the ordering of events.  As the page
tables are freed with put_page(), bad pmd's are found when some of the
children exit.  The hugepage counters also get corrupted and the Total and
Free count will no longer match even when all the hugepage-backed regions
are freed.  This requires a reboot of the machine to "fix".

This patch addresses the problem by comparing all flags except VM_LOCKED
when deciding if pagetables should be shared or not for hugetlbfs-backed
mapping.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <starlight@binnacle.cx>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-05-29 08:40:03 -07:00
Pallipadi, Venkatesh
2171787be2 x86: avoid back to back on_each_cpu in cpa_flush_array
Cleanup cpa_flush_array() to avoid back to back on_each_cpu() calls.

[ Impact: optimizes fix 0af48f42df ]

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-26 13:12:12 -07:00
venkatesh.pallipadi@intel.com
0af48f42df x86: cpa_flush_array wbinvd should be done on all CPUs
cpa_flush_array seems to prefer wbinvd() over clflush at 4M threshold.
clflush needs to be done on only one CPU as per instruction definition.
wbinvd() however, should be done on all CPUs.

[ Impact: fix missing flush which could cause data corruption ]

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-22 13:33:59 -07:00
venkatesh.pallipadi@intel.com
0b827537e3 x86: bugfix wbinvd() model check instead of family check
wbinvd is supported on all CPUs 486 or later. But,
pageattr.c is checking x86_model >= 4 before wbinvd(), which looks like
an oversight bug. It was first introduced at one place by changeset
d7c8f21a8c and got copied over to second
place in the same file later.

[ Impact: fix missing cache flush on early-model CPUs, potential data corruption ]

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-22 13:33:27 -07:00
Yinghai Lu
7c43769a97 x86, mm: Fix node_possible_map logic
Recently there were some changes to the meaning of node_possible_map,
and it is quite strange:

- the node without memory would be set in node_possible_map
- but some node with less NODE_MIN_SIZE will be kicked out of node_possible_map.

fix it by adding strict_setup_node_bootmem().

Also, remove unparse_node().

so result will be:

1. cpu_to_node() will return online node only (nearest one)
2. apicid_to_node() still returns the node that could be not online but is set
   in node_possible_map.
3. node_possible_map will include nodes that mem on it are less NODE_MIN_SIZE

v2: after move_cpus_to_node change.

[ Impact: get node_possible_map right ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <4A0C49BE.6080800@kernel.org>
[ v3: various small cleanups and comment clarifications ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 09:21:04 +02:00
Yinghai Lu
888a589f6b mm, x86: remove MEMORY_HOTPLUG_RESERVE related code
after:

 | commit b263295dbf
 | Author: Christoph Lameter <clameter@sgi.com>
 | Date:   Wed Jan 30 13:30:47 2008 +0100
 |
 |    x86: 64-bit, make sparsemem vmemmap the only memory model

we don't have MEMORY_HOTPLUG_RESERVE anymore.

Historically, x86-64 had an architecture-specific method for memory hotplug
whereby it scanned the SRAT for physical memory ranges that could be
potentially used for memory hot-add later. By reserving those ranges
without physical memory, the memmap would be allocated and left dormant
until needed. This depended on the DISCONTIG memory model which has been
removed so the code implementing HOTPLUG_RESERVE is now dead.

This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE.

(Changelog authored by Mel.)

v2: updated changelog, and remove hotadd= in doc

[ Impact: remove dead code ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: Mel Gorman <mel@csn.ul.ie>
Workflow-found-OK-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A0C4910.7090508@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 09:13:31 +02:00
Shaohua Li
ed077b58f6 x86: make sparse mem work in non-NUMA mode
With sparse memory, holes should not be marked present for memmap.
This patch makes sure sparsemem really works on SMP mode (!NUMA).

[ Impact: use less memory to map fragmented RAM, avoid boot-OOM/crash ]

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
LKML-Reference: <1242117600.22431.0.camel@sli10-desk.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-12 11:26:35 +02:00
Pekka Enberg
087fa4e964 x86: use sparse_memory_present_with_active_regions() on UMA
There's no need to use call memory_present() manually on UMA because
initmem_init() sets up early_node_map by calling
e820_register_active_regions().

[ Impact: cleanup ]

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <1241699742.17846.31.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 11:52:06 +02:00
Pekka Enberg
3551f88f64 x86: unify 64-bit UMA and NUMA paging_init()
64-bit UMA and NUMA versions of paging_init() are almost identical.
Therefore, merge the copy in mm/numa_64.c to mm/init_64.c to remove
duplicate code.

[ Impact: cleanup ]

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <1241699741.17846.30.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 11:52:06 +02:00
Yinghai Lu
0964b0562b x86: Allow 1MB of slack between the e820 map and SRAT, not 4GB
It is expected that there might be slight differences between the e820
map and the SRAT table and the intention was that 1MB of slack be allowed.

The calculation comparing e820ram and pxmram assumes the units are bytes,
when they are in fact pages. This means 4GB of slack is being allowed,
not 1MB. This patch makes the correct comparison.

comment is from Mel.

[ Impact: don't accept buggy SRATs that could dump up to 4G of RAM ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A03E13E.6050107@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 11:38:21 +02:00
Yinghai Lu
b37ab91907 x86: Sanity check the e820 against the SRAT table using e820 map only
node_cover_memory() sanity checks the SRAT table by ensuring that all
PXMs cover the memory reported in the e820.

However, when calculating the size of the holes in the e820, it uses
the early_node_map[] which contains information taken from both SRAT
and e820. If the SRAT is missing an entry, then it is not detected
that the SRAT table is incorrect and missing entries.

This patch uses the e820 map to calculate the holes instead of
early_node_map[].

comment is from Mel.

[ Impact: reject incorrect SRAT tables ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A03E10C.60906@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 11:35:07 +02:00
Yinghai Lu
80989ce064 x86: clean up and and print out initial max_pfn_mapped
Do this so we can check the range that is mapped before
init_memory_mapping().

To be able to print out meaningful info, we first have to fix
64-bit to have max_pfn_mapped assigned before that call. This
also unifies the code-path a bit.

[ Impact: print more debug info, cleanup ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49BF0978.40605@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 11:11:12 +02:00
Ingo Molnar
134cbf35c7 Merge commit 'v2.6.30-rc5' into x86/mm
Merge reason: this branch was on a .30-rc2 base - sync it up with
              all the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11 09:33:15 +02:00
Jan Beulich
4983439676 x86-64: finish cleanup_highmaps()'s job wrt. _brk_end
With the introduction of the .brk section, special care must be taken
that no unused page table entries remain if _brk_end and _end are
separated by a 2M page boundary. cleanup_highmap() runs very early and
hence cannot take care of that, hence potential entries needing to be
removed past _brk_end must be cleared once the brk allocator has done
its job.

[ Impact: avoids undesirable TLB aliases ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-05-07 21:51:34 -07:00
Nikanth Karthikesan
e0e5ea3268 x86: Fix a typo in a printk message
[ Impact: printk message cleanup ]

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
LKML-Reference: <200905040908.27299.knikanth@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06 12:23:12 +02:00
David Rientjes
7eccf7b227 x86, srat: do not register nodes beyond e820 map
The mem= option will truncate the memory map at a specified address so
it's not possible to register nodes with memory beyond the e820 upper
bound.

unparse_node() is only called when then node had memory associated with
it, although with the mem= option it is no longer addressable.

[ Impact: fix boot hang on certain (large) systems ]

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <alpine.DEB.2.00.0905051248150.20021@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-06 10:49:07 +02:00
Linus Torvalds
e91b3b2681 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tracing: x86, mmiotrace: fix range test
  tracing: fix ref count in splice pages
2009-05-05 12:08:02 -07:00
Ingo Molnar
a454ab3110 x86, mm: fault.c, use printk_once() in is_errata93()
Andrew pointed out that the 'once' variable has a needlessly
function-global scope. We can in fact eliminate it completely,
via the use of printk_once().

[ Impact: cleanup ]

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-03 10:09:03 +02:00
Pekka Enberg
9518e0e435 x86: move per-cpu mmu_gathers to mm/init.c
[ Impact: cleanup ]

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <1240923650.1982.22.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-30 10:12:37 +02:00
Pekka Enberg
2b72394e40 x86: move max_pfn_mapped and max_low_pfn_mapped to setup.c
This patch moves the max_pfn_mapped and max_low_pfn_mapped global
variables to kernel/setup.c where they're initialized.

[ Impact: cleanup ]

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
LKML-Reference: <1240923649.1982.21.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-30 10:12:36 +02:00
Stuart Bennett
33015c8599 tracing: x86, mmiotrace: fix range test
Matching on (addr == (p->addr + p->len)) causes problems when mappings
are adjacent.

[ Impact: fix mmiotrace confusion on adjacent iomaps ]

Signed-off-by: Stuart Bennett <stuart@freedesktop.org>
Acked-by: Pekka Paalanen <pq@iki.fi>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1240946271-7083-2-git-send-email-stuart@freedesktop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 11:32:22 +02:00
Yinghai Lu
4c31e92b97 x86: check boundary in setup_node_bootmem()
Commit dc09855 ("x86/uv: fix init of memory-less nodes") causes a
two sockets system (where node-1 doesn't have RAM installed) to crash.

That commit makes node_possible include cpu nodes that do not have memory.
So check boundary in setup_node_bootmem().

[ Impact: fix boot crash on RAM-less NUMA node system ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jack Steiner <steiner@sgi.com>
LKML-Reference: <49EF89DF.9090404@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-23 09:58:56 +02:00
Pekka Enberg
89388913f2 x86: unify noexec handling
This patch unifies noexec handling on 32-bit and 64-bit.

[ Impact: cleanup ]

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
[ mingo@elte.hu: build fix ]
LKML-Reference: <1240303167.771.69.camel@penberg-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21 10:48:08 +02:00
Ingo Molnar
8ecee4620e Merge branch 'linus' into x86/mm
Merge reason: refresh the topic: there's been 290 non-merges commits upstream
              to arch/x86 alone.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-21 10:46:54 +02:00
Ingo Molnar
62d1702909 Merge branch 'linus' into x86/urgent
Merge reason: We need the x86/uv updates from upstream, to queue up
              dependent fix.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-20 18:08:12 +02:00
Jaswinder Singh Rajput
a81b6314e0 x86: mm/numa_32.c calculate_numa_remap_pages should use __init
calculate_numa_remap_pages() is called only by __init initmem_init()
further calculate_numa_remap_pages is calling:
	__init find_e820_area() and __init reserve_early()

So calculate_numa_remap_pages() should be __init calculate_numa_remap_pages().

 WARNING: arch/x86/built-in.o(.text+0x82ea3): Section mismatch in reference from the function calculate_numa_remap_pages() to the function .init.text:find_e820_area()
 The function calculate_numa_remap_pages() references
 the function __init find_e820_area().
 This is often because calculate_numa_remap_pages lacks a __init
 annotation or the annotation of find_e820_area is wrong.

 WARNING: arch/x86/built-in.o(.text+0x82f5f): Section mismatch in reference from the function calculate_numa_remap_pages() to the function .init.text:reserve_early()
 The function calculate_numa_remap_pages() references
 the function __init reserve_early().
 This is often because calculate_numa_remap_pages lacks a __init
 annotation or the annotation of reserve_early is wrong.

[ Impact: save memory, address Section mismatch warning ]

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <1239991281.3153.4.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 22:43:13 +02:00
Jack Steiner
dc09855191 x86/uv: fix init of memory-less nodes
Add support for nodes that have cpus but no memory.
The current code was failing to add these nodes
to the nodes_present_map.

v2: Fixes case caught by David Rientjes - missed support
    for the x2apic SRAT table.

[ Impact: fix potential boot crash on memory-less UV nodes. ]

Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090417142242.GA23743@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 22:42:12 +02:00
Linus Torvalds
b9836e0837 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix microcode driver newly spewing warnings
  x86, PAT: Remove page granularity tracking for vm_insert_pfn maps
  x86: disable X86_PTRACE_BTS for now
  x86, documentation: kernel-parameters replace X86-32,X86-64 with X86
  x86: pci-swiotlb.c swiotlb_dma_ops should be static
  x86, PAT: Remove duplicate memtype reserve in devmem mmap
  x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()
  x86, PAT: Changing memtype to WC ensuring no WB alias
  x86, PAT: Handle faults cleanly in set_memory_ APIs
  x86, PAT: Change order of cpa and free in set_memory_wb
  x86, CPA: Change idmap attribute before ioremap attribute setup
2009-04-17 09:56:11 -07:00
Pallipadi, Venkatesh
4b06504627 x86, PAT: Remove page granularity tracking for vm_insert_pfn maps
This change resolves the problem of too many single page entries
in pat_memtype_list and "freeing invalid memtype" errors with i915,
reported here:

  http://marc.info/?l=linux-kernel&m=123845244713183&w=2

Remove page level granularity track and untrack of vm_insert_pfn.
memtype tracking at page granularity does not scale and cleaner
approach would be for the driver to request a type for a bigger
IO address range or PCI io memory range for that device, either at
mmap time or driver init time and just use that type during
vm_insert_pfn.

This patch just removes the track/untrack of vm_insert_pfn. That
means we will be in same state as 2.6.28, with respect to these APIs.

Newer APIs for the drivers to request a memtype for a bigger region
is coming soon.

[ Impact: fix Xorg startup warnings and hangs ]

Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Tested-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <20090408223716.GC3493@linux-os.sc.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-17 00:44:22 +02:00
Yinghai Lu
6424fb3866 x86: remove (null) in /sys kernel_page_tables
Impact: cleanup

%p prints out 0x000000000000000 as (null)
so use %lx instead.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <49E43282.1090607@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-14 11:50:22 +02:00
Jaswinder Singh Rajput
2c1b284e4f x86: clean up declarations and variables
Impact: cleanup, no code changed

 - syscalls.h       update declarations due to unifications
 - irq.c            declare smp_generic_interrupt() before it gets used
 - process.c        declare sys_fork() and sys_vfork() before they get used
 - tsc.c            rename tsc_khz shadowed variable
 - apic/probe_32.c  declare apic_default before it gets used
 - apic/nmi.c       prev_nmi_count should be unsigned
 - apic/io_apic.c   declare smp_irq_move_cleanup_interrupt() before it gets used
 - mm/init.c        declare direct_gbpages and free_initrd_mem before they get used

Signed-off-by: Jaswinder Singh Rajput <jaswinder@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12 15:20:16 +02:00
Marcin Slusarz
1ee4bd92a7 x86: fix wrong section of pat_disable & make it static
pat_disable cannot be __cpuinit anymore because it's called from pat_init
and the callchain looks like this:
pat_disable [cpuinit] <- pat_init <- generic_set_all <-
 ipi_handler <- set_mtrr <- (other non init/cpuinit functions)

WARNING: arch/x86/mm/built-in.o(.text+0x449e): Section mismatch in reference
from the function pat_init() to the function .cpuinit.text:pat_disable()
The function pat_init() references
the function __cpuinit pat_disable().
This is often because pat_init lacks a __cpuinit
annotation or the annotation of pat_disable is wrong.

Non CONFIG_X86_PAT version of pat_disable is static inline, so this version
can be static too (and there are no callers outside of this file).

Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <49DFB055.6070405@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-12 12:34:23 +02:00
Masami Hiramatsu
9b987aeb4a x86: fix set_fixmap to use phys_addr_t
Impact: fix kprobes crash on 32-bit with RAM above 4G

Use phys_addr_t for receiving a physical address argument
instead of unsigned long. This allows fixmap to handle
pages higher than 4GB on x86-32.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: systemtap-ml <systemtap@sources.redhat.com>
Cc: Gary Hade <garyhade@us.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <49DE3695.6040800@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 20:27:13 +02:00
Suresh Siddha
0c3c8a1836 x86, PAT: Remove duplicate memtype reserve in devmem mmap
/dev/mem mmap code was doing memtype reserve/free for a while now.
Recently we added memtype tracking in remap_pfn_range, and /dev/mem mmap
uses it indirectly. So, we don't need seperate tracking in /dev/mem code
any more. That means another ~100 lines of code removed :-).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20090409212709.085210000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:48 +02:00
Suresh Siddha
b6ff32d9aa x86, PAT: Consolidate code in pat_x_mtrr_type() and reserve_memtype()
Fix pat_x_mtrr_type() to use UC_MINUS when the mtrr type return UC. This
is to be  consistent with ioremap() and ioremap_nocache() which uses
UC_MINUS.

Consolidate the code such that reserve_memtype() also uses
pat_x_mtrr_type() when the caller doesn't specify any special attribute
(non WB attribute).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20090409212708.939936000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:48 +02:00
venkatesh.pallipadi@intel.com
3869c4aa18 x86, PAT: Changing memtype to WC ensuring no WB alias
As per SDM, there should not be any aliasing of a WC with any cacheable
type across CPUs. That is if one CPU is changing the identity map
memtype to _WC, no other CPU at the time of this change should not have a
TLB for this page that carries a WB attribute. SDM suggests to make the
page not present. But for that we will have to handle any page faults
that can potentially happen due to these pages being not present.

Other way to deal with this without having any WB mapping is to change
the page first to UC and then to WC. This ensures that we meet the SDM
requirement of no cacheable alais to WC page. This also has same or
lower overhead than marking the page not present and making it present
later.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090409212708.797481000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:47 +02:00
venkatesh.pallipadi@intel.com
9fa3ab390a x86, PAT: Handle faults cleanly in set_memory_ APIs
Handle faults and do proper cleanups in set_memory_*() functions. In
some cases, these functions were not doing proper free on failure paths.

With the changes to tracking memtype of RAM pages in struct page instead
of pat list, we do not need the changes in commits c5e147. This patch
reverts that change.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090409212708.653222000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:47 +02:00
venkatesh.pallipadi@intel.com
a5593e0b32 x86, PAT: Change order of cpa and free in set_memory_wb
To be free of aliasing due to races, set_memory_* interfaces should
follow ordering of reserving, changing memtype to UC/WC, changing
memtype back to WB followed by free.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20090409212708.512280000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:46 +02:00
Suresh Siddha
43a432b155 x86, CPA: Change idmap attribute before ioremap attribute setup
Change the identity mapping with the requested attribute first, before
we setup the virtual memory mapping with the new requested attribute.

This makes sure that there is no window when identity map'ed attribute
may disagree with ioremap range on the attribute type.

This also avoids doing cpa on the ioremap'ed address twice (first in
ioremap_page_range and then in ioremap_change_attr using vaddr), and
should improve ioremap performance a bit.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
LKML-Reference: <20090409212708.373330000@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:55:46 +02:00
Andy Grover
a0d22f485a x86: Document get_user_pages_fast()
While better than get_user_pages(), the usage of gupf(),
especially the return values and the fact that it can
potentially only partially pin the range, warranted some
documentation.

Signed-off-by: Andy Grover <andy.grover@oracle.com>
Cc: npiggin@suse.de
Cc: akpm@linux-foundation.org
LKML-Reference: <1239320729-3262-1-git-send-email-andy.grover@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-10 13:14:08 +02:00
Linus Torvalds
32fb6c1756 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (140 commits)
  ACPI: processor: use .notify method instead of installing handler directly
  ACPI: button: use .notify method instead of installing handler directly
  ACPI: support acpi_device_ops .notify methods
  toshiba-acpi: remove MAINTAINERS entry
  ACPI: battery: asynchronous init
  acer-wmi: Update copyright notice & documentation
  acer-wmi: Cleanup the failure cleanup handling
  acer-wmi: Blacklist Acer Aspire One
  video: build fix
  thinkpad-acpi: rework brightness support
  thinkpad-acpi: enhanced debugging messages for the fan subdriver
  thinkpad-acpi: enhanced debugging messages for the hotkey subdriver
  thinkpad-acpi: enhanced debugging messages for rfkill subdrivers
  thinkpad-acpi: restrict access to some firmware LEDs
  thinkpad-acpi: remove HKEY disable functionality
  thinkpad-acpi: add new debug helpers and warn of deprecated atts
  thinkpad-acpi: add missing log levels
  thinkpad-acpi: cleanup debug helpers
  thinkpad-acpi: documentation cleanup
  thinkpad-acpi: drop ibm-acpi alias
  ...
2009-04-05 11:16:25 -07:00
Linus Torvalds
714f83d5d9 Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
  tracing, net: fix net tree and tracing tree merge interaction
  tracing, powerpc: fix powerpc tree and tracing tree interaction
  ring-buffer: do not remove reader page from list on ring buffer free
  function-graph: allow unregistering twice
  trace: make argument 'mem' of trace_seq_putmem() const
  tracing: add missing 'extern' keywords to trace_output.h
  tracing: provide trace_seq_reserve()
  blktrace: print out BLK_TN_MESSAGE properly
  blktrace: extract duplidate code
  blktrace: fix memory leak when freeing struct blk_io_trace
  blktrace: fix blk_probes_ref chaos
  blktrace: make classic output more classic
  blktrace: fix off-by-one bug
  blktrace: fix the original blktrace
  blktrace: fix a race when creating blk_tree_root in debugfs
  blktrace: fix timestamp in binary output
  tracing, Text Edit Lock: cleanup
  tracing: filter fix for TRACE_EVENT_FORMAT events
  ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
  x86: kretprobe-booster interrupt emulation code fix
  ...

Fix up trivial conflicts in
 arch/parisc/include/asm/ftrace.h
 include/linux/memory.h
 kernel/extable.c
 kernel/module.c
2009-04-05 11:04:19 -07:00
Linus Torvalds
90975ef712 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
  cpumask: remove cpumask allocation from idle_balance, fix
  numa, cpumask: move numa_node_id default implementation to topology.h, fix
  cpumask: remove cpumask allocation from idle_balance
  x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
  x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
  x86: microcode: cleanup
  x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
  cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
  numa, cpumask: move numa_node_id default implementation to topology.h
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  cpumask: remove x86 cpumask_t uses.
  cpumask: use cpumask_var_t in uv_flush_tlb_others.
  cpumask: remove cpumask_t assignment from vector_allocation_domain()
  cpumask: make Xen use the new operators.
  cpumask: clean up summit's send_IPI functions
  cpumask: use new cpumask functions throughout x86
  x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
  cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
  cpumask: convert node_to_cpumask_map[] to cpumask_var_t
  x86: unify 32 and 64-bit node_to_cpumask_map
  ...
2009-04-05 10:33:07 -07:00
Len Brown
478c6a43fc Merge branch 'linus' into release
Conflicts:
	arch/x86/kernel/cpu/cpufreq/longhaul.c

Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-05 02:14:15 -04:00
Suresh Siddha
7237d3de78 x86, ACPI: add support for x2apic ACPI extensions
All logical processors with APIC ID values of 255 and greater will have their
APIC reported through Processor X2APIC structure (type-9 entry type) and all
logical processors with APIC ID less than 255 will have their APIC reported
through legacy Processor Local APIC (type-0 entry type) only. This is the
same case even for NMI structure reporting.
    
The Processor X2APIC Affinity structure provides the association between the
X2APIC ID of a logical processor and the proximity domain to which the logical
processor belongs.
    
For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device()
objects in the ACPI namespace.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-04-03 20:08:12 -04:00
Andrew Morton
6a491e2e3e x86: fix is_io_mapping_possible() build warning on i386 allnoconfig
i386 allnoconfig:

 arch/x86/mm/iomap_32.c: In function 'is_io_mapping_possible':
 arch/x86/mm/iomap_32.c:27: warning: comparison is always false due to limited range of data type

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-03 18:39:05 +02:00
Akinobu Mita
a7f8c50d90 x86, mm: fix misuse of debug_kmap_atomic
Impact: fix CONFIG_DEBUG_HIGHMEM=y breakage

Commit 7ca43e756 ("mm: use debug_kmap_atomic") introduced some
debug_kmap_atomic() calls in the wrong places.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20090402070126.GA3951@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 16:37:04 +02:00
Ingo Molnar
8302294f43 Merge branch 'tracing/core-v2' into tracing-for-linus
Conflicts:
	include/linux/slub_def.h
	lib/Kconfig.debug
	mm/slob.c
	mm/slub.c
2009-04-02 00:49:02 +02:00