Commit Graph

16599 Commits

Author SHA1 Message Date
Alexey Dobriyan
19c5870c0e Use helpers to obtain task pid in printks (arch code)
One of the easiest things to isolate is the pid printed in kernel log.
There was a patch, that made this for arch-independent code, this one makes
so for arch/xxx files.

It took some time to cross-compile it, but hopefully these are all the
printks in arch code.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:43 -07:00
Jiri Slaby
93043ece03 define global BIT macro
define global BIT macro

move all local BIT defines to the new globally define macro.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:42 -07:00
Jiri Slaby
1977f03272 remove asm/bitops.h includes
remove asm/bitops.h includes

including asm/bitops directly may cause compile errors. don't include it
and include linux/bitops instead. next patch will deny including asm header
directly.

Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:41 -07:00
Pavel Emelyanov
b488893a39 pid namespaces: changes to show virtual ids to user
This is the largest patch in the set. Make all (I hope) the places where
the pid is shown to or get from user operate on the virtual pids.

The idea is:
 - all in-kernel data structures must store either struct pid itself
   or the pid's global nr, obtained with pid_nr() call;
 - when seeking the task from kernel code with the stored id one
   should use find_task_by_pid() call that works with global pids;
 - when showing pid's numerical value to the user the virtual one
   should be used, but however when one shows task's pid outside this
   task's namespace the global one is to be used;
 - when getting the pid from userspace one need to consider this as
   the virtual one and use appropriate task/pid-searching functions.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: nuther build fix]
[akpm@linux-foundation.org: yet nuther build fix]
[akpm@linux-foundation.org: remove unneeded casts]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Paul Menage <menage@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:40 -07:00
Serge E. Hallyn
b460cbc581 pid namespaces: define is_global_init() and is_container_init()
is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A cgroup init has it's tsk->pid == 1.

A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

	2.6.22-rc4-mm2-pidns1:
	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
	  global init (/sbin/init) process. This would improve performance
	  and remove dependence on the task_pid().

	2.6.21-mm2-pidns2:

	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
	  This way, we kill only the cgroup if the cgroup's init has a
	  bug rather than force a kernel panic.

[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Acked-by: Pavel Emelianov <xemul@openvz.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Herbert Poetzel <herbert@13thfloor.at>
Cc: Kirill Korotaev <dev@sw.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:37 -07:00
Pavel Emelianov
a47afb0f9d pid namespaces: round up the API
The set of functions process_session, task_session, process_group and
task_pgrp is confusing, as the names can be mixed with each other when looking
at the code for a long time.

The proposals are to
* equip the functions that return the integer with _nr suffix to
  represent that fact,
* and to make all functions work with task (not process) by making
  the common prefix of the same name.

For monotony the routines signal_session() and set_signal_session() are
replaced with task_session_nr() and set_task_session(), especially since they
are only used with the explicit task->signal dereference.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Kirill Korotaev <dev@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Herbert Poetzl <herbert@13thfloor.at>
Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:37 -07:00
Paul Jackson
c2e2c7fa1c task cgroups: enable cgroups by default in some configs
In pre-cgroup cpusets, a few config files enabled cpusets by default.

Signed-off-by: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:36 -07:00
Coly Li
3ed75eb8f1 setup vma->vm_page_prot by vm_get_page_prot()
This patch uses vm_get_page_prot() to setup vma->vm_page_prot.

Though inside vm_get_page_prot() the protection flags is AND with
(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED), it does not hurt correct code.

Signed-off-by: Coly Li <coyli@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:34 -07:00
Fernando Luis Vázquez Cao
22124c9999 kmap leak fix for x86_32 kdump
copy_oldmem_page should not return leaving a page frame from the
previous kernel mapped.

Signed-off-by: Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp>
Acked-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:33 -07:00
Mike Travis
92cb7612ae x86: convert cpuinfo_x86 array to a per_cpu array
cpu_data is currently an array defined using NR_CPUS.  This means that
we overallocate since we will rarely really use maximum configured cpus.
When NR_CPU count is raised to 4096 the size of cpu_data becomes
3,145,728 bytes.

These changes were adopted from the sparc64 (and ia64) code.  An
additional field was added to cpuinfo_x86 to be a non-ambiguous cpu
index.  This corresponds to the index into a cpumask_t as well as the
per_cpu index.  It's used in various places like show_cpuinfo().

cpu_data is defined to be the boot_cpu_data structure for the NON-SMP
case.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:04 +02:00
Jan Blunck
f1df280f53 x86: introduce frame_pointer() and stack_pointer()
This patch defines frame_pointer() and stack_pointer() similar to the
already defined instruction_pointer(). Thus the oprofile code can be
written in a more readable fashion.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:04 +02:00
Stephane Eranian
124d395fd0 i386: do not BUG_ON() when MSR is unknown
Here is a small patch to change the behavior of the PMU msr allocator
to avoid BUG_ON() when the MSR is unknwon. Instead, it now returns
ok, which means "I do not manage". The current allocator is not
yet managing the full set of PMU registers (e.g., GLOBAL_* on Core 2).

[watchdog] do not BUG_ON() in the MSR allocator if MSR is unknown, return ok
instead

Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:04 +02:00
Mike Travis
71b31233a2 x86: acpi use cpu_physical_id
This is from an earlier message from Christoph Lameter:

    processor_core.c currently tries to determine the apicid by special casing
    for IA64 and x86. The desired information is readily available via

	    cpu_physical_id()

    on IA64, i386 and x86_64.

    Signed-off-by: Christoph Lameter <clameter@sgi.com>

Additionally, boot_cpu_id needed to be exported to fix compile errors in
dma code when !CONFIG_SMP.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-10-19 20:35:03 +02:00
Mike Travis
b6278470b7 x86: convert cpu_llc_id to be a per cpu variable
Convert cpu_llc_id from a static array sized by NR_CPUS to a per_cpu
variable. This saves sizeof(cpu_llc_id) * NR unused cpus.  Access is
mostly from startup and CPU HOTPLUG functions.

Note there's an additional change of the type of cpu_llc_id from int to
u8 for ARCH i386 to correspond with the same type in ARCH x86_64.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Mike Travis
71fff5e6ca x86: convert cpu_to_apicid to be a per cpu variable
This patch converts the x86_cpu_to_apicid array to be a per cpu
variable. This saves sizeof(apicid) * NR unused cpus.  Access is mostly
from startup and CPU HOTPLUG functions.

MP_processor_info() is one of the functions that require access to the
x86_cpu_to_apicid array before the per_cpu data area is setup.  For this
case, a pointer to the __initdata array is initialized in setup_arch()
and removed in smp_prepare_cpus() after the per_cpu data area is
initialized.

A second change is included to change the initial array value of ARCH
i386 from 0xff to BAD_APICID to be consistent with ARCH x86_64.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Rusty Russell
dbeb2be21d i386: introduce "used_vectors" bitmap which can be used to reserve vectors.
This simplifies the io_apic.c __assign_irq_vector() logic and removes
the explicit SYSCALL_VECTOR check, and also allows for vectors to be
reserved by other mechanisms (ie. lguest).

[ tglx: arch/x86 adaptation ]

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Andi Kleen
39743c9ef7 x86: use raw locks during oopses
Don't want any lockdep or other fragile machinery to run during oopses.
Use raw spinlocks directly for oops locking.
Also disables irq flag tracing there.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Jan Beulich
b1992df3f0 x86: honor _PAGE_PSE bit on page walks
[ tglx: arch/x86 adaptation ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Akinobu Mita
1f503e7743 i386: do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.
Do cpuid_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.

[ tglx: arch/x86 adaptation ]

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Laurent Vivier
66d16ed45d x86: implement missing x86_64 function smp_call_function_mask()
This patch defines the missing function smp_call_function_mask() for x86_64,
this is more or less a cut&paste of i386 function. It removes also some
duplicate code.

This function is needed by KVM to execute a function on some CPUs.

AK: Fixed description
AK: Moved WARN_ON(irqs_disabled) one level up to not warn in the panic case.
[ tglx: arch/x86 adaptation ]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Glauber de Oliveira Costa
9d1c6e7c86 x86: use descriptor's functions instead of inline assembly
This patch provides a new set of functions for managing the descriptor
tables that can be used instead of putting the raw assembly in .c files.

Remodeling of store_tr() suggested by Frederik Deweerdt.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Pavel Emelyanov
9d975ebda5 i386: consolidate show_regs and show_registers for i386
Both functions printk the same information, except for CRx and
debug registers in the show_registers() one and a bit different
manner. So move the common code into one place. This is already
done for x86_64, so I think it's worth having the same on i386.

This saves 100 bytes of .rodata section :) ...
   but only 8 from .text :(

[ tglx: arch/x86 adaptation ]

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Jan Blunck
574a60421c i386: make callgraph use dump_trace() on i386/x86_64
This patch improves oprofile callgraphs for i386/x86_64. The old
backtracing code was unable to produce kernel backtraces if the
kernel wasn't compiled with framepointers. The code now uses
dump_trace().

[ tglx: arch/x86 adaptation ]

Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Andi Kleen
9480626830 x86: enable iommu_merge by default
[ tglx: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Andi Kleen
54ef34009a x86: Unify i386 and x86-64 early quirks
They were already very similar; just use the same file now.

[ tglx: arch/x86 adaptation ]

Cc: lenb@kernel.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:03 +02:00
Udo A. Steinberg
158ad3260b x86: enable HPET on ICH3 and ICH4
ICH3 and ICH4 have undocumented HPET capabilities. This patch enables
HPET for platforms based around these ICHs.

Tested on various ICH3 and ICH4 platforms.

Because HPET is not officially documented for ICH3/4 and may not have
been validated by chipset folks, we're on thin ice here. I'd recommend
testing this patch in -hrt or -mm for a while and wait for
success/failure reports before feeding it upstream.

tglx: depends on the force_hpet command line option !

Signed-off-by: Udo A. Steinberg <us15@os.inf.tu-dresden.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Udo A. Steinberg
b196884e2f x86: force enable HPET on VT8235/8237 chipsets
This patch adds quirks to force enable HPET on Via VT8235 and
VT8237 chipsets. The datasheet for 8237 documents HPET
functionality (although wrongly) whereas HPET is undocumented
for 8235.

Tested on A7V880 (8237) and K7VT4A+ (8235) boards.

tglx: depends on the force_hept commandline option

Signed-off-by: Udo A. Steinberg <us15@os.inf.tu-dresden.de>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Thomas Gleixner
b17530bda2 x86: add force_hpet boot option
add force_hpet boot option.

(this will be useful to make the forced-enable quirks depend on.)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-19 20:35:02 +02:00
Thomas Gleixner
764922376c x86: quirk.c trivial coding style and white space cleanup
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-19 20:35:02 +02:00
Andi Kleen
b3f5dde15e x86: don't zero pad addresses in segfault message
don't zero pad addresses in segfault message. Matches the other trap
messages. This leaves some more space for the new file name.

[ mingo: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Roland McGrath
95d1b8f981 x86: Use linux/elfcore-compat.h
This makes x86-64's ia32 code use the new linux/elfcore-compat.h, reducing
some hand-copied duplication.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Hiroshi Shimamoto
7778887880 x86: merge init_task_32/64.c
Merge init_task_32/64.c.
Move 64bit per cpu data orig_ist to setup64.c.

[ mingo: fixed checkpatch trivialities. ]

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Andi Kleen
af93ebc0b3 x86: remove page_fault_trace
Old debugging code that is not really needed anymore. If someone
wants it it would be better replaced with a systemtap script or
kprobe.

This avoids a potential cache miss during page fault processing.

[ mingo: arch/x86 adaptation ]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Adrian Knoth
096708dcf7 Kconfig: Missing line breaks in arch/x86_64/Kconfig
The helptext for IA32_EMULATION in arch/x86_64/Kconfig is wider than 80
chars, thus failing to be displayed in 80x24 screens.

This patch re-breaks lines.


Signed-off-by: Adrian Knoth <adi@drcomp.erfurt.thur.de>
2007-10-19 20:35:02 +02:00
Thomas Gleixner
06b4f2a51c x86: move cpufreq Kconfigs to the same directory
Move the 64bit Kconfig file to arch/x86/kernel/cpu/cpufreq, so we
can unify them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Andres Salomon
0387f4511e GEODE: use symbolic constant in cs5536 reboot fixup
Simple cosmetic update for the cs5536 reboot fixup; define the MSR that's used
for rebooting in geode.h, and use the define.

Signed-off-by: Andres Salomon <dilinger@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-19 20:35:02 +02:00
Ingo Molnar
54ffaa45c5 x86: fix CONFIG_NUMA and nosmp | maxcpus=0/1 crash
x86 NUMA kernels crash in the scheduler setup code if "nosmp" or
"maxcpus=0" is passed on the boot command line:

| Brought up 1 CPUs
| BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
| printing eip: c011f0b5 *pde = 00000000
| Oops: 0000 [#1] SMP
|
| Pid: 1, comm: swapper Not tainted (2.6.23 #67)
| EIP: 0060:[<c011f0b5>] EFLAGS: 00010246 CPU: 0
| EIP is at sd_degenerate+0x35/0x40

the reason is sloppy spaghetti code in smpboot_32.c that resulted in a
missing map_cpu_to_logical_apicid() call - which also had the side-effect
of setting up the cpu_2_node[] entry for the lone CPU. That resulted in
node_to_cpumask(0) resulting in 00000000 - confusing the sched-domains
setup code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Sam Ravnborg
3ceba7815c x86: use relative symlink for bzImage
Use relative symlinks so we can refer bzImage from different tree
structures aka via NFS.

Moved the creation of directory + symlink after a successful build of
the boot parts.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-19 20:35:02 +02:00
Siddha, Suresh B
957ff882f9 x86, vsyscall: fix the oops crash with __pa_vsymbol()
Appended patch fixes an oops while changing the vsyscall sysctl.
I am sure no one tested this code before integrating into mainline :(

BTW, using ioremap() in vsyscall_sysctl_change() to get the virtual
address of a kernel symbol sounds like an over kill.. I wonder if we
can define a simple __va_vsymbol() which will return directly the
kernel direct mapping. comments in the code which says gcc has trouble
with __va(__pa()) sounds bogus to me. __pa() on a vsyscall address will
not work anyhow :(

And also, the whole nop out syscall in vsyscall page infrastructure
(vsyscall_sysctl_change()) is added to make some attacks difficult,
and yet I don't see this nop out being done by default. This area
requires more cleanups?

Fix an oops with __pa_vsymbol(). VSYSCALL_FIRST_PAGE is a fixmap index.
We want the starting virtual address of the vsyscall page and not the index.

[ mingo: arch/x86 adaptation ]

Reported-by: Yanmin Zhang <yanmin.zhang@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 20:35:02 +02:00
Thomas Gleixner
f322727b92 x86: update .gitignore entries
vdso / vsycall create .so.dbg files now.
Add *.so.dbg to the main .ignore file

Exclude the compile time created boot directory in arch/x86_64 as well

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-19 20:35:02 +02:00
Atsushi Nemoto
1d9ef3ecd7 [MIPS] Kill duplicated setup_irq() for cp0 timer
Also many plat_timer_setup() can be killed too.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-19 18:15:58 +01:00
Ralf Baechle
d527eef5b7 [MIPS] Sibyte: Finish conversion to modern time APIs.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-19 18:15:58 +01:00
Ralf Baechle
93c846f904 [MIPS] time: Helpers to compute clocksource/event shift and mult values.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-19 18:15:57 +01:00
Ralf Baechle
f887b93e17 [MIPS] SMTC: Build fix.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-19 18:15:57 +01:00
Ralf Baechle
9c9ad7917b [MIPS] time: Delete dead code.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-19 18:15:57 +01:00
Ralf Baechle
723ee050aa [MIPS] MIPSsim: Strip defconfig file to the bones.
MIPSsim simulates only a barebone system, no point in a fancy kernel.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-19 18:15:57 +01:00
Ingo Molnar
9a24d04a3c x86: fix global_flush_tlb() bug
While we were reviewing pageattr_32/64.c for unification,
Thomas Gleixner noticed the following serious SMP bug in
global_flush_tlb():

	down_read(&init_mm.mmap_sem);
	list_replace_init(&deferred_pages, &l);
	up_read(&init_mm.mmap_sem);

this is SMP-unsafe because list_replace_init() done on two CPUs in
parallel can corrupt the list.

This bug has been introduced about a year ago in the 64-bit tree:

       commit ea7322decb
       Author: Andi Kleen <ak@suse.de>
       Date:   Thu Dec 7 02:14:05 2006 +0100

       [PATCH] x86-64: Speed and clean up cache flushing in change_page_attr

                down_read(&init_mm.mmap_sem);
        -       dpage = xchg(&deferred_pages, NULL);
        +       list_replace_init(&deferred_pages, &l);
                up_read(&init_mm.mmap_sem);

the xchg() based version was SMP-safe, but list_replace_init() is not.
So this "cleanup" introduced a nasty bug.

why this bug never become prominent is a mystery - it can probably be
explained with the (still) relative obscurity of the x86_64 architecture.

the safe fix for now is to write-lock init_mm.mmap_sem.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-19 12:19:26 +02:00
Kyle McMartin
27db71a2f1 [PARISC] Fix palo target
Hunk missing from previous commit, oops.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 15:12:42 -07:00
Kyle McMartin
5feb4f39aa [PARISC] Restore palo target
Turns out, people were still using it, and it accidently works.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 15:09:59 -07:00
Linus Torvalds
32c15bb978 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  [MIPS] time: Move R4000 clockevent device code to separate configurable file
  [MIPS] time: Delete dead cycles_per_jiffy, mips_timer_ack and null_timer_ack
  [MIPS] IP32: Retire use of plat_timer_setup.
  [MIPS] Jazz: Retire use of plat_timer_setup.
  [MIPS] IP27: Convert to clock_event_device.
  [MIPS] JMR3927: Convert to clock_event_device.
  [MIPS] Always do the ARC64_TWIDDLE_PC thing.
2007-10-18 14:51:02 -07:00
Linus Torvalds
a57793651f Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (51 commits)
  [IPV6]: Fix again the fl6_sock_lookup() fixed locking
  [NETFILTER]: nf_conntrack_tcp: fix connection reopening fix
  [IPV6]: Fix race in ipv6_flowlabel_opt() when inserting two labels
  [IPV6]: Lost locking in fl6_sock_lookup
  [IPV6]: Lost locking when inserting a flowlabel in ipv6_fl_list
  [NETFILTER]: xt_sctp: fix mistake to pass a pointer where array is required
  [NET]: Fix OOPS due to missing check in dev_parse_header().
  [TCP]: Remove lost_retrans zero seqno special cases
  [NET]: fix carrier-on bug?
  [NET]: Fix uninitialised variable in ip_frag_reasm()
  [IPSEC]: Rename mode to outer_mode and add inner_mode
  [IPSEC]: Disallow combinations of RO and AH/ESP/IPCOMP
  [IPSEC]: Use the top IPv4 route's peer instead of the bottom
  [IPSEC]: Store afinfo pointer in xfrm_mode
  [IPSEC]: Add missing BEET checks
  [IPSEC]: Move type and mode map into xfrm_state.c
  [IPSEC]: Fix length check in xfrm_parse_spi
  [IPSEC]: Move ip_summed zapping out of xfrm6_rcv_spi
  [IPSEC]: Get nexthdr from caller in xfrm6_rcv_spi
  [IPSEC]: Move tunnel parsing for IPv4 out of xfrm4_input
  ...
2007-10-18 14:40:30 -07:00
Linus Torvalds
9cf52b2921 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC/64]: Consolidate of_register_driver
  [SPARC] Videopix Frame Grabber: Convert device_lock_sem to mutex
  [SPARC]: Support for new termios.
  [SPARC64]: Check of_get_property() return in pci_determine_mem_io_space().
  [SPARC64]: Fix boot failures due to bootmem.
  [SPARC64]: Implement atomic backoff.
2007-10-18 14:39:44 -07:00
Ralf Baechle
e8c44319c6 Replace __attribute_pure__ with __pure
To be consistent with the use of attributes in the rest of the kernel
replace all use of __attribute_pure__ with __pure and delete the definition
of __attribute_pure__.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Bryan Wu <bryan.wu@analog.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:32 -07:00
Stephen Hemminger
c80544dc0b sparse pointer use of zero as null
Get rid of sparse related warnings from places that use integer as NULL
pointer.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:31 -07:00
Satyam Sharma
38048983e1 x86 msr driver: Misc cpuinit annotations
msr_class_cpu_callback() can be marked __cpuinit, being the notifier callback
for a __cpuinitdata notifier_block.  So can be marked msr_device_create() too,
called only from the newly-__cpuinit msr_class_cpu_callback() or from
__init-marked msr_init().

Signed-off-by: Satyam Sharma <satyam@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:28 -07:00
Michael Neuling
4603ac180a powerpc: add scaled time accounting
This adds POWERPC specific hooks for scaled time accounting.

POWER6 includes a SPURR register.  The SPURR is based off the PURR register
but is scaled based on CPU frequency and issue rates.  This gives a more
accurate account of the instructions used per task.  The PURR and timebase
will be constant relative to the wall clock, irrespective of the CPU
frequency.

This implementation reads the SPURR register in account_system_vtime which
is only call called on context witch and hard and soft irq entry and exit.
The percentage of user and system time is then estimated using the ratio of
these accounted by the PURR.  If the SPURR is not present, the PURR read.

An earlier implementation of this patch read the SPURR whenever the PURR
was read, which included the system call entry and exit path.
Unfortunately this showed a performance regression on lmbench runs, so was
re-implemented.

I've included the lmbench results here when run bare metal on POWER6.  1st
column is the unpatch results.  2nd column is the results using the below
patch and the 3rd is the % diff of these results from the base.  4th and
5th columns are the results and % differnce from the base using the older
patch (SPURR read in syscall entry/exit path).

                              Base        Scaled-Acct     SPURR-in-syscall
                             Result      Result  % diff    Result % diff
Simple syscall:              0.3086      0.3086  0.0000    0.3452 11.8600
Simple read:                 0.4591      0.4671  1.7425    0.5044 9.86713
Simple write:                0.4364      0.4366  0.0458    0.4731 8.40971
Simple stat:                 2.0055      2.0295  1.1967    2.0669 3.06158
Simple fstat:                0.5962      0.5876  -1.442    0.6368 6.80979
Simple open/close:           3.1283      3.1009  -0.875    3.2088 2.57328
Select on 10 fd's:           0.8554      0.8457  -1.133    0.8667 1.32101
Select on 100 fd's:          3.5292      3.6329  2.9383    3.6664 3.88756
Select on 250 fd's:          7.9097      8.1881  3.5197    8.2242 3.97613
Select on 500 fd's:          15.2659     15.836  3.7357    15.873 3.97814
Select on 10 tcp fd's:       0.9576      0.9416  -1.670    0.9752 1.83792
Select on 100 tcp fd's:      7.248       7.2254  -0.311    7.2685 0.28283
Select on 250 tcp fd's:      17.7742     17.707  -0.375    17.749 -0.1406
Select on 500 tcp fd's:      35.4258     35.25   -0.496    35.286 -0.3929
Signal handler installation: 0.6131      0.6075  -0.913    0.647  5.52927
Signal handler overhead:     2.0919      2.1078  0.7600    2.1831 4.35967
Protection fault:            0.7345      0.7478  1.8107    0.8031 9.33968
Pipe latency:                33.006      16.398  -50.31    33.475 1.42368
AF_UNIX sock stream latency: 14.5093     30.910  113.03    30.715 111.692
Process fork+exit:           219.8       222.8   1.3648    229.37 4.35623
Process fork+execve:         876.14      873.28  -0.32     868.66 -0.8533
Process fork+/bin/sh -c:     2830        2876.5  1.6431    2958   4.52296
File /var/tmp/XXX write bw:  1193497     1195536 0.1708    118657 -0.5799
Pagefaults on /var/tmp/XXX:  3.1272      3.2117  2.7020    3.2521 3.99398

Also, kernel compile times show no difference with this patch applied.

[pbadari@us.ibm.com: Avoid unnecessary PURR reading]
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Jay Lan <jlan@engr.sgi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:28 -07:00
Joe Perches
898eb71cb1 Add missing newlines to some uses of dev_<level> messages
Found these while looking at printk uses.

Add missing newlines to dev_<level> uses
Add missing KERN_<level> prefixes to multiline dev_<level>s
Fixed a wierd->weird spelling typo
Added a newline to a printk

Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Smart <James.Smart@Emulex.Com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:28 -07:00
Eric W. Biederman
282a821f18 sysctl: x86_64 remove unnecessary binary paths
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:22 -07:00
Akinobu Mita
ef1d7151d2 cpu hotplug: intel_cacheinfo: fix cpu hotplug error handling
- Fix resource leakage in error case within detect_cache_attributes()

- Don't register hotcpu notifier when cache_add_dev() returns error

- Introduce cache_dev_map cpumask to track whether cache interface for
  CPU is successfully added by cache_add_dev() or not.

  cache_add_dev() may fail with out of memory error. In order to
  avoid cache_remove_dev() with that uninitialized cache interface when
  CPU_DEAD event is delivered we need to have the cache_dev_map cpumask.

  (We cannot change cache_add_dev() from CPU_ONLINE event handler
  to CPU_UP_PREPARE event handler. Because cache_add_dev() needs
  to do cpuid and store the results with its CPU online.)

[nix.or.die@googlemail.com: fix a section mismatch warning]
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andi Kleen <ak@suse.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:21 -07:00
Akinobu Mita
d435d862ba cpu hotplug: mce: fix cpu hotplug error handling
- Clear kobject in percpu device_mce before calling sysdev_register() with

  Because mce_create_device() may fail and it leaves kobject filled with
  junk. It will be the problem when mce_create_device() will be called
  next time.

- Fix error handling in mce_create_device()

  Error handling should not do sysdev_remove_file() with not yet added
  attributes.

- Don't register hotcpu notifier when mce_create_device() returns error

- Do mce_create_device() in CPU_UP_PREPARE instead of CPU_ONLINE

Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andi Kleen <ak@suse.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:21 -07:00
Akinobu Mita
881a841f4a cpu hotplug: msr: fix cpu hotplug error handling
Do msr_device_create() in CPU_UP_PREPARE instead of CPU_ONLINE.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andi Kleen <ak@suse.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:21 -07:00
Akinobu Mita
c7e38a9c27 cpu hotplug: thermal_throttle: fix cpu hotplug error handling
Do thermal_throttle_add_dev() in CPU_UP_PREPARE instead of CPU_ONLINE.

Cc: Dmitriy Zavin <dmitriyz@google.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Andi Kleen <ak@suse.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:21 -07:00
Tony Breeds
2c62214831 Fix discrepancy between VDSO based gettimeofday() and sys_gettimeofday().
On platforms that copy sys_tz into the vdso (currently only x86_64, soon to
include powerpc), it is possible for the vdso to get out of sync if a user
calls (admittedly unusual) settimeofday(NULL, ptr).

This patch adds a hook for architectures that set
CONFIG_GENERIC_TIME_VSYSCALL to ensure when sys_tz is updated they can also
updatee their copy in the vdso.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Tony Luck <tony.luck@intel.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:20 -07:00
Rafael J. Wysocki
efa4d2fb04 Hibernation: Use temporary page tables for kernel text mapping on x86_64
Use temporary page tables for the kernel text mapping during hibernation
restore on x86_64.

Without the patch, the original boot kernel's page tables that represent the
kernel text mapping are used while the core of the image kernel is being
restored.  However, in principle, if the boot kernel is not identical to the
image kernel, the location of these page tables in the image kernel need not
be the same, so we should create a safe copy of the kernel text mapping prior
to restoring the core of the image kernel.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:20 -07:00
Rafael J. Wysocki
c30bb68c26 Hibernation: Pass CR3 in the image header on x86_64
Since we already pass the address of restore_registers() in the image header,
we can also pass the value of the CR3 register from before the hibernation in
the same way.  This will allow us to avoid using init_level4_pgt page tables
during the restore.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
d158cbdf39 Hibernation: Arbitrary boot kernel support on x86_64
Make it possible to restore a hibernation image on x86_64 with the help of a
kernel different from the one in the image.

The idea is to split the core restoration code into two separate parts and to
place each of them in a different page.   The first part belongs to the boot
kernel and is executed as the last step of the image kernel's memory
restoration procedure.   Before being executed, it is relocated to a safe page
that won't be overwritten while copying the image kernel pages.

The final operation performed by it is a jump to the second part of the core
restoration code that belongs to the image kernel and has just been restored.
This code makes the CPU switch to the image kernel's page tables and restores
the state of general purpose registers (including the stack pointer) from
before the hibernation.

The main issue with this idea is that in order to jump to the second part of
the core restoration code the boot kernel needs to know its address.
 However, this address may be passed to it in the image header.   Namely, the
part of the image header previously used for checking if the version of the
image kernel is correct can be replaced with some architecture specific data
that will allow the boot kernel to jump to the right address within the image
kernel.   These data should also be used for checking if the image kernel is
compatible with the boot kernel (as far as the memory restroration procedure
is concerned).  It can be done, for example, with the help of a "magic" value
that has to be equal in both kernels, so that they can be regarded as
compatible.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Pavel Machek
50a1efe14f s2ram: kill old debugging junk
This removes old debugging stuff, that should be no longer neccessary.  It
accessed VGA hardware (which may not be ready at this point), and used LEDs
at port 80 for debugging.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Andres Salomon
8f4ce8c32f serial: turn serial console suspend a boot rather than compile time option
Currently, there's a CONFIG_DISABLE_CONSOLE_SUSPEND that allows one to stop
the serial console from being suspended when the rest of the machine goes
to sleep.  This is incredibly useful for debugging power management-related
things; however, having it as a compile-time option has proved to be
incredibly inconvenient for us (OLPC).  There are plenty of times that we
want serial console to not suspend, but for the most part we'd like serial
console to be suspended.

This drops CONFIG_DISABLE_CONSOLE_SUSPEND, and replaces it with a kernel
boot parameter (no_console_suspend).  By default, the serial console will
be suspended along with the rest of the system; by passing
'no_console_suspend' to the kernel during boot, serial console will remain
alive during suspend.

For now, this is pretty serial console specific; further fixes could be
applied to make this work for things like netconsole.

Signed-off-by: Andres Salomon <dilinger@debian.org>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@suspend2.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:19 -07:00
Rafael J. Wysocki
e6c5eb9541 PM: Rework struct platform_suspend_ops
There is no reason why the .prepare() and .finish() methods in 'struct
platform_suspend_ops' should take any arguments, since architectures don't use
these methods' argument in any practically meaningful way (ie.  either the
target system sleep state is conveyed to the platform by .set_target(), or
there is only one suspend state supported and it is indicated to the PM core
by .valid(), or .prepare() and .finish() aren't defined at all).   There also
is no reason why .finish() should return any result.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
26398a70ea PM: Rename struct pm_ops and related things
The name of 'struct pm_ops' suggests that it is related to the power
management in general, but in fact it is only related to suspend.   Moreover,
its name should indicate what this structure is used for, so it seems
reasonable to change it to 'struct platform_suspend_ops'.   In that case, the
name of the global variable of this type used by the PM core and the names of
related functions should be changed accordingly.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Rafael J. Wysocki
95d9ffbe01 PM: Move definition of struct pm_ops to suspend.h
Move the definition of 'struct pm_ops' and related functions from <linux/pm.h>
to <linux/suspend.h> .

There are, at least, the following reasons to do that:
* 'struct pm_ops' is specifically related to suspend and not to the power
  management in general.
* As long as 'struct pm_ops' is defined in <linux/pm.h>, any modification of it
  causes the entire kernel to be recompiled, which is unnecessary and annoying.
* Some suspend-related features are already defined in <linux/suspend.h>, so it
  is logical to move the definition of 'struct pm_ops' into there.
* 'struct hibernation_ops', being the hibernation-related counterpart of
  'struct pm_ops', is defined in <linux/suspend.h> .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-18 14:37:18 -07:00
Kyle McMartin
991b7d6e6f [PARISC] Attempt to clean up parisc/Makefile
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 13:54:51 -07:00
Matthew Wilcox
9611f61eb5 [PARISC] Fix infinite loop in /proc/iomem
pcibios_link_hba_resources() could corrupt the resource tree by inserting
resources in the wrong place.  Fix this by calling pci_claim_resource()
for PCI-PCI bridges.  Delete pcibios_link_hba_resources as we shouldn't
need it any more.  Also get rid of lba_claim_dev_resources() and just
call pci_claim_resource() directly.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 12:34:22 -07:00
Ralf Baechle
42f77542f4 [MIPS] time: Move R4000 clockevent device code to separate configurable file
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
2cfa7660db [MIPS] time: Delete dead cycles_per_jiffy, mips_timer_ack and null_timer_ack
cycles_per_jiffy was only ever getting assigned and the function pointer
not being called anymore and mips_timer_ack had gotten similarly stale.  I
leave the remaining assignments unfixed as a lighthouse pointing platform
maintainers to what needs a rewrite.  These changes make null_timer_ack()
unreferenced, so delete that too.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
76d3a7a54c [MIPS] IP32: Retire use of plat_timer_setup.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
89742e5376 [MIPS] Jazz: Retire use of plat_timer_setup.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
e887b24592 [MIPS] IP27: Convert to clock_event_device.
This separates the tick timer stuff from the generic MIPS time.c.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:47 +01:00
Ralf Baechle
832348ff0b [MIPS] JMR3927: Convert to clock_event_device.
This separate the tick timer stuff from the generic MIPS time.c.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:46 +01:00
Thomas Bogendoerfer
15ad838d28 [MIPS] Always do the ARC64_TWIDDLE_PC thing.
Always jump to the place where the kernel is linked to. This helps where
the bootloaders/proms ignores the start address inside the ELF header.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2007-10-18 18:11:46 +01:00
Kyle McMartin
6cc4525d29 [PARISC] Kill off broken irqstack code
It's been unfinished and broken long enough, and I have some ideas on how
to do it more cleanly.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:31 -07:00
Kyle McMartin
873d50e2e5 [PARISC] Remove hardcoded uses of PAGE_SIZE
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:27 -07:00
Kyle McMartin
6ebeafff64 [PARISC] Clean up pointless ASM_PAGE_SIZE_DIV use
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:23 -07:00
Kyle McMartin
80af087699 [PARISC] Kill off the last vestiges of ASM_PAGE_SIZE
Sam's previous patch missed a few uses of ASM_PAGE_SIZE.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:19 -07:00
Sam Ravnborg
1c59357109 [PARISC] Kill off ASM_PAGE_SIZE use
We have the macro _AC() generally available now
so the calculation of PAGE_SIZE can be made
assembler compatible.
Introduce use of _AC() and kill all users of
ASM_PAGE_SIZE.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:15 -07:00
Sam Ravnborg
be1b3d8cb1 [PARISC] Beautify parisc vmlinux.lds.S
Introduce a consistent layout of vmlinux.
The same layout has been introduced for most
architectures.

And the same time move a few label definitions inside
the curly brackets so they are assigned the correct
starting address. Before a ld inserted alignment
would have casued the label to pint before the actual
start of the section.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:12 -07:00
Kyle McMartin
7819994312 [PARISC] Kill incorrect cast warning in unwinder
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:04 -07:00
Kyle McMartin
f16484f0e3 [PARISC] Kill zone_to_nid printk warning
zone_to_nid returns int, always.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:59:00 -07:00
Kyle McMartin
f8b9e59457 [PARISC] Unbreak processor_probe when we have more than NR_CPUS
If we already have NR_CPUS worth of cpus online, we obviously shouldn't
be trying to bring up more... Fixes a particularly vexing issue I had when
running another machines kernel on my rp3440.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:57 -07:00
Kyle McMartin
730e844d57 [PARISC] Kill pointless variable use in time.c
Clean up a pointless use of a variable in update_cr16_clocksource. It just
looks silly.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:52 -07:00
Kyle McMartin
efb80e7e09 [PARISC] import necessary bits of libgcc.a
Currently we're hacking libs-y to include libgcc.a, but this has
unforeseen consequences since the userspace libgcc is linked with fpregs
enabled. We need the kernel to stop using fpregs in an uncontrolled manner
to implement lazy fpu state saves.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:49 -07:00
Christoph Lameter
6f7d998e94 [PARISC] Use page allocator instead of slab allocator in pci-dma.c
Slab pages obtained via kmalloc are not cacheline aligned.  Nor is it
advisable to perform VM operations designed for page allocator pages on
memory obtained via kmalloc.

So replace the page sized allocations in kernel/pci-dma.c with page allocator
pages.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:45 -07:00
Adrian Bunk
f13cec8447 [PARISC] parisc: "extern inline" -> "static inline"
"extern inline" will have different semantics with gcc 4.3, and "static
inline" is correct here.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:41 -07:00
Kyle McMartin
0ed5462927 [PARISC] Update defconfigs
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:37 -07:00
Kyle McMartin
2cfc5be7df [PARISC] Wire up sys_fallocate (and compat_sys_fallocate)
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2007-10-18 00:58:26 -07:00
Stephen Rothwell
5c45708352 [SPARC/64]: Consolidate of_register_driver
Also of_unregister_driver.  These will be shortly also used by the
PowerPC code.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 21:17:42 -07:00
Jeff Dike
be8ad5a409 [UMP]: header_ops conversion needed for non-ethernet drivers
UML's two non-ethernet drivers need some header_ops conversion.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 19:35:04 -07:00
David S. Miller
4209ab098c [SPARC64]: Check of_get_property() return in pci_determine_mem_io_space().
If the PCI controller lacks the 'ranges' property nothing
is going to work.

Noticed by Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 16:25:10 -07:00
David S. Miller
719023fb90 [SPARC64]: Fix boot failures due to bootmem.
Do not use *alloc_bootmem_low*(), because ARCH_LOW_ADDRESS_LIMIT
is 4GB and this results in boot failures if all of the physical
memory in the machine is above 4GB.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 16:25:00 -07:00
David S. Miller
24f287e412 [SPARC64]: Implement atomic backoff.
When the cpu count is high and contention hits an atomic object, the
processors can synchronize such that some cpus continually get knocked
out and cannot complete the atomic update.

So implement an exponential backoff when SMP.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-17 16:24:55 -07:00