Commit Graph

759 Commits

Author SHA1 Message Date
Yinghai Lu
f2cf8e085c x86_64: move iommu declaration from proto to iommu.h
[akpm@linux-foundation.org: build fix]
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:14 -07:00
Yinghai Lu
bc2cea6a34 x86_64: disable the GART in shutdown
For K8 system: 4G RAM with memory hole remapping enabled, or more than 4G
RAM installed.  when using kexec to load second kernel.  In the second
kernel, when mem is allocated for GART, it will do the memset for clear, it
will cause restart, because some device still used that for dma.  solution
will be:

in second kernel: disable that at first before we try to allocate mem for
it.  or in the first kernel: do disable that before shutdown.
Andi/Eric/Alan prefer to second one for clean shutdown in first kernel.
Andi also point out need to consider to AGP enable but mem less 4G case
too.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:13 -07:00
Andrew Morton
8f03d6ce4e x86_64: flush_tlb_kernel_range() warning fix
mm/vmalloc.c: In function 'unmap_kernel_range':
mm/vmalloc.c:75: warning: unused variable 'start'

make it a C function so that the compiler thinks it used its arguments.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:13 -07:00
Jiri Kosina
bdb345a4e3 x86_64: fix wrong comment regarding set_fixmap()
The function name is set_fixmap(), not fixmap_set() as stated in the comment.

Also fix a typo, punctuation and lower/uppercase a bit.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:12 -07:00
Thomas Gleixner
28318daf79 x86_64: use the global PIT lock
Replace the pcspkr private PIT lock by the global PIT lock to serialize the
PIT access all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:12 -07:00
Glauber de Oliveira Costa
99253b8e73 x86_64: Move functions declarations to header file
Some interrupt entry points are currently defined in i8259.c They probably
belong in a header.  Right now, their only user is init_IRQ, justifying
their declaration in-file.  But when virtualization comes in, we may be
interested in using that functions in late initializations.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:12 -07:00
Muli Ben-Yehuda
8cb32dc748 x86_64: make dump_error_regs a chip op
Provide seperate versions for Calgary and CalIOC2

Also print out the PCIe Root Complex Status on CalIOC2 errors

Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:11 -07:00
Muli Ben-Yehuda
ff297b8c08 x86_64: introduce chipset specific ops
Calgary and CalIOC2 share most of the same logic. Introduce struct
cal_chipset_ops for quirks and tce flush logic which are

[akpm@linux-foundation.org: make calgary_chip_ops static]
Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:11 -07:00
Adrian Bunk
9596017e79 x86: remove support for the Rise CPU
The Rise CPUs were only very short-lived, and there are no reports of
anyone both owning one and running Linux on it.

Googling for the printk string "CPU: Rise iDragon" didn't find any dmesg
available online.

If it turns out that against all expectations there are actually users
reverting this patch would be easy.

This patch will make the kernel images smaller by a few bytes for all
i386 users.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
Nigel Cunningham
44bf4cea43 x86: PM_TRACE support
Signed-off-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
Tim Hockin
e02e68d31e x86_64: support poll() on /dev/mcelog
Background:
 /dev/mcelog is typically polled manually.  This is less than optimal for
 situations where accurate accounting of MCEs is important.  Calling
 poll() on /dev/mcelog does not work.

Description:
 This patch adds support for poll() to /dev/mcelog.  This results in
 immediate wakeup of user apps whenever the poller finds MCEs.  Because
 the exception handler can not take any locks, it can not call the wakeup
 itself.  Instead, it uses a thread_info flag (TIF_MCE_NOTIFY) which is
 caught at the next return from interrupt or exit from idle, calling the
 mce_user_notify() routine.  This patch also disables the "fake panic"
 path of the mce_panic(), because it results in printk()s in the exception
 handler and crashy systems.

 This patch also does some small cleanup for essentially unused variables,
 and moves the user notification into the body of the poller, so it is
 only called once per poll, rather than once per CPU.

Result:
 Applications can now poll() on /dev/mcelog.  When an error is logged
 (whether through the poller or through an exception) the applications are
 woken up promptly.  This should not affect any previous behaviors.  If no
 MCEs are being logged, there is no overhead.

Alternatives:
 I considered simply supporting poll() through the poller and not using
 TIF_MCE_NOTIFY at all.  However, the time between an uncorrectable error
 happening and the user application being notified is *the*most* critical
 window for us.  Many uncorrectable errors can be logged to the network if
 given a chance.

 I also considered doing the MCE poll directly from the idle notifier, but
 decided that was overkill.

Testing:
 I used an error-injecting DIMM to create lots of correctable DRAM errors
 and verified that my user app is woken up in sync with the polling interval.
 I also used the northbridge to inject uncorrectable ECC errors, and
 verified (printk() to the rescue) that the notify routine is called and the
 user app does wake up.  I built with PREEMPT on and off, and verified
 that my machine survives MCEs.

[wli@holomorphy.com: build fix]
Signed-off-by: Tim Hockin <thockin@google.com>
Signed-off-by: William Irwin <bill.irwin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
David Rientjes
3484d79813 x86_64: fake pxm-to-node mapping for fake numa
For NUMA emulation, our SLIT should represent the true NUMA topology of the
system but our proximity domain to node ID mapping needs to reflect the
emulated state.

When NUMA emulation has successfully setup fake nodes on the system, a new
function, acpi_fake_nodes() is called.  This function determines the proximity
domain (_PXM) for each true node found on the system.  It then finds which
emulated nodes have been allocated on this true node as determined by its
starting address.  The node ID to PXM mapping is changed so that each fake
node ID points to the PXM of the true node that it is located on.

If the machine failed to register a SLIT, then we assume there is no special
requirement for emulated node affinity so we use the default LOCAL_DISTANCE,
which is newly exported to this code, as our measurement if the emulated nodes
appear in the same PXM.  Otherwise, we use REMOTE_DISTANCE.

PXM_INVAL and NID_INVAL are also exported to the ACPI header file so that we
can compare node_to_pxm() results in generic code (in this case, the SRAT
code).

Cc: Len Brown <lenb@kernel.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:10 -07:00
Christoph Lameter
34feb2c83b x86_64: Quicklist support for x86_64
This adds caching of pgds and puds, pmds, pte.  That way we can avoid costly
zeroing and initialization of special mappings in the pgd.

A second quicklist is useful to separate out PGD handling.  We can carry the
initialized pgds over to the next process needing them.

Also clean up the pgd_list handling to use regular list macros.  There is no
need anymore to avoid the lru field.

Move the add/removal of the pgds to the pgdlist into the constructor /
destructor.  That way the implementation is congruent with i386.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Luck, Tony" <tony.luck@intel.com>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:09 -07:00
Thomas Gleixner
0655d7c32b x86: share hpet.h with i386
hpet.h in asm-i386 and asm-x86_64 contain tons of duplicated stuff.
Consolidate into one shared header file.

AK: Fix i386 compilation with !X86_IO_APIC

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:09 -07:00
Thomas Gleixner
f40f31bfe1 x86_64: Fix APIC typo
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:09 -07:00
Chris Wright
55f93afd89 x86_64: Untangle asm/hpet.h from asm/timex.h
When making changes to x86_64 timers, I noticed that touching hpet.h triggered
an unreasonably large rebuild.  Untangling it from timex.h quiets the extra
rebuild quite a bit.

Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:08 -07:00
Yinghai Lu
af3e9a2e33 x86_64: remove extra extern declaring about dmi_ioremap
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:08 -07:00
Andi Kleen
2aae950b21 x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu
This implements new vDSO for x86-64.  The concept is similar
to the existing vDSOs on i386 and PPC.  x86-64 has had static
vsyscalls before,  but these are not flexible enough anymore.

A vDSO is a ELF shared library supplied by the kernel that is mapped into
user address space.  The vDSO mapping is randomized for each process
for security reasons.

Doing this was needed for clock_gettime, because clock_gettime
always needs a syscall fallback and having one at a fixed
address would have made buffer overflow exploits too easy to write.

The vdso can be disabled with vdso=0

It currently includes a new gettimeofday implemention and optimized
clock_gettime(). The gettimeofday implementation is slightly faster
than the one in the old vsyscall.  clock_gettime is significantly faster
than the syscall for CLOCK_MONOTONIC and CLOCK_REALTIME.

The new calls are generally faster than the old vsyscall.

Advantages over the old x86-64 vsyscalls:
- Extensible
- Randomized
- Cleaner
- Easier to virtualize (the old static address range previously causes
overhead e.g. for Xen because it has to create special page tables for it)

Weak points:
- glibc support still to be written

The VM interface is partly based on Ingo Molnar's i386 version.

Includes compile fix from Joachim Deguara

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:08 -07:00
Andi Kleen
aac57f81eb x86_64: Always use builtin memcpy on gcc 4.3
Jan asked to always use the builtin memcpy on gcc 4.3 mainline because
it should generate better code than the old macro. Let's try it.

Cc: Jan Hubicka <jh@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:08 -07:00
Jean Delvare
9d531cc119 x86_64: asm/ptrace.h needs linux/compiler.h
On x86_64, <asm/ptrace.h> uses __user but doesn't include
<linux/compiler.h>.  This could lead to build failures.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 18:37:07 -07:00
Nick Piggin
71133027fe x86_64: wbinvd macro fix
Too many semicolons in this macro.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-21 17:49:13 -07:00
Ralf Baechle
c41917df8a [PATCH] sched: sched_cacheflush is now unused
Since Ingo's recent scheduler rewrite which was merged as commit
0437e109e1 sched_cacheflush is unused.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-19 21:28:35 +02:00
Peter Zijlstra
b111757c50 arch: personality independent stack top
New arch macro STACK_TOP_MAX it gives the larges valid stack address for the
architecture in question.

It differs from STACK_TOP in that it will not distinguish between
personalities but will always return the largest possible address.

This is used to create the initial stack on execve, which we will move down to
the proper location once the binfmt code has figured out where that is.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ollie Wild <aaw@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:45 -07:00
Fenghua Yu
5fb7dc37dc define new percpu interface for shared data
per cpu data section contains two types of data.  One set which is
exclusively accessed by the local cpu and the other set which is per cpu,
but also shared by remote cpus.  In the current kernel, these two sets are
not clearely separated out.  This can potentially cause the same data
cacheline shared between the two sets of data, which will result in
unnecessary bouncing of the cacheline between cpus.

One way to fix the problem is to cacheline align the remotely accessed per
cpu data, both at the beginning and at the end.  Because of the padding at
both ends, this will likely cause some memory wastage and also the
interface to achieve this is not clean.

This patch:

Moves the remotely accessed per cpu data (which is currently marked
as ____cacheline_aligned_in_smp) into a different section, where all the data
elements are cacheline aligned. And as such, this differentiates the local
only data and remotely accessed data cleanly.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:44 -07:00
Michael Ellerman
9e367d8592 jprobes: remove JPROBE_ENTRY()
AFAICT now that jprobe.entry is a void *, JPROBE_ENTRY doesn't do anything
useful - so remove it ..

I've left a do-nothing version so that out-of-tree jprobes code will still
compile without modifications.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:44 -07:00
Amit Arora
97ac73506c sys_fallocate() implementation on i386, x86_64 and powerpc
fallocate() is a new system call being proposed here which will allow
applications to preallocate space to any file(s) in a file system.
Each file system implementation that wants to use this feature will need
to support an inode operation called ->fallocate().
Applications can use this feature to avoid fragmentation to certain
level and thus get faster access speed. With preallocation, applications
also get a guarantee of space for particular file(s) - even if later the
the system becomes full.

Currently, glibc provides an interface called posix_fallocate() which
can be used for similar cause. Though this has the advantage of working
on all file systems, but it is quite slow (since it writes zeroes to
each block that has to be preallocated). Without a doubt, file systems
can do this more efficiently within the kernel, by implementing
the proposed fallocate() system call. It is expected that
posix_fallocate() will be modified to call this new system call first
and incase the kernel/filesystem does not implement it, it should fall
back to the current implementation of writing zeroes to the new blocks.
ToDos:
1. Implementation on other architectures (other than i386, x86_64,
   and ppc). Patches for s390(x) and ia64 are already available from
   previous posts, but it was decided that they should be added later
   once fallocate is in the mainline. Hence not including those patches
   in this take.
2. Changes to glibc,
   a) to support fallocate() system call
   b) to make posix_fallocate() and posix_fallocate64() call fallocate()

Signed-off-by: Amit Arora <aarora@in.ibm.com>
2007-07-17 21:42:44 -04:00
Antonino A. Daplas
317b3c2167 fbdev: detect primary display device
Add function helper, fb_is_primary_device().  Given struct fb_info, it will
return a nonzero value if the device is the primary display.

Currently, only the i386 is supported where the function checks for the
IORESOURCE_ROM_SHADOW flag.

Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:11 -07:00
Antonino A. Daplas
10eb2659cc fbdev: move arch-specific bits to their respective subdirectories
Move arch-specific bits of fb_mmap() to their respective subdirectories

[bob.picco@hp.com: efi_range_is_wc is referenced but not declared]
[bunk@stusta.de: fix include/asm-m68k/fb.h]
Signed-off-by: Antonino Daplas <adaplas@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:11 -07:00
Mel Gorman
769848c038 Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated
It is often known at allocation time whether a page may be migrated or not.
This patch adds a flag called __GFP_MOVABLE and a new mask called
GFP_HIGH_MOVABLE.  Allocations using the __GFP_MOVABLE can be either migrated
using the page migration mechanism or reclaimed by syncing with backing
storage and discarding.

An API function very similar to alloc_zeroed_user_highpage() is added for
__GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable().  The
flags used by alloc_zeroed_user_highpage() are not changed because it would
change the semantics of an existing API.  After this patch is applied there
are no in-kernel users of alloc_zeroed_user_highpage() so it probably should
be marked deprecated if this patch is merged.

Note that this patch includes a minor cleanup to the use of __GFP_ZERO in
shmem.c to keep all flag modifications to inode->mapping in the
shmem_dir_alloc() helper function.  This clean-up suggestion is courtesy of
Hugh Dickens.

Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the
concept.  Credit to Hugh Dickens for catching issues with shmem swap vector
and ramfs allocations.

[akpm@linux-foundation.org: build fix]
[hugh@veritas.com: __GFP_ZERO cleanup]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:22:59 -07:00
Martin Schwidefsky
e21ea246bc mm: remove ptep_test_and_clear_dirty and ptep_clear_flush_dirty
Nobody is using ptep_test_and_clear_dirty and ptep_clear_flush_dirty.  Remove
the functions from all architectures.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:22:59 -07:00
Arnd Bergmann
4b7775870b Introduce compat_u64 and compat_s64 types
One common problem with 32 bit system call and ioctl emulation is the
different alignment rules between i386 and 64 bit machines.  A number of
drivers work around this by marking the compat structures as
'attribute((packed))', which is not the right solution because it breaks
all the non-x86 architectures that want to use the same compat code.

Hopefully, this patch improves the situation, it introduces two new types,
compat_u64 and compat_s64.  These are defined on all architectures to have
the same size and alignment as the 32 bit version of u64 and s64.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Vasily Tarasov <vtaras@openvz.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:48 -07:00
Jan Beulich
45e98cdb6d page table handling cleanup
Kill pte_rdprotect(), pte_exprotect(), pte_mkread(), pte_mkexec(), pte_read(),
pte_exec(), and pte_user() except where arch-specific code is making use of
them.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:36 -07:00
Yinghai Lu
18a8bd949d serial: convert early_uart to earlycon for 8250
Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
include/asm-x86_64/serial.h.  the serial8250_ports need to be probed late in
serial initializing stage.  the console_init=>serial8250_console_init=>
register_console=>serial8250_console_setup will return -ENDEV, and console
ttyS0 can not be enabled at that time.  need to wait till uart_add_one_port in
drivers/serial/serial_core.c to call register_console to get console ttyS0.
that is too late.

Make early_uart to use early_param, so uart console can be used earlier.  Make
it to be bootconsole with CON_BOOT flag, so can use console handover feature.
and it will switch to corresponding normal serial console automatically.

new command line will be:
	console=uart8250,io,0x3f8,9600n8
	console=uart8250,mmio,0xff5e0000,115200n8
or
	earlycon=uart8250,io,0x3f8,9600n8
	earlycon=uart8250,mmio,0xff5e0000,115200n8

it will print in very early stage:
	Early serial console at I/O port 0x3f8 (options '9600n8')
	console [uart0] enabled
later for console it will print:
	console handover: boot [uart0] -> real [ttyS0]

Signed-off-by: <yinghai.lu@sun.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 09:05:35 -07:00
Linus Torvalds
21ba0f88ae Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
  PCI: Only build PCI syscalls on architectures that want them
  PCI: limit pci_get_bus_and_slot to domain 0
  PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
  PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
  PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
  PCI: hotplug: pciehp: wait for 1 second after power off slot
  PCI: pci_set_power_state(): check for PM capabilities earlier
  PCI: cpci_hotplug: Convert to use the kthread API
  PCI: add pci_try_set_mwi
  PCI: pcie: remove SPIN_LOCK_UNLOCKED
  PCI: ROUND_UP macro cleanup in drivers/pci
  PCI: remove pci_dac_dma_... APIs
  PCI: pci-x-pci-express-read-control-interfaces cleanups
  PCI: Fix typo in include/linux/pci.h
  PCI: pci_ids, remove double or more empty lines
  PCI: pci_ids, add atheros and 3com_2 vendors
  PCI: pci_ids, reorder some entries
  PCI: i386: traps, change VENDOR to DEVICE
  PCI: ATM: lanai, change VENDOR to DEVICE
  PCI: Change all drivers to use pci_device->revision
  ...
2007-07-12 13:40:57 -07:00
H. Peter Anvin
8afd2af889 x86-64: add symbolic constants for the boot segment selectors
Add symbolic constants for the segment selectors/GDT slots used by
the setup code, for consistency with i386.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:54 -07:00
H. Peter Anvin
48c7ae674f Make struct boot_params a real structure, and remove obsolete fields
Make struct boot_params a real structure, and remove the handling of
some obsolete fields, in particular hd*_info, which was only used by
the ST-506 driver, and likely to be wrong for that driver on any
modern BIOS.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:54 -07:00
H. Peter Anvin
9c25d134b3 Make definitions for struct e820entry and struct e820map consistent
Make definitions for struct e820entry and struct e820map
consistent between i386 and x86-64.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:54 -07:00
Venki Pallipadi
1d67953f2b Use a new CPU feature word to cover features that are spread around
Some Intel features are spread around in different CPUID leafs like 0x5,
0x6 and 0xA.  Make this feature detection code common across i386 and
x86_64.

Display Intel Dynamic Acceleration feature in /proc/cpuinfo. This feature
will be enabled automatically by current acpi-cpufreq driver.

Refer to Intel Software Developer's Manual for more details about the feature.

Thanks to hpa (H Peter Anvin) for the making the actual code detecting the
scattered features data-driven.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:54 -07:00
H. Peter Anvin
ec481536b1 Unify the CPU features vectors between i386 and x86-64
Unify the handling of the CPU features vectors between i386 and x86-64.
This also adopts the collapsing of features which are required at
compile-time into constant tests from x86-64 to i386.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-12 10:55:54 -07:00
Jan Beulich
caa5171622 PCI: remove pci_dac_dma_... APIs
Based on replies to a respective query, remove the pci_dac_dma_...() APIs
(except for pci_dac_dma_supported() on Alpha, where this function is used
in non-DAC PCI DMA code).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Jesse Barnes <jesse.barnes@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:11 -07:00
Michael Ellerman
575e3348cb PCI: Use a weak symbol for the empty version of pcibios_add_platform_entries()
I'm not sure if this is going to fly, weak symbols work on the compilers I'm
using, but whether they work for all of the affected architectures I can't say.
I've cc'ed as many arch maintainers/lists as I could find.

But assuming they do, we can use a weak empty definition of
pcibios_add_platform_entries() to avoid having an empty definition on every
arch.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:07 -07:00
Andi Kleen
0b62233021 x86_64: Fix eventd/timerfd syscalls
They had the same syscall number.

Pointed out by Davide Libenzi

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-20 14:27:25 -07:00
Benjamin Herrenschmidt
8dab5241d0 Rework ptep_set_access_flags and fix sun4c
Some changes done a while ago to avoid pounding on ptep_set_access_flags and
update_mmu_cache in some race situations break sun4c which requires
update_mmu_cache() to always be called on minor faults.

This patch reworks ptep_set_access_flags() semantics, implementations and
callers so that it's now responsible for returning whether an update is
necessary or not (basically whether the PTE actually changed).  This allow
fixing the sparc implementation to always return 1 on sun4c.

[akpm@linux-foundation.org: fixes, cleanups]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: David Miller <davem@davemloft.net>
Cc: Mark Fortescue <mark@mtfhpc.demon.co.uk>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-06-16 13:16:16 -07:00
Alexey Dobriyan
e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Linus Torvalds
faa8b6c3c2 Revert "ipmi: add new IPMI nmi watchdog handling"
This reverts commit f64da958df.

Andi Kleen is unhappy with the changes, and they really do not seem
worth it.  IPMI could use DIE_NMI_IPI instead of the new callback, even
though that ends up having its own set of problems too, mainly because
the IPMI code cannot really know the NMI was from IPMI or not.

Manually fix up conflicts in arch/x86_64/kernel/traps.c and
drivers/char/ipmi/ipmi_watchdog.c.

Cc: Andi Kleen <ak@suse.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Corey Minyard <minyard@acm.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-14 15:24:24 -07:00
Davide Libenzi
fdb902b122 signal/timer/event: eventfd wire up x86 arches
This patch wires the eventfd system call to the x86 architectures.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:37 -07:00
Davide Libenzi
57ac889850 signal/timer/event: timerfd wire up x86 arches
This patch wires the timerfd system call to the x86 architectures.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:36 -07:00
Davide Libenzi
2121e24bd8 signal/timer/event: signalfd wire up x86 arches
This patch wires the signalfd system call to the x86 architectures.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:36 -07:00
Stephen Rothwell
04dd08b45b Consolidate asm/poll.h
These files are almost all the same.

This patch could be made even simpler if we don't mind POLLREMOVE turning
up in a few architectures that didn't have it previously (which should be
OK as POLLREMOVE is not used anywhere in the current tree).

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:34 -07:00
Andi Kleen
9393e1dc8e x86_64: new syscall
Add epoll_pwait()

(akpm: stolen from Andi's queue, because I want to send the signalfd patches
which also add syscalls.  Not sure what the __IGNORE_getcpu is for).

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-11 08:29:32 -07:00