Commit Graph

68 Commits

Author SHA1 Message Date
Ingo Molnar
b5aa97e83b Merge branches 'x86/signal' and 'x86/irq' into perfcounters/core
Merge these pending x86 tree changes into the perfcounters tree
to avoid conflicts.
2008-12-08 15:46:36 +01:00
Hiroshi Shimamoto
4217458daf x86: signal: change type of paramter for sys_rt_sigreturn()
Impact: cleanup on 32-bit

Peter pointed this parameter can be changed.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-08 15:21:35 +01:00
Linus Torvalds
e948990f95 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix early panic with boot option "nosmp"
  x86/oprofile: fix Intel cpu family 6 detection
  oprofile: fix CPU unplug panic in ppro_stop()
  AMD IOMMU: fix possible race while accessing iommu->need_sync
  AMD IOMMU: set device table entry for aliased devices
  AMD IOMMU: struct amd_iommu remove padding on 64 bit
  x86: fix broken flushing in GART nofullflush path
  x86: fix dma_mapping_error for 32bit x86
2008-12-04 21:40:08 -08:00
Ingo Molnar
c36910c147 Merge branch 'iommu-fixes-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2008-12-03 12:54:45 +01:00
Richard Kennedy
eac9fbc6a9 AMD IOMMU: struct amd_iommu remove padding on 64 bit
Remove 16 bytes of padding from struct amd_iommu on 64bit builds
reducing its size to 120 bytes, allowing it to span one fewer
cachelines.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2008-12-03 12:20:46 +01:00
Mahesh Salgaonkar
43714539ea sched: don't export sched_mc_power_savings in laptops
Impact: do not expose a control that has no effect

Fix to prevent sched_mc_power_saving from being exported through sysfs
on single-socket systems. (Say multicore single socket (Laptop))

CPU core map of the boot cpu should be equal to possible number
of cpus for single socket system.

This fix has been developed at FOSS.in kernel workout.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-01 08:44:00 +01:00
Linus Torvalds
66a45cc4cc Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: always define DECLARE_PCI_UNMAP* macros
  x86: fixup config space size of CPU functions for AMD family 11h
  x86, bts: fix wrmsr and spinlock over kmalloc
  x86, pebs: fix PEBS record size configuration
  x86, bts: turn macro into static inline function
  x86, bts: exclude ds.c from build when disabled
  arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
  x86: use limited register constraint for setnz
  xen: pin correct PGD on suspend
  x86: revert irq number limitation
  x86: fixing __cpuinit/__init tangle, xsave_cntxt_init()
  x86: fix __cpuinit/__init tangle in init_thread_xstate()
  oprofile: fix an overflow in ppro code
2008-11-30 13:01:04 -08:00
Christoph Hellwig
96b8936a9e remove __ARCH_WANT_COMPAT_SYS_PTRACE
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).

Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 11:00:15 -08:00
Thomas Bogendoerfer
7b1dedca42 x86: fix dma_mapping_error for 32bit x86
Devices like b44 ethernet can't dma from addresses above 1GB. The driver
handles this cases by falling back to GFP_DMA allocation. But for detecting
the problem it needs to get an indication from dma_mapping_error.
The bug is triggered by using a VMSPLIT option of 2G/2G.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-29 21:00:38 +01:00
Ingo Molnar
3bdae4f464 Merge branch 'x86/debug' into x86/irq
We merge this branch because x86/debug touches code that we started
cleaning up in x86/irq. The two branches started out independent,
but as unexpected amount of activity went into x86/irq, they became
dependent. Resolve that by this cross-merge.
2008-11-28 15:00:48 +01:00
Joerg Roedel
b627c8b17c x86: always define DECLARE_PCI_UNMAP* macros
Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off

Currently these macros evaluate to a no-op except the kernel is compiled
with GART or Calgary support. But we also need these macros when we have
SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at
least with SWIOTLB we can define these macros always.

This patch is also for stable backport for the same reason the SWIOTLB
default selection patch is.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27 12:44:08 +01:00
Markus Metzger
e5e8ca633b x86, bts: turn macro into static inline function
Impact: cleanup

Replace a macro with a static inline function.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:28:51 +01:00
Cyrill Gorcunov
3b6c52b5b6 x86: introduce ENTRY(KPROBE_ENTRY)_X86 assembly helpers to catch unbalanced declaration v3
Impact: make ENTRY()/END() macros more capable

It's usefull to catch unbalanced or messed or mixed declarations of ENTRY and
KPROBES. These macros would help a bit.

For example the following code would compile without problems

        ENTRY_X86(mcount)
                retq
        END_X86(mcount)

But if you forget and mess the following form

        ENTRY_X86(mcount)
                retq
        END(mcount)

        ENTRY_X86(ftrace_caller)

The assembler will issue the following message:
Error: ENTRY_X86/KPROBE_X86 unbalanced,missed,mixed

Actually the checking is performed at every _X86 macro
so maybe it's good idea to put ENTRY_KPROBE_FINAL_X86
at the end of .S file to be sure you didn't miss anything.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Alexander van Heukelum <heukelum@mailshack.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 19:57:01 +01:00
Cyrill Gorcunov
8a2503fa4a x86: move dwarf2 related macro to dwarf2.h
Impact: cleanup

Move recently introduced dwarf2 macros to dwarf2.h file.
It allow us to not duplicate them in assembly files.

Active usage of _cfi macros don't make assembly files
more obvious to understand but we already have a lot of
macros there which requires to search the definitions
of them *anyway*. But at least it make every cfi usage
one line shorter.

Also some code alignment is done.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 13:20:52 +01:00
Linus Torvalds
0260da162f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: uaccess_64: fix return value in __copy_from_user()
  x86: quirk for reboot stalls on a Dell Optiplex 330
2008-11-20 13:09:32 -08:00
Ingo Molnar
c032a2de4c Merge branch 'x86/cleanups' into x86/irq
[ merged x86/cleanups into x86/irq to enable a wider IRQ entry code
  patch to be applied, which depends on a cleanup patch in x86/cleanups. ]
2008-11-20 10:48:31 +01:00
Linus Torvalds
3108864e2d Merge branch 'x86/numa' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/numa' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: make NUMA on 32-bit depend on EXPERIMENTAL again
  x86, hibernate: fix breakage on x86_32 with CONFIG_NUMA set
2008-11-19 18:53:02 -08:00
Linus Torvalds
4f7dbc7ff4 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: more general identifier for Phoenix BIOS
  AMD IOMMU: check for next_bit also in unmapped area
  AMD IOMMU: fix fullflush comparison length
  AMD IOMMU: enable device isolation per default
  AMD IOMMU: add parameter to disable device isolation
  x86, PEBS/DS: fix code flow in ds_request()
  x86: add rdtsc barrier to TSC sync check
  xen: fix scrub_page()
  x86: fix es7000 compiling
  x86, bts: fix unlock problem in ds.c
  x86, voyager: fix smp generic helper voyager breakage
  x86: move iomap.h to the new include location
2008-11-19 18:51:56 -08:00
Ulrich Drepper
de11defebf reintroduce accept4
Introduce a new accept4() system call.  The addition of this system call
matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(),
inotify_init1(), epoll_create1(), pipe2()) which added new system calls
that differed from analogous traditional system calls in adding a flags
argument that can be used to access additional functionality.

The accept4() system call is exactly the same as accept(), except that
it adds a flags bit-mask argument.  Two flags are initially implemented.
(Most of the new system calls in 2.6.27 also had both of these flags.)

SOCK_CLOEXEC causes the close-on-exec (FD_CLOEXEC) flag to be enabled
for the new file descriptor returned by accept4().  This is a useful
security feature to avoid leaking information in a multithreaded
program where one thread is doing an accept() at the same time as
another thread is doing a fork() plus exec().  More details here:
http://udrepper.livejournal.com/20407.html "Secure File Descriptor Handling",
Ulrich Drepper).

The other flag is SOCK_NONBLOCK, which causes the O_NONBLOCK flag
to be enabled on the new open file description created by accept4().
(This flag is merely a convenience, saving the use of additional calls
fcntl(F_GETFL) and fcntl (F_SETFL) to achieve the same result.

Here's a test program.  Works on x86-32.  Should work on x86-64, but
I (mtk) don't have a system to hand to test with.

It tests accept4() with each of the four possible combinations of
SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies
that the appropriate flags are set on the file descriptor/open file
description returned by accept4().

I tested Ulrich's patch in this thread by applying against 2.6.28-rc2,
and it passes according to my test program.

/* test_accept4.c

  Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk
       <mtk.manpages@gmail.com>

  Licensed under the GNU GPLv2 or later.
*/
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdlib.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>

#define PORT_NUM 33333

#define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)

/**********************************************************************/

/* The following is what we need until glibc gets a wrapper for
  accept4() */

/* Flags for socket(), socketpair(), accept4() */
#ifndef SOCK_CLOEXEC
#define SOCK_CLOEXEC    O_CLOEXEC
#endif
#ifndef SOCK_NONBLOCK
#define SOCK_NONBLOCK   O_NONBLOCK
#endif

#ifdef __x86_64__
#define SYS_accept4 288
#elif __i386__
#define USE_SOCKETCALL 1
#define SYS_ACCEPT4 18
#else
#error "Sorry -- don't know the syscall # on this architecture"
#endif

static int
accept4(int fd, struct sockaddr *sockaddr, socklen_t *addrlen, int flags)
{
   printf("Calling accept4(): flags = %x", flags);
   if (flags != 0) {
       printf(" (");
       if (flags & SOCK_CLOEXEC)
           printf("SOCK_CLOEXEC");
       if ((flags & SOCK_CLOEXEC) && (flags & SOCK_NONBLOCK))
           printf(" ");
       if (flags & SOCK_NONBLOCK)
           printf("SOCK_NONBLOCK");
       printf(")");
   }
   printf("\n");

#if USE_SOCKETCALL
   long args[6];

   args[0] = fd;
   args[1] = (long) sockaddr;
   args[2] = (long) addrlen;
   args[3] = flags;

   return syscall(SYS_socketcall, SYS_ACCEPT4, args);
#else
   return syscall(SYS_accept4, fd, sockaddr, addrlen, flags);
#endif
}

/**********************************************************************/

static int
do_test(int lfd, struct sockaddr_in *conn_addr,
       int closeonexec_flag, int nonblock_flag)
{
   int connfd, acceptfd;
   int fdf, flf, fdf_pass, flf_pass;
   struct sockaddr_in claddr;
   socklen_t addrlen;

   printf("=======================================\n");

   connfd = socket(AF_INET, SOCK_STREAM, 0);
   if (connfd == -1)
       die("socket");
   if (connect(connfd, (struct sockaddr *) conn_addr,
               sizeof(struct sockaddr_in)) == -1)
       die("connect");

   addrlen = sizeof(struct sockaddr_in);
   acceptfd = accept4(lfd, (struct sockaddr *) &claddr, &addrlen,
                      closeonexec_flag | nonblock_flag);
   if (acceptfd == -1) {
       perror("accept4()");
       close(connfd);
       return 0;
   }

   fdf = fcntl(acceptfd, F_GETFD);
   if (fdf == -1)
       die("fcntl:F_GETFD");
   fdf_pass = ((fdf & FD_CLOEXEC) != 0) ==
              ((closeonexec_flag & SOCK_CLOEXEC) != 0);
   printf("Close-on-exec flag is %sset (%s); ",
           (fdf & FD_CLOEXEC) ? "" : "not ",
           fdf_pass ? "OK" : "failed");

   flf = fcntl(acceptfd, F_GETFL);
   if (flf == -1)
       die("fcntl:F_GETFD");
   flf_pass = ((flf & O_NONBLOCK) != 0) ==
              ((nonblock_flag & SOCK_NONBLOCK) !=0);
   printf("nonblock flag is %sset (%s)\n",
           (flf & O_NONBLOCK) ? "" : "not ",
           flf_pass ? "OK" : "failed");

   close(acceptfd);
   close(connfd);

   printf("Test result: %s\n", (fdf_pass && flf_pass) ? "PASS" : "FAIL");
   return fdf_pass && flf_pass;
}

static int
create_listening_socket(int port_num)
{
   struct sockaddr_in svaddr;
   int lfd;
   int optval;

   memset(&svaddr, 0, sizeof(struct sockaddr_in));
   svaddr.sin_family = AF_INET;
   svaddr.sin_addr.s_addr = htonl(INADDR_ANY);
   svaddr.sin_port = htons(port_num);

   lfd = socket(AF_INET, SOCK_STREAM, 0);
   if (lfd == -1)
       die("socket");

   optval = 1;
   if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval,
                  sizeof(optval)) == -1)
       die("setsockopt");

   if (bind(lfd, (struct sockaddr *) &svaddr,
            sizeof(struct sockaddr_in)) == -1)
       die("bind");

   if (listen(lfd, 5) == -1)
       die("listen");

   return lfd;
}

int
main(int argc, char *argv[])
{
   struct sockaddr_in conn_addr;
   int lfd;
   int port_num;
   int passed;

   passed = 1;

   port_num = (argc > 1) ? atoi(argv[1]) : PORT_NUM;

   memset(&conn_addr, 0, sizeof(struct sockaddr_in));
   conn_addr.sin_family = AF_INET;
   conn_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
   conn_addr.sin_port = htons(port_num);

   lfd = create_listening_socket(port_num);

   if (!do_test(lfd, &conn_addr, 0, 0))
       passed = 0;
   if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, 0))
       passed = 0;
   if (!do_test(lfd, &conn_addr, 0, SOCK_NONBLOCK))
       passed = 0;
   if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, SOCK_NONBLOCK))
       passed = 0;

   close(lfd);

   exit(passed ? EXIT_SUCCESS : EXIT_FAILURE);
}

[mtk.manpages@gmail.com: rewrote changelog, updated test program]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-19 18:49:57 -08:00
Hiroshi Shimamoto
20a4a236c7 x86: uaccess_64: fix return value in __copy_from_user()
__copy_from_user() will return invalid value 16 when it fails to
access user space and the size is 10.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-18 22:28:58 +01:00
Ingo Molnar
73f56c0d35 Merge branch 'iommu-fixes-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2008-11-18 16:48:49 +01:00
Ingo Molnar
cbe9ee00ce Merge branch 'x86/urgent' into x86/cleanups 2008-11-18 15:41:36 +01:00
Ingo Molnar
9dacc71ff3 Merge commit 'v2.6.28-rc5' into x86/cleanups 2008-11-17 10:46:18 +01:00
David Woodhouse
52168e60f7 Revert "x86: blacklist DMAR on Intel G31/G33 chipsets"
This reverts commit e51af66308, which was
wrongly hoovered up and submitted about a month after a better fix had
already been merged.

The better fix is commit cbda1ba898
("PCI/iommu: blacklist DMAR on Intel G31/G33 chipsets"), where we do
this blacklisting based on the DMI identification for the offending
motherboard, since sometimes this chipset (or at least a chipset with
the same PCI ID) apparently _does_ actually have an IOMMU.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-15 11:37:16 -08:00
Rafael J. Wysocki
97a70e548b x86, hibernate: fix breakage on x86_32 with CONFIG_NUMA set
Impact: fix crash during hibernation on 32-bit NUMA

The NUMA code on x86_32 creates special memory mapping that allows
each node's pgdat to be located in this node's memory.  For this
purpose it allocates a memory area at the end of each node's memory
and maps this area so that it is accessible with virtual addresses
belonging to low memory.  As a result, if there is high memory,
these NUMA-allocated areas are physically located in high memory,
although they are mapped to low memory addresses.

Our hibernation code does not take that into account and for this
reason hibernation fails on all x86_32 systems with CONFIG_NUMA=y and
with high memory present.  Fix this by adding a special mapping for
the NUMA-allocated memory areas to the temporary page tables created
during the last phase of resume.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-12 23:28:51 +01:00
Len Brown
3e0fe36483 Merge branch 'misc' into release 2008-11-11 21:14:11 -05:00
Bjorn Helgaas
32836259ff ACPI: pci_link: remove acpi_irq_balance_set() interface
This removes the acpi_irq_balance_set() interface from the PCI
interrupt link driver.

x86 used acpi_irq_balance_set() to tell the PCI interrupt link
driver to configure links to minimize IRQ sharing.  But the link
driver can easily figure out whether to turn on IRQ balancing
based on the IRQ model (PIC/IOAPIC/etc), so we can get rid of
that external interface.

It's better for the driver to figure this out at init-time.  If
we set it externally via the x86 code, the interface reduces
modularity, and we depend on the fact that acpi_process_madt()
happens before we process the kernel command line.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-11 21:12:05 -05:00
H. Peter Anvin
939b787130 x86: 64 bits: shrink and align IRQ stubs
Move the IRQ stub generation to assembly to simplify it and for
consistency with 32 bits.  Doing it in a C file with asm() statements
doesn't help clarity, and it prevents some optimizations.

Shrink the IRQ stubs down to just over four bytes per (we fit seven
into a 32-byte chunk.)  This shrinks the total icache consumption of
the IRQ stubs down to an even kilobyte, if all of them are in active
use.

The downside is that we end up with a double jump, which could have a
negative effect on some pipelines.  The double jump is always inside
the same cacheline on any modern chips.

To get the most effect, cache-align the IRQ stubs.

This makes the 64-bit code match changes already done to the 32-bit
code, and should open up irqinit*.c for unification.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-11 13:51:52 -08:00
H. Peter Anvin
4687518c4c x86: 32 bit: interrupt stub consistency with 64 bit
Don't generate interrupt stubs for interrupt vectors below
FIRST_EXTERNAL_VECTOR, and make the table of interrupt vectors
(interrupt[]) __initconst.  Both of these changes both conserve memory
and improve consistency with 64 bits.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-11 13:03:07 -08:00
Harvey Harrison
19f47c634e x86: x86_32 has its own irq_regs definition
Impact: cleanup

Arches that have their own irq_regs definition are expected to
define ARCH_HAS_OWN_IRQ_REGS or else a generic (unused) set
will also be defined in lib/irq_regs.c

Sparse noticed the unused generic one had no prototype:
lib/irq_regs.c:15:1: warning: symbol 'per_cpu____irq_regs' was not declared. Should it be static?

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-10 08:41:47 +01:00
Ingo Molnar
4fcc50abdf x86: clean up vget_cycles()
Impact: remove unused variable

I forgot to remove the now unused "cycles_t cycles" parameter from
vget_cycles() - which triggers build warnings as tsc.h is included
in a number of files.

Remove it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-09 21:05:43 +01:00
Arjan van de Ven
3044646148 x86: move iomap.h to the new include location
a new file was accidentally added to include/asm-x86;
move it to the new arch/x86/include/asm location

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
2008-11-09 10:07:58 -08:00
Ingo Molnar
cb9e35dce9 x86: clean up rdtsc_barrier() use
Impact: cleanup

Move rdtsc_barrier() use to vsyscall_64.c where it's relied on,
and point out its role in the context of its use.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-08 20:27:00 +01:00
Ingo Molnar
895e031707 Merge branch 'linus' into x86/cleanups 2008-11-08 20:23:02 +01:00
Ingo Molnar
0d12cdd5f8 sched: improve sched_clock() performance
in scheduler-intense workloads native_read_tsc() overhead accounts for
20% of the system overhead:

 659567 system_call                              41222.9375
 686796 schedule                                 435.7843
 718382 __switch_to                              665.1685
 823875 switch_mm                                4526.7857
 1883122 native_read_tsc                          55385.9412
 9761990 total                                      2.8468

this is large part due to the rdtsc_barrier() that is done before
and after reading the TSC.

But sched_clock() is not a precise clock in the GTOD sense, using such
barriers is completely pointless. So remove the barriers and only use
them in vget_cycles().

This improves lat_ctx performance by about 5%.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-08 16:48:19 +01:00
Linus Torvalds
a15a82f42c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  Revert "x86: default to reboot via ACPI"
  x86: align DirectMap in /proc/meminfo
  AMD IOMMU: fix lazy IO/TLB flushing in unmap path
  x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR
  x86: remove VISWS and PARAVIRT around NR_IRQS puzzle
  x86: mention ACPI in top-level Kconfig menu
  x86: size NR_IRQS on 32-bit systems the same way as 64-bit
  x86: don't allow nr_irqs > NR_IRQS
  x86/docs: remove noirqbalance param docs
  x86: don't use tsc_khz to calculate lpj if notsc is passed
  x86, voyager: fix smp_intr_init() compile breakage
  AMD IOMMU: fix detection of NP capable IOMMUs
2008-11-06 15:57:24 -08:00
Yinghai Lu
7db282fa67 x86: remove VISWS and PARAVIRT around NR_IRQS puzzle
Impact: fix warning message when PARAVIRT is set in config

Remove stale #ifdef components from our IRQ sizing logic.
x86/Voyager is the only holdout.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 09:35:34 +01:00
Yinghai Lu
1b48976880 x86: size NR_IRQS on 32-bit systems the same way as 64-bit
Impact: make NR_IRQS big enough for system with lots of apic/pins

If lots of IO_APIC's are there (or can be there), size the same way
as 64-bit, depending on MAX_IO_APICS and NR_CPUS.

This fixes the boot problem reported by Ben Hutchings on a 32-bit
server with 5 IO-APICs and 240 IO-APIC pins.

Signed-off-by: Yinghai <yinghai@kernel.org>
Tested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 07:23:22 +01:00
Uros Bizjak
838e8bb71d x86: Implement change_bit with immediate operand as "lock xorb"
Impact: Minor optimization.

Implement change_bit with immediate bit count as "lock xorb". This is
similar to  "lock orb" and "lock andb"  for set_bit and clear_bit
functions.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-11-05 09:51:02 -08:00
Ingo Molnar
9fcd18c9e6 sched: re-tune balancing
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems

Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.

Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)

lat_ctx on a NUMA box is particularly happy about this change:

before:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.60
 |   2 5.70

after:

 |   phoenix:~/l> ./lat_ctx -s 0 2
 |   "size=0k ovr=2.65
 |   2 2.07

a 2.75x speedup.

pipe-test is similarly happy about it too:

 |  phoenix:~/sched-tests> ./pipe-test
 |   18.26 usecs/loop.
 |   14.70 usecs/loop.
 |   14.38 usecs/loop.
 |   10.55 usecs/loop.              # +WAKE_AFFINE on domain0+domain1
 |   8.63 usecs/loop.
 |   8.59 usecs/loop.
 |   9.03 usecs/loop.
 |   8.94 usecs/loop.
 |   8.96 usecs/loop.
 |   8.63 usecs/loop.

Also:

 - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings)
 - enable SD_WAKE_BALANCE on SMP domains

Sysbench+postgresql improves all around the board, quite significantly:

           .28-rc3-11474e2c  .28-rc3-11474e2c-tune
-------------------------------------------------
    1:             571              688    +17.08%
    2:            1236             1206    -2.55%
    4:            2381             2642    +9.89%
    8:            4958             5164    +3.99%
   16:            9580             9574    -0.07%
   32:            7128             8118    +12.20%
   64:            7342             8266    +11.18%
  128:            7342             8064    +8.95%
  256:            7519             7884    +4.62%
  512:            7350             7731    +4.93%
-------------------------------------------------
  SUM:           55412            59341    +6.62%

So it's a win both for the runup portion, the peak area and the tail.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 18:04:38 +01:00
Linus Torvalds
da4a22cba7 Merge branch 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  io mapping: clean up #ifdefs
  io mapping: improve documentation
  i915: use io-mapping interfaces instead of a variety of mapping kludges
  resources: add io-mapping functions to dynamically map large device apertures
  x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
2008-11-03 10:15:40 -08:00
James Bottomley
73557af5bf x86, voyager: fix smp_intr_init() compile breakage
Impact: fix x86/Voyager build

Looks like this became static on the rest of x86.  Fix it up by adding
an external definition to mach-voyager/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-03 10:52:21 +01:00
Venki Pallipadi
2576c99917 x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature
Impact: fix xsave slowdown regression

Fix two features from conflicting in feature bits.

Fixes this performance regression:

   Subject: cpu2000(both float and int) 13% regression with 2.6.28-rc1
   http://lkml.org/lkml/2008/10/28/36

Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Bisected-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 11:01:40 +01:00
Keith Packard
fd94093435 x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
Impact: introduce new APIs, separate kmap code from CONFIG_HIGHMEM

This takes the code used for CONFIG_HIGHMEM memory mappings except that
it's designed for dynamic IO resource mapping.

These fixmaps are available even with CONFIG_HIGHMEM turned off.

Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 10:12:38 +01:00
Linus Torvalds
74c75f524e Merge branch 'x86-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: cpu_index build fix
  x86/voyager: fix missing cpu_index initialisation
  x86/voyager: fix compile breakage caused by dc1e35c6e9
  x86: fix /dev/mem mmap breakage when PAT is disabled
  x86/voyager: fix compile breakage casued by x86: move prefill_possible_map calling early
  x86: use CONFIG_X86_SMP instead of CONFIG_SMP
  x86/voyager: fix boot breakage caused by x86: boot secondary cpus through initial_code
  x86, uv: fix compile error in uv_hub.h
  i386/PAE: fix pud_page()
  x86: remove debug code from arch_add_memory()
  x86: start annotating early ioremap pointers with __iomem
  x86: two trivial sparse annotations
  x86: fix init_memory_mapping for [dc000000 - e0000000) - v2
2008-10-30 18:33:46 -07:00
James Bottomley
b3572e361b x86/voyager: fix compile breakage caused by dc1e35c6e9
Impact: build fix on x86/Voyager

Given commits like this:

| Author: Suresh Siddha <suresh.b.siddha@intel.com>
| Date:   Tue Jul 29 10:29:19 2008 -0700
|
|     x86, xsave: enable xsave/xrstor on cpus with xsave support

Which deliberately expose boot cpu dependence to pieces of the system,
I think it's time to explicitly have a variable for it to prevent this
continual misassumption that the boot CPU is zero.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:33 +01:00
Linus Torvalds
8bd93ca7b0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, gart: fix gart detection for Fam11h CPUs
  x86: 64 bit print out absent pages num too
  x86, kdump: fix invalid access on i386 sparsemem
  x86: fix APIC_DEBUG with inquire_remote_apic
  x86: AMD microcode patch loader author update
  x86: microcode patch loader author update
  mailmap: add Peter Oruba
  x86, bts: improve help text for BTS config
  doc/x86: fix doc subdirs
2008-10-30 12:50:59 -07:00
Linus Torvalds
d6c3112abe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI: build failure at x86/kernel/pci-dma.c with !CONFIG_PCI
2008-10-30 12:09:44 -07:00
Mike Travis
c08b6acc9b x86, uv: fix compile error in uv_hub.h
Impact: include file dependency cleanup

Fix compile errors of files that include asm/uv/uv_hub.h but do
not include linux/timer.h.

[ such files are not mainline right now. ]

Signed-of-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 19:38:46 +01:00
Jan Beulich
ab00fee30c i386/PAE: fix pud_page()
Impact: cleanup

To the unsuspecting user it is quite annoying that this broken and
inconsistent with x86-64 definition still exists.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 11:47:50 +01:00