Commit Graph

49672 Commits

Author SHA1 Message Date
Paul Mackerras
9f5f9ffe50 powerpc/perf: Fix sampling enable for PPC970
The logic to distinguish marked instruction events from ordinary events
on PPC970 and derivatives was flawed.  The result is that instruction
sampling didn't get enabled in the PMU for some marked instruction
events, so they would never trigger.  This fixes it by adding the
appropriate break statements in the switch statement.

Reported-by: David Binderman <dcb314@hotmail.com>
Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-23 17:03:56 +10:00
Steven Rostedt
258af47479 tracing/x86: Don't use mcount in kvmclock.c
The guest can use the paravirt clock in kvmclock.c which is used
by sched_clock(), which in turn is used by the tracing mechanism
for timestamps, which leads to infinite recursion.

Disable mcount/tracing for kvmclock.o.

Cc: stable@kernel.org
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-22 23:01:19 -04:00
Jeremy Fitzhardinge
9ecd4e1689 tracing/x86: Don't use mcount in pvclock.c
When using a paravirt clock, pvclock.c can be used by sched_clock(),
which in turn is used by the tracing mechanism for timestamps,
which leads to infinite recursion.

Disable mcount/tracing for pvclock.o.

Cc: stable@kernel.org
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
LKML-Reference: <4C9A9A3F.4040201@goop.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-09-22 23:00:50 -04:00
Richard Weinberger
cb1dcc0ff4 uml: fix compile warning
This fixes:
incompatible pointer type:  => 89
       arch/um/kernel/exec.c: warning: passing argument 2 of 'execve1' from
incompatible pointer type:  => 69, 85
       arch/um/kernel/exec.c: warning: passing argument 3 of 'execve1' from
incompatible pointer type:  => 69, 85

which was introduced by d7627467b7 ("Make do_execve() take a const
filename pointer")

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-22 17:22:39 -07:00
FUJITA Tomonori
710224fa27 arm: fix "arm: fix pci_set_consistent_dma_mask for dmabounce devices"
This fixes the regression caused by the commit 6fee48cd33
("dma-mapping: arm: use generic pci_set_dma_mask and
pci_set_consistent_dma_mask").

ARM needs to clip the dma coherent mask for dmabounce devices. This
restores the old trick.

Note that strictly speaking, the DMA API doesn't allow architectures to do
such but I'm not sure it's worth adding the new API to set the dma mask
that allows architectures to clip it.

Reported-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-22 17:22:38 -07:00
Linus Torvalds
c79bd89282 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Prevent no-handler signal syscall restart recursion.
  sparc: Don't mask signal when we can't setup signal frame.
  sparc64: Fix race in signal instruction flushing.
  sparc64: Support RAW perf events.
2010-09-22 12:09:46 -07:00
Al Viro
9a81c16b52 powerpc: fix double syscall restarts
Make sigreturn zero regs->trap, make do_signal() do the same on all
paths.  As it is, signal interrupting e.g. read() from fd 512 (==
ERESTARTSYS) with another signal getting unblocked when the first
handler finishes will lead to restart one insn earlier than it ought
to.  Same for multiple signals with in-kernel handlers interrupting
that sucker at the same time.  Same for multiple signals of any kind
interrupting that sucker on 64bit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-22 09:33:50 -07:00
David S. Miller
c278525978 sparc: Prevent no-handler signal syscall restart recursion.
Explicitly clear the "in-syscall" bit when we have no signal
handler and back up the program counters to back up the system
call.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-21 22:30:13 -07:00
David S. Miller
392c21802e sparc: Don't mask signal when we can't setup signal frame.
Don't invoke the signal handler tracehook in that situation
either.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-21 21:41:12 -07:00
Linus Torvalds
87ac6fa26e Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hw breakpoints: Fix pid namespace bug
  x86: Fix instruction breakpoint encoding
  oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540)
  kprobes: Fix Kconfig dependency
2010-09-21 13:21:42 -07:00
Linus Torvalds
4e24db5b1a Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY.
  virtio: console: Prevent userspace from submitting NULL buffers
  virtio: console: Fix poll blocking even though there is data to read
2010-09-21 11:00:09 -07:00
Yinghai Lu
74b3c444a9 x86, setup: Fix earlyprintk=serial,0x3f8,115200
earlyprintk can take and I/O port, so we need to handle this case in
the setup code too, otherwise 0x3f8 will be treated as a baud rate.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4C7B05A6.4010801@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-21 10:18:33 -07:00
Yinghai Lu
83d9f65bda x86, setup: Fix earlyprintk=serial,ttyS0,115200
Torsten reported that there is garbage output,
after commit 8fee13a48e (x86,
setup: enable early console output from the decompressor)

It turns out we missed the offset for that case.

Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4C7B0578.8090807@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-21 10:18:14 -07:00
Jiri Olsa
bb7ab785ad oprofile: Add Support for Intel CPU Family 6 / Model 29
This patch adds CPU type detection for dunnington processor (Family 6
/ Model 29) to be identified as core 2 family cpu type (wikipedia
source).

I tested oprofile on Intel(R) Xeon(R) CPU E7440 reporting itself as
model 29, and it runs without an issue.

Spec:

 http://www.intel.com/Assets/en_US/PDF/specupdate/320336.pdf

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-09-21 12:22:48 +02:00
David S. Miller
05c5e7698b sparc64: Fix race in signal instruction flushing.
If another cpu does a very wide munmap() on the signal frame area,
it can tear down the page table hierarchy from underneath us.

Borrow an idea from the 64-bit fault path's get_user_insn(), and
disable cross call interrupts during the page table traversal
to lock them in place while we operate.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-20 23:24:52 -07:00
Rusty Russell
9b6efcd2e2 lguest: update comments to reflect LHCALL_LOAD_GDT_ENTRY.
We used to have a hypercall which reloaded the entire GDT, then we
switched to one which loaded a single entry (to match the IDT code).

Some comments were not updated, so fix them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported by: Eviatar Khen <eviatarkhen@gmail.com>
2010-09-21 10:54:02 +09:30
Al Viro
ed1cde6836 frv: double syscall restarts, syscall restart in sigreturn()
We need to make sure that only the first do_signal() to be handled on
the way out syscall will bother with syscall restarts; additionally, the
check on the "signal has user handler" path had been wrong - compare
with restart prevention in sigreturn()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-20 10:44:38 -07:00
Al Viro
44c7afffa4 frv: handling of restart into restart_syscall is fscked
do_signal() should place the syscall number in gr7, not gr8 when
handling ERESTART_WOULDBLOCK.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-20 10:44:38 -07:00
Al Viro
ad0acab455 frv: avoid infinite loop of SIGSEGV delivery
Use force_sigsegv() rather than force_sig(SIGSEGV, ...) as the former
resets the SEGV handler pointer which will kill the process, rather than
leaving it open to an infinite loop if the SEGV handler itself caused a
SEGV signal.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-20 10:44:37 -07:00
Al Viro
5f4ad04a1e frv: fix address verification holes in setup_frame/setup_rt_frame
a) sa_handler might be maliciously set to point to kernel memory;
   blindly dereferencing it in FDPIC case is a Bad Idea(tm).

b) I'm not sure you need that set_fs(USER_DS) there at all, but if you
   do, you'd better do it *before* checking the frame you've decided to
   use with access_ok(), lest sigaltstack() becomes a convenient
   roothole.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-20 10:44:37 -07:00
Al Viro
20cd514d0f frv: restart_block.fn needs to be reset on sigreturn
Reset restart_block.fn on executing a sigreturn such that any currently
pending system call restarts will be forced to return -EINTR.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-20 10:44:37 -07:00
Eric Miao
c4a90588fa ARM: dove: fix __io() definition to use bus based offset
Signed-off-by: Eric Miao <eric.miao@canonical.com>
Acked-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
2010-09-19 22:43:42 -04:00
Arnaud Patard
e4ff1c39ee ARM: kirkwood: Unbreak PCIe I/O port
The support for the 2 pcie port of the 6282 has broken i/o port by switching
*_IO_PHYS_BASE and *_IO_BUS_BASE. In fact, the patches reintroduced the same
bug solved by commit 35f029e251.
So, I'm adding back *_IO_BUS_BASE in resource declaration and fix definition
of KIRKWOOD_PCIE1_IO_BUS_BASE. With this change, the xgi card on my t5325 is
working again.

Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Cc: stable@kernel.org
2010-09-19 22:43:25 -04:00
Linus Torvalds
2422084a94 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: deal with multiple simultaneously pending signals
  alpha: fix a 14 years old bug in sigreturn tracing
  alpha: unb0rk sigsuspend() and rt_sigsuspend()
  alpha: belated ERESTART_RESTARTBLOCK race fix
  alpha: Shift perf event pending work earlier in timer interrupt
  alpha: wire up fanotify and prlimit64 syscalls
  alpha: kill big kernel lock
  alpha: fix build breakage in asm/cacheflush.h
  alpha: remove unnecessary cast from void* in assignment.
  alpha: Use static const char * const where possible
2010-09-19 11:09:23 -07:00
Linus Torvalds
f1c9c9797a Merge branch 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung
* 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: S3C64XX: Add IORESOURCE_IRQ_HIGHLEVEL flag to dm9000 on mach-real6410
  ARM: S3C64XX: Fix coding style errors on mach-real6410
  ARM: S3C64XX: Prototype SPI devices
  ARM: S3C64XX: Fix dev-spi build
  ARM: SAMSUNG: Fix on s5p_gpio_[get,set]_drvstr
  ARM: SAMSUNG: Fix on drive strength value
  ARM: S5PV210: Add FIMC clocks
  ARM: S5PV210: Reduce the iodesc length of systimer
  ARM: S5PV210: Update I2C-1 Clock Register Property.
  ARM: S5P: Decrease IO Registers memory region size on FIMC
  ARM: S5P: Fix DMA coherent mask for FIMC
2010-09-19 11:05:05 -07:00
Russell King
d93c333dc8 ARM: Fix build error when using KCONFIG_CONFIG
Jonathan Cameron reports that when using the environment
variable KCONFIG_CONFIG, he encounters this error:

make[2]: *** No rule to make target `.config', needed by `arch/arm/boot/compressed/vmlinux.lds'

Reported-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 16:18:37 +01:00
Catalin Marinas
d907387c42 ARM: 6383/1: Implement phys_mem_access_prot() to avoid attributes aliasing
ARMv7 onwards requires that there are no aliases to the same physical
location using different memory types (i.e. Normal vs Strongly Ordered).
Access to SO mappings when the unaligned accesses are handled in
hardware is also Unpredictable (pgprot_noncached() mappings in user
space).

The /dev/mem driver requires uncached mappings with O_SYNC. The patch
implements the phys_mem_access_prot() function which generates Strongly
Ordered memory attributes if !pfn_valid() (independent of O_SYNC) and
Normal Noncacheable (writecombine) if O_SYNC.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:19:18 +01:00
Peter Korsgaard
79e27dc067 ARM: 6400/1: at91: fix arch_gettimeoffset fallout
5cfc8ee0bb (ARM: convert arm to arch_gettimeoffset()) marked all of
at91 AND at91x40 as needing ARCH_USES_GETTIMEOFFSET, and hence no high
res timer support / accurate clock_gettime() - But only at91x40 needs it.

Cc: stable@kernel.org
Signed-off-by: Peter Korsgaard <peter.korsgaard@barco.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:16:27 +01:00
Al Viro
494486a1d2 alpha: deal with multiple simultaneously pending signals
Unlike the other targets, alpha sets _one_ sigframe and
buggers off until the next syscall/interrupt, even if
more signals are pending.  It leads to quite a few unpleasant
inconsistencies, starting with SIGSEGV potentially arriving
not where it should and including e.g. mess with sigsuspend();
consider two pending signals blocked until sigsuspend()
unblocks them.  We pick the first one; then, if we are hit
by interrupt while in the handler, we process the second one
as well.  If we are not, and if no syscalls had been made,
we get out of the first handler and leave the second signal
pending; normally sigreturn() would've picked it anyway, but
here it starts with restoring the original mask and voila -
the second signal is blocked again.  On everything else we
get both delivered consistently.

It's actually easy to fix; the only thing to watch out for
is prevention of double syscall restart.  Fortunately, the
idea I've nicked from arm fix by rmk works just fine...

Testcase demonstrating the behaviour in question; on alpha
we get one or both flags set (usually one), on everything
else both are always set.
	#include <signal.h>
	#include <stdio.h>
	int had1, had2;
	void f1(int sig) { had1 = 1; }
	void f2(int sig) { had2 = 1; }
	main()
	{
		sigset_t set1, set2;
		sigemptyset(&set1);
		sigemptyset(&set2);
		sigaddset(&set2, 1);
		sigaddset(&set2, 2);
		signal(1, f1);
		signal(2, f2);
		sigprocmask(SIG_SETMASK, &set2, NULL);
		raise(1);
		raise(2);
		sigsuspend(&set1);
		printf("had1:%d had2:%d\n", had1, had2);
	}

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:08:29 -04:00
Al Viro
5329363861 alpha: fix a 14 years old bug in sigreturn tracing
The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL,
all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support.

What happens is direct return to ret_from_syscall, in order to bypass
mangling of a3 (error indicator) and prevent other mutilations of
registers (e.g. by syscall restart).  That's fine, but... the entire
TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall
stopping/notifying the tracer is after the syscall.  And the normal
path we are forcibly switching to doesn't have it.

So we end up with *one* stop in traced sigreturn() vs. two in other
syscalls.  And yes, strace is visibly broken by that; try to strace
the following
	#include <signal.h>
	#include <stdio.h>
	void f(int sig) {}
	main()
	{
		signal(SIGHUP, f);
		raise(SIGHUP);
		write(1, "eeeek\n", 6);
	}
and watch the show.  The
	close(1)                                = 405
in the end of strace output is coming from return value of write() (6 ==
__NR_close on alpha) and syscall number of exit_group() (__NR_exit_group ==
405 there).

The fix is fairly simple - the only thing we end up missing is the call
of syscall_trace() and we can tell whether we'd been called from the
SYSCALL_TRACE path by checking ra value.  Since we are setting the
switch_stack up (that's what sys_sigreturn() does), we have the right
environment for calling syscall_trace() - just before we call
undo_switch_stack() and return.  Since undo_switch_stack() will overwrite
s0 anyway, we can use it to store the result of "has it been called from
SYSCALL_TRACE path?" check.  The same thing applies in rt_sigreturn().

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:08:28 -04:00
Al Viro
392fb6e354 alpha: unb0rk sigsuspend() and rt_sigsuspend()
Old code used to set regs->r0 and regs->r19 to force the right
return value.  Leaving that after switch to ERESTARTNOHAND
was a Bad Idea(tm), since now that screws the restart - if we
hit the case when get_signal_to_deliver() returns 0, we will
step back to syscall insn, with v0 set to EINTR and a3 to 1.
The latter won't matter, since EINTR is 4, aka __NR_write.

Testcase:

	#include <signal.h>
	#define _GNU_SOURCE
	#include <unistd.h>
	#include <sys/syscall.h>

	main()
	{
		sigset_t mask;
		sigemptyset(&mask);
		sigaddset(&mask, SIGCONT);
		sigprocmask(SIG_SETMASK, &mask, NULL);
		kill(0, SIGCONT);
		syscall(__NR_sigsuspend, 1, "b0rken\n", 7);
	}

results on alpha in immediate message to stdout...

Fix is obvious; moreover, since we don't need regs anymore, we can
switch to normal prototypes for these guys and lose the wrappers.
Even better, rt_sigsuspend() is identical to generic version in
kernel/signal.c now.

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:08:28 -04:00
Al Viro
2deba1bd71 alpha: belated ERESTART_RESTARTBLOCK race fix
same thing as had been done on other targets back in 2003 -
move setting ->restart_block.fn into {rt_,}sigreturn().

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:08:27 -04:00
Michael Cree
bdc8b8914b alpha: Shift perf event pending work earlier in timer interrupt
Pending work from the performance event subsystem is executed in
the timer interrupt.  This patch shifts the call to
perf_event_do_pending() before the call to update_process_times()
as the latter may call back into the perf event subsystem and it
is prudent to have the pending work executed first.

Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:06:19 -04:00
Mikael Pettersson
531f0474bf alpha: wire up fanotify and prlimit64 syscalls
The 2.6.36-rc kernel added three new system calls:
fanotify_init, fanotify_mark, and prlimit64.  This
patch wires them up on Alpha.

Built and booted on an XP900.  Untested beyond that.

Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:06:19 -04:00
Arnd Bergmann
12e750d956 alpha: kill big kernel lock
All uses of the BKL on alpha are totally bogus, nothing
is really protected by this. Remove the remaining users
so we don't have to mark alpha as 'depends on BKL'.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:06:18 -04:00
Tejun Heo
b97f897d60 alpha: fix build breakage in asm/cacheflush.h
Alpha SMP flush_icache_user_range() is implemented as an inline
function inside include/asm/cacheflush.h.  It dereferences @current
but doesn't include linux/sched.h and thus causes build failure if
linux/sched.h wasn't included previously.  Fix it by including the
needed header file explicitly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:06:18 -04:00
matt mooney
af96f8a340 alpha: remove unnecessary cast from void* in assignment.
Acked-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:06:17 -04:00
Joe Perches
31019075f4 alpha: Use static const char * const where possible
Acked-by: Richard Henderson  <rth@twiddle.net>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2010-09-18 23:06:17 -04:00
Darius Augulis
4d89ecaae9 ARM: S3C64XX: Add IORESOURCE_IRQ_HIGHLEVEL flag to dm9000 on mach-real6410
Add IORESOURCE_IRQ_HIGHLEVEL irq flag to dm9000 driver
platform data in board mach-real6410.

Signed-off-by: Darius Augulis <augulis.darius@gmail.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-18 09:54:55 +09:00
Darius Augulis
591cd25ee3 ARM: S3C64XX: Fix coding style errors on mach-real6410
Fix errors reported by checkpatch.pl script

Signed-off-by: Darius Augulis <augulis.darius@gmail.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-18 09:54:55 +09:00
Mark Brown
5343795fda ARM: S3C64XX: Prototype SPI devices
Avoids build warnings due to the undeclared non-statics.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-18 09:54:54 +09:00
Al Viro
653d48b221 arm: fix really nasty sigreturn bug
If a signal hits us outside of a syscall and another gets delivered
when we are in sigreturn (e.g. because it had been in sa_mask for
the first one and got sent to us while we'd been in the first handler),
we have a chance of returning from the second handler to location one
insn prior to where we ought to return.  If r0 happens to contain -513
(-ERESTARTNOINTR), sigreturn will get confused into doing restart
syscall song and dance.

Incredible joy to debug, since it manifests as random, infrequent and
very hard to reproduce double execution of instructions in userland
code...

The fix is simple - mark it "don't bother with restarts" in wrapper,
i.e. set r8 to 0 in sys_sigreturn and sys_rt_sigreturn wrappers,
suppressing the syscall restart handling on return from these guys.
They can't legitimately return a restart-worthy error anyway.

Testcase:
	#include <unistd.h>
	#include <signal.h>
	#include <stdlib.h>
	#include <sys/time.h>
	#include <errno.h>

	void f(int n)
	{
		__asm__ __volatile__(
			"ldr r0, [%0]\n"
			"b 1f\n"
			"b 2f\n"
			"1:b .\n"
			"2:\n" : : "r"(&n));
	}

	void handler1(int sig) { }
	void handler2(int sig) { raise(1); }
	void handler3(int sig) { exit(0); }

	main()
	{
		struct sigaction s = {.sa_handler = handler2};
		struct itimerval t1 = { .it_value = {1} };
		struct itimerval t2 = { .it_value = {2} };

		signal(1, handler1);

		sigemptyset(&s.sa_mask);
		sigaddset(&s.sa_mask, 1);
		sigaction(SIGALRM, &s, NULL);

		signal(SIGVTALRM, handler3);

		setitimer(ITIMER_REAL, &t1, NULL);
		setitimer(ITIMER_VIRTUAL, &t2, NULL);

		f(-513); /* -ERESTARTNOINTR */

		write(1, "buggered\n", 9);
		return 1;
	}

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-17 10:22:18 -07:00
Daniel Walker
14eff18126 ARM: 6398/1: add proc info for ARM11MPCore/Cortex-A9 from ARM
Setting of these bits can cause issues on other SMP SoC's not produced
by ARM.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Daniel Walker <dwalker@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 16:44:24 +01:00
Russell King
b2b163bb82 ARM: prevent multiple syscall restarts
Al Viro reports that calling "sys_sigsuspend(-ERESTARTNOHAND, 0, 0)"
with two signals coming and being handled in kernel space results
in the syscall restart being done twice.

Avoid this by clearing the 'why' flag when we call the signal handling
code to prevent further syscall restarts after the first.

Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 14:56:16 +01:00
Catalin Marinas
1a8e41cd67 ARM: 6395/1: VExpress: Set bit 22 in the PL310 (cache controller) AuxCtlr register
Clearing bit 22 in the PL310 Auxiliary Control register (shared
attribute override enable) has the side effect of transforming Normal
Shared Non-cacheable reads into Cacheable no-allocate reads.

Coherent DMA buffers in Linux always have a Cacheable alias via the
kernel linear mapping and the processor can speculatively load cache
lines into the PL310 controller. With bit 22 cleared, Non-cacheable
reads would unexpectedly hit such cache lines leading to buffer
corruption.

Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:52 +01:00
Will Deacon
a672e99b12 ARM: 6389/1: errata: incorrect hazard handling in the SCU may lead to data corruption
On the r2p0, r2p1 and r2p2 versions of the Cortex-A9, data corruption
can occur if a shared cache line is replaced on one CPU as another CPU
is accessing it.

This workaround sets two bits in the diagnostic register of the Cortex-A9,
reducing the linefill issuing capabilities of the processor and
avoiding the erroneous behaviour.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:52 +01:00
Will Deacon
9f05027c7c ARM: 6388/1: errata: DMB operation may be faulty
On versions of the Cortex-A9 up to and including r2p2, under rare
circumstances, a DMB instruction between 2 write operations may not
ensure the correct visibility ordering of the 2 writes.

This workaround sets a bit in the diagnostic register of the Cortex-A9,
causing the DMB instruction to behave like a DSB, which functions
correctly on the affected cores.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:51 +01:00
Will Deacon
6491848d1a ARM: 6387/1: errata: check primary part ID in proc-v7.S
Kconfig doesn't have any knowledge of specific v7 cores, so it is possible
to select errata workarounds that may cause inadvertent behaviour when
executed on a core other than those targetted by the fix.

This patch improves the variant and revision checking in proc-v7.S so
that the primary part number is also considered when applying errata
workarounds.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:50 +01:00
Linus Walleij
63f469324f ARM: 6377/1: supply _cansleep gpio function to U300
We have to use _cansleep gpio accessors in the MMCI driver so as
to avoid slowpath warnings, now U300 has MMCI but doesn't have
these functions in place to siply wrap the existing non-sleeping
functions into sleepable variants.

Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:49 +01:00
Linus Walleij
a0719f52d9 ARM: 6376/1: plat-nomadik: MTU: Change prescaler limit and comment updates
The prescaler 16 is now used only when the timer runs at 32 MHz
or more. Some comment updates as well.

Acked-by: Alessandro Rubini <rubini@unipv.it>
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:48 +01:00
Linus Walleij
99f76891a3 ARM: 6375/1: plat-nomadik: MTU timer trivial bug fix
timer0 to 3 are all on mtu block 0, so don't calculate the clock event
rate based upon mtu block 1's clock speed.

Acked-by: Alessandro Rubini <rubini@unipv.it>
Signed-off-by: Jonas Aaberg <jonas.aberg@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-17 10:16:47 +01:00
Linus Torvalds
a5b617368c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: hpet: Work around hardware stupidity
  x86, build: Disable -fPIE when compiling with CONFIG_CC_STACKPROTECTOR=y
  x86, cpufeature: Suppress compiler warning with gcc 3.x
  x86, UV: Fix initialization of max_pnode
2010-09-16 19:38:08 -07:00
Frederic Weisbecker
89e45aac42 x86: Fix instruction breakpoint encoding
Lengths and types of breakpoints are encoded in a half byte
into CPU registers. However when we extract these values
and store them, we add a high half byte part to them: 0x40 to the
length and 0x80 to the type.
When that gets reloaded to the CPU registers, the high part
is masked.

While making the instruction breakpoints available for perf,
I zapped that high part on instruction breakpoint encoding
and that broke the arch -> generic translation used by ptrace
instruction breakpoints. Writing dr7 to set an inst breakpoint
was then failing.

There is no apparent reason for these high parts so we could get
rid of them altogether. That's an invasive change though so let's
do that later and for now fix the problem by restoring that inst
breakpoint high part encoding in this sole patch.

Reported-by: Kelvie Wong <kelvie@ieee.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
2010-09-17 03:24:13 +02:00
Linus Torvalds
7bb419041b Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Optimize ticket spinlocks in fsys_rt_sigprocmask
2010-09-16 12:58:44 -07:00
Linus Torvalds
8be7eb359d Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch/tile: fix formatting bug in register dumps
  arch/tile: fix memcpy_fromio()/memcpy_toio() signatures
  arch/tile: Save and restore extra user state for tilegx
  arch/tile: Change struct sigcontext to be more useful
  arch/tile: finish const-ifying sys_execve()
2010-09-16 12:54:54 -07:00
Ingo Molnar
0d2b54904d Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/urgent 2010-09-16 16:36:19 +02:00
Patrick Simmons
c33f543d32 oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540)
This patch adds CPU type detection for the Intel Celeron 540, which is
part of the Core 2 family according to Wikipedia; the family and ID pair
is absent from the Volume 3B table referenced in the source code
comments.  I have tested this patch on an Intel Celeron 540 machine
reporting itself as Family 6 Model 22, and OProfile runs on the machine
without issue.

Spec:

 http://download.intel.com/design/mobile/SPECUPDT/317667.pdf

Signed-off-by: Patrick Simmons <linuxrocks123@netscape.net>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
2010-09-16 12:35:56 +02:00
Petr Tesarik
2d2b690164 [IA64] Optimize ticket spinlocks in fsys_rt_sigprocmask
Tony's fix (f574c84319) has a small bug,
it incorrectly uses "r3" as a scratch register in the first of the two
unlock paths ... it is also inefficient.  Optimize the fast path again.

Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-09-15 15:35:48 -07:00
Tony Lindgren
359f64f7b3 omap: Fix compile dependency to LEDS_CLASS
If we LEDS_CLASS is not selected, we will get undefined reference
to `led_classdev_register'.

Signed-off-by: Tony Lindgren <tony@atomide.com>
2010-09-15 10:18:51 -07:00
Chris Metcalf
7040dea4d2 arch/tile: fix formatting bug in register dumps
This cut-and-paste bug was caused by rewriting the register dump
code to use only a single printk per line of output.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-09-15 11:17:05 -04:00
Chris Metcalf
0fab59e5dd arch/tile: fix memcpy_fromio()/memcpy_toio() signatures
This tripped up a driver (not yet committed to git).  Fix it now.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-09-15 11:17:04 -04:00
Chris Metcalf
a802fc6854 arch/tile: Save and restore extra user state for tilegx
During context switch, save and restore a couple of additional bits of
tilegx user state that can be persistently modified by userspace.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-09-15 11:16:10 -04:00
Chris Metcalf
74fca9da09 arch/tile: Change struct sigcontext to be more useful
Rather than just using pt_regs, it now contains the actual saved
state explicitly, similar to pt_regs.  By doing it this way, we
provide a cleaner API for userspace (or equivalently, we avoid the
need for libc to provide its own definition of sigcontext).

While we're at it, move PT_FLAGS_xxx to where they are not visible
from userspace.  And always pass siginfo and mcontext to signal
handlers, even if they claim they don't need it, since sometimes
they actually try to use it anyway in practice.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-09-15 11:16:08 -04:00
Chris Metcalf
e6e6c46d75 arch/tile: finish const-ifying sys_execve()
The sys_execve() implementation was properly const-ified but not
the declaration, the syscall wrappers, or the compat version.
This change completes the constification process.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-09-15 11:16:05 -04:00
Linus Torvalds
9c03f1622a Merge ssh://master.kernel.org/home/hpa/tree/sec
* ssh://master.kernel.org/home/hpa/tree/sec:
  x86-64, compat: Retruncate rax after ia32 syscall entry tracing
  x86-64, compat: Test %rax for the syscall number, not %eax
  compat: Make compat_alloc_user_space() incorporate the access_ok()
2010-09-14 17:07:51 -07:00
David Howells
a4128b03ff MN10300: Fix up the IRQ names for the on-chip serial ports
Fix up the IRQ names for the MN10300 on-chip serial ports in the driver as
request_interrupt() no longer allows names containing slashes, giving a warning
like the following if one is encountered:

	------------[ cut here ]------------
	WARNING: at fs/proc/generic.c:323 __xlate_proc_name+0x62/0x7c()
	name 'ttySM0/Rx'

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-14 17:06:28 -07:00
Roland McGrath
eefdca043e x86-64, compat: Retruncate rax after ia32 syscall entry tracing
In commit d4d6715, we reopened an old hole for a 64-bit ptracer touching a
32-bit tracee in system call entry.  A %rax value set via ptrace at the
entry tracing stop gets used whole as a 32-bit syscall number, while we
only check the low 32 bits for validity.

Fix it by truncating %rax back to 32 bits after syscall_trace_enter,
in addition to testing the full 64 bits as has already been added.

Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-14 16:08:47 -07:00
H. Peter Anvin
36d001c70d x86-64, compat: Test %rax for the syscall number, not %eax
On 64 bits, we always, by necessity, jump through the system call
table via %rax.  For 32-bit system calls, in theory the system call
number is stored in %eax, and the code was testing %eax for a valid
system call number.  At one point we loaded the stored value back from
the stack to enforce zero-extension, but that was removed in checkin
d4d6715016.  An actual 32-bit process
will not be able to introduce a non-zero-extended number, but it can
happen via ptrace.

Instead of re-introducing the zero-extension, test what we are
actually going to use, i.e. %rax.  This only adds a handful of REX
prefixes to the code.

Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
2010-09-14 16:08:46 -07:00
H. Peter Anvin
c41d68a513 compat: Make compat_alloc_user_space() incorporate the access_ok()
compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area.  A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures.  This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.

Reported-by: Ben Hawkes <hawkes@sota.gen.nz>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James Bottomley <jejb@parisc-linux.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: <stable@kernel.org>
2010-09-14 16:08:45 -07:00
Thomas Gleixner
54ff7e595d x86: hpet: Work around hardware stupidity
This more or less reverts commits 08be979 (x86: Force HPET
readback_cmp for all ATI chipsets) and 30a564be (x86, hpet: Restrict
read back to affected ATI chipsets) to the status of commit 8da854c
(x86, hpet: Erratum workaround for read after write of HPET
comparator).

The delta to commit 8da854c is mostly comments and the change from
WARN_ONCE to printk_once as we know the call path of this function
already.

This needs really in depth explanation:

First of all the HPET design is a complete failure. Having a counter
compare register which generates an interrupt on matching values
forces the software to do at least one superfluous readback of the
counter register.

While it is nice in theory to program "absolute" time events it is
practically useless because the timer runs at some absurd frequency
which can never be matched to real world units. So we are forced to
calculate a relative delta and this forces a readout of the actual
counter value, adding the delta and programming the compare
register. When the delta is small enough we run into the danger that
we program a compare value which is already in the past. Due to the
compare for equal nature of HPET we need to read back the counter
value after writing the compare rehgister (btw. this is necessary for
absolute timeouts as well) to make sure that we did not miss the timer
event. We try to work around that by setting the minimum delta to a
value which is larger than the theoretical time which elapses between
the counter readout and the compare register write, but that's only
true in theory. A NMI or SMI which hits between the readout and the
write can easily push us beyond that limit. This would result in
waiting for the next HPET timer interrupt until the 32bit wraparound
of the counter happens which takes about 306 seconds.

So we designed the next event function to look like:

   match = read_cnt() + delta;
   write_compare_ref(match);
   return read_cnt() < match ? 0 : -ETIME;

At some point we got into trouble with certain ATI chipsets. Even the
above "safe" procedure failed. The reason was that the write to the
compare register was delayed probably for performance reasons. The
theory was that they wanted to avoid the synchronization of the write
with the HPET clock, which is understandable. So the write does not
hit the compare register directly instead it goes to some intermediate
register which is copied to the real compare register in sync with the
HPET clock. That opens another window for hitting the dreaded "wait
for a wraparound" problem.

To work around that "optimization" we added a read back of the compare
register which either enforced the update of the just written value or
just delayed the readout of the counter enough to avoid the issue. We
unfortunately never got any affirmative info from ATI/AMD about this.

One thing is sure, that we nuked the performance "optimization" that
way completely and I'm pretty sure that the result is worse than
before some HW folks came up with those.

Just for paranoia reasons I added a check whether the read back
compare register value was the same as the value we wrote right
before. That paranoia check triggered a couple of years after it was
added on an Intel ICH9 chipset. Venki added a workaround (commit
8da854c) which was reading the compare register twice when the first
check failed. We considered this to be a penalty in general and
restricted the readback (thus the wasted CPU cycles) to the known to
be affected ATI chipsets.

This turned out to be a utterly wrong decision. 2.6.35 testers
experienced massive problems and finally one of them bisected it down
to commit 30a564be which spured some further investigation.

Finally we got confirmation that the write to the compare register can
be delayed by up to two HPET clock cycles which explains the problems
nicely. All we can do about this is to go back to Venki's initial
workaround in a slightly modified version.

Just for the record I need to say, that all of this could have been
avoided if hardware designers and of course the HPET committee would
have thought about the consequences for a split second. It's out of my
comprehension why designing a working timer is so hard. There are two
ways to achieve it:

 1) Use a counter wrap around aware compare_reg <= counter_reg
    implementation instead of the easy compare_reg == counter_reg

    Downsides:

	- It needs more silicon.

	- It needs a readout of the counter to apply a relative
	  timeout. This is necessary as the counter does not run in
	  any useful (and adjustable) frequency and there is no
	  guarantee that the counter which is used for timer events is
	  the same which is used for reading the actual time (and
	  therefor for calculating the delta)

    Upsides:

	- None

  2) Use a simple down counter for relative timer events

    Downsides:

	- Absolute timeouts are not possible, which is not a problem
	  at all in the context of an OS and the expected
	  max. latencies/jitter (also see Downsides of #1)

   Upsides:

	- It needs less or equal silicon.

	- It works ALWAYS

	- It is way faster than a compare register based solution (One
	  write versus one write plus at least one and up to four
	  reads)

I would not be so grumpy about all of this, if I would not have been
ignored for many years when pointing out these flaws to various
hardware folks. I really hate timers (at least those which seem to be
designed by janitors).

Though finally we got a reasonable explanation plus a solution and I
want to thank all the folks involved in chasing it down and providing
valuable input to this.

Bisected-by: Nix <nix@esperi.org.uk>
Reported-by: Artur Skawina <art.08.09@gmail.com>
Reported-by: Damien Wyart <damien.wyart@free.fr>
Reported-by: John Drescher <drescherjm@gmail.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: stable@kernel.org
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-09-15 00:55:13 +02:00
Mark Brown
23a07eb0e8 ARM: S3C64XX: Fix dev-spi build
The irqs.h usage here got missed in the Samsung platform reorganisation.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Jassi Brar <jassi.brar@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:59:51 +09:00
Kukjin Kim
cbd2780fce ARM: SAMSUNG: Fix on s5p_gpio_[get,set]_drvstr
This patch fixes bug on gpio drive strength helper function.

The offset should be like follwoing.
-       off = chip->chip.base - pin;
+       off = pin - chip->chip.base;

In the s5p_gpio_get_drvstr(),
the second line is unnecessary, because overwrite drvstr.
        drvstr = __raw_readl(reg);
-       drvstr = 0xffff & (0x3 << shift);

And need 2bit masking before return the drvstr value.
        drvstr = drvstr >> shift;
+       drvstr &= 0x3;

In the s5p_gpio_set_drvstr(), need relevant bit clear.
        tmp = __raw_readl(reg);
+       tmp &= ~(0x3 << shift);
        tmp |= drvstr << shift;

Reported-by: Jaecheol Lee <jc.lee@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:59:31 +09:00
Kukjin Kim
0770e5280e ARM: SAMSUNG: Fix on drive strength value
This patch fixes on defined drive strength value for GPIO.
According to data sheet, if we want drive strength 1x, the value
should be 00(b), if 2x should be 10(b), if 3x should be 01(b),
and if 4x should be 11(b). Also fixes comment(from S5C to S5P).

Reported-by: Janghyuck Kim <janghyuck.kim@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:59:23 +09:00
Marek Szyprowski
da01c2f733 ARM: S5PV210: Add FIMC clocks
These clocks enables FIMC driver to operate on machines, which
bootloader power gated FIMC devices to save power on boot.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:59:16 +09:00
Kyungmin Park
a203a13a88 ARM: S5PV210: Reduce the iodesc length of systimer
It's enough to use 4KiB.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:58:35 +09:00
MyungJoo Ham
f1c894de47 ARM: S5PV210: Update I2C-1 Clock Register Property.
CLK_GATE_IP3[8] is RESERVED. The port "I2C_HDMI_DDC" of CLK_GATE_IP3[10] is
used as another I2C port. Therefore, defined the unused I2C-1 as another I2C
there was left undefined but used.

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:58:21 +09:00
Sylwester Nawrocki
80e2f36aab ARM: S5P: Decrease IO Registers memory region size on FIMC
IO registers region size of all FIMC versions is less than 1kB so there
is no need to reserve 1M.

Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:57:55 +09:00
Marek Szyprowski
0fe7f88504 ARM: S5P: Fix DMA coherent mask for FIMC
FIMC driver uses DMA_coherent allocator, which requires proper dma mask
to be set.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
[kgene.kim@samsung.com: minor title fix]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
2010-09-14 17:57:39 +09:00
basile@opensource.dyc.edu
08c2b394b9 x86, build: Disable -fPIE when compiling with CONFIG_CC_STACKPROTECTOR=y
The arch/x86/Makefile uses scripts/gcc-x86_$(BITS)-has-stack-protector.sh
to check if cc1 supports -fstack-protector.  When -fPIE is passed to cc1,
these scripts fail causing stack protection to be disabled even when it
is available.

This fix is similar to commit c47efe5548

Reported-by: Kai Dietrich <mail@cleeus.de>
Signed-off-by: Magnus Granberg <zorry@gentoo.org>
LKML-Reference: <20100913101319.748A1148E216@opensource.dyc.edu>
Signed-off-by: Anthony G. Basile <basile@opensource.dyc.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-13 15:53:16 -07:00
Tetsuo Handa
2fd818642a x86, cpufeature: Suppress compiler warning with gcc 3.x
Gcc 3.x generates a warning

  arch/x86/include/asm/cpufeature.h: In function `__static_cpu_has':
  arch/x86/include/asm/cpufeature.h:326: warning: asm operand 1 probably doesn't match constraints

on each file.
But static_cpu_has() for gcc 3.x does not need __static_cpu_has().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
LKML-Reference: <201008300127.o7U1RC6Z044051@www262.sakura.ne.jp>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2010-09-13 14:48:41 -07:00
Linus Torvalds
1d91686a47 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k,m68knommu: Wire up fanotify_init, fanotify_mark, and prlimit64
2010-09-13 12:51:22 -07:00
Linus Torvalds
ab22c17cd2 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] fix siglock

Quoth Tony:

 "I committed the fix for this last week prior to your -rc4 announcement
  reminding us to give proper "Reported-by:" credit.  This one should have
  had:

  Reported-by: Tony Ernst <tee@sgi.com>

  and also

  Much-useful-investigation-and-tracing-by: Hedi Berriche <hedi@sgi.com>
  Much-useful-investigation-and-tracing-by: Petr Tesarik <ptesarik@novell.com>"
2010-09-13 12:49:55 -07:00
Masami Hiramatsu
05ed160e89 kprobes: Fix Kconfig dependency
Fix Kconfig dependency among Kprobes, optprobe and kallsyms.

Kprobes uses kallsyms_lookup for finding target function and
checking instruction boundary, thus CONFIG_KPROBES should select
CONFIG_KALLSYMS.

Optprobe is an optional feature which is supported on x86 arch,
and it also uses kallsyms_lookup for checking instructions in
the target function. Since KALLSYMS_ALL just adds symbols of
kernel variables, it doesn't need to select KALLSYMS_ALL.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>,
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Felipe Contreras <felipe.contreras@gmail.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: akpm <akpm@linux-foundation.org>
LKML-Reference: <20100913102541.20260.85700.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-13 20:41:31 +02:00
Geert Uytterhoeven
edc805b7c5 m68k,m68knommu: Wire up fanotify_init, fanotify_mark, and prlimit64
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
2010-09-13 20:28:45 +02:00
David S. Miller
b343ae51c1 sparc64: Support RAW perf events.
Encoding is "(encoding << 16) | pic_mask"

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-12 17:20:24 -07:00
Linus Torvalds
10d90f2803 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Kill all BKL usage.
2010-09-11 08:01:09 -07:00
Peter Zijlstra
5ee5e97ee9 x86, tsc: Fix a preemption leak in restore_sched_clock_state()
A real life genuine preemption leak..

Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-10 18:17:45 -07:00
Jack Steiner
36ac4b987b x86, UV: Fix initialization of max_pnode
Fix calculation of "max_pnode" for systems where the the highest
blade has neither cpus or memory. (And, yes, although rare this
does occur).

Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20100910150808.GA19802@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-10 17:15:49 +02:00
Linus Torvalds
be6200aac9 Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Perform hardware_enable in CPU_STARTING callback
  KVM: i8259: fix migration
  KVM: fix i8259 oops when no vcpus are online
  KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts
2010-09-10 08:02:45 -07:00
Linus Torvalds
6ccaa31729 Merge branch 'at91-fixes-for-linus' of git://github.com/at91linux/linux-2.6-at91
* 'at91-fixes-for-linus' of git://github.com/at91linux/linux-2.6-at91:
  AT91: at91sam9261ek: remove C99 comments but keep information
  AT91: at91sam9261ek board: remove warnings related to use of SPI or SD/MMC
  AT91: dm9000 initialization update
  AT91: SAM9G45 - add a separate clock entry for every single TC block
  AT91: clock: peripheral clocks can have other parent than mck
  AT91: change dma resource index
2010-09-10 07:24:51 -07:00
Nicolas Ferre
4deb22a600 AT91: at91sam9261ek: remove C99 comments but keep information
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2010-09-10 14:36:06 +02:00
Nicolas Ferre
64d72bbeeb AT91: at91sam9261ek board: remove warnings related to use of SPI or SD/MMC
The sd/mmc data structure is not used if SPI is selected. The configuration
of PIO on the board prevent from using both interfaces at the same time
(board dependent).
Remove the warnings at compilation time adding a preprocessor condition.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2010-09-10 12:00:56 +02:00
Nicolas Ferre
1879c45cce AT91: dm9000 initialization update
Add information in dm9000 mac/phy chip initialization:
- irq resource details
- platform data details

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2010-09-10 11:39:23 +02:00
Ira W. Snyder
94131e174f arch/powerpc/include/asm/fsldma.h needs slab.h
The slab.h header is required to use the kmalloc() family of functions.
Due to recent kernel changes, this header must be directly included by
code that calls into the memory allocator.

Without this patch, any code which includes this header fails to build.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-09 18:57:24 -07:00
Linus Torvalds
152831be91 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm: (30 commits)
  ARM: Update mach-types
  ARM: Partially revert "Auto calculate ZRELADDR and provide option for exceptions"
  ARM: Ensure PTE modifications via dma_alloc_coherent are visible
  ARM: 6359/1: ep93xx: move clock initialization earlier
  Revert "[ARM] pxa: remove now unnecessary dma_needs_bounce()"
  ARM: 6352/1: perf: fix event validation
  ARM: 6344/1: Mark CPU_32v6K as depended on CPU_V7
  ARM: 6343/1: wire up fanotify and prlimit64 syscalls on ARM
  ARM: 6330/1: perf: reword comments relating to perf_event_do_pending
  ARM: pxa168fb: fix section mismatch
  ARM: pxa: Make id const in pwm_probe()
  ARM: pxa: fix CI_HSYNC and CI_VSYNC MFP defines for pxa300
  ARM: pxa: remove __init from cpufreq_driver->init()
  ARM: imx: set cache line size to 64 bytes for i.MX5
  mx5/clock: fix clear bit fields issue in _clk_ccgr_disable function
  mxc/tzic: add base address when accessing TZIC registers
  ARM: mach-shmobile: ap4evb: fix write protect for SDHI1
  ARM: mach-shmobile: ap4evb: modify FSI2 ID
  ARM: mach-shmobile: do not enable the PLLC2 clock on init
  ARM: mach-shmobile: Clock framework comment fix
  ...
2010-09-09 18:31:34 -07:00
Tony Luck
f574c84319 [IA64] fix siglock
When ia64 converted to using ticket locks, an inline implementation
of trylock/unlock in fsys.S was missed.  This was not noticed because
in most circumstances it simply resulted in using the slow path because
the siglock was apparently not available (under old spinlock rules).

Problems occur when the ticket spinlock has value 0x0 (when first
initialised, or when it wraps around). At this point the fsys.S
code acquires the lock (changing the 0x0 to 0x1. If another process
attempts to get the lock at this point, it will change the value from
0x1 to 0x2 (using new ticket lock rules). Then the fsys.S code will
free the lock using old spinlock rules by writing 0x0 to it. From
here a variety of bad things can happen.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2010-09-09 15:16:56 -07:00
Russell King
a14d040408 ARM: Update mach-types
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-09 22:49:26 +01:00
Russell King
9e84ed63dc ARM: Partially revert "Auto calculate ZRELADDR and provide option for exceptions"
Partially revert e69edc7, which introduced automatic zreladdr
support.  The change in the way the manual definition is defined
seems to be error and conflict prone.  Go back to the original way
we were handling this for the time being, while keeping the automatic
zreladdr facility.

Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-09 22:39:41 +01:00
Russell King
de9ea203d1 Merge branch 'origin' 2010-09-09 22:38:43 +01:00
Fabian Godehardt
ab64511cbb AT91: SAM9G45 - add a separate clock entry for every single TC block
Without this patch you will not be able to register the first block
because of the second association call on at91_add_device_tc().

Signed-off-by: Fabian Godehardt <fg@emlix.com>
[nicolas.ferre@atmel.com: change tcb1_clk to fake child clock of tcb0_clk]
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2010-09-09 20:14:43 +02:00
Nicolas Ferre
5afddee415 AT91: clock: peripheral clocks can have other parent than mck
While registering clock allow to set parent clock other
than mck. It is useful for clocks than can be seen as
child clock of a peripheral.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
2010-09-09 20:13:23 +02:00
David S. Miller
b19f820039 sparc: Kill all BKL usage.
They were all bogus artifacts and completely unnecessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 20:57:59 -07:00
Linus Torvalds
1faa6ec8cc Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks
  io-mapping: Fix the address space annotations
  x86: Fix the address space annotations of iomap_atomic_prot_pfn()
  x86, mm: Fix CONFIG_VMSPLIT_1G and 2G_OPT trampoline
  x86, hwmon: Fix unsafe smp_processor_id() in thermal_throttle_add_dev
2010-09-08 11:14:10 -07:00
Linus Torvalds
899edae615 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf, x86: Try to handle unknown nmis with an enabled PMU
  perf, x86: Fix handle_irq return values
  perf, x86: Fix accidentally ack'ing a second event on intel perf counter
  oprofile, x86: fix init_sysfs() function stub
  lockup_detector: Sync touch_*_watchdog back to old semantics
  tracing: Fix a race in function profile
  oprofile, x86: fix init_sysfs error handling
  perf_events: Fix time tracking for events with pid != -1 and cpu != -1
  perf: Initialize callchains roots's childen hits
  oprofile: fix crash when accessing freed task structs
2010-09-08 11:13:16 -07:00
Eric Millbrandt
fa32154e47 powerpc/5200: tighten up ac97 reset timing
Tighten up time timing around the gpio reset functionality.  Add a 200ns
delay before remuxing the pins back to ac97 to comply with the ac97 spec.

Signed-off-by: Eric Millbrandt <emillbrandt@dekaresearch.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-09-08 11:55:26 -06:00
Gleb Natapov
eebb5f31b8 KVM: i8259: fix migration
Top of kvm_kpic_state structure should have the same memory layout as
kvm_pic_state since it is copied by memcpy.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-09-08 14:50:58 -03:00
Avi Kivity
ae0635b358 KVM: fix i8259 oops when no vcpus are online
If there are no vcpus, found will be NULL.  Check before doing anything with
it.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-09-08 14:50:56 -03:00
Avi Kivity
16518d5ada KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts
operand::val and operand::orig_val are 32-bit on i386, whereas cmpxchg8b
operands are 64-bit.

Fix by adding val64 and orig_val64 union members to struct operand, and
using them where needed.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-09-08 14:50:55 -03:00
Julia Lawall
915b96191f powerpc/5200: efika.c: Add of_node_put to avoid memory leak
This function is implemented as though the function of_get_next_child does
not increment the reference count of its result, but actually it does.
Thus the patch adds of_node_put in error handling code and drops a call to
of_node_get.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E1;
position p1,p2;
@@

x@p1 = of_get_next_child(...);
... when != x = E1
of_node_get@p2(x)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@

cocci.print_main("call",p1)
cocci.print_secs("get",p2)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-09-08 11:45:24 -06:00
Russell King
2be23c475a ARM: Ensure PTE modifications via dma_alloc_coherent are visible
Dave Hylands reports:
| We've observed a problem with dma_alloc_writecombine when the system
| is under heavy load (heavy bus traffic).  We've managed to reduce the
| problem to the following snippet, which is run from a kthread in a
| continuous loop:
|
|   void *virtAddr;
|   dma_addr_t physAddr;
|   unsigned int numBytes = 256;
|
|   for (;;) {
|       virtAddr = dma_alloc_writecombine(NULL,
|             numBytes, &physAddr, GFP_KERNEL);
|       if (virtAddr == NULL) {
|          printk(KERN_ERR "Running out of memory\n");
|          break;
|       }
|
|       /* access DMA memory allocated */
|       tmp = virtAddr;
|       *tmp = 0x77;
|
|       /* free DMA memory */
|       dma_free_writecombine(NULL,
|             numBytes, virtAddr, physAddr);
|
|         ...sleep here...
|     }
|
| By itself, the code will run forever with no issues. However, as we
| increase our bus traffic (typically using DMA) then the *tmp = 0x77
| line will eventually cause a page fault. If we add a small delay (a
| few microseconds) before the *tmp = 0x77, then we don't see a page
| fault, even under heavy load.

A dsb() is required after modifying the PTE entries to ensure that they
will always be visible.  Add this dsb().

Reported-by: Dave Hylands <dhylands@gmail.com>
Tested-by: Dave Hylands <dhylands@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-08 16:27:56 +01:00
Mika Westerberg
a387f0f540 ARM: 6359/1: ep93xx: move clock initialization earlier
Commit 7cfe24947 ("ARM: AMBA: Add pclk support to AMBA bus
infrastructure") changed AMBA bus to handle the PCLK automatically.
However, in EP93xx clock initialization is arch_initcall which is done
later than AMBA device identification. This causes
amba_get_enable_pclk() to fail resulting device where UARTs are not
functional.

So change ep93xx_clock_init() to be postcore_initcall.

Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Acked-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-08 12:28:39 +01:00
Russell King
0485e18bc4 Revert "[ARM] pxa: remove now unnecessary dma_needs_bounce()"
This reverts commit 4fa5518, which causes a compilation regression for
IXP4xx platforms.

Reported-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-08 12:28:39 +01:00
Linus Torvalds
d56557af19 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: bus speed strings should be const
  PCI hotplug: Fix build with CONFIG_ACPI unset
  PCI: PCIe: Remove the port driver module exit routine
  PCI: PCIe: Move PCIe PME code to the pcie directory
  PCI: PCIe: Disable PCIe port services during port initialization
  PCI: PCIe: Ask BIOS for control of all native services at once
  ACPI/PCI: Negotiate _OSC control bits before requesting them
  ACPI/PCI: Do not preserve _OSC control bits returned by a query
  ACPI/PCI: Make acpi_pci_query_osc() return control bits
  ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
  PCI: PCIe: Introduce commad line switch for disabling port services
  PCI: PCIe AER: Introduce pci_aer_available()
  x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
  PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs
2010-09-07 16:00:17 -07:00
Linus Torvalds
98e52c373c Merge branch 'for-linus' of git://android.kernel.org/kernel/tegra
* 'for-linus' of git://android.kernel.org/kernel/tegra:
  [ARM] tegra: Add ZRELADDR default for ARCH_TEGRA
2010-09-07 14:48:44 -07:00
Linus Torvalds
add2b10f2b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: Fix printk format errors
  alpha: convert perf_event to use local_t
  Fix call to replaced SuperIO functions
  alpha: remove homegrown L1_CACHE_ALIGN macro
2010-09-07 14:38:54 -07:00
Linus Torvalds
a44a553f82 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pseries: Correct rtas_data_buf locking in dlpar code
  powerpc/85xx: Add P1021 PCI IDs and quirks
  arch/powerpc/sysdev/qe_lib/qe.c: Add of_node_put to avoid memory leak
  arch/powerpc/platforms/83xx/mpc837x_mds.c: Add missing iounmap
  fsl_rio: fix compile errors
  powerpc/85xx: Fix compile issue with p1022_ds due to lmb rename to memblock
  powerpc/85xx: Fix compilation of mpc85xx_mds.c
  powerpc: Don't use kernel stack with translation off
  powerpc/perf_event: Reduce latency of calling perf_event_do_pending
  powerpc/kexec: Adds correct calling convention for kexec purgatory
2010-09-07 14:34:37 -07:00
Greg Ungerer
e6ba59bcae m68knommu: fix missing linker segments
Recent changes to linker segments that hold per-cpu data broke linking
for m68knommu targets:

  LD      vmlinux
/usr/local/bin/m68k-uclinux-ld.real: error: no memory region specified for loadable section `.data..shared_aligned'

Add missing segments into the m68knommu linker script.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-07 13:55:30 -07:00
David Howells
1e72910e24 h8300: Fix missing consts in kernel_execve()
Fix missing consts in h8300's kernel_execve():

  arch/h8300/kernel/sys_h8300.c: In function 'kernel_execve':
  arch/h8300/kernel/sys_h8300.c:59: warning: initialization from incompatible pointer type
  arch/h8300/kernel/sys_h8300.c:60: warning: initialization from incompatible pointer type

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-07 13:52:28 -07:00
David Howells
b857189d94 h8300: Fix die()
Fix h8300's die() to take care of a number of problems:

    CC      arch/h8300/kernel/traps.o
  In file included from arch/h8300/include/asm/bitops.h:10,
                   from include/linux/bitops.h:22,
                   from include/linux/kernel.h:17,
                   from include/linux/sched.h:54,
                   from arch/h8300/kernel/traps.c:18:
  arch/h8300/include/asm/system.h:136: warning: 'struct pt_regs' declared inside parameter list
  arch/h8300/include/asm/system.h:136: warning: its scope is only this definition or declaration, which is probably not what you want
  arch/h8300/kernel/traps.c💯 error: conflicting types for 'die'
  arch/h8300/include/asm/system.h:136: error: previous declaration of 'die' was here
  make[2]: *** [arch/h8300/kernel/traps.o] Error 1

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-07 13:52:28 -07:00
David Howells
3ab61eb9fd h8300: IRQ flags should be stored in an unsigned long
Fix h8300's asm/atomic.h to store the IRQ flags in an unsigned long to deal
with warnings of the following type:

  arch/h8300/include/asm/atomic.h: In function 'atomic_add_return':
  arch/h8300/include/asm/atomic.h:22: warning: comparison of distinct pointer types lacks a cast
  arch/h8300/include/asm/atomic.h:24: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-07 13:52:28 -07:00
Nicolas Ferre
8d2602e077 AT91: change dma resource index
Reported-by: Dan Liang <dan.liang@atmel.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
2010-09-07 18:53:33 +02:00
Andreas Herrmann
1389298f7d x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks
kobject_add_internal failed for threshold_bank2 with -EEXIST,
don't try to register things with the same name in the same
directory:

  Pid: 1, comm: swapper Tainted: G        W  2.6.31 #1
  Call Trace:
  [<ffffffff81161b07>] ? kobject_add_internal+0x156/0x180
  [<ffffffff81161cc0>] ? kobject_add+0x66/0x6b
  [<ffffffff81161793>] ? kobject_init+0x42/0x82
  [<ffffffff81161cf9>] ? kobject_create_and_add+0x34/0x63
  [<ffffffff81393963>] ? threshold_create_bank+0x14f/0x259
  [<ffffffff8139310a>] ? mce_create_device+0x8d/0x1b8
  [<ffffffff81646497>] ? threshold_init_device+0x3f/0x80
  [<ffffffff81646458>] ? threshold_init_device+0x0/0x80
  [<ffffffff81009050>] ? do_one_initcall+0x4f/0x143
  [<ffffffff816413a0>] ? kernel_init+0x14c/0x1a2
  [<ffffffff8100c8da>] ? child_rip+0xa/0x20
  [<ffffffff81641254>] ? kernel_init+0x0/0x1a2
  [<ffffffff8100c8d0>] ? child_rip+0x0/0x20
  kobject_create_and_add: kobject_add error: -17

(Probably the for_each_cpu loop should be entirely removed.)

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20100827092006.GB5348@loge.amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-05 14:35:49 +02:00
Francisco Jerez
cc1a8e5233 x86: Fix the address space annotations of iomap_atomic_prot_pfn()
This patch fixes the sparse warnings when the return pointer of
iomap_atomic_prot_pfn() is used as an argument of iowrite32()
and friends.

Signed-off-by: Francisco Jerez <currojerez@riseup.net>
LKML-Reference: <1283633804-11749-1-git-send-email-currojerez@riseup.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-05 14:26:14 +02:00
Robert Richter
4177c42a63 perf, x86: Try to handle unknown nmis with an enabled PMU
When the PMU is enabled it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty
and daze the CPU.

The solution to avoid an 'unknown nmi' massage in this case was
simply to stop the nmi handler chain when the PMU is enabled by
stating the nmi was handled. This has the drawback that a) we
can not detect unknown nmis anymore, and b) subsequent nmi
handlers are not called.

This patch addresses this. Now, we check this unknown NMI if it
could be a PMU back-to-back NMI. Otherwise we pass it and let
the kernel handle the unknown nmi.

This is a debug log:

 cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
 cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
 cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
 cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
 cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
 cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
 cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
 cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
 cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
 cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
 cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
 cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
 cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
 cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
 cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
 cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
 Uhhuh. NMI received for unknown reason 00 on CPU 6.
 Do you have a strange power saving mode enabled?
 Dazed and confused, but trying to continue

Deltas:

 nmi #32334 340186
 nmi #32335 1327704
 nmi #32336 1819      <<<< back-to-back nmi [1]
 nmi #32337 85961
 nmi #32338 284507
 nmi #32339 1578809
 nmi #32340 217616
 nmi #32341 1798      <<<< back-to-back nmi [2]
 nmi #32342 240913
 nmi #32343 1512809
 nmi #32344 116672
 nmi #32345 412453
 nmi #32346 1462095   <<<< 1st nmi (standard) handling 2 counters
 nmi #32347 2046      <<<< 2nd nmi (back-to-back) handling one
 counter nmi #32348 1773      <<<< 3rd nmi (back-to-back)
 handling no counter! [3]

For  back-to-back nmi detection there are the following rules:

The PMU nmi handler was handling more than one counter and no
counter was handled in the subsequent nmi (see [1] and [2]
above).

There is another case if there are two subsequent back-to-back
nmis [3]. The 2nd is detected as back-to-back because the first
handled more than one counter. If the second handles one counter
and the 3rd handles nothing, we drop the 3rd nmi because it
could be a back-to-back nmi.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[ renamed nmi variable to pmu_nmi to avoid clash with .nmi in entry.S ]
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference: <1283454469-1909-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-03 08:05:18 +02:00
Peter Zijlstra
de725dec9d perf, x86: Fix handle_irq return values
Now that we rely on the number of handled overflows, ensure all
handle_irq implementations actually return the right number.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference: <1283454469-1909-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-03 08:05:18 +02:00
Don Zickus
2e556b5b32 perf, x86: Fix accidentally ack'ing a second event on intel perf counter
During testing of a patch to stop having the perf subsytem
swallow nmis, it was uncovered that Nehalem boxes were randomly
getting unknown nmis when using the perf tool.

Moving the ack'ing of the PMI closer to when we get the status
allows the hardware to properly re-set the PMU bit signaling
another PMI was triggered during the processing of the first
PMI.  This allows the new logic for dealing with the
shortcomings of multiple PMIs to handle the extra NMI by
'eat'ing it later.

Now one can wonder why are we getting a second PMI when we
disable all the PMUs in the begining of the NMI handler to
prevent such a case, for that I do not know.  But I know the fix
below helps deal with this quirk.

Tested on multiple Nehalems where the problem was occuring.
With the patch, the code now loops a second time to handle the
second PMI (whereas before it was not).

Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference: <1283454469-1909-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-03 08:05:17 +02:00
Will Deacon
65b4711ff5 ARM: 6352/1: perf: fix event validation
The validate_event function in the ARM perf events backend has the
following problems:

1.) Events that are disabled count towards the cost.
2.) Events associated with other PMUs [for example, software events or
    breakpoints] do not count towards the cost, but do fail validation,
    causing the group to fail.

This patch changes validate_event so that it ignores events in the
PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.

Reported-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Jamie Iles <jamie.iles@picochip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-02 16:36:42 +01:00
Catalin Marinas
026b5ca3b6 ARM: 6344/1: Mark CPU_32v6K as depended on CPU_V7
CPU_32v6K is selected by CPU_V7 but it only depends on CPU_V6.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-02 15:32:13 +01:00
Sean MacLennan
cd64d1697c powerpc: mtmsrd not defined
Replace the BOOK3S_64 specific mtmsrd with the generic MTMSRD macro.
Only enable ldstfp when CONFIG_PPC_FPU is set.

Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:34 +10:00
Sean MacLennan
025c0186a0 powerpc: Fix incorrect .stabs entry for copy_32.S
Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:34 +10:00
Matthew McClintock
0d35e1620d powerpc/mm: Assume first cpu is boot_cpuid not 0
arch/powerpc/mm/mmu_context_nohash.c assumes the boot cpu
will always have smp_processor_id() == 0. This patch fixes
that assumption

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:34 +10:00
Ian Munsie
86250b9d12 powerpc: Wire up direct socket system calls
This patch wires up the various socket system calls on PowerPC so that
userspace can call them directly, rather than by going through the
multiplexed socketcall system call.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:34 +10:00
Julia Lawall
7cf9bac559 powerpc/chrp/nvram.c: Add of_node_put to avoid memory leak
Add a call to of_node_put in the error handling code following a call to
of_find_node_by_type.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E,E1,E2;
statement S;
@@

*x =
(of_find_node_by_path
|of_find_node_by_name
|of_find_node_by_phandle
|of_get_parent
|of_get_next_parent
|of_get_next_child
|of_find_compatible_node
|of_match_node
|of_find_node_by_type
|of_find_node_with_property
|of_find_matching_node
|of_parse_phandle
)(...);
...
if (x == NULL) S
<... when != x = E
*if (...) {
  ... when != of_node_put(x)
      when != if (...) { ... of_node_put(x); ... }
(
  return <+...x...+>;
|
*  return ...;
)
}
...>
(
E2 = x;
|
of_node_put(x);
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:34 +10:00
Julia Lawall
182f30e4b9 powerpc/cell: Add of_node_put to avoid memory leak
Add calls to of_node_put in the error handling code following calls to
of_find_node_by_path and of_find_node_by_phandle.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E,E1;
statement S;
@@

*x =
(of_find_node_by_path
|of_find_node_by_name
|of_find_node_by_phandle
|of_get_parent
|of_get_next_parent
|of_get_next_child
|of_find_compatible_node
|of_match_node
)(...);
...
if (x == NULL) S
<... when != x = E
*if (...) {
  ... when != of_node_put(x)
      when != if (...) { ... of_node_put(x); ... }
(
  return <+...x...+>;
|
*  return ...;
)
}
...>
of_node_put(x);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:33 +10:00
Julia Lawall
0373721b19 powerpc/powermac/pfunc_core.c: Add of_node_put to avoid memory leak
Add a call to of_node_put in the error handling code following a call to
of_find_node_by_phandle.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E,E1;
statement S;
@@

*x =
(of_find_node_by_path
|of_find_node_by_name
|of_find_node_by_phandle
|of_get_parent
|of_get_next_parent
|of_get_next_child
|of_find_compatible_node
|of_match_node
)(...);
...
if (x == NULL) S
<... when != x = E
*if (...) {
  ... when != of_node_put(x)
      when != if (...) { ... of_node_put(x); ... }
(
  return <+...x...+>;
|
*  return ...;
)
}
...>
of_node_put(x);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:33 +10:00
Julia Lawall
a8e25c6154 powerpc/maple: Add of_node_put to avoid memory leak
Add a call to of_node_put in the error handling code following a call to
of_find_node_by_path.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
local idexpression x;
expression E,E1;
statement S;
@@

*x =
(of_find_node_by_path
|of_find_node_by_name
|of_find_node_by_phandle
|of_get_parent
|of_get_next_parent
|of_get_next_child
|of_find_compatible_node
|of_match_node
)(...);
...
if (x == NULL) S
<... when != x = E
*if (...) {
  ... when != of_node_put(x)
      when != if (...) { ... of_node_put(x); ... }
(
  return <+...x...+>;
|
*  return ...;
)
}
...>
of_node_put(x);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:33 +10:00
Benjamin Herrenschmidt
8fb07c0444 powerpc/dart_iommu: Support for 64-bit iommu bypass window on PCIe
The PCI-Express bus off the U4/CPC945 bridge supports direct DMA to
all of memory, bypassing the DART iommu, for 64-bit capable devices.

This adds support for it on Bimini and Apple Quad G5's in order to
improve DMA performances of cards using that slot (the x16 graphics
slot). Tested with an Intel ixgbe 10GE card.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Benjamin Herrenschmidt
5b6e9ff6de powerpc/dma: Add optional platform override of dma_set_mask()
Some platforms may want to override dma_set_mask() to take into
account some specific "features" such as the availability of
a direct-map window in addition to an iommu.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Denis Kirjanov
cab175f9fa powerpc: Use is_32bit_task() helper to test 32-bit binary
This patch removes all explicit tests for the TIF_32BIT flag

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Andreas Schwab
05d77ac90c powerpc: Remove fpscr use from [kvm_]cvt_{fd,df}
Neither lfs nor stfs touch the fpscr, so remove the restore/save of it
around them.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Paul Mackerras
872e439a45 powerpc/pseries: Re-enable dispatch trace log userspace interface
Since the cpu accounting code uses the hypervisor dispatch trace log
now when CONFIG_VIRT_CPU_ACCOUNTING = y, the previous commit disabled
access to it via files in the /sys/kernel/debug/powerpc/dtl/ directory
in that case.  This restores those files.

To do this, we now have a hook that the cpu accounting code will call
as it processes each entry from the hypervisor dispatch trace log.
The code in dtl.c now uses that to fill up its ring buffer, rather
than having the hypervisor fill the ring buffer directly.

This also fixes dtl_file_read() to handle overflow conditions a bit
better and adds a spinlock to ensure that race conditions (multiple
processes opening or reading the file concurrently) are handled
correctly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Paul Mackerras
cf9efce0ce powerpc: Account time using timebase rather than PURR
Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the
PURR register for measuring the user and system time used by
processes, as well as other related times such as hardirq and
softirq times.  This turns out to be quite confusing for users
because it means that a program will often be measured as taking
less time when run on a multi-threaded processor (SMT2 or SMT4 mode)
than it does when run on a single-threaded processor (ST mode), even
though the program takes longer to finish.  The discrepancy is
accounted for as stolen time, which is also confusing, particularly
when there are no other partitions running.

This changes the accounting to use the timebase instead, meaning that
the reported user and system times are the actual number of real-time
seconds that the program was executing on the processor thread,
regardless of which SMT mode the processor is in.  Thus a program will
generally show greater user and system times when run on a
multi-threaded processor than on a single-threaded processor.

On pSeries systems on POWER5 or later processors, we measure the
stolen time (time when this partition wasn't running) using the
hypervisor dispatch trace log.  We check for new entries in the
log on every entry from user mode and on every transition from
kernel process context to soft or hard IRQ context (i.e. when
account_system_vtime() gets called).  So that we can correctly
distinguish time stolen from user time and time stolen from system
time, without having to check the log on every exit to user mode,
we store separate timestamps for exit to user mode and entry from
user mode.

On systems that have a SPURR (POWER6 and POWER7), we read the SPURR
in account_system_vtime() (as before), and then apportion the SPURR
ticks since the last time we read it between scaled user time and
scaled system time according to the relative proportions of user
time and system time over the same interval.  This avoids having to
read the SPURR on every kernel entry and exit.  On systems that have
PURR but not SPURR (i.e., POWER5), we do the same using the PURR
rather than the SPURR.

This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl
for now since it conflicts with the use of the dispatch trace log
by the time accounting code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Paul Mackerras
93c22703ef powerpc: Dynamically allocate most lppaca structs
This arranges for the lppaca structs for most cpus to be dynamically
allocated in the same manner as the paca structs.  If we don't include
support for legacy iSeries, only the first lppaca is statically
allocated; the rest are dynamically allocated.  If we include legacy
iSeries support, then we statically allocate the first 64 lppaca
structs, since the iSeries hypervisor requires that the lppaca
structs be present in the data section of the kernel image, but
legacy iSeries supports at most 64 cpus.

With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config
went from:

   text    data     bss     dec     hex filename
9524478 4734564 8469944 22728986        15ad11a ../test-1024/vmlinux

to:

   text    data     bss     dec     hex filename
9524482 3751508 8469944 21745934        14bd10e ../test-1024/vmlinux

a reduction of 983052 bytes overall.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Paul Mackerras
8154c5d22d powerpc: Abstract indexing of lppaca structs
Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Michael Neuling
e1f0ece113 powerpc: Move arch_sd_sibling_asym_packing() to smp.c
Simple cleanup by moving arch_sd_sibling_asym_packing from process.c to
smp.c to save an #ifdef CONFIG_SMP

No functionality change.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Anton Blanchard
28b549905b powerpc: Check end of stack canary at oops time
Add a check for the stack canary when we oops, similar to x86. This should make
it clear that we overran our stack:

Unable to handle kernel paging request for data at address 0x24652f63700ac689
Faulting instruction address: 0xc000000000063d24
Thread overran stack, or stack corrupted

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:30 +10:00
Anton Blanchard
f89451fbd2 powerpc: Feature nop out reservation clear when stcx checks address
The POWER architecture does not require stcx to check that it is operating
on the same address as the larx. This means it is possible for an
an exception handler to execute a larx, get a reservation, decide
not to do the stcx and then return back with an active reservation. If the
interrupted code was in the middle of a larx/stcx sequence the stcx could
incorrectly succeed.

All recent POWER CPUs check the address before letting the stcx succeed
so we can create a CPU feature and nop it out. As Ben suggested, we can
only do this in our syscall path because there is a remote possibility
some kernel code gets interrupted by an exception that ends up operating
on the same cacheline.

Thanks to Paul Mackerras and Derek Williams for the idea.

To test this I used a very simple null syscall (actually getppid) testcase
at http://ozlabs.org/~anton/junkcode/null_syscall.c

I tested against 2.6.35-git10 with the following changes against the
pseries_defconfig:

CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n
CONFIG_PPC_4K_PAGES=n
CONFIG_PPC_64K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=9
CONFIG_PPC_SUBPAGE_PROT=n
CONFIG_FUNCTION_TRACER=n
CONFIG_FUNCTION_GRAPH_TRACER=n
CONFIG_IRQSOFF_TRACER=n
CONFIG_STACK_TRACER=n

to remove the overhead of virtual CPU accounting, syscall auditing and
the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses.

POWER6: +8.2%
POWER7: +7.0%

Another suggestion was to use a larx to something in the L1 instead of a stcx.
This was almost as fast as removing the larx on POWER6, but only 3.5% faster
on POWER7. We can use this to speed up the reservation clear in our
exception exit code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:30 +10:00
Anton Blanchard
8c77391475 powerpc: Add 64bit csum_and_copy_to_user
This adds the equivalent of csum_and_copy_from_user for the receive side so we
can copy and checksum in one pass. It is modelled on the generic checksum
routine.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:30 +10:00
Anton Blanchard
fdd374b62c powerpc: Optimise 64bit csum_partial_copy_generic and add csum_and_copy_from_user
We use the same core loop as the new csum_partial, adding in the
stores and exception handling code. To keep things simple we do all the
exception fixup in csum_and_copy_from_user. This wrapper function is
modelled on the generic checksum code and is careful to always calculate
a complete checksum even if we only copied part of the data to userspace.

To test this I forced checksumming on over loopback and ran socklib (a
simple TCP benchmark). On a POWER6 575 throughput improved by 19% with
this patch. If I forced both the sender and receiver onto the same cpu
(with the hope of shifting the benchmark from being cache bandwidth limited
to cpu limited), adding this patch improved performance by 55%

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:30 +10:00
Anton Blanchard
9b83ecb0a3 powerpc: Optimise 64bit csum_partial
The main loop of csum_partial runs very slowly on recent POWER CPUs. After some
analysis on both POWER6 and POWER7 I came up with routine below. First we get
the source aligned to a double word, ignoring any odd alignment to keep things
simple. Then we do 64 bytes at a time, with an entry and exit limb of a further
64 bytes. On both POWER6 and POWER7 this should be as fast as we can go since
we are limited by the latency of the adde instructions.

To test this I forced checksumming on over loopback and ran socklib (a
simple TCP benchmark). On a POWER6 575 throughput improved by 11% with
this patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:29 +10:00