Commit Graph

3550 Commits

Author SHA1 Message Date
Michael Ellerman
698193d85a powerpc: Consolidate obj-y assignments
No need to have three of them.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:22 +11:00
Nishanth Aravamudan
6d283d782f powerpc/vio: Use dma ops helpers
Use the set_dma_ops helper. Instead of modifying vio_dma_mapping_ops,
just create a trivial wrapper for dma_supported.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:20 +11:00
Vaidyanathan Srinivasan
99d8670525 powerpc: Cleanup APIs for cpu/thread/core mappings
These APIs take logical cpu number as input
Change cpu_first_thread_in_core() to cpu_first_thread_sibling()
Change cpu_last_thread_in_core() to cpu_last_thread_sibling()

These APIs convert core number (index) to logical cpu/thread numbers
Add cpu_first_thread_of_core(int core)
Changed cpu_thread_to_core() to cpu_core_index_of_thread(int cpu)

The goal is to make 'threads_per_core' accessible to the
pseries_energy module.  Instead of making an API to read
threads_per_core, this is a higher level wrapper function to
convert from logical cpu number to core number.

The current APIs cpu_first_thread_in_core() and
cpu_last_thread_in_core() returns logical CPU number while
cpu_thread_to_core() returns core number or index which is
not a logical CPU number.  The new APIs are now clearly named to
distinguish 'core number' versus first and last 'logical cpu
number' in that core.

The new APIs cpu_{first,last}_thread_sibling() work on
logical cpu numbers.  While cpu_first_thread_of_core() and
cpu_core_index_of_thread() work on core index.

Example usage:  (4 threads per core system)

cpu_first_thread_sibling(5) = 4
cpu_last_thread_sibling(5) = 7
cpu_core_index_of_thread(5) = 1
cpu_first_thread_of_core(1) = 4

cpu_core_index_of_thread() is used in cpu_to_drc_index() in the
module and cpu_first_thread_of_core() is used in
drc_index_to_cpu() in the module.

Make API changes to few callers.  Export symbols for use in modules.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:19 +11:00
Anton Blanchard
d72e063bb3 powerpc/kdump: Override crash_free_reserved_phys_range to avoid freeing RTAS
The crashkernel region will almost always overlap RTAS. If we free the
crashkernel region via "echo 0 > /sys/kernel/kexec_crash_size" then we will
free RTAS and the machine will crash in confusing and exciting ways.

Override crash_free_reserved_phys_range and check for overlap with RTAS.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:17 +11:00
Anton Blanchard
64ff312876 powerpc: Add support for popcnt instructions
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:17 +11:00
Peter Zijlstra
004417a6d4 perf, arch: Cleanup perf-pmu init vs lockup-detector
The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).

The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.

Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.

Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-26 15:14:56 +01:00
Linus Torvalds
2d42dc3feb Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb,ppc: Fix regression in evr register handling
  kgdb,x86: fix regression in detach handling
  kdb: fix crash when KDB_BASE_CMD_MAX is exceeded
  kdb: fix memory leak in kdb_main.c
2010-11-18 08:24:58 -08:00
Alessio Igor Bogani
0f6b77ca12 powerpc: Update a BKL related comment
The commit 5e3d20a remove bkl from startup code so setup_arch() it isn't called
with bkl held anymore. Update the comment on top of that function.
Fix also a typo.

This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-18 14:54:24 +11:00
Dongdong Deng
e3839ed8e8 kgdb,ppc: Fix regression in evr register handling
Commit ff10b88b5a (kgdb,ppc: Individual
register get/set for ppc) introduced a problem where memcpy was used
incorrectly to read and write the evr registers with a kernel that
has:

CONFIG_FSL_BOOKE=y
CONFIG_SPE=y
CONFIG_KGDB=y

This patch also fixes the following compilation problems:

arch/powerpc/kernel/kgdb.c: In function 'dbg_get_reg':
arch/powerpc/kernel/kgdb.c:341: error: passing argument 2 of 'memcpy' makes pointer from integer without a cast
arch/powerpc/kernel/kgdb.c: In function 'dbg_set_reg':
arch/powerpc/kernel/kgdb.c:366: error: passing argument 1 of 'memcpy' makes pointer from integer without a cast

[jason.wessel@windriver.com: Remove void * casts and fix patch header]
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
CC: linuxppc-dev@lists.ozlabs.org
2010-11-17 13:54:58 -06:00
Arnd Bergmann
451a3c24b0 BKL: remove extraneous #include <smp_lock.h>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-17 08:59:32 -08:00
Scott Wood
a36be1003a PPC: KVM: Book E doesn't have __end_interrupts.
Fix an unresolved symbol with CONFIG_KVM_GUEST plus CONFIG_RELOCATABLE on
Book E.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2010-11-05 14:42:27 -02:00
David Daney
4b6ba8aacb of/net: Move of_get_mac_address() to a common source file.
There are two identical implementations of of_get_mac_address(), one
each in arch/powerpc/kernel/prom_parse.c and
arch/microblaze/kernel/prom_parse.c.  Move this function to a new
common file of_net.{c,h} and adjust all the callers to include the new
header.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
[grant.likely@secretlab.ca: protect header with #ifdef]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-11-01 01:08:14 -04:00
Linus Torvalds
1e431a9d64 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb,ppc: Individual register get/set for ppc
  kgdbts: prevent re-entry to kgdbts before it unregisters
  debug_core,x86,blackfin: Clean up hw debug disable API
  kdb: Fix early debugging crash regression
  kgdb,arm: fix register dump
  kdb: fix per_cpu command to remove supress mask
  kdb: Add kdb kernel module sample
2010-10-29 11:49:38 -07:00
Dongdong Deng
ff10b88b5a kgdb,ppc: Individual register get/set for ppc
commit 534af1082329392bc29f6badf815e69ae2ae0f4c(kgdb,kdb: individual
register set and and get API) introduce dbg_get_reg/dbg_set_reg API
for individual register get and set.

This patch implement those APIs for ppc.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-29 13:14:42 -05:00
Namhyung Kim
f68d204820 ptrace: cleanup arch_ptrace() on powerpc
Use new 'datavp' and 'datalp' variables in order to remove unnecessary
castings.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:11 -07:00
Namhyung Kim
9b05a69e05 ptrace: change signature of arch_ptrace()
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:10 -07:00
Hagen Paul Pfeifer
732eacc054 replace nested max/min macros with {max,min}3 macro
Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:12 -07:00
Linus Torvalds
51f00a471c Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  mtd/m25p80: add support to parse the partitions by OF node
  of/irq: of_irq.c needs to include linux/irq.h
  of/mips: Cleanup some include directives/files.
  of/mips: Add device tree support to MIPS
  of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
  of/device: Rework to use common platform_device_alloc() for allocating devices
  of/xsysace: Fix OF probing on little-endian systems
  of: use __be32 types for big-endian device tree data
  of/irq: remove references to NO_IRQ in drivers/of/platform.c
  of/promtree: add package-to-path support to pdt
  of/promtree: add of_pdt namespace to pdt code
  of/promtree: no longer call prom_ functions directly; use an ops structure
  of/promtree: make drivers/of/pdt.c no longer sparc-only
  sparc: break out some PROM device-tree building code out into drivers/of
  of/sparc: convert various prom_* functions to use phandle
  sparc: stop exporting openprom.h header
  powerpc, of_serial: Endianness issues setting up the serial ports
  of: MTD: Fix OF probing on little-endian systems
  of: GPIO: Fix OF probing on little-endian systems
2010-10-25 08:19:14 -07:00
Linus Torvalds
1765a1fe5d Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
  KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
  KVM: Fix signature of kvm_iommu_map_pages stub
  KVM: MCE: Send SRAR SIGBUS directly
  KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
  KVM: fix typo in copyright notice
  KVM: Disable interrupts around get_kernel_ns()
  KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
  KVM: MMU: move access code parsing to FNAME(walk_addr) function
  KVM: MMU: audit: check whether have unsync sps after root sync
  KVM: MMU: audit: introduce audit_printk to cleanup audit code
  KVM: MMU: audit: unregister audit tracepoints before module unloaded
  KVM: MMU: audit: fix vcpu's spte walking
  KVM: MMU: set access bit for direct mapping
  KVM: MMU: cleanup for error mask set while walk guest page table
  KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
  KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
  KVM: x86: Fix constant type in kvm_get_time_scale
  KVM: VMX: Add AX to list of registers clobbered by guest switch
  KVM guest: Move a printk that's using the clock before it's ready
  KVM: x86: TSC catchup mode
  ...
2010-10-24 12:47:25 -07:00
Alexander Graf
591bd8e7b4 KVM: PPC: Enable napping only for Book3s_64
Before I incorrectly enabled napping also for BookE, which would result in
needless dcache flushes. Since we only need to force enable napping on
Book3s_64 because it doesn't go into MSR_POW otherwise, we can just #ifdef
that code to this particular platform.

Reported-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:19 +02:00
Alexander Graf
ad0873763a KVM: PPC: Force enable nap on KVM
There are some heuristics in the PPC power management code that try to find
out if the particular hardware we're running on supports proper power management
or just hangs the machine when going into nap mode.

Since we know that KVM is safe with nap, let's force enable it in the PV code
once we're certain that we are on a KVM VM.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:15 +02:00
Alexander Graf
df08bd1026 KVM: PPC: Make PV mtmsrd L=1 work with r30 and r31
We had an arbitrary limitation in mtmsrd L=1 that kept us from using r30 and
r31 as input registers. Let's get rid of that and get more potential speedups!

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:14 +02:00
Alexander Graf
512ba59ed9 KVM: PPC: Make PV mtmsr work with r30 and r31
So far we've been restricting ourselves to r0-r29 as registers an mtmsr
instruction could use. This was bad, as there are some code paths in
Linux actually using r30.

So let's instead handle all registers gracefully and get rid of that
stupid limitation

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:13 +02:00
Alexander Graf
cbe487fac7 KVM: PPC: Add mtsrin PV code
This is the guest side of the mtsr acceleration. Using this a guest can now
call mtsrin with almost no overhead as long as it ensures that it only uses
it with (MSR_IR|MSR_DR) == 0. Linux does that, so we're good.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:12 +02:00
Alexander Graf
7508e16c9f KVM: PPC: Add feature bitmap for magic page
We will soon add SR PV support to the shared page, so we need some
infrastructure that allows the guest to query for features KVM exports.

This patch adds a second return value to the magic mapping that
indicated to the guest which features are available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:09 +02:00
Alexander Graf
989044ee0f KVM: PPC: Fix CONFIG_KVM_GUEST && !CONFIG_KVM case
When CONFIG_KVM_GUEST is selected, but CONFIG_KVM is not, we were missing
some defines in asm-offsets.c and included too many headers at other places.

This patch makes above configuration work.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:51:44 +02:00
Alexander Graf
a58ddea556 KVM: PPC: Move KVM trampolines before __end_interrupts
When using a relocatable kernel we need to make sure that the trampline code
and the interrupt handlers are both copied to low memory. The only way to do
this reliably is to put them in the copied section.

This patch should make relocated kernels work with KVM.

KVM-Stable-Tag
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:59 +02:00
Alexander Graf
644bfa013f KVM: PPC: PV wrteei
On BookE the preferred way to write the EE bit is the wrteei instruction. It
already encodes the EE bit in the instruction.

So in order to get BookE some speedups as well, let's also PV'nize thati
instruction.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:57 +02:00
Alexander Graf
7810927760 KVM: PPC: PV mtmsrd L=0 and mtmsr
There is also a form of mtmsr where all bits need to be addressed. While the
PPC64 Linux kernel behaves resonably well here, on PPC32 we do not have an
L=1 form. It does mtmsr even for simple things like only changing EE.

So we need to hook into that one as well and check for a mask of bits that we
deem safe to change from within guest context.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:56 +02:00
Alexander Graf
819a63dc79 KVM: PPC: PV mtmsrd L=1
The PowerPC ISA has a special instruction for mtmsr that only changes the EE
and RI bits, namely the L=1 form.

Since that one is reasonably often occuring and simple to implement, let's
go with this first. Writing EE=0 is always just a store. Doing EE=1 also
requires us to check for pending interrupts and if necessary exit back to the
hypervisor.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:56 +02:00
Alexander Graf
92234722ed KVM: PPC: PV assembler helpers
When we hook an instruction we need to make sure we don't clobber any of
the registers at that point. So we write them out to scratch space in the
magic page. To make sure we don't fall into a race with another piece of
hooked code, we need to disable interrupts.

To make the later patches and code in general easier readable, let's introduce
a set of defines that save and restore r30, r31 and cr. Let's also define some
helpers to read the lower 32 bits of a 64 bit field on 32 bit systems.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:55 +02:00
Alexander Graf
71ee8e34fe KVM: PPC: Introduce branch patching helper
We will need to patch several instruction streams over to a different
code path, so we need a way to patch a single instruction with a branch
somewhere else.

This patch adds a helper to facilitate this patching.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:54 +02:00
Alexander Graf
2d4f567103 KVM: PPC: Introduce kvm_tmp framework
We will soon require more sophisticated methods to replace single instructions
with multiple instructions. We do that by branching to a memory region where we
write replacement code for the instruction to.

This region needs to be within 32 MB of the patched instruction though, because
that's the furthest we can jump with immediate branches.

So we keep 1MB of free space around in bss. After we're done initing we can just
tell the mm system that the unused pages are free, but until then we have enough
space to fit all our code in.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:54 +02:00
Alexander Graf
d1290b15e7 KVM: PPC: PV tlbsync to nop
With our current MMU scheme we don't need to know about the tlbsync instruction.
So we can just nop it out.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:53 +02:00
Alexander Graf
d1293c9275 KVM: PPC: PV instructions to loads and stores
Some instructions can simply be replaced by load and store instructions to
or from the magic page.

This patch replaces often called instructions that fall into the above category.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:52 +02:00
Alexander Graf
73a1810982 KVM: PPC: KVM PV guest stubs
We will soon start and replace instructions from the text section with
other, paravirtualized versions. To ease the readability of those patches
I split out the generic looping and magic page mapping code out.

This patch still only contains stubs. But at least it loops through the
text section :).

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:51 +02:00
Alexander Graf
d17051cb8d KVM: PPC: Generic KVM PV guest support
We have all the hypervisor pieces in place now, but the guest parts are still
missing.

This patch implements basic awareness of KVM when running Linux as guest. It
doesn't do anything with it yet though.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:50 +02:00
Alexander Graf
2a342ed577 KVM: PPC: Implement hypervisor interface
To communicate with KVM directly we need to plumb some sort of interface
between the guest and KVM. Usually those interfaces use hypercalls.

This hypercall implementation is described in the last patch of the series
in a special documentation file. Please read that for further information.

This patch implements stubs to handle KVM PPC hypercalls on the host and
guest side alike.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:45 +02:00
Alexander Graf
666e7252a1 KVM: PPC: Convert MSR to shared page
One of the most obvious registers to share with the guest directly is the
MSR. The MSR contains the "interrupts enabled" flag which the guest has to
toggle in critical sections.

So in order to bring the overhead of interrupt en- and disabling down, let's
put msr into the shared page. Keep in mind that even though you can fully read
its contents, writing to it doesn't always update all state. There are a few
safe fields that don't require hypervisor interaction. See the documentation
for a list of MSR bits that are safe to be set from inside the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:43 +02:00
Alexander Graf
96bc451a15 KVM: PPC: Introduce shared page
For transparent variable sharing between the hypervisor and guest, I introduce
a shared page. This shared page will contain all the registers the guest can
read and write safely without exiting guest context.

This patch only implements the stubs required for the basic structure of the
shared page. The actual register moving follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:42 +02:00
Linus Torvalds
092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Linus Torvalds
d4429f608a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (71 commits)
  powerpc/44x: Update ppc44x_defconfig
  powerpc/watchdog: Make default timeout for Book-E watchdog a Kconfig option
  fsl_rio: Add comments for sRIO registers.
  powerpc/fsl-booke: Add e55xx (64-bit) smp defconfig
  powerpc/fsl-booke: Add p5020 DS board support
  powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips
  powerpc/fsl-booke: Add support for FSL Arch v1.0 MMU in setup_page_sizes
  powerpc/fsl-booke: Add support for FSL 64-bit e5500 core
  powerpc/85xx: add cache-sram support
  powerpc/85xx: add ngPIXIS FPGA device tree node to the P1022DS board
  powerpc: Fix compile error with paca code on ppc64e
  powerpc/fsl-booke: Add p3041 DS board support
  oprofile/fsl emb: Don't set MSR[PMM] until after clearing the interrupt.
  powerpc/fsl-booke: Add PCI device ids for P2040/P3041/P5010/P5020 QoirQ chips
  powerpc/mpc8xxx_gpio: Add support for 'qoriq-gpio' controllers
  powerpc/fsl_booke: Add support to boot from core other than 0
  powerpc/p1022: Add probing for individual DMA channels
  powerpc/fsl_soc: Search all global-utilities nodes for rstccr
  powerpc: Fix invalid page flags in create TLB CAM path for PTE_64BIT
  powerpc/mpc83xx: Support for MPC8308 P1M board
  ...

Fix up conflict with the generic irq_work changes in arch/powerpc/kernel/time.c
2010-10-21 21:19:54 -07:00
Linus Torvalds
3044100e58 Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
  x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S
  xen: Cope with unmapped pages when initializing kernel pagetable
  memblock, bootmem: Round pfn properly for memory and reserved regions
  memblock: Annotate memblock functions with __init_memblock
  memblock: Allow memblock_init to be called early
  memblock/arm: Fix memblock_region_is_memory() typo
  x86, memblock: Remove __memblock_x86_find_in_range_size()
  memblock: Fix wraparound in find_region()
  x86-32, memblock: Make add_highpages honor early reserved ranges
  x86, memblock: Fix crashkernel allocation
  arm, memblock: Fix the sparsemem build
  memblock: Fix section mismatch warnings
  powerpc, memblock: Fix memblock API change fallout
  memblock, microblaze: Fix memblock API change fallout
  x86: Remove old bootmem code
  x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve
  x86: Remove not used early_res code
  x86, memblock: Replace e820_/_early string with memblock_
  x86: Use memblock to replace early_res
  x86, memblock: Use memblock_debug to control debug message print out
  ...

Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile
2010-10-21 18:52:11 -07:00
Linus Torvalds
e36f561a2c Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-irqflags
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-irqflags:
  Fix IRQ flag handling naming
  MIPS: Add missing #inclusions of <linux/irq.h>
  smc91x: Add missing #inclusion of <linux/irq.h>
  Drop a couple of unnecessary asm/system.h inclusions
  SH: Add missing consts to sys_execve() declaration
  Blackfin: Rename IRQ flags handling functions
  Blackfin: Add missing dep to asm/irqflags.h
  Blackfin: Rename DES PC2() symbol to avoid collision
  Blackfin: Split the BF532 BFIN_*_FIO_FLAG() functions to their own header
  Blackfin: Split PLL code from mach-specific cdef headers
2010-10-21 14:37:27 -07:00
Grant Likely
32c97689c4 of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
This patch refactors the early init parsing of the chosen node so that
architectures aren't forced to provide an empty implementation of
early_init_dt_scan_chosen_arch.  Instead, if an architecture wants to
do something different, it can either use a wrapper function around
early_init_dt_scan_chosen(), or it can replace it altogether.

This patch was written in preparation to adding device tree support to
both x86 ad MIPS.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: David Daney <ddaney@caviumnetworks.com>
2010-10-21 11:10:10 -06:00
Grant Likely
7096d04221 of/device: Rework to use common platform_device_alloc() for allocating devices
The current code allocates and manages platform_devices created from
the device tree manually.  It also uses an unsafe shortcut for
allocating the platform_device and the resource table at the same
time. (which I added in the last rework; sorry).

This patch refactors the code to use platform_device_alloc() for
allocating new devices.  This reduces the amount of custom code
implemented by of_platform, eliminates the unsafe alloc trick, and has
the side benefit of letting the platform_bus code manage freeing the
device data and resources when the device is freed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Simek <monstr@monstr.eu>
2010-10-21 11:10:10 -06:00
Paul Mackerras
57fa721433 perf, powerpc: Fix power_pmu_event_init to not use event->ctx
Commit c3f00c70 ("perf: Separate find_get_context() from event
initialization") changed the generic perf_event code to call
perf_event_alloc, which calls the arch-specific event_init code,
before looking up the context for the new event.  Unfortunately,
power_pmu_event_init uses event->ctx->task to see whether the
new event is a per-task event or a system-wide event, and thus
crashes since event->ctx is NULL at the point where
power_pmu_event_init gets called.

(The reason it needs to know whether it is a per-task event is
because there are some hardware events on Power systems which
only count when the processor is not idle, and there are some
fixed-function counters which count such events.  For example,
the "run cycles" event counts cycles when the processor is not
idle.  If the user asks to count cycles, we can use "run cycles"
if this is a per-task event, since the processor is running when
the task is running, by definition.  We can't use "run cycles"
if the user asks for "cycles" on a system-wide counter.)

Fortunately the information we need is in the
event->attach_state field, so we just use that instead.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101019055535.GA10398@drongo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-19 09:18:34 +02:00
Peter Zijlstra
e360adbe29 irq_work: Add generic hardirq context callbacks
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-18 19:58:50 +02:00
Arnd Bergmann
6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Benjamin Herrenschmidt
6a1c9dfe41 Merge remote branch 'jwb/next' into next 2010-10-15 10:45:03 +11:00
Kumar Gala
55fd766b5f powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips
On Freescale parts typically have TLB array for large mappings that we can
bolt the linear mapping into.  We utilize the code that already exists
on PPC32 on the 64-bit side to setup the linear mapping to be cover by
bolted TLB entries.  We utilize a quarter of the variable size TLB array
for this purpose.

Additionally, we limit the amount of memory to what we can cover via
bolted entries so we don't get secondary faults in the TLB miss
handlers.  We should fix this limitation in the future.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:55:14 -05:00
Kumar Gala
4490c06b58 powerpc/fsl-booke: Add support for FSL 64-bit e5500 core
The new e5500 core is similar to the e500mc core but adds 64-bit
support.  We support running it in 32-bit mode as it is identical to the
e500mc.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:55:03 -05:00
Kumar Gala
3c4b76449b powerpc: Fix compile error with paca code on ppc64e
arch/powerpc/kernel/paca.c: In function 'allocate_lppacas':
arch/powerpc/kernel/paca.c:111:1: error: parameter name omitted
arch/powerpc/kernel/paca.c:111:1: error: parameter name omitted

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:53:08 -05:00
Matthew McClintock
2ed38b2359 powerpc/fsl_booke: Add support to boot from core other than 0
First we check to see if we are the first core booting up. This
is accomplished by comparing the boot_cpuid with -1, if it is we
assume this is the first core coming up.

Secondly, we need to update the initial thread info structure
to reflect the actual cpu we are running on otherwise
smp_processor_id() and related functions will return the default
initialization value of the struct or 0.

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:52:58 -05:00
Matthew McClintock
c71635d288 powerpc/kexec: make masking/disabling interrupts generic
Right now just the kexec crash pathway turns turns off the interrupts.
Pull that out and make a generic version for use elsewhere

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:52:46 -05:00
Timur Tabi
55ec2fca3e powerpc: export ppc_proc_freq and ppc_tb_freq as GPL symbols
Export the global variable 'ppc_tb_freq', so that modules (like the Book-E
watchdog driver) can use it.  To maintain consistency, ppc_proc_freq is
changed to a GPL-only export.  This is okay, because any module that needs
this symbol should be an actual Linux driver, which must be GPL-licensed.

Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:52:43 -05:00
Tirumala Marri
6edc323db7 powerpc/44x: Add support for the AMCC APM821xx SoC
This patch adds CPU, device tree, defconfig and bluestone board
support for APM821xx SoC.

Signed-off-by: Tirumala R Marri <tmarri@apm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-10-13 08:47:09 -04:00
matt mooney
4108d9ba90 powerpc/Makefiles: Change to new flag variables
Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y.

Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:22 +11:00
Nishanth Aravamudan
bc0df9ec4c powerpc/pci: Cleanup device dma setup code
Use set_dma_ops and remove unused oddly-named temp pointer sd.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:22 +11:00
Nishanth Aravamudan
45848e0fc1 powerpc/viobus: Free TCE table on device release
Release the TCE table as the XXX suggests, except on FW_FEATURE_ISERIES,
where the tables are allocated globally and reused.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Nishanth Aravamudan
edea8f6f48 powerpc/vio: Use put_device() on device_register failure
The kernel doc for device_register (and device_initialize) very clearly
state to call put_device not kfree after calling, even on error.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Nishanth Aravamudan
ffa56e555a powerpc/dma: Fix check for direct DMA support
The current check is wrong because it does not take the DMA offset intot
account, and in the case of a driver which doesn't actually support
64bits would falsely report that device as working.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Nishanth Aravamudan
1cb8e85a9d powerpc/dma: Fix dma_iommu_dma_supported compare
The table offset is in entries, each of which imply a dma address of
an IOMMU page.

Also, we should check the device can reach the whole IOMMU table.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Julia Lawall
a655237fa2 powerpc/irq.c: Add of_node_put to avoid memory leak
In this case, a device_node structure is stored in another structure that
is then freed without first decrementing the reference count of the
device_node structure.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression x;
identifier f;
position p1,p2;
@@

x@p1->f = \(of_find_node_by_path\|of_find_node_by_name\|of_find_node_by_phandle\|of_get_parent\|of_get_next_parent\|of_get_next_child\|of_find_compatible_node\|of_match_node\|of_find_node_by_type\|of_find_node_with_property\|of_find_matching_node\|of_parse_phandle\|of_node_get\)(...);
... when != of_node_put(x)
kfree@p2(x)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@
cocci.print_main("call",p1)
cocci.print_secs("free",p2)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:04 +11:00
Joe Perches
4e74fd7d0a powerpc: Use static const char arrays
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:03 +11:00
Nathan Fontenot
d8862be122 powerpc/pseries: Export rtas_ibm_suspend_me()
Export the rtas_ibm_suspend_me() routine.  This is needed to perform
partition migration in the kernel.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:02 +11:00
Benjamin Herrenschmidt
4783f393de Merge remote branch 'kumar/merge' into next 2010-10-13 16:18:36 +11:00
Ingo Molnar
7cd2541cf2 Merge commit 'v2.6.36-rc7' into perf/core
Conflicts:
	arch/x86/kernel/module.c

Merge reason: Resolve the conflict, pick up fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08 10:46:27 +02:00
Ingo Molnar
153db80f8c Merge commit 'v2.6.36-rc7' into core/memblock
Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08 09:15:00 +02:00
Ian Munsie
f14362d1fe powerpc, of_serial: Endianness issues setting up the serial ports
The speed and clock of the serial ports is retrieved from the device
tree in both the PowerPC legacy serial code and the Open Firmware serial
driver, therefore they need to handle the fact that the device tree is
always big endian, while the CPU may not be.

Also fix other device tree references in the legacy serial code.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-10-07 17:21:15 -06:00
David Howells
df9ee29270 Fix IRQ flag handling naming
Fix the IRQ flag handling naming.  In linux/irqflags.h under one configuration,
it maps:

	local_irq_enable() -> raw_local_irq_enable()
	local_irq_disable() -> raw_local_irq_disable()
	local_irq_save() -> raw_local_irq_save()
	...

and under the other configuration, it maps:

	raw_local_irq_enable() -> local_irq_enable()
	raw_local_irq_disable() -> local_irq_disable()
	raw_local_irq_save() -> local_irq_save()
	...

This is quite confusing.  There should be one set of names expected of the
arch, and this should be wrapped to give another set of names that are expected
by users of this facility.

Change this to have the arch provide:

	flags = arch_local_save_flags()
	flags = arch_local_irq_save()
	arch_local_irq_restore(flags)
	arch_local_irq_disable()
	arch_local_irq_enable()
	arch_irqs_disabled_flags(flags)
	arch_irqs_disabled()
	arch_safe_halt()

Then linux/irqflags.h wraps these to provide:

	raw_local_save_flags(flags)
	raw_local_irq_save(flags)
	raw_local_irq_restore(flags)
	raw_local_irq_disable()
	raw_local_irq_enable()
	raw_irqs_disabled_flags(flags)
	raw_irqs_disabled()
	raw_safe_halt()

with type checking on the flags 'arguments', and then wraps those to provide:

	local_save_flags(flags)
	local_irq_save(flags)
	local_irq_restore(flags)
	local_irq_disable()
	local_irq_enable()
	irqs_disabled_flags(flags)
	irqs_disabled()
	safe_halt()

with tracing included if enabled.

The arch functions can now all be inline functions rather than some of them
having to be macros.

Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300]
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile]
Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze]
Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR]
Acked-by: Tony Luck <tony.luck@intel.com> [IA-64]
Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R]
Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC]
Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC]
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390]
Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score]
Acked-by: Matt Fleming <matt@console-pimps.org> [SH]
Acked-by: David S. Miller <davem@davemloft.net> [Sparc]
Acked-by: Chris Zankel <chris@zankel.net> [Xtensa]
Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha]
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300]
Cc: starvik@axis.com [CRIS]
Cc: jesper.nilsson@axis.com [CRIS]
Cc: linux-cris-kernel@axis.com
2010-10-07 14:08:55 +01:00
Stephen Rothwell
7c6d45e665 powerpc: remove unused variable
Since powerpc uses -Werror on arch powerpc, the build was broken like
this:

  cc1: warnings being treated as errors
  arch/powerpc/kernel/module.c: In function 'module_finalize':
  arch/powerpc/kernel/module.c:66: error: unused variable 'err'

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05 17:27:54 -07:00
Linus Torvalds
5336377d62 modules: Fix module_bug_list list corruption race
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.

However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling.  That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.

Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.

So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.

Future fixups:
 - move the module list handling code into kernel/module.c where it
   belongs.
 - get rid of 'module_bug_list' and just use the regular list of modules
   (called 'modules' - imagine that) that we already create and maintain
   for other reasons.

Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05 11:29:27 -07:00
Paul Mackerras
9f5f9ffe50 powerpc/perf: Fix sampling enable for PPC970
The logic to distinguish marked instruction events from ordinary events
on PPC970 and derivatives was flawed.  The result is that instruction
sampling didn't get enabled in the PMU for some marked instruction
events, so they would never trigger.  This fixes it by adding the
appropriate break statements in the switch statement.

Reported-by: David Binderman <dcb314@hotmail.com>
Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-23 17:03:56 +10:00
Ingo Molnar
d0303d71c2 Merge branch 'linus' into perf/core
Conflicts:
	arch/sparc/kernel/perf_event.c

Merge reason: Resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-23 08:02:09 +02:00
Al Viro
9a81c16b52 powerpc: fix double syscall restarts
Make sigreturn zero regs->trap, make do_signal() do the same on all
paths.  As it is, signal interrupting e.g. read() from fd 512 (==
ERESTARTSYS) with another signal getting unblocked when the first
handler finishes will lead to restart one insn earlier than it ought
to.  Same for multiple signals with in-kernel handlers interrupting
that sucker at the same time.  Same for multiple signals of any kind
interrupting that sucker on 64bit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-22 09:33:50 -07:00
Ingo Molnar
3aabae7d9d Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2010-09-15 10:27:31 +02:00
Peter Zijlstra
a4eaf7f146 perf: Rework the PMU methods
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.

The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.

This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).

It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).

The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:

 1) We disable the counter:
    a) the pmu has per-counter enable bits, we flip that
    b) we program a NOP event, preserving the counter state

 2) We store the counter state and ignore all read/overflow events

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:30 +02:00
Peter Zijlstra
33696fc0d1 perf: Per PMU disable
Changes perf_disable() into perf_pmu_disable().

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:29 +02:00
Peter Zijlstra
24cd7f54a0 perf: Reduce perf_disable() usage
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:29 +02:00
Peter Zijlstra
b0a873ebbf perf: Register PMU implementations
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:28 +02:00
Peter Zijlstra
51b0fe3954 perf: Deconstify struct pmu
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:27 +02:00
Benjamin Herrenschmidt
5b6e9ff6de powerpc/dma: Add optional platform override of dma_set_mask()
Some platforms may want to override dma_set_mask() to take into
account some specific "features" such as the availability of
a direct-map window in addition to an iommu.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Denis Kirjanov
cab175f9fa powerpc: Use is_32bit_task() helper to test 32-bit binary
This patch removes all explicit tests for the TIF_32BIT flag

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Andreas Schwab
05d77ac90c powerpc: Remove fpscr use from [kvm_]cvt_{fd,df}
Neither lfs nor stfs touch the fpscr, so remove the restore/save of it
around them.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Paul Mackerras
872e439a45 powerpc/pseries: Re-enable dispatch trace log userspace interface
Since the cpu accounting code uses the hypervisor dispatch trace log
now when CONFIG_VIRT_CPU_ACCOUNTING = y, the previous commit disabled
access to it via files in the /sys/kernel/debug/powerpc/dtl/ directory
in that case.  This restores those files.

To do this, we now have a hook that the cpu accounting code will call
as it processes each entry from the hypervisor dispatch trace log.
The code in dtl.c now uses that to fill up its ring buffer, rather
than having the hypervisor fill the ring buffer directly.

This also fixes dtl_file_read() to handle overflow conditions a bit
better and adds a spinlock to ensure that race conditions (multiple
processes opening or reading the file concurrently) are handled
correctly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Paul Mackerras
cf9efce0ce powerpc: Account time using timebase rather than PURR
Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the
PURR register for measuring the user and system time used by
processes, as well as other related times such as hardirq and
softirq times.  This turns out to be quite confusing for users
because it means that a program will often be measured as taking
less time when run on a multi-threaded processor (SMT2 or SMT4 mode)
than it does when run on a single-threaded processor (ST mode), even
though the program takes longer to finish.  The discrepancy is
accounted for as stolen time, which is also confusing, particularly
when there are no other partitions running.

This changes the accounting to use the timebase instead, meaning that
the reported user and system times are the actual number of real-time
seconds that the program was executing on the processor thread,
regardless of which SMT mode the processor is in.  Thus a program will
generally show greater user and system times when run on a
multi-threaded processor than on a single-threaded processor.

On pSeries systems on POWER5 or later processors, we measure the
stolen time (time when this partition wasn't running) using the
hypervisor dispatch trace log.  We check for new entries in the
log on every entry from user mode and on every transition from
kernel process context to soft or hard IRQ context (i.e. when
account_system_vtime() gets called).  So that we can correctly
distinguish time stolen from user time and time stolen from system
time, without having to check the log on every exit to user mode,
we store separate timestamps for exit to user mode and entry from
user mode.

On systems that have a SPURR (POWER6 and POWER7), we read the SPURR
in account_system_vtime() (as before), and then apportion the SPURR
ticks since the last time we read it between scaled user time and
scaled system time according to the relative proportions of user
time and system time over the same interval.  This avoids having to
read the SPURR on every kernel entry and exit.  On systems that have
PURR but not SPURR (i.e., POWER5), we do the same using the PURR
rather than the SPURR.

This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl
for now since it conflicts with the use of the dispatch trace log
by the time accounting code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Paul Mackerras
93c22703ef powerpc: Dynamically allocate most lppaca structs
This arranges for the lppaca structs for most cpus to be dynamically
allocated in the same manner as the paca structs.  If we don't include
support for legacy iSeries, only the first lppaca is statically
allocated; the rest are dynamically allocated.  If we include legacy
iSeries support, then we statically allocate the first 64 lppaca
structs, since the iSeries hypervisor requires that the lppaca
structs be present in the data section of the kernel image, but
legacy iSeries supports at most 64 cpus.

With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config
went from:

   text    data     bss     dec     hex filename
9524478 4734564 8469944 22728986        15ad11a ../test-1024/vmlinux

to:

   text    data     bss     dec     hex filename
9524482 3751508 8469944 21745934        14bd10e ../test-1024/vmlinux

a reduction of 983052 bytes overall.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Paul Mackerras
8154c5d22d powerpc: Abstract indexing of lppaca structs
Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Michael Neuling
e1f0ece113 powerpc: Move arch_sd_sibling_asym_packing() to smp.c
Simple cleanup by moving arch_sd_sibling_asym_packing from process.c to
smp.c to save an #ifdef CONFIG_SMP

No functionality change.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Anton Blanchard
f89451fbd2 powerpc: Feature nop out reservation clear when stcx checks address
The POWER architecture does not require stcx to check that it is operating
on the same address as the larx. This means it is possible for an
an exception handler to execute a larx, get a reservation, decide
not to do the stcx and then return back with an active reservation. If the
interrupted code was in the middle of a larx/stcx sequence the stcx could
incorrectly succeed.

All recent POWER CPUs check the address before letting the stcx succeed
so we can create a CPU feature and nop it out. As Ben suggested, we can
only do this in our syscall path because there is a remote possibility
some kernel code gets interrupted by an exception that ends up operating
on the same cacheline.

Thanks to Paul Mackerras and Derek Williams for the idea.

To test this I used a very simple null syscall (actually getppid) testcase
at http://ozlabs.org/~anton/junkcode/null_syscall.c

I tested against 2.6.35-git10 with the following changes against the
pseries_defconfig:

CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n
CONFIG_PPC_4K_PAGES=n
CONFIG_PPC_64K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=9
CONFIG_PPC_SUBPAGE_PROT=n
CONFIG_FUNCTION_TRACER=n
CONFIG_FUNCTION_GRAPH_TRACER=n
CONFIG_IRQSOFF_TRACER=n
CONFIG_STACK_TRACER=n

to remove the overhead of virtual CPU accounting, syscall auditing and
the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses.

POWER6: +8.2%
POWER7: +7.0%

Another suggestion was to use a larx to something in the L1 instead of a stcx.
This was almost as fast as removing the larx on POWER6, but only 3.5% faster
on POWER7. We can use this to speed up the reservation clear in our
exception exit code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:30 +10:00
Ingo Molnar
daab7fc734 Merge commit 'v2.6.36-rc3' into x86/memblock
Conflicts:
	arch/x86/kernel/trampoline.c
	mm/memblock.c

Merge reason: Resolve the conflicts, update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31 09:45:46 +02:00
Michael Neuling
54a8340433 powerpc: Don't use kernel stack with translation off
In f761622e59 we changed
early_setup_secondary so it's called using the proper kernel stack
rather than the emergency one.

Unfortunately, this stack pointer can't be used when translation is off
on PHYP as this stack pointer might be outside the RMO.  This results in
the following on all non zero cpus:
  cpu 0x1: Vector: 300 (Data Access) at [c00000001639fd10]
      pc: 000000000001c50c
      lr: 000000000000821c
      sp: c00000001639ff90
     msr: 8000000000001000
     dar: c00000001639ffa0
   dsisr: 42000000
    current = 0xc000000016393540
    paca    = 0xc000000006e00200
      pid   = 0, comm = swapper

The original patch was only tested on bare metal system, so it never
caught this problem.

This changes __secondary_start so that we calculate the new stack
pointer but only start using it after we've called early_setup_secondary.

With this patch, the above problem goes away.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31 11:35:13 +10:00
Paul Mackerras
b0d278b7d3 powerpc/perf_event: Reduce latency of calling perf_event_do_pending
Commit 0fe1ac48 ("powerpc/perf_event: Fix oops due to
perf_event_do_pending call") moved the call to perf_event_do_pending
in timer_interrupt() down so that it was after the irq_enter() call.
Unfortunately this moved it after the code that checks whether it
is time for the next decrementer clock event.  The result is that
the call to perf_event_do_pending() won't happen until the next
decrementer clock event is due.  This was pointed out by Milton
Miller.

This fixes it by moving the check for whether it's time for the
next decrementer clock event down to the point where we're about
to call the event handler, after we've called perf_event_do_pending.

This has the side effect that on old pre-Core99 Powermacs where we
use the ppc_n_lost_interrupts mechanism to replay interrupts, a
replayed interrupt will incur a little more latency since it will
now do the code from the irq_enter down to the irq_exit, that it
used to skip.  However, these machines are now old and rare enough
that this doesn't matter.  To make it clear that ppc_n_lost_interrupts
is only used on Powermacs, and to speed up the code slightly on
non-Powermac ppc32 machines, the code that tests ppc_n_lost_interrupts
is now conditional on CONFIG_PMAC as well as CONFIG_PPC32.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31 11:35:13 +10:00
Matthew McClintock
4562c986f0 powerpc/kexec: Adds correct calling convention for kexec purgatory
Call kexec purgatory code correctly. We were getting lucky before.
If you examine the powerpc 32bit kexec "purgatory" code you will
see it expects the following:

>From kexec-tools: purgatory/arch/ppc/v2wrap_32.S
-> calling convention:
->   r3 = physical number of this cpu (all cpus)
->   r4 = address of this chunk (master only)

As such, we need to set r3 to the current core, r4 happens to be
unused by purgatory at the moment but we go ahead and set it
here as well

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31 11:35:12 +10:00
Ingo Molnar
7de5d895b2 Merge branch 'linus' into perf/core
Merge reason: pick up perf fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-25 13:10:00 +02:00
Andreas Schwab
bcc30d3758 powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:28:28 +10:00
Grant Likely
76ec01dbb7 powerpc/pci: Fix checking for child bridges in PCI code.
pci_device_to_OF_node() can return null, and list_for_each_entry will
never enter the loop when dev is NULL, so it looks like this test is
a typo.

Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:28:27 +10:00
Matt Evans
f761622e59 powerpc: Initialise paca->kstack before early_setup_secondary
As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"

Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data).  It's not always allowable to take such a miss, and
intermittent crashes will result.

Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)).  This patch therefore only
affects the init of secondaries.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:31 +10:00
Anton Blanchard
7aa241fdce powerpc: Fix bogus it_blocksize in VIO iommu code
When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:

address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048

Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.

Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:31 +10:00
Anton Blanchard
4138d65333 powerpc: Inline ppc64_runlatch_off
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it
into the callers. To avoid a mess of circular includes I didn't add
it as an inline function.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:30 +10:00
Nathan Fontenot
954e6da54b powerpc: Correct smt_enabled=X boot option for > 2 threads per core
The 'smt_enabled=X' boot option does not handle values of X > 2.
For Power 7 processors with smt modes of 0,1,2,3, and 4 this does
not work.  This patch allows the smt_enabled option to be set to
any value limited to a max equal to the number of threads per
core.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:30 +10:00
Signed-off-by: Darren Hart
6685a47749 powerpc: Silence __cpu_up() under normal operation
During CPU offline/online tests __cpu_up would flood the logs with
the following message:

Processor 0 found.

This provides no useful information to the user as there is no context
provided, and since the operation was a success (to this point) it is expected
that the CPU will come back online, providing all the feedback necessary.

Change the "Processor found" message to DBG() similar to other such messages in
the same function. Also, add an appropriate log level for the "Processor is
stuck" message.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:29 +10:00
Signed-off-by: Darren Hart
a7c2bb8279 powerpc: Re-enable preemption before cpu_die()
start_secondary() is called shortly after _start and also via

cpu_idle()->cpu_die()->pseries_mach_cpu_die()

start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is
called via the cpu_idle() routine with preemption disabled, resulting in the
following repeating message during rapid cpu offline/online tests
with CONFIG_PREEMPT=y:

BUG: scheduling while atomic: swapper/0/0x00000002
Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan]
Call Trace:
[c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable)
[c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c
[c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c
[c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800
[c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8
[c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4
[c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14

Move the cpu_die() call inside the existing preemption enabled block of
cpu_idle(). This is safe as the idle task is affined to a single CPU so the
debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are
in a "migration disabled" region.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:29 +10:00
Julia Lawall
da9bef6735 powerpc/pci: Drop unnecessary null test
list_for_each_entry binds its first argument to a non-null value, and thus
any null test on the value of that argument is superfluous.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
iterator I;
expression x,E,E1,E2;
statement S,S1,S2;
@@

I(x,...) { <...
- if (x != NULL || ...)
  S
  ...> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:28 +10:00
Anton Blanchard
249ec22875 powerpc/kdump: Stop all other CPUs before running crash handlers
During kdump we run the crash handlers first then stop all other CPUs.
We really want to stop all CPUs as close to the fail as possible and also
have a very controlled environment for running the crash handlers, so it
makes sense to reverse the order.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:27 +10:00
Denis Kirjanov
9904b00593 powerpc: Use is_32bit_task() helper to test 32 bit binary
Use is_32bit_task() helper to test 32 bit binary.

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:27 +10:00
Benjamin Herrenschmidt
b1515af291 Merge remote branch 'jwb/merge' into merge 2010-08-24 14:36:45 +10:00
Dave Kleikamp
3e7f45ad52 powerpc/4xx: Index interrupt stacks by physical cpu
The interrupt stacks need to be indexed by the physical cpu since the
critical, debug and machine check handlers use the contents of SPRN_PIR to
index the critirq_ctx, dbgirq_ctx, and mcheckirq_ctx arrays.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23 07:37:53 -04:00
Dave Kleikamp
66477466b8 powerpc/47x: Remove redundant line from cputable.c
There are two entries for .cpu_user_features in
arch/powerpc/kernel/cputable.c.  Remove the one that doesn't belong

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23 07:37:01 -04:00
Dave Kleikamp
029b8f662b powerpc/47x: Make sure mcsr is cleared before enabling machine check interrupts
Clear the machine check syndrom register before enabling machine check
interrupts.  The initial state of the tlb can lead to parity errors being
flagged early after a cold boot.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23 07:36:58 -04:00
Ingo Molnar
c8710ad389 Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2010-08-19 12:48:09 +02:00
Frederic Weisbecker
f72c1a931e perf: Factorize callchain context handling
Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
2010-08-19 01:32:11 +02:00
Frederic Weisbecker
56962b4449 perf: Generalize some arch callchain code
- Most archs use one callchain buffer per cpu, except x86 that needs
  to deal with NMIs. Provide a default perf_callchain_buffer()
  implementation that x86 overrides.

- Centralize all the kernel/user regs handling and invoke new arch
  handlers from there: perf_callchain_user() / perf_callchain_kernel()
  That avoid all the user_mode(), current->mm checks and so...

- Invert some parameters in perf_callchain_*() helpers: entry to the
  left, regs to the right, following the traditional (dst, src).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
2010-08-19 01:30:59 +02:00
Frederic Weisbecker
70791ce9ba perf: Generalize callchain_store()
callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.

This removes repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
2010-08-19 01:30:11 +02:00
David Howells
d7627467b7 Make do_execve() take a const filename pointer
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:

arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to.  This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel().  A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().

do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.

Further kernel_execve() and sys_execve() need to be changed to match.

This has been test built on x86_64, frv, arm and mips.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-17 18:07:43 -07:00
David Howells
c788732523 Mark arguments to certain syscalls as being const
Mark arguments to certain system calls as being const where they should be but
aren't.  The list includes:

 (*) The filename arguments of various stat syscalls, execve(), various utimes
     syscalls and some mount syscalls.

 (*) The filename arguments of some syscall helpers relating to the above.

 (*) The buffer argument of various write syscalls.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-13 16:53:13 -07:00
Grant Likely
ee11006613 powerpc: fix i8042 module build error
of_i8042_{kbd,aux}_irq needs to be exported

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-06 20:49:20 -06:00
Linus Torvalds
b62ad9ab18 Merge branch 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  um: Fix read_persistent_clock fallout
  kgdb: Do not access xtime directly
  powerpc: Clean up obsolete code relating to decrementer and timebase
  powerpc: Rework VDSO gettimeofday to prevent time going backwards
  clocksource: Add __clocksource_updatefreq_hz/khz methods
  x86: Convert common clocksources to use clocksource_register_hz/khz
  timekeeping: Make xtime and wall_to_monotonic static
  hrtimer: Cleanup direct access to wall_to_monotonic
  um: Convert to use read_persistent_clock
  timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset
  powerpc: Cleanup xtime usage
  powerpc: Simplify update_vsyscall
  time: Kill off CONFIG_GENERIC_TIME
  time: Implement timespec_add
  x86: Fix vtime/file timestamp inconsistencies

Trivial conflicts in Documentation/feature-removal-schedule.txt

Much less trivial conflicts in arch/powerpc/kernel/time.c resolved as
per Thomas' earlier merge commit 47916be4e2 ("Merge branch
'powerpc.cherry-picks' into timers/clocksource")
2010-08-06 13:18:29 -07:00
Linus Torvalds
c4efd6b569 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug
  sched: No need for bootmem special cases
  sched: Revert nohz_ratelimit() for now
  sched: Reduce update_group_power() calls
  sched: Update rq->clock for nohz balanced cpus
  sched: Fix spelling of sibling
  sched, cpuset: Drop __cpuexit from cpu hotplug callbacks
  sched: Fix the racy usage of thread_group_cputimer() in fastpath_timer_check()
  sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()
  sched: thread_group_cputime: Simplify, document the "alive" check
  sched: Remove the obsolete exit_state/signal hacks
  sched: task_tick_rt: Remove the obsolete ->signal != NULL check
  sched: __sched_setscheduler: Read the RLIMIT_RTPRIO value lockless
  sched: Fix comments to make them DocBook happy
  sched: Fix fix_small_capacity
  powerpc: Exclude arch_sd_sibiling_asym_packing() on UP
  powerpc: Enable asymmetric SMT scheduling on POWER7
  sched: Add asymmetric group packing option for sibling domain
  sched: Fix capacity calculations for SMT4
  sched: Change nohz idle load balancing logic to push model
  ...
2010-08-06 09:39:22 -07:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Linus Torvalds
89a6c8cb9e Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  debug_core,kdb: fix crash when arch does not have single step
  kgdb,x86: use macro HBP_NUM to replace magic number 4
  kgdb,mips: remove unused kgdb_cpu_doing_single_step operations
  mm,kdb,kgdb: Add a debug reference for the kdb kmap usage
  KGDB: Remove set but unused newPC
  ftrace,kdb: Allow dumping a specific cpu's buffer with ftdump
  ftrace,kdb: Extend kdb to be able to dump the ftrace buffer
  kgdb,powerpc: Replace hardcoded offset by BREAK_INSTR_SIZE
  arm,kgdb: Add ability to trap into debugger on notify_die
  gdbstub: do not directly use dbg_reg_def[] in gdb_cmd_reg_set()
  gdbstub: Implement gdbserial 'p' and 'P' packets
  kgdb,arm: Individual register get/set for arm
  kgdb,mips: Individual register get/set for mips
  kgdb,x86: Individual register get/set for x86
  kgdb,kdb: individual register set and and get API
  gdbstub: Optimize kgdb's "thread:" response for the gdb serial protocol
  kgdb: remove custom hex_to_bin()implementation
2010-08-05 15:59:48 -07:00
Linus Torvalds
03c0c29aff Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (63 commits)
  of/platform: Register of_platform_drivers with an "of:" prefix
  of/address: Clean up function declarations
  of/spi: call of_register_spi_devices() from spi core code
  of: Provide default of_node_to_nid() implementation.
  of/device: Make of_device_make_bus_id() usable by other code.
  of/irq: Fix endian issues in parsing interrupt specifiers
  of: Fix phandle endian issues
  of/flattree: fix of_flat_dt_is_compatible() to match the full compatible string
  of: remove of_default_bus_ids
  of: make of_find_device_by_node generic
  microblaze: remove references to of_device and to_of_device
  sparc: remove references to of_device and to_of_device
  powerpc: remove references to of_device and to_of_device
  of/device: Replace of_device with platform_device in includes and core code
  of/device: Protect against binding of_platform_drivers to non-OF devices
  of: remove asm/of_device.h
  of: remove asm/of_platform.h
  of/platform: remove all of_bus_type and of_platform_bus_type references
  of: Merge of_platform_bus_type with platform_bus_type
  drivercore/of: Add OF style matching to platform bus
  ...

Fix up trivial conflicts in arch/microblaze/kernel/Makefile due to just
some obj-y removals by the devicetree branch, while the microblaze
updates added a new file.
2010-08-05 15:57:35 -07:00
Linus Torvalds
cdd854bc42 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (79 commits)
  powerpc/8xx: Add support for the MPC8xx based boards from TQC
  powerpc/85xx: Introduce support for the Freescale P1022DS reference board
  powerpc/85xx: Adding DTS for the STx GP3-SSA MPC8555 board
  powerpc/85xx: Change deprecated binding for 85xx-based boards
  powerpc/tqm85xx: add a quirk for ti1520 PCMCIA bridge
  powerpc/tqm85xx: update PCI interrupt-map attribute
  powerpc/mpc8308rdb: support for MPC8308RDB board from Freescale
  powerpc/fsl_pci: add quirk for mpc8308 pcie bridge
  powerpc/85xx: Cleanup QE initialization for MPC85xxMDS boards
  powerpc/85xx: Fix booting for P1021MDS boards
  powerpc/85xx: Fix SWIOTLB initalization for MPC85xxMDS boards
  powerpc/85xx: kexec for SMP 85xx BookE systems
  powerpc/5200/i2c: improve i2c bus error recovery
  of/xilinxfb: update tft compatible versions
  powerpc/fsl-diu-fb: Support setting display mode using EDID
  powerpc/5121: doc/dts-bindings: update doc of FSL DIU bindings
  powerpc/5121: shared DIU framebuffer support
  powerpc/5121: move fsl-diu-fb.h to include/linux
  powerpc/5121: fsl-diu-fb: fix issue with re-enabling DIU area descriptor
  powerpc/512x: add clock structure for Video-IN (VIU) unit
  ...
2010-08-05 09:03:46 -07:00
Michal Simek
3f0a55e357 kgdb,powerpc: Replace hardcoded offset by BREAK_INSTR_SIZE
kgdb_handle_breakpoint checks the first arch_kgdb_breakpoint
which is not known by gdb that's why is necessary jump over
it. The jump lenght is equal to BREAK_INSTR_SIZE that's
why is cleaner to use defined macro instead of hardcoded
non-described offset.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 09:22:22 -05:00
Benjamin Herrenschmidt
cd3db0c4ca memblock: Remove rmo_size, burry it in arch/powerpc where it belongs
The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
server ppc64 though I hijack it on embedded ppc64 for similar purposes)
and represents the area of memory that can be accessed in real mode
(aka with MMU off), or on embedded, from the exception vectors (which
is bolted in the TLB) which pretty much boils down to the same thing.

We take that out of the generic MEMBLOCK data structure and move it into
arch/powerpc where it belongs, renaming it to "RMA" while at it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:08 +10:00
Benjamin Herrenschmidt
e63075a3c9 memblock: Introduce default allocation limit and use it to replace explicit ones
This introduce memblock.current_limit which is used to limit allocations
from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).

The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
be used with memblock_alloc_base() to allocate really anywhere.

It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.

Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
strongly recommend that you ensure that you set an appropriate limit
during boot in order to guarantee that an memblock_alloc() at any time
results in something that is accessible with a simple __va().

The reason is that a subsequent patch will introduce the ability for
the array to resize itself by reallocating itself. The MEMBLOCK core will
honor the current limit when performing those allocations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:07 +10:00
Linus Torvalds
3cfc2c42c1 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
  Documentation: update broken web addresses.
  fix comment typo "choosed" -> "chosen"
  hostap:hostap_hw.c Fix typo in comment
  Fix spelling contorller -> controller in comments
  Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
  fs/Kconfig: Fix typo Userpace -> Userspace
  Removing dead MACH_U300_BS26
  drivers/infiniband: Remove unnecessary casts of private_data
  fs/ocfs2: Remove unnecessary casts of private_data
  libfc: use ARRAY_SIZE
  scsi: bfa: use ARRAY_SIZE
  drm: i915: use ARRAY_SIZE
  drm: drm_edid: use ARRAY_SIZE
  synclink: use ARRAY_SIZE
  block: cciss: use ARRAY_SIZE
  comment typo fixes: charater => character
  fix comment typos concerning "challenge"
  arm: plat-spear: fix typo in kerneldoc
  reiserfs: typo comment fix
  update email address
  ...
2010-08-04 15:31:02 -07:00
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Benjamin Herrenschmidt
412a4ac5e9 Merge commit 'gcl/next' into next 2010-08-04 10:26:03 +10:00
Scott Wood
69e77a8b04 perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
Commit 6b95ed345b changed from
a struct initializer to perf_sample_data_init(), but the setting
of the .period member was left out.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-08-03 10:56:45 +10:00
Peter Zijlstra
09f86cd093 perf, powerpc: Convert the FSL driver to use local64_t
For some reason the FSL driver got left out when we converted perf
to use local64_t instead of atomic64_t.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-08-03 10:24:03 +10:00
Ingo Molnar
3772b73472 Merge commit 'v2.6.35' into perf/core
Conflicts:
	tools/perf/Makefile
	tools/perf/util/hist.c

Merge reason: Resolve the conflicts and update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-02 08:31:54 +02:00
Grant Likely
22ae782f86 of/address: Clean up function declarations
This patch moves the declaration of of_get_address(), of_get_pci_address(),
and of_pci_address_to_resource() out of arch code and into the common
linux/of_address header file.

This patch also fixes some of the asm/prom.h ordering issues.  It still
includes some header files that it ideally shouldn't be, but at least the
ordering is consistent now so that of_* overrides work.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01 01:42:42 -06:00
Andreas Schwab
49f6be8ea1 KVM: PPC: elide struct thread_struct instances from stack
Instead of instantiating a whole thread_struct on the stack use only the
required parts of it.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:39:24 +03:00
Matt Evans
e8e5c2155b powerpc/kexec: Fix orphaned offline CPUs across kexec
When CPU hotplug is used, some CPUs may be offline at the time a kexec is
performed.  The subsequent kernel may expect these CPUs to be already running,
and will declare them stuck.  On pseries, there's also a soft-offline (cede)
state that CPUs may be in; this can also cause problems as the kexeced kernel
may ask RTAS if they're online -- and RTAS would say they are.  The CPU will
either appear stuck, or will cause a crash as we replace its cede loop beneath
it.

This patch kicks each present offline CPU awake before the kexec, so that
none are forever lost to these assumptions in the subsequent kernel.

Now, the behaviour is that all available CPUs that were offlined are now
online & usable after the kexec.  This mimics the behaviour of a full reboot
(on which all CPUs will be restarted).

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 15:05:22 +10:00
Matt Evans
e2f7f73717 powerpc/kexec: Add to and tidy debug/comments in machine_kexec64.c
Tidies some typos, KERN_INFO-ise an info msg, and add a debug msg showing
when the final sequence starts.

Also adds a comment to kexec_prepare_cpus_wait() to make note of a possible
problem; the need for kexec to deal with CPUs that failed to originally start
up.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 15:05:21 +10:00
Michael Neuling
2c48a7d615 powerpc: Print decimal values in prom_init.c
Currently we look pretty stupid when printing out a bunch of things in
prom_init.c.  eg.

  Max number of cores passed to firmware: 0x0000000000000080

So I've change this to print in decimal:

  Max number of cores passed to firmware: 128 (NR_CPUS = 256)

This required adding a prom_print_dec() function and changing some
prom_printk() calls from %x to %lu.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 15:05:20 +10:00
Tiejun Chen
d77cb21b57 powerpc/smp: remove the incorrect decrementer initial codes for AP
We already defined start_cpu_decrementer() to invoke decrementer for AP as
the following path:

start_secondary() -> secondary_cpu_time_init() -> start_cpu_decrementer()

So remove these incorrect codes introduced from commit:
e7f75ad0 powerpc/47x: Base ppc476 support

And actually we really should not enable decrementer before calling set_dec().

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:31 +10:00
Neil Horman
67238fb721 powerpc: Add vmcoreinfo symbols to allow makdumpfile to filter core files properly
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

 machine_kexec.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:31 +10:00
Matthew McClintock
bbc8e30f17 powerpc/crashdump: Fix issues with kexec and 36bit physical addr
Fix sizes of variables so correct values are exported via /proc.
Cast variable in comparison to avoid compiler error.

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:30 +10:00
Matt Evans
fc53b4202e powerpc/kexec: Switch to a static PACA on the way out
With dynamic PACAs, the kexecing CPU's PACA won't lie within the kernel
static data and there is a chance that something may stomp it when preparing
to kexec.  This patch switches this final CPU to a static PACA just before
we pull the switch.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:30 +10:00
Benjamin Herrenschmidt
7e3f36c3e1 Merge commit 'jwb/next' into next 2010-07-30 15:02:32 +10:00
Thomas Gleixner
47916be4e2 Merge branch 'powerpc.cherry-picks' into timers/clocksource
Conflicts:
	arch/powerpc/kernel/time.c

Reason: The powerpc next tree contains two commits which conflict with
the timekeeping changes:

8fd63a9e powerpc: Rework VDSO gettimeofday to prevent time going backwards
c1aa687d powerpc: Clean up obsolete code relating to decrementer and timebase

John Stultz identified them and provided the conflict resolution.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-28 21:49:22 +02:00
Paul Mackerras
d75d68cfef powerpc: Clean up obsolete code relating to decrementer and timebase
Since the decrementer and timekeeping code was moved over to using
the generic clockevents and timekeeping infrastructure, several
variables and functions have been obsolete and effectively unused.
This deletes them.

In particular, wakeup_decrementer() is no longer needed since the
generic code reprograms the decrementer as part of the process of
resuming the timekeeping code, which happens during sysdev resume.
Thus the wakeup_decrementer calls in the suspend_enter methods for
52xx platforms have been removed.  The call in the powermac cpu
frequency change code has been replaced by set_dec(1), which will
cause a timer interrupt as soon as interrupts are enabled, and the
generic code will then reprogram the decrementer with the correct
value.

This also simplifies the generic_suspend_en/disable_irqs functions
and makes them static since they are not referenced outside time.c.
The preempt_enable/disable calls are removed because the generic
code has disabled all but the boot cpu at the point where these
functions are called, so we can't be moved to another cpu.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-28 21:07:12 +02:00
Paul Mackerras
0e469db8f7 powerpc: Rework VDSO gettimeofday to prevent time going backwards
Currently it is possible for userspace to see the result of
gettimeofday() going backwards by 1 microsecond, assuming that
userspace is using the gettimeofday() in the VDSO.  The VDSO
gettimeofday() algorithm computes the time in "xsecs", which are
units of 2^-20 seconds, or approximately 0.954 microseconds,
using the algorithm

	now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec

and then converts the time in xsecs to seconds and microseconds.

The kernel updates the tb_orig_stamp and stamp_xsec values every
tick in update_vsyscall().  If the length of the tick is not an
integer number of xsecs, then some precision is lost in converting
the current time to xsecs.  For example, with CONFIG_HZ=1000, the
tick is 1ms long, which is 1048.576 xsecs.  That means that
stamp_xsec will advance by either 1048 or 1049 on each tick.
With the right conditions, it is possible for userspace to get
(timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is
slightly late in updating the vdso_datapage, and then for stamp_xsec
to advance by 1048 when the kernel does update it, and for userspace
to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to
integer truncation.  The result is that time appears to go backwards
by 1 microsecond.

To fix this we change the VDSO gettimeofday to use a new field in the
VDSO datapage which stores the nanoseconds part of the time as a
fractional number of seconds in a 0.32 binary fraction format.
(Or put another way, as a 32-bit number in units of 0.23283 ns.)
This is convenient because we can use the mulhwu instruction to
convert it to either microseconds or nanoseconds.

Since it turns out that computing the time of day using this new field
is simpler than either using stamp_xsec (as gettimeofday does) or
stamp_xtime.tv_nsec (as clock_gettime does), this converts both
gettimeofday and clock_gettime to use the new field.  The existing
__do_get_tspec function is converted to use the new field and take
a parameter in r7 that indicates the desired resolution, 1,000,000
for microseconds or 1,000,000,000 for nanoseconds.  The __do_get_xsec
function is then unused and is deleted.

The new algorithm is

	now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs
		+ (stamp_xtime_seconds << 32) + stamp_sec_fraction

with 'now' in units of 2^-32 seconds.  That is then converted to
seconds and either microseconds or nanoseconds with

	seconds = now >> 32
	partseconds = ((now & 0xffffffff) * resolution) >> 32

The 32-bit VDSO code also makes a further simplification: it ignores
the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary
fraction.  Doing so gets rid of 4 multiply instructions.  Assuming
a timebase frequency of 1GHz or less and an update interval of no
more than 10ms, the upper 32 bits of tb_to_xs will be at least
4503599, so the error from ignoring the low 32 bits will be at most
2.2ns, which is more than an order of magnitude less than the time
taken to do gettimeofday or clock_gettime on our fastest processors,
so there is no possibility of seeing inconsistent values due to this.

This also moves update_gtod() down next to its only caller, and makes
update_vsyscall use the time passed in via the wall_time argument rather
than accessing xtime directly.  At present, wall_time always points to
xtime, but that could change in future.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-28 21:06:47 +02:00
Peter Zijlstra
6b95ed345b perf, powerpc: Use perf_sample_data_init() for the FSL code
We should use perf_sample_data_init() to initialize struct
perf_sample_data.  As explained in the description of commit dc1d628a
("perf: Provide generic perf_sample_data initialization"), it is
possible for userspace to get the kernel to dereference data.raw,
so if it is not initialized, that means that unprivileged userspace
can possibly oops the kernel.  Using perf_sample_data_init makes sure
it gets initialized to NULL.

This conversion should have been included in commit dc1d628a, but it
got missed.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-07-27 22:20:09 +10:00
John Stultz
7615856ebf timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset
update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-27 12:40:54 +02:00
John Stultz
06d518e3df powerpc: Cleanup xtime usage
This removes powerpc's direct xtime usage, allowing for further
generic timeekeping cleanups

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1279068988-21864-6-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-27 12:40:54 +02:00
John Stultz
b0797b60d0 powerpc: Simplify update_vsyscall
Currently powerpc's update_vsyscall calls an inline update_gtod.
However, both are straightforward, and there are no other users,
so this patch merges update_gtod into update_vsyscall.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1279068988-21864-5-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-27 12:40:54 +02:00
Lee Nipper
ff34910396 powerpc/40x: Distinguish AMCC PowerPC 405EX and 405EXr correctly
The recent AMCC 405EX Rev D without Security uses a PVR value
that matches the old 405EXr Rev A/B with Security.
The 405EX Rev D without Security would be shown
incorrectly as an 405EXr. The pvr_mask of 0xffff0004
is no longer sufficient to distinguish the 405EX from 405EXr.

This patch replaces 2 entries in the cpu_specs table
and adds 8 more, each using pvr_mask of 0xffff000f
and appropriate pvr_value to distinguish the AMCC
PowerPC 405EX and 405EXr instances.
The cpu_name for these entries now includes the
Rev, in similar fashion to the 440GX.

Signed-off-by: Lee Nipper <lee.nipper@gmail.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-07-26 09:07:24 -04:00
Jonas Bonn
c0dd394ca5 of: remove of_default_bus_ids
This list used was by only two platforms with all other platforms defining an
own list of valid bus id's to pass to of_platform_bus_probe.  This patch:

i)   copies the default list to the two platforms that depended on it (powerpc)
ii)  remove the usage of of_default_bus_ids in of_platform_bus_probe
iii) removes the definition of the list from all architectures that defined it

Passing a NULL 'matches' parameter to of_platform_bus_probe is still valid; the
function returns no error in that case as the NULL value is equivalent to an
empty list.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
[grant.likely@secretlab.ca: added __initdata annotations, warn on and return error on missing match table, and fix whitespace errors]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-07-24 09:58:22 -06:00
Jonas Bonn
c608558407 of: make of_find_device_by_node generic
There's no need for this function to be architecture specific and all four
architectures defining it had the same definition.  The function has been
moved to drivers/of/platform.c.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
[grant.likely@secretlab.ca: moved to drivers/of/platform.c, simplified code, and added kerneldoc comment]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24 09:58:22 -06:00
Grant Likely
a454dc5059 powerpc: remove references to of_device and to_of_device
of_device is just a #define alias to platform_device.  This patch
replaces all references to it with platform_device.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-07-24 09:58:21 -06:00
Grant Likely
1ab1d63a85 of/platform: remove all of_bus_type and of_platform_bus_type references
Both of_bus_type and of_platform_bus_type are just #define aliases
for the platform bus.  This patch removes all references to them and
switches to the of_register_platform_driver()/of_unregister_platform_driver()
API for registering.

Subsequent patches will convert each user of of_register_platform_driver()
into plain platform_drivers without the of_platform_driver shim.  At which
point the of_register_platform_driver()/of_unregister_platform_driver()
functions can be removed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24 09:57:52 -06:00
Grant Likely
eca3930163 of: Merge of_platform_bus_type with platform_bus_type
of_platform_bus was being used in the same manner as the platform_bus.
The only difference being that of_platform_bus devices are generated
from data in the device tree, and platform_bus devices are usually
statically allocated in platform code.  Having them separate causes
the problem of device drivers having to be registered twice if it
was possible for the same device to appear on either bus.

This patch removes of_platform_bus_type and registers all of_platform
bus devices and drivers on the platform bus instead.  A previous patch
made the of_device structure an alias for the platform_device structure,
and a shim is used to adapt of_platform_drivers to the platform bus.

After all of of_platform_bus drivers are converted to be normal platform
drivers, the shim code can be removed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24 09:57:51 -06:00
Grant Likely
4e4f62bf73 Merge commit 'v2.6.35-rc6' into devicetree/next
Conflicts:
	arch/sparc/kernel/prom_64.c
2010-07-24 09:49:13 -06:00
Benjamin Herrenschmidt
3fdfd99051 powerpc: Fix erroneous lmb->memblock conversions
Oooops... we missed these. We incorrectly converted strings
used when parsing the device-tree on pseries, thus breaking
access to drconf memory and hotplug memory.

While at it, also revert some variable names that represent
something the FW calls "lmb" and thus don't need to be converted
to "memblock".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
2010-07-23 12:56:57 +10:00
Ingo Molnar
dca45ad8af Merge branch 'linus' into sched/core
Merge reason: Move from the -rc3 to the almost-rc6 base.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-21 21:45:08 +02:00
Ingo Molnar
9dcdbf7a33 Merge branch 'linus' into perf/core
Merge reason: Pick up the latest perf fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-21 21:43:06 +02:00
Pavel Machek
a2531293db update email address
pavel@suse.cz no longer works, replace it with working address.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-19 10:56:54 +02:00
Grant Likely
c5f5849bff of: Remove unused of_find_device_by_phandle()
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-18 22:39:36 -06:00
Linus Torvalds
6f7dd68b75 Merge branch 'lmb-to-memblock' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'lmb-to-memblock' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  lmb: rename to memblock
2010-07-14 17:27:44 -07:00
Yinghai Lu
95f72d1ed4 lmb: rename to memblock
via following scripts

      FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

      sed -i \
        -e 's/lmb/memblock/g' \
        -e 's/LMB/MEMBLOCK/g' \
        $FILES

      for N in $(find . -name lmb.[ch]); do
        M=$(echo $N | sed 's/lmb/memblock/g')
        mv $N $M
      done

and remove some wrong change like lmbench and dlmb etc.

also move memblock.c from lib/ to mm/

Suggested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14 17:14:00 +10:00
Benjamin Herrenschmidt
ff82c319e6 powerpc/book3e: Fix single step when using HW page tables
We patch the TLB miss exception vectors to point to alternate
functions when using HW page table on BookE.

However, we were patching in a new branch in the first instruction
of the exception handler instead of the second one, thus overriding
the nop that is in the first instruction.

This cause problems when single stepping as we rely on that nop for
the single step to stop properly within the exception vector range
rather than on the target of the branch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14 14:13:51 +10:00
Benjamin Herrenschmidt
34d97e07cc powerpc/book3e: Add generic 64-bit idle powersave support
We use a similar technique to ppc32: We set a thread local flag
to indicate that we are about to enter or have entered the stop
state, and have fixup code in the async interrupt entry code that
reacts to this flag to make us return to a different location
(sets NIP to LINK in our case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
--
v2. Fix lockdep bug
    Re-mask interrupts when coming back from idle
2010-07-14 14:13:18 +10:00
Matthew McClintock
77154a2026 powerpc/fsl-booke: Fix address issue when using relocatable kernels
When booting a relocatable kernel it needs to jump to the correct
start address, which for BookE parts is usually unchanged
regardless of the physical memory offset.

Recent changes cause problems with how we calculate the start
address, it was always adding the RMO into the start address
which is incorrect. This patch only adds in the RMO offset
if we are in the kexec code path, as it needs the RMO to work
correctly.

Instead of adding the RMO offset in in the common code path, we
can just set r6 to the RMO offset in the kexec code path instead
of to zero, and finally perform the masking in the common code
path

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-07-11 11:04:08 -05:00
Michael Ellerman
850f22d568 powerpc/book3e: Resend doorbell exceptions to ourself
If we are soft disabled and receive a doorbell exception we don't process
it immediately. This means we need to check on the way out of irq restore
if there are any doorbell exceptions to process.

The problem is at that point we don't know what our regs are, and that
in turn makes xmon unhappy. To workaround the problem, instead of checking
for and processing doorbells, we check for any doorbells and if there were
any we send ourselves another.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:19 +10:00
David Gibson
0e37d25950 powerpc/book3e: Use set_irq_regs() in the msgsnd/msgrcv IPI path
include/asm-generic/irq_regs.h declares per-cpu irq_regs variables and
get_irq_regs() and set_irq_regs() helper functions to maintain them.
These can be used to access the proper pt_regs structure related to the
current interrupt entry (if any).

In the powerpc arch code, this is used to maintain irq regs on
decrementer and external interrupt exceptions.  However, for the
doorbell exceptions used by the msgsnd/msgrcv IPI mechanism of newer
BookE CPUs, the irq_regs are not kept up to date.

In particular this means that xmon will not work properly on SMP,
because the secondary xmon instances started by IPI will blow up when
they cannot retrieve the irq regs.

This patch fixes the problem by adding calls to maintain the irq regs
across doorbell exceptions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:18 +10:00
Benjamin Herrenschmidt
89c81797d4 powerpc/book3e: Hookup doorbells exceptions on 64-bit Book3E
Note that critical doorbells are an unimplemented stub just like
other critical or machine check handlers, since we haven't done
support for "levelled" exceptions yet.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:17 +10:00
Benjamin Herrenschmidt
e8775d4aa1 powerpc/book3e: Don't re-trigger decrementer on lazy irq restore
The decrementer on BookE acts as a level interrupt and doesn't
need to be re-triggered when going negative. It doesn't go
negative anyways (unless programmed to auto-reload with a
negative value) as it stops when reaching 0.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:09 +10:00
Benjamin Herrenschmidt
b9f1cd71db powerpc/book3e: More doorbell cleanups. Sample the PIR register
The doorbells use the content of the PIR register to match messages
from other CPUs. This may or may not be the same as our linux CPU
number, so using that as the "target" is no right.

Instead, we sample the PIR register at boot on every processor
and use that value subsequently when sending IPIs.

We also use a per-cpu message mask rather than a global array which
should limit cache line contention.

Note: We could use the CPU number in the device-tree instead of
the PIR register, as they are supposed to be equivalent. This
might prove useful if doorbells are to be used to kick CPUs out
of FW at boot time, thus before we can sample the PIR. This is
however not the case now and using the PIR just works.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 15:29:53 +10:00
Benjamin Herrenschmidt
e3145b387a powerpc/book3e: Move doorbell_exception from traps.c to dbell.c
... where it belongs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 15:25:18 +10:00
Benjamin Herrenschmidt
a2e198116f powerpc/book3e: Hack to get gdb moving along on Book3E 64-bit
Our handling of debug interrupts on Book3E 64-bit is not quite
the way it should be just yet. This is a workaround to let gdb
work at least for now. We ensure that when context switching,
we set the appropriate DBCR0 value for the new task. We also
make sure that we turn off MSR[DE] within the kernel, and set
it as part of the bits that get set when going back to userspace.

In the long run, we will probably set the userspace DBCR0 on the
exception exit code path and ensure we have some proper kernel
value to set on the way into the kernel, a bit like ppc32 does,
but that will take more work.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 15:24:47 +10:00
Martyn Welch
540c6c392f powerpc: Add i8042 keyboard and mouse irq parsing
Currently the irqs for the i8042, which historically provides keyboard and
mouse (aux) support, is hardwired in the driver rather than parsing the
dts.  This patch modifies the powerpc legacy IO code to attempt to parse
the device tree for this information, failing back to the hardcoded values
if it fails.

Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:28:33 +10:00
Anton Blanchard
ae01f84b93 powerpc: Optimise per cpu accesses on 64bit
Now we dynamically allocate the paca array, it takes an extra load
whenever we want to access another cpu's paca. One place we do that a lot
is per cpu variables. A simple example:

DEFINE_PER_CPU(unsigned long, vara);
unsigned long test4(int cpu)
{
	return per_cpu(vara, cpu);
}

This takes 4 loads, 5 if you include the actual load of the per cpu variable:

    ld r11,-32760(r30)  # load address of paca pointer
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,9       # get offset into paca (each entry is 512 bytes)
    ld r0,0(r11)        # load paca pointer
    add r3,r0,r3        # paca + offset
    ld r11,64(r3)       # load paca[cpu].data_offset

    ldx r3,r9,r11       # load per cpu variable

If we remove the ppc64 specific per_cpu_offset(), we get the generic one
which indexes into a statically allocated array. This removes one load and
one add:

    ld r11,-32760(r30)  # load address of __per_cpu_offset
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,3       # get offset into __per_cpu_offset (each entry 8 bytes)
    ldx r11,r11,r3      # load __per_cpu_offset[cpu]

    ldx r3,r9,r11       # load per cpu variable

Having all the offsets in one array also helps when iterating over a per cpu
variable across a number of cpus, such as in the scheduler. Before we would
need to load one paca cacheline when calculating each per cpu offset. Now we
have 16 (128 / sizeof(long)) per cpu offsets in each cacheline.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:28:30 +10:00
Brian King
8fe93f8d85 powerpc/pseries: Migration code reorganization / hibernation prep
Partition hibernation will use some of the same code as is
currently used for Live Partition Migration. This function
further abstracts this code such that code outside of rtas.c
can utilize it. It also changes the error field in the suspend
me data structure to be an atomic type, since it is set and
checked on different cpus without any barriers or locking.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:26:17 +10:00
Paul Mackerras
c1aa687d49 powerpc: Clean up obsolete code relating to decrementer and timebase
Since the decrementer and timekeeping code was moved over to using
the generic clockevents and timekeeping infrastructure, several
variables and functions have been obsolete and effectively unused.
This deletes them.

In particular, wakeup_decrementer() is no longer needed since the
generic code reprograms the decrementer as part of the process of
resuming the timekeeping code, which happens during sysdev resume.
Thus the wakeup_decrementer calls in the suspend_enter methods for
52xx platforms have been removed.  The call in the powermac cpu
frequency change code has been replaced by set_dec(1), which will
cause a timer interrupt as soon as interrupts are enabled, and the
generic code will then reprogram the decrementer with the correct
value.

This also simplifies the generic_suspend_en/disable_irqs functions
and makes them static since they are not referenced outside time.c.
The preempt_enable/disable calls are removed because the generic
code has disabled all but the boot cpu at the point where these
functions are called, so we can't be moved to another cpu.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:26:16 +10:00
Paul Mackerras
8fd63a9ea7 powerpc: Rework VDSO gettimeofday to prevent time going backwards
Currently it is possible for userspace to see the result of
gettimeofday() going backwards by 1 microsecond, assuming that
userspace is using the gettimeofday() in the VDSO.  The VDSO
gettimeofday() algorithm computes the time in "xsecs", which are
units of 2^-20 seconds, or approximately 0.954 microseconds,
using the algorithm

	now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec

and then converts the time in xsecs to seconds and microseconds.

The kernel updates the tb_orig_stamp and stamp_xsec values every
tick in update_vsyscall().  If the length of the tick is not an
integer number of xsecs, then some precision is lost in converting
the current time to xsecs.  For example, with CONFIG_HZ=1000, the
tick is 1ms long, which is 1048.576 xsecs.  That means that
stamp_xsec will advance by either 1048 or 1049 on each tick.
With the right conditions, it is possible for userspace to get
(timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is
slightly late in updating the vdso_datapage, and then for stamp_xsec
to advance by 1048 when the kernel does update it, and for userspace
to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to
integer truncation.  The result is that time appears to go backwards
by 1 microsecond.

To fix this we change the VDSO gettimeofday to use a new field in the
VDSO datapage which stores the nanoseconds part of the time as a
fractional number of seconds in a 0.32 binary fraction format.
(Or put another way, as a 32-bit number in units of 0.23283 ns.)
This is convenient because we can use the mulhwu instruction to
convert it to either microseconds or nanoseconds.

Since it turns out that computing the time of day using this new field
is simpler than either using stamp_xsec (as gettimeofday does) or
stamp_xtime.tv_nsec (as clock_gettime does), this converts both
gettimeofday and clock_gettime to use the new field.  The existing
__do_get_tspec function is converted to use the new field and take
a parameter in r7 that indicates the desired resolution, 1,000,000
for microseconds or 1,000,000,000 for nanoseconds.  The __do_get_xsec
function is then unused and is deleted.

The new algorithm is

	now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs
		+ (stamp_xtime_seconds << 32) + stamp_sec_fraction

with 'now' in units of 2^-32 seconds.  That is then converted to
seconds and either microseconds or nanoseconds with

	seconds = now >> 32
	partseconds = ((now & 0xffffffff) * resolution) >> 32

The 32-bit VDSO code also makes a further simplification: it ignores
the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary
fraction.  Doing so gets rid of 4 multiply instructions.  Assuming
a timebase frequency of 1GHz or less and an update interval of no
more than 10ms, the upper 32 bits of tb_to_xs will be at least
4503599, so the error from ignoring the low 32 bits will be at most
2.2ns, which is more than an order of magnitude less than the time
taken to do gettimeofday or clock_gettime on our fastest processors,
so there is no possibility of seeing inconsistent values due to this.

This also moves update_gtod() down next to its only caller, and makes
update_vsyscall use the time passed in via the wall_time argument rather
than accessing xtime directly.  At present, wall_time always points to
xtime, but that could change in future.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:26:16 +10:00
Benjamin Herrenschmidt
5f07aa7524 Merge commit 'paulus-perf/master' into next 2010-07-09 11:25:48 +10:00
Paul E. McKenney
c2be05481f powerpc: Fix default_machine_crash_shutdown #ifdef botch
crash_kexec_wait_realmode() is defined only if CONFIG_PPC_STD_MMU_64
and CONFIG_SMP, but is called if CONFIG_PPC_STD_MMU_64 even if !CONFIG_SMP.
Fix the conditional compilation around the invocation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:45 +10:00
Johannes Berg
3cd8519248 powerpc: Fix logic error in fixup_irqs
When SPARSE_IRQ is set, irq_to_desc() can
return NULL. While the code here has a
check for NULL, it's not really correct.
Fix it by separating the check for it.

This fixes CPU hot unplug for me.

Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Cc: stable@kernel.org [2.6.32+]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:44 +10:00
Anton Blanchard
33ad5e4b6c powerpc: Linux cannot run with 0 cores
If we configure with CONFIG_SMP=n or set NR_CPUS less than the number of
SMT threads we will set the max cores property to 0 in the
ibm,client-architecture-support structure. On new versions of firmware that
understand this property it obliges and terminates our partition.

Use DIV_ROUND_UP so we handle not only the CONFIG_SMP=n case but also the
case where NR_CPUS isn't a multiple of the number of SMT threads.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:42 +10:00
Stephen Rothwell
5afd878a95 powerpc: Fix compile errors in prom_init_check for gcc 4.5
Just whitelist these extra compiler generated symbols.
Fixes these errors:

Error: External symbol '_restgpr0_14' referenced from prom_init.c
Error: External symbol '_restgpr0_20' referenced from prom_init.c
Error: External symbol '_restgpr0_22' referenced from prom_init.c
Error: External symbol '_restgpr0_24' referenced from prom_init.c
Error: External symbol '_restgpr0_25' referenced from prom_init.c
Error: External symbol '_restgpr0_26' referenced from prom_init.c
Error: External symbol '_restgpr0_27' referenced from prom_init.c
Error: External symbol '_restgpr0_28' referenced from prom_init.c
Error: External symbol '_restgpr0_29' referenced from prom_init.c
Error: External symbol '_restgpr0_31' referenced from prom_init.c
Error: External symbol '_savegpr0_14' referenced from prom_init.c
Error: External symbol '_savegpr0_20' referenced from prom_init.c
Error: External symbol '_savegpr0_22' referenced from prom_init.c
Error: External symbol '_savegpr0_24' referenced from prom_init.c
Error: External symbol '_savegpr0_25' referenced from prom_init.c
Error: External symbol '_savegpr0_26' referenced from prom_init.c
Error: External symbol '_savegpr0_27' referenced from prom_init.c
Error: External symbol '_savegpr0_28' referenced from prom_init.c
Error: External symbol '_savegpr0_29' referenced from prom_init.c
Error: External symbol '_savegpr0_31' referenced from prom_init.c

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:39 +10:00
Matt Evans
219a92a4c4 powerpc/perf_event: Fix for power_pmu_disable()
When power_pmu_disable() removes the given event from a particular index into
cpuhw->event[], it shuffles down higher event[] entries.  But, this array is
paired with cpuhw->events[] and cpuhw->flags[] so should shuffle them
similarly.

If these arrays get out of sync, code such as power_check_constraints() will
fail.  This caused a bug where events were temporarily disabled and then failed
to be re-enabled; subsequent code tried to write_pmc() with its (disabled) idx
of 0, causing a message "oops trying to write PMC0".  This triggers this bug on
POWER7, running a miss-heavy test:

  perf record -e L1-dcache-load-misses -e L1-dcache-store-misses ./misstest

Signed-off-by: Matt Evans <matt@ozlabs.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:37 +10:00
Grant Likely
94c0931983 of: Merge of_device_alloc() and of_device_make_bus_id()
This patch merges the common routines of_device_alloc() and
of_device_make_bus_id() from powerpc and microblaze.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Grant Likely <grant.likely@secretlab.ca>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linuxppc-dev@ozlabs.org
CC: devicetree-discuss@lists.ozlabs.org
2010-07-05 16:14:29 -06:00
Grant Likely
5fd200f3b3 of/device: Merge of_platform_bus_probe()
Merge common code between PowerPC and microblaze.  This patch merges
the code that scans the tree and registers devices.  The functions
merged are of_platform_bus_probe(), of_platform_bus_create(), and
of_platform_device_create().

This patch also move the of_default_bus_ids[] table out of a Microblaze
header file and makes it non-static.  The device ids table isn't merged
because powerpc and microblaze use different default data.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Grant Likely <grant.likely@secretlab.ca>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linuxppc-dev@ozlabs.org
2010-07-05 16:14:28 -06:00
Grant Likely
dd27dcda37 of/device: merge of_device_uevent
Merge common code between powerpc and microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linuxppc-dev@ozlabs.org
2010-07-05 16:14:28 -06:00
Grant Likely
dbbdee9473 of/address: Merge all of the bus translation code
Microblaze and PowerPC share a large chunk of code for translating
OF device tree data into usable addresses.  Differences between the two
consist of cosmetic differences, and the addition of dma-ranges support
code to powerpc but not microblaze.  This patch moves the powerpc
version into common code and applies many of the cosmetic (non-functional)
changes from the microblaze version.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:26 -06:00
Grant Likely
1f5bef30cf of/address: merge of_address_to_resource()
Merge common code between PowerPC and Microblaze.  This patch also
moves the prototype of pci_address_to_pio() out of pci-bridge.h and
into prom.h because the only user of pci_address_to_pio() is
of_address_to_resource().

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:26 -06:00
Grant Likely
6b884a8d50 of/address: merge of_iomap()
Merge common code between Microblaze and PowerPC.  This patch creates
new of_address.h and address.c files to containing address translation
and mapping routines.  First routine to be moved it of_iomap()

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:26 -06:00
Grant Likely
7dc2e1134a of/irq: merge irq mapping code
Merge common irq mapping code between PowerPC and Microblaze.

This patch merges of_irq_find_parent(), of_irq_map_raw() and
of_irq_map_one().  The functions are dependent on one another, so all
three are merged in a single patch.  Other than cosmetic difference
(ie. DBG() vs. pr_debug()), the implementations are identical.

of_irq_to_resource() is also merged, but in this case the
implementations are different.  This patch drops the microblaze version
and uses the powerpc implementation unchanged.  The microblaze version
essentially open-coded irq_of_parse_and_map() which it does not need
to do.  Therefore the powerpc version is safe to adopt.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:25 -06:00
Grant Likely
b83da291b4 of/powerpc: Move Powermac irq quirk code into powermac pic driver code
The code that figures out what is wrong with the powermac irq device
tree data belongs with the rest of the powermac irq code.  This patch
moves it out of prom_parse.c and into powermac/pic.c so that it is only
compiled in when actually needed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:25 -06:00
Paul Mackerras
d09ec73871 powerpc, hw_breakpoint: Tell generic code we have no instruction breakpoints
At present, hw_breakpoint_slots() returns 1 regardless of what
type of breakpoint is specified in the type argument.  Since we
don't define CONFIG_HAVE_MIXED_BREAKPOINTS_REGS, there are
separate values for TYPE_INST and TYPE_DATA, and hw_breakpoint_slots()
returns 1 for both, effectively advertising instruction breakpoint
support which doesn't exist.

This fixes it by making hw_breakpoint_slots return 1 for TYPE_DATA
and 0 for TYPE_INST.  This moves hw_breakpoint_slots() from the
powerpc hw_breakpoint.h to hw_breakpoint.c because the definitions
of TYPE_INST and TYPE_DATA aren't available in <asm/hw_breakpoint.h>.
They are defined in <linux/hw_breakpoint.h> but we can't include
that header in <asm/hw_breakpoint.h>, and nor can we rely on
<linux/hw_breakpoint.h> being included before <asm/hw_breakpoint.h>.
Since hw_breakpoint_slots() is only called at boot time, there is
no performance impact from making it a real function rather than
a static inline.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-30 13:54:58 +10:00
Michael Neuling
2ec57d448b sched: Fix spelling of sibling
No logic changes, only spelling.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: linuxppc-dev@ozlabs.org
Cc: David Howells <dhowells@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <15249.1277776921@neuling.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-29 10:44:29 +02:00
Thomas Gleixner
f384c954c9 Merge branch 'linus' into perf/core
Reason: Further changes conflict with upstream fixes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-06-28 22:33:24 +02:00
Grant Likely
e387344499 of/irq: Move irq_of_parse_and_map() to common code
Merge common code between PowerPC and Microblaze.  SPARC implements
irq_of_parse_and_map(), but the implementation is different, so it
does not use this code.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jeremy Kerr <jeremy.kerr@canonical.com>
2010-06-28 12:41:33 -07:00
Paul Mackerras
76b0f13376 powerpc, hw_breakpoint: Cooperate better with other single-steppers
The code we had to clear the MSR_SE bit was not doing anything because
the caller (ultimately single_step_exception() in traps.c) had already
cleared.  Instead of trying to leave MSR_SE set if the TIF_SINGLESTEP
flag is set (which indicates that the process is being single-stepped
by ptrace), we instead return NOTIFY_DONE in that case, which means
the caller will generate a SIGTRAP for the process.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-23 15:46:55 +10:00
Paul Mackerras
574cb24899 powerpc, hw_breakpoint: Fix off-by-one in checking access address
The code would accept an access to an address one byte past the end
of the requested range as legitimate, due to having a "<=" rather than
a "<".  This fixes that and cleans up the code a bit.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-23 15:42:43 +10:00
K.Prasad
e3e94084ad powerpc, hw_breakpoint: Discard extraneous interrupt due to accesses outside symbol length
Many a times, the requested breakpoint length can be less than the
fixed breakpoint length i.e. 8 bytes supported by PowerPC 64-bit
server (Book III S) processors.  This could lead to extraneous
interrupts resulting in false breakpoint notifications.  This
detects and discards such interrupts for non-ptrace requests.
We don't change ptrace behaviour to avoid breaking compatability.

[Suggestion from Paul Mackerras <paulus@samba.org> to add a new flag in
'struct arch_hw_breakpoint' to identify extraneous interrupts]

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:51 +10:00
K.Prasad
06532a6743 powerpc, hw_breakpoint: Enable hw-breakpoints while handling intervening signals
A signal delivered between a hw_breakpoint_handler() and the
single_step_dabr_instruction() will not have the breakpoint active
while the signal handler is running -- the signal delivery will
set up a new MSR value which will not have MSR_SE set, so we
won't get the signal step interrupt until and unless the signal
handler returns (which it may never do).

To fix this, we restore the breakpoint when delivering a signal --
we clear the MSR_SE bit and set the DABR again.  If the signal
handler returns, the DABR interrupt will occur again when the
instruction that we were originally trying to single-step gets
re-executed.

[Paul Mackerras <paulus@samba.org> pointed out the need to do this.]

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:50 +10:00
K.Prasad
2538c2d08f powerpc, hw_breakpoint: Handle concurrent alignment interrupts
If an alignment interrupt occurs on an instruction that is being
single-stepped, the alignment interrupt handler currently handles
the single-step condition by unconditionally sending a SIGTRAP to
the process.  Other synchronous interrupts that result in the
instruction being emulated do likewise.

With hw_breakpoint support, the hw_breakpoint code needs to be able
to intercept these single-step events as well as those where the
instruction executes normally and a trace interrupt happens.

Fix this by making emulate_single_step() use the existing
single_step_exception() function instead of calling _exception()
directly.  We then make single_step_exception() use the abstracted
clear_single_step() rather than clearing bits in the MSR image
directly so that emulate_single_step() will continue to work
correctly on Book 3E processors.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:50 +10:00
K.Prasad
5aae8a5370 powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors
Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras <paulus@samba.org> to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:50 +10:00
Ingo Molnar
646b1db495 Merge commit 'v2.6.35-rc3' into perf/core
Merge reason: Go from -rc1 base to -rc3 base, merge in fixes.
2010-06-18 10:53:19 +02:00
Ingo Molnar
4cb6948e53 Merge commit 'v2.6.35-rc3' into sched/core
Merge reason: Update to the latest -rc.
2010-06-18 10:46:35 +02:00
Milton Miller
bd2b64a12b powerpc: rtas_flash needs to use rtas_data_buf
When trying to flash a machine via the update_flash command, Anton received the
following error:

    Restarting system.
    FLASH: kernel bug...flash list header addr above 4GB

The code in question has a comment that the flash list should be in
the kernel data and therefore under 4GB:

        /* NOTE: the "first" block list is a global var with no data
         * blocks in the kernel data segment.  We do this because
         * we want to ensure this block_list addr is under 4GB.
         */

Unfortunately the Kconfig option is marked tristate which means the variable
may not be in the kernel data and could be above 4GB.

Instead of relying on the data segment being below 4GB, use the static
data buffer allocated by the kernel for use by rtas.  Since we don't
use the header struct directly anymore, convert it to a simple pointer.

Reported-By: Anton Blanchard <anton@samba.org>
Signed-Off-By: Milton Miller <miltonm@bga.com
Tested-By: Anton Blanchard <anton@samba.org>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-15 15:02:37 +10:00
Christoph Hellwig
f1ba9a5b2a powerpc: Unconditionally enabled irq stacks
Irq stacks provide an essential protection from stack overflows through
external interrupts, at the cost of two additionals stacks per CPU.

Enable them unconditionally to simplify the kernel build and prevent
people from accidentally disabling them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-15 15:02:37 +10:00
Matt Evans
b636f1379e powerpc/kexec: Wait for online/possible CPUs only.
kexec_perpare_cpus_wait() iterates i through NR_CPUS to check
paca[i].kexec_state of each to make sure they have quiesced.
However now we have dynamic PACA allocation, paca[NR_CPUS] is not necessarily
valid and we overrun the array;  spurious "cpu is not possible, ignoring"
errors result.  This patch iterates for_each_online_cpu so stays
within the bounds of paca[] -- and every CPU is now 'possible'.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-15 15:02:33 +10:00
Linus Torvalds
eda054770e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: clear bridge resource range if BIOS assigned bad one
  PCI: hotplug/cpqphp, fix NULL dereference
  Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"
  PCI: change resource collision messages from KERN_ERR to KERN_INFO
2010-06-11 14:15:44 -07:00
Yinghai Lu
837c4ef13c PCI: clear bridge resource range if BIOS assigned bad one
Yannick found that video does not work with 2.6.34.  The cause of this
bug was that the BIOS had assigned the wrong range to the PCI bridge
above the video device.  Before 2.6.34 the kernel would have shrunk
the size of the bridge window, but since
  d65245c PCI: don't shrink bridge resources
the kernel will avoid shrinking BIOS ranges.

So zero out the old range if we fail to claim it at boot time; this will
cause us to allocate a new range at startup, restoring the 2.6.34
behavior.

Fixes regression https://bugzilla.kernel.org/show_bug.cgi?id=16009.

Reported-by: Yannick <yannick.roehlly@free.fr>
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-06-11 13:24:51 -07:00
Ingo Molnar
c726b61c6a Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2010-06-09 18:55:57 +02:00
Peter Zijlstra
89275d59b5 powerpc: Exclude arch_sd_sibiling_asym_packing() on UP
Only SMP systems care about load-balance features, plus this
saves some .text space on UP and also fixes the build.

Reported-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Neuling <mikey@neuling.org>
LKML-Reference: <tip-76cbd8a8f8b0dddbff89a6708bd5bd13c0d21a00@git.kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-09 16:31:39 +02:00
Michael Neuling
76cbd8a8f8 powerpc: Enable asymmetric SMT scheduling on POWER7
The POWER7 core has dynamic SMT mode switching which is controlled by
the hypervisor.  There are 3 SMT modes:
	SMT1 uses thread  0
	SMT2 uses threads 0 & 1
	SMT4 uses threads 0, 1, 2 & 3
When in any particular SMT mode, all threads have the same performance
as each other (ie. at any moment in time, all threads perform the same).

The SMT mode switching works such that when linux has threads 2 & 3 idle
and 0 & 1 active, it will cede (H_CEDE hypercall) threads 2 and 3 in the
idle loop and the hypervisor will automatically switch to SMT2 for that
core (independent of other cores).  The opposite is not true, so if
threads 0 & 1 are idle and 2 & 3 are active, we will stay in SMT4 mode.

Similarly if thread 0 is active and threads 1, 2 & 3 are idle, we'll go
into SMT1 mode.

If we can get the core into a lower SMT mode (SMT1 is best), the threads
will perform better (since they share less core resources).  Hence when
we have idle threads, we want them to be the higher ones.

This adds a feature bit for asymmetric packing to powerpc and then
enables it on POWER7.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
LKML-Reference: <20100608045702.31FB5CC8C7@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-09 11:13:14 +02:00
Peter Zijlstra
e78505958c perf: Convert perf_event to local_t
Since now all modification to event->count (and ->prev_count
and ->period_left) are local to a cpu, change then to local64_t so we
avoid the LOCK'ed ops.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-09 11:12:37 +02:00
Peter Zijlstra
8d2cacbbb8 perf: Cleanup {start,commit,cancel}_txn details
Clarify some of the transactional group scheduling API details
and change it so that a successfull ->commit_txn also closes
the transaction.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1274803086.5882.1752.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-09 11:12:34 +02:00
Frederic Weisbecker
b0f82b81fe perf: Drop the skip argument from perf_arch_fetch_regs_caller
Drop this argument now that we always want to rewind only to the
state of the first caller.
It means frame pointers are not necessary anymore to reliably get
the source of an event. But this also means we need this helper
to be a macro now, as an inline function is not an option since
we need to know when to provide a default implentation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-08 23:31:27 +02:00
Ananth N Mavinakayanahalli
db97bc7f99 powerpc/kprobes: Remove resume_execution() in kprobes
emulate_step() in kprobe_handler() would've already determined if the
probed instruction can be emulated. We single-step in hardware only if
the instruction couldn't be emulated. resume_execution() therefore is
superfluous -- all we need is to fix up the instruction pointer after
single-stepping.

Thanks to Paul Mackerras for catching this.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-06-02 17:50:37 +10:00
Linus Torvalds
aef4b9aaae Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Don't export cvt_fd & _df when CONFIG_PPC_FPU is not set
  powerpc/44x: icon: select SM502 and frame buffer console support
  powerpc/85xx: Add P1021MDS board support
  powerpc/85xx: Change MPC8572DS camp dtses for MSI sharing
  powerpc/fsl_msi: add removal path and probe failing path
  powerpc/fsl_msi: enable msi sharing through AMP OSes
  powerpc/fsl_msi: enable msi allocation in all banks
  powerpc/fsl_msi: fix the conflict of virt_msir's chip_data
  powerpc/fsl_msi: Add multiple MSI bank support
  powerpc/kexec: Add support for FSL-BookE
  powerpc/fsl-booke: Move the entry setup code into a seperate file
  powerpc/fsl-booke: fix the case where we are not in the first page
  powerpc/85xx: Enable support for ports 3 and 4 on 8548 CDS
  powerpc/fsl-booke: Add hibernation support for FSL BookE processors
  powerpc/e500mc: Implement machine check handler.
  powerpc/44x: Add basic ICON PPC440SPe board support
  powerpc/44x: Fix UART clocks on 440SPe
  powerpc/44x: Add reset-type to katmai.dts
  powerpc/44x: Adding PCI-E support for PowerPC 460SX based SOC.
2010-06-01 14:13:14 -07:00
Linus Torvalds
1f73897861 Merge branch 'for-35' of git://repo.or.cz/linux-kbuild
* 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits)
  kbuild: Revert part of e8d400a to resolve a conflict
  kbuild: Fix checking of scm-identifier variable
  gconfig: add support to show hidden options that have prompts
  menuconfig: add support to show hidden options which have prompts
  gconfig: remove show_debug option
  gconfig: remove dbg_print_ptype() and dbg_print_stype()
  kconfig: fix zconfdump()
  kconfig: some small fixes
  add random binaries to .gitignore
  kbuild: Include gen_initramfs_list.sh and the file list in the .d file
  kconfig: recalc symbol value before showing search results
  .gitignore: ignore *.lzo files
  headerdep: perlcritic warning
  scripts/Makefile.lib: Align the output of LZO
  kbuild: Generate modules.builtin in make modules_install
  Revert "kbuild: specify absolute paths for cscope"
  kbuild: Do not unnecessarily regenerate modules.builtin
  headers_install: use local file handles
  headers_check: fix perl warnings
  export_report: fix perl warnings
  ...
2010-06-01 08:55:52 -07:00
Benjamin Herrenschmidt
a7fed9f736 powerpc: Don't export cvt_fd & _df when CONFIG_PPC_FPU is not set
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-31 11:51:54 +10:00
Benjamin Herrenschmidt
ecca1a34be Merge commit 'kumar/next' into next
Conflicts:
	arch/powerpc/sysdev/fsl_msi.c
2010-05-31 10:01:50 +10:00
FUJITA Tomonori
712d3e22a8 powerpc: remove unnecessary sync_single_range_* in swiotlb_dma_ops
sync_single_range_for_cpu and sync_single_range_for_device hooks in
swiotlb_dma_ops are unnecessary because sync_single_for_cpu and
sync_single_for_device are used there.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-27 09:12:52 -07:00
Sebastian Andrzej Siewior
b3df895aeb powerpc/kexec: Add support for FSL-BookE
This adds support kexec on FSL-BookE where the MMU can not be simply
switched off. The code borrows the initial MMU-setup code to create the
identical mapping mapping. The only difference to the original boot code
is the size of the mapping(s) and the executeable address.
The kexec code maps the first 2 GiB of memory in 256 MiB steps. This
should work also on e500v1 boxes.
SMP support is still not available.

(Kumar: Added minor change to build to ifdef CONFIG_PPC_STD_MMU_64 some
code that was PPC64 specific)

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-24 21:25:32 -05:00
Sebastian Andrzej Siewior
7c08ce718f powerpc/fsl-booke: Move the entry setup code into a seperate file
This patch only moves the initial entry code which setups the mapping
from what ever to KERNELBASE into a seperate file. No code change has
been made here.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-24 14:01:12 -05:00
Sebastian Andrzej Siewior
2289d2d1a8 powerpc/fsl-booke: fix the case where we are not in the first page
During boot we change the mapping a few times until we have a "defined"
mapping. During this procedure a small 4KiB mapping is created and after
that one a 64MiB. Currently the offset of the 4KiB page in that we run
is zero because the complete startup up code is in first page which
starts at RPN zero.
If the code is recycled and moved to another location then its execution
will fail because the start address in 64 MiB mapping is computed
wrongly. It does not consider the offset to the page from the begin of
the memory.
This patch fixes this. Usually (system boot) r25 is zero so this does
not change anything unless the code is recycled.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-24 14:00:53 -05:00
Grant Likely
cf9b59e9d3 Merge remote branch 'origin' into secretlab/next-devicetree
Merging in current state of Linus' tree to deal with merge conflicts and
build failures in vio.c after merge.

Conflicts:
	drivers/i2c/busses/i2c-cpm.c
	drivers/i2c/busses/i2c-mpc.c
	drivers/net/gianfar.c

Also fixed up one line in arch/powerpc/kernel/vio.c to use the
correct node pointer.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22 00:36:56 -06:00
Grant Likely
4018294b53 of: Remove duplicate fields from of_platform_driver
.name, .match_table and .owner are duplicated in both of_platform_driver
and device_driver.  This patch is a removes the extra copies from struct
of_platform_driver and converts all users to the device_driver members.

This patch is a pretty mechanical change.  The usage model doesn't change
and if any drivers have been missed, or if anything has been fixed up
incorrectly, then it will fail with a compile time error, and the fixup
will be trivial.  This patch looks big and scary because it touches so
many files, but it should be pretty safe.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Sean MacLennan <smaclennan@pikatech.com>
2010-05-22 00:10:40 -06:00
Grant Likely
597b9d1e44 drivercore: Add of_match_table to the common device drivers
OF-style matching can be available to any device, on any type of bus.
This patch allows any driver to provide an OF match table when CONFIG_OF
is enabled so that drivers can be bound against devices described in
the device tree.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-22 00:10:40 -06:00
Grant Likely
cb6dc512b7 arch/powerpc: Move dma_mask from of_device into pdev_archdata
By moving dma_mask into pdev_archdata, and adding archdata to
struct of_device, it makes it possible to substitute of_device
with struct platform_device, which is a stepping stone to
removing the of_platform bus entirely.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-22 00:10:40 -06:00
Linus Torvalds
98edb6ca41 Merge branch 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.35' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (269 commits)
  KVM: x86: Add missing locking to arch specific vcpu ioctls
  KVM: PPC: Add missing vcpu_load()/vcpu_put() in vcpu ioctls
  KVM: MMU: Segregate shadow pages with different cr0.wp
  KVM: x86: Check LMA bit before set_efer
  KVM: Don't allow lmsw to clear cr0.pe
  KVM: Add cpuid.txt file
  KVM: x86: Tell the guest we'll warn it about tsc stability
  x86, paravirt: don't compute pvclock adjustments if we trust the tsc
  x86: KVM guest: Try using new kvm clock msrs
  KVM: x86: export paravirtual cpuid flags in KVM_GET_SUPPORTED_CPUID
  KVM: x86: add new KVMCLOCK cpuid feature
  KVM: x86: change msr numbers for kvmclock
  x86, paravirt: Add a global synchronization point for pvclock
  x86, paravirt: Enable pvclock flags in vcpu_time_info structure
  KVM: x86: Inject #GP with the right rip on efer writes
  KVM: SVM: Don't allow nested guest to VMMCALL into host
  KVM: x86: Fix exception reinjection forced to true
  KVM: Fix wallclock version writing race
  KVM: MMU: Don't read pdptrs with mmu spinlock held in mmu_alloc_roots
  KVM: VMX: enable VMXON check with SMX enabled (Intel TXT)
  ...
2010-05-21 17:16:21 -07:00
Linus Torvalds
79c4581262 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (92 commits)
  powerpc: Remove unused 'protect4gb' boot parameter
  powerpc: Build-in e1000e for pseries & ppc64_defconfig
  powerpc/pseries: Make request_ras_irqs() available to other pseries code
  powerpc/numa: Use ibm,architecture-vec-5 to detect form 1 affinity
  powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim
  powerpc: Use smt_snooze_delay=-1 to always busy loop
  powerpc: Remove check of ibm,smt-snooze-delay OF property
  powerpc/kdump: Fix race in kdump shutdown
  powerpc/kexec: Fix race in kexec shutdown
  powerpc/kexec: Speedup kexec hash PTE tear down
  powerpc/pseries: Add hcall to read 4 ptes at a time in real mode
  powerpc: Use more accurate limit for first segment memory allocations
  powerpc/kdump: Use chip->shutdown to disable IRQs
  powerpc/kdump: CPUs assume the context of the oopsing CPU
  powerpc/crashdump: Do not fail on NULL pointer dereferencing
  powerpc/eeh: Fix oops when probing in early boot
  powerpc/pci: Check devices status property when scanning OF tree
  powerpc/vio: Switch VIO Bus PM to use generic helpers
  powerpc: Avoid bad relocations in iSeries code
  powerpc: Use common cpu_die (fixes SMP+SUSPEND build)
  ...
2010-05-21 11:17:05 -07:00
Anton Vorontsov
90103f932f powerpc/fsl-booke: Add hibernation support for FSL BookE processors
This is started as swsusp_32.S modifications, but the amount of #ifdefs
made the whole file horribly unreadable, so let's put the support into
its own separate file.

The code should be relatively easy to modify to support 44x BookEs as
well, but since I don't have any 44x to test, let's confine the code to
FSL BookE. (The only FSL-specific part so far is 'flush_dcache_L1'.)

Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-21 07:41:53 -05:00
Scott Wood
fe04b11215 powerpc/e500mc: Implement machine check handler.
Most of the MSCR bit assigments are different in e500mc versus
e500, and they are now write-one-to-clear.

Some e500mc machine check conditions are made recoverable (as long as
they aren't stuck on), most notably L1 instruction cache parity errors.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-21 07:41:52 -05:00
FUJITA Tomonori
99ec28f183 powerpc: Remove unused 'protect4gb' boot parameter
'protect4gb' boot parameter was introduced to avoid allocating dma
space acrossing 4GB boundary in 2007 (the commit
569975591c).

In 2008, the IOMMU was fixed to use the boundary_mask parameter per
device properly. So 'protect4gb' workaround was removed (the
383af9525b). But somehow I messed the
'protect4gb' boot parameter that was used to enable the
workaround.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:13 +10:00
Anton Blanchard
b878dc0059 powerpc: Use smt_snooze_delay=-1 to always busy loop
Right now if we want to busy loop and not give up any time to the hypervisor
we put a very large value into smt_snooze_delay. This is sometimes useful
when running a single partition and you want to avoid any latencies due
to the hypervisor or CPU power state transitions. While this works, it's a bit
ugly - how big a number is enough now we have NO_HZ and can be idle for a very
long time.

The patch below makes smt_snooze_delay signed, and a negative value means loop
forever:

echo -1 > /sys/devices/system/cpu/cpu0/smt_snooze_delay

This change shouldn't affect the existing userspace tools (eg ppc64_cpu), but
I'm cc-ing Nathan just to be sure.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:12 +10:00
Anton Blanchard
dd04c63c96 powerpc: Remove check of ibm,smt-snooze-delay OF property
I'm not sure why we have code for parsing an ibm,smt-snooze-delay OF
property. Since we have a smt-snooze-delay= boot option and we can
also set it at runtime via sysfs, it should be safe to get rid of
this code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:11 +10:00
Michael Neuling
60adec6226 powerpc/kdump: Fix race in kdump shutdown
When we are crashing, the crashing/primary CPU IPIs the secondaries to
turn off IRQs, go into real mode and wait in kexec_wait.  While this
is happening, the primary tears down all the MMU maps.  Unfortunately
the primary doesn't check to make sure the secondaries have entered
real mode before doing this.

On PHYP machines, the secondaries can take a long time shutting down
the IRQ controller as RTAS calls are need.  These RTAS calls need to
be serialised which resilts in the secondaries contending in
lock_rtas() and hence taking a long time to shut down.

We've hit this on large POWER7 machines, where some secondaries are
still waiting in lock_rtas(), when the primary tears down the HPTEs.

This patch makes sure all secondaries are in real mode before the
primary tears down the MMU.  It uses the new kexec_state entry in the
paca.  It times out if the secondaries don't reach real mode after
10sec.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:11 +10:00
Michael Neuling
1fc711f7ff powerpc/kexec: Fix race in kexec shutdown
In kexec_prepare_cpus, the primary CPU IPIs the secondary CPUs to
kexec_smp_down().  kexec_smp_down() calls kexec_smp_wait() which sets
the hw_cpu_id() to -1.  The primary does this while leaving IRQs on
which means the primary can take a timer interrupt which can lead to
the IPIing one of the secondary CPUs (say, for a scheduler re-balance)
but since the secondary CPU now has a hw_cpu_id = -1, we IPI CPU
-1... Kaboom!

We are hitting this case regularly on POWER7 machines.

There is also a second race, where the primary will tear down the MMU
mappings before knowing the secondaries have entered real mode.

Also, the secondaries are clearing out any pending IPIs before
guaranteeing that no more will be received.

This changes kexec_prepare_cpus() so that we turn off IRQs in the
primary CPU much earlier.  It adds a paca flag to say that the
secondaries have entered the kexec_smp_down() IPI and turned off IRQs,
rather than overloading hw_cpu_id with -1.  This new paca flag is
again used to in indicate when the secondaries has entered real mode.

It also ensures that all CPUs have their IRQs off before we clear out
any pending IPI requests (in kexec_cpu_down()) to ensure there are no
trailing IPIs left unacknowledged.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:11 +10:00
Anton Blanchard
095c7965f4 powerpc: Use more accurate limit for first segment memory allocations
Author: Milton Miller <miltonm@bga.com>

On large machines we are running out of room below 256MB. In some cases we
only need to ensure the allocation is in the first segment, which may be
256MB or 1TB.

Add slb0_limit and use it to specify the upper limit for the irqstack and
emergency stacks.

On a large ppc64 box, this fixes a panic at boot when the crashkernel=
option is specified (previously we would run out of memory below 256MB).

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:10 +10:00
Anton Blanchard
5d7a87217d powerpc/kdump: Use chip->shutdown to disable IRQs
I saw this in a kdump kernel:

IOMMU table initialized, virtual merging enabled
Interrupt 155954 (real) is invalid, disabling it.
Interrupt 155953 (real) is invalid, disabling it.

ie we took some spurious interrupts. default_machine_crash_shutdown tries
to disable all interrupt sources but uses chip->disable which maps to
the default action of:

static void default_disable(unsigned int irq)
{
}

If we use chip->shutdown, then we actually mask the IRQ:

static void default_shutdown(unsigned int irq)
{
        struct irq_desc *desc = irq_to_desc(irq);

        desc->chip->mask(irq);
        desc->status |= IRQ_MASKED;
}

Not sure why we don't implement a ->disable action for xics.c, or why
default_disable doesn't mask the interrupt.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:10 +10:00
Anton Blanchard
0644079410 powerpc/kdump: CPUs assume the context of the oopsing CPU
We wrap the crash_shutdown_handles[] calls with longjmp/setjmp, so if any
of them fault we can recover. The problem is we add a hook to the debugger
fault handler hook which calls longjmp unconditionally.

This first part of kdump is run before we marshall the other CPUs, so there
is a very good chance some CPU on the box is going to page fault. And when
it does it hits the longjmp code and assumes the context of the oopsing CPU.
The machine gets very confused when it has 10 CPUs all with the same stack,
all thinking they have the same CPU id. I get even more confused trying
to debug it.

The patch below adds crash_shutdown_cpu and uses it to specify which cpu is
in the protected region. Since it can only be -1 or the oopsing CPU, we don't
need to use memory barriers since it is only valid on the local CPU - no other
CPU will ever see a value that matches it's local CPU id.

Eventually we should switch the order and marshall all CPUs before doing the
crash_shutdown_handles[] calls, but that is a bigger fix.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:10 +10:00
Maxim Uvarov
426b6cb478 powerpc/crashdump: Do not fail on NULL pointer dereferencing
Signed-off-by: Maxim Uvarov <muvarov@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:09 +10:00
Sonny Rao
5b339bdf16 powerpc/pci: Check devices status property when scanning OF tree
We ran into an issue where it looks like we're not properly ignoring a
pci device with a non-good status property when we walk the device tree
and instanciate the Linux side PCI devices.

However, the EEH init code does look for the property and disables EEH
on these devices. This leaves us in an inconsistent where we are poking
at a supposedly bad piece of hardware and RTAS will block our config
cycles because EEH isn't enabled anyway.

Signed-of-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:09 +10:00
Brian King
a1263c7144 powerpc/vio: Switch VIO Bus PM to use generic helpers
Switch to use the generic power management helpers.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:09 +10:00
Milton Miller
abb17f9c3a powerpc: Use common cpu_die (fixes SMP+SUSPEND build)
Configuring a powerpc 32 bit kernel for both SMP and SUSPEND turns on
CPU_HOTPLUG to enable disable_nonboot_cpus to be called by the common
suspend code.  Previously the definition of cpu_die for ppc32 was in
the powermac platform code, causing it to be undefined if that platform
as not selected.

arch/powerpc/kernel/built-in.o: In function 'cpu_idle':
arch/powerpc/kernel/idle.c:98: undefined reference to 'cpu_die'

Move the code from setup_64 to smp.c and rename the power mac
versions to their specific names.

Note that this does not setup the cpu_die pointers in either
smp_ops (request a given cpu die) or ppc_md (make this cpu die),
for other platforms but there are generic versions in smp.c.

Reported-by: Matt Sealey <matt@genesi-usa.com>
Reported-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:31:08 +10:00
Michael Ellerman
7358650e9e powerpc/rtasd: Don't start event scan if scan rate is zero
There appear to be Pegasos systems which have the rtas-event-scan
RTAS tokens, but on which the event scan always fails. They also
have an event-scan-rate property containing 0, which means call
event scan 0 times per minute.

So interpret a scan rate of 0 to mean don't scan at all. This fixes
the problem on the Pegasos machines and makes sense as well.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-21 17:29:39 +10:00
Jason Wessel
ba797b2813 powerpc,kgdb: Introduce low level trap catching
The only way the debugger can handle a trap in inside rcu_lock,
notify_die, or atomic_notifier_call_chain without a recursive fault is
to allow the kernel debugger to handle the exception first in
program_check_exception().

The other change here is to make sure that kgdb_handle_exception() is
called with correct parameters when catching an oops, because kdb
needs to know if the entry was an oops, single step, or breakpoint
exception.

[benh@kernel.crashing.org: move debugger_bpt instead of #ifdef]

CC: Paul Mackerras <paulus@samba.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-20 21:04:25 -05:00
Jason Wessel
dcc7871128 kgdb: core changes to support kdb
These are the minimum changes to the kgdb core in order to enable an
API to connect a new front end (kdb) to the debug core.

This patch introduces the dbg_kdb_mode variable controls where the
user level I/O is routed.  It will be routed to the gdbstub (kgdb) or
to the kdb front end which is a simple shell available over the kgdboc
connection.

You can switch back and forth between kdb or the gdb stub mode of
operation dynamically.  From gdb stub mode you can blindly type
"$3#33", or from the kdb mode you can enter "kgdb" to switch to the
gdb stub.

The logic in the debug core depends on kdb to look for the typical gdb
connection sequences and return immediately with KGDB_PASS_EVENT if a
gdb serial command sequence is detected.  That should allow a
reasonably seamless transition between kdb -> gdb without leaving the
kernel exception state.  The two gdb serial queries that kdb is
responsible for detecting are the "?" and "qSupported" packets.

CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Martin Hicks <mort@sgi.com>
2010-05-20 21:04:21 -05:00
Grant Likely
58f9b0b024 of: eliminate of_device->node and dev_archdata->{of,prom}_node
This patch eliminates the node pointer from struct of_device and the
of_node (or prom_node) pointer from struct dev_archdata since the node
pointer is now part of struct device proper when CONFIG_OF is set, and
all users of the old pointer locations have already been converted over
to use device->of_node.

Also remove dev_archdata_{get,set}_node() as it is no longer used by
anything.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18 16:10:45 -06:00
Grant Likely
61c7a080a5 of: Always use 'struct device.of_node' to get device node pointer.
The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated.  This patch
makes all readers of these elements use device.of_node instead.

(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-05-18 16:10:44 -06:00
Linus Torvalds
4d7b4ac22f Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
  perf tools: Add mode to build without newt support
  perf symbols: symbol inconsistency message should be done only at verbose=1
  perf tui: Add explicit -lslang option
  perf options: Type check all the remaining OPT_ variants
  perf options: Type check OPT_BOOLEAN and fix the offenders
  perf options: Check v type in OPT_U?INTEGER
  perf options: Introduce OPT_UINTEGER
  perf tui: Add workaround for slang < 2.1.4
  perf record: Fix bug mismatch with -c option definition
  perf options: Introduce OPT_U64
  perf tui: Add help window to show key associations
  perf tui: Make <- exit menus too
  perf newt: Add single key shortcuts for zoom into DSO and threads
  perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
  perf newt: Fix the 'A'/'a' shortcut for annotate
  perf newt: Make <- exit the ui_browser
  x86, perf: P4 PMU - fix counters management logic
  perf newt: Make <- zoom out filters
  perf report: Report number of events, not samples
  perf hist: Clarify events_stats fields usage
  ...

Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
2010-05-18 08:19:03 -07:00
Kumar Gala
78f622377f powerpc/fsl-booke: Move loadcam_entry back to asm code to fix SMP ftrace
When we build with ftrace enabled its possible that loadcam_entry would
have used the stack pointer (even though the code doesn't need it).  We
call loadcam_entry in __secondary_start before the stack is setup.  To
ensure that loadcam_entry doesn't use the stack pointer the easiest
solution is to just have it in asm code.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-17 10:56:20 -05:00
Li Yang
78e2e68a2b powerpc/fsl-booke: Fix InstructionTLBError execute permission check
In CONFIG_PTE_64BIT the PTE format has unique permission bits for user
and supervisor execute.  However on !CONFIG_PTE_64BIT we overload the
supervisor bit to imply user execute with _PAGE_USER set.  This allows
us to use the same permission check mask for user or supervisor code on
!CONFIG_PTE_64BIT.

However, on CONFIG_PTE_64BIT we map _PAGE_EXEC to _PAGE_BAP_UX so we
need a different permission mask based on the fault coming from a kernel
address or user space.

Without unique permission masks we see issues like the following with
modules:

Unable to handle kernel paging request for instruction fetch
Faulting instruction address: 0xf938d040
Oops: Kernel access of bad area, sig: 11 [#1]

Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Jin Qing <b24347@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-17 10:56:16 -05:00
Alexander Graf
251585b5d0 KVM: PPC: Find HTAB ourselves
For KVM we need to find the location of the HTAB. We can either rely
on internal data structures of the kernel or ask the hardware.

Ben issued complaints about the internal data structure method, so
let's switch it to our own inquiry of the HTAB. Now we're fully
independend :-).

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:19:07 +03:00
Alexander Graf
dd84c21748 KVM: PPC: Add KVM intercept handlers
When an interrupt occurs we don't know yet if we're in guest context or
in host context. When in guest context, KVM needs to handle it.

So let's pull the same trick we did on Book3S_64: Just add a macro to
determine if we're in guest context or not and if so jump on to KVM code.

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:52 +03:00
Alexander Graf
218d169c4c PPC: Export SWITCH_FRAME_SIZE
We need the SWITCH_FRAME_SIZE define on Book3S_32 now too.
So let's export it unconditionally.

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:49 +03:00
Alexander Graf
be85669886 KVM: PPC: Export MMU variables
Our shadow MMU code needs to know where the HTAB is located and how
big it is. So we need some variables from the kernel exported to
module space if KVM is built as a module.

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:48 +03:00
Alexander Graf
97e492558f KVM: PPC: Add SVCPU to Book3S_32
We need to keep the pointer to the shadow vcpu somewhere accessible from
within really early interrupt code. The best fit I found was the thread
struct, as that resides in an SPRG.

So let's put a pointer to the shadow vcpu in the thread struct and add
an asm-offset so we can find it.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:43 +03:00
Alexander Graf
0604675fe1 KVM: PPC: Use now shadowed vcpu fields
The shadow vcpu now contains some fields we don't use from the vcpu anymore.
Access to them happens using inline functions that happily use the shadow
vcpu fields.

So let's now ifdef them out to booke only and add asm-offsets.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:32 +03:00
Alexander Graf
00c3a37ca3 KVM: PPC: Use CONFIG_PPC_BOOK3S define
Upstream recently added a new name for PPC64: Book3S_64.

So instead of using CONFIG_PPC64 we should use CONFIG_PPC_BOOK3S consotently.
That makes understanding the code easier (I hope).

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:29 +03:00
Alexander Graf
2191d657c9 KVM: PPC: Name generic 64-bit code generic
We have quite some code that can be used by Book3S_32 and Book3S_64 alike,
so let's call it "Book3S" instead of "Book3S_64", so we can later on
use it from the 32 bit port too.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:14 +03:00
Alexander Graf
306d071f17 KVM: PPC: Don't export Book3S symbols on BookE
Book3S knows how to convert floats to doubles and vice versa. BookE doesn't.
So let's make sure we don't export them on BookE.

This fixes a link error on BookE with CONFIG_KVM=y.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:25 +03:00
Benjamin Herrenschmidt
131c6c9edd Merge commit 'kumar/merge' into merge 2010-05-13 11:42:40 +10:00
Paul Mackerras
0fe1ac48be powerpc/perf_event: Fix oops due to perf_event_do_pending call
Anton Blanchard found that large POWER systems would occasionally
crash in the exception exit path when profiling with perf_events.
The symptom was that an interrupt would occur late in the exit path
when the MSR[RI] (recoverable interrupt) bit was clear.  Interrupts
should be hard-disabled at this point but they were enabled.  Because
the interrupt was not recoverable the system panicked.

The reason is that the exception exit path was calling
perf_event_do_pending after hard-disabling interrupts, and
perf_event_do_pending will re-enable interrupts.

The simplest and cleanest fix for this is to use the same mechanism
that 32-bit powerpc does, namely to cause a self-IPI by setting the
decrementer to 1.  This means we can remove the tests in the exception
exit path and raw_local_irq_restore.

This also makes sure that the call to perf_event_do_pending from
timer_interrupt() happens within irq_enter/irq_exit.  (Note that
calling perf_event_do_pending from timer_interrupt does not mean that
there is a possible 1/HZ latency; setting the decrementer to 1 ensures
that the timer interrupt will happen immediately, i.e. within one
timebase tick, which is a few nanoseconds or 10s of nanoseconds.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-12 14:34:00 +10:00
Lin Ming
8e6d5573af perf, powerpc: Implement group scheduling transactional APIs
[paulus@samba.org: Set cpuhw->event[i]->hw.config in
power_pmu_commit_txn.]

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100508102841.GA10650@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-11 17:08:24 +02:00
Benjamin Herrenschmidt
1ed31d6db9 Merge commit 'origin/master' into next 2010-05-07 11:29:25 +10:00
Anton Blanchard
828a69869b powerpc/cpumask: Update some comments
Since the *_map cpumask variants are deprecated, change the comments to
instead refer to *_mask.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:41:59 +10:00
Anton Blanchard
cc1ba8ea6d powerpc/cpumask: Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks
Dynamically allocate cpu_sibling_map and cpu_core_map cpumasks.

We don't need to set_cpu_online() the boot cpu in smp_prepare_boot_cpu,
init/main.c does it for us.

We also postpone setting of the boot cpu in cpu_sibling_map and cpu_core_map
until when the memory allocator is available (smp_prepare_cpus), similar
to x86.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:41:56 +10:00
Anton Blanchard
e6532c63cc powerpc/cpumask: Convert /proc/cpuinfo to new cpumask API
Use new cpumask API in /proc/cpuinfo code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:41:55 +10:00
Anton Blanchard
2c2df03845 powerpc/cpumask: Refactor /proc/cpuinfo code
This separates the per cpu output from the summary output at the end of the
file, making it easier to convert to the new cpumask API in a subsequent
patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:41:54 +10:00
Anton Blanchard
b6decb7079 powerpc/cpumask: Convert fixup_irqs to new cpumask API
Use new cpumask_* functions, and dynamically allocate cpumask in fixup_irqs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:16:14 +10:00
Anton Blanchard
bfb9126def powerpc/cpumask: Convert smp_cpus_done to new cpumask API
Use the new cpumask_* functions and dynamically allocate the cpumask in
smp_cpus_done.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:16:13 +10:00
Anton Blanchard
d5f86fe345 powerpc/cpumask: Convert rtasd to new cpumask API
Use cpumask_first, cpumask_next in rtasd code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-05-06 17:16:13 +10:00
Torez Smith
b4e8c8dd84 powerpc/4xx: Simple platform for the ISS 4xx simulator
This is a trivial 4xx plaform that uses the new simple bsp from
Josh and is handy to use in simulators such as ISS or even Mambo
who don't properly implement most of the actual devices in the
SoC but really only the core.

Signed-off-by: Torez Smith  <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-05-05 11:11:56 -04:00
Dave Kleikamp
221c185d4e powerpc/476: Add isync after loading mmu and debug spr's
476 requires an isync after loading MMU and debug related SPR's.  Some of
these are in performance-critical paths and may need to be optimized, but
initially, we're playing it safe.

Signed-off-by: Torez Smith  <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-05-05 10:39:13 -04:00
Dave Kleikamp
fc5e709731 powerpc/476: add machine check handler for 47x core
The 47x core's MCSR varies from 44x, so it needs it's own machine check
handler.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-05-05 09:27:22 -04:00
Dave Kleikamp
e7f75ad01d powerpc/47x: Base ppc476 support
This patch adds the base support for the 476 processor.  The code was
primarily written by Ben Herrenschmidt and Torez Smith, but I've been
maintaining it for a while.

The goal is to have a single binary that will run on 44x and 47x, but
we still have some details to work out.  The biggest is that the L1 cache
line size differs on the two platforms, but it's currently a compile-time
option.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Torez Smith  <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-05-05 09:11:10 -04:00
Dave Kleikamp
795033c344 powerpc/44x: break out cpu init code into stand-alone function
The 47x platform supports multiple cores and shares code with 44x.
Break out code that is common for initializing the primary and secondary
cpus into a function which can be called for both.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-05-05 08:04:36 -04:00
Torez Smith
471c70ff39 powerpc/booke: Add Stack Marking support to Booke Exception Prolog
This patch adds a marker to the exception stack frame to aid in debugging.
It's already inserted on other platforms and xmon recognizes it and
identifies exception frames when showing stack traces.

Signed-off-by: Torez Smith  <lnxtorez@linux.vnet.ibm.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-05-05 08:01:52 -04:00
Kumar Gala
b8b14c6676 powerpc/swiotlb: Fix off by one in determining boundary of which ops to use
When we compare the devices DMA mask to the amount of memory we need to
make sure we treat the DMA mask as an address boundary.  For example if
the DMA_MASK(32) and we have 4G of memory we'd incorrectly set the dma
ops to swiotlb.  We need to add one to the dma mask when we convert it.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-05-04 01:27:18 -05:00
Grant Likely
d706c1b050 driver-core: Add device node pointer to struct device
Currently, platforms using CONFIG_OF add a 'struct device_node *of_node'
to dev->archdata.  However, with CONFIG_OF becoming generic for all
architectures, it makes sense for commonality to move it out of archdata
and into struct device proper.

This patch adds a struct device_node *of_node member to struct device
and updates all locations which currently write the device_node pointer
into archdata to also update dev->of_node.  Subsequent patches will
modify callers to use the archdata location and ultimately remove
the archdata member entirely.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
CC: Michal Simek <monstr@monstr.eu>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: "David S. Miller" <davem@davemloft.net>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Jeremy Kerr <jeremy.kerr@canonical.com>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linux-kernel@vger.kernel.org
CC: linuxppc-dev@ozlabs.org
CC: sparclinux@vger.kernel.org
2010-04-28 18:20:57 -06:00
Anton Blanchard
4b83c330b4 powerpc/numa: Add form 1 NUMA affinity
Firmware changed the way it represents memory and cpu affinity on POWER7.
Unfortunately the old method now caps the topology to work around issues
with legacy operating systems. For Linux to get the correct topology we
need to use the new form 1 affinity information.

We set the form 1 field in the client architecture, and if we see "1" in the
ibm,associativity-form property firmware supports form 1 affinity and
we should look at the first field in the ibm,associativity-reference-points
array. If not we use the second field as we always have.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-28 16:22:33 +10:00
Alexander Graf
963cf3dc63 KVM: PPC: Add helpers to call FPU instructions
To emulate paired single instructions, we need to be able to call FPU
operations from within the kernel. Since we don't want gcc to spill
arbitrary FPU code everywhere, we tell it to use a soft fpu.

Since we know we can really call the FPU in safe areas, let's also add
some calls that we can later use to actually execute real world FPU
operations on the host's FPU.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-25 12:35:15 +03:00
Mahesh Salgaonkar
359e4284a3 powerpc: Add kprobe-based event tracer
This patch ports the kprobe-based event tracer to powerpc. This patch
is based on x86 port. This brings powerpc on par with x86.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:11:43 +10:00
Benjamin Herrenschmidt
a32aaf1451 powerpc/vio: Add power management support
Adds support for suspend/resume for VIO devices. This is needed for
support for HMC initiated hibernation.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:09:15 +10:00
Anton Blanchard
e9bbc8cde0 powerpc/pseries: Call ibm,os-term if the ibm,extended-os-term is present
We have had issues in the past with ibm,os-term initiating shutdown of a
partition. This is confusing to the user, especially if panic_timeout is
non zero.

The temporary fix was to avoid calling ibm,os-term if a panic_timeout was set
and since we set it on every boot we basically never call ibm,os-term.

An extended version of ibm,os-term has since been implemented which gives us
the behaviour we want:

  "When the platform supports extended ibm,os-term behavior, the return to the
  RTAS will always occur unless there is a kernel assisted dump active as
  initiated by an ibm,configure-kernel-dump call."

This patch checks for the ibm,extended-os-term property and calls ibm,os-term
if it exists.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:48 +10:00
Julia Lawall
21dbeb91a2 powerpc: Use set_cpus_allowed_ptr
Use set_cpus_allowed_ptr rather than set_cpus_allowed.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
expression E1,E2;
@@

- set_cpus_allowed(E1, cpumask_of_cpu(E2))
+ set_cpus_allowed_ptr(E1, cpumask_of(E2))

@@
expression E;
identifier I;
@@

- set_cpus_allowed(E, I)
+ set_cpus_allowed_ptr(E, &I)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:45 +10:00
Julia Lawall
f6d8c8bb1d powerpc/vio: Add missing unlock in error path
Add an unlock before exiting the function.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression E1;
identifier f;
@@

f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:43 +10:00
Joakim Tjernlund
469d62be92 powerpc/8xx: Use SPRG2 and DAR registers to stash r11 and cr.
This avoids storing these registers in memory.
CPU6 errata will still use the old way.
Remove some G2 leftover accesses from 2.4

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:34 +10:00
Joakim Tjernlund
d069cb4373 powerpc/8xx: Don't touch ACCESSED when no SWAP.
Only the swap function cares about the ACCESSED bit in
the pte. Do not waste cycles updateting ACCESSED when swap
is not compiled into the kernel.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:33 +10:00
Joakim Tjernlund
4afb0be7cc powerpc/8xx: Avoid testing for kernel space in ITLB Miss.
Only modules will cause ITLB Misses as we always pin
the first 8MB of kernel memory.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:32 +10:00
Joakim Tjernlund
fe1691e3f4 powerpc/8xx: Optimze TLB Miss handlers
This removes a couple of insn's from the TLB Miss
handlers whithout changing functionality.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:31 +10:00
Vaidyanathan Srinivasan
6fe9d1facb powerpc/pseries: Export data from new hcall H_EM_GET_PARMS
Add support for H_EM_GET_PARMS hcall that will return data
related to power modes from the platform.  Export the data
directly to user space for administrative tools to interpret
and use.

cat /proc/powerpc/lparcfg will export power mode data

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 18:00:29 +10:00
K.Prasad
9c7cc234dc powerpc: Disable interrupts for data breakpoint exceptions
Data address breakpoint exceptions are currently handled along with page-faults
which require interrupts to remain in enabled state. Since exception handling
for data breakpoints aren't pre-empt safe, we handle them separately.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 14:44:38 +10:00
Benjamin Herrenschmidt
578b7cd151 powerpc/vio: Add modalias support
BenH: Added to vio_cmo_dev_attrs as well

Provide a modalias entry for VIO devices in sysfs.  I believe
this was another initrd generation bugfix for anaconda.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-04-07 14:44:28 +10:00
Tejun Heo
336f5899d2 Merge branch 'master' into export-slabh 2010-04-05 11:37:28 +09:00
Frederic Weisbecker
6e03bb5ad3 perf: Always build the powerpc perf_arch_fetch_caller_regs version
Now that software events use perf_arch_fetch_caller_regs() too, we
need the powerpc version to be always built.

Fixes the following build error:

	(.text+0x3210): undefined reference to `perf_arch_fetch_caller_regs'
	(.text+0x3324): undefined reference to `perf_arch_fetch_caller_regs'
	(.text+0x33bc): undefined reference to `perf_arch_fetch_caller_regs'
	(.text+0x33ec): undefined reference to `perf_arch_fetch_caller_regs'
	(.text+0xd4a0): undefined reference to `perf_arch_fetch_caller_regs'
	arch/powerpc/kernel/built-in.o:(.text+0xd528): more undefined references to `perf_arch_fetch_caller_regs' follow
	make[1]: *** [.tmp_vmlinux1] Error 1
	make: *** [sub-make] Error 2

Reported-by: Michael Ellerman <michael@ellerman.id.au>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2010-04-03 12:42:00 +02:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Linus Torvalds
6fa41366c1 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regs
  perf top: Add missing initialization to zero
  perf probe: Use original address instead of CU-based address
  perf probe: Fix offset to allow signed value
  perf top: Improve the autosizing of column lenghts
  perf probe: Fix need_dwarf flag if lazy matching is used
  perf probe: Fix probe_point buffer overrun
2010-03-26 15:09:33 -07:00
Linus Torvalds
95c46afe60 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Remove IOMMU_VMERGE config option
  powerpc: Fix swiotlb to respect the boot option
  powerpc: Do not call prink when CONFIG_PRINTK is not defined
  powerpc: Use correct ccr bit for syscall error status
  powerpc/fsl-booke: Get coherent bit from PTE
  powerpc/85xx: Make sure lwarx hint isn't set on ppc32
2010-03-19 13:42:43 -07:00
FUJITA Tomonori
191aee58b6 powerpc: Remove IOMMU_VMERGE config option
The description says:

 Cause IO segments sent to a device for DMA to be merged virtually
 by the IOMMU when they happen to have been allocated contiguously.
 This doesn't add pressure to the IOMMU allocator. However, some
 drivers don't support getting large merged segments coming back
 from *_map_sg().

 Most drivers don't have this problem; it is safe to say Y here.

It's out of date. Long ago, drivers didn't have a way to tell IOMMUs
about their segment length limit (that is, the maximum segment length
that they can handle). So IOMMUs merged as many segments as possible
and gave too large segments to drivers.

dma_get_max_seg_size() was introduced to solve the above
problem. Device drives can use the API to tell IOMMU about the maximum
segment length that they can handle. In addition, the default limit
(64K) should be safe for everyone.

So this config option seems to be unnecessary.

Note that this config option just enables users to disable the virtual
merging by default. Users can still disable the virtual merging by the
boot parameter.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-19 16:38:16 +11:00
FUJITA Tomonori
a93272969c powerpc: Fix swiotlb to respect the boot option
powerpc initializes swiotlb before parsing the kernel boot options so
swiotlb options (e.g. specifying the swiotlb buffer size) are ignored.

Any time before freeing bootmem works for swiotlb so this patch moves
powerpc's swiotlb initialization after parsing the kernel boot
options, mem_init (as x86 does).

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Tested-by: Becky Bruce <beckyb@kernel.crashing.org>
Tested-by: Albert Herranz <albert_herranz@yahoo.es>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-19 16:38:16 +11:00
Márton Németh
09156a7a40 powerpc: Do not call prink when CONFIG_PRINTK is not defined
When printk() is disabled (CONFIG_PRINTK) at menu item
 General setup
 -> Configure standard kernel features (for small systems)
    -> Enable support for printk
then there should be no printk() calls at all.

Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-19 16:38:16 +11:00
Benjamin Herrenschmidt
d6a8536a93 Merge commit 'kumar/merge' into merge 2010-03-19 16:23:55 +11:00
Linus Torvalds
f82c37e7bb Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
  perf: Fix unexported generic perf_arch_fetch_caller_regs
  perf record: Don't try to find buildids in a zero sized file
  perf: export perf_trace_regs and perf_arch_fetch_caller_regs
  perf, x86: Fix hw_perf_enable() event assignment
  perf, ppc: Fix compile error due to new cpu notifiers
  perf: Make the install relative to DESTDIR if specified
  kprobes: Calculate the index correctly when freeing the out-of-line execution slot
  perf tools: Fix sparse CPU numbering related bugs
  perf_event: Fix oops triggered by cpu offline/online
  perf: Drop the obsolete profile naming for trace events
  perf: Take a hot regs snapshot for trace events
  perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
  perf/x86-64: Use frame pointer to walk on irq and process stacks
  lockdep: Move lock events under lockdep recursion protection
  perf report: Print the map table just after samples for which no map was found
  perf report: Add multiple event support
  perf session: Change perf_session post processing functions to take histogram tree
  perf session: Add storage for seperating event types in report
  perf session: Change add_hist_entry to take the tree root instead of session
  perf record: Add ID and to recorded event data when recording multiple events
  ...
2010-03-18 16:52:46 -07:00
Paul Mackerras
9eff26ea48 powerpc/perf_events: Fix call-graph recording, add perf_arch_fetch_caller_regs
This implements a powerpc version of perf_arch_fetch_caller_regs
to get correct call-graphs.

It's implemented in assembly because that way we can be sure there isn't
a stack frame for perf_arch_fetch_caller_regs.  If it was in C, gcc might
or might not create a stack frame for it, which would affect the number
of levels we have to skip.

With this, we see results from perf record -e lock:lock_acquire like
this:

 # Samples: 24878
 #
 # Overhead         Command      Shared Object  Symbol
 # ........  ..............  .................  ......
 #
    14.99%            perf  [kernel.kallsyms]  [k] ._raw_spin_lock
                      |
                      --- ._raw_spin_lock
                         |
                         |--25.00%-- .alloc_fd
                         |          (nil)
                         |          |
                         |          |--50.00%-- .anon_inode_getfd
                         |          |          .sys_perf_event_open
                         |          |          syscall_exit
                         |          |          syscall
                         |          |          create_counter
                         |          |          __cmd_record
                         |          |          run_builtin
                         |          |          main
                         |          |          0xfd2e704
                         |          |          0xfd2e8c0
                         |          |          (nil)

... etc.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: anton@samba.org
Cc: linuxppc-dev@ozlabs.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100318050513.GA6575@drongo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 06:48:29 +01:00
Kumar Gala
9d296cfa69 powerpc/fsl-booke: Get coherent bit from PTE
We shouldn't be always setting 'M' in the TLB entry since its reasonable
for somethings to be mapped non-coherent.  The PTE should have 'M' set
properly.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-03-16 23:39:56 -05:00
Linus Torvalds
9fdfbc2bff Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Provide generic perf_sample_data initialization
  MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer
  perf trace: Don't use pager if scripting
  perf trace/scripting: Remove extraneous header read
  perf, ARM: Modify kuser rmb() call to compile for Thumb-2
  x86/stacktrace: Don't dereference bad frame pointers
  perf archive: Don't try to collect files without a build-id
  perf_events, x86: Fixup fixed counter constraints
  perf, x86: Restrict the ANY flag
  perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE
  perf, x86: add some IBS macros to perf_event.h
  perf, x86: make IBS macros available in perf_event.h
  hw-breakpoints: Remove stub unthrottle callback
  x86/hw-breakpoints: Remove the name field
  perf: Remove pointless breakpoint union
  perf lock: Drop the buffers multiplexing dependency
  perf lock: Fix and add misc documentally things
  percpu: Add __percpu sparse annotations to hw_breakpoint
2010-03-13 14:39:42 -08:00
Linus Torvalds
b6fedfd2a1 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/booke: Fix breakpoint/watchpoint one-shot behavior
  powerpc: Reduce printk from pseries_mach_cpu_die()
  powerpc: Move checks in pseries_mach_cpu_die()
  powerpc: Reset kernel stack on cpu online from cede state
  powerpc: Fix G5 thermal shutdown
  powerpc/pseries: Pass CPPR value to H_XIRR hcall
  powerpc/booke: Fix a couple typos in the advanced ptrace code
  powerpc: Fix SMP build with disabled CPU hotplugging.
  powerpc: Dynamically allocate pacas
  powerpc/perf: e500 support
  powerpc/perf: Build callchain code regardless of hardware event support.
  powerpc/cpm2: Checkpatch cleanup
  powerpc/86xx: Renaming following split of GE Fanuc joint venture
  powerpc/86xx: Convert gef_pic_lock to raw_spinlock
  powerpc/qe: Convert qe_ic_lock to raw_spinlock
  powerpc/82xx: Convert pci_pic_lock to raw_spinlock
  powerpc/85xx: Convert socrates_fpga_pic_lock to raw_spinlock
2010-03-12 16:06:51 -08:00
Linus Torvalds
c32da02342 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits)
  doc: fix typo in comment explaining rb_tree usage
  Remove fs/ntfs/ChangeLog
  doc: fix console doc typo
  doc: cpuset: Update the cpuset flag file
  Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed
  Remove drivers/parport/ChangeLog
  Remove drivers/char/ChangeLog
  doc: typo - Table 1-2 should refer to "status", not "statm"
  tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments
  No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h
  devres/irq: Fix devm_irq_match comment
  Remove reference to kthread_create_on_cpu
  tree-wide: Assorted spelling fixes
  tree-wide: fix 'lenght' typo in comments and code
  drm/kms: fix spelling in error message
  doc: capitalization and other minor fixes in pnp doc
  devres: typo fix s/dev/devm/
  Remove redundant trailing semicolons from macros
  fix typo "definetly" -> "definitely" in comment
  tree-wide: s/widht/width/g typo in comments
  ...

Fix trivial conflict in Documentation/laptops/00-INDEX
2010-03-12 16:04:50 -08:00
FUJITA Tomonori
6e6c70e691 dma-mapping: powerpc: use generic pci_set_dma_mask and pci_set_consistent_dma_mask
This converts powerpc to use the generic pci_set_dma_mask and
pci_set_consistent_dma_mask (drivers/pci/pci.c).

The generic pci_set_dma_mask does what powerpc's pci_set_dma_mask does.

Unlike powerpc's pci_set_consistent_dma_mask, the gneric
pci_set_consistent_dma_mask sets only coherent_dma_mask.  It doesn't work
for powerpc?  pci_set_consistent_dma_mask API should set only
coherent_dma_mask?

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:42 -08:00
Christoph Hellwig
5cacdb4add Add generic sys_olduname()
Add generic implementations of the old and really old uname system calls.
Note that sh only implements sys_olduname but not sys_oldolduname, but I'm
not going to bother with another ifdef for that special case.

m32r implemented an old uname but never wired it up, so kill it, too.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:32 -08:00
Christoph Hellwig
e28cbf2293 improve sys_newuname() for compat architectures
On an architecture that supports 32-bit compat we need to override the
reported machine in uname with the 32-bit value.  Instead of doing this
separately in every architecture introduce a COMPAT_UTS_MACHINE define in
<asm/compat.h> and apply it directly in sys_newuname().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:32 -08:00
Christoph Hellwig
baed7fc9b5 Add generic sys_ipc wrapper
Add a generic implementation of the ipc demultiplexer syscall.  Except for
s390 and sparc64 all implementations of the sys_ipc are nearly identical.

There are slight differences in the types of the parameters, where mips
and powerpc as the only 64-bit architectures with sys_ipc use unsigned
long for the "third" argument as it gets casted to a pointer later, while
it traditionally is an "int" like most other paramters.  frv goes even
further and uses unsigned long for all parameters execept for "ptr" which
is a pointer type everywhere.  The change from int to unsigned long for
"third" and back to "int" for the others on frv should be fine due to the
in-register calling conventions for syscalls (we already had a similar
issue with the generic sys_ptrace), but I'd prefer to have the arch
maintainers looks over this in details.

Except for that h8300, m68k and m68knommu lack an impplementation of the
semtimedop sub call which this patch adds, and various architectures have
gets used - at least on i386 it seems superflous as the compat code on
x86-64 and ia64 doesn't even bother to implement it.

[akpm@linux-foundation.org: add sys_ipc to sys_ni.c]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-03-12 15:52:32 -08:00
Peter Zijlstra
85cfabbcd1 perf, ppc: Fix compile error due to new cpu notifiers
Fix:

  arch/powerpc/kernel/perf_event.c:1334: error: 'power_pmu_notifier' undeclared (first use in this function)
  arch/powerpc/kernel/perf_event.c:1334: error: (Each undeclared identifier is reported only once
  arch/powerpc/kernel/perf_event.c:1334: error: for each function it appears in.)
  arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'power_pmu_notifier'
  arch/powerpc/kernel/perf_event.c:1334: error: implicit declaration of function 'register_cpu_notifier'

Due to commit 3f6da390 (perf: Rework and fix the arch CPU-hotplug hooks).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11 15:21:27 +01:00
Peter Zijlstra
3f6da39053 perf: Rework and fix the arch CPU-hotplug hooks
Remove the hw_perf_event_*() hotplug hooks in favour of per PMU hotplug
notifiers. This has the advantage of reducing the static weak interface
as well as exposing all hotplug actions to the PMU.

Use this to fix x86 hotplug usage where we did things in ONLINE which
should have been done in UP_PREPARE or STARTING.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: paulus@samba.org
Cc: eranian@google.com
Cc: robert.richter@amd.com
Cc: fweisbec@gmail.com
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <20100305154128.736225361@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10 13:22:24 +01:00
Peter Zijlstra
dc1d628a67 perf: Provide generic perf_sample_data initialization
This makes it easier to extend perf_sample_data and fixes a bug on arm
and sparc, which failed to set ->raw to NULL, which can cause crashes
when combined with PERF_SAMPLE_RAW.

It also optimizes PowerPC and tracepoint, because the struct
initialization is forced to zero out the whole structure.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jean Pihet <jpihet@mvista.com>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Jamie Iles <jamie.iles@picochip.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@kernel.org
LKML-Reference: <20100304140100.315416040@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-10 13:22:23 +01:00
Dave Kleikamp
30124d1109 powerpc/booke: Fix breakpoint/watchpoint one-shot behavior
Another fix for the extended ptrace patches in the -next tree.

The handling of breakpoints and watchpoints is inconsistent.  When a
breakpoint or watchpoint is hit, the interrupt handler is clearing the
proper bits in the dbcr* registers, but leaving the dac* and iac* registers
alone.  The ptrace code to delete the break/watchpoints checks the dac* and
iac* registers for zero to determine if they are enabled.  Instead, they
should check the dbcr* bits.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09 11:57:10 +11:00
Vaidyanathan Srinivasan
8dbce53cc2 powerpc: Reset kernel stack on cpu online from cede state
Cpu hotplug (offline) without dlpar operation will place cpu
in cede state and the extended_cede_processor() function will
return when resumed.

Kernel stack pointer needs to be reset before
start_secondary() is called to continue the online operation.

Added new function start_secondary_resume() to do the above
steps.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09 11:57:10 +11:00
Dave Kleikamp
856f70a368 powerpc/booke: Fix a couple typos in the advanced ptrace code
powerpc/booke: Fix a couple typos in the advanced ptrace code

Found and fixed a couple typos in the advanced ptrace patches.
(These patches are currently in benh's next tree.)

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09 11:54:18 +11:00
Michael Ellerman
1426d5a3bd powerpc: Dynamically allocate pacas
On 64-bit kernels we currently have a 512 byte struct paca_struct for
each cpu (usually just called "the paca"). Currently they are statically
allocated, which means a kernel built for a large number of cpus will
waste a lot of space if it's booted on a machine with few cpus.

We can avoid that by only allocating the number of pacas we need at
boot. However this is complicated by the fact that we need to access
the paca before we know how many cpus there are in the system.

The solution is to dynamically allocate enough space for NR_CPUS pacas,
but then later in boot when we know how many cpus we have, we free any
unused pacas.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-03-09 11:52:52 +11:00
Benjamin Herrenschmidt
59603b9ae4 Merge commit 'kumar/next' into merge 2010-03-09 11:51:57 +11:00
Jiri Kosina
318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Emese Revfy
52cf25d0ab Driver core: Constify struct sysfs_ops in struct kobj_type
Constify struct sysfs_ops.

This is part of the ops structure constification
effort started by Arjan van de Ven et al.

Benefits of this constification:

 * prevents modification of data that is shared
   (referenced) by many other structure instances
   at runtime

 * detects/prevents accidental (but not intentional)
   modification attempts on archs that enforce
   read-only kernel data at runtime

 * potentially better optimized code as the compiler
   can assume that the const data cannot be changed

 * the compiler/linker move const data into .rodata
   and therefore exclude them from false sharing

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-03-07 17:04:49 -08:00
Scott Wood
a11106544f powerpc/perf: e500 support
This implements perf_event support for the Freescale embedded performance
monitor, based on the existing perf_event.c that supports server/classic
chips.

Some limitations:
- Performance monitor interrupts are regular EE interrupts, and thus you
  can't profile places with interrupts disabled.  We may want to implement
  soft IRQ-disabling, with perfmon interrupts exempted and treated as NMIs.
- When trying to schedule multiple event groups at once, and using
  restricted events, situations could arise where scheduling fails even
  though it would be possible.  Consider three groups, each with two events.
  One group has restricted events, the others don't.  The two non-restricted
  groups are scheduled, then one is removed, which happens to occupy the two
  counters that can't do restricted events.  The remaining non-restricted
  group will not be moved to the non-restricted-capable counters to make
  room if the restricted group tries to be scheduled.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-03-05 03:04:08 -06:00
Scott Wood
9d6df3fdfc powerpc/perf: Build callchain code regardless of hardware event support.
It's also useful for software events, as well as future support for
other types of hardware counters.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-03-04 10:46:29 -06:00
Denys Vlasenko
54cb27a71f Rename .data.read_mostly to .data..read_mostly.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-03-03 11:26:00 +01:00
Tim Abbott
75b1348372 Rename .data.page_aligned to .data..page_aligned.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-03-03 11:25:59 +01:00
Tim Abbott
2af7687f1a Rename .data.init_task to .data..init_task.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-03-03 11:25:58 +01:00
Tim Abbott
4af57b787b Rename .data.cacheline_aligned to .data..cacheline_aligned.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-03-03 11:25:58 +01:00
Alexander Graf
f7adbba1e5 KVM: PPC: Keep SRR1 flags around in shadow_msr
SRR1 stores more information that just the MSR value. It also stores
valuable information about the type of interrupt we received, for
example whether the storage interrupt we just got was because of a
missing htab entry or not.

We use that information to speed up the exit path.

Now if we get preempted before we can interpret the shadow_msr values,
we get into vcpu_put which then calls the MSR handler, which then sets
all the SRR1 information bits in shadow_msr to 0. Great.

So let's preserve the SRR1 specific bits in shadow_msr whenever we set
the MSR. They don't hurt.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:56 -03:00
Alexander Graf
fbad5f1dfd KVM: PPC: Export __giveup_vsx
We need to explicitly only giveup VSX in KVM, so let's export that
specific function to module space.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:52 -03:00
Alexander Graf
021ec9c69f KVM: PPC: Call SLB patching code in interrupt safe manner
Currently we're racy when doing the transition from IR=1 to IR=0, from
the module memory entry code to the real mode SLB switching code.

To work around that I took a look at the RTAS entry code which is faced
with a similar problem and did the same thing:

  A small helper in linear mapped memory that does mtmsr with IR=0 and
  then RFIs info the actual handler.

Thanks to that trick we can safely take page faults in the entry code
and only need to be really wary of what to do as of the SLB switching
part.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:49 -03:00
Alexander Graf
7e57cba060 KVM: PPC: Use PACA backed shadow vcpu
We're being horribly racy right now. All the entry and exit code hijacks
random fields from the PACA that could easily be used by different code in
case we get interrupted, for example by a #MC or even page fault.

After discussing this with Ben, we figured it's best to reserve some more
space in the PACA and just shove off some vcpu state to there.

That way we can drastically improve the readability of the code, make it
less racy and less complex.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:48 -03:00
Linus Torvalds
6556a67435 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (172 commits)
  perf_event, amd: Fix spinlock initialization
  perf_event: Fix preempt warning in perf_clock()
  perf tools: Flush maps on COMM events
  perf_events, x86: Split PMU definitions into separate files
  perf annotate: Handle samples not at objdump output addr boundaries
  perf_events, x86: Remove superflous MSR writes
  perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in()
  perf_events, x86: AMD event scheduling
  perf_events: Add new start/stop PMU callbacks
  perf_events: Report the MMAP pgoff value in bytes
  perf annotate: Defer allocating sym_priv->hist array
  perf symbols: Improve debugging information about symtab origins
  perf top: Use a macro instead of a constant variable
  perf symbols: Check the right return variable
  perf/scripts: Tag syscall_name helper as not yet available
  perf/scripts: Add perf-trace-python Documentation
  perf/scripts: Remove unnecessary PyTuple resizes
  perf/scripts: Add syscall tracing scripts
  perf/scripts: Add Python scripting engine
  perf/scripts: Remove check-perf-trace from listed scripts
  ...

Fix trivial conflict in tools/perf/util/probe-event.c
2010-02-28 10:20:25 -08:00
Linus Torvalds
ef1a8de8ea Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (88 commits)
  powerpc: Fix lwsync feature fixup vs. modules on 64-bit
  powerpc: Convert pmc_owner_lock to raw_spinlock
  powerpc: Convert die.lock to raw_spinlock
  powerpc: Convert tlbivax_lock to raw_spinlock
  powerpc: Convert mpic locks to raw_spinlock
  powerpc: Convert pmac_pic_lock to raw_spinlock
  powerpc: Convert big_irq_lock to raw_spinlock
  powerpc: Convert feature_lock to raw_spinlock
  powerpc: Convert i8259_lock to raw_spinlock
  powerpc: Convert beat_htab_lock to raw_spinlock
  powerpc: Convert confirm_error_lock to raw_spinlock
  powerpc: Convert ipic_lock to raw_spinlock
  powerpc: Convert native_tlbie_lock to raw_spinlock
  powerpc: Convert beatic_irq_mask_lock to raw_spinlock
  powerpc: Convert nv_lock to raw_spinlock
  powerpc: Convert context_lock to raw_spinlock
  powerpc/85xx: Add NOR, LEDs and PIB support for MPC8568E-MDS boards
  powerpc/86xx: Enable VME driver on the GE SBC610
  powerpc/86xx: Enable VME driver on the GE PPC9A
  powerpc/86xx: Add MSI section to GE PPC9A DTS
  ...
2010-02-27 13:26:18 -08:00
Linus Torvalds
68c6b85984 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (48 commits)
  x86/PCI: Prevent mmconfig memory corruption
  ACPI: Use GPE reference counting to support shared GPEs
  x86/PCI: use host bridge _CRS info by default on 2008 and newer machines
  PCI: augment bus resource table with a list
  PCI: add pci_bus_for_each_resource(), remove direct bus->resource[] refs
  PCI: read bridge windows before filling in subtractive decode resources
  PCI: split up pci_read_bridge_bases()
  PCIe PME: use pci_pcie_cap()
  PCI PM: Run-time callbacks for PCI bus type
  PCIe PME: use pci_is_pcie()
  PCI / ACPI / PM: Platform support for PCI PME wake-up
  ACPI / ACPICA: Multiple system notify handlers per device
  ACPI / PM: Add more run-time wake-up fields
  ACPI: Use GPE reference counting to support shared GPEs
  PCI PM: Make it possible to force using INTx for PCIe PME signaling
  PCI PM: PCIe PME root port service driver
  PCI PM: Add function for checking PME status of devices
  PCI: mark is_pcie obsolete
  PCI: set PCI_PREF_RANGE_TYPE_64 in pci_bridge_check_ranges
  PCI: pciehp: second try to get big range for pcie devices
  ...
2010-02-26 10:35:27 -08:00
Peter Zijlstra
6e37738a2f perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in()
Since the cpu argument to hw_perf_group_sched_in() is always
smp_processor_id(), simplify the code a little by removing this argument
and using the current cpu where needed.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1265890918.5396.3.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-02-26 10:56:53 +01:00
Benjamin Herrenschmidt
874f2f997d Merge commit 'origin/master' into next
Manual merge of:
	drivers/char/hvc_console.c
	drivers/char/hvc_console.h
2010-02-26 14:41:00 +11:00
Linus Torvalds
6ebdc661b6 Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (41 commits)
  of: remove undefined request_OF_resource & release_OF_resource
  of/sparc: Remove sparc-local declaration of allnodes and devtree_lock
  of: move definition of of_chosen into common code.
  of: remove unused extern reference to devtree_lock
  of: put default string compare and #a/s-cell values into common header
  of/flattree: Don't assume HAVE_LMB
  of: protect linux/of.h with CONFIG_OF
  proc_devtree: fix THIS_MODULE without module.h
  of: Remove old and misplaced function declarations
  of/flattree: Make the kernel accept ePAPR style phandle information
  of/flattree: endian-convert members of boot_param_header
  of: assume big-endian properties, adding conversions where necessary
  of: use __be32 for cell value accessors
  of/flattree: use OF_ROOT_NODE_{SIZE,ADDR}_CELLS DEFAULT for fdt parsing
  of/flattree: use callback to setup initrd from /chosen
  proc_devtree: include linux/of.h
  of: make set_node_proc_entry private to proc_devtree.c
  of: include linux/proc_fs.h
  of/flattree: merge early_init_dt_scan_memory() common code
  of: add 'of_' prefix to machine_is_compatible()
  ...
2010-02-25 15:38:37 -08:00
Bjorn Helgaas
89a74ecccd PCI: add pci_bus_for_each_resource(), remove direct bus->resource[] refs
No functional change; this converts loops that iterate from 0 to
PCI_BUS_NUM_RESOURCES through pci_bus resource[] table to use the
pci_bus_for_each_resource() iterator instead.

This doesn't change the way resources are stored; it merely removes
dependencies on the fact that they're in a table.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-23 09:43:31 -08:00
Dominik Brodowski
3b7a17fcda resource/PCI: mark struct resource as const
Now that we return the new resource start position, there is no
need to update "struct resource" inside the align function.
Therefore, mark the struct resource as const.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:57 -08:00
Dominik Brodowski
b26b2d494b resource/PCI: align functions now return start of resource
As suggested by Linus, align functions should return the start
of a resource, not void. An update of "res->start" is no longer
necessary.

Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-02-22 16:16:56 -08:00
Thomas Gleixner
071c06cb57 powerpc: Convert pmc_owner_lock to raw_spinlock
pmc_owner_lock needs to be a real spinlock in RT. Convert it to
raw_spinlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-19 14:52:33 +11:00
Thomas Gleixner
b8f87782e8 powerpc: Convert die.lock to raw_spinlock
die.lock needs to be a real spinlock in RT. Convert it to
raw_spinlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-19 14:52:33 +11:00
Thomas Gleixner
f95e085b25 powerpc: Convert big_irq_lock to raw_spinlock
big_irq_lock needs to be a real spinlock in RT. Convert it to
raw_spinlock.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-19 14:52:32 +11:00
Sebastian Andrzej Siewior
51adc548cb powerpc/fsl-booke: replace a hardcoded constant
24 is offset between the opcode past bl and past rfi. This makes it more
obvious.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-02-17 21:10:25 -06:00
Dave Kleikamp
3bffb6529c powerpc/booke: Add support for advanced debug registers
powerpc/booke: Add support for advanced debug registers

From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>

Based on patches originally written by Torez Smith.

This patch defines context switch and trap related functionality
for BookE specific Debug Registers. It adds support to ptrace()
for setting and getting BookE related Debug Registers

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Torez Smith  <lnxtorez@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Gibson <dwg@au1.ibm.com>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Sergio Durigan Junior <sergiodj@br.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@br.ibm.com>
Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:03:17 +11:00
Dave Kleikamp
3162d92dfb powerpc: Extended ptrace interface
powerpc: Extended ptrace interface

From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>

Based on patches originally written by Torez Smith.

Add a new extended ptrace interface so that user-space has a single
interface for powerpc, without having to know the specific layout
of the debug registers.

Implement:
PPC_PTRACE_GETHWDEBUGINFO
PPC_PTRACE_SETHWDEBUG
PPC_PTRACE_DELHWDEBUG

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Cc: Torez Smith  <lnxtorez@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Sergio Durigan Junior <sergiodj@br.ibm.com>
Cc: Thiago Jung Bauermann <bauerman@br.ibm.com>
Cc: linuxppc-dev list <Linuxppc-dev@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:03:17 +11:00
Dave Kleikamp
172ae2e7f8 powerpc/booke: Introduce new CONFIG options for advanced debug registers
powerpc/booke: Introduce new CONFIG options for advanced debug registers

From: Dave Kleikamp <shaggy@linux.vnet.ibm.com>

Introduce new config options to simplify the ifdefs pertaining to the
advanced debug registers for booke and 40x processors:

CONFIG_PPC_ADV_DEBUG_REGS - boolean: true for dac-based processors
CONFIG_PPC_ADV_DEBUG_IACS - number of IAC registers
CONFIG_PPC_ADV_DEBUG_DACS - number of DAC registers
CONFIG_PPC_ADV_DEBUG_DVCS - number of DVC registers
CONFIG_PPC_ADV_DEBUG_DAC_RANGE - DAC ranges supported

Beginning conservatively, since I only have the facilities to test 440
hardware.  I believe all 40x and booke platforms support at least 2 IAC
and 2 DAC registers.  For 440, 4 IAC and 2 DVC registers are enabled, as
well as the DAC ranges.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:03:16 +11:00
Anton Blanchard
17081102a6 powerpc: Convert global "BAD" interrupt to per cpu spurious
I often get asked if BAD interrupts are really bad. On some boxes (eg
IBM machines running a hypervisor) there are valid cases where are
presented with an interrupt that is not for us. These cases are common
enough to show up as thousands of BAD interrupts a day.

Tone them down by calling them spurious. Since they can be a significant cause
of OS jitter, we may as well log them per cpu so we know where they are
occurring.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:02:49 +11:00
Anton Blanchard
89713ed108 powerpc: Add timer, performance monitor and machine check counts to /proc/interrupts
With NO_HZ it is useful to know how often the decrementer is going off. The
patch below adds an entry for it and also adds it into the /proc/stat
summaries.

While here, I added performance monitoring and machine check exceptions.
I found it useful to keep an eye on the PMU exception rate
when using the perf tool. Since it's possible to take a completely
handled machine check on a System p box it also sounds like a good idea to
keep a machine check summary.

The event naming matches x86 to keep gratuitous differences to a minimum.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:02:49 +11:00
Anton Blanchard
c86845ede8 powerpc: Rework /proc/interrupts
On a large machine I noticed the columns of /proc/interrupts failed to line up
with the header after CPU9. At sufficiently large numbers of CPUs it becomes
impossible to line up the CPU number with the counts.

While fixing this I noticed x86 has a number of updates that we may as well
pull in. On PowerPC we currently omit an interrupt completely if there is no
active handler, whereas on x86 it is printed if there is a non zero count.

The x86 code also spaces the first column correctly based on nr_irqs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:02:48 +11:00
Anton Blanchard
8c007bfdf1 powerpc: Reduce footprint of irq_stat
PowerPC is currently using asm-generic/hardirq.h which statically allocates an
NR_CPUS irq_stat array. Switch to an arch specific implementation which uses
per cpu data:

On a kernel with NR_CPUS=1024, this saves quite a lot of memory:

   text    data     bss      dec         hex    filename
8767938 2944132 1636796 13348866         cbb002 vmlinux.baseline
8767779 2944260 1505724 13217763         c9afe3 vmlinux.irq_cpustat

A saving of around 128kB.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-17 14:02:48 +11:00
Grant Likely
fc0bdae49d of: move definition of of_chosen into common code.
Rather than defining of_chosen in each arch, it can be defined for all
in driver/of/base.c

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
2010-02-14 07:13:55 -07:00
Grant Likely
22d5579e66 of: remove unused extern reference to devtree_lock
Neither the powerpc nor the microblaze code use devtree_lock anymore.
Remove the extern reference.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
2010-02-14 07:13:52 -07:00
Jeremy Kerr
4ef7b373df of/flattree: Don't assume HAVE_LMB
We don't always have lmb available, so make arches provide an
early_init_dt_alloc_memory_arch() to handle the allocation of
memory in the fdt code.

When we don't have lmb.h included, we need asm/page.h for __va.

Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
2010-02-14 07:13:47 -07:00
Jeremy Kerr
087f79c48c of/flattree: endian-convert members of boot_param_header
The boot_param_header has big-endian fields, so change the types to
__be32, and perform endian conversion when we access them.

Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-02-09 08:34:10 -07:00
Jeremy Kerr
1406bc2f57 of/flattree: use callback to setup initrd from /chosen
At present, the fdt code sets the kernel-wide initrd_start and
initrd_end variables when parsing /chosen. On ARM, we only set these
once the bootmem has been reserved.

This change adds an arch hook to setup the initrd from the device
tree:

 void early_init_dt_setup_initrd_arch(unsigned long start,
				      unsigned long end);

The arch-specific code can then setup the initrd however it likes.

Compiled on powerpc, with CONFIG_BLK_DEV_INITRD=y and =n.

Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-02-09 08:34:10 -07:00
Grant Likely
51975db0b7 of/flattree: merge early_init_dt_scan_memory() common code
Merge common code between PowerPC and Microblaze architectures.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
2010-02-09 08:33:10 -07:00
Grant Likely
71a157e8ed of: add 'of_' prefix to machine_is_compatible()
machine is compatible is an OF-specific call.  It should have
the of_ prefix to protect the global namespace.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
2010-02-09 08:33:00 -07:00
Jeremy Kerr
89751a7cb7 of: merge of_find_node_by_phandle
Merge common function between powerpc, sparc and microblaze. Code is
identical for powerpc and microblaze, but adds a lock (and release) of
the devtree_lock on sparc.

Signed-off-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-02-09 08:32:48 -07:00
Grant Likely
fcdeb7fedf of: merge of_attach_node() & of_detach_node()
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 08:32:42 -07:00
Anton Blanchard
b919ee827e powerpc: Only print clockevent settings once
The clockevent multiplier and shift is useful information, but we
only need to print it once.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 13:56:24 +11:00
Anton Blanchard
44c9f3cc1a powerpc: Clear MSR_RI during RTAS calls
RTAS should never cause an exception but if it does (for example accessing
outside our RMO) then we might go a long way through the kernel before
oopsing. If we unset MSR_RI we should at least stop things on exception
exit.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 13:56:24 +11:00
Frans Pop
8354be9c10 powerpc: Remove trailing space in messages
Signed-off-by: Frans Pop <elendil@planet.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 13:56:23 +11:00
Anton Blanchard
0b9612c210 powerpc: Make powerpc_firmware_features __read_mostly
We use firmware_has_feature quite a lot these days, so it's worth putting
powerpc_firmware_features into __read_mostly.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 13:56:07 +11:00
Anton Blanchard
66fcb1059d powerpc: Add last sysfs file and dump of ftrace buffer to oops printout
Add printout of last accessed sysfs file, added to x86 in
ae87221d3c (sysfs: crash debugging)

Also add the notify_die hook that allows us to print out the ftrace
buffer on oops. This is useful in conjunction with ftrace function_graph:

Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=128 NUMA pSeries
last sysfs file: /sys/class/net/tunl0/type
Dumping ftrace buffer:

...

  0)               |                .sysrq_handle_crash() {
  0)   0.476 us    |                  .hash_page();
  0)   0.488 us    |                  .xmon_fault_handler();
  0)               |                  .bad_page_fault() {
  0)               |                    .search_exception_tables() {
  0)   0.590 us    |                      .search_module_extables();
  0)   2.546 us    |                    }
  0)               |                    .printk() {
  0)               |                      .vprintk() {
  0)   0.488 us    |                        ._raw_spin_lock();
  0)   0.572 us    |                        .emit_log_char();

Showing the function graph of a sysrq-c crash.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 13:56:06 +11:00
Joe Perches
5a2ad98e92 arch/powerpc: Fix continuation line formats
String constants that are continued on subsequent lines with \
are not good.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-09 13:55:05 +11:00
Stefan Weil
947af29435 Fix spelling of 'platform' in comments and doc
Replace platfrom -> platform.

This is a frequent spelling bug.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-05 12:22:34 +01:00
Benjamin Herrenschmidt
efec959f63 powerpc/pseries: Pass more accurate number of supported cores to firmware
Updated variant of a patch by Joel Schopp.

The field containing the number of supported cores which we pass to
firmware via the ibm,client-architecture call was set by a previous
patch statically as high as is possible (NR_CPUS).

However, that value isn't quite right for a system that supports
multiple threads per core, thus permitting the firmware to assign
more cores to a Linux partition than it can really cope with.

This patch improves it by using the device-tree to determine the
number of threads supported by the processors in order to adjust
the value passed to firmware.

Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-04 14:33:54 +11:00
jschopp@austin.ibm.com
28bb9ee13a powerpc: Add static fields to ibm,client-architecture call
This patch adds 2 fields to the ibm_architecture_vec array.

The first of these fields indicates the number of cores which Linux can
boot.  It does not account for SMT, so it may result in cpus assigned to
Linux which cannot be booted.  A second patch follows that dynamically
updates this for SMT.

The second field just indicates that our OS is Linux, and not another
OS.  The system may or may not use this hint to performance tune
settings for Linux.

Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-03 17:41:13 +11:00
Anton Blanchard
5be3492f97 powerpc: Mark some variables in the page fault path __read_mostly
Using perf to trace L1 dcache misses and dumping data addresses I found a few
variables taking a lot of misses. Since they are almost never written, they
should go into the __read_mostly section.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-03 17:39:48 +11:00
Anton Blanchard
61c03ddbdf powerpc: Replace per_cpu(, smp_processor_id()) with __get_cpu_var()
The cputime code has a few places that do per_cpu(, smp_processor_id()).
Replace them with __get_cpu_var().

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-03 17:39:48 +11:00
Andreas Schwab
94f28da840 powerpc: TIF_ABI_PENDING bit removal
Here are the powerpc bits to remove TIF_ABI_PENDING now that
set_personality() is called at the appropriate place in exec.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-02-01 14:00:30 +11:00
Ingo Molnar
ae7f6711d6 Merge branch 'perf/urgent' into perf/core
Merge reason: We want to queue up a dependent patch. Also update to
              later -rc's.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-29 10:36:22 +01:00
Benjamin Herrenschmidt
94afc008e1 powerpc/pci: Add missing call to header fixup
Add missing call to  pci_fixup_device(pci_fixup_early, ...) when
building the pci_dev from scratch off the Open Firmware device-tree

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-29 16:51:11 +11:00
Benjamin Herrenschmidt
26b4a0ca46 powerpc/pci: Add missing hookup to pci_slot
Add missing hookup to existing pci_slot when building the pci_dev from
scratch off the Open Firmware device-tree

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-29 16:51:10 +11:00
Benjamin Herrenschmidt
bb209c8287 powerpc/pci: Add calls to set_pcie_port_type() and set_pcie_hotplug_bridge()
We are missing these when building the pci_dev from scratch off
the Open Firmware device-tree

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-01-29 16:51:10 +11:00
Grant Likely
0ada0a7312 Merge commit 'v2.6.33-rc5' into secretlab/test-devicetree 2010-01-28 14:38:25 -07:00
Grant Likely
6016a363f6 of: unify phandle name in struct device_node
In struct device_node, the phandle is named 'linux_phandle' for PowerPC
and MicroBlaze, and 'node' for SPARC.  There is no good reason for the
difference, it is just an artifact of the code diverging over a couple
of years.  This patch renames both to simply .phandle.

Note: the .node also existed in PowerPC/MicroBlaze, but the only user
seems to be arch/powerpc/platforms/powermac/pfunc_core.c.  It doesn't
look like the assignment between .linux_phandle and .node is
significantly different enough to warrant the separate code paths
unless ibm,phandle properties actually appear in Apple device trees.

I think it is safe to eliminate the old .node property and use
phandle everywhere.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-28 14:06:53 -07:00
Grant Likely
923f7e30b4 of: Merge of_node_get() and of_node_put()
Merge common code between PowerPC and MicroBlaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-28 13:52:53 -07:00
Grant Likely
1f43cfb947 of: merge machine_is_compatible()
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-28 13:47:25 -07:00
Anton Blanchard
339ce1a4dc perf: Fix inconsistency between IP and callchain sampling
When running perf across all cpus with backtracing (-a -g), sometimes we
get samples without associated backtraces:

    23.44%         init  [kernel]                     [k] restore
    11.46%         init                       eeba0c  [k] 0x00000000eeba0c
     6.77%      swapper  [kernel]                     [k] .perf_ctx_adjust_freq
     5.73%         init  [kernel]                     [k] .__trace_hcall_entry
     4.69%         perf  libc-2.9.so                  [.] 0x0000000006bb8c
                       |
                       |--11.11%-- 0xfffa941bbbc

It turns out the backtrace code has a check for the idle task and the IP
sampling does not. This creates problems when profiling an interrupt
heavy workload (in my case 10Gbit ethernet) since we get no backtraces
for interrupts received while idle (ie most of the workload).

Right now x86 and sh check that current is not NULL, which should never
happen so remove that too.

Idle task's exclusion must be performed from the core code, on top
of perf_event_attr:exclude_idle.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
LKML-Reference: <20100118054707.GT12666@kryten>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-01-28 14:31:20 +01:00
Nathan Fontenot
d0174c7219 powerpc: Move cpu hotplug driver lock from pseries to powerpc
Move the defintion and lock helper routines for the cpu hotplug driver
lock from pseries to powerpc code to avoid build breaks for platforms
other than pseries that use cpu hotplug.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15 13:26:18 +11:00
Nathan Fontenot
9becd2a0d6 powerpc: Move /proc/ppc64 to /proc/powerpc update
It looks like the previous patch sent out to move RTAS and
other items from /proc/ppc64 to /proc/powerpc missed a few
files needed for RAS and DLPAR functionality.

Original Patch here:
http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-September/076096.html

This patch updates the remaining files to be created under /proc/powerpc.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15 13:26:17 +11:00
Joakim Tjernlund
061ec9599f powerpc/8xx: Fix user space TLB walk in dcbX fixup
The newly added fixup for buggy dcbX insn's has
a bug that always trigger a kernel TLB walk so a user space
dcbX insn will cause a Kernel Machine Check if it hits DTLB error.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15 13:26:16 +11:00
Stefan Roese
3e7b484354 powerpc: Fix decrementer setup on 1GHz boards
We noticed that recent kernels didn't boot on our 1GHz Canyonlands 460EX
boards anymore. As it seems, patch 8d165db1 [powerpc: Improve
decrementer accuracy] introduced this problem. The routine div_sc()
overflows with shift = 32 resulting in this incorrect setup:

time_init: decrementer frequency = 1000.000012 MHz
time_init: processor frequency   = 1000.000012 MHz
clocksource: timebase mult[400000] shift[22] registered
clockevent: decrementer mult[33] shift[32] cpu[0]

This patch now introduces a local div_dc64() version of this function
so that this overflow doesn't happen anymore.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Detlev Zundel <dzu@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15 13:26:15 +11:00
Anton Vorontsov
e443ed3560 powerpc/swsusp_32: Fix TLB invalidation
It seems there is a thinko in the TLB invalidation code that makes the
tlbie in the loop executed just once. The intended check was probably
'gt', not 'lt'.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15 13:20:07 +11:00
Joakim Tjernlund
9f4f04ba2b powerpc/8xx: Always pin kernel instruction TLB
Various kernel asm modifies SRR0/SRR1 just before executing
a rfi. If such code crosses a page boundary you risk a TLB miss
which will clobber SRR0/SRR1. Avoid this by always pinning
kernel instruction TLB space.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-01-15 13:20:07 +11:00
Linus Torvalds
d661d76b02 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI/cardbus: Add a fixup hook and fix powerpc
  PCI: change PCI nomenclature in drivers/pci/ (non-comment changes)
  PCI: change PCI nomenclature in drivers/pci/ (comment changes)
  PCI: fix section mismatch on update_res()
  PCI: add Intel 82599 Virtual Function specific reset method
  PCI: add Intel USB specific reset method
  PCI: support device-specific reset methods
  PCI: Handle case when no pci device can provide cache line size hint
  PCI/PM: Propagate wake-up enable for PCIe devices too
  vgaarbiter: fix a typo in the vgaarbiter Documentation
2009-12-30 13:13:24 -08:00
Neil Campbell
bb7f20b1c6 powerpc: Handle VSX alignment faults correctly in little-endian mode
This patch fixes the handling of VSX alignment faults in little-endian
mode (the current code assumes the processor is in big-endian mode).

The patch also makes the handlers clear the top 8 bytes of the register
when handling an 8 byte VSX load.

This is based on 2.6.32.

Signed-off-by: Neil Campbell <neilc@linux.vnet.ibm.com>
Cc: <stable@kernel.org>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-18 14:55:43 +11:00
Benjamin Herrenschmidt
2d1c861871 PCI/cardbus: Add a fixup hook and fix powerpc
The cardbus code creates PCI devices without ever going through the
necessary fixup bits and pieces that normal PCI devices go through.

There's in fact a commented out call to pcibios_fixup_bus() in there,
it's commented because ... it doesn't work.

I could make pcibios_fixup_bus() do the right thing on powerpc easily
but I felt it cleaner instead to provide a specific hook pci_fixup_cardbus
for which a weak empty implementation is provided by the PCI core.

This fixes cardbus on powerbooks and probably all other PowerPC
platforms which was broken completely for ever on some platforms and
since 2.6.31 on others such as PowerBooks when we made the DMA ops
mandatory (since those are setup by the fixups).

Acked-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-12-16 18:55:51 -08:00
Linus Torvalds
a73611b6aa Merge branch 'next' of git://git.secretlab.ca/git/linux-2.6
* 'next' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
  powerpc: fix up for mmu_mapin_ram api change
  powerpc: wii: allow ioremap within the memory hole
  powerpc: allow ioremap within reserved memory regions
  wii: use both mem1 and mem2 as ram
  wii: bootwrapper: add fixup to calc useable mem2
  powerpc: gamecube/wii: early debugging using usbgecko
  powerpc: reserve fixmap entries for early debug
  powerpc: wii: default config
  powerpc: wii: platform support
  powerpc: wii: hollywood interrupt controller support
  powerpc: broadway processor support
  powerpc: wii: bootwrapper bits
  powerpc: wii: device tree
  powerpc: gamecube: default config
  powerpc: gamecube: platform support
  powerpc: gamecube/wii: flipper interrupt controller support
  powerpc: gamecube/wii: udbg support for usbgecko
  powerpc: gamecube/wii: do not include PCI support
  powerpc: gamecube/wii: declare as non-coherent platforms
  powerpc: gamecube/wii: introduce GAMECUBE_COMMON
  ...

Fix up conflicts in arch/powerpc/mm/fsl_booke_mmu.c.

Hopefully even close to correctly.
2009-12-16 13:26:53 -08:00
Linus Torvalds
74f3ae7434 Merge branch 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'module' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  modpost: fix segfault with short symbol names
  module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y
  Kbuild: clear marker out of modpost
  module: make MODULE_SYMBOL_PREFIX into a CONFIG option
  ARM: unexport symbols used to implement floating point emulation
  ARM: use unified discard definition in linker script
  x86: don't export inline function
  sparc64: don't export static inline pci_ functions
2009-12-16 10:47:24 -08:00
Akinobu Mita
a66022c457 iommu-helper: use bitmap library
Use bitmap library and kill some unused iommu helper functions.

1. s/iommu_area_free/bitmap_clear/

2. s/iommu_area_reserve/bitmap_set/

3. Use bitmap_find_next_zero_area instead of find_next_zero_area

  This cannot be simple substitution because find_next_zero_area
  doesn't check the last bit of the limit in bitmap

4. Remove iommu_area_free, iommu_area_reserve, and find_next_zero_area

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:18 -08:00
Oleg Nesterov
25baa35bef ptrace: powerpc: implement user_single_step_siginfo()
Suggested by Roland.

Implement user_single_step_siginfo() for powerpc.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-16 07:20:08 -08:00
Rusty Russell
d4703aefdb module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y
powerpc applies relocations to the kcrctab.  They're absolute symbols,
but it's not completely unreasonable: other archs may too, but the
relocation is often 0.

http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-November/077972.html

Inspired-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Paul Mackerras <paulus@samba.org>
2009-12-15 16:28:34 +10:30
Thomas Gleixner
239007b844 genirq: Convert irq_desc.lock to raw_spinlock
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 23:55:33 +01:00
Thomas Gleixner
0199c4e68d locking: Convert __raw_spin* functions to arch_spin*
Name space cleanup. No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
edc35bd72e locking: Rename __RAW_SPIN_LOCK_UNLOCKED to __ARCH_SPIN_LOCK_UNLOCKED
Further name space cleanup. No functional change

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Thomas Gleixner
445c89514b locking: Convert raw_spinlock to arch_spinlock
The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.

Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
2009-12-14 23:55:32 +01:00
Linus Torvalds
d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Albert Herranz
d1d56f8c1d powerpc: gamecube/wii: early debugging using usbgecko
Add support for using the USB Gecko adapter as an early debugging
console on the Nintendo GameCube and Wii video game consoles.
The USB Gecko is a 3rd party memory card interface adapter that provides
a EXI (External Interface) to USB serial converter.

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-12-12 22:24:31 -07:00
Albert Herranz
45158dc7d6 powerpc: broadway processor support
This patch extends the cputable entry of the 750CL to also match
the 750CL-based "Broadway" cpu found on the Nintendo Wii.

As of this patch, the following "Broadway" design revision levels have
been seen in the wild:
- DD1.2 (87102)
- DD2.0 (87200)

Signed-off-by: Albert Herranz <albert_herranz@yahoo.es>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-12-12 22:24:29 -07:00
Linus Torvalds
09cea96caa Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (151 commits)
  powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
  MAINTAINERS: Add PowerPC patterns
  powerpc/pseries: Track previous CPPR values to correctly EOI interrupts
  powerpc/pseries: Correct pseries/dlpar.c build break without CONFIG_SMP
  powerpc: Make "intspec" pointers in irq_host->xlate() const
  powerpc/8xx: DTLB Miss cleanup
  powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
  powerpc/8xx: Start using dcbX instructions in various copy routines
  powerpc/8xx: Restore _PAGE_WRITETHRU
  powerpc/8xx: Add missing Guarded setting in DTLB Error.
  powerpc/8xx: Fixup DAR from buggy dcbX instructions.
  powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
  powerpc/8xx: Invalidate non present TLBs
  powerpc/pseries: Serialize cpu hotplug operations during deactivate Vs deallocate
  pseries/pseries: Add code to online/offline CPUs of a DLPAR node
  powerpc: stop_this_cpu: remove the cpu from the online map.
  powerpc/pseries: Add kernel based CPU DLPAR handling
  sysfs/cpu: Add probe/release files
  powerpc/pseries: Kernel DLPAR Infrastructure
  ...
2009-12-12 14:27:24 -08:00
Al Viro
f8b7256096 Unify sys_mmap*
New helper - sys_mmap_pgoff(); switch syscalls to using it.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-11 06:44:29 -05:00
Grant Likely
86e0322134 of/flattree: merge early_init_dt_scan_chosen()
Merge common code between PowerPC and Microblaze.  This patch
splits the arch-specific stuff out into a new function,
early_init_dt_scan_chosen_arch().

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-10 23:42:21 -07:00
Grant Likely
0f0b56c3f2 of/flattree: eliminate cell_t typedef
A cell is firmly established as a u32.  No need to do an ugly typedef
to redefine it to cell_t.  Eliminate the unnecessary typedef so that
it doesn't have to be added to the of_fdt header file

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-10 23:42:17 -07:00
Grant Likely
83f7a06eb4 of/flattree: merge dt_mem_next_cell
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-10 15:23:15 -07:00
Grant Likely
f00abd9491 of/flattree: Merge earlyinit_dt_scan_root()
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
2009-12-10 15:23:15 -07:00
Grant Likely
f7b3a8355b of/flattree: Merge early_init_dt_check_for_initrd()
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-10 15:18:03 -07:00
Linus Torvalds
4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Benjamin Herrenschmidt
e090aa8032 powerpc: Fix usage of 64-bit instruction in 32-bit altivec code
e821ea70f3 introduced a bug by copying
some 64-bit originated code as-is to be used by both 32 and 64-bit
but this code contains a 64-bit ony "cmpdi" instruction.

This changes it to cmpwi, which is fine since VRSAVE can only contains
a 32-bit value anyway.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@kernel.org>
2009-12-09 18:10:12 +11:00
Benjamin Herrenschmidt
bcd6acd51f Merge commit 'origin/master' into next
Conflicts:
	include/linux/kvm.h
2009-12-09 17:14:38 +11:00
Roman Fietze
40d50cf7ca powerpc: Make "intspec" pointers in irq_host->xlate() const
Writing a driver using SCLPC on the MPC5200B I detected, that the
intspec arrays to map irqs to Linux virq cannot be const, because the
mapping and xlate functions only take non const pointers. All those
functions do not modify the intspec, so a const pointer could be used.

Signed-off-by: Roman Fietze <roman.fietze@telemotive.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:37 +11:00
Joakim Tjernlund
990d89c663 powerpc/8xx: DTLB Miss cleanup
Use symbolic constant for PRESENT and avoid branching.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:37 +11:00
Joakim Tjernlund
2321f33790 powerpc/8xx: Remove DIRTY pte handling in DTLB Error.
There is no need to do set the DIRTY bit directly in DTLB Error.
Trap to do_page_fault() and let the generic MM code do the work.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:37 +11:00
Joakim Tjernlund
15d914d72a powerpc/8xx: Start using dcbX instructions in various copy routines
Now that 8xx can fixup dcbX instructions, start using them
where possible like every other PowerPc arch do.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:37 +11:00
Joakim Tjernlund
0c4661698c powerpc/8xx: Restore _PAGE_WRITETHRU
8xx has not had WRITETHRU due to lack of bits in the pte.
After the recent rewrite of the 8xx TLB code, there are
two bits left. Use one of them to WRITETHRU.

Perhaps use the last SW bit to PAGE_SPECIAL or PAGE_FILE?

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:37 +11:00
Joakim Tjernlund
4a280a413c powerpc/8xx: Add missing Guarded setting in DTLB Error.
only DTLB Miss did set this bit, DTLB Error needs too otherwise
the setting is lost when the page becomes dirty.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:36 +11:00
Joakim Tjernlund
0a2ab51ffb powerpc/8xx: Fixup DAR from buggy dcbX instructions.
This is an assembler version to fixup DAR not being set
by dcbX, icbi instructions. There are two versions, one
uses selfmodifing code, the other uses a
jump table but is much bigger(default).

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:36 +11:00
Joakim Tjernlund
60e071fee9 powerpc/8xx: Tag DAR with 0x00f0 to catch buggy instructions.
dcbz, dcbf, dcbi, dcbst and icbi do not set DAR when they
cause a DTLB Error. Dectect this by tagging DAR with 0x00f0
at every exception exit that modifies DAR.
Test for DAR=0x00f0 in DataTLBError and bail
to handle_page_fault().

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:36 +11:00
Joakim Tjernlund
fe11dc3f96 powerpc/8xx: Update TLB asm so it behaves as linux mm expects.
Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
Get rid of _PAGE_HWWRITE too.
Pros:
 - I/D TLB Miss never needs to write to the linux pte.
 - _PAGE_ACCESSED is only set on TLB Error fixing accounting
 - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
    when a page has been made dirty.
 - Proper RO/RW mapping of user space.
 - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
 - kernel RO/user NA support.
Cons:
 - A few more instructions in the TLB Miss routines.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:10:36 +11:00
Benjamin Herrenschmidt
8c82da5e24 Merge commit 'gcl/next' into next 2009-12-09 17:10:22 +11:00
Valentine Barshak
8389b37dff powerpc: stop_this_cpu: remove the cpu from the online map.
Remove the CPU from the online map to prevent smp_call_function
from sending messages to a stopped CPU.

Signed-off-by: Valentine Barshak <vbarshak@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:09:34 +11:00
Nathan Fontenot
12633e803a sysfs/cpu: Add probe/release files
Version 3 of this patch is updated with documentation added to
Documentation/ABI.  There are no changes to any of the C code from v2
of the patch.

In order to support kernel DLPAR of CPU resources we need to provide an
interface to add (probe) and remove (release) the resource from the system.
This patch Creates new generic probe and release sysfs files to facilitate
cpu probe/release.  The probe/release interface provides for allowing each
arch to supply their own routines for implementing the backend of adding
and removing cpus to/from the system.

This also creates the powerpc specific stubs to handle the arch callouts
from writes to the sysfs files.

The creation and use of these files is regulated by the
CONFIG_ARCH_CPU_PROBE_RELEASE option so that only architectures that need the
capability will have the files created.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-12-09 17:09:33 +11:00
Linus Torvalds
fbf07eac7b Merge branch 'timers-for-linus-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hrtimer: Fix /proc/timer_list regression
  itimers: Fix racy writes to cpu_itimer fields
  timekeeping: Fix clock_gettime vsyscall time warp
2009-12-08 19:28:09 -08:00
Linus Torvalds
60d8ce2cd6 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  timers, init: Limit the number of per cpu calibration bootup messages
  posix-cpu-timers: optimize and document timer_create callback
  clockevents: Add missing include to pacify sparse
  x86: vmiclock: Fix printk format
  x86: Fix printk format due to variable type change
  sparc: fix printk for change of variable type
  clocksource/events: Fix fallout of generic code changes
  nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
  nohz: Track last do_timer() cpu
  nohz: Prevent clocksource wrapping during idle
  nohz: Type cast printk argument
  mips: Use generic mult/shift factor calculation for clocks
  clocksource: Provide a generic mult/shift factor calculation
  clockevents: Use u32 for mult and shift factors
  nohz: Introduce arch_needs_cpu
  nohz: Reuse ktime in sub-functions of tick_check_idle.
  time: Remove xtime_cache
  time: Implement logarithmic time accumulation
2009-12-08 19:27:08 -08:00
Linus Torvalds
3ad1f3b35e Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  of: merge of_find_all_nodes() implementations
  of: merge other miscellaneous prototypes
  of: merge of_*_flat_dt*() functions
  of: merge of_node_get(), of_node_put() and of_find_all_nodes()
  of: merge of_read_number() an of_read_ulong()
  of: merge of_node_*_flag() and set_node_proc_entry()
  of: merge struct boot_param_header from Microblaze and PowerPC
  of: add common header for flattened device tree representation
  of: Move OF_IS_DYNAMIC and OF_MARK_DYNAMIC macros to of.h
  of: merge struct device_node
  of: merge phandle, ihandle and struct property
  of: Rework linux/of.h and asm/prom.h include ordering
2009-12-08 07:46:56 -08:00
Linus Torvalds
1557d33007 Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits)
  security/tomoyo: Remove now unnecessary handling of security_sysctl.
  security/tomoyo: Add a special case to handle accesses through the internal proc mount.
  sysctl: Drop & in front of every proc_handler.
  sysctl: Remove CTL_NONE and CTL_UNNUMBERED
  sysctl: kill dead ctl_handler definitions.
  sysctl: Remove the last of the generic binary sysctl support
  sysctl net: Remove unused binary sysctl code
  sysctl security/tomoyo: Don't look at ctl_name
  sysctl arm: Remove binary sysctl support
  sysctl x86: Remove dead binary sysctl support
  sysctl sh: Remove dead binary sysctl support
  sysctl powerpc: Remove dead binary sysctl support
  sysctl ia64: Remove dead binary sysctl support
  sysctl s390: Remove dead sysctl binary support
  sysctl frv: Remove dead binary sysctl support
  sysctl mips/lasat: Remove dead binary sysctl support
  sysctl drivers: Remove dead binary sysctl support
  sysctl crypto: Remove dead binary sysctl support
  sysctl security/keys: Remove dead binary sysctl support
  sysctl kernel: Remove binary sysctl logic
  ...
2009-12-08 07:38:50 -08:00
Jiri Kosina
d014d04386 Merge branch 'for-next' into for-linus
Conflicts:

	kernel/irq/chip.c
2009-12-07 18:36:35 +01:00
Linus Torvalds
c3fa27d136 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (470 commits)
  x86: Fix comments of register/stack access functions
  perf tools: Replace %m with %a in sscanf
  hw-breakpoints: Keep track of user disabled breakpoints
  tracing/syscalls: Make syscall events print callbacks static
  tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook
  perf: Don't free perf_mmap_data until work has been done
  perf_event: Fix compile error
  perf tools: Fix _GNU_SOURCE macro related strndup() build error
  trace_syscalls: Remove unused syscall_name_to_nr()
  trace_syscalls: Simplify syscall profile
  trace_syscalls: Remove duplicate init_enter_##sname()
  trace_syscalls: Add syscall_nr field to struct syscall_metadata
  trace_syscalls: Remove enter_id exit_id
  trace_syscalls: Set event_enter_##sname->data to its metadata
  trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit
  perf_event: Initialize data.period in perf_swevent_hrtimer()
  perf probe: Simplify event naming
  perf probe: Add --list option for listing current probe events
  perf probe: Add argv_split() from lib/argv_split.c
  perf probe: Move probe event utility functions to probe-event.c
  ...
2009-12-05 15:30:21 -08:00
André Goddard Rosa
af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
FUJITA Tomonori
7c31629fd6 powerpc: Kill unused swiotlb variable
We can kill unused swiotlb variable.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-24 17:00:24 +11:00
Stephen Rothwell
c38c2044b2 powerpc: do not export pci_alloc/free_consistent
Since they are static inline functions.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-24 17:00:23 +11:00
Benjamin Herrenschmidt
a0592d42fe powerpc: kill the obsolete code under is_global_init()
The code under "if (is_global_init())" is bogus, and is_global_init()
itself is not right in mt case.

Contrary to what the comment says, nowadays force_sig_info() does kill
init even if the handler is SIG_DFL. Note that force_sig_info() clears
SIGNAL_UNKILLABLE exactly for this case.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-24 17:00:22 +11:00
Thomas Gleixner
b27df67248 powerpc: Fixup last users of irq_chip->typename
The typename member of struct irq_chip was kept for migration purposes
and is obsolete since more than 2 years. Fix up the leftovers.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-24 14:32:45 +11:00
Thomas Gleixner
3b03fecd12 powerpc: Use unlocked ioctl in nvram_64
The ioctl is only used for powermac systems and reads a partition
number from an array which is initialized at boot time way before the
nvram code is initialized. So it's safe to switch to unlocked_ioctl.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-24 14:31:26 +11:00
Grant Likely
02af11b03f of: merge prom_{add,remove,modify}_property
Merge common code between PowerPC and MicroBlaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 20:16:45 -07:00
Grant Likely
41f880091c of/flattree: Merge unflatten_device_tree
Merge common code between PowerPC and MicroBlaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 20:07:01 -07:00
Grant Likely
bbd33931a0 of/flattree: Merge unflatten_dt_node
Merge common code between PowerPC and MicroBlaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 20:07:00 -07:00
Grant Likely
00e38efd90 of/flattree: Merge of_flat_dt_is_compatible
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 20:07:00 -07:00
Grant Likely
ca900cfa29 of/flattree: merge of_get_flat_dt_prop
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 20:06:59 -07:00
Grant Likely
819d281930 of/flattree: merge of_get_flat_dt_root
Merge common code between PowerPC and MicroBlaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 19:44:23 -07:00
Grant Likely
c8cb7a5984 of/flattree: merge of_scan_flat_dt
Merge common code between PowerPC and Microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 18:54:23 -07:00
Grant Likely
e169cfbef4 of/flattree: merge find_flat_dt_string and initial_boot_params
Merge common code between Microblaze and PowerPC.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Wolfram Sang <w.sang@pengutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu>
2009-11-23 14:53:09 -07:00
Grant Likely
2cfcadde83 Merge commit 'v2.6.32-rc8' 2009-11-23 14:49:36 -07:00
Kumar Gala
8b27f0b61d powerpc/fsl-booke: Rework TLB CAM code
Re-write the code so its more standalone and fixed some issues:
* Bump'd # of CAM entries to 64 to support e500mc
* Make the code handle MAS7 properly
* Use pr_cont instead of creating a string as we go

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-11-20 16:45:33 -06:00
Eric W. Biederman
6d4561110a sysctl: Drop & in front of every proc_handler.
For consistency drop & in front of every proc_handler.  Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-18 08:37:40 -08:00
Lin Ming
0696b711e4 timekeeping: Fix clock_gettime vsyscall time warp
Since commit 0a544198 "timekeeping: Move NTP adjusted clock multiplier
to struct timekeeper" the clock multiplier of vsyscall is updated with
the unmodified clock multiplier of the clock source and not with the
NTP adjusted multiplier of the timekeeper.

This causes user space observerable time warps:
new CLOCK-warp maximum: 120 nsecs,  00000025c337c537 -> 00000025c337c4bf

Add a new argument "mult" to update_vsyscall() and hand in the
timekeeping internal NTP adjusted multiplier.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: "Zhang Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1258436990.17765.83.camel@minggr.sh.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-17 11:52:34 +01:00
Eric W. Biederman
bb9074ff58 Merge commit 'v2.6.32-rc7'
Resolve the conflict between v2.6.32-rc7 where dn_def_dev_handler
gets a small bug fix and the sysctl tree where I am removing all
sysctl strategy routines.
2009-11-17 01:01:34 -08:00
Ingo Molnar
0ffa798d94 Merge branches 'perf/powerpc' and 'perf/bench' into perf/core
Merge reason: Both 'perf bench' and the pending PowerPC changes
              are now ready for the next merge window.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-15 09:51:24 +01:00
Thomas Gleixner
a362c638bd clocksource/events: Fix fallout of generic code changes
powerpc grew a new warning due to the type change of clockevent->mult.

The architectures which use parts of the generic time keeping
infrastructure tripped over my wrong assumption that
clocksource_register is only used when GENERIC_TIME=y.

I should have looked and also I should have known better. These
renitent Gaul villages are racking my nerves. Some serious deprecating
is due.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-11-14 00:35:52 +01:00
Eric W. Biederman
26ea513558 sysctl powerpc: Remove dead binary sysctl support
Now that sys_sysctl is a generic wrapper around /proc/sys  .ctl_name
and .strategy members of sysctl tables are dead code.  Remove them.

Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-12 02:05:02 -08:00
Benjamin Herrenschmidt
0526484aa3 Merge commit 'origin/master' into next 2009-11-12 10:59:04 +11:00
FUJITA Tomonori
ad32e8cb86 swiotlb: Defer swiotlb init printing, export swiotlb_print_info()
This enables us to avoid printing swiotlb memory info when we
initialize swiotlb. After swiotlb initialization, we could find
that we don't need swiotlb.

This patch removes the code to print swiotlb memory info in
swiotlb_init() and exports the function to do that.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
Cc: tony.luck@intel.com
Cc: benh@kernel.crashing.org
LKML-Reference: <1257849980-22640-9-git-send-email-fujita.tomonori@lab.ntt.co.jp>
[ -v2: merge up conflict ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:32:00 +01:00
Dirk Hohndel
06fe9fb418 tree-wide: fix a very frequent spelling mistake
something-bility is spelled as something-blity
so a grep for 'blit' would find these lines

this is so trivial that I didn't split it by subsystem / copy
additional maintainers - all changes are to comments
The only purpose is to get fewer false positives when grepping
around the kernel sources.

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-11-09 09:40:54 +01:00
Eric W. Biederman
da3f6f9b3e sysctl: Introduce a generic compat sysctl sysctl
This uses compat_alloc_userspace to remove the various
hacks to allow do_sysctl to write to throuh oldlenp.

The rest of our mature compat syscall helper facitilies
are used as well to ensure we have a nice clean maintainable
compat syscall that can be used on all architectures.

The motiviation for a generic compat sysctl (besides the
obvious hack removal) is to reduce the number of compat
sysctl defintions out there so I can refactor the
binary sysctl implementation.

ppc already used the name compat_sys_sysctl so I remove the
ppcs version here.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2009-11-06 03:52:55 -08:00
Benjamin Herrenschmidt
978d7eb31d powerpc: Avoid giving out RTC dates below EPOCH
Doing so causes xtime to be negative which crashes the timekeeping
code in funny ways when doing suspend/resume

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 17:06:21 +11:00
Alexander Graf
55c758840a Export new PACA constants in asm-offsets
In order to access fields in the PACA from assembly code, we need
to generate offsets using asm-offsets.c.

So let's add the new PACA related bits, we just introduced!

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:50:26 +11:00
Alexander Graf
4ab79aa801 Export symbols for KVM module
We want to be able to build KVM as a module. To enable us doing so, we
need some more exports from core Linux parts.

This patch exports all functions and variables that are required for KVM.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:50:24 +11:00
Alexander Graf
62908905b2 Add Book3s_64 offsets to asm-offsets.c
We need to access some VCPU fields from assembly code. In order to get
the proper offsets, we have to define them in asm-offsets.c.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:57 +11:00
Alexander Graf
842f2fedcd Make head_64.S aware of KVM real mode code
We need to run some KVM trampoline code in real mode. Unfortunately, real mode
only covers 8MB on Cell so we need to squeeze ourselves as low as possible.

Also, we need to trap interrupts to get us back from guest state to host state
without telling Linux about it.

This patch adds interrupt traps and includes the KVM code that requires real
mode in the real mode parts of Linux.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-11-05 16:49:57 +11:00
Albrecht Dreß
a22c65f81a powerpc: tiny memcpy_(to|from)io optimisation
This trivial patch changes memcpy_(to|from)io as to transfer as many
32-bit words as possible in 32-bit accesses (in the current solution,
the last 32-bit word was transferred as 4 byte accesses).

Signed-off-by: Albrecht Dreß <albrecht.dress@arcor.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-11-04 16:43:12 -07:00
Michael Ellerman
cd01570717 powerpc: Enable sparse irq_descs on powerpc
Defining CONFIG_SPARSE_IRQ enables generic code that gets rid of the
static irq_desc array, and replaces it with an array of pointers to
irq_descs.

It also allows node local allocation of irq_descs, however we
currently don't have the information available to do that, so we just
allocate them on all on node 0.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:21:31 +11:00
Michael Ellerman
750ab11291 powerpc: Rearrange and fix show_interrupts() for sparse irq_descs
Move the default case out of the if, ie. when we're just displaying
an irq. And consolidate all the odd cases at the top, ie. printing
the header and footer.

And in the process cope with sparse irq_descs.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:21:31 +11:00
Michael Ellerman
76f1d94f3e powerpc: Make virq_debug_show() cope with sparse irq_descs
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:21:30 +11:00
Thomas Gleixner
32c105c378 powerpc/nvram_64: Mark init code __init
Mark all functions which are only called from nvram_init() __init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:21:29 +11:00
Thomas Gleixner
fd62c6c448 powerpc/nvram_64: Check nvram_error_log_index in nvram_clear_error_log()
nvram_clear_error_log() calls ppc_md.nvram_write() even when
nvram_error_log_index is -1 (invalid). The nvram_write() function does
not check for a negative offset.

Check nvram_error_log_index as the other nvram log functions do.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:21:28 +11:00
Thomas Gleixner
ae7dd0208f powerpc/nvram_64: Remove unused code
nvram_find_partition() has no user. The call site was removed in the
arch/powerpc move, but the function stayed. Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:21:28 +11:00
David Gibson
a4fe3ce769 powerpc/mm: Allow more flexible layouts for hugepage pagetables
Currently each available hugepage size uses a slightly different
pagetable layout: that is, the bottem level table of pointers to
hugepages is a different size, and may branch off from the normal page
tables at a different level.  Every hugepage aware path that needs to
walk the pagetables must therefore look up the hugepage size from the
slice info first, and work out the correct way to walk the pagetables
accordingly.  Future hardware is likely to add more possible hugepage
sizes, more layout options and more mess.

This patch, therefore reworks the handling of hugepage pagetables to
reduce this complexity.  In the new scheme, instead of having to
consult the slice mask, pagetable walking code can check a flag in the
PGD/PUD/PMD entries to see where to branch off to hugepage pagetables,
and the entry also contains the information (eseentially hugepage
shift) necessary to then interpret that table without recourse to the
slice mask.  This scheme can be extended neatly to handle multiple
levels of self-describing "special" hugepage pagetables, although for
now we assume only one level exists.

This approach means that only the pagetable allocation path needs to
know how the pagetables should be set out.  All other (hugepage)
pagetable walking paths can just interpret the structure as they go.

There already was a flag bit in PGD/PUD/PMD entries for hugepage
directory pointers, but it was only used for debug.  We alter that
flag bit to instead be a 0 in the MSB to indicate a hugepage pagetable
pointer (normally it would be 1 since the pointer lies in the linear
mapping).  This means that asm pagetable walking can test for (and
punt on) hugepage pointers with the same test that checks for
unpopulated page directory entries (beq becomes bge), since hugepage
pointers will always be positive, and normal pointers always negative.

While we're at it, we get rid of the confusing (and grep defeating)
#defining of hugepte_shift to be the same thing as mmu_huge_psizes.

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:20:58 +11:00
Michael Ellerman
6cff46f4bc powerpc: Remove get_irq_desc()
get_irq_desc() is a powerpc-specific version of irq_to_desc(). That
is reason enough to remove it, but it also doesn't know about sparse
irq_desc support which irq_to_desc() does (when we enable it).

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:20:55 +11:00
Benjamin Herrenschmidt
3d541c4b7f powerpc/chrp: Use the same RTAS daemon as pSeries
The CHRP code has some fishy timer based code to scan the RTAS event
log, which uses a 1KB stack buffer and doesn't even use the results.

The pSeries code as a nicer daemon that allows userspace to read the
event log and basically uses the same RTAS interface

This patch moves rtasd.c out of platform/pseries and makes it usable
by CHRP, after removing the old crufty event log mechanism in there.

The nvram logging part of the daemon is still only available on 64-bit
since the underlying nvram management routines aren't currently shared.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:20:53 +11:00
Benjamin Herrenschmidt
188917e183 powerpc: Move /proc/ppc64 to /proc/powerpc and add symlink
Some of the stuff in /proc/ppc64 such as the RTAS bits are actually
useful to some 32-bit platforms. Rename the file, and create a
symlink on 64-bit for backward compatibility

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-30 17:20:53 +11:00
Tejun Heo
6b7487fc65 percpu: make percpu symbols in powerpc unique
This patch updates percpu related symbols in powerpc such that percpu
symbols are unique and don't clash with local symbols.  This serves
two purposes of decreasing the possibility of global percpu symbol
collision and allowing dropping per_cpu__ prefix from percpu symbols.

* arch/powerpc/kernel/perf_callchain.c: s/callchain/cpu_perf_callchain/

* arch/powerpc/kernel/setup-common.c: s/pvr/cpu_pvr/

* arch/powerpc/platforms/pseries/dtl.c: s/dtl/cpu_dtl/

* arch/powerpc/platforms/cell/interrupt.c: s/iic/cpu_iic/

Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
which cause name clashes" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
2009-10-29 22:34:14 +09:00
Anton Blanchard
c86e2eaded powerpc: perf_event: Cleanup output by adding symbols
Add some dummy symbols for the branches at 0xf00, 0xf20 and 0xf40,
otherwise hits end up in trap_0e which is confusing to the user.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:05 +11:00
Anton Blanchard
917e407c76 powerpc: perf_event: Hide iseries_check_pending_irqs
If CONFIG_PPC_ISERIES isn't defined we end up with
iseries_check_pending_irqs and do_work at the same address.
perf ends up picking iseries_check_pending_irqs which creates
confusing backtraces.  Hide it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:05 +11:00
Anton Blanchard
907b1f45d9 powerpc: Export powerpc_debugfs_root
Kernel modules should be able to place their debug output inside our
powerpc debugfs directory.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:04 +11:00
Anton Blanchard
6795b85c6a powerpc: tracing: Add powerpc tracepoints for timer entry and exit
We can monitor the effectiveness of our power management of both the
kernel and hypervisor by probing the timer interrupt. For example, on
this box we see 10.37s timer interrupts on an idle core:

<idle>-0     [010]  3900.671297: timer_interrupt_entry: pt_regs=c0000000ce1e7b10
<idle>-0     [010]  3900.671302: timer_interrupt_exit: pt_regs=c0000000ce1e7b10

<idle>-0     [010]  3911.042963: timer_interrupt_entry: pt_regs=c0000000ce1e7b10
<idle>-0     [010]  3911.042968: timer_interrupt_exit: pt_regs=c0000000ce1e7b10

<idle>-0     [010]  3921.414630: timer_interrupt_entry: pt_regs=c0000000ce1e7b10
<idle>-0     [010]  3921.414635: timer_interrupt_exit: pt_regs=c0000000ce1e7b10

Since we have a 207MHz decrementer it will go negative and fire every 10.37s
even if Linux is completely idle.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:03 +11:00
Anton Blanchard
1bf4af1650 powerpc: tracing: Add powerpc tracepoints for interrupt entry and exit
This adds powerpc-specific tracepoints for interrupt entry and exit.

While we already have generic irq_handler_entry and irq_handler_exit
tracepoints there are cases on our virtualised powerpc machines where an
interrupt is presented to the OS, but subsequently handled by the hypervisor.
This means no OS interrupt handler is invoked.

Here is an example on a POWER6 machine with the patch below applied:

<idle>-0     [006]  3243.949840744: irq_entry: pt_regs=c0000000ce31fb10
<idle>-0     [006]  3243.949850520: irq_exit: pt_regs=c0000000ce31fb10

<idle>-0     [007]  3243.950218208: irq_entry: pt_regs=c0000000ce323b10
<idle>-0     [007]  3243.950224080: irq_exit: pt_regs=c0000000ce323b10

<idle>-0     [000]  3244.021879320: irq_entry: pt_regs=c000000000a63aa0
<idle>-0     [000]  3244.021883616: irq_handler_entry: irq=87 handler=eth0
<idle>-0     [000]  3244.021887328: irq_handler_exit: irq=87 return=handled
<idle>-0     [000]  3244.021897408: irq_exit: pt_regs=c000000000a63aa0

Here we see two phantom interrupts (no handler was invoked), followed
by a real interrupt for eth0. Without the tracepoints in this patch we
would have missed the phantom interrupts.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:03 +11:00
Anton Blanchard
eecff81d1f powerpc: Create PPC_WARN_ALIGNMENT to match PPC_WARN_EMULATED
perf_event wants a separate event for alignment and emulation faults,
so create another emulation event.  This will make it easy to hook in
perf_event at one spot.

We pass in regs which will be required for these events.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:03 +11:00
Anton Blanchard
81cd5ae303 powerpc: perf_event: Enable SDAR in continous sample mode
In continuous sampling mode we want the SDAR to update.  While we can
select between dcache misses and ERAT (L1-TLB) misses, a decent default
is to enable both.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:02 +11:00
Anton Blanchard
bc284e5d9d powerpc: perf_event: Log invalid data addresses as all 1s
When we take an exception and the SDAR isn't synchronised we currently
log 0 as the address.  Unfortunately this is a pretty common value, so
use ~0UL instead.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-10-28 16:13:02 +11:00
Benjamin Herrenschmidt
4f917ba3d5 powerpc/ppc64: Use preempt_schedule_irq instead of preempt_schedule
Based on an original patch by Valentine Barshak <vbarshak@ru.mvista.com>

Use preempt_schedule_irq to prevent infinite irq-entry and
eventual stack overflow problems with fast-paced IRQ sources.

This kind of problems has been observed on the PASemi Electra IDE
controller. We have to make sure we are soft-disabled before calling
preempt_schedule_irq and hard disable interrupts after that
to avoid unrecoverable exceptions.

This patch also moves the "clrrdi r9,r1,THREAD_SHIFT" out of
the #ifdef CONFIG_PPC_BOOK3E scope, since r9 is clobbered
and has to be restored in both cases.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:43 +11:00
Kumar Gala
ce7a35c73a powerpc: Fix compile errors found by new ppc64e_defconfig
Fix the following 3 issues:

arch/powerpc/kernel/process.c: In function 'arch_randomize_brk':
arch/powerpc/kernel/process.c:1183: error: 'mmu_highuser_ssize' undeclared (first use in this function)
arch/powerpc/kernel/process.c:1183: error: (Each undeclared identifier is reported only once
arch/powerpc/kernel/process.c:1183: error: for each function it appears in.)
arch/powerpc/kernel/process.c:1183: error: 'MMU_SEGSIZE_1T' undeclared (first use in this function)

In file included from arch/powerpc/kernel/setup_64.c:60:
arch/powerpc/include/asm/mmu-hash64.h:132: error: redefinition of 'struct mmu_psize_def'
arch/powerpc/include/asm/mmu-hash64.h:159: error: expected identifier or '(' before numeric constant
arch/powerpc/include/asm/mmu-hash64.h:396: error: conflicting types for 'mm_context_t'
arch/powerpc/include/asm/mmu-book3e.h:184: error: previous declaration of 'mm_context_t' was here

cc1: warnings being treated as errors
arch/powerpc/kernel/pci_64.c: In function 'pcibios_unmap_io_space':
arch/powerpc/kernel/pci_64.c💯 error: unused variable 'res'

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:41 +11:00
Andreas Schwab
348aa30300 powerpc: Align vDSO base address
The ABI specifies a 64K alignment, we need to map the vDSO accordingly

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:40 +11:00
Andreas Schwab
7de80284d6 powerpc: Fix segment mapping in vdso32
Due to missing segment assignments the .text section was put in the NOTES
segment (and marked as NOTE section), and the .got was put in the DYNAMIC
segment.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:40 +11:00
Michael Neuling
7abb840b49 powerpc/perf_events: Fix priority of MSR HV vs PR bits
The architecture defines that if MSR PR is set we are in problem state
irrespective of the HV bit.  This fixes perf events to reflect this.

Also, on bare metal systems, samples taken in Linux will now be reported
as kernel rather than hypervisor.

Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: paulus@samba.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-27 16:42:38 +11:00
Grant Likely
e91edcf5a2 of: merge of_find_all_nodes() implementations
Merge common code between Microblaze and PowerPC, and make it available
to Sparc

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Stephen Neuendorffer <stephen.neuendorffer@xilinx.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
2009-10-15 10:58:09 -06:00
Benjamin Herrenschmidt
b734dd5b57 Merge commit 'ftrace/ppc' into merge 2009-10-15 14:09:11 +11:00
Heiko Schocher
0f6023d599 powerpc/pci: Fix MODPOST warning
making a powerpc target with PCI support, shows the
following warning:

  MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x10430): Section mismatch in reference from the
function pcibios_allocate_bus_resources() to the function .init.text:reparent_resources()

The function pcibios_allocate_bus_resources() references
the function __init reparent_resources().

This is often because pcibios_allocate_bus_resources lacks a __init
annotation or the annotation of reparent_resources is wrong.

This patch fix this warning by removing the __init
annotation before reparent_resources.

Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-14 16:58:40 +11:00
Dragos Tatulea
04f5653477 powerpc/oprofile: Add ppc750 CL as supported by oprofile
Here's a patch that adds the ppc750 CL cpu as supported by oprofile.

Signed-off-by: Dragos Tatulea <dtatulea@ixiacom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-14 16:58:39 +11:00
Sean MacLennan
5be2a213b1 powerpc: warning: allocated section `.data_nosave' not in segment
We need to align before the output section. Having the align inside
the output section causes the linker to put some filler in there,
which makes it a non-empty section, but this section isn't assigned to
a segment so you get a warning from the linker.

Signed-off-by: Sean MacLennan <smaclennan@pikatech.com>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-14 16:58:39 +11:00
Anton Vorontsov
cf50f447b2 powerpc/kgdb: Fix build failure caused by "kgdb.c: unused variable 'acc'"
'acc' isn't used anywhere and thus triggers gcc warning, which causes
build error with CONFIG_PPC_DISABLE_WERROR=n (default):

  cc1: warnings being treated as errors
  arch/powerpc/kernel/kgdb.c: In function 'gdb_regs_to_pt_regs':
  arch/powerpc/kernel/kgdb.c:289: warning: unused variable 'acc'
  make[1]: *** [arch/powerpc/kernel/kgdb.o] Error 1

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-10-14 16:58:38 +11:00
Steven Rostedt
be10ab1090 powerpc64/ftrace: use PACA to retrieve TOC in mod_return_to_handler
The mod_return_to_handler needs to switch to the kernel TOC before
jumping to a the kernel code. It currently does this by looking
at the kernel function data and retrieves the TOC that way.

Not only is this inefficient, it also breaks with a relocatable kernel.
The PACA contains the kernel TOC and we can easily retrieve it that
way.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-13 14:20:56 -07:00
Steven Rostedt
9135c3cc5a powerpc/ftrace: show real return addresses in modules
When the function graph tracer is enabled, it replaces the return address
with a hook back to the tracer. This makes back traces see the hook instead
of the actual return address.

The current code also shows the real address by checking if the return
address jumps to the return_to_handler. If it is, is also prints out
the saved real return address.

On powerpc64, some modules may return to mod_return_to_handler, which
is not checked. This patch will also show the real address if a return
is to mod_return_to_handler as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-10-13 14:20:55 -07:00
Tim Abbott
142597dbbd powerpc: Cleanup linker script using new linker script macros.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:48 +10:00
Anton Blanchard
049d049706 powerpc: Fix ibm,client-architecture-support printout
On machines without the ibm,client-architecture-support call we were missing a
newline. We may as well print the full name in all its glory too - its
ibm,client-architecture-support, not ibm,client-architecture as I mistakenly
wrote (a name only an IBM architect could love).

For my penance I will write out ibm,client-architecture-support 100 times.

Before:

Calling ibm,client-architecture...command line: root=/dev/sda6 console=hvc0  quiet

After:

Calling ibm,client-architecture-support... not implemented
command line: root=/dev/sda6 console=hvc0

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:47 +10:00
Anton Blanchard
f2053f1a7b powerpc/perf_counter: Fix vdso detection
perf_counter uses arch_vma_name() to detect a vdso region which in turn uses
current->mm->context.vdso_base. We need to initialise this before doing
the mmap or else we fail to detect the vdso.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:45 +10:00
Anton Blanchard
8bbde7a706 powerpc: Move 64bit heap above 1TB on machines with 1TB segments
If we are using 1TB segments and we are allowed to randomise the heap, we can
put it above 1TB so it is backed by a 1TB segment. Otherwise the heap will be
in the bottom 1TB which always uses 256MB segments and this may result in a
performance penalty.

This functionality is disabled when heap randomisation is turned off:

echo 1 > /proc/sys/kernel/randomize_va_space

which may be useful when trying to allocate the maximum amount of 16M or 16G
pages.

On a microbenchmark that repeatedly touches 32GB of memory with a stride of
256MB + 4kB (designed to stress 256MB segments while still mapping nicely into
the L1 cache), we see the improvement:

Force malloc to use heap all the time:
# export MALLOC_MMAP_MAX_=0 MALLOC_TRIM_THRESHOLD_=-1

Disable heap randomization:
# echo 1 > /proc/sys/kernel/randomize_va_space
# time ./test
12.51s

Enable heap randomization:
# echo 2 > /proc/sys/kernel/randomize_va_space
# time ./test
1.70s

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:44 +10:00
Becky Bruce
738ef42e32 powerpc: Change archdata dma_data to a union
Sometimes this is used to hold a simple offset, and sometimes
it is used to hold a pointer.  This patch changes it to a union containing
void * and dma_addr_t.  get/set accessors are also provided, because it was
getting a bit ugly to get to the actual data.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:43 +10:00
Becky Bruce
1cebd7a0f6 powerpc: Rename get_dma_direct_offset get_dma_offset
The former is no longer really accurate with the swiotlb case now
a possibility.  I also move it into dma-mapping.h - it no longer
needs to be in dma.c, and there are about to be some more accessors
that should all end up in the same place.  A comment is added to
indicate that this function is not used in configs where there is no
simple dma offset, such as the iommu case.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:43 +10:00
Huang Weiyi
5c8f382c0b powerpc/book3e-64: Remove duplicated #include
Remove duplicated #include('s) in
  arch/powerpc/kernel/exceptions-64e.S

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:41 +10:00
roel kluin
0f3372741f powerpc: kmalloc failure ignored in vio_build_iommu_table()
Prevent NULL dereference if kmalloc() fails.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-24 15:31:38 +10:00
Linus Torvalds
94a8d5caba Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits)
  cpumask: Move deprecated functions to end of header.
  cpumask: remove unused deprecated functions, avoid accusations of insanity
  cpumask: use new-style cpumask ops in mm/quicklist.
  cpumask: use mm_cpumask() wrapper: x86
  cpumask: use mm_cpumask() wrapper: um
  cpumask: use mm_cpumask() wrapper: mips
  cpumask: use mm_cpumask() wrapper: mn10300
  cpumask: use mm_cpumask() wrapper: m32r
  cpumask: use mm_cpumask() wrapper: arm
  cpumask: Use accessors for cpu_*_mask: um
  cpumask: Use accessors for cpu_*_mask: powerpc
  cpumask: Use accessors for cpu_*_mask: mips
  cpumask: Use accessors for cpu_*_mask: m32r
  cpumask: remove arch_send_call_function_ipi
  cpumask: arch_send_call_function_ipi_mask: s390
  cpumask: arch_send_call_function_ipi_mask: powerpc
  cpumask: arch_send_call_function_ipi_mask: mips
  cpumask: arch_send_call_function_ipi_mask: m32r
  cpumask: arch_send_call_function_ipi_mask: alpha
  cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64
  ...
2009-09-23 18:14:11 -07:00
Alexey Dobriyan
2bcd57ab61 headers: utsname.h redux
* remove asm/atomic.h inclusion from linux/utsname.h --
   not needed after kref conversion
 * remove linux/utsname.h inclusion from files which do not need it

NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 18:13:10 -07:00
Rusty Russell
ea0f1cab6e cpumask: Use accessors for cpu_*_mask: powerpc
Use the accessors rather than frobbing bits directly (the new versions
are const).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
2009-09-24 09:34:48 +09:30
Rusty Russell
f063ea02fb cpumask: arch_send_call_function_ipi_mask: powerpc
We're weaning the core code off handing cpumask's around on-stack.
This introduces arch_send_call_function_ipi_mask(), and by defining
it, the old arch_send_call_function_ipi is defined by the core code.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-09-24 09:34:45 +09:30
Linus Torvalds
c37efa9325 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits)
  Use macros for .data.page_aligned section.
  Use macros for .bss.page_aligned section.
  Use new __init_task_data macro in arch init_task.c files.
  kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts.
  arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0
  kbuild: add static to prototypes
  kbuild: fail build if recordmcount.pl fails
  kbuild: set -fconserve-stack option for gcc 4.5
  kbuild: echo the record_mcount command
  gconfig: disable "typeahead find" search in treeviews
  kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling
  checkincludes.pl: add option to remove duplicates in place
  markup_oops: use modinfo to avoid confusion with underscored module names
  checkincludes.pl: provide usage helper
  checkincludes.pl: close file as soon as we're done with it
  ctags: usability fix
  kernel hacking: move STRIP_ASM_SYMS from General
  gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma
  kbuild: Check if linker supports the -X option
  kbuild: introduce ld-option
  ...

Fix trivial conflict in scripts/basic/fixdep.c
2009-09-23 15:37:02 -07:00
Linus Torvalds
31bbb9b58d Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  itimers: Add tracepoints for itimer
  hrtimer: Add tracepoint for hrtimers
  timers: Add tracepoints for timer_list timers
  cputime: Optimize jiffies_to_cputime(1)
  itimers: Simplify arm_timer() code a bit
  itimers: Fix periodic tics precision
  itimers: Merge ITIMER_VIRT and ITIMER_PROF

Trivial header file include conflicts in kernel/fork.c
2009-09-23 09:46:15 -07:00
James Morris
88e9d34c72 seq_file: constify seq_operations
Make all seq_operations structs const, to help mitigate against
revectoring user-triggerable function pointers.

This is derived from the grsecurity patch, although generated from scratch
because it's simpler than extracting the changes from there.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 07:39:29 -07:00
Linus Torvalds
7fa07729e4 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf_event, powerpc: Fix compilation after big perf_counter rename
2009-09-22 08:11:04 -07:00
Linus Torvalds
342ff1a1b5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
  trivial: fix typo in aic7xxx comment
  trivial: fix comment typo in drivers/ata/pata_hpt37x.c
  trivial: typo in kernel-parameters.txt
  trivial: fix typo in tracing documentation
  trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
  trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
  trivial: remove unnecessary semicolons
  trivial: Fix duplicated word "options" in comment
  trivial: kbuild: remove extraneous blank line after declaration of usage()
  trivial: improve help text for mm debug config options
  trivial: doc: hpfall: accept disk device to unload as argument
  trivial: doc: hpfall: reduce risk that hpfall can do harm
  trivial: SubmittingPatches: Fix reference to renumbered step
  trivial: fix typos "man[ae]g?ment" -> "management"
  trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
  trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
  trivial: fix missing printk space in amd_k7_smp_check
  trivial: fix typo s/ketymap/keymap/ in comment
  trivial: fix typo "to to" in multiple files
  trivial: fix typos in comments s/DGBU/DBGU/
  ...
2009-09-22 07:51:45 -07:00
Paul Mackerras
a8f90e9067 perf_event, powerpc: Fix compilation after big perf_counter rename
This fixes two places in the powerpc perf_event (perf_counter) code
where 'list_entry' needs to be changed to 'group_entry', but were
missed in commit 65abc865 ("perf_counter: Rename list_entry ->
group_entry, counter_list -> group_list").

This also changes 'event' back to 'counter' in a couple of
contexts:

* Field and function names that deal with the limited-function
  counters: it's really the hardware counters whose function is
  limited, not the events that they count.  Hence:

  MAX_LIMITED_HWEVENTS -> MAX_LIMITED_HWCOUNTERS
  limited_event -> limited_counter
  freeze/thaw_limited_events -> freeze/thaw_limited_counters

* The machine-specific PMU description struct (struct power_pmu): this
  renames 'n_event' back to 'n_counter' since it really describes how
  many hardware counters the machine has.  (Renaming this back avoids
  a compile error in each of the machine-specific PMU back-ends where
  they initialize their power_pmu struct.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <19128.4280.813369.589704@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22 09:30:40 +02:00
Anand Gadiyar
411c940385 trivial: fix typo "for for" in multiple files
trivial: fix typo "for for" in multiple files

Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-21 15:14:54 +02:00
Ingo Molnar
57c0c15b52 perf: Tidy up after the big rename
- provide compatibility Kconfig entry for existing PERF_COUNTERS .config's

 - provide courtesy copy of old perf_counter.h, for user-space projects

 - small indentation fixups

 - fix up MAINTAINERS

 - fix small x86 printout fallout

 - fix up small PowerPC comment fallout (use 'counter' as in register)

Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:34:11 +02:00
Ingo Molnar
cdd6c482c9 perf: Do the big rename: Performance Counters -> Performance Events
Bye-bye Performance Counters, welcome Performance Events!

In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.

Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.

All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)

The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.

Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.

User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)

This patch has been generated via the following script:

  FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

  sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

  for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
  done

  FILES=$(find . -name perf_event.*)

  sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\<event\>/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.

Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.

( NOTE: 'counters' are still the proper terminology when we deal
  with hardware registers - and these sed scripts are a bit
  over-eager in renaming them. I've undone some of that, but
  in case there's something left where 'counter' would be
  better than 'event' we can undo that on an individual basis
  instead of touching an otherwise nicely automated patch. )

Suggested-by: Stephane Eranian <eranian@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <linux-arch@vger.kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 14:28:04 +02:00
Ingo Molnar
ae82bfd61c Merge branch 'linus' into perfcounters/rename
Merge reason: pull in all the latest code before doing the rename.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 12:51:42 +02:00
Paul Mackerras
cd74c86bdf perf_counter, powerpc, sparc: Fix compilation after perf_counter_overflow() change
Commit 5622f295 ("x86, perf_counter, bts: Optimize BTS overflow
handling") removed the regs field from struct perf_sample_data and
added a regs parameter to perf_counter_overflow().  This breaks the
build on powerpc (and Sparc) as reported by Sachin Sant:

  arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
  arch/powerpc/kernel/perf_counter.c:1165: error: unknown field 'regs' specified in initializer

This adjusts arch/powerpc/kernel/perf_counter.c to correspond with the
new struct perf_sample_data and perf_counter_overflow().

[ v2: also fix Sparc, Markus Metzger <markus.t.metzger@intel.com> ]

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Markus Metzger <markus.t.metzger@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: benh@kernel.crashing.org
Cc: linuxppc-dev@ozlabs.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <19127.8400.376239.586120@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-21 09:28:40 +02:00
Tim Abbott
abe1ee3a22 Use macros for .data.page_aligned section.
This patch changes the remaining direct references to
.data.page_aligned in C and assembly code to use the macros in
include/linux/linkage.h.

Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-09-21 06:27:08 +02:00
Joe Perches
d200c922bc Use new __init_task_data macro in arch init_task.c files.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-09-21 06:27:08 +02:00
Sam Ravnborg
f86fd30660 kbuild: rename ld-option to cc-ldoption
ld-option is misnamed as it test options to gcc, not to ld.
Renamed it to reflect this.

Cc: Andi Kleen <andi@firstfloor.org>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-09-20 12:27:42 +02:00
Linus Torvalds
a03fdb7612 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (34 commits)
  time: Prevent 32 bit overflow with set_normalized_timespec()
  clocksource: Delay clocksource down rating to late boot
  clocksource: clocksource_select must be called with mutex locked
  clocksource: Resolve cpu hotplug dead lock with TSC unstable, fix crash
  timers: Drop a function prototype
  clocksource: Resolve cpu hotplug dead lock with TSC unstable
  timer.c: Fix S/390 comments
  timekeeping: Fix invalid getboottime() value
  timekeeping: Fix up read_persistent_clock() breakage on sh
  timekeeping: Increase granularity of read_persistent_clock(), build fix
  time: Introduce CLOCK_REALTIME_COARSE
  x86: Do not unregister PIT clocksource on PIT oneshot setup/shutdown
  clocksource: Avoid clocksource watchdog circular locking dependency
  clocksource: Protect the watchdog rating changes with clocksource_mutex
  clocksource: Call clocksource_change_rating() outside of watchdog_lock
  timekeeping: Introduce read_boot_clock
  timekeeping: Increase granularity of read_persistent_clock()
  timekeeping: Update clocksource with stop_machine
  timekeeping: Add timekeeper read_clock helper functions
  timekeeping: Move NTP adjusted clock multiplier to struct timekeeper
  ...

Fix trivial conflict due to MIPS lemote -> loongson renaming.
2009-09-18 09:15:24 -07:00
Linus Torvalds
4406c56d0a Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits)
  PCI hotplug: clean up acpi_run_hpp()
  PCI hotplug: acpiphp: use generic pci_configure_slot()
  PCI hotplug: shpchp: use generic pci_configure_slot()
  PCI hotplug: pciehp: use generic pci_configure_slot()
  PCI hotplug: add pci_configure_slot()
  PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface
  PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge
  PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation
  PCI: Clear saved_state after the state has been restored
  PCI PM: Return error codes from pci_pm_resume()
  PCI: use dev_printk in quirk messages
  PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset()
  PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle
  PCI Hotplug: acpiphp: find bridges the easy way
  PCI: pcie portdrv: remove unused variable
  PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support
  ACPI PM: Replace wakeup.prepared with reference counter
  PCI PM: Introduce device flag wakeup_prepared
  PCI / ACPI PM: Rework some debug messages
  PCI PM: Simplify PCI wake-up code
  ...

Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree
scanning having been moved and merged for the 32- and 64-bit cases.  The
'needs_freset' initialization added in 6e19314cc ("PCI/powerpc: support
PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.
2009-09-16 07:49:54 -07:00
Linus Torvalds
723e9db7a4 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (134 commits)
  powerpc/nvram: Enable use Generic NVRAM driver for different size chips
  powerpc/iseries: Fix oops reading from /proc/iSeries/mf/*/cmdline
  powerpc/ps3: Workaround for flash memory I/O error
  powerpc/booke: Don't set DABR on 64-bit BookE, use DAC1 instead
  powerpc/perf_counters: Reduce stack usage of power_check_constraints
  powerpc: Fix bug where perf_counters breaks oprofile
  powerpc/85xx: Fix SMP compile error and allow NULL for smp_ops
  powerpc/irq: Improve nanodoc
  powerpc: Fix some late PowerMac G5 with PCIe ATI graphics
  powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT
  powerpc/book3e: Add missing page sizes
  powerpc/pseries: Fix to handle slb resize across migration
  powerpc/powermac: Thermal control turns system off too eagerly
  powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan()
  powerpc/405ex: support cuImage via included dtb
  powerpc/405ex: provide necessary fixup function to support cuImage
  powerpc/40x: Add support for the ESTeem 195E (PPC405EP) SBC
  powerpc/44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.
  powerpc/44x: Update Arches defconfig
  powerpc/44x: Update Arches dts
  ...

Fix up conflicts in drivers/char/agp/uninorth-agp.c
2009-09-15 09:51:09 -07:00
Linus Torvalds
ada3fa1505 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits)
  powerpc64: convert to dynamic percpu allocator
  sparc64: use embedding percpu first chunk allocator
  percpu: kill lpage first chunk allocator
  x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA
  percpu: update embedding first chunk allocator to handle sparse units
  percpu: use group information to allocate vmap areas sparsely
  vmalloc: implement pcpu_get_vm_areas()
  vmalloc: separate out insert_vmalloc_vm()
  percpu: add chunk->base_addr
  percpu: add pcpu_unit_offsets[]
  percpu: introduce pcpu_alloc_info and pcpu_group_info
  percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward
  percpu: add @align to pcpu_fc_alloc_fn_t
  percpu: make @dyn_size mandatory for pcpu_setup_first_chunk()
  percpu: drop @static_size from first chunk allocators
  percpu: generalize first chunk allocator selection
  percpu: build first chunk allocators selectively
  percpu: rename 4k first chunk allocator to page
  percpu: improve boot messages
  percpu: fix pcpu_reclaim() locking
  ...

Fix trivial conflict as by Tejun Heo in kernel/sched.c
2009-09-15 09:39:44 -07:00
Linus Torvalds
4f0ac85416 Merge branch 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
  perf tools: Avoid unnecessary work in directory lookups
  perf stat: Clean up statistics calculations a bit more
  perf stat: More advanced variance computation
  perf stat: Use stddev_mean in stead of stddev
  perf stat: Remove the limit on repeat
  perf stat: Change noise calculation to use stddev
  x86, perf_counter, bts: Do not allow kernel BTS tracing for now
  x86, perf_counter, bts: Correct pointer-to-u64 casts
  x86, perf_counter, bts: Fail if BTS is not available
  perf_counter: Fix output-sharing error path
  perf trace: Fix read_string()
  perf trace: Print out in nanoseconds
  perf tools: Seek to the end of the header area
  perf trace: Fix parsing of perf.data
  perf trace: Sample timestamps as well
  perf_counter: Introduce new (non-)paranoia level to allow raw tracepoint access
  perf trace: Sample the CPU too
  perf tools: Work around strict aliasing related warnings
  perf tools: Clean up warnings list in the Makefile
  perf tools: Complete support for dynamic strings
  ...
2009-09-11 13:22:43 -07:00
Linus Torvalds
a66a50054e Merge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (59 commits)
  x86/gart: Do not select AGP for GART_IOMMU
  x86/amd-iommu: Initialize passthrough mode when requested
  x86/amd-iommu: Don't detach device from pt domain on driver unbind
  x86/amd-iommu: Make sure a device is assigned in passthrough mode
  x86/amd-iommu: Align locking between attach_device and detach_device
  x86/amd-iommu: Fix device table write order
  x86/amd-iommu: Add passthrough mode initialization functions
  x86/amd-iommu: Add core functions for pd allocation/freeing
  x86/dma: Mark iommu_pass_through as __read_mostly
  x86/amd-iommu: Change iommu_map_page to support multiple page sizes
  x86/amd-iommu: Support higher level PTEs in iommu_page_unmap
  x86/amd-iommu: Remove old page table handling macros
  x86/amd-iommu: Use 2-level page tables for dma_ops domains
  x86/amd-iommu: Remove bus_addr check in iommu_map_page
  x86/amd-iommu: Remove last usages of IOMMU_PTE_L0_INDEX
  x86/amd-iommu: Change alloc_pte to support 64 bit address space
  x86/amd-iommu: Introduce increase_address_space function
  x86/amd-iommu: Flush domains if address space size was increased
  x86/amd-iommu: Introduce set_dte_entry function
  x86/amd-iommu: Add a gneric version of amd_iommu_flush_all_devices
  ...
2009-09-11 13:16:37 -07:00
Martyn Welch
d331d8305c powerpc/nvram: Enable use Generic NVRAM driver for different size chips
Remove the reliance on a staticly defined NVRAM size, allowing
platforms to support NVRAMs with sizes differing from the standard.

A fall back value is provided for platforms not supporting this extension.

Signed-off-by: Martyn Welch <martyn.welch@gefanuc.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-11 16:02:11 +10:00
Benjamin Herrenschmidt
c6c9eacef0 powerpc/booke: Don't set DABR on 64-bit BookE, use DAC1 instead
Also remove a duplicate setting of it in the context switch path
on BookE.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-11 11:27:59 +10:00
Paul Mackerras
e51ee31e8a powerpc/perf_counters: Reduce stack usage of power_check_constraints
Michael Ellerman reported stack-frame size warnings being produced
for power_check_constraints(), which uses an 8*8 array of u64 and
two 8*8 arrays of unsigned long, which are currently allocated on the
stack, along with some other smaller variables.  These arrays come
to 1.5kB on 64-bit or 1kB on 32-bit, which is a bit too much for the
stack.

This fixes the problem by putting these arrays in the existing
per-cpu cpu_hw_counters struct.  This is OK because two of the call
sites have interrupts disabled already; for the third call site we
use get_cpu_var, which disables preemption, so we know we won't
get a context switch while we're in power_check_constraints().
Note that power_check_constraints() can be called during context
switch but is not called from interrupts.

Reported-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-11 11:27:59 +10:00
Paul Mackerras
a6dbf93a2a powerpc: Fix bug where perf_counters breaks oprofile
Currently there is a bug where if you use oprofile on a pSeries
machine, then use perf_counters, then use oprofile again, oprofile
will not work correctly; it will lose the PMU configuration the next
time the hypervisor does a partition context switch, and thereafter
won't count anything.

Maynard Johnson identified the sequence causing the problem:
- oprofile setup calls ppc_enable_pmcs(), which calls
  pseries_lpar_enable_pmcs, which tells the hypervisor that we want
  to use the PMU, and sets the "PMU in use" flag in the lppaca.
  This flag tells the hypervisor whether it needs to save and restore
  the PMU config.
- The perf_counter code sets and clears the "PMU in use" flag directly
  as it context-switches the PMU between tasks, and leaves it clear
  when it finishes.
- oprofile setup, called for a new oprofile run, calls ppc_enable_pmcs,
  which does nothing because it has already been called.  In particular
  it doesn't set the "PMU in use" flag.

This fixes the problem by arranging for ppc_enable_pmcs to always set
the "PMU in use" flag.  It makes the perf_counter code call
ppc_enable_pmcs also rather than calling the lower-level function
directly, and removes the setting of the "PMU in use" flag from
pseries_lpar_enable_pmcs, since that is now done in its caller.

This also removes the declaration of pasemi_enable_pmcs because it
isn't defined anywhere.

Reported-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: <stable@kernel.org)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-11 11:27:58 +10:00
Kumar Gala
757cbd46d1 powerpc/85xx: Fix SMP compile error and allow NULL for smp_ops
The following commit introduced a compile error since it removed
the implementation of smp_85xx_basic_setup:

commit 77c0a700c1
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Fri Aug 28 14:25:04 2009 +1000

    powerpc: Properly start decrementer on BookE secondary CPUs

Make it so that smp_ops probe() and setup_cpu() can be set to NULL.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-11 11:27:57 +10:00
Mike Mason
6e19314cc9 PCI/powerpc: support PCIe fundamental reset
By default, the EEH framework on powerpc does what's known as a "hot
reset" during recovery of a PCI Express device.  We've found a case
where the device needs a "fundamental reset" to recover properly.  The
current PCI error recovery and EEH frameworks do not support this
distinction.

The attached patch makes changes to EEH to utilize the new bit field.

Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Signed-off-by: Richard Lary <rlary@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-09 13:29:41 -07:00
Ingo Molnar
695a461296 Merge branch 'amd-iommu/2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu 2009-09-04 14:44:16 +02:00
Paul Mackerras
a3df6f7d30 perf_counter/powerpc: Fix cache event codes for POWER7
I had the codes for L1 D-cache load accesses and misses swapped
around, and the wrong codes for LL-cache accesses and misses.
This corrects them.

Reported-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
LKML-Reference: <19103.8514.709300.585484@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-03 08:41:53 +02:00
Kumar Gala
76acc2c1a7 powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT
Switch to using the Power ISA defined PTE format when we have a 64-bit
PTE.  This makes the code handling between fsl-booke and book3e-64
similiar for TLB faults.

Additionally this lets use take advantage of the page size encodings and
full permissions that the HW PTE defines.

Also defined _PMD_PRESENT, _PMD_PRESENT_MASK, and _PMD_BAD since the
32-bit ppc arch code expects them.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-02 16:20:41 +10:00
Brian King
46db2f86a3 powerpc/pseries: Fix to handle slb resize across migration
The SLB can change sizes across a live migration, which was not
being handled, resulting in possible machine crashes during
migration if migrating to a machine which has a smaller max SLB
size than the source machine. Fix this by first reducing the
SLB size to the minimum possible value, which is 32, prior to
migration. Then during the device tree update which occurs after
migration, we make the call to ensure the SLB gets updated. Also
add the slb_size to the lparcfg output so that the migration
tools can check to make sure the kernel has this capability
before allowing migration in scenarios where the SLB size will change.

BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid
      breaking ppc32 build

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-02 16:19:01 +10:00
Grant Likely
0ed2c722c6 powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan()
The two versions are doing almost exactly the same thing.  No need to
maintain them as separate files.  This patch also has the side effect
of making the PCI device tree scanning code available to 32 bit powerpc
machines, but no board ports actually make use of this feature at this
point.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-09-02 15:45:53 +10:00
Thomas Gleixner
f71bb0ac5e Merge branch 'timers/posixtimers' into timers/tracing
Merge reason: timer tracepoint patches depend on both branches

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-29 10:34:29 +02:00
Benjamin Herrenschmidt
77c0a700c1 powerpc: Properly start decrementer on BookE secondary CPUs
This moves the code to start the decrementer on 40x and BookE into
a separate function which is now called from time_init() and
secondary_time_init(), before the respective clock sources are
registered. We also remove the 85xx specific code for doing it
from the platform code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:25:04 +10:00
Kumar Gala
89c2dd62a3 powerpc/pci: Pull ppc32 PCI features into common
Some of the PCI features we have in ppc32 we will need on ppc64
platforms in the future.  These include support for:

* ppc_md.pci_exclude_device
* indirect config cycles
* early config cycles

We also simplified the logic in fake_pci_bus() to assume it will always
get a valid pci_controller.  Since all current callers seem to pass it
one.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:15 +10:00
Grant Likely
fbe6544719 powerpc/pci: move pci_64.c device tree scanning code into pci-common.c
The PCI device tree scanning code in pci_64.c is some useful functionality.
It allows PCI devices to be described in the device tree instead of being
probed for, which in turn allows pci devices to use all of the device tree
facilities to describe complex PCI bus architectures like GPIO and IRQ
routing (perhaps not a common situation for desktop or server systems,
but useful for embedded systems with on-board PCI devices).

This patch moves the device tree scanning into pci-common.c so it is
available for 32-bit powerpc machines too.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:15 +10:00
Grant Likely
ae14e13a4c powerpc/pci: Remove dead checks for CONFIG_PPC_OF
PPC_OF is always selected for arch/powerpc.  This patch removes the stale
#defines

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:14 +10:00
Kumar Gala
bb1af71ecb powerpc/book3e-64: Add support to initial_tlb_book3e for non-HES TLB
We now search through TLBnCFG looking for the first array that has IPROT
support (we assume that there is only one).  If that TLB has hardware
entry select (HES) support we use the existing code and with the proper
TLB select (the HES code still needs to clean up bolted entries from
firmware).  The non-HES code is pretty similiar to the 32-bit FSL Book-E
code but does make some new assumtions (like that we have tlbilx) and
simplifies things down a bit.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:14 +10:00
Kumar Gala
4b98d9e713 powerpc/book3e-64: Add helper function to setup IVORs
Not all 64-bit Book-3E parts will have fixed IVORs so add a function that
cpusetup code can call to setup the base IVORs (0..15) to match the fixed
offsets.  We need to 'or' part of interrupt_base_book3e into the IVORs
since on parts that have them the IVPR doesn't extend as far down.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:13 +10:00
Kumar Gala
6c188829d2 powerpc/book3e-64: Wait til generic_calibrate_decr to enable decrementer
Match what we do on 32-bit Book-E processors and enable the decrementer
in generic_calibrate_decr.  We need to make sure we disable the
decrementer early in boot since we currently use lazy (soft) interrupt
on 64-bit Book-E and possible get a decrementer exception before we
are ready for it.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:13 +10:00
Kumar Gala
f45c4486f7 powerpc/book3e-64: Move the default cpu table entry
Move the default cpu entry table for CONFIG_PPC_BOOK3E_64 to the
very end since we will probably want to support both 32-bit and
64-bit kernels for some processors that are higher up in the list.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:13 +10:00
FUJITA Tomonori
80d3e8abb7 powerpc: Add CONFIG_DMA_API_DEBUG support
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:11 +10:00
FUJITA Tomonori
4a9a6bfe70 powerpc: Handle SWIOTLB mapping error properly
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:11 +10:00
FUJITA Tomonori
45223c5492 powerpc: use dma_map_ops struct
This converts uses dma_map_ops struct (in include/linux/dma-mapping.h)
instead of POWERPC homegrown dma_mapping_ops.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:10 +10:00
FUJITA Tomonori
3702977fa7 powerpc: Remove swiotlb_pci_dma_ops
Now swiotlb_pci_dma_ops is identical to swiotlb_dma_ops; we can use
swiotlb_dma_ops with any devices. This removes swiotlb_pci_dma_ops.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:09 +10:00
FUJITA Tomonori
762afb7317 powerpc: Remove addr_needs_map in struct dma_mapping_ops
This patch adds max_direct_dma_addr to struct dev_archdata to remove
addr_needs_map in struct dma_mapping_ops. It also converts
dma_capable() to use max_direct_dma_addr.

max_direct_dma_addr is initialized in pci_dma_dev_setup_swiotlb(),
called via ppc_md.pci_dma_dev_setup hook.

For further information:
http://marc.info/?t=124719060200001&r=1&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-28 14:24:09 +10:00
Benjamin Herrenschmidt
2864697cef Merge commit 'tip/iommu-for-powerpc' into next 2009-08-28 14:23:06 +10:00
Gautham R Shenoy
6776426320 powerpc/pseries: Reduce the polling interval in __cpu_up()
Time time taken for a single cpu online operation on a pseries machine
is as follows:
Dedicated LPAR (POWER6): ~220ms.
Shared LPAR (POWER5)   : ~240ms.

Of this time, approximately 200ms is taken up by __cpu_up(). This is because
we poll every 200ms to check if the new cpu has notified it's presence
through the cpu_callin_map. We repeat this operation until the new cpu sets
the value in cpu_callin_map or 5 seconds elapse, whichever comes earlier.

However, using completion_structs instead of polling loops,
the time taken by the new processor to indicate it's presence has
found to be less than 1ms on pseries. This method however may not
work on all powerpc platforms due to the time-base synchronization code.

Keeping this in mind, we could reduce msleep polling interval from
200ms to 1ms while retaining the 5 second timeout.

With this, the time taken for a cpu online operation changes as follows:
Dedicated LPAR (POWER6): 20-25ms.
Shared LPAR (POWER5)   : 60-80ms.

In both these cases, it was found that the code polls through the loop
only once indicating that 1ms is a reasonable value, atleast on pseries.

The code needs testing on other powerpc platforms.

Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
Acked-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-27 13:12:54 +10:00
Josh Boyer
14d757520a powerpc: Fix __flush_icache_range on 44x
The ptrace POKETEXT interface allows a process to modify the text pages of
a child process being ptraced, usually to insert breakpoints via trap
instructions.  The kernel eventually calls copy_to_user_page, which in turn
calls __flush_icache_range to invalidate the icache lines for the child
process.

However, this function does not work on 44x due to the icache being virtually
indexed.  This was noticed by a breakpoint being triggered after it had been
cleared by ltrace on a 440EPx board.  The convenient solution is to do a
flash invalidate of the icache in the __flush_icache_range function.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-27 13:12:52 +10:00
Benjamin Herrenschmidt
ea3cc330ac powerpc/mm: Cleanup handling of execute permission
This is an attempt at cleaning up a bit the way we handle execute
permission on powerpc. _PAGE_HWEXEC is gone, _PAGE_EXEC is now only
defined by CPUs that can do something with it, and the myriad of
#ifdef's in the I$/D$ coherency code is reduced to 2 cases that
hopefully should cover everything.

The logic on BookE is a little bit different than what it was though
not by much. Since now, _PAGE_EXEC will be set by the generic code
for executable pages, we need to filter out if they are unclean and
recover it. However, I don't expect the code to be more bloated than
it already was in that area due to that change.

I could boast that this brings proper enforcing of per-page execute
permissions to all BookE and 40x but in fact, we've had that now for
some time as a side effect of my previous rework in that area (and
I didn't even know it :-) We would only enable execute permission if
the page was cache clean and we would only cache clean it if we took
and exec fault. Since we now enforce that the later only work if
VM_EXEC is part of the VMA flags, we de-fact already enforce per-page
execute permissions... Unless I missed something

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-27 13:12:51 +10:00
Martin Schwidefsky
d90246cd8e timekeeping: Increase granularity of read_persistent_clock(), build fix
Fix the following build problem on powerpc:

  arch/powerpc/kernel/time.c: In function 'read_persistent_clock':
  arch/powerpc/kernel/time.c:788: error: 'return' with a value, in function returning void
  arch/powerpc/kernel/time.c:791: error: 'return' with a value, in function returning void

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: dwalker@fifo99.com
Cc: johnstul@us.ibm.com
LKML-Reference: <20090822222313.74b9619c@skybase>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-23 10:49:48 +02:00
Benjamin Herrenschmidt
4f0dbc2781 Merge commit 'paulus-perf/master' into next 2009-08-20 11:07:56 +10:00
Michael Ellerman
903444e429 powerpc/vmlinux.lds: Move _edata down
Currently _edata does not include several data sections, this causes
the kernel's report of memory usage at boot to not match reality, and
also prevents kmemleak from working - because it scan between _sdata
and _edata for pointers to allocated memory.

This mirrors a similar change made recently to the x86 linker script.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:29:29 +10:00
Michael Ellerman
a15098c90d powerpc: Enable GCOV
Make it possible to enable GCOV code coverage measurement on powerpc.

Lightly tested on 64-bit, seems to work as expected.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:29:28 +10:00
Julia Lawall
14ea58ad79 powerpc: Use DIV_ROUND_CLOSEST in time init code
The kernel.h macro DIV_ROUND_CLOSEST performs the computation (x + d/2)/d
but is perhaps more readable.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@haskernel@
@@

#include <linux/kernel.h>

@depends on haskernel@
expression x,__divisor;
@@

- (((x) + ((__divisor) / 2)) / (__divisor))
+ DIV_ROUND_CLOSEST(x,__divisor)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:29:26 +10:00
Benjamin Krill
cf68787b68 powerpc/prom_init: Evaluate mem kernel parameter for early allocation
Evaluate mem kernel parameter for early memory allocations. If mem is set
no allocation in the region above the given boundary is allowed. The current
code doesn't take care about this and allocate memory above the given mem
boundary.

Signed-off-by: Benjamin Krill <ben@codiert.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:29:25 +10:00
Stefan Roese
20d70345f1 powerpc: Add AMCC 460EX/460GT Rev. B support to cputable.c
Signed-off-by: Stefan Roese <sr@denx.de>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:25:18 +10:00
Benjamin Herrenschmidt
2d27cfd328 powerpc: Remaining 64-bit Book3E support
This contains all the bits that didn't fit in previous patches :-) This
includes the actual exception handlers assembly, the changes to the
kernel entry, other misc bits and wiring it all up in Kconfig.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:25:11 +10:00
Benjamin Herrenschmidt
25d21ad6e7 powerpc: Add TLB management code for 64-bit Book3E
This adds the TLB miss handler assembly, the low level TLB flush routines
along with the necessary hook for dealing with our virtual page tables
or indirect TLB entries that need to be flushes when PTE pages are freed.

There is currently no support for hugetlbfs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:25:09 +10:00
Benjamin Herrenschmidt
dce6670aaa powerpc: Add PACA fields specific to 64-bit Book3E processors
This adds various fields in the PACA that are for use specifically
by Book3E processors, such as exception save areas, current pgd
pointer, special exceptions kernel stacks etc...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:25:08 +10:00
Benjamin Herrenschmidt
57e2a99f74 powerpc: Add memory management headers for new 64-bit BookE
This adds the PTE and pgtable format definitions, along with changes
to the kernel memory map and other definitions related to implementing
support for 64-bit Book3E. This also shields some asm-offset bits that
are currently only relevant on 32-bit

We also move the definition of the "linux" page size constants to
the common mmu.h file and add a few sizes that are relevant to
embedded processors.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:25:06 +10:00
Benjamin Herrenschmidt
cf54dc7cd4 powerpc: Move definitions of secondary CPU spinloop to header file
Those definitions are currently declared extern in the .c file where
they are used, move them to a header file instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:44 +10:00
Benjamin Herrenschmidt
747bea91b7 powerpc: Clean ifdef usage in copy_thread()
Currently, a single ifdef covers SLB related bits and more generic ppc64
related bits, split this in two separate ifdef's since 64-bit BookE will
need one but not the other.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:43 +10:00
Benjamin Herrenschmidt
6f0ef0f505 powerpc/mm: Call mmu_context_init() from ppc64
Our 64-bit hash context handling has no init function, but 64-bit Book3E
will use the common mmu_context_nohash.c code which does, so define an
empty inline mmu_context_init() for 64-bit server and call it from
our 64-bit setup_arch()

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
2009-08-20 10:12:42 +10:00
Benjamin Herrenschmidt
6c1719942e powerpc/of: Remove useless register save/restore when calling OF back
enter_prom() used to save and restore registers such as CTR, XER etc..
which are volatile, or SRR0,1... which we don't care about. This
removes a bunch of useless code and while at it turns an mtmsrd into
an MTMSRD macro which will be useful to Book3E.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:36 +10:00
Benjamin Herrenschmidt
dd90bbd5fb powerpc: Add compat_sys_truncate
The truncate syscall has a signed long parameter, so when using a 32-
bit userspace with a 64-bit kernel the argument is zero-extended
instead of sign-extended. Adding the compat_sys_truncate function
fixes the issue.

This was noticed during an LSB truncate test failure. The test was
checking for the correct error number set when truncate is called with
a length of -1. The test can be found at:

http://bzr.linuxfoundation.org/lsb/devel/runtime-test?cmd=inventory;rev=stewb%40linux-foundation.org-20090626205411-sfb23cc0tjj7jzgm;path=modules/vsx-pcts/tset/POSIX.os/files/truncate/

BenH: Added compat_sys_ftruncate() as well, same issue.

Signed-off-by: Chase Douglas <cndougla@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:34 +10:00
Benjamin Herrenschmidt
c5a8c0c99f powerpc: Remove use of a second scratch SPRG in STAB code
The STAB code used on Power3 and RS/64 uses a second scratch SPRG to
save a GPR in order to decide whether to go to do_stab_bolted_* or
to handle a normal data access exception.

This prevents our scheme of freeing SPRG3 which is user visible for
user uses since we cannot use SPRG0 which, on RS/64, seems to be
read-only for supervisor mode (like POWER4).

This reworks the STAB exception entry to use the PACA as temporary
storage instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:28 +10:00
Benjamin Herrenschmidt
ee43eb788b powerpc: Use names rather than numbers for SPRGs (v2)
The kernel uses SPRG registers for various purposes, typically in
low level assembly code as scratch registers or to hold per-cpu
global infos such as the PACA or the current thread_info pointer.

We want to be able to easily shuffle the usage of those registers
as some implementations have specific constraints realted to some
of them, for example, some have userspace readable aliases, etc..
and the current choice isn't always the best.

This patch should not change any code generation, and replaces the
usage of SPRN_SPRGn everywhere in the kernel with a named replacement
and adds documentation next to the definition of the names as to
what those are used for on each processor family.

The only parts that still use the original numbers are bits of KVM
or suspend/resume code that just blindly needs to save/restore all
the SPRGs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:27 +10:00
Benjamin Herrenschmidt
8aa34ab8b2 powerpc: Rename exception.h to exception-64s.h
The file include/asm/exception.h contains definitions
that are specific to exception handling on 64-bit server
type processors.

This renames the file to exception-64s.h to reflect that
fact and avoid confusion.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:26 +10:00
Anton Blanchard
30d0b36828 powerpc: Move 64bit VDSO to improve context switch performance
On 64bit applications the VDSO is the only thing in segment 0. Since the VDSO
is position independent we can remove the hint and let get_unmapped_area pick
an area. This will mean the vdso will be near other mmaps and will share
an SLB entry:

10000000-10001000 r-xp 00000000 08:06 5778459        /root/context_switch_64
10010000-10011000 r--p 00000000 08:06 5778459        /root/context_switch_64
10011000-10012000 rw-p 00001000 08:06 5778459        /root/context_switch_64
fffa92ae000-fffa92b0000 rw-p 00000000 00:00 0
fffa92b0000-fffa9453000 r-xp 00000000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa9453000-fffa9462000 ---p 001a3000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa9462000-fffa9466000 r--p 001a2000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa9466000-fffa947c000 rw-p 001a6000 08:06 4334051  /lib64/power6/libc-2.9.so
fffa947c000-fffa9480000 rw-p 00000000 00:00 0
fffa9480000-fffa94a8000 r-xp 00000000 08:06 4333852  /lib64/ld-2.9.so
fffa94b3000-fffa94b4000 rw-p 00000000 00:00 0

fffa94b4000-fffa94b7000 r-xp 00000000 00:00 0        [vdso] <----- here I am

fffa94b7000-fffa94b8000 r--p 00027000 08:06 4333852  /lib64/ld-2.9.so
fffa94b8000-fffa94bb000 rw-p 00028000 08:06 4333852  /lib64/ld-2.9.so
fffa94bb000-fffa94bc000 rw-p 00000000 00:00 0
fffe4c10000-fffe4c25000 rw-p 00000000 00:00 0        [stack]

On a microbenchmark that bounces a token between two 64bit processes over pipes
and calls gettimeofday each iteration (to access the VDSO), our context switch
rate goes from 268k to 277k ctx switches/sec (tested on a 4GHz POWER6).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-20 10:12:24 +10:00
Paul Mackerras
20002ded4d perf_counter: powerpc: Add callchain support
This adds support for tracing callchains for powerpc, both 32-bit
and 64-bit, and both in the kernel and userspace, from PMU interrupt
context.

The first three entries stored for each callchain are the NIP (next
instruction pointer), LR (link register), and the contents of the LR
save area in the second stack frame (the first is ignored because the
ABI convention on powerpc is that functions save their return address
in their caller's stack frame).  Because leaf functions don't have to
save their return address (LR value) and don't have to establish a
stack frame, it's possible for either or both of LR and the second
stack frame's LR save area to have valid return addresses in them.
This is basically impossible to disambiguate without either reading
the code or looking at auxiliary information such as CFI tables.
Since we don't want to do either of those things at interrupt time,
we store both LR and the second stack frame's LR save area.

Once we get past the second stack frame, there is no ambiguity; all
return addresses we get are reliable.

For kernel traces, we check whether they are valid kernel instruction
addresses and store zero instead if they are not (rather than
omitting them, which would make it impossible for userspace to know
which was which).  We also store zero instead of the second stack
frame's LR save area value if it is the same as LR.

For kernel traces, we check for interrupt frames, and for user traces,
we check for signal frames.  In each case, since we're starting a new
trace, we store a PERF_CONTEXT_KERNEL/USER marker so that userspace
knows that the next three entries are NIP, LR and the second stack frame
for the interrupted context.

We read user memory with __get_user_inatomic.  On 64-bit, if this
PMU interrupt occurred while interrupts are soft-disabled, and
there is no MMU hash table entry for the page, we will get an
-EFAULT return from __get_user_inatomic even if there is a valid
Linux PTE for the page, since hash_page isn't reentrant.  Thus we
have code here to read the Linux PTE and access the page via the
kernel linear mapping.  Since 64-bit doesn't use (or need) highmem
there is no need to do kmap_atomic.  On 32-bit, we don't do soft
interrupt disabling, so this complication doesn't occur and there
is no need to fall back to reading the Linux PTE, since hash_page
(or the TLB miss handler) will get called automatically if necessary.

Note that we cannot get PMU interrupts in the interval during
context switch between switch_mm (which switches the user address
space) and switch_to (which actually changes current to the new
process).  On 64-bit this is because interrupts are hard-disabled
in switch_mm and stay hard-disabled until they are soft-enabled
later, after switch_to has returned.  So there is no possibility
of trying to do a user stack trace when the user address space is
not current's address space.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-18 14:48:47 +10:00
Paul Mackerras
9c1e105238 powerpc: Allow perf_counters to access user memory at interrupt time
This provides a mechanism to allow the perf_counters code to access
user memory in a PMU interrupt routine.  Such an access can cause
various kinds of interrupt: SLB miss, MMU hash table miss, segment
table miss, or TLB miss, depending on the processor.  This commit
only deals with 64-bit classic/server processors, which use an MMU
hash table.  32-bit processors are already able to access user memory
at interrupt time.  Since we don't soft-disable on 32-bit, we avoid
the possibility of reentering hash_page or the TLB miss handlers,
since they run with interrupts disabled.

On 64-bit processors, an SLB miss interrupt on a user address will
update the slb_cache and slb_cache_ptr fields in the paca.  This is
OK except in the case where a PMU interrupt occurs in switch_slb,
which also accesses those fields.  To prevent this, we hard-disable
interrupts in switch_slb.  Interrupts are already soft-disabled at
this point, and will get hard-enabled when they get soft-enabled
later.

This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice,
and to make sure that it clears the slb_cache_ptr when called from
other callers than switch_slb, the existing routine is renamed to
__slb_flush_and_rebolt, which is called by switch_slb and the new
version of slb_flush_and_rebolt.

Similarly, switch_stab (used on POWER3 and RS64 processors) gets a
hard_irq_disable() to protect the per-cpu variables used there and
in ste_allocate.

If a MMU hashtable miss interrupt occurs, normally we would call
hash_page to look up the Linux PTE for the address and create a HPTE.
However, hash_page is fairly complex and takes some locks, so to
avoid the possibility of deadlock, we check the preemption count
to see if we are in a (pseudo-)NMI handler, and if so, we don't call
hash_page but instead treat it like a bad access that will get
reported up through the exception table mechanism.  An interrupt
whose handler runs even though the interrupt occurred when
soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI
handler, which should use nmi_enter()/nmi_exit() rather than
irq_enter()/irq_exit().

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-18 14:48:43 +10:00
Martin Schwidefsky
d4f587c67f timekeeping: Increase granularity of read_persistent_clock()
The persistent clock of some architectures (e.g. s390) have a
better granularity than seconds. To reduce the delta between the
host clock and the guest clock in a virtualized system change the 
read_persistent_clock function to return a struct timespec.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Daniel Walker <dwalker@fifo99.com>
LKML-Reference: <20090814134811.013873340@de.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-08-15 10:55:46 +02:00
Tejun Heo
c2a7e81801 powerpc64: convert to dynamic percpu allocator
Now that percpu allows arbitrary embedding of the first chunk,
powerpc64 can easily be converted to dynamic percpu allocator.
Convert it.  powerpc supports several large page sizes.  Cap atom_size
at 1M.  There isn't much to gain by going above that anyway.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-14 15:00:53 +09:00
Tejun Heo
384be2b18a Merge branch 'percpu-for-linus' into percpu-for-next
Conflicts:
	arch/sparc/kernel/smp_64.c
	arch/x86/kernel/cpu/perf_counter.c
	arch/x86/kernel/setup_percpu.c
	drivers/cpufreq/cpufreq_ondemand.c
	mm/percpu.c

Conflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids.  As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-08-14 14:45:31 +09:00
Linus Torvalds
d00aa6695b Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  perf_counter: Zero dead bytes from ftrace raw samples size alignment
  perf_counter: Subtract the buffer size field from the event record size
  perf_counter: Require CAP_SYS_ADMIN for raw tracepoint data
  perf_counter: Correct PERF_SAMPLE_RAW output
  perf tools: callchain: Fix bad rounding of minimum rate
  perf_counter tools: Fix libbfd detection for systems with libz dependency
  perf: "Longum est iter per praecepta, breve et efficax per exempla"
  perf_counter: Fix a race on perf_counter_ctx
  perf_counter: Fix tracepoint sampling to be part of generic sampling
  perf_counter: Work around gcc warning by initializing tracepoint record unconditionally
  perf tools: callchain: Fix sum of percentages to be 100% by displaying amount of ignored chains in fractal mode
  perf tools: callchain: Fix 'perf report' display to be callchain by default
  perf tools: callchain: Fix spurious 'perf report' warnings: ignore empty callchains
  perf record: Fix the -A UI for empty or non-existent perf.data
  perf util: Fix do_read() to fail on EOF instead of busy-looping
  perf list: Fix the output to not include tracepoints without an id
  perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support
  perf stat: Fix tool option consistency: rename -S/--scale to -c/--scale
  perf report: Add debug help for the finding of symbol bugs - show the symtab origin (DSO, build-id, kernel, etc)
  perf report: Fix per task mult-counter stat reporting
  ...
2009-08-10 11:48:51 -07:00
Benjamin Herrenschmidt
b2f2e8fee3 powerpc/dma: pci_set_dma_mask() shouldn't fail if mask fits in RAM
On an iMac G5, the b43 driver is failing to initialise because trying to
set the dma mask to 30-bit fails. Even though there's only 512MiB of RAM
in the machine anyway:
	https://bugzilla.redhat.com/show_bug.cgi?id=514787

We should probably let it succeed if the available RAM in the system
doesn't exceed the requested limit.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-08-10 16:36:38 +10:00
Paul Mackerras
f36a1a133a perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support
If we have the powerpc perf_counter backend compiled in, but
the cpu we are running on is one where we don't support the
PMU, we currently oops in hw_perf_group_sched_in if we try to
use any counters, because ppmu is NULL in that case, and we
unconditionally dereference ppmu.

This fixes the problem by adding a check if ppmu is NULL at the
beginning of hw_perf_group_sched_in, and also at the beginning
of the other functions that get called from the perf_counter
core, i.e. hw_perf_disable, hw_perf_enable, and
hw_perf_counter_setup.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: benh@kernel.crashing.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-09 12:54:37 +02:00
Benjamin Herrenschmidt
e0d82a0a4e perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it
If the current CPU doesn't support performance counters,
cur_cpu_spec->oprofile_cpu_type can be NULL. The current
perf_counter modules don't test for that case and would thus
crash at boot time.

Bug reported by David Woodhouse.

Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul Mackerras <paulus@samba.org>
LKML-Reference: <19066.48028.446975.501454@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-06 13:55:09 +02:00
Stanislaw Gruszka
a42548a188 cputime: Optimize jiffies_to_cputime(1)
For powerpc with CONFIG_VIRT_CPU_ACCOUNTING
jiffies_to_cputime(1) is not compile time constant and run time
calculations are quite expensive. To optimize we use
precomputed value. For all other architectures is is
preprocessor definition.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <1248862529-6063-5-git-send-email-sgruszka@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-03 14:48:36 +02:00
FUJITA Tomonori
8ab7ff42c9 powerpc: remove unused swiotlb_phys_to_bus() and swiotlb_bus_to_phys()
phys_to_dma() and dma_to_phys() are used instead of
swiotlb_phys_to_bus() and swiotlb_bus_to_phys().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
2009-07-28 14:19:20 +09:00
FUJITA Tomonori
30945fdf6a powerpc: remove unncesary swiotlb_arch_address_needs_mapping
swiotlb doesn't use swiotlb_arch_address_needs_mapping(); it uses
dma_capalbe(). We can remove unnecessary
swiotlb_arch_address_needs_mapping().

We can remove swiotlb_addr_needs_map() and is_buffer_dma_capable() in
swiotlb_pci_addr_needs_map() too; dma_capable() handles the features
that both provide.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
2009-07-28 14:19:19 +09:00
FUJITA Tomonori
02ca646e73 swiotlb: remove unnecessary swiotlb_bus_to_virt
swiotlb_bus_to_virt is unncessary; we can use swiotlb_bus_to_phys and
phys_to_virt instead.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Becky Bruce <beckyb@kernel.crashing.org>
2009-07-28 14:19:18 +09:00
Andreas Schwab
0115cb544b powerpc: Fix another bug in move of altivec code to vector.S
When moving load_up_altivec to vector.S a typo in a comment caused a
thinko setting the wrong variable.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-07-15 17:41:46 +10:00
Dave Kleikamp
28477fb1ed powerpc: Fix booke user_disable_single_step()
On booke processors, gdb is seeing spurious SIGTRAPs when setting a
watchpoint.

user_disable_single_step() simply quits when the DAC is non-zero.  It should
be clearing the DBCR0_IC and DBCR0_BT bits from the dbcr0 register and
TIF_SINGLESTEP from the thread flag.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-07-15 17:41:45 +10:00
Alexey Dobriyan
405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Linus Torvalds
85be928c41 Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
  perf report: Add "Fractal" mode output - support callchains with relative overhead rate
  perf_counter tools: callchains: Manage the cumul hits on the fly
  perf report: Change default callchain parameters
  perf report: Use a modifiable string for default callchain options
  perf report: Warn on callchain output request from non-callchain file
  x86: atomic64: Inline atomic64_read() again
  x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative()
  x86: atomic64: Improve atomic64_xchg()
  x86: atomic64: Export APIs to modules
  x86: atomic64: Improve atomic64_read()
  x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP
  x86: atomic64: Fix unclean type use in atomic64_xchg()
  x86: atomic64: Make atomic_read() type-safe
  x86: atomic64: Reduce size of functions
  x86: atomic64: Improve atomic64_add_return()
  x86: atomic64: Improve cmpxchg8b()
  x86: atomic64: Improve atomic64_read()
  x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file
  x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too
  perf report: Annotate variable initialization
  ...
2009-07-10 14:25:03 -07:00
Tejun Heo
023bf6f1b8 linker script: unify usage of discard definition
Discarded sections in different archs share some commonality but have
considerable differences.  This led to linker script for each arch
implementing its own /DISCARD/ definition, which makes maintaining
tedious and adding new entries error-prone.

This patch makes all linker scripts to move discard definitions to the
end of the linker script and use the common DISCARDS macro.  As ld
uses the first matching section definition, archs can include default
discarded sections by including them earlier in the linker script.

ia64 is notable because it first throws away some ia64 specific
subsections and then include the rest of the sections into the final
image, so those sections must be discarded before the inclusion.

defconfig compile tested for x86, x86-64, powerpc, powerpc64, ia64,
alpha, sparc, sparc64 and s390.  Michal Simek tested microblaze.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Cc: linux-arch@vger.kernel.org
Cc: Michal Simek <monstr@monstr.eu>
Cc: microblaze-uclinux@itee.uq.edu.au
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Tony Luck <tony.luck@intel.com>
2009-07-09 11:27:40 +09:00
Huang Weiyi
3665ee36fa powerpc/perf_counter: Remove duplicated #include
Remove duplicated #include('s) in
  arch/powerpc/kernel/mpc7450-pmu.c
  arch/powerpc/kernel/ppc970-pmu.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-07-08 13:50:22 +10:00
Tejun Heo
c43768cbb7 Merge branch 'master' into for-next
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes.  As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.

Conflicts:
	arch/alpha/include/asm/percpu.h
	arch/mn10300/kernel/vmlinux.lds.S
	include/linux/percpu-defs.h
2009-07-04 07:13:18 +09:00
Anton Blanchard
0a456fc58f powerpc/perf_counter: Enable alternate PR/HV bits for POWER7
POWER7 has the same PR/HV bit layout as POWER6, so set the flag.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: a.p.zijlstra@chello.nl
Cc: benh@kernel.crashing.org
LKML-Reference: <20090701030701.GI3563@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 10:20:28 +02:00
Benjamin Herrenschmidt
f694cda892 powerpc/440: Fix warning early debug code
The function udbg_44x_as1_flush() has the wrong prototype causing
a warning when enabling 440 early debug.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 16:55:35 +10:00
Benjamin Herrenschmidt
03c01aa740 powerpc/of: Fix usage of dev_set_name() in of_device_alloc()
dev_set_name() takes a format string, so use it properly and avoid
a warning with recent gcc's

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 16:55:35 +10:00
Benjamin Herrenschmidt
c4007a2fbf powerpc: Use one common impl. of RTAS timebase sync and use raw spinlock
Several platforms use their own copy of what is essentially the same code,
using RTAS to synchronize the timebases when bringing up new CPUs. This
moves it all into a single common implementation and additionally
turns the spinlock into a raw spinlock since the former can rely on
the timebase not being frozen when spinlock debugging is enabled, and finally
masks interrupts while the timebase is disabled.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 16:55:25 +10:00
Benjamin Herrenschmidt
f97bb36f70 powerpc/rtas: Turn rtas lock into a raw spinlock
RTAS currently uses a normal spinlock. However it can be called from
contexts where this is not necessarily a good idea. For example, it
can be called while syncing timebases, with the core timebase being
frozen. Unfortunately, that will deadlock in case of lock contention
when spinlock debugging is enabled as the spin lock debugging code
will try to use __delay() which ... relies on the timebase being
enabled.

Also RTAS can be used in some low level IRQ handling code path so it
may as well be a raw spinlock for -rt sake.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 14:37:27 +10:00
Benjamin Herrenschmidt
5d38902c48 powerpc: Add irqtrace support for 32-bit powerpc
Based on initial work from: Dale Farnsworth <dale@farnsworth.org>

Add the low level irq tracing hooks for 32-bit powerpc needed
to enable full lockdep functionality.

The approach taken to deal with the code in entry_32.S is that
we don't trace all the transitions of MSR:EE when we just turn
it off to peek at TI_FLAGS without races. Only when we are
calling into C code or returning from exceptions with a state
that have changed from what lockdep thinks.

There's a little bugger though: If we take an exception that
keeps interrupts enabled (such as an alignment exception) while
interrupts are enabled, we will call trace_hardirqs_on() on the
way back spurriously. Not a big deal, but to get rid of it would
require remembering in pt_regs that the exception was one of the
type that kept interrupts enabled which we don't know at this
stage. (Well, we could test all cases for regs->trap but that
sucks too much).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Kumar Gala <galak@kernel.crashing.org>
2009-06-26 14:37:27 +10:00
Benjamin Herrenschmidt
4a5cbf17c4 powerpc: Map more memory early on 601 processors
The 32-bit kernel relies on some memory being mapped covering
the kernel text,data and bss at least, early during boot before
the full MMU setup is done. On 32-bit "classic" processors, this
is done using BAT registers.

On 601, the size of BATs is limited to 8M and we use 2 of them
for that initial mapping. This can become quite tight when enabling
features like lockdep, so let's use a 3rd one to bump that mapping
from 16M to 24M. We keep the 4th BAT free as it can be useful for
debugging early boot code to map things like serial ports.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 14:37:25 +10:00
Kumar Gala
a236719418 powerpc: Fix output from show_regs
For some reason we've had an explicit KERN_INFO for GPR dumps.  With
recent changes we get output like:

<6>GPR00: 00000000 ef855eb0 ef858000 00000001 000000d0 f1000000 ffbc8000 ffffffff

The KERN_INFO is causing the <6>.  Don't see any reason to keep it
around.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 14:37:24 +10:00
Benjamin Herrenschmidt
7ccbe504b5 powerpc/pmac: Fix issues with PowerMac "PowerSurge" SMP
The old PowerSurge SMP (ie, dual or quad 604 machines) code has
numerous issues in modern world.

One is cpu_possible_map is set too late (the device-tree is bogus)
so we fail to allocate the interrupt stacks and crash. Another
problem is the fact the timebase is frozen by the bringup of the
second CPU so the delays in the generic code will hang, we need
to move some of the calling procedure to inside the powermac code.

This makes it boot again for me

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-26 14:37:24 +10:00
Tejun Heo
405d967dc7 linker script: throw away .discard section
x86 throws away .discard section but no other archs do.  Also,
.discard is not thrown away while linking modules.  Make every arch
and module linking throw it away.  This will be used to define dummy
variables for percpu declarations and definitions.

This patch is based on Ivan Kokshaysky's alpha percpu patch.

[ Impact: always throw away everything in .discard ]

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Bryan Wu <cooloney@kernel.org>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@elte.hu>
2009-06-24 15:13:38 +09:00
Linus Torvalds
12e24f34cb Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits)
  perfcounter: Handle some IO return values
  perf_counter: Push perf_sample_data through the swcounter code
  perf_counter tools: Define and use our own u64, s64 etc. definitions
  perf_counter: Close race in perf_lock_task_context()
  perf_counter, x86: Improve interactions with fast-gup
  perf_counter: Simplify and fix task migration counting
  perf_counter tools: Add a data file header
  perf_counter: Update userspace callchain sampling uses
  perf_counter: Make callchain samples extensible
  perf report: Filter to parent set by default
  perf_counter tools: Handle lost events
  perf_counter: Add event overlow handling
  fs: Provide empty .set_page_dirty() aop for anon inodes
  perf_counter: tools: Makefile tweaks for 64-bit powerpc
  perf_counter: powerpc: Add processor back-end for MPC7450 family
  perf_counter: powerpc: Make powerpc perf_counter code safe for 32-bit kernels
  perf_counter: powerpc: Change how processor-specific back-ends get selected
  perf_counter: powerpc: Use unsigned long for register and constraint values
  perf_counter: powerpc: Enable use of software counters on 32-bit powerpc
  perf_counter tools: Add and use isprint()
  ...
2009-06-20 11:29:32 -07:00
Linus Torvalds
b0b7065b64 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
  tracing/urgent: warn in case of ftrace_start_up inbalance
  tracing/urgent: fix unbalanced ftrace_start_up
  function-graph: add stack frame test
  function-graph: disable when both x86_32 and optimize for size are configured
  ring-buffer: have benchmark test print to trace buffer
  ring-buffer: do not grab locks in nmi
  ring-buffer: add locks around rb_per_cpu_empty
  ring-buffer: check for less than two in size allocation
  ring-buffer: remove useless compile check for buffer_page size
  ring-buffer: remove useless warn on check
  ring-buffer: use BUF_PAGE_HDR_SIZE in calculating index
  tracing: update sample event documentation
  tracing/filters: fix race between filter setting and module unload
  tracing/filters: free filter_string in destroy_preds()
  ring-buffer: use commit counters for commit pointer accounting
  ring-buffer: remove unused variable
  ring-buffer: have benchmark test handle discarded events
  ring-buffer: prevent adding write in discarded area
  tracing/filters: strloc should be unsigned short
  tracing/filters: operand can be negative
  ...

Fix up kmemcheck-induced conflict in kernel/trace/ring_buffer.c manually
2009-06-20 10:56:46 -07:00
Linus Torvalds
773d7a09e1 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (35 commits)
  powerpc/5121: make clock debug output more readable
  powerpc/5xxx: Add common mpc5xxx_get_bus_frequency() function
  powerpc/5200: Update pcm030.dts to add i2c eeprom and delete cruft
  powerpc/5200: convert mpc52xx_psc_spi to use cs_control callback
  fbdev/xilinxfb: Fix improper casting and tighen up probe path
  usb/ps3: Add missing annotations
  powerpc: Add memory clobber to mtspr()
  powerpc: Fix invalid construct in our CPU selection Kconfig
  ps3rom: Use ps3_system_bus_[gs]et_drvdata() instead of direct access
  powerpc: Add configurable -Werror for arch/powerpc
  of_serial: Add UPF_FIXED_TYPE flag
  drivers/hvc: Add missing __devexit_p()
  net/ps3: gelic - Add missing annotations
  powerpc: Introduce macro spin_event_timeout()
  powerpc/warp: Fix ISA_DMA_THRESHOLD default
  powerpc/bootwrapper: Custom build options for XPedite52xx targets
  powerpc/85xx: Add defconfig for X-ES MPC85xx boards
  powerpc/85xx: Add dts files for X-ES MPC85xx boards
  powerpc/85xx: Add platform support for X-ES MPC85xx boards
  83xx: add support for the kmeter1 board.
  ...
2009-06-19 17:40:40 -07:00
Steven Rostedt
71e308a239 function-graph: add stack frame test
In case gcc does something funny with the stack frames, or the return
from function code, we would like to detect that.

An arch may implement passing of a variable that is unique to the
function and can be saved on entering a function and can be tested
when exiting the function. Usually the frame pointer can be used for
this purpose.

This patch also implements this for x86. Where it passes in the stack
frame of the parent function, and will test that frame on exit.

There was a case in x86_32 with optimize for size (-Os) where, for a
few functions, gcc would align the stack frame and place a copy of the
return address into it. The function graph tracer modified the copy and
not the actual return address. On return from the funtion, it did not go
to the tracer hook, but returned to the parent. This broke the function
graph tracer, because the return of the parent (where gcc did not do
this funky manipulation) returned to the location that the child function
was suppose to. This caused strange kernel crashes.

This test detected the problem and pointed out where the issue was.

This modifies the parameters of one of the functions that the arch
specific code calls, so it includes changes to arch code to accommodate
the new prototype.

Note, I notice that the parsic arch implements its own push_return_trace.
This is now a generic function and the ftrace_push_return_trace should be
used instead. This patch does not touch that code.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-18 18:40:18 -04:00
Harry Ciao
8f101a051e edac: cpc925 MC platform device setup
Fix up the number of cells for the values of CPC925 Memory Controller,
and setup related platform device during system booting up, against
which CPC925 Memory Controller EDAC driver would be matched.

Signed-off-by: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-18 13:03:57 -07:00
Paul Mackerras
7325927e5a perf_counter: powerpc: Add processor back-end for MPC7450 family
This adds support for the performance monitor hardware on the
MPC7450 family of processors (7450, 7451, 7455, 7447/7457, 7447A,
7448), used in the later Apple G4 powermacs/powerbooks and other
machines.  These machines have 6 hardware counters with a unique
set of events which can be counted on each counter, with some
events being available on multiple counters.

Raw event codes for these processors are (PMC << 8) + PMCSEL.
If PMC is non-zero then the event is that selected by the given
PMCSEL value for that PMC (hardware counter).  If PMC is zero
then the event selected is one of the low-numbered ones that are
common to several PMCs.  In this case PMCSEL must be <= 22 and
the event is what that PMCSEL value would select on PMC1 (but
it may be placed any other PMC that has the same event for that
PMCSEL value).

For events that count cycles or occurrences that exceed a threshold,
the threshold requested can be specified in the 0x3f000 bits of the
raw event codes.  If the event uses the threshold multiplier bit
and that bit should be set, that is indicated with the 0x40000 bit
of the raw event code.

This fills in some of the generic cache events.  Unfortunately there
are quite a few blank spaces in the table, partly because these
processors tend to count cache hits rather than cache accesses.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55631.802122.696927@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 11:11:46 +02:00
Paul Mackerras
98fb1807b9 perf_counter: powerpc: Make powerpc perf_counter code safe for 32-bit kernels
This abstracts a few things in arch/powerpc/kernel/perf_counter.c
that are specific to 64-bit kernels, and provides definitions for
32-bit kernels.  In particular,

* Only 64-bit has MMCRA and the bits in it that give information
  about a PMU interrupt (sampled PR, HV, slot number etc.)
* Only 64-bit has the lppaca and the lppaca->pmcregs_in_use field
* Use of SDAR is confined to 64-bit for now
* Only 64-bit has soft/lazy interrupt disable and therefore
  pseudo-NMIs (interrupts that occur while interrupts are soft-disabled)
* Only 64-bit has PMC7 and PMC8
* Only 64-bit has the MSR_HV bit.

This also fixes the types used in a couple of places, where we were
using long types for things that need to be 64-bit.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55590.634126.876084@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 11:11:46 +02:00
Paul Mackerras
079b3c569c perf_counter: powerpc: Change how processor-specific back-ends get selected
At present, the powerpc generic (processor-independent) perf_counter
code has list of processor back-end modules, and at initialization,
it looks at the PVR (processor version register) and has a switch
statement to select a suitable processor-specific back-end.

This is going to become inconvenient as we add more processor-specific
back-ends, so this inverts the order: now each back-end checks whether
it applies to the current processor, and registers itself if so.
Furthermore, instead of looking at the PVR, back-ends now check the
cur_cpu_spec->oprofile_cpu_type string and match on that.

Lastly, each back-end now specifies a name for itself so the core can
print a nice message when a back-end registers itself.

This doesn't provide any support for unregistering back-ends, but that
wouldn't be hard to do and would allow back-ends to be modules.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55529.762227.518531@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 11:11:45 +02:00
Paul Mackerras
448d64f8f4 perf_counter: powerpc: Use unsigned long for register and constraint values
This changes the powerpc perf_counter back-end to use unsigned long
types for hardware register values and for the value/mask pairs used
in checking whether a given set of events fit within the hardware
constraints.  This is in preparation for adding support for the PMU
on some 32-bit powerpc processors.  On 32-bit processors the hardware
registers are only 32 bits wide, and the PMU structure is generally
simpler, so 32 bits should be ample for expressing the hardware
constraints.  On 64-bit processors, unsigned long is 64 bits wide,
so using unsigned long vs. u64 (unsigned long long) makes no actual
difference.

This makes some other very minor changes: adjusting whitespace to line
things up in initialized structures, and simplifying some code in
hw_perf_disable().

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55473.26174.331511@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 11:11:45 +02:00
Paul Mackerras
105988c015 perf_counter: powerpc: Enable use of software counters on 32-bit powerpc
This enables the perf_counter subsystem on 32-bit powerpc.  Since we
don't have any support for hardware counters on 32-bit powerpc yet,
only software counters can be used.

Besides selecting HAVE_PERF_COUNTERS for 32-bit powerpc as well as
64-bit, the main thing this does is add an implementation of
set_perf_counter_pending().  This needs to arrange for
perf_counter_do_pending() to be called when interrupts are enabled.
Rather than add code to local_irq_restore as 64-bit does, the 32-bit
set_perf_counter_pending() generates an interrupt by setting the
decrementer to 1 so that a decrementer interrupt will become pending
in 1 or 2 timebase ticks (if a decrementer interrupt isn't already
pending).  When interrupts are enabled, timer_interrupt() will be
called, and some new code in there calls perf_counter_do_pending().
We use a per-cpu array of flags to indicate whether we need to call
perf_counter_do_pending() or not.

This introduces a couple of new Kconfig symbols: PPC_HAVE_PMU_SUPPORT,
which is selected by processor families for which we have hardware PMU
support (currently only PPC64), and PPC_PERF_CTRS, which enables the
powerpc-specific perf_counter back-end.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linuxppc-dev@ozlabs.org
Cc: benh@kernel.crashing.org
LKML-Reference: <19000.55404.103840.393470@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-18 11:11:44 +02:00
Benjamin Herrenschmidt
4b337c5f24 Merge commit 'origin/master' into next 2009-06-18 11:16:55 +10:00
Ingo Molnar
a3d06cc6aa Merge branch 'linus' into perfcounters/core
Conflicts:
	arch/x86/include/asm/kmap_types.h
	include/linux/mm.h

	include/asm-generic/kmap_types.h

Merge reason: We crossed changes with kmap_types.h cleanups in mainline.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-17 13:06:17 +02:00
Linus Torvalds
517d08699b Merge branch 'akpm'
* akpm: (182 commits)
  fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
  fbdev: *bfin*: fix __dev{init,exit} markings
  fbdev: *bfin*: drop unnecessary calls to memset
  fbdev: bfin-t350mcqb-fb: drop unused local variables
  fbdev: blackfin has __raw I/O accessors, so use them in fb.h
  fbdev: s1d13xxxfb: add accelerated bitblt functions
  tcx: use standard fields for framebuffer physical address and length
  fbdev: add support for handoff from firmware to hw framebuffers
  intelfb: fix a bug when changing video timing
  fbdev: use framebuffer_release() for freeing fb_info structures
  radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
  s3c-fb: CPUFREQ frequency scaling support
  s3c-fb: fix resource releasing on error during probing
  carminefb: fix possible access beyond end of carmine_modedb[]
  acornfb: remove fb_mmap function
  mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
  mb862xxfb: restrict compliation of platform driver to PPC
  Samsung SoC Framebuffer driver: add Alpha Channel support
  atmel-lcdc: fix pixclock upper bound detection
  offb: use framebuffer_alloc() to allocate fb_info struct
  ...

Manually fix up conflicts due to kmemcheck in mm/slab.c
2009-06-16 19:50:13 -07:00
Geert Uytterhoeven
ae52bb2384 fbdev: move logo externs to header file
Now we have __initconst, we can finally move the external declarations for
the various Linux logo structures to <linux/linux_logo.h>.

James' ack dates back to the previous submission (way to long ago), when the
logos were still __initdata, which caused failures on some platforms with some
toolchain versions.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: James Simmons <jsimmons@infradead.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 19:47:57 -07:00
Alexey Dobriyan
bb1f17b037 mm: consolidate init_mm definition
* create mm/init-mm.c, move init_mm there
* remove INIT_MM, initialize init_mm with C99 initializer
* unexport init_mm on all arches:

  init_mm is already unexported on x86.

  One strange place is some OMAP driver (drivers/video/omap/) which
  won't build modular, but it's already wants get_vm_area() export.
  Somebody should look there.

[akpm@linux-foundation.org: add missing #includes]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 19:47:28 -07:00
Michael Ellerman
ba55bd7436 powerpc: Add configurable -Werror for arch/powerpc
Add the option to build the code under arch/powerpc with -Werror.

The intention is to make it harder for people to inadvertantly introduce
warnings in the arch/powerpc code. It needs to be configurable so that
if a warning is introduced, people can easily work around it while it's
being fixed.

The option is a negative, ie. don't enable -Werror, so that it will be
turned on for allyes and allmodconfig builds.

The default is n, in the hope that developers will build with -Werror,
that will probably lead to some build breaks, I am prepared to be flamed.

It's not enabled for math-emu, which is a steaming pile of warnings.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-16 14:15:45 +10:00
Gerhard Pircher
f1f8b4948d powerpc: Enable additional BAT registers in setup_745x_specifics()
Currently the kernel expects the additional four IBAT and DBAT registers
to be available, but doesn't enable these registers on 745x CPUs, which
have them disabled after reset. Thus set the HIGH_BAT_EN bit in HID0
register, if the corresponding MMU feature is defined.

Signed-off-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-06-15 21:45:31 -05:00
Nate Case
cab888e678 powerpc/fsl-booke: Enable L1 cache on e500v1/e500v2/e500mc CPUs
Some boot loaders may not enable L1 instruction/data cache.  Check if
data and instruction caches are enabled, and enable them if needed.

Signed-off-by: Nate Case <ncase@xes-inc.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-06-15 21:45:30 -05:00
Linus Torvalds
0fa213310c Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (103 commits)
  powerpc: Fix bug in move of altivec code to vector.S
  powerpc: Add support for swiotlb on 32-bit
  powerpc/spufs: Remove unused error path
  powerpc: Fix warning when printing a resource_size_t
  powerpc/xmon: Remove unused variable in xmon.c
  powerpc/pseries: Fix warnings when printing resource_size_t
  powerpc: Shield code specific to 64-bit server processors
  powerpc: Separate PACA fields for server CPUs
  powerpc: Split exception handling out of head_64.S
  powerpc: Introduce CONFIG_PPC_BOOK3S
  powerpc: Move VMX and VSX asm code to vector.S
  powerpc: Set init_bootmem_done on NUMA platforms as well
  powerpc/mm: Fix a AB->BA deadlock scenario with nohash MMU context lock
  powerpc/mm: Fix some SMP issues with MMU context handling
  powerpc: Add PTRACE_SINGLEBLOCK support
  fbdev: Add PLB support and cleanup DCR in xilinxfb driver.
  powerpc/virtex: Add ml510 reference design device tree
  powerpc/virtex: Add Xilinx ML510 reference design support
  powerpc/virtex: refactor intc driver and add support for i8259 cascading
  powerpc/virtex: Add support for Xilinx PCI host bridge
  ...
2009-06-15 09:32:52 -07:00
Paul Mackerras
90c8f95453 perf_counter: powerpc: Fix two compile warnings
This fixes a couple of compile warnings that crept into the powerpc
perf_counter code recently:

   CC      arch/powerpc/kernel/perf_counter.o
 arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
 arch/powerpc/kernel/perf_counter.c:1016: warning: unused variable 'addr'
 arch/powerpc/kernel/perf_counter.c: In function 'hw_perf_counter_init':
 arch/powerpc/kernel/perf_counter.c:891: warning: 'ev' may be used uninitialized in this function

Stephen Rothwell reported this against linux-next as well.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18998.12884.787039.22202@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-15 16:12:25 +02:00
Michael Ellerman
7719ed7ce8 powerpc: Only build prom_init.o when CONFIG_PPC_OF_BOOT_TRAMPOLINE=y
Commit 28794d34 ("powerpc/kconfig: Kill PPC_MULTIPLATFORM"), added
CONFIG_PPC_OF_BOOT_TRAMPOLINE to control the buliding of prom_init.o

However the Makefile still unconditionally builds prom_init_check,
the script that checks prom_init.o for symbol usage, and so in turn
prom_init.o is still always being built. (it's not linked though)

So surround all the prom_init_check logic with an ifeq block testing
if CONFIG_PPC_OF_BOOT_TRAMPOLINE is set.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:27:36 +10:00
Michael Ellerman
e468455e58 powerpc: Fix warning in setup_64.c when CONFIG_RELOCATABLE=y
When CONFIG_RELOCATABLE is enabled, PHYSICAL_START is actually a
variable of type phys_addr_t. That means to print it we need to
cast to unsigned long long and use llx.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:21 +10:00
Benjamin Herrenschmidt
177996e6e2 powerpc: Don't do generic calibrate_delay()
Currently we are wasting time calling the generic calibrate_delay()
function. We don't need it since our implementation of __delay() is
based on the CPU timebase. So instead, we use our own small
implementation that initializes loops_per_jiffy to something sensible
to make the few users like spinlock debug be happy

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-15 13:26:17 +10:00
Benjamin Herrenschmidt
7dafd239ab Merge commit 'origin/master' into next 2009-06-15 10:36:54 +10:00
Linus Torvalds
4ddbac9898 Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf_counter: Start documenting HAVE_PERF_COUNTERS requirements
  perf_counter: Add forward/backward attribute ABI compatibility
  perf record: Explicity program a default counter
  perf_counter: Remove PERF_TYPE_RAW special casing
  perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too
  powerpc, perf_counter: Fix performance counter event types
  perf_counter/x86: Add a quirk for Atom processors
  perf_counter tools: Remove one L1-data alias
2009-06-12 13:16:52 -07:00
Jaswinder Singh Rajput
4c921126fe powerpc, perf_counter: Fix performance counter event types
Sachin Sant reported these compiler errors:

 CC      arch/powerpc/kernel/power7-pmu.o
arch/powerpc/kernel/power7-pmu.c:297: error: PERF_COUNT_CPU_CYCLES undeclared here (not in a function)

Which happened because a last-minute rename of symbols crossed with
the Power7 support patch.

Fix this by using the new symbol names.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
LKML-Reference: <1244788494.5554.1.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:21:11 +02:00
Rusty Russell
5933048c69 module: cleanup FIXME comments about trimming exception table entries.
Everyone cut and paste this comment from my original one.  We now do
it generically, so cut the comments.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Amerigo Wang <amwang@redhat.com>
2009-06-12 21:47:05 +09:30
Benjamin Herrenschmidt
bc47ab0241 Merge commit 'origin/master' into next
Manual merge of:
	arch/powerpc/kernel/asm-offsets.c
2009-06-12 16:53:38 +10:00
Benjamin Herrenschmidt
37f9ef553b powerpc: Fix bug in move of altivec code to vector.S
The patch that moved to vector.S and made common between 32 and 64-bit the
altivec code had a nasty bug on 32-bit (did I really test that ?) which
causes the kernel to blr back into userspace ... oops :-)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-12 16:51:41 +10:00
Stephen Rothwell
e14112d1bd perfcounters: remove powerpc definitions of perf_counter_do_pending
Commit 925d519ab8 ("perf_counter:
unify and fix delayed counter wakeup") added global definitions.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-11 20:03:13 -07:00
Peter Zijlstra
8be6e8f3c3 perf_counter: Rename L2 to LL cache
The top (fastest) and last level (biggest) caches are the most
interesting ones, performance wise.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
[ Fixed the Nehalem LL table to LLC Reference/Miss events ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 17:54:17 +02:00
Peter Zijlstra
f4dbfa8f31 perf_counter: Standardize event names
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 17:54:15 +02:00
Paul Mackerras
106b506c3a perf_counter: powerpc: Implement generalized cache events for POWER processors
This adds tables of event codes for the generalized cache events for
all the currently supported powerpc processors: POWER{4,5,5+,6,7} and
PPC970*, plus powerpc-specific code to use these tables when a
generalized cache event is requested.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18992.36430.933526.742969@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 16:48:37 +02:00
Paul Mackerras
4da52960fd perf_counters: powerpc: Add support for POWER7 processors
This adds the back-end for the PMU on POWER7 processors.  POWER7
has 4 fully-programmable counters and two fixed-function counters
(which do respect the freeze conditions, can generate interrupts,
and are writable, unlike PMC5/6 on POWER5+/6).

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18992.36329.189378.17992@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 16:48:37 +02:00
Peter Zijlstra
9e350de37a perf_counter: Accurate period data
We currently log hw.sample_period for PERF_SAMPLE_PERIOD, however this is
incorrect. When we adjust the period, it will only take effect the next
cycle but report it for the current cycle. So when we adjust the period
for every cycle, we're always wrong.

Solve this by keeping track of the last_period.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 02:39:02 +02:00
Peter Zijlstra
df1a132bf3 perf_counter: Introduce struct for sample data
For easy extension of the sample data, put it in a structure.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-11 02:39:02 +02:00
Becky Bruce
ec3cf2ece2 powerpc: Add support for swiotlb on 32-bit
This patch includes the basic infrastructure to use swiotlb
bounce buffering on 32-bit powerpc.  It is not yet enabled on
any platforms.  Probably the most interesting bit is the
addition of addr_needs_map to dma_ops - we need this as
a dma_op because the decision of whether or not an addr
can be mapped by a device is device-specific.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 16:49:18 +10:00
Stephen Rothwell
bcba0778ee powerpc: Fix warning when printing a resource_size_t
resource_size_t is 64 bits on PowerPC 64.

Gets rid of this warning:

arch/powerpc/kernel/pci_64.c: In function 'pcibios_map_io_space':
arch/powerpc/kernel/pci_64.c:504: warning: format '%016lx' expects type 'long unsigned int', but argument 2 has type 'resource_size_t'

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 16:47:41 +10:00
Benjamin Herrenschmidt
944916858a powerpc: Shield code specific to 64-bit server processors
This is a random collection of added ifdef's around portions of
code that only mak sense on server processors. Using either
CONFIG_PPC_STD_MMU_64 or CONFIG_PPC_BOOK3S as seems appropriate.

This is meant to make the future merging of Book3E 64-bit support
easier.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 16:47:38 +10:00
Benjamin Herrenschmidt
91c60b5b82 powerpc: Separate PACA fields for server CPUs
This patch has no effect other than re-ordering PACA fields on
current server CPUs. It however is a pre-requisite for future
support of BookE 64-bit processors. Various parts of the PACA
struct are now moved under some ifdef's, either the new
CONFIG_PPC_BOOK3S or CONFIG_PPC_STD_MMU_64, whatever seems more
appropriate.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.craashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 16:47:38 +10:00
Benjamin Herrenschmidt
0ebc4cdaa3 powerpc: Split exception handling out of head_64.S
To prepare for future support of Book3E 64-bit PowerPC processors,
which use a completely different exception handling, we move that
code to a new exceptions-64s.S file.

This file is #included from head_64.S due to some of the absolute
address requirements which can currently only be fulfilled from
within that file.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 16:47:37 +10:00
Benjamin Herrenschmidt
e821ea70f3 powerpc: Move VMX and VSX asm code to vector.S
Currently, load_up_altivec and give_up_altivec are duplicated
in 32-bit and 64-bit. This creates a common implementation that
is moved away from head_32.S, head_64.S and misc_64.S and into
vector.S, using the same macros we already use for our common
implementation of load_up_fpu.

I also moved the VSX code over to vector.S though in that case
I didn't make it build on 32-bit (yet).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 16:46:25 +10:00
Roland McGrath
ec097c84df powerpc: Add PTRACE_SINGLEBLOCK support
Reworked by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

This adds block-step support on powerpc, including a PTRACE_SINGLEBLOCK
request for ptrace.

The BookE implementation is tweaked to fire a single step after a
block step in order to mimmic the server behaviour.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-09 13:29:25 +10:00
Ingo Molnar
a21ca2cac5 perf_counter: Separate out attr->type from attr->config
Counter type is a frequently used value and we do a lot of
bit juggling by encoding and decoding it from attr->config.

Clean this up by creating a separate attr->type field.

Also clean up the various similarly complex user-space bits
all around counter attribute management.

The net improvement is significant, and it will be easier
to add a new major type (which is what triggered this cleanup).

(This changes the ABI, all tools are adapted.)
(PowerPC build-tested.)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-06 11:37:22 +02:00
Paul Mackerras
1b58c2515b perf_counter: powerpc: Use new identifier names in powerpc-specific code
Commit b23f3325 ("perf_counter: Rename various fields") fixed up
most of the uses of the renamed fields, but missed one instance
of "record_type" in powerpc-specific code which needs to be changed
to "sample_type", and a "PERF_RECORD_ADDR" in the same statement that
needs to be changed to "PERF_SAMPLE_ADDR", causing compilation
errors on powerpc.  This fixes it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18983.3111.770392.800486@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-04 13:20:11 +02:00
Paul Mackerras
dcd945e0d8 perf_counter: powerpc: Fix race causing "oops trying to read PMC0" errors
When using interrupting counters and limited (non-interrupting)
counters at the same time, it's possible that we get an
interrupt in write_mmcr0() after writing MMCR0 but before we
have set up the counters using limited PMCs.  What happens then
is that we get into perf_counter_interrupt() with
counter->hw.idx = 0 for the limited counters, leading to the
"oops trying to read PMC0" error message being printed.

This fixes the problem by making perf_counter_interrupt()
robust against counter->hw.idx being zero (the counter is just
ignored in that case) and also by changing write_mmcr0() to
write MMCR0 initially with the counter overflow interrupt
enable bits masked (set to 0).  If the MMCR0 value requested by
the caller has either of those bits set, we write MMCR0 again
with the requested value of those bits after setting up the
limited counters properly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <18982.17684.138182.954599@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-03 11:49:53 +02:00
Paul Mackerras
6984efb692 perf_counter: powerpc: Fix event alternative code generation on POWER5/5+
Commit ef923214 ("perf_counter: powerpc: use u64 for event
codes internally") introduced a bug where the return value from
function find_alternative_bdecode gets put into a u64 variable
and later tested to see if it is < 0.  The effect is that we
get extra, bogus event code alternatives on POWER5 and POWER5+,
leading to error messages such as "oops compute_mmcr failed"
being printed and counters not counting properly.

This fixes it by using s64 for the return type of
find_alternative_bdecode and for the local variable that the
caller puts the value in.  It also makes the event argument a
u64 on POWER5+ for consistency.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <18982.17586.666132.90983@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-03 11:49:52 +02:00
Peter Zijlstra
0d48696f87 perf_counter: Rename perf_counter_hw_event => perf_counter_attr
The structure isn't hw only and when I read event, I think about those
things that fall out the other end. Rename the thing.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Cc: Stephane Eranian <eranian@googlemail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-02 21:45:33 +02:00
Peter Zijlstra
b23f3325ed perf_counter: Rename various fields
A few renames:

  s/irq_period/sample_period/
  s/irq_freq/sample_freq/
  s/PERF_RECORD_/PERF_SAMPLE_/
  s/record_type/sample_type/

And change both the new sample_type and read_format to u64.

Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-02 21:45:30 +02:00
Stephen Rothwell
baf75b0a42 powerpc/pci: Fix annotation of pcibios_claim_one_bus
It was __devinit, but it is also within a CONFIG_HOTPLUG guarded section
of code, so the __devinit does nothing but cause the following warning:

WARNING: vmlinux.o(.text+0x107a8): Section mismatch in reference from the function pcibios_finish_adding_to_bus() to the function .devinit.text:pcibios_claim_one_bus()
The function pcibios_finish_adding_to_bus() references
the function __devinit pcibios_claim_one_bus().
This is often because pcibios_finish_adding_to_bus lacks a __devinit
annotation or the annotation of pcibios_claim_one_bus is wrong.

It is also only (externally) used in arch/powerpc/kernel/of_platform.c
which cannot be built as a module so don't export it.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-02 11:09:12 +10:00
Michael Ellerman
92e02a5125 powerpc/ftrace: Use PPC_INST_NOP directly
There's no need to wrap PPC_INST_NOP in a static inline.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-02 10:36:53 +10:00
Michael Ellerman
898b160fe9 powerpc/ftrace: Remove unused macros
These macros were used in the original port, but since commit
e4486fe316 (ftrace, use probe_kernel API to modify code) they
are unused.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-02 10:36:46 +10:00
Michael Ellerman
4a9e3f8e94 powerpc/ftrace: Use ppc_function_entry() instead of GET_ADDR
Use ppc_function_entry() from code-patching.h.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-02 10:36:32 +10:00
Nathan Fontenot
f03cdb3a66 powerpc: Display processor virtualization resource allocs in lparcfg
This patch updates the output from /proc/ppc64/lparcfg to display the
processor virtualization resource allocations for a shared processor
partition.

This information is already gathered via the h_get_ppp call, we just
have to make sure that the ibm,partition-performance-parameters-level
property is >= 1 to ensure that the information is valid.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-06-02 10:36:10 +10:00
Ingo Molnar
23db9f430b Merge branch 'linus' into perfcounters/core
Merge reason: merge almost-rc8 into perfcounters/core, which was -rc6
              based - to pick up the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01 10:01:39 +02:00
Benjamin Herrenschmidt
435462c6e6 Merge branch 'merge' into next 2009-05-29 13:54:52 +10:00
Benjamin Herrenschmidt
8b31e49d1d powerpc: Fix up dma_alloc_coherent() on platforms without cache coherency.
The implementation we just revived has issues, such as using a
Kconfig-defined virtual address area in kernel space that nothing
actually carves out (and thus will overlap whatever is there),
or having some dependencies on being self contained in a single
PTE page which adds unnecessary constraints on the kernel virtual
address space.

This fixes it by using more classic PTE accessors and automatically
locating the area for consistent memory, carving an appropriate hole
in the kernel virtual address space, leaving only the size of that
area as a Kconfig option. It also brings some dma-mask related fixes
from the ARM implementation which was almost identical initially but
grew its own fixes.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-27 16:33:59 +10:00
Paul Mackerras
8a7b8cb91f perf_counter: powerpc: Implement interrupt throttling
This implements interrupt throttling on powerpc.  Since we don't have
individual count enable/disable or interrupt enable/disable controls
per counter, this simply sets the hardware counter to 0, meaning that
it will not interrupt again until it has counted 2^31 counts, which
will take at least 2^30 cycles assuming a maximum of 2 counts per
cycle.  Also, we set counter->hw.period_left to the maximum possible
value (2^63 - 1), so we won't report overflows for this counter for
the forseeable future.

The unthrottle operation restores counter->hw.period_left and the
hardware counter so that we will once again report a counter overflow
after counter->hw.irq_period counts.

[ Impact: new perfcounters robustness feature on PowerPC ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18971.35823.643362.446774@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-26 09:43:59 +02:00
Geert Uytterhoeven
80947e7c99 powerpc: Keep track of emulated instructions
If CONFIG_PPC_EMULATED_STATS is enabled, make available counters for the
various classes of emulated instructions under
/sys/kernel/debug/powerpc/emulated_instructions/ (assumed debugfs is mounted on
/sys/kernel/debug).  Optionally (controlled by
/sys/kernel/debug/powerpc/emulated_instructions/do_warn), rate-limited warnings
can be printed to the console when instructions are emulated.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:26 +10:00
Anton Blanchard
8d165db107 powerpc: Improve decrementer accuracy
I have been looking at sources of OS jitter and notice that after a long
NO_HZ idle period we wakeup too early:

relative time (us)    event
                      timer irq exit
    999946.405        timer irq entry
         4.835        timer irq exit
        21.685        timer irq entry
         3.540          timer (tick_sched_timer) entry

Here we slept for just under a second then took a timer interrupt that did
nothing. 21.685 us later we wake up again and do the work.

We set a rather low shift value of 16 for the decrementer clockevent, which I
think is causing this issue. On this box we have a 207MHz decrementer and see:

clockevent: decrementer mult[3501] shift[16] cpu[0]

For calculations of large intervals this mult/shift combination could be
off by a significant amount. I notice the sparc code has a loop that iterates
to find a mult/shift combination that maximises the shift value while
keeping mult under 32bit. With the patch below we get:

clockevent: decrementer mult[35015c20] shift[32] cpu[15]

And we no longer see the spurious wakeups.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
9aa4e7b169 powerpc/pci: Remove redundant pcnet32 fixup
The pcnet32 driver has had in its device id table for some time a match
against the "broken" vendor ID.  No need to fake out the vendor ID
with an explicit fixup.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
cf6692c07a powerpc/pci: Cleanup some minor cruft
* Removed building setup-irq on ppc32, we don't use it anymore
* Remove duplicate prototype for setup_grackle() code that needs it
  gets it from <asm/grackle.h>
* Removed gratuitous pci_io_size type differences between ppc32/ppc64

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
2eb4afb69f powerpc/pci: Move pseries code into pseries platform specific area
There doesn't appear to be any specific reason that we need to setup the
pseries specific notifier in generic arch pci code.  Move it into pseries
land.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:24 +10:00
Kumar Gala
2f52297665 powerpc/pci: Clean up direct access to sysdata by RTAS
We shouldn't directly access sysdata to get the device node but call
pci_bus_to_OF_node() for this purpose.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:23 +10:00
Milton Miller
60dbf43851 powerpc: Add 2.06 tlbie mnemonics
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards
compatibilty for CPUs before 2.06.

Only useful for bare metal systems.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:21 +10:00
Michael Ellerman
835363e67d powerpc/irq: Remove fallback to __do_IRQ()
We should no longer have any irq code that needs __do_IRQ(), so
remove the fallback to __do_IRQ().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:44:20 +10:00
Michael Ellerman
9b647a30cb powerpc/irq: Move get_irq() comment into header
The guts of do_IRQ() isn't really the right place to be documenting
the ppc_md.get_irq() interface. So move the comment into machdep.h

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:59 +10:00
Michael Ellerman
d7cb10d6d2 powerpc/irq: Move stack overflow check into a separate function
Makes do_IRQ() shorter and clearer.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:58 +10:00
Michael Ellerman
f2694ba568 powerpc/irq: Move #ifdef'ed body of do_IRQ() into a separate function
Rather than a giant ifdef in the body of do_IRQ(), including a
dangling else, move the irq stack logic into a separate routine and
do the ifdef there.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-21 15:43:58 +10:00
Paul Mackerras
c0daaf3f1f perf_counter: powerpc: initialize cpuhw pointer before use
Commit 9e35ad38 ("perf_counter: Rework the perf counter
disable/enable") added code to the powerpc hw_perf_enable (renamed
from hw_perf_restore) to test cpuhw->disabled and return immediately
if it is not set (i.e. if the PMU is already enabled).

Unfortunately the test got added before cpuhw was initialized,
resulting in an oops the first time hw_perf_enable got called.
This fixes it by moving the initialization of cpuhw to before
cpuhw->disabled is tested.

[ Impact: fix oops-causing bug on powerpc ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <18960.56772.869734.304631@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 07:38:42 +02:00
Ingo Molnar
dc3f81b129 Merge commit 'v2.6.30-rc6' into perfcounters/core
Merge reason: this branch was on an -rc4 base, merge it up to -rc6
              to get the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 07:37:49 +02:00
Benjamin Herrenschmidt
0e337b42d6 powerpc: Explicit alignment for .data.cacheline_aligned
I don't think anything guarantees that the objects in data.page_aligned
are a multiple of PAGE_SIZE, thus the section may end on any boundary.

So the following section, .data.cacheline_aligned needs an explicit
alignment.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:05 +10:00
Steven Rostedt
c3cf8667ed powerpc/ftrace: Fix constraint to be early clobber
After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
graph tracer broke. This was discovered on my x86 boxes.

The issue is that gcc used the same register for an output as it did for
an input in an asm statement. I first thought this was a bug in gcc and
reported it. I was notified that gcc was correct and that the output had
to be flagged as an "early clobber".

I noticed that powerpc had the same issue and this patch fixes it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:05 +10:00
Michael Ellerman
021376a3b6 powerpc/ftrace: Use pr_devel() in ftrace.c
pr_debug() can now result in code being generated even when #DEBUG
is not defined. That's not really desirable in the ftrace code
which we want to be snappy.

With CONFIG_DYNAMIC_DEBUG=y:

size before:
   text	   data	    bss	    dec	    hex	filename
   3334	    672	      4	   4010	    faa	arch/powerpc/kernel/ftrace.o

size after:
   text	   data	    bss	    dec	    hex	filename
   2616	    360	      4	   2980	    ba4	arch/powerpc/kernel/ftrace.o

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-18 15:19:04 +10:00
Paul Mackerras
0bbd0d4be8 perf_counter: powerpc: supply more precise information on counter overflow events
This uses values from the MMCRA, SIAR and SDAR registers on
powerpc to supply more precise information for overflow events,
including a data address when PERF_RECORD_ADDR is specified.

Since POWER6 uses different bit positions in MMCRA from earlier
processors, this converts the struct power_pmu limited_pmc5_6
field, which only had 0/1 values, into a flags field and
defines bit values for its previous use (PPMU_LIMITED_PMC5_6)
and a new flag (PPMU_ALT_SIPR) to indicate that the processor
uses the POWER6 bit positions rather than the earlier
positions.  It also adds definitions in reg.h for the new and
old positions of the bit that indicates that the SIAR and SDAR
values come from the same instruction.

For the data address, the SDAR value is supplied if we are not
doing instruction sampling.  In that case there is no guarantee
that the address given in the PERF_RECORD_ADDR subrecord will
correspond to the instruction whose address is given in the
PERF_RECORD_IP subrecord.

If instruction sampling is enabled (e.g. because this counter
is counting a marked instruction event), then we only supply
the SDAR value for the PERF_RECORD_ADDR subrecord if it
corresponds to the instruction whose address is in the
PERF_RECORD_IP subrecord.  Otherwise we supply 0.

[ Impact: support more PMU hardware features on PowerPC ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18955.37028.48861.555309@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15 16:38:57 +02:00
Paul Mackerras
ef923214a4 perf_counter: powerpc: use u64 for event codes internally
Although the perf_counter API allows 63-bit raw event codes,
internally in the powerpc back-end we had been using 32-bit
event codes.  This expands them to 64 bits so that we can add
bits for specifying threshold start/stop events and instruction
sampling modes later.

This also corrects the return value of can_go_on_limited_pmc;
we were returning an event code rather than just a 0/1 value in
some circumstances. That didn't particularly matter while event
codes were 32-bit, but now that event codes are 64-bit it
might, so this fixes it.

[ Impact: extend PowerPC perfcounter interfaces from u32 to u64 ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <18955.36874.472452.353104@drongo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15 16:38:55 +02:00
Peter Zijlstra
60db5e09c1 perf_counter: frequency based adaptive irq_period
Instead of specifying the irq_period for a counter, provide a target interrupt
frequency and dynamically adapt the irq_period to match this frequency.

[ Impact: new perf-counter attribute/feature ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20090515132018.646195868@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15 15:26:56 +02:00
Peter Zijlstra
9e35ad388b perf_counter: Rework the perf counter disable/enable
The current disable/enable mechanism is:

	token = hw_perf_save_disable();
	...
	/* do bits */
	...
	hw_perf_restore(token);

This works well, provided that the use nests properly. Except we don't.

x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore
provide a reference counter disable/enable interface, where the first disable
disables the hardware, and the last enable enables the hardware again.

[ Impact: refactor, simplify the PMU disable/enable logic ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-15 09:47:02 +02:00
Benjamin Herrenschmidt
ad892a63f6 powerpc: Fix PCI ROM access
A couple of issues crept in since about 2.6.27 related to accessing PCI
device ROMs on various powerpc machines.

First, historically, we don't allocate the ROM resource in the resource
tree. I'm not entirely certain of why, I susepct they often contained
garbage on x86 but it's hard to tell. This causes the current generic
code to always call pci_assign_resource() when trying to access the said
ROM from sysfs, which will try to re-assign some new address regardless
of what the ROM BAR was already set to at boot time. This can be a
problem on hypervisor platforms like pSeries where we aren't supposed
to move PCI devices around (and in fact probably can't).

Second, our code that generates the PCI tree from the OF device-tree
(instead of doing config space probing) which we mostly use on pseries
at the moment, didn't set the (new) flag IORESOURCE_SIZEALIGN on any
resource. That means that any attempt at re-assigning such a resource
with pci_assign_resource() would fail due to resource_alignment()
returning 0.

This fixes this by doing these two things:

 - The code that calculates resource flags based on the OF device-node
is improved to set IORESOURCE_SIZEALIGN on any valid BAR, and while at
it also set IORESOURCE_READONLY for ROMs since we were lacking that too

 - We now allocate ROM resources as part of the resource tree. However
to limit the chances of nasty conflicts due to busted firmwares, we
only do it on the second pass of our two-passes allocation scheme,
so that all valid and enabled BARs get precedence.

This brings pSeries back the ability to access PCI ROMs via sysfs (and
thus initialize various video cards from X etc...).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-15 16:43:42 +10:00
Benjamin Herrenschmidt
b173f03d7c powerpc/pseries: Really fix the oprofile CPU type on pseries
My previous pach for fixing the oprofile CPU type got somewhat mismerged
(by my fault) when it collided with another related patch. This should
finally (fingers crossed) fix the whole thing.

We make sure we keep the -old- oprofile type and CPU type whenever
one of them was specified in the first pass through the function.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-15 16:43:42 +10:00
Becky Bruce
49a8496525 powerpc: Allow mem=x cmdline to work with 4G+
We're currently choking on mem=4g (and above) due to memory_limit
being specified as an unsigned long. Make memory_limit
phys_addr_t to fix this.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-15 16:43:41 +10:00
Benjamin Herrenschmidt
0203d6ec4e powerpc: Fix setting of oprofile cpu type
commit 2657dd4e30 introduced a
bug where we would now always override the "real" oprofile CPU
type with the "compatible" one provided by a pseudo-PVR in the
device-tree which is incorrect and breaks oprofile on all current
configs since the "compatible" ones aren't yet recognized.

This fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-01 15:12:05 +10:00
Michael Wolf
79af6c49a9 powerpc adjust oprofile_cpu_type version 3
Oprofile is changing the naming it is using for the compatibility modes.
Instead of having compat-power<x>, oprofile will go to family naming
convention and use ibm-compat-v<x>.  Currently only ibm-compat-v1 will
be defined.
The notion of compatibility events just started with POWER6. So there is
no way that any other tool could exist that is using these
oprofile_cpu_type strings we want to change.

Signed-off-by: Mike Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-05-01 15:12:05 +10:00
Paul Mackerras
ab7ef2e50a perf_counter: powerpc: allow use of limited-function counters
POWER5+ and POWER6 have two hardware counters with limited functionality:
PMC5 counts instructions completed in run state and PMC6 counts cycles
in run state.  (Run state is the state when a hardware RUN bit is 1;
the idle task clears RUN while waiting for work to do and sets it when
there is work to do.)

These counters can't be written to by the kernel, can't generate
interrupts, and don't obey the freeze conditions.  That means we can
only use them for per-task counters (where we know we'll always be in
run state; we can't put a per-task counter on an idle task), and only
if we don't want interrupts and we do want to count in all processor
modes.

Obviously some counters can't go on a limited hardware counter, but there
are also situations where we can only put a counter on a limited hardware
counter - if there are already counters on that exclude some processor
modes and we want to put on a per-task cycle or instruction counter that
doesn't exclude any processor mode, it could go on if it can use a
limited hardware counter.

To keep track of these constraints, this adds a flags argument to the
processor-specific get_alternatives() functions, with three bits defined:
one to say that we can accept alternative event codes that go on limited
counters, one to say we only want alternatives on limited counters, and
one to say that this is a per-task counter and therefore events that are
gated by run state are equivalent to those that aren't (e.g. a "cycles"
event is equivalent to a "cycles in run state" event).  These flags
are computed for each counter and stored in the counter->hw.counter_base
field (slightly wonky name for what it does, but it was an existing
unused field).

Since the limited counters don't freeze when we freeze the other counters,
we need some special handling to avoid getting skew between things counted
on the limited counters and those counted on normal counters.  To minimize
this skew, if we are using any limited counters, we read PMC5 and PMC6
immediately after setting and clearing the freeze bit.  This is done in
a single asm in the new write_mmcr0() function.

The code here is specific to PMC5 and PMC6 being the limited hardware
counters.  Being more general (e.g. having a bitmap of limited hardware
counter numbers) would have meant more complex code to read the limited
counters when freezing and unfreezing the normal counters, with
conditional branches, which would have increased the skew.  Since it
isn't necessary for the code to be more general at this stage, it isn't.

This also extends the back-ends for POWER5+ and POWER6 to be able to
handle up to 6 counters rather than the 4 they previously handled.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 14:58:35 +02:00
Robert Richter
4aeb0b4239 perfcounters: rename struct hw_perf_counter_ops into struct pmu
This patch renames struct hw_perf_counter_ops into struct pmu. It
introduces a structure to describe a cpu specific pmu (performance
monitoring unit). It may contain ops and data. The new name of the
structure fits better, is shorter, and thus better to handle. Where it
was appropriate, names of function and variable have been changed too.

[ Impact: cleanup ]

Signed-off-by: Robert Richter <robert.richter@amd.com>
Cc: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1241002046-8832-7-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 14:51:03 +02:00
Ingo Molnar
e7fd5d4b3d Merge branch 'linus' into perfcounters/core
Merge reason: This brach was on -rc1, refresh it to almost-rc4 to pick up
              the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-29 14:47:05 +02:00
Linus Torvalds
c3310e7766 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc/ps3: Fix build error on UP
  powerpc/cell: Select PCI for IBM_CELL_BLADE AND CELLEB
  powerpc: ppc32 needs elf_read_implies_exec()
  powerpc/86xx: Add device_type entry to soc for ppc9a
  powerpc/44x: Correct memory size calculation for denali-based boards
  maintainers: Fix PowerPC 4xx git tree
  powerpc: fix for long standing bug noticed by gcc 4.4.0
  Revert "powerpc: Add support for early tlbilx opcode"
2009-04-28 15:55:32 -07:00
Tim Abbott
13beadd91f powerpc: Revert switch to TEXT_TEXT in linker script
Commit edada399 broke the build on 64-bit powerpc because it moved the
__ftr_alt_* sections of a file away from the .text section, causing
link failures due to relative conditional branch targets being too far
away from the branch instructions.  This happens on pretty much all
64-bit powerpc configs.

This change reverts commit edada399 while preserving the update from
the *.refok sections to .ref.text that has happened since.

Signed-off-by: Tim Abbott <tabbott@mit.edu>
Requested-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-28 15:55:14 -07:00
Tim Abbott
edada399e8 powerpc: Use TEXT_TEXT macro in linker script.
Rather than adding .ref.text to the powerpc linker script so that we
can use __REF on the powerpc architecture, it seems simpler to switch
to using the generic TEXT_TEXT macro.

Signed-off-by: Tim Abbott <tabbott@mit.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-27 19:51:58 -07:00
Tim Abbott
e703984587 powerpc: convert to use __HEAD and HEAD_TEXT macros.
This has the consequence of changing the section name use for head
code from ".text.head" to ".head.text".  Since this commit changes all
users in the architecture, this change should be harmless.

Signed-off-by: Tim Abbott <tabbott@mit.edu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-26 09:20:38 -07:00
Linus Torvalds
f8c3301e83 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: Fix modular build of ide-pmac when mediabay is built in
  powerpc/pasemi: Fix build error on UP
  powerpc: Make macintosh/mediabay driver depend on CONFIG_BLOCK
  maintainers: Fix PS3 patterns
  powerpc/ps3: Fix CONFIG_PS3_FLASH=n build warning
  powerpc/32: Don't clobber personality flags on exec
  powerpc: Fix crash on CPU hotplug
  powerpc/85xx: Remove defconfigs that mpc85xx_{smp_}defconfig cover
  powerpc/85xx: Added SMP defconfig
  powerpc/85xx: Enabled a bunch of FSL specific drivers/options
  powerpc/85xx: Updated generic mpc85xx_defconfig
  powerpc: don't disable SATA interrupts on Freescale MPC8610 HPCD
  fsl_rio: Pass the proper device to dma mapping routines
  powerpc: Fix of_node_put() exit path in of_irq_map_one()
  powerpc/5200: defconfig updates
  powerpc/5200: Add FLASH nodes to lite5200 device tree
  powerpc/device-tree: Document MTD nodes with multiple "reg" tuples
  powerpc/of-device-tree: Factor MTD physmap bindings out of booting-without-of
  powerpc/5200: Bring the legacy fsl_spi_platform_data hooks back
2009-04-24 07:44:58 -07:00
Kumar Gala
323d23aeac Revert "powerpc: Add support for early tlbilx opcode"
This reverts commit e996557740.  Our HW
guys were able to fix this so it never sees the light of day.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-04-23 08:51:22 -05:00
Paul Mackerras
5bd3ef84d7 Merge branch 'merge' of git://git.secretlab.ca/git/linux-2.6 into merge 2009-04-22 13:02:09 +10:00
Magnus Damm
8e19608e8b clocksource: pass clocksource to read() callback
Pass clocksource pointer to the read() callback for clocksources.  This
allows us to share the callback between multiple instances.

[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-21 13:41:47 -07:00
Ilpo Järvinen
6d25b688ec powerpc: Fix of_node_put() exit path in of_irq_map_one()
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-04-20 12:18:43 -06:00
Linus Torvalds
a23c218bd3 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: pseries/dtl.c should include asm/firmware.h
  powerpc: Fix data-corrupting bug in __futex_atomic_op
  powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset()
  powerpc: Allow 256kB pages with SHMEM
  powerpc: Document new FSL I2C bindings and cleanup
  powerpc/mm: Fix compile warning
  powerpc/85xx: TQM8548: update defconfig
  powerpc/85xx: TQM8548: use proper phy-handles for enet2 and enet3
  powerpc/85xx: TQM85xx: correct address of LM75 I2C device nodes
  powerpc: Add support for early tlbilx opcode
  powerpc: Fix tlbilx opcode
2009-04-15 08:42:40 -07:00
Paul Mackerras
ca8f2d7f01 perf_counter: powerpc: add nmi_enter/nmi_exit calls
Impact: fix potential deadlocks on powerpc

Now that the core is using in_nmi() (added in e30e08f6, "perf_counter:
fix NMI race in task clock"), we need the powerpc perf_counter_interrupt
to call nmi_enter() and nmi_exit() in those cases where the interrupt
happens when interrupts are soft-disabled.

If interrupts were soft-enabled, we can treat it as a regular interrupt
and do irq_enter/irq_exit around the whole routine. This lets us get rid
of the test_perf_counter_pending() call at the end of
perf_counter_interrupt, thus simplifying things a little.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18909.31952.873098.336615@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-09 07:56:08 +02:00
Peter Zijlstra
78f13e9525 perf_counter: allow for data addresses to be recorded
Paul suggested we allow for data addresses to be recorded along with
the traditional IPs as power can provide these.

For now, only the software pagefault events provide data addresses,
but in the future power might as well for some events.

x86 doesn't seem capable of providing this atm.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090408130409.394816925@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 19:05:56 +02:00
Paul Mackerras
f708223d49 perf_counter: powerpc: set sample enable bit for marked instruction events
Impact: enable access to hardware feature

POWER processors have the ability to "mark" a subset of the instructions
and provide more detailed information on what happens to the marked
instructions as they flow through the pipeline.  This marking is
enabled by the "sample enable" bit in MMCRA, and there are
synchronization requirements around setting and clearing the bit.

This adds logic to the processor-specific back-ends so that they know
which events relate to marked instructions and set the sampling enable
bit if any event that we want to put on the PMU is a marked instruction
event.  It also adds logic to the generic powerpc code to do the
necessary synchronization if that bit is set.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18908.31930.1024.228867@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 12:39:28 +02:00
Paul Mackerras
dc66270b51 perf_counter: fix powerpc build
Commit 4af4998b ("perf_counter: rework context time") changed struct
perf_counter_context to have a 'time' field instead of a 'time_now'
field, but neglected to fix the place in the powerpc perf_counter.c
where the time_now field was accessed.  This fixes it.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <18908.31922.411398.147810@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 12:39:27 +02:00
Ingo Molnar
5ea472a77f Merge commit 'v2.6.30-rc1' into perfcounters/core
Conflicts:
	arch/powerpc/include/asm/systbl.h
	arch/powerpc/include/asm/unistd.h
	include/linux/init_task.h

Merge reason: the conflicts are non-trivial: PowerPC placement
              of sys_perf_counter_open has to be mixed with the
	      new preadv/pwrite syscalls.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-08 10:35:30 +02:00
Yang Hongyang
284901a90a dma-mapping: replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)
Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

Signed-off-by: Yang Hongyang<yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-07 08:31:11 -07:00
Peter Zijlstra
f6c7d5fe58 perf_counter: theres more to overflow than writing events
Prepare for more generic overflow handling. The new perf_counter_overflow()
method will handle the generic bits of the counter overflow, and can return
a !0 return value, in which case the counter should be (soft) disabled, so
that it won't count until it's properly disabled.

XXX: do powerpc and swcounter

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
LKML-Reference: <20090406094517.812109629@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-07 10:48:56 +02:00
Kumar Gala
e996557740 powerpc: Add support for early tlbilx opcode
During the ISA 2.06 development the opcode for tlbilx changed and some
early implementations used to old opcode.  Add support for a MMU_FTR
fixup to deal with this.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-04-07 01:36:30 -05:00
Michael Ellerman
7ddb7ad11f powerpc/ftrace: Fix printf format warning
'tramp' is an unsigned long, so print it with %lx.

Fixes the following build warning:
arch/powerpc/kernel/ftrace.c:291: error: format ‘%x’ expects type ‘unsigned int’, but argument 2 has type ‘long unsigned int’

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:19:00 +10:00
Michael Ellerman
f4952f6cbe powerpc/ftrace: Fix #if that should be #ifdef
Commit bb7253403f ("powerpc64,
ftrace: save toc only on modules for function graph"), added an
#if CONFIG_PPC64.  This changes it to #ifdef.

Fixes the following warning on 32-bit builds:
 arch/powerpc/kernel/ftrace.c:562:5: error: "CONFIG_PPC64" is not defined

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:19:00 +10:00
Michael Neuling
bc826666e4 powerpc: Fix ptrace compat wrapper for FPU register access
The ptrace compat wrapper mishandles access to the fpu registers.  The
PTRACE_PEEKUSR and PTRACE_POKEUSR requests miscalculate the index into
the fpr array due to the broken FPINDEX macro.  The
PPC_PTRACE_PEEKUSR_3264 request needs to use the same formula that the
native ptrace interface uses when operating on the register number (as
opposed to the 4-byte offset).  The PPC_PTRACE_POKEUSR_3264 request
didn't take TS_FPRWIDTH into account.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:19:00 +10:00
Michael Ellerman
c7d07fdd5a powerpc: Print information about mapping hw irqs to virtual irqs
The irq remapping layer seems to cause some confusion when people
see a different irq number in /proc/interrupts vs the one they
request in their driver or DTS.

So have the irq remapping layer print out a message when we map an
irq. The message is only printed the first time the irq is mapped,
and it's KERN_DEBUG so most people won't see it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:19:00 +10:00
Michael Neuling
7e875e9dc8 powerpc: Disable VSX or current process in giveup_fpu/altivec
When we call giveup_fpu, we need to need to turn off VSX for the
current process.  If we don't, on return to userspace it may execute a
VSX instruction before the next FP instruction, and not have its
register state refreshed correctly from the thread_struct.  Ditto for
altivec.

This caused a bug where an unaligned lfs or stfs results in
fix_alignment calling giveup_fpu so it can use the FPRs (in order to
do a single <-> double conversion), and then returning to userspace
with FP off but VSX on.  Then if a VSX instruction is executed, before
another FP instruction, it will proceed without another exception and
hence have the incorrect register state for VSX registers 0-31.

   lfs unaligned   <- alignment exception turns FP off but leaves VSX on

   VSX instruction <- no exception since VSX on, hence we get the
                      wrong VSX register values for VSX registers 0-31,
                      which overlap the FPRs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:18:59 +10:00
Anton Blanchard
856cc2f0be powerpc/pseries: Fix ibm,client-architecture comment
We specify a 64MB RMO, but the comment says 128MB.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:18:59 +10:00
Anton Blanchard
0559f0a761 powerpc/pseries: Add dispatch dispersion statistics
PHYP tells us how often a shared processor dispatch changed physical cpus.
This can highlight performance problems caused by the hypervisor.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:18:59 +10:00
Anton Blanchard
1f8737aab3 powerpc: Clean up some prom printouts
Make all messages consistent, some have spaces before the "...", some do not.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:18:59 +10:00
Anton Blanchard
4da727ae2a powerpc: Print progress of ibm,client-architecture method
The ibm,client-architecture method will often cause a reconfiguration reboot.
When this happens the last thing we see is:

	Hypertas detected, assuming LPAR !

Which doesn't explain what just happened.  Wrap the ibm,client-architecture
so it's clear what is going on:

	Calling ibm,client-architecture... done

In order to maintain the law of conservation of screen real estate, downgrade
two other messages to debug.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:18:58 +10:00
Huang Weiyi
85701e6ac1 powerpc: Remove duplicated #include's
Remove duplicated #include's in
  - arch/powerpc/include/asm/ps3fb.h
  - arch/powerpc/kernel/setup-common.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-07 15:18:58 +10:00
Paul Mackerras
d5d2bc0dd0 perf_counter: make it possible for hw_perf_counter_init to return error codes
Impact: better error reporting

At present, if hw_perf_counter_init encounters an error, all it can do
is return NULL, which causes sys_perf_counter_open to return an EINVAL
error to userspace.  This isn't very informative for userspace; it means
that userspace can't tell the difference between "sorry, oprofile is
already using the PMU" and "we don't support this CPU" and "this CPU
doesn't support the requested generic hardware event".

This commit uses the PTR_ERR/ERR_PTR/IS_ERR set of macros to let
hw_perf_counter_init return an error code on error rather than just NULL
if it wishes.  If it does so, that error code will be returned from
sys_perf_counter_open to userspace.  If it returns NULL, an EINVAL
error will be returned to userspace, as before.

This also adapts the powerpc hw_perf_counter_init to make use of this
to return ENXIO, EINVAL, EBUSY, or EOPNOTSUPP as appropriate.  It would
be good to add extra error numbers in future to allow userspace to
distinguish the various errors that are currently reported as EINVAL,
i.e. irq_period < 0, too many events in a group, conflict between
exclude_* settings in a group, and PMU resource conflict in a group.

[ v2: fix a bug pointed out by Corey Ashford where error returns from
      hw_perf_counter_init were not handled correctly in the case of
      raw hardware events.]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090330171023.682428180@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:40 +02:00
Paul Mackerras
7595d63b3a perf_counter: powerpc: only reserve PMU hardware when we need it
Impact: cooperate with oprofile

At present, on PowerPC, if you have perf_counters compiled in, oprofile
doesn't work.  There is code to allow the PMU to be shared between
competing subsystems, such as perf_counters and oprofile, but currently
the perf_counter subsystem reserves the PMU for itself at boot time,
and never releases it.

This makes perf_counter play nicely with oprofile.  Now we keep a count
of how many perf_counter instances are counting hardware events, and
reserve the PMU when that count becomes non-zero, and release the PMU
when that count becomes zero.  This means that it is possible to have
perf_counters compiled in and still use oprofile, as long as there are
no hardware perf_counters active.  This also means that if oprofile is
active, sys_perf_counter_open will fail if the hw_event specifies a
hardware event.

To avoid races with other tasks creating and destroying perf_counters,
we use a mutex.  We use atomic_inc_not_zero and atomic_add_unless to
avoid having to take the mutex unless there is a possibility of the
count going between 0 and 1.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090330171023.627912475@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:39 +02:00
Peter Zijlstra
925d519ab8 perf_counter: unify and fix delayed counter wakeup
While going over the wakeup code I noticed delayed wakeups only work
for hardware counters but basically all software counters rely on
them.

This patch unifies and generalizes the delayed wakeup to fix this
issue.

Since we're dealing with NMI context bits here, use a cmpxchg() based
single link list implementation to track counters that have pending
wakeups.

[ This should really be generic code for delayed wakeups, but since we
  cannot use cmpxchg()/xchg() in generic code, I've let it live in the
  perf_counter code. -- Eric Dumazet could use it to aggregate the
  network wakeups. ]

Furthermore, the x86 method of using TIF flags was flawed in that its
quite possible to end up setting the bit on the idle task, loosing the
wakeup.

The powerpc method uses per-cpu storage and does appear to be
sufficient.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:36 +02:00
Paul Mackerras
53cfbf5937 perf_counter: record time running and time enabled for each counter
Impact: new functionality

Currently, if there are more counters enabled than can fit on the CPU,
the kernel will multiplex the counters on to the hardware using
round-robin scheduling.  That isn't too bad for sampling counters, but
for counting counters it means that the value read from a counter
represents some unknown fraction of the true count of events that
occurred while the counter was enabled.

This remedies the situation by keeping track of how long each counter
is enabled for, and how long it is actually on the cpu and counting
events.  These times are recorded in nanoseconds using the task clock
for per-task counters and the cpu clock for per-cpu counters.

These values can be supplied to userspace on a read from the counter.
Userspace requests that they be supplied after the counter value by
setting the PERF_FORMAT_TOTAL_TIME_ENABLED and/or
PERF_FORMAT_TOTAL_TIME_RUNNING bits in the hw_event.read_format field
when creating the counter.  (There is no way to change the read format
after the counter is created, though it would be possible to add some
way to do that.)

Using this information it is possible for userspace to scale the count
it reads from the counter to get an estimate of the true count:

true_count_estimate = count * total_time_enabled / total_time_running

This also lets userspace detect the situation where the counter never
got to go on the cpu: total_time_running == 0.

This functionality has been requested by the PAPI developers, and will
be generally needed for interpreting the count values from counting
counters correctly.

In the implementation, this keeps 5 time values (in nanoseconds) for
each counter: total_time_enabled and total_time_running are used when
the counter is in state OFF or ERROR and for reporting back to
userspace.  When the counter is in state INACTIVE or ACTIVE, it is the
tstamp_enabled, tstamp_running and tstamp_stopped values that are
relevant, and total_time_enabled and total_time_running are determined
from them.  (tstamp_stopped is only used in INACTIVE state.)  The
reason for doing it like this is that it means that only counters
being enabled or disabled at sched-in and sched-out time need to be
updated.  There are no new loops that iterate over all counters to
update total_time_enabled or total_time_running.

This also keeps separate child_total_time_running and
child_total_time_enabled fields that get added in when reporting the
totals to userspace.  They are separate fields so that they can be
atomic.  We don't want to use atomics for total_time_running,
total_time_enabled etc., because then we would have to use atomic
sequences to update them, which are slower than regular arithmetic and
memory accesses.

It is possible to measure total_time_running by adding a task_clock
counter to each group of counters, and total_time_enabled can be
measured approximately with a top-level task_clock counter (though
inaccuracies will creep in if you need to disable and enable groups
since it is not possible in general to disable/enable the top-level
task_clock counter simultaneously with another group).  However, that
adds extra overhead - I measured around 15% increase in the context
switch latency reported by lat_ctx (from lmbench) when a task_clock
counter was added to each of 2 groups, and around 25% increase when a
task_clock counter was added to each of 4 groups.  (In both cases a
top-level task-clock counter was also added.)

In contrast, the code added in this commit gives better information
with no overhead that I could measure (in fact in some cases I
measured lower times with this code, but the differences were all less
than one standard deviation).

[ v2: address review comments by Andrew Morton. ]

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Orig-LKML-Reference: <18890.6578.728637.139402@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:36 +02:00
Peter Zijlstra
7b732a7504 perf_counter: new output ABI - part 1
Impact: Rework the perfcounter output ABI

use sys_read() only for instant data and provide mmap() output for all
async overflow data.

The first mmap() determines the size of the output buffer. The mmap()
size must be a PAGE_SIZE multiple of 1+pages, where pages must be a
power of 2 or 0. Further mmap()s of the same fd must have the same
size. Once all maps are gone, you can again mmap() with a new size.

In case of 0 extra pages there is no data output and the first page
only contains meta data.

When there are data pages, a poll() event will be generated for each
full page of data. Furthermore, the output is circular. This means
that although 1 page is a valid configuration, its useless, since
we'll start overwriting it the instant we report a full page.

Future work will focus on the output format (currently maintained)
where we'll likey want each entry denoted by a header which includes a
type and length.

Further future work will allow to splice() the fd, also containing the
async overflow data -- splice() would be mutually exclusive with
mmap() of the data.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.470536358@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:27 +02:00
Paul Mackerras
37d8182838 perf_counter: add an mmap method to allow userspace to read hardware counters
Impact: new feature giving performance improvement

This adds the ability for userspace to do an mmap on a hardware counter
fd and get access to a read-only page that contains the information
needed to translate a hardware counter value to the full 64-bit
counter value that would be returned by a read on the fd.  This is
useful on architectures that allow user programs to read the hardware
counters, such as PowerPC.

The mmap will only succeed if the counter is a hardware counter
monitoring the current process.

On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter
and translate it to the full 64-bit value in about 30ns using the
mmapped page, compared to about 830ns for the read syscall on the
counter, so this does give a significant performance improvement.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Orig-LKML-Reference: <20090323172417.297057964@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:26 +02:00
Peter Zijlstra
f4a2deb486 perf_counter: remove the event config bitfields
Since the bitfields turned into a bit of a mess, remove them and rely on
good old masks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Orig-LKML-Reference: <20090323172417.059499915@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:25 +02:00
Paul Mackerras
9aaa131a27 perf_counter: fix type/event_id layout on big-endian systems
Impact: build fix for powerpc

Commit db3a944aca35ae61 ("perf_counter: revamp syscall input ABI")
expanded the hw_event.type field into a union of structs containing
bitfields.  In particular it introduced a type field and a raw_type
field, with the intention that the 1-bit raw_type field should
overlay the most-significant bit of the 8-bit type field, and in fact
perf_counter_alloc() now assumes that (or at least, assumes that
raw_type doesn't overlay any of the bits that are 1 in the values of
PERF_TYPE_{HARDWARE,SOFTWARE,TRACEPOINT}).

Unfortunately this is not true on big-endian systems such as PowerPC,
where bitfields are laid out from left to right, i.e. from most
significant bit to least significant.  This means that setting
hw_event.type = PERF_TYPE_SOFTWARE will set hw_event.raw_type to 1.

This fixes it by making the layout depend on whether or not
__BIG_ENDIAN_BITFIELD is defined.  It's a bit ugly, but that's what
we get for using bitfields in a user/kernel ABI.

Also, that commit didn't fix up some places in arch/powerpc/kernel/
perf_counter.c where hw_event.raw and hw_event.event_id were used.
This fixes them too.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-06 09:30:18 +02:00
Paul Mackerras
db4fb5acf2 perf_counter: powerpc: clean up perc_counter_interrupt
Impact: cleanup

This updates the powerpc perf_counter_interrupt following on from the
"perf_counter: unify irq output code" patch.  Since we now use the
generic perf_counter_output code, which sets the perf_counter_pending
flag directly, we no longer need the need_wakeup variable.

This removes need_wakeup and makes perf_counter_interrupt use
get_perf_counter_pending() instead.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194234.024464535@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:18 +02:00
Peter Zijlstra
0322cd6ec5 perf_counter: unify irq output code
Impact: cleanup

Having 3 slightly different copies of the same code around does nobody
any good. First step in revamping the output format.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.929962222@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:17 +02:00
Peter Zijlstra
b8e83514b6 perf_counter: revamp syscall input ABI
Impact: modify ABI

The hardware/software classification in hw_event->type became a little
strained due to the addition of tracepoint tracing.

Instead split up the field and provide a type field to explicitly specify
the counter type, while using the event_id field to specify which event to
use.

Raw counters still work as before, only the raw config now goes into
raw_event.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Orig-LKML-Reference: <20090319194233.836807573@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:30:17 +02:00
Paul Mackerras
b6c5a71da1 perf_counter: abstract wakeup flag setting in core to fix powerpc build
Impact: build fix for powerpc

Commit bd753921015e7905 ("perf_counter: software counter event
infrastructure") introduced a use of TIF_PERF_COUNTERS into the core
perfcounter code.  This breaks the build on powerpc because we use
a flag in a per-cpu area to signal wakeups on powerpc rather than
a thread_info flag, because the thread_info flags have to be
manipulated with atomic operations and are thus slower than per-cpu
flags.

This fixes the by changing the core to use an abstracted
set_perf_counter_pending() function, which is defined on x86 to set
the TIF_PERF_COUNTERS flag and on powerpc to set the per-cpu flag
(paca->perf_counter_pending).  It changes the previous powerpc
definition of set_perf_counter_pending to not take an argument and
adds a clear_perf_counter_pending, so as to simplify the definition
on x86.

On x86, set_perf_counter_pending() is defined as a macro.  Defining
it as a static inline in arch/x86/include/asm/perf_counters.h causes
compile failures because <asm/perf_counters.h> gets included early in
<linux/sched.h>, and the definitions of set_tsk_thread_flag etc. are
therefore not available in <asm/perf_counters.h>.  (On powerpc this
problem is avoided by defining set_perf_counter_pending etc. in
<asm/hw_irq.h>.)

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-04-06 09:30:14 +02:00
Ingo Molnar
f541ae326f Merge branch 'linus' into perfcounters/core-v2
Merge reason: we have gathered quite a few conflicts, need to merge upstream

Conflicts:
	arch/powerpc/kernel/Makefile
	arch/x86/ia32/ia32entry.S
	arch/x86/include/asm/hardirq.h
	arch/x86/include/asm/unistd_32.h
	arch/x86/include/asm/unistd_64.h
	arch/x86/kernel/cpu/common.c
	arch/x86/kernel/irq.c
	arch/x86/kernel/syscall_table_32.S
	arch/x86/mm/iomap_32.c
	include/linux/sched.h
	kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-06 09:02:57 +02:00
Linus Torvalds
714f83d5d9 Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
  tracing, net: fix net tree and tracing tree merge interaction
  tracing, powerpc: fix powerpc tree and tracing tree interaction
  ring-buffer: do not remove reader page from list on ring buffer free
  function-graph: allow unregistering twice
  trace: make argument 'mem' of trace_seq_putmem() const
  tracing: add missing 'extern' keywords to trace_output.h
  tracing: provide trace_seq_reserve()
  blktrace: print out BLK_TN_MESSAGE properly
  blktrace: extract duplidate code
  blktrace: fix memory leak when freeing struct blk_io_trace
  blktrace: fix blk_probes_ref chaos
  blktrace: make classic output more classic
  blktrace: fix off-by-one bug
  blktrace: fix the original blktrace
  blktrace: fix a race when creating blk_tree_root in debugfs
  blktrace: fix timestamp in binary output
  tracing, Text Edit Lock: cleanup
  tracing: filter fix for TRACE_EVENT_FORMAT events
  ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
  x86: kretprobe-booster interrupt emulation code fix
  ...

Fix up trivial conflicts in
 arch/parisc/include/asm/ftrace.h
 include/linux/memory.h
 kernel/extable.c
 kernel/module.c
2009-04-05 11:04:19 -07:00
Linus Torvalds
bad6a5c08c Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-parisc
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/rtc-parisc:
  powerpc/ps3: Add rtc-ps3
  powerpc: Hook up rtc-generic, and kill rtc-ppc
  m68k: Hook up rtc-generic
  parisc: rtc: Rename rtc-parisc to rtc-generic
  parisc: rtc: Add missing module alias
  parisc: rtc: platform_driver_probe() fixups
  parisc: rtc: get_rtc_time() returns unsigned int
2009-04-03 09:51:35 -07:00
Alexey Dobriyan
6f2c55b843 Simplify copy_thread()
First argument unused since 2.3.11.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:04:51 -07:00
Jean Delvare
bf6aede712 workqueue: add to_delayed_work() helper function
It is a fairly common operation to have a pointer to a work and to need a
pointer to the delayed work it is contained in.  In particular, all
delayed works which want to rearm themselves will have to do that.  So it
would seem fair to offer a helper function for this operation.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 19:04:50 -07:00
Geert Uytterhoeven
bcd68a70cb powerpc: Hook up rtc-generic, and kill rtc-ppc
PowerPC has been a long time user of the generic RTC abstraction, so hook up
rtc-generic:
  - Create the "rtc-generic" platform device if ppc_md.get_rtc_time is set,
  - Kill rtc-ppc, as rtc-generic offers the same functionality in a more
    generic way, and supports autoloading through udev.

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2009-04-02 01:05:31 +00:00
Stephen Rothwell
a095bdbb13 tracing, powerpc: fix powerpc tree and tracing tree interaction
Today's linux-next build (powerpc allyesconfig) failed like this:

arch/powerpc/kernel/ftrace.c: In function 'prepare_ftrace_return':
arch/powerpc/kernel/ftrace.c:612: warning: passing argument 3 of 'ftrace_push_return_trace' makes pointer from integer without a cast
arch/powerpc/kernel/ftrace.c:612: error: too many arguments to function 'ftrace_push_return_trace'

Caused by commit 5d1a03dc54
("function-graph: moved the timestamp from arch to generic code") from
the tracing tree which (removed an argument from
ftrace_push_return_trace()) interacting with commit
6794c78243 ("powerpc64: port of the
function graph tracer") from the powerpc tree.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <linuxppc-dev@ozlabs.org>
LKML-Reference: <20090327230834.93d0221d.sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-02 00:50:24 +02:00
Linus Torvalds
e76e5b2c66 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (88 commits)
  PCI: fix HT MSI mapping fix
  PCI: don't enable too much HT MSI mapping
  x86/PCI: make pci=lastbus=255 work when acpi is on
  PCI: save and restore PCIe 2.0 registers
  PCI: update fakephp for bus_id removal
  PCI: fix kernel oops on bridge removal
  PCI: fix conflict between SR-IOV and config space sizing
  powerpc/PCI: include pci.h in powerpc MSI implementation
  PCI Hotplug: schedule fakephp for feature removal
  PCI Hotplug: rename legacy_fakephp to fakephp
  PCI Hotplug: restore fakephp interface with complete reimplementation
  PCI: Introduce /sys/bus/pci/devices/.../rescan
  PCI: Introduce /sys/bus/pci/devices/.../remove
  PCI: Introduce /sys/bus/pci/rescan
  PCI: Introduce pci_rescan_bus()
  PCI: do not enable bridges more than once
  PCI: do not initialize bridges more than once
  PCI: always scan child buses
  PCI: pci_scan_slot() returns newly found devices
  PCI: don't scan existing devices
  ...

Fix trivial append-only conflict in Documentation/feature-removal-schedule.txt
2009-04-01 09:47:12 -07:00
Alexey Dobriyan
99b7623380 proc 2/2: remove struct proc_dir_entry::owner
Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
as correctly noted at bug #12454. Someone can lookup entry with NULL
->owner, thus not pinning enything, and release it later resulting
in module refcount underflow.

We can keep ->owner and supply it at registration time like ->proc_fops
and ->data.

But this leaves ->owner as easy-manipulative field (just one C assignment)
and somebody will forget to unpin previous/pin current module when
switching ->owner. ->proc_fops is declared as "const" which should give
some thoughts.

->read_proc/->write_proc were just fixed to not require ->owner for
protection.

rmmod'ed directories will be empty and return "." and ".." -- no harm.
And directories with tricky enough readdir and lookup shouldn't be modular.
We definitely don't want such modular code.

Removing ->owner will also make PDE smaller.

So, let's nuke it.

Kudos to Jeff Layton for reminding about this, let's say, oversight.

http://bugzilla.kernel.org/show_bug.cgi?id=12454

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
2009-03-31 01:14:44 +04:00
Benjamin Herrenschmidt
9ff9a26b78 Merge commit 'origin/master' into next
Manual merge of:
	arch/powerpc/include/asm/elf.h
	drivers/i2c/busses/i2c-mpc.c
2009-03-30 14:04:53 +11:00
Ingo Molnar
6e15cf0486 Merge branch 'core/percpu' into percpu-cpumask-x86-for-linus-2
Conflicts:
	arch/parisc/kernel/irq.c
	arch/x86/include/asm/fixmap_64.h
	arch/x86/include/asm/setup.h
	kernel/irq/handle.c

Semantic merge:
        arch/x86/include/asm/fixmap.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-27 17:28:43 +01:00
Benjamin Herrenschmidt
ec78c8ac16 powerpc: Fix bugs introduced by sysfs changes
Rusty's patch to change our sysfs access to various registers
to use smp_call_function_single() introduced a whole bunch of
warnings. This fixes them. This version also fixes an actual
bug in here where it did mtspr instead of mfspr when reading
the files

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-27 16:58:24 +11:00
Josh Boyer
efbda86098 powerpc: Sanitize stack pointer in signal handling code
On powerpc64 machines running 32-bit userspace, we can get garbage bits in the
stack pointer passed into the kernel.  Most places handle this correctly, but
the signal handling code uses the passed value directly for allocating signal
stack frames.

This fixes the issue by introducing a get_clean_sp function that returns a
sanitized stack pointer.  For 32-bit tasks on a 64-bit kernel, the stack
pointer is masked correctly.  In all other cases, the stack pointer is simply
returned.

Additionally, we pass an 'is_32' parameter to get_sigframe now in order to
get the properly sanitized stack.  The callers are know to be 32 or 64-bit
statically.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-27 16:58:24 +11:00
Linus Torvalds
a8416961d3 Merge branch 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits)
  x86: disable __do_IRQ support
  sparseirq, powerpc/cell: fix unused variable warning in interrupt.c
  genirq: deprecate obsolete typedefs and defines
  genirq: deprecate __do_IRQ
  genirq: add doc to struct irqaction
  genirq: use kzalloc instead of explicit zero initialization
  genirq: make irqreturn_t an enum
  genirq: remove redundant if condition
  genirq: remove unused hw_irq_controller typedef
  irq: export remove_irq() and setup_irq() symbols
  irq: match remove_irq() args with setup_irq()
  irq: add remove_irq() for freeing of setup_irq() irqs
  genirq: assert that irq handlers are indeed running in hardirq context
  irq: name 'p' variables a bit better
  irq: further clean up the free_irq() code flow
  irq: refactor and clean up the free_irq() code flow
  irq: clean up manage.c
  irq: use GFP_KERNEL for action allocation in request_irq()
  kernel/irq: fix sparse warning: make symbol static
  irq: optimize init_kstat_irqs/init_copy_kstat_irqs
  ...
2009-03-26 16:06:50 -07:00
Jesse Barnes
ceb93a9ff1 powerpc/PCI: include pci.h in powerpc MSI implementation
This file uses PCI MSI defines and so needs pci.h.

Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-25 08:54:29 -07:00
Hollis Blanchard
366d4b9b9f KVM: ppc: No need to include core-header for KVM in asm-offsets.c currently
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-03-24 11:02:58 +02:00
Benjamin Herrenschmidt
757c74d298 powerpc/mm: Introduce early_init_mmu() on 64-bit
This moves some MMU related init code out of setup_64.c into hash_utils_64.c
and calls it early_init_mmu() and early_init_mmu_secondary(). This will
make it easier to plug in a new MMU type.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:34 +11:00
Kumar Gala
2319f12395 powerpc/mm: e300c2/c3/c4 TLB errata workaround
Complete workaround for DTLB errata in e300c2/c3/c4 processors.

Due to the bug, the hardware-implemented LRU algorythm always goes to way
1 of the TLB. This fix implements the proposed software workaround in
form of a LRW table for chosing the TLB-way.

Based on patch from David Jander <david@protonic.nl>

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:32 +11:00
Kumar Gala
eb3436a013 powerpc/mm: Used free register to save a few cycles in SW TLB miss handling
Now that r0 is free we can keep the value of I/DMISS in r3 and not reload
it before doing the tlbli/d.  This saves us a few cycles in the fast path
case.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:31 +11:00
Kumar Gala
00fcb14703 powerpc/mm: Remove unused register usage in SW TLB miss handling
Long ago we had some code that actually used the CTR in the SW TLB
miss handlers (603/e300).  Since we don't use it no reason to waste
cycles saving it off and restoring it (we actually didn't restore it
in the fast path case).

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:31 +11:00
Kumar Gala
d746286c1f powerpc: setup default archdata for {of_}platform via bus_register_notifier
Since a number of powerpc chips are SoCs we end up having dma-able
devices that are registered as platform or of_platform devices.  We need
to hook the archdata to setup proper dma_ops for these devices.

Rather than having to add a bus_notify to each platform we add a default
one at the highest priority (called first) to set the default dma_ops for
of_platform and platform devices to dma_direct_ops.  This allows platform
code to override the ops by providing their own notifier call back.

In the future to enable >4G DMA support on ppc32 we can hook swiotlb ops.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:30 +11:00
Kumar Gala
32ac57668d powerpc/pci: Default to dma_direct_ops for pci dma_ops
This will allow us to remove the ppc32 specific checks in get_dma_ops()
that defaults to dma_direct_ops if the archdata is NULL.  We really
should always have archdata set to something going forward.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:30 +11:00
Rusty Russell
9a3719341a powerpc: Make sysfs code use smp_call_function_single
Impact: performance improvement

This fixes 'powerpc: avoid cpumask games in arch/powerpc/kernel/sysfs.c'
which talked about using smp_call_function_single, but actually used
work_on_cpu (an older version of the patch).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:47:27 +11:00
Benjamin Herrenschmidt
151a9f4aef powerpc: Fix prom_init on 32-bit OF machines
Commit e7943fbbfd broke ppc32 using
Open Firmware client interface due to using the wrong relocation
macro when accessing the variable "linux_banner".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-24 13:43:35 +11:00
Benjamin Herrenschmidt
9e41d9597e Merge commit 'origin/master' into next 2009-03-24 13:38:30 +11:00
Kumar Gala
345953cf9a powerpc/mm: Fix Respect _PAGE_COHERENT on classic ppc32 SW TLB load machines
Grant picked up the wrong version of "Respect _PAGE_COHERENT on classic
ppc32 SW" (commit a4bd6a93c3)

It was missing the code to actually deal with the fixup of
_PAGE_COHERENT based on the CPU feature.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-03-23 08:38:26 -05:00
Ingo Molnar
b3e3b302cf Merge branches 'irq/sparseirq' and 'linus' into irq/core 2009-03-23 10:07:49 +01:00
Matthew Wilcox
1c8d7b0a56 PCI MSI: Add support for multiple MSI
Add the new API pci_enable_msi_block() to allow drivers to
request multiple MSI and reimplement pci_enable_msi in terms of
pci_enable_msi_block.  Ensure that the architecture back ends don't
have to know about multiple MSI.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-03-20 10:48:14 -07:00
Kumar Gala
a4bd6a93c3 powerpc/mm: Respect _PAGE_COHERENT on classic ppc32 SW
Since we now set _PAGE_COHERENT in the Linux PTE we shouldn't be clearing
it out before we setup the SW TLB.  Today all the SW TLB machines
(603/e300) that we support are non-SMP, however there are some errata on
some devices that cause us to set _PAGE_COHERENT via CPU_FTR_NEED_COHERENT.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2009-03-17 09:17:50 -06:00
Ingo Molnar
edb35028e4 Merge branches 'irq/genirq' and 'linus' into irq/core 2009-03-16 09:20:13 +01:00
Benjamin Herrenschmidt
28794d34ec powerpc/kconfig: Kill PPC_MULTIPLATFORM
CONFIG_PPC_MULTIPLATFORM is a remain of the pre-powerpc days and isn't
really meaningful anymore. It was basically equivalent to PPC64 || 6xx.

This removes it along with the following changes:

 - 32-bit platforms that relied on PPC32 && PPC_MULTIPLATFORM now rely
   on 6xx which is what they want anyway.

 - A new symbol, PPC_BOOK3S, is defined that represent compliance with
   the "Server" variant of the architecture. This is set when either 6xx
   or PPC64 is set and open the door for future BOOK3E 64-bit.

 - 64-bit platforms that relied on PPC64 && PPC_MULTIPLATFORM now use
   PPC64 && PPC_BOOK3S

 - A separate and selectable CONFIG_PPC_OF_BOOT_TRAMPOLINE option is now
   used to control the use of prom_init.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:35 +11:00
Thomas Gleixner
97f7d6bcc1 powerpc/irq: Convert obsolete irq_desc_t to struct irq_desc
Impact: cleanup

Convert the last remaining users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: linuxppc-dev@ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:34 +11:00
Andrew Klossner
af9c724907 powerpc/udbg: Fix lost byte during console handover; change LFCR to CRLF
When the console is on a serial port to be driven by serial8250, a
character can be lost from the end of the first line in the two-line
sequence

	serial8250.0: ttyS0 at MMIO 0xe0004500 (irq = 42) is a 16550A
	console handover: boot [udbg0] -> real [ttyS0]

This happens because udbg_puts or udbg_write stuff the last byte of
the line into the Tx FIFO and return, whereupon the serial8250
initialization code immediately empties that FIFO.  The fix: udbg_puts
and udbg_write now wait for the Tx FIFO to clear before returning.
This delays the system by one additional serial frame time for each
line written by udbg, but the effect is not noticeable, a cumulative
17 milliseconds for 200 lines of early printk output at 115200 baud.

Also, the routines in udbg_16550.c now emit CRLF instead of LFCR.
Linux makes a point of emitting CRLF because, when serial output is
captured to a file, LFCR sequences can confuse text editors.  See
http://lkml.org/lkml/2006/2/4/50 for some history.

Signed-off-by: Andrew Klossner <andrew@cesa.opbu.xerox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:34 +11:00
Wolfram Sang
a77acda0b7 powerpc/pci: Fix typo: s/resouces/resources/ in a pr_debug
Fix typo: s/resouces/resources/ in a pr_debug

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:34 +11:00
Michael Ellerman
e7943fbbfd powerpc: Print linux_banner in prom_init
So at least you can see what kernel you're booting if you die
before the kernel prints it mid-way through start_kernel().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:33 +11:00
Octavian Purdila
7c9583a4db powerpc/oprofile: Enable support for ppc750 processors
This patch enables oprofile for all 3 FX variants and GX variant of the
750 processor.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:11:32 +11:00
Michael Ellerman
9e1e3723be powerpc: Remove unused asm-offsets entries for cpu_spec
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:15 +11:00
Michael Ellerman
2657dd4e30 powerpc: Make sure we copy all cpu_spec features except PMC related ones
When identify_cpu() is called a second time with a logical PVR, it
only copies a subset of the cpu_spec fields so as to avoid overwriting
the performance monitor fields that were initialized based on the
real PVR.

However some of the other, non performance monitor related fields are
also not copied:
 * pvr_mask
 * pvr_value
 * mmu_features
 * machine_check

The fact that pvr_mask is not copied can result in show_cpuinfo()
showing the cpu as "unknown", if we override an unknown PVR with a
logical one - as reported by Shaggy.

So change the logic to copy all fields, and then put back the PMC
related ones in the case that we're overwriting a real PVR with a
logical one.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:14 +11:00
Michael Ellerman
666435bbf3 powerpc: Deindentify identify_cpu()
The for-loop body of identify_cpu() has gotten a little big, so move the
loop body logic into a separate function. No other changes.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-03-11 17:10:14 +11:00
Tejun Heo
19390c4d03 linker script: define __per_cpu_load on all SMP capable archs
Impact: __per_cpu_load available on all SMP capable archs

Percpu now requires three symbols to be defined - __per_cpu_load,
__per_cpu_start and __per_cpu_end.  There were three archs which
didn't have it.  Update them as follows.

* powerpc: can use generic PERCPU() macro.  Compile tested for
  powerpc32, compile/boot tested for powerpc64.

* ia64: can use generic PERCPU_VADDR() macro.  __phys_per_cpu_start is
  identical to __per_cpu_load.  Compile tested and symbol table looks
  identical after the change except for the additional __per_cpu_load.

* arm: added explicit __per_cpu_load definition.  Currently uses
  unified .init output section so can't use the generic macro.  Dunno
  whether the unified .init ouput section is required by arch
  peculiarity so I left it alone.  Please break it up and use PERCPU()
  if possible.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Pat Gefre <pfg@sgi.com>
Cc: Russell King <rmk@arm.linux.org.uk>
2009-03-10 16:27:48 +09:00
Kumar Gala
c3071951d0 powerpc/fsl-booke: Add support for tlbilx instructions
The e500mc core supports the new tlbilx instructions that do core
local invalidates and also provide us the ability to take down
all TLB entries matching a given PID.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-03-09 09:25:38 -05:00
Paul Mackerras
880860e392 perfcounters/powerpc: add support for POWER4 processors
Impact: more hardware support

This adds the back-end for the PMU on the POWER4 and POWER4+ processors
(GP and GQ).  This is quite similar to the PPC970, with 8 PMCs, but has
fewer events than the PPC970.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-03-06 16:30:57 +11:00
Paul Mackerras
aabbaa6036 perfcounters/powerpc: add support for POWER5+ processors
Impact: more hardware support

This adds the back-end for the PMU on the POWER5+ processors (i.e. GS,
including GS DD3 aka POWER5++).  This doesn't use the fixed-function
PMC5 and PMC6 since they don't respect the freeze conditions and don't
generate interrupts, as on POWER6.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-03-06 16:28:37 +11:00
Paul Mackerras
86028598de perfcounters/powerpc: fix oops with multiple counters in a group
Impact: fix oops-causing bug

This fixes a bug in the powerpc hw_perf_counter_init where the code
didn't initialize ctrs[n] before passing the ctrs array to check_excludes,
leading to possible oopses and other incorrect behaviour.  This fixes it
by initializing ctrs[n] correctly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-03-06 08:07:13 +11:00
Ingo Molnar
8163d88c79 Merge commit 'v2.6.29-rc7' into perfcounters/core
Conflicts:
	arch/x86/mm/iomap_32.c
2009-03-04 11:42:31 +01:00
Benjamin Herrenschmidt
652e8f8d57 Merge commit 'jwb/next' into next 2009-03-03 13:30:03 +11:00
Ingo Molnar
55f2b78995 Merge branch 'x86/urgent' into x86/pat 2009-03-01 12:47:58 +01:00
Ingo Molnar
8e818179eb Merge branch 'x86/core' into perfcounters/core
Conflicts:
	arch/x86/kernel/apic/apic.c
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-26 13:02:23 +01:00
Paul Mackerras
742bd95ba9 perfcounters/powerpc: Add support for POWER5 processors
This adds the back-end for the PMU on the POWER5 processor.  This knows
how to use the fixed-function PMC5 and PMC6 (instructions completed and
run cycles).  Unlike POWER6, PMC5/6 obey the freeze conditions and can
generate interrupts, so their use doesn't impose any extra restrictions.

POWER5+ is different and is not supported by this patch.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-26 15:36:48 +11:00
Michael Neuling
49f297f8df powerpc: Fix load/store float double alignment handler
When we introduced VSX, we changed the way FPRs are stored in the
thread_struct.  Unfortunately we missed the load/store float double
alignment handler code when updating how we access FPRs in the
thread_struct.

Below fixes this and merges the little/big endian case.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-26 14:02:53 +11:00
Paul Mackerras
d095cd46da perfcounters/powerpc: Make exclude_kernel bit work on Apple G5 processors
Currently, setting hw_event.exclude_kernel does nothing on the PPC970
variants used in Apple G5 machines, because they have the HV (hypervisor)
bit in the MSR forced to 1, so as far as the PMU is concerned, the
kernel runs in hypervisor mode.  Thus we have to use the MMCR0_FCHV
(freeze counters in hypervisor mode) bit rather than the MMCR0_FCS
(freeze counters in supervisor mode) bit.

This checks the MSR.HV bit at startup, and if it is set, we set the
freeze_counters_kernel variable to MMCR0_FCHV (it was initialized to
MMCR0_FCS).  We then use that whenever we need to exclude kernel events.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-23 23:01:28 +11:00
Anton Blanchard
501cb16d3c powerpc: Randomise PIEs
Randomise ELF_ET_DYN_BASE, which is used when loading position independent
executables.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:21 +11:00
Anton Blanchard
912f9ee21c powerpc: Randomise the brk region
Randomize the heap.

before:
tundro2:~ # sleep 1 & cat /proc/${!}/maps | grep heap
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]
10017000-10118000 rw-p 10017000 00:00 0                                  [heap]

after
tundro2:~ # sleep 1 & cat /proc/${!}/maps | grep heap
19419000-1951a000 rw-p 19419000 00:00 0                                  [heap]
325ff000-32700000 rw-p 325ff000 00:00 0                                  [heap]
1a97c000-1aa7d000 rw-p 1a97c000 00:00 0                                  [heap]
1cc60000-1cd61000 rw-p 1cc60000 00:00 0                                  [heap]
1afa9000-1b0aa000 rw-p 1afa9000 00:00 0                                  [heap]

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:20 +11:00
Anton Blanchard
d839088cae powerpc: Randomise lower bits of stack address
Randomise the lower bits of the stack address. More randomisation is good for
security but the scatter can also help with SMT threads that share an L1. A
quick test case shows this working:

int main()
{
	int sp;
	printf("%x\n", (unsigned long)&sp & 4095);
}

before:
80
80
80
80
80

after:
610
490
300
6b0
d80

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:20 +11:00
Anton Blanchard
a465f9b694 powerpc: Move is_32bit_task
Move is_32bit_task into asm/thread_info.h, that allows us to test for
32/64bit tasks without an ugly CONFIG_PPC64 ifdef.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:06 +11:00
Michael Neuling
553631e25f powerpc: Fix load/store float double alignment handler
When we introduced VSX, we changed the way FPRs are stored in the
thread_struct.  Unfortunately we missed the load/store float double
alignment handler code when updating how we access FPRs in the
thread_struct.

Below fixes this and merges the little/big endian case.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:05 +11:00
Michael Neuling
545bba1824 powerpc: Add alignment handler for new lfiwzx instruction
lfiwzx is a new floating point load instruction in 2.06 that needs an
alignment handler for Linux.

Turns out to be the worlds easiest handler to add.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:04 +11:00
Brian King
f52862f407 powerpc/pseries: Fix partition migration hang under load
While testing partition migration with heavy CPU load using
shared processors, it was observed that sometimes the migration
would never complete and would appear to hang. Currently, the
migration code assumes that if H_SUCCESS is returned from the H_JOIN
then the migration is complete and the processor is waking up on
the target system. If there was an outstanding PROD to the processor
when the H_JOIN is called, however, it will return H_SUCCESS on the source
system, causing the migration to hang, or in some scenarios cause
the kernel to crash on the complete call waking the caller
of rtas_percpu_suspend_me. Fix this by calling H_JOIN multiple times
if necessary during the migration.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:04 +11:00
Kumar Gala
620165f971 powerpc: Add support for using doorbells for SMP IPI
The e500mc supports the new msgsnd/doorbell mechanisms that were added in
the Power ISA 2.05 architecture.  We use the normal level doorbell for
doing SMP IPIs at this point.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 15:53:03 +11:00
Tom Arbuckle
f81786913a powerpc/pci: Fix PCI<->OF matching of old style multifunc devices
Old OF variants used to create a 'dummy' parent node "multifunc-device"
for devices with more than one PCI function. Our code that matches OF
nodes to PCI devices dealt with that in one place but not in another,
this fixes it.

This has the practical effect of fixing interrupt routing of multifunction
PCI cards on some older PowerMac machines.

Signed-off-by: Tom Arbuckle <tom.d.arbuckle@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:57 +11:00
Kumar Gala
16c57b3620 powerpc: Unify opcode definitions and support
Create a new header that becomes a single location for defining PowerPC
opcodes used by code that is either generationg instructions
at runtime (fixups, debug, etc.), emulating instructions, or just
compiling instructions old assemblers don't know about.

We currently don't handle the floating point emulation or alignment decode
as both are better handled by the specific decode support they already
have.

Added support for the new dcbzl, dcbal, msgsnd, tlbilx, & wait instructions
since older assemblers don't know about them.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:56 +11:00
Steven Rostedt
bb9b903527 powerpc, ftrace: use create_branch lib function
Impact: clean up, remove duplicate code

When ftrace was first ported to PowerPC, there existed a
create_function_call that would create the instruction to make a call
to a given address. Unfortunately, this call expected to write to
the address it was given, and since it used the address to calculate
the offset, it could not be faked.

ftrace needed a way to create the instruction without actually writing
that instruction to the text section. So ftrace had to implement its
own code.

Now we have create_branch in the code patching library, which does
exactly what ftrace needs. This patch replaces ftrace's implementation
with the library function.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:56 +11:00
Steven Rostedt
b54dcfe108 powerpc, ftrace: use unsigned int for instruction manipulation
The original port of ftrace to PowerPC kept a lot of the code used
by x86. Some of this code was to handle x86's 5 byte instruction.
This was handled by using character arrays to manipulate the
code.

PowerPC has a consistent 4 byte instruction. Using unsigned ints
makes the code more efficient as well as more readable.
By converting to use unsigned ints to represent instructions,
I was able to remove the side effects that were needed for
manipulating character strings.

  i.e. memcpy and memcmp

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:55 +11:00
Steven Rostedt
60ce8f7260 powerpc32, ftrace: dynamic function graph tracer
This patch gets function graph tracing working with dynamic function
tracer on PowerPC32.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:55 +11:00
Steven Rostedt
fad4f47cc8 powerpc32, ftrace: port function graph tracer to ppc32, static only
This patch ports the function graph tracer for PowerPC, but only
for static function tracing.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:55 +11:00
Steven Rostedt
bf528a3a9b powerpc32, ftrace: save and restore mcount regs with macro
Impact: clean up

Use a macro to save and restore the registers for PowerPC32,
since that code is duplicated.

This is similar to the work done by Cyrill Gorcunov for the
mcount code in x86_64.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:54 +11:00
Steven Rostedt
bb7253403f powerpc64, ftrace: save toc only on modules for function graph
The TOCS used by modules are different than the one used by
the core kernel code. The function graph tracer must save and
restore the TOC whenever it traces a module call. But this
is an added overhead to burden the majority of core kernel
code being traced.

Benjamin Herrenschmidt suggested in testing the entry of
the call to tell if it is a core kernel function or a module.
He recommended using the REGION_ID() macro to perform this test.

This patch implements Benjamin's idea, and uses a different
return_to_handler routine dependent on if the entry is a core
kernel function or not. The module version saves the TOC, where as
the core kernel version does not.

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:54 +11:00
Steven Rostedt
4654288847 powerpc64, tracing: add function graph tracer with dynamic tracing
This is the port of the function graph tracer to PowerPC with
dynamic tracing.

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:54 +11:00
Steven Rostedt
6794c78243 powerpc64: port of the function graph tracer
This is a port of the function graph tracer that was written by
Frederic Weisbecker for the x86.

This only works for PPC64 at the moment and only for static tracing.
PPC32 and dynamic function graph tracing support will come later.

The trace produces a visual calling of functions:

 # tracer: function_graph
 #
 # CPU  DURATION                  FUNCTION CALLS
 # |     |   |                     |   |   |   |
  0)   2.224 us    |                        }
  0) ! 271.024 us  |                      }
  0) ! 320.080 us  |                    }
  0) ! 324.656 us  |                  }
  0) ! 329.136 us  |                }
  0)               |                .put_prev_task_fair() {
  0)               |                  .update_curr() {
  0)   2.240 us    |                    .update_min_vruntime();
  0)   6.512 us    |                  }
  0)   2.528 us    |                  .__enqueue_entity();
  0) + 15.536 us   |                }
  0)               |                .pick_next_task_fair() {
  0)   2.032 us    |                  .__pick_next_entity();
  0)   2.064 us    |                  .__clear_buddies();
  0)               |                  .set_next_entity() {
  0)   2.672 us    |                    .__dequeue_entity();
  0)   6.864 us    |                  }

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand <geoffrey.levand@am.sony.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:53 +11:00
Steven Rostedt
17be5b3ddf powerpc, ftrace: fix compile error when modules not configured
Michael Neuling reported a compile bug when dynamic ftrace was
configured in and modules were not. This was due to the ftrace
code referencing module specific structures.

Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:53 +11:00
Steven Rostedt
44e1d064b9 ftrace, powerpc: replace debug macro with proper pr_deug
Impact: cleanup

The PowerPC ftrace code uses a hacked up DEBUGP macro for prints.
This patch converts it to the standard pr_debug.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-23 10:48:52 +11:00
Ingo Molnar
fc6fc7f1b1 Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic conflict resolution:
	arch/x86/kernel/setup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-22 20:05:19 +01:00
Benjamin Herrenschmidt
3b7faeb49e Merge commit 'kumar/next' into next 2009-02-18 13:23:30 +11:00
Benjamin Herrenschmidt
82a0a1cc8f Merge commit 'origin/master' into next
Manual merge of:
	arch/powerpc/include/asm/pgtable-ppc32.h
2009-02-18 13:19:25 +11:00
Madhulika Madishetty
6c71209023 AMCC PPC 460SX redwood SoC platform initial framework
This patch contains initial framework for the AMCC Redwood board.

Signed-off-by: Madhulika Madishetty <mmadishetty@amcc.com>
Signed-off-by: Tirumala Marri <tmarri@amcc.com>
Signed-off-by: Feng Kan <fkan@amcc.com>
Signed-off-by: Vidhyananth Venkatasamy <vvenkatasamy@amcc.com>
Signed-off-by: Preetesh Parekh <pparekh@amcc.com>
Acked-by: Loc Ho <lho@amcc.com>
Acked-by: Feng Kan <fkan@amcc.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-02-14 14:41:29 -05:00
Yuri Tikhonov
e12401222f powerpc/44x: Support for 256KB PAGE_SIZE
This patch adds support for 256KB pages on ppc44x-based boards.

For simplification of implementation with 256KB pages we still assume
2-level paging. As a side effect this leads to wasting extra memory space
reserved for PTE tables: only 1/4 of pages allocated for PTEs are
actually used. But this may be an acceptable trade-off to achieve the
high performance we have with big PAGE_SIZEs in some applications (e.g.
RAID).

Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
the risk of stack overflows in the cases of on-stack arrays, which size
depends on the page size (e.g. multipage BIOs, NTFS, etc.).

With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
occupied by PKMAP addresses leaving no place for vmalloc. We do not
separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
value of 10 in support for 16K/64K had been selected rather intuitively.
Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
one) we have 512 pages for PKMAP.

Because ELF standard supports only page sizes up to 64K, then you should
use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
for building applications, which are to be run with the 256KB-page sized
kernel. If using the older binutils, then you should patch them like follows:

	--- binutils/bfd/elf32-ppc.c.orig
	+++ binutils/bfd/elf32-ppc.c

	-#define ELF_MAXPAGESIZE                0x10000
	+#define ELF_MAXPAGESIZE                0x40000

One more restriction we currently have with 256KB page sizes is inability
to use shmem safely, so, for now, the 256KB is available only if you turn
the CONFIG_SHMEM option off (another variant is to use BROKEN).
Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
dependency in 'config PPC_256K_PAGES', and use the workaround available here:
 http://lkml.org/lkml/2008/12/19/20

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2009-02-14 14:40:04 -05:00
Ingo Molnar
8f8573ae9f Merge branches 'irq/genirq', 'irq/sparseirq' and 'irq/urgent' into irq/core 2009-02-13 11:57:18 +01:00
Ingo Molnar
f8a6b2b9ce Merge branch 'linus' into x86/apic
Conflicts:
	arch/x86/kernel/acpi/boot.c
	arch/x86/mm/fault.c
2009-02-13 09:44:22 +01:00
Ingo Molnar
e9c4ffb11f Merge branch 'linus' into perfcounters/core
Conflicts:
	arch/x86/kernel/acpi/boot.c
2009-02-13 09:34:07 +01:00
Michael Neuling
26456dcfb8 powerpc/vsx: Fix VSX alignment handler for regs 32-63
Fix the VSX alignment handler for VSX registers > 32.  32-63 are stored
in the VMX part of the thread_struct not the FPR part.

Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: stable@kernel.org (2.6.27 & .28 please)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-13 16:37:45 +11:00
Kumar Gala
70fe3af840 powerpc/book-3e: Introduce concept of Book-3e MMU
The Power ISA 2.06 spec introduces a standard MMU programming model that
is based on the Freescale Book-E MMU programing model.  The Freescale
version is pretty backwards compatiable with the ISA 2.06 definition so
we are starting to refactor some of the Freescale code so it can be
easily shared.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-02-12 16:51:33 -06:00
Kumar Gala
d66c82ea45 powerpc/fsl-booke: Add new ISA 2.06 page sizes and MAS defines
The Power ISA 2.06 added power of two page sizes to the embedded MMU
architecture.  Its done it such a way to be code compatiable with the
existing HW.  Made the minor code changes to support both power of two
and power of four page sizes.  Also added some new MAS bits and macros
that are defined as part of the 2.06 ISA.  Renamed some things to use
the 'Book-3e' concept to convey the new MMU that is based on the
Freescale Book-E MMU programming model.

Note, its still invalid to try and use a page size that isn't supported
by cpu.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-02-12 16:37:11 -06:00
Ingo Molnar
ffc0467293 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters into perfcounters/core 2009-02-11 09:22:14 +01:00
Ingo Molnar
95fd4845ed Merge commit 'v2.6.29-rc4' into perfcounters/core
Conflicts:
	arch/x86/kernel/setup_percpu.c
	arch/x86/mm/fault.c
	drivers/acpi/processor_idle.c
	kernel/irq/handle.c
2009-02-11 09:22:04 +01:00
Milton Miller
c3bd517de6 powerpc/pci: Move hose_list and pci_address_to_pio to pci-common
move the definition of hose_list next to its hotplug spinlock.

create pcibios_io_size to encapsulate ifdef in existing pci-common
function pcibios_vaddr_is_ioport

move pci_address_to_pio to pci-common, using new pcibios_io_size, and
protect this GPL exported function against concurrent hotplug removal

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-11 16:00:07 +11:00
Paul Mackerras
0475f9ea8e perf_counters: allow users to count user, kernel and/or hypervisor events
Impact: new perf_counter feature

This extends the perf_counter_hw_event struct with bits that specify
that events in user, kernel and/or hypervisor mode should not be
counted (i.e. should be excluded), and adds code to program the PMU
mode selection bits accordingly on x86 and powerpc.

For software counters, we don't currently have the infrastructure to
distinguish which mode an event occurs in, so we currently fail the
counter initialization if the setting of the hw_event.exclude_* bits
would require us to distinguish.  Context switches and CPU migrations
are currently considered to occur in kernel mode.

On x86, this changes the previous policy that only root can count
kernel events.  Now non-root users can count kernel events or exclude
them.  Non-root users still can't use NMI events, though.  On x86 we
don't appear to have any way to control whether hypervisor events are
counted or not, so hw_event.exclude_hv is ignored.

On powerpc, the selection of whether to count events in user, kernel
and/or hypervisor mode is PMU-wide, not per-counter, so this adds a
check that the hw_event.exclude_* settings are the same as other events
on the PMU.  Counters being added to a group have to have the same
settings as the other hardware counters in the group.  Counters and
groups can only be enabled in hw_perf_group_sched_in or power_perf_enable
if they have the same settings as any other counters already on the
PMU.  If we are not running on a hypervisor, the exclude_hv setting
is ignored (by forcing it to 0) since we can't ever get any
hypervisor events.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-02-11 15:06:59 +11:00
Michael Ellerman
059f134f84 powerpc: Allow debugging of LMBs with lmb=debug
The lmb debugging can be turned on at boottime with lmb=debug on the
command line. However on powerpc that doesn't work, because we don't
necessarily call lmb_dump_all().

So always call lmb_dump_all() after lmb_analyze(), no output is
generated unless lmb=debug is found on the command line.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-11 13:38:00 +11:00
Michael Ellerman
33642d31d1 powerpc: Remove unused ppc64_terminate_msg()
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-11 13:38:00 +11:00
Benjamin Herrenschmidt
edbc29d76d Merge commit 'kumar/next' into next 2009-02-11 13:37:44 +11:00
Benjamin Herrenschmidt
5b11abfdb5 powerpc/pci: mmap anonymous memory when legacy_mem doesn't exist
The new legacy_mem file in sysfs is causing problems with X on machines
that don't support legacy memory access. The way I initially implemented
it, we would fail with -ENXIO when trying to mmap it, thus exposing to
X that we do support the API but there is no legacy memory.

Unfortunately, X poor error handling is causing it to fail to start when
it gets this error.

This implements a workaround hack that instead maps anonymous memory
instead (using shmem if VM_SHARED is set, just like /dev/zero does).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:08 +11:00
Steven Rostedt
f25f9074c2 powerpc/ftrace: Fix math to calculate offset in TOC
Impact: fix dynamic ftrace with large modules in PPC64

The math to calculate the offset into the TOC that is taken from reading
the trampoline is incorrect. The bottom half of the offset is a signed
extended short. The current code was using an OR to create the offset
when it should have been using an addition.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-10 14:39:08 +11:00
Ingo Molnar
9d45cf9e36 Merge branch 'x86/urgent' into x86/apic
Conflicts:
	arch/x86/mach-default/setup.c

Semantic merge:
	arch/x86/kernel/irqinit_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 22:30:01 +01:00
Benjamin Herrenschmidt
59b608c2c3 powerpc: Fix oops on some machines due to incorrect pr_debug()
Recently, a patch left DEBUG enabled in the powerpc common PCI code,
resulting in an old bug in a pr_debug() statement to show up and cause
a NULL dereference on some machines.

This fixes the pr_debug() statement and reverts to DEBUG not being
force-enabled in that file.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-02-02 17:08:25 +11:00
Kumar Gala
105c31df6f powerpc/fsl-booke: Cleanup init/exception setup to be runtime
We currently have a few variants of fsl-booke processors (e500v1, e500v2,
e500mc, and e200).  They all have minor differences that we had previously
been handling via ifdefs.

To move towards having this support the following changes have been made:

* PID1, PID2 only exist on e500v1 & e500v2 and should not be accessed on
  e500mc or e200.  We use MMUCFG[NPIDS] to determine which case we are
  since we only touch PID1/2 in extremely early init code.

* Not all IVORs exist on all the processors so introduce cpu_setup
  functions for each variant to setup the proper IVORs that are either
  unique or exist but have some variations between the processors

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-01-28 18:16:50 -06:00
Ingo Molnar
6a385db5ce Merge branch 'core/percpu' into x86/core
Conflicts:
	kernel/irq/handle.c
2009-01-28 23:12:55 +01:00
Robert Jennings
69b052e828 powerpc/pseries: Correct VIO bus accounting problem in CMO env.
In the VIO bus code the wrappers for dma alloc_coherent and free_coherent
calls are rounding to IOMMU_PAGE_SIZE.  Taking a look at the underlying
calls, the actual mapping is promoted to PAGE_SIZE.  Changing the
rounding in these two functions fixes under-reporting the entitlement
used by the system.  Without this change, the system could run out of
entitlement before it believes it has and incur mapping failures at the
firmware level.

Also in the VIO bus code, the wrapper for dma map_sg is not exiting in
an error path where it should.  Rather than fall through to code for the
success case, this patch adds the return that is needed in the error path.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-28 17:15:52 +11:00
Ingo Molnar
77835492ed Merge commit 'v2.6.29-rc2' into perfcounters/core
Conflicts:
	include/linux/syscalls.h
2009-01-21 16:37:27 +01:00
Ingo Molnar
198030782c Merge branch 'x86/mm' into core/percpu
Conflicts:
	arch/x86/mm/fault.c
2009-01-21 10:39:51 +01:00
Ingo Molnar
af37501c79 Merge branch 'core/percpu' into perfcounters/core
Conflicts:
	arch/x86/include/asm/pda.h

We merge tip/core/percpu into tip/perfcounters/core because of a
semantic and contextual conflict: the former eliminates the PDA,
while the latter extends it with apic_perf_irqs field.

Resolve the conflict by moving the new field to the irq_cpustat
structure on 64-bit too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 18:15:49 +01:00
Tejun Heo
74e7904559 linker script: add missing .data.percpu.page_aligned
arm, arm/mach-integrator and powerpc were missing
.data.percpu.page_aligned in their percpu output section definitions.
Add it.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-17 15:26:32 +09:00
Michael Neuling
b60c31d85a powerpc: Get the number of SLBs from "slb-size" property
The PAPR says that the property for specifying the number of SLBs should
be called "slb-size".  We currently only look for "ibm,slb-size" because
this is what firmware actually presents.

This patch makes us look for the "slb-size" property as well and in
preference to the "ibm,slb-size".  This should future proof us if
firmware changes to match PAPR.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-16 16:15:16 +11:00
Paul Mackerras
3b6f9e5cb2 perf_counter: Add support for pinned and exclusive counter groups
Impact: New perf_counter features

A pinned counter group is one that the user wants to have on the CPU
whenever possible, i.e. whenever the associated task is running, for
a per-task group, or always for a per-cpu group.  If the system
cannot satisfy that, it puts the group into an error state where
it is not scheduled any more and reads from it return EOF (i.e. 0
bytes read).  The group can be released from error state and made
readable again using prctl(PR_TASK_PERF_COUNTERS_ENABLE).  When we
have finer-grained enable/disable controls on counters we'll be able
to reset the error state on individual groups.

An exclusive group is one that the user wants to be the only group
using the CPU performance monitor hardware whenever it is on.  The
counter group scheduler will not schedule an exclusive group if there
are already other groups on the CPU and will not schedule other groups
onto the CPU if there is an exclusive group scheduled (that statement
does not apply to groups containing only software counters, which can
always go on and which do not prevent an exclusive group from going on).
With an exclusive group, we will be able to let users program PMU
registers at a low level without the concern that those settings will
perturb other measurements.

Along the way this reorganizes things a little:
- is_software_counter() is moved to perf_counter.h.
- cpuctx->active_oncpu now records the number of hardware counters on
  the CPU, i.e. it now excludes software counters.  Nothing was reading
  cpuctx->active_oncpu before, so this change is harmless.
- A new cpuctx->exclusive field records whether we currently have an
  exclusive group on the CPU.
- counter_sched_out moves higher up in perf_counter.c and gets called
  from __perf_counter_remove_from_context and __perf_counter_exit_task,
  where we used to have essentially the same code.
- __perf_counter_sched_in now goes through the counter list twice, doing
  the pinned counters in the first loop and the non-pinned counters in
  the second loop, in order to give the pinned counters the best chance
  to be scheduled in.

Note that only a group leader can be exclusive or pinned, and that
attribute applies to the whole group.  This avoids some awkwardness in
some corner cases (e.g. where a group leader is closed and the other
group members get added to the context list).  If we want to relax that
restriction later, we can, and it is easier to relax a restriction than
to apply a new one.

This doesn't yet handle the case where a pinned counter is inherited
and goes into error state in the child - the error state is not
propagated up to the parent when the child exits, and arguably it
should.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-14 21:00:30 +11:00
Paul Mackerras
01d0287f06 powerpc/perf_counter: Make sure PMU gets enabled properly
This makes sure that we call the platform-specific ppc_md.enable_pmcs
function on each CPU before we try to use the PMU on that CPU.  If the
CPU goes off-line and then on-line, we need to do the enable_pmcs call
again, so we use the hw_perf_counter_setup hook to ensure that.  It gets
called as each CPU comes online, but it isn't called on the CPU that is
coming up, so this adds the CPU number as an argument to it (there were
no non-empty instances of hw_perf_counter_setup before).

This also arranges to set the pmcregs_in_use field of the lppaca (data
structure shared with the hypervisor) on each CPU when we are using the
PMU and clear it when we are not.  This allows the hypervisor to optimize
partition switches by not saving/restoring the PMU registers when we
aren't using the PMU.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-14 13:44:19 +11:00
Kumar Gala
5597b25c30 powerpc/e500mc: Doorbells need to be taken w/exceptions disabled
We use Doorbell interrupts for IPIs and thus we need to make sure we aren't
interrupted in the process of processing the IPI.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Dave Liu <daveliu@freescale.com>
2009-01-13 17:46:24 -06:00
Benjamin Herrenschmidt
c478b58135 powerpc/powermac: Fix occasional SMP boot failure
The PowerMac kernel occasionally fails to bring up the secondary CPUs on
SMP, the trigger factor seem to be fairly random and related to location
of code and data.

This appears to be due to the initial loading of the TOC value by the
secondary processor which now happens before we clear HID4:RM_CI (Real
Mode Cache Invalidate). This bit should really be cleared before we do
any load or store other than fetching code.

This fix works based on the assumption that all SMP 64-bit PowerMacs use
variants of the 970, which fortunately is true, by explicitely clearing
that bit, adding an slbia for good measure as RM_CI mode is known to
create bogus ERAT entries.

I also removed some spurrious debug output that was left enabled by
mistake while at it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13 14:48:03 +11:00
Nathan Lynch
fc7a9feb9c powerpc/cacheinfo: Rename cache_dir per-cpu variable
The per_cpu__ prefix on DECLARE_PER_CPU'd variables is going away;
rename cache_dir to cache_dir_pcpu.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13 14:48:02 +11:00
Stephen Rothwell
9477e455b4 powerpc: Cleanup from l64 to ll64 change: arch code
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13 14:47:59 +11:00
Ingo Molnar
fe333321e2 powerpc: Change u64/s64 to a long long integer type
Convert arch/powerpc/ over to long long based u64:

 -#ifdef __powerpc64__
 -# include <asm-generic/int-l64.h>
 -#else
 -# include <asm-generic/int-ll64.h>
 -#endif
 +#include <asm-generic/int-ll64.h>

This will avoid reoccuring spurious warnings in core kernel code that
comes when people test on their own hardware. (i.e. x86 in ~98% of the
cases) This is what x86 uses and it generally helps keep 64-bit code
32-bit clean too.

[Adjusted to not impact user mode (from paulus) - sfr]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13 14:47:59 +11:00
Milton Miller
66c721e184 powerpc/kexec: Check crash_base for relocatable kernel
Enforce that the crash kernel region never overlaps the current kernel,
as it will be written directly on kexec load.

Also, default to the previous KDUMP_KERNELBASE if the start is 0.

Other architectures (x86, ia64) state that specifying the start address
0 (or omitting it) will result in the kernel allocating it.  Before the
relocatable patch in 2.6.28, powerpc would adjust any other start value
to the hardcoded KDUMP_KERNELBASE of 32M.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13 14:47:59 +11:00
Milton Miller
e16459c6b7 powerpc: Make dummy section a valid note header
We are declaring the dummy section (used to work around a binutils
bug) as PT_NOTE, but we don't have enough bytes for it to be a valid
note header, and kexec userspace complains:

Warning: Elf Note name is not null terminated
Warning: append= option is not passed. Using the first kernel root partition
Warning: Elf Note name is not null terminated

Instead of using the arbitray value 0xf177 (aka "fill"), declare a
no-name no-description note of type 0.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-13 14:47:58 +11:00
Benjamin Herrenschmidt
30aae739a9 Merge commit 'kumar/kumar-next' into next 2009-01-13 13:59:03 +11:00
Mike Travis
e65e49d0f3 irq: update all arches for new irq_desc
Impact: cleanup, update to new cpumask API

Irq_desc.affinity and irq_desc.pending_mask are now cpumask_var_t's
so access to them should be using the new cpumask API.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-12 15:27:13 -08:00
Yinghai Lu
dee4102a9a sparseirq: use kstat_irqs_cpu instead
Impact: build fix

Ingo Molnar wrote:

> tip/arch/blackfin/kernel/irqchip.c: In function 'show_interrupts':
> tip/arch/blackfin/kernel/irqchip.c:85: error: 'struct kernel_stat' has no member named 'irqs'
> make[2]: *** [arch/blackfin/kernel/irqchip.o] Error 1
> make[2]: *** Waiting for unfinished jobs....
>

So could move kstat_irqs array to irq_desc struct.

(s390, m68k, sparc) are not touched yet, because they don't support genirq

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11 15:53:13 +01:00
Ingo Molnar
c0d362a832 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/perfcounters into perfcounters/core 2009-01-11 02:44:08 +01:00
Paul Mackerras
f78628374a powerpc/perf_counter: Add support for POWER6
This adds the back-end for the PMU on the POWER6 processor.
Fortunately, the event selection hardware is somewhat simpler on
POWER6 than on other POWER family processors, so the constraints
fit into only 32 bits.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-10 16:35:01 +11:00
Paul Mackerras
16b067993d powerpc/perf_counter: Add support for PPC970 family
This adds the back-end for the PMU on the PPC970 family.

The PPC970 allows events from the ISU to be selected in two different
ways.  Rather than use alternative event codes to express this, we
instead use a single encoding for ISU events and express the
resulting constraint (that you can't select events from all three
of FPU/IFU/VPU, ISU and IDU/STS at the same time, since they all come
in through only 2 multiplexers) using a NAND constraint field, and
work out which multiplexer is used for ISU events at compute_mmcr
time.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-10 16:34:07 +11:00
Paul Mackerras
4574910e50 powerpc/perf_counter: Add generic support for POWER-family PMU hardware
This provides the architecture-specific functions needed to access
PMU hardware on the 64-bit PowerPC processors.  It has been designed
for the IBM POWER family (POWER 4/4+/5/5+/6 and PPC970) but will
hopefully also suit other 64-bit PowerPC machines (although probably
not Cell given how different it is in this area).  This doesn't
include back-ends for any specific processors.

This implements a system which allows back-ends to express the
constraints that their hardware has on what events can be counted
simultaneously.  The constraints are expressed as a 64-bit mask +
64-bit value for each event, and the encoding is capable of
expressing the constraints arising from having a set of multiplexers
feeding an event bus, with some events being available through
multiple multiplexer settings, such as we get on POWER4 and PPC970.
Furthermore, the back-end can supply alternative event codes for
each event, and the constraint checking code will try all possible
combinations of alternative event codes to try to find a combination
that will fit.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-10 16:32:05 +11:00
Paul Mackerras
93a6d3ce69 powerpc: Provide a way to defer perf counter work until interrupts are enabled
Because 64-bit powerpc uses lazy (soft) interrupt disabling, it is
possible for a performance monitor exception to come in when the
kernel thinks interrupts are disabled (i.e. when they are
soft-disabled but hard-enabled).  In such a situation the performance
monitor exception handler might have some processing to do (such as
process wakeups) which can't be done in what is effectively an NMI
handler.

This provides a way to defer that work until interrupts get enabled,
either in raw_local_irq_restore() or by returning from an interrupt
handler to code that had interrupts enabled.  We have a per-processor
flag that indicates that there is work pending to do when interrupts
subsequently get re-enabled.  This flag is checked in the interrupt
return path and in raw_local_irq_restore(), and if it is set,
perf_counter_do_pending() is called to do the pending work.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-01-09 19:48:17 +11:00
Kumar Gala
1edda9c795 powerpc: Export cacheable_memzero as its now used in a driver
The Freescale PowerPC specific gianfar driver (gig-e) uses
cacheable_memzero for performance reasons we need to export
the symbol to allow the driver to be built as a module.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-08 16:25:17 +11:00
Ingo Molnar
2b931fb67e powerpc: Use correct type in prom_init.c
tce_entryp is a "u64 *" not an "unsigned long *".

[Split from a large patch -sfr]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-08 16:25:16 +11:00
Stephen Rothwell
6327716131 powerpc: Remove unnecessary casts
of_get_flat_dt_prop() returns a "void *", so we don't need to cast when
assigning its result to a pointer variable.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-08 16:25:16 +11:00
Paul Mackerras
16124f10df powerpc: Fix pciconfig_iobase system call on PCI-Express powermac
X has been failing to start on my quad G5 powermac since commit
1fd0f52583 ("powerpc: Fix domain numbers
in /proc on 64-bit") went in.  The reason is that the change allows X
to see the PCI-PCI bridge above the video card (previously it was
obscured by the fact that there were two "00" directories in
/proc/bus/pci), and the pciconfig_iobase system call on the bridge is
failing because of a hack that we have to return information about the
AGP bus when X asks about bus 0.  This machine doesn't have an AGP bus
(it has PCI Express) and so the pciconfig_iobase call is returning -1,
which ultimately causes X to fail to start.

This fixes it by checking that we have an AGP bridge before
redirecting the pciconfig_iobase call to return information about the
AGP bus.  With this, X starts successfully both on a quad G5 with
PCI Express and on an older dual G5 with AGP.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-08 16:25:11 +11:00
Nathan Lynch
93197a36a9 powerpc: Rewrite sysfs processor cache info code
The current code for providing processor cache information in sysfs
has the following deficiencies:
- several complex functions that are hard to understand
- implicit recursion (cache_desc_release -> kobject_put -> cache_desc_release)
- explicit recursion (create_cache_index_info)
- use of two per-cpu arrays when one would suffice
- duplication of work on systems where CPUs share cache

Also, when I looked at implementing support for a shared_cpu_map
attribute, it was pretty much impossible to handle hotplug without
checking every single online CPU's cache_desc list and fixing things
up... not that this is a hot path, but it would have introduced
O(n^2)-ish behavior during boot.  Addressing this involved rethinking
the core data structures used, which didn't lend itself to an
incremental approach.

This implementation maintains a "forest" (potentially more than one
tree) of cache objects which reflects the system's cache topology.
Cache objects are instantiated as needed as CPUs come online.  A
per-cpu array is used mainly for sysfs-related bookkeeping; the
objects in the array just point to the appropriate points in the
forest.

This maintains compatibility with the existing code and includes some
enhancements:
- Implement the shared_cpu_map attribute, which is essential for
  enabling userspace to discover the system's overall cache topology.
- Use cache-block-size properties if cache-line-size is not available.

I chose to place this implementation in a new file since it would have
roughly doubled the size of sysfs.c, which is already kind of messy.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-08 16:25:10 +11:00
Benjamin Herrenschmidt
c1f343028d powerpc/pci: Reserve legacy regions on PCI
There's a problem on some embedded platforms when we re-assign
everything on PCI, such as 44x. The generic code tries to avoid
assigning devices to addresses overlapping the low legacy
addresses such as VGA hard decoded areas using constants that
are unfortunately no good for us, as they don't take into account
the address translation we do to access PCI busses.

Thus we end up allocating things like IO BARs to 0, which is
technically legal, but will shadow hard decoded ports for use
by things like VGA cards.

This works around it by attempting to reserve legacy regions
before we try to assign addresses.

NOTE: This may have nasty side effects in cases I haven't tested
yet:

 - We try to use FW mappings (ie. powermac) and the FW has allocated
a conflicting address over those legacy regions. This will typically
happen. I would expect the new code to just fail with an informative
message without harm but I haven't had a chance to test that scenario
yet.

 - A device with fixed BARs overlapping those legacy addresses such
as an IDE controller in legacy mode is in the system. I don't know
for sure yet what will happen there, I have to test :-)

Ideally, we should change PCIBIOS_MIN_IO/MIN_MEM accross the board
to take a bus pointer so they can provide appropriate per-bus translated
values to the generic code but that's a more invasive patch. I will
do that in the future, but in the meantime, this fixes the problem
locally

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2009-01-08 16:25:07 +11:00
Linus Torvalds
b424e8d3b4 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (98 commits)
  PCI PM: Put PM callbacks in the order of execution
  PCI PM: Run default PM callbacks for all devices using new framework
  PCI PM: Register power state of devices during initialization
  PCI PM: Call pci_fixup_device from legacy routines
  PCI PM: Rearrange code in pci-driver.c
  PCI PM: Avoid touching devices behind bridges in unknown state
  PCI PM: Move pci_has_legacy_pm_support
  PCI PM: Power-manage devices without drivers during suspend-resume
  PCI PM: Add suspend counterpart of pci_reenable_device
  PCI PM: Fix poweroff and restore callbacks
  PCI: Use msleep instead of cpu_relax during ASPM link retraining
  PCI: PCIe portdrv: Add kerneldoc comments to remining core funtions
  PCI: PCIe portdrv: Rearrange code so that related things are together
  PCI: PCIe portdrv: Fix suspend and resume of PCI Express port services
  PCI: PCIe portdrv: Add kerneldoc comments to some core functions
  x86/PCI: Do not use interrupt links for devices using MSI-X
  net: sfc: Use pci_clear_master() to disable bus mastering
  PCI: Add pci_clear_master() as opposite of pci_set_master()
  PCI hotplug: remove redundant test in cpq hotplug
  PCI: pciehp: cleanup register and field definitions
  ...
2009-01-07 15:41:01 -08:00
Trent Piepho
6fd8be4bf7 powerpc/fsl-booke: Remove num_tlbcam_entries
This is a global variable defined in fsl_booke_mmu.c with a value that gets
initialized in assembly code in head_fsl_booke.S.

It's never used.

If some code ever does want to know the number of entries in TLB1, then
"numcams = mfspr(SPRN_TLB1CFG) & 0xfff", is a whole lot simpler than a
global initialized during kernel boot from assembly.

Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-01-07 15:33:07 -06:00
Trent Piepho
19f5465e82 powerpc/fsl-booke: Don't hard-code size of struct tlbcam
Some assembly code in head_fsl_booke.S hard-coded the size of struct tlbcam
to 20 when it indexed the TLBCAM table.  Anyone changing the size of struct
tlbcam would not know to expect that.

The kernel already has a system to get the size of C structures into
assembly language files, asm-offsets, so let's use it.

The definition of the struct gets moved to a header, so that asm-offsets.c
can include it.

Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2009-01-07 15:33:06 -06:00
Linus Torvalds
57c44c5f6f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
  trivial: chack -> check typo fix in main Makefile
  trivial: Add a space (and a comma) to a printk in 8250 driver
  trivial: Fix misspelling of "firmware" in docs for ncr53c8xx/sym53c8xx
  trivial: Fix misspelling of "firmware" in powerpc Makefile
  trivial: Fix misspelling of "firmware" in usb.c
  trivial: Fix misspelling of "firmware" in qla1280.c
  trivial: Fix misspelling of "firmware" in a100u2w.c
  trivial: Fix misspelling of "firmware" in megaraid.c
  trivial: Fix misspelling of "firmware" in ql4_mbx.c
  trivial: Fix misspelling of "firmware" in acpi_memhotplug.c
  trivial: Fix misspelling of "firmware" in ipw2100.c
  trivial: Fix misspelling of "firmware" in atmel.c
  trivial: Fix misspelled firmware in Kconfig
  trivial: fix an -> a typos in documentation and comments
  trivial: fix then -> than typos in comments and documentation
  trivial: update Jesper Juhl CREDITS entry with new email
  trivial: fix singal -> signal typo
  trivial: Fix incorrect use of "loose" in event.c
  trivial: printk: fix indentation of new_text_line declaration
  trivial: rtc-stk17ta8: fix sparse warning
  ...
2009-01-07 11:31:52 -08:00
Bjorn Helgaas
3f9455d488 PCI: powerpc: use generic pci_swizzle_interrupt_pin()
Use the generic pci_swizzle_interrupt_pin() instead of arch-specific code.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-01-07 11:12:52 -08:00
Masami Hiramatsu
1294156078 kprobes: add kprobe_insn_mutex and cleanup arch_remove_kprobe()
Add kprobe_insn_mutex for protecting kprobe_insn_pages hlist, and remove
kprobe_mutex from architecture dependent code.

This allows us to call arch_remove_kprobe() (and free_insn_slot) while
holding kprobe_mutex.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-06 15:59:20 -08:00
Frederik Schwarzer
025dfdafe7 trivial: fix then -> than typos in comments and documentation
- (better, more, bigger ...) then -> (...) than

Signed-off-by: Frederik Schwarzer <schwarzerf@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-01-06 11:28:06 +01:00
Linus Torvalds
61420f59a5 Merge branch 'cputime' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'cputime' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [PATCH] fast vdso implementation for CLOCK_THREAD_CPUTIME_ID
  [PATCH] improve idle cputime accounting
  [PATCH] improve precision of idle time detection.
  [PATCH] improve precision of process accounting.
  [PATCH] idle cputime accounting
  [PATCH] fix scaled & unscaled cputime accounting
2009-01-03 11:56:24 -08:00
Linus Torvalds
b840d79631 Merge branch 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (66 commits)
  x86: export vector_used_by_percpu_irq
  x86: use logical apicid in x2apic_cluster's x2apic_cpu_mask_to_apicid_and()
  sched: nominate preferred wakeup cpu, fix
  x86: fix lguest used_vectors breakage, -v2
  x86: fix warning in arch/x86/kernel/io_apic.c
  sched: fix warning in kernel/sched.c
  sched: move test_sd_parent() to an SMP section of sched.h
  sched: add SD_BALANCE_NEWIDLE at MC and CPU level for sched_mc>0
  sched: activate active load balancing in new idle cpus
  sched: bias task wakeups to preferred semi-idle packages
  sched: nominate preferred wakeup cpu
  sched: favour lower logical cpu number for sched_mc balance
  sched: framework for sched_mc/smt_power_savings=N
  sched: convert BALANCE_FOR_xx_POWER to inline functions
  x86: use possible_cpus=NUM to extend the possible cpus allowed
  x86: fix cpu_mask_to_apicid_and to include cpu_online_mask
  x86: update io_apic.c to the new cpumask code
  x86: Introduce topology_core_cpumask()/topology_thread_cpumask()
  x86: xen: use smp_call_function_many()
  x86: use work_on_cpu in x86/kernel/cpu/mcheck/mce_amd_64.c
  ...

Fixed up trivial conflict in kernel/time/tick-sched.c manually
2009-01-02 11:44:09 -08:00
Linus Torvalds
597b0d2162 Merge branch 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (140 commits)
  KVM: MMU: handle large host sptes on invlpg/resync
  KVM: Add locking to virtual i8259 interrupt controller
  KVM: MMU: Don't treat a global pte as such if cr4.pge is cleared
  MAINTAINERS: Maintainership changes for kvm/ia64
  KVM: ia64: Fix kvm_arch_vcpu_ioctl_[gs]et_regs()
  KVM: x86: Rework user space NMI injection as KVM_CAP_USER_NMI
  KVM: VMX: Fix pending NMI-vs.-IRQ race for user space irqchip
  KVM: fix handling of ACK from shared guest IRQ
  KVM: MMU: check for present pdptr shadow page in walk_shadow
  KVM: Consolidate userspace memory capability reporting into common code
  KVM: Advertise the bug in memory region destruction as fixed
  KVM: use cpumask_var_t for cpus_hardware_enabled
  KVM: use modern cpumask primitives, no cpumask_t on stack
  KVM: Extract core of kvm_flush_remote_tlbs/kvm_reload_remote_mmus
  KVM: set owner of cpu and vm file operations
  anon_inodes: use fops->owner for module refcount
  x86: KVM guest: kvm_get_tsc_khz: return khz, not lpj
  KVM: MMU: prepopulate the shadow on invlpg
  KVM: MMU: skip global pgtables on sync due to cr3 switch
  KVM: MMU: collapse remote TLB flushes on root sync
  ...
2009-01-02 11:41:11 -08:00
Al Viro
18d8fda7c3 take init_fs to saner place
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-12-31 18:07:42 -05:00
Hollis Blanchard
73e75b416f KVM: ppc: Implement in-kernel exit timing statistics
Existing KVM statistics are either just counters (kvm_stat) reported for
KVM generally or trace based aproaches like kvm_trace.
For KVM on powerpc we had the need to track the timings of the different exit
types. While this could be achieved parsing data created with a kvm_trace
extension this adds too much overhead (at least on embedded PowerPC) slowing
down the workloads we wanted to measure.

Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm
code. These statistic is available per vm&vcpu under the kvm debugfs directory.
As this statistic is low, but still some overhead it can be enabled via a
.config entry and should be off by default.

Since this patch touched all powerpc kvm_stat code anyway this code is now
merged and simplified together with the exit timing statistic code (still
working with exit timing disabled in .config).

Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:41 +02:00
Hollis Blanchard
7924bd4109 KVM: ppc: directly insert shadow mappings into the hardware TLB
Formerly, we used to maintain a per-vcpu shadow TLB and on every entry to the
guest would load this array into the hardware TLB. This consumed 1280 bytes of
memory (64 entries of 16 bytes plus a struct page pointer each), and also
required some assembly to loop over the array on every entry.

Instead of saving a copy in memory, we can just store shadow mappings directly
into the hardware TLB, accepting that the host kernel will clobber these as
part of the normal 440 TLB round robin. When we do that we need less than half
the memory, and we have decreased the exit handling time for all guest exits,
at the cost of increased number of TLB misses because the host overwrites some
guest entries.

These savings will be increased on processors with larger TLBs or which
implement intelligent flush instructions like tlbivax (which will avoid the
need to walk arrays in software).

In addition to that and to the code simplification, we have a greater chance of
leaving other host userspace mappings in the TLB, instead of forcing all
subsequent tasks to re-fault all their mappings.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:55:09 +02:00
Hollis Blanchard
db93f5745d KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor
This patch doesn't yet move all 44x-specific data into the new structure, but
is the first step down that path. In the future we may also want to create a
struct kvm_vcpu_booke.

Based on patch from Liu Yu <yu.liu@freescale.com>.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:52:22 +02:00
Hollis Blanchard
0f55dc481e KVM: ppc: Rename "struct tlbe" to "struct kvmppc_44x_tlbe"
This will ease ports to other cores.

Also remove unused "struct kvm_tlb" while we're at it.

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-12-31 16:51:50 +02:00
Martin Schwidefsky
79741dd357 [PATCH] idle cputime accounting
The cpu time spent by the idle process actually doing something is
currently accounted as idle time. This is plain wrong, the architectures
that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
time spent doing nothing and the time spent by idle doing work. The first
is accounted with account_idle_time and the second with account_system_time.
The architectures that use the account_xxx_time interface directly and not
the account_xxx_ticks interface now need to do the check for the idle
process in their arch code. In particular to improve the system vs true
idle time accounting the arch code needs to measure the true idle time
instead of just testing for the idle process.
To improve the tick based accounting as well we would need an architecture
primitive that can tell us if the pt_regs of the interrupted context
points to the magic instruction that halts the cpu.

In addition idle time is no more added to the stime of the idle process.
This field now contains the system time of the idle process as it should
be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
every tick that occurs while idle is running will be accounted as idle
time.

This patch contains the necessary common code changes to be able to
distinguish idle system time and true idle time. The architectures with
support for VIRT_CPU_ACCOUNTING need some changes to exploit this.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-31 15:11:46 +01:00
Martin Schwidefsky
457533a7d3 [PATCH] fix scaled & unscaled cputime accounting
The utimescaled / stimescaled fields in the task structure and the
global cpustat should be set on all architectures. On s390 the calls
to account_user_time_scaled and account_system_time_scaled never have
been added. In addition system time that is accounted as guest time
to the user time of a process is accounted to the scaled system time
instead of the scaled user time.
To fix the bugs and to prevent future forgetfulness this patch merges
account_system_time_scaled into account_system_time and
account_user_time_scaled into account_user_time.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: Michael Neuling <mikey@neuling.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-31 15:11:46 +01:00
Linus Torvalds
3c92ec8ae9 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits)
  powerpc/44x: Support 16K/64K base page sizes on 44x
  powerpc: Force memory size to be a multiple of PAGE_SIZE
  powerpc/32: Wire up the trampoline code for kdump
  powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
  powerpc/32: Allow __ioremap on RAM addresses for kdump kernel
  powerpc/32: Setup OF properties for kdump
  powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
  powerpc: Prepare xmon_save_regs for use with kdump
  powerpc: Remove default kexec/crash_kernel ops assignments
  powerpc: Make default kexec/crash_kernel ops implicit
  powerpc: Setup OF properties for ppc32 kexec
  powerpc/pseries: Fix cpu hotplug
  powerpc: Fix KVM build on ppc440
  powerpc/cell: add QPACE as a separate Cell platform
  powerpc/cell: fix build breakage with CONFIG_SPUFS disabled
  powerpc/mpc5200: fix error paths in PSC UART probe function
  powerpc/mpc5200: add rts/cts handling in PSC UART driver
  powerpc/mpc5200: Make PSC UART driver update serial errors counters
  powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver
  powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver
  ...

Fix trivial conflict in drivers/char/Makefile as per Paul's directions
2008-12-28 16:54:33 -08:00
Ilya Yanok
ca9153a3a2 powerpc/44x: Support 16K/64K base page sizes on 44x
This adds support for 16k and 64k page sizes on PowerPC 44x processors.

The PGDIR table is much smaller than a page when using 16k or 64k
pages (512 and 32 bytes respectively) so we allocate the PGDIR with
kzalloc() instead of __get_free_pages().

One PTE table covers rather a large memory area when using 16k or 64k
pages (32MB or 512MB respectively), so we can easily put FIXMAP and
PKMAP in the area covered by one PTE table.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Vladimir Panfilov <pvr@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-29 09:53:25 +11:00
Hollis Blanchard
6ca4f7494b powerpc: Force memory size to be a multiple of PAGE_SIZE
Ensure that total memory size is page-aligned, because otherwise
mark_bootmem() gets upset.

This error case was triggered by using 64 KiB pages in the kernel
while arch/powerpc/boot/4xx.c arbitrarily reduced the amount of memory
by 4096 (to work around a chip bug that affects the last 256 bytes of
physical memory).

Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-29 09:53:14 +11:00
Linus Torvalds
1db2a5c11e Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (85 commits)
  [S390] provide documentation for hvc_iucv kernel parameter.
  [S390] convert ctcm printks to dev_xxx and pr_xxx macros.
  [S390] convert zfcp printks to pr_xxx macros.
  [S390] convert vmlogrdr printks to pr_xxx macros.
  [S390] convert zfcp dumper printks to pr_xxx macros.
  [S390] convert cpu related printks to pr_xxx macros.
  [S390] convert qeth printks to dev_xxx and pr_xxx macros.
  [S390] convert sclp printks to pr_xxx macros.
  [S390] convert iucv printks to dev_xxx and pr_xxx macros.
  [S390] convert ap_bus printks to pr_xxx macros.
  [S390] convert dcssblk and extmem printks messages to pr_xxx macros.
  [S390] convert monwriter printks to pr_xxx macros.
  [S390] convert s390 debug feature printks to pr_xxx macros.
  [S390] convert monreader printks to pr_xxx macros.
  [S390] convert appldata printks to pr_xxx macros.
  [S390] convert setup printks to pr_xxx macros.
  [S390] convert hypfs printks to pr_xxx macros.
  [S390] convert time printks to pr_xxx macros.
  [S390] convert cpacf printks to pr_xxx macros.
  [S390] convert cio printks to pr_xxx macros.
  ...
2008-12-28 12:33:21 -08:00
Martin Schwidefsky
fc5243d98a [S390] arch_setup_additional_pages arguments
arch_setup_additional_pages currently gets two arguments, the binary
format descripton and an indication if the process uses an executable
stack or not. The second argument is not used by anybody, it could
be removed without replacement.

What actually does make sense is to pass an indication if the process
uses the elf interpreter or not. The glibc code will not use anything
from the vdso if the process does not use the dynamic linker, so for
statically linked binaries the architecture backend can choose not
to map the vdso.

Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:54 +01:00
Dale Farnsworth
f8f50b1bdd powerpc/32: Wire up the trampoline code for kdump
Wire up the trampoline code for ppc32 to relay exceptions from the
vectors at address 0 to vectors at address 32MB, and modify Kconfig
to enable Kdump support for all classic powerpcs.

Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:29 +11:00
Dale Farnsworth
ccdcef72c2 powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M
Add the ability for a classic ppc kernel to be loaded at an address
of 32MB.  This done by fixing a few places that assume we are loaded
at address 0, and by changing several uses of KERNELBASE to use
PAGE_OFFSET, instead.

Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:29 +11:00
Dale Farnsworth
6f29c3298b powerpc/32: Setup OF properties for kdump
Refactor the setting of kdump OF properties, moving the common code
from machine_kexec_64.c to machine_kexec.c where it can be used on
both ppc64 and ppc32.  This will be needed for kdump to work on ppc32
platforms.

Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:29 +11:00
Anton Vorontsov
7375331388 powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs()
This replaces the dummy crash_setup_regs function with full-fledged
crash_setup_regs implementation.  On PPC32 we simply use the new
ppc_save_regs function to dump the registers.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:28 +11:00
Anton Vorontsov
322b439455 powerpc: Prepare xmon_save_regs for use with kdump
Today the arch/powerpc/xmon/setjmp.S file contains only the
xmon_save_regs function.  We want to use it for kdump purposes, so
let's move the file into arch/powerpc/kernel/ and give the function a
more generic name (ppc_save_regs).

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:28 +11:00
Anton Vorontsov
77733f8a33 powerpc: Make default kexec/crash_kernel ops implicit
This removes the need for each platform to specify default kexec and
crash kernel ops, thus effectively adds a working kexec support for
most 6xx/7xx/7xxx-based boards.

Platforms that can't cope with default ops will explode in some weird
way (a hang or reboot is most likely), which means that the board's
kexec support should be fixed or blacklisted via dummy _prepare
callback returning -ENOSYS.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:28 +11:00
Dale Farnsworth
2e8e4f5b80 powerpc: Setup OF properties for ppc32 kexec
Refactor the setting of kexec OF properties, moving the common code
from machine_kexec_64.c to machine_kexec.c where it can be used on
both ppc64 and ppc32.  This is needed for kexec to work on ppc32
platforms.

Signed-off-by: Dale Farnsworth <dale@farnsworth.org>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-23 15:13:28 +11:00
Benjamin Herrenschmidt
9dce3ce5c5 powerpc/44x: 44x TLB doesn't need "Guarded" set for all pages
After discussing with chip designers, it appears that it's not
necessary to set G everywhere on 440 cores. The various core
errata related to prefetch should be sorted out by firmware by
disabling icache prefetching in CCR0. We add the workaround to
the kernel however just in case oooold firmwares don't do it.

This is valid for -all- 4xx core variants. Later ones hard wire
the absence of prefetch but it doesn't harm to clear the bits
in CCR0 (they should already be cleared anyway).

We still leave G=1 on the linear mapping for now, we need to
stop over-mapping RAM to be able to remove it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:16 +11:00
Benjamin Herrenschmidt
64b3d0e812 powerpc/mm: Rework usage of _PAGE_COHERENT/NO_CACHE/GUARDED
Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
in the hash code based on some CPU feature bit.  We also manipulate
_PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.

This changes the logic so that instead, the PTE now contains
_PAGE_COHERENT for all normal RAM pages thay have I = 0 on platforms
that need it.  The hash code clears it if the feature bit is not set.

It also adds some clean accessors to setup various valid combinations
of access flags and change various bits of code to use them instead.

This should help having the PTE actually containing the bit
combinations that we really want.

I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
set it explicitely from the TLB miss.  I will ultimately remove it
completely as it appears that it might not be needed after all
but in the meantime, having it in the TLB miss makes things a
lot easier.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:16 +11:00
Benjamin Herrenschmidt
7752035180 powerpc/mm: Runtime allocation of mmu context maps for nohash CPUs
This makes the MMU context code used for CPUs with no hash table
(except 603) dynamically allocate the various maps used to track
the state of contexts.

Only the main free map and CPU 0 stale map are allocated at boot
time.  Other CPU maps are allocated when those CPUs are brought up
and freed if they are unplugged.

This also moves the initialization of the MMU context management
slightly later during the boot process, which should be fine as
it's really only needed when userland if first started anyways.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:16 +11:00
Benjamin Herrenschmidt
2a4aca1144 powerpc/mm: Split low level tlb invalidate for nohash processors
Currently, the various forms of low level TLB invalidations are all
implemented in misc_32.S for 32-bit processors, in a fairly scary
mess of #ifdef's and with interesting duplication such as a whole
bunch of code for FSL _tlbie and _tlbia which are no longer used.

This moves things around such that _tlbie is now defined in
hash_low_32.S and is only used by the 32-bit hash code, and all
nohash CPUs use the various _tlbil_* forms that are now moved to
a new file, tlb_nohash_low.S.

I moved all the definitions for that stuff out of
include/asm/tlbflush.h as they are really internal mm stuff, into
mm/mmu_decl.h

The code should have no functional changes.  I kept some variants
inline for trivial forms on things like 40x and 8xx.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:16 +11:00
Benjamin Herrenschmidt
f048aace29 powerpc/mm: Add SMP support to no-hash TLB handling
This commit moves the whole no-hash TLB handling out of line into a
new tlb_nohash.c file, and implements some basic SMP support using
IPIs and/or broadcast tlbivax instructions.

Note that I'm using local invalidations for D->I cache coherency.

At worst, if another processor is trying to execute the same and
has the old entry in its TLB, it will just take a fault and re-do
the TLB flush locally (it won't re-do the cache flush in any case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:16 +11:00
Benjamin Herrenschmidt
7c03d653cd powerpc/mm: Introduce MMU features
We're soon running out of CPU features and I need to add some new
ones for various MMU related bits, so this patch separates the MMU
features from the CPU features.  I moved over the 32-bit MMU related
ones, added base features for MMU type families, but didn't move
over any 64-bit only feature yet.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:16 +11:00
Benjamin Herrenschmidt
5e696617c4 powerpc/mm: Split mmu_context handling
This splits the mmu_context handling between 32-bit hash based
processors, 64-bit hash based processors and everybody else.  This is
preliminary work for adding SMP support for BookE processors.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:15 +11:00
Benjamin Herrenschmidt
6d2170be45 powerpc/4xx: Extended DCR support v2
This adds supports to the "extended" DCR addressing via the indirect
mfdcrx/mtdcrx instructions supported by some 4xx cores (440H6 and
later).

I enabled the feature for now only on AMCC 460 chips.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:15 +11:00
Nathan Lynch
13ba3c0092 powerpc: Convert sysfs cache code to of_find_next_cache_node()
Using the common code means that more complete cache information will
provided in sysfs on platforms that don't use the l2-cache property
convention.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:14 +11:00
Nathan Lynch
b2ea25b958 powerpc: Convert cpu_to_l2cache() to of_find_next_cache_node()
The smp code uses cache information to populate cpu_core_map; change
it to use common code for cache lookup.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:14 +11:00
Nathan Lynch
e523f723d6 powerpc: Add of_find_next_cache_node()
We have more than one piece of code that looks up cache nodes manually
using the "l2-cache" property.  Add a common helper routine which does
this and handles ePAPR's "next-level-cache" property as well as
powermac.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-21 14:21:14 +11:00
Ingo Molnar
30cd324e97 Merge branches 'tracing/ftrace', 'tracing/ring-buffer' and 'tracing/urgent' into tracing/core
Conflicts:
	include/linux/ftrace.h
2008-12-19 09:42:40 +01:00
Ingo Molnar
b9974dc6bd Merge branch 'linus' into cpus4096 2008-12-18 11:48:30 +01:00