Commit Graph

3550 Commits

Author SHA1 Message Date
Anton Blanchard
68568add2c powerpc/time: Remove unnecessary sanity check of decrementer expiration
The clockevents code uses max_delta_ns to avoid calling a
clockevent with too large a value.

Remove the redundant version of this in the timer_interrupt
code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:59 +11:00
Anton Blanchard
11b8633ada powerpc/time: Use clocksource_register_hz
Use clocksource_register_hz which calculates the shift/mult
factors for us. Also remove the shift = 22 assumption in
vsyscall_update - thanks to Paul Mackerras and John Stultz for
catching that.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:59 +11:00
Anton Blanchard
d8afc6fd95 powerpc/time: Use clockevents_calc_mult_shift
We can use clockevents_calc_mult_shift instead of doing all
the work ourselves.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:59 +11:00
Anton Blanchard
37fb9a0231 powerpc/time: Handle wrapping of decrementer
When re-enabling interrupts we have code to handle edge sensitive
decrementers by resetting the decrementer to 1 whenever it is negative.
If interrupts were disabled long enough that the decrementer wrapped to
positive we do nothing. This means interrupts can be delayed for a long
time until it finally goes negative again.

While we hope interrupts are never be disabled long enough for the
decrementer to go positive, we have a very good test team that can
drive any kernel into the ground. The softlockup data we get back
from these fails could be seconds in the future, completely missing
the cause of the lockup.

We already keep track of the timebase of the next event so use that
to work out if we should trigger a decrementer exception.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:58 +11:00
Jason Jin
05737c7c5b powerpc/fsl-pci: Don't hide resource for pci/e when configured as Agent/EP
Current pci/pcie init code will hide the pci/pcie host resource.
But did not judge it is host/RC or agent/EP. If configured as
agent/EP, we should avoid hiding its resource in the host side.

In PCI system, the Programing Interface can be used to judge the
host/agent status:
Programing Interface = 0: host
Programing Interface = 1: Agent

In PCIE system, both the Programing Interface and Header type can
be used to judge the RC/EP status.
Header Type = 0: EP
Header Type = 1: RC

Signed-off-by: Jason Jin <Jason.jin@freescale.com>
Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com>
Signed-off-by: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-11-24 02:01:40 -06:00
Will Deacon
a313f4c55d powerpc/signal32: Fix sigset_t conversion when copying to user
On PPC64, put_sigset_t converts a sigset_t to a compat_sigset_t
before copying it to userspace. There is a typo in the case that
we have 4 words to copy, meaning that we corrupt the compat_sigset_t.

It appears that _NSIG_WORDS can't be greater than 2 at the moment
so this code is probably always optimised away anyway.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:41:10 +11:00
Kumar Gala
187b9f2aa7 powerpc/book3e-64: Fix debug support for userspace
With the introduction of CONFIG_PPC_ADV_DEBUG_REGS user space debug is
broken on Book-E 64-bit parts that support delayed debug events.  When
switch_booke_debug_regs() sets DBCR0 we'll start getting debug events as
MSR_DE is also set and we aren't able to handle debug events from kernel
space.

We can remove the hack that always enables MSR_DE and loads up DBCR0 and
just utilize switch_booke_debug_regs() to get user space debug working
again.

We still need to handle critical/debug exception stacks & proper
save/restore of state for those exception levles to support debug events
from kernel space like we have on 32-bit.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Kumar Gala
b95bc21914 powerpc: Remove extraneous CONFIG_PPC_ADV_DEBUG_REGS define
All of DebugException is already protected by CONFIG_PPC_ADV_DEBUG_REGS
there is no need to have another such ifdef inside the function.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Kumar Gala
ba28c9aae2 powerpc: Revert show_regs() define for readability
We had an existing ifdef for 4xx & BOOKE processors that got changed to
CONFIG_PPC_ADV_DEBUG_REGS.  The define has nothing to do with
CONFIG_PPC_ADV_DEBUG_REGS.  The define really should be:

 #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)

and not

 #ifdef CONFIG_PPC_ADV_DEBUG_REGS

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Kevin Hao
2cd76629f6 powerpc/trace: Add a dummy stack frame for trace_hardirqs_off
The trace_hardirqs_off will use CALLER_ADDR0 and CALLER_ADDR1.
If an exception occurs in user mode, there is only one stack frame
on the stack and accessing the CALLER_ADDR1 will causes the following
call trace. So we create a dummy stack frame to make
trace_hardirqs_off happy.

WARNING: at kernel/smp.c:459
Modules linked in:
NIP: c0093280 LR: c00930a0 CTR: c0010780
REGS: edb87ae0 TRAP: 0700   Not tainted  (3.1.0)
MSR: 00021002 <ME,CE>  CR: 28002888  XER: 00000000
TASK = edce2ac0[17658] 'mthread-lock-on' THREAD: edb86000 CPU: 5
GPR00: 00000001 edb87b90 edce2ac0 00000005 c0019594 edb87bd8 00000001 00000fe3
GPR08: 00041000 c084138c 4e20120d edb87b90 48002888 1001aa7c 00000000 00000000
GPR16: 48830000 10012a8c 00000000 10000af4 00000001 c0810000 00000000 00000000
GPR24: ee9aa920 c0816a18 00000000 00000005 c0019594 edb87bd8 ee20178c edb87b90
NIP [c0093280] smp_call_function_many+0x214/0x2b4
LR [c00930a0] smp_call_function_many+0x34/0x2b4
Call Trace:
[edb87b90] [c00930a0] smp_call_function_many+0x34/0x2b4 (unreliable)
[edb87bd0] [c00194ec] __flush_tlb_page+0xac/0x100
[edb87c00] [c001957c] flush_tlb_page+0x3c/0x54
[edb87c10] [c00180ac] ptep_set_access_flags+0x74/0x12c
[edb87c40] [c0128068] handle_pte_fault+0x2f0/0x9ac
[edb87cb0] [c0128c3c] handle_mm_fault+0x104/0x1dc
[edb87ce0] [c05f40f4] do_page_fault+0x2dc/0x630
[edb87e50] [c001078c] handle_page_fault+0xc/0x80

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00
Anton Blanchard
d715e433b7 powerpc: Copy down exception vectors after feature fixups
kdump fails because we try to execute an HV only instruction. Feature
fixups are being applied after we copy the exception vectors down to 0
so they miss out on any updates.

We have always had this issue but it only became critical in v3.0
when we added CFAR support (breaks POWER5) and v3.1 when we added
POWERNV (breaks everyone).

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> [v3.0+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00
Anton Blanchard
6d1e2c6c1a powerpc: panic if we can't instantiate RTAS
I had to debug a strange situation where all manner of things were
failing. SMT threads, storage and network were all completely broken.

The root cause was we couldn't find enough memory to instantiate RTAS -
this was a network install so the initrd was huge.

Instead of limping along and failing in mysterious ways we should just
panic up front if RTAS exists and we can't allocate space for it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00
Suzuki Poulose
bbc24a25e2 powerpc/4xx: Fix typos in kexec config dependencies
Kexec is not supported on 47x. 47x is a variant of 44x with slightly
different MMU and SMP support. There was a typo in the config dependency
for kexec. This patch fixes the same.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc:	Kumar Gala <galak@kernel.crashing.org>
Cc:	Josh Boyer <jwboyer@gmail.com>
Cc:	linux ppc dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00
Al Viro
9c8b39077b powerpc: Fix build breakage in jump_label.c
Should do what other architectures do and wrap all that code into
the appropriate ifdef

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:25:19 +11:00
Alexander Graf
5ccf55dd81 powerpc/kvm: Fix build failure with HV KVM and CBE
When running with HV KVM and CBE config options enabled, I get
build failures like the following:

  arch/powerpc/kernel/head_64.o: In function `cbe_system_error_hv':
  (.text+0x1228): undefined reference to `do_kvm_0x1202'
  arch/powerpc/kernel/head_64.o: In function `cbe_maintenance_hv':
  (.text+0x1628): undefined reference to `do_kvm_0x1602'
  arch/powerpc/kernel/head_64.o: In function `cbe_thermal_hv':
  (.text+0x1828): undefined reference to `do_kvm_0x1802'

This is because we jump to a KVM handler when HV is enabled, but we
only generate the handler with PR KVM mode.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-08 15:34:04 +11:00
Yong Zhang
a3a9f3b47d powerpc/irq: Remove IRQF_DISABLED
Since commit [e58aa3d2: genirq: Run irq handlers with interrupts disabled],
We run all interrupt handlers with interrupts disabled
and we even check and yell when an interrupt handler
returns with interrupts enabled (see commit [b738a50a:
genirq: Warn when handler enables interrupts]).

So now this flag is a NOOP and can be removed.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-08 14:51:46 +11:00
Linus Torvalds
32aaeffbd4 Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
  Revert "tracing: Include module.h in define_trace.h"
  irq: don't put module.h into irq.h for tracking irqgen modules.
  bluetooth: macroize two small inlines to avoid module.h
  ip_vs.h: fix implicit use of module_get/module_put from module.h
  nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
  include: replace linux/module.h with "struct module" wherever possible
  include: convert various register fcns to macros to avoid include chaining
  crypto.h: remove unused crypto_tfm_alg_modname() inline
  uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
  pm_runtime.h: explicitly requires notifier.h
  linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
  miscdevice.h: fix up implicit use of lists and types
  stop_machine.h: fix implicit use of smp.h for smp_processor_id
  of: fix implicit use of errno.h in include/linux/of.h
  of_platform.h: delete needless include <linux/module.h>
  acpi: remove module.h include from platform/aclinux.h
  miscdevice.h: delete unnecessary inclusion of module.h
  device_cgroup.h: delete needless include <linux/module.h>
  net: sch_generic remove redundant use of <linux/module.h>
  net: inet_timewait_sock doesnt need <linux/module.h>
  ...

Fix up trivial conflicts (other header files, and  removal of the ab3550 mfd driver) in
 - drivers/media/dvb/frontends/dibx000_common.c
 - drivers/media/video/{mt9m111.c,ov6650.c}
 - drivers/mfd/ab3550-core.c
 - include/linux/dmaengine.h
2011-11-06 19:44:47 -08:00
Linus Torvalds
1197ab2942 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (106 commits)
  powerpc/p3060qds: Add support for P3060QDS board
  powerpc/83xx: Add shutdown request support to MCU handling on MPC8349 MITX
  powerpc/85xx: Make kexec to interate over online cpus
  powerpc/fsl_booke: Fix comment in head_fsl_booke.S
  powerpc/85xx: issue 15 EOI after core reset for FSL CoreNet devices
  powerpc/8xxx: Fix interrupt handling in MPC8xxx GPIO driver
  powerpc/85xx: Add 'fsl,pq3-gpio' compatiable for GPIO driver
  powerpc/86xx: Correct Gianfar support for GE boards
  powerpc/cpm: Clear muram before it is in use.
  drivers/virt: add ioctl for 32-bit compat on 64-bit to fsl-hv-manager
  powerpc/fsl_msi: add support for "msi-address-64" property
  powerpc/85xx: Setup secondary cores PIR with hard SMP id
  powerpc/fsl-booke: Fix settlbcam for 64-bit
  powerpc/85xx: Adding DCSR node to dtsi device trees
  powerpc/85xx: clean up FPGA device tree nodes for Freecsale QorIQ boards
  powerpc/85xx: fix PHYS_64BIT selection for P1022DS
  powerpc/fsl-booke: Fix setup_initial_memory_limit to not blindly map
  powerpc: respect mem= setting for early memory limit setup
  powerpc: Update corenet64_smp_defconfig
  powerpc: Update mpc85xx/corenet 32-bit defconfigs
  ...

Fix up trivial conflicts in:
 - arch/powerpc/configs/40x/hcu4_defconfig
	removed stale file, edited elsewhere
 - arch/powerpc/include/asm/udbg.h, arch/powerpc/kernel/udbg.c:
	added opal and gelic drivers vs added ePAPR driver
 - drivers/tty/serial/8250.c
	moved UPIO_TSI to powerpc vs removed UPIO_DWAPB support
2011-11-06 17:12:03 -08:00
Matthew McClintock
7d0d3ad5e3 powerpc/fsl_booke: Fix comment in head_fsl_booke.S
Fix typo in comments introduced by:

commit 6dece0eb69
Author: Scott Wood <scottwood@freescale.com>
Date:   Mon Jul 25 11:29:33 2011 +0000

    powerpc/32: Pass device tree address as u64 to machine_init

Signed-off-by: Matthew McClintock <msm@freescale.com>
cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-11-03 13:12:28 -05:00
Paul Gortmaker
ead53f22dc powerpc: remove non-required uses of include <linux/module.h>
None of the files touched here are modules, and they are not
exporting any symbols either -- so there is no need to be including
the module.h.  Builds of all the files remains successful.

Even kernel/module.c does not need to include it, since it includes
linux/moduleloader.h instead.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:44 -04:00
Paul Gortmaker
4b16f8e2d6 powerpc: various straight conversions from module.h --> export.h
All these files were including module.h just for the basic
EXPORT_SYMBOL infrastructure.  We can shift them off to the
export.h header which is a way smaller footprint and thus
realize some compile time gains.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:44 -04:00
Paul Gortmaker
cab2e05271 powerpc: fix implicit use of cache.h in kernel/firmware.c
This file only needs export.h to get EXPORT_SYMBOL, but in doing
so, it uncovers an implicit use of linux/cache.h as follows:

 CC      arch/powerpc/kernel/firmware.o
arch/powerpc/kernel/firmware.c:20: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__read_mostly'
arch/powerpc/kernel/firmware.c:21: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__used'
make[2]: *** [arch/powerpc/kernel/firmware.o] Error 1

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:41 -04:00
Paul Gortmaker
62fe91bba2 powerpc: Fix up implicit sched.h users
They are getting it through device.h --> module.h path, but we want
to clean that up.  This is a sample of what will happen if we don't:

  pseries/iommu.c: In function 'tce_build_pSeriesLP':
  pseries/iommu.c:136: error: implicit declaration of function 'show_stack'

  pseries/eeh.c: In function 'eeh_token_to_phys':
  pseries/eeh.c:359: error: 'init_mm' undeclared (first use in this function)

  pseries/eeh_event.c: In function 'eeh_event_handler':
  pseries/eeh_event.c:63: error: implicit declaration of function 'daemonize'
  pseries/eeh_event.c:64: error: implicit declaration of function 'set_current_state'
  pseries/eeh_event.c:64: error: 'TASK_INTERRUPTIBLE' undeclared (first use in this function)
  pseries/eeh_event.c:64: error: (Each undeclared identifier is reported only once
  pseries/eeh_event.c:64: error: for each function it appears in.)
  pseries/eeh_event.c: In function 'eeh_thread_launcher':
  pseries/eeh_event.c:109: error: 'CLONE_KERNEL' undeclared (first use in this function)

  hotplug-cpu.c: In function 'pseries_mach_cpu_die':
  hotplug-cpu.c:115: error: implicit declaration of function 'idle_task_exit'

  kernel/swsusp_64.c: In function 'do_after_copyback':
  kernel/swsusp_64.c:17: error: implicit declaration of function 'touch_softlockup_watchdog'

  cell/spufs/context.c: In function 'alloc_spu_context':
  cell/spufs/context.c:60: error: implicit declaration of function 'get_task_mm'
  cell/spufs/context.c:60: warning: assignment makes pointer from integer without a cast
  cell/spufs/context.c: In function 'spu_forget':
  cell/spufs/context.c:127: error: implicit declaration of function 'mmput'

  pasemi/dma_lib.c: In function 'pasemi_dma_stop_chan':
  pasemi/dma_lib.c:332: error: implicit declaration of function 'cond_resched'

  sysdev/fsl_lbc.c: In function 'fsl_lbc_ctrl_irq':
  sysdev/fsl_lbc.c:247: error: 'TASK_NORMAL' undeclared (first use in this function)

Add in sched.h so these get the definitions they are looking for.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:40 -04:00
Paul Gortmaker
b56eade55d powerpc: Fix up implicit stat.h users
They get it via module.h (via device.h) but we want to clean that up.
When we do, we'll get things like:

ibmebus.c:314: error: 'S_IWUSR' undeclared here (not in a function)
vio.c:972: error: 'S_IWUSR' undeclared here (not in a function)

so add in the stat header it is using explicitly in advance.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:39 -04:00
Paul Gortmaker
9308794884 powerpc: include export.h for files using EXPORT_SYMBOL/THIS_MODULE
Fix failures in powerpc associated with the previously allowed
implicit module.h presence that now lead to things like this:

arch/powerpc/mm/mmu_context_hash32.c:76:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/powerpc/mm/tlb_hash32.c:48:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/powerpc/kernel/pci_32.c:51:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
arch/powerpc/kernel/iomap.c:36:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/powerpc/platforms/44x/canyonlands.c:126:1: error: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/powerpc/kvm/44x.c:168:59: error: 'THIS_MODULE' undeclared (first use in this function)

[with several contibutions from Stephen Rothwell <sfr@canb.auug.org.au>]

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:38 -04:00
Paul Gortmaker
66b15db69c powerpc: add export.h to files making use of EXPORT_SYMBOL
With module.h being implicitly everywhere via device.h, the absence
of explicitly including something for EXPORT_SYMBOL went unnoticed.
Since we are heading to fix things up and clean module.h from the
device.h file, we need to explicitly include these files now.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:37 -04:00
Paul Gortmaker
333a151822 powerpc: io-workarounds.c was implicitly getting init_mm
It was coming in via device.h --> module.h etc. but we want to
clean that up.  So explicitly include the header where init_mm
is being declared.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:37 -04:00
Linus Torvalds
1bc87b0055 Merge branch 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (75 commits)
  KVM: SVM: Keep intercepting task switching with NPT enabled
  KVM: s390: implement sigp external call
  KVM: s390: fix register setting
  KVM: s390: fix return value of kvm_arch_init_vm
  KVM: s390: check cpu_id prior to using it
  KVM: emulate lapic tsc deadline timer for guest
  x86: TSC deadline definitions
  KVM: Fix simultaneous NMIs
  KVM: x86 emulator: convert push %sreg/pop %sreg to direct decode
  KVM: x86 emulator: switch lds/les/lss/lfs/lgs to direct decode
  KVM: x86 emulator: streamline decode of segment registers
  KVM: x86 emulator: simplify OpMem64 decode
  KVM: x86 emulator: switch src decode to decode_operand()
  KVM: x86 emulator: qualify OpReg inhibit_byte_regs hack
  KVM: x86 emulator: switch OpImmUByte decode to decode_imm()
  KVM: x86 emulator: free up some flag bits near src, dst
  KVM: x86 emulator: switch src2 to generic decode_operand()
  KVM: x86 emulator: expand decode flags to 64 bits
  KVM: x86 emulator: split dst decode to a generic decode_operand()
  KVM: x86 emulator: move memop, memopp into emulation context
  ...
2011-10-30 15:36:45 -07:00
Kumar Gala
ba14f64917 powerpc: respect mem= setting for early memory limit setup
For those MMUs that have some form of bolt'd linear mapping (TLB)
required its rare that one ever sets mem= smaller than the size of that
mapping.

However, on Book-E 64 parts the initial linear mapping is quite large
(1G) so its quite reasonable that mem= is set smaller than that.

We need to parse the command line for mem= limit and constrain the
amount of memory we map initially by it if need be.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-10-11 23:30:40 -05:00
Bharat Bhushan
e33ee8b6f4 powerpc: e500mc: Fix: use CONFIG_PPC_E500MC in idle_e500.S
It is wrongly using undefined CONFIG_E500MC.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-10-11 23:23:24 -05:00
Kumar Gala
37caf9f2a1 powerpc/fsl-booke: Handle L1 D-cache parity error correctly on e500mc
If the L1 D-Cache is in write shadow mode the HW will auto-recover the
error.  However we might still log the error and cause a machine check
(if L1CSR0[CPE] - Cache error checking enable).  We should only treat
the non-write shadow case as non-recoverable.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-10-06 23:36:55 -05:00
Benjamin Herrenschmidt
7680057cc4 powerpc: Don't try OPAL takeover on old 970 blades
The firmware on old 970 blades supports some kind of takeover called
"TNK takeover" which will crash if we try to probe for OPAL takeover,
so don't do it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-29 17:04:59 +10:00
Carl E. Love
d15f02eb4e powerpc/perf_event: Fix Power6 L1 cache read & write event codes]
The current L1 cache read event code 0x80082 only counts for thread 0. The
event code 0x280030 should be used to count events on thread 0 and 1. The
patch fixes the event code for the L1 cache read.

The current L1 cache write event code 0x80086 only counts for thread 0. The
event code 0x180032 should be used to count events on thread 0 and 1. The
patch fixes the event code for the L1 cache write.

FYI, the documentation lists three event codes for the L1 cache read event
and three event codes for the L1 cache write event.  The event description
for the event codes is as follows:

L1 cache read requests  0x80082  LSU 0 only
L1 cache read requests  0x8008A  LSU 1 only
L1 cache read requests  0x80030  LSU 1 or LSU 0, counter 2 only.

L1 cache store requests 0x80086  LSU 0 only
L1 cache store requests 0x8008E  LSU 1 only
L1 cache store requests 0x80032  LSU 0 or LSU 1, counter 1 only.

There can only be one request from either LSU 0 or 1 active at a time.

Signed-off-by: Carl Love <cel@us.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-29 17:04:59 +10:00
Benjamin Herrenschmidt
e69b742a67 powerpc/ptrace: Fix build with gcc 4.6
gcc (rightfully) complains that we are accessing beyond the
end of the fpr array (we do, to access the fpscr).

The only sane thing to do (whether anything in that code can be
called remotely sane is debatable) is to special case fpscr and
handle it as a separate statement.

I initially tried to do it it by making the array access conditional
to index < PT_FPSCR and using a 3rd else leg but for some reason gcc
was unable to understand it and still spewed the warning.

So I ended up with something a tad more intricated but it seems to
build on 32-bit and on 64-bit with and without VSX.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-29 17:02:04 +10:00
Benjamin Herrenschmidt
bb36c44557 powerpc/pci: Don't configure PCIe settings when PCI_PROBE_ONLY is set
We don't want to configure PCI Express Max Payload Size or
Max Read Request Size on systems that set that flag. The
firmware will have done it for us, and under hypervisors such
as pHyp we don't even see the parent switches and bridges and
thus can make no assumption on what values are safe to use.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-26 14:22:39 +10:00
Paul Mackerras
19ccb76a19 KVM: PPC: Implement H_CEDE hcall for book3s_hv in real-mode code
With a KVM guest operating in SMT4 mode (i.e. 4 hardware threads per
core), whenever a CPU goes idle, we have to pull all the other
hardware threads in the core out of the guest, because the H_CEDE
hcall is handled in the kernel.  This is inefficient.

This adds code to book3s_hv_rmhandlers.S to handle the H_CEDE hcall
in real mode.  When a guest vcpu does an H_CEDE hcall, we now only
exit to the kernel if all the other vcpus in the same core are also
idle.  Otherwise we mark this vcpu as napping, save state that could
be lost in nap mode (mainly GPRs and FPRs), and execute the nap
instruction.  When the thread wakes up, because of a decrementer or
external interrupt, we come back in at kvm_start_guest (from the
system reset interrupt vector), find the `napping' flag set in the
paca, and go to the resume path.

This has some other ramifications.  First, when starting a core, we
now start all the threads, both those that are immediately runnable and
those that are idle.  This is so that we don't have to pull all the
threads out of the guest when an idle thread gets a decrementer interrupt
and wants to start running.  In fact the idle threads will all start
with the H_CEDE hcall returning; being idle they will just do another
H_CEDE immediately and go to nap mode.

This required some changes to kvmppc_run_core() and kvmppc_run_vcpu().
These functions have been restructured to make them simpler and clearer.
We introduce a level of indirection in the wait queue that gets woken
when external and decrementer interrupts get generated for a vcpu, so
that we can have the 4 vcpus in a vcore using the same wait queue.
We need this because the 4 vcpus are being handled by one thread.

Secondly, when we need to exit from the guest to the kernel, we now
have to generate an IPI for any napping threads, because an HDEC
interrupt doesn't wake up a napping thread.

Thirdly, we now need to be able to handle virtual external interrupts
and decrementer interrupts becoming pending while a thread is napping,
and deliver those interrupts to the guest when the thread wakes.
This is done in kvmppc_cede_reentry, just before fast_guest_return.

Finally, since we are not using the generic kvm_vcpu_block for book3s_hv,
and hence not calling kvm_arch_vcpu_runnable, we can remove the #ifdef
from kvm_arch_vcpu_runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:30 +03:00
Paul Mackerras
0214394760 KVM: PPC: book3s_pr: Simplify transitions between virtual and real mode
This simplifies the way that the book3s_pr makes the transition to
real mode when entering the guest.  We now call kvmppc_entry_trampoline
(renamed from kvmppc_rmcall) in the base kernel using a normal function
call instead of doing an indirect call through a pointer in the vcpu.
If kvm is a module, the module loader takes care of generating a
trampoline as it does for other calls to functions outside the module.

kvmppc_entry_trampoline then disables interrupts and jumps to
kvmppc_handler_trampoline_enter in real mode using an rfi[d].
That then uses the link register as the address to return to
(potentially in module space) when the guest exits.

This also simplifies the way that we call the Linux interrupt handler
when we exit the guest due to an external, decrementer or performance
monitor interrupt.  Instead of turning on the MMU, then deciding that
we need to call the Linux handler and turning the MMU back off again,
we now go straight to the handler at the point where we would turn the
MMU on.  The handler will then return to the virtual-mode code
(potentially in the module).

Along the way, this moves the setting and clearing of the HID5 DCBZ32
bit into real-mode interrupts-off code, and also makes sure that
we clear the MSR[RI] bit before loading values into SRR0/1.

The net result is that we no longer need any code addresses to be
stored in vcpu->arch.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:29 +03:00
Paul Mackerras
177339d7f7 KVM: PPC: Assemble book3s{,_hv}_rmhandlers.S separately
This makes arch/powerpc/kvm/book3s_rmhandlers.S and
arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as
separate compilation units rather than having them #included in
arch/powerpc/kernel/exceptions-64s.S.  We no longer have any
conditional branches between the exception prologs in
exceptions-64s.S and the KVM handlers, so there is no need to
keep their contents close together in the vmlinux image.

In their current location, they are using up part of the limited
space between the first-level interrupt handlers and the firmware
NMI data area at offset 0x7000, and with some kernel configurations
this area will overflow (e.g. allyesconfig), leading to an
"attempt to .org backwards" error when compiling exceptions-64s.S.

Moving them out requires that we add some #includes that the
book3s_{,hv_}rmhandlers.S code was previously getting implicitly
via exceptions-64s.S.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-09-25 19:52:28 +03:00
Thadeu Lima de Souza Cascardo
d12b524f8b powerpc: Reserve iommu page 0
Some devices have a dma-window that starts at the address 0. This allows
DMA addresses to be mapped to this address and returned to drivers as a
valid DMA address. Some drivers may not behave well in this case, since
the address 0 is considered an error or not allocated.

The solution to avoid this kind of error from happening is reserve the
page addressed as 0 so it cannot be allocated for a DMA mapping.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-23 10:27:03 +10:00
Anshuman Khandual
a120db06c3 perf events, powerpc: Add POWER7 stalled-cycles-frontend/backend events
perf events, powerpc: Add POWER7 stalled-cycles-frontend/backend events

Extent the POWER7 PMU driver with definitions for generic front-end and back-end
stall events.

As explained in Ingo's original comment(8f62242246
), the exact definitions of the stall events are very much processor specific as

different things mean different in their respective instruction pipeline. These
two Power7 raw events are the closest approximation to the concept detailed in
Ingo's comment.

[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] = 0x100f8, /* GCT_NOSLOT_CYC */
It means cycles when the Global Completion Table has no slots from this thread

[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] = 0x4000a,  /* CMPLU_STALL */
It means no groups completed and GCT not empty for this thread

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:12:56 +10:00
Benjamin Herrenschmidt
ed79ba9e15 powerpc/powernv: Machine check and other system interrupts
OPAL can handle various interrupt for us such as Machine Checks (it
performs all sorts of recovery tasks and passes back control to us with
informations about the error), Hardware Management Interrupts and Softpatch
interrupts.

This wires up the mechanisms and prints out specific informations returned
by HAL when a machine check occurs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:10:03 +10:00
Benjamin Herrenschmidt
daea1175a9 powerpc/powernv: Support for OPAL console
This adds a udbg and an hvc console backend for supporting a console
using the OPAL console interfaces.

On OPAL v1 we have hvc0 mapped to whatever console the system was
configured for (network or hvsi serial port) via the service
processor.

On OPAL v2 we have hvcN mapped to the Nth console provided by OPAL
which generally corresponds to:

	hvc0 : network console (raw protocol)
	hvc1 : serial port S1 (hvsi)
	hvc2 : serial port S2 (hvsi)

Note: At this point, early debug console only works with OPAL v1
and shouldn't be enabled in a normal kernel.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:54 +10:00
Benjamin Herrenschmidt
6e35d5dac0 powerpc/powernv: Add support for instanciating OPAL v2 from Open Firmware
OPAL v2 is instantiated in a way similar to RTAS using Open Firmware
client interface calls, and the resulting address and entry point are
put in the device-tree

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:52 +10:00
Benjamin Herrenschmidt
14a43e69ed powerpc/powernv: Basic support for OPAL
Add definition of OPAL interfaces along with  the wrappers to call
into OPAL runtime and the early device-tree parsing hook to locate
the OPAL runtime firmware.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:50 +10:00
Benjamin Herrenschmidt
817c21ad9a powerpc/powernv: Get kernel command line accross OPAL takeover
We stash it in boot_command_line which isn't in BSS and so won't
be overwritten. We then use that as a default cmd_line before
we walk the device-tree.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:48 +10:00
Benjamin Herrenschmidt
27f4488872 powerpc/powernv: Add OPAL takeover from PowerVM
On machines supporting the OPAL firmware version 1, the system
is initially booted under pHyp. We then use a special hypercall
to verify if OPAL is available and if it is, we then trigger
a "takeover" which disables pHyp and loads the OPAL runtime
firmware, giving control to the kernel in hypervisor mode.

This patch add the necessary code to detect that the OPAL takeover
capability is present when running under PowerVM (aka pHyp) and
perform said takeover to get hypervisor control of the processor.

To perform the takeover, we must first use RTAS (within Open
Firmware runtime environment) to start all processors & threads,
in order to give control to OPAL on all of them. We then call
the takeover hypercall on everybody, OPAL will re-enter the kernel
main entry point passing it a flat device-tree.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 16:09:47 +10:00
Benjamin Herrenschmidt
e550592e68 powerpc/powernv: Don't clobber r9 in relative_toc()
With OPAL, r8 and r9 will be used to pass the OPAL base and entry
for debugging purposes (those informations are also in the
device-tree). We don't want to clobber those registers that
early.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:24 +10:00
Benjamin Herrenschmidt
781fb7a3e4 powerpc/pci: Call pcie_bus_configure_settings()
This new function is used to properly setup the PCI Express Max Payload Size
(and in some circumstances Max Read Request Size).

Some systems will not operate properly if these aren't set correctly and
the firmware doesn't always do it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:24 +10:00
Benjamin Herrenschmidt
fb82b83970 powerpc/smp: More generic support for "soft hotplug"
This adds more generic support for doing CPU hotplug with a simple
idle loop and no actual reset of the processors. The generic
smp_generic_kick_cpu() does the hotplug bringup trick if the PACA
shows that the CPU has already been started at boot and we provide
an accessor for the CPU state.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:24 +10:00
Anton Blanchard
dfbe93a222 powerpc: Coding style cleanups
While converting code to use for_each_node_by_type I noticed a
number of coding style issues.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:23 +10:00
Anton Blanchard
94db7c5e14 powerpc: Use for_each_node_by_type instead of open coding it
Use for_each_node_by_type instead of open coding it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 15:53:23 +10:00
Hector Martin
c26afe9e85 powerpc/ps3: Add gelic udbg driver
Add a new udbg driver for the PS3 gelic Ehthernet device.

This driver shares only a few stucture and constant definitions with the
gelic Ethernet device driver, so is implemented as a stand-alone driver
with no dependencies on the gelic Ethernet device driver.

Signed-off-by: Hector Martin <hector@marcansoft.com>
Signed-off-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:20:05 +10:00
Tang Yuantian
0330581ab3 powerpc/mm: Fix the call trace when resumed from hibernation
In SMP mode, the kernel would produce call trace when resumed
	from hibernation. The reason is when the function destroy_context
	is called to drop the resuming mm context, the mm->context.active
	is 1 which is wrong and should be zero.
	We pass the current->active_mm as previous mm context to function
	switch_mmu_context to decrease the context.active by 1.

	In UP mode, there is no effect.

Signed-off-by: Tang Yuantian <b29983@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:58 +10:00
Scott Wood
6dece0eb69 powerpc/32: Pass device tree address as u64 to machine_init
u64 is used rather than phys_addr_t to keep things simple, as
this is called from assembly code.

Update callers to pass a 64-bit address in r3/r4.  Other unused
register assignments that were once parameters to machine_init
are dropped.

For FSL BookE, look up the physical address of the device tree from the
effective address passed in r3 by the loader.  This is required for
situations where memory does not start at zero (due to AMP or IOMMU-less
virtualization), and thus the IMA doesn't start at zero, and thus the
device tree effective address does not equal the physical address.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:47 +10:00
Becky Bruce
41151e77a4 powerpc: Hugetlb for BookE
Enable hugepages on Freescale BookE processors.  This allows the kernel to
use huge TLB entries to map pages, which can greatly reduce the number of
TLB misses and the amount of TLB thrashing experienced by applications with
large memory footprints.  Care should be taken when using this on FSL
processors, as the number of large TLB entries supported by the core is low
(16-64) on current processors.

The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
Page sizes larger than the max zone size are called "gigantic" pages and
must be allocated on the command line (and cannot be deallocated).

This is currently only fully implemented for Freescale 32-bit BookE
processors, but there is some infrastructure in the code for
64-bit BooKE.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:40 +10:00
Arnd Bergmann
7df5659eef serial/8250: Move UPIO_TSI to powerpc
This iotype is only used by the legacy_serial code in powerpc, so the
code should live there, rather than be compiled in for every 8250
driver.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: linux-serial@vger.kernel.org
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:39 +10:00
Milton Miller
2eccacd097 powerpc: Tidy up dma_map_ops after adding new hook
The new get_required_mask hook name is longer than many of but not all
of the prior ops.  Tidy the struct initializers to align the equal signs
using the local whitespace.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Cc: benh@kernel.crashing.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:37 +10:00
Milton Miller
d24f9c6999 powerpc: Use the newly added get_required_mask dma_map_ops hook
Now that the generic code has dma_map_ops set, instead of having a
messy ifdef & if block in the base dma_get_required_mask hook push
the computation into the dma ops.

If the ops fails to set the get_required_mask hook default to the
width of dma_addr_t.

This also corrects ibmbus ibmebus_dma_supported to require a 64
bit mask.  I doubt anything is checking or setting the dma mask on
that bus.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Cc: benh@kernel.crashing.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-20 09:19:35 +10:00
Milton Miller
6a5c7be5e4 powerpc: Override dma_get_required_mask by platform hook and ops
The hook dma_get_required_mask is supposed to return the mask required
by the platform to operate efficently.  The generic version of
dma_get_required_mask in driver/base/platform.c returns a mask based
only on max_pfn.  However, this is likely too big for iommu systems
and could be too small for platforms that require a dma offset or have
a secondary window at a high offset.

Override the default, provide a hook in ppc_md used by pseries lpar and
cell, and provide the default answer based on memblock_end_of_DRAM(),
with hooks for get_dma_offset, and provide an implementation for iommu
that looks at the defined table size.  Coverting from the end address
to the required bit mask is based on the generic implementation.

The need for this was discovered when the qla2xxx driver switched to
64 bit dma then reverted to 32 bit when dma_get_required_mask said
32 bits was sufficient.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Cc: benh@kernel.crashing.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-09-01 16:00:19 +10:00
Benjamin Herrenschmidt
9bb7361d99 Merge remote-tracking branch 'jwb/next' into next 2011-08-30 15:14:46 +10:00
Timur Tabi
dcd83aaff1 tty/powerpc: introduce the ePAPR embedded hypervisor byte channel driver
The ePAPR embedded hypervisor specification provides an API for "byte
channels", which are serial-like virtual devices for sending and receiving
streams of bytes.  This driver provides Linux kernel support for byte
channels via three distinct interfaces:

1) An early-console (udbg) driver.  This provides early console output
through a byte channel.  The byte channel handle must be specified in a
Kconfig option.

2) A normal console driver.  Output is sent to the byte channel designated
for stdout in the device tree.  The console driver is for handling kernel
printk calls.

3) A tty driver, which is used to handle user-space input and output.  The
byte channel used for the console is designated as the default tty.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-23 10:32:56 -07:00
Suzuki Poulose
674bfa4855 powerpc/44x: Kexec support for PPC440X chipsets
This patch adds kexec support for PPC440 based chipsets.  This work is based
on the KEXEC patches for FSL BookE.

The FSL BookE patch and the code flow could be found at the link below:

	http://patchwork.ozlabs.org/patch/49359/

Steps:

1) Invalidate all the TLB entries except the one this code is run from
2) Create a tmp mapping for our code in the other address space and jump to it
3) Invalidate the entry we used
4) Create a 1:1 mapping for 0-2GiB in blocks of 256M
5) Jump to the new 1:1 mapping and invalidate the tmp mapping

I have tested this patches on Ebony, Sequoia boards and Virtex on QEMU.

You need kexec-tools commit e8b7939b1e or newer for ppc440x support, 
available at:

 git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git

Signed-off-by: 	Suzuki Poulose <suzuki@in.ibm.com>
Cc:	Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-08-11 13:50:37 -04:00
Benjamin Herrenschmidt
a85fe3fce8 powerpc: Really fix build without CONFIG_PCI
Brown paper bag day, previous commit wouldn't work very well with modules
enabled. Move the exports into the ifdef.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-11 01:15:44 +10:00
Benjamin Herrenschmidt
81210c2062 powerpc: Fix build without CONFIG_PCI
Commit fea80311a9
"iomap: make IOPORT/PCI mapping functions conditional"

Broke powerpc build without CONFIG_PCI as we would still define
pci_iomap(), which overlaps with the new empty inline in the headers.

Make our implementation conditional on CONFIG_PCI

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 16:01:20 +10:00
Anton Blanchard
8aa6d35929 powerpc: Move kdump default base address to half RMO size on 64bit
We are seeing boot failures on some very large boxes even with
commit b5416ca9f8 (powerpc: Move kdump default base address to
64MB on 64bit).

This patch halves the RMO so both kernels get about the same
amount of RMO memory. On large machines this region will be
at least 256MB, so each kernel will get 128MB.

We cap it at 256MB (small SLB size) since some early allocations need
to be in the bolted SLB region. We could relax this on machines with
1TB SLBs in a future patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:56 +10:00
David Ahern
b59a1bfcc2 powerpc/perf: Disable pagefaults during callchain stack read
Panic observed on an older kernel when collecting call chains for
the context-switch software event:

 [<b0180e00>]rb_erase+0x1b4/0x3e8
 [<b00430f4>]__dequeue_entity+0x50/0xe8
 [<b0043304>]set_next_entity+0x178/0x1bc
 [<b0043440>]pick_next_task_fair+0xb0/0x118
 [<b02ada80>]schedule+0x500/0x614
 [<b02afaa8>]rwsem_down_failed_common+0xf0/0x264
 [<b02afca0>]rwsem_down_read_failed+0x34/0x54
 [<b02aed4c>]down_read+0x3c/0x54
 [<b0023b58>]do_page_fault+0x114/0x5e8
 [<b001e350>]handle_page_fault+0xc/0x80
 [<b0022dec>]perf_callchain+0x224/0x31c
 [<b009ba70>]perf_prepare_sample+0x240/0x2fc
 [<b009d760>]__perf_event_overflow+0x280/0x398
 [<b009d914>]perf_swevent_overflow+0x9c/0x10c
 [<b009db54>]perf_swevent_ctx_event+0x1d0/0x230
 [<b009dc38>]do_perf_sw_event+0x84/0xe4
 [<b009dde8>]perf_sw_event_context_switch+0x150/0x1b4
 [<b009de90>]perf_event_task_sched_out+0x44/0x2d4
 [<b02ad840>]schedule+0x2c0/0x614
 [<b0047dc0>]__cond_resched+0x34/0x90
 [<b02adcc8>]_cond_resched+0x4c/0x68
 [<b00bccf8>]move_page_tables+0xb0/0x418
 [<b00d7ee0>]setup_arg_pages+0x184/0x2a0
 [<b0110914>]load_elf_binary+0x394/0x1208
 [<b00d6e28>]search_binary_handler+0xe0/0x2c4
 [<b00d834c>]do_execve+0x1bc/0x268
 [<b0015394>]sys_execve+0x84/0xc8
 [<b001df10>]ret_from_syscall+0x0/0x3c

A page fault occurred walking the callchain while creating a perf
sample for the context-switch event. To handle the page fault the
mmap_sem is needed, but it is currently held by setup_arg_pages.
(setup_arg_pages calls shift_arg_pages with the mmap_sem held.
shift_arg_pages then calls move_page_tables which has a cond_resched
at the top of its for loop - hitting that cond_resched is what caused
the context switch.)

This is an extension of Anton's proposed patch:
https://lkml.org/lkml/2011/7/24/151
adding case for 32-bit ppc.

Tested on the system that first generated the panic and then again
with latest kernel using a PPC VM. I am not able to test the 64-bit
path - I do not have H/W for it and 64-bit PPC VMs (qemu on Intel)
is horribly slow.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:56 +10:00
Anton Blanchard
fbafd72815 powerpc: Clean up some panic messages in prom_init
Add a newline to the panic messages in make_room. Also fix a
comment that suggested our chunk size is 4Mb. It's 1MB.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:55 +10:00
Anton Blanchard
966728dd88 powerpc: Fix device tree claim code
I have a box that fails in OF during boot with:

DEFAULT CATCH!, exception-handler=fff00400
at   %SRR0: 49424d2c4c6f6768   %SRR1: 800000004000b002

ie "IBM,Logh". OF got corrupted with a device tree string.

Looking at make_room and alloc_up, we claim the first chunk (1 MB)
but we never claim any more. mem_end is always set to alloc_top
which is the top of our available address space, guaranteeing we will
never call alloc_up and claim more memory.

Also alloc_up wasn't setting alloc_bottom to the bottom of the
available address space.

This doesn't help the box to boot, but we at least fail with
an obvious error. We could relocate the device tree in a future
patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:54 +10:00
Scott Wood
26ee97672e powerpc: Return the_cpu_ spec from identify_cpu
Commit af9eef3c7b caused cpu_setup to see
the_cpu_spec, rather than the source struct.  However, on 32-bit, the
return value of identify_cpu was being used for feature fixups, and
identify_cpu was returning the source struct.  So if cpu_setup patches
the feature bits, the update won't affect the fixups.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-08-05 14:47:54 +10:00
Linus Torvalds
3960ef326a Merge branch 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'next/cross-platform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  ARM: Consolidate the clkdev header files
  ARM: set vga memory base at run-time
  ARM: convert PCI defines to variables
  ARM: pci: make pcibios_assign_all_busses use pci_has_flag
  ARM: remove unnecessary mach/hardware.h includes
  pci: move microblaze and powerpc pci flag functions into asm-generic
  powerpc: rename ppc_pci_*_flags to pci_*_flags

Fix up conflicts in arch/microblaze/include/asm/pci-bridge.h
2011-07-26 17:12:10 -07:00
Arun Sharma
60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Linus Torvalds
184475029a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (99 commits)
  drivers/virt: add missing linux/interrupt.h to fsl_hypervisor.c
  powerpc/85xx: fix mpic configuration in CAMP mode
  powerpc: Copy back TIF flags on return from softirq stack
  powerpc/64: Make server perfmon only built on ppc64 server devices
  powerpc/pseries: Fix hvc_vio.c build due to recent changes
  powerpc: Exporting boot_cpuid_phys
  powerpc: Add CFAR to oops output
  hvc_console: Add kdb support
  powerpc/pseries: Fix hvterm_raw_get_chars to accept < 16 chars, fixing xmon
  powerpc/irq: Quieten irq mapping printks
  powerpc: Enable lockup and hung task detectors in pseries and ppc64 defeconfigs
  powerpc: Add mpt2sas driver to pseries and ppc64 defconfig
  powerpc: Disable IRQs off tracer in ppc64 defconfig
  powerpc: Sync pseries and ppc64 defconfigs
  powerpc/pseries/hvconsole: Fix dropped console output
  hvc_console: Improve tty/console put_chars handling
  powerpc/kdump: Fix timeout in crash_kexec_wait_realmode
  powerpc/mm: Fix output of total_ram.
  powerpc/cpufreq: Add cpufreq driver for Momentum Maple boards
  powerpc: Correct annotations of pmu registration functions
  ...

Fix up trivial Kconfig/Makefile conflicts in arch/powerpc, drivers, and
drivers/cpufreq
2011-07-25 22:59:39 -07:00
Linus Torvalds
45b583b10a Merge 'akpm' patch series
* Merge akpm patch series: (122 commits)
  drivers/connector/cn_proc.c: remove unused local
  Documentation/SubmitChecklist: add RCU debug config options
  reiserfs: use hweight_long()
  reiserfs: use proper little-endian bitops
  pnpacpi: register disabled resources
  drivers/rtc/rtc-tegra.c: properly initialize spinlock
  drivers/rtc/rtc-twl.c: check return value of twl_rtc_write_u8() in twl_rtc_set_time()
  drivers/rtc: add support for Qualcomm PMIC8xxx RTC
  drivers/rtc/rtc-s3c.c: support clock gating
  drivers/rtc/rtc-mpc5121.c: add support for RTC on MPC5200
  init: skip calibration delay if previously done
  misc/eeprom: add eeprom access driver for digsy_mtc board
  misc/eeprom: add driver for microwire 93xx46 EEPROMs
  checkpatch.pl: update $logFunctions
  checkpatch: make utf-8 test --strict
  checkpatch.pl: add ability to ignore various messages
  checkpatch: add a "prefer __aligned" check
  checkpatch: validate signature styles and To: and Cc: lines
  checkpatch: add __rcu as a sparse modifier
  checkpatch: suggest using min_t or max_t
  ...

Did this as a merge because of (trivial) conflicts in
 - Documentation/feature-removal-schedule.txt
 - arch/xtensa/include/asm/uaccess.h
that were just easier to fix up in the merge than in the patch series.
2011-07-25 21:00:19 -07:00
Amerigo Wang
c5f41752fd notifiers: sys: move reboot notifiers into reboot.h
It is not necessary to share the same notifier.h.

This patch already moves register_reboot_notifier() and
unregister_reboot_notifier() from kernel/notifier.c to kernel/sys.c.

[amwang@redhat.com: make allyesconfig succeed on ppc64]
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-25 20:57:14 -07:00
Linus Torvalds
d3ec4844d4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  fs: Merge split strings
  treewide: fix potentially dangerous trailing ';' in #defined values/expressions
  uwb: Fix misspelling of neighbourhood in comment
  net, netfilter: Remove redundant goto in ebt_ulog_packet
  trivial: don't touch files that are removed in the staging tree
  lib/vsprintf: replace link to Draft by final RFC number
  doc: Kconfig: `to be' -> `be'
  doc: Kconfig: Typo: square -> squared
  doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
  drivers/net: static should be at beginning of declaration
  drivers/media: static should be at beginning of declaration
  drivers/i2c: static should be at beginning of declaration
  XTENSA: static should be at beginning of declaration
  SH: static should be at beginning of declaration
  MIPS: static should be at beginning of declaration
  ARM: static should be at beginning of declaration
  rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
  Update my e-mail address
  PCIe ASPM: forcedly -> forcibly
  gma500: push through device driver tree
  ...

Fix up trivial conflicts:
 - arch/arm/mach-ep93xx/dma-m2p.c (deleted)
 - drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
 - drivers/net/r8169.c (just context changes)
2011-07-25 13:56:39 -07:00
Linus Torvalds
fcda12e7f6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  modpost: Fix modpost's license checking V3
  module: add /sys/module/<name>/uevent files
  module: change attr callbacks to take struct module_kobject
  modules: make arch's use default loader hooks
  modules: add default loader hook implementations
  param: fix return value handling in param_set_*
2011-07-24 09:54:54 -07:00
Linus Torvalds
5fabc487c9 Merge branch 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  KVM: IOMMU: Disable device assignment without interrupt remapping
  KVM: MMU: trace mmio page fault
  KVM: MMU: mmio page fault support
  KVM: MMU: reorganize struct kvm_shadow_walk_iterator
  KVM: MMU: lockless walking shadow page table
  KVM: MMU: do not need atomicly to set/clear spte
  KVM: MMU: introduce the rules to modify shadow page table
  KVM: MMU: abstract some functions to handle fault pfn
  KVM: MMU: filter out the mmio pfn from the fault pfn
  KVM: MMU: remove bypass_guest_pf
  KVM: MMU: split kvm_mmu_free_page
  KVM: MMU: count used shadow pages on prepareing path
  KVM: MMU: rename 'pt_write' to 'emulate'
  KVM: MMU: cleanup for FNAME(fetch)
  KVM: MMU: optimize to handle dirty bit
  KVM: MMU: cache mmio info on page fault path
  KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code
  KVM: MMU: do not update slot bitmap if spte is nonpresent
  KVM: MMU: fix walking shadow page table
  KVM guest: KVM Steal time registration
  ...
2011-07-24 09:07:03 -07:00
Jonas Bonn
66574cc054 modules: make arch's use default loader hooks
This patch removes all the module loader hook implementations in the
architecture specific code where the functionality is the same as that
now provided by the recently added default hooks.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-07-24 22:06:04 +09:30
Linus Torvalds
4d4abdcb1d Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits)
  perf: Remove the nmi parameter from the oprofile_perf backend
  x86, perf: Make copy_from_user_nmi() a library function
  perf: Remove perf_event_attr::type check
  x86, perf: P4 PMU - Fix typos in comments and style cleanup
  perf tools: Make test use the preset debugfs path
  perf tools: Add automated tests for events parsing
  perf tools: De-opt the parse_events function
  perf script: Fix display of IP address for non-callchain path
  perf tools: Fix endian conversion reading event attr from file header
  perf tools: Add missing 'node' alias to the hw_cache[] array
  perf probe: Support adding probes on offline kernel modules
  perf probe: Add probed module in front of function
  perf probe: Introduce debuginfo to encapsulate dwarf information
  perf-probe: Move dwarf library routines to dwarf-aux.{c, h}
  perf probe: Remove redundant dwarf functions
  perf probe: Move strtailcmp to string.c
  perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
  tracing/kprobe: Update symbol reference when loading module
  tracing/kprobes: Support module init function probing
  kprobes: Return -ENOENT if probe point doesn't exist
  ...
2011-07-22 16:44:39 -07:00
Linus Torvalds
acb41c0f92 Merge branch 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  pci/of: Consolidate pci_bus_to_OF_node()
  pci/of: Consolidate pci_device_to_OF_node()
  x86/devicetree: Use generic PCI <-> OF matching
  microblaze/pci: Move the remains of pci_32.c to pci-common.c
  microblaze/pci: Remove powermac originated cruft
  pci/of: Match PCI devices to OF nodes dynamically
2011-07-22 14:54:02 -07:00
Benjamin Herrenschmidt
50d2a4223b powerpc: Copy back TIF flags on return from softirq stack
We already did it for hard IRQs but it looks like we forgot
to do it for softirqs. Without this, we would lose flags
such as TIF_NEED_RESCHED set using current_thread_info()
by something running of a softirq.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-22 13:38:58 +10:00
Benjamin Herrenschmidt
4b575f3e8a Merge remote-tracking branch 'jwb/next' into next 2011-07-22 13:16:41 +10:00
Andrew Gabbasov
9974eec2b8 powerpc: Exporting boot_cpuid_phys
Kernel loadable module can use hard_smp_processor_id() if building with SMP
kernel. In order to make it work for UP kernels too, boot_cpuid_phys
symbol (which is what hard_smp_processor_id() macro resolves to
in non-SMP configuration) must be exported.

Signed-off-by: Andrew Gabbasov <andrew_gabbasov@mentor.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-19 15:13:34 +10:00
Michael Neuling
5115a026ce powerpc: Add CFAR to oops output
Now we have the CFAR saved add it to the oops output.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-19 15:13:33 +10:00
Anton Blanchard
8896293422 powerpc/irq: Quieten irq mapping printks
HFI creates interrupts each time a window is setup. This results in
a lot of messages in the kernel log buffer:

    irq: irq 199007 on host null mapped to virtual irq 351

This box has over 3500 of them, causing more important kernel
messages to be overwritten. We can get at this information via
debugfs now so we may as well turn it into a pr_debug.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-19 15:13:07 +10:00
Michael Neuling
63f21a56f1 powerpc/kdump: Fix timeout in crash_kexec_wait_realmode
The existing code it pretty ugly.  How about we clean it up even more
like this?

From: Anton Blanchard <anton@samba.org>

We check for timeout expiry in the outer loop, but we also need to
check it in the inner loop or we can lock up forever waiting for a
CPU to hit real mode.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-19 15:13:05 +10:00
Dmitry Eremin-Solenikov
77c2342a57 powerpc: Correct annotations of pmu registration functions
This fixes the following warning:
WARNING: arch/powerpc/kernel/built-in.o(.text+0x29768): Section mismatch in reference from the function .register_power_pmu() to the function .cpuinit.text:.power_pmu_notifier()
The function .register_power_pmu() references
the function __cpuinit .power_pmu_notifier().
This is often because .register_power_pmu lacks a __cpuinit
annotation or the annotation of .power_pmu_notifier is wrong.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-19 15:12:40 +10:00
Mathias Krause
0e0ebdb9c2 powerpc: Remove redundant set_fs(USER_DS)
The address limit is already set in flush_old_exec() so this
set_fs(USER_DS) is redundant.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-19 15:12:40 +10:00
Dave Kleikamp
9661534d6a powerpc/47x: allow kernel to be loaded in higher physical memory
The 44x code (which is shared by 47x) assumes the available physical memory
begins at 0x00000000.  This is not necessarily the case in an AMP
environment.

Support CONFIG_RELOCATABLE for 476 in order to allow the kernel to be
loaded into a higher memory range.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2011-07-12 10:34:24 -04:00
Rob Herring
0e47ff1ce6 powerpc: rename ppc_pci_*_flags to pci_*_flags
This renames pci flags functions and enums in preparation for creating
generic version in asm-generic/pci-bridge.h. The following search and
replace is done:

s/ppc_pci_/pci_/
s/PPC_PCI_/PCI_/

Direct accesses to ppc_pci_flag variable are replaced with helper
functions.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
2011-07-12 09:28:04 -05:00
Dave Kleikamp
91b191c71e powerpc/44x: don't use tlbivax on AMP systems
Since other OS's may be running on the other cores don't use tlbivax

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2011-07-12 09:21:55 -04:00
Paul Mackerras
9e368f2915 KVM: PPC: book3s_hv: Add support for PPC970-family processors
This adds support for running KVM guests in supervisor mode on those
PPC970 processors that have a usable hypervisor mode.  Unfortunately,
Apple G5 machines have supervisor mode disabled (MSR[HV] is forced to
1), but the YDL PowerStation does have a usable hypervisor mode.

There are several differences between the PPC970 and POWER7 in how
guests are managed.  These differences are accommodated using the
CPU_FTR_ARCH_201 (PPC970) and CPU_FTR_ARCH_206 (POWER7) CPU feature
bits.  Notably, on PPC970:

* The LPCR, LPID or RMOR registers don't exist, and the functions of
  those registers are provided by bits in HID4 and one bit in HID0.

* External interrupts can be directed to the hypervisor, but unlike
  POWER7 they are masked by MSR[EE] in non-hypervisor modes and use
  SRR0/1 not HSRR0/1.

* There is no virtual RMA (VRMA) mode; the guest must use an RMO
  (real mode offset) area.

* The TLB entries are not tagged with the LPID, so it is necessary to
  flush the whole TLB on partition switch.  Furthermore, when switching
  partitions we have to ensure that no other CPU is executing the tlbie
  or tlbsync instructions in either the old or the new partition,
  otherwise undefined behaviour can occur.

* The PMU has 8 counters (PMC registers) rather than 6.

* The DSCR, PURR, SPURR, AMR, AMOR, UAMOR registers don't exist.

* The SLB has 64 entries rather than 32.

* There is no mediated external interrupt facility, so if we switch to
  a guest that has a virtual external interrupt pending but the guest
  has MSR[EE] = 0, we have to arrange to have an interrupt pending for
  it so that we can get control back once it re-enables interrupts.  We
  do that by sending ourselves an IPI with smp_send_reschedule after
  hard-disabling interrupts.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:59 +03:00
Paul Mackerras
969391c58a powerpc, KVM: Split HVMODE_206 cpu feature bit into separate HV and architecture bits
This replaces the single CPU_FTR_HVMODE_206 bit with two bits, one to
indicate that we have a usable hypervisor mode, and another to indicate
that the processor conforms to PowerISA version 2.06.  We also add
another bit to indicate that the processor conforms to ISA version 2.01
and set that for PPC970 and derivatives.

Some PPC970 chips (specifically those in Apple machines) have a
hypervisor mode in that MSR[HV] is always 1, but the hypervisor mode
is not useful in the sense that there is no way to run any code in
supervisor mode (HV=0 PR=0).  On these processors, the LPES0 and LPES1
bits in HID4 are always 0, and we use that as a way of detecting that
hypervisor mode is not useful.

Where we have a feature section in assembly code around code that
only applies on POWER7 in hypervisor mode, we use a construct like

END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)

The definition of END_FTR_SECTION_IFSET is such that the code will
be enabled (not overwritten with nops) only if all bits in the
provided mask are set.

Note that the CPU feature check in __tlbie() only needs to check the
ARCH_206 bit, not the HVMODE bit, because __tlbie() can only get called
if we are running bare-metal, i.e. in hypervisor mode.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:58 +03:00
Paul Mackerras
aa04b4cc5b KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests
This adds infrastructure which will be needed to allow book3s_hv KVM to
run on older POWER processors, including PPC970, which don't support
the Virtual Real Mode Area (VRMA) facility, but only the Real Mode
Offset (RMO) facility.  These processors require a physically
contiguous, aligned area of memory for each guest.  When the guest does
an access in real mode (MMU off), the address is compared against a
limit value, and if it is lower, the address is ORed with an offset
value (from the Real Mode Offset Register (RMOR)) and the result becomes
the real address for the access.  The size of the RMA has to be one of
a set of supported values, which usually includes 64MB, 128MB, 256MB
and some larger powers of 2.

Since we are unlikely to be able to allocate 64MB or more of physically
contiguous memory after the kernel has been running for a while, we
allocate a pool of RMAs at boot time using the bootmem allocator.  The
size and number of the RMAs can be set using the kvm_rma_size=xx and
kvm_rma_count=xx kernel command line options.

KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability
of the pool of preallocated RMAs.  The capability value is 1 if the
processor can use an RMA but doesn't require one (because it supports
the VRMA facility), or 2 if the processor requires an RMA for each guest.

This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the
pool and returns a file descriptor which can be used to map the RMA.  It
also returns the size of the RMA in the argument structure.

Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION
ioctl calls from userspace.  To cope with this, we now preallocate the
kvm->arch.ram_pginfo array when the VM is created with a size sufficient
for up to 64GB of guest memory.  Subsequently we will get rid of this
array and use memory associated with each memslot instead.

This moves most of the code that translates the user addresses into
host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level
to kvmppc_core_prepare_memory_region.  Also, instead of having to look
up the VMA for each page in order to check the page size, we now check
that the pages we get are compound pages of 16MB.  However, if we are
adding memory that is mapped to an RMA, we don't bother with calling
get_user_pages_fast and instead just offset from the base pfn for the
RMA.

Typically the RMA gets added after vcpus are created, which makes it
inconvenient to have the LPCR (logical partition control register) value
in the vcpu->arch struct, since the LPCR controls whether the processor
uses RMA or VRMA for the guest.  This moves the LPCR value into the
kvm->arch struct and arranges for the MER (mediated external request)
bit, which is the only bit that varies between vcpus, to be set in
assembly code when going into the guest if there is a pending external
interrupt request.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
Paul Mackerras
371fefd6f2 KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7.  The host still has to run single-threaded.

This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability.  The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.

To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode.  KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline).  To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it.  Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.

When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host.  This number is exported
to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.

We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host.  We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.

When a vcore starts to run, it executes in the context of one of the
vcpu threads.  The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).

It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running.  In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest.  It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.

Note that there is no fixed relationship between the hardware thread
number and the vcpu number.  Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
Paul Mackerras
a8606e20e4 KVM: PPC: Handle some PAPR hcalls in the kernel
This adds the infrastructure for handling PAPR hcalls in the kernel,
either early in the guest exit path while we are still in real mode,
or later once the MMU has been turned back on and we are in the full
kernel context.  The advantage of handling hcalls in real mode if
possible is that we avoid two partition switches -- and this will
become more important when we support SMT4 guests, since a partition
switch means we have to pull all of the threads in the core out of
the guest.  The disadvantage is that we can only access the kernel
linear mapping, not anything vmalloced or ioremapped, since the MMU
is off.

This also adds code to handle the following hcalls in real mode:

H_ENTER       Add an HPTE to the hashed page table
H_REMOVE      Remove an HPTE from the hashed page table
H_READ        Read HPTEs from the hashed page table
H_PROTECT     Change the protection bits in an HPTE
H_BULK_REMOVE Remove up to 4 HPTEs from the hashed page table
H_SET_DABR    Set the data address breakpoint register

Plus code to handle the following hcalls in the kernel:

H_CEDE        Idle the vcpu until an interrupt or H_PROD hcall arrives
H_PROD        Wake up a ceded vcpu
H_REGISTER_VPA Register a virtual processor area (VPA)

The code that runs in real mode has to be in the base kernel, not in
the module, if KVM is compiled as a module.  The real-mode code can
only access the kernel linear mapping, not vmalloc or ioremap space.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:55 +03:00
Paul Mackerras
de56a948b9 KVM: PPC: Add support for Book3S processors in hypervisor mode
This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode.  Using hypervisor mode means
that the guest can use the processor's supervisor mode.  That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host.  This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses.  That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification.  In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest.  We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount.  Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition.  MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest.  At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management.  This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:54 +03:00
Paul Mackerras
3c42bf8a71 KVM: PPC: Split host-state fields out of kvmppc_book3s_shadow_vcpu
There are several fields in struct kvmppc_book3s_shadow_vcpu that
temporarily store bits of host state while a guest is running,
rather than anything relating to the particular guest or vcpu.
This splits them out into a new kvmppc_host_state structure and
modifies the definitions in asm-offsets.c to suit.

On 32-bit, we have a kvmppc_host_state structure inside the
kvmppc_book3s_shadow_vcpu since the assembly code needs to be able
to get to them both with one pointer.  On 64-bit they are separate
fields in the PACA.  This means that on 64-bit we don't need to
copy the kvmppc_host_state in and out on vcpu load/unload, and
in future will mean that the book3s_hv code doesn't need a
shadow_vcpu struct in the PACA at all.  That does mean that we
have to be careful not to rely on any values persisting in the
hstate field of the paca across any point where we could block
or get preempted.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:53 +03:00
Paul Mackerras
923c53caea powerpc: Set up LPCR for running guest partitions
In hypervisor mode, the LPCR controls several aspects of guest
partitions, including virtual partition memory mode, and also controls
whether the hypervisor decrementer interrupts are enabled.  This sets
up LPCR at boot time so that guest partitions will use a virtual real
memory area (VRMA) composed of 16MB large pages, and hypervisor
decrementer interrupts are disabled.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:52 +03:00
Paul Mackerras
b01c8b54a1 powerpc, KVM: Rework KVM checks in first-level interrupt handlers
Instead of branching out-of-line with the DO_KVM macro to check if we
are in a KVM guest at the time of an interrupt, this moves the KVM
check inline in the first-level interrupt handlers.  This speeds up
the non-KVM case and makes sure that none of the interrupt handlers
are missing the check.

Because the first-level interrupt handlers are now larger, some things
had to be move out of line in exceptions-64s.S.

This all necessitated some minor changes to the interrupt entry code
in KVM.  This also streamlines the book3s_32 KVM test.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:48 +03:00
Liu Yu
dd9ebf1f94 KVM: PPC: e500: Add shadow PID support
Dynamically assign host PIDs to guest PIDs, splitting each guest PID into
multiple host (shadow) PIDs based on kernel/user and MSR[IS/DS].  Use
both PID0 and PID1 so that the shadow PIDs for the right mode can be
selected, that correspond both to guest TID = zero and guest TID = guest
PID.

This allows us to significantly reduce the frequency of needing to
invalidate the entire TLB.  When the guest mode or PID changes, we just
update the host PID0/PID1.  And since the allocation of shadow PIDs is
global, multiple guests can share the TLB without conflict.

Note that KVM does not yet support the guest setting PID1 or PID2 to
a value other than zero.  This will need to be fixed for nested KVM
to work.  Until then, we enforce the requirement for guest PID1/PID2
to stay zero by failing the emulation if the guest tries to set them
to something else.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:39 +03:00
Scott Wood
4cd35f675b KVM: PPC: e500: Save/restore SPE state
This is done lazily.  The SPE save will be done only if the guest has
used SPE since the last preemption or heavyweight exit.  Restore will be
done only on demand, when enabling MSR_SPE in the shadow MSR, in response
to an SPE fault or mtmsr emulation.

For SPEFSCR, Linux already switches it on context switch (non-lazily), so
the only remaining bit is to save it between qemu and the guest.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:32 +03:00
Scott Wood
ecee273fc4 KVM: PPC: booke: use shadow_msr
Keep the guest MSR and the guest-mode true MSR separate, rather than
modifying the guest MSR on each guest entry to produce a true MSR.

Any bits which should be modified based on guest MSR must be explicitly
propagated from vcpu->arch.shared->msr to vcpu->arch.shadow_msr in
kvmppc_set_msr().

While we're modifying the guest entry code, reorder a few instructions
to bury some load latencies.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:32 +03:00
Scott Wood
c51584d52e powerpc/e500: SPE register saving: take arbitrary struct offset
Previously, these macros hardcoded THREAD_EVR0 as the base of the save
area, relative to the base register passed.  This base offset is now
passed as a separate macro parameter, allowing reuse with other SPE
save areas, such as used by KVM.

Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:31 +03:00
yu liu
685659ee70 powerpc/e500: Save SPEFCSR in flush_spe_to_thread()
giveup_spe() saves the SPE state which is protected by MSR[SPE].
However, modifying SPEFSCR does not trap when MSR[SPE]=0.
And since SPEFSCR is already saved/restored in _switch(),
not all the callers want to save SPEFSCR again.
Thus, saving SPEFSCR should not belong to giveup_spe().

This patch moves SPEFSCR saving to flush_spe_to_thread(),
and cleans up the caller that needs to save SPEFSCR accordingly.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:30 +03:00
Jiri Kosina
b7e9c223be Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
2011-07-11 14:15:55 +02:00
Kumar Gala
6471fc6630 powerpc: Dont require a dma_ops struct to set dma mask
The only reason to require a dma_ops struct is to see if it has
implemented set_dma_mask.  If not we can fall back to setting the mask
directly.

This resolves an issue with how to sequence the setting of a DMA mask
for platform devices.  Before we had an issue in that we have no way of
setting the DMA mask before the various low level bus notifiers get
called that might check it (swiotlb).

So now we can do:

	pdev = platform_device_alloc("foobar", 0);
	dma_set_mask(&pdev->dev, DMA_BIT_MASK(37));
	platform_device_add(pdev);

And expect the right thing to happen with the bus notifiers get called
via platform_device_add.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:36 -05:00
Kumar Gala
314b02f503 powerpc: implement arch_setup_pdev_archdata
We have a long standing issues with platform devices not have a valid
dma_mask pointer.  This hasn't been an issue to date as no platform
device has tried to set its dma_mask value to a non-default value.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:36 -05:00
Becky Bruce
3160b09796 powerpc: Create next_tlbcam_idx percpu variable for FSL_BOOKE
This is used to round-robin TLBCAM entries.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:34 -05:00
Avi Kivity
4dc0da8696 perf: Add context field to perf_event
The perf_event overflow handler does not receive any caller-derived
argument, so many callers need to resort to looking up the perf_event
in their local data structure.  This is ugly and doesn't scale if a
single callback services many perf_events.

Fix by adding a context parameter to perf_event_create_kernel_counter()
(and derived hardware breakpoints APIs) and storing it in the perf_event.
The field can be accessed from the callback as event->overflow_handler_context.
All callers are updated.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:38 +02:00
Peter Zijlstra
89d6c0b5bd perf, arch: Add generic NODE cache events
Add a NODE level to the generic cache events which is used to measure
local vs remote memory accesses. Like all other cache events, an
ACCESS is HIT+MISS, if there is no way to distinguish between reads
and writes do reads only etc..

The below needs filling out for !x86 (which I filled out with
unsupported events).

I'm fairly sure ARM can leave it like that since it doesn't strike me as
an architecture that even has NUMA support. SH might have something since
it does appear to have some NUMA bits.

Sparc64, PowerPC and MIPS certainly want a good look there since they
clearly are NUMA capable.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Anton Blanchard <anton@samba.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:38 +02:00
Peter Zijlstra
a8b0ca17b8 perf: Remove the nmi parameter from the swevent and overflow interface
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:35 +02:00
Peter Zijlstra
4f8b50bbbe irq_work, ppc: Fix up arch hooks
Commit e360adbe29 ("irq_work: Add generic hardirq context
callbacks") fouled up the ppc bit, not properly naming the
arch specific function that raises the 'self-IPI'.

Cc: Huang Ying <ying.huang@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: stable@kernel.org # 37+
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-eg0aqien8p1aqvzu9dft6dtv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:02:22 +02:00
Michael Ellerman
ac5f89c7d8 powerpc: Add jump label support
This patch adds support for the new "jump label" feature.

Unlike x86 and sparc we just merrily patch the code with no locks etc,
as far as I know this is safe, but I'm not really sure what the x86/sparc
code is protecting against so maybe it's not.

I also don't see any reason for us to implement the poke_early() routine,
even though sparc does.

[BenH: Updated the patch to upstream generic changes]

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-07-01 13:48:55 +10:00
Dave Carroll
a9c0f41b3a powerpc: Add printk companion for ppc_md.progress
This patch adds a printk companion to replace the udbg progress function
when initmem is freed.

Suggested-by: Milton Miller <miltonm@bga.com>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Dave Carroll <dcarroll@astekcorp.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-30 15:28:05 +10:00
Benjamin Herrenschmidt
6da49a2925 Merge remote branch 'origin/master' into next 2011-06-30 15:23:59 +10:00
Benjamin Herrenschmidt
4d2bb3f500 powerpc/pseries: Re-implement HVSI as part of hvc_vio
On pseries machines, consoles are provided by the hypervisor using
a low level get_chars/put_chars type interface. However, this is
really just a transport to the service processor which implements
them either as "raw" console (networked consoles, HMC, ...) or as
"hvsi" serial ports.

The later is a simple packet protocol on top of the raw character
interface that is supposed to convey additional "serial port" style
semantics. In practice however, all it does is provide a way to
read the CD line and set/clear our DTR line, that's it.

We currently implement the "raw" protocol as an hvc console backend
(/dev/hvcN) and the "hvsi" protocol using a separate tty driver
(/dev/hvsi0).

However this is quite impractical. The arbitrary difference between
the two type of devices has been a major source of user (and distro)
confusion. Additionally, there's an additional mini -hvsi implementation
in the pseries platform code for our low level debug console and early
boot kernel messages, which means code duplication, though that low
level variant is impractical as it's incapable of doing the initial
protocol negociation to establish the link to the FSP.

This essentially replaces the dedicated hvsi driver and the platform
udbg code completely by extending the existing hvc_vio backend used
in "raw" mode so that:

 - It now supports HVSI as well
 - We add support for hvc backend providing tiocm{get,set}
 - It also provides a udbg interface for early debug and boot console

This is overall less code, though this will only be obvious once we
remove the old "hvsi" driver, which is still available for now. When
the old driver is enabled, the new code still kicks in for the low
level udbg console, replacing the old mini implementation in the platform
code, it just doesn't provide the higher level "hvc" interface.

In addition to producing generally simler code, this has several benefits
over our current situation:

 - The user/distro only has to deal with /dev/hvcN for the hypervisor
console, avoiding all sort of confusion that has plagued us in the past

 - The tty, kernel and low level debug console all use the same code
base which supports the full protocol establishment process, thus the
console is now available much earlier than it used to be with the
old HVSI driver. The kernel console works much earlier and udbg is
available much earlier too. Hackers can enable a hard coded very-early
debug console as well that works with HVSI (previously that was only
supported for the "raw" mode).

I've tried to keep the same semantics as hvsi relative to how I react
to things like CD changes, with some subtle differences though:

 - I clear DTR on close if HUPCL is set

 - Current hvsi triggers a hangup if it detects a up->down transition
   on CD (you can still open a console with CD down). My new implementation
   triggers a hangup if the link to the FSP is severed, and severs it upon
   detecting a up->down transition on CD.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:35 +10:00
Benjamin Herrenschmidt
dd2e356a3d powerpc/udbg: Register udbg console generically
When CONFIG_PPC_EARLY_DEBUG is set, call register_early_udbg_console()
early from generic code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:32 +10:00
Akinobu Mita
de2780a3d8 powerpc/pseries: Improve error code on reconfiguration notifier failure
Reconfiguration notifier call for device node may fail by several reasons,
but it always assumes kmalloc failures.

This enables reconfiguration notifier call chain to get the actual error
code rather than -ENOMEM by converting all reconfiguration notifier calls
to return encapsulate error code with notifier_from_errno().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:24 +10:00
Dmitry Eremin-Solenikov
e48f7eb27f powerpc/maple: Enable scom access functions on Maple
Enable functions used to access SCOM if PPC_MAPLE is defined: they are
used by cpufreq driver to control hardware.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 17:48:20 +10:00
Scott Wood
3d97a619ac powerpc/book3e-64: Reraise doorbell when masked by soft-irq-disable
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 16:40:59 +10:00
Christian Dietrich
76462232c2 arch/powerpc: use printk_ratelimited instead of printk_ratelimit
Since printk_ratelimit() shouldn't be used anymore (see comment in
include/linux/printk.h), replace it with printk_ratelimited.

Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 15:31:01 +10:00
Christian Dietrich
9a8f99fab0 powerpc/rtas-rtc: remove sideeffects of printk_ratelimit
Don't use printk_ratelimit() as an additional condition for returning
on an error. Because when the ratelimit is reached, printk_ratelimit
will return 0 and e.g. in rtas_get_boot_time won't check for an error
condition.

Signed-off-by: Christian Dietrich <christian.dietrich@informatik.uni-erlangen.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-29 15:30:43 +10:00
Scott Wood
ebf714ff37 powerpc/e500mc: Add support for the wait instruction in e500_idle
e500mc cannot doze or nap due to an erratum (as well as having a
different mechanism than previous e500), but it has a "wait" instruction
that is similar to doze.

On 64-bit, due to the soft-irq-disable mechanism, the existing
book3e_idle should be used instead.

Signed-off-by: Vakul Garg <vakul@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-27 08:36:15 -05:00
Stuart Yoder
6ec36b5848 powerpc: make irq_choose_cpu() available to all PIC drivers
Move irq_choose_cpu() into arch/powerpc/kernel/irq.c so that it can be used
by other PIC drivers.  The function is not MPIC-specific.

Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-22 21:44:59 -05:00
Kumar Gala
c065488f1a powerpc/pci: Move FSL fixup from 32-bit to common
We need the FSL specific header fixup code on both 32-bit and 64-bit
platforms so just move the code into pci-common.c.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-22 21:44:57 -05:00
Ashish Kalra
1325a684b5 powerpc/85xx: Save scratch registers to thread info instead of using SPRGs.
We expect this is actually faster, and we end up needing more space than we
can get from the SPRGs in some instances.  This is also useful when running
as a guest OS - SPRGs4-7 do not have guest versions.

8 slots are allocated in thread_info for this even though we only actually
use 4 of them - this allows space for future code to have more scratch
space (and we know we'll need it for things like hugetlb).

Signed-off-by: Ashish Kalra <Ashish.Kalra@freescale.com>
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-22 21:44:55 -05:00
Scott Wood
82a9a4809f powerpc/e500: fix breakage with fsl_rio_mcheck_exception
The wrong MCSR bit was being used on e500mc.  MCSR_BUS_RBERR only exists
on e500v1/v2.  Use MCSR_LD on e500mc, and remove all MCSR checking
in fsl_rio_mcheck_exception as we now no longer call that function
if the appropriate bit in MCSR is not set.

If RIO support was enabled at compile-time, but was never probed, just
return from fsl_rio_mcheck_exception rather than dereference a NULL
pointer.

TODO: There is still a remaining, though comparitively minor, issue in
that this recovery mechanism will falsely engage if there's an unrelated
MCSR_LD event at the same time as a RIO error.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-22 06:15:16 -05:00
Paul Mackerras
9ca980dce5 powerpc: Avoid extra indirect function call in sending IPIs
On many platforms (including pSeries), smp_ops->message_pass is always
smp_muxed_ipi_message_pass.  This changes arch/powerpc/kernel/smp.c so
that if smp_ops->message_pass is NULL, it calls smp_muxed_ipi_message_pass
directly.

This means that a platform doesn't need to set both .message_pass and
.cause_ipi, only one of them.  It is a slight performance improvement
in that it gets rid of an indirect function call at the expense of a
predictable conditional branch.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-20 11:21:32 +10:00
Matt Evans
7ac87abb81 powerpc: Fix early boot accounting of CPUs
smp_release_cpus() waits for all cpus (including the bootcpu) due to an
off-by-one count on boot_cpu_count (which is all CPUs).  This patch replaces
that with spinning_secondaries (which is all secondary CPUs).

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-17 16:19:51 +10:00
Joe Perches
28f65c11f2 treewide: Convert uses of struct resource to resource_size(ptr)
Several fixes as well where the +1 was missing.

Done via coccinelle scripts like:

@@
struct resource *ptr;
@@

- ptr->end - ptr->start + 1
+ resource_size(ptr)

and some grep and typing.

Mostly uncompiled, no cross-compilers.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-10 14:55:36 +02:00
Benjamin Herrenschmidt
307cfe7153 powerpc: Force page alignment for initrd reserved memory
When using 64K pages with a separate cpio rootfs, U-Boot will align
the rootfs on a 4K page boundary. When the memory is reserved, and
subsequent early memblock_alloc is called, it will allocate memory
between the 64K page alignment and reserved memory. When the reserved
memory is subsequently freed, it is done so by pages, causing the
early memblock_alloc requests to be re-used, which in my case, caused
the device-tree to be clobbered.

This patch forces the reserved memory for initrd to be kernel page
aligned, and will move the device tree if it overlaps with the range
extension of initrd. This patch will also consolidate the identical
function free_initrd_mem() from mm/init_32.c, init_64.c to mm/mem.c,
and adds the same range extension when freeing initrd. free_initrd_mem()
is also moved to the __init section.

Many thanks to Milton Miller for his input on this patch.

[BenH: Fixed build without CONFIG_BLK_DEV_INITRD]

Signed-off-by: Dave Carroll <dcarroll@astekcorp.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-06-09 16:52:38 +10:00
Benjamin Herrenschmidt
d660474e84 Merge remote branch 'kumar/merge' into merge 2011-06-09 14:46:32 +10:00
Benjamin Herrenschmidt
98d9f30c82 pci/of: Match PCI devices to OF nodes dynamically
powerpc has two different ways of matching PCI devices to their
corresponding OF node (if any) for historical reasons. The ppc64 one
does a scan looking for matching bus/dev/fn, while the ppc32 one does a
scan looking only for matching dev/fn on each level in order to be
agnostic to busses being renumbered (which Linux does on some
platforms).

This removes both and instead moves the matching code to the PCI core
itself. It's the most logical place to do it: when a pci_dev is created,
we know the parent and thus can do a single level scan for the matching
device_node (if any).

The benefit is that all archs now get the matching for free. There's one
hook the arch might want to provide to match a PHB bus to its device
node. A default weak implementation is provided that looks for the
parent device device node, but it's not entirely reliable on powerpc for
various reasons so powerpc provides its own.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:17 +10:00
Kumar Gala
fb9be2349f powerpc/book3e: Fix CPU feature handling on e5500 in 32-bit mode
We are missing FPU feature bit that user space may require.  In the
64-bit mode this gets set since we pull it in via COMMON_USER_PPC64.  We
just explicitly set it so user space will be happy again.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-06-02 15:29:09 -05:00
Linus Torvalds
2a56d22202 Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (45 commits)
  ARM: 6945/1: Add unwinding support for division functions
  ARM: kill pmd_off()
  ARM: 6944/1: mm: allow ASID 0 to be allocated to tasks
  ARM: 6943/1: mm: use TTBR1 instead of reserved context ID
  ARM: 6942/1: mm: make TTBR1 always point to swapper_pg_dir on ARMv6/7
  ARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_area
  ARM: add sendmmsg syscall
  ARM: 6863/1: allow hotplug on msm
  ARM: 6832/1: mmci: support for ST-Ericsson db8500v2
  ARM: 6830/1: mach-ux500: force PrimeCell revisions
  ARM: 6829/1: amba: make hardcoded periphid override hardware
  ARM: 6828/1: mach-ux500: delete SSP PrimeCell ID
  ARM: 6827/1: mach-netx: delete hardcoded periphid
  ARM: 6940/1: fiq: Briefly document driver responsibilities for suspend/resume
  ARM: 6938/1: fiq: Refactor {get,set}_fiq_regs() for Thumb-2
  ARM: 6914/1: sparsemem: fix highmem detection when using SPARSEMEM
  ARM: 6913/1: sparsemem: allow pfn_valid to be overridden when using SPARSEMEM
  at91: drop at572d940hf support
  at91rm9200: introduce at91rm9200_set_type to specficy cpu package
  at91: drop boot_params and PLAT_PHYS_OFFSET
  ...
2011-05-27 19:51:32 -07:00
Russell King
239df0fd5e Merge branches 'devel', 'devel-stable' and 'fixes' into for-linus 2011-05-27 22:59:57 +01:00
Linus Torvalds
f23a5e1405 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM: Fix PM QOS's user mode interface to work with ASCII input
  PM / Hibernate: Update kerneldoc comments in hibernate.c
  PM / Hibernate: Remove arch_prepare_suspend()
  PM / Hibernate: Update some comments in core hibernate code
2011-05-27 14:27:34 -07:00
Benjamin Herrenschmidt
9693ebd481 Merge remote branch 'kumar/merge' into merge 2011-05-27 09:58:22 +10:00
Milton Miller
4dd6029001 powerpc: Fix irq_free_virt by adjusting bounds before loop
Instead of looping over each irq and checking against the irq array
bounds, adjust the bounds before looping.

The old code will not free any irq if the irq + count is above
irq_virq_count because the test in the loop is testing irq + count
instead of irq + i.

This code checks the limits to avoid unsigned integer overflows.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:59 +10:00
Milton Miller
9b78825158 powerpc/irq: Protect irq_radix_revmap_lookup against irq_free_virt
The radix-tree code uses call_rcu when freeing internal elements.
We must protect against the elements being freed while we traverse
the tree, even if the returned pointer will still be valid.

While preparing a patch to expand the context in which
irq_radix_revmap_lookup will be called, I realized that the
radix tree was not locked.

When asked

    For a normal call_rcu usage, is it allowed to read the structure in
    irq_enter / irq_exit, without additional rcu_read_lock?  Could an
    element freed with call_rcu advance with the cpu still between
    irq_enter/irq_exit (and irq_disabled())?

Paul McKenney replied:

    Absolutely illegal to do so. OK for call_rcu_sched(), but a
    flaming bug for call_rcu().

    And thank you very much for finding this!!!

Further analysis:

In the current CONFIG_TREE_RCU implementation. CONFIG_TREE_PREEMPT_RCU
(and CONFIG_TINY_PREEMPT_RCU) uses explicit counters.

These counters are reflected from per-CPU to global in the
scheduling-clock-interrupt handler, so disabling irq does prevent the
grace period from completing. But there are real-time implementations
(such as the one use by the Concurrent guys) where disabling irq
does -not- prevent the grace period from completing.

While an alternative fix would be to switch radix-tree to rcu_sched, I
don't want to audit the other users of radix trees (nor put alternative
freeing in the library).  The normal overhead for rcu_read_lock and
unlock are a local counter increment and decrement.

This does not show up in the rcu lockdep because in 2.6.34 commit
2676a58c98 (radix-tree: Disable RCU lockdep checking in radix tree)
deemed it too hard to pass the condition of the protecting lock
to the library.

Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:59 +10:00
Milton Miller
2e455257d1 powerpc/irq: Check desc in handle_one_irq and expand generic_handle_irq
Look up the descriptor and check that it is found in handle_one_irq
before checking if we are on the irq stack, and call the handler
directly using the descriptor if we are on the stack.

We need check irq_to_desc finds the descriptor to avoid a NULL
pointer dereference.  It could have failed because the number from
ppc_md.get_irq was above NR_IRQS, or various exceptional conditions
with sparse irqs (eg race conditions while freeing an irq if its was
not shutdown in the controller).

fe12bc2c99 (genirq: Uninline and sanity check generic_handle_irq())
moved generic_handle_irq out of line to allow its use by interrupt
controllers in modules.  However, handle_one_irq is core arch code.
It already knows the details of struct irq_desc and handling irqs in
the nested irq case.  This will avoid the extra stack frame to return
the value we don't check.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:59 +10:00
Milton Miller
3d1b5e206a powerpc/irq: Always free duplicate IRQ_LEGACY hosts
Since kmem caches are allocated before init_IRQ as noted in 3af259d155
(powerpc: Radix trees are available before init_IRQ), we now call
kmalloc in all cases and can can always call kfree if we are asked
to allocate a duplicate or conflicting IRQ_HOST_MAP_LEGACY host.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:59 +10:00
Milton Miller
8142f032a9 powerpc/irq: Remove stale and misleading comment
The comment claims we will call host->ops->map() to update the flags if
we find a previously established mapping, but we never did.  We used
to call remap, but that call was removed in da05198002 (powerpc: Remove
irq_host_ops->remap hook).

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:58 +10:00
Milton Miller
7ef71d753e powerpc/cell: Use common smp ipi actions
The cell iic interrupt controller has enough software caused interrupts
to use a unique interrupt for each of the 4 messages powerpc uses.
This means each interrupt gets its own irq action/data combination.

Use the seperate, optimized, arch common ipi action functions
registered via the helper smp_request_message_ipi instead passing the
message as action data to a single action that then demultipexes to
the required acton via a switch statement.

smp_request_message_ipi will register the action as IRQF_PER_CPU
and IRQF_DISABLED, and WARN if the allocation fails for some reason,
so no need to print on that failure.  It will return positive if
the message will not be used by the kernel, in which case we can
free the virq.

In addition to elimiating inefficient code, this also corrects the
error that a kernel built with kexec but without a debugger would
not register the ipi for kdump to notify the other cpus of a crash.

This also restores the debugger action to be static to kernel/smp.c.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:58 +10:00
Ian Munsie
02424d8966 powerpc/ftrace: Implement raw syscall tracepoints on PowerPC
This patch implements the raw syscall tracepoints on PowerPC and exports
them for ftrace syscalls to use.

To minimise reworking existing code, I slightly re-ordered the thread
info flags such that the new TIF_SYSCALL_TRACEPOINT bit would still fit
within the 16 bits of the andi. instruction's UI field. The instructions
in question are in /arch/powerpc/kernel/entry_{32,64}.S to and the
_TIF_SYSCALL_T_OR_A with the thread flags to see if system call tracing
is enabled.

In the case of 64bit PowerPC, arch_syscall_addr and
arch_syscall_match_sym_name are overridden to allow ftrace syscalls to
work given the unusual system call table structure and symbol names that
start with a period.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-26 13:38:57 +10:00
Russell King
ae1d3b974e Merge branch 'for-rmk' of git://github.com/at91linux/linux-2.6-at91 into devel-stable 2011-05-26 00:41:21 +01:00
Peter Zijlstra
d6bf29b44d powerpc: mmu_gather rework
Fix up powerpc to the new mmu_gather stuff.

PPC has an extra batching queue to RCU free the actual pagetable
allocations, use the ARCH extentions for that for now.

For the ppc64_tlb_batch, which tracks the vaddrs to unhash from the
hardware hash-table, keep using per-cpu arrays but flush on context switch
and use a TLF bit to track the lazy_mmu state.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-25 08:39:13 -07:00
Russell King
03eb14199e Merge branch 'devicetree/arm-next' of git://git.secretlab.ca/git/linux-2.6 into devel-stable 2011-05-25 00:08:17 +01:00
Rafael J. Wysocki
354258011e PM / Hibernate: Remove arch_prepare_suspend()
All architectures supporting hibernation define
arch_prepare_suspend() as an empty function, so remove it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-05-24 23:35:55 +02:00
Linus Torvalds
5129df03d0 Merge branch 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: Unify input section names
  percpu: Avoid extra NOP in percpu_cmpxchg16b_double
  percpu: Cast away printk format warning
  percpu: Always align percpu output section to PAGE_SIZE

Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun
2011-05-24 11:53:42 -07:00
Linus Torvalds
57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Linus Torvalds
f4b10bc60a Merge branch 'kvm-updates/2.6.40' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.40' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (131 commits)
  KVM: MMU: Use ptep_user for cmpxchg_gpte()
  KVM: Fix kvm mmu_notifier initialization order
  KVM: Add documentation for KVM_CAP_NR_VCPUS
  KVM: make guest mode entry to be rcu quiescent state
  KVM: x86 emulator: Make jmp far emulation into a separate function
  KVM: x86 emulator: Rename emulate_grpX() to em_grpX()
  KVM: x86 emulator: Remove unused arg from emulate_pop()
  KVM: x86 emulator: Remove unused arg from writeback()
  KVM: x86 emulator: Remove unused arg from read_descriptor()
  KVM: x86 emulator: Remove unused arg from seg_override()
  KVM: Validate userspace_addr of memslot when registered
  KVM: MMU: Clean up gpte reading with copy_from_user()
  KVM: PPC: booke: add sregs support
  KVM: PPC: booke: save/restore VRSAVE (a.k.a. USPRG0)
  KVM: PPC: use ticks, not usecs, for exit timing
  KVM: PPC: fix exit accounting for SPRs, tlbwe, tlbsx
  KVM: PPC: e500: emulate SVR
  KVM: VMX: Cache vmcs segment fields
  KVM: x86 emulator: consolidate segment accessors
  KVM: VMX: Avoid reading %rip unnecessarily when handling exceptions
  ...
2011-05-23 08:42:08 -07:00
Scott Wood
eab176722f KVM: PPC: booke: save/restore VRSAVE (a.k.a. USPRG0)
Linux doesn't use USPRG0 (now renamed VRSAVE in the architecture, even
when Altivec isn't involved), but a guest might.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-05-22 08:47:50 -04:00
Josh Boyer
6de06f313a powerpc: Fix 32-bit SMP build
Commit 69e3cea8d5 ("powerpc/smp: Make start_secondary_resume
available to all CPU variants") introduced start_secondary_resume to
misc_32.S, however it uses a 64-bit instruction which is not valid on
32-bit platforms.  Use 'stw' instead.

Reported-by: Richard Cochran <richardcochran@gmail.com>
Tested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-20 16:23:19 -07:00
Shaohui Xie
cce1f106c6 powerpc/fsl_rio: move machine_check handler
Add support for machine_check support into machine_check_e500 and
machine_check_e500mc.

Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-20 08:46:57 -05:00
Benjamin Herrenschmidt
208b3a4c19 powerpc: Fix hard CPU IDs detection
commit 9d07bc841c
"powerpc: Properly handshake CPUs going out of boot spin loop"

Would cause a miscalculation of the hard CPU ID. It removes breaking
out of the loop when finding a match with a processor, thus the "i"
used as an index in the intserv array is always incorrect

This broke interrupt on my PowerMac laptop.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 17:50:18 +10:00
Benjamin Herrenschmidt
880102e785 Merge remote branch 'origin/master' into merge
Manual merge of arch/powerpc/kernel/smp.c and add missing scheduler_ipi()
call to arch/powerpc/platforms/cell/interrupt.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-20 15:36:52 +10:00
Benjamin Herrenschmidt
3d07f0e83d Merge remote branch 'kumar/next' into next 2011-05-20 13:43:47 +10:00
Linus Torvalds
39ab05c8e0 Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (44 commits)
  debugfs: Silence DEBUG_STRICT_USER_COPY_CHECKS=y warning
  sysfs: remove "last sysfs file:" line from the oops messages
  drivers/base/memory.c: fix warning due to "memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION"
  memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION
  SYSFS: Fix erroneous comments for sysfs_update_group().
  driver core: remove the driver-model structures from the documentation
  driver core: Add the device driver-model structures to kerneldoc
  Translated Documentation/email-clients.txt
  RAW driver: Remove call to kobject_put().
  reboot: disable usermodehelper to prevent fs access
  efivars: prevent oops on unload when efi is not enabled
  Allow setting of number of raw devices as a module parameter
  Introduce CONFIG_GOOGLE_FIRMWARE
  driver: Google Memory Console
  driver: Google EFI SMI
  x86: Better comments for get_bios_ebda()
  x86: get_bios_ebda_length()
  misc: fix ti-st build issues
  params.c: Use new strtobool function to process boolean inputs
  debugfs: move to new strtobool
  ...

Fix up trivial conflicts in fs/debugfs/file.c due to the same patch
being applied twice, and an unrelated cleanup nearby.
2011-05-19 18:24:11 -07:00
Sebastian Siewior
f38aa70877 powerpc: Remove last piece of GEMINI
It seems that Adrian is getting old. He removed almost everything of
GEMINI in commit c53653130 ("[POWERPC] Remove the broken Gemini
support") except this piece.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 17:32:29 +10:00
Gabriel Paubert
2c78027a62 powerpc: Fix for Pegasos keyboard and mouse
[See http://lists.ozlabs.org/pipermail/linuxppc-dev/2010-October/086424.html
and followups. Part of the commit message is directly copied from that.]

Commit 540c6c392f tries to find i8042 IRQs in
the device-tree but doesn't fall back to the old hardcoded 1 and 12 in all
failure cases.

Specifically, the case where the device-tree contains nothing matching
pnpPNP,303 or pnpPNP,f03 doesn't seem to be handled well. It sort of falls
through to the old code, but leaves the IRQs set to 0.

Signed-off-by: Gabriel Paubert <paubert@iram.es>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 17:32:27 +10:00
Benjamin Herrenschmidt
03bf469add powerpc: Make early memory scan more resilient to out of order nodes
We keep track of the size of the lowest block of memory and call
setup_initial_memory_limit() only after we've parsed them all

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Milton Miller <miltonm@bga.com>
2011-05-19 17:32:25 +10:00
Benjamin Herrenschmidt
4c8440666b Merge branch 'merge' into next 2011-05-19 17:00:06 +10:00
Milton Miller
41fb5e6260 powerpc: Make IRQ_NOREQUEST last to clear, first to set
When creating an irq, don't allow a concurent driver request until
we have caled map, which will likley call set_chip_and_handler to
change the irq_chip and its operations.

Similarly, when tearing down an IRQ, make sure no new uses come
along while we change the irq back to the nop chip and then reset
the descriptor to freed status.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 16:54:46 +10:00
Scott Wood
3a6e9bd7f6 powerpc/e5500: set non-base IVORs
Without this, we attempt to use doorbells for IPIs, and end up
branching to some bad address.  Plus, even for the exceptions
we don't implement, it's good to handle it and get a message out.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-19 00:36:43 -05:00
Kumar Gala
d36b4c4f3c powerpc/fsl-booke64: Add support for Debug Level exception handler
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-05-19 00:36:42 -05:00
Kumar Gala
134c428e5a Merge remote branch 'benh/merge' into benh-next 2011-05-19 00:36:21 -05:00
Milton Miller
1e8c23013e powerpc: Remove virq_to_host
The only references to the irq_map[].host field are internal to
arch/powerpc/kernel/irq.c

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:32:01 +10:00
Milton Miller
3ee62d365b powerpc: Add virq_is_host to reduce virq_to_host usage
Some irq_host implementations are using virq_to_host to check if
they are the irq_host for a virtual irq.  To allow us to make space
versus time tradeoffs, replace this usage with an assertive
virq_is_host that confirms or denies the irq is associated with the
given irq_host.

Signed-off-by: Milton Miller <miltonm@bga.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:59 +10:00
Milton Miller
da05198002 powerpc: Remove irq_host_ops->remap hook
It was called from irq_create_mapping if that was called for a host
and hwirq that was previously mapped, "to update the flags".  But the
only implementation was in beat_interrupt and all it did was repeat a
hypervisor call without error checking that was performed with error
checking at the beginning of the map hook.  In addition, the comment on
the beat remap hook says it will only called once for a given mapping,
which would apply to map not remap.

All flags should be known by the time the match hook is called, before
we call the map hook.  Removing this mostly unused hook will simpify
the requirements of irq_domain concept.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:53 +10:00
Milton Miller
2d441681a4 powerpc: Return early if irq_host lookup type is wrong
If for some reason the code incrorectly calls the wrong function to
manage the revmap, not only should we warn, we should take action.
However, in the paths we expect to be taken every delivered interrupt
change to WARN_ON_ONCE.  Use the if (WARN_ON(x)) format to get the
unlikely for free.

Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:37 +10:00
Milton Miller
3af259d155 powerpc: Radix trees are available before init_IRQ
Since the generic irq code uses a radix tree for sparse interrupts,
the initcall ordering has been changed to initialize radix trees before
irqs.   We no longer need to defer creating revmap radix trees to the
arch_initcall irq_late_init.

Also, the kmem caches are allocated so we don't need to use
zalloc_maybe_bootmem.

Signed-off-by: Milton Miller <miltonm@bga.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:35 +10:00
Milton Miller
714542721b powerpc: Use bytes instead of bitops in smp ipi multiplexing
Since there are only 4 messages, we can replace the atomic bit set
(which uses atomic load reserve and store conditional sequence) with
a byte stores to seperate bytes.  We still have to perform a load
reserve and store conditional sequence to avoid loosing messages on
reception but we can do that with a single call to xchg.

The do {} while and __BIG_ENDIAN specific mask testing was chosen by
looking at the generated asm code.  On gcc-4.4, the bit masking becomes
a simple bit mask and test of the register returned from xchg without
storing and loading the value to the stack like attempts with a union
of bytes and an int (or worse, loading single bit constants from the
constant pool into non-voliatle registers that had to be preseved on
the stack).  The do {} while avoids an unconditional branch to the
end of the loop to test the entry / repeat condition of a while loop
and instead optimises for the expected single iteration of the loop.

We have a full mb() at the beginning to cover ordering between send,
ipi, and receive so we can use xchg_local and forgo the further
acquire and release barriers of xchg.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:31 +10:00
Milton Miller
1ece355b68 powerpc: Add kconfig for muxed smp ipi support
Compile the new smp ipi mux and demux code only if a platform
will make use of it.  The new config is selected as required.

The new cause_ipi smp op is only available conditionally to point out
configs where the select is required; this makes setting the op an
immediate fail instead of a deferred unresolved symbol at link.

This also creates a new config for power surge powermac upgrade support
that can be disabled in expert mode but is default on.

I also removed the depends / default y on CONFIG_XICS since it is selected
by PSERIES.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:05 +10:00
Milton Miller
23d72bfd8f powerpc: Consolidate ipi message mux and demux
Consolidate the mux and demux of ipi messages into smp.c and call
a new smp_ops callback to actually trigger the ipi.

The powerpc architecture code is optimised for having 4 distinct
ipi triggers, which are mapped to 4 distinct messages (ipi many, ipi
single, scheduler ipi, and enter debugger).  However, several interrupt
controllers only provide a single software triggered interrupt that
can be delivered to each cpu.  To resolve this limitation, each smp_ops
implementation created a per-cpu variable that is manipulated with atomic
bitops.  Since these lines will be contended they are optimialy marked as
shared_aligned and take a full cache line for each cpu.  Distro kernels
may have 2 or 3 of these in their config, each taking per-cpu space
even though at most one will be in use.

This consolidation removes smp_message_recv and replaces the single call
actions cases with direct calls from the common message recognition loop.
The complicated debugger ipi case with its muxed crash handling code is
moved to debug_ipi_action which is now called from the demux code (instead
of the multi-message action calling smp_message_recv).

I put a call to reschedule_action to increase the likelyhood of correctly
merging the anticipated scheduler_ipi() hook coming from the scheduler
tree; that single required call can be inlined later.

The actual message decode is a copy of the old pseries xics code with its
memory barriers and cache line spacing, augmented with a per-cpu unsigned
long based on the book-e doorbell code.  The optional data is set via a
callback from the implementation and is passed to the new cause-ipi hook
along with the logical cpu number.  While currently only the doorbell
implemntation uses this data it should be almost zero cost to retrieve and
pass it -- it adds a single register load for the argument from the same
cache line to which we just completed a store and the register is dead
on return from the call.  I extended the data element from unsigned int
to unsigned long in case some other code wanted to associate a pointer.

The doorbell check_self is replaced by a call to smp_muxed_ipi_resend,
conditioned on the CPU_DBELL feature.  The ifdef guard could be relaxed
to CONFIG_SMP but I left it with BOOKE for now.

Also, the doorbell interrupt vector for book-e was not calling irq_enter
and irq_exit, which throws off cpu accounting and causes code to not
realize it is running in interrupt context.  Add the missing calls.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:31:03 +10:00
Milton Miller
a56555e573 powerpc: Remove alloc_maybe_bootmem for zalloc version
Replace all remaining callers of alloc_maybe_bootmem with
zalloc_maybe_bootmem.   The callsite in pci_dn is followed with a
memset to clear the memory, and not zeroing at the other callsites
in the celleb fake pci code could lead to following uninitialized
memory as pointers or even freeing said pointers on error paths.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 15:30:57 +10:00
Milton Miller
f1072939b6 powerpc: Remove checks for MSG_ALL and MSG_ALL_BUT_SELF
Now that smp_ops->smp_message_pass is always called with an (online) cpu
number for the target remove the checks for MSG_ALL and MSG_ALL_BUT_SELF.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:46 +10:00
Milton Miller
e047637132 powerpc: Remove call sites of MSG_ALL_BUT_SELF
The only user of MSG_ALL_BUT_SELF in the whole kernel tree is powerpc,
and it only uses it to start the debugger. Both debuggers always call
smp_send_debugger_break with MSG_ALL_BUT_SELF, and only mpic can do
anything more optimal than a loop over all online cpus, but all message
passing implementations have to code for this special delivery target.

Convert smp_send_debugger_break to take void and loop calling the smp_ops
message_pass function for each of the other cpus in the online cpumask.

Use raw_smp_processor_id() because we are either entering the debugger
or trying to start kdump and the additional warning it not useful were
it to trigger.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:46 +10:00
Milton Miller
aa79bc2167 powerpc: Call no-longer static setup_nr_cpu_ids instead of replicating it
c1854e0072 (powerpc: Set nr_cpu_ids early
and use it to free PACAs) copied the formerly static setup_nr_cpu_ids
from init/main.c but 34db18a054 (smp:
move smp setup functions to kernel/smp.c) moved it to kernel/smp.c
with a declaration in include/linux/smp.h, so we can call it instead of
replicating it.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:45 +10:00
Milton Miller
2cd947f175 powerpc: Use nr_cpu_ids in initial paca allocation
Now that we never set a cpu above nr_cpu_ids possible we can
limit our initial paca allocation to nr_cpu_ids.  We can then
clamp the number of cpus in platforms/iseries/setup.c.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller
8657ae28dd powerpc: Respect nr_cpu_ids when calling set_cpu_possible and set_cpu_present
We should not set cpus above nr_cpu_ids to possible.  While we
will trigger a warning with CONFIG_CPUMASK_DEBUG, even then the mask
initializers will set the bits beyond what the iterators check and cause
nr_cpu_ids to increase.

Respecting nr_cpu_ids during setup will allow us to use it in our initial
paca allocation.  It can be reduced from NR_CPUS by the existing early param
nr_cpus=, which was added in 2b633e3fac (smp:
Use nr_cpus= to set nr_cpu_ids early).  We already call parse_early_parms
between finding the command line and allocating the pacas.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller
bd9e5eefec powerpc/kdump64: Don't reference freed memory as pacas
Starting with 1426d5a3bd (powerpc:
Dynamically allocate pacas) the space for pacas beyond cpu_possible
is freed, but we failed to update the loop in crash.c.

Since c1854e0072 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) the number of pacas allocated is
always nr_cpu_ids.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <stable@kernel.org> # .34.x
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:44 +10:00
Milton Miller
768d18ad6d powerpc: Don't search for paca in freed memory
Starting with 1426d5a3bd (powerpc:
Dynamically allocate pacas) we free the memory for pacas beyond
cpu_possible, but we failed to update the loop the secondary cpus use
to find their paca.  If the system has running cpu threads for which
the kernel did not allocate a paca for they will search the memory that
was freed.  For instance this could happen when the device tree for
a kdump kernel was not updated after a cpu hotplug, or the kernel is
running with more cpus than the kernel was configured.

Since c1854e0072 (powerpc: Set nr_cpu_ids
early and use it to free PACAs) we set nr_cpu_ids before telling the
cpus to advance, so use that to limit the search.

We can't reference nr_cpu_ids without CONFIG_SMP because it is defined
as 1 instead of a memory location, but any extra threads should be sent
to kexec_wait in that case anyways, so make that explicit and remove
the search loop for UP.

Note to stable: The fix also requires
c1854e0072 (powerpc: Set
nr_cpu_ids early and use it to free PACAs) to function.  Also
9d07bc841c (Properly handshake CPUs going
out of boot spin loop) affects the second chunk, specifically the branch
target was 3b before and is 4b after that patch, and there was a blank
line before the #ifdef CONFIG_SMP that was removed

Cc: <stable@kernel.org> # .34.x: c1854e0072 powerpc: Set nr_cpu_ids early
Cc: <stable@kernel.org> # .34.x
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Milton Miller
3d2cea732d powerpc/kexec: Fix memory corruption from unallocated slaves
Commit 1fc711f7ff (powerpc/kexec: Fix race
in kexec shutdown) moved the write to signal the cpu had exited the kernel
from before the transition to real mode in kexec_smp_wait to kexec_wait.

Unfornately it missed that kexec_wait is used both by cpus leaving the
kernel and by secondary slave cpus that were not allocated a paca for
what ever reason -- they could be beyond nr_cpus or not described in
the current device tree for whatever reason (for example, kexec-load
was not refreshed after a cpu hotplug operation).  Cpus coming through
that path they will write to paca[NR_CPUS] which is beyond the space
allocated for the paca data and overwrite memory not allocated to pacas
but very likely still real mode accessable).

Move the write back to kexec_smp_wait, which is used only by cpus that
found their paca, but after the transition to real mode.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: <stable@kernel.org> # (1fc711f was backported to 2.6.32)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:43 +10:00
Anton Blanchard
f5f0307f42 powerpc: Improve scheduling of system call entry instructions
After looking at our system call path, Mary Brown suggested that we
should put all mfspr SRR* instructions before any mtspr SRR*.

To test this I used a very simple null syscall (actually getppid)
testcase at http://ozlabs.org/~anton/junkcode/null_syscall.c

I tested with the following changes against the pseries_defconfig:

CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n

to remove the overhead of virtual CPU accounting and syscall
auditing.

POWER6:
baseline:       mean = 757.2 cycles       sd = 2.108
modified:       mean = 759.1 cycles       sd = 2.020

POWER7:
baseline:       mean = 411.4 cycles       sd = 0.138
modified:       mean = 404.1 cycles       sd = 0.109

So we have 1.77% improvement on POWER7 which looks significant. The
POWER6 suggest a 0.25% slowdown, but the results are within 1
standard deviation and may be in the noise.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard
ba00ce1d6e powerpc: Remove static branch hint in giveup_altivec
A static branch hint will override dynamic branch prediction on
recent POWER CPUs. Since we are about to use more altivec in the
kernel remove the static hint in giveup_altivec that assumes
a userspace task is using altivec.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Anton Blanchard
d988f0e3f8 powerpc: Simplify 4k/64k copy_page logic
To make it easier to add optimised versions of copy_page, remove
the 4kB loop for 64kB pages and just do all the work in copy_page.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:42 +10:00
Justin Mattock
93395febdb powerpc: Remove unused config in the Makefile
The patch below removes an unused config variable found by using a kernel
cleanup script.
Note: I did try to cross compile these but hit erros while doing so..
(gcc is not setup to cross compile) and am unsure if anymore needs to be done.
Please have a look if/when anybody has free time.

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 14:30:40 +10:00
kerstin jonsson
c560bbceaf powerpc/4xx: Fix regression in SMP on 476
commit c56e58537d breaks SMP support in PPC_47x chip.
 secondary_ti must be set to current thread info before callin kick_cpu or else
 start_secondary_47x will jump into void when trying to return to c-code.
 In the current setup secondary_ti is initialized before the CPU idle task is started
 and only the boot core will start. I am not sure this is the correct solution, but it
 makes SMP possible in my chip.
 Note! The HOTPLUG support probably need some fixing to, There is no trampoline code
 available in head_44x.S - start_secondary_resume?

Signed-off-by: Kerstin Jonsson <kerstin.jonsson@ericsson.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 13:09:22 +10:00
Ben Hutchings
35d215fbe4 powerpc/kexec: Fix build failure on 32-bit SMP
Commit b987812b3f left
crash_kexec_wait_realmode() undefined for UP.

Commit 7c7a81b53e defined it for UP but
left it undefined for 32-bit SMP.

Seems like people are getting confused by nested #ifdef's, so move the
definitions of crash_kexec_wait_realmode() after the #ifdef CONFIG_SMP
section.

Compile-tested with 32-bit UP, 32-bit SMP and 64-bit SMP configurations.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 13:09:21 +10:00
Benjamin Herrenschmidt
69e3cea8d5 powerpc/smp: Make start_secondary_resume available to all CPU variants
This should fix SMP & Hotplug builds on FSL BookE and 476

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-19 13:07:12 +10:00
Greg Kroah-Hartman
82a3242e11 sysfs: remove "last sysfs file:" line from the oops messages
On some arches (x86, sh, arm, unicore, powerpc) the oops message would
print out the last sysfs file accessed.

This was very useful in finding a number of sysfs and driver core bugs
in the 2.5 and early 2.6 development days, but it has been a number of
years since this file has actually helped in debugging anything that
couldn't also be trivially determined from the stack traceback.

So it's time to delete the line.  This is good as we need all the space
we can get for oops messages at times on consoles.

Acked-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-05-13 16:05:51 -07:00
Ingo Molnar
9cb5baba5e Merge commit 'v2.6.39-rc7' into sched/core 2011-05-12 09:36:18 +02:00
Grant Likely
85f60ae4ee dt/flattree: explicitly pass command line pointer to early_init_dt_scan_chosen
This patch drops the reference to a global 'cmd_line' variable from
early_init_dt_scan_chosen, and instead passes the pointer to the command
line string via the *data argument.  Each architecture does something
slightly different with the initial command line, so it makes sense for
the architecture to be able to specify the variable name.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-11 14:53:18 +02:00
Justin P. Mattock
70f23fd66b treewide: fix a few typos in comments
- kenrel -> kernel
- whetehr -> whether
- ttt -> tt
- sss -> ss

Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-10 10:16:21 +02:00
Frederic Weisbecker
925f83c085 hw_breakpoints, powerpc: Fix CONFIG_HAVE_HW_BREAKPOINT off-case in ptrace_set_debugreg()
We make use of ptrace_get_breakpoints() / ptrace_put_breakpoints() to
protect ptrace_set_debugreg() even if CONFIG_HAVE_HW_BREAKPOINT if off.
However in this case, these APIs are not implemented.

To fix this, push the protection down inside the relevant ifdef.
Best would be to export the code inside
CONFIG_HAVE_HW_BREAKPOINT into a standalone function to cleanup
the ifdefury there and call the breakpoint ref API inside. But
as it is more invasive, this should be rather made in an -rc1.

Fixes this build error:

  arch/powerpc/kernel/ptrace.c:1594: error: implicit declaration of function 'ptrace_get_breakpoints' make[2]: ***

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: LPPC <linuxppc-dev@lists.ozlabs.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1304639598-4707-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-06 11:24:46 +02:00
Jack Miller
a0496d450a powerpc: Add early debug for WSP platforms
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:41 +10:00
David Gibson
a1d0d98daf powerpc: Add WSP platform
Add a platform for the Wire Speed Processor, based on the PPC A2.

This includes code for the ICS & OPB interrupt controllers, as well
as a SCOM backend, and SCOM based cpu bringup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:35 +10:00
Benjamin Herrenschmidt
40bd587a88 powerpc: Rename slb0_limit() to safe_stack_limit() and add Book3E support
slb0_limit() wasn't a very descriptive name. This changes it along with
a comment explaining what it's used for, and provides a 64-bit BookE
implementation.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-06 13:32:24 +10:00
Brian King
9ee820fa00 powerpc/pseries: Add page coalescing support
Adds support for page coalescing, which is a feature on IBM Power servers
which allows for coalescing identical pages between logical partitions.
Hint text pages as coalesce candidates, since they are the most likely
pages to be able to be coalesced between partitions. This patch also
exports some page coalescing statistics available from firmware via
lparcfg.

[BenH: Moved a couple of things around to fix compile problems]

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 16:02:21 +10:00
Ben Hutchings
7707e4110e powerpc/kexec: Fix build failure on 32-bit SMP
Commit b987812b3f left
crash_kexec_wait_realmode() undefined for UP.

Commit 7c7a81b53e defined it for UP but
left it undefined for 32-bit SMP.

Seems like people are getting confused by nested #ifdef's, so move the
definitions of crash_kexec_wait_realmode() after the #ifdef CONFIG_SMP
section.

Compile-tested with 32-bit UP, 32-bit SMP and 64-bit SMP configurations.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:59 +10:00
KOSAKI Motohiro
104699c0ab powerpc: Convert old cpumask API into new one
Adapt new API.

Almost change is trivial. Most important change is the below line
because we plan to change task->cpus_allowed implementation.

-       ctx->cpus_allowed = current->cpus_allowed;

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:59 +10:00
Paul Mackerras
48404f2e95 powerpc: Save Come-From Address Register (CFAR) in exception frame
Recent 64-bit server processors (POWER6 and POWER7) have a "Come-From
Address Register" (CFAR), that records the address of the most recent
branch or rfid (return from interrupt) instruction for debugging purposes.

This saves the value of the CFAR in the exception entry code and stores
it in the exception frame.  We also make xmon print the CFAR value in
its register dump code.

Rather than extend the pt_regs struct at this time, we steal the orig_gpr3
field, which is only used for system calls, and use it for the CFAR value
for all exceptions/interrupts other than system calls.  This means we
don't save the CFAR on system calls, which is not a great problem since
system calls tend not to happen unexpectedly, and also avoids adding the
overhead of reading the CFAR to the system call entry path.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:22:09 +10:00
Paul Mackerras
1977b50212 powerpc: Save register r9-r13 values accurately on interrupt with bad stack
When we take an interrupt or exception from kernel mode and the stack
pointer is obviously not a kernel address (i.e. the top bit is 0), we
switch to an emergency stack, save register values and panic.  However,
on 64-bit server machines, we don't actually save the values of r9 - r13
at the time of the interrupt, but rather values corrupted by the
exception entry code for r12-r13, and nothing at all for r9-r11.

This fixes it by passing a pointer to the register save area in the paca
through to the bad_stack code in r3.  The register values are saved in
one of the paca register save areas (depending on which exception this
is).  Using the pointer in r3, the bad_stack code now retrieves the
saved values of r9 - r13 and stores them in the exception frame on the
emergency stack.  This also stores the normal exception frame marker
("regshere") in the exception frame.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:19:27 +10:00
Grant Likely
476eb49126 powerpc/irq: Stop exporting irq_map
First step in eliminating irq_map[] table entirely

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-05-04 15:02:15 +10:00
Matt Evans
44ae3ab335 powerpc: Free up some CPU feature bits by moving out MMU-related features
Some of the 64bit PPC CPU features are MMU-related, so this patch moves
them to MMU_FTR_ bits.  All cpu_has_feature()-style tests are moved to
mmu_has_feature(), and seven feature bits are freed as a result.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:52 +10:00
Anton Blanchard
eca590f402 powerpc/rtas: Only sleep in rtas_busy_delay if we have useful work to do
RTAS returns extended error codes as a hint of how long the
OS might want to wait before retrying a call. If we have nothing
else useful to do we may as well call back straight away.

This was found when testing the new dynamic dma window feature.
Firmware split the zeroing of the TCE table into 32k chunks but
returned 9901 (which is a suggested wait of 10ms). All up this took
about 10 minutes to complete since msleep is jiffies based and will
round 10ms up to 20ms.

With the patch below we take 3 seconds to complete the same test.
The hint firmware is returning in the RTAS call should definitely
be decreased, but even if we slept 1ms each iteration this would
take 32s.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:50 +10:00
Michael Ellerman
9f0b079320 powerpc: Use MSR_64BIT in places
Use the new MSR_64BIT in a few places. Some of these are already ifdef'ed
for BOOKE vs BOOKS, but it's still clearer, MSR_SF does not immediately
parse as "MSR bit for 64bit".

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:44 +10:00
Michael Ellerman
73706c3283 powerpc/irq: Dump chip data pointer in virq_mapping
This can be useful for differentiating interrupts on the same host
but with different chip data.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:37 +10:00
Michael Ellerman
69b123684b powerpc/pci: Properly initialize IO workaround "private"
Even when no initfunc is provided.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:33 +10:00
Michael Ellerman
d1109b7529 powerpc/pci: Make IO workarounds init implicit when first bus is registered
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:31 +10:00
Michael Ellerman
3cc30d0726 powerpc/pci: Move IO workarounds to the common kernel dir
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:29 +10:00
Alexey Kardashevskiy
efcac6589a powerpc: Per process DSCR + some fixes (try#4)
The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.

This patch allows the value to be specified per thread by emulating
the corresponding mfspr and mtspr instructions. Children of such
threads inherit the value. Other threads use a default value that
can be specified in sysfs - /sys/devices/system/cpu/dscr_default.

If a thread starts with non default value in the sysfs entry,
all children threads inherit this non default value even if
the sysfs value is changed later.

Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 14:18:19 +10:00
Jack Miller
f0aae3238f powerpc/book3e: Flush IPROT protected TLB entries leftover by firmware
When we set up the TLB for ourselves on Book3E, we need to flush out any
old mappings established by the firmware or bootloader.  At present we
attempt this with a tlbilx to flush everything, but this will leave behind
any entries with the IPROT bit set.

There are several good reason firmware might establish mappings with IPROT,
and in fact ePAPR compliant firmwares are required to establish their
initial mapped area with IPROT.

This patch, therefore adds more complex code to scan through the TLB upon
entry and flush away any entries that are not our own.

Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:16 +10:00
Benjamin Herrenschmidt
1a51dde139 powerpc/book3e: Use way 3 for linear mapping bolted entry
An erratum on A2 can lead to the bolted entry we insert for the linear
mapping being evicted, to avoid that write the bolted entry to way 3.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:14 +10:00
Michael Ellerman
ca1769f7a3 powerpc: Index crit/dbg/mcheck stacks using cpu number on 64bit
In exc_lvl_ctx_init() we index into the crit/dbg/mcheck stacks using
the hard cpu id, but that assumes the hard cpu id is zero based and
contiguous. That is not the case on A2.

The root of the problem is that the 32bit code has no equivalent of the
paca to allow it to do the hard->soft mapping in assembler. Until the
32bit code is updated to handle that, index the stacks using the soft
cpu ids on 64bit and hard on 32 bit.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:12 +10:00
Benjamin Herrenschmidt
76b4eda866 powerpc: Add A2 cpu support
Add the cputable entry, regs and setup & restore entries for
the PowerPC A2 core.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-27 13:02:02 +10:00
Frederic Weisbecker
07fa7a0a8a powerpc, hw_breakpoints: Fix racy access to ptrace breakpoints
While the tracer accesses ptrace breakpoints, the child task may
concurrently exit due to a SIGKILL and thus release its breakpoints
at the same time. We can then dereference some freed pointers.

To fix this, hold a reference on the child breakpoints before
manipulating them.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Prasad <prasad@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: v2.6.33.. <stable@kernel.org>
Link: http://lkml.kernel.org/r/1302284067-7860-4-git-send-email-fweisbec@gmail.com
2011-04-25 17:33:54 +02:00
Ingo Molnar
42ac9e87fd Merge commit 'v2.6.39-rc4' into sched/core
Merge reason: Pick up upstream fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-21 11:39:28 +02:00
Michael Ellerman
de30097476 powerpc/smp: smp_ops->kick_cpu() should be able to fail
When we start a cpu we use smp_ops->kick_cpu(), which currently
returns void, it should be able to fail. Convert it to return
int, and update all uses.

Convert all the current error cases to return -ENOENT, which is
what would eventually be returned by __cpu_up() currently when
it doesn't detect the cpu as coming up in time.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 17:01:18 +10:00
Benjamin Herrenschmidt
af2771493a powerpc: Improve prom_printf()
Adds the ability to print decimal numbers and adds some more
format string variants

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:25 +10:00
Benjamin Herrenschmidt
dd79773864 powerpc: Perform an isync to synchronize CPUs coming out of secondary_hold
We need to do that to guarantee they see any code change done by
dynamic patching during boot.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:25 +10:00
Benjamin Herrenschmidt
948cf67c47 powerpc: Add NAP mode support on Power7 in HV mode
Wakeup comes from the system reset handler with a potential loss of
the non-hypervisor CPU state. We save the non-volatile state on the
stack and a pointer to it in the PACA, which the system reset handler
uses to restore things

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt
9d07bc841c powerpc: Properly handshake CPUs going out of boot spin loop
We need to wait a bit for them to have done their CPU setup
or we might end up with translation and EE on with different
LPCR values between threads

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt
ad0693ee72 powerpc: Call CPU ->restore callback earlier on secondary CPUs
We do it before we loop on the PACA start flag. This way, we get a
chance to set critical SPRs on all CPUs before Linux tries to start
them up, which avoids problems when changing some bits such as LPCR
bits that need to be identical on all threads of a core or similar
things like that. Ideally, some of that should also be done before
the MMU is enabled, but that's a separate issue which would require
moving some of the SMP startup code earlier, let's not get there
for now, it works with that change alone.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt
b144871cb5 powerpc: Initialize TLB and LPID register on HV mode Power7
In case entry from the bootloader isn't "clean"

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:24 +10:00
Benjamin Herrenschmidt
895796a8ab powerpc: Initialize LPCR:DPFD on power7 to a sane default
This sets the default data stream prefetch size for operating
systems that don't set their own value in DSCR. We use 4 which
is "medium".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Paul Mackerras
673b189a2e powerpc: Always use SPRN_SPRG_HSCRATCH0 when running in HV mode
This uses feature sections to arrange that we always use HSPRG1
as the scratch register in the interrupt entry code rather than
SPRG2 when we're running in hypervisor mode on POWER7.  This will
ensure that we don't trash the guest's SPRG2 when we are running
KVM guests.  To simplify the code, we define GET_SCRATCH0() and
SET_SCRATCH0() macros like the GET_PACA/SET_PACA macros.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Benjamin Herrenschmidt
b3e6b5dfcf powerpc: More work to support HV exceptions
Rework exception macros a bit to split offset from vector and add
some basic support for HDEC, HDSI, HISI and a few more.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:23 +10:00
Benjamin Herrenschmidt
a5d4f3ad3a powerpc: Base support for exceptions using HSRR0/1
Pass the register type to the prolog, also provides alternate "HV"
version of hardware interrupt (0x500) and adjust LPES accordingly

We tag those interrupts by setting bit 0x2 in the trap number

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt
2dd60d79e0 powerpc: In HV mode, use HSPRG0 for PACA
When running in Hypervisor mode (arch 2.06 or later), we store the PACA
in HSPRG0 instead of SPRG1. The architecture specifies that SPRGs may be
lost during a "nap" power management operation (though they aren't
currently on POWER7) and this enables use of SPRG1 by KVM guests.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Benjamin Herrenschmidt
24cc67de62 powerpc: Define CPU feature for Architected 2.06 HV mode
This bit indicates that we are operating in hypervisor mode on a CPU
compliant to architecture 2.06 or later (currently server only).

We set it on POWER7 and have a boot-time CPU setup function that
clears it if MSR:HV isn't set (booting under a hypervisor).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-20 11:03:22 +10:00
Eric B Munson
86c74ab317 powerpc/perf_event: Skip updating kernel counters if register value shrinks
Because of speculative event roll back, it is possible for some event coutners
to decrease between reads on POWER7.  This causes a problem with the way that
counters are updated.  Delta calues are calculated in a 64 bit value and the
top 32 bits are masked.  If the register value has decreased, this leaves us
with a very large positive value added to the kernel counters.  This patch
protects against this by skipping the update if the delta would be negative.
This can lead to a lack of precision in the coutner values, but from my testing
the value is typcially fewer than 10 samples at a time.

Signed-off-by: Eric B Munson <emunson@mgebm.net>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:23 +10:00
Anton Blanchard
84ffae55af powerpc: Fix oops if scan_dispatch_log is called too early
We currently enable interrupts before the dispatch log for the boot
cpu is setup. If a timer interrupt comes in early enough we oops in
scan_dispatch_log:

Unable to handle kernel paging request for data at address 0x00000010

...

.scan_dispatch_log+0xb0/0x170
.account_system_vtime+0xa0/0x220
.irq_enter+0x88/0xc0
.do_IRQ+0x48/0x230

The patch below adds a check to scan_dispatch_log to ensure the
dispatch log has been allocated.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:08:19 +10:00
Paul Gortmaker
7c7a81b53e powerpc/kexec: Fix regression causing compile failure on UP
Recent commit b987812b3f caused
a compile failure on UP because a considerably large block
of the file was included within CONFIG_SMP, hence making a stub
function not exposed on UP builds when it needed to be.

Relocate the stub to the #else /* ! CONFIG_SMP */ section
and also annotate the relevant else/endif so that nobody
else falls into the same trap I did.

Reported-by: Michael Guntsche <mike@it-loops.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-18 13:06:45 +10:00
Benjamin Herrenschmidt
8f3dda75cb Merge remote branch 'kumar/merge' into merge 2011-04-18 12:09:37 +10:00
Peter Zijlstra
184748cc50 sched: Provide scheduler_ipi() callback in response to smp_send_reschedule()
For future rework of try_to_wake_up() we'd like to push part of that
function onto the CPU the task is actually going to run on.

In order to do so we need a generic callback from the existing scheduler IPI.

This patch introduces such a generic callback: scheduler_ipi() and
implements it as a NOP.

BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl
2011-04-14 08:52:32 +02:00
Kumar Gala
11ed0db9f6 powerpc/book3e: Fix CPU feature handling on 64-bit e5500
The CPU_FTRS_POSSIBLE and CPU_FTRS_ALWAYS defines did not encompass
e5500 CPU features when built for 64-bit.  This causes issues with
cpu_has_feature() as it utilizes the POSSIBLE & ALWAYS defines as part
of its check.

Create a unique CPU_FTRS_E5500 (as its different from CPU_FTRS_E500MC),
created a new group for 64-bit Book3e based CPUs and add CPU_FTRS_E5500
to that group.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Prabhakar Kushwaha
07d9fce24d powerpc: Check device status before adding serial device
serial port nodes with the property status="disabled" are not usable and so
avoid adding "disabled" port with the system.

Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-04-12 06:29:21 -05:00
Rafael J. Wysocki
1f112cee07 PM / Hibernate: Introduce CONFIG_HIBERNATE_CALLBACKS
Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose.  However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation.  Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.

To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it.  Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
2011-04-11 22:54:42 +02:00
Linus Torvalds
42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
Ryan Grimm
c1854e0072 powerpc: Set nr_cpu_ids early and use it to free PACAs
Without this, "holes" in the CPU numbering can cause us to
free too many PACAs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-05 16:22:11 +10:00
Paul Gortmaker
b987812b3f powerpc/kexec: Fix mismatched ifdefs for PPC64/SMP.
Commit b3df895aeb "powerpc/kexec: Add support for FSL-BookE"
introduced the original PPC_STD_MMU_64 checks around the function
crash_kexec_wait_realmode().   Then commit c2be05481f
"powerpc: Fix default_machine_crash_shutdown #ifdef botch" changed
the ifdef around the calling site to add a check on SMP, but the
ifdef around the function itself was left unchanged, leaving an
unused function for PPC_STD_MMU_64=y and SMP=n

Rather than have two ifdefs that can get out of sync like this,
simply put the corrected conditional around the function and use
a stub to get rid of one set of ifdefs completely.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-05 16:22:10 +10:00
Benjamin Herrenschmidt
aeeafbfa7a powerpc/smp: Increase vdso_data->processorCount, not just decrease it
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:36 +11:00
Benjamin Herrenschmidt
c56e58537d powerpc/smp: Create idle threads on demand and properly reset them
Instead of creating idle threads at boot for all possible CPUs, we
create them on demand, like x86 or ARM, and we properly call init_idle
to re-initialize an idle thread when a CPU was unplugged and is now
re-plugged.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:34 +11:00
Benjamin Herrenschmidt
105765f451 powerpc/smp: Don't expose per-cpu "cpu_state" array
Instead, keep it static, expose an accessor and use that from
the PowerMac code. Avoids easy namespace collisions and will
make it easier to consolidate with other implementations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:33 +11:00
Benjamin Herrenschmidt
d72944457b powerpc/smp: Add a smp_ops->bringup_up() done callback
This allows us to stop abusing smp_ops->setup_cpu() for cleanup
tasks that have to take place after the initial boot time CPU
bringup.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:29 +11:00
Benjamin Herrenschmidt
62cc67b9df powerpc/pmac/smp: Properly NAP offlined CPU on G5
The current code soft-disables, and then goes to NAP mode which
turns interrupts on. That means that if an interrupt occurs, we
will hit the masked interrupt code path which isn't what we want,
as it will return with EE off, which will either get us out of
NAP mode, or fail to enter it (according to spec).

Instead, let's just rely on the fact that it is safe to take
decrementer interrupts on an offline CPU and leave interrupts
enabled. We can also get rid of the special case in asm for
power4_cpu_offline_powersave() and just use power4_idle().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:25 +11:00
Benjamin Herrenschmidt
1c91cc5705 powerpc/pmac/smp: Rename fixup_irqs() to migrate_irqs() and use it on ppc32
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:18 +11:00
Benjamin Herrenschmidt
7a53a4fe70 powerpc/smp: Remove unused smp_ops->cpu_enable()
Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:14 +11:00
Benjamin Herrenschmidt
b527d07114 powerpc/smp: Remove unused generic_cpu_enable()
Nobody uses it, besides we should always use the normal __cpu_up
path anyways

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:12 +11:00
Benjamin Herrenschmidt
4fcb8833af powerpc/smp: Fix generic_mach_cpu_die()
This is used by some "soft" hotplug implementations. I needs to
call idle_task_exit() when the CPU is going away, and we remove
the now no-longer needed set_cpu_online() and local_irq_enable()
which are handled by the return to start_secondary

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:10 +11:00
Benjamin Herrenschmidt
fa3f82c8bb powerpc/smp: soft-replugged CPUs must go back to start_secondary
Various thing are torn down when a CPU is hot-unplugged. That CPU
is expected to go back to start_secondary when re-plugged to re
initialize everything, such as clock sources, maps, ...

Some implementations just return from cpu_die() callback
in the idle loop when the CPU is "re-plugged". This is not enough.

We fix it using a little asm trampoline which resets the stack
and calls back into start_secondary as if we were all fresh from
boot. The trampoline already existed on ppc64, but we add it for
ppc32

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:09 +11:00
Benjamin Herrenschmidt
963e5d3b76 powerpc: Make decrementer interrupt robust against offlined CPUs
With some implementations, it is possible that a timer interrupt
occurs every few seconds on an offline CPU. In this case, just
re-arm the decrementer and return immediately

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-04-01 15:37:07 +11:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Benjamin Herrenschmidt
84493804bb powerpc/mm: Move the STAB0 location to 0x8000 to make room in low memory
Recent upstream builds with allmodconfig fail due to lack of space
between 0x3000 and 0x6000. We have a hard block at 0x7000 but we can
spare a page by moving the STAB0 from 0x6000 to 0x8000.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:20 +11:00
Anton Blanchard
ad5d1c888e powerpc: Fix accounting of softirq time when idle
commit cf9efce0ce (powerpc: Account time using timebase rather
than PURR) used in_irq() to detect if the time was spent in
interrupt processing. This only catches hardirq context so if we
are in softirq context and in the idle loop we end up accounting it
as idle time. If we instead use in_interrupt() we catch both softirq
and hardirq time.

The issue was found when running a network intensive workload. top
showed the following:

0.0%us,  1.1%sy,  0.0%ni, 85.7%id,  0.0%wa,  9.9%hi,  3.3%si,  0.0%st

85.7% idle. But this was wildly different to the perf events data.
To confirm the suspicion I ran something to keep the core busy:

# yes > /dev/null &

8.2%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa, 10.3%hi, 81.4%si,  0.0%st

We only got 8.2% of the CPU for the userspace task and softirq has
shot up to 81.4%.

With the patch below top shows the correct stats:

0.0%us,  0.0%sy,  0.0%ni,  5.3%id,  0.0%wa, 13.3%hi, 81.3%si,  0.0%st

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:18 +11:00
Benjamin Herrenschmidt
6090912c4a powerpc: Implement dma_mmap_coherent()
This is used by Alsa to mmap buffers allocated with dma_alloc_coherent()
into userspace. We need a special variant to handle machines with
non-coherent DMAs as those buffers have "special" virt addresses and
require non-cachable mappings

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-30 10:44:00 +11:00
Thomas Gleixner
433c9c67c5 powerpc: Use generic show_interrupts()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:13 +02:00
Thomas Gleixner
ec775d0e70 powerpc: Convert to new irq_* function names
Scripted with coccinelle.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:12 +02:00
Thomas Gleixner
7bfbc1f283 powerpc: irq: Use irqdata based information
We want to tighten the irq_desc access. So use the new accessors for
the same information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:12 +02:00
Thomas Gleixner
98488db9ff powerpc: Use proper accessors for IRQ_* flags
Use the proper accessors instead of open access to irq_desc.
Converted with coccinelle.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-29 14:48:08 +02:00
Tejun Heo
0415b00d17 percpu: Always align percpu output section to PAGE_SIZE
Percpu allocator honors alignment request upto PAGE_SIZE and both the
percpu addresses in the percpu address space and the translated kernel
addresses should be aligned accordingly.  The calculation of the
former depends on the alignment of percpu output section in the kernel
image.

The linker script macros PERCPU_VADDR() and PERCPU() are used to
define this output section and the latter takes @align parameter.
Several architectures are using @align smaller than PAGE_SIZE breaking
percpu memory alignment.

This patch removes @align parameter from PERCPU(), renames it to
PERCPU_SECTION() and makes it always align to PAGE_SIZE.  While at it,
add PCPU_SETUP_BUG_ON() checks such that alignment problems are
reliably detected and remove percpu alignment comment recently added
in workqueue.c as the condition would trigger BUG way before reaching
there.

For um, this patch raises the alignment of percpu area.  As the area
is in .init, there shouldn't be any noticeable difference.

This problem was discovered by David Howells while debugging boot
failure on mn10300.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Cc: uclinux-dist-devel@blackfin.uclinux.org
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
2011-03-24 18:50:09 +01:00
Linus Torvalds
b81a618dcd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  deal with races in /proc/*/{syscall,stack,personality}
  proc: enable writing to /proc/pid/mem
  proc: make check_mem_permission() return an mm_struct on success
  proc: hold cred_guard_mutex in check_mem_permission()
  proc: disable mem_write after exec
  mm: implement access_remote_vm
  mm: factor out main logic of access_process_vm
  mm: use mm_struct to resolve gate vma's in __get_user_pages
  mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
  mm: arch: make in_gate_area take an mm_struct instead of a task_struct
  mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
  x86: mark associated mm when running a task in 32 bit compatibility mode
  x86: add context tag to mark mm when running a task in 32-bit compatibility mode
  auxv: require the target to be tracable (or yourself)
  close race in /proc/*/environ
  report errors in /proc/*/*map* sanely
  pagemap: close races with suid execve
  make sessionid permissions in /proc/*/task/* match those in /proc/*
  fix leaks in path_lookupat()

Fix up trivial conflicts in fs/proc/base.c
2011-03-23 20:51:42 -07:00
Olaf Hering
93a72052be crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state.  To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.

Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.

Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
to early_param().  It adds an address range check from x86 also on ia64
and powerpc.

[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:19 -07:00
Alexandre Bounine
388b78adc9 rapidio: modify configuration to support PCI-SRIO controller
1. Add an option to include RapidIO support if the PCI is available.
2. Add FSL_RIO configuration option to enable controller selection.
3. Add RapidIO support option into x86 and MIPS architectures.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:46:42 -07:00
Stephen Wilson
cae5d39032 mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
Now that gate vma's are referenced with respect to a particular mm and not a
particular task it only makes sense to propagate the change to this predicate as
well.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:55 -04:00
Stephen Wilson
83b964bbf8 mm: arch: make in_gate_area take an mm_struct instead of a task_struct
Morally, the question of whether an address lies in a gate vma should be asked
with respect to an mm, not a particular task.  Moreover, dropping the dependency
on task_struct will help make existing and future operations on mm's more
flexible and convenient.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:54 -04:00
Stephen Wilson
31db58b3ab mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
Morally, the presence of a gate vma is more an attribute of a particular mm than
a particular task.  Moreover, dropping the dependency on task_struct will help
make both existing and future operations on mm's more flexible and convenient.

Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-03-23 16:36:54 -04:00
Eric Dumazet
b6a84016bd mm: NUMA aware alloc_thread_info_node()
Add a node parameter to alloc_thread_info(), and change its name to
alloc_thread_info_node()

This change is needed to allow NUMA aware kthread_create_on_cpu()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 17:44:01 -07:00
Mike Wolf
a71f5d5d27 powerpc/ptrace: Remove BUG_ON when full register set not available
In some cases during a threaded core dump not all the threads will have
a full register set. This happens when the signal causing the core dump
races with a thread exiting.  The race happens when the exiting thread
has entered the kernel for the last time before the signal arrives, but
doesn't get far enough through the exit code to avoid being included
in the core dump.

So we get a thread included in the core dump which is never going to go
out to userspace again and only has a partial register set recorded

Normally we would catch each thread as it is about to go into userspace
and capture the full register set then.

However, this exiting thread is never going to go out to userspace
again, so we have no way to capture its full register set.  It doesn't
really matter, though, as this is a thread which is effectively
already dead.

So instead of hitting a BUG() in this case (a really bad choice of
action in the first place), we use a poison value for the register
values.

[BenH]: Some cosmetic/stylistic changes and fix build on ppc32

Signed-off-by: Mike Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-21 11:18:14 +11:00
Benjamin Herrenschmidt
90407c9976 powerpc/pci: Fix crash in PCI code on ppc64 when matching device nodes
Commit b5d937de03 has a bug which causes
basically a NULL dereference in the PCI code during boot on ppc64
machines.

fetch_dev_dn() is called when dev->dev.of_node is NULL, so using that
as the starting point for the search makes no sense. It should instead
start from the device node of the PHB.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-21 10:57:57 +11:00
Linus Torvalds
619297855a Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
  trace, filters: Initialize the match variable in process_ops() properly
  trace, documentation: Fix branch profiling location in debugfs
  oprofile, s390: Cleanups
  oprofile, s390: Remove hwsampler_files.c and merge it into init.c
  perf: Fix tear-down of inherited group events
  perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
  perf: Handle stopped state with tracepoints
  perf: Fix the software events state check
  perf, powerpc: Handle events that raise an exception without overflowing
  perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
  perf, x86: Clean up SandyBridge PEBS events
  perf lock: Fix sorting by wait_min
  perf tools: Version incorrect with some versions of grep
  perf evlist: New command to list the names of events present in a perf.data file
  perf script: Add support for H/W and S/W events
  perf script: Add support for dumping symbols
  perf script: Support custom field selection for output
  perf script: Move printing of 'common' data from print_event and rename
  perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
  perf script: Change process_event prototype
  ...
2011-03-18 10:38:34 -07:00
Linus Torvalds
0a95d92c00 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (62 commits)
  powerpc/85xx: Fix signedness bug in cache-sram
  powerpc/fsl: 85xx: document cache sram bindings
  powerpc/fsl: define binding for fsl mpic interrupt controllers
  powerpc/fsl_msi: Handle msi-available-ranges better
  drivers/serial/ucc_uart.c: Add of_node_put to avoid memory leak
  powerpc/85xx: Fix SPE float to integer conversion failure
  powerpc/85xx: Update sata controller compatible for p1022ds board
  ATA: Add FSL sata v2 controller support
  powerpc/mpc8xxx_gpio: simplify searching for 'fsl, qoriq-gpio' compatiable
  powerpc/8xx: remove obsolete mgsuvd board
  powerpc/82xx: rename and update mgcoge board support
  powerpc/83xx: rename and update kmeter1
  powerpc/85xx: Workaroudn e500 CPU erratum A005
  powerpc/fsl_pci: Add support for FSL PCIe controllers v2.x
  powerpc/85xx: Fix writing to spin table 'cpu-release-addr' on ppc64e
  powerpc/pseries: Disable MSI using new interface if possible
  powerpc: Enable GENERIC_HARDIRQS_NO_DEPRECATED.
  powerpc: core irq_data conversion.
  powerpc: sysdev/xilinx_intc irq_data conversion.
  powerpc: sysdev/uic irq_data conversion.
  ...

Fix up conflicts in arch/powerpc/sysdev/fsl_msi.c (due to getting rid of
of_platform_driver in arch/powerpc)
2011-03-18 06:31:43 -07:00
Benjamin Herrenschmidt
831532035b Merge remote branch 'jwb/next' into next 2011-03-17 17:59:01 +11:00
Linus Torvalds
4c5811bf46 Merge branch 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6: (21 commits)
  tty: serial: altera_jtaguart: Add device tree support
  tty: serial: altera_uart: Add devicetree support
  dt: eliminate of_platform_driver shim code
  dt: Eliminate of_platform_{,un}register_driver
  dt/serial: Eliminate users of of_platform_{,un}register_driver
  dt/usb: Eliminate users of of_platform_{,un}register_driver
  dt/video: Eliminate users of of_platform_{,un}register_driver
  dt/net: Eliminate users of of_platform_{,un}register_driver
  dt/sound: Eliminate users of of_platform_{,un}register_driver
  dt/spi: Eliminate users of of_platform_{,un}register_driver
  dt: uartlite: merge platform and of_platform driver bindings
  dt: xilinx_hwicap: merge platform and of_platform driver bindings
  ipmi: convert OF driver to platform driver
  leds/leds-gpio: merge platform_driver with of_platform_driver
  dt/sparc: Eliminate users of of_platform_{,un}register_driver
  dt/powerpc: Eliminate users of of_platform_{,un}register_driver
  dt/powerpc: move of_bus_type infrastructure to ibmebus
  drivercore/dt: add a match table pointer to struct device
  dt: Typo fix.
  altera_ps2: Add devicetree support
  ...
2011-03-16 17:28:10 -07:00
Linus Torvalds
79d8a8f736 Merge branch 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-2.6.39' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
  percpu: Generic support for this_cpu_cmpxchg_double()
  alpha: use L1_CACHE_BYTES for cacheline size in the linker script
  percpu: align percpu readmostly subsection to cacheline

Fix up trivial conflict in arch/x86/kernel/vmlinux.lds.S due to the
percpu alignment having changed ("x86: Reduce back the alignment of the
per-CPU data section")
2011-03-16 08:22:41 -07:00
Anton Blanchard
0837e3242c perf, powerpc: Handle events that raise an exception without overflowing
Events on POWER7 can roll back if a speculative event doesn't
eventually complete. Unfortunately in some rare cases they will
raise a performance monitor exception. We need to catch this to
ensure we reset the PMC. In all cases the PMC will be 256 or less
cycles from overflow.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org> # as far back as it applies cleanly
LKML-Reference: <20110309143842.6c22845e@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-16 14:04:13 +01:00
Linus Torvalds
d10902812c Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  x86: Clean up apic.c and apic.h
  x86: Remove superflous goal definition of tsc_sync
  x86: dt: Correct local apic documentation in device tree bindings
  x86: dt: Cleanup local apic setup
  x86: dt: Fix OLPC=y/INTEL_CE=n build
  rtc: cmos: Add OF bindings
  x86: ce4100: Use OF to setup devices
  x86: ioapic: Add OF bindings for IO_APIC
  x86: dtb: Add generic bus probe
  x86: dtb: Add support for PCI devices backed by dtb nodes
  x86: dtb: Add device tree support for HPET
  x86: dtb: Add early parsing of IO_APIC
  x86: dtb: Add irq domain abstraction
  x86: dtb: Add a device tree for CE4100
  x86: Add device tree support
  x86: e820: Remove conditional early mapping in parse_e820_ext
  x86: OLPC: Make OLPC=n build again
  x86: OLPC: Remove extra OLPC_OPENFIRMWARE_DT indirection
  x86: OLPC: Cleanup config maze completely
  x86: OLPC: Hide OLPC_OPENFIRMWARE config switch
  ...

Fix up conflicts in arch/x86/platform/ce4100/ce4100.c
2011-03-15 20:01:36 -07:00
Lennert Buytenhek
e11802872d powerpc: core irq_data conversion.
Signed-off-by: Lennert Buytenhek <buytenh@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-10 11:04:04 +11:00
Benjamin Herrenschmidt
f2f6dad6ca powerpc/iseries: Fix early init access to lppaca
The combination of commit

8154c5d22d and
93c22703ef

Broke boot on iSeries.

The problem is that iSeries very early boot code, which generates
the device-tree and runs before our normal early initializations
does need access the lppaca's very early, before the PACA array is
initialized, and in fact even before the boot PACA has been
initialized (it contains all 0's at this stage).

However, the first patch above makes that code use the new
llpaca_of(cpu) accessor, which itself is changed by the second patch to
use the PACA array.

We fix that by reverting iSeries to directly dereferencing the array. In
addition, we fix all iterators in the iSeries code to always skip CPU
whose number is above 63 which is the maximum size of that array and
the maximum number of supported CPUs on these machines.

Additionally, we make sure the boot_paca is properly initialized
in our early startup code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-10 10:06:02 +11:00
Jim Keniston
0f4ac13236 powerpc/nvram: Generalize code for OS partitions in NVRAM
Adapt the functions used to create and write to the RTAS-log partition
to work with any OS-type partition.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-04 18:19:04 +11:00
Scott Wood
6dd2270029 powerpc: Fix memory limits when starting at a non-zero address
memblock_enforce_memory_limit() takes the desired maximum quantity of memory
to end up with, not an address above which memory will not be used.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 16:56:15 +11:00
Paul E. McKenney
9ff0c61d08 powerpc: Mask smp_processor_id() false positive
The rtas_event_scan() function uses smp_processor_id() to select a
starting point in cpu_online_mask, and does so under the protection
of get_online_cpus().  This might not select the current processor
in any case, so switch to raw_smp_processor_id().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 16:50:25 +11:00
Thomas Gleixner
a9d8946b4a powerpc: Use new irq allocator
Use the new functions and free the descriptor when the virq is
destroyed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 16:50:24 +11:00
Thomas Gleixner
089fb442f3 powerpc: Use ARCH_IRQ_INIT_FLAGS
Define the ARCH_IRQ_INIT_FLAGS instead of fixing it up in a loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 16:50:24 +11:00
K.Prasad
e0780b720f powerpc: Fix call to flush_ptrace_hw_breakpoint()
Fix the error in spelling the config option for hw-breakpoints and fix
the build issue that follows.

Signed-off by: K.Prasad <prasad@linux.vnet.ibm.com>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 14:56:49 +11:00
Anton Blanchard
357574c482 powerpc/kexec: Restore ppc_md.machine_kexec
Kyle Moffett points out that mpc85xx has started using the
ppc_md.machine_kexec hook. As such, revert patch c94868788c
(powerpc/kexec: Remove ppc_md.machine_kexec).

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-03-02 14:56:48 +11:00
Grant Likely
000061245a dt/powerpc: Eliminate users of of_platform_{,un}register_driver
Get rid of old users of of_platform_driver in arch/powerpc.  Most
of_platform_driver users can be converted to use the platform_bus
directly.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-28 01:36:39 -07:00
Grant Likely
710ac54be4 dt/powerpc: move of_bus_type infrastructure to ibmebus
arch/powerpc/kernel/ibmebus.c is the only remaining user of the
of_bus_type support code for initializing the bus and registering
drivers.  All others have either been switched to the vanilla platform
bus or already have their own infrastructure.

This patch moves the functionality that ibmebus is using out of
drivers/of/{platform,device}.c and into ibmebus.c where it is actually
used.  Also renames the moved symbols from of_platform_* to
ibmebus_bus_* to reflect the actual usage.

This patch is part of moving all of the of_platform_bus_type users
over to the platform_bus_type.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-28 01:36:38 -07:00
Grant Likely
38a5d6736e Merge commit 'v2.6.38-rc6' into devicetree/next
Conflicts:
	drivers/spi/pxa2xx_spi_pci.c
2011-02-28 01:36:21 -07:00
Thomas Gleixner
7acdbb3f35 Merge branch 'linus' into x86/platform
Reason: Import mainline device tree changes on which further patches
        depend on or conflict.

Trivial conflict in: drivers/spi/pxa2xx_spi_pci.c

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-23 09:21:41 +01:00
Benjamin Herrenschmidt
1f1936ff3f powerpc: Fix some 6xx/7xxx CPU setup functions
Some of those functions try to adjust the CPU features, for example
to remove NAP support on some revisions. However, they seem to use
r5 as an index into the CPU table entry, which might have been right
a long time ago but no longer is. r4 is the right register to use.

This probably caused some off behaviours on some PowerMac variants
using 750cx or 7455 processor revisions.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@kernel.org
2011-02-07 12:57:11 +11:00
Benjamin Herrenschmidt
af9eef3c7b powerpc: Pass the right cpu_spec to ->setup_cpu() on 64-bit
When calling setup_cpu() on 64-bit, we pass a pointer to the
cputable entry we have found. This used to be fine when cur_cpu_spec
was a pointer to that entry, but nowadays, we copy the entry into
a separate variable, and we do so before we call the setup_cpu()
callback. That means that any attempt by that callback at patching
the CPU table entry (to adjust CPU features for example) will patch
the wrong table.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-02-07 12:47:57 +11:00
Grant Likely
b5d937de03 powerpc/pci: Make both ppc32 and ppc64 use sysdata for pci_controller
Currently, ppc32 uses sysdata for the pci_controller pointer, and
ppc64 uses it to hold the device_node pointer.  This patch moves the
of_node pointer into (struct pci_bus*)->dev.of_node and
(struct pci_dev*)->dev.of_node so that sysdata can be converted to always
use the pci_controller pointer instead.  It also fixes up the
allocating of pci devices so that the of_node pointer gets assigned
consistently and increments the ref count.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-04 11:46:51 -07:00
Sebastian Andrzej Siewior
04bea68b2f of/pci: move of_irq_map_pci() into generic code
There is a tiny difference between PPC32 and PPC64. Microblaze uses the
PPC32 variant.

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
[grant.likely@secretlab.ca: Added comment to #endif, moved documentation
	block to function implementation, fixed for non ppc and microblaze
	compiles]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-02-04 11:46:50 -07:00
Dave Kleikamp
c48d0dbaac powerpc/476: define specific cpu table entry DD2 core
The DD2 core still has some unstability.  Define CPU_FTR_476_DD2 to
enable workarounds in later patches.

This is based on an earlier, unreleased patch for DD1 by Ben Herrenschmidt.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2011-02-02 06:58:53 -05:00
Tejun Heo
19df0c2fef percpu: align percpu readmostly subsection to cacheline
Currently percpu readmostly subsection may share cachelines with other
percpu subsections which may result in unnecessary cacheline bounce
and performance degradation.

This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
linker macros, makes each arch linker scripts specify its cacheline
size and use it to align percpu subsections.

This is based on Shaohua's x86 only patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Shaohua Li <shaohua.li@intel.com>
2011-01-25 14:26:50 +01:00
Linus Torvalds
500d85ce39 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix time function double declaration with glibc
  perf tools: Fix build by checking if extra warnings are supported
  perf tools: Fix build when using gcc 3.4.6
  perf tools: Add missing header, fixes build
  perf tools: Fix 64 bit integer format strings
  perf test: Fix build on older glibcs
  perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/
  perf test: Use cpu_map->[cpu] when setting affinity
  perf symbols: Fix annotation of thumb code
  perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat
  powerpc, perf: Fix frequency calculation for overflowing counters (FSL version)
  perf: Fix perf_event_init_task()/perf_event_free_task() interaction
  perf: Fix find_get_context() vs perf_event_exit_task() race
2011-01-25 05:26:47 +10:00
Anton Blanchard
fbe754ca3a powerpc: machine_check_generic is wrong on 64bit
Decoding machine checks is CPU specific and so machine_check_generic doesn't
do the right thing on 64bit chips. Luckily we never call into this code
because we call ppc_md.machine_check_exception instead if available.

Since we check cur_cpu_spec->machine_check before calling it, we may as
well remove machine_check_generic from 64bit archs.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:38 +11:00
Anton Blanchard
7f32c9c600 powerpc: Check RTAS extended log flag before checking length
The spec suggests we should first check the extended log flag before checking
the length field.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:38 +11:00
Anton Blanchard
e49b1fae0b powerpc: Don't silently handle machine checks from userspace
If a machine check comes from userspace we send a SIGBUS to the task and
fail to printk anything.

If we are taking machine checks due to bad hardware we want to know about
it right away. Furthermore if we don't complain loudly then it will look
a lot like a bug in the userspace application, potentially causing a lot
of confusion.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:37 +11:00
Anton Blanchard
dfb5509f8f powerpc: Remove duplicate debugger hook in machine_check_exception
We are calling debugger_fault_handler twice in machine_check_exception.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:37 +11:00
Anton Blanchard
a443506b85 powerpc: Don't force MSR_RI in machine_check_exception
We should never force MSR_RI on. If we take a machine check with MSR_RI off
then we have no chance of recovering safely.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:37 +11:00
Anton Blanchard
7071854bb2 powerpc: Print 32 bits of DSISR in show_regs
We were printing 64 bits of DSISR in show_regs even though it is 32 bit.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:37 +11:00
Anton Blanchard
ac4414e4d3 powerpc/kdump: Disable ftrace during kexec
We should disable ftrace during kexec, some of the tracers are very invasive
and we do not want them going off while doing the low level work of swapping
one kernel out for another. This mirrors what we do on x86.

Even though we cannot return from a kexec on powerpc (since we do not implement
CONFIG_KEXEC_JUMP), add the restore code in case we do one day.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:36 +11:00
Anton Blanchard
158d5b5e36 powerpc/kdump: Move crash_kexec_stop_spus to kdump crash handler
Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.

While I'm here I noticed "CPUSs reliabally"

so fix the spelling MISTAKESs reliabally.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:36 +11:00
Anton Blanchard
c1f784e553 powerpc/kdump: Remove ppc_md.machine_crash_shutdown
No one uses ppc_md.machine_crash_shutdown, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:35 +11:00
Anton Blanchard
c94868788c powerpc/kexec: Remove ppc_md.machine_kexec
No one uses ppc_md.machine_kexec, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:35 +11:00
Anton Blanchard
619b267724 powerpc/kexec: Remove ppc_md.machine_kexec_cleanup
No one uses ppc_md.machine_kexec_cleanup, so remove it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:35 +11:00
Tejun Heo
b18ae08dea powerpc/cell: Use system_wq in cpufreq_spudemand
With cmwq, there's no reason to use a separate workqueue in
cpufreq_spudemand.  Use system_wq instead.  The work items are already
sync canceled on stop, so it's already guaranteed that no work is
running when spu_gov_exit() is entered.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:34 +11:00
Akinobu Mita
4c4a5cf64b powerpc/rtas_flash: Use simple_read_from_buffer
Simplify read file operation for /proc/powerpc/rtas/* interface
by using simple_read_from_buffer.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:34 +11:00
Steven Rostedt
06ca2188ec powerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off
32-bit variant of the previous patch for 64-bit:

<<
    When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
    With one level stack. But if we have irqsoff tracing enabled,
    it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
    goes two stack frames up. If this is from user space, then there may
    not exist a second stack....
>>

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-01-21 14:08:33 +11:00
Benjamin Herrenschmidt
50f4df4e6a Merge remote branch 'kumar/next' into merge 2011-01-21 11:00:44 +11:00
Anton Blanchard
8c8a9b25b5 powerpc, perf: Fix frequency calculation for overflowing counters (FSL version)
When fixing the frequency calculations for perf on powerpc I
forgot to fix the FSL version.

If we dont set event->hw.last_period the frequency to period
calculations in perf go haywire and we continually
throttle/unthrottle the PMU.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110118214404.2f42e634@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-19 20:05:42 +01:00
Linus Torvalds
c6fa63c659 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix tracepoint id to string perf.data header table
  perf tools: Fix handling of wildcards in tracepoint event selectors
  powerpc: perf: Fix frequency calculation for overflowing counters
2011-01-18 08:04:30 -08:00
Anton Blanchard
4bca770ede powerpc: perf: Fix frequency calculation for overflowing counters
When profiling a benchmark that is almost 100% userspace, I noticed some wildly
inaccurate profiles that showed almost all time spent in the kernel.

Closer examination shows we were programming a tiny number of cycles into the
PMU after each overflow (about ~200 away from the next overflow). This gets us
stuck in a loop which we eventually break out of by throttling the PMU (there
are regular throttle/unthrottle events in the log).

It looks like we aren't setting event->hw.last_period to something same and the
frequency to period calculations in perf are going haywire.

With the following patch we find the correct period after a few interrupts and
stay there. I also see no more throttle events.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
LKML-Reference: <20110117161742.5feb3761@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-17 11:43:02 +01:00
Grant Likely
672c54466d dt/flattree: Return virtual address from early_init_dt_alloc_memory_arch()
The physical address is never used by the device tree code when
allocating memory for unflattening.  Change the architecture's alloc
hook to return the virutal address instead.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-01-15 22:01:58 -07:00
Shaohui Xie
b5fb0cc7f1 powerpc/fsl_rio: Fix non-standard HID1 register access
Moved setting of RFXE bit so we get machine checks on RIO errors into
cpu_setup so that the RIO code isn't core specific.

Signed-off-by: Shaohui Xie <b21989@freescale.com>
Cc: Li Yang <leoli@freescale.com>
Cc: Roy Zang <tie-fei.zang@freescale.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-01-12 18:00:29 -06:00
Linus Torvalds
5a62f99544 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (72 commits)
  powerpc/pseries: Fix build of topology stuff without CONFIG_NUMA
  powerpc/pseries: Fix VPHN build errors on non-SMP systems
  powerpc/83xx: add mpc8308_p1m DMA controller device-tree node
  powerpc/83xx: add DMA controller to mpc8308 device-tree node
  powerpc/512x: try to free dma descriptors in case of allocation failure
  powerpc/512x: add MPC8308 dma support
  powerpc/512x: fix the hanged dma transfer issue
  powerpc/512x: scatter/gather dma fix
  powerpc/powermac: Make auto-loading of therm_pm72 possible
  of/address: Use propper endianess in get_flags
  powerpc/pci: Use printf extension %pR for struct resource
  powerpc: Remove unnecessary casts of void ptr
  powerpc: Disable VPHN polling during a suspend operation
  powerpc/pseries: Poll VPA for topology changes and update NUMA maps
  powerpc: iommu: Add device name to iommu error printks
  powerpc: Record vma->phys_addr in ioremap()
  powerpc: Update compat_arch_ptrace
  powerpc: Fix PPC_PTRACE_SETHWDEBUG on PPC_BOOK3S
  powerpc/time: printk time stamp init not correct
  powerpc: Minor cleanups for machdep.h
  ...
2011-01-11 16:31:41 -08:00
Linus Torvalds
0bd2cbcdfa Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (29 commits)
  of/flattree: forward declare struct device_node in of_fdt.h
  ipmi: explicitly include of_address.h and of_irq.h
  sparc: explicitly cast negative phandle checks to s32
  powerpc/405: Fix missing #{address,size}-cells in i2c node
  powerpc/5200: dts: refactor dts files
  powerpc/5200: dts: Change combatible strings on localbus
  powerpc/5200: dts: remove unused properties
  powerpc/5200: dts: rename nodes to prepare for refactoring dts files
  of/flattree: Update dtc to current mainline.
  of/device: Don't register disabled devices
  powerpc/dts: fix syntax bugs in bluestone.dts
  of: Fixes for OF probing on little endian systems
  of: make drivers depend on CONFIG_OF instead of CONFIG_PPC_OF
  of/flattree: Add of_flat_dt_match() helper function
  of_serial: explicitly include of_irq.h
  of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree
  of/flattree: Reorder unflatten_dt_node
  of/flattree: Refactor unflatten_dt_node
  of/flattree: Add non-boottime device tree functions
  of/flattree: Add Kconfig for EARLY_FLATTREE
  ...

Fix up trivial conflict in arch/sparc/prom/tree_32.c as per Grant.
2011-01-10 08:57:03 -08:00
Grant Likely
cfb13c5db0 Merge commit 'v2.6.37-rc7' into devicetree/next 2010-12-23 00:41:14 -07:00
Peter Zijlstra
2e80a82a49 perf: Dynamic pmu types
Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.

Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.

If we want to enumerate the PMUs they need a name, provide one.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.259707703@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:36:43 +01:00
Joe Perches
518fdae26a powerpc/pci: Use printf extension %pR for struct resource
Using %pR standardizes the struct resource output.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:36:30 +11:00
Jesse Larrew
3b7a27db3b powerpc: Disable VPHN polling during a suspend operation
Tie the polling mechanism into the ibm,suspend-me rtas call to
stop/restart polling before/after a suspend, hibernate, migrate,
or checkpoint restart operation. This ensures that the system has a
chance to disable the polling if the partition is migrated to a system
that does not support VPHN (and vice versa).

Signed-off-by: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:36:30 +11:00
Anton Blanchard
4dfa9c4748 powerpc: iommu: Add device name to iommu error printks
Right now its difficult to see which device is running out of iommu space:

iommu_alloc failed, tbl c00000076e096660 vaddr c000000768806600 npages 1

Use dev_info() so we get the device name and location:

ipr 0000:00:01.0: iommu_alloc failed, tbl c00000076e096660 vaddr c000000768806600 npages 1

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:32 +11:00
Andreas Schwab
bb2c458b8b powerpc: Update compat_arch_ptrace
Update compat_arch_ptrace to follow recent changes in
PTRACE_GET_DEBUGREG and the addition of
PPC_PTRACE_{GETHWDBGINFO|{SET|DEL}HWDEBUG}.  The latter three can be
forwarded to arch_ptrace unchanged.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:32 +11:00
Andreas Schwab
4dfbf290ae powerpc: Fix PPC_PTRACE_SETHWDEBUG on PPC_BOOK3S
Properly set the DABR_TRANSLATION/DABR_DATA_READ/DABR_DATA_READ bits in
the dabr when setting the debug register via PPC_PTRACE_SETHWDEBUG.  Also
don't reject trigger type of PPC_BREAKPOINT_TRIGGER_READ.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:31 +11:00
Heiko Schocher
364a124652 powerpc/time: printk time stamp init not correct
problem:

I see sometimes on my mpc5200 based board such printk timing
information:

[    0.000000] NR_IRQS:512 nr_irqs:512 16
[    0.000000] MPC52xx PIC is up and running!
[    0.000000] clocksource: timebase mult[79364d9] shift[22] registered
[    0.000000] console [ttyPSC0] enabled
[  130.300633] pid_max: default: 32768 minimum: 301
[  130.305647] Mount-cache hash table entries: 512
[  130.315818] NET: Registered protocol family 16

reason:
if the tbu not starts from 0 when linux boots, boot_tb
maybe could not store the real 64 bit tbu value, because
boot_tp is only a 32 bit unsigned long.

solution:
change boot_tb to u64

[BenH: Made it u64 instead of unsigned long long]

Signed-off-by: Heiko Schocher <hs@denx.de>
cc: Wolfgang Denk <wd@denx.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:31 +11:00
Sonny Rao
928a319781 Powerpc: separate CONFIG_RELOCATABLE from CONFIG_CRASHDUMP in boot code
Fix head_64.S so that we can build a relocatable kernel
that isn't necessarily a crash-dump kernel

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Sonny Rao <sonnyrao@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:31 +11:00
Anton Blanchard
8f4da26e9b powerpc: Fix incorrect comment about interrupt stack allocation
We now allow interrupt stacks anywhere in the first segment which can be
256M or 1TB. Fix the comment.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:35:31 +11:00
Nishanth Aravamudan
b3c73856ae powerpc/iommu: Use coherent_dma_mask for alloc_coherent
The IOMMU code has been passing the dma-mask instead of the
coherent_dma_mask to the iommu allocator.  Coherent allocations should
be made using the coherent_dma_mask.

Also update the vio code to ensure the coherent_dma_mask is set. Without
this change drivers, such as ibmvscsi, fail to load with the corrected
dma_iommu_alloc_coherent().

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-12-09 15:17:50 +11:00
Benjamin Herrenschmidt
f4b9841595 Merge branch 'nvram' into next 2010-12-09 14:36:38 +11:00
Jim Keniston
6024ede9ba powerpc/nvram: Handle partition names >= 12 chars
The name field in the nvram_header can be < 12 chars, null-terminated,
or 12 chars without the null.  Handle this safely.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:43:51 +11:00
Jim Keniston
690d1a9bd1 powerpc/nvram: Fix NVRAM partition list setup
Simplify creation and use of the NVRAM partition list.

Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:43:51 +11:00
Benjamin Herrenschmidt
edc79a2f3e powerpc/nvram: Move the log partition stuff to pseries
The nvram log partition stuff currently in nvram_64.c is really
pseries specific. It isn't actually used on anything else (despite
the fact that we ran the code to setup the partition on anything
except powermac) and the log format is specific to pseries RTAS
implementation. So move it where it belongs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:37:45 +11:00
Benjamin Herrenschmidt
d9626947f2 powerpc/nvram: Change nvram_setup_partition() to use new helper
This changes the function to use nvram_find_partition() instead
of doing the lookup "by hand". It also makes some of the logic
clearer and prints out more useful diagnostic information.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:35:08 +11:00
Benjamin Herrenschmidt
cf5cbf9f80 powerpc/nvram: Add nvram_find_partition()
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:34:05 +11:00
Benjamin Herrenschmidt
fa2b4e54d4 powerpc/nvram: Improve partition removal
Existing code is nasty, has bugs etc... rewrite the function
more simply, and make it take the signature and optional
name of the partitions to remove as arguments, thus making
it a more generic utility.

We also try to remove a log partition that we find and is too
small rather than creating a duplicate.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:34:03 +11:00
Benjamin Herrenschmidt
e49e2e8723 powerpc/nvram: Shuffle code around in nvram_create_partition()
This error log stuff is really pseries specific. As a first step we move
the initialization of these variables to the caller of
nvram_create_partition(), which is also slightly reorganized so we
setup the free partition before we clear the new partition, so the
chance of an error during clear leaving us with invalid headers
is lessened.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:33:58 +11:00
Benjamin Herrenschmidt
cef0d5ad62 powerpc/nvram: Completely clear a new partition
When creating a partition, we clear it entirely rather than
just the first two words since the previous code was rather
specific to the pseries log partition format.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:32:10 +11:00
Benjamin Herrenschmidt
578914cffc powerpc/nvram: Ensure that the partition header/block size is right
Use BUILD_BUG_ON to ensure the structure representing a partition
header have the right size.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:32:08 +11:00
Benjamin Herrenschmidt
36673307ae powerpc/nvram: nvram_create_partitions() now uses bytes
This converts nvram_create_partition() to use a size in bytes
rather than blocks. It does the appropriate alignment internally

The size passed is also the data size (ie. doesn't include the
header anymore).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:32:06 +11:00
Benjamin Herrenschmidt
4e7c77a385 powerpc/nvram: More flexible nvram_create_partition()
Replace nvram_create_os_partition() with a variant that takes
the partition name, signature and size as arguments.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:31:51 +11:00
Benjamin Herrenschmidt
74d51d0298 powerpc/nvram: Move things out of asm/nvram.h
This moves a bunch of definitions out of asm/nvram.h to the files
that use them or just outright remove completely unused stuff.

We leave the partition signatures definitions, they will be useful

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-30 15:09:19 +11:00
Stephen Rothwell
46f5221049 powerpc: Remove second definition of STACK_FRAME_OVERHEAD
Since STACK_FRAME_OVERHEAD is defined in asm/ptrace.h and that
is ASSEMBER safe, we can just include that instead of going via
asm-offsets.h.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:23 +11:00
Michael Neuling
6f08cb3be6 powerpc: Add POWER7+ cputable entry
This adds the POWER7+ cputable entry for the PVR 0x004a0000.  Rest is
the same as vanilla POWER7.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:23 +11:00
Michael Neuling
1d32bb1827 powerpc: Remove POWER6 oprofile workarounds for POWER7
These are not needed on POWER7 so remove them.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:22 +11:00
Michael Neuling
93fe56e99f powerpc: Remove unneeded cpu_setup/restore from POWER7 cputable entry
These are not needed so just remove them

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:22 +11:00
Michael Ellerman
698193d85a powerpc: Consolidate obj-y assignments
No need to have three of them.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:22 +11:00
Nishanth Aravamudan
6d283d782f powerpc/vio: Use dma ops helpers
Use the set_dma_ops helper. Instead of modifying vio_dma_mapping_ops,
just create a trivial wrapper for dma_supported.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:20 +11:00
Vaidyanathan Srinivasan
99d8670525 powerpc: Cleanup APIs for cpu/thread/core mappings
These APIs take logical cpu number as input
Change cpu_first_thread_in_core() to cpu_first_thread_sibling()
Change cpu_last_thread_in_core() to cpu_last_thread_sibling()

These APIs convert core number (index) to logical cpu/thread numbers
Add cpu_first_thread_of_core(int core)
Changed cpu_thread_to_core() to cpu_core_index_of_thread(int cpu)

The goal is to make 'threads_per_core' accessible to the
pseries_energy module.  Instead of making an API to read
threads_per_core, this is a higher level wrapper function to
convert from logical cpu number to core number.

The current APIs cpu_first_thread_in_core() and
cpu_last_thread_in_core() returns logical CPU number while
cpu_thread_to_core() returns core number or index which is
not a logical CPU number.  The new APIs are now clearly named to
distinguish 'core number' versus first and last 'logical cpu
number' in that core.

The new APIs cpu_{first,last}_thread_sibling() work on
logical cpu numbers.  While cpu_first_thread_of_core() and
cpu_core_index_of_thread() work on core index.

Example usage:  (4 threads per core system)

cpu_first_thread_sibling(5) = 4
cpu_last_thread_sibling(5) = 7
cpu_core_index_of_thread(5) = 1
cpu_first_thread_of_core(1) = 4

cpu_core_index_of_thread() is used in cpu_to_drc_index() in the
module and cpu_first_thread_of_core() is used in
drc_index_to_cpu() in the module.

Make API changes to few callers.  Export symbols for use in modules.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:19 +11:00
Anton Blanchard
d72e063bb3 powerpc/kdump: Override crash_free_reserved_phys_range to avoid freeing RTAS
The crashkernel region will almost always overlap RTAS. If we free the
crashkernel region via "echo 0 > /sys/kernel/kexec_crash_size" then we will
free RTAS and the machine will crash in confusing and exciting ways.

Override crash_free_reserved_phys_range and check for overlap with RTAS.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:17 +11:00
Anton Blanchard
64ff312876 powerpc: Add support for popcnt instructions
POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-29 15:48:17 +11:00
Peter Zijlstra
004417a6d4 perf, arch: Cleanup perf-pmu init vs lockup-detector
The perf hardware pmu got initialized at various points in the boot,
some before early_initcall() some after (notably arch_initcall).

The problem is that the NMI lockup detector is ran from early_initcall()
and expects the hardware pmu to be present.

Sanitize this by moving all architecture hardware pmu implementations to
initialize at early_initcall() and move the lockup detector to an explicit
initcall right after that.

Cc: paulus <paulus@samba.org>
Cc: davem <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1290707759.2145.119.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-26 15:14:56 +01:00
Linus Torvalds
2d42dc3feb Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb,ppc: Fix regression in evr register handling
  kgdb,x86: fix regression in detach handling
  kdb: fix crash when KDB_BASE_CMD_MAX is exceeded
  kdb: fix memory leak in kdb_main.c
2010-11-18 08:24:58 -08:00
Alessio Igor Bogani
0f6b77ca12 powerpc: Update a BKL related comment
The commit 5e3d20a remove bkl from startup code so setup_arch() it isn't called
with bkl held anymore. Update the comment on top of that function.
Fix also a typo.

This work was supported by a hardware donation from the CE Linux Forum.

Signed-off-by: Alessio Igor Bogani <abogani@texware.it>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-11-18 14:54:24 +11:00
Dongdong Deng
e3839ed8e8 kgdb,ppc: Fix regression in evr register handling
Commit ff10b88b5a (kgdb,ppc: Individual
register get/set for ppc) introduced a problem where memcpy was used
incorrectly to read and write the evr registers with a kernel that
has:

CONFIG_FSL_BOOKE=y
CONFIG_SPE=y
CONFIG_KGDB=y

This patch also fixes the following compilation problems:

arch/powerpc/kernel/kgdb.c: In function 'dbg_get_reg':
arch/powerpc/kernel/kgdb.c:341: error: passing argument 2 of 'memcpy' makes pointer from integer without a cast
arch/powerpc/kernel/kgdb.c: In function 'dbg_set_reg':
arch/powerpc/kernel/kgdb.c:366: error: passing argument 1 of 'memcpy' makes pointer from integer without a cast

[jason.wessel@windriver.com: Remove void * casts and fix patch header]
Reported-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
CC: linuxppc-dev@lists.ozlabs.org
2010-11-17 13:54:58 -06:00
Arnd Bergmann
451a3c24b0 BKL: remove extraneous #include <smp_lock.h>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.

Remove this too as a cleanup.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-17 08:59:32 -08:00
Scott Wood
a36be1003a PPC: KVM: Book E doesn't have __end_interrupts.
Fix an unresolved symbol with CONFIG_KVM_GUEST plus CONFIG_RELOCATABLE on
Book E.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2010-11-05 14:42:27 -02:00
David Daney
4b6ba8aacb of/net: Move of_get_mac_address() to a common source file.
There are two identical implementations of of_get_mac_address(), one
each in arch/powerpc/kernel/prom_parse.c and
arch/microblaze/kernel/prom_parse.c.  Move this function to a new
common file of_net.{c,h} and adjust all the callers to include the new
header.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
[grant.likely@secretlab.ca: protect header with #ifdef]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-11-01 01:08:14 -04:00
Linus Torvalds
1e431a9d64 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb,ppc: Individual register get/set for ppc
  kgdbts: prevent re-entry to kgdbts before it unregisters
  debug_core,x86,blackfin: Clean up hw debug disable API
  kdb: Fix early debugging crash regression
  kgdb,arm: fix register dump
  kdb: fix per_cpu command to remove supress mask
  kdb: Add kdb kernel module sample
2010-10-29 11:49:38 -07:00
Dongdong Deng
ff10b88b5a kgdb,ppc: Individual register get/set for ppc
commit 534af1082329392bc29f6badf815e69ae2ae0f4c(kgdb,kdb: individual
register set and and get API) introduce dbg_get_reg/dbg_set_reg API
for individual register get and set.

This patch implement those APIs for ppc.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-29 13:14:42 -05:00
Namhyung Kim
f68d204820 ptrace: cleanup arch_ptrace() on powerpc
Use new 'datavp' and 'datalp' variables in order to remove unnecessary
castings.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:11 -07:00
Namhyung Kim
9b05a69e05 ptrace: change signature of arch_ptrace()
Fix up the arguments to arch_ptrace() to take account of the fact that
@addr and @data are now unsigned long rather than long as of a preceding
patch in this series.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Acked-by: Roland McGrath <roland@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-27 18:03:10 -07:00
Hagen Paul Pfeifer
732eacc054 replace nested max/min macros with {max,min}3 macro
Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:12 -07:00
Linus Torvalds
51f00a471c Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  mtd/m25p80: add support to parse the partitions by OF node
  of/irq: of_irq.c needs to include linux/irq.h
  of/mips: Cleanup some include directives/files.
  of/mips: Add device tree support to MIPS
  of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
  of/device: Rework to use common platform_device_alloc() for allocating devices
  of/xsysace: Fix OF probing on little-endian systems
  of: use __be32 types for big-endian device tree data
  of/irq: remove references to NO_IRQ in drivers/of/platform.c
  of/promtree: add package-to-path support to pdt
  of/promtree: add of_pdt namespace to pdt code
  of/promtree: no longer call prom_ functions directly; use an ops structure
  of/promtree: make drivers/of/pdt.c no longer sparc-only
  sparc: break out some PROM device-tree building code out into drivers/of
  of/sparc: convert various prom_* functions to use phandle
  sparc: stop exporting openprom.h header
  powerpc, of_serial: Endianness issues setting up the serial ports
  of: MTD: Fix OF probing on little-endian systems
  of: GPIO: Fix OF probing on little-endian systems
2010-10-25 08:19:14 -07:00
Linus Torvalds
1765a1fe5d Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
  KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
  KVM: Fix signature of kvm_iommu_map_pages stub
  KVM: MCE: Send SRAR SIGBUS directly
  KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
  KVM: fix typo in copyright notice
  KVM: Disable interrupts around get_kernel_ns()
  KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
  KVM: MMU: move access code parsing to FNAME(walk_addr) function
  KVM: MMU: audit: check whether have unsync sps after root sync
  KVM: MMU: audit: introduce audit_printk to cleanup audit code
  KVM: MMU: audit: unregister audit tracepoints before module unloaded
  KVM: MMU: audit: fix vcpu's spte walking
  KVM: MMU: set access bit for direct mapping
  KVM: MMU: cleanup for error mask set while walk guest page table
  KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
  KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
  KVM: x86: Fix constant type in kvm_get_time_scale
  KVM: VMX: Add AX to list of registers clobbered by guest switch
  KVM guest: Move a printk that's using the clock before it's ready
  KVM: x86: TSC catchup mode
  ...
2010-10-24 12:47:25 -07:00
Alexander Graf
591bd8e7b4 KVM: PPC: Enable napping only for Book3s_64
Before I incorrectly enabled napping also for BookE, which would result in
needless dcache flushes. Since we only need to force enable napping on
Book3s_64 because it doesn't go into MSR_POW otherwise, we can just #ifdef
that code to this particular platform.

Reported-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:19 +02:00
Alexander Graf
ad0873763a KVM: PPC: Force enable nap on KVM
There are some heuristics in the PPC power management code that try to find
out if the particular hardware we're running on supports proper power management
or just hangs the machine when going into nap mode.

Since we know that KVM is safe with nap, let's force enable it in the PV code
once we're certain that we are on a KVM VM.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:15 +02:00
Alexander Graf
df08bd1026 KVM: PPC: Make PV mtmsrd L=1 work with r30 and r31
We had an arbitrary limitation in mtmsrd L=1 that kept us from using r30 and
r31 as input registers. Let's get rid of that and get more potential speedups!

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:14 +02:00
Alexander Graf
512ba59ed9 KVM: PPC: Make PV mtmsr work with r30 and r31
So far we've been restricting ourselves to r0-r29 as registers an mtmsr
instruction could use. This was bad, as there are some code paths in
Linux actually using r30.

So let's instead handle all registers gracefully and get rid of that
stupid limitation

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:13 +02:00
Alexander Graf
cbe487fac7 KVM: PPC: Add mtsrin PV code
This is the guest side of the mtsr acceleration. Using this a guest can now
call mtsrin with almost no overhead as long as it ensures that it only uses
it with (MSR_IR|MSR_DR) == 0. Linux does that, so we're good.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:12 +02:00
Alexander Graf
7508e16c9f KVM: PPC: Add feature bitmap for magic page
We will soon add SR PV support to the shared page, so we need some
infrastructure that allows the guest to query for features KVM exports.

This patch adds a second return value to the magic mapping that
indicated to the guest which features are available.

Signed-off-by: Alexander Graf <agraf@suse.de>
2010-10-24 10:52:09 +02:00
Alexander Graf
989044ee0f KVM: PPC: Fix CONFIG_KVM_GUEST && !CONFIG_KVM case
When CONFIG_KVM_GUEST is selected, but CONFIG_KVM is not, we were missing
some defines in asm-offsets.c and included too many headers at other places.

This patch makes above configuration work.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:51:44 +02:00
Alexander Graf
a58ddea556 KVM: PPC: Move KVM trampolines before __end_interrupts
When using a relocatable kernel we need to make sure that the trampline code
and the interrupt handlers are both copied to low memory. The only way to do
this reliably is to put them in the copied section.

This patch should make relocated kernels work with KVM.

KVM-Stable-Tag
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:59 +02:00
Alexander Graf
644bfa013f KVM: PPC: PV wrteei
On BookE the preferred way to write the EE bit is the wrteei instruction. It
already encodes the EE bit in the instruction.

So in order to get BookE some speedups as well, let's also PV'nize thati
instruction.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:57 +02:00
Alexander Graf
7810927760 KVM: PPC: PV mtmsrd L=0 and mtmsr
There is also a form of mtmsr where all bits need to be addressed. While the
PPC64 Linux kernel behaves resonably well here, on PPC32 we do not have an
L=1 form. It does mtmsr even for simple things like only changing EE.

So we need to hook into that one as well and check for a mask of bits that we
deem safe to change from within guest context.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:56 +02:00
Alexander Graf
819a63dc79 KVM: PPC: PV mtmsrd L=1
The PowerPC ISA has a special instruction for mtmsr that only changes the EE
and RI bits, namely the L=1 form.

Since that one is reasonably often occuring and simple to implement, let's
go with this first. Writing EE=0 is always just a store. Doing EE=1 also
requires us to check for pending interrupts and if necessary exit back to the
hypervisor.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:56 +02:00
Alexander Graf
92234722ed KVM: PPC: PV assembler helpers
When we hook an instruction we need to make sure we don't clobber any of
the registers at that point. So we write them out to scratch space in the
magic page. To make sure we don't fall into a race with another piece of
hooked code, we need to disable interrupts.

To make the later patches and code in general easier readable, let's introduce
a set of defines that save and restore r30, r31 and cr. Let's also define some
helpers to read the lower 32 bits of a 64 bit field on 32 bit systems.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:55 +02:00
Alexander Graf
71ee8e34fe KVM: PPC: Introduce branch patching helper
We will need to patch several instruction streams over to a different
code path, so we need a way to patch a single instruction with a branch
somewhere else.

This patch adds a helper to facilitate this patching.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:54 +02:00
Alexander Graf
2d4f567103 KVM: PPC: Introduce kvm_tmp framework
We will soon require more sophisticated methods to replace single instructions
with multiple instructions. We do that by branching to a memory region where we
write replacement code for the instruction to.

This region needs to be within 32 MB of the patched instruction though, because
that's the furthest we can jump with immediate branches.

So we keep 1MB of free space around in bss. After we're done initing we can just
tell the mm system that the unused pages are free, but until then we have enough
space to fit all our code in.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:54 +02:00
Alexander Graf
d1290b15e7 KVM: PPC: PV tlbsync to nop
With our current MMU scheme we don't need to know about the tlbsync instruction.
So we can just nop it out.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:53 +02:00
Alexander Graf
d1293c9275 KVM: PPC: PV instructions to loads and stores
Some instructions can simply be replaced by load and store instructions to
or from the magic page.

This patch replaces often called instructions that fall into the above category.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:52 +02:00
Alexander Graf
73a1810982 KVM: PPC: KVM PV guest stubs
We will soon start and replace instructions from the text section with
other, paravirtualized versions. To ease the readability of those patches
I split out the generic looping and magic page mapping code out.

This patch still only contains stubs. But at least it loops through the
text section :).

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:51 +02:00
Alexander Graf
d17051cb8d KVM: PPC: Generic KVM PV guest support
We have all the hypervisor pieces in place now, but the guest parts are still
missing.

This patch implements basic awareness of KVM when running Linux as guest. It
doesn't do anything with it yet though.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:50 +02:00
Alexander Graf
2a342ed577 KVM: PPC: Implement hypervisor interface
To communicate with KVM directly we need to plumb some sort of interface
between the guest and KVM. Usually those interfaces use hypercalls.

This hypercall implementation is described in the last patch of the series
in a special documentation file. Please read that for further information.

This patch implements stubs to handle KVM PPC hypercalls on the host and
guest side alike.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:45 +02:00
Alexander Graf
666e7252a1 KVM: PPC: Convert MSR to shared page
One of the most obvious registers to share with the guest directly is the
MSR. The MSR contains the "interrupts enabled" flag which the guest has to
toggle in critical sections.

So in order to bring the overhead of interrupt en- and disabling down, let's
put msr into the shared page. Keep in mind that even though you can fully read
its contents, writing to it doesn't always update all state. There are a few
safe fields that don't require hypervisor interaction. See the documentation
for a list of MSR bits that are safe to be set from inside the guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:43 +02:00
Alexander Graf
96bc451a15 KVM: PPC: Introduce shared page
For transparent variable sharing between the hypervisor and guest, I introduce
a shared page. This shared page will contain all the registers the guest can
read and write safely without exiting guest context.

This patch only implements the stubs required for the basic structure of the
shared page. The actual register moving follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-10-24 10:50:42 +02:00
Linus Torvalds
092e0e7e52 Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
  vfs: make no_llseek the default
  vfs: don't use BKL in default_llseek
  llseek: automatically add .llseek fop
  libfs: use generic_file_llseek for simple_attr
  mac80211: disallow seeks in minstrel debug code
  lirc: make chardev nonseekable
  viotape: use noop_llseek
  raw: use explicit llseek file operations
  ibmasmfs: use generic_file_llseek
  spufs: use llseek in all file operations
  arm/omap: use generic_file_llseek in iommu_debug
  lkdtm: use generic_file_llseek in debugfs
  net/wireless: use generic_file_llseek in debugfs
  drm: use noop_llseek
2010-10-22 10:52:56 -07:00
Linus Torvalds
d4429f608a Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (71 commits)
  powerpc/44x: Update ppc44x_defconfig
  powerpc/watchdog: Make default timeout for Book-E watchdog a Kconfig option
  fsl_rio: Add comments for sRIO registers.
  powerpc/fsl-booke: Add e55xx (64-bit) smp defconfig
  powerpc/fsl-booke: Add p5020 DS board support
  powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips
  powerpc/fsl-booke: Add support for FSL Arch v1.0 MMU in setup_page_sizes
  powerpc/fsl-booke: Add support for FSL 64-bit e5500 core
  powerpc/85xx: add cache-sram support
  powerpc/85xx: add ngPIXIS FPGA device tree node to the P1022DS board
  powerpc: Fix compile error with paca code on ppc64e
  powerpc/fsl-booke: Add p3041 DS board support
  oprofile/fsl emb: Don't set MSR[PMM] until after clearing the interrupt.
  powerpc/fsl-booke: Add PCI device ids for P2040/P3041/P5010/P5020 QoirQ chips
  powerpc/mpc8xxx_gpio: Add support for 'qoriq-gpio' controllers
  powerpc/fsl_booke: Add support to boot from core other than 0
  powerpc/p1022: Add probing for individual DMA channels
  powerpc/fsl_soc: Search all global-utilities nodes for rstccr
  powerpc: Fix invalid page flags in create TLB CAM path for PTE_64BIT
  powerpc/mpc83xx: Support for MPC8308 P1M board
  ...

Fix up conflict with the generic irq_work changes in arch/powerpc/kernel/time.c
2010-10-21 21:19:54 -07:00
Linus Torvalds
3044100e58 Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
  x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S
  xen: Cope with unmapped pages when initializing kernel pagetable
  memblock, bootmem: Round pfn properly for memory and reserved regions
  memblock: Annotate memblock functions with __init_memblock
  memblock: Allow memblock_init to be called early
  memblock/arm: Fix memblock_region_is_memory() typo
  x86, memblock: Remove __memblock_x86_find_in_range_size()
  memblock: Fix wraparound in find_region()
  x86-32, memblock: Make add_highpages honor early reserved ranges
  x86, memblock: Fix crashkernel allocation
  arm, memblock: Fix the sparsemem build
  memblock: Fix section mismatch warnings
  powerpc, memblock: Fix memblock API change fallout
  memblock, microblaze: Fix memblock API change fallout
  x86: Remove old bootmem code
  x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve
  x86: Remove not used early_res code
  x86, memblock: Replace e820_/_early string with memblock_
  x86: Use memblock to replace early_res
  x86, memblock: Use memblock_debug to control debug message print out
  ...

Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile
2010-10-21 18:52:11 -07:00
Linus Torvalds
e36f561a2c Merge git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-irqflags
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-irqflags:
  Fix IRQ flag handling naming
  MIPS: Add missing #inclusions of <linux/irq.h>
  smc91x: Add missing #inclusion of <linux/irq.h>
  Drop a couple of unnecessary asm/system.h inclusions
  SH: Add missing consts to sys_execve() declaration
  Blackfin: Rename IRQ flags handling functions
  Blackfin: Add missing dep to asm/irqflags.h
  Blackfin: Rename DES PC2() symbol to avoid collision
  Blackfin: Split the BF532 BFIN_*_FIO_FLAG() functions to their own header
  Blackfin: Split PLL code from mach-specific cdef headers
2010-10-21 14:37:27 -07:00
Grant Likely
32c97689c4 of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
This patch refactors the early init parsing of the chosen node so that
architectures aren't forced to provide an empty implementation of
early_init_dt_scan_chosen_arch.  Instead, if an architecture wants to
do something different, it can either use a wrapper function around
early_init_dt_scan_chosen(), or it can replace it altogether.

This patch was written in preparation to adding device tree support to
both x86 ad MIPS.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: David Daney <ddaney@caviumnetworks.com>
2010-10-21 11:10:10 -06:00
Grant Likely
7096d04221 of/device: Rework to use common platform_device_alloc() for allocating devices
The current code allocates and manages platform_devices created from
the device tree manually.  It also uses an unsafe shortcut for
allocating the platform_device and the resource table at the same
time. (which I added in the last rework; sorry).

This patch refactors the code to use platform_device_alloc() for
allocating new devices.  This reduces the amount of custom code
implemented by of_platform, eliminates the unsafe alloc trick, and has
the side benefit of letting the platform_bus code manage freeing the
device data and resources when the device is freed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michal Simek <monstr@monstr.eu>
2010-10-21 11:10:10 -06:00
Paul Mackerras
57fa721433 perf, powerpc: Fix power_pmu_event_init to not use event->ctx
Commit c3f00c70 ("perf: Separate find_get_context() from event
initialization") changed the generic perf_event code to call
perf_event_alloc, which calls the arch-specific event_init code,
before looking up the context for the new event.  Unfortunately,
power_pmu_event_init uses event->ctx->task to see whether the
new event is a per-task event or a system-wide event, and thus
crashes since event->ctx is NULL at the point where
power_pmu_event_init gets called.

(The reason it needs to know whether it is a per-task event is
because there are some hardware events on Power systems which
only count when the processor is not idle, and there are some
fixed-function counters which count such events.  For example,
the "run cycles" event counts cycles when the processor is not
idle.  If the user asks to count cycles, we can use "run cycles"
if this is a per-task event, since the processor is running when
the task is running, by definition.  We can't use "run cycles"
if the user asks for "cycles" on a system-wide counter.)

Fortunately the information we need is in the
event->attach_state field, so we just use that instead.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101019055535.GA10398@drongo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-19 09:18:34 +02:00
Peter Zijlstra
e360adbe29 irq_work: Add generic hardirq context callbacks
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-18 19:58:50 +02:00
Arnd Bergmann
6038f373a3 llseek: automatically add .llseek fop
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+>
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+>
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15 15:53:27 +02:00
Benjamin Herrenschmidt
6a1c9dfe41 Merge remote branch 'jwb/next' into next 2010-10-15 10:45:03 +11:00
Kumar Gala
55fd766b5f powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips
On Freescale parts typically have TLB array for large mappings that we can
bolt the linear mapping into.  We utilize the code that already exists
on PPC32 on the 64-bit side to setup the linear mapping to be cover by
bolted TLB entries.  We utilize a quarter of the variable size TLB array
for this purpose.

Additionally, we limit the amount of memory to what we can cover via
bolted entries so we don't get secondary faults in the TLB miss
handlers.  We should fix this limitation in the future.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:55:14 -05:00
Kumar Gala
4490c06b58 powerpc/fsl-booke: Add support for FSL 64-bit e5500 core
The new e5500 core is similar to the e500mc core but adds 64-bit
support.  We support running it in 32-bit mode as it is identical to the
e500mc.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:55:03 -05:00
Kumar Gala
3c4b76449b powerpc: Fix compile error with paca code on ppc64e
arch/powerpc/kernel/paca.c: In function 'allocate_lppacas':
arch/powerpc/kernel/paca.c:111:1: error: parameter name omitted
arch/powerpc/kernel/paca.c:111:1: error: parameter name omitted

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:53:08 -05:00
Matthew McClintock
2ed38b2359 powerpc/fsl_booke: Add support to boot from core other than 0
First we check to see if we are the first core booting up. This
is accomplished by comparing the boot_cpuid with -1, if it is we
assume this is the first core coming up.

Secondly, we need to update the initial thread info structure
to reflect the actual cpu we are running on otherwise
smp_processor_id() and related functions will return the default
initialization value of the struct or 0.

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:52:58 -05:00
Matthew McClintock
c71635d288 powerpc/kexec: make masking/disabling interrupts generic
Right now just the kexec crash pathway turns turns off the interrupts.
Pull that out and make a generic version for use elsewhere

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:52:46 -05:00
Timur Tabi
55ec2fca3e powerpc: export ppc_proc_freq and ppc_tb_freq as GPL symbols
Export the global variable 'ppc_tb_freq', so that modules (like the Book-E
watchdog driver) can use it.  To maintain consistency, ppc_proc_freq is
changed to a GPL-only export.  This is okay, because any module that needs
this symbol should be an actual Linux driver, which must be GPL-licensed.

Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-10-14 00:52:43 -05:00
Tirumala Marri
6edc323db7 powerpc/44x: Add support for the AMCC APM821xx SoC
This patch adds CPU, device tree, defconfig and bluestone board
support for APM821xx SoC.

Signed-off-by: Tirumala R Marri <tmarri@apm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-10-13 08:47:09 -04:00
matt mooney
4108d9ba90 powerpc/Makefiles: Change to new flag variables
Replace EXTRA_CFLAGS with ccflags-y and EXTRA_AFLAGS with asflags-y.

Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:22 +11:00
Nishanth Aravamudan
bc0df9ec4c powerpc/pci: Cleanup device dma setup code
Use set_dma_ops and remove unused oddly-named temp pointer sd.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:22 +11:00
Nishanth Aravamudan
45848e0fc1 powerpc/viobus: Free TCE table on device release
Release the TCE table as the XXX suggests, except on FW_FEATURE_ISERIES,
where the tables are allocated globally and reused.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Nishanth Aravamudan
edea8f6f48 powerpc/vio: Use put_device() on device_register failure
The kernel doc for device_register (and device_initialize) very clearly
state to call put_device not kfree after calling, even on error.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Nishanth Aravamudan
ffa56e555a powerpc/dma: Fix check for direct DMA support
The current check is wrong because it does not take the DMA offset intot
account, and in the case of a driver which doesn't actually support
64bits would falsely report that device as working.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Nishanth Aravamudan
1cb8e85a9d powerpc/dma: Fix dma_iommu_dma_supported compare
The table offset is in entries, each of which imply a dma address of
an IOMMU page.

Also, we should check the device can reach the whole IOMMU table.

Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:21 +11:00
Julia Lawall
a655237fa2 powerpc/irq.c: Add of_node_put to avoid memory leak
In this case, a device_node structure is stored in another structure that
is then freed without first decrementing the reference count of the
device_node structure.

The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression x;
identifier f;
position p1,p2;
@@

x@p1->f = \(of_find_node_by_path\|of_find_node_by_name\|of_find_node_by_phandle\|of_get_parent\|of_get_next_parent\|of_get_next_child\|of_find_compatible_node\|of_match_node\|of_find_node_by_type\|of_find_node_with_property\|of_find_matching_node\|of_parse_phandle\|of_node_get\)(...);
... when != of_node_put(x)
kfree@p2(x)

@script:python@
p1 << r.p1;
p2 << r.p2;
@@
cocci.print_main("call",p1)
cocci.print_secs("free",p2)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:04 +11:00
Joe Perches
4e74fd7d0a powerpc: Use static const char arrays
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:03 +11:00
Nathan Fontenot
d8862be122 powerpc/pseries: Export rtas_ibm_suspend_me()
Export the rtas_ibm_suspend_me() routine.  This is needed to perform
partition migration in the kernel.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13 16:19:02 +11:00
Benjamin Herrenschmidt
4783f393de Merge remote branch 'kumar/merge' into next 2010-10-13 16:18:36 +11:00
Ingo Molnar
7cd2541cf2 Merge commit 'v2.6.36-rc7' into perf/core
Conflicts:
	arch/x86/kernel/module.c

Merge reason: Resolve the conflict, pick up fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08 10:46:27 +02:00
Ingo Molnar
153db80f8c Merge commit 'v2.6.36-rc7' into core/memblock
Merge reason: Update from -rc3 to -rc7.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08 09:15:00 +02:00
Ian Munsie
f14362d1fe powerpc, of_serial: Endianness issues setting up the serial ports
The speed and clock of the serial ports is retrieved from the device
tree in both the PowerPC legacy serial code and the Open Firmware serial
driver, therefore they need to handle the fact that the device tree is
always big endian, while the CPU may not be.

Also fix other device tree references in the legacy serial code.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-10-07 17:21:15 -06:00
David Howells
df9ee29270 Fix IRQ flag handling naming
Fix the IRQ flag handling naming.  In linux/irqflags.h under one configuration,
it maps:

	local_irq_enable() -> raw_local_irq_enable()
	local_irq_disable() -> raw_local_irq_disable()
	local_irq_save() -> raw_local_irq_save()
	...

and under the other configuration, it maps:

	raw_local_irq_enable() -> local_irq_enable()
	raw_local_irq_disable() -> local_irq_disable()
	raw_local_irq_save() -> local_irq_save()
	...

This is quite confusing.  There should be one set of names expected of the
arch, and this should be wrapped to give another set of names that are expected
by users of this facility.

Change this to have the arch provide:

	flags = arch_local_save_flags()
	flags = arch_local_irq_save()
	arch_local_irq_restore(flags)
	arch_local_irq_disable()
	arch_local_irq_enable()
	arch_irqs_disabled_flags(flags)
	arch_irqs_disabled()
	arch_safe_halt()

Then linux/irqflags.h wraps these to provide:

	raw_local_save_flags(flags)
	raw_local_irq_save(flags)
	raw_local_irq_restore(flags)
	raw_local_irq_disable()
	raw_local_irq_enable()
	raw_irqs_disabled_flags(flags)
	raw_irqs_disabled()
	raw_safe_halt()

with type checking on the flags 'arguments', and then wraps those to provide:

	local_save_flags(flags)
	local_irq_save(flags)
	local_irq_restore(flags)
	local_irq_disable()
	local_irq_enable()
	irqs_disabled_flags(flags)
	irqs_disabled()
	safe_halt()

with tracing included if enabled.

The arch functions can now all be inline functions rather than some of them
having to be macros.

Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300]
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile]
Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze]
Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR]
Acked-by: Tony Luck <tony.luck@intel.com> [IA-64]
Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R]
Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS]
Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC]
Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC]
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390]
Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score]
Acked-by: Matt Fleming <matt@console-pimps.org> [SH]
Acked-by: David S. Miller <davem@davemloft.net> [Sparc]
Acked-by: Chris Zankel <chris@zankel.net> [Xtensa]
Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha]
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300]
Cc: starvik@axis.com [CRIS]
Cc: jesper.nilsson@axis.com [CRIS]
Cc: linux-cris-kernel@axis.com
2010-10-07 14:08:55 +01:00
Stephen Rothwell
7c6d45e665 powerpc: remove unused variable
Since powerpc uses -Werror on arch powerpc, the build was broken like
this:

  cc1: warnings being treated as errors
  arch/powerpc/kernel/module.c: In function 'module_finalize':
  arch/powerpc/kernel/module.c:66: error: unused variable 'err'

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05 17:27:54 -07:00
Linus Torvalds
5336377d62 modules: Fix module_bug_list list corruption race
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.

However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling.  That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.

Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.

So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.

Future fixups:
 - move the module list handling code into kernel/module.c where it
   belongs.
 - get rid of 'module_bug_list' and just use the regular list of modules
   (called 'modules' - imagine that) that we already create and maintain
   for other reasons.

Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05 11:29:27 -07:00
Paul Mackerras
9f5f9ffe50 powerpc/perf: Fix sampling enable for PPC970
The logic to distinguish marked instruction events from ordinary events
on PPC970 and derivatives was flawed.  The result is that instruction
sampling didn't get enabled in the PMU for some marked instruction
events, so they would never trigger.  This fixes it by adding the
appropriate break statements in the switch statement.

Reported-by: David Binderman <dcb314@hotmail.com>
Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-23 17:03:56 +10:00
Ingo Molnar
d0303d71c2 Merge branch 'linus' into perf/core
Conflicts:
	arch/sparc/kernel/perf_event.c

Merge reason: Resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-23 08:02:09 +02:00
Al Viro
9a81c16b52 powerpc: fix double syscall restarts
Make sigreturn zero regs->trap, make do_signal() do the same on all
paths.  As it is, signal interrupting e.g. read() from fd 512 (==
ERESTARTSYS) with another signal getting unblocked when the first
handler finishes will lead to restart one insn earlier than it ought
to.  Same for multiple signals with in-kernel handlers interrupting
that sucker at the same time.  Same for multiple signals of any kind
interrupting that sucker on 64bit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-22 09:33:50 -07:00
Ingo Molnar
3aabae7d9d Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2010-09-15 10:27:31 +02:00
Peter Zijlstra
a4eaf7f146 perf: Rework the PMU methods
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.

The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.

This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).

It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).

The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:

 1) We disable the counter:
    a) the pmu has per-counter enable bits, we flip that
    b) we program a NOP event, preserving the counter state

 2) We store the counter state and ignore all read/overflow events

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:30 +02:00
Peter Zijlstra
33696fc0d1 perf: Per PMU disable
Changes perf_disable() into perf_pmu_disable().

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:29 +02:00
Peter Zijlstra
24cd7f54a0 perf: Reduce perf_disable() usage
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:29 +02:00
Peter Zijlstra
b0a873ebbf perf: Register PMU implementations
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:28 +02:00
Peter Zijlstra
51b0fe3954 perf: Deconstify struct pmu
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus <paulus@samba.org>
Cc: stephane eranian <eranian@googlemail.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Yanmin <yanmin_zhang@linux.intel.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Michael Cree <mcree@orcon.net.nz>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:46:27 +02:00
Benjamin Herrenschmidt
5b6e9ff6de powerpc/dma: Add optional platform override of dma_set_mask()
Some platforms may want to override dma_set_mask() to take into
account some specific "features" such as the availability of
a direct-map window in addition to an iommu.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Denis Kirjanov
cab175f9fa powerpc: Use is_32bit_task() helper to test 32-bit binary
This patch removes all explicit tests for the TIF_32BIT flag

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Andreas Schwab
05d77ac90c powerpc: Remove fpscr use from [kvm_]cvt_{fd,df}
Neither lfs nor stfs touch the fpscr, so remove the restore/save of it
around them.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Paul Mackerras
872e439a45 powerpc/pseries: Re-enable dispatch trace log userspace interface
Since the cpu accounting code uses the hypervisor dispatch trace log
now when CONFIG_VIRT_CPU_ACCOUNTING = y, the previous commit disabled
access to it via files in the /sys/kernel/debug/powerpc/dtl/ directory
in that case.  This restores those files.

To do this, we now have a hook that the cpu accounting code will call
as it processes each entry from the hypervisor dispatch trace log.
The code in dtl.c now uses that to fill up its ring buffer, rather
than having the hypervisor fill the ring buffer directly.

This also fixes dtl_file_read() to handle overflow conditions a bit
better and adds a spinlock to ensure that race conditions (multiple
processes opening or reading the file concurrently) are handled
correctly.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:32 +10:00
Paul Mackerras
cf9efce0ce powerpc: Account time using timebase rather than PURR
Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the
PURR register for measuring the user and system time used by
processes, as well as other related times such as hardirq and
softirq times.  This turns out to be quite confusing for users
because it means that a program will often be measured as taking
less time when run on a multi-threaded processor (SMT2 or SMT4 mode)
than it does when run on a single-threaded processor (ST mode), even
though the program takes longer to finish.  The discrepancy is
accounted for as stolen time, which is also confusing, particularly
when there are no other partitions running.

This changes the accounting to use the timebase instead, meaning that
the reported user and system times are the actual number of real-time
seconds that the program was executing on the processor thread,
regardless of which SMT mode the processor is in.  Thus a program will
generally show greater user and system times when run on a
multi-threaded processor than on a single-threaded processor.

On pSeries systems on POWER5 or later processors, we measure the
stolen time (time when this partition wasn't running) using the
hypervisor dispatch trace log.  We check for new entries in the
log on every entry from user mode and on every transition from
kernel process context to soft or hard IRQ context (i.e. when
account_system_vtime() gets called).  So that we can correctly
distinguish time stolen from user time and time stolen from system
time, without having to check the log on every exit to user mode,
we store separate timestamps for exit to user mode and entry from
user mode.

On systems that have a SPURR (POWER6 and POWER7), we read the SPURR
in account_system_vtime() (as before), and then apportion the SPURR
ticks since the last time we read it between scaled user time and
scaled system time according to the relative proportions of user
time and system time over the same interval.  This avoids having to
read the SPURR on every kernel entry and exit.  On systems that have
PURR but not SPURR (i.e., POWER5), we do the same using the PURR
rather than the SPURR.

This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl
for now since it conflicts with the use of the dispatch trace log
by the time accounting code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Paul Mackerras
93c22703ef powerpc: Dynamically allocate most lppaca structs
This arranges for the lppaca structs for most cpus to be dynamically
allocated in the same manner as the paca structs.  If we don't include
support for legacy iSeries, only the first lppaca is statically
allocated; the rest are dynamically allocated.  If we include legacy
iSeries support, then we statically allocate the first 64 lppaca
structs, since the iSeries hypervisor requires that the lppaca
structs be present in the data section of the kernel image, but
legacy iSeries supports at most 64 cpus.

With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config
went from:

   text    data     bss     dec     hex filename
9524478 4734564 8469944 22728986        15ad11a ../test-1024/vmlinux

to:

   text    data     bss     dec     hex filename
9524482 3751508 8469944 21745934        14bd10e ../test-1024/vmlinux

a reduction of 983052 bytes overall.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Paul Mackerras
8154c5d22d powerpc: Abstract indexing of lppaca structs
Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Michael Neuling
e1f0ece113 powerpc: Move arch_sd_sibling_asym_packing() to smp.c
Simple cleanup by moving arch_sd_sibling_asym_packing from process.c to
smp.c to save an #ifdef CONFIG_SMP

No functionality change.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:31 +10:00
Anton Blanchard
f89451fbd2 powerpc: Feature nop out reservation clear when stcx checks address
The POWER architecture does not require stcx to check that it is operating
on the same address as the larx. This means it is possible for an
an exception handler to execute a larx, get a reservation, decide
not to do the stcx and then return back with an active reservation. If the
interrupted code was in the middle of a larx/stcx sequence the stcx could
incorrectly succeed.

All recent POWER CPUs check the address before letting the stcx succeed
so we can create a CPU feature and nop it out. As Ben suggested, we can
only do this in our syscall path because there is a remote possibility
some kernel code gets interrupted by an exception that ends up operating
on the same cacheline.

Thanks to Paul Mackerras and Derek Williams for the idea.

To test this I used a very simple null syscall (actually getppid) testcase
at http://ozlabs.org/~anton/junkcode/null_syscall.c

I tested against 2.6.35-git10 with the following changes against the
pseries_defconfig:

CONFIG_VIRT_CPU_ACCOUNTING=n
CONFIG_AUDIT=n
CONFIG_PPC_4K_PAGES=n
CONFIG_PPC_64K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=9
CONFIG_PPC_SUBPAGE_PROT=n
CONFIG_FUNCTION_TRACER=n
CONFIG_FUNCTION_GRAPH_TRACER=n
CONFIG_IRQSOFF_TRACER=n
CONFIG_STACK_TRACER=n

to remove the overhead of virtual CPU accounting, syscall auditing and
the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses.

POWER6: +8.2%
POWER7: +7.0%

Another suggestion was to use a larx to something in the L1 instead of a stcx.
This was almost as fast as removing the larx on POWER6, but only 3.5% faster
on POWER7. We can use this to speed up the reservation clear in our
exception exit code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:30 +10:00
Ingo Molnar
daab7fc734 Merge commit 'v2.6.36-rc3' into x86/memblock
Conflicts:
	arch/x86/kernel/trampoline.c
	mm/memblock.c

Merge reason: Resolve the conflicts, update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31 09:45:46 +02:00
Michael Neuling
54a8340433 powerpc: Don't use kernel stack with translation off
In f761622e59 we changed
early_setup_secondary so it's called using the proper kernel stack
rather than the emergency one.

Unfortunately, this stack pointer can't be used when translation is off
on PHYP as this stack pointer might be outside the RMO.  This results in
the following on all non zero cpus:
  cpu 0x1: Vector: 300 (Data Access) at [c00000001639fd10]
      pc: 000000000001c50c
      lr: 000000000000821c
      sp: c00000001639ff90
     msr: 8000000000001000
     dar: c00000001639ffa0
   dsisr: 42000000
    current = 0xc000000016393540
    paca    = 0xc000000006e00200
      pid   = 0, comm = swapper

The original patch was only tested on bare metal system, so it never
caught this problem.

This changes __secondary_start so that we calculate the new stack
pointer but only start using it after we've called early_setup_secondary.

With this patch, the above problem goes away.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31 11:35:13 +10:00
Paul Mackerras
b0d278b7d3 powerpc/perf_event: Reduce latency of calling perf_event_do_pending
Commit 0fe1ac48 ("powerpc/perf_event: Fix oops due to
perf_event_do_pending call") moved the call to perf_event_do_pending
in timer_interrupt() down so that it was after the irq_enter() call.
Unfortunately this moved it after the code that checks whether it
is time for the next decrementer clock event.  The result is that
the call to perf_event_do_pending() won't happen until the next
decrementer clock event is due.  This was pointed out by Milton
Miller.

This fixes it by moving the check for whether it's time for the
next decrementer clock event down to the point where we're about
to call the event handler, after we've called perf_event_do_pending.

This has the side effect that on old pre-Core99 Powermacs where we
use the ppc_n_lost_interrupts mechanism to replay interrupts, a
replayed interrupt will incur a little more latency since it will
now do the code from the irq_enter down to the irq_exit, that it
used to skip.  However, these machines are now old and rare enough
that this doesn't matter.  To make it clear that ppc_n_lost_interrupts
is only used on Powermacs, and to speed up the code slightly on
non-Powermac ppc32 machines, the code that tests ppc_n_lost_interrupts
is now conditional on CONFIG_PMAC as well as CONFIG_PPC32.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31 11:35:13 +10:00
Matthew McClintock
4562c986f0 powerpc/kexec: Adds correct calling convention for kexec purgatory
Call kexec purgatory code correctly. We were getting lucky before.
If you examine the powerpc 32bit kexec "purgatory" code you will
see it expects the following:

>From kexec-tools: purgatory/arch/ppc/v2wrap_32.S
-> calling convention:
->   r3 = physical number of this cpu (all cpus)
->   r4 = address of this chunk (master only)

As such, we need to set r3 to the current core, r4 happens to be
unused by purgatory at the moment but we go ahead and set it
here as well

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31 11:35:12 +10:00
Ingo Molnar
7de5d895b2 Merge branch 'linus' into perf/core
Merge reason: pick up perf fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-25 13:10:00 +02:00
Andreas Schwab
bcc30d3758 powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:28:28 +10:00
Grant Likely
76ec01dbb7 powerpc/pci: Fix checking for child bridges in PCI code.
pci_device_to_OF_node() can return null, and list_for_each_entry will
never enter the loop when dev is NULL, so it looks like this test is
a typo.

Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:28:27 +10:00
Matt Evans
f761622e59 powerpc: Initialise paca->kstack before early_setup_secondary
As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"

Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data).  It's not always allowable to take such a miss, and
intermittent crashes will result.

Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)).  This patch therefore only
affects the init of secondaries.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:31 +10:00
Anton Blanchard
7aa241fdce powerpc: Fix bogus it_blocksize in VIO iommu code
When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:

address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048

Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.

Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:31 +10:00
Anton Blanchard
4138d65333 powerpc: Inline ppc64_runlatch_off
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it
into the callers. To avoid a mess of circular includes I didn't add
it as an inline function.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:30 +10:00
Nathan Fontenot
954e6da54b powerpc: Correct smt_enabled=X boot option for > 2 threads per core
The 'smt_enabled=X' boot option does not handle values of X > 2.
For Power 7 processors with smt modes of 0,1,2,3, and 4 this does
not work.  This patch allows the smt_enabled option to be set to
any value limited to a max equal to the number of threads per
core.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:30 +10:00
Signed-off-by: Darren Hart
6685a47749 powerpc: Silence __cpu_up() under normal operation
During CPU offline/online tests __cpu_up would flood the logs with
the following message:

Processor 0 found.

This provides no useful information to the user as there is no context
provided, and since the operation was a success (to this point) it is expected
that the CPU will come back online, providing all the feedback necessary.

Change the "Processor found" message to DBG() similar to other such messages in
the same function. Also, add an appropriate log level for the "Processor is
stuck" message.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:29 +10:00
Signed-off-by: Darren Hart
a7c2bb8279 powerpc: Re-enable preemption before cpu_die()
start_secondary() is called shortly after _start and also via

cpu_idle()->cpu_die()->pseries_mach_cpu_die()

start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is
called via the cpu_idle() routine with preemption disabled, resulting in the
following repeating message during rapid cpu offline/online tests
with CONFIG_PREEMPT=y:

BUG: scheduling while atomic: swapper/0/0x00000002
Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan]
Call Trace:
[c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable)
[c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c
[c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c
[c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800
[c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8
[c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4
[c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14

Move the cpu_die() call inside the existing preemption enabled block of
cpu_idle(). This is safe as the idle task is affined to a single CPU so the
debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are
in a "migration disabled" region.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:29 +10:00
Julia Lawall
da9bef6735 powerpc/pci: Drop unnecessary null test
list_for_each_entry binds its first argument to a non-null value, and thus
any null test on the value of that argument is superfluous.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
iterator I;
expression x,E,E1,E2;
statement S,S1,S2;
@@

I(x,...) { <...
- if (x != NULL || ...)
  S
  ...> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:28 +10:00
Anton Blanchard
249ec22875 powerpc/kdump: Stop all other CPUs before running crash handlers
During kdump we run the crash handlers first then stop all other CPUs.
We really want to stop all CPUs as close to the fail as possible and also
have a very controlled environment for running the crash handlers, so it
makes sense to reverse the order.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:27 +10:00
Denis Kirjanov
9904b00593 powerpc: Use is_32bit_task() helper to test 32 bit binary
Use is_32bit_task() helper to test 32 bit binary.

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24 15:26:27 +10:00
Benjamin Herrenschmidt
b1515af291 Merge remote branch 'jwb/merge' into merge 2010-08-24 14:36:45 +10:00
Dave Kleikamp
3e7f45ad52 powerpc/4xx: Index interrupt stacks by physical cpu
The interrupt stacks need to be indexed by the physical cpu since the
critical, debug and machine check handlers use the contents of SPRN_PIR to
index the critirq_ctx, dbgirq_ctx, and mcheckirq_ctx arrays.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23 07:37:53 -04:00
Dave Kleikamp
66477466b8 powerpc/47x: Remove redundant line from cputable.c
There are two entries for .cpu_user_features in
arch/powerpc/kernel/cputable.c.  Remove the one that doesn't belong

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23 07:37:01 -04:00
Dave Kleikamp
029b8f662b powerpc/47x: Make sure mcsr is cleared before enabling machine check interrupts
Clear the machine check syndrom register before enabling machine check
interrupts.  The initial state of the tlb can lead to parity errors being
flagged early after a cold boot.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23 07:36:58 -04:00
Ingo Molnar
c8710ad389 Merge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2010-08-19 12:48:09 +02:00
Frederic Weisbecker
f72c1a931e perf: Factorize callchain context handling
Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
2010-08-19 01:32:11 +02:00
Frederic Weisbecker
56962b4449 perf: Generalize some arch callchain code
- Most archs use one callchain buffer per cpu, except x86 that needs
  to deal with NMIs. Provide a default perf_callchain_buffer()
  implementation that x86 overrides.

- Centralize all the kernel/user regs handling and invoke new arch
  handlers from there: perf_callchain_user() / perf_callchain_kernel()
  That avoid all the user_mode(), current->mm checks and so...

- Invert some parameters in perf_callchain_*() helpers: entry to the
  left, regs to the right, following the traditional (dst, src).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
2010-08-19 01:30:59 +02:00
Frederic Weisbecker
70791ce9ba perf: Generalize callchain_store()
callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.

This removes repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
2010-08-19 01:30:11 +02:00
David Howells
d7627467b7 Make do_execve() take a const filename pointer
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:

arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to.  This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel().  A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().

do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.

Further kernel_execve() and sys_execve() need to be changed to match.

This has been test built on x86_64, frv, arm and mips.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-17 18:07:43 -07:00
David Howells
c788732523 Mark arguments to certain syscalls as being const
Mark arguments to certain system calls as being const where they should be but
aren't.  The list includes:

 (*) The filename arguments of various stat syscalls, execve(), various utimes
     syscalls and some mount syscalls.

 (*) The filename arguments of some syscall helpers relating to the above.

 (*) The buffer argument of various write syscalls.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-13 16:53:13 -07:00
Grant Likely
ee11006613 powerpc: fix i8042 module build error
of_i8042_{kbd,aux}_irq needs to be exported

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-06 20:49:20 -06:00
Linus Torvalds
b62ad9ab18 Merge branch 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-timekeeping-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  um: Fix read_persistent_clock fallout
  kgdb: Do not access xtime directly
  powerpc: Clean up obsolete code relating to decrementer and timebase
  powerpc: Rework VDSO gettimeofday to prevent time going backwards
  clocksource: Add __clocksource_updatefreq_hz/khz methods
  x86: Convert common clocksources to use clocksource_register_hz/khz
  timekeeping: Make xtime and wall_to_monotonic static
  hrtimer: Cleanup direct access to wall_to_monotonic
  um: Convert to use read_persistent_clock
  timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset
  powerpc: Cleanup xtime usage
  powerpc: Simplify update_vsyscall
  time: Kill off CONFIG_GENERIC_TIME
  time: Implement timespec_add
  x86: Fix vtime/file timestamp inconsistencies

Trivial conflicts in Documentation/feature-removal-schedule.txt

Much less trivial conflicts in arch/powerpc/kernel/time.c resolved as
per Thomas' earlier merge commit 47916be4e2 ("Merge branch
'powerpc.cherry-picks' into timers/clocksource")
2010-08-06 13:18:29 -07:00
Linus Torvalds
c4efd6b569 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug
  sched: No need for bootmem special cases
  sched: Revert nohz_ratelimit() for now
  sched: Reduce update_group_power() calls
  sched: Update rq->clock for nohz balanced cpus
  sched: Fix spelling of sibling
  sched, cpuset: Drop __cpuexit from cpu hotplug callbacks
  sched: Fix the racy usage of thread_group_cputimer() in fastpath_timer_check()
  sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()
  sched: thread_group_cputime: Simplify, document the "alive" check
  sched: Remove the obsolete exit_state/signal hacks
  sched: task_tick_rt: Remove the obsolete ->signal != NULL check
  sched: __sched_setscheduler: Read the RLIMIT_RTPRIO value lockless
  sched: Fix comments to make them DocBook happy
  sched: Fix fix_small_capacity
  powerpc: Exclude arch_sd_sibiling_asym_packing() on UP
  powerpc: Enable asymmetric SMT scheduling on POWER7
  sched: Add asymmetric group packing option for sibling domain
  sched: Fix capacity calculations for SMT4
  sched: Change nohz idle load balancing logic to push model
  ...
2010-08-06 09:39:22 -07:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Linus Torvalds
89a6c8cb9e Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  debug_core,kdb: fix crash when arch does not have single step
  kgdb,x86: use macro HBP_NUM to replace magic number 4
  kgdb,mips: remove unused kgdb_cpu_doing_single_step operations
  mm,kdb,kgdb: Add a debug reference for the kdb kmap usage
  KGDB: Remove set but unused newPC
  ftrace,kdb: Allow dumping a specific cpu's buffer with ftdump
  ftrace,kdb: Extend kdb to be able to dump the ftrace buffer
  kgdb,powerpc: Replace hardcoded offset by BREAK_INSTR_SIZE
  arm,kgdb: Add ability to trap into debugger on notify_die
  gdbstub: do not directly use dbg_reg_def[] in gdb_cmd_reg_set()
  gdbstub: Implement gdbserial 'p' and 'P' packets
  kgdb,arm: Individual register get/set for arm
  kgdb,mips: Individual register get/set for mips
  kgdb,x86: Individual register get/set for x86
  kgdb,kdb: individual register set and and get API
  gdbstub: Optimize kgdb's "thread:" response for the gdb serial protocol
  kgdb: remove custom hex_to_bin()implementation
2010-08-05 15:59:48 -07:00
Linus Torvalds
03c0c29aff Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (63 commits)
  of/platform: Register of_platform_drivers with an "of:" prefix
  of/address: Clean up function declarations
  of/spi: call of_register_spi_devices() from spi core code
  of: Provide default of_node_to_nid() implementation.
  of/device: Make of_device_make_bus_id() usable by other code.
  of/irq: Fix endian issues in parsing interrupt specifiers
  of: Fix phandle endian issues
  of/flattree: fix of_flat_dt_is_compatible() to match the full compatible string
  of: remove of_default_bus_ids
  of: make of_find_device_by_node generic
  microblaze: remove references to of_device and to_of_device
  sparc: remove references to of_device and to_of_device
  powerpc: remove references to of_device and to_of_device
  of/device: Replace of_device with platform_device in includes and core code
  of/device: Protect against binding of_platform_drivers to non-OF devices
  of: remove asm/of_device.h
  of: remove asm/of_platform.h
  of/platform: remove all of_bus_type and of_platform_bus_type references
  of: Merge of_platform_bus_type with platform_bus_type
  drivercore/of: Add OF style matching to platform bus
  ...

Fix up trivial conflicts in arch/microblaze/kernel/Makefile due to just
some obj-y removals by the devicetree branch, while the microblaze
updates added a new file.
2010-08-05 15:57:35 -07:00
Linus Torvalds
cdd854bc42 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (79 commits)
  powerpc/8xx: Add support for the MPC8xx based boards from TQC
  powerpc/85xx: Introduce support for the Freescale P1022DS reference board
  powerpc/85xx: Adding DTS for the STx GP3-SSA MPC8555 board
  powerpc/85xx: Change deprecated binding for 85xx-based boards
  powerpc/tqm85xx: add a quirk for ti1520 PCMCIA bridge
  powerpc/tqm85xx: update PCI interrupt-map attribute
  powerpc/mpc8308rdb: support for MPC8308RDB board from Freescale
  powerpc/fsl_pci: add quirk for mpc8308 pcie bridge
  powerpc/85xx: Cleanup QE initialization for MPC85xxMDS boards
  powerpc/85xx: Fix booting for P1021MDS boards
  powerpc/85xx: Fix SWIOTLB initalization for MPC85xxMDS boards
  powerpc/85xx: kexec for SMP 85xx BookE systems
  powerpc/5200/i2c: improve i2c bus error recovery
  of/xilinxfb: update tft compatible versions
  powerpc/fsl-diu-fb: Support setting display mode using EDID
  powerpc/5121: doc/dts-bindings: update doc of FSL DIU bindings
  powerpc/5121: shared DIU framebuffer support
  powerpc/5121: move fsl-diu-fb.h to include/linux
  powerpc/5121: fsl-diu-fb: fix issue with re-enabling DIU area descriptor
  powerpc/512x: add clock structure for Video-IN (VIU) unit
  ...
2010-08-05 09:03:46 -07:00
Michal Simek
3f0a55e357 kgdb,powerpc: Replace hardcoded offset by BREAK_INSTR_SIZE
kgdb_handle_breakpoint checks the first arch_kgdb_breakpoint
which is not known by gdb that's why is necessary jump over
it. The jump lenght is equal to BREAK_INSTR_SIZE that's
why is cleaner to use defined macro instead of hardcoded
non-described offset.

Signed-off-by: Michal Simek <monstr@monstr.eu>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 09:22:22 -05:00
Benjamin Herrenschmidt
cd3db0c4ca memblock: Remove rmo_size, burry it in arch/powerpc where it belongs
The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
server ppc64 though I hijack it on embedded ppc64 for similar purposes)
and represents the area of memory that can be accessed in real mode
(aka with MMU off), or on embedded, from the exception vectors (which
is bolted in the TLB) which pretty much boils down to the same thing.

We take that out of the generic MEMBLOCK data structure and move it into
arch/powerpc where it belongs, renaming it to "RMA" while at it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:08 +10:00
Benjamin Herrenschmidt
e63075a3c9 memblock: Introduce default allocation limit and use it to replace explicit ones
This introduce memblock.current_limit which is used to limit allocations
from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).

The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
be used with memblock_alloc_base() to allocate really anywhere.

It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.

Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
strongly recommend that you ensure that you set an appropriate limit
during boot in order to guarantee that an memblock_alloc() at any time
results in something that is accessible with a simple __va().

The reason is that a subsequent patch will introduce the ability for
the array to resize itself by reallocating itself. The MEMBLOCK core will
honor the current limit when performing those allocations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-05 12:56:07 +10:00
Linus Torvalds
3cfc2c42c1 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (48 commits)
  Documentation: update broken web addresses.
  fix comment typo "choosed" -> "chosen"
  hostap:hostap_hw.c Fix typo in comment
  Fix spelling contorller -> controller in comments
  Kconfig.debug: FAIL_IO_TIMEOUT: typo Faul -> Fault
  fs/Kconfig: Fix typo Userpace -> Userspace
  Removing dead MACH_U300_BS26
  drivers/infiniband: Remove unnecessary casts of private_data
  fs/ocfs2: Remove unnecessary casts of private_data
  libfc: use ARRAY_SIZE
  scsi: bfa: use ARRAY_SIZE
  drm: i915: use ARRAY_SIZE
  drm: drm_edid: use ARRAY_SIZE
  synclink: use ARRAY_SIZE
  block: cciss: use ARRAY_SIZE
  comment typo fixes: charater => character
  fix comment typos concerning "challenge"
  arm: plat-spear: fix typo in kerneldoc
  reiserfs: typo comment fix
  update email address
  ...
2010-08-04 15:31:02 -07:00
Jiri Kosina
d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Benjamin Herrenschmidt
412a4ac5e9 Merge commit 'gcl/next' into next 2010-08-04 10:26:03 +10:00
Scott Wood
69e77a8b04 perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
Commit 6b95ed345b changed from
a struct initializer to perf_sample_data_init(), but the setting
of the .period member was left out.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: stable@kernel.org
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-08-03 10:56:45 +10:00
Peter Zijlstra
09f86cd093 perf, powerpc: Convert the FSL driver to use local64_t
For some reason the FSL driver got left out when we converted perf
to use local64_t instead of atomic64_t.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-08-03 10:24:03 +10:00
Ingo Molnar
3772b73472 Merge commit 'v2.6.35' into perf/core
Conflicts:
	tools/perf/Makefile
	tools/perf/util/hist.c

Merge reason: Resolve the conflicts and update to latest upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-02 08:31:54 +02:00
Grant Likely
22ae782f86 of/address: Clean up function declarations
This patch moves the declaration of of_get_address(), of_get_pci_address(),
and of_pci_address_to_resource() out of arch code and into the common
linux/of_address header file.

This patch also fixes some of the asm/prom.h ordering issues.  It still
includes some header files that it ideally shouldn't be, but at least the
ordering is consistent now so that of_* overrides work.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-08-01 01:42:42 -06:00
Andreas Schwab
49f6be8ea1 KVM: PPC: elide struct thread_struct instances from stack
Instead of instantiating a whole thread_struct on the stack use only the
required parts of it.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-08-01 10:39:24 +03:00
Matt Evans
e8e5c2155b powerpc/kexec: Fix orphaned offline CPUs across kexec
When CPU hotplug is used, some CPUs may be offline at the time a kexec is
performed.  The subsequent kernel may expect these CPUs to be already running,
and will declare them stuck.  On pseries, there's also a soft-offline (cede)
state that CPUs may be in; this can also cause problems as the kexeced kernel
may ask RTAS if they're online -- and RTAS would say they are.  The CPU will
either appear stuck, or will cause a crash as we replace its cede loop beneath
it.

This patch kicks each present offline CPU awake before the kexec, so that
none are forever lost to these assumptions in the subsequent kernel.

Now, the behaviour is that all available CPUs that were offlined are now
online & usable after the kexec.  This mimics the behaviour of a full reboot
(on which all CPUs will be restarted).

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 15:05:22 +10:00
Matt Evans
e2f7f73717 powerpc/kexec: Add to and tidy debug/comments in machine_kexec64.c
Tidies some typos, KERN_INFO-ise an info msg, and add a debug msg showing
when the final sequence starts.

Also adds a comment to kexec_prepare_cpus_wait() to make note of a possible
problem; the need for kexec to deal with CPUs that failed to originally start
up.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 15:05:21 +10:00
Michael Neuling
2c48a7d615 powerpc: Print decimal values in prom_init.c
Currently we look pretty stupid when printing out a bunch of things in
prom_init.c.  eg.

  Max number of cores passed to firmware: 0x0000000000000080

So I've change this to print in decimal:

  Max number of cores passed to firmware: 128 (NR_CPUS = 256)

This required adding a prom_print_dec() function and changing some
prom_printk() calls from %x to %lu.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 15:05:20 +10:00
Tiejun Chen
d77cb21b57 powerpc/smp: remove the incorrect decrementer initial codes for AP
We already defined start_cpu_decrementer() to invoke decrementer for AP as
the following path:

start_secondary() -> secondary_cpu_time_init() -> start_cpu_decrementer()

So remove these incorrect codes introduced from commit:
e7f75ad0 powerpc/47x: Base ppc476 support

And actually we really should not enable decrementer before calling set_dec().

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:31 +10:00
Neil Horman
67238fb721 powerpc: Add vmcoreinfo symbols to allow makdumpfile to filter core files properly
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>

 machine_kexec.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:31 +10:00
Matthew McClintock
bbc8e30f17 powerpc/crashdump: Fix issues with kexec and 36bit physical addr
Fix sizes of variables so correct values are exported via /proc.
Cast variable in comparison to avoid compiler error.

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:30 +10:00
Matt Evans
fc53b4202e powerpc/kexec: Switch to a static PACA on the way out
With dynamic PACAs, the kexecing CPU's PACA won't lie within the kernel
static data and there is a chance that something may stomp it when preparing
to kexec.  This patch switches this final CPU to a static PACA just before
we pull the switch.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-31 14:56:30 +10:00
Benjamin Herrenschmidt
7e3f36c3e1 Merge commit 'jwb/next' into next 2010-07-30 15:02:32 +10:00
Thomas Gleixner
47916be4e2 Merge branch 'powerpc.cherry-picks' into timers/clocksource
Conflicts:
	arch/powerpc/kernel/time.c

Reason: The powerpc next tree contains two commits which conflict with
the timekeeping changes:

8fd63a9e powerpc: Rework VDSO gettimeofday to prevent time going backwards
c1aa687d powerpc: Clean up obsolete code relating to decrementer and timebase

John Stultz identified them and provided the conflict resolution.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-28 21:49:22 +02:00
Paul Mackerras
d75d68cfef powerpc: Clean up obsolete code relating to decrementer and timebase
Since the decrementer and timekeeping code was moved over to using
the generic clockevents and timekeeping infrastructure, several
variables and functions have been obsolete and effectively unused.
This deletes them.

In particular, wakeup_decrementer() is no longer needed since the
generic code reprograms the decrementer as part of the process of
resuming the timekeeping code, which happens during sysdev resume.
Thus the wakeup_decrementer calls in the suspend_enter methods for
52xx platforms have been removed.  The call in the powermac cpu
frequency change code has been replaced by set_dec(1), which will
cause a timer interrupt as soon as interrupts are enabled, and the
generic code will then reprogram the decrementer with the correct
value.

This also simplifies the generic_suspend_en/disable_irqs functions
and makes them static since they are not referenced outside time.c.
The preempt_enable/disable calls are removed because the generic
code has disabled all but the boot cpu at the point where these
functions are called, so we can't be moved to another cpu.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-28 21:07:12 +02:00
Paul Mackerras
0e469db8f7 powerpc: Rework VDSO gettimeofday to prevent time going backwards
Currently it is possible for userspace to see the result of
gettimeofday() going backwards by 1 microsecond, assuming that
userspace is using the gettimeofday() in the VDSO.  The VDSO
gettimeofday() algorithm computes the time in "xsecs", which are
units of 2^-20 seconds, or approximately 0.954 microseconds,
using the algorithm

	now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec

and then converts the time in xsecs to seconds and microseconds.

The kernel updates the tb_orig_stamp and stamp_xsec values every
tick in update_vsyscall().  If the length of the tick is not an
integer number of xsecs, then some precision is lost in converting
the current time to xsecs.  For example, with CONFIG_HZ=1000, the
tick is 1ms long, which is 1048.576 xsecs.  That means that
stamp_xsec will advance by either 1048 or 1049 on each tick.
With the right conditions, it is possible for userspace to get
(timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is
slightly late in updating the vdso_datapage, and then for stamp_xsec
to advance by 1048 when the kernel does update it, and for userspace
to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to
integer truncation.  The result is that time appears to go backwards
by 1 microsecond.

To fix this we change the VDSO gettimeofday to use a new field in the
VDSO datapage which stores the nanoseconds part of the time as a
fractional number of seconds in a 0.32 binary fraction format.
(Or put another way, as a 32-bit number in units of 0.23283 ns.)
This is convenient because we can use the mulhwu instruction to
convert it to either microseconds or nanoseconds.

Since it turns out that computing the time of day using this new field
is simpler than either using stamp_xsec (as gettimeofday does) or
stamp_xtime.tv_nsec (as clock_gettime does), this converts both
gettimeofday and clock_gettime to use the new field.  The existing
__do_get_tspec function is converted to use the new field and take
a parameter in r7 that indicates the desired resolution, 1,000,000
for microseconds or 1,000,000,000 for nanoseconds.  The __do_get_xsec
function is then unused and is deleted.

The new algorithm is

	now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs
		+ (stamp_xtime_seconds << 32) + stamp_sec_fraction

with 'now' in units of 2^-32 seconds.  That is then converted to
seconds and either microseconds or nanoseconds with

	seconds = now >> 32
	partseconds = ((now & 0xffffffff) * resolution) >> 32

The 32-bit VDSO code also makes a further simplification: it ignores
the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary
fraction.  Doing so gets rid of 4 multiply instructions.  Assuming
a timebase frequency of 1GHz or less and an update interval of no
more than 10ms, the upper 32 bits of tb_to_xs will be at least
4503599, so the error from ignoring the low 32 bits will be at most
2.2ns, which is more than an order of magnitude less than the time
taken to do gettimeofday or clock_gettime on our fastest processors,
so there is no possibility of seeing inconsistent values due to this.

This also moves update_gtod() down next to its only caller, and makes
update_vsyscall use the time passed in via the wall_time argument rather
than accessing xtime directly.  At present, wall_time always points to
xtime, but that could change in future.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-28 21:06:47 +02:00
Peter Zijlstra
6b95ed345b perf, powerpc: Use perf_sample_data_init() for the FSL code
We should use perf_sample_data_init() to initialize struct
perf_sample_data.  As explained in the description of commit dc1d628a
("perf: Provide generic perf_sample_data initialization"), it is
possible for userspace to get the kernel to dereference data.raw,
so if it is not initialized, that means that unprivileged userspace
can possibly oops the kernel.  Using perf_sample_data_init makes sure
it gets initialized to NULL.

This conversion should have been included in commit dc1d628a, but it
got missed.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-07-27 22:20:09 +10:00
John Stultz
7615856ebf timkeeping: Fix update_vsyscall to provide wall_to_monotonic offset
update_vsyscall() did not provide the wall_to_monotoinc offset,
so arch specific implementations tend to reference wall_to_monotonic
directly. This limits future cleanups in the timekeeping core, so
this patch fixes the update_vsyscall interface to provide
wall_to_monotonic, allowing wall_to_monotonic to be made static
as planned in Documentation/feature-removal-schedule.txt

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tony Luck <tony.luck@intel.com>
LKML-Reference: <1279068988-21864-7-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-27 12:40:54 +02:00
John Stultz
06d518e3df powerpc: Cleanup xtime usage
This removes powerpc's direct xtime usage, allowing for further
generic timeekeping cleanups

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
LKML-Reference: <1279068988-21864-6-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-27 12:40:54 +02:00
John Stultz
b0797b60d0 powerpc: Simplify update_vsyscall
Currently powerpc's update_vsyscall calls an inline update_gtod.
However, both are straightforward, and there are no other users,
so this patch merges update_gtod into update_vsyscall.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1279068988-21864-5-git-send-email-johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-07-27 12:40:54 +02:00
Lee Nipper
ff34910396 powerpc/40x: Distinguish AMCC PowerPC 405EX and 405EXr correctly
The recent AMCC 405EX Rev D without Security uses a PVR value
that matches the old 405EXr Rev A/B with Security.
The 405EX Rev D without Security would be shown
incorrectly as an 405EXr. The pvr_mask of 0xffff0004
is no longer sufficient to distinguish the 405EX from 405EXr.

This patch replaces 2 entries in the cpu_specs table
and adds 8 more, each using pvr_mask of 0xffff000f
and appropriate pvr_value to distinguish the AMCC
PowerPC 405EX and 405EXr instances.
The cpu_name for these entries now includes the
Rev, in similar fashion to the 440GX.

Signed-off-by: Lee Nipper <lee.nipper@gmail.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-07-26 09:07:24 -04:00
Jonas Bonn
c0dd394ca5 of: remove of_default_bus_ids
This list used was by only two platforms with all other platforms defining an
own list of valid bus id's to pass to of_platform_bus_probe.  This patch:

i)   copies the default list to the two platforms that depended on it (powerpc)
ii)  remove the usage of of_default_bus_ids in of_platform_bus_probe
iii) removes the definition of the list from all architectures that defined it

Passing a NULL 'matches' parameter to of_platform_bus_probe is still valid; the
function returns no error in that case as the NULL value is equivalent to an
empty list.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
[grant.likely@secretlab.ca: added __initdata annotations, warn on and return error on missing match table, and fix whitespace errors]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-07-24 09:58:22 -06:00
Jonas Bonn
c608558407 of: make of_find_device_by_node generic
There's no need for this function to be architecture specific and all four
architectures defining it had the same definition.  The function has been
moved to drivers/of/platform.c.

Signed-off-by: Jonas Bonn <jonas@southpole.se>
[grant.likely@secretlab.ca: moved to drivers/of/platform.c, simplified code, and added kerneldoc comment]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24 09:58:22 -06:00
Grant Likely
a454dc5059 powerpc: remove references to of_device and to_of_device
of_device is just a #define alias to platform_device.  This patch
replaces all references to it with platform_device.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-07-24 09:58:21 -06:00
Grant Likely
1ab1d63a85 of/platform: remove all of_bus_type and of_platform_bus_type references
Both of_bus_type and of_platform_bus_type are just #define aliases
for the platform bus.  This patch removes all references to them and
switches to the of_register_platform_driver()/of_unregister_platform_driver()
API for registering.

Subsequent patches will convert each user of of_register_platform_driver()
into plain platform_drivers without the of_platform_driver shim.  At which
point the of_register_platform_driver()/of_unregister_platform_driver()
functions can be removed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24 09:57:52 -06:00
Grant Likely
eca3930163 of: Merge of_platform_bus_type with platform_bus_type
of_platform_bus was being used in the same manner as the platform_bus.
The only difference being that of_platform_bus devices are generated
from data in the device tree, and platform_bus devices are usually
statically allocated in platform code.  Having them separate causes
the problem of device drivers having to be registered twice if it
was possible for the same device to appear on either bus.

This patch removes of_platform_bus_type and registers all of_platform
bus devices and drivers on the platform bus instead.  A previous patch
made the of_device structure an alias for the platform_device structure,
and a shim is used to adapt of_platform_drivers to the platform bus.

After all of of_platform_bus drivers are converted to be normal platform
drivers, the shim code can be removed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
2010-07-24 09:57:51 -06:00
Grant Likely
4e4f62bf73 Merge commit 'v2.6.35-rc6' into devicetree/next
Conflicts:
	arch/sparc/kernel/prom_64.c
2010-07-24 09:49:13 -06:00
Benjamin Herrenschmidt
3fdfd99051 powerpc: Fix erroneous lmb->memblock conversions
Oooops... we missed these. We incorrectly converted strings
used when parsing the device-tree on pseries, thus breaking
access to drconf memory and hotplug memory.

While at it, also revert some variable names that represent
something the FW calls "lmb" and thus don't need to be converted
to "memblock".

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
2010-07-23 12:56:57 +10:00
Ingo Molnar
dca45ad8af Merge branch 'linus' into sched/core
Merge reason: Move from the -rc3 to the almost-rc6 base.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-21 21:45:08 +02:00
Ingo Molnar
9dcdbf7a33 Merge branch 'linus' into perf/core
Merge reason: Pick up the latest perf fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-21 21:43:06 +02:00
Pavel Machek
a2531293db update email address
pavel@suse.cz no longer works, replace it with working address.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-19 10:56:54 +02:00
Grant Likely
c5f5849bff of: Remove unused of_find_device_by_phandle()
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-18 22:39:36 -06:00
Linus Torvalds
6f7dd68b75 Merge branch 'lmb-to-memblock' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'lmb-to-memblock' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  lmb: rename to memblock
2010-07-14 17:27:44 -07:00
Yinghai Lu
95f72d1ed4 lmb: rename to memblock
via following scripts

      FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

      sed -i \
        -e 's/lmb/memblock/g' \
        -e 's/LMB/MEMBLOCK/g' \
        $FILES

      for N in $(find . -name lmb.[ch]); do
        M=$(echo $N | sed 's/lmb/memblock/g')
        mv $N $M
      done

and remove some wrong change like lmbench and dlmb etc.

also move memblock.c from lib/ to mm/

Suggested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14 17:14:00 +10:00
Benjamin Herrenschmidt
ff82c319e6 powerpc/book3e: Fix single step when using HW page tables
We patch the TLB miss exception vectors to point to alternate
functions when using HW page table on BookE.

However, we were patching in a new branch in the first instruction
of the exception handler instead of the second one, thus overriding
the nop that is in the first instruction.

This cause problems when single stepping as we rely on that nop for
the single step to stop properly within the exception vector range
rather than on the target of the branch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-14 14:13:51 +10:00
Benjamin Herrenschmidt
34d97e07cc powerpc/book3e: Add generic 64-bit idle powersave support
We use a similar technique to ppc32: We set a thread local flag
to indicate that we are about to enter or have entered the stop
state, and have fixup code in the async interrupt entry code that
reacts to this flag to make us return to a different location
(sets NIP to LINK in our case).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
--
v2. Fix lockdep bug
    Re-mask interrupts when coming back from idle
2010-07-14 14:13:18 +10:00
Matthew McClintock
77154a2026 powerpc/fsl-booke: Fix address issue when using relocatable kernels
When booting a relocatable kernel it needs to jump to the correct
start address, which for BookE parts is usually unchanged
regardless of the physical memory offset.

Recent changes cause problems with how we calculate the start
address, it was always adding the RMO into the start address
which is incorrect. This patch only adds in the RMO offset
if we are in the kexec code path, as it needs the RMO to work
correctly.

Instead of adding the RMO offset in in the common code path, we
can just set r6 to the RMO offset in the kexec code path instead
of to zero, and finally perform the masking in the common code
path

Signed-off-by: Matthew McClintock <msm@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2010-07-11 11:04:08 -05:00
Michael Ellerman
850f22d568 powerpc/book3e: Resend doorbell exceptions to ourself
If we are soft disabled and receive a doorbell exception we don't process
it immediately. This means we need to check on the way out of irq restore
if there are any doorbell exceptions to process.

The problem is at that point we don't know what our regs are, and that
in turn makes xmon unhappy. To workaround the problem, instead of checking
for and processing doorbells, we check for any doorbells and if there were
any we send ourselves another.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:19 +10:00
David Gibson
0e37d25950 powerpc/book3e: Use set_irq_regs() in the msgsnd/msgrcv IPI path
include/asm-generic/irq_regs.h declares per-cpu irq_regs variables and
get_irq_regs() and set_irq_regs() helper functions to maintain them.
These can be used to access the proper pt_regs structure related to the
current interrupt entry (if any).

In the powerpc arch code, this is used to maintain irq regs on
decrementer and external interrupt exceptions.  However, for the
doorbell exceptions used by the msgsnd/msgrcv IPI mechanism of newer
BookE CPUs, the irq_regs are not kept up to date.

In particular this means that xmon will not work properly on SMP,
because the secondary xmon instances started by IPI will blow up when
they cannot retrieve the irq regs.

This patch fixes the problem by adding calls to maintain the irq regs
across doorbell exceptions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:18 +10:00
Benjamin Herrenschmidt
89c81797d4 powerpc/book3e: Hookup doorbells exceptions on 64-bit Book3E
Note that critical doorbells are an unimplemented stub just like
other critical or machine check handlers, since we haven't done
support for "levelled" exceptions yet.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:17 +10:00
Benjamin Herrenschmidt
e8775d4aa1 powerpc/book3e: Don't re-trigger decrementer on lazy irq restore
The decrementer on BookE acts as a level interrupt and doesn't
need to be re-triggered when going negative. It doesn't go
negative anyways (unless programmed to auto-reload with a
negative value) as it stops when reaching 0.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 16:11:09 +10:00
Benjamin Herrenschmidt
b9f1cd71db powerpc/book3e: More doorbell cleanups. Sample the PIR register
The doorbells use the content of the PIR register to match messages
from other CPUs. This may or may not be the same as our linux CPU
number, so using that as the "target" is no right.

Instead, we sample the PIR register at boot on every processor
and use that value subsequently when sending IPIs.

We also use a per-cpu message mask rather than a global array which
should limit cache line contention.

Note: We could use the CPU number in the device-tree instead of
the PIR register, as they are supposed to be equivalent. This
might prove useful if doorbells are to be used to kick CPUs out
of FW at boot time, thus before we can sample the PIR. This is
however not the case now and using the PIR just works.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 15:29:53 +10:00
Benjamin Herrenschmidt
e3145b387a powerpc/book3e: Move doorbell_exception from traps.c to dbell.c
... where it belongs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 15:25:18 +10:00
Benjamin Herrenschmidt
a2e198116f powerpc/book3e: Hack to get gdb moving along on Book3E 64-bit
Our handling of debug interrupts on Book3E 64-bit is not quite
the way it should be just yet. This is a workaround to let gdb
work at least for now. We ensure that when context switching,
we set the appropriate DBCR0 value for the new task. We also
make sure that we turn off MSR[DE] within the kernel, and set
it as part of the bits that get set when going back to userspace.

In the long run, we will probably set the userspace DBCR0 on the
exception exit code path and ensure we have some proper kernel
value to set on the way into the kernel, a bit like ppc32 does,
but that will take more work.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 15:24:47 +10:00
Martyn Welch
540c6c392f powerpc: Add i8042 keyboard and mouse irq parsing
Currently the irqs for the i8042, which historically provides keyboard and
mouse (aux) support, is hardwired in the driver rather than parsing the
dts.  This patch modifies the powerpc legacy IO code to attempt to parse
the device tree for this information, failing back to the hardcoded values
if it fails.

Signed-off-by: Martyn Welch <martyn.welch@ge.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:28:33 +10:00
Anton Blanchard
ae01f84b93 powerpc: Optimise per cpu accesses on 64bit
Now we dynamically allocate the paca array, it takes an extra load
whenever we want to access another cpu's paca. One place we do that a lot
is per cpu variables. A simple example:

DEFINE_PER_CPU(unsigned long, vara);
unsigned long test4(int cpu)
{
	return per_cpu(vara, cpu);
}

This takes 4 loads, 5 if you include the actual load of the per cpu variable:

    ld r11,-32760(r30)  # load address of paca pointer
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,9       # get offset into paca (each entry is 512 bytes)
    ld r0,0(r11)        # load paca pointer
    add r3,r0,r3        # paca + offset
    ld r11,64(r3)       # load paca[cpu].data_offset

    ldx r3,r9,r11       # load per cpu variable

If we remove the ppc64 specific per_cpu_offset(), we get the generic one
which indexes into a statically allocated array. This removes one load and
one add:

    ld r11,-32760(r30)  # load address of __per_cpu_offset
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,3       # get offset into __per_cpu_offset (each entry 8 bytes)
    ldx r11,r11,r3      # load __per_cpu_offset[cpu]

    ldx r3,r9,r11       # load per cpu variable

Having all the offsets in one array also helps when iterating over a per cpu
variable across a number of cpus, such as in the scheduler. Before we would
need to load one paca cacheline when calculating each per cpu offset. Now we
have 16 (128 / sizeof(long)) per cpu offsets in each cacheline.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:28:30 +10:00
Brian King
8fe93f8d85 powerpc/pseries: Migration code reorganization / hibernation prep
Partition hibernation will use some of the same code as is
currently used for Live Partition Migration. This function
further abstracts this code such that code outside of rtas.c
can utilize it. It also changes the error field in the suspend
me data structure to be an atomic type, since it is set and
checked on different cpus without any barriers or locking.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:26:17 +10:00
Paul Mackerras
c1aa687d49 powerpc: Clean up obsolete code relating to decrementer and timebase
Since the decrementer and timekeeping code was moved over to using
the generic clockevents and timekeeping infrastructure, several
variables and functions have been obsolete and effectively unused.
This deletes them.

In particular, wakeup_decrementer() is no longer needed since the
generic code reprograms the decrementer as part of the process of
resuming the timekeeping code, which happens during sysdev resume.
Thus the wakeup_decrementer calls in the suspend_enter methods for
52xx platforms have been removed.  The call in the powermac cpu
frequency change code has been replaced by set_dec(1), which will
cause a timer interrupt as soon as interrupts are enabled, and the
generic code will then reprogram the decrementer with the correct
value.

This also simplifies the generic_suspend_en/disable_irqs functions
and makes them static since they are not referenced outside time.c.
The preempt_enable/disable calls are removed because the generic
code has disabled all but the boot cpu at the point where these
functions are called, so we can't be moved to another cpu.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:26:16 +10:00
Paul Mackerras
8fd63a9ea7 powerpc: Rework VDSO gettimeofday to prevent time going backwards
Currently it is possible for userspace to see the result of
gettimeofday() going backwards by 1 microsecond, assuming that
userspace is using the gettimeofday() in the VDSO.  The VDSO
gettimeofday() algorithm computes the time in "xsecs", which are
units of 2^-20 seconds, or approximately 0.954 microseconds,
using the algorithm

	now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec

and then converts the time in xsecs to seconds and microseconds.

The kernel updates the tb_orig_stamp and stamp_xsec values every
tick in update_vsyscall().  If the length of the tick is not an
integer number of xsecs, then some precision is lost in converting
the current time to xsecs.  For example, with CONFIG_HZ=1000, the
tick is 1ms long, which is 1048.576 xsecs.  That means that
stamp_xsec will advance by either 1048 or 1049 on each tick.
With the right conditions, it is possible for userspace to get
(timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is
slightly late in updating the vdso_datapage, and then for stamp_xsec
to advance by 1048 when the kernel does update it, and for userspace
to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to
integer truncation.  The result is that time appears to go backwards
by 1 microsecond.

To fix this we change the VDSO gettimeofday to use a new field in the
VDSO datapage which stores the nanoseconds part of the time as a
fractional number of seconds in a 0.32 binary fraction format.
(Or put another way, as a 32-bit number in units of 0.23283 ns.)
This is convenient because we can use the mulhwu instruction to
convert it to either microseconds or nanoseconds.

Since it turns out that computing the time of day using this new field
is simpler than either using stamp_xsec (as gettimeofday does) or
stamp_xtime.tv_nsec (as clock_gettime does), this converts both
gettimeofday and clock_gettime to use the new field.  The existing
__do_get_tspec function is converted to use the new field and take
a parameter in r7 that indicates the desired resolution, 1,000,000
for microseconds or 1,000,000,000 for nanoseconds.  The __do_get_xsec
function is then unused and is deleted.

The new algorithm is

	now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs
		+ (stamp_xtime_seconds << 32) + stamp_sec_fraction

with 'now' in units of 2^-32 seconds.  That is then converted to
seconds and either microseconds or nanoseconds with

	seconds = now >> 32
	partseconds = ((now & 0xffffffff) * resolution) >> 32

The 32-bit VDSO code also makes a further simplification: it ignores
the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary
fraction.  Doing so gets rid of 4 multiply instructions.  Assuming
a timebase frequency of 1GHz or less and an update interval of no
more than 10ms, the upper 32 bits of tb_to_xs will be at least
4503599, so the error from ignoring the low 32 bits will be at most
2.2ns, which is more than an order of magnitude less than the time
taken to do gettimeofday or clock_gettime on our fastest processors,
so there is no possibility of seeing inconsistent values due to this.

This also moves update_gtod() down next to its only caller, and makes
update_vsyscall use the time passed in via the wall_time argument rather
than accessing xtime directly.  At present, wall_time always points to
xtime, but that could change in future.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-09 11:26:16 +10:00
Benjamin Herrenschmidt
5f07aa7524 Merge commit 'paulus-perf/master' into next 2010-07-09 11:25:48 +10:00
Paul E. McKenney
c2be05481f powerpc: Fix default_machine_crash_shutdown #ifdef botch
crash_kexec_wait_realmode() is defined only if CONFIG_PPC_STD_MMU_64
and CONFIG_SMP, but is called if CONFIG_PPC_STD_MMU_64 even if !CONFIG_SMP.
Fix the conditional compilation around the invocation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:45 +10:00
Johannes Berg
3cd8519248 powerpc: Fix logic error in fixup_irqs
When SPARSE_IRQ is set, irq_to_desc() can
return NULL. While the code here has a
check for NULL, it's not really correct.
Fix it by separating the check for it.

This fixes CPU hot unplug for me.

Reported-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Cc: stable@kernel.org [2.6.32+]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:44 +10:00
Anton Blanchard
33ad5e4b6c powerpc: Linux cannot run with 0 cores
If we configure with CONFIG_SMP=n or set NR_CPUS less than the number of
SMT threads we will set the max cores property to 0 in the
ibm,client-architecture-support structure. On new versions of firmware that
understand this property it obliges and terminates our partition.

Use DIV_ROUND_UP so we handle not only the CONFIG_SMP=n case but also the
case where NR_CPUS isn't a multiple of the number of SMT threads.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:42 +10:00
Stephen Rothwell
5afd878a95 powerpc: Fix compile errors in prom_init_check for gcc 4.5
Just whitelist these extra compiler generated symbols.
Fixes these errors:

Error: External symbol '_restgpr0_14' referenced from prom_init.c
Error: External symbol '_restgpr0_20' referenced from prom_init.c
Error: External symbol '_restgpr0_22' referenced from prom_init.c
Error: External symbol '_restgpr0_24' referenced from prom_init.c
Error: External symbol '_restgpr0_25' referenced from prom_init.c
Error: External symbol '_restgpr0_26' referenced from prom_init.c
Error: External symbol '_restgpr0_27' referenced from prom_init.c
Error: External symbol '_restgpr0_28' referenced from prom_init.c
Error: External symbol '_restgpr0_29' referenced from prom_init.c
Error: External symbol '_restgpr0_31' referenced from prom_init.c
Error: External symbol '_savegpr0_14' referenced from prom_init.c
Error: External symbol '_savegpr0_20' referenced from prom_init.c
Error: External symbol '_savegpr0_22' referenced from prom_init.c
Error: External symbol '_savegpr0_24' referenced from prom_init.c
Error: External symbol '_savegpr0_25' referenced from prom_init.c
Error: External symbol '_savegpr0_26' referenced from prom_init.c
Error: External symbol '_savegpr0_27' referenced from prom_init.c
Error: External symbol '_savegpr0_28' referenced from prom_init.c
Error: External symbol '_savegpr0_29' referenced from prom_init.c
Error: External symbol '_savegpr0_31' referenced from prom_init.c

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:39 +10:00
Matt Evans
219a92a4c4 powerpc/perf_event: Fix for power_pmu_disable()
When power_pmu_disable() removes the given event from a particular index into
cpuhw->event[], it shuffles down higher event[] entries.  But, this array is
paired with cpuhw->events[] and cpuhw->flags[] so should shuffle them
similarly.

If these arrays get out of sync, code such as power_check_constraints() will
fail.  This caused a bug where events were temporarily disabled and then failed
to be re-enabled; subsequent code tried to write_pmc() with its (disabled) idx
of 0, causing a message "oops trying to write PMC0".  This triggers this bug on
POWER7, running a miss-heavy test:

  perf record -e L1-dcache-load-misses -e L1-dcache-store-misses ./misstest

Signed-off-by: Matt Evans <matt@ozlabs.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-07-08 18:11:37 +10:00
Grant Likely
94c0931983 of: Merge of_device_alloc() and of_device_make_bus_id()
This patch merges the common routines of_device_alloc() and
of_device_make_bus_id() from powerpc and microblaze.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Grant Likely <grant.likely@secretlab.ca>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linuxppc-dev@ozlabs.org
CC: devicetree-discuss@lists.ozlabs.org
2010-07-05 16:14:29 -06:00
Grant Likely
5fd200f3b3 of/device: Merge of_platform_bus_probe()
Merge common code between PowerPC and microblaze.  This patch merges
the code that scans the tree and registers devices.  The functions
merged are of_platform_bus_probe(), of_platform_bus_create(), and
of_platform_device_create().

This patch also move the of_default_bus_ids[] table out of a Microblaze
header file and makes it non-static.  The device ids table isn't merged
because powerpc and microblaze use different default data.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Grant Likely <grant.likely@secretlab.ca>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linuxppc-dev@ozlabs.org
2010-07-05 16:14:28 -06:00
Grant Likely
dd27dcda37 of/device: merge of_device_uevent
Merge common code between powerpc and microblaze

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linuxppc-dev@ozlabs.org
2010-07-05 16:14:28 -06:00
Grant Likely
dbbdee9473 of/address: Merge all of the bus translation code
Microblaze and PowerPC share a large chunk of code for translating
OF device tree data into usable addresses.  Differences between the two
consist of cosmetic differences, and the addition of dma-ranges support
code to powerpc but not microblaze.  This patch moves the powerpc
version into common code and applies many of the cosmetic (non-functional)
changes from the microblaze version.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Wolfram Sang <w.sang@pengutronix.de>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:26 -06:00
Grant Likely
1f5bef30cf of/address: merge of_address_to_resource()
Merge common code between PowerPC and Microblaze.  This patch also
moves the prototype of pci_address_to_pio() out of pci-bridge.h and
into prom.h because the only user of pci_address_to_pio() is
of_address_to_resource().

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:26 -06:00
Grant Likely
6b884a8d50 of/address: merge of_iomap()
Merge common code between Microblaze and PowerPC.  This patch creates
new of_address.h and address.c files to containing address translation
and mapping routines.  First routine to be moved it of_iomap()

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Michal Simek <monstr@monstr.eu>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:26 -06:00
Grant Likely
7dc2e1134a of/irq: merge irq mapping code
Merge common irq mapping code between PowerPC and Microblaze.

This patch merges of_irq_find_parent(), of_irq_map_raw() and
of_irq_map_one().  The functions are dependent on one another, so all
three are merged in a single patch.  Other than cosmetic difference
(ie. DBG() vs. pr_debug()), the implementations are identical.

of_irq_to_resource() is also merged, but in this case the
implementations are different.  This patch drops the microblaze version
and uses the powerpc implementation unchanged.  The microblaze version
essentially open-coded irq_of_parse_and_map() which it does not need
to do.  Therefore the powerpc version is safe to adopt.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
CC: Michal Simek <monstr@monstr.eu>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:25 -06:00
Grant Likely
b83da291b4 of/powerpc: Move Powermac irq quirk code into powermac pic driver code
The code that figures out what is wrong with the powermac irq device
tree data belongs with the rest of the powermac irq code.  This patch
moves it out of prom_parse.c and into powermac/pic.c so that it is only
compiled in when actually needed.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
2010-07-05 16:14:25 -06:00
Paul Mackerras
d09ec73871 powerpc, hw_breakpoint: Tell generic code we have no instruction breakpoints
At present, hw_breakpoint_slots() returns 1 regardless of what
type of breakpoint is specified in the type argument.  Since we
don't define CONFIG_HAVE_MIXED_BREAKPOINTS_REGS, there are
separate values for TYPE_INST and TYPE_DATA, and hw_breakpoint_slots()
returns 1 for both, effectively advertising instruction breakpoint
support which doesn't exist.

This fixes it by making hw_breakpoint_slots return 1 for TYPE_DATA
and 0 for TYPE_INST.  This moves hw_breakpoint_slots() from the
powerpc hw_breakpoint.h to hw_breakpoint.c because the definitions
of TYPE_INST and TYPE_DATA aren't available in <asm/hw_breakpoint.h>.
They are defined in <linux/hw_breakpoint.h> but we can't include
that header in <asm/hw_breakpoint.h>, and nor can we rely on
<linux/hw_breakpoint.h> being included before <asm/hw_breakpoint.h>.
Since hw_breakpoint_slots() is only called at boot time, there is
no performance impact from making it a real function rather than
a static inline.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-30 13:54:58 +10:00
Michael Neuling
2ec57d448b sched: Fix spelling of sibling
No logic changes, only spelling.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: linuxppc-dev@ozlabs.org
Cc: David Howells <dhowells@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <15249.1277776921@neuling.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-29 10:44:29 +02:00
Thomas Gleixner
f384c954c9 Merge branch 'linus' into perf/core
Reason: Further changes conflict with upstream fixes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-06-28 22:33:24 +02:00
Grant Likely
e387344499 of/irq: Move irq_of_parse_and_map() to common code
Merge common code between PowerPC and Microblaze.  SPARC implements
irq_of_parse_and_map(), but the implementation is different, so it
does not use this code.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Jeremy Kerr <jeremy.kerr@canonical.com>
2010-06-28 12:41:33 -07:00
Paul Mackerras
76b0f13376 powerpc, hw_breakpoint: Cooperate better with other single-steppers
The code we had to clear the MSR_SE bit was not doing anything because
the caller (ultimately single_step_exception() in traps.c) had already
cleared.  Instead of trying to leave MSR_SE set if the TIF_SINGLESTEP
flag is set (which indicates that the process is being single-stepped
by ptrace), we instead return NOTIFY_DONE in that case, which means
the caller will generate a SIGTRAP for the process.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-23 15:46:55 +10:00
Paul Mackerras
574cb24899 powerpc, hw_breakpoint: Fix off-by-one in checking access address
The code would accept an access to an address one byte past the end
of the requested range as legitimate, due to having a "<=" rather than
a "<".  This fixes that and cleans up the code a bit.

Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-23 15:42:43 +10:00
K.Prasad
e3e94084ad powerpc, hw_breakpoint: Discard extraneous interrupt due to accesses outside symbol length
Many a times, the requested breakpoint length can be less than the
fixed breakpoint length i.e. 8 bytes supported by PowerPC 64-bit
server (Book III S) processors.  This could lead to extraneous
interrupts resulting in false breakpoint notifications.  This
detects and discards such interrupts for non-ptrace requests.
We don't change ptrace behaviour to avoid breaking compatability.

[Suggestion from Paul Mackerras <paulus@samba.org> to add a new flag in
'struct arch_hw_breakpoint' to identify extraneous interrupts]

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-06-22 19:40:51 +10:00