Commit Graph

3189 Commits

Author SHA1 Message Date
Michael Ellerman
922b9f86a0 powerpc: Fix program check handling when lockdep is enabled
In commit 54321242af ("Disable interrupts early in Program Check"), we
switched from enabling to disabling interrupts in program_check_common.

Whereas ENABLE_INTS leaves r3 untouched, if lockdep is enabled DISABLE_INTS
calls into lockdep code and will clobber r3. That means we pass a bogus
struct pt_regs* into program_check_exception() and all hell breaks loose.

So load our regs pointer into r3 after we call DISABLE_INTS.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-22 16:48:49 +11:00
Grant Likely
cc79ca691c irq_domain: Move irq_domain code from powerpc to kernel/irq
This patch only moves the code.  It doesn't make any changes, and the
code is still only compiled for powerpc.  Follow-on patches will generalize
the code for other architectures.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
2012-02-16 01:37:49 -07:00
Grant Likely
6d9285b00f irq_domain/powerpc: Eliminate virq_is_host()
There is only one user, and it is trivial to open-code.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
2012-02-16 01:36:46 -07:00
Anton Blanchard
9a45a9407c powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events
perf on POWER stopped working after commit e050e3f0a7 (perf: Fix
broken interrupt rate throttling). That patch exposed a bug in
the POWER perf_events code.

Since the PMCs count upwards and take an exception when the top bit
is set, we want to write 0x80000000 - left in power_pmu_start. We were
instead programming in left which effectively disables the counter
until we eventually hit 0x80000000. This could take seconds or longer.

With the patch applied I get the expected number of samples:

          SAMPLE events:       9948

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@kernel.org>
2012-02-16 16:24:35 +11:00
Benjamin Herrenschmidt
54321242af powerpc: Disable interrupts early in Program Check
Program Check exceptions are the result of WARNs, BUGs, some
type of breakpoints, kprobe, and other illegal instructions.

We want interrupts (and thus preemption) to remain disabled
while doing the initial stage of testing the reason and
branching off to a debugger or kprobe, so we are still on
the original CPU which makes debugging easier in various cases.

This is how the code was intended, hence the local_irq_enable()
right in the middle of program_check_exception().

However, the assembly exception prologue for that exception was
incorrectly marked as enabling interrupts, which defeats that
(and records a redundant enable with lockdep).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-16 16:15:10 +11:00
Ira Snyder
40c8cefaaf powerpc: Fix kernel log of oops/panic instruction dump
A kernel oops/panic prints an instruction dump showing several
instructions before and after the instruction which caused the
oops/panic.

The code intended that the faulting instruction be enclosed in angle
brackets, however a bug caused the faulting instruction to be
interpreted by printk() as the message log level.

To fix this, the KERN_CONT log level is added before the actual text of
the printed message.

=== Before the patch ===

[ 1081.587266] Instruction dump:
[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000
[ 1081.602500]  4e800020 3803ffd0 2b800009

<4>[ 1081.587266] Instruction dump:
<4>[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
<4>[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000
<98090000>[ 1081.602500]  4e800020 3803ffd0 2b800009

=== After the patch ===

[   51.385216] Instruction dump:
[   51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
[   51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009

<4>[   51.385216] Instruction dump:
<4>[   51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
<4>[   51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-16 16:11:23 +11:00
Grant Likely
4bbdd45afd irq_domain/powerpc: eliminate irq_map; use irq_alloc_desc() instead
This patch drops the powerpc-specific irq_map table and replaces it with
directly using the irq_alloc_desc()/irq_free_desc() interfaces for allocating
and freeing irq_desc structures.

This patch is a preparation step for generalizing the powerpc-specific virq
infrastructure to become irq_domains.

As part of this change, the irq_big_lock is changed to a mutex from a raw
spinlock.  There is no longer any need to use a spin lock since the irq_desc
allocation code is now responsible for the critical section of finding
an unused range of irq numbers.

The radix lookup table is also changed to store the irq_data pointer instead
of the irq_map entry since the irq_map is removed.  This should end up being
functionally equivalent since only allocated irq_descs are ever added to the
radix tree.

v5: - Really don't ever allocate virq 0.  The previous version could still
      do it if hint == 0
    - Respect irq_virq_count setting for NOMAP.  Some NOMAP domains cannot
      use virq values above irq_virq_count.
    - Use numa_node_id() when allocating irq_descs.  Ideally the API should
      obtain that value from the caller, but that touches a lot of call sites
      so will be deferred to a follow-on patch.
    - Fix irq_find_mapping() to include irq numbers lower than
      NUM_ISA_INTERRUPTS.  With the switch to irq_alloc_desc*(), the lowest
      possible allocated irq is now returned by arch_probe_nr_irqs().
v4: - Fix incorrect access to irq_data structure in debugfs code
    - Don't ever allocate virq 0

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
2012-02-14 14:06:51 -07:00
Grant Likely
bae1d8f199 irq_domain/powerpc: Use common irq_domain structure instead of irq_host
This patch drops the powerpc-specific irq_host structures and uses the common
irq_domain strucutres defined in linux/irqdomain.h.  It also fixes all
the users to use the new structure names.

Renaming irq_host to irq_domain has been discussed for a long time, and this
patch is a step in the process of generalizing the powerpc virq code to be
usable by all architecture.

An astute reader will notice that this patch actually removes the irq_host
structure instead of renaming it.  This is because the irq_domain structure
already exists in include/linux/irqdomain.h and has the needed data members.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
2012-02-14 14:06:50 -07:00
Brian King
444080d13d powerpc/pseries: Fix partition migration hang in stop_topology_update
This fixes a hang that was observed during live partition migration.
Since stop_topology_update must not be called from an interrupt
context, call it earlier in the migration process. The hang observed
can be seen below:

WARNING: at kernel/timer.c:1011
Modules linked in: ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop ibmveth sg ext3 jbd mbcache raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_round_robin dm_multipath scsi_dh sd_mod crc_t10dif ibmvfc scsi_transport_fc scsi_tgt scsi_mod dm_snapshot dm_mod
NIP: c0000000000c52d8 LR: c00000000004be28 CTR: 0000000000000000
REGS: c00000005ffd77d0 TRAP: 0700   Not tainted  (3.2.0-git-00001-g07d106d)
MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 48000084  XER: 00000001
CFAR: c00000000004be20
TASK = c00000005ec78860[0] 'swapper/3' THREAD: c00000005ec98000 CPU: 3
GPR00: 0000000000000001 c00000005ffd7a50 c000000000fbbc98 c000000000ec8340
GPR04: 00000000282a0020 0000000000000000 0000000000004000 0000000000000101
GPR08: 0000000000000012 c00000005ffd4000 0000000000000020 c000000000f3ba88
GPR12: 0000000000000000 c000000007f40900 0000000000000001 0000000000000004
GPR16: 0000000000000001 0000000000000000 0000000000000000 c000000001022310
GPR20: 0000000000000001 0000000000000000 0000000000200200 c000000001029e14
GPR24: 0000000000000000 0000000000000001 0000000000000040 c00000003f74bc80
GPR28: c00000003f74bc84 c000000000f38038 c000000000f16b58 c000000000ec8340
NIP [c0000000000c52d8] .del_timer_sync+0x28/0x60
LR [c00000000004be28] .stop_topology_update+0x20/0x38
Call Trace:
[c00000005ffd7a50] [c00000005ec78860] 0xc00000005ec78860 (unreliable)
[c00000005ffd7ad0] [c00000000004be28] .stop_topology_update+0x20/0x38
[c00000005ffd7b40] [c000000000028378] .__rtas_suspend_last_cpu+0x58/0x260
[c00000005ffd7bf0] [c0000000000fa230] .generic_smp_call_function_interrupt+0x160/0x358
[c00000005ffd7cf0] [c000000000036ec8] .smp_ipi_demux+0x88/0x100
[c00000005ffd7d80] [c00000000005c154] .icp_hv_ipi_action+0x5c/0x80
[c00000005ffd7e00] [c00000000012a088] .handle_irq_event_percpu+0x100/0x318
[c00000005ffd7f00] [c00000000012e774] .handle_percpu_irq+0x84/0xd0
[c00000005ffd7f90] [c000000000022ba8] .call_handle_irq+0x1c/0x2c
[c00000005ec9ba20] [c00000000001157c] .do_IRQ+0x22c/0x2a8
[c00000005ec9bae0] [c0000000000054bc] hardware_interrupt_entry+0x18/0x1c
Exception: 501 at .cpu_idle+0x194/0x2f8
    LR = .cpu_idle+0x194/0x2f8
[c00000005ec9bdd0] [c000000000017e58] .cpu_idle+0x188/0x2f8 (unreliable)
[c00000005ec9be90] [c00000000067ec18] .start_secondary+0x3e4/0x524
[c00000005ec9bf90] [c0000000000093e8] .start_secondary_prolog+0x10/0x14
Instruction dump:
ebe1fff8 4e800020 fbe1fff8 7c0802a6 f8010010 7c7f1b78 f821ff81 78290464
80090014 5400019e 7c0000d0 78000fe0 <0b000000> 4800000c 7c210b78 7c421378

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:39 +11:00
Benjamin Herrenschmidt
6fe5f5f3ff powerpc: Fix WARN_ON in decrementer_check_overflow
We use __get_cpu_var() which triggers a false positive warning
in smp_processor_id() thinking interrupts are enabled (at this
point, they are soft-enabled but hard-disabled).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-14 15:01:38 +11:00
Grant Likely
07d57a32fb drivercore: Output common devicetree information in uevent
When userspace needs to find a specific device, it currently isn't easy to
resolve a /sys/devices/ path from a specific device tree node.  Nor is it
easy to obtain the compatible list for devices.

This patch generalizes the code that inserts OF_* values into the uevent
device attribute so that any device that is attached to an OF node will
have that information exported to userspace.  Without this patch only
platform devices and some powerpc-specific busses have access to this
data.

The original function also creates a MODALIAS property for the compatible
list, but that code has not been generalized into the common case because
it has the potential to break module loading on a lot of bus types.  Bus
types are still responsible for their own MODALIAS properties.

Boot tested on ARM and compile tested on PowerPC and SPARC.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Frederic Lambert <frdrc66@gmail.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Mark Brown <broonie@sirena.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-02-01 14:26:30 -07:00
Ingo Molnar
44a6839711 Merge branch 'perf/fast' into perf/core
Merge reason: Lets ready it for v3.4

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-27 12:08:09 +01:00
Benjamin Herrenschmidt
3493c85366 powerpc: Fix build on some non-freescale platforms
Commit 9deaa53ac7 broke build
on platforms that use legacy_serial.c without also having
CONFIG_SERIAL_8250_FSL enabled due to an unconditional code
to a routine in that module.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-01-25 13:33:22 +11:00
Christian Kujau
897e01a08c powerpc/crash: Fix build error without SMP
I could not find cpus_in_crash anywhere in the sourcetree, except for
arch/powerpc/kernel/crash.c. Moving the definition into the CONFIG_SMP
fixes it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-01-25 09:47:45 +11:00
Linus Torvalds
f429ee3b80 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits)
  audit: no leading space in audit_log_d_path prefix
  audit: treat s_id as an untrusted string
  audit: fix signedness bug in audit_log_execve_info()
  audit: comparison on interprocess fields
  audit: implement all object interfield comparisons
  audit: allow interfield comparison between gid and ogid
  audit: complex interfield comparison helper
  audit: allow interfield comparison in audit rules
  Kernel: Audit Support For The ARM Platform
  audit: do not call audit_getname on error
  audit: only allow tasks to set their loginuid if it is -1
  audit: remove task argument to audit_set_loginuid
  audit: allow audit matching on inode gid
  audit: allow matching on obj_uid
  audit: remove audit_finish_fork as it can't be called
  audit: reject entry,always rules
  audit: inline audit_free to simplify the look of generic code
  audit: drop audit_set_macxattr as it doesn't do anything
  audit: inline checks for not needing to collect aux records
  audit: drop some potentially inadvisable likely notations
  ...

Use evil merge to fix up grammar mistakes in Kconfig file.

Bad speling and horrible grammar (and copious swearing) is to be
expected, but let's keep it to commit messages and comments, rather than
expose it to users in config help texts or printouts.
2012-01-17 16:41:31 -08:00
Eric Paris
b05d8447e7 audit: inline audit_syscall_entry to reduce burden on archs
Every arch calls:

if (unlikely(current->audit_context))
	audit_syscall_entry()

which requires knowledge about audit (the existance of audit_context) in
the arch code.  Just do it all in static inline in audit.h so that arch's
can remain blissfully ignorant.

Signed-off-by: Eric Paris <eparis@redhat.com>
2012-01-17 16:16:56 -05:00
Eric Paris
d7e7528bcd Audit: push audit success and retcode into arch ptrace.h
The audit system previously expected arches calling to audit_syscall_exit to
supply as arguments if the syscall was a success and what the return code was.
Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
by converting from negative retcodes to an audit internal magic value stating
success or failure.  This helper was wrong and could indicate that a valid
pointer returned to userspace was a failed syscall.  The fix is to fix the
layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
in turns calls back into arch code to collect the return value and to
determine if the syscall was a success or failure.  We also define a generic
is_syscall_success() macro which determines success/failure based on if the
value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
separate mechanism to indicate syscall failure.

We make both the is_syscall_success() and regs_return_value() static inlines
instead of macros.  The reason is because the audit function must take a void*
for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
function takes a void* we need to use static inlines to cast it back to the
arch correct structure to dereference it.

The other major change is that on some arches, like ia64, MIPS and ppc, we
change regs_return_value() to give us the negative value on syscall failure.
THE only other user of this macro, kretprobe_example.c, won't notice and it
makes the value signed consistently for the audit functions across all archs.

In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
audit code as the return value.  But the ptrace_64.h code defined the macro
regs_return_value() as regs[3].  I have no idea which one is correct, but this
patch now uses the regs_return_value() function, so it now uses regs[3].

For powerpc we previously used regs->result but now use the
regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
always positive so the regs_return_value(), much like ia64 makes it negative
before calling the audit code when appropriate.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
Acked-by: Richard Weinberger <richard@nod.at> [for uml]
Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
2012-01-17 16:16:56 -05:00
Linus Torvalds
8e63dd6e1c Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix unpaired __trace_hcall_entry and __trace_hcall_exit
  powerpc: Fix RCU idle and hcall tracing
2012-01-14 12:26:23 -08:00
Joe Perches
ff2d8b19a3 treewide: convert uses of ATTRIB_NORETURN to __noreturn
Use the more commonly used __noreturn instead of ATTRIB_NORETURN.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-12 20:13:03 -08:00
Joe Perches
9402c95f34 treewide: remove useless NORET_TYPE macro and uses
It's a very old and now unused prototype marking so just delete it.

Neaten panic pointer argument style to keep checkpatch quiet.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-12 20:13:03 -08:00
Linus Torvalds
7b67e75147 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (80 commits)
  x86/PCI: Expand the x86_msi_ops to have a restore MSIs.
  PCI: Increase resource array mask bit size in pcim_iomap_regions()
  PCI: DEVICE_COUNT_RESOURCE should be equal to PCI_NUM_RESOURCES
  PCI: pci_ids: add device ids for STA2X11 device (aka ConneXT)
  PNP: work around Dell 1536/1546 BIOS MMCONFIG bug that breaks USB
  x86/PCI: amd: factor out MMCONFIG discovery
  PCI: Enable ATS at the device state restore
  PCI: msi: fix imbalanced refcount of msi irq sysfs objects
  PCI: kconfig: English typo in pci/pcie/Kconfig
  PCI/PM/Runtime: make PCI traces quieter
  PCI: remove pci_create_bus()
  xtensa/PCI: convert to pci_scan_root_bus() for correct root bus resources
  x86/PCI: convert to pci_create_root_bus() and pci_scan_root_bus()
  x86/PCI: use pci_scan_bus() instead of pci_scan_bus_parented()
  x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan
  sparc32, leon/PCI: convert to pci_scan_root_bus() for correct root bus resources
  sparc/PCI: convert to pci_create_root_bus()
  sh/PCI: convert to pci_scan_root_bus() for correct root bus resources
  powerpc/PCI: convert to pci_create_root_bus()
  powerpc/PCI: split PHB part out of pcibios_map_io_space()
  ...

Fix up conflicts in drivers/pci/msi.c and include/linux/pci_regs.h due
to the same patches being applied in other branches.
2012-01-11 18:50:26 -08:00
Linus Torvalds
e343a895a9 lib: use generic pci_iomap on all architectures
Many architectures don't want to pull in iomap.c,
 so they ended up duplicating pci_iomap from that file.
 That function isn't trivial, and we are going to modify it
 https://lkml.org/lkml/2011/11/14/183
 so the duplication hurts.
 
 This reduces the scope of the problem significantly,
 by moving pci_iomap to a separate file and
 referencing that from all architectures.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJPBZXBAAoJECgfDbjSjVRpuuYIAIMD0wE96MuTOSBJX4VG8VAP
 UyjL9dsfMRy8CKioQo5/fxpTY07YBCWmNauSSX7pzgcoUKBfYIGn4Z1qwGYsWK9M
 CzLs6PXLTugw0FtKobHZl/klRTWEBS6YOUjp9x568rplwF+Ppk7b993uj7eS/g+e
 T0mUKzqg4/UavbHd9+W5KgC4drQ5hgtu2WZHoUxBK4umnd3C2G+U82Sthg50o/XU
 SC8IGm39K8I36HoIWgXj3Y7nkOP3mQELohOT4ZPiVSmLvGS4i47+ix75anO+8ZvZ
 jxHr8RC85IK1Nd89NZhbKOyvx0QQiwoKUZaTwcWXJNSOADzZnM6icdIsodc+Elo=
 =ccQZ
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

lib: use generic pci_iomap on all architectures

Many architectures don't want to pull in iomap.c,
so they ended up duplicating pci_iomap from that file.
That function isn't trivial, and we are going to modify it
https://lkml.org/lkml/2011/11/14/183
so the duplication hurts.

This reduces the scope of the problem significantly,
by moving pci_iomap to a separate file and
referencing that from all architectures.

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  alpha: drop pci_iomap/pci_iounmap from pci-noop.c
  mn10300: switch to GENERIC_PCI_IOMAP
  mn10300: add missing __iomap markers
  frv: switch to GENERIC_PCI_IOMAP
  tile: switch to GENERIC_PCI_IOMAP
  tile: don't panic on iomap
  sparc: switch to GENERIC_PCI_IOMAP
  sh: switch to GENERIC_PCI_IOMAP
  powerpc: switch to GENERIC_PCI_IOMAP
  parisc: switch to GENERIC_PCI_IOMAP
  mips: switch to GENERIC_PCI_IOMAP
  microblaze: switch to GENERIC_PCI_IOMAP
  arm: switch to GENERIC_PCI_IOMAP
  alpha: switch to GENERIC_PCI_IOMAP
  lib: add GENERIC_PCI_IOMAP
  lib: move GENERIC_IOMAP to lib/Kconfig

Fix up trivial conflicts due to changes nearby in arch/{m68k,score}/Kconfig
2012-01-10 18:04:27 -08:00
Anton Blanchard
a5ccfee05a powerpc: Fix RCU idle and hcall tracing
Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
region. Since pSeries calls H_CEDE in the idle loop, we were violating
this rule.

commit a7b152d534 (powerpc: Tell RCU about idle after hcall tracing)
tried to work around it by delaying the rcu_idle_enter until after we
called the hcall tracepoint, but there are a number of issues with it.

The hcall tracepoint trampoline code is called conditionally when the
tracepoint is enabled. If the tracepoint is not enabled we never call
rcu_idle_enter. The idle_uses_rcu check was also done at compile time
which breaks multiplatform builds.

The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
and the hypervisor dispatch trace log to work out if we called H_CEDE.

This fixes a hang during boot on pSeries.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-01-11 11:54:20 +11:00
Linus Torvalds
5983faf942 Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (65 commits)
  tty: serial: imx: move del_timer_sync() to avoid potential deadlock
  imx: add polled io uart methods
  imx: Add save/restore functions for UART control regs
  serial/imx: let probing fail for the dt case without a valid alias
  serial/imx: propagate error from of_alias_get_id instead of using -ENODEV
  tty: serial: imx: Allow UART to be a source for wakeup
  serial: driver for m32 arch should not have DEC alpha errata
  serial/documentation: fix documented name of DCD cpp symbol
  atmel_serial: fix spinlock lockup in RS485 code
  tty: Fix memory leak in virtual console when enable unicode translation
  serial: use DIV_ROUND_CLOSEST instead of open coding it
  serial: add support for 400 and 800 v3 series Titan cards
  serial: bfin-uart: Remove ASYNC_CTS_FLOW flag for hardware automatic CTS.
  serial: bfin-uart: Enable hardware automatic CTS only when CTS pin is available.
  serial: make FSL errata depend on 8250_CONSOLE, not just 8250
  serial: add irq handler for Freescale 16550 errata.
  serial: manually inline serial8250_handle_port
  serial: make 8250 timeout use the specified IRQ handler
  serial: export the key functions for an 8250 IRQ handler
  serial: clean up parameter passing for 8250 Rx IRQ handling
  ...
2012-01-09 12:09:24 -08:00
Linus Torvalds
eb59c505f8 Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
* 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
  PM / Hibernate: Implement compat_ioctl for /dev/snapshot
  PM / Freezer: fix return value of freezable_schedule_timeout_killable()
  PM / shmobile: Allow the A4R domain to be turned off at run time
  PM / input / touchscreen: Make st1232 use device PM QoS constraints
  PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
  PM / shmobile: Remove the stay_on flag from SH7372's PM domains
  PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
  PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
  PM: Drop generic_subsys_pm_ops
  PM / Sleep: Remove forward-only callbacks from AMBA bus type
  PM / Sleep: Remove forward-only callbacks from platform bus type
  PM: Run the driver callback directly if the subsystem one is not there
  PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
  PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
  PM / Sleep: Merge internal functions in generic_ops.c
  PM / Sleep: Simplify generic system suspend callbacks
  PM / Hibernate: Remove deprecated hibernation snapshot ioctls
  PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
  ARM: S3C64XX: Implement basic power domain support
  PM / shmobile: Use common always on power domain governor
  ...

Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused
XBT_FORCE_SLEEP bit
2012-01-08 13:10:57 -08:00
Linus Torvalds
972b2c7199 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
  reiserfs: Properly display mount options in /proc/mounts
  vfs: prevent remount read-only if pending removes
  vfs: count unlinked inodes
  vfs: protect remounting superblock read-only
  vfs: keep list of mounts for each superblock
  vfs: switch ->show_options() to struct dentry *
  vfs: switch ->show_path() to struct dentry *
  vfs: switch ->show_devname() to struct dentry *
  vfs: switch ->show_stats to struct dentry *
  switch security_path_chmod() to struct path *
  vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
  vfs: trim includes a bit
  switch mnt_namespace ->root to struct mount
  vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
  vfs: opencode mntget() mnt_set_mountpoint()
  vfs: spread struct mount - remaining argument of next_mnt()
  vfs: move fsnotify junk to struct mount
  vfs: move mnt_devname
  vfs: move mnt_list to struct mount
  vfs: switch pnode.h macros to struct mount *
  ...
2012-01-08 12:19:57 -08:00
Linus Torvalds
7affca3537 Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
* 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (73 commits)
  arm: fix up some samsung merge sysdev conversion problems
  firmware: Fix an oops on reading fw_priv->fw in sysfs loading file
  Drivers:hv: Fix a bug in vmbus_driver_unregister()
  driver core: remove __must_check from device_create_file
  debugfs: add missing #ifdef HAS_IOMEM
  arm: time.h: remove device.h #include
  driver-core: remove sysdev.h usage.
  clockevents: remove sysdev.h
  arm: convert sysdev_class to a regular subsystem
  arm: leds: convert sysdev_class to a regular subsystem
  kobject: remove kset_find_obj_hinted()
  m86k: gpio - convert sysdev_class to a regular subsystem
  mips: txx9_sram - convert sysdev_class to a regular subsystem
  mips: 7segled - convert sysdev_class to a regular subsystem
  sh: dma - convert sysdev_class to a regular subsystem
  sh: intc - convert sysdev_class to a regular subsystem
  power: suspend - convert sysdev_class to a regular subsystem
  power: qe_ic - convert sysdev_class to a regular subsystem
  power: cmm - convert sysdev_class to a regular subsystem
  s390: time - convert sysdev_class to a regular subsystem
  ...

Fix up conflicts with 'struct sysdev' removal from various platform
drivers that got changed:
 - arch/arm/mach-exynos/cpu.c
 - arch/arm/mach-exynos/irq-eint.c
 - arch/arm/mach-s3c64xx/common.c
 - arch/arm/mach-s3c64xx/cpu.c
 - arch/arm/mach-s5p64x0/cpu.c
 - arch/arm/mach-s5pv210/common.c
 - arch/arm/plat-samsung/include/plat/cpu.h
 - arch/powerpc/kernel/sysfs.c
and fix up cpu_is_hotpluggable() as per Greg in include/linux/cpu.h
2012-01-07 12:03:30 -08:00
Linus Torvalds
e4e88f31bc Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (185 commits)
  powerpc: fix compile error with 85xx/p1010rdb.c
  powerpc: fix compile error with 85xx/p1023_rds.c
  powerpc/fsl: add MSI support for the Freescale hypervisor
  arch/powerpc/sysdev/fsl_rmu.c: introduce missing kfree
  powerpc/fsl: Add support for Integrated Flash Controller
  powerpc/fsl: update compatiable on fsl 16550 uart nodes
  powerpc/85xx: fix PCI and localbus properties in p1022ds.dts
  powerpc/85xx: re-enable ePAPR byte channel driver in corenet32_smp_defconfig
  powerpc/fsl: Update defconfigs to enable some standard FSL HW features
  powerpc: Add TBI PHY node to first MDIO bus
  sbc834x: put full compat string in board match check
  powerpc/fsl-pci: Allow 64-bit PCIe devices to DMA to any memory address
  powerpc: Fix unpaired probe_hcall_entry and probe_hcall_exit
  offb: Fix setting of the pseudo-palette for >8bpp
  offb: Add palette hack for qemu "standard vga" framebuffer
  offb: Fix bug in calculating requested vram size
  powerpc/boot: Change the WARN to INFO for boot wrapper overlap message
  powerpc/44x: Fix build error on currituck platform
  powerpc/boot: Change the load address for the wrapper to fit the kernel
  powerpc/44x: Enable CRASH_DUMP for 440x
  ...

Fix up a trivial conflict in arch/powerpc/include/asm/cputime.h due to
the additional sparse-checking code for cputime_t.
2012-01-06 17:58:22 -08:00
Bjorn Helgaas
45a709f890 powerpc/PCI: convert to pci_create_root_bus()
Convert from pci_create_bus() to pci_create_root_bus().  This way the root
bus resources are correct immediately.  This patch doesn't fix a problem
because powerpc fixed the resources before scanning the bus, but it makes
powerpc more consistent with other architectures.

v2: fix build error with resource pointer passing

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:09 -08:00
Bjorn Helgaas
49a6cba4eb powerpc/PCI: split PHB part out of pcibios_map_io_space()
No functional change.  This is so we can use pcibios_phb_map_io_space()
before we have a struct pci_bus.

v2: fix map io phb typo

CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:08 -08:00
Bjorn Helgaas
a46770f5b9 powerpc/PCI: make pcibios_setup_phb_resources() static
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:11:08 -08:00
Myron Stowe
79c8be8384 PCI: PowerPC: convert pcibios_set_master() to a non-inlined function
This patch converts PowerPC's architecture-specific
'pcibios_set_master()' routine to a non-inlined function.  This will
allow follow on patches to create a generic 'pcibios_set_master()'
function using the '__weak' attribute which can be used by all
architectures as a default which, if necessary, can then be over-
ridden by architecture-specific code.

Converting 'pci_bios_set_master()' to a non-inlined function will
allow PowerPC's 'pcibios_set_master()' implementation to remain
architecture-specific after the generic version is introduced and
thus, not change current behavior.

No functional change.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2012-01-06 12:10:38 -08:00
Greg Kroah-Hartman
ff4b8a57f0 Merge branch 'driver-core-next' into Linux 3.2
This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file,
and it fixes the build error in the arch/x86/kernel/microcode_core.c
file, that the merge did not catch.

The microcode_core.c patch was provided by Stephen Rothwell
<sfr@canb.auug.org.au> who was invaluable in the merge issues involved
with the large sysdev removal process in the driver-core tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-06 11:42:52 -08:00
Linus Torvalds
423d091dfe Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  cpu: Export cpu_up()
  rcu: Apply ACCESS_ONCE() to rcu_boost() return value
  Revert "rcu: Permit rt_mutex_unlock() with irqs disabled"
  docs: Additional LWN links to RCU API
  rcu: Augment rcu_batch_end tracing for idle and callback state
  rcu: Add rcutorture tests for srcu_read_lock_raw()
  rcu: Make rcutorture test for hotpluggability before offlining CPUs
  driver-core/cpu: Expose hotpluggability to the rest of the kernel
  rcu: Remove redundant rcu_cpu_stall_suppress declaration
  rcu: Adaptive dyntick-idle preparation
  rcu: Keep invoking callbacks if CPU otherwise idle
  rcu: Irq nesting is always 0 on rcu_enter_idle_common
  rcu: Don't check irq nesting from rcu idle entry/exit
  rcu: Permit dyntick-idle with callbacks pending
  rcu: Document same-context read-side constraints
  rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass
  rcu: Remove dynticks false positives and RCU failures
  rcu: Reduce latency of rcu_prepare_for_idle()
  rcu: Eliminate RCU_FAST_NO_HZ grace-period hang
  rcu: Avoid needlessly IPIing CPUs at GP end
  ...
2012-01-06 08:02:40 -08:00
Al Viro
d161a13f97 switch procfs to umode_t use
both proc_dir_entry ->mode and populating functions

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-03 22:54:56 -05:00
Kay Sievers
10fbcf4c6c convert 'memory' sysdev_class to a regular subsystem
This moves the 'memory sysdev_class' over to a regular 'memory' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 14:48:43 -08:00
Kay Sievers
8a25a2fd12 cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 14:29:42 -08:00
Rafael J. Wysocki
90363ddf0a PM: Drop generic_subsys_pm_ops
Since the PM core is now going to execute driver callbacks directly
if the corresponding subsystem callbacks are not present,
forward-only subsystem callbacks (i.e. such that only execute the
corresponding driver callbacks) are not necessary any more.  Thus
it is possible to remove generic_subsys_pm_ops, because the only
callback in there that is not forward-only, .runtime_idle, is not
really used by the only user of generic_subsys_pm_ops, which is
vio_bus_type.

However, the generic callback routines themselves cannot be removed
from generic_ops.c, because they are used individually by a number
of subsystems.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-12-21 22:03:32 +01:00
Peter Zijlstra
35edc2a509 perf, arch: Rework perf_event_index()
Put the logic to compute the event index into a per pmu method. This
is required because the x86 rules are weird and wonderful and don't
match the capabilities of the current scheme.

AFAIK only powerpc actually has a usable userspace read of the PMCs
but I'm not at all sure anybody actually used that.

ARM is restored to the default since it currently does not support
userspace access at all. And all software events are provided with a
method that reports their index as 0 (disabled).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arun Sharma <asharma@fb.com>
Link: http://lkml.kernel.org/n/tip-dfydxodki16lylkt3gl2j7cw@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-21 11:01:07 +01:00
Suzuki Poulose
26ecb6c44b powerpc/44x: Enable CONFIG_RELOCATABLE for PPC44x
The following patch adds relocatable kernel support - based on processing
of dynamic relocations - for PPC44x kernel.

We find the runtime address of _stext and relocate ourselves based
on the following calculation.

	virtual_base = ALIGN(KERNELBASE,256M) +
			MODULO(_stext.run,256M)

relocate() is called with the Effective Virtual Base Address (as
shown below)

            | Phys. Addr| Virt. Addr |
Page (256M) |------------------------|
Boundary    |           |            |
            |           |            |
            |           |            |
Kernel Load |___________|_ __ _ _ _ _|<- Effective
Addr(_stext)|           |      ^     |Virt. Base Addr
            |           |      |     |
            |           |      |     |
            |           |reloc_offset|
            |           |      |     |
            |           |      |     |
            |           |______v_____|<-(KERNELBASE)%256M
            |           |            |
            |           |            |
            |           |            |
Page(256M)  |-----------|------------|
Boundary    |           |            |

The virt_phys_offset is updated accordingly, i.e,

	virt_phys_offset = effective. kernel virt base - kernstart_addr

I have tested the patches on 440x platforms only. However this should
work fine for PPC_47x also, as we only depend on the runtime address
and the current TLB XLAT entry for the startup code, which is available
in r25. I don't have access to a 47x board yet. So, it would be great if
somebody could test this on 47x.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Tony Breeds <tony@bakeyournoodle.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-12-20 10:21:57 -05:00
Suzuki Poulose
9c5f7d39a8 powerpc: Process dynamic relocations for kernel
The following patch implements the dynamic relocation processing for
PPC32 kernel. relocate() accepts the target virtual address and relocates
 the kernel image to the same.

Currently the following relocation types are handled :

	R_PPC_RELATIVE
	R_PPC_ADDR16_LO
	R_PPC_ADDR16_HI
	R_PPC_ADDR16_HA

The last 3 relocations in the above list depends on value of Symbol indexed
whose index is encoded in the Relocation entry. Hence we need the Symbol
Table for processing such relocations.

Note: The GNU ld for ppc32 produces buggy relocations for relocation types
that depend on symbols. The value of the symbols with STB_LOCAL scope
should be assumed to be zero. - Alan Modra

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Alan Modra <amodra@au1.ibm.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-12-20 10:21:08 -05:00
Suzuki Poulose
2391324545 powerpc/44x: Enable DYNAMIC_MEMSTART for 440x
DYNAMIC_MEMSTART(old RELOCATABLE) was restricted only to PPC_47x variants
of 44x. This patch enables DYNAMIC_MEMSTART for 440x based chipsets.

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-12-20 10:20:38 -05:00
Suzuki Poulose
0f890c8d20 powerpc: Rename mapping based RELOCATABLE to DYNAMIC_MEMSTART for BookE
The current implementation of CONFIG_RELOCATABLE in BookE is based
on mapping the page aligned kernel load address to KERNELBASE. This
approach however is not enough for platforms, where the TLB page size
is large (e.g, 256M on 44x). So we are renaming the RELOCATABLE used
currently in BookE to DYNAMIC_MEMSTART to reflect the actual method.

The CONFIG_RELOCATABLE for PPC32(BookE) based on processing of the
dynamic relocations will be introduced in the later in the patch series.

This change would allow the use of the old method of RELOCATABLE for
platforms which can afford to enforce the page alignment (platforms with
smaller TLB size).

Changes since v3:

* Introduced a new config, NONSTATIC_KERNEL, to denote a kernel which is
  either a RELOCATABLE or DYNAMIC_MEMSTART(Suggested by: Josh Boyer)

Suggested-by: Scott Wood <scottwood@freescale.com>
Tested-by: Scott Wood <scottwood@freescale.com>

Signed-off-by: Suzuki K. Poulose <suzuki@in.ibm.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linux ppc dev <linuxppc-dev@lists.ozlabs.org>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-12-20 10:20:19 -05:00
Benjamin Herrenschmidt
3f53638c80 powerpc: Fix old bug in prom_init setting of the color
We have an array of 16 entries and a loop of 32 iterations... oops.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-19 14:41:25 +11:00
Paul Mackerras
64968f60e7 powerpc: Only use initrd_end as the limit for alloc_bottom if it's inside the RMO.
As the kernels and initrd's get bigger boot-loaders and possibly
kexec-tools will need to place the initrd outside the RMO.  When this
happens we end up with no lowmem and the boot doesn't get very far.

Only use initrd_end as the limit for alloc_bottom if it's inside the
RMO.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-19 14:41:24 +11:00
Andreas Schwab
9f5072d4f6 powerpc: Fix wrong divisor in usecs_to_cputime
Commit d57af9b (taskstats: use real microsecond granularity for CPU times)
renamed msecs_to_cputime to usecs_to_cputime, but failed to update all
numbers on the way.  This causes nonsensical cpu idle/iowait values to be
displayed in /proc/stat (the only user of usecs_to_cputime so far).

This also renames __cputime_msec_factor to __cputime_usec_factor, adapting
its value and using it directly in cputime_to_usecs instead of doing two
multiplications.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-19 14:41:20 +11:00
Benjamin Herrenschmidt
1e7342e778 Merge remote-tracking branch 'jwb/next' into next
Conflicts:
	arch/powerpc/platforms/40x/ppc40x_simple.c
2011-12-16 11:24:25 +11:00
Benjamin Herrenschmidt
43ca5d347a Merge branch 'kexec' into next 2011-12-16 11:09:21 +11:00
Benjamin Herrenschmidt
efdad722ef Merge branch 'ps3' into next 2011-12-16 11:09:15 +11:00
Benjamin Herrenschmidt
e6f08d37e6 Merge branch 'cpuidle' into next 2011-12-16 11:09:11 +11:00
Frederic Weisbecker
1268fbc746 nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
Those two APIs were provided to optimize the calls of
tick_nohz_idle_enter() and rcu_idle_enter() into a single
irq disabled section. This way no interrupt happening in-between would
needlessly process any RCU job.

Now we are talking about an optimization for which benefits
have yet to be measured. Let's start simple and completely decouple
idle rcu and dyntick idle logics to simplify.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:57 -08:00
Paul E. McKenney
a7b152d534 powerpc: Tell RCU about idle after hcall tracing
The PowerPC pSeries platform (CONFIG_PPC_PSERIES=y) enables
hypervisor-call tracing for CONFIG_TRACEPOINTS=y kernels.  One of the
hypervisor calls that is traced is the H_CEDE call in the idle loop
that tells the hypervisor that this OS instance no longer needs the
current CPU.  However, tracing uses RCU, so this combination of kernel
configuration variables needs to avoid telling RCU about the current CPU's
idleness until after the H_CEDE-entry tracing completes on the one hand,
and must tell RCU that the the current CPU is no longer idle before the
H_CEDE-exit tracing starts.

In all other cases, it suffices to inform RCU of CPU idleness upon
idle-loop entry and exit.

This commit makes the required adjustments.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:39 -08:00
Frederic Weisbecker
2bbb6817c0 nohz: Allow rcu extended quiescent state handling seperately from tick stop
It is assumed that rcu won't be used once we switch to tickless
mode and until we restart the tick. However this is not always
true, as in x86-64 where we dereference the idle notifiers after
the tick is stopped.

To prepare for fixing this, add two new APIs:
tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().

If no use of RCU is made in the idle loop between
tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
must instead call the new *_norcu() version such that the arch doesn't
need to call rcu_idle_enter() and rcu_idle_exit().

Otherwise the arch must call tick_nohz_enter_idle() and
tick_nohz_exit_idle() and also call explicitly:

- rcu_idle_enter() after its last use of RCU before the CPU is put
to sleep.
- rcu_idle_exit() before the first use of RCU after the CPU is woken
up.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11 10:31:36 -08:00
Frederic Weisbecker
280f06774a nohz: Separate out irq exit and idle loop dyntick logic
The tick_nohz_stop_sched_tick() function, which tries to delay
the next timer tick as long as possible, can be called from two
places:

- From the idle loop to start the dytick idle mode
- From interrupt exit if we have interrupted the dyntick
idle mode, so that we reprogram the next tick event in
case the irq changed some internal state that requires this
action.

There are only few minor differences between both that
are handled by that function, driven by the ts->inidle
cpu variable and the inidle parameter. The whole guarantees
that we only update the dyntick mode on irq exit if we actually
interrupted the dyntick idle mode, and that we enter in RCU extended
quiescent state from idle loop entry only.

Split this function into:

- tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
dynticks idle mode unconditionally if it can, and enters into RCU
extended quiescent state.

- tick_nohz_irq_exit() which only updates the dynticks idle mode
when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).

To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
into tick_nohz_idle_exit().

This simplifies the code and micro-optimize the irq exit path (no need
for local_irq_save there). This also prepares for the split between
dynticks and rcu extended quiescent state logics. We'll need this split to
further fix illegal uses of RCU in extended quiescent states in the idle
loop.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: David Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11 10:31:35 -08:00
Paul Gortmaker
9deaa53ac7 serial: add irq handler for Freescale 16550 errata.
Sending a break on the SOC UARTs found in some MPC83xx/85xx/86xx
chips seems to cause a short lived IRQ storm (/proc/interrupts
typically shows somewhere between 300 and 1500 events).  Unfortunately
this renders SysRQ over the serial console completely inoperable.

The suggested workaround in the errata is to read the Rx register,
wait one character period, and then read the Rx register again.
We achieve this by tracking the old LSR value, and on the subsequent
interrupt event after a break, we don't read LSR, instead we just
read the RBR again and return immediately.

The "fsl,ns16550" is used in the compatible field of the serial
device to mark UARTs known to have this issue.

Thanks to Scott Wood for providing the errata data which led to
a much cleaner fix.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-09 19:14:13 -08:00
Tony Breeds
df777bd39a powerpc/476fpe: Add 476fpe SoC code
Based on original work by David 'Shaggy' Kleikamp.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-12-09 07:51:02 -05:00
Tejun Heo
1aadc0560f memblock: s/memblock_analyze()/memblock_allow_resize()/ and update users
The only function of memblock_analyze() is now allowing resize of
memblock region arrays.  Rename it to memblock_allow_resize() and
update its users.

* The following users remain the same other than renaming.

  arm/mm/init.c::arm_memblock_init()
  microblaze/kernel/prom.c::early_init_devtree()
  powerpc/kernel/prom.c::early_init_devtree()
  openrisc/kernel/prom.c::early_init_devtree()
  sh/mm/init.c::paging_init()
  sparc/mm/init_64.c::paging_init()
  unicore32/mm/init.c::uc32_memblock_init()

* In the following users, analyze was used to update total size which
  is no longer necessary.

  powerpc/kernel/machine_kexec.c::reserve_crashkernel()
  powerpc/kernel/prom.c::early_init_devtree()
  powerpc/mm/init_32.c::MMU_init()
  powerpc/mm/tlb_nohash.c::__early_init_mmu()  
  powerpc/platforms/ps3/mm.c::ps3_mm_add_memory()
  powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups()
  sh/kernel/machine_kexec.c::reserve_crashkernel()

* x86/kernel/e820.c::memblock_x86_fill() was directly setting
  memblock_can_resize before populating memblock and calling analyze
  afterwards.  Call memblock_allow_resize() before start populating.

memblock_can_resize is now static inside memblock.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-12-08 10:22:08 -08:00
Tejun Heo
6fbef13c4f powerpc: Cleanup memblock usage
* early_init_devtree(): Total memory size is aligned to PAGE_SIZE;
  however, alignment isn't enforced if memory_limit is explicitly
  specified.  Simplify the logic and always apply PAGE_SIZE alignment.

* MMU_init(): memblock regions is truncated by directly modifying
  memblock.memory.cnt.  This is incomplete (reserved array is not
  truncated) and unnecessarily low level hindering further memblock
  improvments.  Use memblock_enforce_memory_limit() instead.

* wii_memory_fixups(): Unnecessarily low level direct manipulation of
  memblock regions.  The same result can be achieved using properly
  abstracted operations.  Reimplement using memblock API.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
2011-12-08 10:22:07 -08:00
Tejun Heo
fe091c208a memblock: Kill memblock_init()
memblock_init() initializes arrays for regions and memblock itself;
however, all these can be done with struct initializers and
memblock_init() can be removed.  This patch kills memblock_init() and
initializes memblock with struct initializer.

The only difference is that the first dummy entries don't have .nid
set to MAX_NUMNODES initially.  This doesn't cause any behavior
difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "H. Peter Anvin" <hpa@zytor.com>
2011-12-08 10:22:07 -08:00
Paul Mackerras
2fde6d20bb powerpc: Provide a way for KVM to indicate that NV GPR values are lost
This fixes a problem where a CPU thread coming out of nap mode can
think it has valid values in the nonvolatile GPRs (r14 - r31) as saved
away in power7_idle, but in fact the values have been trashed because
the thread was used for KVM in the mean time.  The result is that the
thread crashes because code that called power7_idle (e.g.,
pnv_smp_cpu_kill_self()) goes to use values in registers that have
been trashed.

The bit field in SRR1 that tells whether state was lost only reflects
the most recent nap, which may not have been the nap instruction in
power7_idle.  So we need an extra PACA field to indicate that state
has been lost even if SRR1 indicates that the most recent nap didn't
lose state.  We clear this field when saving the state in power7_idle,
we set it to a non-zero value when we use the thread for KVM, and we
test it in power7_wakeup_noloss.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:22:53 +11:00
Paul Mackerras
cba313da5c powerpc/powernv: Fix problems in onlining CPUs
At present, on the powernv platform, if you off-line a CPU that was
online, and then try to on-line it again, the kernel generates a
warning message "OPAL Error -1 starting CPU n".  Furthermore, if the
CPU is a secondary thread that was used by KVM while it was off-line,
the CPU fails to come online.

The first problem is fixed by only calling OPAL to start the CPU the
first time it is on-lined, as indicated by the cpu_start field of its
PACA being zero.  The second problem is fixed by restoring the
cpu_start field to 1 instead of 0 when using the CPU within KVM.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:22:53 +11:00
Anton Blanchard
3339264042 powerpc/pseries: Increase minimum RMO size from 64MB to 256MB
The minimum RMO size field in ibm,client-architecture is currently
ignored, but a future firmware version will rectify that. Since we
always get at least 128MB of RMO right now, asking for 64MB is
likely to result in boot failures.

We should bump it to at least 128MB, but considering all the boot
issues we have on 128MB RMO boxes and all new machines have virtual
RMO, we may as well set our minimum to 256MB.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:22:53 +11:00
Geoff Levand
816cb49a4b powerpc/ps3: Fix hcall lv1_get_version_info
The lv1_get_version_info hcall takes 2, not 1 output
arguments.  Adjust the lv1 hcall table and all calls.

Usage:

  int lv1_get_version_info(u64 *version_number, u64 *vendor_id)

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:05:54 +11:00
Anton Blanchard
2440c01e10 powerpc/kdump: Only save CPU state first time through the secondary CPU capture code
We might enter the secondary CPU capture code twice, eg if we have to
unstick some CPUs with a system reset. In this case we don't want to
overwrite the state on CPUs that had made it into the capture code OK,
so use the cpus_state_saved cpumask for that and make it local to
crash_ipi_callback.

For controlling progress now use atomic_t cpus_in_crash to count how
many CPUs have made it into the kdump code, and time_to_dump to tell
everyone it's time to dump.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:24 +11:00
Anton Blanchard
549e88a134 powerpc/kdump: Delay before sending IPI on a system reset
If we enter the kdump code via system reset, wait a bit before
sending the IPI to capture all secondary CPUs. Without it we race
with the hypervisor that is issuing the system reset to each CPU.
If the IPI gets there first the system reset oops output then shows
the register state of the IPI handler which is not what we want.

I took the opportunity to add defines for all the various delays
we have. There's no need for cpu_relax when we are doing an mdelay,
so remove them too.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:24 +11:00
Anton Blanchard
760ca4dc90 powerpc: Rework die()
Our die() code was based off a very old x86 version. Update it to
mirror the current x86 code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:23 +11:00
Anton Blanchard
8c27474a25 powerpc: Cleanup crash/kexec code
Remove some unnecessary defines and fix some spelling mistakes.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:23 +11:00
Anton Blanchard
07fe0c6132 powerpc/kdump: Use setjmp/longjmp to handle kdump and system reset recursion
We can handle recursion caused by system reset by reusing the crash
shutdown fault handler.

Since we don't have an OS triggerable NMI, if all CPUs don't make it
into kdump then we tell the user to issue a system reset. However if
we have a panic timeout set we cannot wait forever and must continue
the kdump.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:22 +11:00
Anton Blanchard
9b00ac0697 powerpc: Remove broken and complicated kdump system reset code
We have a lot of complicated logic that handles possible recursion between
kdump and a system reset exception. We can solve this in a much simpler
way using the same setjmp/longjmp tricks xmon does.

As a first step, this patch removes the old system reset code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:22 +11:00
Anton Blanchard
58154c8ce7 powerpc: Give us time to get all oopses out before panicking
I've been seeing truncated output when people send system reset info
to me. We should see a backtrace for every CPU, but the panic() code
takes the box down before they all make it out to the console. The
panic code runs unlocked so we also see corrupted console output.

If we are going to panic, then delay 1 second before calling into the
panic code. Move oops_exit inside the die lock and put a newline
between oopses for clarity.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 14:02:22 +11:00
Deepthi Dharwar
707827f338 powerpc/cpuidle: cpuidle driver for pSeries
This patch implements a back-end cpuidle driver for pSeries
based on pseries_dedicated_idle_loop and pseries_shared_idle_loop
routines.  The driver is built only if CONFIG_CPU_IDLE is set. This
cpuidle driver uses global registration of idle states and
not per-cpu.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 13:56:31 +11:00
Deepthi Dharwar
771dae8189 powerpc/cpuidle: Add cpu_idle_wait() to allow switching of idle routines
This patch provides cpu_idle_wait() routine for the powerpc
platform which is required by the cpuidle subsystem. This
routine is required to change the idle handler on SMP systems.
The equivalent routine for x86 is in arch/x86/kernel/process.c
but the powerpc implementation is different.

cpuidle_disable variable is to enable/disable cpuidle
framework if power_save option is set during the boot
time.

Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-08 13:54:58 +11:00
Benjamin Herrenschmidt
faa8bf8878 Merge branch 'booke-hugetlb' into next 2011-12-08 13:20:34 +11:00
Benjamin Herrenschmidt
4666ca2aa3 powerpc/pci: Make pci_read_irq_line() static
It's only used inside the same file where it's defined. There's
also no point exporting it anymore.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-07 18:04:58 +11:00
Benjamin Herrenschmidt
40dfef66a9 powerpc/powernv: Workaround OFW issues in prom_init.c
Open Firmware on OPAL machines seems to have issues if we close
stdin and/or we try to print things after calling "quiesce" so
we avoid doing both.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-07 18:04:53 +11:00
Becky Bruce
a6146888be powerpc: Add gpages reservation code for 64-bit FSL BOOKE
For 64-bit FSL_BOOKE implementations, gigantic pages need to be
reserved at boot time by the memblock code based on the command line.
This adds the call that handles the reservation, and fixes some code
comments.

It also removes the previous pr_err when reserve_hugetlb_gpages
is called on a system without hugetlb enabled - the way the code is
structured, the call is unconditional and the resulting error message
spurious and confusing.

Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-12-07 16:26:23 +11:00
Tanmay Inamdar
d5b9ee7b51 powerpc/40x: Add APM8018X SOC support
The AppliedMicro APM8018X embedded processor targets embedded applications that
require low power and a small footprint. It features a PowerPC 405 processor
core built in a 65nm low-power CMOS process with a five-stage pipeline executing
up to one instruction per cycle. The family has 128-kbytes of on-chip memory,
a 128-bit local bus and on-chip DDR2 SDRAM controller with 16-bit interface.

Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
Signed-off-by: Josh Boyer <jwboyer@gmail.com>
2011-11-30 10:02:15 -05:00
Michael S. Tsirkin
335b8cf7c3 powerpc: switch to GENERIC_PCI_IOMAP
powerpc copied pci_iomap from generic code, probably to avoid
pulling the rest of iomap.c in.  Since that's in
a separate file now, we can reuse the common implementation.

The only difference is handling of nocache flag,
that turns out to be done correctly by the
generic code since arch/powerpc/include/asm/io.h
defines ioremap_nocache same as ioremap.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-11-28 21:13:18 +02:00
Anton Blanchard
3bfd0c9c8f powerpc: Decode correct MSR bits in oops output
On a 64bit book3s machine I have an oops from a system reset that
claims the book3e CE bit was set:

MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 24004082  XER: 00000010

On a book3s machine system reset sets IBM bit 46 and 47 depending on
the power saving mode. Separate the definitions by type and for
completeness add the rest of the bits in.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-28 11:42:09 +11:00
Benjamin Herrenschmidt
56368797d6 Merge remote-tracking branch 'kumar/next' into next 2011-11-25 15:25:39 +11:00
Ananth N Mavinakayanahalli
595fe91447 powerpc: Export PIR data through sysfs
On Fri, Nov 11, 2011 at 10:17:55AM +0530, Ananth N Mavinakayanahalli wrote:
> >
> > At this rate we're going to end up with no bits left for CPU features
> > way too quickly... Especially for something we only care about once at
> > boot time.
> >
> > Wouldn't CPU_FTR_PPCAS_ARCH_V2 be a good enough test ?
>
> /me checks Cell manuals... yes, that test would be good enough. I will
> cook up a patch to use this.

Here it is...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:53:23 +11:00
Benjamin Herrenschmidt
184cd4a3b9 powerpc/powernv: PCI support for p7IOC under OPAL v2
This adds support for p7IOC (and possibly other IODA v1 IO Hubs)
using OPAL v2 interfaces.

We completely take over resource assignment and assign them using an
algorithm that hands out device BARs in a way that makes them fit in
individual segments of the M32 window of the bridge, which enables us
to assign individual PEs to devices and functions.

The current implementation gives out a PE per functions on PCIe, and a
PE for the entire bridge for PCIe to PCI-X bridges.

This can be adjusted / fine tuned later.

We also setup DMA resources (32-bit only for now) and MSIs (both 32-bit
and 64-bit MSI are supported).

The DMA allocation tries to divide the available 256M segments of the
32-bit DMA address space "fairly" among PEs. This is done using a
"weight" heuristic which assigns less value to things like OHCI USB
controllers than, for example SCSI RAID controllers. This algorithm
will probably want some fine tuning for specific devices or device
types.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:53:15 +11:00
Benjamin Herrenschmidt
48c2ce97fa powerpc/pci: Change how re-assigning resouces work
When PCI_REASSIGN_ALL_RSRC is set, we used to clear all bus resources
at the beginning of survey and re-allocate them later.

This changes it so instead, during early fixup, we mark all resources
as IORESOURCE_UNSET and move them down to be 0-based.

Later, if bus resources are still unset at the beginning of the survey,
then we clear them.

This shouldn't impact the re-assignment case on 4xx, but will enable
us to have the platform do some custom resource assignment before the
survey, by clearing individual resources IORESOURCE_UNSET bit.

Also limits the clutter in the kernel log from fixup when re-assigning
since we don't care about the offset applied to the BAR values in this
case.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:32:55 +11:00
Benjamin Herrenschmidt
491b98c315 powerpc/pci: Add a platform hook after probe and before resource survey
Some platforms need to perform resource allocation using a custom algorithm
due to HW constraints, or may want to tweak things globally below a host
bridge. For example OPAL support for IODA will need to perform a
resource allocation pass that applies IODA specific segmentation
constraints to MMIO which cannot be done simply using the kernel generic
resource management code.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:32:53 +11:00
Thomas Gleixner
3b5e16d7ad powerpc: Mark IPI interrupts IRQF_NO_THREAD
IPI handlers cannot be threaded. Remove the obsolete IRQF_DISABLED
flag (see commit e58aa3d2) while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:14:38 +11:00
Ravi K. Nittala
df17f56d8a powerpc/pseries: Cancel RTAS event scan before firmware flash
The RTAS firmware flash update is conducted using an RTAS call that is
serialized by lock_rtas() which uses spin_lock. While the flash is in
progress, rtasd performs scan for any RTAS events that are generated by
the system. rtasd keeps scanning for the RTAS events generated on the
machine. This is performed via workqueue mechanism. The rtas_event_scan()
also uses an RTAS call to scan the events, eventually trying to acquire
the spin_lock before issuing the request.

The flash update takes a while to complete and during this time, any other
RTAS call has to wait. In this case, rtas_event_scan() waits for a long time
on the spin_lock resulting in a soft lockup.

Fix: Just before the flash update is performed, the queued rtas_event_scan()
work item is cancelled from the work queue so that there is no other RTAS
call issued while the flash is in progress. After the flash completes, the
system reboots and the rtas_event_scan() is rescheduled.

Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com>
Signed-off-by: Ravi Nittala <ravi.nittala@in.ibm.com>
Reported-by: Divya Vikas <divya.vikas@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:29 +11:00
Jimi Xenidis
fac26ad4f9 powerpc/book3e: Add ICSWX/ACOP support to Book3e cores like A2
ICSWX is also used by the A2 processor to access coprocessors,
although not all "chips" that contain A2s have coprocessors.

Signed-off-by: Jimi Xenidis <jimix@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:28 +11:00
Anton Blanchard
7df1027542 powerpc/time: Optimise decrementer_check_overflow
decrementer_check_overflow is called from arch_local_irq_restore so
we want to make it as light weight as possible. As such, turn
decrementer_check_overflow into an inline function.

To avoid a circular mess of includes, separate out the two components
of struct decrementer_clock and keep the struct clock_event_device
part local to time.c.

The fast path improves from:

arch_local_irq_restore
     0:       mflr    r0
     4:       std     r0,16(r1)
     8:       stdu    r1,-112(r1)
     c:       stb     r3,578(r13)
    10:       cmpdi   cr7,r3,0
    14:       beq-    cr7,24 <.arch_local_irq_restore+0x24>
...
    24:       addi    r1,r1,112
    28:       ld      r0,16(r1)
    2c:       mtlr    r0
    30:       blr

to:

arch_local_irq_restore
    0:       std     r30,-16(r1)
    4:       ld      r30,0(r2)
    8:       stb     r3,578(r13)
    c:       cmpdi   cr7,r3,0
   10:       beq-    cr7,6c <.arch_local_irq_restore+0x6c>
...
   6c:       ld      r30,-16(r1)
   70:       blr

Unfortunately we still setup a local TOC (due to -mminimal-toc). Yet
another sign we should be moving to -mcmodel=medium.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:11:26 +11:00
Anton Blanchard
621692cb7e powerpc/time: Fix some style issues
Fix some formatting issues and use the DECREMENTER_MAX
define instead of 0x7fffffff.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:10:00 +11:00
Anton Blanchard
68568add2c powerpc/time: Remove unnecessary sanity check of decrementer expiration
The clockevents code uses max_delta_ns to avoid calling a
clockevent with too large a value.

Remove the redundant version of this in the timer_interrupt
code.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:59 +11:00
Anton Blanchard
11b8633ada powerpc/time: Use clocksource_register_hz
Use clocksource_register_hz which calculates the shift/mult
factors for us. Also remove the shift = 22 assumption in
vsyscall_update - thanks to Paul Mackerras and John Stultz for
catching that.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:59 +11:00
Anton Blanchard
d8afc6fd95 powerpc/time: Use clockevents_calc_mult_shift
We can use clockevents_calc_mult_shift instead of doing all
the work ourselves.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:59 +11:00
Anton Blanchard
37fb9a0231 powerpc/time: Handle wrapping of decrementer
When re-enabling interrupts we have code to handle edge sensitive
decrementers by resetting the decrementer to 1 whenever it is negative.
If interrupts were disabled long enough that the decrementer wrapped to
positive we do nothing. This means interrupts can be delayed for a long
time until it finally goes negative again.

While we hope interrupts are never be disabled long enough for the
decrementer to go positive, we have a very good test team that can
drive any kernel into the ground. The softlockup data we get back
from these fails could be seconds in the future, completely missing
the cause of the lockup.

We already keep track of the timebase of the next event so use that
to work out if we should trigger a decrementer exception.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-25 14:09:58 +11:00
Jason Jin
05737c7c5b powerpc/fsl-pci: Don't hide resource for pci/e when configured as Agent/EP
Current pci/pcie init code will hide the pci/pcie host resource.
But did not judge it is host/RC or agent/EP. If configured as
agent/EP, we should avoid hiding its resource in the host side.

In PCI system, the Programing Interface can be used to judge the
host/agent status:
Programing Interface = 0: host
Programing Interface = 1: Agent

In PCIE system, both the Programing Interface and Header type can
be used to judge the RC/EP status.
Header Type = 0: EP
Header Type = 1: RC

Signed-off-by: Jason Jin <Jason.jin@freescale.com>
Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com>
Signed-off-by: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-11-24 02:01:40 -06:00
Will Deacon
a313f4c55d powerpc/signal32: Fix sigset_t conversion when copying to user
On PPC64, put_sigset_t converts a sigset_t to a compat_sigset_t
before copying it to userspace. There is a typo in the case that
we have 4 words to copy, meaning that we corrupt the compat_sigset_t.

It appears that _NSIG_WORDS can't be greater than 2 at the moment
so this code is probably always optimised away anyway.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:41:10 +11:00
Kumar Gala
187b9f2aa7 powerpc/book3e-64: Fix debug support for userspace
With the introduction of CONFIG_PPC_ADV_DEBUG_REGS user space debug is
broken on Book-E 64-bit parts that support delayed debug events.  When
switch_booke_debug_regs() sets DBCR0 we'll start getting debug events as
MSR_DE is also set and we aren't able to handle debug events from kernel
space.

We can remove the hack that always enables MSR_DE and loads up DBCR0 and
just utilize switch_booke_debug_regs() to get user space debug working
again.

We still need to handle critical/debug exception stacks & proper
save/restore of state for those exception levles to support debug events
from kernel space like we have on 32-bit.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Kumar Gala
b95bc21914 powerpc: Remove extraneous CONFIG_PPC_ADV_DEBUG_REGS define
All of DebugException is already protected by CONFIG_PPC_ADV_DEBUG_REGS
there is no need to have another such ifdef inside the function.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Kumar Gala
ba28c9aae2 powerpc: Revert show_regs() define for readability
We had an existing ifdef for 4xx & BOOKE processors that got changed to
CONFIG_PPC_ADV_DEBUG_REGS.  The define has nothing to do with
CONFIG_PPC_ADV_DEBUG_REGS.  The define really should be:

 #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)

and not

 #ifdef CONFIG_PPC_ADV_DEBUG_REGS

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-17 16:26:07 +11:00
Kevin Hao
2cd76629f6 powerpc/trace: Add a dummy stack frame for trace_hardirqs_off
The trace_hardirqs_off will use CALLER_ADDR0 and CALLER_ADDR1.
If an exception occurs in user mode, there is only one stack frame
on the stack and accessing the CALLER_ADDR1 will causes the following
call trace. So we create a dummy stack frame to make
trace_hardirqs_off happy.

WARNING: at kernel/smp.c:459
Modules linked in:
NIP: c0093280 LR: c00930a0 CTR: c0010780
REGS: edb87ae0 TRAP: 0700   Not tainted  (3.1.0)
MSR: 00021002 <ME,CE>  CR: 28002888  XER: 00000000
TASK = edce2ac0[17658] 'mthread-lock-on' THREAD: edb86000 CPU: 5
GPR00: 00000001 edb87b90 edce2ac0 00000005 c0019594 edb87bd8 00000001 00000fe3
GPR08: 00041000 c084138c 4e20120d edb87b90 48002888 1001aa7c 00000000 00000000
GPR16: 48830000 10012a8c 00000000 10000af4 00000001 c0810000 00000000 00000000
GPR24: ee9aa920 c0816a18 00000000 00000005 c0019594 edb87bd8 ee20178c edb87b90
NIP [c0093280] smp_call_function_many+0x214/0x2b4
LR [c00930a0] smp_call_function_many+0x34/0x2b4
Call Trace:
[edb87b90] [c00930a0] smp_call_function_many+0x34/0x2b4 (unreliable)
[edb87bd0] [c00194ec] __flush_tlb_page+0xac/0x100
[edb87c00] [c001957c] flush_tlb_page+0x3c/0x54
[edb87c10] [c00180ac] ptep_set_access_flags+0x74/0x12c
[edb87c40] [c0128068] handle_pte_fault+0x2f0/0x9ac
[edb87cb0] [c0128c3c] handle_mm_fault+0x104/0x1dc
[edb87ce0] [c05f40f4] do_page_fault+0x2dc/0x630
[edb87e50] [c001078c] handle_page_fault+0xc/0x80

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00
Anton Blanchard
d715e433b7 powerpc: Copy down exception vectors after feature fixups
kdump fails because we try to execute an HV only instruction. Feature
fixups are being applied after we copy the exception vectors down to 0
so they miss out on any updates.

We have always had this issue but it only became critical in v3.0
when we added CFAR support (breaks POWER5) and v3.1 when we added
POWERNV (breaks everyone).

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org> [v3.0+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2011-11-16 14:47:54 +11:00