This changes the writeX family of functions to have a sync instruction
before the MMIO store rather than after, because the generally expected
behaviour is that the device receiving the MMIO store can be guaranteed
to see the effects of any preceding writes to normal memory.
To preserve ordering between writeX and readX, and to preserve ordering
between preceding stores and the readX, the readX family of functions
have had an sync added before the load.
Although writeX followed by spin_unlock is not officially guaranteed
to keep the writeX inside the spin-locked region unless an mmiowb()
is used, there are currently drivers that depend on the previous
behaviour on powerpc, which was that the mmiowb wasn't actually required.
Therefore we have a per-cpu flag that is set by writeX, cleared by
__raw_spin_lock and mmiowb, and tested by __raw_spin_unlock. If it is
set, __raw_spin_unlock does a sync and clears it.
This changes both 32-bit and 64-bit readX/writeX. 32-bit already has a
sync in __raw_spin_unlock (since lwsync doesn't exist on 32-bit), and thus
doesn't need the per-cpu flag.
Tested on G5 (PPC970) and POWER5.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add instrumentation for hypervisor calls on pseries. Call statistics
include number of calls, wall time and cpu cycles (if available) and
are made available via debugfs. Instrumentation code is behind the
HCALL_STATS config option and has no impact if not enabled.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The performance monitor implementation (including CTRL register behaviour)
is just included in PPC v2 as an example, it's not truly part of the base.
It's actually a somewhat misleading feature, but I'll leave that be for
now: The presence of the register is not what the feature bit is used
for, but instead it's used to determine if it contains the runlatch
bit for idle reporting of the performance monitor. For alternative
implementations, the register might still exist but the bit might have
different meaning (or no meaning at all).
For now, split it off and don't include it in CPU_FTR_PPCAS_ARCH_V2_BASE.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is required to generate proper core files using kdump on ppc64.
Create a backup region of 64K size irrespective of the PAGE SIZE.
At present 32K was used as backup size. In the case of 64K page size,
second PT_LOAD segments starts at 32K and the first one is not page
aligned. __ioremap() (crash_dump.c) fails if pfn = 0 which is the
case for the second PT_LOAD segment. This is not an issue for 4K page
size because the the first page (32K backup) is copied to second
kernel memory and thus referencing with the second kernel pfn.
Signed-off-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The sys_[gs]et_robust_list() syscalls were wired up on PowerPC but
didn't work correctly because futex_atomic_cmpxchg_inatomic() wasn't
implemented. Implement it, based on __cmpxchg_u32().
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
These are build fixes that enable (for example) libata and the ide
code to actually build on iSeries. The associated hardware will never
be supported on legacy iSeries, so the code paths don't actually need
to work, but it is useful (especially for a combined kernel) if the
code can build.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a problem introduced in 5db9fa9593.
The last_jiffy per-cpu variable is only 32 bits on 32-bit machines, but it
was being compared with a 64-bit quantity (tb_next_jiffy), which resulted in
time not advancing.
This fixes it by changing last_jiffy to be 64 bits on all platforms. With
this, we no longer need tb_last_stamp as a 32-bit version of tb_last_jiffy,
so this gets rid of tb_last_stamp and we just use tb_last_jiffy instead.
This also fixes a bug when the boot cpu is not online, because using
tb_last_stamp could have caused the wrong timebase origin value to be used
when calculating the time of day.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Device-tree bugs on js20 with some versions of SLOF were causing the
interrupt for IDE to not be parsed correctly and fail to boot. This
patch adds a bit more sanity checking to the parser to detect some of
those errors and fail instead of returning bogus information. The
powerpc PCI code can then trigger a fallback that works on those
machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a new hardware information table for mpic. This enables
the mpic code to deal with mpic controllers with different register
layouts and hardware behaviours.
This introduces CONFIG_MPIC_WEIRD. For boards with non standard mpic
controllers, select CONFIG_MPIC_WEIRD and add its hardware information
in the mpic_infos[] array.
TSI108/109 PIC takes the first index of weird hardware information
table. :) The table can be extended. The Tsi108/109 PIC looks like
standard OpenPIC but, in fact, is different in register mapping and
behavior.
The patch does not affect the behavior of standard mpic. If
CONFIG_MPIC_WEIRD is not defined, the code is essentially identical to
the current code.
[benh@kernel.crashing.org:
This patch is a slightly cleaned up version of Zang Roy's support for
the TSI108 MPIC variant. It also fixes up MPC7448_hpc2 to use the new
version of the type macros and changes the way MPIC is selected in
Kconfig to better match what is done for other system devices.
]
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Keep from breaking 83xx arch/ppc build. Back up old school arch/powerpc/sysdev/ipic.[hc] to arch/ppc/syslib.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
After going through the trouble of setting up the PIC base
address in the pic@40000 device tree node, use it instead
of the obsolete hard-coded value.
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Several RTAS calls take a "config_addr" parameter, which is a particular
way of specifying a PCI busno, devfn and register number into a 32-bit word.
Currently these are open-coded, and I'll be adding another soon, replace
them with a helper that encapsulates the logic. Be more strict about masking
the busno too, just in case.
Booted on P5 LPAR.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cleanup CPU inits a bit more, Geoff Levand already did some earlier.
* Move CPU state save to cpu_setup, since cpu_setup is only ever done
on cpu 0 on 64-bit and save is never done more than once.
* Rename __restore_cpu_setup to __restore_cpu_ppc970 and add
function pointers to the cputable to use instead. Powermac always
has 970 so no need to check there.
* Rename __970_cpu_preinit to __cpu_preinit_ppc970 and check PVR before
calling it instead of in it, it's too early to use cputable.
* Rename pSeries_secondary_smp_init to generic_secondary_smp_init since
everyone but powermac and iSeries use it.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
On Tue, 2006-08-15 at 08:22 -0700, Dave Hansen wrote:
> kernel BUG in cache_free_debugcheck at mm/slab.c:2748!
Alright, this one is only triggered when slab debugging is enabled. The
slabs are assumed to be aligned on a HUGEPTE_TABLE_SIZE boundary. The free
path makes use of this assumption and uses the lowest nibble to pass around
an index into an array of kmem_cache pointers. With slab debugging turned
on, the slab is still aligned, but the "working" object pointer is not.
This would break the assumption above that a full nibble is available for
the PGF_CACHENUM_MASK.
The following patch reduces PGF_CACHENUM_MASK to cover only the two least
significant bits, which is enough to cover the current number of 4 pgtable
cache types. Then use this constant to mask out the appropriate part of
the huge pte pointer.
Signed-off-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The patch rewrites mpc7448hpc2 board irq support according to the new
mpic device tree interface.
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There are two problems in the powerpc gettimeofday code which can
cause incorrect results to be returned.
The first is that there is a race between do_gettimeofday and the
timer interrupt:
1. do_gettimeofday does get_tb()
2. decrementer exception on boot cpu which runs timer_recalc_offset,
which also samples the timebase and updates the do_gtod structure
with a greater timebase value.
3. do_gettimeofday calls __do_gettimeofday, which leads to the
negative result from tb_val - temp_varp->tb_orig_stamp.
The second is caused by taking the boot cpu offline, which can cause
the value of tb_last_jiffy to be increased past the currently
available timebase, causing the same underflow as above.
[paulus@samba.org - define and use data_barrier() instead of mb().]
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
To compile kexec on 32-bit we need a few more bits and pieces. Rather
than add empty definitions, we can make crash.c work on 32-bit, with
only a couple of kludges.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a shadow buffer for the SLBs and regsiters it with PHYP.
Only the bolted SLB entries (top 3) are shadowed.
The SLB shadow buffer tells the hypervisor what the kernel needs to
have in the SLB for the kernel to be able to function. The hypervisor
can use this information to speed up partition context switches.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We don't have much in the way of doc comments, but some of those we do have
don't work because they start with "/***" or "/*", not "/**" which is what
kernel-doc requires.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Noticing the following might_sleep warning (dump_stack()) during kdump
testing when CONFIG_DEBUG_SPINLOCK_SLEEP is enabled. All secondary CPUs
will be calling rtas_set_indicator with interrupts disabled to remove
them from global interrupt queue.
BUG: sleeping function called from invalid context at
arch/powerpc/kernel/rtas.c:463
in_atomic():1, irqs_disabled():1
Call Trace:
[C00000000FFFB970] [C000000000010234] .show_stack+0x68/0x1b0 (unreliable)
[C00000000FFFBA10] [C000000000059354] .__might_sleep+0xd8/0xf4
[C00000000FFFBA90] [C00000000001D1BC] .rtas_busy_delay+0x20/0x5c
[C00000000FFFBB20] [C00000000001D8A8] .rtas_set_indicator+0x6c/0xcc
[C00000000FFFBBC0] [C000000000048BF4] .xics_teardown_cpu+0x118/0x134
[C00000000FFFBC40] [C00000000004539C]
.pseries_kexec_cpu_down_xics+0x74/0x8c
[C00000000FFFBCC0] [C00000000002DF08] .crash_ipi_callback+0x15c/0x188
[C00000000FFFBD50] [C0000000000296EC] .smp_message_recv+0x84/0xdc
[C00000000FFFBDC0] [C000000000048E08] .xics_ipi_dispatch+0xf0/0x130
[C00000000FFFBE50] [C00000000009EF10] .handle_IRQ_event+0x7c/0xf8
[C00000000FFFBF00] [C0000000000A0A14] .handle_percpu_irq+0x90/0x10c
[C00000000FFFBF90] [C00000000002659C] .call_handle_irq+0x1c/0x2c
[C00000000058B9C0] [C00000000000CA10] .do_IRQ+0xf4/0x1a4
[C00000000058BA50] [C0000000000044EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x14/0x1c
LR = .pseries_dedicated_idle_sleep+0x190/0x1d4
[C00000000058BD40] [C00000000058BDE0] 0xc00000000058bde0 (unreliable)
[C00000000058BDF0] [C00000000001270C] .cpu_idle+0x10c/0x1e0
[C00000000058BE70] [C000000000009274] .rest_init+0x44/0x5c
To fix this issue, rtas_set_indicator_fast() is added so that will not
wait for RTAS 'busy' delay and this new function is used for kdump (in
xics_teardown_cpu()) and for CPU hotplug ( xics_migrate_irqs_away() and
xics_setup_cpu()).
Note that the platform architecture spec says that set-indicator
on the indicator we're using here is not permitted to return the
busy or extended busy status codes.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Our pseries hcall interfaces are out of control:
plpar_hcall_norets
plpar_hcall
plpar_hcall_8arg_2ret
plpar_hcall_4out
plpar_hcall_7arg_7ret
plpar_hcall_9arg_9ret
Create 3 interfaces to cover all cases:
plpar_hcall_norets: 7 arguments no returns
plpar_hcall: 6 arguments 4 returns
plpar_hcall9: 9 arguments 9 returns
There are only 2 cases in the kernel that need plpar_hcall9, hopefully
we can keep it that way.
Pass in a buffer to stash return parameters so we avoid the &dummy1,
&dummy2 madness.
Signed-off-by: Anton Blanchard <anton@samba.org>
--
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch fixes several problems:
- The legacy backlight value might be set at interrupt time. Introduced
a worker to prevent it from directly calling the backlight code.
- via-pmu allows the backlight to be grabbed, in which case we need to
prevent other kernel code from changing the brightness.
- Don't send PMU requests in via-pmu-backlight when the machine is about
to sleep or waking up.
- More Kconfig fixes.
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kprobe inserts breakpoint instruction in probepoint and then jumps to
instruction slot when breakpoint is hit, the instruction slot icache must
be consistent with dcache. Here is the patch which invalidates instruction
slot icache area.
Without this patch, in some machines there will be fault when executing
instruction slot where icache content is inconsistent with dcache.
Signed-off-by: bibo,mao <bibo.mao@intel.com>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Keshavamurthy Anil S <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move all the Hypervisor call definitions to to a single header file.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Previous changes have treated the return values of get_property as
const, so now we can make the actual change to get_property(). There
shouldn't be a need to cast the return values anymore.
We will now get compiler warnings when property values are assigned to
a non-const variable.
If properties need to be updated, there's still the of_find_property
function.
Built for cell_defconfig, chrp32_defconfig, g5_defconfig,
iseries_defconfig, maple_defconfig, pmac32_defconfig, ppc64_defconfig
and pseries_defconfig.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that get_property() returns a void *, there's no need to cast its
return value. Also, treat the return value as const, so we can
constify get_property later.
powermac platform & macintosh driver changes.
Built for pmac32_defconfig, g5_defconfig
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that get_property() returns a void *, there's no need to cast its
return value. Also, treat the return value as const, so we can
constify get_property later.
cell platform changes.
Built for cell_defconfig
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that get_property() returns a void *, there's no need to cast its
return value. Also, treat the return value as const, so we can
constify get_property later.
powerpc core changes.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
set_wmb should not be used in the kernel because it just confuses the
code more and has no benefit. Since it is not currently used in the
kernel this patch removes it so that new code does not include it.
All archs define set_wmb(var, value) to do { var = value; wmb(); }
while(0) except ia64 and sparc which use a mb() instead. But this is
still moot since it is not used anyway.
Hasn't been tested on any archs but x86 and x86_64 (and only compiled
tested)
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Although we pass the address of an iommu_table_cb to HvCallXm_getTceTableParms,
we don't actually need the structure definition anywhere except in the
iseries iommu code, so move the struct in there.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
This driver uses the hvc_console.c infrastructure that is used by the
pSeries virtual and RTAS consoles. This will allow us to make viocons.c
obsolete and is another step along the way to a combined kernel (as
viocons could not coexist with CONFIG_VT).
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Move ItLpNaca into platforms/iseries now that it's not used elsewhere.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
HvLpConfig_get(Primary)LpIndex are currently static inlines that return
fields from the itLpNaca, if we make them real functions we can make the
itLpNaca private to iSeries.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
No one outside platforms/iseries needs ItExtVpdPanel anymore, so move
it in there. It used to be needed by lparcfg, and so was exported, but
isn't needed anymore, so unexport it.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
The ASCII -> EBCDIC functions, e2a() and strne2a() are now only used in
dt.c, so move them in there.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
This patch fixes several problems:
- pmac_backlight_key() is called under interrupt context, and therefore
can't use mutexes or semaphores, so defer the backlight level for
later, as it's not critical (original code by Aristeu S. Rozanski F.
<aris@valeta.org>).
- Add exports for functions that might be called from modules
- Fix Kconfig depdencies on PMAC_BACKLIGHT.
- Fix locking issues on calls from inside the driver (reported by
Aristeu S. Rozanski F., too)
- Fix wrong calculation of backlight values in some of the drivers
- Replace pmac_backlight_key_up/down by inline functions
[akpm@osdl.org: fix function prototypes]
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Acked-by: Aristeu S. Rozanski F. <aris@valeta.org>
Acked-by: Rene Nussbaumer <linux-kernel@killerfox.forkbomb.ch>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch slightly reworks the new irq code to fix a small design error. I
removed the passing of the trigger to the map() calls entirely, it was not a
good idea to have one call do two different things. It also fixes a couple of
corner cases.
Mapping a linux virtual irq to a physical irq now does only that. Setting the
trigger is a different action which has a different call.
The main changes are:
- I no longer call host->ops->map() for an already mapped irq, I just return
the virtual number that was already mapped. It was called before to give an
opportunity to change the trigger, but that was causing issues as that could
happen while the interrupt was in use by a device, and because of the
trigger change, map would potentially muck around with things in a racy way.
That was causing much burden on a given's controller implementation of
map() to get it right. This is much simpler now. map() is only called on
the initial mapping of an irq, meaning that you know that this irq is _not_
being used. You can initialize the hardware if you want (though you don't
have to).
- Controllers that can handle different type of triggers (level/edge/etc...)
now implement the standard irq_chip->set_type() call as defined by the
generic code. That means that you can use the standard set_irq_type() to
configure an irq line manually if you wish or (though I don't like that
interface), pass explicit trigger flags to request_irq() as defined by the
generic kernel interfaces. Also, using those interfaces guarantees that
your controller set_type callback is called with the descriptor lock held,
thus providing locking against activity on the same interrupt (including
mask/unmask/etc...) automatically. A result is that, for example, MPIC's
own map() implementation calls irq_set_type(NONE) to configure the hardware
to the default triggers.
- To allow the above, the irq_map array entry for the new mapped interrupt
is now set before map() callback is called for the controller.
- The irq_create_of_mapping() (also used by irq_of_parse_and_map()) function
for mapping interrupts from the device-tree now also call the separate
set_irq_type(), and only does so if there is a change in the trigger type.
- While I was at it, I changed pci_read_irq_line() (which is the helper I
would expect most archs to use in their pcibios_fixup() to get the PCI
interrupt routing from the device tree) to also handle a fallback when the
DT mapping fails consisting of reading the PCI_INTERRUPT_PIN to know wether
the device has an interrupt at all, and the the PCI_INTERRUPT_LINE to get an
interrupt number from the device. That number is then mapped using the
default controller, and the trigger is set to level low. That default
behaviour works for several platforms that don't have a proper interrupt
tree like Pegasos. If it doesn't work for your platform, then either
provide a proper interrupt tree from the firmware so that fallback isn't
needed, or don't call pci_read_irq_line()
- Add back a bit that got dropped by my main rework patch for properly
clearing pending IPIs on pSeries when using a kexec
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>