Most distros use it so we may as well enable it and get regular compile
testing.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When issuing a system reset we almost always oops in the oops_to_nvram
code because multiple CPUs are using the deflate work area. Add a
spinlock to protect it.
To play it safe I'm using trylock to avoid locking up if the NVRAM
code oopses. This means we might miss multiple CPUs oopsing at exactly
the same time but I think it's best to play it safe for now. Once we
are happy with the reliability we can change it to a full spinlock.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On most 68k Macs the SCC IRQ is an autovector interrupt and cannot be
masked. This can be a problem when pmac_zilog starts up.
For example, the serial debugging code in arch/m68k/kernel/head.S may be
used beforehand. It disables the SCC interrupts at the chip but doesn't
ack them. Then when a pmac_zilog port is used, the machine locks up with
"unexpected interrupt".
This can happen in pmz_shutdown() since the irq is freed before the
channel interrupts are disabled.
Fix this by clearing interrupt enable bits before the handler is
uninstalled. Also move the interrupt control bit flipping into a separate
pmz_interrupt_control() routine. Replace all instances of these operations
with calls to this routine. Omit the zssync() calls that seem to serve no
purpose.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This fixes a problem where a CPU thread coming out of nap mode can
think it has valid values in the nonvolatile GPRs (r14 - r31) as saved
away in power7_idle, but in fact the values have been trashed because
the thread was used for KVM in the mean time. The result is that the
thread crashes because code that called power7_idle (e.g.,
pnv_smp_cpu_kill_self()) goes to use values in registers that have
been trashed.
The bit field in SRR1 that tells whether state was lost only reflects
the most recent nap, which may not have been the nap instruction in
power7_idle. So we need an extra PACA field to indicate that state
has been lost even if SRR1 indicates that the most recent nap didn't
lose state. We clear this field when saving the state in power7_idle,
we set it to a non-zero value when we use the thread for KVM, and we
test it in power7_wakeup_noloss.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
At present, on the powernv platform, if you off-line a CPU that was
online, and then try to on-line it again, the kernel generates a
warning message "OPAL Error -1 starting CPU n". Furthermore, if the
CPU is a secondary thread that was used by KVM while it was off-line,
the CPU fails to come online.
The first problem is fixed by only calling OPAL to start the CPU the
first time it is on-lined, as indicated by the cpu_start field of its
PACA being zero. The second problem is fixed by restoring the
cpu_start field to 1 instead of 0 when using the CPU within KVM.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The minimum RMO size field in ibm,client-architecture is currently
ignored, but a future firmware version will rectify that. Since we
always get at least 128MB of RMO right now, asking for 64MB is
likely to result in boot failures.
We should bump it to at least 128MB, but considering all the boot
issues we have on 128MB RMO boxes and all new machines have virtual
RMO, we may as well set our minimum to 256MB.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With CONFIG_STRICT_DEVMEM=y, user space cannot read any part of /dev/mem.
Since this breaks librtas, punch a hole in /dev/mem to allow access to the
rmo_buffer that librtas needs.
Anton Blanchard reported the problem and helped with the fix.
A quick test for this patch:
# cat /proc/rtas/rmo_buffer
000000000f190000 10000
# python -c "print 0x000000000f190000 / 0x10000"
3865
# dd if=/dev/mem of=/tmp/foo count=1 bs=64k skip=3865
1+0 records in
1+0 records out
65536 bytes (66 kB) copied, 0.000205235 s, 319 MB/s
# dd if=/dev/mem of=/tmp/foo
dd: reading `/dev/mem': Operation not permitted
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00022519 s, 0.0 kB/s
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
So I've had one of these for a while and it looks like the vendor never
bothered submitting the support upstream.
This adds it using ppc40x_simple and provides a device-tree.
There are some changes to the boot wrapper because the way u-boot works
on this thing, it seems to expect a multipart image with the kernel,
initrd and dtb in it.
The USB support is missing as it needs the yet unmerged driver for
the DWC OTG part and the GPIOs may need further definition in the dts.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Refresh ps3_defconfig to latest kernel sources and
change the options:
CONFIG_PPP=m to CONFIG_PPP=n.
CONFIG_NAMESPACES=y to CONFIG_NAMESPACES=n
CONFIG_NUMA=y to CONFIG_NUMA=n
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add an __init annotation to the ps3_smp_probe() routine.
Fixes build warnings like these when
CONFIG_DEBUG_SECTION_MISMATCH=y:
WARNING: Section mismatch in reference from the function .ps3_smp_probe()
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix some PS3 repository.c build warnings when DEBUG is
defined. Also change most pr_debug calls to pr_devel calls.
Fixes warnings like these:
format '%lx' expects type 'long unsigned int', but argument 7 has type 'u64'
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1 hcall #91 should be named lv1_read_repository_node, and
not lv1_get_repository_node_value. Adjust the lv1 hcall table
and all calls.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1_get_version_info hcall takes 2, not 1 output
arguments. Adjust the lv1 hcall table and all calls.
Usage:
int lv1_get_version_info(u64 *version_number, u64 *vendor_id)
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1_get_virtual_address_space_id_of_ppe hcall takes 0, not 1 input
arguments. Adjust the lv1 hcall table and all calls.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The lv1_net_stop_tx_dma and net_stop_rx_dma hcalls take 2, not 3 input
arguments. Adjust the lv1 hcall table and all calls.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
General code cleanup for PS3 interrupt.c:
o Fill out comments for structure members.
o Move variables ipi_debug_brk_mask and lock from struct ps3_bmp to
struct ps3_private.
o Fix pr_debug build errors when DEBUG is defined.
o Convert bit operation to set_bit().
o Convert DBG macro from pr_debug to pr_devel
o Add new macro FAIL to replace pr_debug calls
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We might enter the secondary CPU capture code twice, eg if we have to
unstick some CPUs with a system reset. In this case we don't want to
overwrite the state on CPUs that had made it into the capture code OK,
so use the cpus_state_saved cpumask for that and make it local to
crash_ipi_callback.
For controlling progress now use atomic_t cpus_in_crash to count how
many CPUs have made it into the kdump code, and time_to_dump to tell
everyone it's time to dump.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If we enter the kdump code via system reset, wait a bit before
sending the IPI to capture all secondary CPUs. Without it we race
with the hypervisor that is issuing the system reset to each CPU.
If the IPI gets there first the system reset oops output then shows
the register state of the IPI handler which is not what we want.
I took the opportunity to add defines for all the various delays
we have. There's no need for cpu_relax when we are doing an mdelay,
so remove them too.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I have an intermittent kdump fail where the hypervisor fails an H_EOI.
As a result our CPPR is never reset to 0xff and we no longer accept
interrupts.
This patch calls icp_hv_set_cppr to reset the CPPR if H_EOI fails,
fixing the kdump fail.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We've had a 180 second panic timeout on ppc64 for as long as I
can remember. This patch reduces it to 10 seconds on pseries for a few
reasons:
- Almost all pseries machines have a hypervisor console so panic
output will be available in a scrollback buffer.
- The 180 seconds impacts our availability, users (other than
kernel hackers) just want the box to come back around so it
can continue its work.
- I spend a lot of my life staring at the 180 second panic timeout.
Many pseries machines take minutes to power cycle, so it's quicker
to sit through the 180 seconds than it is to power cycle.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Our die() code was based off a very old x86 version. Update it to
mirror the current x86 code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Remove some unnecessary defines and fix some spelling mistakes.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We can handle recursion caused by system reset by reusing the crash
shutdown fault handler.
Since we don't have an OS triggerable NMI, if all CPUs don't make it
into kdump then we tell the user to issue a system reset. However if
we have a panic timeout set we cannot wait forever and must continue
the kdump.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have a lot of complicated logic that handles possible recursion between
kdump and a system reset exception. We can solve this in a much simpler
way using the same setjmp/longjmp tricks xmon does.
As a first step, this patch removes the old system reset code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I've been seeing truncated output when people send system reset info
to me. We should see a backtrace for every CPU, but the panic() code
takes the box down before they all make it out to the console. The
panic code runs unlocked so we also see corrupted console output.
If we are going to panic, then delay 1 second before calling into the
panic code. Move oops_exit inside the die lock and put a newline
between oopses for clarity.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch makes pseries_idle_driver not to be registered when
power_save=off kernel boot option is specified. The
cpuidle_disable variable used here is similar to
its usage on x86. If cpuidle_disable is set then
sysfs entries for cpuidle framework are not created
and the required drivers are not loaded.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch enables cpuidle for pSeries and pSeries_idle is
directly called from the idle loop. As a result of pSeries_idle, cpuidle
driver registered with cpuidle subsystem comes into action. On
failure of loading of the driver or cpuidle framework default idle
is executed as part of the function. This patch
also removes the routines pseries_shared_idle_sleep and
pseries_dedicated_idle_sleep as they are now implemented as part of
pseries_idle cpuidle driver.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch implements a back-end cpuidle driver for pSeries
based on pseries_dedicated_idle_loop and pseries_shared_idle_loop
routines. The driver is built only if CONFIG_CPU_IDLE is set. This
cpuidle driver uses global registration of idle states and
not per-cpu.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch provides cpu_idle_wait() routine for the powerpc
platform which is required by the cpuidle subsystem. This
routine is required to change the idle handler on SMP systems.
The equivalent routine for x86 is in arch/x86/kernel/process.c
but the powerpc implementation is different.
cpuidle_disable variable is to enable/disable cpuidle
framework if power_save option is set during the boot
time.
Signed-off-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Signed-off-by: Trinabh Gupta <g.trinabh@gmail.com>
Signed-off-by: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
It's only used inside the same file where it's defined. There's
also no point exporting it anymore.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Open Firmware on OPAL machines seems to have issues if we close
stdin and/or we try to print things after calling "quiesce" so
we avoid doing both.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Define HUGETLB_NEED_PRELOAD in mmu-book3e.h for CONFIG_PPC64 instead
of having a much more complicated #if block. This is easier to read
and maintain.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This avoids an extra find_vma() and is less error-prone.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Allow hugetlb to be enabled on 64b FSL_BOOK3E. No platforms enable
it by default yet.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For 64-bit FSL_BOOKE implementations, gigantic pages need to be
reserved at boot time by the memblock code based on the command line.
This adds the call that handles the reservation, and fixes some code
comments.
It also removes the previous pr_err when reserve_hugetlb_gpages
is called on a system without hugetlb enabled - the way the code is
structured, the call is unconditional and the resulting error message
spurious and confusing.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Before hugetlb, at each level of the table, we test for
!0 to determine if we have a valid table entry. With hugetlb, this
compare becomes:
< 0 is a normal entry
0 is an invalid entry
> 0 is huge
This works because the hugepage code pulls the top bit off the entry
(which for non-huge entries always has the top bit set) as an
indicator that we have a hugepage.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I happened to comment this code while I was digging through it;
we might as well commit that. I also made some whitespace
changes - the existing code had a lot of unnecessary newlines
that I found annoying when I was working on my tiny laptop.
No functional changes.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The original 32-bit hugetlb implementation used PPC64 vs PPC32 to
determine which code path to take. However, the final hugetlb
implementation for 64-bit FSL ended up shared with the FSL
32-bit code so the actual check needs to be FSL_BOOK3E vs
everything else. This patch changes the include protections to
reflect this.
There are also a couple of related comment fixes.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This updates the hugetlb page table code to handle 64-bit FSL_BOOKE.
The previous 32-bit work counted on the inner levels of the page table
collapsing.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch does 2 things: It corrects the code that determines the
size to write into MAS1 for the PPC_MM_SLICES case (this originally
came from David Gibson and I had incorrectly altered it), and it
changes the methodolody used to calculate the size for !PPC_MM_SLICES
to work for 64-bit as well as 32-bit.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There was an unconditional return of "1" in the original code
from David Gibson, and I dropped it because it wasn't needed
for FSL BOOKE 32-bit. However, not all systems (including 64-bit
FSL BOOKE) do loading of the hpte from the fault handler asm
and depend on this function returning 1, which causes a call
to update_mmu_cache() that writes an entry into the tlb.
Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>