ISA 3 allows for prevention of instruction fetch and execution
of user mode pages. If such an error occurs, SRR1 bit 35 reports the
error. We catch and report the error in do_page_fault().
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Setup AMOR (Authority Mask Override Register) in HV mode so that the
host and guest kernel can in turn setup IAMR.
This allows us to enable key 0 in a following patch.
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Ensure that PSSCR is set to a safe value corresponding to no
state-loss each time a POWER9 CPU comes online.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is a nice interface for asking ftrace to dump all its tracing
buffers. The only down side for use in xmon is that it uses printk.
Depending on circumstances printk may not work when in xmon, but it also
may, so add a 'dt' command which dumps the ftrace buffers, and add a
note to the help to mentiont that it uses printk.
Calling this routine also disables tracing, which is problematic if you
return from xmon and expect the system to keep operating normally. So
after we do the dump turn tracing back on.
Both functions already have nop versions defined for when ftrace is not
enabled, so we don't need any extra #ifdefs.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Use builtin_platform_driver() helper to simplify the code.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit d0563a1297 ("powerpc: Implement {cmp}xchg for u8 and u16")
we removed the volatile from __cmpxchg().
This is leading to warnings such as:
drivers/gpu/drm/drm_lock.c: In function ‘drm_lock_take’:
arch/powerpc/include/asm/cmpxchg.h:484:37: warning: passing argument 1
of ‘__cmpxchg’ discards ‘volatile’ qualifier from pointer target
(__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_, \
There doesn't seem to be consensus across architectures whether the
argument is volatile or not, so at least for now put the volatile back.
Fixes: d0563a1297 ("powerpc: Implement {cmp}xchg for u8 and u16")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Partially copied from commit df0698be14 ("ARM: stack protector:
change the canary value per task")
A new random value for the canary is stored in the task struct whenever
a new task is forked. This is meant to allow for different canary values
per task. On powerpc, GCC expects the canary value to be found in a global
variable called __stack_chk_guard. So this variable has to be updated
with the value stored in the task struct whenever a task switch occurs.
Because the variable GCC expects is global, this cannot work on SMP
unfortunately. So, on SMP, the same initial canary value is kept
throughout, making this feature a bit less effective although it is still
useful.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Partialy copied from commit c743f38013 ("ARM: initial stack protector
(-fstack-protector) support")
This is the very basic stuff without the changing canary upon
task switch yet. Just the Kconfig option and a constant canary
value initialized at boot time.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Implement xchg{u8,u16}{local,relaxed}, and
cmpxchg{u8,u16}{,local,acquire,relaxed}.
It works on all ppc.
remove volatile of first parameter in __cmpxchg_local and __cmpxchg
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There are no ibmebus driver that make use of legacy suspend/resume. This
patch removes the support for it from ibmebus framework, new ibmebus
driver (as unlikely as they are) wanting to use suspend/resume should
use dev_pm_ops.
Since there aren't any special bus specific things to do during
suspend/resume and since the PM core will automatically fallback
directly to using the device's PM ops if no bus PM ops are specified
there is no need to have any special ibmebus PM ops at all.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Invoke the kprobe handlers directly rather than through notify_die(), to
reduce path taken for handling kprobes. Similar to commit 6f6343f53d
("kprobes/x86: Call exception handlers directly from do_int3/do_debug").
While at it, rename post_kprobe_handler() to kprobe_post_handler() for
more uniform naming.
Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 03465f899b ("powerpc: Use kprobe blacklist for exception
handlers") removed __kprobes annotation from some of the prototypes,
but left the kprobes header include directive unchanged. Remove it as it
is no longer needed.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Define and set the POWER9 HFSCR doorbell bit so that guests can use
msgsndp.
ISA 3.0 calls this MSGP, so name it accordingly in the code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ISA 3.0 defines a new PECE (Power-saving mode Exit Cause Enable) field
in the LPCR (Logical Partitioning Control Register), called
LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable).
KVM code will need to know about this bit, so add a definition for it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ISA 3.00 adds the logical PVR value 0x0f000005, so add a definition for
this.
Define PCR_ARCH_207 to reflect ISA 2.07 compatibility mode in the processor
compatibility register (PCR).
[paulus@ozlabs.org - moved dummy PCR_ARCH_300 value into next patch]
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.
It also exports opal_int_set_mfrr() so that the modular part of KVM
can use it to send IPIs.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 requires the host to set up a partition table, which is a
table in memory indexed by logical partition ID (LPID) which
contains the pointers to page tables and process tables for the
host and each guest.
This factors out the initialization of the partition table into
a single function. This code was previously duplicated between
hash_utils_64.c and pgtable-radix.c.
This provides a function for setting a partition table entry,
which is used in early MMU initialization, and will be used by
KVM whenever a guest is created. This function includes a tlbie
instruction which will flush all TLB entries for the LPID and
all caches of the partition table entry for the LPID, across the
system.
This also moves a call to memblock_set_current_limit(), which was
in radix_init_partition_table(), but has nothing to do with the
partition table. By analogy with the similar code for hash, the
call gets moved to near the end of radix__early_init_mmu(). It
now gets called when running as a guest, whereas previously it
would only be called if the kernel is running as the host.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
VSID 0 is bad address. Don't create slb entries on coproc fault for
bad address
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
eeh_pe_reset and eeh_reset_pe are two different functions in the same
file which do mostly the same thing. Not only is this confusing, but
potentially causes disrepancies in functionality, notably eeh_reset_pe
as it does not check return values for failure.
Refactor this into the following:
- eeh_pe_reset(): stays as is, performs a single operation, exported
- eeh_pe_reset_full(): new, full reset process that calls eeh_pe_reset()
- eeh_reset_pe(): removed and replaced by eeh_pe_reset_full()
- eeh_reset_pe_once(): removed
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
PHB, PE (and by association MVE) numbers are printed as a mix of decimal
and hexadecimal throughout the kernel. This can be misleading, so make
them all hexadecimal.
Standardising on hex instead of dec because:
- PHB numbers are presented in hex in sysfs/debugfs (and lspci, etc)
- PE numbers are presented as hex in sysfs and parsed in hex in debugfs
The only place I think this could cause confusing are the messages during
boot, i.e.
pci 000a:01 : [PE# 000] Secondary bus 1 associated with PE#0
which can be a quick way to check PE numbers. pe_level_printk() will
only print two characters instead of three, so the above would be
pci 000a:01 : [PE# 00] Secondary bus 1 associated with PE#0
which gives a hint it's in hex.
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Whenever a PE is initialised in powernv, opal_pci_eeh_freeze_clear() is
called. This is to remove any existing freeze, and has no negative side
effects if the PE is already in an unfrozen state. On PHB backends that
don't support this operation and return OPAL_UNSUPPORTED, this creates a
scary and misleading warning message.
Skip the warning message on init if OPAL_UNSUPPORTED is returned.
As far as I'm aware, this currently only affects NPUs.
Fixes: 313483d ("powerpc/powernv: Unfreeze PE on allocation")
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The ibm_pa_features array consists of structures that describe which bit
and byte in the ibm,pa-features property toggles one or more flags in
either the CPU, MMU, or user visible feature flags.
Each one consists of 7 values, which are all unsigned long, int or char,
meaning the compiler gives us no warning if we assign the wrong values
to the wrong elements. In fact we have had a bug here in the past, where
we were setting incorrect bits, see commit 6997e57d69 ("powerpc:
scan_features() updates incorrect bits for REAL_LE").
So switch to using named initialisers for the structure elements, to
reduce the likelihood of future bugs, and hopefully improve readability
also.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
These are the PPC optimised versions of various crypto algorithms, so we
should turn them on by default to get test coverage.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The IBMEBUS code supports the GX bus found on Power7 and earlier CPUs.
On Power8 it has been replaced, and so we have no need for it.
We don't actually have a config symbol for Power8 vs Power7 etc., but
we only support booting little endian on Power8 or later, so use that as
a reasonable approximation.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When ending an oops, don't clear die_owner unless the nest count
went to zero. This prevents a second nested oops from hanging forever
on the die_lock.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When exiting xmon with 'x' (exit and recover), oops_begin bails
out immediately, but die then calls __die() and oops_end(), which
cause a lot of bad things to happen.
If the debugger was attached then went to graceful recovery, exit
from die() immediately.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add an option to use thin archives to build the kernel.
Thin archives are explained in commit a5967db9af ("kbuild: allow
architectures to use thin archives instead of ld -r").
This is a gradual way to introduce the option to testers.
Some change to the way we invoke ar is required so it can be used
by scripts/link-vmlinux.sh.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Make it an explicit option not dependant on COMPILE_TEST]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Under some configs we need to explicitly include cpu_has_feature.h,
otherwise we fail with:
arch/powerpc/lib/sstep.c:1992:7: error: implicit declaration of function 'cpu_has_feature'
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The pasrsing of data written to the dlpar file in sysfs does not correctly
account for the possibility of reading past the end of the buffer. The code
assumes that all pieces of the command witten to the sysfs file are present
in the form "<resource> <action> <id_type> <id>".
Correct this by updating the buffer parsing code to make a local copy and
use the strsep() and sysfs_streq() routines to parse the buffer. This patch
also separates the parsing code into subroutines for each piece of the
command.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Remove the unused but set variable srr1 in save_mce_event() to
fix the following GCC warning when building with 'W=1':
arch/powerpc/kernel/mce.c:75:11: warning: variable 'srr1' set but not used
It has never been used.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Fix two [-Wold-style-declaration] GCC warnings by moving the inline
keyword before the return type.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
MIN_HUGEPTE_SHIFT hasn't been used since commit d1837cba5d
("powerpc/mm: Cleanup initialization of hugepages on powerpc")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Version 3.00 of the ISA states that the PATS (partition table size) field
of the PTCR (partition table control register) and the PRTS (process table
size) field of the partition table entry must both be less than or equal
to 24. However the actual size of the partition and process tables is equal
to 2 to the power of 12 plus the PATS and PRTS fields, respectively. This
means that the max allowable size of each of these tables is 2^36 or 64GB
for both.
Thus when checking the size shift for each we should be checking for values
of greater than 36 instead of the current check for shifts larger than 24
and 23.
Fixes: 2bfd65e45e
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Useful to be able to dump the kernel hash page table to check
which pages are hashed along with their sizes and other details.
Add a debugfs file to check the hash page table. If radix is enabled
(and so there is no hash page table) then this file doesn't exist. To
use this the PPC_PTDUMP config option must be selected.
Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
[mpe: Fix build with SPARSEMEM_VMEMMAP=n & PSERIES=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Useful to be able to dump the kernels page tables to check permissions
and memory types - derived from arm64's implementation.
Add a debugfs file to check the page tables. To use this the PPC_PTDUMP
config option must be selected.
Signed-off-by: Rashmica Gupta <rashmicy@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently there's some CMO (Cooperative Memory Overcommit) code, in
plpar_wrappers.h. Some of it is #ifdef CONFIG_PSERIES and some of it
isn't. The end result being if a file includes plpar_wrappers.h it won't
build with CONFIG_PSERIES=n.
Fix it by moving the CMO code into platforms/pseries. The two hcall
wrappers can just be moved into their only caller, cmm.c, and the
accessors can go in pseries.h.
Note we need the accessors because cmm.c can be built as a module, so
there needs to be a split between the built-in code vs the module, and
that's achieved by using those accessors.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit. This is OK because the
old format contains all the information that is in the new format.
This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format. At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.
This fixes and partially reverts commit 50de596de8
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).
Fixes: 50de596de8 ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts storcenter_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts pseries_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts ppc6xx_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts ppc64e_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts ppc64_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts pmac32_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
This patch disables deprecated IDE subsystem in pasemi_defconfig
(no IDE host drivers are selected in this config so there is no valid
reason to enable IDE subsystem itself).
Cc: Olof Johansson <olof@lixom.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
IDE subsystem has been deprecated since 2009 and the majority
(if not all) of Linux distributions have switched to use
libata for ATA support exclusively. However there are still
some users (mostly old or/and embedded non-x86 systems) that
have not converted from using IDE subsystem to libata PATA
drivers. This doesn't seem to be good thing in the long-term
for Linux as while there is less and less PATA systems left
in use:
* testing efforts are divided between two subsystems
* having duplicate drivers for same hardware confuses users
This patch converts maple_defconfig to use libata PATA
drivers.
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>