This introduces a new KVM capability to control how KVM behaves
on machine check exception (MCE) in HV KVM guests.
If this capability has not been enabled, KVM redirects machine check
exceptions to guest's 0x200 vector, if the address in error belongs to
the guest. With this capability enabled, KVM will cause a guest exit
with the exit reason indicating an NMI.
The new capability is required to avoid problems if a new kernel/KVM
is used with an old QEMU, running a guest that doesn't issue
"ibm,nmi-register". As old QEMU does not understand the NMI exit
type, it treats it as a fatal error. However, the guest could have
handled the machine check error if the exception was delivered to
guest's 0x200 interrupt vector instead of NMI exit in case of old
QEMU.
[paulus@ozlabs.org - Reworded the commit message to be clearer,
enable only on HV KVM.]
Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Implement support for the hypervisor diagnose 0x26c
('Access Certain System Information').
It passes a request buffer and a subfunction code, and receives
a response buffer and a return code.
Also add the scaffolding for the 'MAC Services' subfunction.
It may be used by network devices to obtain a hypervisor-managed
MAC address.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using sparse on the arm64 tree we get many thousands of
warnings like 'constant ... is so big it is unsigned long long'
or 'shift too big (32) for type unsigned long'. This happens
because by default sparse considers the machine as 32bit and
defines the size of the types accordingly.
Fix this by passing the '-m64' flag to sparse so that
sparse can correctly define longs as being 64bit.
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Will Deacon <will.deacon@arm.com>
CC: linux-arm-kernel@lists.infradead.org
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This commit fixes a "maybe-uninitialized" build failure in
arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all
enabled. The failure is:
In file included from ./include/linux/printk.h:329:0,
from ./include/linux/kernel.h:13,
from ./include/asm-generic/bug.h:15,
from ./arch/mips/include/asm/bug.h:41,
from ./include/linux/bug.h:4,
from ./include/linux/thread_info.h:11,
from ./include/asm-generic/current.h:4,
from ./arch/mips/include/generated/asm/current.h:1,
from ./include/linux/sched.h:11,
from arch/mips/kvm/tlb.c:13:
arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’:
./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
__dynamic_pr_debug(&descriptor, pr_fmt(fmt), \
^~~~~~~~~~~~~~~~~~
arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here
int idx_user, idx_kernel;
^~~~~~~~~~
There is a similar error relating to "idx_user". Both errors were
observed with GCC 6.
As far as I can tell, it is impossible for either idx_user or idx_kernel
to be uninitialized when they are later read in the calls to kvm_debug,
but to satisfy the compiler, add zero initializers to both variables.
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Fixes: 57e3869cfa ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID")
Cc: <stable@vger.kernel.org> # 4.11+
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Enable gpio support for CP and AP on the Marvell Armada 7K/8K SoCs.
The Armada 8K has two CP110 blocks, each having two GPIO controllers.
However, in each CP110 block, one of the GPIO controller cannot be
used: in the master CP110, only the second GPIO controller can be used,
while on the slave CP110, only the first GPIO controller can be used.
On the other side, the Armada 7K has only one CP110, but both its GPIO
controllers can be used.
For this reason, the GPIO controllers are marked as "disabled" in the
armada-cp110-master.dtsi and armada-cp110-slave.dtsi files, and only
enabled in the per-SoC dtsi files.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Enable pinctrl support for CP and AP on the Armada 7K/8K SoCs.
The CP master being different between Armada 7k and Armada 8k. This
commit introduces the intermediates files armada-70x0.dtsi and
armada-80x0.dtsi.
These new files will provide different compatible strings depending of
the SoC family. They will also be the location for the pinmux
configuration at the SoC level.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
The new binding for the system controller on cp110 moved the clock
controller into a subnode. This preliminary step will allow to add gpio
and pinctrl subnodes.
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
* fix problems that could cause hangs or crashes in the host on POWER9
* fix problems that could allow guests to potentially affect or disrupt
the execution of the controlling userspace
EX_R3 is used only for a small section of the bad stack handler.
Merge it with EX_DAR.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
EX_LR is used only for a small section of the SLB miss handler.
Merge it with EX_DAR.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Rather than open-coding it 4 times.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Move __ASSEMBLY__ guards into head-64.h where they're really needed]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The SLB miss handler uses r3 for the faulting address but r12 is
mostly able to be freed up to save r3 in. It just requires SRR1
be reloaded again on error.
It would be more conventional to use r12 for SRR1 (and use r11 to
save r3), but slb_allocate_realmode clobbers r11 and not r12.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The EXCEPTION_PROLOG_1 used by SLB miss already saves CTR when the
kernel is built with CONFIG_RELOCATABLE. So it does not have to be
saved and reloaded when branching to slb_miss_realmode. It can be
restored from the PACA as usual.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The EX_DAR save area is only used in exceptional cases. With r3 no
longer clobbered by slb_allocate_realmode, saving faulting address to
EX_DAR can be deferred to those cases.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
One fewer registers clobbered by this function means the SLB miss
handler can save one fewer.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This patch factors out the allocator for signal frame optional
records into a separate function, to ensure consistency and
facilitate later expansion.
No overrun checking is currently done, because the allocation is in
user memory and anyway the kernel never tries to allocate enough
space in the signal frame yet for an overrun to occur. This
behaviour will be refined in future patches.
The approach taken in this patch to allocation of the terminator
record is not very clean: this will also be replaced in subsequent
patches.
For future extension, a comment is added in sigcontext.h
documenting the current static allocations in __reserved[]. This
will be important for determining under what circumstances
userspace may or may not see an expanded signal frame.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.
The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records. The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.
The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.
As a result of these changes, it becomes more natural to represent
locations in the signal frame by a base pointer and an offset,
since the absolute address of each location is not known during the
layout pass. To be more consistent with this logic,
parse_user_sigframe() is refactored to describe signal frame
locations in a similar way.
This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently, rt_sigreturn does very limited checking on the
sigcontext coming from userspace.
Future additions to the sigcontext data will increase the potential
for surprises. Also, it is not clear whether the sigcontext
extension records are supposed to occur in a particular order.
To allow the parsing code to be extended more easily, this patch
factors out the sigcontext parsing into a separate function, and
adds extra checks to validate the well-formedness of the sigcontext
structure.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In order to be able to increase the amount of the data currently
written to the __reserved[] array in the signal frame, it is
necessary to overwrite the locations currently occupied by the
{fp,lr} frame link record pushed at the top of the signal stack.
In order for this to work, this patch detaches the frame link
record from struct rt_sigframe and places it separately at the top
of the signal stack. This will allow subsequent patches to insert
data between it and __reserved[].
This change relies on the non-ABI status of the placement of the
frame record with respect to struct sigframe: this status is
undocumented, but the placement is not declared or described in the
user headers, and known unwinder implementations (libgcc,
libunwind, gdb) appear not to rely on it.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
To avoid issues with the /proc/kcore code getting confused about the
kernels block mappings in the VMALLOC region, enable the existing
facility that describes the [_text, _end) interval as a separate
KCORE_TEXT region, which supersedes the KCORE_VMALLOC region that
it intersects with on arm64.
Reported-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Normally, when the initrd is gone, we can't search it for microcode
blobs to apply anymore. For that we need to stash away the patch in our
own storage.
And save_microcode_in_initrd_intel() looks like the proper place to
do that from. So in order for early loading to work, invalidate the
intel_ucode_patch pointer to the patch *before* scanning the initrd one
last time.
If the scanning code finds a microcode patch, it will assign that
pointer again, this time with our own storage's address.
This way, early microcode application during resume-from-RAM works too,
even after the initrd is long gone.
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170614140626.4462-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Early during boot, the BSP finds the ramdisk's position from boot_params
but by the time the APs get to boot, the BSP has continued in the mean
time and has potentially managed to relocate that ramdisk.
And in that case, the APs need to find the ramdisk at its new position,
in *physical* memory as they're running before paging has been enabled.
Thus, get the updated physical location of the ramdisk which is in the
relocated_ramdisk variable.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170614140626.4462-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
kernel/sched/Makefile
Pick up the waitqueue related renames - it didn't get much feedback,
so it appears to be uncontroversial. Famous last words? ;-)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When hpet=force is supplied on the kernel command line and the HPET
supports the Legacy Replacement Interrupt Route option (HPET_ID_LEGSUP),
the legacy interrupts init code uses the boot CPU's mask initially by
calling smp_processor_id() assuming that it is running on the BSP.
It does run on the BSP but the code region is preemptible and the
preemption check fires.
Simply use the BSP's id directly to avoid the warning.
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20170620093154.18472-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
New bindings are used for the system controller on the ap806, which
means all clock properties must be converted. Use the new bindings in
the xor nodes.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Since the mdio nodes are disabled by default now, we should explicitly
enable these nodes at the board level when they are used. Enable the
cpm_mdio node for the 8040-mcbin.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
The dma alloc interface returns an error by return NULL, and the
mapping interfaces rely on the mapping_error method, which the dummy
ops already implement correctly.
Thus remove the DMA_ERROR_CODE define.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
ARM and x86 had duplicated versions of the dma_ops structure, the
only difference is that x86 hasn't wired up the set_dma_mask,
mmap, and get_sgtable ops yet. On x86 all of them are identical
to the generic version, so they aren't needed but harmless.
All the symbols used only for xen_swiotlb_dma_ops can now be marked
static as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Recently vDSO support for CLOCK_MONOTONIC_RAW was added in
49eea433b3 ("arm64: Add support for CLOCK_MONOTONIC_RAW in
clock_gettime() vDSO"). Noticing that the core timekeeping code
never set tkr_raw.xtime_nsec, the vDSO implementation didn't
bother exposing it via the data page and instead took the
unshifted tk->raw_time.tv_nsec value which was then immediately
shifted left in the vDSO code.
Unfortunately, by accellerating the MONOTONIC_RAW clockid, it
uncovered potential 1ns time inconsistencies caused by the
timekeeping core not handing sub-ns resolution.
Now that the core code has been fixed and is actually setting
tkr_raw.xtime_nsec, we need to take that into account in the
vDSO by adding it to the shifted raw_time value, in order to
fix the user-visible inconsistency. Rather than do that at each
use (and expand the data page in the process), instead perform
the shift/addition operation when populating the data page and
remove the shift from the vDSO code entirely.
[jstultz: minor whitespace tweak, tried to improve commit
message to make it more clear this fixes a regression]
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Tested-by: Daniel Mentz <danielmentz@google.com>
Acked-by: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
commit 94647a3012 ("ARM: dts: omap3-overo: Enable WiFi/BT combo")
while enabling WiFi/BT combo added regulator to trigger the nReset
signal of the Bluetooth module in vqmmc-supply. However BT should be
handled by UART. Moreover "vqmmc" is not a defined binding for
omap_hsmmc. While "vqmmc" in mmc2 hasn't caused any issues so far,
mmc2 will start to mis-behave once omap_hsmmc defines "vqmmc"
binding.
Remove "vqmmc-supply" property in mmc2 here.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
On a POWER9 system, it is possible for an interrupt to become pending
for a VCPU when that VCPU is about to cede (execute a H_CEDE hypercall)
and has already disabled interrupts, or in the H_CEDE processing up
to the point where the XIVE context is pulled from the hardware. In
such a case, the H_CEDE should not sleep, but should return immediately
to the guest. However, the conditions tested in kvmppc_vcpu_woken()
don't include the condition that a XIVE interrupt is pending, so the
VCPU could sleep until the next decrementer interrupt.
To fix this, we add a new xive_interrupt_pending() helper which looks
in the XIVE context that was pulled from the hardware to see if the
priority of any pending interrupt is higher (numerically lower than)
the CPU priority. If so then kvmppc_vcpu_woken() will return true.
If the XIVE context has never been used, then both the pipr and the
cppr fields will be zero and the test will indicate that no interrupt
is pending.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The external request lines are used by tusb6010 on OMAP24xx platforms.
Update the map so the driver can use dmaengine API to request the DMA
channel. At the same time add temporary map containing only the external
DMA request numbers for DT booted case on omap24xx since the tusb6010 stack
is not yet supports DT boot.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Missing symbol type for few functions prevents genksyms from generating
symbol versions for those functions. This patch fixes them.
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line.
sed
's/.*___EXPORT_SYMBOL[[:space:]]*\([a-zA-Z0-9_]*\)[[:space:]]*,.*/EXPORT_SYMBOL(\1);/'
ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence
version generation is done only for the last matched symbol. This patch adds
new line between the symbol expansions.
Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>