Each source is associated with an Event State Buffer (ESB) with a
even/odd pair of pages which provides commands to manage the source:
to trigger, to EOI, to turn off the source for instance.
The custom VM fault handler will deduce the guest IRQ number from the
offset of the fault, and the ESB page of the associated XIVE interrupt
will be inserted into the VMA using the internal structure caching
information on the interrupts.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Each thread has an associated Thread Interrupt Management context
composed of a set of registers. These registers let the thread handle
priority management and interrupt acknowledgment. The most important
are :
- Interrupt Pending Buffer (IPB)
- Current Processor Priority (CPPR)
- Notification Source Register (NSR)
They are exposed to software in four different pages each proposing a
view with a different privilege. The first page is for the physical
thread context and the second for the hypervisor. Only the third
(operating system) and the fourth (user level) are exposed the guest.
A custom VM fault handler will populate the VMA with the appropriate
pages, which should only be the OS page for now.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The state of the thread interrupt management registers needs to be
collected for migration. These registers are cached under the
'xive_saved_state.w01' field of the VCPU when the VPCU context is
pulled from the HW thread. An OPAL call retrieves the backup of the
IPB register in the underlying XIVE NVT structure and merges it in the
KVM state.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
When migration of a VM is initiated, a first copy of the RAM is
transferred to the destination before the VM is stopped, but there is
no guarantee that the EQ pages in which the event notifications are
queued have not been modified.
To make sure migration will capture a consistent memory state, the
XIVE device should perform a XIVE quiesce sequence to stop the flow of
event notifications and stabilize the EQs. This is the purpose of the
KVM_DEV_XIVE_EQ_SYNC control which will also marks the EQ pages dirty
to force their transfer.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This control will be used by the H_INT_SYNC hcall from QEMU to flush
event notifications on the XIVE IC owning the source.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This control is to be used by the H_INT_RESET hcall from QEMU. Its
purpose is to clear all configuration of the sources and EQs. This is
necessary in case of a kexec (for a kdump kernel for instance) to make
sure that no remaining configuration is left from the previous boot
setup so that the new kernel can start safely from a clean state.
The queue 7 is ignored when the XIVE device is configured to run in
single escalation mode. Prio 7 is used by escalations.
The XIVE VP is kept enabled as the vCPU is still active and connected
to the XIVE device.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
These controls will be used by the H_INT_SET_QUEUE_CONFIG and
H_INT_GET_QUEUE_CONFIG hcalls from QEMU to configure the underlying
Event Queue in the XIVE IC. They will also be used to restore the
configuration of the XIVE EQs and to capture the internal run-time
state of the EQs. Both 'get' and 'set' rely on an OPAL call to access
the EQ toggle bit and EQ index which are updated by the XIVE IC when
event notifications are enqueued in the EQ.
The value of the guest physical address of the event queue is saved in
the XIVE internal xive_q structure for later use. That is when
migration needs to mark the EQ pages dirty to capture a consistent
memory state of the VM.
To be noted that H_INT_SET_QUEUE_CONFIG does not require the extra
OPAL call setting the EQ toggle bit and EQ index to configure the EQ,
but restoring the EQ state will.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This control will be used by the H_INT_SET_SOURCE_CONFIG hcall from
QEMU to configure the target of a source and also to restore the
configuration of a source when migrating the VM.
The XIVE source interrupt structure is extended with the value of the
Effective Interrupt Source Number. The EISN is the interrupt number
pushed in the event queue that the guest OS will use to dispatch
events internally. Caching the EISN value in KVM eases the test when
checking if a reconfiguration is indeed needed.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The XIVE KVM device maintains a list of interrupt sources for the VM
which are allocated in the pool of generic interrupts (IPIs) of the
main XIVE IC controller. These are used for the CPU IPIs as well as
for virtual device interrupts. The IRQ number space is defined by
QEMU.
The XIVE device reuses the source structures of the XICS-on-XIVE
device for the source blocks (2-level tree) and for the source
interrupts. Under XIVE native, the source interrupt caches mostly
configuration information and is less used than under the XICS-on-XIVE
device in which hcalls are still necessary at run-time.
When a source is initialized in KVM, an IPI interrupt source is simply
allocated at the OPAL level and then MASKED. KVM only needs to know
about its type: LSI or MSI.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The user interface exposes a new capability KVM_CAP_PPC_IRQ_XIVE to
let QEMU connect the vCPU presenters to the XIVE KVM device if
required. The capability is not advertised for now as the full support
for the XIVE native exploitation mode is not yet available. When this
is case, the capability will be advertised on PowerNV Hypervisors
only. Nested guests (pseries KVM Hypervisor) are not supported.
Internally, the interface to the new KVM device is protected with a
new interrupt mode: KVMPPC_IRQ_XIVE.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This is the basic framework for the new KVM device supporting the XIVE
native exploitation mode. The user interface exposes a new KVM device
to be created by QEMU, only available when running on a L0 hypervisor.
Support for nested guests is not available yet.
The XIVE device reuses the device structure of the XICS-on-XIVE device
as they have a lot in common. That could possibly change in the future
if the need arise.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This merges in the ppc-kvm topic branch from the powerpc tree to get
patches which touch both general powerpc code and KVM code, one of
which is a prerequisite for following patches.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
On POWER9 and later processors where the host can schedule vcpus on a
per thread basis, there is a streamlined entry path used when the guest
is radix. This entry path saves/restores the fp and vr state in
kvmhv_p9_guest_entry() by calling store_[fp/vr]_state() and
load_[fp/vr]_state(). This is the same as the old entry path however the
old entry path also saved/restored the VRSAVE register, which isn't done
in the new entry path.
This means that the vrsave register is now volatile across guest exit,
which is an incorrect change in behaviour.
Fix this by saving/restoring the vrsave register in kvmhv_p9_guest_entry().
This restores the old, correct, behaviour.
Fixes: 95a6432ce9 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
When running on POWER9 with kvm_hv.indep_threads_mode = N and the host
in SMT1 mode, KVM will run guest VCPUs on offline secondary threads.
If those guests are in radix mode, we fail to load the LPID and flush
the TLB if necessary, leading to the guest crashing with an
unsupported MMU fault. This arises from commit 9a4506e11b ("KVM:
PPC: Book3S HV: Make radix handle process scoped LPID flush in C,
with relocation on", 2018-05-17), which didn't consider the case
where indep_threads_mode = N.
For simplicity, this makes the real-mode guest entry path flush the
TLB in the same place for both radix and hash guests, as we did before
9a4506e11b, though the code is now C code rather than assembly code.
We also have the radix TLB flush open-coded rather than calling
radix__local_flush_tlb_lpid_guest(), because the TLB flush can be
called in real mode, and in real mode we don't want to invoke the
tracepoint code.
Fixes: 9a4506e11b ("KVM: PPC: Book3S HV: Make radix handle process scoped LPID flush in C, with relocation on")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This replaces assembler code in book3s_hv_rmhandlers.S that checks
the kvm->arch.need_tlb_flush cpumask and optionally does a TLB flush
with C code in book3s_hv_builtin.c. Note that unlike the radix
version, the hash version doesn't do an explicit ERAT invalidation
because we will invalidate and load up the SLB before entering the
guest, and that will invalidate the ERAT.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The code in book3s_hv_rmhandlers.S that pushes the XIVE virtual CPU
context to the hardware currently assumes it is being called in real
mode, which is usually true. There is however a path by which it can
be executed in virtual mode, in the case where indep_threads_mode = N.
A virtual CPU executing on an offline secondary thread can take a
hypervisor interrupt in virtual mode and return from the
kvmppc_hv_entry() call after the kvm_secondary_got_guest label.
It is possible for it to be given another vcpu to execute before it
gets to execute the stop instruction. In that case it will call
kvmppc_hv_entry() for the second VCPU in virtual mode, and the XIVE
vCPU push code will be executed in virtual mode. The result in that
case will be a host crash due to an unexpected data storage interrupt
caused by executing the stdcix instruction in virtual mode.
This fixes it by adding a code path for virtual mode, which uses the
virtual TIMA pointer and normal load/store instructions.
[paulus@ozlabs.org - wrote patch description]
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This fixes a bug in the XICS emulation on POWER9 machines which is
triggered by the guest doing a H_IPI with priority = 0 (the highest
priority). What happens is that the notification interrupt arrives
at the destination at priority zero. The loop in scan_interrupts()
sees that a priority 0 interrupt is pending, but because xc->mfrr is
zero, we break out of the loop before taking the notification
interrupt out of the queue and EOI-ing it. (This doesn't happen
when xc->mfrr != 0; in that case we process the priority-0 notification
interrupt on the first iteration of the loop, and then break out of
a subsequent iteration of the loop with hirq == XICS_IPI.)
To fix this, we move the prio >= xc->mfrr check down to near the end
of the loop. However, there are then some other things that need to
be adjusted. Since we are potentially handling the notification
interrupt and also delivering an IPI to the guest in the same loop
iteration, we need to update pending and handle any q->pending_count
value before the xc->mfrr check, rather than at the end of the loop.
Also, we need to update the queue pointers when we have processed and
EOI-ed the notification interrupt, since we may not do it later.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
I made the same typo when trying to grep for uses of smp_wmb and figured
I might as well fix it.
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
We already allocate hardware TCE tables in multiple levels and skip
intermediate levels when we can, now it is a turn of the KVM TCE tables.
Thankfully these are allocated already in 2 levels.
This moves the table's last level allocation from the creating helper to
kvmppc_tce_put() and kvm_spapr_tce_fault(). Since such allocation cannot
be done in real mode, this creates a virtual mode version of
kvmppc_tce_put() which handles allocations.
This adds kvmppc_rm_ioba_validate() to do an additional test if
the consequent kvmppc_tce_put() needs a page which has not been allocated;
if this is the case, we bail out to virtual mode handlers.
The allocations are protected by a new mutex as kvm->lock is not suitable
for the task because the fault handler is called with the mmap_sem held
but kvmhv_setup_mmu() locks kvm->lock and mmap_sem in the reverse order.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The kvmppc_tce_to_ua() helper is called from real and virtual modes
and it works fine as long as CONFIG_DEBUG_LOCKDEP is not enabled.
However if the lockdep debugging is on, the lockdep will most likely break
in kvm_memslots() because of srcu_dereference_check() so we need to use
PPC-own kvm_memslots_raw() which uses realmode safe
rcu_dereference_raw_notrace().
This creates a realmode copy of kvmppc_tce_to_ua() which replaces
kvm_memslots() with kvm_memslots_raw().
Since kvmppc_rm_tce_to_ua() becomes static and can only be used inside
HV KVM, this moves it earlier under CONFIG_KVM_BOOK3S_HV_POSSIBLE.
This moves truly virtual-mode kvmppc_tce_to_ua() to where it belongs and
drops the prmap parameter which was never used in the virtual mode.
Fixes: d3695aa4f4 ("KVM: PPC: Add support for multiple-TCE hcalls", 2016-02-15)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The trace_hardirqs_on() sets current->hardirqs_enabled and from here
the lockdep assumes interrupts are enabled although they are remain
disabled until the context switches to the guest. Consequent
srcu_read_lock() checks the flags in rcu_lock_acquire(), observes
disabled interrupts and prints a warning (see below).
This moves trace_hardirqs_on/off closer to __kvmppc_vcore_entry to
prevent lockdep from being confused.
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
WARNING: CPU: 16 PID: 8038 at kernel/locking/lockdep.c:4128 check_flags.part.25+0x224/0x280
[...]
NIP [c000000000185b84] check_flags.part.25+0x224/0x280
LR [c000000000185b80] check_flags.part.25+0x220/0x280
Call Trace:
[c000003fec253710] [c000000000185b80] check_flags.part.25+0x220/0x280 (unreliable)
[c000003fec253780] [c000000000187ea4] lock_acquire+0x94/0x260
[c000003fec253840] [c00800001a1e9768] kvmppc_run_core+0xa60/0x1ab0 [kvm_hv]
[c000003fec253a10] [c00800001a1ed944] kvmppc_vcpu_run_hv+0x73c/0xec0 [kvm_hv]
[c000003fec253ae0] [c00800001a1095dc] kvmppc_vcpu_run+0x34/0x48 [kvm]
[c000003fec253b00] [c00800001a1056bc] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
[c000003fec253b90] [c00800001a0f3618] kvm_vcpu_ioctl+0x460/0x850 [kvm]
[c000003fec253d00] [c00000000041c4f4] do_vfs_ioctl+0xe4/0x930
[c000003fec253db0] [c00000000041ce04] ksys_ioctl+0xc4/0x110
[c000003fec253e00] [c00000000041ce78] sys_ioctl+0x28/0x80
[c000003fec253e20] [c00000000000b5a4] system_call+0x5c/0x70
Instruction dump:
419e0034 3d220004 39291730 81290000 2f890000 409e0020 3c82ffc6 3c62ffc5
3884be70 386329c0 4bf6ea71 60000000 <0fe00000> 3c62ffc6 3863be90 4801273d
irq event stamp: 1025
hardirqs last enabled at (1025): [<c00800001a1e9728>] kvmppc_run_core+0xa20/0x1ab0 [kvm_hv]
hardirqs last disabled at (1024): [<c00800001a1e9358>] kvmppc_run_core+0x650/0x1ab0 [kvm_hv]
softirqs last enabled at (0): [<c0000000000f1210>] copy_process.isra.4.part.5+0x5f0/0x1d00
softirqs last disabled at (0): [<0000000000000000>] (null)
---[ end trace 31180adcc848993e ]---
possible reason: unannotated irqs-off.
irq event stamp: 1025
hardirqs last enabled at (1025): [<c00800001a1e9728>] kvmppc_run_core+0xa20/0x1ab0 [kvm_hv]
hardirqs last disabled at (1024): [<c00800001a1e9358>] kvmppc_run_core+0x650/0x1ab0 [kvm_hv]
softirqs last enabled at (0): [<c0000000000f1210>] copy_process.isra.4.part.5+0x5f0/0x1d00
softirqs last disabled at (0): [<0000000000000000>] (null)
Fixes: 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry", 2017-06-26)
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Implement a real mode handler for the H_CALL H_PAGE_INIT which can be
used to zero or copy a guest page. The page is defined to be 4k and must
be 4k aligned.
The in-kernel real mode handler halves the time to handle this H_CALL
compared to handling it in userspace for a hash guest.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Implement a virtual mode handler for the H_CALL H_PAGE_INIT which can be
used to zero or copy a guest page. The page is defined to be 4k and must
be 4k aligned.
The in-kernel handler halves the time to handle this H_CALL compared to
handling it in userspace for a radix guest.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This adds a flag so that the DAWR can be enabled on P9 via:
echo Y > /sys/kernel/debug/powerpc/dawr_enable_dangerous
The DAWR was previously force disabled on POWER9 in:
9654153158 powerpc: Disable DAWR in the base POWER9 CPU features
Also see Documentation/powerpc/DAWR-POWER9.txt
This is a dangerous setting, USE AT YOUR OWN RISK.
Some users may not care about a bad user crashing their box
(ie. single user/desktop systems) and really want the DAWR. This
allows them to force enable DAWR.
This flag can also be used to disable DAWR access. Once this is
cleared, all DAWR access should be cleared immediately and your
machine once again safe from crashing.
Userspace may get confused by toggling this. If DAWR is force
enabled/disabled between getting the number of breakpoints (via
PTRACE_GETHWDBGINFO) and setting the breakpoint, userspace will get an
inconsistent view of what's available. Similarly for guests.
For the DAWR to be enabled in a KVM guest, the DAWR needs to be force
enabled in the host AND the guest. For this reason, this won't work on
POWERVM as it doesn't allow the HCALL to work. Writes of 'Y' to the
dawr_enable_dangerous file will fail if the hypervisor doesn't support
writing the DAWR.
To double check the DAWR is working, run this kernel selftest:
tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
Any errors/failures/skips mean something is wrong.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The support for XIVE native exploitation mode in Linux/KVM needs a
couple more OPAL calls to get and set the state of the XIVE internal
structures being used by a sPAPR guest.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
There is a hardware bug in some POWER9 processors where a treclaim in
fake suspend mode can cause an inconsistency in the XER[SO] bit across
the threads of a core, the workaround being to force the core into SMT4
when doing the treclaim.
The FAKE_SUSPEND bit (bit 10) in the PSSCR is used to control whether a
thread is in fake suspend or real suspend. The important difference here
being that thread reconfiguration is blocked in real suspend but not
fake suspend mode.
When we exit a guest which was in fake suspend mode, we force the core
into SMT4 while we do the treclaim in kvmppc_save_tm_hv().
However on the new exit path introduced with the function
kvmhv_run_single_vcpu() we restore the host PSSCR before calling
kvmppc_save_tm_hv() which means that if we were in fake suspend mode we
put the thread into real suspend mode when we clear the
PSSCR[FAKE_SUSPEND] bit. This means that we block thread reconfiguration
and the thread which is trying to get the core into SMT4 before it can
do the treclaim spins forever since it itself is blocking thread
reconfiguration. The result is that that core is essentially lost.
This results in a trace such as:
[ 93.512904] CPU: 7 PID: 13352 Comm: qemu-system-ppc Not tainted 5.0.0 #4
[ 93.512905] NIP: c000000000098a04 LR: c0000000000cc59c CTR: 0000000000000000
[ 93.512908] REGS: c000003fffd2bd70 TRAP: 0100 Not tainted (5.0.0)
[ 93.512908] MSR: 9000000302883033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[SE]> CR: 22222444 XER: 00000000
[ 93.512914] CFAR: c000000000098a5c IRQMASK: 3
[ 93.512915] PACATMSCRATCH: 0000000000000001
[ 93.512916] GPR00: 0000000000000001 c000003f6cc1b830 c000000001033100 0000000000000004
[ 93.512928] GPR04: 0000000000000004 0000000000000002 0000000000000004 0000000000000007
[ 93.512930] GPR08: 0000000000000000 0000000000000004 0000000000000000 0000000000000004
[ 93.512932] GPR12: c000203fff7fc000 c000003fffff9500 0000000000000000 0000000000000000
[ 93.512935] GPR16: 2000000000300375 000000000000059f 0000000000000000 0000000000000000
[ 93.512951] GPR20: 0000000000000000 0000000000080053 004000000256f41f c000003f6aa88ef0
[ 93.512953] GPR24: c000003f6aa89100 0000000000000010 0000000000000000 0000000000000000
[ 93.512956] GPR28: c000003f9e9a0800 0000000000000000 0000000000000001 c000203fff7fc000
[ 93.512959] NIP [c000000000098a04] pnv_power9_force_smt4_catch+0x1b4/0x2c0
[ 93.512960] LR [c0000000000cc59c] kvmppc_save_tm_hv+0x40/0x88
[ 93.512960] Call Trace:
[ 93.512961] [c000003f6cc1b830] [0000000000080053] 0x80053 (unreliable)
[ 93.512965] [c000003f6cc1b8a0] [c00800001e9cb030] kvmhv_p9_guest_entry+0x508/0x6b0 [kvm_hv]
[ 93.512967] [c000003f6cc1b940] [c00800001e9cba44] kvmhv_run_single_vcpu+0x2dc/0xb90 [kvm_hv]
[ 93.512968] [c000003f6cc1ba10] [c00800001e9cc948] kvmppc_vcpu_run_hv+0x650/0xb90 [kvm_hv]
[ 93.512969] [c000003f6cc1bae0] [c00800001e8f620c] kvmppc_vcpu_run+0x34/0x48 [kvm]
[ 93.512971] [c000003f6cc1bb00] [c00800001e8f2d4c] kvm_arch_vcpu_ioctl_run+0x2f4/0x400 [kvm]
[ 93.512972] [c000003f6cc1bb90] [c00800001e8e3918] kvm_vcpu_ioctl+0x460/0x7d0 [kvm]
[ 93.512974] [c000003f6cc1bd00] [c0000000003ae2c0] do_vfs_ioctl+0xe0/0x8e0
[ 93.512975] [c000003f6cc1bdb0] [c0000000003aeb24] ksys_ioctl+0x64/0xe0
[ 93.512978] [c000003f6cc1be00] [c0000000003aebc8] sys_ioctl+0x28/0x80
[ 93.512981] [c000003f6cc1be20] [c00000000000b3a4] system_call+0x5c/0x70
[ 93.512983] Instruction dump:
[ 93.512986] 419dffbc e98c0000 2e8b0000 38000001 60000000 60000000 60000000 40950068
[ 93.512993] 392bffff 39400000 79290020 39290001 <7d2903a6> 60000000 60000000 7d235214
To fix this we preserve the PSSCR[FAKE_SUSPEND] bit until we call
kvmppc_save_tm_hv() which will mean the core can get into SMT4 and
perform the treclaim. Note kvmppc_save_tm_hv() clears the
PSSCR[FAKE_SUSPEND] bit again so there is no need to explicitly do that.
Fixes: 95a6432ce9 ("KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
When I updated the spectre_v2 reporting to handle software count cache
flush I got the logic wrong when there's no software count cache
enabled at all.
The result is that on systems with the software count cache flush
disabled we print:
Mitigation: Indirect branch cache disabled, Software count cache flush
Which correctly indicates that the count cache is disabled, but
incorrectly says the software count cache flush is enabled.
The root of the problem is that we are trying to handle all
combinations of options. But we know now that we only expect to see
the software count cache flush enabled if the other options are false.
So split the two cases, which simplifies the logic and fixes the bug.
We were also missing a space before "(hardware accelerated)".
The result is we see one of:
Mitigation: Indirect branch serialisation (kernel only)
Mitigation: Indirect branch cache disabled
Mitigation: Software count cache flush
Mitigation: Software count cache flush (hardware accelerated)
Fixes: ee13cb249f ("powerpc/64s: Add support for software count cache flush")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Diana Craciun <diana.craciun@nxp.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
MAX_PHYSMEM_BITS only needs to be defined if CONFIG_SPARSEMEM is
enabled, and that was the case before commit 4ffe713b75
("powerpc/mm: Increase the max addressable memory to 2PB").
On 32-bit systems, where CONFIG_SPARSEMEM is not enabled, we now
define it as 46. That is larger than the real number of physical
address bits, and breaks calculations in zsmalloc:
mm/zsmalloc.c:130:49: warning: right shift count is negative
MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
^~
...
mm/zsmalloc.c:253:21: error: variably modified 'size_class' at file scope
struct size_class *size_class[ZS_SIZE_CLASSES];
^~~~~~~~~~
Fixes: 4ffe713b75 ("powerpc/mm: Increase the max addressable memory to 2PB")
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Not only the 603 but all 6xx need SPRN_SPRG_PGDIR to be initialised at
startup. This patch move it from __setup_cpu_603() to start_here()
and __secondary_start(), close to the initialisation of SPRN_THREAD.
Previously, virt addr of PGDIR was retrieved from thread struct.
Now that it is the phys addr which is stored in SPRN_SPRG_PGDIR,
hash_page() shall not convert it to phys anymore.
This patch removes the conversion.
Fixes: 93c4a162b0 ("powerpc/6xx: Store PGDIR physical address in a SPRG")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Jakub Drnec reported:
Setting the realtime clock can sometimes make the monotonic clock go
back by over a hundred years. Decreasing the realtime clock across
the y2k38 threshold is one reliable way to reproduce. Allegedly this
can also happen just by running ntpd, I have not managed to
reproduce that other than booting with rtc at >2038 and then running
ntp. When this happens, anything with timers (e.g. openjdk) breaks
rather badly.
And included a test case (slightly edited for brevity):
#define _POSIX_C_SOURCE 199309L
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <unistd.h>
long get_time(void) {
struct timespec tp;
clock_gettime(CLOCK_MONOTONIC, &tp);
return tp.tv_sec + tp.tv_nsec / 1000000000;
}
int main(void) {
long last = get_time();
while(1) {
long now = get_time();
if (now < last) {
printf("clock went backwards by %ld seconds!\n", last - now);
}
last = now;
sleep(1);
}
return 0;
}
Which when run concurrently with:
# date -s 2040-1-1
# date -s 2037-1-1
Will detect the clock going backward.
The root cause is that wtom_clock_sec in struct vdso_data is only a
32-bit signed value, even though we set its value to be equal to
tk->wall_to_monotonic.tv_sec which is 64-bits.
Because the monotonic clock starts at zero when the system boots the
wall_to_montonic.tv_sec offset is negative for current and future
dates. Currently on a freshly booted system the offset will be in the
vicinity of negative 1.5 billion seconds.
However if the wall clock is set past the Y2038 boundary, the offset
from wall to monotonic becomes less than negative 2^31, and no longer
fits in 32-bits. When that value is assigned to wtom_clock_sec it is
truncated and becomes positive, causing the VDSO assembly code to
calculate CLOCK_MONOTONIC incorrectly.
That causes CLOCK_MONOTONIC to jump ahead by ~4 billion seconds which
it is not meant to do. Worse, if the time is then set back before the
Y2038 boundary CLOCK_MONOTONIC will jump backward.
We can fix it simply by storing the full 64-bit offset in the
vdso_data, and using that in the VDSO assembly code. We also shuffle
some of the fields in vdso_data to avoid creating a hole.
The original commit that added the CLOCK_MONOTONIC support to the VDSO
did actually use a 64-bit value for wtom_clock_sec, see commit
a7f290dad3 ("[PATCH] powerpc: Merge vdso's and add vdso support to
32 bits kernel") (Nov 2005). However just 3 days later it was
converted to 32-bits in commit 0c37ec2aa8 ("[PATCH] powerpc: vdso
fixes (take #2)"), and the bug has existed since then AFAICS.
Fixes: 0c37ec2aa8 ("[PATCH] powerpc: vdso fixes (take #2)")
Cc: stable@vger.kernel.org # v2.6.15+
Link: http://lkml.kernel.org/r/HaC.ZfES.62bwlnvAvMP.1STMMj@seznam.cz
Reported-by: Jakub Drnec <jaydee@email.cz>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes
the common Kbuild.asm file. Factor out the duplicated include directives
to scripts/Makefile.asm-generic so that no architecture would opt out
of the mandatory-y mechanism.
um is not forced to include mandatory-y since it is a very exceptional
case which does not support UAPI.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The generic-y is redundant under the following condition:
- arch has its own implementation
- the same header is added to generated-y
- the same header is added to mandatory-y
If a redundant generic-y is found, the warning like follows is displayed:
scripts/Makefile.asm-generic:20: redundant generic-y found in arch/arm/include/asm/Kbuild: timex.h
I fixed up arch Kbuild files found by this.
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
* Replace the /sys/class/dax device model with /sys/bus/dax, and include
a compat driver so distributions can opt-in to the new ABI.
* Allow for an alternative driver for the device-dax address-range
* Introduce the 'kmem' driver to hotplug / assign a device-dax
address-range to the core-mm.
* Arrange for the device-dax target-node to be onlined so that the newly
added memory range can be uniquely referenced by numa apis.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJchWpGAAoJEB7SkWpmfYgCJk8P/0Q1DINszUDO/vKjJ09cDs9P
Jw3it6GBIL50rDOu9QdcprSpwYDD0h1mLAV/m6oa3bVO+p4uWGvnxaxRx2HN2c/v
vhZFtUDpHlqR63vzWMNVKRprYixCRJDUr6xQhhCcE3ak/ELN6w7LWfikKVWv15UL
MfR96IQU38f+xRda/zSXnL9606Dvkvu/inEHj84lRcHIwj3sQAUalrE8bR3O32gZ
bDg/l5kzT49o8ZXUo/TegvRSSSZpJmOl2DD0RW+ax5q3NI2bOXFrVDUKBKxf/hcQ
E/V9i57TrqQx0GqRhnU7rN/v53cFZGGs31TEEIB/xs3bzCnADxwXcjL5b5K005J6
vJjBA2ODBewHFK3uVx46Hy1iV4eCtZWj4QrMnrjdSrjXOfbF5GTbWOhPFgoq7TWf
S7VqFEf3I2gDPaMq4o8Ej1kLH4HMYeor2NSOZjyvGn87rSZ3ZIQguwbaNIVl+itz
gdDt0ZOU0BgOBkV+rZIeZDaGdloWCHcDPL15CkZaOZyzdWhfEZ7dod6ad+9udilU
EUPH62RgzXZtfm5zpebYyjNVLbb9pLZ0nT+UypyGR6zqWx1SqU3mXi63NFXPco+x
XA9j//edPeI6NHg2CXLEh8DLuCg3dG1zWRJANkiF+niBwyCR8CHtGWAoY6soXbKe
2UrXGcIfXxyJ8V9v8v4q
=hfa3
-----END PGP SIGNATURE-----
Merge tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull device-dax updates from Dan Williams:
"New device-dax infrastructure to allow persistent memory and other
"reserved" / performance differentiated memories, to be assigned to
the core-mm as "System RAM".
Some users want to use persistent memory as additional volatile
memory. They are willing to cope with potential performance
differences, for example between DRAM and 3D Xpoint, and want to use
typical Linux memory management apis rather than a userspace memory
allocator layered over an mmap() of a dax file. The administration
model is to decide how much Persistent Memory (pmem) to use as System
RAM, create a device-dax-mode namespace of that size, and then assign
it to the core-mm. The rationale for device-dax is that it is a
generic memory-mapping driver that can be layered over any "special
purpose" memory, not just pmem. On subsequent boots udev rules can be
used to restore the memory assignment.
One implication of using pmem as RAM is that mlock() no longer keeps
data off persistent media. For this reason it is recommended to enable
NVDIMM Security (previously merged for 5.0) to encrypt pmem contents
at rest. We considered making this recommendation an actively enforced
requirement, but in the end decided to leave it as a distribution /
administrator policy to allow for emulation and test environments that
lack security capable NVDIMMs.
Summary:
- Replace the /sys/class/dax device model with /sys/bus/dax, and
include a compat driver so distributions can opt-in to the new ABI.
- Allow for an alternative driver for the device-dax address-range
- Introduce the 'kmem' driver to hotplug / assign a device-dax
address-range to the core-mm.
- Arrange for the device-dax target-node to be onlined so that the
newly added memory range can be uniquely referenced by numa apis"
NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because
we currently have special - and very annoying rules in the kernel about
accessing PMEM only with the "MC safe" accessors, because machine checks
inside the regular repeat string copy functions can be fatal in some
(not described) circumstances.
And apparently the PMEM modules can cause that a lot more than regular
RAM. The argument is that this happens because PMEM doesn't necessarily
get scrubbed at boot like RAM does, but that is planned to be added for
the user space tooling.
Quoting Dan from another email:
"The exposure can be reduced in the volatile-RAM case by scanning for
and clearing errors before it is onlined as RAM. The userspace tooling
for that can be in place before v5.1-final. There's also runtime
notifications of errors via acpi_nfit_uc_error_notify() from
background scrubbers on the DIMM devices. With that mechanism the
kernel could proactively clear newly discovered poison in the volatile
case, but that would be additional development more suitable for v5.2.
I understand the concern, and the need to highlight this issue by
tapping the brakes on feature development, but I don't see PMEM as RAM
making the situation worse when the exposure is also there via DAX in
the PMEM case. Volatile-RAM is arguably a safer use case since it's
possible to repair pages where the persistent case needs active
application coordination"
* tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: "Hotplug" persistent memory for use like normal RAM
mm/resource: Let walk_system_ram_range() search child resources
mm/memory-hotplug: Allow memory resources to be children
mm/resource: Move HMM pr_debug() deeper into resource code
mm/resource: Return real error codes from walk failures
device-dax: Add a 'modalias' attribute to DAX 'bus' devices
device-dax: Add a 'target_node' attribute
device-dax: Auto-bind device after successful new_id
acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node
device-dax: Add /sys/class/dax backwards compatibility
device-dax: Add support for a dax override driver
device-dax: Move resource pinning+mapping into the common driver
device-dax: Introduce bus + driver model
device-dax: Start defining a dax bus model
device-dax: Remove multi-resource infrastructure
device-dax: Kill dax_region base
device-dax: Kill dax_region ida
One fix to prevent runtime allocation of 16GB pages when running in a VM (as
opposed to bare metal), because it doesn't work.
A small fix to our recently added KCOV support to exempt some more code from
being instrumented.
Plus a few minor build fixes, a small dead code removal and a defconfig update.
Thanks to:
Alexey Kardashevskiy, Aneesh Kumar K.V, Christophe Leroy, Jason Yan, Joel
Stanley, Mahesh Salgaonkar, Mathieu Malaterre.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcjNHCAAoJEFHr6jzI4aWAJVAP/21RUgDvqAAW55jTwihH6Eit
q6l1mJ30zwARz+UYWssqMe7qIYmnjWDeapgpZncZE3P6f3VMmepJrr75zca0LJhC
ixWqNJOcQgUu9civDwwpaqKQvyY0CYCdF5mu1rA1RNZ2kTeuCMw7zYPPpM84UGkq
IPFe3EgWAOURFeaQUGpH16klJVbPISq/1RCtsAkR4QifD4auM+EDYq+ML69LInc4
m7mi2CpPQDGZyCepFL0zdfOI43zrtWerG0UwCxPbGPYzvT+T3mvxU2unV1NcYn6/
obNYB5V0OCz4gUiu7aLoHnYZx2zK8fi1lTjSrB7XhWdi4ftEfRP3TrUntHWo420n
FC3+ibbjS3Cr8y7eubXgEAAKh74M1xzBF2bdAEHQ/QmqHZLcG+mnUihOq/g8mCp1
LsTKvkzXilov752wKSwdjvSNbU29a2KRaXSXAEgWJvsAQbZAidGRzX7CA9XeHQPp
kRCWHTwzXM0E31oi5rGAk2F1l4EK12QLdk1m0DF96ZanX7xG/UK6MpDNut2y51Wr
KsWPYhUhI6pc9xt+Fts0zehDWAtfttn7RTvE+34dkaZURGl3rQkjsKt1lQ+scRYX
fuSAnpTinE46e6APezwjCELtHDAzOCZvOnh9RVPe+F//KEF8LcNQv6TLhQoukRAe
ldJEhSReJfo3/agqGJ6v
=6cp4
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One fix to prevent runtime allocation of 16GB pages when running in a
VM (as opposed to bare metal), because it doesn't work.
A small fix to our recently added KCOV support to exempt some more
code from being instrumented.
Plus a few minor build fixes, a small dead code removal and a
defconfig update.
Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Christophe Leroy,
Jason Yan, Joel Stanley, Mahesh Salgaonkar, Mathieu Malaterre"
* tag 'powerpc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Include <asm/nmi.h> header file to fix a warning
powerpc/powernv: Fix compile without CONFIG_TRACEPOINTS
powerpc/mm: Disable kcov for SLB routines
powerpc: remove dead code in head_fsl_booke.S
powerpc/configs: Sync skiroot defconfig
powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR configuration
for 32-bit guests
s390: interrupt cleanup, introduction of the Guest Information Block,
preparation for processor subfunctions in cpu models
PPC: bug fixes and improvements, especially related to machine checks
and protection keys
x86: many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations; plus AVIC fixes.
Generic: memcg accounting
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJci+7XAAoJEL/70l94x66DUMkIAKvEefhceySHYiTpfefjLjIC
16RewgHa+9CO4Oo5iXiWd90fKxtXLXmxDQOS4VGzN0rxvLGRw/fyXIxL1MDOkaAO
l8SLSNuewY4XBUgISL3PMz123r18DAGOuy9mEcYU/IMesYD2F+wy5lJ17HIGq6X2
RpoF1p3qO1jfkPTKOob6Ixd4H5beJNPKpdth7LY3PJaVhDxgouj32fxnLnATVSnN
gENQ10fnt8BCjshRYW6Z2/9bF15JCkUFR1xdBW2/xh1oj+kvPqqqk2bEN1eVQzUy
2hT/XkwtpthqjSbX8NNavWRSFnOnbMLTRKQyIXmFVsM5VoSrwtiGsCFzBgcT++I=
=XIzU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- some cleanups
- direct physical timer assignment
- cache sanitization for 32-bit guests
s390:
- interrupt cleanup
- introduction of the Guest Information Block
- preparation for processor subfunctions in cpu models
PPC:
- bug fixes and improvements, especially related to machine checks
and protection keys
x86:
- many, many cleanups, including removing a bunch of MMU code for
unnecessary optimizations
- AVIC fixes
Generic:
- memcg accounting"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (147 commits)
kvm: vmx: fix formatting of a comment
KVM: doc: Document the life cycle of a VM and its resources
MAINTAINERS: Add KVM selftests to existing KVM entry
Revert "KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()"
KVM: PPC: Book3S: Add count cache flush parameters to kvmppc_get_cpu_char()
KVM: PPC: Fix compilation when KVM is not enabled
KVM: Minor cleanups for kvm_main.c
KVM: s390: add debug logging for cpu model subfunctions
KVM: s390: implement subfunction processor calls
arm64: KVM: Fix architecturally invalid reset value for FPEXC32_EL2
KVM: arm/arm64: Remove unused timer variable
KVM: PPC: Book3S: Improve KVM reference counting
KVM: PPC: Book3S HV: Fix build failure without IOMMU support
Revert "KVM: Eliminate extra function calls in kvm_get_dirty_log_protect()"
x86: kvmguest: use TSC clocksource if invariant TSC is exposed
KVM: Never start grow vCPU halt_poll_ns from value below halt_poll_ns_grow_start
KVM: Expose the initial start value in grow_halt_poll_ns() as a module parameter
KVM: grow_halt_poll_ns() should never shrink vCPU halt_poll_ns
KVM: x86/mmu: Consolidate kvm_mmu_zap_all() and kvm_mmu_zap_mmio_sptes()
KVM: x86/mmu: WARN if zapping a MMIO spte results in zapping children
...
Make sure to include <asm/nmi.h> to provide the following prototype:
hv_nmi_check_nonrecoverable.
Remove the following warning treated as error (W=1):
arch/powerpc/kernel/traps.c:393:6: error: no previous prototype for 'hv_nmi_check_nonrecoverable'
Fixes: ccd477028a ("powerpc/64s: Fix HV NMI vs HV interrupt recoverability test")
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The functions returns s64 but the return statement is missing.
This adds the missing return statement.
Fixes: 75d9fc7fd9 ("powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Add check for the return value of memblock_alloc*() functions and call
panic() in case of error. The panic message repeats the one used by
panicing memblock allocators with adjustment of parameters to include
only relevant ones.
The replacement was mostly automated with semantic patches like the one
below with manual massaging of format strings.
@@
expression ptr, size, align;
@@
ptr = memblock_alloc(size, align);
+ if (!ptr)
+ panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);
[anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
[rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
[rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
[akpm@linux-foundation.org: fix xtensa printk warning]
Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Guo Ren <ren_guo@c-sky.com> [c-sky]
Acked-by: Paul Burton <paul.burton@mips.com> [MIPS]
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> [s390]
Reviewed-by: Juergen Gross <jgross@suse.com> [Xen]
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The memblock_alloc_base() function tries to allocate a memory up to the
limit specified by its max_addr parameter and panics if the allocation
fails. Replace its usage with memblock_phys_alloc_range() and make the
callers check the return value and panic in case of error.
Link: http://lkml.kernel.org/r/1548057848-15136-10-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com> [c-sky]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Juergen Gross <jgross@suse.com> [Xen]
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the memblock_phys_alloc() function an inline wrapper for
memblock_phys_alloc_range() and update the memblock_phys_alloc() callers
to check the returned value and panic in case of error.
Link: http://lkml.kernel.org/r/1548057848-15136-8-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com> [c-sky]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Juergen Gross <jgross@suse.com> [Xen]
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The memblock_phys_alloc_try_nid() function tries to allocate memory from
the requested node and then falls back to allocation from any node in
the system. The memblock_alloc_base() fallback used by this function
panics if the allocation fails.
Replace the memblock_alloc_base() fallback with the direct call to
memblock_alloc_range_nid() and update the memblock_phys_alloc_try_nid()
callers to check the returned value and panic in case of error.
Link: http://lkml.kernel.org/r/1548057848-15136-7-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com> [c-sky]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Juergen Gross <jgross@suse.com> [Xen]
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since only the virtual address of allocated blocks is used, lets use
functions returning directly virtual address.
Those functions have the advantage of also zeroing the block.
[rppt@linux.ibm.com: powerpc: remove duplicated alloc_stack() function]
Link: http://lkml.kernel.org/r/20190226064032.GA5873@rapoport-lnx
[rppt@linux.ibm.com: updated error message in alloc_stack() to be more verbose]
[rppt@linux.ibm.com: convereted several additional call sites ]
Link: http://lkml.kernel.org/r/1548057848-15136-3-git-send-email-rppt@linux.ibm.com
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Guo Ren <ren_guo@c-sky.com> [c-sky]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Juergen Gross <jgross@suse.com> [Xen]
Cc: Mark Salter <msalter@redhat.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- do not generate unneeded top-level built-in.a
- let git ignore O= directory entirely
- optimize scripts/kallsyms slightly
- exclude DWARF info from *.s regardless of config options
- fix GCC toolchain search path for Clang to prepare ld.lld support
- do not generate modules.order when CONFIG_MODULES is disabled
- simplify single target rules and remove VPATH for external module build
- allow to add optional flags to dpkg-buildpackage when building deb-pkg
- move some compiler option tests from Makefile to Kconfig
- various Makefile cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJcgxYUAAoJED2LAQed4NsGr7YQAJq4LmN/aZDI9Mt0YAQjEyyA
PCpm8J2HI9HO1sMoY7J/ksWmV0BU25G+uspKD7dXAQo3l9fmahQM5e4dsyZ4Xqs8
DyyYSGtJJnMJaWmupIZNA4UKDCVtwPoVW8YeuK9rwADVokCux9avogof9O1OoA/E
Pylo+I4UCM82kbpZSd+UxnCx6B0v8XGtW+d31Q4yZXCkw5nw14chrlaprcqB3UgB
+7C3xOnDWCi7gyxaTqmD7dLay2DM8KCDlznEvBL733Y/cK3to1fywzEPzp0JQCLX
BLgmmpW13NF++q5BCoTW6sFjZAhBVbiYZwesMrCi75Y32T8zt4G5l4pkvGkSuGF/
UQh5aoCxaMIp70VPj/loZ0lh78nwVGTok9zRb0rfztM0X4DbmiPi5MNiHRzRpIeE
1jjEa/GK1t0TDnXc/MuDFK8cWwdhttIqUL5yWfAxjXbtP27eLtsopQUdW7EPHs7d
sMnfuSUuhOC28yByVxIkBcwawLyYrcWRphJ3ixCO70CoJWt2DT6aOKxcFJefoJix
Pto6Oo3oQ4iypMM5M9/0Uo+AK2TKRejWIqtZdbo+ir70tNxVH3WDZq++fG0drXOB
r2I/GY6nRjuzLOe2jzEqywFTFd2xpk4Qo84LGb1R3U6aU5qS2gA0W/q00JS5c2qU
R8uReJ7bvmLmrVNZ/NI4
=y9YG
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- do not generate unneeded top-level built-in.a
- let git ignore O= directory entirely
- optimize scripts/kallsyms slightly
- exclude DWARF info from *.s regardless of config options
- fix GCC toolchain search path for Clang to prepare ld.lld support
- do not generate modules.order when CONFIG_MODULES is disabled
- simplify single target rules and remove VPATH for external module
build
- allow to add optional flags to dpkg-buildpackage when building
deb-pkg
- move some compiler option tests from Makefile to Kconfig
- various Makefile cleanups
* tag 'kbuild-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (40 commits)
kbuild: remove scripts/basic/% build target
kbuild: use -Werror=implicit-... instead of -Werror-implicit-...
kbuild: clean up scripts/gcc-version.sh
kbuild: remove cc-version macro
kbuild: update comment block of scripts/clang-version.sh
kbuild: remove commented-out INITRD_COMPRESS
kbuild: move -gsplit-dwarf, -gdwarf-4 option tests to Kconfig
kbuild: [bin]deb-pkg: add DPKG_FLAGS variable
kbuild: move ".config not found!" message from Kconfig to Makefile
kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing
kbuild: simplify single target rules
kbuild: remove empty rules for makefiles
kbuild: make -r/-R effective in top Makefile for old Make versions
kbuild: move tools_silent to a more relevant place
kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig
kbuild: refactor cc-cross-prefix implementation
kbuild: hardcode genksyms path and remove GENKSYMS variable
scripts/gdb: refactor rules for symlink creation
kbuild: create symlink to vmlinux-gdb.py in scripts_gdb target
scripts/gdb: do not descend into scripts/gdb from scripts
...
- add debugfs support for dumping dma-debug information (Corentin Labbe)
- Kconfig cleanups (Andy Shevchenko and me)
- debugfs cleanups (Greg Kroah-Hartman)
- improve dma_map_resource and use it in the media code
- arch_setup_dma_ops / arch_teardown_dma_ops cleanups
- various small cleanups and improvements for the per-device coherent
allocator
- make the DMA mask an upper bound and don't fail "too large" dma mask
in the remaning two architectures - this will allow big driver
cleanups in the following merge windows
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlyCKUgLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYP1vA//WNK5cxQVGZZsmsmkcNe3sCaJCZD4MpVpq/D+l87t
3j1C1qmduOPyI1m061niYk7j4B4DeyeLs+XOeUsl5Yz+FqVvDICuNHXXJQSUr3Ao
JbMfBis8Ne65Eyz0xxBltCWM7WiE6fdo7AGoR4Bzj3+f4xGOOazkRy4R6r67bU6x
v3R5dTvfbSlvvKhn+j8ksAEYb+WPUmr6Z2dnlF0mShnOCpZVy0wd0M1gtEFKrVHx
zKz9/va4/7yEcpdVqNtSDlHIsSZcFE3ZfTRWq6ZtBoRN+gNwrI0YylY7HtCfJWZG
IxMiuQ+8SHGE8+NI2d56bs4MsHbqPBRSuadJNuZaTzdxs6FDTEnlCDeXwGF1cHf2
qhVMfn17V4TZNT4NAd2wHa60cjTMoqraWeS06/b2tyXTF0uxyWj0BCjaHNJa+Ayc
KCulq1n2LmTDiOGnZJT7Oui6PO5etOHAmvgMQumBNkzQJbPGvuiYGgsciYAMSmuy
NccIrghQzR9BlG6U1srzTiGQJnpm38x1hWphtU6gQPwz5iKt3FBAfEWCic8U81QE
JKSwoYv/5ChO+sy9880t/FLO8hn/7L55IOdZEfGkQ22gFzf3W5f9v2jFQc8XN2BO
Fc6EjWERrmTzUi0f1Ooj3VPRtWuZq86KqlKByy6iZ5eXwxpGE1M0HZVoHYCW+aDd
MYc=
=nAMI
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-5.1' of git://git.infradead.org/users/hch/dma-mapping
Pull DMA mapping updates from Christoph Hellwig:
- add debugfs support for dumping dma-debug information (Corentin
Labbe)
- Kconfig cleanups (Andy Shevchenko and me)
- debugfs cleanups (Greg Kroah-Hartman)
- improve dma_map_resource and use it in the media code
- arch_setup_dma_ops / arch_teardown_dma_ops cleanups
- various small cleanups and improvements for the per-device coherent
allocator
- make the DMA mask an upper bound and don't fail "too large" dma mask
in the remaning two architectures - this will allow big driver
cleanups in the following merge windows
* tag 'dma-mapping-5.1' of git://git.infradead.org/users/hch/dma-mapping: (21 commits)
Documentation/DMA-API-HOWTO: update dma_mask sections
sparc64/pci_sun4v: allow large DMA masks
sparc64/iommu: allow large DMA masks
sparc64: refactor the ali DMA quirk
ccio: allow large DMA masks
dma-mapping: remove the DMA_MEMORY_EXCLUSIVE flag
dma-mapping: remove dma_mark_declared_memory_occupied
dma-mapping: move CONFIG_DMA_CMA to kernel/dma/Kconfig
dma-mapping: improve selection of dma_declare_coherent availability
dma-mapping: remove an incorrect __iommem annotation
of: select OF_RESERVED_MEM automatically
device.h: dma_mem is only needed for HAVE_GENERIC_DMA_COHERENT
mfd/sm501: depend on HAS_DMA
dma-mapping: add a kconfig symbol for arch_teardown_dma_ops availability
dma-mapping: add a kconfig symbol for arch_setup_dma_ops availability
dma-mapping: move debug configuration options to kernel/dma
dma-debug: add dumping facility via debugfs
dma: debug: no need to check return value of debugfs_create functions
videobuf2: replace a layering violation with dma_map_resource
dma-mapping: don't BUG when calling dma_map_resource on RAM
...
Merge more updates from Andrew Morton:
- some of the rest of MM
- various misc things
- dynamic-debug updates
- checkpatch
- some epoll speedups
- autofs
- rapidio
- lib/, lib/lzo/ updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (83 commits)
samples/mic/mpssd/mpssd.h: remove duplicate header
kernel/fork.c: remove duplicated include
include/linux/relay.h: fix percpu annotation in struct rchan
arch/nios2/mm/fault.c: remove duplicate include
unicore32: stop printing the virtual memory layout
MAINTAINERS: fix GTA02 entry and mark as orphan
mm: create the new vm_fault_t type
arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
arch: simplify several early memory allocations
openrisc: simplify pte_alloc_one_kernel()
sh: prefer memblock APIs returning virtual address
microblaze: prefer memblock API returning virtual address
powerpc: prefer memblock APIs returning virtual address
lib/lzo: separate lzo-rle from lzo
lib/lzo: implement run-length encoding
lib/lzo: fast 8-byte copy on arm64
lib/lzo: 64-bit CTZ on arm64
lib/lzo: tidy-up ifdefs
ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
ipc: annotate implicit fall through
...
There are several early memory allocations in arch/ code that use
memblock_phys_alloc() to allocate memory, convert the returned physical
address to the virtual address and then set the allocated memory to
zero.
Exactly the same behaviour can be achieved simply by calling
memblock_alloc(): it allocates the memory in the same way as
memblock_phys_alloc(), then it performs the phys_to_virt() conversion
and clears the allocated memory.
Replace the longer sequence with a simpler call to memblock_alloc().
Link: http://lkml.kernel.org/r/1546248566-14910-6-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "memblock: simplify several early memory allocation", v4.
These patches simplify some of the early memory allocations by replacing
usage of older memblock APIs with newer and shinier ones.
Quite a few places in the arch/ code allocated memory using a memblock
API that returns a physical address of the allocated area, then
converted this physical address to a virtual one and then used memset(0)
to clear the allocated range.
More recent memblock APIs do all the three steps in one call and their
usage simplifies the code.
It's important to note that regardless of API used, the core allocation
is nearly identical for any set of memblock allocators: first it tries
to find a free memory with all the constraints specified by the caller
and then falls back to the allocation with some or all constraints
disabled.
The first three patches perform the conversion of call sites that have
exact requirements for the node and the possible memory range.
The fourth patch is a bit one-off as it simplifies openrisc's
implementation of pte_alloc_one_kernel(), and not only the memblock
usage.
The fifth patch takes care of simpler cases when the allocation can be
satisfied with a simple call to memblock_alloc().
The sixth patch removes one-liner wrappers for memblock_alloc on arm and
unicore32, as suggested by Christoph.
This patch (of 6):
There are a several places that allocate memory using memblock APIs that
return a physical address, convert the returned address to the virtual
address and frequently also memset(0) the allocated range.
Update these places to use memblock allocators already returning a
virtual address. Use memblock functions that clear the allocated memory
instead of calling memset(0) where appropriate.
The calls to memblock_alloc_base() that were not followed by memset(0)
are replaced with memblock_alloc_try_nid_raw(). Since the latter does
not panic() when the allocation fails, the appropriate panic() calls are
added to the call sites.
Link: http://lkml.kernel.org/r/1546248566-14910-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mark Salter <msalter@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>