Commit Graph

1041 Commits

Author SHA1 Message Date
Ard Biesheuvel
f80fb3a3d5 arm64: add support for kernel ASLR
This adds support for KASLR is implemented, based on entropy provided by
the bootloader in the /chosen/kaslr-seed DT property. Depending on the size
of the address space (VA_BITS) and the page size, the entropy in the
virtual displacement is up to 13 bits (16k/2 levels) and up to 25 bits (all
4 levels), with the sidenote that displacements that result in the kernel
image straddling a 1GB/32MB/512MB alignment boundary (for 4KB/16KB/64KB
granule kernels, respectively) are not allowed, and will be rounded up to
an acceptable value.

If CONFIG_RANDOMIZE_MODULE_REGION_FULL is enabled, the module region is
randomized independently from the core kernel. This makes it less likely
that the location of core kernel data structures can be determined by an
adversary, but causes all function calls from modules into the core kernel
to be resolved via entries in the module PLTs.

If CONFIG_RANDOMIZE_MODULE_REGION_FULL is not enabled, the module region is
randomized by choosing a page aligned 128 MB region inside the interval
[_etext - 128 MB, _stext + 128 MB). This gives between 10 and 14 bits of
entropy (depending on page size), independently of the kernel randomization,
but still guarantees that modules are within the range of relative branch
and jump instructions (with the caveat that, since the module region is
shared with other uses of the vmalloc area, modules may need to be loaded
further away if the module region is exhausted)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24 14:57:27 +00:00
Ard Biesheuvel
1e48ef7fcc arm64: add support for building vmlinux as a relocatable PIE binary
This implements CONFIG_RELOCATABLE, which links the final vmlinux
image with a dynamic relocation section, allowing the early boot code
to perform a relocation to a different virtual address at runtime.

This is a prerequisite for KASLR (CONFIG_RANDOMIZE_BASE).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24 14:57:27 +00:00
Ard Biesheuvel
6c94f27ac8 arm64: switch to relative exception tables
Instead of using absolute addresses for both the exception location
and the fixup, use offsets relative to the exception table entry values.
Not only does this cut the size of the exception table in half, it is
also a prerequisite for KASLR, since absolute exception table entries
are subject to dynamic relocation, which is incompatible with the sorting
of the exception table that occurs at build time.

This patch also introduces the _ASM_EXTABLE preprocessor macro (which
exists on x86 as well) and its _asm_extable assembly counterpart, as
shorthands to emit exception table entries.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24 14:57:26 +00:00
Ard Biesheuvel
4a2e034e5c arm64: make asm/elf.h available to asm files
This reshuffles some code in asm/elf.h and puts a #ifndef __ASSEMBLY__
around its C definitions so that the CPP defines can be used in asm
source files as well.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24 14:57:25 +00:00
Ard Biesheuvel
6ad1fe5d90 arm64: avoid R_AARCH64_ABS64 relocations for Image header fields
Unfortunately, the current way of using the linker to emit build time
constants into the Image header will no longer work once we switch to
the use of PIE executables. The reason is that such constants are emitted
into the binary using R_AARCH64_ABS64 relocations, which are resolved at
runtime, not at build time, and the places targeted by those relocations
will contain zeroes before that.

So refactor the endian swapping linker script constant generation code so
that it emits the upper and lower 32-bit words separately.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24 14:57:25 +00:00
Ard Biesheuvel
fd045f6cd9 arm64: add support for module PLTs
This adds support for emitting PLTs at module load time for relative
branches that are out of range. This is a prerequisite for KASLR, which
may place the kernel and the modules anywhere in the vmalloc area,
making it more likely that branch target offsets exceed the maximum
range of +/- 128 MB.

In this version, I removed the distinction between relocations against
.init executable sections and ordinary executable sections. The reason
is that it is hardly worth the trouble, given that .init.text usually
does not contain that many far branches, and this version now only
reserves PLT entry space for jump and call relocations against undefined
symbols (since symbols defined in the same module can be assumed to be
within +/- 128 MB)

For example, the mac80211.ko module (which is fairly sizable at ~400 KB)
built with -mcmodel=large gives the following relocation counts:

                    relocs    branches   unique     !local
  .text              3925       3347       518        219
  .init.text           11          8         7          1
  .exit.text            4          4         4          1
  .text.unlikely       81         67        36         17

('unique' means branches to unique type/symbol/addend combos, of which
!local is the subset referring to undefined symbols)

IOW, we are only emitting a single PLT entry for the .init sections, and
we are better off just adding it to the core PLT section instead.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-24 14:57:24 +00:00
Ard Biesheuvel
f98deee9a9 arm64: move brk immediate argument definitions to separate header
Instead of reversing the header dependency between asm/bug.h and
asm/debug-monitors.h, split off the brk instruction immediate value
defines into a new header asm/brk-imm.h, and include it from both.

This solves the circular dependency issue that prevents BUG() from
being used in some header files, and keeps the definitions together.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-23 17:43:25 +00:00
Ard Biesheuvel
8439e62a15 arm64: mm: use bit ops rather than arithmetic in pa/va translations
Since PAGE_OFFSET is chosen such that it cuts the kernel VA space right
in half, and since the size of the kernel VA space itself is always a
power of 2, we can treat PAGE_OFFSET as a bitmask and replace the
additions/subtractions with 'or' and 'and-not' operations.

For the comparison against PAGE_OFFSET, a mov/cmp/branch sequence ends
up getting replaced with a single tbz instruction. For the additions and
subtractions, we save a mov instruction since the mask is folded into the
instruction's immediate field.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-23 17:43:25 +00:00
Ard Biesheuvel
a92405f082 arm64: mm: only perform memstart_addr sanity check if DEBUG_VM
Checking whether memstart_addr has been assigned every time it is
referenced adds a branch instruction that may hurt performance if
the reference in question occurs on a hot path. So only perform the
check if CONFIG_DEBUG_VM=y.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: replaced #ifdef with VM_BUG_ON]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-23 17:42:11 +00:00
Ard Biesheuvel
a7f8de168a arm64: allow kernel Image to be loaded anywhere in physical memory
This relaxes the kernel Image placement requirements, so that it
may be placed at any 2 MB aligned offset in physical memory.

This is accomplished by ignoring PHYS_OFFSET when installing
memblocks, and accounting for the apparent virtual offset of
the kernel Image. As a result, virtual address references
below PAGE_OFFSET are correctly mapped onto physical references
into the kernel Image regardless of where it sits in memory.

Special care needs to be taken for dealing with memory limits passed
via mem=, since the generic implementation clips memory top down, which
may clip the kernel image itself if it is loaded high up in memory. To
deal with this case, we simply add back the memory covering the kernel
image, which may result in more memory to be retained than was passed
as a mem= parameter.

Since mem= should not be considered a production feature, a panic notifier
handler is installed that dumps the memory limit at panic time if one was
set.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:53 +00:00
Ard Biesheuvel
a89dea5853 arm64: defer __va translation of initrd_start and initrd_end
Before deferring the assignment of memstart_addr in a subsequent patch, to
the moment where all memory has been discovered and possibly clipped based
on the size of the linear region and the presence of a mem= command line
parameter, we need to ensure that memstart_addr is not used to perform __va
translations before it is assigned.

One such use is in the generic early DT discovery of the initrd location,
which is recorded as a virtual address in the globals initrd_start and
initrd_end. So wire up the generic support to declare the initrd addresses,
and implement it without __va() translations, and perform the translation
after memstart_addr has been assigned.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:49 +00:00
Ard Biesheuvel
f9040773b7 arm64: move kernel image to base of vmalloc area
This moves the module area to right before the vmalloc area, and moves
the kernel image to the base of the vmalloc area. This is an intermediate
step towards implementing KASLR, which allows the kernel image to be
located anywhere in the vmalloc area.

Since other subsystems such as hibernate may still need to refer to the
kernel text or data segments via their linears addresses, both are mapped
in the linear region as well. The linear alias of the text region is
mapped read-only/non-executable to prevent inadvertent modification or
execution.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:44 +00:00
Ard Biesheuvel
a0bf9776cd arm64: kvm: deal with kernel symbols outside of linear mapping
KVM on arm64 uses a fixed offset between the linear mapping at EL1 and
the HYP mapping at EL2. Before we can move the kernel virtual mapping
out of the linear mapping, we have to make sure that references to kernel
symbols that are accessed via the HYP mapping are translated to their
linear equivalent.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:40 +00:00
Ard Biesheuvel
6533945a32 arm64: pgtable: implement static [pte|pmd|pud]_offset variants
The page table accessors pte_offset(), pud_offset() and pmd_offset()
rely on __va translations, so they can only be used after the linear
mapping has been installed. For the early fixmap and kasan init routines,
whose page tables are allocated statically in the kernel image, these
functions will return bogus values. So implement pte_offset_kimg(),
pmd_offset_kimg() and pud_offset_kimg(), which can be used instead
before any page tables have been allocated dynamically.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:31 +00:00
Ard Biesheuvel
ab893fb9f1 arm64: introduce KIMAGE_VADDR as the virtual base of the kernel region
This introduces the preprocessor symbol KIMAGE_VADDR which will serve as
the symbolic virtual base of the kernel region, i.e., the kernel's virtual
offset will be KIMAGE_VADDR + TEXT_OFFSET. For now, we define it as being
equal to PAGE_OFFSET, but in the future, it will be moved below it once
we move the kernel virtual mapping out of the linear mapping.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:27 +00:00
Ard Biesheuvel
324420bf91 arm64: add support for ioremap() block mappings
This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings. It also adds support
to the unmap path for dealing with block mappings, which will allow us
to unmap the __init region using unmap_kernel_range() in a subsequent
patch.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:16 +00:00
Ard Biesheuvel
03336b1df9 arm64: prevent potential circular header dependencies in asm/bug.h
Currently, using BUG_ON() in header files is cumbersome, due to the fact
that asm/bug.h transitively includes a lot of other header files, resulting
in the actual BUG_ON() invocation appearing before its definition in the
preprocessor input. So let's reverse the #include dependency between
asm/bug.h and asm/debug-monitors.h, by moving the definition of BUG_BRK_IMM
from the latter to the former. Also fix up one user of asm/debug-monitors.h
which relied on a transitive include.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 18:16:11 +00:00
Catalin Marinas
e950631e84 arm64: Remove the get_thread_info() function
This function was introduced by previous commits implementing UAO.
However, it can be replaced with task_thread_info() in
uao_thread_switch() or get_fs() in do_page_fault() (the latter being
called only on the current context, so no need for using the saved
pt_regs).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 17:27:05 +00:00
James Morse
7054419600 arm64: kernel: Don't toggle PAN on systems with UAO
If a CPU supports both Privileged Access Never (PAN) and User Access
Override (UAO), we don't need to disable/re-enable PAN round all
copy_to_user() like calls.

UAO alternatives cause these calls to use the 'unprivileged' load/store
instructions, which are overridden to be the privileged kind when
fs==KERNEL_DS.

This patch changes the copy_to_user() calls to have their PAN toggling
depend on a new composite 'feature' ARM64_ALT_PAN_NOT_UAO.

If both features are detected, PAN will be enabled, but the copy_to_user()
alternatives will not be applied. This means PAN will be enabled all the
time for these functions. If only PAN is detected, the toggling will be
enabled as normal.

This will save the time taken to disable/re-enable PAN, and allow us to
catch copy_to_user() accesses that occur with fs==KERNEL_DS.

Futex and swp-emulation code continue to hang their PAN toggling code on
ARM64_HAS_PAN.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 17:27:05 +00:00
James Morse
57f4959bad arm64: kernel: Add support for User Access Override
'User Access Override' is a new ARMv8.2 feature which allows the
unprivileged load and store instructions to be overridden to behave in
the normal way.

This patch converts {get,put}_user() and friends to use ldtr*/sttr*
instructions - so that they can only access EL0 memory, then enables
UAO when fs==KERNEL_DS so that these functions can access kernel memory.

This allows user space's read/write permissions to be checked against the
page tables, instead of testing addr<USER_DS, then using the kernel's
read/write permissions.

Signed-off-by: James Morse <james.morse@arm.com>
[catalin.marinas@arm.com: move uao_thread_switch() above dsb()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 17:27:04 +00:00
James Morse
406e308770 arm64: add ARMv8.2 id_aa64mmfr2 boiler plate
ARMv8.2 adds a new feature register id_aa64mmfr2. This patch adds the
cpu feature boiler plate used by the actual features in later patches.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 12:00:01 +00:00
James Morse
0f54b14e76 arm64: cpufeature: Change read_cpuid() to use sysreg's mrs_s macro
Older assemblers may not have support for newer feature registers. To get
round this, sysreg.h provides a 'mrs_s' macro that takes a register
encoding and generates the raw instruction.

Change read_cpuid() to use mrs_s in all cases so that new registers
don't have to be a special case. Including sysreg.h means we need to move
the include and definition of read_cpuid() after the #ifndef __ASSEMBLY__
to avoid syntax errors in vmlinux.lds.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-18 11:59:54 +00:00
Ard Biesheuvel
7abc7d833c arm64: use local label prefixes for __reg_num symbols
The __reg_num_xNN symbols that are used to implement the msr_s and
mrs_s macros are recorded in the ELF metadata of each object file.
This does not affect the size of the final binary, but it does clutter
the output of tools like readelf, i.e.,

  $ readelf -a vmlinux |grep -c __reg_num_x
  50976

So let's use symbols with the .L prefix, these are strictly local,
and don't end up in the object files.

  $ readelf -a vmlinux |grep -c __reg_num_x
  0

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 18:39:58 +00:00
Will Deacon
afb83cc3f0 arm64: prefetch: add missing #include for spin_lock_prefetch
As of 52e662326e1e ("arm64: prefetch: don't provide spin_lock_prefetch
with LSE"), spin_lock_prefetch is patched at runtime when the LSE atomics
are in use. This relies on the ARM64_LSE_ATOMIC_INSN macro to drive
the alternatives framework, but that macro is only available via
asm/lse.h, which isn't explicitly included in processor.h. Consequently,
drivers can run into build failures such as:

   In file included from include/linux/prefetch.h:14:0,
                    from drivers/net/ethernet/intel/i40e/i40e_txrx.c:27:
   arch/arm64/include/asm/processor.h: In function 'spin_lock_prefetch':
   arch/arm64/include/asm/processor.h:183:15: error: expected string literal before 'ARM64_LSE_ATOMIC_INSN'
     asm volatile(ARM64_LSE_ATOMIC_INSN(

This patch add the missing include and gets things building again.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:12:34 +00:00
Will Deacon
d5370f7548 arm64: prefetch: add alternative pattern for CPUs without a prefetcher
Most CPUs have a hardware prefetcher which generally performs better
without explicit prefetch instructions issued by software, however
some CPUs (e.g. Cavium ThunderX) rely solely on explicit prefetch
instructions.

This patch adds an alternative pattern (ARM64_HAS_NO_HW_PREFETCH) to
allow our library code to make use of explicit prefetch instructions
during things like copy routines only when the CPU does not have the
capability to perform the prefetching itself.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Andrew Pinski <apinski@cavium.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:12:32 +00:00
Will Deacon
cd5e10bdf3 arm64: prefetch: don't provide spin_lock_prefetch with LSE
The LSE atomics rely on us not dirtying data at L1 if we can avoid it,
otherwise many of the potential scalability benefits are lost.

This patch replaces spin_lock_prefetch with a nop when the LSE atomics
are in use, so that users don't shoot themselves in the foot by causing
needless coherence traffic at L1.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Andrew Pinski <apinski@cavium.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:12:32 +00:00
Lorenzo Pieralisi
5e89c55e4e arm64: kernel: implement ACPI parking protocol
The SBBR and ACPI specifications allow ACPI based systems that do not
implement PSCI (eg systems with no EL3) to boot through the ACPI parking
protocol specification[1].

This patch implements the ACPI parking protocol CPU operations, and adds
code that eases parsing the parking protocol data structures to the
ARM64 SMP initializion carried out at the same time as cpus enumeration.

To wake-up the CPUs from the parked state, this patch implements a
wakeup IPI for ARM64 (ie arch_send_wakeup_ipi_mask()) that mirrors the
ARM one, so that a specific IPI is sent for wake-up purpose in order
to distinguish it from other IPI sources.

Given the current ACPI MADT parsing API, the patch implements a glue
layer that helps passing MADT GICC data structure from SMP initialization
code to the parking protocol implementation somewhat overriding the CPU
operations interfaces. This to avoid creating a completely trasparent
DT/ACPI CPU operations layer that would require creating opaque
structure handling for CPUs data (DT represents CPU through DT nodes, ACPI
through static MADT table entries), which seems overkill given that ACPI
on ARM64 mandates only two booting protocols (PSCI and parking protocol),
so there is no need for further protocol additions.

Based on the original work by Mark Salter <msalter@redhat.com>

[1] https://acpica.org/sites/acpica/files/MP%20Startup%20for%20ARM%20platforms.docx

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Loc Ho <lho@apm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Al Stone <ahs3@redhat.com>
[catalin.marinas@arm.com: Added WARN_ONCE(!acpi_parking_protocol_valid() on the IPI]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:12:32 +00:00
Mark Rutland
068a17a580 arm64: mm: create new fine-grained mappings at boot
At boot we may change the granularity of the tables mapping the kernel
(by splitting or making sections). This may happen when we create the
linear mapping (in __map_memblock), or at any point we try to apply
fine-grained permissions to the kernel (e.g. fixup_executable,
mark_rodata_ro, fixup_init).

Changing the active page tables in this manner may result in multiple
entries for the same address being allocated into TLBs, risking problems
such as TLB conflict aborts or issues derived from the amalgamation of
TLB entries. Generally, a break-before-make (BBM) approach is necessary
to avoid conflicts, but we cannot do this for the kernel tables as it
risks unmapping text or data being used to do so.

Instead, we can create a new set of tables from scratch in the safety of
the existing mappings, and subsequently migrate over to these using the
new cpu_replace_ttbr1 helper, which avoids the two sets of tables being
active simultaneously.

To avoid issues when we later modify permissions of the page tables
(e.g. in fixup_init), we must create the page tables at a granularity
such that later modification does not result in splitting of tables.

This patch applies this strategy, creating a new set of fine-grained
page tables from scratch, and safely migrating to them. The existing
fixmap and kasan shadow page tables are reused in the new fine-grained
tables.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:46 +00:00
Mark Rutland
961faac114 arm64: mm: add functions to walk tables in fixmap
As a preparatory step to allow us to allocate early page tables from
unmapped memory using memblock_alloc, add new p??_{set,clear}_fixmap*
functions which can be used to walk page tables outside of the linear
mapping by using fixmap slots.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:46 +00:00
Mark Rutland
1e531cce68 arm64: mm: add __{pud,pgd}_populate
We currently have __pmd_populate for creating a pmd table entry given
the physical address of a pte, but don't have equivalents for the pud or
pgd levels of table.

To enable us to manipulate tables which are mapped outside of the linear
mapping (where we have a PA, but not a linear map VA), it is useful to
have these functions.

This patch adds __{pud,pgd}_populate. As these should not be called when
the kernel uses folded {pmd,pud}s, in these cases they expand to
BUILD_BUG(). So long as the appropriate checks are made on the {pud,pgd}
entry prior to attempting population, these should be optimized out at
compile time.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:46 +00:00
Mark Rutland
dca56dca71 arm64: mm: add functions to walk page tables by PA
To allow us to walk tables allocated into the fixmap, we need to acquire
the physical address of a page, rather than the virtual address in the
linear map.

This patch adds new p??_page_paddr and p??_offset_phys functions to
acquire the physical address of a next-level table, and changes
p??_offset* into macros which simply convert this to a linear map VA.
This renders p??_page_vaddr unused, and hence they are removed.

At the pgd level, a new pgd_offset_raw function is added to find the
relevant PGD entry given the base of a PGD and a virtual address.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:45 +00:00
Mark Rutland
053520f7d3 arm64: mm: move pte_* macros
For pmd, pud, and pgd levels of table, functions including p?d_index and
p?d_offset are defined after the p?d_page_vaddr function for the
immediately higher level of table.

The pte functions however are defined much earlier, even though several
rely on the later definition of pmd_page_vaddr. While this isn't
currently a problem as these are macros, it prevents the logical
grouping of later C functions (which cannot rely on prototypes for
functions not yet defined).

Move these definitions after pmd_page_vaddr, for consistency with the
placement of these functions for other levels of table.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:45 +00:00
Mark Rutland
50e1881ddd arm64: mm: add code to safely replace TTBR1_EL1
If page tables are modified without suitable TLB maintenance, the ARM
architecture permits multiple TLB entries to be allocated for the same
VA. When this occurs, it is permitted that TLB conflict aborts are
raised in response to synchronous data/instruction accesses, and/or and
amalgamation of the TLB entries may be used as a result of a TLB lookup.

The presence of conflicting TLB entries may result in a variety of
behaviours detrimental to the system (e.g. erroneous physical addresses
may be used by I-cache fetches and/or page table walks). Some of these
cases may result in unexpected changes of hardware state, and/or result
in the (asynchronous) delivery of SError.

To avoid these issues, we must avoid situations where conflicting
entries may be allocated into TLBs. For user and module mappings we can
follow a strict break-before-make approach, but this cannot work for
modifications to the swapper page tables that cover the kernel text and
data.

Instead, this patch adds code which is intended to be executed from the
idmap, which can safely unmap the swapper page tables as it only
requires the idmap to be active. This enables us to uninstall the active
TTBR1_EL1 entry, invalidate TLBs, then install a new TTBR1_EL1 entry
without potentially unmapping code or data required for the sequence.
This avoids the risk of conflict, but requires that updates are staged
in a copy of the swapper page tables prior to being installed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:45 +00:00
Mark Rutland
609116d202 arm64: add function to install the idmap
In some cases (e.g. when making invasive changes to the kernel page
tables) we will need to execute code from the idmap.

Add a new helper which may be used to install the idmap, complementing
the existing cpu_uninstall_idmap.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:45 +00:00
Mark Rutland
9e8e865bbe arm64: unify idmap removal
We currently open-code the removal of the idmap and restoration of the
current task's MMU state in a few places.

Before introducing yet more copies of this sequence, unify these to call
a new helper, cpu_uninstall_idmap.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:44 +00:00
Mark Rutland
5227cfa71f arm64: mm: place empty_zero_page in bss
Currently the zero page is set up in paging_init, and thus we cannot use
the zero page earlier. We use the zero page as a reserved TTBR value
from which no TLB entries may be allocated (e.g. when uninstalling the
idmap). To enable such usage earlier (as may be required for invasive
changes to the kernel page tables), and to minimise the time that the
idmap is active, we need to be able to use the zero page before
paging_init.

This patch follows the example set by x86, by allocating the zero page
at compile time, in .bss. This means that the zero page itself is
available immediately upon entry to start_kernel (as we zero .bss before
this), and also means that the zero page takes up no space in the raw
Image binary. The associated struct page is allocated in bootmem_init,
and remains unavailable until this time.

Outside of arch code, the only users of empty_zero_page assume that the
empty_zero_page symbol refers to the zeroed memory itself, and that
ZERO_PAGE(x) must be used to acquire the associated struct page,
following the example of x86. This patch also brings arm64 inline with
these assumptions.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Jeremy Linton <jeremy.linton@arm.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-02-16 15:10:44 +00:00
Linus Torvalds
8ab54ed641 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irqchip fixes from Thomas Gleixner:
 "Another set of ARM SoC related irqchip fixes:
   - Plug a memory leak in gicv3-its
   - Limit features to the root gic interrupt controller
   - Add a missing barrier in the gic-v3 IAR access
   - Another compile test fix for sun4i"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on redestributor
  irqchip/gic: Only set the EOImodeNS bit for the root controller
  irqchip/gic: Only populate set_affinity for the root controller
  irqchip/gicv3-its: Fix memory leak in its_free_tables()
  irqchip/sun4i: Fix compilation outside of arch/arm
2016-02-14 11:49:30 -08:00
Tirumalesh Chalamarla
1a1ebd5fb1 irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on redestributor
The ARM GICv3 specification mentions the need for dsb after a read
from the ICC_IAR1_EL1 register:

	4.1.1 Physical CPU Interface:
	The effects of reading ICC_IAR0_EL1 and ICC_IAR1_EL1
	on the state of a returned INTID are not guaranteed
	to be visible until after the execution of a DSB.

Not having this could result in missed interrupts, so let's add the
required barrier.

[Marc: fixed commit message]

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Tirumalesh Chalamarla <tchalamarla@caviumnetworks.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-02-11 10:20:02 +00:00
Paolo Bonzini
afc6074381 KVM/ARM fixes for v4.5-rc2
A few random fixes, mostly coming from the PMU work by Shannon:
 
 - fix for injecting faults coming from the guest's userspace
 - cleanup for our CPTR_EL2 accessors (reserved bits)
 - fix for a bug impacting perf (user/kernel discrimination)
 - fix for a 32bit sysreg handling bug
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWqehPAAoJECPQ0LrRPXpDn6gP/2lrJ9lV5I3MxLzUytmRY8EM
 Xl8WnNEQJ0e7oEdb1l6k4DR8D/HefzXpp/YWHY1WdDZSej0b2egro1xsFWdgaOr9
 NVGJnoQBlCFqSIf2szml4ftpHXZZ/kMF/EvhtzEL6cpUdqeA/tkS6HoCMQknhCbx
 3zOYnNKCGQUkFhTKJUSXB6NcZ/950uqkQxAdCPNUTGg1YzkNfbcgTewqKsmb25Cv
 /sOUFmrq2AlnWkdH+QWP0BtNFUX9saOSXvxrABT6nfiXSpUeF6Rprcgi9gdoNhAD
 mfE5IFw0dOEo2XThZTchKu3FBSMAkDadvC9yWFr88dr62E6EKFPzY3vHLCA8QoT/
 zk5beGSjyWGe7FZZJ4CKdO4EWBZr/WSlSVzOfG4ZBVPUoh2AZcUEhzzrzTezzocO
 71/5ZVpQ6O8+Pxwyy85Vd2drf7OZLagGNydNx46RHXrRxl+q0c5vFTVh4Txbd4YU
 XNsd+kA62/OYyPHbtVzTzAPPKG7aM8hLzdy8dkTgvuDzWHmxFWhD/HgiMHfFrQqs
 WCafvBhTc4375dvwYOupxaU2ncHKvt/zQJtBOw6bEwAIUa5c1IkIUr0i8XgRq6lr
 x/YvhFIwiVyXVnrDt3ZSIx79Oajf541uJg7vLFyPBQkcnQlJ6T7oy7qJlqhM0567
 Sr6G0/YXa1ccIfmKyeh4
 =36kx
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM fixes for v4.5-rc2

A few random fixes, mostly coming from the PMU work by Shannon:

- fix for injecting faults coming from the guest's userspace
- cleanup for our CPTR_EL2 accessors (reserved bits)
- fix for a bug impacting perf (user/kernel discrimination)
- fix for a 32bit sysreg handling bug
2016-02-08 16:20:51 +01:00
Mark Brown
3e275c64a9 arm64: asm: Explicitly include linux/personality.h in asm/page.h
asm/page.h uses READ_IMPLIES_EXEC from linux/personality.h but does not
explicitly include it causing build failures in -next where whatever was
causing it to be implicitly included has changed to remove that
inclusion.  Add an explicit inclusion to fix this.

Signed-off-by: Mark Brown <broonie@kernel.org>
[will: moved #include inside #ifndef __ASSEMBLY__ block]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-03 19:22:02 +00:00
James Morse
811d61e384 arm64: futex.h: Add missing PAN toggling
futex.h's futex_atomic_cmpxchg_inatomic() does not use the
__futex_atomic_op() macro and needs its own PAN toggling. This was missed
when the feature was implemented.

Fixes: 338d4f49d6 ("arm64: kernel: Add support for Privileged Access Never")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-02-02 15:59:21 +00:00
Catalin Marinas
ac15bd63bb arm64: Honour !PTE_WRITE in set_pte_at() for kernel mappings
Currently, set_pte_at() only checks the software PTE_WRITE bit for user
mappings when it sets or clears the hardware PTE_RDONLY accordingly. The
kernel ptes are written directly without any modification, relying
solely on the protection bits in macros like PAGE_KERNEL. However,
modifying kernel pte attributes via pte_wrprotect() would be ignored by
set_pte_at(). Since pte_wrprotect() does not set PTE_RDONLY (it only
clears PTE_WRITE), the new permission is not taken into account.

This patch changes set_pte_at() to adjust the read-only permission for
kernel ptes as well. As a side effect, existing PROT_* definitions used
for kernel ioremap*() need to include PTE_DIRTY | PTE_WRITE.

(additionally, white space fix for PTE_KERNEL_ROX)

Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-01-25 11:09:06 +00:00
Shannon Zhao
9586a2ea68 arm64: KVM: Fix wrong use of the CPSR MODE mask for 32bit guests
The values of CPSR MODE mask are different between aarch32 and aarch64.
It should use the right one according to the execution state.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-01-24 21:55:59 +00:00
Dave Martin
a7e0ac295d arm64: KVM: Obey RES0/1 reserved bits when setting CPTR_EL2
Some bits in CPTR are defined as RES1 in the architecture.  Setting
these bits to zero may unintentionally enable future architecture
extensions, allowing guests to use them without supervision by the host.

This would be bad: for forwards compatibility, this patch makes
sure the affected bits are always written with 1, not 0.

This patch only addresses CPTR_EL2.  Initialisation of other system
registers may still need review.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-01-24 21:55:58 +00:00
Christoph Hellwig
e1c7e32453 dma-mapping: always provide the dma_map_ops based implementation
Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.

[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-20 17:09:18 -08:00
Linus Torvalds
a200dcb346 virtio: barrier rework+fixes
This adds a new kind of barrier, and reworks virtio and xen
 to use it.
 Plus some fixes here and there.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWlU2kAAoJECgfDbjSjVRpZ6IH/Ra19ecG8sCQo9zskr4zo22Z
 DZXC3u0sJDBYjjBAiw3IY1FKh7wx2Fr1RhUOj1bteBgcFCMCV1zInP5ITiCyzd1H
 YYh1w9C2tZaj2T4t9L4hIrAdtIF8fGS+oI2IojXPjOuDLEt6pfFBEjHp/sfl3UJq
 ZmZvw4OXviSNej7jBw8Xni3Uv18yfmLGXvMdkvMSPC1/XL29voGDqTVwhqJwxLVz
 k/ZLcKFOzIs9N7Nja0Jl1EiZtC2Y9cpItqweicNAzszlpkSL44vQxmCSefB+WyQ4
 gt0O3+AxYkLfrxzCBhUA4IpRex3/XPW1b+1e/V1XjfR2n/FlyLe+AIa8uPJElFc=
 =ukaV
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio barrier rework+fixes from Michael Tsirkin:
 "This adds a new kind of barrier, and reworks virtio and xen to use it.

  Plus some fixes here and there"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits)
  checkpatch: add virt barriers
  checkpatch: check for __smp outside barrier.h
  checkpatch.pl: add missing memory barriers
  virtio: make find_vqs() checkpatch.pl-friendly
  virtio_balloon: fix race between migration and ballooning
  virtio_balloon: fix race by fill and leak
  s390: more efficient smp barriers
  s390: use generic memory barriers
  xen/events: use virt_xxx barriers
  xen/io: use virt_xxx barriers
  xenbus: use virt_xxx barriers
  virtio_ring: use virt_store_mb
  sh: move xchg_cmpxchg to a header by itself
  sh: support 1 and 2 byte xchg
  virtio_ring: update weak barriers to use virt_xxx
  Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb"
  asm-generic: implement virt_xxx memory barriers
  x86: define __smp_xxx
  xtensa: define __smp_xxx
  tile: define __smp_xxx
  ...
2016-01-18 16:44:24 -08:00
Dan Williams
ba049e93ae kvm: rename pfn_t to kvm_pfn_t
To date, we have implemented two I/O usage models for persistent memory,
PMEM (a persistent "ram disk") and DAX (mmap persistent memory into
userspace).  This series adds a third, DAX-GUP, that allows DAX mappings
to be the target of direct-i/o.  It allows userspace to coordinate
DMA/RDMA from/to persistent memory.

The implementation leverages the ZONE_DEVICE mm-zone that went into
4.3-rc1 (also discussed at kernel summit) to flag pages that are owned
and dynamically mapped by a device driver.  The pmem driver, after
mapping a persistent memory range into the system memmap via
devm_memremap_pages(), arranges for DAX to distinguish pfn-only versus
page-backed pmem-pfns via flags in the new pfn_t type.

The DAX code, upon seeing a PFN_DEV+PFN_MAP flagged pfn, flags the
resulting pte(s) inserted into the process page tables with a new
_PAGE_DEVMAP flag.  Later, when get_user_pages() is walking ptes it keys
off _PAGE_DEVMAP to pin the device hosting the page range active.
Finally, get_page() and put_page() are modified to take references
against the device driver established page mapping.

Finally, this need for "struct page" for persistent memory requires
memory capacity to store the memmap array.  Given the memmap array for a
large pool of persistent may exhaust available DRAM introduce a
mechanism to allocate the memmap from persistent memory.  The new
"struct vmem_altmap *" parameter to devm_memremap_pages() enables
arch_add_memory() to use reserved pmem capacity rather than the page
allocator.

This patch (of 18):

The core has developed a need for a "pfn_t" type [1].  Move the existing
pfn_t in KVM to kvm_pfn_t [2].

[1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html
[2]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002218.html

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Minchan Kim
05ee26d9e7 arch/arm64/include/asm/pgtable.h: add pmd_mkclean for THP
MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_mkclean for THP page MADV_FREE support.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Shaohua Li <shli@kernel.org>
Cc: <yalin.wang2010@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Gang <gang.chen.5i5j@gmail.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jason Evans <je@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mika Penttil <mika.penttila@nextfour.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rik van Riel <riel@redhat.com>
Cc: Roland Dreier <roland@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Shaohua Li <shli@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Kirill A. Shutemov
b7ed934a7c arm64, thp: remove infrastructure for handling splitting PMDs
With new refcounting we don't need to mark PMDs splitting.  Let's drop
code to handle this.

pmdp_splitting_flush() is not needed too: on splitting PMD we will do
pmdp_clear_flush() + set_pte_at().  pmdp_clear_flush() will do IPI as
needed for fast_gup.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Linus Torvalds
1baa5efbeb * s390: Support for runtime instrumentation within guests,
support of 248 VCPUs.
 
 * ARM: rewrite of the arm64 world switch in C, support for
 16-bit VM identifiers.  Performance counter virtualization
 missed the boat.
 
 * x86: Support for more Hyper-V features (synthetic interrupt
 controller), MMU cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJWlSKwAAoJEL/70l94x66DY0UIAK5vp4zfQoQOJC4KP4Xgxwdu
 kpnK2Boz3/74o1b0y5+eJZoUZCsXCVLtmP5uhmMxUYWDgByFG2X8ZDhPFwB5FYLT
 2dN+Lr4tsolgIfRdHZtrT6Svp9SDL039bWTdscnbR6l37/j9FRWvpKdhI3orloFD
 /i4CSW2dVIq1/9Xctwu/rtcOEesEx4Cad+6YV3/530eVAXFzE908nXfmqJNZTocY
 YCGcmrMVCOu0ng5QM4xSzmmYjKMLUcRs+QzZWkVBzdJtTgwZUr09yj7I2dZ1yj/i
 cxYrJy6shSwE74XkXsmvG+au3C5u3vX4tnXjBFErnPJ99oqzHatVnFWNRhj4dLQ=
 =PIj1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "PPC changes will come next week.

   - s390: Support for runtime instrumentation within guests, support of
     248 VCPUs.

   - ARM: rewrite of the arm64 world switch in C, support for 16-bit VM
     identifiers.  Performance counter virtualization missed the boat.

   - x86: Support for more Hyper-V features (synthetic interrupt
     controller), MMU cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (115 commits)
  kvm: x86: Fix vmwrite to SECONDARY_VM_EXEC_CONTROL
  kvm/x86: Hyper-V SynIC timers tracepoints
  kvm/x86: Hyper-V SynIC tracepoints
  kvm/x86: Update SynIC timers on guest entry only
  kvm/x86: Skip SynIC vector check for QEMU side
  kvm/x86: Hyper-V fix SynIC timer disabling condition
  kvm/x86: Reorg stimer_expiration() to better control timer restart
  kvm/x86: Hyper-V unify stimer_start() and stimer_restart()
  kvm/x86: Drop stimer_stop() function
  kvm/x86: Hyper-V timers fix incorrect logical operation
  KVM: move architecture-dependent requests to arch/
  KVM: renumber vcpu->request bits
  KVM: document which architecture uses each request bit
  KVM: Remove unused KVM_REQ_KICK to save a bit in vcpu->requests
  kvm: x86: Check kvm_write_guest return value in kvm_write_wall_clock
  KVM: s390: implement the RI support of guest
  kvm/s390: drop unpaired smp_mb
  kvm: x86: fix comment about {mmu,nested_mmu}.gva_to_gpa
  KVM: x86: MMU: Use clear_page() instead of init_shadow_page_table()
  arm/arm64: KVM: Detect vGIC presence at runtime
  ...
2016-01-12 13:22:12 -08:00