Commit Graph

1939 Commits

Author SHA1 Message Date
Jonathan Austin
6a8146f420 ARM: 8609/1: V7M: Add support for the Cortex-M7 processor
Cortex-M7 is a new member of the V7M processor family that adds, among
other things, caches over the features available in Cortex-M4.

This patch adds support for recognising the processor at boot time, and
make use of recently introduced cache functions.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06 15:51:08 +01:00
Jonathan Austin
c3a6bcbe6a ARM: 8608/1: V7M: Indirect proc_info construction for V7M CPUs
This patch copies the method used for V7A/R CPUs to specify differing
processor info for different cores.

This patch differentiates Cortex-M3 and Cortex-M4 and leaves a fallback case
for any other V7M processors.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06 15:51:08 +01:00
Jonathan Austin
bc0ee9d24a ARM: 8607/1: V7M: Wire up caches for V7M processors with cache support.
This patch does the plumbing required to invoke the V7M cache code added
in earlier patches in this series, although there is no users for that
yet.

In order to honour the I/D cache disable config options, this patch changes
the mechanism by which the CCR is set on boot, to be more like V7A/R.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06 15:51:08 +01:00
Vladimir Murzin
9a1af5f220 ARM: 8606/1: V7M: introduce cache operations
This commit implements the cache operation for V7M.

It is based on V7 counterpart and differs as follows:
- cache operations are memory mapped
- only Thumb instruction set is supported
- we don't handle user access faults

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Andras Szemzo <sza@esh.hu>
Tested-by: Joachim Eastwood <manabian@gmail.com>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-09-06 15:51:07 +01:00
Masahiro Yamada
dd34b11566 ARM: uniphier: remove SoC-specific SMP code
The UniPhier architecture (32bit) switched over to PSCI.  Remove
the SoC-specific SMP operations.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2016-08-29 01:57:14 +09:00
Vladimir Murzin
f271b779f4 ARM: 8599/1: mm: pull asm/memory.h explicitly
Commit d781145549 (""ARM: 8512/1: proc-v7.S: Adjust stack address when
XIP_KERNEL"") introduced a macro which lives under asm/memory.h.
Unfortunately, for MMU-less systems (like R-class) it leads to build failure:

arch/arm/mm/proc-v7.S: Assembler messages:
arch/arm/mm/proc-v7.S:538: Error: unrecognised relocation suffix
make[1]: *** [arch/arm/mm/proc-v7.o] Error 1
make: *** [arch/arm/mm] Error 2

since it is implicitly pulled via asm/pgtable.h for MMU capable systems only.

To fix it include asm/memory.h explicitly.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-23 10:07:50 +01:00
Kees Cook
7619751f8c ARM: 8595/2: apply more __ro_after_init
Guided by grsecurity's analogous __read_only markings in arch/arm,
this applies several uses of __ro_after_init to structures that are
only updated during __init.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12 16:47:06 +01:00
Andrey Smirnov
55604b7ab1 ARM: 8593/1: cache-l2x0.c: Do not clear bit 23 in prefetch control register
As per L2C-310 TRM[1]:

"... You can control this feature using bits 30,27 and 23 of the
Prefetch Control Register. Bit 23 and 27 are only used if you set bit 30
HIGH..."

which means there is no need to clear bit 23 if bit 30 is being cleared.

[1] http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0246e/CJAJACBJ.html

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12 16:47:04 +01:00
Andrey Smirnov
fc1473103c ARM: 8592/1: cache-l2x0.c: Replace magic numbers
Replace magic numbers used for L310 Prefetch Control Register

Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12 16:47:03 +01:00
Fabio Estevam
bf31c5e07d ARM: 8587/1: dma-mapping: Use %zu for printing a size_t variable
According to Documentation/printk-formats.txt when printing
a size_t variable we should use %zu or %zx format specifiers.

As we are printing a memory size value, we should better use %zu
in this case.

Reported-by: Frank Mori Hess <fmh6jj@gmail.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-12 16:47:01 +01:00
Ard Biesheuvel
61444cde91 ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations
The late_alloc() PTE allocation function used by create_mapping_late()
does not call pgtable_page_ctor() on PTE pages it allocates, leaving
the per-page spinlock uninitialized.

Since generic page table manipulation code may assume that translation
table pages that are not owned by init_mm are covered by fully
constructed struct pages, the following crash may occur with the new
UEFI memory attributes table code.

  efi: memattr: Processing EFI Memory Attributes table:
  efi: memattr:  0x0000ffa16000-0x0000ffa82fff [Runtime Code       |RUN|  |  |XP|  |  |  |   |  |  |  |  ]
  Unable to handle kernel NULL pointer dereference at virtual address 00000010
  pgd = c0204000
  [00000010] *pgd=00000000
  Internal error: Oops: 5 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc4-00063-g3882aa7b340b #361
  Hardware name: Generic DT based system
  task: ed858000 ti: ed842000 task.ti: ed842000
  PC is at __lock_acquire+0xa0/0x19a8
  ...
  [<c038c830>] (__lock_acquire) from [<c038e4f8>] (lock_acquire+0x6c/0x88)
  [<c038e4f8>] (lock_acquire) from [<c0c06134>] (_raw_spin_lock+0x2c/0x3c)
  [<c0c06134>] (_raw_spin_lock) from [<c0410384>] (apply_to_page_range+0xe8/0x238)
  [<c0410384>] (apply_to_page_range) from [<c1205f34>] (efi_set_mapping_permissions+0x54/0x5c)
  [<c1205f34>] (efi_set_mapping_permissions) from [<c1247474>] (efi_memattr_apply_permissions+0x2b8/0x378)
  [<c1247474>] (efi_memattr_apply_permissions) from [<c1248258>] (arm_enable_runtime_services+0x1f0/0x22c)
  [<c1248258>] (arm_enable_runtime_services) from [<c0301f0c>] (do_one_initcall+0x44/0x174)
  [<c0301f0c>] (do_one_initcall) from [<c1200d10>] (kernel_init_freeable+0x90/0x1e8)
  [<c1200d10>] (kernel_init_freeable) from [<c0bff690>] (kernel_init+0x8/0x114)
  [<c0bff690>] (kernel_init) from [<c0307ed0>] (ret_from_fork+0x14/0x24)

The crash is due to the fact that the UEFI page tables are not owned by
init_mm, but are not covered by fully constructed struct pages.

Given that the UEFI subsystem is currently the only user of
create_mapping_late(), add an unconditional call to pgtable_page_ctor() to
late_alloc().

Fixes: 9fc68b717c ("ARM/efi: Apply strict permissions for UEFI Runtime Services regions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-09 22:57:41 +01:00
Nicolas Pitre
b9a019899f ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit
To limit the amount of mapped low memory, we determine a physical address
boundary based on the start of the vmalloc area using __pa().
Strictly speaking, the vmalloc area location is arbitrary and does not
necessarily corresponds to a valid physical address. For example, if

	PAGE_OFFSET = 0x80000000
	PHYS_OFFSET = 0x90000000
	vmalloc_min = 0xf0000000

then __pa(vmalloc_min) overflows and returns a wrapped 0 when phys_addr_t
is a 32-bit type. Then the code that follows determines that the entire
physical memory is above that boundary and no low memory gets mapped at
all:

|[...]
|Machine model: Freescale i.MX51 NA04 Board
|Ignoring RAM at 0x90000000-0xb0000000 (!CONFIG_HIGHMEM)
|Consider using a HIGHMEM enabled kernel.

To avoid this problem let's make vmalloc_limit a 64-bit value all the
time and determine that boundary explicitly without using __pa().

Reported-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-08-09 22:57:40 +01:00
Krzysztof Kozlowski
00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Linus Torvalds
a6408f6cb6 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
2016-07-29 13:55:30 -07:00
Linus Torvalds
b5f00d18cc Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Included in this update are:

   - Patches from Gregory Clement to fix the coherent DMA cases in our
     dma-mapping code.

   - A number of CPU errata updates and fixes.

   - ARM cpuidle improvements from Jisheng Zhang.

   - Fix from Kees for the location of _etext.

   - Cleanups from Masahiro Yamada to avoid duplicated messages during
     the kernel build, and remove CONFIG_ARCH_HAS_BARRIERS.

   - Remove a udelay loop limitation, allowing for faster CPUs to
     calibrate the delay correctly.

   - Cleanup some left-overs from the SW PAN implementation.

   - Ensure that a modified address limit is not visible to exception
     handlers"

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (21 commits)
  ARM: 8586/1: cpuidle: make arm_cpuidle_suspend() a bit more efficient
  ARM: 8585/1: cpuidle: fix !cpuidle_ops[cpu].init case during init
  ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used
  ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent
  ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421
  ARM: 8559/1: errata: Workaround erratum A12 821420
  ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423
  ARM: save and reset the address limit when entering an exception
  ARM: 8577/1: Fix Cortex-A15 798181 errata initialization
  ARM: 8584/1: floppy: avoid gcc-6 warning
  ARM: 8583/1: mm: fix location of _etext
  ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS
  ARM: 8306/1: loop_udelay: remove bogomips value limitation
  ARM: 8581/1: add missing <asm/prom.h> to arch/arm/kernel/devtree.c
  ARM: 8576/1: avoid duplicating "Kernel: arch/arm/boot/*Image is ready"
  ARM: 8556/1: on a generic DT system: do not touch l2x0
  ARM: uaccess: remove put_user() code duplication
  ARM: 8580/1: Remove orphaned __addr_ok() definition
  ARM: get rid of horrible *(unsigned int *)(regs + 1)
  ARM: introduce svc_pt_regs structure
  ...
2016-07-29 13:03:49 -07:00
Kirill A. Shutemov
dcddffd41d mm: do not pass mm_struct into handle_mm_fault
We always have vma->vm_mm around.

Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Michal Hocko
397b080bb7 arm: get rid of superfluous __GFP_REPEAT
__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.

PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this
flag is for more than order-2.  This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.

Link: http://lkml.kernel.org/r/1464599699-30131-5-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Richard Cochran
9eeb226477 arm/l2c: Convert to hotplug state machine
Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs.

Signed-off-by: Richard Cochran <rcochran@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Brad Mouring <brad.mouring@ni.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153336.801270887@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-15 10:40:28 +02:00
Gregory CLEMENT
565068221b ARM: 8561/4: dma-mapping: Fix the coherent case when iommu is used
When doing dma allocation with IOMMU the __iommu_alloc_atomic() was
used even when the system was coherent. However, this function
allocates from a non-cacheable pool, which is fine when the device is
not cache coherent but won't work as expected in the device is cache
coherent. Indeed, the CPU and device must access the memory using the
same cacheability attributes.

Moreover when the devices are coherent, the mmap call must not change
the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs
has been updated in the same way that it was done for the arm_dma_mmap
in commit 55af8a9164 ("ARM: 8387/1: arm/mm/dma-mapping.c: Add
arm_coherent_dma_mmap").

Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14 16:25:31 +01:00
Gregory CLEMENT
f127089650 ARM: 8561/3: dma-mapping: Don't use outer_flush_range when the L2C is coherent
When a L2 cache controller is used in a system that provides hardware
coherency, the entire outer cache operations are useless, and can be
skipped.  Moreover, on some systems, it is harmful as it causes
deadlocks between the Marvell coherency mechanism, the Marvell PCIe
controller and the Cortex-A9.

In the current kernel implementation, the outer cache flush range
operation is triggered by the dma_alloc function.
This operation can be take place during runtime and in some
circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
SoCs.

This patch extends the __dma_clear_buffer() function to receive a
boolean argument related to the coherency of the system. The same
things is done for the calling functions.

Reported-by: Nadav Haklai <nadavh@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14 16:25:30 +01:00
Doug Anderson
9f6f93543d ARM: 8560/1: errata: Workaround errata A12 825619 / A17 852421
The workaround for both errata is to set bit 24 in the diagnostic
register.  There are no known end-user bugs solved by fixing this
errata, but the fix is trivial and it seems sane to apply it.

The arguments for why this needs to be in the kernel are similar to the
arugments made in the patch "Workaround errata A12 818325/852422 A17
852423".

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14 15:32:31 +01:00
Doug Anderson
416bcf2159 ARM: 8559/1: errata: Workaround erratum A12 821420
This erratum has a very simple workaround (set a bit in a register), so
let's apply it.  Apparently the workaround's downside is a very slight
power impact.

Note that applying this errata fixes deadlocks that are easy to
reproduce with real world applications.

The arguments for why this needs to be in the kernel are similar to the
arugments made in the patch "Workaround errata A12 818325/852422 A17
852423".

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14 15:32:30 +01:00
Doug Anderson
62c0f4a534 ARM: 8558/1: errata: Workaround errata A12 818325/852422 A17 852423
There are several similar errata on Cortex A12 and A17 that all have the same workaround: setting bit[12] of the Feature Register.
Technically the list of errata are:

- A12 818325: Execution of an UNPREDICTABLE STR or STM instruction
  might deadlock.  Fixed in r0p1.
- A12 852422: Execution of a sequence of instructions might lead to
  either a data corruption or a CPU deadlock.  Not fixed in any A12s
  yet.
- A17 852423: Execution of a sequence of instructions might lead to
  either a data corruption or a CPU deadlock.  Not fixed in any A17s
  yet.

Since A12 got renamed to A17 it seems likely that there won't be any
future Cortex-A12 cores, so we'll enable for all Cortex-A12.

For Cortex-A17 I believe that all known revisions are affected and that all knows revisions means <= r1p2.  Presumably if a new A17 was
released it would have this problem fixed.

Note that in <https://patchwork.kernel.org/patch/4735341/> folks
previously expressed opposition to this change because:
A) It was thought to only apply to r0p0 and there were no known r0p0
   boards supported in mainline.
B) It was argued that such a workaround beloned in firmware.

Now that this same fix solves other errata on real boards (like
rk3288) point A) is addressed.

Point B) is impossible to address on boards like rk3288.  On rk3288
the firmware doesn't stay resident in RAM and isn't involved at all in
the suspend/resume process nor in the SMP bringup process.  That means
that the most the firmware could do would be to set the bit on "core
0" and this bit would be lost at suspend/resume time.  It is true that
we could write a "generic" solution that saved the boot-time "core 0"
value of this register and applied it at SMP bringup / resume time.
However, since this register (described as the "Feature Register" in
errata) appears to be undocumented (as far as I can tell) and is only
modified for these errata, that "generic" solution seems questionably
cleaner.  The generic solution also won't fix existing users that
haven't happened to do a FW update.

Note that in ARM64 presumably PSCI will be universal and fixes like
this will end up in ATF.  Hopefully we are nearing the end of this
style of errata workaround.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Huang Tao <huangtao@rock-chips.com>
Signed-off-by: Kever Yang <kever.yang@rock-chips.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-14 15:32:30 +01:00
Masahiro Yamada
520319de0c ARM: 8582/1: remove unused CONFIG_ARCH_HAS_BARRIERS
Since commit 2b749cb3a5 ("ARM: realview: remove private barrier
implementation"), this config is not used by any platform.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-07-02 11:01:08 +01:00
Zhaoxiu Zeng
fff7fb0b2d lib/GCD.c: use binary GCD algorithm instead of Euclidean
The binary GCD algorithm is based on the following facts:
	1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
	2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
	3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)

Even on x86 machines with reasonable division hardware, the binary
algorithm runs about 25% faster (80% the execution time) than the
division-based Euclidian algorithm.

On platforms like Alpha and ARMv6 where division is a function call to
emulation code, it's even more significant.

There are two variants of the code here, depending on whether a fast
__ffs (find least significant set bit) instruction is available.  This
allows the unpredictable branches in the bit-at-a-time shifting loop to
be eliminated.

If fast __ffs is not available, the "even/odd" GCD variant is used.

I use the following code to benchmark:

	#include <stdio.h>
	#include <stdlib.h>
	#include <stdint.h>
	#include <string.h>
	#include <time.h>
	#include <unistd.h>

	#define swap(a, b) \
		do { \
			a ^= b; \
			b ^= a; \
			a ^= b; \
		} while (0)

	unsigned long gcd0(unsigned long a, unsigned long b)
	{
		unsigned long r;

		if (a < b) {
			swap(a, b);
		}

		if (b == 0)
			return a;

		while ((r = a % b) != 0) {
			a = b;
			b = r;
		}

		return b;
	}

	unsigned long gcd1(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		b >>= __builtin_ctzl(b);

		for (;;) {
			a >>= __builtin_ctzl(a);
			if (a == b)
				return a << __builtin_ctzl(r);

			if (a < b)
				swap(a, b);
			a -= b;
		}
	}

	unsigned long gcd2(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		r &= -r;

		while (!(b & r))
			b >>= 1;

		for (;;) {
			while (!(a & r))
				a >>= 1;
			if (a == b)
				return a;

			if (a < b)
				swap(a, b);
			a -= b;
			a >>= 1;
			if (a & r)
				a += b;
			a >>= 1;
		}
	}

	unsigned long gcd3(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		b >>= __builtin_ctzl(b);
		if (b == 1)
			return r & -r;

		for (;;) {
			a >>= __builtin_ctzl(a);
			if (a == 1)
				return r & -r;
			if (a == b)
				return a << __builtin_ctzl(r);

			if (a < b)
				swap(a, b);
			a -= b;
		}
	}

	unsigned long gcd4(unsigned long a, unsigned long b)
	{
		unsigned long r = a | b;

		if (!a || !b)
			return r;

		r &= -r;

		while (!(b & r))
			b >>= 1;
		if (b == r)
			return r;

		for (;;) {
			while (!(a & r))
				a >>= 1;
			if (a == r)
				return r;
			if (a == b)
				return a;

			if (a < b)
				swap(a, b);
			a -= b;
			a >>= 1;
			if (a & r)
				a += b;
			a >>= 1;
		}
	}

	static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
		gcd0, gcd1, gcd2, gcd3, gcd4,
	};

	#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))

	#if defined(__x86_64__)

	#define rdtscll(val) do { \
		unsigned long __a,__d; \
		__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
		(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
	} while(0)

	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
								unsigned long a, unsigned long b, unsigned long *res)
	{
		unsigned long long start, end;
		unsigned long long ret;
		unsigned long gcd_res;

		rdtscll(start);
		gcd_res = gcd(a, b);
		rdtscll(end);

		if (end >= start)
			ret = end - start;
		else
			ret = ~0ULL - start + 1 + end;

		*res = gcd_res;
		return ret;
	}

	#else

	static inline struct timespec read_time(void)
	{
		struct timespec time;
		clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
		return time;
	}

	static inline unsigned long long diff_time(struct timespec start, struct timespec end)
	{
		struct timespec temp;

		if ((end.tv_nsec - start.tv_nsec) < 0) {
			temp.tv_sec = end.tv_sec - start.tv_sec - 1;
			temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
		} else {
			temp.tv_sec = end.tv_sec - start.tv_sec;
			temp.tv_nsec = end.tv_nsec - start.tv_nsec;
		}

		return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
	}

	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
								unsigned long a, unsigned long b, unsigned long *res)
	{
		struct timespec start, end;
		unsigned long gcd_res;

		start = read_time();
		gcd_res = gcd(a, b);
		end = read_time();

		*res = gcd_res;
		return diff_time(start, end);
	}

	#endif

	static inline unsigned long get_rand()
	{
		if (sizeof(long) == 8)
			return (unsigned long)rand() << 32 | rand();
		else
			return rand();
	}

	int main(int argc, char **argv)
	{
		unsigned int seed = time(0);
		int loops = 100;
		int repeats = 1000;
		unsigned long (*res)[TEST_ENTRIES];
		unsigned long long elapsed[TEST_ENTRIES];
		int i, j, k;

		for (;;) {
			int opt = getopt(argc, argv, "n:r:s:");
			/* End condition always first */
			if (opt == -1)
				break;

			switch (opt) {
			case 'n':
				loops = atoi(optarg);
				break;
			case 'r':
				repeats = atoi(optarg);
				break;
			case 's':
				seed = strtoul(optarg, NULL, 10);
				break;
			default:
				/* You won't actually get here. */
				break;
			}
		}

		res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
		memset(elapsed, 0, sizeof(elapsed));

		srand(seed);
		for (j = 0; j < loops; j++) {
			unsigned long a = get_rand();
			/* Do we have args? */
			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
			unsigned long long min_elapsed[TEST_ENTRIES];
			for (k = 0; k < repeats; k++) {
				for (i = 0; i < TEST_ENTRIES; i++) {
					unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
					if (k == 0 || min_elapsed[i] > tmp)
						min_elapsed[i] = tmp;
				}
			}
			for (i = 0; i < TEST_ENTRIES; i++)
				elapsed[i] += min_elapsed[i];
		}

		for (i = 0; i < TEST_ENTRIES; i++)
			printf("gcd%d: elapsed %llu\n", i, elapsed[i]);

		k = 0;
		srand(seed);
		for (j = 0; j < loops; j++) {
			unsigned long a = get_rand();
			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
			for (i = 1; i < TEST_ENTRIES; i++) {
				if (res[j][i] != res[j][0])
					break;
			}
			if (i < TEST_ENTRIES) {
				if (k == 0) {
					k = 1;
					fprintf(stderr, "Error:\n");
				}
				fprintf(stderr, "gcd(%lu, %lu): ", a, b);
				for (i = 0; i < TEST_ENTRIES; i++)
					fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
			}
		}

		if (k == 0)
			fprintf(stderr, "PASS\n");

		free(res);

		return 0;
	}

Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:

  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 10174
  gcd1: elapsed 2120
  gcd2: elapsed 2902
  gcd3: elapsed 2039
  gcd4: elapsed 2812
  PASS
  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 9309
  gcd1: elapsed 2280
  gcd2: elapsed 2822
  gcd3: elapsed 2217
  gcd4: elapsed 2710
  PASS
  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 9589
  gcd1: elapsed 2098
  gcd2: elapsed 2815
  gcd3: elapsed 2030
  gcd4: elapsed 2718
  PASS
  zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
  gcd0: elapsed 9914
  gcd1: elapsed 2309
  gcd2: elapsed 2779
  gcd3: elapsed 2228
  gcd4: elapsed 2709
  PASS

[akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
Signed-off-by: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
Signed-off-by: George Spelvin <linux@horizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Linus Torvalds
a1c28b75a9 Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Changes included in this pull request:

   - revert pxa2xx-flash back to using ioremap_cached() and switch
     memremap() to use arch_memremap_wb()

   - remove pci=firmware command line argument handling

   - remove unnecessary arm_dma_set_mask() implementation, the generic
     implementation will do for ARM

   - removal of the ARM kallsyms "hack" to work around mode switching
     veneers and vectors located below PAGE_OFFSET

   - tidy up build system output a little

   - add L2 cache power management DT bindings

   - remove duplicated local_irq_disable() in reboot paths

   - handle AMBA primecell devices better at registration time with PM
     domains (needed for Samsung SoCs)

   - ARM specific preparation to support Keystone II kexec"

* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs
  ARM: 8570/2: Documentation: devicetree: Add PL310 PM bindings
  ARM: 8569/1: pl2x0: Add OF control of cache power management
  ARM: 8568/1: reboot: remove duplicated local_irq_disable()
  ARM: 8566/1: drivers: amba: properly handle devices with power domains
  ARM: provide arm_has_idmap_alias() helper
  ARM: kexec: remove 512MB restriction on kexec crashdump
  ARM: provide improved virt_to_idmap() functionality
  ARM: kexec: fix crashkernel= handling
  ARM: 8557/1: specify install, zinstall, and uinstall as PHONY targets
  ARM: 8562/1: suppress "include/generated/mach-types.h is up to date."
  ARM: 8553/1: kallsyms: remove --page-offset command line option
  ARM: 8552/1: kallsyms: remove special lower address limit for CONFIG_ARM
  ARM: 8555/1: kallsyms: ignore ARM mode switching veneers
  ARM: 8548/1: dma-mapping: remove arm_dma_set_mask()
  ARM: 8554/1: kernel: pci: remove pci=firmware command line parameter handling
  ARM: memremap: implement arch_memremap_wb()
  memremap: add arch specific hook for MEMREMAP_WB mappings
  mtd: pxa2xx-flash: switch back from memremap to ioremap_cached
  ARM: reintroduce ioremap_cached() for creating cached I/O mappings
2016-05-20 10:01:38 -07:00
Russell King
5632a9fbcd Merge branches 'amba', 'devel-stable', 'kexec-for-next' and 'misc' into for-linus 2016-05-19 10:31:35 +01:00
Robin Murphy
53c92d7933 iommu: of: enforce const-ness of struct iommu_ops
As a set of driver-provided callbacks and static data, there is no
compelling reason for struct iommu_ops to be mutable in core code, so
enforce const-ness throughout.

Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-05-09 15:33:29 +02:00
Linus Torvalds
9125aeb3e2 Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "These are a number of updates to fix a few problems found in the ARM
  nommu code over the last couple of years, caused mostly by changes on
  the mmu side"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8573/1: domain: move {set,get}_domain under config guard
  ARM: 8572/1: nommu: change memory reserve for the vectors
  ARM: 8571/1: nommu: fix PMSAv7 setup
2016-05-07 08:27:35 -07:00
Masahiro Yamada
6427a840ff ARM: 8567/1: cache-uniphier: activate ways for secondary CPUs
This outer cache allows to control active ways independently for
each CPU, but currently nothing is done for secondary CPUs.  In
other words, all the ways are locked for secondary CPUs by default.
This commit fixes it to fully bring out the performance of this
outer cache.

There would be two possible ways to achieve this:

[1] Each CPU initializes active ways for itself.  This can be done
    via the SSCLPDAWCR register.  This is a banked register, so each
    CPU sees a different instance of the register for its own.

[2] The master CPU initializes active ways for all the CPUs.  This
    is available via SSCDAWCARMR(N) registers, where all instances
    of SSCLPDAWCR are mirrored.  They are mapped at the address
    SSCDAWCARMR + 4 * N, where N is the CPU number.

The outer cache frame work does not support a per-CPU init callback.
So this commit adopts [2]; the master CPU iterates over possible CPUs
setting up SSCDAWCARMR(N) registers.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05 19:03:39 +01:00
Jean-Philippe Brucker
5b526bd925 ARM: 8572/1: nommu: change memory reserve for the vectors
Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an
additional page above the base vector one. This change wasn't taken into
account by the nommu memreserve.
This patch ensures that the kernel won't overwrite any vector stub on
nommu.

[changed the MPU side too]

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05 19:03:02 +01:00
Jean-Philippe Brucker
695665b0c5 ARM: 8571/1: nommu: fix PMSAv7 setup
Commit 1c2f87c (ARM: 8025/1: Get rid of meminfo) broke the support for
MPU on ARMv7-R. This patch adapts the code inside CONFIG_ARM_MPU to use
memblocks appropriately.

MPU initialisation only uses the first memory region, and removes all
subsequent ones. Because looping over all regions that need removal is
inefficient, and memblock_remove already handles memory ranges, we can
flatten the 'for_each_memblock' part.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05 19:03:01 +01:00
Brad Mouring
204932dfc8 ARM: 8569/1: pl2x0: Add OF control of cache power management
Add ability to override power management bits of 310 controllers
(dynamic clock gating and standby mode) through OF entries. As the
saved register is only applied when working on a supported controller,
it is safe to save the settings.

In order to maintain existing behavior, if the settings are not found
in the DT, the corresponding feature will be enabled.

Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-05 19:02:10 +01:00
Russell King
981b6714db ARM: provide improved virt_to_idmap() functionality
For kexec, we need more functionality from the IDMAP system.  We need to
be able to convert physical addresses to their identity mappped versions
as well as virtual addresses.

Convert the existing arch_virt_to_idmap() to deal with physical
addresses instead.

Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-05-03 11:13:54 +01:00
Linus Torvalds
f862d66a1a Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Three further fixes for ARM.

  Alexandre Courbot was having problems with DMA allocations with the
  GFP flags affecting where the tracking data was being allocated from.
  Vladimir Murzin noticed that the CPU feature code was not entirely
  correct, which can cause some features to be misreported"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8564/1: fix cpu feature extracting helper
  ARM: 8563/1: fix demoting HWCAP_SWP
  ARM: 8551/2: DMA: Fix kzalloc flags in __dma_alloc
2016-04-21 08:45:02 -07:00
Russell King
e31db4c756 Merge tag 'arm-memremap-for-v4.7' of git://git.linaro.org/people/ard.biesheuvel/linux-arm into devel-stable
This series wires up the generic memremap() function for ARM in a way
that allows it to be used as intended, i.e., without regard for whether
the region being mapped is covered by a struct page and/or the linear
mapping (lowmem)
2016-04-20 09:09:07 +01:00
Alexandre Courbot
9c18fcf7ae ARM: 8551/2: DMA: Fix kzalloc flags in __dma_alloc
Commit 19e6e5e539 ("ARM: 8547/1: dma-mapping: store buffer
information") allocates a structure meant for internal buffer management
with the GFP flags of the buffer itself. This can trigger the following
safeguard in the slab/slub allocator:

	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
		pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK);
		BUG();
	}

Fix this by filtering the flags that make the slab allocator unhappy.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Acked-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-04-15 09:44:02 +01:00
Linus Torvalds
08b15d1386 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "A couple of small fixes, and wiring up the new syscalls which appeared
  during the merge window"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8550/1: protect idiv patching against undefined gcc behavior
  ARM: wire up preadv2 and pwritev2 syscalls
  ARM: SMP enable of cache maintanence broadcast
2016-04-10 17:48:17 -07:00
Alexandre Courbot
b67dd2e9bd ARM: 8548/1: dma-mapping: remove arm_dma_set_mask()
arm_dma_set_mask() implements exactly the same behavior as the fallback
that dma_set_mask() takes if the set_dma_mask op is not set. Remove it
and use that fallback instead like what is already done for
dma_get_mask().

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-04-07 21:57:15 +01:00
Kirill A. Shutemov
09cbfeaf1a mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized.  And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE.  And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special.  They are
not.

The changes are pretty straight-forward:

 - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;

 - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

 - page_cache_get() -> get_page();

 - page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below.  For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach.  I'll
fix them manually in a separate patch.  Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-04 10:41:08 -07:00
Ard Biesheuvel
9ab9e4fce4 ARM: memremap: implement arch_memremap_wb()
The generic memremap() falls back to using ioremap_cache() to create
MEMREMAP_WB mappings if the requested region is not already covered
by the linear mapping, unless the architecture provides an implementation
of arch_memremap_wb().

Since ioremap_cache() is not appropriate on ARM to map memory with the
same attributes used for the linear mapping, implement arch_memremap_wb()
which does exactly that. Also, relax the WARN() check to allow MT_MEMORY_RW
mappings of pfn_valid() pages.

Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2016-04-04 10:26:42 +02:00
Ard Biesheuvel
20c5ea4fc1 ARM: reintroduce ioremap_cached() for creating cached I/O mappings
The original ARM-only ioremap flavor 'ioremap_cached' has been renamed
to 'ioremap_cache' to align with other architectures, and subsequently
abused in generic code to map things like firmware tables in memory.
For that reason, there is currently an effort underway to deprecate
ioremap_cache, whose semantics are poorly defined, and which is typed
with an __iomem annotation that is inappropriate for mappings of ordinary
memory.

However, original users of ioremap_cached() used it in a context where
the I/O connotation is appropriate, and replacing those instances with
memremap() does not make sense. So let's revive ioremap_cached(), so
that we can change back those original users before we drop ioremap_cache
entirely in favor of memremap.

Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2016-04-04 10:26:40 +02:00
Russell King
0fc03d4c87 ARM: SMP enable of cache maintanence broadcast
Masahiro Yamada reports that we can fail to set the FW bit in the
auxiliary control register, which enables broadcasting the cache
maintanence operations.  This occurs because we only check that the
SMP/nAMP bit is set, rather than checking whether all the bits we
want to be set are set.

Rearrange the code to ensure that all desired bits are set, and only
update the register if we discover some required bits are not set.

Tested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2016-04-01 23:27:47 +01:00
Linus Torvalds
de06dbfa78 Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Another mixture of changes this time around:

   - Split XIP linker file from main linker file to make it more
     maintainable, and various XIP fixes, and clean up a resulting
     macro.

   - Decompressor cleanups from Masahiro Yamada

   - Avoid printing an error for a missing L2 cache

   - Remove some duplicated symbols in System.map, and move
     vectors/stubs back into kernel VMA

   - Various low priority fixes from Arnd

   - Updates to allow bus match functions to return negative errno
     values, touching some drivers and the driver core.  Greg has acked
     these changes.

   - Virtualisation platform udpates form Jean-Philippe Brucker.

   - Security enhancements from Kees Cook

   - Rework some Kconfig dependencies and move PSCI idle management code
     out of arch/arm into drivers/firmware/psci.c

   - ARM DMA mapping updates, touching media, acked by Mauro.

   - Fix places in ARM code which should be using virt_to_idmap() so
     that Keystone2 can work.

   - Fix Marvell Tauros2 to work again with non-DT boots.

   - Provide a delay timer for ARM Orion platforms"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (45 commits)
  ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0
  ARM: 8547/1: dma-mapping: store buffer information
  ARM: 8543/1: decompressor: rename suffix_y to compress-y
  ARM: 8542/1: decompressor: merge piggy.*.S and simplify Makefile
  ARM: 8541/1: decompressor: drop redundant FORCE in Makefile
  ARM: 8540/1: decompressor: use clean-files instead of extra-y to clean files
  ARM: 8539/1: decompressor: drop more unneeded assignments to "targets"
  ARM: 8538/1: decompressor: drop unneeded assignments to "targets"
  ARM: 8532/1: uncompress: mark putc as inline
  ARM: 8531/1: turn init_new_context into an inline function
  ARM: 8530/1: remove VIRT_TO_BUS
  ARM: 8537/1: drop unused DEBUG_RODATA from XIP_KERNEL
  ARM: 8536/1: mm: hide __start_rodata_section_aligned for non-debug builds
  ARM: 8535/1: mm: DEBUG_RODATA makes no sense with XIP_KERNEL
  ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
  ARM: make the physical-relative calculation more obvious
  ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNEL
  ARM: 8411/1: Add default SPARSEMEM settings
  ARM: 8503/1: clk_register_clkdev: remove format string interface
  ARM: 8529/1: remove 'i' and 'zi' targets
  ...
2016-03-19 16:31:54 -07:00
Jan Kara
0e8fb9312f mm: remove VM_FAULT_MINOR
The define has a comment from Nick Piggin from 2007:

 /* For backwards compat. Remove me quickly. */

I guess 9 years should not be too hurried sense of 'quickly' even for
kernel measures.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Kirill A. Shutemov
3ed3a4f0dd mm: cleanup *pte_alloc* interfaces
There are few things about *pte_alloc*() helpers worth cleaning up:

 - 'vma' argument is unused, let's drop it;

 - most __pte_alloc() callers do speculative check for pmd_none(),
   before taking ptl: let's introduce pte_alloc() macro which does
   the check.

   The only direct user of __pte_alloc left is userfaultfd, which has
   different expectation about atomicity wrt pmd.

 - pte_alloc_map() and pte_alloc_map_lock() are redefined using
   pte_alloc().

[sudeep.holla@arm.com: fix build for arm64 hugetlbpage]
[sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-17 15:09:34 -07:00
Linus Torvalds
dd273a8071 Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Just two ARM fixes this time: one to fix the hyp-stub for older ARM
  CPUs, and another to fix the set_memory_xx() permission functions to
  deal with zero sizes correctly"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8544/1: set_memory_xx fixes
  ARM: 8534/1: virt: fix hyp-stub build for pre-ARMv7 CPUs
2016-03-06 13:51:27 -08:00
Russell King
1b3bf84797 Merge branches 'amba', 'fixes', 'misc' and 'tauros2' into for-next 2016-03-04 23:36:02 +00:00
Rabin Vincent
b426867612 ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0
Given a device which uses arm_coherent_dma_ops and on which
dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA
API with gfp=0 results in memory corruption and a memory leak.

 p = dma_alloc_coherent(dev, sz, &dma, 0);
 if (p)
 	dma_free_coherent(dev, sz, p, dma);

The memory leak is because the alloc allocates using
__alloc_simple_buffer() but the free attempts
dma_release_from_contiguous() which does not do free anything since the
page is not in the CMA area.

The memory corruption is because the free calls __dma_remap() on a page
which is backed by only first level page tables.  The
apply_to_page_range() + __dma_update_pte() loop ends up interpreting the
section mapping as an addresses to a second level page table and writing
the new PTE to memory which is not used by page tables.

We don't have access to the GFP flags used for allocation in the free
function.  Fix this by adding allocator backends and using this
information in the free function so that we always use the correct
release routine.

Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA")
Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-03-04 23:35:17 +00:00
Rabin Vincent
19e6e5e539 ARM: 8547/1: dma-mapping: store buffer information
Keep a list of allocated DMA buffers so that we can store metadata in
alloc() which we later need in free().

Signed-off-by: Rabin Vincent <rabin.vincent@axis.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-03-04 23:35:17 +00:00
Mika Penttilä
f474c8c857 ARM: 8544/1: set_memory_xx fixes
Allow zero size updates. This makes set_memory_xx() consistent with x86, s390 and arm64 and makes apply_to_page_range() not to BUG() when loading modules.

Signed-off-by: Mika Penttilä mika.penttila@nextfour.com
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-03-04 23:32:45 +00:00
Daniel Cashman
5ef11c35ce mm: ASLR: use get_random_long()
Replace calls to get_random_int() followed by a cast to (unsigned long)
with calls to get_random_long().  Also address shifting bug which, in
case of x86 removed entropy mask for mmap_rnd_bits values > 31 bits.

Signed-off-by: Daniel Cashman <dcashman@android.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Kralevich <nnk@google.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-27 10:28:52 -08:00
Arnd Bergmann
ac96680d22 ARM: 8535/1: mm: DEBUG_RODATA makes no sense with XIP_KERNEL
When CONFIG_DEBUG_ALIGN_RODATA is set, we get a link error:

arch/arm/mm/built-in.o:(.data+0x4bc): undefined reference to `__start_rodata_section_aligned'

However, this combination is useless, as XIP_KERNEL implies that all the
RODATA is already marked readonly, so both CONFIG_DEBUG_RODATA and
CONFIG_DEBUG_ALIGN_RODATA (which depends on the other) are not
needed with XIP_KERNEL, and this patches enforces that using a Kconfig
dependency.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 25362dc496 ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-22 11:39:42 +00:00
Russell King
8ff97fa313 ARM: make the physical-relative calculation more obvious
The physical-relative calculation between the XIP text and data sections
introduced by the previous patch was far from obvious. Let's simplify it
by turning it into a macro which takes the two (virtual) addresses.

This allows us to arrange the calculation in a more obvious manner - we
can make it two sub-expressions which calculate the physical address for
each symbol, and then takes the difference of those physical addresses.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-17 00:28:39 +00:00
Nicolas Pitre
d781145549 ARM: 8512/1: proc-v7.S: Adjust stack address when XIP_KERNEL
When XIP_KERNEL is enabled, the virt to phys address translation for RAM
is not the same as the virt to phys address translation for .text.
The only way to know where physical RAM is located is to use
PLAT_PHYS_OFFSET.
The MACRO will be useful for other places where there is a similar problem.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-16 17:17:49 +00:00
Kees Cook
64ac2e74f0 ARM: 8502/1: mm: mark section-aligned portion of rodata NX
When rodata is large enough that it crosses a section boundary after the
kernel text, mark the rest NX. This is as close to full NX of rodata as
we can get without splitting page tables or doing section alignment via
CONFIG_DEBUG_ALIGN_RODATA.

When the config is:

 CONFIG_DEBUG_RODATA=y
 # CONFIG_DEBUG_ALIGN_RODATA is not set

Before:

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80a00000           9M     ro x  SHD
0x80a00000-0xa0000000         502M     RW NX SHD

After:

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80700000           6M     ro x  SHD
0x80700000-0x80a00000           3M     ro NX SHD
0x80a00000-0xa0000000         502M     RW NX SHD

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-11 15:44:10 +00:00
Chris Brandt
02afa9a87b ARM: 8518/1: Use correct symbols for XIP_KERNEL
For an XIP build, _etext does not represent the end of the
binary image that needs to stay mapped into the MODULES_VADDR area.
Years ago, data came before text in the memory map. However,
now that the order is text/init/data, an XIP_KERNEL needs to map
up to the data location in order to keep from cutting off
parts of the kernel that are needed.
We only map up to the beginning of data because data has already been
copied, so there's no reason to keep it around anymore.
A new symbol is created to make it clear what it is we are referring
to.

This fixes the bug where you might lose the end of your kernel area
after page table setup is complete.

Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-11 15:43:14 +00:00
Doug Anderson
14d3ae2efe ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize alloc
If we know that TLB efficiency will not be an issue when memory is
accessed then it's not terribly important to allocate big chunks of
memory.  The whole point of allocating the big chunks was that it would
make TLB usage efficient.

As Marek Szyprowski indicated:
    Please note that mapping memory with larger pages significantly
    improves performance, especially when IOMMU has a little TLB
    cache. This can be easily observed when multimedia devices do
    processing of RGB data with 90/270 degree rotation
Image rotation is distinctly an operation that needs to bounce around
through memory, so it makes sense that TLB efficiency is important
there.

Video decoding, on the other hand, is a fairly sequential operation.
During video decoding it's not expected that we'll be jumping all over
memory.  Decoding video is also pretty heavy and the TLB misses aren't a
huge deal.  Presumably most HW video acceleration users of dma-mapping
will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES.

Allocating big chunks of memory is quite expensive, especially if we're
doing it repeadly and memory is full.  In one (out of tree) usage model
it is common that arm_iommu_alloc_attrs() is called 16 times in a row,
each one trying to allocate 4 MB of memory.  This is called whenever the
system encounters a new video, which could easily happen while the
memory system is stressed out.  In fact, on certain social media
websites that auto-play video and have infinite scrolling, it's quite
common to see not just one of these 16x4MB allocations but 2 or 3 right
after another.  Asking the system even to do a small amount of extra
work to give us big chunks in this case is just not a good use of time.

Allocating big chunks of memory is also expensive indirectly.  Even if
we ask the system not to do ANY extra work to allocate _our_ memory,
we're still potentially eating up all big chunks in the system.
Presumably there are other users in the system that aren't quite as
flexible and that actually need these big chunks.  By eating all the big
chunks we're causing extra work for the rest of the system.  We also may
start making other memory allocations fail.  While the system may be
robust to such failures (as is the case with dwc2 USB trying to allocate
buffers for Ethernet data and with WiFi trying to allocate buffers for
WiFi data), it is yet another big performance hit.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-11 15:33:38 +00:00
Doug Anderson
33298ef6d8 ARM: 8505/1: dma-mapping: Optimize allocation
The __iommu_alloc_buffer() is expected to be called to allocate pretty
sizeable buffers.  Upon simple tests of video I saw it trying to
allocate 4,194,304 bytes.  The function tries to allocate large chunks
in order to optimize IOMMU TLB usage.

The current function is very, very slow.

One problem is the way it keeps trying and trying to allocate big
chunks.  Imagine a very fragmented memory that has 4M free but no
contiguous pages at all.  Further imagine allocating 4M (1024 pages).
We'll do the following memory allocations:
- For page 1:
  - Try to allocate order 10 (no retry)
  - Try to allocate order 9 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- For page 2:
  - Try to allocate order 9 (no retry)
  - Try to allocate order 8 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- ...
- ...

Total number of calls to alloc() calls for this case is:
  sum(int(math.log(i, 2)) + 1 for i in range(1, 1025))
  => 9228

The above is obviously worse case, but given how slow alloc can be we
really want to try to avoid even somewhat bad cases.  I timed the old
code with a device under memory pressure and it wasn't hard to see it
take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing
was done on kernel 3.14, so possibly mainline would behave
differently).

A second problem is that allocating big chunks under memory pressure
when we don't need them is just not a great idea anyway unless we really
need them.  We can make due pretty well with smaller chunks so it's
probably wise to leave bigger chunks for other users once memory
pressure is on.

Let's adjust the allocation like this:

1. If a big chunk fails, stop trying to hard and bump down to lower
   order allocations.
2. Don't try useless orders.  The whole point of big chunks is to
   optimize the TLB and it can really only make use of 2M, 1M, 64K and
   4K sizes.

We'll still tend to eat up a bunch of big chunks, but that might be the
right answer for some users.  A future patch could possibly add a new
DMA_ATTR that would let the caller decide that TLB optimization isn't
important and that we should use smaller chunks.  Presumably this would
be a sane strategy for some callers.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-11 15:33:37 +00:00
Kees Cook
25362dc496 ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA
The use of CONFIG_DEBUG_RODATA is generally seen as an essential part of
kernel self-protection:
http://www.openwall.com/lists/kernel-hardening/2015/11/30/13
Additionally, its name has grown to mean things beyond just rodata. To
get ARM closer to this, we ought to rearrange the names of the configs
that control how the kernel protects its memory. What was called
CONFIG_ARM_KERNMEM_PERMS is realy doing the work that other architectures
call CONFIG_DEBUG_RODATA.

This redefines CONFIG_DEBUG_RODATA to actually do the bulk of the
ROing (and NXing). In the place of the old CONFIG_DEBUG_RODATA, use
CONFIG_DEBUG_ALIGN_RODATA, since that's what the option does: adds
section alignment for making rodata explicitly NX, as arm does not split
the page tables like arm64 does without _ALIGN_RODATA.

Also adds human readable names to the sections so I could more easily
debug my typos, and makes CONFIG_DEBUG_RODATA default "y" for CPU_V7.

Results in /sys/kernel/debug/kernel_page_tables for each config state:

 # CONFIG_DEBUG_RODATA is not set
 # CONFIG_DEBUG_ALIGN_RODATA is not set

---[ Kernel Mapping ]---
0x80000000-0x80900000           9M     RW x  SHD
0x80900000-0xa0000000         503M     RW NX SHD

 CONFIG_DEBUG_RODATA=y
 CONFIG_DEBUG_ALIGN_RODATA=y

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80700000           6M     ro x  SHD
0x80700000-0x80a00000           3M     ro NX SHD
0x80a00000-0xa0000000         502M     RW NX SHD

 CONFIG_DEBUG_RODATA=y
 # CONFIG_DEBUG_ALIGN_RODATA is not set

---[ Kernel Mapping ]---
0x80000000-0x80100000           1M     RW NX SHD
0x80100000-0x80a00000           9M     ro x  SHD
0x80a00000-0xa0000000         502M     RW NX SHD

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-08 15:56:45 +00:00
Russell King
2841029393 ARM: make virt_to_idmap() return unsigned long
Make virt_to_idmap() return an unsigned long rather than phys_addr_t.

Returning phys_addr_t here makes no sense, because the definition of
virt_to_idmap() is that it shall return a physical address which maps
identically with the virtual address.  Since virtual addresses are
limited to 32-bit, identity mapped physical addresses are as well.

Almost all users already had an implicit narrowing cast to unsigned long
so let's make this official and part of this interface.

Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-02-08 15:47:28 +00:00
Tetsuo Handa
1d5cfdb076 tree wide: use kvfree() than conditional kfree()/vfree()
There are many locations that do

  if (memory_was_allocated_by_vmalloc)
    vfree(ptr);
  else
    kfree(ptr);

but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr().  Unless callers have special reasons, we can
replace this branch with kvfree().  Please check and reply if you found
problems.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jan Kara <jack@suse.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Acked-by: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: David Rientjes <rientjes@google.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Boris Petkov <bp@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-22 17:02:18 -08:00
Linus Torvalds
6b5a12dbca ARM: SoC multiplatform code changes for v4.5
This branch is the culmination of 5 years of effort to bring the ARMv6
 and ARMv7 platforms together such that they can all be enabled and
 boot the same kernel. It has been a tremendous amount of cleanup and
 refactoring by a huge number of people, and creation of several new
 (and major) subsystems to better abstract out all the platform details
 in an appropriate manner.
 
 The bulk of this branch is a large patchset from Arnd that brings several
 of the more minor and older platforms we have closer to multiplatform
 support.  Among these are MMP, S3C64xx, Orion5x, mv78xx0 and realview
 Much of this is moving around header files from old mach directories,
 but there are also some cleanup patches of debug_ll (lowlevel debug
 per-platform options) and other parts.
 
 Linus Walleij also has some patchs to clean up the older ARM Realview
 platforms by finally introducing DT support, and Rob Herring has some
 for ARM Versatile which is now DT-only. Both of these platforms are
 now multiplatform.
 
 Finally, a couple of patches from Russell for Dove PMU, and a fix from
 Valentin Rothberg for Exynos ADC, which were rebased on top of the
 series to avoid conflicts.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIUAwUAVqAGcmCrR//JCVInAQLDog/4x9F0PHGmZhexGfFOpi2Od63Jjx55izRU
 zRXqRjjFjambOrZuOx8lEGDy/qzqKbsDU8D1P4IUugkDr2bLSXv+NTLZL1kNBIdm
 YOlJhw/BmzLYqauOHmBzGhtv1FDUk3rqbgTsP5tTWj5LpSkwjmqui3HBZpi+f3Rr
 YOn+NeQSARiw+51D0b106a9RFshQXRGgn5m3xFjLWhJqshb2z2Ew5cogX/zdwrrM
 ss1BFomxsvgk6S+snN6v7cEX2iXe3r89qNR5jEW5BgNpQGFsAUeXPr9zzH07L/Qq
 O7XLw9jt5MX/X5372zVHPb57WoflLbF9cFaaDUZV3eTqt3lC67BTxOtYIdC2i90k
 E5GYlsy88CRwT2EO+ok/6UTryph+hVv7JqHfbKfnISrbraMCK36DtDTpBIpZ9uYF
 rRB7ncJZUWBcyoe+qvitSl+2KV54iB1ez2RXsketxM98dDZsfB2M2ImFou1F/Pgg
 ALvpifPubi/uDe7xNUsSuaT6/3jAomBuNsxnkYJ3NeiH/+duZbOYGkzK/LlcjZyc
 UrA0IpLfwIFsBNzwfpZPZ1lkEu8Y1YZZ+Hv9k65q1wMuBDgrFI5zUeYrPZi4pN9T
 Yo1xP9FstVLDouJrpGZo12VIIxR1UBeGqfRI/BZ58LEF3PRq/g2OVFsdQia5gZKr
 ddiJKSL1Vw==
 =z1AW
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC multiplatform code updates from Arnd Bergmann:
 "This branch is the culmination of 5 years of effort to bring the ARMv6
  and ARMv7 platforms together such that they can all be enabled and
  boot the same kernel.  It has been a tremendous amount of cleanup and
  refactoring by a huge number of people, and creation of several new
  (and major) subsystems to better abstract out all the platform details
  in an appropriate manner.

  The bulk of this branch is a large patchset from Arnd that brings
  several of the more minor and older platforms we have closer to
  multiplatform support.  Among these are MMP, S3C64xx, Orion5x, mv78xx0
  and realview Much of this is moving around header files from old mach
  directories, but there are also some cleanup patches of debug_ll
  (lowlevel debug per-platform options) and other parts.

  Linus Walleij also has some patchs to clean up the older ARM Realview
  platforms by finally introducing DT support, and Rob Herring has some
  for ARM Versatile which is now DT-only.  Both of these platforms are
  now multiplatform.

  Finally, a couple of patches from Russell for Dove PMU, and a fix from
  Valentin Rothberg for Exynos ADC, which were rebased on top of the
  series to avoid conflicts"

* tag 'armsoc-multiplatform' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (75 commits)
  ARM: realview: don't select SMP_ON_UP for UP builds
  ARM: s3c: simplify s3c_irqwake_{e,}intallow definition
  ARM: s3c64xx: fix pm-debug compilation
  iio: exynos-adc: fix irqf_oneshot.cocci warnings
  ARM: realview: build realview-dt SMP support only when used
  ARM: realview: select apropriate targets
  ARM: realview: clean up header files
  ARM: realview: make all header files local
  ARM: no longer make CPU targets visible separately
  ARM: integrator: use explicit core module options
  ARM: realview: enable multiplatform
  ARM: make default platform work for NOMMU
  ARM: debug-ll: move DEBUG_LL_UART_EFM32 to correct Kconfig location
  ARM: defconfig: use correct debug_ll settings
  ARM: versatile: convert to multi-platform
  ARM: versatile: merge mach code into a single file
  ARM: versatile: switch to DT only booting and remove legacy code
  ARM: versatile: add DT based PCI detection
  ARM: pxa: mark ezx structures as __maybe_unused
  ARM: pxa: mark raumfeld init functions as __maybe_unused
  ...
2016-01-20 18:03:56 -08:00
Kirill A. Shutemov
e1534ae950 mm: differentiate page_mapped() from page_mapcount() for compound pages
Let's define page_mapped() to be true for compound pages if any
sub-pages of the compound page is mapped (with PMD or PTE).

On other hand page_mapcount() return mapcount for this particular small
page.

This will make cases like page_get_anon_vma() behave correctly once we
allow huge pages to be mapped with PTE.

Most users outside core-mm should use page_mapcount() instead of
page_mapped().

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Kirill A. Shutemov
0ebd744615 arm, thp: remove infrastructure for handling splitting PMDs
With new refcounting we don't need to mark PMDs splitting.  Let's drop
code to handle this.

pmdp_splitting_flush() is not needed too: on splitting PMD we will do
pmdp_clear_flush() + set_pte_at().  pmdp_clear_flush() will do IPI as
needed for fast_gup.

[arnd@arndb.de: fix unterminated ifdef in header file]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-15 17:56:32 -08:00
Daniel Cashman
e0c25d958f arm: mm: support ARCH_MMAP_RND_BITS
arm: arch_mmap_rnd() uses a hard-code value of 8 to generate the random
offset for the mmap base address.  This value represents a compromise
between increased ASLR effectiveness and avoiding address-space
fragmentation.  Replace it with a Kconfig option, which is sensibly
bounded, so that platform developers may choose where to place this
compromise.  Keep 8 as the minimum acceptable value.

[arnd@arndb.de: ARM: avoid ARCH_MMAP_RND_BITS for NOMMU]
Signed-off-by: Daniel Cashman <dcashman@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-14 16:00:49 -08:00
Russell King
6660800fb7 Merge branch 'devel-stable' into for-linus 2016-01-12 13:41:03 +00:00
Russell King
598bcc6ea6 Merge branches 'misc' and 'misc-rc6' into for-linus 2016-01-05 11:07:28 +00:00
Jungseung Lee
ad84f56bf6 ARM: 8494/1: mm: Enable PXN when running non-LPAE kernel on LPAE processor
The VMSA field of MMFR0 (bottom 4 bits) is incremented for each
added feature.  PXN is supported if the value is >= 4 and LPAE
is supported if it is >= 5.

In case a kernel with CONFIG_ARM_LPAE disabled is used on a
processor that supports LPAE, we can still use PXN in short
descriptors.  So check for >= 4 not == 4.

Signed-off-by: Jungseung Lee <js07.lee@samsung.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-01-04 11:26:40 +00:00
Linus Walleij
36f46d6d5c ARM: 8482/1: l2x0: make it possible to disable outer sync from DT
According to commit 2503a5ecd8
"ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore
boards with L220" Some PB11MPCore RealView core tiles have broken
outer_sync.

We got rid of the custom barriers from the machine by disabling
outer sync, but that was just for the boardfile case. We have
to be able to do the same in the device tree case.

Since __l2c_init() is cloning and copying the L2C vtable,
we pass an argument to this function to optionally numb
the outer sync operation if desired, before initializing
the cache.

After this we can set up the cache correctly on the RealView
PB11MPCore. This was tested on a PB11MPCore known to have the
issue. Before this, spurious crashes would occur if we try to
set up the cache properly, after this it boots rock solid.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-22 12:15:53 +00:00
Arnd Bergmann
86fff036fe Multiplatform support for the RealView
- Tested on the ARM PB11MPCore
 - Tested with boardfile boot
 - Tested with DeviceTree boot
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWdAsTAAoJEEEQszewGV1zMtsP/1So1gjBSf8NfLbRCndf4l3/
 jrviXeGEyGAzvyXvH+KB/24R10NpJMC5zEB8LiUP1Ig/cmNA+XVIJ9IkE8QsjsDx
 dA3aqYHTsXr8w8l6hzVX95GvAsxvg7hQ73rMsSg1x6r6pNeg859M4ZJis5gfB7Pf
 w76Qazl4jJUE79t768mxOKf46Af6D3y/Oa8dpsNW7w3ehzeEODI4N5BvUyZ50vKT
 j8ZdBs++S2ucW1s1zUTeo0wDjaOBK8gNmYKzDtwScCNs5K6l4GcggO0pkrX7MmN/
 quNHcvULU0qnzETMjss6wb4xCAenZ98hqHTnBL2NjHd9A1tvTEGBzRfFx4jb5EI6
 MspsOfxCLpj59QV7qwQJoYbFdzmQo6YgPn+lZoHA5wzxNqt2g0xiBscPYCVO30WT
 mYEF27dUDYzq8EH/BkQ/0dsPcWSNZVWpTYj/14+L766maFsPHJYAVnRk1iRNK9Gq
 uxdu9ZH8VusqFLFkikbNkj9xeFc9M0GN2F5AY0+Li24uzqjClessdXdfiOqQAWZ5
 px0ET819Zz+HG0F8+YNKiLL3Ybnhz8EB4NrncXwSJPrvOGt+xATqSUycqrxwwMsu
 a/kNRaZMRZ7jKdZPPHPESPXdJZKs0vGDThrEFMqKVTFtXfoGNQKdh/Ltixy+rPCH
 CX8QrAZPi7CqRUEMXmFp
 =tfk1
 -----END PGP SIGNATURE-----

Merge tag 'realview-multiplatform-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/multiplatform

Pull "Multiplatform support for the RealView" from Linus Walleij:

Here is the result of my application of the second part of Arnds
patchset, actually enabling multiplatform and getting the RealView
off the ground as a multiplatform target.

It is dependent on an outstanding patch to the irqchips tree bumping
the number of GICs to 2 for the RealView platform. I cannot say I will
be sleepless if these go in side by side: each branch will compile but
will not boot until both trees have been pulled hurting bisectability a
bit.

- Tested on the ARM PB11MPCore
- Tested with boardfile boot
- Tested with DeviceTree boot

* tag 'realview-multiplatform-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
  ARM: realview: select apropriate targets
  ARM: realview: clean up header files
  ARM: realview: make all header files local
  ARM: no longer make CPU targets visible separately
  ARM: integrator: use explicit core module options
  ARM: realview: enable multiplatform
2015-12-18 16:42:47 +01:00
Arnd Bergmann
17d44d7d87 ARM: no longer make CPU targets visible separately
Now that realview and integrator always select the correct CPU
type themselves based on the core tiles, there is no need to
still have them user-visible in arch/arm/mm/Kconfig. The
ARM925T symbol has been selected by the only user for many
years, so that can be removed along with the realview and
integrator specific ones.

This also solves randconfig build problems on realview.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-12-18 14:09:21 +01:00
Linus Torvalds
d7637d01be Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:

 - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
   These are tagged for -stable.  The scatterlist impact is potentially
  corrupted dma addresses on HIGHMEM enabled platforms.

 - A minor locking fix for the NFIT hot-add implementation that is new
   in 4.4-rc.  This would only trigger in the case a hot-add raced
   driver removal.

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dma-debug: Fix dma_debug_entry offset calculation
  Revert "scatterlist: use sg_phys()"
  nfit: acpi_nfit_notify(): Do not leave device locked
2015-12-17 11:20:13 -08:00
Nicolas Pitre
b563d06451 ARM: 8453/2: proc-v7.S: don't locate temporary stack space in .text section
The proc-v7.S code uses a small temporary stack to preserve register
content in its setup code. This stack is located in the .text section
which is normally meant to be read-only.

Move that temporary stack to the .bss section and get its address in
a position independent way, similarly to what we do in other parts
of the kernel.

While at it, one comments was updated to reflect reality, and the list
of saved registers in the proc-v7.S case is updated to match the comment
next to it for coherency.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-17 10:29:01 +00:00
Arnd Bergmann
22ba14f41c The board and infrastructure changes for RealView
multiplatform and extended DT support.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWb9LqAAoJEEEQszewGV1z1I4QAIgylTG1hftD5xCtaKsgJv5X
 pcp+eVVCeYjeO3AbknrXzBlty0u3/rDOR6n9aUn7ci63qErGi2GeoZ5glo6y3aOU
 qKo2/M0LOoP3y6SGxMjaPTTpStjKsaj2XLjHLNrSHAKXsvoFB69vnAlQh2jU+ohX
 JEl9wvkugMWicGaiooWQfG7OAf2Gb6AFAKQUfYNVNNXBTD13oQcFRgLwBKDkNlc9
 7N6yzPQDNQiauytr7Ji/49fbkiOLFSB5yllhecb37F/b56XprGvsXJTRwsQhPwbj
 ig28qu/g7LhfnkZUOTwhWH6WdyFarMlpA8oHHKrZBeGySgvdjXBVYH4IQAfhT2N9
 WSh/he+w8T2+oMVj97gLSxKiP4ugUSsBR6daq5X8NESobxFVYzmFIYQxKxOwFif4
 JDWvOKaQsRhx42iAq3CIMkh7yQsBC4tJjtyvVwHuGeC2zRlyODbBCnN4OGKDdYpJ
 +VY4kr48Yld5IxYm6J6fXOEWTcFw2n/hvEVsik5mmgw1rYivJsCcvcVxv04xZYCl
 6d13eCaSh2TfeKYs/Qf2yy3hmps4dWFcd18/xWRyFdnkCs5qGRbtgLiAQtUnQAGm
 bpaI/rYeRnnF80af2dhpZT2i0zBL9uuur7FYZDB9xsQDOkqnCYxKvSW1CfNg5Gtc
 senI3dczeTC3NtwRp4eC
 =Zhx3
 -----END PGP SIGNATURE-----

Merge tag 'realview-base-armsoc-1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator into next/multiplatform

Merge "Realview multiplatform support" from Linus Walleij:

The board and infrastructure changes for RealView
multiplatform and extended DT support.

* tag 'realview-base-armsoc-1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator:
  ARM: realview: add an DT SMP boot method
  ARM: realview: select SP810 and ICST for the DT variant
  soc: versatile: add support for the PB11MPCore
  clk: versatile-icst: add device tree support
  clk: versatile-icst: refactor to allocate regmap separately
  clk: versatile-icst: convert to use regmap
  ARM: realview: remove private barrier implementation
  ARM: no longer force unbuffered DMA for realview
  clk/realview: stop using machine headers
  ARM: realview: don't map undefined PCI registers
  ARM: realview: remove sparsemem hack

Conflicts:
	drivers/clk/versatile/Kconfig
2015-12-16 00:56:18 +01:00
Dan Williams
3e6110fd54 Revert "scatterlist: use sg_phys()"
commit db0fa0cb01 "scatterlist: use sg_phys()" did replacements of
the form:

    phys_addr_t phys = page_to_phys(sg_page(s));
    phys_addr_t phys = sg_phys(s) & PAGE_MASK;

However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long).  Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.

Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb01 ("scatterlist: use sg_phys()")
Suggested-by: Joerg Roedel <joro@8bytes.org>
Reported-by: Vitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-12-15 12:54:06 -08:00
Anson Huang
fa0708b320 ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted
In cpu_v7_do_suspend routine, r11 is used while it is NOT
saved/restored, different compiler may have different usage
of ARM general registers, so it may cause issues during
calling cpu_v7_do_suspend.

We meet kernel fault occurs when using GCC 4.8.3, r11 contains
valid value before calling into cpu_v7_do_suspend, but when returned
from this routine, r11 is corrupted and lead to kernel fault.
Doing save/restore for those corrupted registers is a must in
assemble code.

Signed-off-by: Anson Huang <Anson.Huang@freescale.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: <stable@vger.kernel.org> # v3.3+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-15 11:51:41 +00:00
Arnd Bergmann
38541bf485 ARM: no longer force unbuffered DMA for realview
Commit 42c4dafe80 ("ARM: 6202/1: Do not ARM_DMA_MEM_BUFFERABLE
on RealView boards with L210/L220") changed the generic setting for
ARM_DMA_MEM_BUFFERABLE to be disabled on any Realview kernel that includes
support for any of the ARM11 variations. Doing this was required to
allow doing DMA without a lockup in the l2x0 cache controller on the
Realview platform.

Unfortunately, in a kernel that also contains support for any ARMv7
based machine, the same change makes it impossible to do DMA on ARMv7,
which gets in the way of enabling multiplatform support on Realview.

As confirmed by Catalin Marinas and Linus Walleij, the current
code for Realview that we have in the kernel does not actually
perform any DMA, and this is unlikely to change in the future.
Therefore we can revert 42c4dafe80 without introducing regressions,
but we must never start using DMA on this platform in the future.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2015-12-15 09:41:33 +01:00
Ard Biesheuvel
09414d00a1 ARM: only consider memblocks with NOMAP cleared for linear mapping
Take the new memblock attribute MEMBLOCK_NOMAP into account when
deciding whether a certain region is or should be covered by the
kernel direct mapping.

Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-12-13 19:18:29 +01:00
Ard Biesheuvel
c7936206b9 ARM: implement create_mapping_late() for EFI use
This implements create_mapping_late(), which we will use to populate
the UEFI Runtime Services page tables.

Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-12-13 19:18:29 +01:00
Ard Biesheuvel
b430e55b23 ARM: add support for non-global kernel mappings
Add support to the kernel translation table population routines for
creating non-global mappings. This will be used by the UEFI runtime
services, which will use temporary mappings in the userland range.

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-12-13 19:18:29 +01:00
Ard Biesheuvel
f579b2b104 ARM: factor out allocation routine from __create_mapping()
To allow __create_mapping() to be used for populating UEFI Runtime
Services page tables, factor out the allocation routine 'early_alloc'
and pass it down as a function pointer into alloc_init_[pud|pmd|pte].
This way, new users of __create_mapping() can supply another allocation
function.

Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-12-13 19:18:29 +01:00
Ard Biesheuvel
1bdb2d4ee0 ARM: split off core mapping logic from create_mapping
In order to be able to reuse the core mapping logic of create_mapping
for mapping the UEFI Runtime Services into a private set of page tables,
split it off from create_mapping() into a separate function
__create_mapping which we will wire up in a subsequent patch.

Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-12-13 19:18:28 +01:00
Ard Biesheuvel
2937367b8a ARM: add support for generic early_ioremap/early_memremap
This enables the generic early_ioremap implementation for ARM.

It uses the fixmap region reserved for kmap. Since early_ioremap
is only supported before paging_init(), and kmap is only supported
afterwards, this is guaranteed not to cause any clashes.

Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
2015-12-13 19:18:28 +01:00
Laura Abbott
08925c2f12 ARM: 8464/1: Update all mm structures with section adjustments
Currently, when updating section permissions to mark areas RO
or NX, the only mm updated is current->mm. This is working off
the assumption that there are no additional mm structures at
the time. This may not always hold true. (Example: calling
modprobe early will trigger a fork/exec). Ensure all mm structres
get updated with the new section information.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-04 19:20:34 +00:00
Masahiro Yamada
89e69fbfd1 ARM: 8462/1: cache-uniphier: use common API to find the next level cache
The function uniphier_cache_get_next_level_node() does the same thing
as of_find_next_cache_node().  Drop the former and stick to the common
API.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-03 00:03:09 +00:00
Will Deacon
40ee068ec0 ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rollovers
Under some unusual context-switching patterns, it is possible to end up
with multiple threads from the same mm running concurrently with
different ASIDs:

1. CPU x schedules task t with mm p containing ASID a and generation g
   This task doesn't block and the CPU doesn't context switch.
   So:
     * per_cpu(active_asid, x) = {g,a}
     * p->context.id = {g,a}

2. Some other CPU generates an ASID rollover. The global generation is
   now (g + 1). CPU x is still running t, with no context switch and
   so per_cpu(reserved_asid, x) = {g,a}

3. CPU y schedules task t', which shares mm p with t. The generation
   mismatches, so we take the slowpath and hit the reserved ASID from
   CPU x. p is then updated so that p->context.id = {g + 1,a}

4. CPU y schedules some other task u, which has an mm != p.

5. Some other CPU generates *another* CPU rollover. The global
   generation is now (g + 2). CPU x is still running t, with no context
   switch and so per_cpu(reserved_asid, x) = {g,a}.

6. CPU y once again schedules task t', but now *fails* to hit the
   reserved ASID from CPU x because of the generation mismatch. This
   results in a new ASID being allocated, despite the fact that t is
   still running on CPU x with the same mm.

Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
between the two threads.

This patch fixes the problem by updating all of the matching reserved
ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
the reserved ASIDs in-sync with the mm and avoids the problem.

Cc: <stable@vger.kernel.org>
Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-12-02 23:57:54 +00:00
Arnd Bergmann
0f67b87609 ARM: mohawk: allow building with MMU disabled
It is in principle possible to build an MMP kernel for
the mohawk CPU with the MMU code disabled, except for one
simple build error:

proc-mohawk.S:345: Error: invalid operands (*UND* and *UND* sections) for `|'
proc-mohawk.S:345: Error: invalid operands (*ABS* and *UND* sections) for `|'
proc-mohawk.S:345: Error: invalid operands (*UND* and *UND* sections) for `|'
proc-mohawk.S:345: Error: invalid operands (*UND* and *UND* sections) for `|'
proc-mohawk.S:345: Error: undefined symbol L_PTE_USER used as an immediate value

This patch changes the proc-mohawk code to do the same as the
other CPUs and not try to actually do anything for the
cpu_mohawk_set_pte_ext function, which won't be used anyway.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-12-01 21:44:25 +01:00
Arnd Bergmann
d33c43ac18 ARM: make xscale iwmmxt code multiplatform aware
In a multiplatform configuration, we may end up building a kernel for
both Marvell PJ1 and an ARMv4 CPU implementation. In that case, the
xscale-cp0 code is built with gcc -march=armv4{,t}, which results in a
build error from the coprocessor instructions.

Since we know this code will only have to run on an actual xscale
processor, we can simply build the entire file for ARMv5TE.

Related to this, we need to handle the iWMMXT initialization sequence
differently during boot, to ensure we don't try to touch xscale
specific registers on other CPUs from the xscale_cp0_init initcall.
cpu_is_xscale() used to be hardcoded to '1' in any configuration that
enables any XScale-compatible core, but this breaks once we can have a
combined kernel with MMP1 and something else.

In this patch, I replace the existing cpu_is_xscale() macro with a new
cpu_is_xscale_family() macro that evaluates true for xscale, xsc3 and
mohawk, which makes the behavior more deterministic.

The two existing users of cpu_is_xscale() are modified accordingly,
but slightly change behavior for kernels that enable CPU_MOHAWK without
also enabling CPU_XSCALE or CPU_XSC3. Previously, these would leave leave
PMD_BIT4 in the page tables untouched, now they clear it as we've always
done for kernels that enable both MOHAWK and the support for the older
CPU types.

Since the previous behavior was inconsistent, I assume it was
unintentional.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-12-01 21:44:24 +01:00
Russell King
1d93ba2aaa ARM: l2c: tauros2: use descriptive definitions for register bits
Use descriptive definitions for the Tauros2 register bits, and while
we're here, clean up the "Tauros2: %s line fill burt8." message.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-11-26 22:12:26 +00:00
Russell King
172f3fcb17 ARM: l2c: tauros2: fix OF-enabled non-DT boot
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-11-26 22:12:02 +00:00
Ezequiel Garcia
a4124e7296 ARM: 8451/1: v7-M: Set an early stack for __v7m_setup
On ARM v7-M, when PROCINFO_INITFUNC (__v7m_setup) is called,
a stack is needed before calling the supervisor call (SVC),
which is used by the supervisor call to save the context.

Currently, __v7m_setup() prepares a temporary stack in the .text.init
section, which is is broken if the kernel is executing directly from
read-only memory.

In particular, this is the case for LPC43xx, which allows
to execute the kernel in-place from a serial flash through its SPIFI
controller.

This commit fixes the issue by seting an early stack to its usual location.

Also, __v7m_setup() is currently saving and restoring the previous
stack. That was bogus, because there's no stack previously set,
so this commit removes it.

Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-11-16 18:34:38 +00:00
Linus Walleij
b522842c43 ARM: 8448/1: add some L220 DT settings
The RealView ARM11MPCore enables parity, eventmon and shared
override in the cache controller through its current boardfile,
but the code and DT bindings for the ARM L220 is currently
lacking the ability to set this up from DT. Add the required
bool parameters for parity and shared override, but keep
eventmon out of it: this should be enabled by the event
monitor code.

Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-11-16 18:02:46 +00:00
Linus Torvalds
56e0464980 ARM: SoC platform updates for v4.4
New and/or improved SoC support for this release:
 
  - Marvell Berlin:
    * Enable standard DT-based cpufreq
    * Add CPU hotplug support
  - Freescale:
    * Ethernet init for i.MX7D
    * Suspend/resume support for i.MX6UL
  - Allwinner:
    * Support for R8 chipset (used on NTC's $9 C.H.I.P board)
  - Mediatek:
    * SMP support for some platforms
  - Uniphier:
    * L2 support
    * Cleaned up SMP support, etc.
 
 + A handful of other patches around above functionality, and a few other
 smaller changes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWQCmaAAoJEIwa5zzehBx3F/0QAKIYmvmJM3sUanNEEwhRilx+
 3xhSgld7e25suLGwrNapTkd8VzVB4b8GnJhNShNk+l5WqfqICHCB4Aru2NmJHY8V
 yPj5vBrgJTVMnIiH7CRDPz9IlwAkWM4MmWi4PgFuhrk1T/0wPKPNMc40OWOloTeD
 gA5YmbbX1hNOqKoI/z+DK7CEdp1lHrEjeYIbnQ0SldFzkY9NKhrI784gtcz3si6E
 19pFQ9LA7EtEv7aRcFOA0sazeooa2wiJ9P9L31Mn5APZBJj5H8HjyKdvOmJ8neQn
 +b77Tya11Q70U57uDq69l0rl58fpy650uTwYaLNGWmUdTgOiGMWN05lvIVNrQd/R
 gP+VEQDGsTH6kqOCy9gyLCmn9q9I7l0t9lwcu5TP52Xy9vqVq9rb388MkcPcsWk8
 cYPvD8RcSaywZUV3YJgbYozBfVuf5rLVus6D54pMXe3N12KGaNBt8kk4co4jBwvh
 b//1urA82cdlEAZ/kiqHXjRMq/ht+dxtb6sSVOJ9frxPLuc7g1z4ORC+Z0PTS5WC
 zB0hMzPnTwXeqHcYpV4wP/vGtgZGpLevBkK7pKVdqKZykV8BS4FiT4HFp6Rghxs3
 dxAz7JjQUle6KOX7YfuHdpLnyZFWQvLYyTn946xGKpw2QH/iWLwECpet2I87QzVs
 QkEkGygPhK4QcGccxz9N
 =h5xk
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform updates from Olof Johansson:
 "New and/or improved SoC support for this release:

  Marvell Berlin:
     - Enable standard DT-based cpufreq
     - Add CPU hotplug support

  Freescale:
     - Ethernet init for i.MX7D
     - Suspend/resume support for i.MX6UL

  Allwinner:
     - Support for R8 chipset (used on NTC's $9 C.H.I.P board)

  Mediatek:
     - SMP support for some platforms

  Uniphier:
     - L2 support
     - Cleaned up SMP support, etc.

  plus a handful of other patches around above functionality, and a few
  other smaller changes"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (42 commits)
  ARM: uniphier: rework SMP operations to use trampoline code
  ARM: uniphier: add outer cache support
  Documentation: EXYNOS: Update bootloader interface on exynos542x
  ARM: mvebu: add broken-idle option
  ARM: orion5x: use mac_pton() helper
  ARM: at91: pm: at91_pm_suspend_in_sram() must be 8-byte aligned
  ARM: sunxi: Add R8 support
  ARM: digicolor: select pinctrl/gpio driver
  arm: berlin: add CPU hotplug support
  arm: berlin: use non-self-cleared reset register to reset cpu
  ARM: mediatek: add smp bringup code
  ARM: mediatek: enable gpt6 on boot up to make arch timer working
  soc: mediatek: Fix random hang up issue while kernel init
  soc: ti: qmss: make acc queue support optional in the driver
  soc: ti: add firmware file name as part of the driver
  Documentation: dt: soc: Add description for knav qmss driver
  ARM: S3C64XX: Use PWM lookup table for mach-smartq
  ARM: S3C64XX: Use PWM lookup table for mach-hmt
  ARM: S3C64XX: Use PWM lookup table for mach-crag6410
  ARM: S3C64XX: Use PWM lookup table for smdk6410
  ...
2015-11-10 14:56:23 -08:00
Nicolas Pitre
77c5b5da02 kmap_atomic_to_page() has no users, remove it
Removal started in commit 5bbeed12bd ("sparc32: drop unused
kmap_atomic_to_page").  Let's do it across the whole tree.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-09 15:11:24 -08:00
Mel Gorman
d0164adc89 mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-06 17:50:42 -08:00
Andrew Morton
0ab32b6f1b uaccess: reimplement probe_kernel_address() using probe_kernel_read()
probe_kernel_address() is basically the same as the (later added)
probe_kernel_read().

The return value on EFAULT is a bit different: probe_kernel_address()
returns number-of-bytes-not-copied whereas probe_kernel_read() returns
-EFAULT.  All callers have been checked, none cared.

probe_kernel_read() can be overridden by the architecture whereas
probe_kernel_address() cannot.  parisc, blackfin and um do this, to insert
additional checking.  Hence this patch possibly fixes obscure bugs,
although there are only two probe_kernel_address() callsites outside
arch/.

My first attempt involved removing probe_kernel_address() entirely and
converting all callsites to use probe_kernel_read() directly, but that got
tiresome.

This patch shrinks mm/slab_common.o by 218 bytes.  For a single
probe_kernel_address() callsite.

Cc: Steven Miao <realmz6@gmail.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-05 19:34:48 -08:00
Russell King
116ef0fcc9 Merge branches 'fixes' and 'misc' into for-next 2015-10-29 15:21:30 +00:00
Masahiro Yamada
e7ecbc057b ARM: uniphier: add outer cache support
This commit adds support for UniPhier outer cache controller.
All the UniPhier SoCs are equipped with the L2 cache, while the L3
cache is currently only integrated on PH1-Pro5 SoC.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
2015-10-27 09:20:50 +09:00
Lucas Stach
9254970cbb ARM: 8447/1: catch pending imprecise abort on unmask
Install a non-faulting handler just before unmasking imprecise aborts
and switch back to the regular one after unmasking is done.

This catches any pending imprecise abort that the firmware/bootloader
may have left behind that would normally crash the kernel at that point.
As there are apparently a lot of bootlaoders out there that do such a
thing it makes sense to handle it in the common startup code.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-10-19 17:08:33 +01:00