This patch adds a step in the init sequence, in order to recreate
the kernel code/data page table mappings prior to full paging
initialization. This is necessary on LPAE systems that run out of
a physical address space outside the 4G limit. On these systems,
this implementation provides a machine descriptor hook that allows
the PHYS_OFFSET to be overridden in a machine specific fashion.
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: R Sricharan <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Commit 9e9a367c29 {ARM: Section based HYP idmap} moved
the address conversion inside identity_mapping_add() without
respective print which carries useful idmap information.
Move the print as well inside identity_mapping_add() to
fix the same.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
On some PAE systems (e.g. TI Keystone), memory is above the
32-bit addressable limit, and the interconnect provides an
aliased view of parts of physical memory in the 32-bit addressable
space. This alias is strictly for boot time usage, and is not
otherwise usable because of coherency limitations. On such systems,
the idmap mechanism needs to take this aliased mapping into account.
This patch introduces virt_to_idmap() and a arch function pointer which
can be populated by platform which needs it. Also populate necessary
idmap spots with now available virt_to_idmap(). Avoided #ifdef approach
to be compatible with multi-platform builds.
Most architecture won't touch it and in that case virt_to_idmap()
fall-back to existing virt_to_phys() macro.
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
All arches do essentially the same thing now for
early_init_dt_setup_initrd_arch, so it can now be removed.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
In order to unify the initrd scanning for DT across architectures, make
arm set initrd_start and initrd_end instead of the physical addresses.
This is aligned with all other architectures.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Grant Likely <grant.likely@linaro.org>
... otherwise it is impossible for the low level iommu driver to
figure out which pte flags should be used.
In __map_sg_chunk we can derive the flags from dma_data_direction.
In __iommu_create_mapping we should treat the memory like
DMA_BIDIRECTIONAL and pass both IOMMU_READ and IOMMU_WRITE to
iommu_map.
__iommu_create_mapping is used during dma_alloc_coherent (via
arm_iommu_alloc_attrs). AFAIK dma_alloc_coherent is responsible for
allocation _and_ mapping. I think this implies that access to the
mapped pages should be allowed.
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann@calxeda.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kernel faults are expected to handle OOM conditions gracefully (gup,
uaccess etc.), so they should never invoke the OOM killer. Reserve this
for faults triggered in user context when it is the only option.
Most architectures already do this, fix up the remaining few.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently hugepage migration works well only for pmd-based hugepages
(mainly due to lack of testing,) so we had better not enable migration of
other levels of hugepages until we are ready for it.
Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
page table walk and check pud/pmd_huge() there, so they are safe. But the
other users (softoffline and memory hotremove) don't do this, so without
this patch they can try to migrate unexpected types of hugepages.
To prevent this, we introduce hugepage_migration_support() as an
architecture dependent check of whether hugepage are implemented on a pmd
basis or not. And on some architecture multiple sizes of hugepages are
available, so hugepage_migration_support() also checks hugepage size.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding the
entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSL12LAAoJEEFnBt12D9kB64gP/RBipnYbo3RPanHg+lE/J1V7
KSVFNGKWJHxTg47VVC1YJGIG21jqxAilpdS2MQL5FP7iyd+IzvtHpQiJgp+2G+pq
di06yrdyrYErxRgZgGQi8IpR538ZzOEVLCKJGdb09YelkRzPT5au7CC1MAsX3qco
yba7PHk0/Nc4hZE4aGbgR1DlRmn86ob7mM0KFE/LORaSN2BueMgWcwKhQXYNGyoh
assX4yNhAbUG6Bgw7paBLDGqHh8c5Ei5AppU8yPb+N094jgYHBJryUoDlzzUHD23
qqiEqHhUKT0TpgHNs8KH0WZFugcmjKvYEbzdzadBxqfXnJN4fKSEcdfF3iz4T14j
U6EZks89GoHwA523OghUZkKNOqlsUdWfdKz+8/grQqKisYwDcf3fCxEYk/4weDCQ
b6fFlOv6+AI3btjXp6F511ZKxyT4ZZzkHjp/ZSrhBygyamNZfax0ma0j+ZS9AZql
kPxQS0nOve6NKaP7vXxMmW5sGMnL19ER/Hm31wthGcWI43GVebUdklnzfGaEeSjs
pmP8oiCNemceqVpiPKxcOxiguf/eyIjP1SFXbguASygUmQeTDbbJ8n1FYznCitue
xJgWttKWsEf/aMR3eJtQ3aBmHR3rijAV4E28Wlq8XMkocwvpQm2zMocS2Z5BJ80S
hi1kQVy8+RxNX96tOSp1
=GSWl
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree core updates from Grant Likely:
"Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding
the entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either"
Tim Bird questions whether the boot time cost of the random feeding may
be noticeable. And "add_device_randomness()" is definitely not some
speed deamon of a function.
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/platform: add error reporting to of_amba_device_create()
irq/of: Fix comment typo for irq_of_parse_and_map
of: Feed entire flattened device tree into the random pool
of/fdt: Clean up casting in unflattening path
of/fdt: Remove duplicate memory clearing on FDT unflattening
gpio: implement gpio-ranges binding document fix
of: call __of_parse_phandle_with_args from of_parse_phandle
of: introduce of_parse_phandle_with_fixed_args
of: move of_parse_phandle()
of: move documentation of of_parse_phandle_with_args
of: Fix missing memory initialization on FDT unflattening
of: consolidate definition of early_init_dt_alloc_memory_arch()
of: Make of_get_phy_mode() return int i.s.o. const int
include: dt-binding: input: create a DT header defining key codes.
of/platform: Staticize of_platform_device_create_pdata()
of: Specify initrd location using 64-bit
dt: Typo fix
OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled
These are changes that arrived a little late before the merge window,
or had dependencies on previous branches.
Highlights:
- ux500: misc. cleanup, fixup I2C devices
- exynos: DT updates for RTC; PM updates
- at91: DT updates for NAND; new platforms added to generic defconfig
- sunxi: DT updates: cubieboard2, pinctrl driver, gated clocks
- highbank: LPAE fixes, select necessary ARM errata
- omap: PM fixes and improvements; OMAP5 mailbox support
- omap: basic support for new DRA7xx SoCs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJSLkf1AAoJEFk3GJrT+8ZlF7oP/AyxrdRFyC1YmuOqzFH0/JTQ
EVBmMBiH+f1IKBT6YRkWCzX4JI5oOi+2DhrM6d/UPfbpr6pwd8dptuPiyLuBBUEm
byNbiJEYHidm23oFpKM+89tTHXbBrrz8XQN2xLwYhNr24QkVAsLTxyOjVA7KJM59
tk1tPQzO1ORyiFd485eQa3V4z98JgcE3QFNthbS7Y72wEXBzMZQDc9nFaoIJ5mHW
nzJSZyV24ibeEJeM2nsc7a3OvCyUfAQaO5Cio2UvdkGzZcmtxjxc1LjHa4VjIL6h
hwz+gqIOfl3hXotbjJxTp9+Ezt4TGU5bB3NUweE1btHE/KIEu0bx4hSsOz/kooA9
2JL8BCCTx+KiGiNHmNCcT679n9q11iOwqOWvxxhcJFkiV/6+mkjwTD9TNwR1q+RG
+LtOZr9tMcu2v/DbAivDYKiROmNCZhxpn35DoUKpBy73SOvJOiTLtSYitVN/tyM3
nWLEP5aTf3NwrWr8nFFws6ycwhgTCX0ITbdFD/fMlLMamHYPkckJ/0NXXOxfGiLk
kCMbdrCX4YTbCftmAQhrbdaPJVnE/SZI3CTJfutj8eX6NC2fm/U7Hcf5PI+W0Igd
moN/PaUULpVZI5hUrADyU1HCQnA97pv0biYVwzW5pBIt2u9tzUritabuERxPt9fa
SdHj0+u+xq9d3y35Oq46
=NIZZ
-----END PGP SIGNATURE-----
Merge tag 'late-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC late changes from Kevin Hilman:
"These are changes that arrived a little late before the merge window,
or had dependencies on previous branches.
Highlights:
- ux500: misc. cleanup, fixup I2C devices
- exynos: DT updates for RTC; PM updates
- at91: DT updates for NAND; new platforms added to generic defconfig
- sunxi: DT updates: cubieboard2, pinctrl driver, gated clocks
- highbank: LPAE fixes, select necessary ARM errata
- omap: PM fixes and improvements; OMAP5 mailbox support
- omap: basic support for new DRA7xx SoCs"
* tag 'late-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (60 commits)
ARM: dts: vexpress: Add CCI node to TC2 device-tree
ARM: EXYNOS: Skip C1 cpuidle state for exynos5440
ARM: EXYNOS: always enable PM domains support for EXYNOS4X12
ARM: highbank: clean-up some unused includes
ARM: sun7i: Enable the A20 clocks in the DTSI
ARM: sun6i: Enable clock support in the DTSI
ARM: sun5i: dt: Use the A10s gates in the DTSI
ARM: at91: at91_dt_defconfig: enable rm9200 support
ARM: dts: add ADC device tree node for exynos5420/5250
ARM: dts: Add RTC DT node to Exynos5420 SoC
ARM: dts: Update the "status" property of RTC DT node for Exynos5250 SoC
ARM: dts: Fix the RTC DT node name for Exynos5250
irqchip: mmp: avoid to include irqs head file
ARM: mmp: avoid to include head file in mach-mmp
irqchip: mmp: support irqchip
irqchip: move mmp irq driver
ARM: OMAP: AM33xx: clock: Add RNG clock data
ARM: OMAP: TI81XX: add always-on powerdomain for TI81XX
ARM: OMAP4: clock: Lock PLLs in the right sequence
ARM: OMAP: AM33XX: hwmod: Add hwmod data for debugSS
...
Pull DMA mapping update from Marek Szyprowski:
"This contains an addition of Device Tree support for reserved memory
regions (Contiguous Memory Allocator is one of the drivers for it) and
changes required by the KVM extensions for PowerPC architectue"
* 'for-v3.12' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: init: add support for reserved memory defined by device tree
drivers: of: add initialization code for dma reserved memory
drivers: of: add function to scan fdt nodes given by path
drivers: dma-contiguous: clean source code and prepare for device tree
Pull ARM updates from Russell King:
"This set includes adding support for Neon acceleration of RAID6 XOR
code from Ard Biesheuvel, cache flushing and barrier updates from Will
Deacon, and a cleanup to the ARM debug code which reduces the amount
of code by about 500 lines.
A few other cleanups, such as constifying the machine descriptors
which already shouldn't be written to, cleaning up the printing of the
L2 cache size"
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (55 commits)
ARM: 7826/1: debug: support debug ll on hisilicon soc
ARM: 7830/1: delay: don't bother reporting bogomips in /proc/cpuinfo
ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables
ARM: 7828/1: ARMv7-M: implement restart routine common to all v7-M machines
ARM: 7827/1: highbank: fix debug uart virtual address for LPAE
ARM: 7823/1: errata: workaround Cortex-A15 erratum 773022
ARM: 7806/1: allow DEBUG_UNCOMPRESS for Tegra
ARM: 7793/1: debug: use generic option for ep93xx PL10x debug port
ARM: debug: move SPEAr debug to generic PL01x code
ARM: debug: move davinci debug to generic 8250 code
ARM: debug: move keystone debug to generic 8250 code
ARM: debug: remove DEBUG_ROCKCHIP_UART
ARM: debug: provide generic option choices for 8250 and PL01x ports
ARM: debug: move PL01X debug include into arch/arm/include/debug/
ARM: debug: provide PL01x debug uart phys/virt address configuration options
ARM: debug: add support for word accesses to debug/8250.S
ARM: debug: move 8250 debug include into arch/arm/include/debug/
ARM: debug: provide 8250 debug uart phys/virt address configuration options
ARM: debug: provide 8250 debug uart register shift configuration option
ARM: debug: provide 8250 debug uart flow control configuration option
...
Pull KVM updates from Gleb Natapov:
"The highlights of the release are nested EPT and pv-ticketlocks
support (hypervisor part, guest part, which is most of the code, goes
through tip tree). Apart of that there are many fixes for all arches"
Fix up semantic conflicts as discussed in the pull request thread..
* 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits)
ARM: KVM: Add newlines to panic strings
ARM: KVM: Work around older compiler bug
ARM: KVM: Simplify tracepoint text
ARM: KVM: Fix kvm_set_pte assignment
ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
ARM: KVM: vgic: fix GICD_ICFGRn access
ARM: KVM: vgic: simplify vgic_get_target_reg
KVM: MMU: remove unused parameter
KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
KVM: x86: update masterclock when kvmclock_offset is calculated (v2)
KVM: PPC: Book3S: Fix compile error in XICS emulation
KVM: PPC: Book3S PR: return appropriate error when allocation fails
arch: powerpc: kvm: add signed type cast for comparation
KVM: x86: add comments where MMIO does not return to the emulator
KVM: vmx: count exits to userspace during invalid guest emulation
KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp
kvm: optimize away THP checks in kvm_is_mmio_pfn()
...
On Cortex-A15 CPUs up to and including r0p4, in certain rare sequences
of code, the loop buffer may deliver incorrect instructions. This
workaround disables the loop buffer to avoid the erratum.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- A couple of fixes to enable LPAE.
- pl08x driver fixes to make it build with ARCH_DMA_ADDR_T_64BIT.
- Avoid L2 related smc calls on Midway.
- Add selecting of necesssary ARM errata.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJSHOrXAAoJEMhvYp4jgsXiPk0IAJfoHwuX1T2JoCNmOwfY4iHQ
8Fc0H4tEklnalNXy2gXWBFup87Ysp6CYf4hvkp1pbAs4LklPbq5XVL0BazfN6JPT
goiX3QZZo/+VBQZC1j+O6VA3wUE5VZjgDKh327n+t5N22jcfk0XgdwOmxFUw5MJp
H23kUB8g/dKtQhI8+GHFBAfFyiCVU8nnZpRx4tBA34cWbKkZlOXAn7Elv6Phcekb
TAEW43kejNowUK01rhLwPAGahaJ2GYI6Ocusw8AsKg+GB/8m3mLcbxGdHnBV0vH4
HLapmhVmCaybSyqXKLyu5CvlBSL1dF0wwxQ1YZrB5mWD9gmhuAOuxo9M/opc9Sc=
=nsop
-----END PGP SIGNATURE-----
Merge tag 'highbank-for-3.12' of git://sources.calxeda.com/kernel/linux into late/all
From Rob Herring:
Updates for Highbank for 3.12:
- A couple of fixes to enable LPAE.
- pl08x driver fixes to make it build with ARCH_DMA_ADDR_T_64BIT.
- Avoid L2 related smc calls on Midway.
- Add selecting of necesssary ARM errata.
* tag 'highbank-for-3.12' of git://sources.calxeda.com/kernel/linux:
ARM: highbank: clean-up some unused includes
ARM: highbank: avoid L2 cache smc calls when PL310 is not present
ARM: move outer_cache declaration out of ifdef
ARM: highbank: select ARCH_DMA_ADDR_T_64BIT for LPAE
DMA: fix printk warning in AMBA PL08x DMA driver
DMA: fix AMBA PL08x compilation issue with 64bit DMA address type
ARM: highbank: select required errata work-arounds
ARM: highbank: select ARCH_HAS_HOLES_MEMORYMODEL
ARM: highbank: enable DMA zone for LPAE
ARM: use phys_addr_t for DMA zone sizes
Signed-off-by: Olof Johansson <olof@lixom.net>
Enable reserved memory initialization from device tree.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Tomasz Figa <t.figa@samsung.com>
[ this is a follow-up to this discussion:
http://archive.arm.linux.org.uk/lurker/message/20130730.230827.a1ceb12a.en.html ]
This patchset renames all uses of "bcm," name bindings to
"brcm," as they were done prior to knowing that brcm had
already been standardized as Broadcom vendor prefix
(in Documentation/devicetree/bindings/vendor-prefixes.txt).
This will not cause any churn on devices because none of
these bindings have made it into production yet.
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Christian Daudt <csd@broadcom.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently we have the following output from cache-l2x0:
l2x0: 16 ways, CACHE_ID 0x410000c7, AUX_CTRL 0x32070000, Cache size: 1048576 B
Using kB for the cache size can improve readability a bit:
l2x0: 16 ways, CACHE_ID 0x410000c7, AUX_CTRL 0x32070000, Cache size: 1024 kB
While at it use pr_info.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit f6f91b0d9f ("ARM: allow kuser helpers to be removed from the
vector page") introduced some help text for the CONFIG_KUSER_HELPERS
option which is rather contradictory.
Let's fix that, and improve it a little.
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add support for suspend/resume operations. The implemented procedures
are identical to the ones for ARM926.
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
In order to specify a DMA zone size of 4GB on LPAE systems, the sizes need
to be 64-bit. So make machine_desc.dma_zone_size and arm_dma_zone_size be
phys_addr_t instead of unsigned long.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
writel_relaxed and spin_unlock are both store operations, so we only
need to enforce store ordering in the dsb.
Signed-off-by: Will Deacon <will.deacon@arm.com>
System-wide barriers aren't required for situations where we only need
to make visibility and ordering guarantees in the inner-shareable domain
(i.e. we are not dealing with devices or potentially incoherent CPUs).
This patch changes the v7 TLB operations, coherent_user_range and
dcache_clean_area functions to user inner-shareable barriers. For cache
maintenance, only the store access type is required to ensure completion.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Inner-shareable TLB invalidation is typically more expensive than local
(non-shareable) invalidation, so performing the broadcasting for
local_flush_tlb_* operations is a waste of cycles and needlessly
clobbers entries in the TLBs of other CPUs.
This patch introduces __flush_tlb_* versions for many of the TLB
invalidation functions, which only respect inner-shareable variants of
the invalidation instructions when presented with the TLB_V7_UIS_FULL
flag. The local version is also inlined to prevent SMP_ON_UP kernels
from missing flushes, where the __flush variant would be called with
the UP flags.
This gains us around 0.5% in hackbench scores for a dual-core A15, but I
would expect this to improve as more cores (and clusters) are added to
the equation.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The kernel TLB range invalidation functions already contain dsb
instructions before and after the maintenance, so there is no need to
introduce additional barriers.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
If kuser helpers are not provided by the kernel, disable user access to
the vectors page. With the kuser helpers gone, there is no reason for
this page to be visible to userspace.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a kernel configuration option to allow the kernel user helpers
to be removed from the vector page, thereby preventing their use with
ROP (return orientated programming) attacks. This option is only
visible for CPU architectures which natively support all the operations
which kernel user helpers would normally provide, and must be enabled
with caution.
Cc: <stable@vger.kernel.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the machine vector stubs into the page above the vector page,
which we can prevent from being visible to userspace. Also move
the reset stub, and place the swi vector at a location that the
'ldr' can get to it.
This hides pointers into the kernel which could give valuable
information to attackers, and reduces the number of exploitable
instructions at a fixed address.
Cc: <stable@vger.kernel.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
General forms of huge_pte_alloc, huge_pte_offset and follow_huge_pmd
are now available in mm/hugetlb.c.
This patch removes the ARM copies of these functions and activates
the general ones by enabling:
CONFIG_ARCH_WANT_GENERAL_HUGETLB
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
struct machine_desc records are defined everywhere as a 'const'
structure, but unfortuantely it loses its const-ness through the use of
linker magic - the symbols which surround the section are not declared
const so it becomes possible not to use 'const' for pointers to these
const structures.
Let's fix this oversight - all pointers to these structures should be
marked const too.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)) causes the following undefined instruction error on a mx53 (Cortex-A8):
Internal error: Oops - undefined instruction: 0 [#1] SMP ARM
CPU: 0 PID: 275 Comm: modprobe Not tainted 3.11.0-rc2-next-20130722-00009-g9b0f371 #881
task: df46cc00 ti: df48e000 task.ti: df48e000
PC is at check_and_switch_context+0x17c/0x4d0
LR is at check_and_switch_context+0xdc/0x4d0
This problem happens because check_and_switch_context() calls dummy_flush_tlb_a15_erratum() without checking if we are really running on a Cortex-A15 or not.
To avoid this issue, only call dummy_flush_tlb_a15_erratum() inside
check_and_switch_context() if erratum_a15_798181() returns true, which means that we are really running on a Cortex-A15.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit. These systems need the ability to specify the
initrd location using 64-bit numbers.
This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.
There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"
More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690https://lkml.org/lkml/2012/9/13/544
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
When map_lowmem() runs, and processes a memory bank whose start or end
is not section-aligned, memory must be allocated to store the 2nd-level
page tables. Those allocations are made by calling memblock_alloc().
At this point, the only memory that is free *and* mapped is memory which
has already been mapped by map_lowmem() itself. For this reason, we must
calculate the first point at which map_lowmem() will need to allocate
memory, and set the memblock allocation limit to a lower address, so that
memblock_alloc() is guaranteed to return memory that is already mapped.
This patch enhances sanity_check_meminfo() to calculate that memory
address, and pass it to memblock_set_current_limit(), rather than just
assuming the limit is arm_lowmem_limit.
The algorithm applied is:
* Default memblock_limit to arm_lowmem_limit in the absence of any other
limit; arm_lowmem_limit is the highest memory that is mapped by
map_lowmem().
* While walking the list of memblocks, if the start of a block is not
aligned, 2nd-level page tables will need to be allocated to map the
first few pages of the block. Hence, the memblock_limit must be before
the start of the block.
* Similarly, if the end of any block is not aligned, 2nd-level page
tables will need to be allocated to map the last few pages of the
block. Hence, the memblock_limit must point at the end of the block,
rounded down to section-alignment.
* The memory blocks are assumed to be sorted in address order, so the
first unaligned block start or end is used to set the limit.
With this algorithm, the start or end of almost any bank can be non-
section-aligned. The only exception is that the start of bank 0 must
be section-aligned, since otherwise memory would need to be allocated
when mapping the start of bank 0, which occurs before any free memory
is mapped.
[swarren, wrote commit description, rewrote calculation of memblock_limit]
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit ae8a8b9553 ("ARM: 7691/1: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE
and use ALT_SMP instead") added early function returns for page table
cache flushing operations on ARMv7 SMP CPUs.
Unfortunately, when targetting Thumb-2, these `mov pc, lr' sequences
assemble to 2 bytes which can lead to corruption of the instruction
stream after code patching.
This patch fixes the alternates to use wide (32-bit) instructions for
Thumb-2, therefore ensuring that the patching code works correctly.
Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
and are flagged as __cpuinit -- so if we remove the __cpuinit from
the arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
related content into no-ops as early as possible, since that will get
rid of these warnings. In any case, they are temporary and harmless.
This removes all the ARM uses of the __cpuinit macros from C code,
and all __CPUINIT from assembly code. It also had two ".previous"
section statements that were paired off against __CPUINIT
(aka .section ".cpuinit.text") that also get removed here.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Pull ARM fixes from Russell King:
"A few fixes for ARM, mostly just one liners with the exception of the
missing section specification. We decided not to rely on .previous to
fix this but to explicitly state the section we want the code to be
in."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7778/1: smp_twd: twd_update_frequency need be run on all online CPUs
ARM: 7782/1: Kconfig: Let ARM_ERRATA_364296 not depend on CONFIG_SMP
ARM: mm: fix boot on SA1110 Assabet
ARM: 7781/1: mmu: Add debug_ll_io_init() mappings to early mappings
ARM: 7780/1: add missing linker section markup to head-common.S
Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 83db0384 (mm/ARM: use common help functions to free reserved
pages) broke booting on the Assabet by trying to convert a PFN to
a virtual address using the __va() macro. This macro takes the
physical address, not a PFN. Fix this.
Cc: <stable@vger.kernel.org> # 3.10
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Failure to add the mapping created in debug_ll_io_init() can lead
to the BUG_ON() triggering in lib/ioremap.c:27 if the static
virtual address decided for the debug_ll mapping overlaps with
another mapping that is created later. This happens because the
generic ioremap code has no idea there is a mapping there and it
tries to place a mapping in the same location and blows up when
it sees that there is a pte already present.
kernel BUG at lib/ioremap.c:27!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc2-00042-g2af0c67-dirty #316
task: ef088000 ti: ef082000 task.ti: ef082000
PC is at ioremap_page_range+0x16c/0x198
LR is at ioremap_page_range+0xf0/0x198
pc : [<c04cb874>] lr : [<c04cb7f8>] psr: 20000113
sp : ef083e78 ip : af140000 fp : ef083ebc
r10: ef7fc100 r9 : ef7fc104 r8 : 000af174
r7 : 00000647 r6 : beffffff r5 : f004c000 r4 : f0040000
r3 : af173417 r2 : 16440653 r1 : af173e07 r0 : ef7fc8fc
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
Control: 10c5787d Table: 8020406a DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xef082238)
Stack: (0xef083e78 to 0xef084000)
3e60: 00040000 ef083eec
3e80: bf134000 f004bfff c0207c00 f004c000 c02fc120 f000c000 c15e7800 00040000
3ea0: ef083eec 00000647 c098ba9c c0953544 ef083edc ef083ec0 c021b82c c04cb714
3ec0: c09cdc50 00000040 ef0f1e00 ef1003c0 ef083f14 ef083ee0 c09535bc c021b7bc
3ee0: c0953544 c04d0c6c c094e2cc c1600be4 c07440c4 c09a6888 00000002 c0a15f00
3f00: ef082000 00000000 ef083f54 ef083f18 c0208728 c0953550 00000002 c1600bfc
3f20: c08e3fac c0839918 ef083f54 c1600b80 c09a6888 c0a15f00 0000008b c094e2cc
3f40: c098ba9c c098bab8 ef083f94 ef083f58 c094ea0c c020865c 00000002 00000002
3f60: c094e2cc 00000000 c025b674 00000000 c06ff860 00000000 00000000 00000000
3f80: 00000000 00000000 ef083fac ef083f98 c06ff878 c094e910 00000000 00000000
3fa0: 00000000 ef083fb0 c020efe8 c06ff86c 00000000 00000000 00000000 00000000
3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 c0595108
[<c04cb874>] (ioremap_page_range+0x16c/0x198) from [<c021b82c>] (__alloc_remap_buffer.isra.18+0x7c/0xc4)
[<c021b82c>] (__alloc_remap_buffer.isra.18+0x7c/0xc4) from [<c09535bc>] (atomic_pool_init+0x78/0x128)
[<c09535bc>] (atomic_pool_init+0x78/0x128) from [<c0208728>] (do_one_initcall+0xd8/0x198)
[<c0208728>] (do_one_initcall+0xd8/0x198) from [<c094ea0c>] (kernel_init_freeable+0x108/0x1d0)
[<c094ea0c>] (kernel_init_freeable+0x108/0x1d0) from [<c06ff878>] (kernel_init+0x18/0xf4)
[<c06ff878>] (kernel_init+0x18/0xf4) from [<c020efe8>] (ret_from_fork+0x14/0x20)
Code: e50b0040 ebf54b2f e51b0040 eaffffee (e7f001f2)
Fix it by telling generic layers about the static mapping via
iotable_init(). This also has the nice side effect of letting
you see the mapping in procfs' vmallocinfo file.
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull ARM DMA mapping updates from Marek Szyprowski:
"This contains important bugfixes and an update for IOMMU integration
support for ARM architecture"
* 'for-v3.11' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: dma: Drop __GFP_COMP for iommu dma memory allocations
ARM: DMA-mapping: mark all !DMA_TO_DEVICE pages in unmapping as clean
ARM: dma-mapping: NULLify dev->archdata.mapping pointer on detach
ARM: dma-mapping: convert DMA direction into IOMMU protection attributes
ARM: dma-mapping: Get pages if the cpu_addr is out of atomic_pool
Prepare for removing num_physpages and simplify mem_init().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().
With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
pattern '0'. This could be used to get rid of poison_init_mem()
on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:
arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
Also address some minor code review comments.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARM updates from Russell King:
"This contains the usual updates from other people (listed below) and
the usual random muddle of miscellaneous ARM updates which cover some
low priority bug fixes and performance improvements.
I've started to put the pull request wording into the merge commits,
which are:
- NoMMU stuff:
This includes the following series sent earlier to the list:
- nommu-fixes
- R7 Support
- MPU support
I've left out the ARCH_MULTIPLATFORM/!MMU stuff that Arnd and I
were discussing today until we've reached a conclusion/that's had
some more review.
This is rebased (and re-tested) on your devel-stable branch because
otherwise there were going to be conflicts with Uwe's V7M work now
that you've merged that. I've included the fix for limiting MPU to
CPU_V7.
- Huge page support
These changes bring both HugeTLB support and Transparent HugePage
(THP) support to ARM. Only long descriptors (LPAE) are supported
in this series.
The code has been tested on an Arndale board (Exynos 5250).
- LPAE updates
Please pull these miscellaneous LPAE fixes I've been collecting for
a while now for 3.11. They've been tested and reviewed by quite a
few people, and most of the patches are pretty trivial. -- Will Deacon.
- arch_timer cleanups
Please pull these arch_timer cleanups I've been holding onto for a
while. They're the same as my last posting, but have been rebased
to v3.10-rc3.
- mpidr linearisation (multiprocessor id register - identifies which
CPU number we are in the system)
This patch series that implements MPIDR linearization through a
simple hashing algorithm and updates current cpu_{suspend}/{resume}
code to use the newly created hash structures to retrieve context
pointers. It represents a stepping stone for the implementation of
power management code on forthcoming multi-cluster ARM systems.
It has been tested on TC2 (dual cluster A15xA7 system), iMX6q,
OMAP4 and Tegra, with processors hitting low-power states requiring
warm-boot resume through the cpu_resume code path"
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits)
ARM: 7775/1: mm: Remove do_sect_fault from LPAE code
ARM: 7777/1: Avoid extra calls to the C compiler
ARM: 7774/1: Fix dtb dependency to use order-only prerequisites
ARM: 7770/1: remove residual ARMv2 support from decompressor
ARM: 7769/1: Cortex-A15: fix erratum 798181 implementation
ARM: 7768/1: prevent risks of out-of-bound access in ASID allocator
ARM: 7767/1: let the ASID allocator handle suspended animation
ARM: 7766/1: versatile: don't mark pen as __INIT
ARM: 7765/1: perf: Record the user-mode PC in the call chain.
ARM: 7735/2: Preserve the user r/w register TPIDRURW on context switch and fork
ARM: kernel: implement stack pointer save array through MPIDR hashing
ARM: kernel: build MPIDR hash function data structure
ARM: mpu: Ensure that MPU depends on CPU_V7
ARM: mpu: protect the vectors page with an MPU region
ARM: mpu: Allow enabling of the MPU via kconfig
ARM: 7758/1: introduce config HAS_BANDGAP
ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting
ARM: 7751/1: zImage: don't overwrite ourself with a page table
ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock
ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
...
These changes are all to SoC-specific code, a total of 33 branches on
17 platforms were pulled into this. Like last time, Renesas sh-mobile
is now the platform with the most changes, followed by OMAP and EXYNOS.
Two new platforms, TI Keystone and Rockchips RK3xxx are added in
this branch, both containing almost no platform specific code at all,
since they are using generic subsystem interfaces for clocks, pinctrl,
interrupts etc. The device drivers are getting merged through the
respective subsystem maintainer trees.
One more SoC (u300) is now multiplatform capable and several others
(shmobile, exynos, msm, integrator, kirkwood, clps711x) are moving
towards that goal with this series but need more work.
Also noteworthy is the work on PCI here, which is traditionally part of
the SoC specific code. With the changes done by Thomas Petazzoni, we can
now more easily have PCI host controller drivers as loadable modules and
keep them separate from the platform code in drivers/pci/host. This has
already led to the discovery that three platforms (exynos, spear and imx)
are actually using an identical PCIe host controller and will be able
to share a driver once support for spear and imx is added.
Conflicts:
* asm/glue-proc.h has one CPU type getting added that conflicts
with another addition in 3.10-rc7
* Simple context changes in arch/arm/Makefile and arch/arm/Kconfig
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUdLnpmCrR//JCVInAQLoFRAAyatR+MhVFwc91cO7yDw/mz81RO1V9jEd
QMufoWi0BRfBsubqxnGlb510EEMTz7gxdrlYPILYNr8TqR+lNGhjKt2FQAjN3q2O
IBvu4x8C+xcxnMNbkCnTQRxP/ziK6yCI6e7enQhwuMuJwvsnJtGbsqKi5ODMw6x0
o5EQmIdj5NhhSJqJZPCmWsKbx100TH1UwaEnhNl0DSaFj51n3bVRrK6Nxce10GWZ
HsS1/a63lq/YZLkwfUEvgin/PU9Jx5jMmqhlp3bZjG+f1ItdzJF+9IgS248vCIi2
ystzWCH88Kh69UFcYFfCjeZe8H45XcP+Zykd8WC0DvF/a7Hwk5KTKE/ciT6RPRxb
rkWW5EwjqZL9w9cU3rUHWtSVenayQMMEmCfksadr1AExyCrhPqfs9RINyBs2lK5a
q2bdSFbXZsNzSyL+3yQAfChvRo1/2FdlFVQy+oVUCActV7L77Y7y6jl+b2qzFsSu
xMKwvC/1vDXTvOnGk6A/qJu7yrHpqJrvw1eI+wnMswNBl7lCTgyyHnr5y8S092jI
KU4hmSxsYP+y13HmKy4ewPy9DYJYBTSdReKfEFo79Dx8eqySAWjHFL/OPRqhCUYS
kBq0eZpVZO7tJnHRaRz8n93wIYzb1UOhhgVwxdjPZF9L4d/jzh1BCv0OBWv8IXCu
uWLAi92lL24=
=0r9S
-----END PGP SIGNATURE-----
Merge tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC specific changes from Arnd Bergmann:
"These changes are all to SoC-specific code, a total of 33 branches on
17 platforms were pulled into this. Like last time, Renesas sh-mobile
is now the platform with the most changes, followed by OMAP and
EXYNOS.
Two new platforms, TI Keystone and Rockchips RK3xxx are added in this
branch, both containing almost no platform specific code at all, since
they are using generic subsystem interfaces for clocks, pinctrl,
interrupts etc. The device drivers are getting merged through the
respective subsystem maintainer trees.
One more SoC (u300) is now multiplatform capable and several others
(shmobile, exynos, msm, integrator, kirkwood, clps711x) are moving
towards that goal with this series but need more work.
Also noteworthy is the work on PCI here, which is traditionally part
of the SoC specific code. With the changes done by Thomas Petazzoni,
we can now more easily have PCI host controller drivers as loadable
modules and keep them separate from the platform code in
drivers/pci/host. This has already led to the discovery that three
platforms (exynos, spear and imx) are actually using an identical PCIe
host controller and will be able to share a driver once support for
spear and imx is added."
* tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (480 commits)
ARM: integrator: let pciv3 use mem/premem from device tree
ARM: integrator: set local side PCI addresses right
ARM: dts: Add pcie controller node for exynos5440-ssdk5440
ARM: dts: Add pcie controller node for Samsung EXYNOS5440 SoC
ARM: EXYNOS: Enable PCIe support for Exynos5440
pci: Add PCIe driver for Samsung Exynos
ARM: OMAP5: voltagedomain data: remove temporary OMAP4 voltage data
ARM: keystone: Move CPU bringup code to dedicated asm file
ARM: multiplatform: always pick one CPU type
ARM: imx: select syscon for IMX6SL
ARM: keystone: select ARM_ERRATA_798181 only for SMP
ARM: imx: Synertronixx scb9328 needs to select SOC_IMX1
ARM: OMAP2+: AM43x: resolve SMP related build error
dmaengine: edma: enable build for AM33XX
ARM: edma: Add EDMA crossbar event mux support
ARM: edma: Add DT and runtime PM support to the private EDMA API
dmaengine: edma: Add TI EDMA device tree binding
arm: add basic support for Rockchip RK3066a boards
arm: add debug uarts for rockchip rk29xx and rk3xxx series
arm: Add basic clocks for Rockchip rk3066a SoCs
...
This contains cleanups as preparation for other branches adding new
features, we pulled 16 branches for 9 platforms into this one.
Most notable here is the removal of support for ATAGS based OMAP4
systems. Since all OMAP4 machines are fully functional with DT based
booting in 3.10, we can remove a lot of code here.
Also noteworthy is Maxime Ripard's cleanup of the machine descriptors,
which means we need no machine descriptors in a lot more cases and
can boot additional machines by just having the respective device
drivers enabled.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUdLnoGCrR//JCVInAQL5bw/+OZeE60sBJxgDAf9XEYls9t5cnLY963uE
izgsyLKwwAi21Xbg/0vgGZLbpdfyd2IJa+bWXhxVTLFI43Hb0D2x1hYyMzy/fFWj
gmqQ4dLYawYj+1sOirTPWDquR31mavofmMF2HVk23S6NWmNIjPk1+7Wgd46Y4vNX
7T6j4cg9HPrxQ37a6ucOuEX6+rqmFe2Q+v7qcsXkwxkVxgIC8V4MgHJmt8gGMRvB
HHrY1kPBHlMJm07fLngilAfpa8G91fmgKxSfugeClyKotj7lHxno/lh/+oxMGvaJ
J9memdfbYISLSvDLeH6Rib/zaC7VnSij9QtZmFtToiJ6qVVZiLFd2dpP+ccpaMGb
YEvm58ayajAvb0wZMueoeKs9yW0UWCdXdkzKbhuWrwmPDjKSb9f9t1u3ons8vl+2
dOwlTex9/ijsxu1qTHMm4/EVg+NR/AwVVwiBRG9sYnfxSHkgXxPW5TF7T4d1H71v
WZkXWsJKIUDQgk+2nnE4J9TvFlPyaV09yFYyiY/+DWAs9DUus8cf37nt2Wz6H1l6
THQcQDcZsPLZsSIEdzUrchdLKDZHzhkb3i8pae7TC6CySiOzj/yX2zbh9Ot538WG
2C7qzAVtyMVuAdFh0caIu0iVXqjsnJLAZGImLZQySR00aq34uXh7MsHepnhAI10q
EQ1vILi4mU4=
=XDJs
-----END PGP SIGNATURE-----
Merge tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC cleanups from Arnd Bergmann:
"This contains cleanups as preparation for other branches adding new
features, we pulled 16 branches for 9 platforms into this one.
Most notable here is the removal of support for ATAGS based OMAP4
systems. Since all OMAP4 machines are fully functional with DT based
booting in 3.10, we can remove a lot of code here.
Also noteworthy is Maxime Ripard's cleanup of the machine descriptors,
which means we need no machine descriptors in a lot more cases and can
boot additional machines by just having the respective device drivers
enabled."
* tag 'cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (76 commits)
ARM: picoxcell: remove .nr_irqs reference
ARM: s5p64x0: avoid build warning for uncompress.h
ARM: SAMSUNG: Remove unused plat/regs-watchdog.h header
ARM: SAMSUNG: Remove legacy watchdog reset code
ARM: SAMSUNG: Let platforms use the new watchdog reset driver
ARM: SAMSUNG: Add watchdog reset driver
ARM: SAMSUNG: Use local definitions of watchdog registers
watchdog: s3c2410_wdt: Use local register definitions
ARM: S5P64X0: Use common uncompress.h part for plat-samsung
ARM: SAMSUNG: Consolidate uncompress subroutine
ARM: at91: drop rm9200dk board support
ARM: dts: msm: Fix merge resolution
ARM: OMAP1: Remove dma.h
ARM: OMAP1: Remove legacy irda.h and irda setup from board files
ARM: OMAP1: Remove duplicated DMA channel definitions
ARM: OMAP1: Remove McBSP DMA channel definitions
ARM: OMAP2+: Remove dma.h
ARM: OMAP2+: hwmod: Remove remaining DMA channel definitions
ARM: OMAP2+: Remove duplicated DMA channel definitions
ARM: OMAP2+: Remove AES crypto device DMA channel definitions
...
We want to use CMA for allocating hash page table and real mode area for
PPC64. Hence move DMA contiguous related changes into a seperate config
so that ppc64 can enable CMA without requiring DMA contiguous.
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[removed defconfig changes]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
For LPAE, do_sect_fault used to be invoked as the second level access
flag handler. When transparent huge pages were introduced for LPAE,
do_page_fault was used instead.
Unfortunately, do_sect_fault remains defined but not used for LPAE code
resulting in a compile warning.
This patch surrounds do_sect_fault with #ifndef CONFIG_ARM_LPAE to fix
this warning.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
__iommu_alloc_buffer wants to split pages after allocation in order to
reduce the memory footprint. This does not work well with __GFP_COMP
pages, so drop this flag before allocation
One failure example is snd_malloc_dev_pages call dma_alloc_coherent with
__GFP_COMP.
Signed-off-by: Richard Zhao <rizhao@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
It is common for one sg to include many pages, so mark all these
pages as clean to avoid unnecessary flushing on them in
set_pte_at() or update_mmu_cache().
The patch might improve loading performance of applciation code a bit.
On the below test code to read file(~1GByte size) from usb mass storage
disk to buffer created with mmap(PROT_READ | PROT_EXEC) on
Pandaboard, average ~1% improvement can be observed with the patch on
10 times test.
unsigned int sum = 0;
static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
return (tv2->tv_sec - tv1->tv_sec) * 1000000 +
(tv2->tv_usec - tv1->tv_usec);
}
int main(int argc, char *argv[])
{
char *mbuffer;
int fd;
int i;
unsigned long page_size, size;
struct stat stat;
struct timeval t1, t2;
page_size = getpagesize();
fd = open(argv[1], O_RDONLY);
assert(fd >= 0);
fstat(fd, &stat);
size = stat.st_size;
printf("%s: file %s, file size %lu, page size %lu\n", argv[0],
read_filename, size, page_size);
gettimeofday(&t1, NULL);
mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
for (i = 0 ; i < size ; i += page_size)
sum += mbuffer[i];
munmap(mbuffer, page_size);
gettimeofday(&t2, NULL);
printf("\tread mmaped time: %luus\n", tv_diff(&t1, &t2));
close(fd);
}
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
The current code only clobbers a local variable, so the device is left
with a stale mapping pointer.
Cc: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
IOMMU mappings take a prot parameter, identifying the protection bits
to enforce on the newly created mapping (READ or WRITE). The ARM
dma-mapping framework currently just passes 0 as the prot argument,
resulting in faulting mappings.
This patch infers the protection attributes based on the direction of
the DMA transfer.
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
In __iommu_get_pages(), the cpu_addr is checked wheather in
atomic_pool range or not. So if the cpu_addr is in atomic_pool
range, it does not need to check twice.
Signed-off-by: YoungJun Cho <yj44.cho@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Looking into the active_asids array is not enough, as we also need
to look into the reserved_asids array (they both represent processes
that are currently running).
Also, not holding the ASID allocator lock is racy, as another CPU
could schedule that process and trigger a rollover, making the erratum
workaround miss an IPI.
Exposing this outside of context.c is a little ugly on the side, so
let's define a new entry point that the erratum workaround can call
to obtain the cpumask.
Cc: <stable@vger.kernel.org> # 3.9
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On a CPU that never ran anything, both the active and reserved ASID
fields are set to zero. In this case the ASID_TO_IDX() macro will
return -1, which is not a very useful value to index a bitmap.
Instead of trying to offset the ASID so that ASID #1 is actually
bit 0 in the asid_map bitmap, just always ignore bit 0 and start
the search from bit 1. This makes the code a bit more readable,
and without risk of OoB access.
Cc: <stable@vger.kernel.org> # 3.9
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When a CPU is running a process, the ASID for that process is
held in a per-CPU variable (the "active ASIDs" array). When
the ASID allocator handles a rollover, it copies the active
ASIDs into a "reserved ASIDs" array to ensure that a process
currently running on another CPU will continue to run unaffected.
The active array is zero-ed to indicate that a rollover occurred.
Because of this mechanism, a reserved ASID is only remembered for
a single rollover. A subsequent rollover will completely refill
the reserved ASIDs array.
In a severely oversubscribed environment where a CPU can be
prevented from running for extended periods of time (think virtual
machines), the above has a horrible side effect:
[P{a} denotes process P running with ASID a]
CPU-0 CPU-1
A{x} [active = <x 0>]
[suspended] runs B{y} [active = <x y>]
[rollover:
active = <0 0>
reserved = <x y>]
runs B{y} [active = <0 y>
reserved = <x y>]
[rollover:
active = <0 0>
reserved = <0 y>]
runs C{x} [active = <0 x>]
[resumes]
runs A{x}
At that stage, both A and C have the same ASID, with deadly
consequences.
The fix is to preserve reserved ASIDs across rollovers if
the CPU doesn't have an active ASID when the rollover occurs.
Cc: <stable@vger.kernel.org> # 3.9
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Carinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This commit fixes the regression on Armada 370 (the kernal hang during
boot) introduced by the commit: "ARM: 7691/1: mm: kill unused
TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead".
When coming out of either a Wait for Interrupt (WFI) or a Wait for
Event (WFE) IDLE states, a specific timing sensitivity exists between
the retiring WFI/WFE instructions and the newly issued subsequent
instructions. This sensitivity can result in a CPU hang scenario. The
workaround is to insert either a Data Synchronization Barrier (DSB) or
Data Memory Barrier (DMB) command immediately after the WFI/WFE
instruction.
This commit was based on the work of Lior Amsalem, but heavily
modified to apply the errata fix dynamically according to the
processor type thanks to the suggestions of Russell King and Nicolas
Pitre.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 1bc3974 (ARM: 7755/1: handle user space mapped pages in
flush_kernel_dcache_page) moved the implementation of
flush_kernel_dcache_page() into mm/flush.c but did not implement it
on noMMU ARM.
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Cc: <stable@vger.kernel.org> # 3.2+: 1bc3974: ARM: 7755/1
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As it was already suggested by Russell King and Arnd Bergmann:
https://lkml.org/lkml/2013/5/16/133
moxart and gemini seem to be the only platforms using CPU_FA526,
and instead of pointing arm_pm_idle to an empty function from
platform code, it makes sense to remove WFI code from the processor
specific idle function.
Applies to arm-soc/for-next (and 3.10-rc1).
Changes since v1:
1. remove WFI but make sure cpu_fa526_do_idle do not fall through
to cpu_fa526_dcache_clean_area
Note: moxart boots and prints to UART without this patch, but input is broken.
Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Conflicts:
arch/arm/kernel/smp.c
Please pull these miscellaneous LPAE fixes I've been collecting for a while
now for 3.11. They've been tested and reviewed by quite a few people, and most
of the patches are pretty trivial. -- Will Deacon.
These changes bring both HugeTLB support and Transparent HugePage
(THP) support to ARM. Only long descriptors (LPAE) are supported
in this series.
The code has been tested on an Arndale board (Exynos 5250).
Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that
the pages it needs to handle are kernel mapped only. However, for
example when doing direct I/O, pages with user space mappings may
occur.
Thus, continue to do lazy flushing if there are no user space
mappings. Otherwise, flush the kernel cache lines directly.
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: <stable@vger.kernel.org> # 3.2+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This commit fixes the ID and mask for the PJ4B which was too
restrictive and didn't match the CPU of the Armada 370 SoC.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This bug was introduced in commit e651eab0.
Some v4/v5 platforms failed to boot due to this.
Signed-off-by: Po-Yu Chuang <ratbert.chuang@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR
register returns zero when it should return one. This leads to cache
maintenance operations which rely on this value to not function as
intended, causing data corruption.
The workaround for this errata is to detect affected CPUs and correct
the LoUIS value read.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Much like with the MMU, MPU initialisation is performed in two stages; the
first in the pre-C world and the 'real' initialisation during arch setup.
This patch wires in previously added MPU initialisation functions so that
the whole of memory is mapped with the appropriate region properties for
'normal' RAM (the appropriate properties depend on whether the system is
SMP).
Stub initialisation functions are added for the case that there MPU support
is not configured in to the kernel.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
CC: Hyok S. Choi <hyok.choi@samsung.com>
This patch adds new functions for probing and initialising the ARMv7
PMSA-compliant MPU.
These use the pre-defined and reserved MPU_PROBE_REGION for establishing
properties of the MPU, which is necessary because certain probe operations
require modifying region properties and reading back the results.
This patch also introduces a minimal sanity_check_meminfo_mpu function, that
ensures that the memory set-up passed to the kernel can be used in conjunction
with the MPU. The base address of a region must be aligned to the region size,
otherwise behavior is unpredictable and region sizes can only be specified as a
power-of-two. To simplify the satisfaction of these requirements this
implementation currently enforces that all memory is contiguous from
PHYS_OFFSET, merging banks that are contiguous but passed in separately.
The functions are added in this patch but wired in to the boot process later
in the series.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
CC: Hyok S. Choi <hyok.choi@samsung.com>
This patch adds processor info for ARM Ltd. Cortex-R7.
The R7 has many similarities to the A9 and though the ACTLR layout is not
identical, the bits associated with cache operations broadcasting and SMP
modes are the same for A9, A5 and R7 (Though in the A-class processors the
same bits toggle TLB-ops broadcasting as well as cache-ops)
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Stephen Boyd <sboyd@codeaurora.org>
Currently CPU_V7 selects CPU_CP15_MMU, however in the case of a V7 CPU
implementing the PMSA, such as the Cortex-R7, the CP15_MMU operations are
not available. Selecting CPU_CP15_MPU is appropriate in this case.
This patch makes CPU_CP15_MMU dependent on the use of the MMU, selecting
CPU_CP15_MPU for v7 processors when !MMU is chosen.
Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
The ARM CPU suspend code can be selected even for a !CONFIG_MMU
configuration. The resulting kernel will not compile and, even if it did,
would access undefined co-processor registers when executing.
This patch fixes the v6 and v7 CPU suspend code for the nommu case.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Tested-by: Jonathan Austin <jonathan.austin@arm.com>
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> (commit_signer:1/3=33%)
CC: Santosh Shilimkar <santosh.shilimkar@ti.com> (commit_signer:1/3=33%)
CC: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Currently flush_dcache_page() thinks pages as non-mapped if
mapping_mapped(mapping) return false. This approach is very
coase:
- mmap on part of file may cause all pages backed on
the file being thought as mmaped
- file-backed pages aren't mapped into user space actually
if the memory mmaped on the file isn't accessed
This patch uses page_mapped() to decide if the page has been
mapped.
From the attached test code, I find there is much performance
improvement(>25%) when accessing page caches via read under this
situations, so memcpy benefits a lot from not flushing cache
under this situation.
No. read time without the patch No. read time with the patch
================================================================
No. 0, time 22615636 us No. 0, time 22014717 us
No. 1, time 4387851 us No. 1, time 3113184 us
No. 2, time 4276535 us No. 2, time 3005244 us
No. 3, time 4259821 us No. 3, time 3001565 us
No. 4, time 4263811 us No. 4, time 3002748 us
No. 5, time 4258486 us No. 5, time 3004104 us
No. 6, time 4253009 us No. 6, time 3002188 us
No. 7, time 4262809 us No. 7, time 2998196 us
No. 8, time 4264525 us No. 8, time 3007255 us
No. 9, time 4267795 us No. 9, time 3005094 us
1), No.0. is to read the file from storage device, and others are
to read the file from page caches basically.
2), file size is 512M, and is on ext4 over usb mass storage.
3), the test is done on Pandaboard.
unsigned int sum = 0;
unsigned long sum_val = 0;
static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
return (tv2->tv_sec - tv1->tv_sec) * 1000000 +
(tv2->tv_usec - tv1->tv_usec);
}
int main(int argc, char *argv[])
{
char *mbuf, fbuf;
int fd;
int i;
unsigned long page_size, size;
struct stat stat;
struct timeval t1, t2;
unsigned char *rbuf = malloc(32 * page_size);
if (!rbuf) {
printf(" %sn", "malloc failed");
exit(-1);
}
page_size = getpagesize();
fd = open(argv[1], O_RDWR);
assert(fd >= 0);
fstat(fd, &stat);
size = stat.st_size;
printf("%s: file %s, size %lu, page size %lun",
argv[0],
argv[1], size, page_size);
gettimeofday(&t1, NULL);
mbuf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (!mbuf) {
printf(" %sn", "mmap failed");
exit(-1);
}
for (i = 0 ; i < size ; i += (page_size * 32)) {
int rcnt;
lseek(fd, i, SEEK_SET);
rcnt = read(fd, rbuf, page_size * 32);
if (rcnt != page_size * 32) {
printf("%s: read faildn", __func__);
exit(-1);
}
}
free(rbuf);
munmap(mbuf, size);
gettimeofday(&t2, NULL);
printf("tread mmaped time: %luusn", tv_diff(&t1, &t2));
close(fd);
}
Cc: Michel Lespinasse <walken@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The patch adds support for THP (transparent huge pages) to LPAE
systems. When this feature is enabled, the kernel tries to map
anonymous pages as 2MB sections where possible.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@linaro.org: symbolic constants used, value of
PMD_SECT_SPLITTING adjusted, tlbflush.h included in pgtable.h,
added PROT_NONE support.]
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
This patch adds support for hugetlbfs based on the x86 implementation.
It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt
for usage). The 64K pages configuration is not supported (section size
is 512MB in this case).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@linaro.org: symbolic constants replace numbers in places.
Split up into multiple files, to simplify future non-LPAE support,
removed huge_pmd_share code, as this is very rarely executed,
Added PROT_NONE support].
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
On ARM we use the __flush_dcache_page function to flush the dcache
of pages when needed; usually when the PG_dcache_clean bit is unset
and we are setting a PTE.
A HugeTLB page is represented as a compound page consisting of an
array of pages. Thus to flush the dcache of a HugeTLB page, one must
flush more than a single page.
This patch modifies __flush_dcache_page such that all constituent
pages of a HugeTLB page are flushed.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
This patch cleans up the highmem sanity check code by simplifying the range
checks with a pre-calculated size_limit. This patch should otherwise have no
functional impact on behavior.
This patch also removes a redundant (bank->start < vmalloc_limit) check, since
this is already covered by the !highmem condition.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
On Keystone platforms, physical memory is entirely outside the 32-bit
addressible range. Therefore, the (bank->start > ULONG_MAX) check below marks
the entire system memory as highmem, and this causes unpleasentness all over.
This patch eliminates the extra bank start check (against ULONG_MAX) by
checking bank->start against the physical address corresponding to vmalloc_min
instead.
In the process, this patch also cleans up parts of the highmem sanity check
code by removing what has now become a redundant check for banks that entirely
overlap with the vmalloc range.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch modifies the highmem sanity checking code to use physical addresses
instead. This change eliminates the wrap-around problems associated with the
original virtual address based checks, and this simplifies the code a bit.
The one constraint imposed here is that low physical memory must be mapped in
a monotonically increasing fashion if there are multiple banks of memory,
i.e., x < y must => pa(x) < pa(y).
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch redefines the early boot time use of the R4 register to steal a few
low order bits (ARCH_PGD_SHIFT bits) on LPAE systems. This allows for up to
38-bit physical addresses.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch moves the TTBR1 offset calculation and the T1SZ calculation out
of the TTB setup assembly code. This should not affect functionality in
any way, but improves code readability as well as readability of subsequent
patches in this series.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch adds TTBR accessor macros, and modifies cpu_get_pgd() and
the LPAE version of cpu_set_reserved_ttbr0() to use these instead.
In the process, we also fix these functions to correctly handle cases
where the physical address lies beyond the 4G limit of 32-bit addressing.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch modifies the switch_mm() processor functions to use phys_addr_t.
On LPAE systems, we now honor the upper 32-bits of the physical address that
is being passed in, and program these into TTBR as expected.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
[will: fixed up conflict in 3-level switch_mm with big-endian changes]
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch fixes the initrd setup code to use phys_addr_t instead of assuming
32-bit addressing. Without this we cannot boot on systems where initrd is
located above the 4G physical address limit.
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The free_memmap() was mistakenly using unsigned long type to represent
physical addresses. This breaks on PAE systems where memory could be placed
above the 32-bit addressible limit.
This patch fixes this function to properly use phys_addr_t instead.
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.
This is an extension to commit 97092e0c56 (ARM:
pgtable: use phys_addr_t for physical addresses), which applied similar changes
elsewhere in the ARM memory management code.
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
More and more sub-architectures are using only the debug_ll_io_init
function as the map_io function. Make the core code call this function
if no function is specified in the machine description to remove some
boilerplate code.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
It is common for one sg to include many pages, so mark all these
pages as clean to avoid unnecessary flushing on them in
set_pte_at() or update_mmu_cache().
The patch might improve loading performance of applciation code a bit.
On the below test code to read file(~1GByte size) from usb mass storage
disk to buffer created with mmap(PROT_READ | PROT_EXEC) on
Pandaboard, average ~1% improvement can be observed with the patch on
10 times test.
unsigned int sum = 0;
static unsigned long tv_diff(struct timeval *tv1, struct timeval *tv2)
{
return (tv2->tv_sec - tv1->tv_sec) * 1000000 + (tv2->tv_usec - tv1->tv_usec);
}
int main(int argc, char *argv[])
{
char *mbuffer;
int fd;
int i;
unsigned long page_size, size;
struct stat stat;
struct timeval t1, t2;
page_size = getpagesize();
fd = open(argv[1], O_RDONLY);
assert(fd >= 0);
fstat(fd, &stat);
size = stat.st_size;
printf("%s: file %s, file size %lu, page size %lun", argv[0],
read_filename, size, page_size);
gettimeofday(&t1, NULL);
mbuffer = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
for (i = 0 ; i < size ; i += page_size)
sum += mbuffer[i];
munmap(mbuffer, page_size);
gettimeofday(&t2, NULL);
printf("tread mmaped time: %luusn", tv_diff(&t1, &t2));
close(fd);
}
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Several of the ioremap functions use unsigned long in places
resulting in truncation if physical addresses greater than
4G are passed in. Change the types of the functions and the
callers accordingly.
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull ARM-v7M support from Uwe Kleine-König:
"All but the last patch were in next since next-20130418 without issues.
The last patch fixes a problem in combination with
8164f7a (ARM: 7680/1: Detect support for SDIV/UDIV from ISAR0 register)
which triggers a WARN_ON without an implemented read_cpuid_ext.
The branch merges fine into v3.10-rc1 and I'd be happy if you pulled it
for 3.11-rc1. The only missing piece to be able to run a Cortex-M3 is
the irqchip driver that will go in via Thomas Gleixner and platform
specific stuff."
On v7-M the extended cpuid registers are not available from CP15 but they
are memory mapped in the System Control Space.
There isn't an equivalent available for CPUID_{CACHETYPE,TCM,TLBTYPE,MPIDR}.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>