asm-generic offers an atomic-add based rwsem implementation, which
can avoid the need for heavier, spinlock-based synchronisation on the
fast path.
This patch makes use of the optimised implementation for arm64 CPUs.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This enables support for the generic CPU feature modalias implementation that
wires up optional CPU features to udev based module autoprobing.
A file <asm/cpufeature.h> is provided that maps CPU feature numbers to
elf_hwcap bits, which is the standard way on arm64 to advertise optional CPU
features both internally and to user space.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: removed unnecessary "!!"]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add support for unwinding using the dwarf information in compat
mode. Using the correct user stack pointer allows perf to record
the frames correctly in the native and compat modes.
Note that although the dwarf frame unwinding works ok using
libunwind in native mode (on ARMv7 & ARMv8), some changes are
required to the libunwind code for the compat mode. Those changes
are posted separately on the libunwind mailing list.
Tested on ARMv8 platform with v8 and compat v7 binaries, the latter
are statically built.
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch implements the functions required for the perf registers API,
allowing the perf tool to interface kernel register dumps with libunwind
in order to provide userspace backtracing.
Compat mode is also supported.
Only the general purpose user space registers are exported, i.e.:
PERF_REG_ARM_X0,
...
PERF_REG_ARM_X28,
PERF_REG_ARM_FP,
PERF_REG_ARM_LR,
PERF_REG_ARM_SP,
PERF_REG_ARM_PC
and not the PERF_REG_ARM_V* registers.
Signed-off-by: Jean Pihet <jean.pihet@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
ARMv8 supports a range of physical address bit sizes. The PARange bits
from ID_AA64MMFR0_EL1 register are read during boot-time and the
intermediate physical address size bits are written in the translation
control registers (TCR_EL1 and VTCR_EL2).
There is no change in the VA bits and levels of translation.
Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com>
Reviewed-by: Will Deacon <Will.deacon@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Special pte mappings are not intended to be executable and do not even
have an associated struct page. This patch ensures that we do not call
__sync_icache_dcache() on such ptes.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Steve Capper <Steve.Capper@arm.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Cc: <stable@vger.kernel.org>
pgprot_{dmacoherent,writecombine,noncached} don't need to generate
executable mappings with side-effects like __sync_icache_dcache() being
called when the mapping is in user space.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Tested-by: Bharat Bhushan <Bharat.Bhushan@freescale.com>
Cc: <stable@vger.kernel.org>
Commit 8adbf57fc4 ("irqchip: gic: use dmb ishst instead of dsb when
raising a softirq") added an explicit dmb(...) call to the GIC driver.
This patch adds a simple dmb() macro to arm64, which expands to a DMB SY
instruction.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add basic CPU topology support to arm64, based on the existing pre-v8
code and some work done by Mark Hambleton. This patch does not
implement any topology discovery support since that should be based on
information from firmware, it merely implements the scaffolding for
integration of topology support in the architecture.
No locking of the topology data is done since it is only modified during
CPU bringup with external serialisation from the SMP code.
The goal is to separate the architecture hookup for providing topology
information from the DT parsing in order to ease review and avoid
blocking the architecture code (which will be built on by other work)
with the DT code review by providing something simple and basic.
Following patches will implement support for interpreting topology
information from MPIDR and for parsing the DT topology bindings for ARM,
similar patches will be needed for ACPI.
Signed-off-by: Mark Brown <broonie@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
[catalin.marinas@arm.com: removed CONFIG_CPU_TOPOLOGY, always on if SMP]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This adds support for advertising the presence of ARMv8 Crypto
Extensions in the Aarch32 execution state to 32-bit ELF binaries
running in 32-bit compat mode under the arm64 kernel.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add support for the ELF auxv entry AT_HWCAP2 when running 32-bit
ELF binaries in compat mode.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
psci_init() is written to return err code if something goes wrong. However,
the single user, setup_arch(), doesn't care about it. Moreover, every error
path is supplied with a clear message which is enough for pleasant debugging.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
This patch adds support for DMA API cache maintenance on SoCs without
hardware device cache coherency.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Over the past couple of years, the generic mmu_gather gained range
tracking - 597e1c3580 (mm/mmu_gather: enable tlb flush range in generic
mmu_gather), 2b047252d0 (Fix TLB gather virtual address range
invalidation corner cases) - and tlb_fast_mode() has been removed -
29eb77825c (arch, mm: Remove tlb_fast_mode()).
The new mmu_gather structure is now suitable for arm64 and this patch
converts the arch asm/tlb.h to the generic code. One functional
difference is the shift_arg_pages() case where previously the code was
flushing the full mm (no tlb_start_vma call) but now it flushes the
range given to tlb_gather_mmu() (possibly slightly more efficient
previously).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
The patch moves the PCI I/O space (currently at 64K) before the
earlyprintk mapping and extends it to 16MB.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
typecast instruction_pointer macro to unsigned long to
resolve following compiler warnings like
warning: format '%lx' expects argument of type 'long unsigned int',
but argument 2 has type 'u64' [-Wformat]
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add KGDB debug support for kernel debugging.
With this patch, basic KGDB debugging is possible.GDB register
layout is updated and GDB tool can establish connection with
target and can set/clear breakpoints.
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add macros to enable and disable to manage PSTATE.D
for debugging. The macros local_dbg_save and local_dbg_restore
are moved to irqflags.h file
KGDB boot tests fail because of PSTATE.D is masked.
unmask it for debugging support
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers
from inline asm blocks that only use these instructions to implement
conditional branches.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Linux requires a number of atomic operations to provide full barrier
semantics, that is no memory accesses after the operation can be
observed before any accesses up to and including the operation in
program order.
On arm64, these operations have been incorrectly implemented as follows:
// A, B, C are independent memory locations
<Access [A]>
// atomic_op (B)
1: ldaxr x0, [B] // Exclusive load with acquire
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
<Access [C]>
The assumption here being that two half barriers are equivalent to a
full barrier, so the only permitted ordering would be A -> B -> C
(where B is the atomic operation involving both a load and a store).
Unfortunately, this is not the case by the letter of the architecture
and, in fact, the accesses to A and C are permitted to pass their
nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
store-release on B). This is a clear violation of the full barrier
requirement.
The simple way to fix this is to implement the same algorithm as ARMv7
using explicit barriers:
<Access [A]>
// atomic_op (B)
dmb ish // Full barrier
1: ldxr x0, [B] // Exclusive load
<op(B)>
stxr w1, x0, [B] // Exclusive store
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
but this has the undesirable effect of introducing *two* full barrier
instructions. A better approach is actually the following, non-intuitive
sequence:
<Access [A]>
// atomic_op (B)
1: ldxr x0, [B] // Exclusive load
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
The simple observations here are:
- The dmb ensures that no subsequent accesses (e.g. the access to C)
can enter or pass the atomic sequence.
- The dmb also ensures that no prior accesses (e.g. the access to A)
can pass the atomic sequence.
- Therefore, no prior access can pass a subsequent access, or
vice-versa (i.e. A is strictly ordered before C).
- The stlxr ensures that no prior access can pass the store component
of the atomic operation.
The only tricky part remaining is the ordering between the ldxr and the
access to A, since the absence of the first dmb means that we're now
permitting re-ordering between the ldxr and any prior accesses.
From an (arbitrary) observer's point of view, there are two scenarios:
1. We have observed the ldxr. This means that if we perform a store to
[B], the ldxr will still return older data. If we can observe the
ldxr, then we can potentially observe the permitted re-ordering
with the access to A, which is clearly an issue when compared to
the dmb variant of the code. Thankfully, the exclusive monitor will
save us here since it will be cleared as a result of the store and
the ldxr will retry. Notice that any use of a later memory
observation to imply observation of the ldxr will also imply
observation of the access to A, since the stlxr/dmb ensure strict
ordering.
2. We have not observed the ldxr. This means we can perform a store
and influence the later ldxr. However, that doesn't actually tell
us anything about the access to [A], so we've not lost anything
here either when compared to the dmb variant.
This patch implements this solution for our barriered atomic operations,
ensuring that we satisfy the full barrier requirements where they are
needed.
Cc: <stable@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The dsb instruction takes an option specifying both the target access
types and shareability domain.
This patch allows such an option to be passed to the dsb macro,
resulting in potentially more efficient code. Currently the option is
ignored until all callers are updated (unlike ARM, the option is
mandated by the assembler).
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Somehow SERROR has acquired an additional 'R' in a couple of headers.
This patch removes them before they spread further. As neither instance
is in use yet, no other sites need to be fixed up.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add DSB after icache flush to complete the cache maintenance operation.
The function __flush_icache_all() is used only for user space mappings
and an ISB is not required because of an exception return before executing
user instructions. An exception return would behave like an ISB.
Signed-off-by: Vinayak Kale <vkale@apm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- Introduction of PTE_WRITE to distinguish between writable but clean
and truly read-only pages
- FIQs enabling/disabling clean-up (they aren't used on arm64)
- CPU resume fix for the per-cpu offset restoring
- Code comment typos
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJS6+SLAAoJEGvWsS0AyF7xk60P/RBEAXqBdlNkTPQ9eIBs8P9b
txv7T6k6MnQt6vK+3Kepgnoadeaw4WRCqKPduA90v7tnrw8fVLmuAM29o0uITkDH
4Cz69/8A9B2Y+ouiODS6MrEvlz99BfbSZdUZd89eNSO7tTzv1YckpepNQCL3Wf+S
1nXa4dZrciCR2vMmkF01jvc3yTjCJ0W7IduUhFQevpMfEzq2bLhvS5eGQd618pBc
Ih44IYO6RfBeYB3CxWiSJeHgSXzlSijUeoPQbiDEhFsKXO7SaZN5oQ5Ra8v+YAMr
fhcblX9rxH2Z1AfYizPssQ0VEnm3cN4nKurr7CA5R+LeHuR38lD9WWABhkYgQMRB
V43/3wvozRyWhLBkS9V0NjObfpPhDuLR6pdbWRQnCO97mH/0HVd2ql49gvsOcdgb
qa2XTF+iOJOQ3325SGdruTkbzkccFoKniiiOsndhx48VGLhHD9U0sCYLfb5X6NJC
ZFI3qQx+3+q32jkDoShqQZ/4fnb8TqzH/KqNtZdZPJ7dlkw/2B8wcnAIoF+BfgbH
2TRfbndSrNGLJSc4bgFQ3e7SCHSdm/PUR/l4c+xN9Ia4tF9614ndfPAaSKaMaKca
O6IwX8+IBo94JDhtPsypplYllBlBeXzMjZFO6b60sBF9UaxgdS1ZQ3NP9pumGN80
2czoY2kwV68DScgxBFLx
=tWas
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pyll ARM64 patches from Catalin Marinas:
- Build fix with DMA_CMA enabled
- Introduction of PTE_WRITE to distinguish between writable but clean
and truly read-only pages
- FIQs enabling/disabling clean-up (they aren't used on arm64)
- CPU resume fix for the per-cpu offset restoring
- Code comment typos
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Introduce PTE_WRITE
arm64: mm: Remove PTE_BIT_FUNC macro
arm64: FIQs are unused
arm64: mm: fix the function name in comment of cpu_do_switch_mm
arm64: fix build error if DMA_CMA is enabled
arm64: kernel: fix per-cpu offset restore on resume
arm64: mm: fix the function name in comment of __flush_dcache_area
arm64: mm: use ubfm for dcache_line_size
We have the following means for encoding writable or dirty ptes:
PTE_DIRTY PTE_RDONLY
!pte_dirty && !pte_write 0 1
!pte_dirty && pte_write 0 1
pte_dirty && !pte_write 1 1
pte_dirty && pte_write 1 0
So we can't distinguish between writable clean ptes and read only
ptes. This can cause problems with ptes being incorrectly flagged as
read only when they are writable but not dirty.
This patch introduces a new software bit PTE_WRITE which allows us to
correctly identify writable ptes. PTE_RDONLY is now only clear for
valid ptes where a page is both writable and dirty.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Expand out the pte manipulation functions. This makes our life easier
when using things like tags and cscope.
Signed-off-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
arm64/include/asm/dma-contiguous.h is trying to include
<asm-genric/dma-contiguous.h> which does not exist, and thus failing
build for arm64 if we enable CONFIG_DMA_CMA. This patch fixes build
error by removing unwanted header inclusion from arm64's dma-contiguous.h.
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: Somraj Mani <somraj.mani@samsung.com>
Acked-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull networking updates from David Miller:
1) BPF debugger and asm tool by Daniel Borkmann.
2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.
3) Correct reciprocal_divide and update users, from Hannes Frederic
Sowa and Daniel Borkmann.
4) Currently we only have a "set" operation for the hw timestamp socket
ioctl, add a "get" operation to match. From Ben Hutchings.
5) Add better trace events for debugging driver datapath problems, also
from Ben Hutchings.
6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we
have a small send and a previous packet is already in the qdisc or
device queue, defer until TX completion or we get more data.
7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.
8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
Borkmann.
9) Share IP header compression code between Bluetooth and IEEE802154
layers, from Jukka Rissanen.
10) Fix ipv6 router reachability probing, from Jiri Benc.
11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.
12) Support tunneling in GRO layer, from Jerry Chu.
13) Allow bonding to be configured fully using netlink, from Scott
Feldman.
14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
already get the TCI. From Atzm Watanabe.
15) New "Heavy Hitter" qdisc, from Terry Lam.
16) Significantly improve the IPSEC support in pktgen, from Fan Du.
17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom
Herbert.
18) Add Proportional Integral Enhanced packet scheduler, from Vijay
Subramanian.
19) Allow openvswitch to mmap'd netlink, from Thomas Graf.
20) Key TCP metrics blobs also by source address, not just destination
address. From Christoph Paasch.
21) Support 10G in generic phylib. From Andy Fleming.
22) Try to short-circuit GRO flow compares using device provided RX
hash, if provided. From Tom Herbert.
The wireless and netfilter folks have been busy little bees too.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
net/cxgb4: Fix referencing freed adapter
ipv6: reallocate addrconf router for ipv6 address when lo device up
fib_frontend: fix possible NULL pointer dereference
rtnetlink: remove IFLA_BOND_SLAVE definition
rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
qlcnic: update version to 5.3.55
qlcnic: Enhance logic to calculate msix vectors.
qlcnic: Refactor interrupt coalescing code for all adapters.
qlcnic: Update poll controller code path
qlcnic: Interrupt code cleanup
qlcnic: Enhance Tx timeout debugging.
qlcnic: Use bool for rx_mac_learn.
bonding: fix u64 division
rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
sfc: Use the correct maximum TX DMA ring size for SFC9100
Add Shradha Shah as the sfc driver maintainer.
net/vxlan: Share RX skb de-marking and checksum checks with ovs
tulip: cleanup by using ARRAY_SIZE()
ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
net/cxgb4: Don't retrieve stats during recovery
...
Nothing major here, just bugfixes all over the place. The most
interesting part is the ARM guys' virtualized interrupt controller
overhaul, which lets userspace get/set the state and thus enables
migration of ARM VMs.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU
uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH
5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0
iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8
FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd
Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ
dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z
2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR
l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy
X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF
hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu
x7N8KR5hAj+mLBoM9/Al
=8sYU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"First round of KVM updates for 3.14; PPC parts will come next week.
Nothing major here, just bugfixes all over the place. The most
interesting part is the ARM guys' virtualized interrupt controller
overhaul, which lets userspace get/set the state and thus enables
migration of ARM VMs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits)
kvm: make KVM_MMU_AUDIT help text more readable
KVM: s390: Fix memory access error detection
KVM: nVMX: Update guest activity state field on L2 exits
KVM: nVMX: Fix nested_run_pending on activity state HLT
KVM: nVMX: Clean up handling of VMX-related MSRs
KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject
KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit
KVM: nVMX: Leave VMX mode on clearing of feature control MSR
KVM: VMX: Fix DR6 update on #DB exception
KVM: SVM: Fix reading of DR6
KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS
add support for Hyper-V reference time counter
KVM: remove useless write to vcpu->hv_clock.tsc_timestamp
KVM: x86: fix tsc catchup issue with tsc scaling
KVM: x86: limit PIT timer frequency
KVM: x86: handle invalid root_hpa everywhere
kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub
kvm: vfio: silence GCC warning
KVM: ARM: Remove duplicate include
arm/arm64: KVM: relax the requirements of VMA alignment for THP
...
Interface)
- Jump label support
- CMA can now be enabled on arm64
- HWCAP bits for crypto and CRC32 extensions
- Optimised percpu using tpidr_el1 register
- Code cleanup
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJS3WLBAAoJEGvWsS0AyF7xxx8P+wUYzu04rKtz4BOj5IHl8TxB
xsRw8ce9MdxIVgdtCjDzmdpkd0s8ZPTEweJnVGYYlB9O9Pmz0VSX4Z1y6W6k0P1f
GCKDMa+hn2uQYnw3bS022Zji6OjUfad9XUfe3f61YdA7GrSdjTVMapXuloASRcfl
0XkfpXwbfLPGpuNp4q/QaA9K/y93T/gc6O/ctJh3OUJDOWJXZGsUTRIKXTF9GrWn
/gPEK9MiatAPpcS7iO283a3vllDalNoEGpt+a4cYCc8il2kCWUpX6W2c9m3Ua26k
mAvkoUErfb3cW/PzqDZzr8M3XbnXb8Je99HBbcjQluL6zyw+0hdUHJpCFOamsz5m
pEpT1e0Hvxb6yNbjqyituiYFPwOUZHP/HeZpH4l1njhN7sIyZP5cUEV4f53VN4lB
KL3HSGzUTaNzT5UpD35CA4vXRwRKrV8YsAVhB0p53KgkUreKA6wbJHSXHorjBZaE
uuP7kqOMGQ494+f6h+yvZqwIcObQGaHYNQJLY3Yhzg3WAs59s/bX/s4yWhgkte0U
yfxKpxTSiLjv5LmZrVQer04DIf9duNkEpI/DAKUbXagHJ7RCHjOneg9F5ZvJ598o
umCo+ok9hV+vLUhagh4t5guSk2ehW7qoZOG44XkYcCLXTIlMV1AQA6oJr804DUm2
71UbGFi01OY0Jtp8prO3
=TPaC
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull ARM64 updates from Catalin Marinas:
- CPU suspend support on top of PSCI (firmware Power State Coordination
Interface)
- jump label support
- CMA can now be enabled on arm64
- HWCAP bits for crypto and CRC32 extensions
- optimised percpu using tpidr_el1 register
- code cleanup
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits)
arm64: fix typo in entry.S
arm64: kernel: restore HW breakpoint registers in cpu_suspend
jump_label: use defined macros instead of hard-coding for better readability
arm64, jump label: optimize jump label implementation
arm64, jump label: detect %c support for ARM64
arm64: introduce aarch64_insn_gen_{nop|branch_imm}() helper functions
arm64: move encode_insn_immediate() from module.c to insn.c
arm64: introduce interfaces to hotpatch kernel and module code
arm64: introduce basic aarch64 instruction decoding helpers
arm64: dts: Reduce size of virtio block device for foundation model
arm64: Remove unused __data_loc variable
arm64: Enable CMA
arm64: Warn on NULL device structure for dma APIs
arm64: Add hwcaps for crypto and CRC32 extensions.
arm64: drop redundant macros from read_cpuid()
arm64: Remove outdated comment
arm64: cmpxchg: update macros to prevent warnings
arm64: support single-step and breakpoint handler hooks
ARM64: fix framepointer check in unwind_frame
ARM64: check stack pointer in get_wchan
...
Pull core locking changes from Ingo Molnar:
- futex performance increases: larger hashes, smarter wakeups
- mutex debugging improvements
- lots of SMP ordering documentation updates
- introduce the smp_load_acquire(), smp_store_release() primitives.
(There are WIP patches that make use of them - not yet merged)
- lockdep micro-optimizations
- lockdep improvement: better cover IRQ contexts
- liblockdep at last. We'll continue to monitor how useful this is
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
futexes: Fix futex_hashsize initialization
arch: Re-sort some Kbuild files to hopefully help avoid some conflicts
futexes: Avoid taking the hb->lock if there's nothing to wake up
futexes: Document multiprocessor ordering guarantees
futexes: Increase hash table size for better performance
futexes: Clean up various details
arch: Introduce smp_load_acquire(), smp_store_release()
arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h
locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
mutexes: Give more informative mutex warning in the !lock->owner case
powerpc: Full barrier for smp_mb__after_unlock_lock()
rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods
Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
Documentation/memory-barriers.txt: Document ACCESS_ONCE()
Documentation/memory-barriers.txt: Prohibit speculative writes
Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency"
...
Conflicts:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
net/ipv4/tcp_metrics.c
Overlapping changes between the "don't create two tcp metrics objects
with the same key" race fix in net and the addition of the destination
address in the lookup key in net-next.
Minor overlapping changes in bnx2x driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 2f7dc60275.
The above commit breaks the mapping type for Device memory because
pgprot_default already contains a Normal memory type. pgprot_default is
also not initialised early enough for earlyprintk resulting in an
inconsistent memory mapping with 64K PAGE_SIZE configuration.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJS0miqAAoJEHm+PkMAQRiGbfgIAJSWEfo8ludknhPcHJabBtxu
75SQAKJlL3sBVnxEc58Rtt8gsKYQIrm4IY5Slunklsn04RxuDUIQMgFoAYR5gQwz
+Myqkw/HOqDe5VStGxtLYpWnfglxVwGDCd7ISfL9AOVy5adMWBxh4Tv+qqQc7aIZ
eF7dy+DD+C6Q3Z5OoV8s0FZDxse29vOf17Nki7+7t8WMqyegYwjoOqNeqocGKsPi
eHLrJgTl4T6jB4l9LKKC154DSKjKOTSwZMWgwK8mToyNLT/ufCiKgXloIjEvZZcY
VVKUtncdHiTf+iqVojgpGBzOEeB5DM83iiapFeDiJg8C9yBzvT8lBtA9aPb5Wgw=
=lEeV
-----END PGP SIGNATURE-----
Merge tag 'v3.13-rc8' into core/locking
Refresh the tree with the latest fixes, before applying new changes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads. Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.
This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation. The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes. These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.
In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability. It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits. It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.
[Changelog by PaulMck]
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Introduce aarch64_insn_gen_{nop|branch_imm}() helper functions, which
will be used to implement jump label on ARM64.
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Function encode_insn_immediate() will be used by other instruction
manipulate related functions, so move it into insn.c and rename it
as aarch64_insn_encode_immediate().
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Introduce three interfaces to patch kernel and module code:
aarch64_insn_patch_text_nosync():
patch code without synchronization, it's caller's responsibility
to synchronize all CPUs if needed.
aarch64_insn_patch_text_sync():
patch code and always synchronize with stop_machine()
aarch64_insn_patch_text():
patch code and synchronize with stop_machine() if needed
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conflicts:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.c
ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.
qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.
Signed-off-by: David S. Miller <davem@davemloft.net>
Current max VCPUs per-Guest is set to 4 which is preventing
us from creating a Guest (or VM) with 8 VCPUs on Host (e.g.
X-Gene Storm SOC) with 8 Host CPUs.
The correct value of max VCPUs per-Guest should be same as
the max CPUs supported by GICv2 which is 8 but, increasing
value of max VCPUs per-Guest can make things slower hence
we add Kconfig option to let KVM users select appropriate
max VCPUs per-Guest.
Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
- Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use
scratch pages.
- Fix block API header for ARM32 and ARM64 to have proper layout
- On ARM when mapping guests, stick on PTE_SPECIAL
- When using SWIOTLB under ARM, don't call swiotlb functions twice
- When unmapping guests memory and if we fail, don't return pages which
failed to be unmapped.
- Grant driver was using the wrong address on ARM.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSsdFEAAoJEFjIrFwIi8fJDQwIAL1ygSTwSXdH6TlqtD9GVdsE
G6kiCM7G6VXrKMf8zBtgbGpcl6FT0zOIz4cRcXbyDniuHTjdWuH9dlmZOzFMAirE
uMWwOB1EfmRBEJRsd2pW0Gj0O6VABWh8BHklFCeWUvk/Stlw9uXqIwf7Pjcj6wPT
XW+ZywqsAve4MM60Rz/nMsakLcTK4i5SCRgPPFgAnPKUod3f/QbEHwci/lpinJFv
AuQp2JytCsDc2nehEi1kMwEx7LLBlUcjXTqPG5lhQnXrFleDtMdCJd9dGjeze7Qu
F5sftfdlp18ojQwegv1PGiVI4jV8rIq29ybaef/y9DLd3nC3rmi8B8/m9RG2qyI=
=dUsw
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull Xen bugfixes from Konrad Rzeszutek Wilk:
- Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use
scratch pages.
- Fix block API header for ARM32 and ARM64 to have proper layout
- On ARM when mapping guests, stick on PTE_SPECIAL
- When using SWIOTLB under ARM, don't call swiotlb functions twice
- When unmapping guests memory and if we fail, don't return pages which
failed to be unmapped.
- Grant driver was using the wrong address on ARM.
* tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/balloon: Seperate the auto-translate logic properly (v2)
xen/block: Correctly define structures in public headers on ARM32 and ARM64
arm: xen: foreign mapping PTEs are special.
xen/arm64: do not call the swiotlb functions twice
xen: privcmd: do not return pages which we have failed to unmap
XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
arm64 bit targets need the features CMA provides. Add the appropriate
hooks, header files, and Kconfig to allow this to happen.
Cc: Will Deacon <will.deacon@arm.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
asm/cputype.h contains a bunch of #defines for CPU id registers
that essentially map to themselves. Remove the #defines and pass
the tokens directly to the inline asm() that reads the registers.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Make sure the value we are going to return is referenced in order to
avoid warnings from newer GCCs such as:
arch/arm64/include/asm/cmpxchg.h:162:3: warning: value computed is not used [-Wunused-value]
((__typeof__(*(ptr)))__cmpxchg_mb((ptr), \
^
net/netfilter/nf_conntrack_core.c:674:2: note: in expansion of macro ‘cmpxchg’
cmpxchg(&nf_conntrack_hash_rnd, 0, rand);
[Modified to use the current underlying implementation as current
mainline for both cmpxchg() and cmpxchg_local() does -- broonie]
Signed-off-by: Mark Hambleton <mahamble@broadcom.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>