branch for merging into other sub-systems.
Note this is based on v5.12-rc2.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmBTJ8AUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9x6VAf1FezXZZrwXBvOUP2BvjBBumy0iWrX
vxC/eQzCq0eenMumlu+GmmAOeU2PI0D+d7VzAwArJcYVb/XDDL2xCG6+iHGJK8kF
aral0g6V0YqmMpc6Nc0PRxW5fgbwnPni+FJJQTFCyW5HMao8aSy8tSON0ZKj04Lm
4i0MwSz5FRFhogAG4tXCpAiI4sXPNuL+uAP2m1BY7Gm+IkgoCBKZWGQWApkP7lWL
cpj3QIQtxWuFUSLvwMRFykUMeTogiI16q6uQbcnE9QFuKV9UJqEDZf7hmAuxHFsQ
YA7JUb9AHfvUBlRmHWsAZ6ylDagTWXTDBI088+NqtPc0etm/26yebqpt
=ALC9
-----END PGP SIGNATURE-----
Merge tag 'tags/platform-drivers-x86-surface-aggregator-v5.13-1' into psy-next
Signed tag for the immutable platform-surface-aggregator-registry
branch for merging into other sub-systems.
Note this is based on v5.12-rc2.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
The INTEL_FAM6 list has become a mess again. Try and bring some sanity
back into it.
Where previously we had one microarch per year and a number of SKUs
within that, this no longer seems to be the case. We now get different
uarch names that share a 'core' design.
Add the core name starting at skylake and reorder to keep the cores
in chronological order. Furthermore, Intel marketed the names {Amber,
Coffee, Whiskey} Lake, but those are in fact steppings of Kaby Lake, add
comments for them.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YE+HhS8i0gshHD3W@hirez.programming.kicks-ass.net
Allow for a randomized stack offset on a per-syscall basis, with roughly
5 bits of entropy. (And include AAPCS rationale AAPCS thanks to Mark
Rutland.)
In order to avoid unconditional stack canaries on syscall entry (due to
the use of alloca()), also disable stack protector to avoid triggering
needless checks and slowing down the entry path. As there is no general
way to control stack protector coverage with a function attribute[1],
this must be disabled at the compilation unit level. This isn't a problem
here, though, since stack protector was not triggered before: examining
the resulting syscall.o, there are no changes in canary coverage (none
before, none now).
[1] a working __attribute__((no_stack_protector)) has been added to GCC
and Clang but has not been released in any version yet:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=346b302d09c1e6db56d9fe69048acb32fbb97845https://reviews.llvm.org/rG4fbf84c1732fca596ad1d6e96015e19760eb8a9b
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210401232347.2791257-6-keescook@chromium.org
For validating the stack offset behavior, report the offset from a given
process's first seen stack address. Add s script to calculate the results
to the LKDTM kselftests.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-7-keescook@chromium.org
Allow for a randomized stack offset on a per-syscall basis, with roughly
5-6 bits of entropy, depending on compiler and word size. Since the
method of offsetting uses macros, this cannot live in the common entry
code (the stack offset needs to be retained for the life of the syscall,
which means it needs to happen at the actual entry point).
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-5-keescook@chromium.org
This provides the ability for architectures to enable kernel stack base
address offset randomization. This feature is controlled by the boot
param "randomize_kstack_offset=on/off", with its default value set by
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.
This feature is based on the original idea from the last public release
of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt
All the credit for the original idea goes to the PaX team. Note that
the design and implementation of this upstream randomize_kstack_offset
feature differs greatly from the RANDKSTACK feature (see below).
Reasoning for the feature:
This feature aims to make harder the various stack-based attacks that
rely on deterministic stack structure. We have had many such attacks in
past (just to name few):
https://jon.oberheide.org/files/infiltrate12-thestackisback.pdfhttps://jon.oberheide.org/files/stackjacking-infiltrate11.pdfhttps://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html
As Linux kernel stack protections have been constantly improving
(vmap-based stack allocation with guard pages, removal of thread_info,
STACKLEAK), attackers have had to find new ways for their exploits
to work. They have done so, continuing to rely on the kernel's stack
determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT
were not relevant. For example, the following recent attacks would have
been hampered if the stack offset was non-deterministic between syscalls:
https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf
(page 70: targeting the pt_regs copy with linear stack overflow)
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
(leaked stack address from one syscall as a target during next syscall)
The main idea is that since the stack offset is randomized on each system
call, it is harder for an attack to reliably land in any particular place
on the thread stack, even with address exposures, as the stack base will
change on the next syscall. Also, since randomization is performed after
placing pt_regs, the ptrace-based approach[1] to discover the randomized
offset during a long-running syscall should not be possible.
Design description:
During most of the kernel's execution, it runs on the "thread stack",
which is pretty deterministic in its structure: it is fixed in size,
and on every entry from userspace to kernel on a syscall the thread
stack starts construction from an address fetched from the per-cpu
cpu_current_top_of_stack variable. The first element to be pushed to the
thread stack is the pt_regs struct that stores all required CPU registers
and syscall parameters. Finally the specific syscall function is called,
with the stack being used as the kernel executes the resulting request.
The goal of randomize_kstack_offset feature is to add a random offset
after the pt_regs has been pushed to the stack and before the rest of the
thread stack is used during the syscall processing, and to change it every
time a process issues a syscall. The source of randomness is currently
architecture-defined (but x86 is using the low byte of rdtsc()). Future
improvements for different entropy sources is possible, but out of scope
for this patch. Further more, to add more unpredictability, new offsets
are chosen at the end of syscalls (the timing of which should be less
easy to measure from userspace than at syscall entry time), and stored
in a per-CPU variable, so that the life of the value does not stay
explicitly tied to a single task.
As suggested by Andy Lutomirski, the offset is added using alloca()
and an empty asm() statement with an output constraint, since it avoids
changes to assembly syscall entry code, to the unwinder, and provides
correct stack alignment as defined by the compiler.
In order to make this available by default with zero performance impact
for those that don't want it, it is boot-time selectable with static
branches. This way, if the overhead is not wanted, it can just be
left turned off with no performance impact.
The generated assembly for x86_64 with GCC looks like this:
...
ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax
# 12380 <kstack_offset>
ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax
ffffffff81003983: 48 83 c0 0f add $0xf,%rax
ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax
ffffffff8100398c: 48 29 c4 sub %rax,%rsp
ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax
ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax
...
As a result of the above stack alignment, this patch introduces about
5 bits of randomness after pt_regs is spilled to the thread stack on
x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for
stack alignment). The amount of entropy could be adjusted based on how
much of the stack space we wish to trade for security.
My measure of syscall performance overhead (on x86_64):
lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null
randomize_kstack_offset=y Simple syscall: 0.7082 microseconds
randomize_kstack_offset=n Simple syscall: 0.7016 microseconds
So, roughly 0.9% overhead growth for a no-op syscall, which is very
manageable. And for people that don't want this, it's off by default.
There are two gotchas with using the alloca() trick. First,
compilers that have Stack Clash protection (-fstack-clash-protection)
enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to
any dynamic stack allocations. While the randomization offset is
always less than a page, the resulting assembly would still contain
(unreachable!) probing routines, bloating the resulting assembly. To
avoid this, -fno-stack-clash-protection is unconditionally added to
the kernel Makefile since this is the only dynamic stack allocation in
the kernel (now that VLAs have been removed) and it is provably safe
from Stack Clash style attacks.
The second gotcha with alloca() is a negative interaction with
-fstack-protector*, in that it sees the alloca() as an array allocation,
which triggers the unconditional addition of the stack canary function
pre/post-amble which slows down syscalls regardless of the static
branch. In order to avoid adding this unneeded check and its associated
performance impact, architectures need to carefully remove uses of
-fstack-protector-strong (or -fstack-protector) in the compilation units
that use the add_random_kstack() macro and to audit the resulting stack
mitigation coverage (to make sure no desired coverage disappears). No
change is visible for this on x86 because the stack protector is already
unconditionally disabled for the compilation unit, but the change is
required on arm64. There is, unfortunately, no attribute that can be
used to disable stack protector for specific functions.
Comparison to PaX RANDKSTACK feature:
The RANDKSTACK feature randomizes the location of the stack start
(cpu_current_top_of_stack), i.e. including the location of pt_regs
structure itself on the stack. Initially this patch followed the same
approach, but during the recent discussions[2], it has been determined
to be of a little value since, if ptrace functionality is available for
an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write
different offsets in the pt_regs struct, observe the cache behavior of
the pt_regs accesses, and figure out the random stack offset. Another
difference is that the random offset is stored in a per-cpu variable,
rather than having it be per-thread. As a result, these implementations
differ a fair bit in their implementation details and results, though
obviously the intent is similar.
[1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/
[2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/
[3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html
Co-developed-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
The state of CONFIG_INIT_ON_ALLOC_DEFAULT_ON (and ...ON_FREE...) did not
change the assembly ordering of the static branches: they were always out
of line. Use the new jump_label macros to check the CONFIG settings to
default to the "expected" state, which slightly optimizes the resulting
assembly code.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Link: https://lore.kernel.org/r/20210401232347.2791257-3-keescook@chromium.org
As shown in the comment in jump_label.h, choosing the initial state of
static branches changes the assembly layout. If the condition is expected
to be likely it's inline, and if unlikely it is out of line via a jump.
A few places in the kernel use (or could be using) a CONFIG to choose the
default state, which would give a small performance benefit to their
compile-time declared default. Provide the infrastructure to do this.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210401232347.2791257-2-keescook@chromium.org
Right now, if a call to kvm_tdp_mmu_zap_sp returns false, the caller
will skip the TLB flush, which is wrong. There are two ways to fix
it:
- since kvm_tdp_mmu_zap_sp will not yield and therefore will not flush
the TLB itself, we could change the call to kvm_tdp_mmu_zap_sp to
use "flush |= ..."
- or we can chain the flush argument through kvm_tdp_mmu_zap_sp down
to __kvm_tdp_mmu_zap_gfn_range. Note that kvm_tdp_mmu_zap_sp will
neither yield nor flush, so flush would never go from true to
false.
This patch does the former to simplify application to stable kernels,
and to make it further clearer that kvm_tdp_mmu_zap_sp will not flush.
Cc: seanjc@google.com
Fixes: 048f49809c ("KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")
Cc: <stable@vger.kernel.org> # 5.10.x: 048f49809c: KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
Cc: <stable@vger.kernel.org> # 5.10.x: 33a3164161: KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
Cc: <stable@vger.kernel.org>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ALSA control interface allows users to add arbitrary control elements
(called "user controls" or "user elements"), and its resource usage is
limited just by the max number of control sets (currently 32). This
limit, however, is quite loose: each allocation of control set may
have 1028 elements, and each element may have up to 512 bytes (ILP32) or
1024 bytes (LP64) of value data. Moreover, each control set may contain
the enum strings and TLV data, which can be up to 64kB and 128kB,
respectively. Totally, the whole memory consumption may go over 38MB --
it's quite large, and we'd rather like to reduce the size.
OTOH, there have been other requests even to increase the max number
of user elements; e.g. ALSA firewire stack require the more user
controls, hence we want to raise the bar, too.
For satisfying both requirements, this patch changes the management of
user controls: instead of setting the upper limit of the number of
user controls, we check the actual memory allocation size and set the
upper limit of the total allocation in bytes. As long as the memory
consumption stays below the limit, more user controls are allowed than
the current limit 32. At the same time, we set the lower limit (8MB)
as default than the current theoretical limit, in order to lower the
risk of DoS.
As a compromise for lowering the default limit, now the actual memory
limit is defined as a module option, 'max_user_ctl_alloc_size', so that
user can increase/decrease the limit if really needed, too.
Link: https://lore.kernel.org/r/s5htur3zl5e.wl-tiwai@suse.de
Co-developed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Tested-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20210408103149.40357-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Add calls to disable the clock and unmap the timer base address in case
of any failures.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210322121844.2271041-1-dinguyen@kernel.org
CMTOUT_IE is only supported for older SoCs. Newer SoCs shall not set
this bit. So, add a version check.
Reported-by: Phong Hoang <phong.hoang.wz@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210309094448.31823-1-wsa+renesas@sang-engineering.com
Fix trivial typo, rename local variable from 'overflw' to 'overflow' in
pistachio_clocksource_read_cycles().
Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210305090315.384547-1-drew@beagleboard.org
In case of error, the function device_node_to_regmap() returns
ERR_PTR() and never returns NULL. The NULL test in the return
value check should be replaced with IS_ERR().
Fixes: ca7b72b5a5 ("clocksource: Add driver for the Ingenic JZ47xx OST")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210308123031.2285083-1-weiyongjun1@huawei.com
To avoid spurious timer interrupts when KTIME_MAX is used, we need to
configure set_state_oneshot_stopped(). Although implementing this is
optional, it still affects things like power management for the extra
timer interrupt.
For more information, please see commit 8fff52fd50 ("clockevents:
Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state") and commit cf8c5009ee
("clockevents/drivers/arm_arch_timer: Implement
->set_state_oneshot_stopped()").
Fixes: 52762fbd1c ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210304072135.52712-4-tony@atomide.com
When the timer is configured in posted mode, we need to check the write-
posted status register (TWPS) before writing to the register.
We now check TWPS after the write starting with commit 52762fbd1c
("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource
support").
For example, in the TRM for am571x the following is documented in chapter
"22.2.4.13.1.1 Write Posting Synchronization Mode":
"For each register, a status bit is provided in the timer write-posted
status (TWPS) register. In this mode, it is mandatory that software check
this status bit before any write access. If a write is attempted to a
register with a previous access pending, the previous access is discarded
without notice."
The regression happened when I updated the code to use standard read/write
accessors for the driver instead of using __omap_dm_timer_load_start().
We have__omap_dm_timer_load_start() check the TWPS status correctly using
__omap_dm_timer_write().
Fixes: 52762fbd1c ("clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210304072135.52712-2-tony@atomide.com
The OST in the JZ4760B SoC works exactly the same as in the JZ4770. But
since the JZ4760B is older, its Device Tree string does not fall back to
the JZ4770 one; so add support for the JZ4760B compatible string here.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210308212302.10288-3-paul@crapouillou.net
Add compatible strings to support the system timer, clocksource, OST,
watchdog and PWM blocks of the JZ4760 and JZ4760B SoCs.
Newer SoCs which behave like the JZ4760 or JZ4760B now see their
compatible string require a fallback compatible string that corresponds
to one of these two SoCs.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210308212302.10288-1-paul@crapouillou.net
This currently supports:
* SMP (via spin-tables)
* AIC IRQs
* Serial (with earlycon)
* Framebuffer
A number of properties are dynamic, and based on system firmware
decisions that vary from version to version. These are expected
to be filled in by the loader.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
Apple SoCs run firmware that sets up a simplefb-compatible framebuffer
for us. Add a compatible for it, and two missing supported formats.
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This adds a Kconfig option to toggle support for Apple ARM SoCs.
At this time this targets the M1 and later "Apple Silicon" Mac SoCs.
Signed-off-by: Hector Martin <marcan@marcan.st>
This is the root interrupt controller used on Apple ARM SoCs such as the
M1. This irqchip driver performs multiple functions:
* Handles both IRQs and FIQs
* Drives the AIC peripheral itself (which handles IRQs)
* Dispatches FIQs to downstream hard-wired clients (currently the ARM
timer).
* Implements a virtual IPI multiplexer to funnel multiple Linux IPIs
into a single hardware IPI
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
AIC is the Apple Interrupt Controller found on Apple ARM SoCs, such as
the M1.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
These definitions are in arm-gic-v3.h for historical reasons which no
longer apply. Move them to sysreg.h so the AIC driver can use them, as
it needs to peek into vGIC registers to deal with the GIC maintentance
interrupt.
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This implements the 'nonposted-mmio' boolean property. Placing this
property in a bus marks all direct child devices as requiring
non-posted MMIO mappings. If no such property is found, the default
is posted MMIO.
of_mmio_is_nonposted() performs this check to determine if a given
device has requested non-posted MMIO.
of_address_to_resource() uses this to set the IORESOURCE_MEM_NONPOSTED
flag on resources that require non-posted MMIO.
of_iomap() and of_io_request_and_map() then use this flag to pick the
correct ioremap() variant.
This mechanism is currently restricted to builds that support Apple ARM
platforms, as an optimization.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
Now that we have ioremap_np(), we can make pci_remap_cfgspace() default
to it, falling back to ioremap() on platforms where it is not available.
Remove the arm64 implementation, since that is now redundant. Future
cleanups should be able to do the same for other arches, and eventually
make the generic pci_remap_cfgspace() unconditional.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This is used on Apple ARM platforms, which require most MMIO
(except PCI devices) to be mapped as nGnRnE.
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This documents the newly introduced ioremap_np() along with all the
other common ioremap() variants, and some higher-level abstractions
available.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This adds more detailed descriptions of the various read/write
primitives available for use with I/O memory/ports.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hector Martin <marcan@marcan.st>
ARM64 currently defaults to posted MMIO (nGnRE), but some devices
require the use of non-posted MMIO (nGnRnE). Introduce a new ioremap()
variant to handle this case. ioremap_np() returns NULL on arches that
do not implement this variant.
sparc64 is the only architecture that needs to be touched directly,
because it includes neither of the generic io.h or iomap.h headers.
This adds the IORESOURCE_MEM_NONPOSTED flag, which maps to this
variant and marks a given resource as requiring non-posted mappings.
This is implemented in the resource system because it is a SoC-level
requirement, so existing drivers do not need special-case code to pick
this ioremap variant.
Then this is implemented in devres by introducing devm_ioremap_np(),
and making devm_ioremap_resource() automatically select this variant
when the resource has the IORESOURCE_MEM_NONPOSTED flag set.
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This allows the devicetree to correctly represent the available set of
timers, which varies from device to device, without the need for fake
dummy interrupts for unavailable slots.
Also add the hyp-virt timer/PPI, which is not currently used, but worth
representing.
Reviewed-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
Not all platforms provide the same set of timers/interrupts, and Linux
only needs one (plus kvm/guest ones); some platforms are working around
this by using dummy fake interrupts. Implementing interrupt-names allows
the devicetree to specify an arbitrary set of available interrupts, so
the timer code can pick the right one.
This also adds the hyp-virt timer/interrupt, which was previously not
expressed in the fixed 4-interrupt form.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
The implementor will be used to condition the FIQ support quirk.
The specific CPU types are not used at the moment, but let's add them
for documentation purposes.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This introduces bindings for all three 2020 Apple M1 devices:
* apple,j274 - Mac mini (M1, 2020)
* apple,j293 - MacBook Pro (13-inch, M1, 2020)
* apple,j313 - MacBook Air (M1, 2020)
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
This is different from the legacy AAPL prefix used on PPC, but
consensus is that we prefer `apple` for these new platforms.
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Hector Martin <marcan@marcan.st>
For unknown reason, old commit d27dfd388715 ("Import pre2.0.8")
changed 'ptrdiff_t' from 'int' to 'long'.
GCC expects it as 'int' really, and this leads to the following
warning when building KFENCE:
CC mm/kfence/report.o
In file included from ./include/linux/printk.h:7,
from ./include/linux/kernel.h:16,
from mm/kfence/report.c:10:
mm/kfence/report.c: In function 'kfence_report_error':
./include/linux/kern_levels.h:5:18: warning: format '%td' expects argument of type 'ptrdiff_t', but argument 6 has type 'long int' [-Wformat=]
5 | #define KERN_SOH "\001" /* ASCII Start Of Header */
| ^~~~~~
./include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH'
11 | #define KERN_ERR KERN_SOH "3" /* error conditions */
| ^~~~~~~~
./include/linux/printk.h:343:9: note: in expansion of macro 'KERN_ERR'
343 | printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
| ^~~~~~~~
mm/kfence/report.c:213:3: note: in expansion of macro 'pr_err'
213 | pr_err("Out-of-bounds %s at 0x%p (%luB %s of kfence-#%td):\n",
| ^~~~~~
<asm-generic/uapi/posix-types.h> defines it as 'int', and
defines 'size_t' and 'ssize_t' exactly as powerpc do, so
remove the powerpc specific definitions and fallback on
generic ones.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e43d133bf52fa19e577f64f3a3a38cedc570377d.1617616601.git.christophe.leroy@csgroup.eu
When neither CONFIG_PCI nor CONFIG_IBMVIO is set/enabled, iommu.c has a
build error. The fault injection code is not useful in that kernel config,
so make the FAIL_IOMMU option depend on PCI || IBMVIO.
Prevents this build error (warning escalated to error):
../arch/powerpc/kernel/iommu.c:178:30: error: 'fail_iommu_bus_notifier' defined but not used [-Werror=unused-variable]
178 | static struct notifier_block fail_iommu_bus_notifier = {
Fixes: d6b9a81b2a ("powerpc: IOMMU fault injection")
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210404192623.10697-1-rdunlap@infradead.org
Eliminate the following coccicheck warning:
./arch/powerpc/platforms/pseries/lpar.c:1633:2-3: Unneeded semicolon
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1617672785-81372-1-git-send-email-yang.lee@linux.alibaba.com
There is no need for this to be in asm, use the new intrrupt entry wrapper.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210406025508.821718-1-npiggin@gmail.com
The power PMU group constraints includes check for EBB events to make
sure all events in a group must agree on EBB. This will prevent
scheduling EBB and non-EBB events together. But in the existing check,
settings for constraint mask and value is interchanged. Patch fixes the
same.
Before the patch, PMU selftest "cpu_event_pinned_vs_ebb_test" fails with
below in dmesg logs. This happens because EBB event gets enabled along
with a non-EBB cpu event.
[35600.453346] cpu_event_pinne[41326]: illegal instruction (4)
at 10004a18 nip 10004a18 lr 100049f8 code 1 in
cpu_event_pinned_vs_ebb_test[10000000+10000]
Test results after the patch:
$ ./pmu/ebb/cpu_event_pinned_vs_ebb_test
test: cpu_event_pinned_vs_ebb
tags: git_version:v5.12-rc5-93-gf28c3125acd3-dirty
Binding to cpu 8
EBB Handler is at 0x100050c8
read error on event 0x7fffe6bd4040!
PM_RUN_INST_CMPL: result 9872 running/enabled 37930432
success: cpu_event_pinned_vs_ebb
This bug was hidden by other logic until commit 1908dc9117 (perf:
Tweak perf_event_attr::exclusive semantics).
Fixes: 4df4899911 ("powerpc/perf: Add power8 EBB support")
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
[mpe: Mention commit 1908dc9117]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1617725761-1464-1-git-send-email-atrajeev@linux.vnet.ibm.com
The suggested alternative for getting cache-inhibited memory with 'mem='
and /dev/mem is pretty hacky. Also, PAPR guests do not allow system
memory to be mapped cache-inhibited so despite /dev/mem being available
this will not work which can cause confusion. Instead recommend using
the memtrace buffers. memtrace is only available on powernv so there
will not be any chance of trying to do this in a guest.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210225032108.1458352-2-jniethe5@gmail.com
Let the memory removed from the linear mapping to be used for the trace
buffers be mmaped. This is a useful way of providing cache-inhibited
memory for the alignment_handler selftest.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
[mpe: make memtrace_mmap() static as noticed by lkp@intel.com]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210225032108.1458352-1-jniethe5@gmail.com
As best as I can tell the ".machine" directive in trampoline_64.S is no
longer, or never was, necessary.
It was added in commit 0d97631392 ("powerpc: Add purgatory for
kexec_file_load() implementation."), which created the file based on
the kexec-tools purgatory. It may be/have-been necessary in the
kexec-tools version, but we have a completely different build system,
and we already pass the desired CPU flags, eg:
gcc ... -m64 -Wl,-a64 -mabi=elfv2 -Wa,-maltivec -Wa,-mpower4 -Wa,-many
... arch/powerpc/purgatory/trampoline_64.S
So drop the ".machine" directive and rely on the assembler flags.
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/r/20210315034159.315675-1-mpe@ellerman.id.au