Since commit ac10be5cdb ("arm64: Use common
of_kexec_alloc_and_setup_fdt()"), smatch reports the following warning:
arch/arm64/kernel/machine_kexec_file.c:152 load_other_segments()
warn: missing error code 'ret'
Return code is not set to an error code in load_other_segments() when
of_kexec_alloc_and_setup_fdt() call returns a NULL dtb. This results
in status success (return code set to 0) being returned from
load_other_segments().
Set return code to -EINVAL if of_kexec_alloc_and_setup_fdt() returns
NULL dtb.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ac10be5cdb ("arm64: Use common of_kexec_alloc_and_setup_fdt()")
Link: https://lore.kernel.org/r/20211210010121.101823-1-nramas@linux.microsoft.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Ganapatrao reported that the kvm_pgtable->mmu pointer is more or
less hardcoded to the main S2 mmu structure, while the nested
code needs it to point to other instances (as we have one instance
per nested context).
Rework the initialisation of the kvm_pgtable structure so that
this assumtion doesn't hold true anymore. This requires some
minor changes to the order in which things are initialised
(the mmu->arch pointer being the critical one).
Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211129200150.351436-5-maz@kernel.org
Use the interconnects property to hook up the MMC and BPMP to the memory
controller. This is needed to set the correct bus-level DMA mask, which
is a prerequisite for adding IOMMU support.
Signed-off-by: Thierry Reding <treding@nvidia.com>
This adds the memory controller and the embedded external memory
controller found on the Tegra234 SoC.
Signed-off-by: Thierry Reding <treding@nvidia.com>
DMA operations for the Tegra194 Video Image Compositor (VIC) are
coherent and so populate the 'dma-coherent' property.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Make the order of the clocks and clock-names properties match the order
in the device tree bindings. This isn't strictly necessary from a point
of view of the operating system because matching will be done based on
the clock-names, but it makes it easier to validate the device trees
against the DT schema.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add missing interrupts, clocks, clock-names, reset and reset-names
properties for the TSEC blocks found on Tegra210.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The XUSB pad controller handles the various PLL power supplies, so
remove any references to them from the PCIe and XUSB controller device
tree nodes.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The XUSB pad controller handles the various PLL power supplies, so
remove any references to them from the XUSB controller device tree
node.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The XUSB pad controller handles the various PLL power supplies, so
remove any references to them from the PCIe and XUSB controller device
tree nodes.
Signed-off-by: Thierry Reding <treding@nvidia.com>
GPIO hog nodes must have a "hog-" prefix or "-hog" suffix according to
the DT schema. Rename all such nodes to allow validation to pass.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Remove the unsupported "regulator-disable-ramp-delay" properties which
ended up in various DTS files for some reason.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The num-viewport property is never used and can be dropped, whereas the
"iommus" property is not needed since we use "iommu-map-mask" and
"iommu-map" already.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The HSP instances on Tegra194 are not fully compatible with the version
found on Tegra186, so drop the fallback compatible string from the list.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra194 pinmux DT bindings do not define the nvidia,lpdr property,
so drop them from the device trees that have listed them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The standard "jedec," vendor prefix should be used for SPI NOR flash
chips. This allows the right DT schema to be picked for validation.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra186 CCPLEX cluster register region is 4 MiB is length, not 4
MiB - 1. This was likely presumed to be the "limit" rather than length.
Fix it up.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The I2C controller found on Tegra186 is not fully compatible with the
Tegra210 version, so drop the fallback compatible string from the list.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Child nodes of the TI INA3221 power monitor device tree node should be
called input@* according to the DT schema.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The DT schema requires that nodes representing thermal zones include a
"-thermal" suffix in their name.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Make the order of the clocks and clock-names properties match the order
in the device tree bindings. This isn't strictly necessary from a point
of view of the operating system because matching will be done based on
the clock-names, but it makes it easier to validate the device trees
against the DT schema.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The CML1 and PLL_E clocks are never explicitly used by the AHCI
controller found on Tegra132, so drop them from the corresponding device
tree node.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The I2C controller found on Tegra124 is not fully compatible with the
Tegra114 version, so drop the fallback compatible string from the list.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add peripheral OPP tables on Tegra132 and wire them up to ACTMON and the
EMC. While at it, add the missing "#interconnect-cells" properties to
the memory controller and external memory controller nodes. Also set the
"#reset-cells" property for the memory controller because it exports the
hotflush reset controls.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The TKE (time-keeping engine) found on Tegra132 is not backwards
compatible with the version found on Tegra20, so update the compatible
string list accordingly.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra PMC device tree bindings don't support the "#wake-cells" and
"nvidia,reset-gpio" properties, so remove them.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The AS3722 pinmux device tree node doesn't have a "reg" property and
therefore must not have a unit-address, so drop it.
While at it, add missing unit-addresses for the charger and smart
battery IC's on the ChromeOS embedded controller's I2C tunnel bus.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The native timers IP block found on NVIDIA Tegra SoCs implements a
watchdog timer that can be used to recover from system hangs. Add the
device tree node on Tegra186.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Regulators defined at the top level in device tree are no longer part of
a simple bus and therefore don't have a reg property. Nodes without a
reg property shouldn't have a unit-address either, so drop the unit
address from the node names. To ensure nodes aren't duplicated (in which
case they would end up merged in the final DTB), append the name of the
regulator to the node name.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Clocks defined at the top level in device tree are no longer part of a
simple bus and therefore don't have a reg property. Nodes without a reg
property shouldn't have a unit-address either, so drop the unit address
from the node names. To ensure nodes aren't duplicated (in which case
they would end up merged in the final DTB), append the name of the clock
to the node name.
Signed-off-by: Thierry Reding <treding@nvidia.com>
The display controllers are attached to a separate ARM SMMU instance
that is dedicated to servicing isochronous memory clients. Add this ISO
instance of the ARM SMMU to device tree.
Please note that the display controllers are not hooked up to this SMMU
yet, because we are still missing a means to transition framebuffers
used by the bootloader to the kernel.
This based upon an initial patch by Thierry Reding <treding@nvidia.com>.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Populate the device-tree nodes for NVENC and NVJPG Host1x engines on
Tegra186 and Tegra194.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add support to enumerate SD in UHS mode on Tegra194. Add required
device-tree properties in SDMMC1 and SDMMC3 instances to enable dynamic
pad voltage switching and enumerate SD card in UHS-I modes.
Signed-off-by: Prathamesh Shete <pshete@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Jetson AGX Orin Developer Kit is a continuation of the Jetson
Developer Kit line using the new NVIDIA Tegra234 (Orin) SoC.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The NVIDIA Tegra234 SoC has 3 clusters of 4 Cortex-A78AE CPU cores each,
for a total of 12 CPUs. Each CPU has 64 KiB instruction and data caches
with each cluster having an additional 256 KiB unified L2 cache and a 2
MiB L3 cache.
Signed-off-by: Thierry Reding <treding@nvidia.com>
These two controllers expose general purpose I/O pins that can be used
to control or monitor a variety of signals.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add a device for TCU (Tegra Combined UART) used for serial console.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add missing properties to the eMMC controller, as required to use it on
actual hardware.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
On final Tegra234 systems, shared memory for communication with BPMP is
located at offset 0x70000 in SYSRAM.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The json-schema bindings for SRAM expect the nodes to be called "sram"
rather than "sysram" or "shmem". Furthermore, place the brackets around
the SYSRAM references such that a two-element array is created rather
than a two-element array nested in a single-element array. This is not
relevant for device tree itself, but allows the nodes to be properly
validated against json-schema bindings.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Add vdd core regulator (1.1 V).
This patch add regulator support for gpu.
The H/W manual mentions nothing about a gpu regulator. So using vdd
core regulator for gpu.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20211208104026.421-4-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
* kvm-arm64/pkvm-hyp-sharing:
: .
: Series from Quentin Perret, implementing HYP page share/unshare:
:
: This series implements an unshare hypercall at EL2 in nVHE
: protected mode, and makes use of it to unmmap guest-specific
: data-structures from EL2 stage-1 during guest tear-down.
: Crucially, the implementation of the share and unshare
: routines use page refcounts in the host kernel to avoid
: accidentally unmapping data-structures that overlap a common
: page.
: [...]
: .
KVM: arm64: pkvm: Unshare guest structs during teardown
KVM: arm64: Expose unshare hypercall to the host
KVM: arm64: Implement do_unshare() helper for unsharing memory
KVM: arm64: Implement __pkvm_host_share_hyp() using do_share()
KVM: arm64: Implement do_share() helper for sharing memory
KVM: arm64: Introduce wrappers for host and hyp spin lock accessors
KVM: arm64: Extend pkvm_page_state enumeration to handle absent pages
KVM: arm64: pkvm: Refcount the pages shared with EL2
KVM: arm64: Introduce kvm_share_hyp()
KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2
KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table
KVM: arm64: Fixup hyp stage-1 refcount
KVM: arm64: Refcount hyp stage-1 pgtable pages
KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator
Signed-off-by: Marc Zyngier <maz@kernel.org>
Make use of the newly introduced unshare hypercall during guest teardown
to unmap guest-related data structures from the hyp stage-1.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-15-qperret@google.com
Introduce an unshare hypercall which can be used to unmap memory from
the hypervisor stage-1 in nVHE protected mode. This will be useful to
update the EL2 ownership state of pages during guest teardown, and
avoids keeping dangling mappings to unreferenced portions of memory.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-14-qperret@google.com
Tearing down a previously shared memory region results in the borrower
losing access to the underlying pages and returning them to the "owned"
state in the owner.
Implement a do_unshare() helper, along the same lines as do_share(), to
provide this functionality for the host-to-hyp case.
Reviewed-by: Andrew Walbran <qwandor@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-13-qperret@google.com
__pkvm_host_share_hyp() shares memory between the host and the
hypervisor so implement it as an invocation of the new do_share()
mechanism.
Note that double-sharing is no longer permitted (as this allows us to
reduce the number of page-table walks significantly), but is thankfully
no longer relied upon by the host.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-12-qperret@google.com
By default, protected KVM isolates memory pages so that they are
accessible only to their owner: be it the host kernel, the hypervisor
at EL2 or (in future) the guest. Establishing shared-memory regions
between these components therefore involves a transition for each page
so that the owner can share memory with a borrower under a certain set
of permissions.
Introduce a do_share() helper for safely sharing a memory region between
two components. Currently, only host-to-hyp sharing is implemented, but
the code is easily extended to handle other combinations and the
permission checks for each component are reusable.
Reviewed-by: Andrew Walbran <qwandor@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-11-qperret@google.com
In preparation for adding additional locked sections for manipulating
page-tables at EL2, introduce some simple wrappers around the host and
hypervisor locks so that it's a bit easier to read and bit more difficult
to take the wrong lock (or even take them in the wrong order).
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-10-qperret@google.com
Explicitly name the combination of SW0 | SW1 as reserved in the pte and
introduce a new PKVM_NOPAGE meta-state which, although not directly
stored in the software bits of the pte, can be used to represent an
entry for which there is no underlying page. This is distinct from an
invalid pte, as stage-2 identity mappings for the host are created
lazily and so an invalid pte there is the same as a valid mapping for
the purposes of ownership information.
This state will be used for permission checking during page transitions
in later patches.
Reviewed-by: Andrew Walbran <qwandor@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-9-qperret@google.com
In order to simplify the page tracking infrastructure at EL2 in nVHE
protected mode, move the responsibility of refcounting pages that are
shared multiple times on the host. In order to do so, let's create a
red-black tree tracking all the PFNs that have been shared, along with
a refcount.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-8-qperret@google.com
The create_hyp_mappings() function can currently be called at any point
in time. However, its behaviour in protected mode changes widely
depending on when it is being called. Prior to KVM init, it is used to
create the temporary page-table used to bring-up the hypervisor, and
later on it is transparently turned into a 'share' hypercall when the
kernel has lost control over the hypervisor stage-1. In order to prepare
the ground for also unsharing pages with the hypervisor during guest
teardown, introduce a kvm_share_hyp() function to make it clear in which
places a share hypercall should be expected, as we will soon need a
matching unshare hypercall in all those places.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-7-qperret@google.com
Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor
stage-1 mappings at EL2.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-6-qperret@google.com
kvm_pgtable_hyp_unmap() relies on the ->page_count() function callback
being provided by the memory-management operations for the page-table.
Wire up this callback for the hypervisor stage-1 page-table.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-5-qperret@google.com
In nVHE-protected mode, the hyp stage-1 page-table refcount is broken
due to the lack of refcount support in the early allocator. Fix-up the
refcount in the finalize walker, once the 'hyp_vmemmap' is up and running.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-4-qperret@google.com
To prepare the ground for allowing hyp stage-1 mappings to be removed at
run-time, update the KVM page-table code to maintain a correct refcount
using the ->{get,put}_page() function callbacks.
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-3-qperret@google.com
In nVHE protected mode, the EL2 code uses a temporary allocator during
boot while re-creating its stage-1 page-table. Unfortunately, the
hyp_vmmemap is not ready to use at this stage, so refcounting pages
is not possible. That is not currently a problem because hyp stage-1
mappings are never removed, which implies refcounting of page-table
pages is unnecessary.
In preparation for allowing hypervisor stage-1 mappings to be removed,
provide stub implementations for {get,put}_page() in the early allocator.
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211215161232.1480836-2-qperret@google.com
* kvm-arm64/vgic-fixes-5.17:
: .
: A few vgic fixes:
: - Harden vgic-v3 error handling paths against signed vs unsigned
: comparison that will happen once the xarray-based vcpus are in
: - Demote userspace-triggered console output to kvm_debug()
: .
KVM: arm64: vgic: Demote userspace-triggered console prints to kvm_debug()
KVM: arm64: vgic-v3: Fix vcpu index comparison
Signed-off-by: Marc Zyngier <maz@kernel.org>
Running the KVM selftests results in these messages being dumped
in the kernel console:
[ 188.051073] kvm [469]: VGIC redist and dist frames overlap
[ 188.056820] kvm [469]: VGIC redist and dist frames overlap
[ 188.076199] kvm [469]: VGIC redist and dist frames overlap
Being amle to trigger this from userspace is definitely not on,
so demote these warnings to kvm_debug().
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211216104507.1482017-1-maz@kernel.org
When handling an error at the point where we try and register
all the redistributors, we unregister all the previously
registered frames by counting down from the failing index.
However, the way the code is written relies on that index
being a signed value. Which won't be true once we switch to
an xarray-based vcpu set.
Since this code is pretty awkward the first place, and that the
failure mode is hard to spot, rewrite this loop to iterate
over the vcpus upwards rather than downwards.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211216104526.1482124-1-maz@kernel.org
Eqos ethernet support five queues on hardware, enable these queues and
configure the priority of each queue. Uses Strict Priority as scheduling
algorithms to ensure that the TSN function works.
The priority of each queue is a bitmask value that maps VLAN tag
priority to the queue. Since the hardware only supports five queues,
this patch maps priority 0-4 to queues one by one, and priority 5-7 to
queue 4.
The total fifo size of 5 queues is 8192 bytes, if enable 5 queues with
store-and-forward mode, it's not enough for large packets, which would
trigger fifo overflow frequently. This patch set DMA to thresh mode to
enable all 5 queues.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reviewed-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add overlays for various serdes protocols on LS1028A QDS board using
different PHY cards. These should be applied at boot, based on serdes
configuration. If no overlay is applied, only the RGMII interface on
the QDS is available in Linux.
Building device tree fragments requires passing the "-@" argument to
dtc, which increases the base dtb size and might cause some platforms to
fail to store the new binary. To avoid that, it would be nice to only
pass "-@" for the platforms where fragments will be used, aka
LS1028A-QDS. One approach suggested by Rob Herring is used here:
https://lore.kernel.org/patchwork/patch/821645/
Also moved the enet* override nodes in dts file to be in alphabetic order.
Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Jason Liu <jason.hui.liu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The i2c rtc is on i2c2 bus not i2c1 bus, so fix it in dts.
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Li Yang <leoyang.lil@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Enable pwm0 on ls1028a-rdb board which uses flextimer1.
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add pwm nodes using flextimer controller.
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add flextimer2 based ftm_alarm1 node and enable it to be the default rtc
wakeup source for rdb and qds boards instead of the original flextimer1
which is used by PWM. The ftm_alarm0 node hence is disabled by default.
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add PCIe EP nodes for ls1028a to support EP mode.
Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add interrupt line for RTC node on lx2162a-qds
Signed-off-by: Biwen Li <biwen.li@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The default NXP SDHC adapter cards for LX2162AQDS are SD 2.0/3.0
adapter card for eSDHC1, and eMMC 5.1 adapter card for eSDHC2.
Add speed modes properties supported by the two adapters in device
tree node.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Enable USB3 HW LPM feature for lx2160a.
Signed-off-by: Ran Wang <ran.wang_1@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The two external MDIO buses used to communicate with phy devices that
are external to SOC are muxed in LX2160AQDS board. These buses can be
routed to any one of the eight IO slots on LX2160AQDS board depending on
value in fpga register 0x54. Additionally the external MDIO1 is used to
communicate to the onboard RGMII phy devices. The mdio1 is controlled
by bits 4-7 of fpga register and mdio2 is controlled by bits 4-7 of fpga
register.
Signed-off-by: Pankaj Bansal <pankaj.bansal@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Disabled by default in SoC dtsi and enables in board dts files.
Signed-off-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The base i.MX8MM dtsi changes the audio PLL2 rate, which gets in the
way if it should be used for anything else than audio. As this PLL doesn't
seem to be used by any upstream supported board, just remove the rate
configuration to allow boards to set it up as they wish.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The slew rate and drive-strength of the i2c1 pads were much too
high. Bring them down to avoid signal quality issues.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Add the missing reset-gpios property to allow Linux to fully reset
the network PHY and fix the pinmux to add the neccessary pull-ups
for the PHY strap configuration.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The change adds a description of a SM8450 cpufreq-epss controller and
references to it from CPU nodes.
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211215043440.605624-11-vkoul@kernel.org
Enable the UFS and phy node and add the regulators used by them.
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211215043440.605624-9-vkoul@kernel.org
Add DTS for Qualcomm QRD platform which uses SM8450 SoC and mark the
reserved nodes.
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211215043440.605624-6-vkoul@kernel.org
Add the apps smmu node as found in the SM8450 SoC
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Acked-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211215043440.605624-5-vkoul@kernel.org
Add the reserved memory nodes for SM8450. This is based on the downstream
documentation.
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211215043440.605624-4-vkoul@kernel.org
This add based DTSI for SM8450 SoC and includes base description of
CPUs, GCC, RPMHCC, UART, interuupt-controller which helps to boot to
shell with console on boards with this SoC
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211215043440.605624-2-vkoul@kernel.org
There must be three parameters in gpio-ranges property. Fixes this not
very helpful error message:
OF: /soc/pinctrl@1000000: (null) = 3 found 3
Fixes: 1e8277854b ("arm64: dts: Add ipq6018 SoC and CP01 board support")
Cc: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/8a744cfd96aff5754bfdcf7298d208ddca5b319a.1638862030.git.baruch@tkos.co.il
Use correct compatible according to dt-binding.
Fixes + few other lines of `make qcom/sdm845-oneplus-fajita.dtb`:
arch/arm64/boot/dts/qcom/sdm845-oneplus-fajita.dt.yaml: qfprom@784000: compatible: ['qcom,qfprom'] is too short
From schema: Documentation/devicetree/bindings/nvmem/qcom,qfprom.yaml
Signed-off-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211213190228.106924-1-david@ixit.cz
Downstream defines four ADC channels related to thermal sensors external
to the PM8998 and two channels for internal voltage measurements.
Add these to the upstream SDM845 MTP, describe the thermal monitor
channels and add thermal_zones for these.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20211005032531.2251928-5-bjorn.andersson@linaro.org
Add a node for the ADC Thermal Monitor found in the PM8998 PMIC. This is
used to connect thermal zones with ADC channels.
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20211005032531.2251928-4-bjorn.andersson@linaro.org
Property #stream-id-cells is legacy leftover and isn't currently
documented nor used.
Signed-off-by: David Heidelberg <david@ixit.cz>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211208184707.100716-1-david@ixit.cz
This reverts commit ed9500c1df.
The clock-frequency property was meant to aid platforms with broken firmwares
that don't set up the timer properly on their own. Don't include it where it is
not the case.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211202004328.459899-1-konrad.dybcio@somainline.org
Currently Soundcard has 1 rx device for headset and SoundWire Speaker Playback.
This setup has issues, ex if we try to play on headset the audio stream is
also sent to SoundWire Speakers and we will hear sound in both headsets and speakers.
Make a separate device for Speakers and Headset so that the streams are
different and handled properly.
Fixes: 45021d35fc ("arm64: dts: qcom: c630: Enable audio support")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211209175342.20386-2-srinivas.kandagatla@linaro.org
Add a node describing the USB Type C connector, in order to utilize the
Chromium OS USB Type-C driver that enumerates Type-C ports and connected
cables/peripherals and makes them visible to userspace.
Cc: Alexandru M Stan <amstan@chromium.org>
Cc: Benson Leung <bleung@chromium.org>
Signed-off-by: Prashant Malani <pmalani@chromium.org>
Reviewed-by: Alexandru M Stan <amstan@chromium.org>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://lore.kernel.org/r/20211209195112.366176-1-pmalani@chromium.org
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Add basic chip support for Mediatek mt7986, include
basic uart nodes, rng node and watchdog node.
Add cpu node, timer node, gic node, psci and reserved-memory node
for ARM Trusted Firmware.
Signed-off-by: Sam Shih <sam.shih@mediatek.com>
Link: https://lore.kernel.org/r/20211122123222.8016-3-sam.shih@mediatek.com
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
* kvm-arm64/pkvm-cleanups-5.17:
: .
: pKVM cleanups from Quentin Perret:
:
: This series is a collection of various fixes and cleanups for KVM/arm64
: when running in nVHE protected mode. The first two patches are real
: fixes/improvements, the following two are minor cleanups, and the last
: two help satisfy my paranoia so they're certainly optional.
: .
KVM: arm64: pkvm: Make kvm_host_owns_hyp_mappings() robust to VHE
KVM: arm64: pkvm: Stub io map functions
KVM: arm64: Make __io_map_base static
KVM: arm64: Make the hyp memory pool static
KVM: arm64: pkvm: Disable GICv2 support
KVM: arm64: pkvm: Fix hyp_pool max order
Signed-off-by: Marc Zyngier <maz@kernel.org>
The kvm_host_owns_hyp_mappings() function should return true if and only
if the host kernel is responsible for creating the hypervisor stage-1
mappings. That is only possible in standard non-VHE mode, or during boot
in protected nVHE mode. But either way, none of this makes sense in VHE,
so make sure to catch this case as well, hence making the function
return sensible values in any context (VHE or not).
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-7-qperret@google.com
Now that GICv2 is disabled in nVHE protected mode there should be no
other reason for the host to use create_hyp_io_mappings() or
kvm_phys_addr_ioremap(). Add sanity checks to make sure that assumption
remains true looking forward.
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-6-qperret@google.com
The __io_map_base variable is used at EL2 to track the end of the
hypervisor's "private" VA range in nVHE protected mode. However it
doesn't need to be used outside of mm.c, so let's make it static to keep
all the hyp VA allocation logic in one place.
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-5-qperret@google.com
The hyp memory pool struct is sized to fit exactly the needs of the
hypervisor stage-1 page-table allocator, so it is important it is not
used for anything else. As it is currently used only from setup.c,
reduce its visibility by marking it static.
Signed-off-by: Quentin Perret <qperret@google.com>
Reviewed-by: Andrew Walbran <qwandor@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-4-qperret@google.com
GICv2 requires having device mappings in guests and the hypervisor,
which is incompatible with the current pKVM EL2 page ownership model
which only covers memory. While it would be desirable to support pKVM
with GICv2, this will require a lot more work, so let's make the
current assumption clear until then.
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-3-qperret@google.com
The EL2 page allocator in protected mode maintains a per-pool max order
value to optimize allocations when the memory region it covers is small.
However, the max order value is currently under-estimated whenever the
number of pages in the region is a power of two. Fix the estimation.
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-2-qperret@google.com
We decided to keep SoC nodes sorted by address for sanity; fix a couple
that slipped into the wrong place.
Reviewed-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Hector Martin <marcan@marcan.st>
We now know that this frequency comes from the external reference
oscillator and is used for various SoC blocks, and isn't just a random
24MHz clock, so let's call it something more appropriate.
Reviewed-by: Mark Kettenis <kettenis@openbsd.org>
Reviewed-by: Sven Peter <sven@svenpeter.dev>
Signed-off-by: Hector Martin <marcan@marcan.st>
The __dma_inv_area() and __dma_clean_area() aliases make cache.S harder
to navigate, but don't gain us anything in practice.
For clarity, let's remove them along with their redundant comments. The
only users are __dma_map_area() and __dma_unmap_area(), which need to be
position independent, and can call __pi_dcache_inval_poc() and
__pi_dcache_clean_poc() directly.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Fuad Tabba <tabba@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20211206124715.4101571-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The msm-id and board-id can be used to select the correct dtb when
multiple are provided to the bootloader.
Multiple DTBs can be provided on sdm845 devices using boot image header
v1 by appending them all to the kernel image before creating the boot
image. The bootloader then selects them like this:
Best match DTB tags 321/00000008/0x00000000/20001/20014/20115/20018/0/(offset)0x79998E27/(size)0x000173CD
Using pmic info 0x20014/0x20115/0x20018/0x0 for device 0x20014/0x20115/0x20018/0x0
Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211209225938.2427342-1-caleb.connolly@linaro.org
This patch enables KCSAN for arm64, with updates to build rules
to not use KCSAN for several incompatible compilation units.
Recent GCC version(at least GCC10) made outline-atomics as the
default option(unlike Clang), which will cause linker errors
for kernel/kcsan/core.o. Disables the out-of-line atomics by
no-outline-atomics to fix the linker errors.
Meanwhile, as Mark said[1], some latent issues are needed to be
fixed which isn't just a KCSAN problem, we make the KCSAN depends
on EXPERT for now.
Tested selftest and kcsan_test(built with GCC11 and Clang 13),
and all passed.
[1] https://lkml.kernel.org/r/YadiUPpJ0gADbiHQ@FVFF77S0Q05N
Acked-by: Marco Elver <elver@google.com> # kernel/kcsan
Tested-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Link: https://lore.kernel.org/r/20211211131734.126874-1-wangkefeng.wang@huawei.com
[catalin.marinas@arm.com: added comment to justify EXPERT]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Add comments to help people figure out when fpsimd_bind_state_to_cpu() and
fpsimd_update_current_state() are used.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211207163250.1373542-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
In preparation for adding SME support update the bulk of the implementation
for the vector length configuration prctl() calls to be independent of
vector type.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210184133.320748-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The vector length configuration for SME is very similar to that for SVE
so in order to allow reuse refactor the SVE configuration so that it takes
the vector type from the struct ctl_table. Since there's no dedicated space
for this we repurpose the extra1 field to store the vector type, this is
otherwise unused for integer sysctls.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211210184133.320748-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* for-next/perf-cpu:
arm64: perf: Support new DT compatibles
arm64: perf: Simplify registration boilerplate
arm64: perf: Support Denver and Carmel PMUs
Now we have a macro for BTI C that looks like a regular instruction change
all the users of the current BTI_C macro to just emit a BTI C directly and
remove the macro.
This does mean that we now unconditionally BTI annotate all assembly
functions, meaning that they are worse in this respect than code generated
by the compiler. The overhead should be minimal for implementations with a
reasonable HINT implementation.
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211214152714.2380849-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently we only override the SYM_FUNC macros when we need to insert
BTI C into them, do this unconditionally to make it more likely that we'll
notice bugs in our override.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211214152714.2380849-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
BTI is only available from v8.5 so we need to encode it using HINT in
generic code and for older toolchains. Add an assembler macro based on
one written by Mark Rutland which lets us use the mnemonic and update
the existing users.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20211214152714.2380849-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With the trend for per-core events moving to userspace JSON, registering
names for PMUv3 implementations is increasingly a pure boilerplate
exercise. Let's wrap things a step further so we can generate the basic
PMUv3 init function with a macro invocation, and reduce further new
addition to just 2 lines each.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/b79477ea3b97f685d00511d4ecd2f686184dca34.1639490264.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Add support for the NVIDIA Denver and Carmel PMUs using the generic
PMUv3 event map for now.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
[ rm: reorder entries alphabetically ]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/5f0f69d47acca78a9e479501aa4d8b429e23cf11.1639490264.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The FEAT_LSE atomic instructions include LD* instructions which return
the original value of a memory location can be used to directly
implement FETCH opertations. Each RETURN op is implemented as a copy of
the corresponding FETCH op with a trailing instruction to generate the
new value of the memory location. We only directly implement
*_fetch_add*(), for which we have a trailing `add` instruction.
As the compiler has no visibility of the `add`, this leads to less than
optimal code generation when consuming the result.
For example, the compiler cannot constant-fold the addition into later
operations, and currently GCC 11.1.0 will compile:
return __lse_atomic_sub_return(1, v) == 0;
As:
mov w1, #0xffffffff
ldaddal w1, w2, [x0]
add w1, w1, w2
cmp w1, #0x0
cset w0, eq // eq = none
ret
This patch improves this by replacing the `add` with C addition after
the inline assembly block, e.g.
ret += i;
This allows the compiler to manipulate `i`. This permits the compiler to
merge the `add` and `cmp` for the above, e.g.
mov w1, #0xffffffff
ldaddal w1, w1, [x0]
cmp w1, #0x1
cset w0, eq // eq = none
ret
With this change the assembly for each RETURN op is identical to the
corresponding FETCH op (including barriers and clobbers) so I've removed
the inline assembly and rewritten each RETURN op in terms of the
corresponding FETCH op, e.g.
| static inline void __lse_atomic_add_return(int i, atomic_t *v)
| {
| return __lse_atomic_fetch_add(i, v) + i
| }
The new construction does not adversely affect the common case, and
before and after this patch GCC 11.1.0 can compile:
__lse_atomic_add_return(i, v)
As:
ldaddal w0, w2, [x1]
add w0, w0, w2
... while having the freedom to do better elsewhere.
This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
We have overly conservative assembly constraints for the basic FEAT_LSE
atomic instructions, and using more accurate and permissive constraints
will allow for better code generation.
The FEAT_LSE basic atomic instructions have come in two forms:
LD{op}{order}{size} <Rs>, <Rt>, [<Rn>]
ST{op}{order}{size} <Rs>, [<Rn>]
The ST* forms are aliases of the LD* forms where:
ST{op}{order}{size} <Rs>, [<Rn>]
Is:
LD{op}{order}{size} <Rs>, XZR, [<Rn>]
For either form, both <Rs> and <Rn> are read but not written back to,
and <Rt> is written with the original value of the memory location.
Where (<Rt> == <Rs>) or (<Rt> == <Rn>), <Rt> is written *after* the
other register value(s) are consumed. There are no UNPREDICTABLE or
CONSTRAINED UNPREDICTABLE behaviours when any pair of <Rs>, <Rt>, or
<Rn> are the same register.
Our current inline assembly always uses <Rs> == <Rt>, treating this
register as both an input and an output (using a '+r' constraint). This
forces the compiler to do some unnecessary register shuffling and/or
redundant value generation.
For example, the compiler cannot reuse the <Rs> value, and currently GCC
11.1.0 will compile:
__lse_atomic_add(1, a);
__lse_atomic_add(1, b);
__lse_atomic_add(1, c);
As:
mov w3, #0x1
mov w4, w3
stadd w4, [x0]
mov w0, w3
stadd w0, [x1]
stadd w3, [x2]
We can improve this with more accurate constraints, separating <Rs> and
<Rt>, where <Rs> is an input-only register ('r'), and <Rt> is an
output-only value ('=r'). As <Rt> is written back after <Rs> is
consumed, it does not need to be earlyclobber ('=&r'), leaving the
compiler free to use the same register for both <Rs> and <Rt> where this
is desirable.
At the same time, the redundant 'r' constraint for `v` is removed, as
the `+Q` constraint is sufficient.
With this change, the above example becomes:
mov w3, #0x1
stadd w3, [x0]
stadd w3, [x1]
stadd w3, [x2]
I've made this change for the non-value-returning and FETCH ops. The
RETURN ops have a multi-instruction sequence for which we cannot use the
same constraints, and a subsequent patch will rewrite hte RETURN ops in
terms of the FETCH ops, relying on the ability for the compiler to reuse
the <Rs> value.
This is intended as an optimization.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The FEAT_LSE atomic instructions include atomic bit-clear instructions
(`ldclr*` and `stclr*`) which can be used to directly implement ANDNOT
operations. Each AND op is implemented as a copy of the corresponding
ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the
`i` argument.
As the compiler has no visibility of the `mvn`, this leads to less than
optimal code generation when generating `i` into a register. For
example, __lse_atomic_fetch_and(0xf, v) can be compiled to:
mov w1, #0xf
mvn w1, w1
ldclral w1, w1, [x2]
This patch improves this by replacing the `mvn` with NOT in C before the
inline assembly block, e.g.
i = ~i;
This allows the compiler to generate `i` into a register more optimally,
e.g.
mov w1, #0xfffffff0
ldclral w1, w1, [x2]
With this change the assembly for each AND op is identical to the
corresponding ANDNOT op (including barriers and clobbers), so I've
removed the inline assembly and rewritten each AND op in terms of the
corresponding ANDNOT op, e.g.
| static inline void __lse_atomic_and(int i, atomic_t *v)
| {
| return __lse_atomic_andnot(~i, v);
| }
This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The FEAT_LSE atomic instructions include atomic ADD instructions
(`stadd*` and `ldadd*`), but do not include atomic SUB instructions, so
we must build all of the SUB operations using the ADD instructions. We
open-code these today, with each SUB op implemented as a copy of the
corresponding ADD op with a leading `neg` instruction in the inline
assembly to negate the `i` argument.
As the compiler has no visibility of the `neg`, this leads to less than
optimal code generation when generating `i` into a register. For
example, __les_atomic_fetch_sub(1, v) can be compiled to:
mov w1, #0x1
neg w1, w1
ldaddal w1, w1, [x2]
This patch improves this by replacing the `neg` with negation in C
before the inline assembly block, e.g.
i = -i;
This allows the compiler to generate `i` into a register more optimally,
e.g.
mov w1, #0xffffffff
ldaddal w1, w1, [x2]
With this change the assembly for each SUB op is identical to the
corresponding ADD op (including barriers and clobbers), so I've removed
the inline assembly and rewritten each SUB op in terms of the
corresponding ADD op, e.g.
| static inline void __lse_atomic_sub(int i, atomic_t *v)
| {
| __lse_atomic_add(-i, v);
| }
For clarity I've moved the definition of each SUB op immediately after
the corresponding ADD op, and used a single macro to create the RETURN
forms of both ops.
This is intended as an optimization and cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The code for the atomic ops is formatted inconsistently, and while this
is not a functional problem it is rather distracting when working on
them.
Some have ops have consistent indentation, e.g.
| #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \
| static inline int __lse_atomic_add_return##name(int i, atomic_t *v) \
| { \
| u32 tmp; \
| \
| asm volatile( \
| __LSE_PREAMBLE \
| " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \
| " add %w[i], %w[i], %w[tmp]" \
| : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \
| : "r" (v) \
| : cl); \
| \
| return i; \
| }
While others have negative indentation for some lines, and/or have
misaligned trailing backslashes, e.g.
| static inline void __lse_atomic_##op(int i, atomic_t *v) \
| { \
| asm volatile( \
| __LSE_PREAMBLE \
| " " #asm_op " %w[i], %[v]\n" \
| : [i] "+r" (i), [v] "+Q" (v->counter) \
| : "r" (v)); \
| }
This patch makes the indentation consistent and also aligns the trailing
backslashes. This makes the code easier to read for those (like myself)
who are easily distracted by these inconsistencies.
This is intended as a cleanup.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Use the EOR3 instruction to implement xor_blocks() if the instruction is
available, which is the case if the CPU implements the SHA-3 extension.
This is about 20% faster on Apple M1 when using the 5-way version.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Arm PMUs can support direct userspace access of counters which allows for
low overhead (i.e. no syscall) self-monitoring of tasks. The same feature
exists on x86 called 'rdpmc'. Unlike x86, userspace access will only be
enabled for thread bound events. This could be extended if needed, but
simplifies the implementation and reduces the chances for any
information leaks (which the x86 implementation suffers from).
PMU EL0 access will be enabled when an event with userspace access is
part of the thread's context. This includes when the event is not
scheduled on the PMU. There's some additional overhead clearing
dirty counters when access is enabled in order to prevent leaking
disabled counter data from other tasks.
Unlike x86, enabling of userspace access must be requested with a new
attr bit: config1:1. If the user requests userspace access with 64-bit
counters, then the event open will fail if the h/w doesn't support
64-bit counters. Chaining is not supported with userspace access. The
modes for config1 are as follows:
config1 = 0 : user access disabled and always 32-bit
config1 = 1 : user access disabled and always 64-bit (using chaining if needed)
config1 = 2 : user access enabled and always 32-bit
config1 = 3 : user access enabled and always 64-bit
Based on work by Raphael Gault <raphael.gault@arm.com>, but has been
completely re-written.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211208201124.310740-5-robh@kernel.org
[will: Made armv8pmu_proc_user_access_handler() static]
Signed-off-by: Will Deacon <will@kernel.org>
Like x86, some users may want to disable userspace PMU counter
altogether. Add a sysctl 'perf_user_access' file to control userspace
counter access. The default is '0' which is disabled. Writing '1'
enables access.
Note that x86 supports globally enabling user access by writing '2' to
/sys/bus/event_source/devices/cpu/rdpmc. As there's not existing
userspace support to worry about, this shouldn't be necessary for Arm.
It could be added later if the need arises.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-perf-users@vger.kernel.org
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211208201124.310740-4-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Setup a thermal zone driven by SoC temperature sensor.
Create passive trip points and bind them to CPUFreq cooling
device that supports power extension.
Based on the work done by Dien Pham <dien.pham.ry@renesas.com>
and others for r8a77990 SoC.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://lore.kernel.org/r/20211208142729.2456-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Provide the two MIPI DSI encoders on the V3U and connect them to the DU
accordingly.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Link: https://lore.kernel.org/r/20211130164311.2909616-2-kieran.bingham+renesas@ideasonboard.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>