forked from Minki/linux
2dc10ad81f
- "genirq: Introduce generic irq migration for cpu hotunplugged" patch merged from tip/irq/for-arm to allow the arm64-specific part to be upstreamed via the arm64 tree - CPU feature detection reworked to cope with heterogeneous systems where CPUs may not have exactly the same features. The features reported by the kernel via internal data structures or ELF_HWCAP are delayed until all the CPUs are up (and before user space starts) - Support for 16KB pages, with the additional bonus of a 36-bit VA space, though the latter only depending on EXPERT - Implement native {relaxed, acquire, release} atomics for arm64 - New ASID allocation algorithm which avoids IPI on roll-over, together with TLB invalidation optimisations (using local vs global where feasible) - KASan support for arm64 - EFI_STUB clean-up and isolation for the kernel proper (required by KASan) - copy_{to,from,in}_user optimisations (sharing the memcpy template) - perf: moving arm64 to the arm32/64 shared PMU framework - L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware - Support for the contiguous PTE hint on kernel mapping (16 consecutive entries may be able to use a single TLB entry) - Generic CONFIG_HZ now used on arm64 - defconfig updates -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWOkmIAAoJEGvWsS0AyF7x4GgQAINU3NePjFFvWZNCkqobeH9+ jFKwtXamIudhTSdnXNXyYWmtRL9Krg3qI4zDQf68dvDFAZAze2kVuOi1yPpCbpFZ /j/afNyQc7+PoyqRAzmT+EMPZlcuOA84Prrl1r3QWZ58QaFeVk/6ZxrHunTHxN0x mR9PIXfWx73MTo+UnG8FChkmEY6LmV4XpemgTaMR9FqFhdT51OZSxDDAYXOTm4JW a5HdN9OWjjJ2rhLlFEaC7tszG9B5doHdy2tr5ge/YERVJzIPDogHkMe8ZhfAJc+x SQU5tKN6Pg4MOi+dLhxlk0/mKCvHLiEQ5KVREJnt8GxupAR54Bat+DQ+rP9cSnpq dRQTcARIOyy9LGgy+ROAsSo+NiyM5WuJ0/WJUYKmgWTJOfczRYoZv6TMKlwNOUYb tGLCZHhKPM3yBHJlWbQykl3xmSuudxCMmjlZzg7B+MVfTP6uo0CRSPmYl+v67q+J bBw/Z2RYXWYGnvlc6OfbMeImI6prXeE36+5ytyJFga0m+IqcTzRGzjcLxKEvdbiU pr8n9i+hV9iSsT/UwukXZ8ay6zH7PrTLzILWQlieutfXlvha7MYeGxnkbLmdYcfe GCj374io5cdImHcVKmfhnOMlFOLuOHphl9cmsd/O2LmCIqBj9BIeNH2Om8mHVK2F YHczMdpESlJApE7kUc1e =3six -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - "genirq: Introduce generic irq migration for cpu hotunplugged" patch merged from tip/irq/for-arm to allow the arm64-specific part to be upstreamed via the arm64 tree - CPU feature detection reworked to cope with heterogeneous systems where CPUs may not have exactly the same features. The features reported by the kernel via internal data structures or ELF_HWCAP are delayed until all the CPUs are up (and before user space starts) - Support for 16KB pages, with the additional bonus of a 36-bit VA space, though the latter only depending on EXPERT - Implement native {relaxed, acquire, release} atomics for arm64 - New ASID allocation algorithm which avoids IPI on roll-over, together with TLB invalidation optimisations (using local vs global where feasible) - KASan support for arm64 - EFI_STUB clean-up and isolation for the kernel proper (required by KASan) - copy_{to,from,in}_user optimisations (sharing the memcpy template) - perf: moving arm64 to the arm32/64 shared PMU framework - L1_CACHE_BYTES increased to 128 to accommodate Cavium hardware - Support for the contiguous PTE hint on kernel mapping (16 consecutive entries may be able to use a single TLB entry) - Generic CONFIG_HZ now used on arm64 - defconfig updates * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (91 commits) arm64/efi: fix libstub build under CONFIG_MODVERSIONS ARM64: Enable multi-core scheduler support by default arm64/efi: move arm64 specific stub C code to libstub arm64: page-align sections for DEBUG_RODATA arm64: Fix build with CONFIG_ZONE_DMA=n arm64: Fix compat register mappings arm64: Increase the max granular size arm64: remove bogus TASK_SIZE_64 check arm64: make Timer Interrupt Frequency selectable arm64/mm: use PAGE_ALIGNED instead of IS_ALIGNED arm64: cachetype: fix definitions of ICACHEF_* flags arm64: cpufeature: declare enable_cpu_capabilities as static genirq: Make the cpuhotplug migration code less noisy arm64: Constify hwcap name string arrays arm64/kvm: Make use of the system wide safe values arm64/debug: Make use of the system wide safe value arm64: Move FP/ASIMD hwcap handling to common code arm64/HWCAP: Use system wide safe values arm64/capabilities: Make use of system wide safe value arm64: Delay cpu feature capability checks ...
244 lines
9.7 KiB
Plaintext
244 lines
9.7 KiB
Plaintext
Booting AArch64 Linux
|
|
=====================
|
|
|
|
Author: Will Deacon <will.deacon@arm.com>
|
|
Date : 07 September 2012
|
|
|
|
This document is based on the ARM booting document by Russell King and
|
|
is relevant to all public releases of the AArch64 Linux kernel.
|
|
|
|
The AArch64 exception model is made up of a number of exception levels
|
|
(EL0 - EL3), with EL0 and EL1 having a secure and a non-secure
|
|
counterpart. EL2 is the hypervisor level and exists only in non-secure
|
|
mode. EL3 is the highest priority level and exists only in secure mode.
|
|
|
|
For the purposes of this document, we will use the term `boot loader'
|
|
simply to define all software that executes on the CPU(s) before control
|
|
is passed to the Linux kernel. This may include secure monitor and
|
|
hypervisor code, or it may just be a handful of instructions for
|
|
preparing a minimal boot environment.
|
|
|
|
Essentially, the boot loader should provide (as a minimum) the
|
|
following:
|
|
|
|
1. Setup and initialise the RAM
|
|
2. Setup the device tree
|
|
3. Decompress the kernel image
|
|
4. Call the kernel image
|
|
|
|
|
|
1. Setup and initialise RAM
|
|
---------------------------
|
|
|
|
Requirement: MANDATORY
|
|
|
|
The boot loader is expected to find and initialise all RAM that the
|
|
kernel will use for volatile data storage in the system. It performs
|
|
this in a machine dependent manner. (It may use internal algorithms
|
|
to automatically locate and size all RAM, or it may use knowledge of
|
|
the RAM in the machine, or any other method the boot loader designer
|
|
sees fit.)
|
|
|
|
|
|
2. Setup the device tree
|
|
-------------------------
|
|
|
|
Requirement: MANDATORY
|
|
|
|
The device tree blob (dtb) must be placed on an 8-byte boundary and must
|
|
not exceed 2 megabytes in size. Since the dtb will be mapped cacheable
|
|
using blocks of up to 2 megabytes in size, it must not be placed within
|
|
any 2M region which must be mapped with any specific attributes.
|
|
|
|
NOTE: versions prior to v4.2 also require that the DTB be placed within
|
|
the 512 MB region starting at text_offset bytes below the kernel Image.
|
|
|
|
3. Decompress the kernel image
|
|
------------------------------
|
|
|
|
Requirement: OPTIONAL
|
|
|
|
The AArch64 kernel does not currently provide a decompressor and
|
|
therefore requires decompression (gzip etc.) to be performed by the boot
|
|
loader if a compressed Image target (e.g. Image.gz) is used. For
|
|
bootloaders that do not implement this requirement, the uncompressed
|
|
Image target is available instead.
|
|
|
|
|
|
4. Call the kernel image
|
|
------------------------
|
|
|
|
Requirement: MANDATORY
|
|
|
|
The decompressed kernel image contains a 64-byte header as follows:
|
|
|
|
u32 code0; /* Executable code */
|
|
u32 code1; /* Executable code */
|
|
u64 text_offset; /* Image load offset, little endian */
|
|
u64 image_size; /* Effective Image size, little endian */
|
|
u64 flags; /* kernel flags, little endian */
|
|
u64 res2 = 0; /* reserved */
|
|
u64 res3 = 0; /* reserved */
|
|
u64 res4 = 0; /* reserved */
|
|
u32 magic = 0x644d5241; /* Magic number, little endian, "ARM\x64" */
|
|
u32 res5; /* reserved (used for PE COFF offset) */
|
|
|
|
|
|
Header notes:
|
|
|
|
- As of v3.17, all fields are little endian unless stated otherwise.
|
|
|
|
- code0/code1 are responsible for branching to stext.
|
|
|
|
- when booting through EFI, code0/code1 are initially skipped.
|
|
res5 is an offset to the PE header and the PE header has the EFI
|
|
entry point (efi_stub_entry). When the stub has done its work, it
|
|
jumps to code0 to resume the normal boot process.
|
|
|
|
- Prior to v3.17, the endianness of text_offset was not specified. In
|
|
these cases image_size is zero and text_offset is 0x80000 in the
|
|
endianness of the kernel. Where image_size is non-zero image_size is
|
|
little-endian and must be respected. Where image_size is zero,
|
|
text_offset can be assumed to be 0x80000.
|
|
|
|
- The flags field (introduced in v3.17) is a little-endian 64-bit field
|
|
composed as follows:
|
|
Bit 0: Kernel endianness. 1 if BE, 0 if LE.
|
|
Bit 1-2: Kernel Page size.
|
|
0 - Unspecified.
|
|
1 - 4K
|
|
2 - 16K
|
|
3 - 64K
|
|
Bits 3-63: Reserved.
|
|
|
|
- When image_size is zero, a bootloader should attempt to keep as much
|
|
memory as possible free for use by the kernel immediately after the
|
|
end of the kernel image. The amount of space required will vary
|
|
depending on selected features, and is effectively unbound.
|
|
|
|
The Image must be placed text_offset bytes from a 2MB aligned base
|
|
address near the start of usable system RAM and called there. Memory
|
|
below that base address is currently unusable by Linux, and therefore it
|
|
is strongly recommended that this location is the start of system RAM.
|
|
The region between the 2 MB aligned base address and the start of the
|
|
image has no special significance to the kernel, and may be used for
|
|
other purposes.
|
|
At least image_size bytes from the start of the image must be free for
|
|
use by the kernel.
|
|
|
|
Any memory described to the kernel (even that below the start of the
|
|
image) which is not marked as reserved from the kernel (e.g., with a
|
|
memreserve region in the device tree) will be considered as available to
|
|
the kernel.
|
|
|
|
Before jumping into the kernel, the following conditions must be met:
|
|
|
|
- Quiesce all DMA capable devices so that memory does not get
|
|
corrupted by bogus network packets or disk data. This will save
|
|
you many hours of debug.
|
|
|
|
- Primary CPU general-purpose register settings
|
|
x0 = physical address of device tree blob (dtb) in system RAM.
|
|
x1 = 0 (reserved for future use)
|
|
x2 = 0 (reserved for future use)
|
|
x3 = 0 (reserved for future use)
|
|
|
|
- CPU mode
|
|
All forms of interrupts must be masked in PSTATE.DAIF (Debug, SError,
|
|
IRQ and FIQ).
|
|
The CPU must be in either EL2 (RECOMMENDED in order to have access to
|
|
the virtualisation extensions) or non-secure EL1.
|
|
|
|
- Caches, MMUs
|
|
The MMU must be off.
|
|
Instruction cache may be on or off.
|
|
The address range corresponding to the loaded kernel image must be
|
|
cleaned to the PoC. In the presence of a system cache or other
|
|
coherent masters with caches enabled, this will typically require
|
|
cache maintenance by VA rather than set/way operations.
|
|
System caches which respect the architected cache maintenance by VA
|
|
operations must be configured and may be enabled.
|
|
System caches which do not respect architected cache maintenance by VA
|
|
operations (not recommended) must be configured and disabled.
|
|
|
|
- Architected timers
|
|
CNTFRQ must be programmed with the timer frequency and CNTVOFF must
|
|
be programmed with a consistent value on all CPUs. If entering the
|
|
kernel at EL1, CNTHCTL_EL2 must have EL1PCTEN (bit 0) set where
|
|
available.
|
|
|
|
- Coherency
|
|
All CPUs to be booted by the kernel must be part of the same coherency
|
|
domain on entry to the kernel. This may require IMPLEMENTATION DEFINED
|
|
initialisation to enable the receiving of maintenance operations on
|
|
each CPU.
|
|
|
|
- System registers
|
|
All writable architected system registers at the exception level where
|
|
the kernel image will be entered must be initialised by software at a
|
|
higher exception level to prevent execution in an UNKNOWN state.
|
|
|
|
For systems with a GICv3 interrupt controller to be used in v3 mode:
|
|
- If EL3 is present:
|
|
ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
|
|
ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b1.
|
|
- If the kernel is entered at EL1:
|
|
ICC.SRE_EL2.Enable (bit 3) must be initialised to 0b1
|
|
ICC_SRE_EL2.SRE (bit 0) must be initialised to 0b1.
|
|
- The DT or ACPI tables must describe a GICv3 interrupt controller.
|
|
|
|
For systems with a GICv3 interrupt controller to be used in
|
|
compatibility (v2) mode:
|
|
- If EL3 is present:
|
|
ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b0.
|
|
- If the kernel is entered at EL1:
|
|
ICC_SRE_EL2.SRE (bit 0) must be initialised to 0b0.
|
|
- The DT or ACPI tables must describe a GICv2 interrupt controller.
|
|
|
|
The requirements described above for CPU mode, caches, MMUs, architected
|
|
timers, coherency and system registers apply to all CPUs. All CPUs must
|
|
enter the kernel in the same exception level.
|
|
|
|
The boot loader is expected to enter the kernel on each CPU in the
|
|
following manner:
|
|
|
|
- The primary CPU must jump directly to the first instruction of the
|
|
kernel image. The device tree blob passed by this CPU must contain
|
|
an 'enable-method' property for each cpu node. The supported
|
|
enable-methods are described below.
|
|
|
|
It is expected that the bootloader will generate these device tree
|
|
properties and insert them into the blob prior to kernel entry.
|
|
|
|
- CPUs with a "spin-table" enable-method must have a 'cpu-release-addr'
|
|
property in their cpu node. This property identifies a
|
|
naturally-aligned 64-bit zero-initalised memory location.
|
|
|
|
These CPUs should spin outside of the kernel in a reserved area of
|
|
memory (communicated to the kernel by a /memreserve/ region in the
|
|
device tree) polling their cpu-release-addr location, which must be
|
|
contained in the reserved region. A wfe instruction may be inserted
|
|
to reduce the overhead of the busy-loop and a sev will be issued by
|
|
the primary CPU. When a read of the location pointed to by the
|
|
cpu-release-addr returns a non-zero value, the CPU must jump to this
|
|
value. The value will be written as a single 64-bit little-endian
|
|
value, so CPUs must convert the read value to their native endianness
|
|
before jumping to it.
|
|
|
|
- CPUs with a "psci" enable method should remain outside of
|
|
the kernel (i.e. outside of the regions of memory described to the
|
|
kernel in the memory node, or in a reserved area of memory described
|
|
to the kernel by a /memreserve/ region in the device tree). The
|
|
kernel will issue CPU_ON calls as described in ARM document number ARM
|
|
DEN 0022A ("Power State Coordination Interface System Software on ARM
|
|
processors") to bring CPUs into the kernel.
|
|
|
|
The device tree should contain a 'psci' node, as described in
|
|
Documentation/devicetree/bindings/arm/psci.txt.
|
|
|
|
- Secondary CPU general-purpose register settings
|
|
x0 = 0 (reserved for future use)
|
|
x1 = 0 (reserved for future use)
|
|
x2 = 0 (reserved for future use)
|
|
x3 = 0 (reserved for future use)
|