License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 14:07:57 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
|
|
|
* linux/boot/head.S
|
|
|
|
*
|
|
|
|
* Copyright (C) 1991, 1992, 1993 Linus Torvalds
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* head.S contains the 32-bit startup code.
|
|
|
|
*
|
|
|
|
* NOTE!!! Startup happens at absolute address 0x00001000, which is also where
|
|
|
|
* the page directory will exist. The startup code will be overwritten by
|
|
|
|
* the page directory. [According to comments etc elsewhere on a compressed
|
|
|
|
* kernel it will end up at 0x1000 + 1Mb I hope so as I assume this. - AC]
|
|
|
|
*
|
|
|
|
* Page 0 is deliberately kept safe, since System Management Mode code in
|
|
|
|
* laptops may need to access the BIOS data stored there. This is also
|
|
|
|
* useful for future device drivers that either access the BIOS via VM86
|
|
|
|
* mode.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
2005-06-25 21:58:59 +00:00
|
|
|
* High loaded stuff by Hans Lermen & Werner Almesberger, Feb. 1996
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2009-05-08 22:59:13 +00:00
|
|
|
.code32
|
|
|
|
.text
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2009-09-16 20:44:27 +00:00
|
|
|
#include <linux/init.h>
|
2005-04-16 22:20:36 +00:00
|
|
|
#include <linux/linkage.h>
|
|
|
|
#include <asm/segment.h>
|
2008-04-08 10:54:30 +00:00
|
|
|
#include <asm/boot.h>
|
2007-05-02 17:27:07 +00:00
|
|
|
#include <asm/msr.h>
|
2008-05-12 13:43:39 +00:00
|
|
|
#include <asm/processor-flags.h>
|
2007-10-26 17:29:04 +00:00
|
|
|
#include <asm/asm-offsets.h>
|
2015-02-19 07:34:58 +00:00
|
|
|
#include <asm/bootparam.h>
|
2020-09-07 13:15:14 +00:00
|
|
|
#include <asm/desc_defs.h>
|
2021-03-10 08:43:22 +00:00
|
|
|
#include <asm/trapnr.h>
|
2018-03-12 10:02:44 +00:00
|
|
|
#include "pgtable.h"
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2022-09-15 11:10:47 +00:00
|
|
|
/*
|
|
|
|
* Fix alignment at 16 bytes. Following CONFIG_FUNCTION_ALIGNMENT will result
|
|
|
|
* in assembly errors due to trying to move .org backward due to the excessive
|
|
|
|
* alignment.
|
|
|
|
*/
|
|
|
|
#undef __ALIGN
|
|
|
|
#define __ALIGN .balign 16, 0x90
|
|
|
|
|
x86/build: Build compressed x86 kernels as PIE
The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
relocation to get the symbol address in PIC. When the compressed x86
kernel isn't built as PIC, the linker optimizes R_386_GOT32X relocations
to their fixed symbol addresses. However, when the compressed x86
kernel is loaded at a different address, it leads to the following
load failure:
Failed to allocate space for phdrs
during the decompression stage.
If the compressed x86 kernel is relocatable at run-time, it should be
compiled with -fPIE, instead of -fPIC, if possible and should be built as
Position Independent Executable (PIE) so that linker won't optimize
R_386_GOT32X relocation to its fixed symbol address.
Older linkers generate R_386_32 relocations against locally defined
symbols, _bss, _ebss, _got and _egot, in PIE. It isn't wrong, just less
optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
R_386_32 relocations when relocating the kernel. To generate
R_386_RELATIVE relocations, we mark _bss, _ebss, _got and _egot as
hidden in both 32-bit and 64-bit x86 kernels.
To build a 64-bit compressed x86 kernel as PIE, we need to disable the
relocation overflow check to avoid relocation overflow errors. We do
this with a new linker command-line option, -z noreloc-overflow, which
got added recently:
commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Mar 15 11:07:06 2016 -0700
Add -z noreloc-overflow option to x86-64 ld
Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
disable relocation overflow check. This can be used to avoid relocation
overflow check if there will be no dynamic relocation overflow at
run-time.
The 64-bit compressed x86 kernel is built as PIE only if the linker supports
-z noreloc-overflow. So far 64-bit relocatable compressed x86 kernel
boots fine even when it is built as a normal executable.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Edited the changelog and comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-17 03:04:35 +00:00
|
|
|
/*
|
|
|
|
* Locally defined symbols should be marked hidden:
|
|
|
|
*/
|
|
|
|
.hidden _bss
|
|
|
|
.hidden _ebss
|
x86/boot: Correct relocation destination on old linkers
For the 32-bit kernel, as described in
6d92bc9d483a ("x86/build: Build compressed x86 kernels as PIE"),
pre-2.26 binutils generates R_386_32 relocations in PIE mode. Since the
startup code does not perform relocation, any reloc entry with R_386_32
will remain as 0 in the executing code.
Commit
974f221c84b0 ("x86/boot: Move compressed kernel to the end of the
decompression buffer")
added a new symbol _end but did not mark it hidden, which doesn't give
the correct offset on older linkers. This causes the compressed kernel
to be copied beyond the end of the decompression buffer, rather than
flush against it. This region of memory may be reserved or already
allocated for other purposes by the bootloader.
Mark _end as hidden to fix. This changes the relocation from R_386_32 to
R_386_RELATIVE even on the pre-2.26 binutils.
For 64-bit, this is not strictly necessary, as the 64-bit kernel is only
built as PIE if the linker supports -z noreloc-overflow, which implies
binutils-2.27+, but for consistency, mark _end as hidden here too.
The below illustrates the before/after impact of the patch using
binutils-2.25 and gcc-4.6.4 (locally compiled from source) and QEMU.
Disassembly before patch:
48: 8b 86 60 02 00 00 mov 0x260(%esi),%eax
4e: 2d 00 00 00 00 sub $0x0,%eax
4f: R_386_32 _end
Disassembly after patch:
48: 8b 86 60 02 00 00 mov 0x260(%esi),%eax
4e: 2d 00 f0 76 00 sub $0x76f000,%eax
4f: R_386_RELATIVE *ABS*
Dump from extract_kernel before patch:
early console in extract_kernel
input_data: 0x0207c098 <--- this is at output + init_size
input_len: 0x0074fef1
output: 0x01000000
output_len: 0x00fa63d0
kernel_total_size: 0x0107c000
needed_size: 0x0107c000
Dump from extract_kernel after patch:
early console in extract_kernel
input_data: 0x0190d098 <--- this is at output + init_size - _end
input_len: 0x0074fef1
output: 0x01000000
output_len: 0x00fa63d0
kernel_total_size: 0x0107c000
needed_size: 0x0107c000
Fixes: 974f221c84b0 ("x86/boot: Move compressed kernel to the end of the decompression buffer")
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200207214926.3564079-1-nivedita@alum.mit.edu
2020-02-07 21:49:26 +00:00
|
|
|
.hidden _end
|
x86/build: Build compressed x86 kernels as PIE
The 32-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
relocation to get the symbol address in PIC. When the compressed x86
kernel isn't built as PIC, the linker optimizes R_386_GOT32X relocations
to their fixed symbol addresses. However, when the compressed x86
kernel is loaded at a different address, it leads to the following
load failure:
Failed to allocate space for phdrs
during the decompression stage.
If the compressed x86 kernel is relocatable at run-time, it should be
compiled with -fPIE, instead of -fPIC, if possible and should be built as
Position Independent Executable (PIE) so that linker won't optimize
R_386_GOT32X relocation to its fixed symbol address.
Older linkers generate R_386_32 relocations against locally defined
symbols, _bss, _ebss, _got and _egot, in PIE. It isn't wrong, just less
optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
R_386_32 relocations when relocating the kernel. To generate
R_386_RELATIVE relocations, we mark _bss, _ebss, _got and _egot as
hidden in both 32-bit and 64-bit x86 kernels.
To build a 64-bit compressed x86 kernel as PIE, we need to disable the
relocation overflow check to avoid relocation overflow errors. We do
this with a new linker command-line option, -z noreloc-overflow, which
got added recently:
commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Mar 15 11:07:06 2016 -0700
Add -z noreloc-overflow option to x86-64 ld
Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
disable relocation overflow check. This can be used to avoid relocation
overflow check if there will be no dynamic relocation overflow at
run-time.
The 64-bit compressed x86 kernel is built as PIE only if the linker supports
-z noreloc-overflow. So far 64-bit relocatable compressed x86 kernel
boots fine even when it is built as a normal executable.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
[ Edited the changelog and comments. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-17 03:04:35 +00:00
|
|
|
|
2009-09-16 20:44:27 +00:00
|
|
|
__HEAD
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* This macro gives the relative virtual address of X, i.e. the offset of X
|
|
|
|
* from startup_32. This is the same as the link-time virtual address of X,
|
|
|
|
* since startup_32 is at 0, but defining it this way tells the
|
|
|
|
* assembler/linker that we do not want the actual run-time address of X. This
|
|
|
|
* prevents the linker from trying to create unwanted run-time relocation
|
|
|
|
* entries for the reference when the compressed kernel is linked as PIE.
|
|
|
|
*
|
|
|
|
* A reference X(%reg) will result in the link-time VA of X being stored with
|
|
|
|
* the instruction, and a run-time R_X86_64_RELATIVE relocation entry that
|
|
|
|
* adds the 64-bit base address where the kernel is loaded.
|
|
|
|
*
|
|
|
|
* Replacing it with (X-startup_32)(%reg) results in the offset being stored,
|
|
|
|
* and no run-time relocation.
|
|
|
|
*
|
|
|
|
* The macro should be used as a displacement with a base register containing
|
|
|
|
* the run-time address of startup_32 [i.e. rva(X)(%reg)], or as an immediate
|
|
|
|
* [$ rva(X)].
|
|
|
|
*
|
|
|
|
* This macro can only be used from within the .head.text section, since the
|
|
|
|
* expression requires startup_32 to be in the same section as the code being
|
|
|
|
* assembled.
|
|
|
|
*/
|
|
|
|
#define rva(X) ((X) - startup_32)
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
.code32
|
2019-10-11 11:51:04 +00:00
|
|
|
SYM_FUNC_START(startup_32)
|
2013-01-24 20:20:07 +00:00
|
|
|
/*
|
|
|
|
* 32bit entry is 0 and it is ABI so immutable!
|
|
|
|
* If we come here directly from a bootloader,
|
|
|
|
* kernel(text+data+bss+brk) ramdisk, zero_page, command line
|
|
|
|
* all need to be under the 4G limit.
|
|
|
|
*/
|
2005-04-16 22:20:36 +00:00
|
|
|
cld
|
|
|
|
cli
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
|
|
|
* Calculate the delta between where we were compiled to run
|
2007-05-02 17:27:07 +00:00
|
|
|
* at and where we were actually loaded at. This can only be done
|
|
|
|
* with a short local call on x86. Nothing else will tell us what
|
|
|
|
* address we are running at. The reserved chunk of the real-mode
|
2007-07-11 19:18:33 +00:00
|
|
|
* data at 0x1e4 (defined as a scratch field) are used as the stack
|
|
|
|
* for this calculation. Only 4 bytes are needed.
|
2007-05-02 17:27:07 +00:00
|
|
|
*/
|
2009-05-06 06:24:50 +00:00
|
|
|
leal (BP_scratch+4)(%esi), %esp
|
2007-05-02 17:27:07 +00:00
|
|
|
call 1f
|
|
|
|
1: popl %ebp
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
subl $ rva(1b), %ebp
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2020-02-02 17:13:48 +00:00
|
|
|
/* Load new GDT with the 64bit segments using 32bit descriptor */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(gdt)(%ebp), %eax
|
2020-02-02 17:13:53 +00:00
|
|
|
movl %eax, 2(%eax)
|
|
|
|
lgdt (%eax)
|
2020-02-02 17:13:48 +00:00
|
|
|
|
|
|
|
/* Load segment registers with our descriptors */
|
|
|
|
movl $__BOOT_DS, %eax
|
|
|
|
movl %eax, %ds
|
|
|
|
movl %eax, %es
|
|
|
|
movl %eax, %fs
|
|
|
|
movl %eax, %gs
|
|
|
|
movl %eax, %ss
|
|
|
|
|
2021-03-10 08:43:20 +00:00
|
|
|
/* Setup a stack and load CS from current GDT */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(boot_stack_end)(%ebp), %esp
|
2007-05-02 17:27:08 +00:00
|
|
|
|
2021-03-10 08:43:20 +00:00
|
|
|
pushl $__KERNEL32_CS
|
|
|
|
leal rva(1f)(%ebp), %eax
|
|
|
|
pushl %eax
|
|
|
|
lretl
|
|
|
|
1:
|
|
|
|
|
2021-03-10 08:43:21 +00:00
|
|
|
/* Setup Exception handling for SEV-ES */
|
2022-11-22 16:10:11 +00:00
|
|
|
#ifdef CONFIG_AMD_MEM_ENCRYPT
|
2021-03-10 08:43:21 +00:00
|
|
|
call startup32_load_idt
|
2022-11-22 16:10:11 +00:00
|
|
|
#endif
|
2021-03-10 08:43:21 +00:00
|
|
|
|
2021-03-10 08:43:20 +00:00
|
|
|
/* Make sure cpu supports long mode. */
|
2007-05-02 17:27:08 +00:00
|
|
|
call verify_cpu
|
|
|
|
testl %eax, %eax
|
2019-09-06 07:55:50 +00:00
|
|
|
jnz .Lno_longmode
|
2007-05-02 17:27:08 +00:00
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
|
|
|
* Compute the delta between where we were compiled to run at
|
2007-05-02 17:27:07 +00:00
|
|
|
* and where the code will actually run at.
|
2009-05-08 22:59:13 +00:00
|
|
|
*
|
|
|
|
* %ebp contains the address we are loaded at by the boot loader and %ebx
|
2007-05-02 17:27:07 +00:00
|
|
|
* contains the address where we should move the kernel image temporarily
|
|
|
|
* for safe in-place decompression.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifdef CONFIG_RELOCATABLE
|
|
|
|
movl %ebp, %ebx
|
2020-03-08 08:08:47 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_EFI_STUB
|
|
|
|
/*
|
|
|
|
* If we were loaded via the EFI LoadImage service, startup_32 will be at an
|
|
|
|
* offset to the start of the space allocated for the image. efi_pe_entry will
|
|
|
|
* set up image_offset to tell us where the image actually starts, so that we
|
|
|
|
* can use the full available buffer.
|
|
|
|
* image_offset = startup_32 - image_base
|
|
|
|
* Otherwise image_offset will be zero and has no effect on the calculations.
|
|
|
|
*/
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
subl rva(image_offset)(%ebp), %ebx
|
2020-03-08 08:08:47 +00:00
|
|
|
#endif
|
|
|
|
|
2009-05-11 22:56:08 +00:00
|
|
|
movl BP_kernel_alignment(%esi), %eax
|
|
|
|
decl %eax
|
|
|
|
addl %eax, %ebx
|
|
|
|
notl %eax
|
|
|
|
andl %eax, %ebx
|
2013-10-11 00:18:14 +00:00
|
|
|
cmpl $LOAD_PHYSICAL_ADDR, %ebx
|
2020-03-08 08:08:44 +00:00
|
|
|
jae 1f
|
2007-05-02 17:27:07 +00:00
|
|
|
#endif
|
2013-10-11 00:18:14 +00:00
|
|
|
movl $LOAD_PHYSICAL_ADDR, %ebx
|
|
|
|
1:
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2009-05-09 00:42:16 +00:00
|
|
|
/* Target address to relocate to for decompression */
|
2020-03-08 08:08:47 +00:00
|
|
|
addl BP_init_size(%esi), %ebx
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
subl $ rva(_end), %ebx
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
2007-05-02 17:27:07 +00:00
|
|
|
* Prepare for entering 64 bit mode
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2007-05-02 17:27:07 +00:00
|
|
|
|
|
|
|
/* Enable PAE mode */
|
2014-02-24 13:37:29 +00:00
|
|
|
movl %cr4, %eax
|
|
|
|
orl $X86_CR4_PAE, %eax
|
2007-05-02 17:27:07 +00:00
|
|
|
movl %eax, %cr4
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Build early 4G boot pagetable
|
|
|
|
*/
|
2017-10-20 14:30:54 +00:00
|
|
|
/*
|
|
|
|
* If SEV is active then set the encryption mask in the page tables.
|
2022-11-22 16:10:15 +00:00
|
|
|
* This will ensure that when the kernel is copied and decompressed
|
2017-10-20 14:30:54 +00:00
|
|
|
* it will be done so encrypted.
|
|
|
|
*/
|
|
|
|
xorl %edx, %edx
|
2021-03-12 12:38:23 +00:00
|
|
|
#ifdef CONFIG_AMD_MEM_ENCRYPT
|
2022-11-22 16:10:15 +00:00
|
|
|
call get_sev_encryption_bit
|
|
|
|
xorl %edx, %edx
|
2017-10-20 14:30:54 +00:00
|
|
|
testl %eax, %eax
|
|
|
|
jz 1f
|
|
|
|
subl $32, %eax /* Encryption bit is always above bit 31 */
|
|
|
|
bts %eax, %edx /* Set encryption mask for page tables */
|
2021-03-12 12:38:23 +00:00
|
|
|
/*
|
2022-02-09 18:10:01 +00:00
|
|
|
* Set MSR_AMD64_SEV_ENABLED_BIT in sev_status so that
|
|
|
|
* startup32_check_sev_cbit() will do a check. sev_enable() will
|
|
|
|
* initialize sev_status with all the bits reported by
|
|
|
|
* MSR_AMD_SEV_STATUS later, but only MSR_AMD64_SEV_ENABLED_BIT
|
|
|
|
* needs to be set for now.
|
2021-03-12 12:38:23 +00:00
|
|
|
*/
|
|
|
|
movl $1, rva(sev_status)(%ebp)
|
2017-10-20 14:30:54 +00:00
|
|
|
1:
|
2021-03-12 12:38:23 +00:00
|
|
|
#endif
|
2017-10-20 14:30:54 +00:00
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/* Initialize Page tables to 0 */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(pgtable)(%ebx), %edi
|
2007-05-02 17:27:07 +00:00
|
|
|
xorl %eax, %eax
|
x86/KASLR: Build identity mappings on demand
Currently KASLR only supports relocation in a small physical range (from
16M to 1G), due to using the initial kernel page table identity mapping.
To support ranges above this, we need to have an identity mapping for the
desired memory range before we can decompress (and later run) the kernel.
32-bit kernels already have the needed identity mapping. This patch adds
identity mappings for the needed memory ranges on 64-bit kernels. This
happens in two possible boot paths:
If loaded via startup_32(), we need to set up the needed identity map.
If loaded from a 64-bit bootloader, the bootloader will have already
set up an identity mapping, and we'll start via the compressed kernel's
startup_64(). In this case, the bootloader's page tables need to be
avoided while selecting the new uncompressed kernel location. If not,
the decompressor could overwrite them during decompression.
To accomplish this, we could walk the pagetable and find every page
that is used, and add them to mem_avoid, but this needs extra code and
will require increasing the size of the mem_avoid array.
Instead, we can create a new set of page tables for our own identity
mapping instead. The pages for the new page table will come from the
_pagetable section of the compressed kernel, which means they are
already contained by in mem_avoid array. To do this, we reuse the code
from the uncompressed kernel's identity mapping routines.
The _pgtable will be shared by both the 32-bit and 64-bit paths to reduce
init_size, as now the compressed kernel's _rodata to _end will contribute
to init_size.
To handle the possible mappings, we need to increase the existing page
table buffer size:
When booting via startup_64(), we need to cover the old VO, params,
cmdline and uncompressed kernel. In an extreme case we could have them
all beyond the 512G boundary, which needs (2+2)*4 pages with 2M mappings.
And we'll need 2 for first 2M for VGA RAM. One more is needed for level4.
This gets us to 19 pages total.
When booting via startup_32(), KASLR could move the uncompressed kernel
above 4G, so we need to create extra identity mappings, which should only
need (2+2) pages at most when it is beyond the 512G boundary. So 19
pages is sufficient for this case as well.
The resulting BOOT_*PGT_SIZE defines use the "_SIZE" suffix on their
names to maintain logical consistency with the existing BOOT_HEAP_SIZE
and BOOT_STACK_SIZE defines.
This patch is based on earlier patches from Yinghai Lu and Baoquan He.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: lasse.collin@tukaani.org
Link: http://lkml.kernel.org/r/1462572095-11754-4-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-05-06 22:01:35 +00:00
|
|
|
movl $(BOOT_INIT_PGT_SIZE/4), %ecx
|
2007-05-02 17:27:07 +00:00
|
|
|
rep stosl
|
|
|
|
|
|
|
|
/* Build Level 4 */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(pgtable + 0)(%ebx), %edi
|
2007-05-02 17:27:07 +00:00
|
|
|
leal 0x1007 (%edi), %eax
|
|
|
|
movl %eax, 0(%edi)
|
2017-10-20 14:30:54 +00:00
|
|
|
addl %edx, 4(%edi)
|
2007-05-02 17:27:07 +00:00
|
|
|
|
|
|
|
/* Build Level 3 */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(pgtable + 0x1000)(%ebx), %edi
|
2007-05-02 17:27:07 +00:00
|
|
|
leal 0x1007(%edi), %eax
|
|
|
|
movl $4, %ecx
|
|
|
|
1: movl %eax, 0x00(%edi)
|
2017-10-20 14:30:54 +00:00
|
|
|
addl %edx, 0x04(%edi)
|
2007-05-02 17:27:07 +00:00
|
|
|
addl $0x00001000, %eax
|
|
|
|
addl $8, %edi
|
|
|
|
decl %ecx
|
|
|
|
jnz 1b
|
|
|
|
|
|
|
|
/* Build Level 2 */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(pgtable + 0x2000)(%ebx), %edi
|
2007-05-02 17:27:07 +00:00
|
|
|
movl $0x00000183, %eax
|
|
|
|
movl $2048, %ecx
|
|
|
|
1: movl %eax, 0(%edi)
|
2017-10-20 14:30:54 +00:00
|
|
|
addl %edx, 4(%edi)
|
2007-05-02 17:27:07 +00:00
|
|
|
addl $0x00200000, %eax
|
|
|
|
addl $8, %edi
|
|
|
|
decl %ecx
|
|
|
|
jnz 1b
|
|
|
|
|
|
|
|
/* Enable the boot page tables */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(pgtable)(%ebx), %eax
|
2007-05-02 17:27:07 +00:00
|
|
|
movl %eax, %cr3
|
|
|
|
|
|
|
|
/* Enable Long mode in EFER (Extended Feature Enable Register) */
|
|
|
|
movl $MSR_EFER, %ecx
|
|
|
|
rdmsr
|
|
|
|
btsl $_EFER_LME, %eax
|
|
|
|
wrmsr
|
|
|
|
|
2013-01-24 20:20:01 +00:00
|
|
|
/* After gdt is loaded */
|
|
|
|
xorl %eax, %eax
|
|
|
|
lldt %ax
|
2015-04-01 14:50:58 +00:00
|
|
|
movl $__BOOT_TSS, %eax
|
2013-01-24 20:20:01 +00:00
|
|
|
ltr %ax
|
|
|
|
|
2022-11-22 16:10:13 +00:00
|
|
|
#ifdef CONFIG_AMD_MEM_ENCRYPT
|
|
|
|
/* Check if the C-bit position is correct when SEV is active */
|
|
|
|
call startup32_check_sev_cbit
|
|
|
|
#endif
|
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
|
|
|
* Setup for the jump to 64bit mode
|
2007-05-02 17:27:07 +00:00
|
|
|
*
|
2021-03-21 21:28:53 +00:00
|
|
|
* When the jump is performed we will be in long mode but
|
2007-05-02 17:27:07 +00:00
|
|
|
* in 32bit compatibility mode with EFER.LME = 1, CS.L = 0, CS.D = 1
|
|
|
|
* (and in turn EFER.LMA = 1). To jump into 64bit mode we use
|
|
|
|
* the new gdt/idt that has __KERNEL_CS with CS.L = 1.
|
|
|
|
* We place all of the values on our mini stack so lret can
|
|
|
|
* used to perform that far jump.
|
|
|
|
*/
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leal rva(startup_64)(%ebp), %eax
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
#ifdef CONFIG_EFI_MIXED
|
2022-11-22 16:10:03 +00:00
|
|
|
cmpb $1, rva(efi_is64)(%ebp)
|
|
|
|
je 1f
|
|
|
|
leal rva(startup_64_mixed_mode)(%ebp), %eax
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
1:
|
|
|
|
#endif
|
2021-03-12 12:38:23 +00:00
|
|
|
|
2020-06-17 13:19:57 +00:00
|
|
|
pushl $__KERNEL_CS
|
2007-05-02 17:27:07 +00:00
|
|
|
pushl %eax
|
|
|
|
|
|
|
|
/* Enter paged protected Mode, activating Long Mode */
|
2022-04-05 23:29:31 +00:00
|
|
|
movl $CR0_STATE, %eax
|
2007-05-02 17:27:07 +00:00
|
|
|
movl %eax, %cr0
|
|
|
|
|
|
|
|
/* Jump from 32bit compatibility mode into 64bit mode. */
|
|
|
|
lret
|
2019-10-11 11:51:04 +00:00
|
|
|
SYM_FUNC_END(startup_32)
|
2007-05-02 17:27:07 +00:00
|
|
|
|
x86/efi: Make the deprecated EFI handover protocol optional
The EFI handover protocol permits a bootloader to invoke the kernel as a
EFI PE/COFF application, while passing a bootparams struct as a third
argument to the entrypoint function call.
This has no basis in the UEFI specification, and there are better ways
to pass additional data to a UEFI application (UEFI configuration
tables, UEFI variables, UEFI protocols) than going around the
StartImage() boot service and jumping to a fixed offset in the loaded
image, just to call a different function that takes a third parameter.
The reason for handling struct bootparams in the bootloader was that the
EFI stub could only load initrd images from the EFI system partition,
and so passing it via struct bootparams was needed for loaders like
GRUB, which pass the initrd in memory, and may load it from anywhere,
including from the network. Another motivation was EFI mixed mode, which
could not use the initrd loader in the EFI stub at all due to 32/64 bit
incompatibilities (which will be fixed shortly [0]), and could not
invoke the ordinary PE/COFF entry point either, for the same reasons.
Given that loaders such as GRUB already carried the bootparams handling
in order to implement non-EFI boot, retaining that code and just passing
bootparams to the EFI stub was a reasonable choice (although defining an
alternate entrypoint could have been avoided.) However, the GRUB side
changes never made it upstream, and are only shipped by some of the
distros in their downstream versions.
In the meantime, EFI support has been added to other Linux architecture
ports, as well as to U-boot and systemd, including arch-agnostic methods
for passing initrd images in memory [1], and for doing mixed mode boot
[2], none of them requiring anything like the EFI handover protocol. So
given that only out-of-tree distro GRUB relies on this, let's permit it
to be omitted from the build, in preparation for retiring it completely
at a later date. (Note that systemd-boot does have an implementation as
well, but only uses it as a fallback for booting images that do not
implement the LoadFile2 based initrd loading method, i.e., v5.8 or older)
[0] https://lore.kernel.org/all/20220927085842.2860715-1-ardb@kernel.org/
[1] ec93fc371f01 ("efi/libstub: Add support for loading the initrd from a device path")
[2] 97aa276579b2 ("efi/x86: Add true mixed mode entry point into .compat section")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-18-ardb@kernel.org
2022-11-22 16:10:17 +00:00
|
|
|
#if IS_ENABLED(CONFIG_EFI_MIXED) && IS_ENABLED(CONFIG_EFI_HANDOVER_PROTOCOL)
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
.org 0x190
|
2019-10-11 11:51:04 +00:00
|
|
|
SYM_FUNC_START(efi32_stub_entry)
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
add $0x4, %esp /* Discard return address */
|
2020-01-03 11:39:34 +00:00
|
|
|
popl %ecx
|
|
|
|
popl %edx
|
|
|
|
popl %esi
|
2022-11-22 16:10:02 +00:00
|
|
|
jmp efi32_entry
|
|
|
|
SYM_FUNC_END(efi32_stub_entry)
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
#endif
|
|
|
|
|
2007-05-02 17:27:07 +00:00
|
|
|
.code64
|
2007-05-02 17:27:08 +00:00
|
|
|
.org 0x200
|
2019-10-11 11:51:02 +00:00
|
|
|
SYM_CODE_START(startup_64)
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
2013-01-24 20:20:07 +00:00
|
|
|
* 64bit entry is 0x200 and it is ABI so immutable!
|
2009-05-08 22:59:13 +00:00
|
|
|
* We come here either from startup_32 or directly from a
|
2013-01-24 20:20:07 +00:00
|
|
|
* 64bit bootloader.
|
|
|
|
* If we come here from a bootloader, kernel(text+data+bss+brk),
|
|
|
|
* ramdisk, zero_page, command line could be above 4G.
|
|
|
|
* We depend on an identity mapped page table being provided
|
|
|
|
* that maps our entire kernel(text+data+bss+brk), zero page
|
|
|
|
* and command line.
|
2007-05-02 17:27:07 +00:00
|
|
|
*/
|
|
|
|
|
2020-02-02 17:13:50 +00:00
|
|
|
cld
|
|
|
|
cli
|
|
|
|
|
2007-05-02 17:27:07 +00:00
|
|
|
/* Setup data segments. */
|
|
|
|
xorl %eax, %eax
|
|
|
|
movl %eax, %ds
|
|
|
|
movl %eax, %es
|
|
|
|
movl %eax, %ss
|
2007-08-10 20:31:05 +00:00
|
|
|
movl %eax, %fs
|
|
|
|
movl %eax, %gs
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
|
|
|
* Compute the decompressed kernel start address. It is where
|
2007-05-02 17:27:07 +00:00
|
|
|
* we were loaded at aligned to a 2M boundary. %rbp contains the
|
|
|
|
* decompressed kernel start address.
|
|
|
|
*
|
|
|
|
* If it is a relocatable kernel then decompress and run the kernel
|
|
|
|
* from load address aligned to 2MB addr, otherwise decompress and
|
2009-05-11 21:41:55 +00:00
|
|
|
* run the kernel from LOAD_PHYSICAL_ADDR
|
2009-05-09 00:42:16 +00:00
|
|
|
*
|
|
|
|
* We cannot rely on the calculation done in 32-bit mode, since we
|
|
|
|
* may have been invoked via the 64-bit entry point.
|
2007-05-02 17:27:07 +00:00
|
|
|
*/
|
|
|
|
|
|
|
|
/* Start with the delta to where the kernel will run at. */
|
|
|
|
#ifdef CONFIG_RELOCATABLE
|
|
|
|
leaq startup_32(%rip) /* - $startup_32 */, %rbp
|
2020-03-08 08:08:47 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_EFI_STUB
|
|
|
|
/*
|
|
|
|
* If we were loaded via the EFI LoadImage service, startup_32 will be at an
|
|
|
|
* offset to the start of the space allocated for the image. efi_pe_entry will
|
|
|
|
* set up image_offset to tell us where the image actually starts, so that we
|
|
|
|
* can use the full available buffer.
|
|
|
|
* image_offset = startup_32 - image_base
|
|
|
|
* Otherwise image_offset will be zero and has no effect on the calculations.
|
|
|
|
*/
|
|
|
|
movl image_offset(%rip), %eax
|
|
|
|
subq %rax, %rbp
|
|
|
|
#endif
|
|
|
|
|
2009-05-11 22:56:08 +00:00
|
|
|
movl BP_kernel_alignment(%rsi), %eax
|
|
|
|
decl %eax
|
|
|
|
addq %rax, %rbp
|
|
|
|
notq %rax
|
|
|
|
andq %rax, %rbp
|
2013-10-11 00:18:14 +00:00
|
|
|
cmpq $LOAD_PHYSICAL_ADDR, %rbp
|
2020-03-08 08:08:44 +00:00
|
|
|
jae 1f
|
2007-05-02 17:27:07 +00:00
|
|
|
#endif
|
2013-10-11 00:18:14 +00:00
|
|
|
movq $LOAD_PHYSICAL_ADDR, %rbp
|
|
|
|
1:
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2009-05-09 00:42:16 +00:00
|
|
|
/* Target address to relocate to for decompression */
|
x86/boot: Move compressed kernel to the end of the decompression buffer
This change makes later calculations about where the kernel is located
easier to reason about. To better understand this change, we must first
clarify what 'VO' and 'ZO' are. These values were introduced in commits
by hpa:
77d1a4999502 ("x86, boot: make symbols from the main vmlinux available")
37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields")
Specifically:
All names prefixed with 'VO_':
- relate to the uncompressed kernel image
- the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define)
All names prefixed with 'ZO_':
- relate to the bootable compressed kernel image (boot/compressed/vmlinux),
which is composed of the following memory areas:
- head text
- compressed kernel (VO image and relocs table)
- decompressor code
- the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
The 'INIT_SIZE' value is used to find the larger of the two image sizes:
#define ZO_INIT_SIZE (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
#define VO_INIT_SIZE (VO__end - VO__text)
#if ZO_INIT_SIZE > VO_INIT_SIZE
# define INIT_SIZE ZO_INIT_SIZE
#else
# define INIT_SIZE VO_INIT_SIZE
#endif
The current code uses extract_offset to decide where to position the
copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
currently includes the extract_offset.)
Why does z_extract_offset exist? It's needed because we are trying to minimize
the amount of RAM used for the whole act of creating an uncompressed, executable,
properly relocation-linked kernel image in system memory. We do this so that
kernels can be booted on even very small systems.
To achieve the goal of minimal memory consumption we have implemented an in-place
decompression strategy: instead of cleanly separating the VO and ZO images and
also allocating some memory for the decompression code's runtime needs, we instead
create this elaborate layout of memory buffers where the output (decompressed)
stream, as it progresses, overlaps with and destroys the input (compressed)
stream. This can only be done safely if the ZO image is placed to the end of the
VO range, plus a certain amount of safety distance to make sure that when the last
bytes of the VO range are decompressed, the compressed stream pointer is safely
beyond the end of the VO range.
z_extract_offset is calculated in arch/x86/boot/compressed/mkpiggy.c during
the build process, at a point when we know the exact compressed and
uncompressed size of the kernel images and can calculate this safe minimum
offset value. (Note that the mkpiggy.c calculation is not perfect, because
we don't know the decompressor used at that stage, so the z_extract_offset
calculation is necessarily imprecise and is mostly based on gzip internals -
we'll improve that in the next patch.)
When INIT_SIZE is bigger than VO_INIT_SIZE (uncommon but possible),
the copied ZO occupies the memory from extract_offset to the end of
decompression buffer. It overlaps with the soon-to-be-uncompressed kernel
like this:
|-----compressed kernel image------|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO VO__end ZO__end
^ ^
|-------uncompressed kernel image---------|
When INIT_SIZE is equal to VO_INIT_SIZE (likely) there's still space
left from end of ZO to the end of decompressing buffer, like below.
|-compressed kernel image-|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO ZO__end VO__end
^ ^
|------------uncompressed kernel image-------------|
To simplify calculations and avoid special cases, it is cleaner to
always place the compressed kernel image in memory so that ZO__end
is at the end of the decompression buffer, instead of placing t at
the start of extract_offset as is currently done.
This patch adds BP_init_size (which is the INIT_SIZE as passed in from
the boot_params) into asm-offsets.c to make it visible to the assembly
code.
Then when moving the ZO, it calculates the starting position of
the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
will be at the end of the decompression buffer. To make the position
calculation safe, the end of ZO is page aligned (and a comment is added
to the existing VO alignment for good measure).
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Rewrote changelog and comments. ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: lasse.collin@tukaani.org
Link: http://lkml.kernel.org/r/1461888548-32439-3-git-send-email-keescook@chromium.org
[ Rewrote the changelog some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29 00:09:04 +00:00
|
|
|
movl BP_init_size(%rsi), %ebx
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
subl $ rva(_end), %ebx
|
x86/boot: Move compressed kernel to the end of the decompression buffer
This change makes later calculations about where the kernel is located
easier to reason about. To better understand this change, we must first
clarify what 'VO' and 'ZO' are. These values were introduced in commits
by hpa:
77d1a4999502 ("x86, boot: make symbols from the main vmlinux available")
37ba7ab5e33c ("x86, boot: make kernel_alignment adjustable; new bzImage fields")
Specifically:
All names prefixed with 'VO_':
- relate to the uncompressed kernel image
- the size of the VO image is: VO__end-VO__text ("VO_INIT_SIZE" define)
All names prefixed with 'ZO_':
- relate to the bootable compressed kernel image (boot/compressed/vmlinux),
which is composed of the following memory areas:
- head text
- compressed kernel (VO image and relocs table)
- decompressor code
- the size of the ZO image is: ZO__end - ZO_startup_32 ("ZO_INIT_SIZE" define, though see below)
The 'INIT_SIZE' value is used to find the larger of the two image sizes:
#define ZO_INIT_SIZE (ZO__end - ZO_startup_32 + ZO_z_extract_offset)
#define VO_INIT_SIZE (VO__end - VO__text)
#if ZO_INIT_SIZE > VO_INIT_SIZE
# define INIT_SIZE ZO_INIT_SIZE
#else
# define INIT_SIZE VO_INIT_SIZE
#endif
The current code uses extract_offset to decide where to position the
copied ZO (i.e. ZO starts at extract_offset). (This is why ZO_INIT_SIZE
currently includes the extract_offset.)
Why does z_extract_offset exist? It's needed because we are trying to minimize
the amount of RAM used for the whole act of creating an uncompressed, executable,
properly relocation-linked kernel image in system memory. We do this so that
kernels can be booted on even very small systems.
To achieve the goal of minimal memory consumption we have implemented an in-place
decompression strategy: instead of cleanly separating the VO and ZO images and
also allocating some memory for the decompression code's runtime needs, we instead
create this elaborate layout of memory buffers where the output (decompressed)
stream, as it progresses, overlaps with and destroys the input (compressed)
stream. This can only be done safely if the ZO image is placed to the end of the
VO range, plus a certain amount of safety distance to make sure that when the last
bytes of the VO range are decompressed, the compressed stream pointer is safely
beyond the end of the VO range.
z_extract_offset is calculated in arch/x86/boot/compressed/mkpiggy.c during
the build process, at a point when we know the exact compressed and
uncompressed size of the kernel images and can calculate this safe minimum
offset value. (Note that the mkpiggy.c calculation is not perfect, because
we don't know the decompressor used at that stage, so the z_extract_offset
calculation is necessarily imprecise and is mostly based on gzip internals -
we'll improve that in the next patch.)
When INIT_SIZE is bigger than VO_INIT_SIZE (uncommon but possible),
the copied ZO occupies the memory from extract_offset to the end of
decompression buffer. It overlaps with the soon-to-be-uncompressed kernel
like this:
|-----compressed kernel image------|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO VO__end ZO__end
^ ^
|-------uncompressed kernel image---------|
When INIT_SIZE is equal to VO_INIT_SIZE (likely) there's still space
left from end of ZO to the end of decompressing buffer, like below.
|-compressed kernel image-|
V V
0 extract_offset +INIT_SIZE
|-----------|---------------|-------------------------|--------|
| | | |
VO__text startup_32 of ZO ZO__end VO__end
^ ^
|------------uncompressed kernel image-------------|
To simplify calculations and avoid special cases, it is cleaner to
always place the compressed kernel image in memory so that ZO__end
is at the end of the decompression buffer, instead of placing t at
the start of extract_offset as is currently done.
This patch adds BP_init_size (which is the INIT_SIZE as passed in from
the boot_params) into asm-offsets.c to make it visible to the assembly
code.
Then when moving the ZO, it calculates the starting position of
the copied ZO (via BP_init_size and the ZO run size) so that the VO__end
will be at the end of the decompression buffer. To make the position
calculation safe, the end of ZO is page aligned (and a comment is added
to the existing VO alignment for good measure).
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Rewrote changelog and comments. ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: lasse.collin@tukaani.org
Link: http://lkml.kernel.org/r/1461888548-32439-3-git-send-email-keescook@chromium.org
[ Rewrote the changelog some more. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-29 00:09:04 +00:00
|
|
|
addq %rbp, %rbx
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2009-05-08 23:27:41 +00:00
|
|
|
/* Set up the stack */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(boot_stack_end)(%rbx), %rsp
|
2009-05-08 23:27:41 +00:00
|
|
|
|
2017-06-06 11:31:25 +00:00
|
|
|
/*
|
|
|
|
* At this point we are in long mode with 4-level paging enabled,
|
2018-03-12 10:02:46 +00:00
|
|
|
* but we might want to enable 5-level paging or vice versa.
|
2017-06-06 11:31:25 +00:00
|
|
|
*
|
2018-03-12 10:02:46 +00:00
|
|
|
* The problem is that we cannot do it directly. Setting or clearing
|
|
|
|
* CR4.LA57 in long mode would trigger #GP. So we need to switch off
|
|
|
|
* long mode and paging first.
|
|
|
|
*
|
|
|
|
* We also need a trampoline in lower memory to switch over from
|
|
|
|
* 4- to 5-level paging for cases when the bootloader puts the kernel
|
|
|
|
* above 4G, but didn't enable 5-level paging for us.
|
|
|
|
*
|
|
|
|
* The same trampoline can be used to switch from 5- to 4-level paging
|
|
|
|
* mode, like when starting 4-level paging kernel via kexec() when
|
|
|
|
* original kernel worked in 5-level paging mode.
|
|
|
|
*
|
|
|
|
* For the trampoline, we need the top page table to reside in lower
|
|
|
|
* memory as we don't have a way to load 64-bit values into CR3 in
|
|
|
|
* 32-bit mode.
|
|
|
|
*
|
|
|
|
* We go though the trampoline even if we don't have to: if we're
|
|
|
|
* already in a desired paging mode. This way the trampoline code gets
|
|
|
|
* tested on every boot.
|
2018-02-09 14:22:26 +00:00
|
|
|
*/
|
|
|
|
|
2018-03-12 10:02:43 +00:00
|
|
|
/* Make sure we have GDT with 32-bit code segment */
|
2020-02-02 17:13:53 +00:00
|
|
|
leaq gdt64(%rip), %rax
|
|
|
|
addq %rax, 2(%rax)
|
|
|
|
lgdt (%rax)
|
2018-03-12 10:02:43 +00:00
|
|
|
|
2020-04-28 15:16:22 +00:00
|
|
|
/* Reload CS so IRET returns to a CS actually in the GDT */
|
|
|
|
pushq $__KERNEL_CS
|
|
|
|
leaq .Lon_kernel_cs(%rip), %rax
|
|
|
|
pushq %rax
|
|
|
|
lretq
|
|
|
|
|
|
|
|
.Lon_kernel_cs:
|
|
|
|
|
2020-09-07 13:15:14 +00:00
|
|
|
pushq %rsi
|
|
|
|
call load_stage1_idt
|
|
|
|
popq %rsi
|
|
|
|
|
2022-02-09 18:10:01 +00:00
|
|
|
#ifdef CONFIG_AMD_MEM_ENCRYPT
|
|
|
|
/*
|
|
|
|
* Now that the stage1 interrupt handlers are set up, #VC exceptions from
|
|
|
|
* CPUID instructions can be properly handled for SEV-ES guests.
|
|
|
|
*
|
|
|
|
* For SEV-SNP, the CPUID table also needs to be set up in advance of any
|
|
|
|
* CPUID instructions being issued, so go ahead and do that now via
|
|
|
|
* sev_enable(), which will also handle the rest of the SEV-related
|
|
|
|
* detection/setup to ensure that has been done in advance of any dependent
|
|
|
|
* code.
|
|
|
|
*/
|
|
|
|
pushq %rsi
|
|
|
|
movq %rsi, %rdi /* real mode address */
|
|
|
|
call sev_enable
|
|
|
|
popq %rsi
|
|
|
|
#endif
|
|
|
|
|
2018-02-09 14:22:26 +00:00
|
|
|
/*
|
|
|
|
* paging_prepare() sets up the trampoline and checks if we need to
|
|
|
|
* enable 5-level paging.
|
2017-06-06 11:31:25 +00:00
|
|
|
*
|
2019-02-06 15:29:08 +00:00
|
|
|
* paging_prepare() returns a two-quadword structure which lands
|
|
|
|
* into RDX:RAX:
|
|
|
|
* - Address of the trampoline is returned in RAX.
|
|
|
|
* - Non zero RDX means trampoline needs to enable 5-level
|
|
|
|
* paging.
|
2017-06-06 11:31:25 +00:00
|
|
|
*
|
2018-02-09 14:22:26 +00:00
|
|
|
* RSI holds real mode data and needs to be preserved across
|
|
|
|
* this function call.
|
2017-06-06 11:31:25 +00:00
|
|
|
*/
|
2018-02-09 14:22:26 +00:00
|
|
|
pushq %rsi
|
2018-05-18 10:35:25 +00:00
|
|
|
movq %rsi, %rdi /* real mode address */
|
2018-02-09 14:22:26 +00:00
|
|
|
call paging_prepare
|
|
|
|
popq %rsi
|
|
|
|
|
|
|
|
/* Save the trampoline address in RCX */
|
|
|
|
movq %rax, %rcx
|
|
|
|
|
2018-03-12 10:02:46 +00:00
|
|
|
/*
|
|
|
|
* Load the address of trampoline_return() into RDI.
|
|
|
|
* It will be used by the trampoline to return to the main code.
|
|
|
|
*/
|
|
|
|
leaq trampoline_return(%rip), %rdi
|
2017-06-06 11:31:25 +00:00
|
|
|
|
|
|
|
/* Switch to compatibility mode (CS.L = 0 CS.D = 1) via far return */
|
|
|
|
pushq $__KERNEL32_CS
|
2018-03-12 10:02:46 +00:00
|
|
|
leaq TRAMPOLINE_32BIT_CODE_OFFSET(%rax), %rax
|
2017-06-06 11:31:25 +00:00
|
|
|
pushq %rax
|
|
|
|
lretq
|
2018-03-12 10:02:46 +00:00
|
|
|
trampoline_return:
|
2018-03-12 10:02:44 +00:00
|
|
|
/* Restore the stack, the 32-bit trampoline uses its own stack */
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(boot_stack_end)(%rbx), %rsp
|
2017-06-06 11:31:25 +00:00
|
|
|
|
2018-02-26 18:04:49 +00:00
|
|
|
/*
|
|
|
|
* cleanup_trampoline() would restore trampoline memory.
|
|
|
|
*
|
2018-05-16 08:01:29 +00:00
|
|
|
* RDI is address of the page table to use instead of page table
|
|
|
|
* in trampoline memory (if required).
|
|
|
|
*
|
2018-02-26 18:04:49 +00:00
|
|
|
* RSI holds real mode data and needs to be preserved across
|
|
|
|
* this function call.
|
|
|
|
*/
|
|
|
|
pushq %rsi
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(top_pgtable)(%rbx), %rdi
|
2018-02-26 18:04:49 +00:00
|
|
|
call cleanup_trampoline
|
|
|
|
popq %rsi
|
|
|
|
|
2009-05-08 23:27:41 +00:00
|
|
|
/* Zero EFLAGS */
|
|
|
|
pushq $0
|
|
|
|
popfq
|
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
|
|
|
* Copy the compressed kernel to the end of our buffer
|
2007-05-02 17:27:07 +00:00
|
|
|
* where decompression in place becomes safe.
|
|
|
|
*/
|
2009-05-08 23:45:15 +00:00
|
|
|
pushq %rsi
|
|
|
|
leaq (_bss-8)(%rip), %rsi
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(_bss-8)(%rbx), %rdi
|
|
|
|
movl $(_bss - startup_32), %ecx
|
|
|
|
shrl $3, %ecx
|
2009-05-08 23:45:15 +00:00
|
|
|
std
|
|
|
|
rep movsq
|
|
|
|
cld
|
|
|
|
popq %rsi
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2020-02-02 17:13:49 +00:00
|
|
|
/*
|
|
|
|
* The GDT may get overwritten either during the copy we just did or
|
|
|
|
* during extract_kernel below. To avoid any issues, repoint the GDTR
|
|
|
|
* to the new copy of the GDT.
|
|
|
|
*/
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(gdt64)(%rbx), %rax
|
|
|
|
leaq rva(gdt)(%rbx), %rdx
|
2020-02-26 23:00:31 +00:00
|
|
|
movq %rdx, 2(%rax)
|
2020-02-02 17:13:49 +00:00
|
|
|
lgdt (%rax)
|
|
|
|
|
2007-05-02 17:27:07 +00:00
|
|
|
/*
|
|
|
|
* Jump to the relocated address.
|
|
|
|
*/
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(.Lrelocated)(%rbx), %rax
|
2007-05-02 17:27:07 +00:00
|
|
|
jmp *%rax
|
2019-10-11 11:51:02 +00:00
|
|
|
SYM_CODE_END(startup_64)
|
2007-05-02 17:27:07 +00:00
|
|
|
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
#ifdef CONFIG_EFI_STUB
|
x86/efi: Make the deprecated EFI handover protocol optional
The EFI handover protocol permits a bootloader to invoke the kernel as a
EFI PE/COFF application, while passing a bootparams struct as a third
argument to the entrypoint function call.
This has no basis in the UEFI specification, and there are better ways
to pass additional data to a UEFI application (UEFI configuration
tables, UEFI variables, UEFI protocols) than going around the
StartImage() boot service and jumping to a fixed offset in the loaded
image, just to call a different function that takes a third parameter.
The reason for handling struct bootparams in the bootloader was that the
EFI stub could only load initrd images from the EFI system partition,
and so passing it via struct bootparams was needed for loaders like
GRUB, which pass the initrd in memory, and may load it from anywhere,
including from the network. Another motivation was EFI mixed mode, which
could not use the initrd loader in the EFI stub at all due to 32/64 bit
incompatibilities (which will be fixed shortly [0]), and could not
invoke the ordinary PE/COFF entry point either, for the same reasons.
Given that loaders such as GRUB already carried the bootparams handling
in order to implement non-EFI boot, retaining that code and just passing
bootparams to the EFI stub was a reasonable choice (although defining an
alternate entrypoint could have been avoided.) However, the GRUB side
changes never made it upstream, and are only shipped by some of the
distros in their downstream versions.
In the meantime, EFI support has been added to other Linux architecture
ports, as well as to U-boot and systemd, including arch-agnostic methods
for passing initrd images in memory [1], and for doing mixed mode boot
[2], none of them requiring anything like the EFI handover protocol. So
given that only out-of-tree distro GRUB relies on this, let's permit it
to be omitted from the build, in preparation for retiring it completely
at a later date. (Note that systemd-boot does have an implementation as
well, but only uses it as a fallback for booting images that do not
implement the LoadFile2 based initrd loading method, i.e., v5.8 or older)
[0] https://lore.kernel.org/all/20220927085842.2860715-1-ardb@kernel.org/
[1] ec93fc371f01 ("efi/libstub: Add support for loading the initrd from a device path")
[2] 97aa276579b2 ("efi/x86: Add true mixed mode entry point into .compat section")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-18-ardb@kernel.org
2022-11-22 16:10:17 +00:00
|
|
|
#ifdef CONFIG_EFI_HANDOVER_PROTOCOL
|
2019-12-24 15:10:17 +00:00
|
|
|
.org 0x390
|
x86/efi: Make the deprecated EFI handover protocol optional
The EFI handover protocol permits a bootloader to invoke the kernel as a
EFI PE/COFF application, while passing a bootparams struct as a third
argument to the entrypoint function call.
This has no basis in the UEFI specification, and there are better ways
to pass additional data to a UEFI application (UEFI configuration
tables, UEFI variables, UEFI protocols) than going around the
StartImage() boot service and jumping to a fixed offset in the loaded
image, just to call a different function that takes a third parameter.
The reason for handling struct bootparams in the bootloader was that the
EFI stub could only load initrd images from the EFI system partition,
and so passing it via struct bootparams was needed for loaders like
GRUB, which pass the initrd in memory, and may load it from anywhere,
including from the network. Another motivation was EFI mixed mode, which
could not use the initrd loader in the EFI stub at all due to 32/64 bit
incompatibilities (which will be fixed shortly [0]), and could not
invoke the ordinary PE/COFF entry point either, for the same reasons.
Given that loaders such as GRUB already carried the bootparams handling
in order to implement non-EFI boot, retaining that code and just passing
bootparams to the EFI stub was a reasonable choice (although defining an
alternate entrypoint could have been avoided.) However, the GRUB side
changes never made it upstream, and are only shipped by some of the
distros in their downstream versions.
In the meantime, EFI support has been added to other Linux architecture
ports, as well as to U-boot and systemd, including arch-agnostic methods
for passing initrd images in memory [1], and for doing mixed mode boot
[2], none of them requiring anything like the EFI handover protocol. So
given that only out-of-tree distro GRUB relies on this, let's permit it
to be omitted from the build, in preparation for retiring it completely
at a later date. (Note that systemd-boot does have an implementation as
well, but only uses it as a fallback for booting images that do not
implement the LoadFile2 based initrd loading method, i.e., v5.8 or older)
[0] https://lore.kernel.org/all/20220927085842.2860715-1-ardb@kernel.org/
[1] ec93fc371f01 ("efi/libstub: Add support for loading the initrd from a device path")
[2] 97aa276579b2 ("efi/x86: Add true mixed mode entry point into .compat section")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-18-ardb@kernel.org
2022-11-22 16:10:17 +00:00
|
|
|
#endif
|
2019-12-24 15:10:17 +00:00
|
|
|
SYM_FUNC_START(efi64_stub_entry)
|
2019-12-24 15:10:13 +00:00
|
|
|
and $~0xf, %rsp /* realign the stack */
|
2020-03-08 08:08:43 +00:00
|
|
|
movq %rdx, %rbx /* save boot_params pointer */
|
2017-08-24 07:33:27 +00:00
|
|
|
call efi_main
|
2020-03-08 08:08:43 +00:00
|
|
|
movq %rbx,%rsi
|
x86/boot: Remove run-time relocations from .head.text code
The assembly code in head_{32,64}.S, while meant to be
position-independent, generates run-time relocations because it uses
instructions such as:
leal gdt(%edx), %eax
which make the assembler and linker think that the code is using %edx as
an index into gdt, and hence gdt needs to be relocated to its run-time
address.
On 32-bit, with lld Dmitry Golovin reports that this results in a
link-time error with default options (i.e. unless -z notext is
explicitly passed):
LD arch/x86/boot/compressed/vmlinux
ld.lld: error: can't create dynamic relocation R_386_32 against local
symbol in readonly segment; recompile object files with -fPIC or pass
'-Wl,-z,notext' to allow text relocations in the output
With the BFD linker, this generates a warning during the build, if
--warn-shared-textrel is enabled, which at least Gentoo enables by
default:
LD arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object
On 64-bit, it is not possible to link the kernel as -pie with lld, and
it is only possible with a BFD linker that supports -z noreloc-overflow,
i.e. versions >2.26. This is because these instructions cannot really be
relocated: the displacement field is only 32-bits wide, and thus cannot
be relocated for a 64-bit load address. The -z noreloc-overflow option
simply overrides the linker error, and results in R_X86_64_RELATIVE
relocations that apply a 64-bit relocation to a 32-bit field anyway.
This happens to work because nothing will process these run-time
relocations.
Start fixing this by removing relocations from .head.text:
- On 32-bit, use a base register that holds the address of the GOT and
reference symbol addresses using @GOTOFF, i.e.
leal gdt@GOTOFF(%edx), %eax
- On 64-bit, most of the code can (and already does) use %rip-relative
addressing, however the .code32 bits can't, and the 64-bit code also
needs to reference symbol addresses as they will be after moving the
compressed kernel to the end of the decompression buffer.
For these cases, reference the symbols as an offset to startup_32 to
avoid creating relocations, i.e.:
leal (gdt-startup_32)(%bp), %eax
This only works in .head.text as the subtraction cannot be represented
as a PC-relative relocation unless startup_32 is in the same section
as the code. Move efi32_pe_entry into .head.text so that it can use
the same method to avoid relocations.
Reported-by: Dmitry Golovin <dima@golovin.in>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-07-31 23:07:49 +00:00
|
|
|
leaq rva(startup_64)(%rax), %rax
|
2017-08-24 07:33:27 +00:00
|
|
|
jmp *%rax
|
2019-10-11 11:51:04 +00:00
|
|
|
SYM_FUNC_END(efi64_stub_entry)
|
2022-02-16 16:22:28 +00:00
|
|
|
SYM_FUNC_ALIAS(efi_stub_entry, efi64_stub_entry)
|
x86/efi: Firmware agnostic handover entry points
The EFI handover code only works if the "bitness" of the firmware and
the kernel match, i.e. 64-bit firmware and 64-bit kernel - it is not
possible to mix the two. This goes against the tradition that a 32-bit
kernel can be loaded on a 64-bit BIOS platform without having to do
anything special in the boot loader. Linux distributions, for one thing,
regularly run only 32-bit kernels on their live media.
Despite having only one 'handover_offset' field in the kernel header,
EFI boot loaders use two separate entry points to enter the kernel based
on the architecture the boot loader was compiled for,
(1) 32-bit loader: handover_offset
(2) 64-bit loader: handover_offset + 512
Since we already have two entry points, we can leverage them to infer
the bitness of the firmware we're running on, without requiring any boot
loader modifications, by making (1) and (2) valid entry points for both
CONFIG_X86_32 and CONFIG_X86_64 kernels.
To be clear, a 32-bit boot loader will always use (1) and a 64-bit boot
loader will always use (2). It's just that, if a single kernel image
supports (1) and (2) that image can be used with both 32-bit and 64-bit
boot loaders, and hence both 32-bit and 64-bit EFI.
(1) and (2) must be 512 bytes apart at all times, but that is already
part of the boot ABI and we could never change that delta without
breaking existing boot loaders anyhow.
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2014-01-10 15:54:31 +00:00
|
|
|
#endif
|
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
.text
|
2019-10-11 11:50:47 +00:00
|
|
|
SYM_FUNC_START_LOCAL_NOALIGN(.Lrelocated)
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
2009-05-08 23:27:41 +00:00
|
|
|
* Clear BSS (stack is currently empty)
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2009-05-08 23:45:15 +00:00
|
|
|
xorl %eax, %eax
|
|
|
|
leaq _bss(%rip), %rdi
|
|
|
|
leaq _ebss(%rip), %rcx
|
2007-05-02 17:27:07 +00:00
|
|
|
subq %rdi, %rcx
|
2009-05-08 23:45:15 +00:00
|
|
|
shrq $3, %rcx
|
|
|
|
rep stosq
|
2007-05-02 17:27:07 +00:00
|
|
|
|
2020-09-07 13:15:14 +00:00
|
|
|
pushq %rsi
|
|
|
|
call load_stage2_idt
|
2020-10-16 20:04:01 +00:00
|
|
|
|
|
|
|
/* Pass boot_params to initialize_identity_maps() */
|
|
|
|
movq (%rsp), %rdi
|
2020-09-07 13:15:17 +00:00
|
|
|
call initialize_identity_maps
|
2020-09-07 13:15:14 +00:00
|
|
|
popq %rsi
|
|
|
|
|
2005-04-16 22:20:36 +00:00
|
|
|
/*
|
2016-04-18 16:42:13 +00:00
|
|
|
* Do the extraction, and jump to the new kernel..
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2009-05-09 00:42:16 +00:00
|
|
|
pushq %rsi /* Save the real mode argument */
|
|
|
|
movq %rsi, %rdi /* real mode address */
|
|
|
|
leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
|
|
|
|
leaq input_data(%rip), %rdx /* input_data */
|
2020-07-31 23:07:50 +00:00
|
|
|
movl input_len(%rip), %ecx /* input_len */
|
2009-05-09 00:42:16 +00:00
|
|
|
movq %rbp, %r8 /* output target address */
|
2020-07-31 23:07:50 +00:00
|
|
|
movl output_len(%rip), %r9d /* decompressed length, end of relocs */
|
x86/boot: Robustify calling startup_{32,64}() from the decompressor code
After commit ce697ccee1a8 ("kbuild: remove head-y syntax"), I
started digging whether x86 is ready for removing this old cruft.
Removing its objects from the list makes the kernel unbootable.
This applies only to bzImage, vmlinux still works correctly.
The reason is that with no strict object order determined by the
linker arguments, not the linker script, startup_64 can be placed
not right at the beginning of the kernel.
Here's vmlinux.map's beginning before removing:
ffffffff81000000 vmlinux.o:(.head.text)
ffffffff81000000 startup_64
ffffffff81000070 secondary_startup_64
ffffffff81000075 secondary_startup_64_no_verify
ffffffff81000160 verify_cpu
and after:
ffffffff81000000 vmlinux.o:(.head.text)
ffffffff81000000 pvh_start_xen
ffffffff81000080 startup_64
ffffffff810000f0 secondary_startup_64
ffffffff810000f5 secondary_startup_64_no_verify
Not a problem itself, but the self-extractor code has the address of
that function hardcoded the beginning, not looking onto the ELF
header, which always contains the address of startup_{32,64}().
So, instead of doing an "act of blind faith", just take the address
from the ELF header and extract a relative offset to the entry
point. The decompressor function already returns a pointer to the
beginning of the kernel to the Asm code, which then jumps to it,
so add that offset to the return value.
This doesn't change anything for now, but allows to resign from the
"head object list" for x86 and makes sure valid Kbuild or any other
improvements won't break anything here in general.
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230109170403.4117105-2-alexandr.lobakin@intel.com
2023-01-09 17:04:02 +00:00
|
|
|
call extract_kernel /* returns kernel entry point in %rax */
|
2007-05-02 17:27:07 +00:00
|
|
|
popq %rsi
|
2005-04-16 22:20:36 +00:00
|
|
|
|
|
|
|
/*
|
2007-05-02 17:27:07 +00:00
|
|
|
* Jump to the decompressed kernel.
|
2005-04-16 22:20:36 +00:00
|
|
|
*/
|
2013-10-11 00:18:14 +00:00
|
|
|
jmp *%rax
|
2019-10-11 11:50:47 +00:00
|
|
|
SYM_FUNC_END(.Lrelocated)
|
2005-04-16 22:20:36 +00:00
|
|
|
|
2013-01-24 20:20:00 +00:00
|
|
|
.code32
|
2018-03-12 10:02:46 +00:00
|
|
|
/*
|
|
|
|
* This is the 32-bit trampoline that will be copied over to low memory.
|
|
|
|
*
|
|
|
|
* RDI contains the return address (might be above 4G).
|
|
|
|
* ECX contains the base address of the trampoline memory.
|
2019-02-06 15:29:08 +00:00
|
|
|
* Non zero RDX means trampoline needs to enable 5-level paging.
|
2018-03-12 10:02:46 +00:00
|
|
|
*/
|
2019-10-11 11:51:02 +00:00
|
|
|
SYM_CODE_START(trampoline_32bit_src)
|
2018-02-26 18:04:50 +00:00
|
|
|
/* Set up data and stack segments */
|
2017-06-06 11:31:25 +00:00
|
|
|
movl $__KERNEL_DS, %eax
|
|
|
|
movl %eax, %ds
|
|
|
|
movl %eax, %ss
|
|
|
|
|
2018-03-12 10:02:44 +00:00
|
|
|
/* Set up new stack */
|
|
|
|
leal TRAMPOLINE_32BIT_STACK_END(%ecx), %esp
|
|
|
|
|
2017-06-06 11:31:25 +00:00
|
|
|
/* Disable paging */
|
|
|
|
movl %cr0, %eax
|
|
|
|
btrl $X86_CR0_PG_BIT, %eax
|
|
|
|
movl %eax, %cr0
|
|
|
|
|
2018-03-12 10:02:45 +00:00
|
|
|
/* Check what paging mode we want to be in after the trampoline */
|
2020-10-29 16:02:58 +00:00
|
|
|
testl %edx, %edx
|
2018-03-12 10:02:45 +00:00
|
|
|
jz 1f
|
2017-06-06 11:31:25 +00:00
|
|
|
|
2018-03-12 10:02:45 +00:00
|
|
|
/* We want 5-level paging: don't touch CR3 if it already points to 5-level page tables */
|
2017-06-06 11:31:25 +00:00
|
|
|
movl %cr4, %eax
|
2018-03-12 10:02:45 +00:00
|
|
|
testl $X86_CR4_LA57, %eax
|
|
|
|
jnz 3f
|
|
|
|
jmp 2f
|
|
|
|
1:
|
|
|
|
/* We want 4-level paging: don't touch CR3 if it already points to 4-level page tables */
|
|
|
|
movl %cr4, %eax
|
|
|
|
testl $X86_CR4_LA57, %eax
|
|
|
|
jz 3f
|
|
|
|
2:
|
|
|
|
/* Point CR3 to the trampoline's new top level page table */
|
|
|
|
leal TRAMPOLINE_32BIT_PGTABLE_OFFSET(%ecx), %eax
|
|
|
|
movl %eax, %cr3
|
|
|
|
3:
|
2019-01-04 05:44:11 +00:00
|
|
|
/* Set EFER.LME=1 as a precaution in case hypervsior pulls the rug */
|
|
|
|
pushl %ecx
|
2019-02-06 11:52:53 +00:00
|
|
|
pushl %edx
|
2019-01-04 05:44:11 +00:00
|
|
|
movl $MSR_EFER, %ecx
|
|
|
|
rdmsr
|
|
|
|
btsl $_EFER_LME, %eax
|
x86/boot: Avoid #VE during boot for TDX platforms
There are a few MSRs and control register bits that the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent security
guarantees. Fortunately, TDX ensures that these are all in the correct
state before the kernel loads, which means the kernel does not need to
modify them.
The conditions to avoid are:
* Any writes to the EFER MSR
* Clearing CR4.MCE
This theoretically makes the guest boot more fragile. If, for instance,
EFER was set up incorrectly and a WRMSR was performed, it will trigger
early exception panic or a triple fault, if it's before early
exceptions are set up. However, this is likely to trip up the guest
BIOS long before control reaches the kernel. In any case, these kinds
of problems are unlikely to occur in production environments, and
developers have good debug tools to fix them quickly.
Change the common boot code to work on TDX and non-TDX systems.
This should have no functional effect on non-TDX systems.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-24-kirill.shutemov@linux.intel.com
2022-04-05 23:29:32 +00:00
|
|
|
/* Avoid writing EFER if no change was made (for TDX guest) */
|
|
|
|
jc 1f
|
2019-01-04 05:44:11 +00:00
|
|
|
wrmsr
|
x86/boot: Avoid #VE during boot for TDX platforms
There are a few MSRs and control register bits that the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent security
guarantees. Fortunately, TDX ensures that these are all in the correct
state before the kernel loads, which means the kernel does not need to
modify them.
The conditions to avoid are:
* Any writes to the EFER MSR
* Clearing CR4.MCE
This theoretically makes the guest boot more fragile. If, for instance,
EFER was set up incorrectly and a WRMSR was performed, it will trigger
early exception panic or a triple fault, if it's before early
exceptions are set up. However, this is likely to trip up the guest
BIOS long before control reaches the kernel. In any case, these kinds
of problems are unlikely to occur in production environments, and
developers have good debug tools to fix them quickly.
Change the common boot code to work on TDX and non-TDX systems.
This should have no functional effect on non-TDX systems.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-24-kirill.shutemov@linux.intel.com
2022-04-05 23:29:32 +00:00
|
|
|
1: popl %edx
|
2019-01-04 05:44:11 +00:00
|
|
|
popl %ecx
|
|
|
|
|
x86/boot: Avoid #VE during boot for TDX platforms
There are a few MSRs and control register bits that the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent security
guarantees. Fortunately, TDX ensures that these are all in the correct
state before the kernel loads, which means the kernel does not need to
modify them.
The conditions to avoid are:
* Any writes to the EFER MSR
* Clearing CR4.MCE
This theoretically makes the guest boot more fragile. If, for instance,
EFER was set up incorrectly and a WRMSR was performed, it will trigger
early exception panic or a triple fault, if it's before early
exceptions are set up. However, this is likely to trip up the guest
BIOS long before control reaches the kernel. In any case, these kinds
of problems are unlikely to occur in production environments, and
developers have good debug tools to fix them quickly.
Change the common boot code to work on TDX and non-TDX systems.
This should have no functional effect on non-TDX systems.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-24-kirill.shutemov@linux.intel.com
2022-04-05 23:29:32 +00:00
|
|
|
#ifdef CONFIG_X86_MCE
|
|
|
|
/*
|
|
|
|
* Preserve CR4.MCE if the kernel will enable #MC support.
|
|
|
|
* Clearing MCE may fault in some environments (that also force #MC
|
|
|
|
* support). Any machine check that occurs before #MC support is fully
|
|
|
|
* configured will crash the system regardless of the CR4.MCE value set
|
|
|
|
* here.
|
|
|
|
*/
|
|
|
|
movl %cr4, %eax
|
|
|
|
andl $X86_CR4_MCE, %eax
|
|
|
|
#else
|
|
|
|
movl $0, %eax
|
|
|
|
#endif
|
|
|
|
|
2018-03-12 10:02:45 +00:00
|
|
|
/* Enable PAE and LA57 (if required) paging modes */
|
x86/boot: Avoid #VE during boot for TDX platforms
There are a few MSRs and control register bits that the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent security
guarantees. Fortunately, TDX ensures that these are all in the correct
state before the kernel loads, which means the kernel does not need to
modify them.
The conditions to avoid are:
* Any writes to the EFER MSR
* Clearing CR4.MCE
This theoretically makes the guest boot more fragile. If, for instance,
EFER was set up incorrectly and a WRMSR was performed, it will trigger
early exception panic or a triple fault, if it's before early
exceptions are set up. However, this is likely to trip up the guest
BIOS long before control reaches the kernel. In any case, these kinds
of problems are unlikely to occur in production environments, and
developers have good debug tools to fix them quickly.
Change the common boot code to work on TDX and non-TDX systems.
This should have no functional effect on non-TDX systems.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-24-kirill.shutemov@linux.intel.com
2022-04-05 23:29:32 +00:00
|
|
|
orl $X86_CR4_PAE, %eax
|
2020-10-29 16:02:58 +00:00
|
|
|
testl %edx, %edx
|
2018-03-12 10:02:45 +00:00
|
|
|
jz 1f
|
|
|
|
orl $X86_CR4_LA57, %eax
|
|
|
|
1:
|
2017-06-06 11:31:25 +00:00
|
|
|
movl %eax, %cr4
|
|
|
|
|
2018-03-12 10:02:46 +00:00
|
|
|
/* Calculate address of paging_enabled() once we are executing in the trampoline */
|
2019-09-06 07:55:50 +00:00
|
|
|
leal .Lpaging_enabled - trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_OFFSET(%ecx), %eax
|
2017-06-06 11:31:25 +00:00
|
|
|
|
2018-03-12 10:02:46 +00:00
|
|
|
/* Prepare the stack for far return to Long Mode */
|
2017-06-06 11:31:25 +00:00
|
|
|
pushl $__KERNEL_CS
|
2018-03-12 10:02:46 +00:00
|
|
|
pushl %eax
|
2017-06-06 11:31:25 +00:00
|
|
|
|
2022-04-05 23:29:31 +00:00
|
|
|
/* Enable paging again. */
|
|
|
|
movl %cr0, %eax
|
|
|
|
btsl $X86_CR0_PG_BIT, %eax
|
2017-06-06 11:31:25 +00:00
|
|
|
movl %eax, %cr0
|
|
|
|
|
|
|
|
lret
|
2019-10-11 11:51:02 +00:00
|
|
|
SYM_CODE_END(trampoline_32bit_src)
|
2017-06-06 11:31:25 +00:00
|
|
|
|
2018-03-12 10:02:46 +00:00
|
|
|
.code64
|
2019-10-11 11:50:47 +00:00
|
|
|
SYM_FUNC_START_LOCAL_NOALIGN(.Lpaging_enabled)
|
2018-03-12 10:02:46 +00:00
|
|
|
/* Return from the trampoline */
|
|
|
|
jmp *%rdi
|
2019-10-11 11:50:47 +00:00
|
|
|
SYM_FUNC_END(.Lpaging_enabled)
|
2018-03-12 10:02:46 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The trampoline code has a size limit.
|
|
|
|
* Make sure we fail to compile if the trampoline code grows
|
|
|
|
* beyond TRAMPOLINE_32BIT_CODE_SIZE bytes.
|
|
|
|
*/
|
|
|
|
.org trampoline_32bit_src + TRAMPOLINE_32BIT_CODE_SIZE
|
|
|
|
|
|
|
|
.code32
|
2019-10-11 11:50:47 +00:00
|
|
|
SYM_FUNC_START_LOCAL_NOALIGN(.Lno_longmode)
|
2018-03-12 10:02:46 +00:00
|
|
|
/* This isn't an x86-64 CPU, so hang intentionally, we cannot continue */
|
2013-01-24 20:20:00 +00:00
|
|
|
1:
|
|
|
|
hlt
|
|
|
|
jmp 1b
|
2019-10-11 11:50:47 +00:00
|
|
|
SYM_FUNC_END(.Lno_longmode)
|
2013-01-24 20:20:00 +00:00
|
|
|
|
2022-11-22 16:10:06 +00:00
|
|
|
.globl verify_cpu
|
2013-01-24 20:20:00 +00:00
|
|
|
#include "../../kernel/verify_cpu.S"
|
|
|
|
|
2007-05-02 17:27:07 +00:00
|
|
|
.data
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_START_LOCAL(gdt64)
|
2020-02-02 17:13:52 +00:00
|
|
|
.word gdt_end - gdt - 1
|
2020-02-02 17:13:53 +00:00
|
|
|
.quad gdt - gdt64
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_END(gdt64)
|
2019-06-27 04:55:25 +00:00
|
|
|
.balign 8
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_START_LOCAL(gdt)
|
2020-02-02 17:13:52 +00:00
|
|
|
.word gdt_end - gdt - 1
|
2020-02-02 17:13:53 +00:00
|
|
|
.long 0
|
2007-05-02 17:27:07 +00:00
|
|
|
.word 0
|
2017-06-06 11:31:25 +00:00
|
|
|
.quad 0x00cf9a000000ffff /* __KERNEL32_CS */
|
2007-05-02 17:27:07 +00:00
|
|
|
.quad 0x00af9a000000ffff /* __KERNEL_CS */
|
|
|
|
.quad 0x00cf92000000ffff /* __KERNEL_DS */
|
2007-08-10 20:31:05 +00:00
|
|
|
.quad 0x0080890000000000 /* TS descriptor */
|
|
|
|
.quad 0x0000000000000000 /* TS continued */
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_END_LABEL(gdt, SYM_L_LOCAL, gdt_end)
|
2008-04-08 10:54:30 +00:00
|
|
|
|
2020-09-07 13:15:14 +00:00
|
|
|
SYM_DATA_START(boot_idt_desc)
|
|
|
|
.word boot_idt_end - boot_idt - 1
|
|
|
|
.quad 0
|
|
|
|
SYM_DATA_END(boot_idt_desc)
|
|
|
|
.balign 8
|
|
|
|
SYM_DATA_START(boot_idt)
|
|
|
|
.rept BOOT_IDT_ENTRIES
|
|
|
|
.quad 0
|
|
|
|
.quad 0
|
|
|
|
.endr
|
|
|
|
SYM_DATA_END_LABEL(boot_idt, SYM_L_GLOBAL, boot_idt_end)
|
|
|
|
|
2009-05-08 22:59:13 +00:00
|
|
|
/*
|
|
|
|
* Stack and heap for uncompression
|
|
|
|
*/
|
|
|
|
.bss
|
|
|
|
.balign 4
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_LOCAL(boot_heap, .fill BOOT_HEAP_SIZE, 1, 0)
|
|
|
|
|
|
|
|
SYM_DATA_START_LOCAL(boot_stack)
|
2008-04-08 10:54:30 +00:00
|
|
|
.fill BOOT_STACK_SIZE, 1, 0
|
2020-06-17 13:19:57 +00:00
|
|
|
.balign 16
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_END_LABEL(boot_stack, SYM_L_LOCAL, boot_stack_end)
|
2009-05-08 23:20:34 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Space for page tables (not in .bss so not zeroed)
|
|
|
|
*/
|
2020-01-09 15:02:17 +00:00
|
|
|
.section ".pgtable","aw",@nobits
|
2009-05-08 23:20:34 +00:00
|
|
|
.balign 4096
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_LOCAL(pgtable, .fill BOOT_PGT_SIZE, 1, 0)
|
2018-05-16 08:01:29 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The page table is going to be used instead of page table in the trampoline
|
|
|
|
* memory.
|
|
|
|
*/
|
2019-10-11 11:50:52 +00:00
|
|
|
SYM_DATA_LOCAL(top_pgtable, .fill PAGE_SIZE, 1, 0)
|