Commit Graph

782741 Commits

Author SHA1 Message Date
Harald Freudenberger
aa55bf5f02 s390/zcrypt: add ap_adapter_mask sysfs attribute
This patch provides a new sysfs attribute file
/sys/bus/ap/ap_adapter_mask. This read-only attribute
refrects the apm field as it is found in the PQAP(QCI)
crypto info.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:37 +02:00
Harald Freudenberger
a17b92e048 s390/zcrypt: provide apfs failure code on type 86 error reply
The apfs field (AP final status) is set on transport
protocol failures (reply code 0x90) for type 86 replies.
For CCA cprbs this value is copied into the xcrb status
field which gives userspace a hint for the failure reason.
However, for EP11 cprbs there is no such status field
in the xcrb struct. So now regardless of the request
type, if a reply type 86 with transport protocol failure
is seen, the apfs value is printed as part of the debug
message. So the user has a chance to see the apfs value
without using a special build kernel.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:36 +02:00
Harald Freudenberger
ee410de890 s390/zcrypt: zcrypt device driver cleanup
Some cleanup in the s390 zcrypt device driver:
- Removed fragments of pcixx crypto card code. This code
  can't be reached anymore because the hardware detection
  function does not recognize crypto cards < CEX2 since
  commit f56545430736 ("s390/zcrypt: Introduce QACT support
  for AP bus devices.")
- Rename of some files and driver names which where still
  reflecting pcixx support to cex2a/cex2c.
- Removed all the zcrypt version strings in the file headers.
  There is only one place left - the zcrypt.h header file is
  now the only place for zcrypt device driver version info.
- Zcrypt version pump up from 2.2.0 to 2.2.1.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:35 +02:00
Vasily Gorbik
78333d1f90 s390/kasan: add support for mem= kernel parameter
Handle mem= kernel parameter in kasan to limit physical memory.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:35 +02:00
Vasily Gorbik
12e55fa194 s390/kasan: optimize kasan vmemmap allocation
Kasan implementation now supports memory hotplug operations. For that
reason regions of initially standby memory are now skipped from
shadow mapping and are mapped/unmapped dynamically upon bringing
memory online/offline.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:34 +02:00
Vasily Gorbik
296352397d s390/kasan: avoid kasan crash with standby memory defined
Kasan early memory allocator simply chops off memory blocks from the
end of the physical memory. Reuse mem_detect info to identify actual
online memory end rather than using max_physmem_end. This allows to run
the kernel with kasan enabled and standby memory defined.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:33 +02:00
Vasily Gorbik
19733fe872 s390/head: avoid doubling early boot stack size under KASAN
Early boot stack uses predefined 4 pages of memory 0x8000-0xC000. This
stack is used to run not instumented decompressor/facilities
verification C code. It doesn't make sense to double its size when
the kernel is built with KASAN support. BOOT_STACK_ORDER is introduced
to avoid that.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:32 +02:00
Vasily Gorbik
6cad0eb561 s390/mm: improve debugfs ptdump markers walking
This allows to print multiple markers when they happened to have the
same value.

...
0x001bfffff0100000-0x001c000000000000       255M PMD I
---[ Kasan Shadow End ]---
---[ vmemmap Area ]---
0x001c000000000000-0x001c000002000000        32M PMD RW X
...

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:31 +02:00
Vasily Gorbik
e006222b57 s390/mm: optimize debugfs ptdump kasan zero page walking
Kasan zero p4d/pud/pmd/pte are always filled in with corresponding
kasan zero entries. Walking kasan zero page backed area is time
consuming and unnecessary. When kasan zero p4d/pud/pmd is encountered,
it eventually points to the kasan zero page always with the same
attributes and nothing but it, therefore zero p4d/pud/pmd could
be jumped over.

Also adds a space between address range and pages number to separate
them from each other when pages number is huge.

0x0018000000000000-0x0018000010000000       256M PMD RW X
0x0018000010000000-0x001bfffff0000000 1073741312M PTE RO X
0x001bfffff0000000-0x001bfffff0001000         4K PTE RW X

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:30 +02:00
Vasily Gorbik
5dff03813f s390/kasan: add option for 4-level paging support
By default 3-level paging is used when the kernel is compiled with
kasan support. Add 4-level paging option to support systems with more
then 3TB of physical memory and to cover 4-level paging specific code
with kasan as well.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:29 +02:00
Vasily Gorbik
135ff16393 s390/kasan: free early identity mapping structures
Kasan initialization code is changed to populate persistent shadow
first, save allocator position into pgalloc_freeable and proceed with
early identity mapping creation. This way early identity mapping paging
structures could be freed at once after switching to swapper_pg_dir
when early identity mapping is not needed anymore.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:29 +02:00
Vasily Gorbik
5e78596329 s390/kasan: enable stack and global variables access checks
By defining KASAN_SHADOW_OFFSET in Kconfig stack and global variables
memory access check instrumentation is enabled. gcc version 4.9.2 or
newer is also required.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:28 +02:00
Vasily Gorbik
f4f0d32bfb s390/dumpstack: disable __dump_trace kasan instrumentation
Walking async_stack produces false positives. Disable __dump_trace
function instrumentation for now.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:27 +02:00
Vasily Gorbik
ac1256f826 s390/kasan: reipl and kexec support
Some functions from both arch/s390/kernel/ipl.c and
arch/s390/kernel/machine_kexec.c are called without DAT enabled
(or with and without DAT enabled code paths). There is no easy way
to partially disable kasan for those files without a substantial
rework. Disable kasan for both files for now.

To avoid disabling kasan for arch/s390/kernel/diag.c DAT flag is
enabled in diag308 call. pcpu_delegate which disables DAT is marked
with __no_sanitize_address to disable instrumentation for that one
function.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:27 +02:00
Vasily Gorbik
9e8df6daed s390/smp: kasan stack instrumentation support
smp_start_secondary function is called without DAT enabled. To avoid
disabling kasan instrumentation for entire arch/s390/kernel/smp.c
smp_start_secondary has been split in 2 parts. smp_start_secondary has
instrumentation disabled, it does minimal setup and enables DAT. Then
instrumentated __smp_start_secondary is called to do the rest.

__load_psw_mask function instrumentation has been disabled as well
to be able to call it from smp_start_secondary.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:26 +02:00
Vasily Gorbik
dde709d136 compiler: introduce __no_sanitize_address_or_inline
Due to conflict between kasan instrumentation and inlining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368 functions which are
defined as inline could not be called from functions defined with
__no_sanitize_address.

Introduce __no_sanitize_address_or_inline which would expand to
__no_sanitize_address when the kernel is built with kasan support and
to inline otherwise. This helps to avoid disabling kasan
instrumentation for entire files.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:25 +02:00
Vasily Gorbik
d58106c3ec s390/kasan: use noexec and large pages
To lower memory footprint and speed up kasan initialisation detect
EDAT availability and use large pages if possible. As we know how
much memory is needed for initialisation, another simplistic large
page allocator is introduced to avoid memory fragmentation.

Since facilities list is retrieved anyhow, detect noexec support and
adjust pages attributes. Handle noexec kernel option to avoid inconsistent
kasan shadow memory pages flags.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:24 +02:00
Vasily Gorbik
793213a82d s390/kasan: dynamic shadow mem allocation for modules
Move from modules area entire shadow memory preallocation to dynamic
allocation per module load.

This behaivior has been introduced for x86 with bebf56a1b: "This patch
also forces module_alloc() to return 8*PAGE_SIZE aligned address making
shadow memory handling ( kasan_module_alloc()/kasan_module_free() )
more simple. Such alignment guarantees that each shadow page backing
modules address space correspond to only one module_alloc() allocation"

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:23 +02:00
Vasily Gorbik
0dac8f6bc3 s390/mm: add kasan shadow to the debugfs pgtable dump
This change adds address space markers for kasan shadow memory.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:22 +02:00
Vasily Gorbik
b6cbe3e8bd s390/kasan: avoid user access code instrumentation
Kasan instrumentation adds "store" check for variables marked as
modified by inline assembly. With user pointers containing addresses
from another address space this produces false positives.

static inline unsigned long clear_user_xc(void __user *to, ...)
{
	asm volatile(
	...
	: "+a" (to) ...

User space access functions are wrapped by manually instrumented
functions in kasan common code, which should be sufficient to catch
errors. So, we just disable uaccess.o instrumentation altogether.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:21 +02:00
Vasily Gorbik
7fef92ccad s390/kasan: double the stack size
Kasan stack instrumentation pads stack variables with redzones, which
increases stack frames size significantly. Stack sizes are increased
from 16k to 32k in the code, as well as for the kernel stack overflow
detection option (CHECK_STACK).

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:21 +02:00
Vasily Gorbik
42db5ed860 s390/kasan: add initialization code and enable it
Kasan needs 1/8 of kernel virtual address space to be reserved as the
shadow area. And eventually it requires the shadow memory offset to be
known at compile time (passed to the compiler when full instrumentation
is enabled).  Any value picked as the shadow area offset for 3-level
paging would eat up identity mapping on 4-level paging (with 1PB
shadow area size). So, the kernel sticks to 3-level paging when kasan
is enabled. 3TB border is picked as the shadow offset.  The memory
layout is adjusted so, that physical memory border does not exceed
KASAN_SHADOW_START and vmemmap does not go below KASAN_SHADOW_END.

Due to the fact that on s390 paging is set up very late and to cover
more code with kasan instrumentation, temporary identity mapping and
final shadow memory are set up early. The shadow memory mapping is
later carried over to init_mm.pgd during paging_init.

For the needs of paging structures allocation and shadow memory
population a primitive allocator is used, which simply chops off
memory blocks from the end of the physical memory.

Kasan currenty doesn't track vmemmap and vmalloc areas.

Current memory layout (for 3-level paging, 2GB physical memory).

---[ Identity Mapping ]---
0x0000000000000000-0x0000000000100000
---[ Kernel Image Start ]---
0x0000000000100000-0x0000000002b00000
---[ Kernel Image End ]---
0x0000000002b00000-0x0000000080000000        2G <- physical memory border
0x0000000080000000-0x0000030000000000     3070G PUD I
---[ Kasan Shadow Start ]---
0x0000030000000000-0x0000030010000000      256M PMD RW X  <- shadow for 2G memory
0x0000030010000000-0x0000037ff0000000   523776M PTE RO NX <- kasan zero ro page
0x0000037ff0000000-0x0000038000000000      256M PMD RW X  <- shadow for 2G modules
---[ Kasan Shadow End ]---
0x0000038000000000-0x000003d100000000      324G PUD I
---[ vmemmap Area ]---
0x000003d100000000-0x000003e080000000
---[ vmalloc Area ]---
0x000003e080000000-0x000003ff80000000
---[ Modules Area ]---
0x000003ff80000000-0x0000040000000000        2G

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:20 +02:00
Vasily Gorbik
d0e2eb0a36 s390: add pgd_page primitive
Add pgd_page primitive which is required by kasan common code.
Also fixes typo in p4d_page definition.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:19 +02:00
Vasily Gorbik
34377d3cfb s390: introduce MAX_PTRS_PER_P4D
Kasan common code requires MAX_PTRS_PER_P4D definition, which in case
of s390 is always PTRS_PER_P4D.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:18 +02:00
Vasily Gorbik
fb594ec13e s390/kasan: replace some memory functions
Follow the common kasan approach:

    "KASan replaces memory functions with manually instrumented
    variants.  Original functions declared as weak symbols so strong
    definitions in mm/kasan/kasan.c could replace them. Original
    functions have aliases with '__' prefix in name, so we could call
    non-instrumented variant if needed."

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:18 +02:00
Vasily Gorbik
0a9b40911b s390/kasan: avoid instrumentation of early C code
Instrumented C code cannot run without the kasan shadow area. Exempt
source code files from kasan which are running before / used during
kasan initialization.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:17 +02:00
Vasily Gorbik
3484984585 s390/kasan: avoid vdso instrumentation
vdso is mapped into user space processes, which won't have kasan
shodow mapped.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:16 +02:00
Vasily Gorbik
75f195420a s390/mm: add missing pfn_to_kaddr helper
kasan common code uses pfn_to_kaddr, which is defined by many other
architectures. Adding it as well to avoid a build error.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:15 +02:00
Vasily Gorbik
49698745e5 s390: move ipl block and cmd line handling to early boot phase
To distinguish zfcpdump case and to be able to parse some of the kernel
command line arguments early (e.g. mem=) ipl block retrieval and command
line construction code is moved to the early boot phase.

"memory_end" is set up correctly respecting "mem=" and hsa_size in case
of the zfcpdump.

arch/s390/boot/string.c is introduced to provide string handling and
command line parsing functions to early boot phase code for the compressed
kernel image case.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:14 +02:00
Vasily Gorbik
b09decfd99 s390/sclp: introduce sclp_early_get_hsa_size
Introduce sclp_early_get_hsa_size function to be used during early
memory detection. This function allows to find a memory limit imposed
during zfcpdump.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:13 +02:00
Vasily Gorbik
f01b8bca08 s390/mem_detect: add info source debug print
Print mem_detect info source when memblock=debug is specified.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:13 +02:00
Vasily Gorbik
54c57795e8 s390/mem_detect: replace tprot loop with binary search
In a situation when other memory detection methods are not available
(no SCLP and no z/VM diag260), continuous online memory is assumed.
Replacing tprot loop with faster binary search, as only online memory
end has to be found.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:12 +02:00
Vasily Gorbik
cd45c99561 s390/mem_detect: use SCLP info for continuous memory detection
When neither SCLP storage info, nor z/VM diag260 "storage configuration"
are available assume a continuous online memory of size specified by
SCLP info.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:11 +02:00
Vasily Gorbik
6e98e64329 s390/mem_detect: introduce z/VM specific diag260 call
In the case when z/VM memory is defined with "define storage config"
command, SCLP storage info is not available. Utilize diag260 "storage
configuration" call, to get information about z/VM specific guest memory
definitions with potential memory holes.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:10 +02:00
Vasily Gorbik
fddbaa5c42 s390/mem_detect: introduce SCLP storage info
SCLP storage info allows to detect continuous and non-continuous online
memory under LPAR, z/VM and KVM, when standby memory is defined.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:09 +02:00
Vasily Gorbik
251b72a440 s390: introduce .boot.data section compile time validation
Make sure that .boot.data sections of vmlinux and
arch/s390/compressed/vmlinux match before producing the compressed kernel
image. Symbols presence, order and sizes are cross-checked.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:08 +02:00
Vasily Gorbik
6966d604e2 s390/mem_detect: move tprot loop to early boot phase
Move memory detection to early boot phase. To store online memory
regions "struct mem_detect_info" has been introduced together with
for_each_mem_detect_block iterator. mem_detect_info is later converted
to memblock.

Also introduces sclp_early_get_meminfo function to get maximum physical
memory and maximum increment number.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:08 +02:00
Vasily Gorbik
17aacfbfa1 s390/sclp: move sclp_early_read_info to sclp_early_core.c
To enable early online memory detection sclp_early_read_info has
been moved to sclp_early_core.c. sclp_info_sccb has been made a part
of .boot.data, which allows to reuse it later during early kernel
startup and make sclp_early_read_info call just once.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:07 +02:00
Vasily Gorbik
d1b52a4388 s390: introduce .boot.data section
Introduce .boot.data section which is "shared" between the decompressor
code and the decompressed kernel. The decompressor will store values in
it, and copy over to the decompressed image before starting it. This
method allows to avoid using pre-defined addresses and other hacks to
pass values between those boot phases.

.boot.data section is a part of init data, and will be freed after kernel
initialization is complete.

For uncompressed kernel image, .boot.data section is basically the same
as .init.data

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:06 +02:00
Vasily Gorbik
7516fc11e4 s390/decompressor: clean up and rename compressed/misc.c
Since compressed/misc.c is conditionally compiled move error reporting
code to boot/main.c. With that being done compressed/misc.c has no
"miscellaneous" functions left and is all about plain decompression
now. Rename it accordingly.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:05 +02:00
Vasily Gorbik
15426ca43d s390: rescue initrd as early as possible
To avoid multi-stage initrd rescue operation and to simplify
assumptions during early memory allocations move initrd at some final
safe destination as early as possible. This would also allow us to
drop .bss usage restrictions for some files.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:05 +02:00
Vasily Gorbik
3b076dca14 s390/sclp: simplify early hsa_size detection
Architecture documentation suggests that hsa_size has been available in
the read info since the list-directed ipl dump has been introduced. By
using this value few early sclp calls could be avoided.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:04 +02:00
Vasily Gorbik
a2ac1bb1f3 s390/decompressor: get rid of .bss usage
Using .bss in early code should be avoided. It might overlay initrd
image or not yet be initialized. Clean up the last couple of places in
the decompressor's code where .bss is used and enfore no .bss usage
check on boot/compressed/misc.c. In particular:
- initializing free_mem_ptr and free_mem_end_ptr with values guarantee
that these variables won't end up in the .bss section.
- define STATIC_RW_DATA to go into .data section.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:03 +02:00
Vasily Gorbik
369f91c374 s390/decompressor: rework uncompressed image info collection
The kernel decompressor has to know several bits of information about
uncompressed image. Currently this info is collected by running "nm" on
uncompressed vmlinux + "sed" and producing sizes.h file. This method
worked well, but it has several disadvantages. Obscure symbols name
pattern matching is fragile. Adding new values makes pattern even
longer. Logic is spread across code and make file. Limited ability to
adjust symbols values (currently magic lma value of 0x100000 is always
subtracted). Apart from that same pieces of information (and more)
would be needed for early memory detection and features like KASLR
outside of boot/compressed/ folder where sizes.h is generated.

To overcome limitations new "struct vmlinux_info" has been introduced
to include values needed for the decompressor and the rest of the
boot code. The only static instance of vmlinux_info is produced during
vmlinux link step by filling in struct fields by the linker (like it is
done with input_data in boot/compressed/vmlinux.scr.lds.S). This way
individual values could be adjusted with all the knowledge linker has
and arithmetic it supports. Later .vmlinux.info section (which contains
struct vmlinux_info) is transplanted into the decompressor image and
dropped from uncompressed image altogether.

While doing that replace "compressed/vmlinux.scr.lds.S" linker
script (whose purpose is to rename .data section in piggy.o to
.rodata.compressed) with plain objcopy command. And simplify
decompressor's linker script.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:02 +02:00
Vasily Gorbik
8f75582a2f s390: remove decompressor's head.S
Decompressor's head.S provided "data mover" sole purpose of which has
been to safely move uncompressed kernel at 0x100000 and jump to it.

With current bzImage layout entire decompressor's code guaranteed to be
in a safe location under 0x100000, and hence could not be overwritten
during kernel move. For that reason head.S could be replaced with simple
memmove function. To do so introduce early boot code phase which is
executed from arch/s390/boot/head.S after "verify_facilities" and takes
care of optional kernel image decompression and transition to it.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:21:02 +02:00
Vasily Gorbik
32ce55a659 s390: unify stack size definitions
Remove STACK_ORDER and STACK_SIZE in favour of identical THREAD_SIZE_ORDER
and THREAD_SIZE definitions. THREAD_SIZE and THREAD_SIZE_ORDER naming is
misleading since it is used as general kernel stack size information. But
both those definitions are used in the common code and throughout
architectures specific code, so changing the naming is problematic.

Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:58 +02:00
Martin Schwidefsky
ce3dc44749 s390: add support for virtually mapped kernel stacks
With virtually mapped kernel stacks the kernel stack overflow detection
is now fault based, every stack has a guard page in the vmalloc space.
The panic_stack is renamed to nodat_stack and is used for all function
that need to run without DAT, e.g. memcpy_real or do_start_kdump.

The main effect is a reduction in the kernel image size as with vmap
stacks the old style overflow checking that adds two instructions per
function is not needed anymore. Result from bloat-o-meter:

add/remove: 20/1 grow/shrink: 13/26854 up/down: 2198/-216240 (-214042)

In regard to performance the micro-benchmark for fork has a hit of a
few microseconds, allocating 4 pages in vmalloc space is more expensive
compare to an order-2 page allocation. But with real workload I could
not find a noticeable difference.

Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:57 +02:00
Martin Schwidefsky
ff340d2472 s390: add stack switch helper
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:56 +02:00
Martin Schwidefsky
53c99bd665 init: add arch_call_rest_init to allow stack switching
With CONFIG_VMAP_STACK=y the kernel stack of all tasks should be
allocated in the vmalloc space. The initial stack used for all
the early init code is in the init_thread_union. To be able to
switch from this early stack to a properly allocated stack
from vmalloc the architecture needs a switch-over point.

Introduce the arch_call_rest_init() function with a weak definition
in init/main.c with the only purpose to call rest_init() from the
end of start_kernel(). The architecture override can then do the
necessary magic to switch to the new vmalloc'ed stack.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:55 +02:00
Martin Schwidefsky
00e9e6645a s390/pfault: do not use stack buffers for hardware data
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data structures passed to a hardware or a hypervisor interface that
requires V=R can not be allocated on the stack anymore.

Make the init and fini pfault parameter blocks static variables.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-10-09 11:20:54 +02:00