Commit Graph

967007 Commits

Author SHA1 Message Date
Heiko Carstens
0290c9e328 s390/mm: use invalid asce instead of kernel asce
Create a region 3 page table which contains only invalid entries, and
use that via "s390_invalid_asce" instead of the kernel ASCE whenever
there is either
- no user address space available, e.g. during early startup
- as an intermediate ASCE when address spaces are switched

This makes sure that user space accesses in such situations are
guaranteed to fail.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23 12:01:12 +01:00
Heiko Carstens
87d5986345 s390/mm: remove set_fs / rework address space handling
Remove set_fs support from s390. With doing this rework address space
handling and simplify it. As a result address spaces are now setup
like this:

CPU running in              | %cr1 ASCE | %cr7 ASCE | %cr13 ASCE
----------------------------|-----------|-----------|-----------
user space                  |  user     |  user     |  kernel
kernel, normal execution    |  kernel   |  user     |  kernel
kernel, kvm guest execution |  gmap     |  user     |  kernel

To achieve this the getcpu vdso syscall is removed in order to avoid
secondary address mode and a separate vdso address space in for user
space. The getcpu vdso syscall will be implemented differently with a
subsequent patch.

The kernel accesses user space always via secondary address space.
This happens in different ways:
- with mvcos in home space mode and directly read/write to secondary
  address space
- with mvcs/mvcp in primary space mode and copy from primary space to
  secondary space or vice versa
- with e.g. cs in secondary space mode and access secondary space

Switching translation modes happens with sacf before and after
instructions which access user space, like before.

Lazy handling of control register reloading is removed in the hope to
make everything simpler, but at the cost of making kernel entry and
exit a bit slower. That is: on kernel entry the primary asce is always
changed to contain the kernel asce, and on kernel exit the primary
asce is changed again so it contains the user asce.

In kernel mode there is only one exception to the primary asce: when
kvm guests are executed the primary asce contains the gmap asce (which
describes the guest address space). The primary asce is reset to
kernel asce whenever kvm guest execution is interrupted, so that this
doesn't has to be taken into account for any user space accesses.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23 12:01:12 +01:00
Heiko Carstens
77663819d4 Merge branch 'fixes' into features
* fixes:
  s390: fix fpu restore in entry.S

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23 12:00:42 +01:00
Sven Schnelle
1179f170b6 s390: fix fpu restore in entry.S
We need to disable interrupts in load_fpu_regs(). Otherwise an
interrupt might come in after the registers are loaded, but before
CIF_FPU is cleared in load_fpu_regs(). When the interrupt returns,
CIF_FPU will be cleared and the registers will never be restored.

The entry.S code usually saves the interrupt state in __SF_EMPTY on the
stack when disabling/restoring interrupts. sie64a however saves the pointer
to the sie control block in __SF_SIE_CONTROL, which references the same
location.  This is non-obvious to the reader. To avoid thrashing the sie
control block pointer in load_fpu_regs(), move the __SIE_* offsets eight
bytes after __SF_EMPTY on the stack.

Cc: <stable@vger.kernel.org> # 5.8
Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Reported-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-23 11:52:13 +01:00
Heiko Carstens
334ef6ed06 init/Kconfig: make COMPILE_TEST depend on !S390
While allmodconfig and allyesconfig build for s390 there are also
various bots running compile tests with randconfig, where PCI is
disabled. This reveals that a lot of drivers should actually depend on
HAS_IOMEM.
Adding this to each device driver would be a never ending story,
therefore just disable COMPILE_TEST for s390.

The reasoning is more or less the same as described in
commit bc083a64b6 ("init/Kconfig: make COMPILE_TEST depend on !UML").

Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:12 +01:00
Alexander Gordeev
12bb4c6823 s390/vmem: make variable and function names consistent
Rename some variable and functions to better clarify
what they are and what they do.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:12 +01:00
Alexander Gordeev
af71657c15 s390/vmem: remove redundant check
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:12 +01:00
Julian Wiedmann
074ff04e27 s390/stp: let subsys_system_register() sysfs attributes
Instead of creating the sysfs attributes for the stp root_dev by hand,
pass them to subsys_system_register() as parameter.

This also ensures that the attributes are available when the KOBJ_ADD
event is raised.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
ba1a6be994 s390/decompressor: print cmdline and BEAR on pgm_check
Add kernel command line and last breaking event.
The kernel command line is taken from early_command_line and printed
only if kernel is not running as protected virtualization guest and
if it has been already initialized from the "COMMAND_LINE".

Linux version 5.10.0-rc3-22794-gecaa72788df0-dirty (gor@tuxmaker) #28 SMP PREEMPT Mon Nov 9 17:41:20 CET 2020
Kernel command line: audit_enable=0 audit=0 selinux=0 crashkernel=296M root=/dev/dasda1 dasd=ec5b
memblock=debug die
Kernel fault: interruption code 0005 ilc:2
PSW : 0000000180000000 0000000000012f92 (parse_boot_command_line+0x27a/0x46c)
      R:0 T:0 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 RI:0 EA:3
GPRS: 0000000000000000 00ffffffffffffff 0000000000000000 000000000001a65c
      000000000000bf60 0000000000000000 00000000000003c0 0000000000000000
      0000000000000080 000000000002322d 000000007f29ef20 0000000000efd018
      000000000311c000 0000000000010070 0000000000012f82 000000000000bea8
Call Trace:
(sp:000000000000bea8 [<000000000002016e>] 000000000002016e)
 sp:000000000000bf18 [<0000000000012408>] startup_kernel+0x88/0x2fc
 sp:000000000000bf60 [<00000000000100c4>] startup_normal+0xb0/0xb0
Last Breaking-Event-Address:
 [<00000000000135ba>] strcmp+0x22/0x24

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Acked-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
8977ab65b8 s390/decompressor: add stacktrace support
Decompressor works on a single statically allocated stack. Stacktrace
implementation with -mbackchain just takes few lines of code.

Linux version 5.10.0-rc3-22793-g0f84a355b776-dirty (gor@tuxmaker) #27 SMP PREEMPT Mon Nov 9 17:30:18 CET 2020
Kernel fault: interruption code 0005 ilc:2
PSW : 0000000180000000 0000000000012f92 (parse_boot_command_line+0x27a/0x46c)
      R:0 T:0 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 RI:0 EA:3
GPRS: 0000000000000000 00ffffffffffffff 0000000000000000 000000000001a62c
      000000000000bf60 0000000000000000 00000000000003c0 0000000000000000
      0000000000000080 000000000002322d 000000007f29ef20 0000000000efd018
      000000000311c000 0000000000010070 0000000000012f82 000000000000bea8
Call Trace:
(sp:000000000000bea8 [<000000000002016e>] 000000000002016e)
 sp:000000000000bf18 [<0000000000012408>] startup_kernel+0x88/0x2fc
 sp:000000000000bf60 [<00000000000100c4>] startup_normal+0xb0/0xb0

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
246218962e s390/decompressor: add symbols support
Information printed by print_pgm_check_info() is crucial for
debugging decompressor problems. Printing instruction addresses is
better than nothing, but turns further debugging into tedious job of
figuring out which function those addresses correspond to.

This change adds simplistic symbols resolution support. And adds %pS
format specifier support to decompressor_printk().

Decompressor symbols list is extracted and sorted with
nm -n -S:
...
0000000000010000 0000000000000014 T startup
0000000000010014 00000000000000b0 t startup_normal
0000000000010180 00000000000000b2 t startup_kdump
...

Then functions are filtered and contracted to a form:
"10000 14 startup\0""10014 b0 startup_normal\0""10180 b2 startup_kdump\0"
...
Which makes it trivial to find beginning of an entry and names are 0
terminated, so could be used as is. Symbols are binary-searched.

To get symbols list with final addresses and then get it into the
decompressor's image the same trick as for kallsyms is used.
Decompressor's vmlinux is linked twice.

Symbols are stored in .decompressor.syms section, current size is about
2kb.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
ec55d1e1db s390/decompressor: correct some asm symbols annotations
Use SYM_CODE_* annotations for asm functions, so that function lengths
are recognized correctly.

Also currently the most part of startup is marked as startup_kdump. Move
misplaced startup_kdump where it belongs.

$ nm -n -S arch/s390/boot/compressed/vmlinux
Before:
0000000000010000 T startup
0000000000010010 T startup_kdump
After:
0000000000010000 0000000000000014 T startup
0000000000010014 00000000000000b0 t startup_normal
0000000000010180 00000000000000b2 t startup_kdump

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
9a78c70a1b s390/decompressor: add decompressor_printk
The decompressor does not have any special debug means. Running the
kernel under qemu with gdb is helpful but tedious exercise if done
repeatedly. It is also not applicable to debugging under LPAR and z/VM.

One special thing which stands out is a working sclp_early_printk,
which could be used once the kernel switches to 64-bit addressing mode.

But sclp_early_printk does not provide any string formating capabilities.
Formatting and printing string without printk-alike function is a
not fun. The lack of printk-alike function means people would save up on
testing and introduce more bugs.

So, finally, provide decompressor_printk function, which fits on one
screen and trades features for simplicity.

It only supports "%s", "%x" and "%lx" specifiers and zero padding for
hex values.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
c9343637d6 s390/ftrace: assume -mhotpatch or -mrecord-mcount always available
Currently the kernel minimal compiler requirement is gcc 4.9 or
clang 10.0.1.
* gcc -mhotpatch option is supported since 4.8.
* A combination of -pg -mrecord-mcount -mnop-mcount -mfentry flags is
supported since gcc 9 and since clang 10.

Drop support for old -pg function prologues. Which leaves binary
compatible -mhotpatch / -mnop-mcount -mfentry prologues in a form:
	brcl	0,0
Which are also do not require initial nop optimization / conversion and
presence of _mcount symbol.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:11 +01:00
Vasily Gorbik
73045a08cf s390: unify identity mapping limits handling
Currently we have to consider too many different values which
in the end only affect identity mapping size. These are:
1. max_physmem_end - end of physical memory online or standby.
   Always <= end of the last online memory block (get_mem_detect_end()).
2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the
   kernel is able to support.
3. "mem=" kernel command line option which limits physical memory usage.
4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as
   crash kernel.
5. "hsa" size which is a memory limit when the kernel is executed during
   zfcp/nvme dump.

Through out kernel startup and run we juggle all those values at once
but that does not bring any amusement, only confusion and complexity.

Unify all those values to a single one we should really care, that is
our identity mapping size.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:10 +01:00
Julian Wiedmann
1e632eaa0f s390/prng: let misc_register() add the prng sysfs attributes
Instead of creating the sysfs attributes for the prng devices by hand,
describe them in .groups and let the misdevice core handle it.

This also ensures that the attributes are available when the KOBJ_ADD
event is raised.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:10 +01:00
Mauro Carvalho Chehab
5ec11d0966 s390/cio: fix kernel-doc markups in cio driver.
Fix typo in the kernel-doc markups
	1. ccw driver -> ccw_driver
	2. ccw_device_id_is_equal() -> ccw_dev_id_is_equal

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[vneethv@linux.ibm.com: slight modification in the changelog]
Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:19:10 +01:00
Heiko Carstens
0cd9b7230c s390: add separate program check exit path
System call and program check handler both use the system call exit
path when returning to previous context. However the program check
handler jumps right to the end of the system call exit path if the
previous context is kernel context.

This lead to the quite odd double disabling of interrupts in the
system call exit path introduced with commit ce9dfafe29 ("s390:
fix system call exit path").

To avoid that have a separate program check handler exit path if the
previous context is kernel context.

Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:17:24 +01:00
Heiko Carstens
6c81603801 Merge branch 'fixes' into features
* fixes:
  s390/cpum_sf.c: fix file permission for cpum_sfb_size
  s390: update defconfigs
  s390: fix system call exit path

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20 19:15:58 +01:00
Sumanth Korikkar
b971cbd03e s390/sclp: provide extended sccb support
As the number of cpus increases, the sccb response can exceed 4k for
read cpu and read scp info sclp commands. Hence, all cpu info entries
cant be embedded within a sccb response

Solution:
To overcome this limitation, extended sccb facility is provided by sclp.

1. Check if the extended sccb facility is installed.
2. If extended sccb is installed, perform the read scp and read cpu
   command considering a max sccb length of three page size. This max
   length is based on factors like max cpus, sccb header.
3. If extended sccb is not installed, perform the read scp and read cpu
   sclp command considering a max sccb length of one page size.

Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-18 12:16:02 +01:00
Sumanth Korikkar
d25d23e134 s390/sclp: avoid copy of sclp_info_sccb
For extended sccb support, sccb size could be up to 3 pages. Hence avoid
copy of sclp_info_sccb.

Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-18 12:16:02 +01:00
Sumanth Korikkar
08ab919d0d s390/sclp: use memblock for early read cpu info
sclp early read cpu info is used to detect the number of configured
cpus, which is utilized by smp_detect_cpus() in early startup.

* For read cpu info, the sccb block should be below 2gb.
* smp_detect_cpus() utilizes read cpu info early, but after memblock
  initialization. Thus use memblock_allow_low() instead.
* Avoid copy of sclp_core_info structure.
* sclp_early_init_core_info(), sclp_early_core_info and
  sclp_early_core_info_valid initdata are no longer required.
* smp_get_core_info() is called only once during early stage.  Hence for
  early sclp_get_core_info(), directly call read cpu command. No need to
  maintain sclp_early_core_info_valid.

Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-18 12:16:02 +01:00
Niklas Schnelle
da78693e6e s390/pci: inform when missing required facilities
when we're missing the necessary machine facilities zPCI can
not function. Until now it would silently fail to be initialized,
add an informational print.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-18 12:16:02 +01:00
Heiko Carstens
ab177c5d00 s390/mm: remove unused clear_user_asce()
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-12 12:11:25 +01:00
Thomas Richter
78d732e1f3 s390/cpum_sf.c: fix file permission for cpum_sfb_size
This file is installed by the s390 CPU Measurement sampling
facility device driver to export supported minimum and
maximum sample buffer sizes.
This file is read by lscpumf tool to display the details
of the device driver capabilities. The lscpumf tool might
be invoked by a non-root user. In this case it does not
print anything because the file contents can not be read.

Fix this by allowing read access for all users. Reading
the file contents is ok, changing the file contents is
left to the root user only.

For further reference and details see:
 [1] https://github.com/ibm-s390-tools/s390-tools/issues/97

Fixes: 69f239ed33 ("s390/cpum_sf: Dynamically extend the sampling buffer if overflows occur")
Cc: <stable@vger.kernel.org> # 3.14
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-12 12:10:36 +01:00
Heiko Carstens
966e7ea434 s390: update defconfigs
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-12 12:10:36 +01:00
Harald Freudenberger
43cb5a7c61 s390/zcrypt/pkey: introduce zcrypt_wait_api_operational() function
The zcrypt api provides a new function to wait until the zcrypt
api is operational:

  int zcrypt_wait_api_operational(void);

The AP bus scan and the binding of ap devices to device drivers is
an asynchronous job. This function waits until these initial jobs
are done and so the zcrypt api should be ready to serve crypto
requests - if there are resources available. The function uses an
internal timeout of 60s. The very first caller will either wait for
ap bus bindings complete or the timeout happens. This state will be
remembered for further callers which will only be blocked until a
decision is made (timeout or bindings complete).

Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:21:00 +01:00
Harald Freudenberger
837cd1059a s390/ap: ap bus userspace notifications for some bus conditions
This patch adds notifications to userspace for two important
conditions of the ap bus:

I) Initial ap bus scan done. This indicates that the initial
   scan of all the ap devices (cards, queues) is complete and
   ap devices have been build up for all the hardware found.
   This condition is signaled with
   1) An ap bus change uevent send to userspace with an environment
      key/value pair "INITSCAN=done":
	# udevadm monitor -k -p
	...
	KERNEL[97.830919] change   /devices/ap (ap)
	ACTION=change
	DEVPATH=/devices/ap
	SUBSYSTEM=ap
	INITSCAN=done
	SEQNUM=10421
   2) A sysfs attribute /sys/bus/ap/scans which shows the
      number of completed ap bus scans done since bus init.
      So a value of 1 or greater signals that the initial
      ap bus scan is complete.
   Note: The initial ap bus scan complete condition is fulfilled
   and will be signaled even if there was no ap resource found.

II) APQN driver bindings complete. This indicates that all
    APQNs have been bound to an zcrypt or alternate device
    driver. Only with the help of an device driver an APQN
    can be used for crypto load. So the binding complete
    condition is the starting point for user space to be
    sure all crypto resources on the ap bus are available
    for use.
    This condition is signaled with
    1) An ap bus change uevent send to userspace with an environment
       key/value pair "BINDINGS=complete":
	 # udevadm monitor -k -p
	 ...
	 KERNEL[97.830975] change   /devices/ap (ap)
	 ACTION=change
	 DEVPATH=/devices/ap
	 SUBSYSTEM=ap
	 BINDINGS=complete
	 SEQNUM=10422
    2) A sysfs attribute /sys/bus/ap/bindings showing
	 "<nr of bound apqns>/<total nr of apqns> (complete)"
       when all available apqns have been bound to device drivers, or
	 "<nr of bound apqns>/<total nr of apqns>"
       when there are some apqns not bound to an device driver.
    Note: The binding complete condition is also fulfilled, when
    there are no apqns available to bind any device driver. In
    this case the binding complete will be signaled AFTER init
    scan is done.
    Note: This condition may arise multiple times when after
    initial scan modifications on the bindings take place. For
    example a manual unbind of an APQN switches the binding
    complete condition off. When at a later time the unbound APQNs
    are bound with an device driver the binding is (again) complete
    resulting in another uevent and marking the bindings sysfs
    attribute with '(complete)'.

There is also a new function to be used within the kernel:

  int ap_wait_init_apqn_bindings_complete(unsigned long timeout)

Interface to wait for the AP bus to have done one initial ap bus
scan and all detected APQNs have been bound to device drivers.
If these both conditions are not fulfilled, this function blocks
on a condition with wait_for_completion_interruptible_timeout().
If these both conditions are fulfilled (before the timeout hits)
the return value is 0. If the timeout (in jiffies) hits instead
-ETIME is returned. On failures negative return values are
returned to the caller. Please note that further unbind/bind
actions after initial binding complete is through do not cause this
function to block again.

Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:21:00 +01:00
Christian Borntraeger
d041315ef7 s390/trng: set quality to 1024
The s390-trng does provide 100% entropy. The quality value is supported
to be between 1 and 1024 and not 1..1000.  Use 1024 to make this driver
the preferred one. If we ever have a better driver that has the same
quality but is faster we can change this again when merging the new
driver. No need to be conservative.

This makes sure that the hw variant is preferred over things like
virtio-rng, where the hypervisor has a potential to be misconfigured
and thus should have a slightly lower confidence.

Cc: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:21:00 +01:00
Daniel Vetter
a67a88b0b8 s390/pci: remove races against pte updates
Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

- gpu drivers dynamically manage their memory nowadays, invalidating
ptes with unmap_mapping_range when buffers get moved

- contiguous dma allocations have moved from dedicated carvetouts to
cma regions. This means if we miss the unmap the pfn might contain
pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

- even /dev/mem now invalidates mappings when the kernel requests that
iomem region when CONFIG_IO_STRICT_DEVMEM is set, see
commit 3234ac664a ("/dev/mem: Revoke mappings when a driver claims the
region")

Accessing pfns obtained from ptes without holding all the locks is
therefore no longer a good idea. Fix this.

Since zpci_memcpy_from|toio seems to not do anything nefarious with
locks we just need to open code get_pfn and follow_pfn and make sure
we drop the locks only after we're done. The write function also needs
the copy_from_user move, since we can't take userspace faults while
holding the mmap sem.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:21:00 +01:00
Vasily Gorbik
d7e7fbba67 s390/early: rewrite program parameter setup in C
And move it earlier in the decompressor.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:21:00 +01:00
Vasily Gorbik
0c4ec024a4 s390/kasan: move memory needs estimation into a function
Also correct rounding downs in estimation calculations.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Vasily Gorbik
e385b550fa s390/kasan: make kasan header self-contained
It is relying on _REGION1_SHIFT / _REGION2_SHIFT values which come from
asm/pgtable.h, so include it.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Vasily Gorbik
54b52981bb s390/kasan: remove obvious parameter with the only possible value
Kasan early code is only working on init_mm, remove unneeded pgd
parameter from kasan_copy_shadow and rename it to
kasan_copy_shadow_mapping.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Vasily Gorbik
92bca2fe61 s390/kasan: avoid confusing naming
Kasan has nothing to do with vmemmap, strip vmemmap from function names
to avoid confusing people.

Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Vasily Gorbik
39f2899b98 s390/decompressor: fix build warning
Fixes the following warning with CONFIG_KERNEL_UNCOMPRESSED=y

arch/s390/boot/compressed/decompressor.h:6:46: warning: non-void function
does not return a value [-Wreturn-type]
static inline void *decompress_kernel(void) {}
                                             ^

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Heiko Carstens
90178c1900 s390/mm: let vmalloc area size depend on physical memory size
To make sure that the vmalloc area size is for almost all cases large
enough let it depend on the (potential) physical memory size. There is
still the possibility to override this with the vmalloc kernel command
line parameter.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Heiko Carstens
fc67c880e3 s390/mm: extend default vmalloc area size to 512GB
We've seen several occurences in the past where the default vmalloc
size of 128GB is not sufficient. Therefore extend the default size.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:59 +01:00
Vasily Gorbik
97b142b740 s390: make sure vmemmap is top region table entry aligned
Since commit 29d37e5b82 ("s390/protvirt: add ultravisor initialization")
vmax is adjusted to the ultravisor secure storage limit. This limit is
currently applied when 4-level paging is used. Later vmax is also used
to align vmemmap address to the top region table entry border. When vmax
is set to the ultravisor secure storage limit this is no longer the case.

Instead of changing vmax, make only MODULES_END be affected by the
secure storage limit, so that vmax stays intact for further vmemmap
address alignment.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:58 +01:00
Vasily Gorbik
a3453d923e s390/kasan: remove 3-level paging support
Compiling the kernel with Kasan disables automatic 3-level vs 4-level
kernel space paging selection, because the shadow memory offset has
to be known at compile time and there is no such offset which would be
acceptable for both 3 and 4-level paging. Instead S390_4_LEVEL_PAGING
option was introduced which allowed to pick how many paging levels to
use under Kasan.

With the introduction of protected virtualization, kernel memory layout
may be affected due to ultravisor secure storage limit. This adds
additional complexity into how memory layout would look like in
combination with Kasan predefined shadow memory offsets. To simplify
this make Kasan 4-level paging default and remove Kasan 3-level paging
support.

Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:58 +01:00
Vasily Gorbik
f38b0a7439 s390: remove unused s390_base_ext_handler
s390_base_ext_handler_fn haven't been used since its introduction in
commit ab14de6c37 ("[S390] Convert memory detection into C code.").

s390_base_ext_handler itself is currently falsely storing 16 registers
at __LC_SAVE_AREA_ASYNC rewriting several following lowcore values:
cpu_flags, return_psw, return_mcck_psw, sync_enter_timer and
async_enter_timer.

Besides that s390_base_ext_handler itself is only potentially hiding
EXT interrupts which should not have happen in the first place. Any
piece of code which requires EXT interrupts before fully functional
ext_int_handler is enabled has to do it on its own, like this is done
by sclp_early_cmd() which is doing EXT interrupts handling synchronously
in sclp_early_wait_irq().

With s390_base_ext_handler removed unexpected EXT interrupt leads
to disabled wait with the address 0x1b0 (__LC_EXT_NEW_PSW), which is
currently setup in the decompressor.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:58 +01:00
Vasily Gorbik
85cde0192a s390/udelay: make it work for the early code
Currently udelay relies on working EXT interrupts handler, which is not
the case during early startup. In such cases udelay_simple() has to be
used instead.

To avoid mistakes of calling udelay too early, which could happen from
the common code as well - make udelay work for the early code by
introducing static branch and redirecting all udelay calls to
udelay_simple until EXT interrupts handler is fully initialized and
async stack is allocated.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:58 +01:00
Vasily Gorbik
13b5bd8af4 s390/head: set io/ext handlers to disabled wait
Set io/ext handlers to disabled wait in the initial lowcore, so that they
are effective right from the kernel start, when a boot method used does
not rewrite this part of the lowcore for its own needs (i.e. kexec, z/vm
ipl reader boot, qemu direct boot, load from removable media or server).

When the kernel is loaded by zipl, scsi loader or qemu loader, some or
all of the io/ext/pgm handlers addresses might be rewritten. Rewrite them
to initial values again as early as possible.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:20:58 +01:00
Heiko Carstens
ce9dfafe29 s390: fix system call exit path
The system call exit path is running with interrupts enabled while
checking for TIF/PIF/CIF bits which require special handling. If all
bits have been checked interrupts are disabled and the kernel exits to
user space.
The problem is that after checking all bits and before interrupts are
disabled bits can be set already again, due to interrupt handling.

This means that the kernel can exit to user space with some
TIF/PIF/CIF bits set, which should never happen. E.g. TIF_NEED_RESCHED
might be set, which might lead to additional latencies, since that bit
will only be recognized with next exit to user space.

Fix this by checking the corresponding bits only when interrupts are
disabled.

Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Cc: <stable@vger.kernel.org> # 5.8
Acked-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09 11:16:11 +01:00
Linus Torvalds
f8394f232b Linux 5.10-rc3 2020-11-08 16:10:16 -08:00
Linus Torvalds
15f5d201c1 Driver core documentation fixes for 5.10-rc3
Here are some small Documentation fixes for 5.10-rc3 that were fallout
 from the larger documentation update we did in 5.10-rc2.  Nothing major
 here at all, but all of these have been in linux-next and resolve build
 warnings when building the documentation files.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX6g8vw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykdGQCgjEHwT/N2jltd8yHF1Kca5yR+FJYAoMb3sbJS
 gR1iX48G20OhkrPNSIUw
 =HDyQ
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core documentation fixes from Greg KH:
 "Some small Documentation fixes that were fallout from the larger
  documentation update we did in 5.10-rc2.

  Nothing major here at all, but all of these have been in linux-next
  and resolve build warnings when building the documentation files"

* tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Documentation: remove mic/index from misc-devices/index.rst
  scripts: get_api.pl: Add sub-titles to ABI output
  scripts: get_abi.pl: Don't let ABI files to create subtitles
  docs: leds: index.rst: add a missing file
  docs: ABI: sysfs-class-net: fix a typo
  docs: ABI: sysfs-driver-dma-ioatdma: what starts with /sys
2020-11-08 11:30:25 -08:00
Linus Torvalds
bbc821849e TTY/Serial fixes for 5.10-rc3
Here are a small number of small tty and serial fixes for some reported
 problems for the tty core, vt code, and some serial drivers.
 
 They include fixes for:
 	- a buggy and obsolete vt font ioctl removal
 	- 8250_mtk serial baudrate runtime warnings
 	- imx serial earlycon build configuration fix
 	- txx9 serial driver error path cleanup issues
 	- tty core fix in release_tty that can be triggered by trying to
 	  bind an invalid serial port name to a speakup console device
 
 Almost all of these have been in linux-next without any problems, the
 only one that hasn't, just deletes code :)
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX6g7nQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymmVACeNXgZpIAUJujtM7hQAEpCDYrFZ08An13TN07Y
 wZe5okUITYIXQRZevKyi
 =w0og
 -----END PGP SIGNATURE-----

Merge tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial fixes from Greg KH:
 "Here are a small number of small tty and serial fixes for some
  reported problems for the tty core, vt code, and some serial drivers.

  They include fixes for:

   - a buggy and obsolete vt font ioctl removal

   - 8250_mtk serial baudrate runtime warnings

   - imx serial earlycon build configuration fix

   - txx9 serial driver error path cleanup issues

   - tty core fix in release_tty that can be triggered by trying to bind
     an invalid serial port name to a speakup console device

  Almost all of these have been in linux-next without any problems, the
  only one that hasn't, just deletes code :)"

* tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  vt: Disable KD_FONT_OP_COPY
  tty: fix crash in release_tty if tty->port is not set
  serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init
  tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is enabled
  serial: 8250_mtk: Fix uart_get_baud_rate warning
2020-11-08 11:28:08 -08:00
Linus Torvalds
df53b815c7 USB fixes for 5.10-rc3
Here are some small USB fixes and new device ids for 5.10-rc3
 
 They include:
 	- USB gadget fixes for some reported issues
 	- Fixes for the every-troublesome apple fastcharge driver,
 	  hopefully we finally have it right.
 	- More USB core quirks for odd devices
 	- USB serial driver fixes for some long-standing issues that
 	  were recently found
 	- some new USB serial driver device ids
 
 All have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX6g8RA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynX/wCg2StXM3DVNqnEvOF8OQ23WAgcC0IAoKIYuTNi
 Uh303+JJ6A0cz4uGfhc0
 =OKt5
 -----END PGP SIGNATURE-----

Merge tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes and new device ids:

   - USB gadget fixes for some reported issues

   - Fixes for the ever-troublesome apple fastcharge driver, hopefully
     we finally have it right.

   - More USB core quirks for odd devices

   - USB serial driver fixes for some long-standing issues that were
     recently found

   - some new USB serial driver device ids

  All have been in linux-next with no reported issues"

* tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property
  usb: mtu3: fix panic in mtu3_gadget_stop()
  USB: serial: option: add Telit FN980 composition 0x1055
  USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
  USB: serial: cyberjack: fix write-URB completion race
  USB: Add NO_LPM quirk for Kingston flash drive
  USB: serial: option: add Quectel EC200T module support
  usb: raw-gadget: fix memory leak in gadget_setup
  usb: dwc2: Avoid leaving the error_debugfs label unused
  usb: dwc3: ep0: Fix delay status handling
  usb: gadget: fsl: fix null pointer checking
  usb: gadget: goku_udc: fix potential crashes in probe
  usb: dwc3: pci: add support for the Intel Alder Lake-S
2020-11-08 11:24:10 -08:00
Eddy Wu
b4e00444ca fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent
current->group_leader->exit_signal may change during copy_process() if
current->real_parent exits.

Move the assignment inside tasklist_lock to avoid the race.

Signed-off-by: Eddy Wu <eddy_wu@trendmicro.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-11-08 11:18:39 -08:00
Daniel Vetter
3c4e0dff20 vt: Disable KD_FONT_OP_COPY
It's buggy:

On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote:
> We recently discovered a slab-out-of-bounds read in fbcon in the latest
> kernel ( v5.10-rc2 for now ).  The root cause of this vulnerability is that
> "fbcon_do_set_font" did not handle "vc->vc_font.data" and
> "vc->vc_font.height" correctly, and the patch
> <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this
> issue.
>
> Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and
> use  KD_FONT_OP_SET again to set a large font.height for tty1. After that,
> we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data
> in "fbcon_do_set_font", while tty1 retains the original larger
> height. Obviously, this will cause an out-of-bounds read, because we can
> access a smaller vc_font.data with a larger vc_font.height.

Further there was only one user ever.
- Android's loadfont, busybox and console-tools only ever use OP_GET
  and OP_SET
- fbset documentation only mentions the kernel cmdline font: option,
  not anything else.
- systemd used OP_COPY before release 232 published in Nov 2016

Now unfortunately the crucial report seems to have gone down with
gmane, and the commit message doesn't say much. But the pull request
hints at OP_COPY being broken

https://github.com/systemd/systemd/pull/3651

So in other words, this never worked, and the only project which
foolishly every tried to use it, realized that rather quickly too.

Instead of trying to fix security issues here on dead code by adding
missing checks, fix the entire thing by removing the functionality.

Note that systemd code using the OP_COPY function ignored the return
value, so it doesn't matter what we're doing here really - just in
case a lone server somewhere happens to be extremely unlucky and
running an affected old version of systemd. The relevant code from
font_copy_to_all_vcs() in systemd was:

	/* copy font from active VT, where the font was uploaded to */
	cfo.op = KD_FONT_OP_COPY;
	cfo.height = vcs.v_active-1; /* tty1 == index 0 */
	(void) ioctl(vcfd, KDFONTOP, &cfo);

Note this just disables the ioctl, garbage collecting the now unused
callbacks is left for -next.

v2: Tetsuo found the old mail, which allowed me to find it on another
archive. Add the link too.

Acked-by: Peilin Ye <yepeilin.cs@gmail.com>
Reported-by: Minh Yuan <yuanmingbuaa@gmail.com>
References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html
References: https://github.com/systemd/systemd/pull/3651
Cc: Greg KH <greg@kroah.com>
Cc: Peilin Ye <yepeilin.cs@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-08 19:35:06 +01:00