linux

Author	SHA1	Message	Date
Linus Torvalds	c677124e63	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "These were the main changes in this cycle: - More -rt motivated separation of CONFIG_PREEMPT and CONFIG_PREEMPTION. - Add more low level scheduling topology sanity checks and warnings to filter out nonsensical topologies that break scheduling. - Extend uclamp constraints to influence wakeup CPU placement - Make the RT scheduler more aware of asymmetric topologies and CPU capacities, via uclamp metrics, if CONFIG_UCLAMP_TASK=y - Make idle CPU selection more consistent - Various fixes, smaller cleanups, updates and enhancements - please see the git log for details" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) sched/fair: Define sched_idle_cpu() only for SMP configurations sched/topology: Assert non-NUMA topology masks don't (partially) overlap idle: fix spelling mistake "iterrupts" -> "interrupts" sched/fair: Remove redundant call to cpufreq_update_util() sched/psi: create /proc/pressure and /proc/pressure/{io\|memory\|cpu} only when psi enabled sched/fair: Fix sgc->{min,max}_capacity calculation for SD_OVERLAP sched/fair: calculate delta runnable load only when it's needed sched/cputime: move rq parameter in irqtime_account_process_tick stop_machine: Make stop_cpus() static sched/debug: Reset watchdog on all CPUs while processing sysrq-t sched/core: Fix size of rq::uclamp initialization sched/uclamp: Fix a bug in propagating uclamp value in new cgroups sched/fair: Load balance aggressively for SCHED_IDLE CPUs sched/fair : Improve update_sd_pick_busiest for spare capacity case watchdog: Remove soft_lockup_hrtimer_cnt and related code sched/rt: Make RT capacity-aware sched/fair: Make EAS wakeup placement consider uclamp restrictions sched/fair: Make task_fits_capacity() consider uclamp restrictions sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with() sched/uclamp: Make uclamp util helpers use and return UL values ...	2020-01-28 10:07:09 -08:00
Sven Schnelle	17248ea036	s390: fix __EMIT_BUG() macro Setting a kprobe on getname_flags() failed: $ echo 'p:tmr1 getname_flags +0(%r2):ustring' > kprobe_events -bash: echo: write error: Invalid argument Debugging the kprobes code showed that the address of getname_flags() is contained in the __bug_table. Kprobes doesn't allow to set probes at BUG() locations. $ objdump -j __bug_table -x build/fs/namei.o [..] 0000000000000108 R_390_PC32 .text+0x00000000000075a8 000000000000010c R_390_PC32 .L223+0x0000000000000004 I was expecting getname_flags() to start with a BUG(), but: 7598: e3 20 10 00 00 04 lg %r2,0(%r1) 759e: c0 f4 00 00 00 00 jg 759e <putname+0x7e> 75a0: R_390_PLT32DBL kmem_cache_free+0x2 75a4: a7 f4 00 01 j 75a6 <putname+0x86> 00000000000075a8 <getname_flags>: 75a8: c0 04 00 00 00 00 brcl 0,75a8 <getname_flags> 75ae: eb 6f f0 48 00 24 stmg %r6,%r15,72(%r15) 75b4: b9 04 00 ef lgr %r14,%r15 75b8: e3 f0 ff a8 ff 71 lay %r15,-88(%r15) So the BUG() is actually the last opcode of the previous function. Fix this by switching to using the MONITOR CALL (MC) instruction, and set the entry in __bug_table to the beginning of that MC. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-01-22 13:05:35 +01:00
Vasily Gorbik	45f7a0da60	s390/ftrace: generate traced function stack frame Currently backtrace from ftraced function does not contain ftraced function itself. e.g. for "path_openat": arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15e/0x3d8 ftrace_graph_caller+0x0/0x1c <-- ftrace code do_filp_open+0x7c/0xe8 <-- ftraced function caller do_open_execat+0x76/0x1b8 open_exec+0x52/0x78 load_elf_binary+0x180/0x1160 search_binary_handler+0x8e/0x288 load_script+0x2a8/0x2b8 search_binary_handler+0x8e/0x288 __do_execve_file.isra.39+0x6fa/0xb40 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Ftraced function is expected in the backtrace by ftrace kselftests, which are now failing. It would also be nice to have it for clarity reasons. "ftrace_caller" itself is called without stack frame allocated for it and does not store its caller (ftraced function). Instead it simply allocates a stack frame for "ftrace_trace_function" and sets backchain to point to ftraced function stack frame (which contains ftraced function caller in saved r14). To fix this issue make "ftrace_caller" allocate a stack frame for itself just to store ftraced function for the stack unwinder. As a result backtrace looks like the following: arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15e/0x3d8 ftrace_graph_caller+0x0/0x1c <-- ftrace code path_openat+0x6/0xd60 <-- ftraced function do_filp_open+0x7c/0xe8 <-- ftraced function caller do_open_execat+0x76/0x1b8 open_exec+0x52/0x78 load_elf_binary+0x180/0x1160 search_binary_handler+0x8e/0x288 load_script+0x2a8/0x2b8 search_binary_handler+0x8e/0x288 __do_execve_file.isra.39+0x6fa/0xb40 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Reported-by: Sven Schnelle <sven.schnelle@ibm.com> Tested-by: Sven Schnelle <sven.schnelle@ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-01-22 13:05:35 +01:00
Thomas Richter	ee09c91480	s390/cpum_sf: Use DIV_ROUND_UP Use macro DIV_ROUND_UP() for calculation of number of SDBT SDBT pages required for index pages. This macro is already used throughout the file. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-01-22 13:05:35 +01:00
Thomas Richter	32dab6828c	s390/cpum_sf: Use kzalloc and minor changes Use kzalloc() to allocate auxiliary buffer structure initialized with all zeroes to avoid random value in trace output. Avoid double access to SBD hardware flags. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-01-22 13:05:34 +01:00
Thomas Richter	ee5c4ccfd5	s390/cpum_sf: Convert debug trace to common layout Convert debug traces to print the head/alert/empty marks consistently as decimal numbers. Add some trace statements to enable easier debugging during auxiliary tracing. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-01-22 13:05:34 +01:00
Aleksa Sarai	fddb5d430a	open: introduce openat2(2) syscall /* Background. / For a very long time, extending openat(2) with new features has been incredibly frustrating. This stems from the fact that openat(2) is possibly the most famous counter-example to the mantra "don't silently accept garbage from userspace" -- it doesn't check whether unknown flags are present[1]. This means that (generally) the addition of new flags to openat(2) has been fraught with backwards-compatibility issues (O_TMPFILE has to be defined as __O_TMPFILE\|O_DIRECTORY\|[O_RDWR or O_WRONLY] to ensure old kernels gave errors, since it's insecure to silently ignore the flag[2]). All new security-related flags therefore have a tough road to being added to openat(2). Userspace also has a hard time figuring out whether a particular flag is supported on a particular kernel. While it is now possible with contemporary kernels (thanks to [3]), older kernels will expose unknown flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during openat(2) time matches modern syscall designs and is far more fool-proof. In addition, the newly-added path resolution restriction LOOKUP flags (which we would like to expose to user-space) don't feel related to the pre-existing O_ flag set -- they affect all components of path lookup. We'd therefore like to add a new flag argument. Adding a new syscall allows us to finally fix the flag-ignoring problem, and we can make it extensible enough so that we will hopefully never need an openat3(2). /* Syscall Prototype. / / * open_how is an extensible structure (similar in interface to * clone3(2) or sched_setattr(2)). The size parameter must be set to * sizeof(struct open_how), to allow for future extensions. All future * extensions will be appended to open_how, with their zero value * acting as a no-op default. / struct open_how { / ... / }; int openat2(int dfd, const char pathname, struct open_how how, size_t size); / Description. / The initial version of 'struct open_how' contains the following fields: flags Used to specify openat(2)-style flags. However, any unknown flag bits or otherwise incorrect flag combinations (like O_PATH\|O_RDWR) will result in -EINVAL. In addition, this field is 64-bits wide to allow for more O_ flags than currently permitted with openat(2). mode The file mode for O_CREAT or O_TMPFILE. Must be set to zero if flags does not contain O_CREAT or O_TMPFILE. resolve Restrict path resolution (in contrast to O_ flags they affect all path components). The current set of flags are as follows (at the moment, all of the RESOLVE_ flags are implemented as just passing the corresponding LOOKUP_ flag). RESOLVE_NO_XDEV => LOOKUP_NO_XDEV RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS RESOLVE_BENEATH => LOOKUP_BENEATH RESOLVE_IN_ROOT => LOOKUP_IN_ROOT open_how does not contain an embedded size field, because it is of little benefit (userspace can figure out the kernel open_how size at runtime fairly easily without it). It also only contains u64s (even though ->mode arguably should be a u16) to avoid having padding fields which are never used in the future. Note that as a result of the new how->flags handling, O_PATH\|O_TMPFILE is no longer permitted for openat(2). As far as I can tell, this has always been a bug and appears to not be used by userspace (and I've not seen any problems on my machines by disallowing it). If it turns out this breaks something, we can special-case it and only permit it for openat(2) but not openat2(2). After input from Florian Weimer, the new open_how and flag definitions are inside a separate header from uapi/linux/fcntl.h, to avoid problems that glibc has with importing that header. /* Testing. / In a follow-up patch there are over 200 selftests which ensure that this syscall has the correct semantics and will correctly handle several attack scenarios. In addition, I've written a userspace library[4] which provides convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care must be taken when using RESOLVE_IN_ROOT'd file descriptors with other syscalls). During the development of this patch, I've run numerous verification tests using libpathrs (showing that the API is reasonably usable by userspace). / Future Work. */ Additional RESOLVE_ flags have been suggested during the review period. These can be easily implemented separately (such as blocking auto-mount during resolution). Furthermore, there are some other proposed changes to the openat(2) interface (the most obvious example is magic-link hardening[5]) which would be a good opportunity to add a way for userspace to restrict how O_PATH file descriptors can be re-opened. Another possible avenue of future work would be some kind of CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace which openat2(2) flags and fields are supported by the current kernel (to avoid userspace having to go through several guesses to figure it out). [1]: https://lwn.net/Articles/588444/ [2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com [3]: commit `629e014bb8` ("fs: completely ignore unknown open flags") [4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523 [5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/ [6]: https://youtu.be/ggD-eb3yPVs Suggested-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-01-18 09:19:18 -05:00
Arvind Sankar	c5ff734cf6	arch/s390/setup: Drop dummy_con initialization con_init in tty/vt.c will now set conswitchp to dummy_con if it's unset. Drop it from arch setup code. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20191218214506.49252-20-nivedita@alum.mit.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-14 15:29:18 +01:00
Sargun Dhillon	9a2cef09c8	arch: wire up pidfd_getfd syscall This wires up the pidfd_getfd syscall for all architectures. Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200107175927.4558-4-sargun@sargun.me Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>	2020-01-13 21:49:47 +01:00
Philipp Rudo	40260b01d0	s390/setup: Fix secure ipl message The new machine loader on z15 always creates an IPL Report block and thus sets the IPL_PL_FLAG_IPLSR even when secure boot is disabled. This causes the wrong message being printed at boot. Fix this by checking for IPL_PL_FLAG_SIPL instead. Fixes: `9641b8cc73` ("s390/ipl: read IPL report at early boot") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2020-01-09 16:59:18 +01:00
Ingo Molnar	1e5f8a3085	Linux 5.5-rc3 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl4AEiYeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGR3sH/ixrBBYUVyjRPOxS ce4iVoTqphGSoAzq/3FA1YZZOPQ/Ep0NXL4L2fTGxmoiqIiuy8JPp07/NKbHQjj1 Rt6PGm6cw2pMJHaK9gRdlTH/6OyXkp06OkH1uHqKYrhPnpCWDnj+i2SHAX21Hr1y oBQh4/XKvoCMCV96J2zxRsLvw8OkQFE0ouWWfj6LbpXIsmWZ++s0OuaO1cVdP/oG j+j2Voi3B3vZNQtGgJa5W7YoZN5Qk4ZIj9bMPg7bmKRd3wNB228AiJH2w68JWD/I jCA+JcITilxC9ud96uJ6k7SMS2ufjQlnP0z6Lzd0El1yGtHYRcPOZBgfOoPU2Euf 33WGSyI= =iEwx -----END PGP SIGNATURE----- Merge tag 'v5.5-rc3' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-25 10:41:37 +01:00
Vasily Gorbik	b4adfe5591	s390/ftrace: save traced function caller A typical backtrace acquired from ftraced function currently looks like the following (e.g. for "path_openat"): arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15a/0x3b8 ftrace_graph_caller+0x0/0x1c 0x3e0007e3c98 <- ftraced function caller (should be do_filp_open+0x7c/0xe8) do_open_execat+0x70/0x1b8 __do_execve_file.isra.0+0x7d8/0x860 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Note random "0x3e0007e3c98" stack value as ftraced function caller. This value causes either imprecise unwinder result or unwinding failure. That "0x3e0007e3c98" comes from r14 of ftraced function stack frame, which it haven't had a chance to initialize since the very first instruction calls ftrace code ("ftrace_caller"). (ftraced function might never save r14 as well). Nevertheless according to s390 ABI any function is called with stack frame allocated for it and r14 contains return address. "ftrace_caller" itself is called with "brasl %r0,ftrace_caller". So, to fix this issue simply always save traced function caller onto ftraced function stack frame. Reported-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-12-18 23:29:26 +01:00
Vasily Gorbik	eef06cbf67	s390/unwind: stop gracefully at user mode pt_regs in irq stack Consider reaching user mode pt_regs at the bottom of irq stack graceful unwinder termination. This is the case when irq/mcck/ext interrupt arrives while in user mode. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-12-18 23:29:26 +01:00
Heiko Carstens	1b68ac8678	s390: remove last diag 0x44 caller diag 0x44 is a voluntary undirected yield of a virtual CPU. This has caused a lot of performance issues in the past. There is only one caller left, and that one is only executed if diag 0x9c (directed yield) is not present. Given that all hypervisors implement diag 0x9c anyway, remove the last diag 0x44 to avoid that more callers will be added. Worst case that could happen now, if diag 0x9c is not present, is that a virtual CPU would loop a bit instead of giving its time slice up. diag 0x44 statistics in debugfs are kept and will always be zero, so that user space can tell that there are no calls. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-12-11 19:53:24 +01:00
Thomas Richter	0539ad0b22	s390/cpum_sf: Avoid SBD overflow condition in irq handler The s390 CPU Measurement sampling facility has an overflow condition which fires when all entries in a SBD are used. The measurement alert interrupt is triggered and reads out all samples in this SDB. It then tests the successor SDB, if this SBD is not full, the interrupt handler does not read any samples at all from this SDB The design waits for the hardware to fill this SBD and then trigger another meassurement alert interrupt. This scheme works nicely until an perf_event_overflow() function call discards the sample due to a too high sampling rate. The interrupt handler has logic to read out a partially filled SDB when the perf event overflow condition in linux common code is met. This causes the CPUM sampling measurement hardware and the PMU device driver to operate on the same SBD's trailer entry. This should not happen. This can be seen here using this trace: cpumsf_pmu_add: tear:0xb5286000 hw_perf_event_update: sdbt 0xb5286000 full 1 over 0 flush_all:0 hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0 above shows 1. interrupt hw_perf_event_update: sdbt 0xb5286008 full 1 over 0 flush_all:0 hw_perf_event_update: sdbt 0xb5286008 full 0 over 0 flush_all:0 above shows 2. interrupt ... this goes on fine until... hw_perf_event_update: sdbt 0xb5286068 full 1 over 0 flush_all:0 perf_push_sample1: overflow one or more samples read from the IRQ handler are rejected by perf_event_overflow() and the IRQ handler advances to the next SDB and modifies the trailer entry of a partially filled SDB. hw_perf_event_update: sdbt 0xb5286070 full 0 over 0 flush_all:1 timestamp: 14:32:52.519953 Next time the IRQ handler is called for this SDB the trailer entry shows an overflow count of 19 missed entries. hw_perf_event_update: sdbt 0xb5286070 full 1 over 19 flush_all:1 timestamp: 14:32:52.970058 Remove access to a follow on SDB when event overflow happened. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-12-11 19:53:24 +01:00
Thomas Richter	39d4a501a9	s390/cpum_sf: Adjust sampling interval to avoid hitting sample limits Function perf_event_ever_overflow() and perf_event_account_interrupt() are called every time samples are processed by the interrupt handler. However function perf_event_account_interrupt() has checks to avoid being flooded with interrupts (more then 1000 samples are received per task_tick). Samples are then dropped and a PERF_RECORD_THROTTLED is added to the perf data. The perf subsystem limit calculation is: maximum sample frequency := 100000 --> 1 samples per 10 us task_tick = 10ms = 10000us --> 1000 samples per task_tick The work flow is measurement_alert() uses SDBT head and each SBDT points to 511 SDB pages, each with 126 sample entries. After processing 8 SBDs and for each valid sample calling: perf_event_overflow() perf_event_account_interrupts() there is a considerable amount of samples being dropped, especially when the sample frequency is very high and near the 100000 limit. To avoid the high amount of samples being dropped near the end of a task_tick time frame, increment the sampling interval in case of dropped events. The CPU Measurement sampling facility on the s390 supports only intervals, specifiing how many CPU cycles have to be executed before a sample is generated. Increase the interval when the samples being generated hit the task_tick limit. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-12-11 19:53:23 +01:00
Thomas Gleixner	fa68645305	sched/rt, s390: Use CONFIG_PREEMPTION CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Switch the preemption and entry code over to use CONFIG_PREEMPTION. Add PREEMPT_RT output to die(). [bigeasy: +Kconfig, dumpstack.c] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: linux-s390@vger.kernel.org Link: https://lore.kernel.org/r/20191015191821.11479-18-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-12-08 14:37:35 +01:00
Linus Torvalds	01d1dff646	s390 updates for the 5.5 merge window #2 - Make stack unwinder reliable and suitable for livepatching. Add unwinder testing module. - Fixes for CALL_ON_STACK helper used for stack switching. - Fix unwinding from bpf code. - Fix getcpu and remove compat support in vdso code. - Fix address space control registers initialization. - Save KASLR offset for early dumps. - Handle new FILTERED_BY_HYPERVISOR reply code in crypto code. - Minor perf code cleanup and potential memory leak fix. - Add couple of error messages for corner cases during PCI device creation. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3mUEMACgkQjYWKoQLX FBgHBgf/Ui3sgKGozvIAwy2kQ3oPtCdsmnTKEhLdhYT0cKMWNkA/jc13vn37ZqSk vMhawMjgjHhn4CLSjxKRGCprYViXIgnF2XrCywTDsBoj87QwB6/dME1gXJRW+/Rm OPvO+8D+210Ow0Xip3xXSRIPNFsUINCQeCCEtQCOuhGMdQPC0VIKgYtgvk1TAo1E +DycHbZ0e+uEp6zvVSsoP9wrkXw/L9krTDnjHncQ7FULJAYnBhY+qaeNTek09QAT j3Ywh5/fYR11c62W6fjb1lQHLb75L0aeK7Q5r5WspxG5LwiR2ncYWOQ4BQPZoUXq GjdNvwRmvEkB3IbnpLp/ft7sqsPn2w== =CoqQ -----END PGP SIGNATURE----- Merge tag 's390-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: - Make stack unwinder reliable and suitable for livepatching. Add unwinder testing module. - Fixes for CALL_ON_STACK helper used for stack switching. - Fix unwinding from bpf code. - Fix getcpu and remove compat support in vdso code. - Fix address space control registers initialization. - Save KASLR offset for early dumps. - Handle new FILTERED_BY_HYPERVISOR reply code in crypto code. - Minor perf code cleanup and potential memory leak fix. - Add couple of error messages for corner cases during PCI device creation. * tag 's390-5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (33 commits) s390: remove compat vdso code s390/livepatch: Implement reliable stack tracing for the consistency model s390/unwind: add stack pointer alignment sanity checks s390/unwind: filter out unreliable bogus %r14 s390/unwind: start unwinding from reliable state s390/test_unwind: add program check context tests s390/test_unwind: add irq context tests s390/test_unwind: print verbose unwinding results s390/test_unwind: add CALL_ON_STACK tests s390: fix register clobbering in CALL_ON_STACK s390/test_unwind: require that unwinding ended successfully s390/unwind: add a test for the internal API s390/unwind: always inline get_stack_pointer s390/pci: add error message on device number limit s390/pci: add error message for UID collision s390/cpum_sf: Check for SDBT and SDB consistency s390/cpum_sf: Use TEAR_REG macro consistantly s390/cpum_sf: Remove unnecessary check for pending SDBs s390/cpum_sf: Replace function name in debug statements s390/kaslr: store KASLR offset for early dumps ...	2019-12-03 12:50:00 -08:00
Heiko Carstens	2115fbf721	s390: remove compat vdso code Remove compat vdso code, since there is hardly any compat user space left. Still existing compat user space will have to use system calls instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-12-01 12:48:49 +01:00
Linus Torvalds	b94ae8ad9f	seccomp updates for v5.5 - implement SECCOMP_USER_NOTIF_FLAG_CONTINUE (Christian Brauner) - fixes to selftests (Christian Brauner) - remove secure_computing() argument (Christian Brauner) -----BEGIN PGP SIGNATURE----- Comment: Kees Cook <kees@outflux.net> iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl3dT/kWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJg7eD/9PFh0xAgk7swWIOnkv/Ckj6pqR lcnVaugsap2sp99P+QxVPoqKoBsHF/OZ96OqJcokljdWO77ElBMG4Xxgjho/mPPU Yzhsd9/Q0j4zYIe/Gy+4LxZ+wSudBxv7ls4l86fst1GWg880VkLk32/1N0BUjFAp uyBBaEuDoXcnkru8ojKH1xgp0Cd1KjyO1KEAQdkSt2GROo3nhROh9955Hrrxuanr 0sjWLYe8E8P3hPugRI/3WRZu4VqdIn47pm+/UMPwGpC80kI+mSL1jtidszqC022w u0H5yoedEhZCan7uHWtEY1TXfwgktUKMZOzMP8LSoZ9cNPAFyKXsFqN7Jzf/1Edr 9Zsc+9gc3lfBr6YYBSHUC4XYGzZ2fy0itK/yRTvZdUGO/XETrE61fR/wyVjQttRS OR1tAtmd9/3iZqe1jh1l3Rw4bJh1w/hS768sWpp8qAMunCGF5gQvFdqGFAxjIS5c Ddd0gjxK/NV72+iUzCSL0qUXcYjNYPT4cUapywBuQ4H1i4hl5EM3nGyCbLFbpqkp L2fzeAdRGSZIzZ35emTWhvSLZ36Ty64zEViNbAOP9o/+j6/SR5TjL1aNDkz69Eca GM1XiDeg4AoamtPR38+DzS+EnzBWfOD6ujsKNFgjAJbVIaa414Vql9utrq7fSvf2 OIJjAD8PZKN93t1qaw== =igQG -----END PGP SIGNATURE----- Merge tag 'seccomp-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "Mostly this is implementing the new flag SECCOMP_USER_NOTIF_FLAG_CONTINUE, but there are cleanups as well. - implement SECCOMP_USER_NOTIF_FLAG_CONTINUE (Christian Brauner) - fixes to selftests (Christian Brauner) - remove secure_computing() argument (Christian Brauner)" * tag 'seccomp-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUE seccomp: fix SECCOMP_USER_NOTIF_FLAG_CONTINUE test seccomp: simplify secure_computing() seccomp: test SECCOMP_USER_NOTIF_FLAG_CONTINUE seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE seccomp: avoid overflow in implicit constant conversion	2019-11-30 17:23:16 -08:00
Miroslav Benes	aa137a6d30	s390/livepatch: Implement reliable stack tracing for the consistency model The livepatch consistency model requires reliable stack tracing architecture support in order to work properly. In order to achieve this, two main issues have to be solved. First, reliable and consistent call chain backtracing has to be ensured. Second, the unwinder needs to be able to detect stack corruptions and return errors. The "zSeries ELF Application Binary Interface Supplement" says: "The stack pointer points to the first word of the lowest allocated stack frame. If the "back chain" is implemented this word will point to the previously allocated stack frame (towards higher addresses), except for the first stack frame, which shall have a back chain of zero (NULL). The stack shall grow downwards, in other words towards lower addresses." "back chain" is optional. GCC option -mbackchain enables it. Quoting Martin Schwidefsky [1]: "The compiler is called with the -mbackchain option, all normal C function will store the backchain in the function prologue. All functions written in assembler code should do the same, if you find one that does not we should fix that. The end result is that a task that voluntarily called schedule() should have a proper backchain at all times. Dependent on the use case this may or may not be enough. Asynchronous interrupts may stop the CPU at the beginning of a function, if kernel preemption is enabled we can end up with a broken backchain. The production kernels for IBM Z are all compiled without kernel preemption. So yes, we might get away without the objtool support. On a side-note, we do have a line item to implement the ORC unwinder for the kernel, that includes the objtool support. Once we have that we can drop the -mbackchain option for the kernel build. That gives us a nice little performance benefit. I hope that the change from backchain to the ORC unwinder will not be too hard to implement in the livepatch tools." Since -mbackchain is enabled by default when the kernel is compiled, the call chain backtracing should be currently ensured and objtool should not be necessary for livepatch purposes. Regarding the second issue, stack corruptions and non-reliable states have to be recognized by the unwinder. Mainly it means to detect preemption or page faults, the end of the task stack must be reached, return addresses must be valid text addresses and hacks like function graph tracing and kretprobes must be properly detected. Unwinding a running task's stack is not a problem, because there is a livepatch requirement that every checked task is blocked, except for the current task. Due to that, the implementation can be much simpler compared to the existing non-reliable infrastructure. We can consider a task's kernel/thread stack only and skip the other stacks. [1] 20180912121106.31ffa97c@mschwideX1 [not archived on lore.kernel.org] Link: https://lkml.kernel.org/r/20191106095601.29986-5-mbenes@suse.cz Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:48 +01:00
Miroslav Benes	be2d11b2a1	s390/unwind: add stack pointer alignment sanity checks ABI requires SP to be aligned 8 bytes, report unwinding error otherwise. Link: https://lkml.kernel.org/r/20191106095601.29986-5-mbenes@suse.cz Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:48 +01:00
Vasily Gorbik	bf018ee644	s390/unwind: filter out unreliable bogus %r14 Currently unwinder unconditionally returns %r14 from the first frame pointed by %r15 from pt_regs. A task could be interrupted when a function already allocated this frame (if it needs it) for its callees or to store local variables. In that case this frame would contain random values from stack or values stored there by a callee. As we are only interested in %r14 to get potential return address, skip bogus return addresses which doesn't belong to kernel text. This helps to avoid duplicating filtering logic in unwider users, most of which use unwind_get_return_address() and would choke on bogus 0 address returned by it otherwise. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:48 +01:00
Vasily Gorbik	222ee9087a	s390/unwind: start unwinding from reliable state A comment in arch/s390/include/asm/unwind.h says: > If 'first_frame' is not zero unwind_start skips unwind frames until it > reaches the specified stack pointer. > The end of the unwinding is indicated with unwind_done, this can be true > right after unwind_start, e.g. with first_frame!=0 that can not be found. > unwind_next_frame skips to the next frame. > Once the unwind is completed unwind_error() can be used to check if there > has been a situation where the unwinder could not correctly understand > the tasks call chain. With this change backchain unwinder now comply with behaviour described. As well as matches orc unwinder implementation. Now unwinder starts from reliable state, i.e. __unwind_start own stack frame is taken or stack frame generated by __switch_to (ksp) - both known to be valid. In case of pt_regs %r15 is better match for pt_regs psw, than sometimes random "sp" caller passed. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:48 +01:00
Vasily Gorbik	0610154650	s390/test_unwind: print verbose unwinding results Add stack name, sp and reliable information into test unwinding results. Also consider ip outside of kernel text as failure if the state is reported reliable. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:48 +01:00
Thomas Richter	247f265fa5	s390/cpum_sf: Check for SDBT and SDB consistency Each SBDT is located at a 4KB page and contains 512 entries. Each entry of a SDBT points to a SDB, a 4KB page containing sampled data. The last entry is a link to another SDBT page. When an event is created the function sequence executed is: __hw_perf_event_init() +--> allocate_buffers() +--> realloc_sampling_buffers() +---> alloc_sample_data_block() Both functions realloc_sampling_buffers() and alloc_sample_data_block() allocate pages and the allocation can fail. This is handled correctly and all allocated pages are freed and error -ENOMEM is returned to the top calling function. Finally the event is not created. Once the event has been created, the amount of initially allocated SDBT and SDB can be too low. This is detected during measurement interrupt handling, where the amount of lost samples is calculated. If the number of lost samples is too high considering sampling frequency and already allocated SBDs, the number of SDBs is enlarged during the next execution of cpumsf_pmu_enable(). If more SBDs need to be allocated, functions realloc_sampling_buffers() +---> alloc-sample_data_block() are called to allocate more pages. Page allocation may fail and the returned error is ignored. A SDBT and SDB setup already exists. However the modified SDBTs and SDBs might end up in a situation where the first entry of an SDBT does not point to an SDB, but another SDBT, basicly an SBDT without payload. This can not be handled by the interrupt handler, where an SDBT must have at least one entry pointing to an SBD. Add a check to avoid SDBTs with out payload (SDBs) when enlarging the buffer setup. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	7dd6b199df	s390/cpum_sf: Use TEAR_REG macro consistantly The macro TEAR_REG() saves the last used SDBT address in the perf_hw_event structure. This is also done by function hw_reset_registers() which is a one-liner and simply uses macro TEAR_REG(). Remove function hw_reset_registers(), which is only used one time and use macro TEAR_REG() instead. This macro is used throughout the code anyway. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	c17a7c6ee8	s390/cpum_sf: Remove unnecessary check for pending SDBs In interrupt handling the function extend_sampling_buffer() is called after checking for a possibly extension. This check is not necessary as the called function itself performs this check again. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Thomas Richter	532da3de70	s390/cpum_sf: Replace function name in debug statements Replace hard coded function names in debug statements by the "%s ...", __func__ construct suggested by checkpatch.pl script. Use consistent debug print format of the form variable blank value. Also add leading 0x for all hex values. Print allocated page addresses consistantly as hex numbers with leading 0x. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:46 +01:00
Gerald Schaefer	a9f2f6865d	s390/kaslr: store KASLR offset for early dumps The KASLR offset is added to vmcoreinfo in arch_crash_save_vmcoreinfo(), so that it can be found by crash when processing kernel dumps. However, arch_crash_save_vmcoreinfo() is called during a subsys_initcall, so if the kernel crashes before that, we have no vmcoreinfo and no KASLR offset. Fix this by storing the KASLR offset in the lowcore, where the vmcore_info pointer will be stored, and where it can be found by crash. In order to make it distinguishable from a real vmcore_info pointer, mark it as uneven (KASLR offset itself is aligned to THREAD_SIZE). When arch_crash_save_vmcoreinfo() stores the real vmcore_info pointer in the lowcore, it overwrites the KASLR offset. At that point, the KASLR offset is not yet added to vmcoreinfo, so we also need to move the mem_assign_absolute() behind the vmcoreinfo_append_str(). Fixes: `b2d24b97b2` ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Cc: <stable@vger.kernel.org> # v5.2+ Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	e76e69611e	s390/unwind: stop gracefully at task pt_regs Consider reaching task pt_regs graceful unwinder termination. Task pt_regs itself never contains a valid state to which a task might return within the kernel context (user task pt_regs is a special case). Since we already avoid printing user task pt_regs and in most cases we don't even bother filling task pt_regs psw and r15 with something reasonable simply skip task pt_regs altogether. With this change unwind_error() now accurately represent whether unwinder reached task pt_regs successfully or failed along the way. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	cb7948e8c3	s390/head64: correct init_task stack setup Add missing allocation of pt_regs at the bottom of the stack. This makes it consistent with other stack setup cases and also what stack unwinder expects. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	97806dfb6f	s390/unwind: make reuse_sp default when unwinding pt_regs Currently unwinder yields 2 entries when pt_regs are met: sp="address of pt_regs itself" ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" And neither of those 2 states (combination of sp and ip) ever happened. reuse_sp has been introduced by commit `a1d863ac3e` ("s390/unwind: fix mixing regs and sp"). reuse_sp=true makes unwinder keen to produce the following result, when pt_regs are given (as an arg to unwind_start): sp=pt_regs->gprs[15] ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" The first state is an actual state in which a task was when pt_regs were collected. The second state is marked unreliable and is for debugging purposes to cover the case when a task has been interrupted in between stack frame allocation and writing back_chain - in this case r14 might show an actual caller. Make unwinder behaviour enabled via reuse_sp=true default and drop the special case handling. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	67f5593419	s390/unwind: report an error if pt_regs are not on stack If unwinder is looking at pt_regs which is not on stack then something went wrong and an error has to be reported rather than successful unwinding termination. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	7bcaad1f9f	s390: avoid misusing CALL_ON_STACK for task stack setup CALL_ON_STACK is intended to be used for temporary stack switching with potential return to the caller. When CALL_ON_STACK is misused to switch from nodat stack to task stack back_chain information would later lead stack unwinder from task stack into (per cpu) nodat stack which is reused for other purposes. This would yield confusing unwinding result or errors. To avoid that introduce CALL_ON_STACK_NORETURN to be used instead. It makes sure that back_chain is zeroed and unwinder finishes gracefully ending up at task pt_regs. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	103b4cca60	s390/unwind: unify task is current checks Avoid mixture of task == NULL and task == current meaning the same thing and simply always initialize task with current in unwind_start. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Vasily Gorbik	7f28dad395	s390: disable preemption when switching to nodat stack with CALL_ON_STACK Make sure preemption is disabled when temporary switching to nodat stack with CALL_ON_STACK helper, because nodat stack is per cpu. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:45 +01:00
Heiko Carstens	5a5525b048	s390/vdso: fix getcpu getcpu reads the required values for cpu and node with two instructions. This might lead to an inconsistent result if user space gets preempted and migrated to a different CPU between the two instructions. Fix this by using just a single instruction to read both values at once. This is currently rather a theoretical bug, since there is no real NUMA support available (except for NUMA emulation). Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Heiko Carstens	a2308c11ec	s390/smp,vdso: fix ASCE handling When a secondary CPU is brought up it must initialize its control registers. CPU A which triggers that a secondary CPU B is brought up stores its control register contents into the lowcore of new CPU B, which then loads these values on startup. This is problematic in various ways: the control register which contains the home space ASCE will correctly contain the kernel ASCE; however control registers for primary and secondary ASCEs are initialized with whatever values were present in CPU A. Typically: - the primary ASCE will contain the user process ASCE of the process that triggered onlining of CPU B. - the secondary ASCE will contain the percpu VDSO ASCE of CPU A. Due to lazy ASCE handling we may also end up with other combinations. When then CPU B switches to a different process (!= idle) it will fixup the primary ASCE. However the problem is that the (wrong) ASCE from CPU A was loaded into control register 1: as soon as an ASCE is attached (aka loaded) a CPU is free to generate TLB entries using that address space. Even though it is very unlikey that CPU B will actually generate such entries, this could result in TLB entries of the address space of the process that ran on CPU A. These entries shouldn't exist at all and could cause problems later on. Furthermore the secondary ASCE of CPU B will not be updated correctly. This means that processes may see wrong results or even crash if they access VDSO data on CPU B. The correct VDSO ASCE will eventually be loaded on return to user space as soon as the kernel executed a call to strnlen_user or an atomic futex operation on CPU B. Fix both issues by intializing the to be loaded control register contents with the correct ASCEs and also enforce (re-)loading of the ASCEs upon first context switch and return to user space. Fixes: `0aaba41b58` ("s390: remove all code using the access register mode") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-30 10:52:44 +01:00
Linus Torvalds	77a05940ee	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest changes in this cycle were: - Make kcpustat vtime aware (Frederic Weisbecker) - Rework the CFS load_balance() logic (Vincent Guittot) - Misc cleanups, smaller enhancements, fixes. The load-balancing rework is the most intrusive change: it replaces the old heuristics that have become less meaningful after the introduction of the PELT metrics, with a grounds-up load-balancing algorithm. As such it's not really an iterative series, but replaces the old load-balancing logic with the new one. We hope there are no performance regressions left - but statistically it's highly probable that there is going to be some workload that is hurting from these chnages. If so then we'd prefer to have a look at that workload and fix its scheduling, instead of reverting the changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) rackmeter: Use vtime aware kcpustat accessor leds: Use all-in-one vtime aware kcpustat accessor cpufreq: Use vtime aware kcpustat accessors for user time procfs: Use all-in-one vtime aware kcpustat accessor sched/vtime: Bring up complete kcpustat accessor sched/cputime: Support other fields on kcpustat_field() sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util() sched/fair: Add comments for group_type and balancing at SD_NUMA level sched/fair: Fix rework of find_idlest_group() sched/uclamp: Fix overzealous type replacement sched/Kconfig: Fix spelling mistake in user-visible help text sched/core: Further clarify sched_class::set_next_task() sched/fair: Use mul_u32_u32() sched/core: Simplify sched_class::pick_next_task() sched/core: Optimize pick_next_task() sched/core: Make pick_next_task_idle() more consistent sched/fair: Better document newidle_balance() leds: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM cpufreq: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM procfs: Use vtime aware kcpustat accessor to fetch CPUTIME_SYSTEM ...	2019-11-26 15:23:14 -08:00
Linus Torvalds	1d87200446	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "The main changes in this cycle were: - Cross-arch changes to move the linker sections for NOTES and EXCEPTION_TABLE into the RO_DATA area, where they belong on most architectures. (Kees Cook) - Switch the x86 linker fill byte from x90 (NOP) to 0xcc (INT3), to trap jumps into the middle of those padding areas instead of sliding execution. (Kees Cook) - A thorough cleanup of symbol definitions within x86 assembler code. The rather randomly named macros got streamlined around a (hopefully) straightforward naming scheme: SYM_START(name, linkage, align...) SYM_END(name, sym_type) SYM_FUNC_START(name) SYM_FUNC_END(name) SYM_CODE_START(name) SYM_CODE_END(name) SYM_DATA_START(name) SYM_DATA_END(name) etc - with about three times of these basic primitives with some label, local symbol or attribute variant, expressed via postfixes. No change in functionality intended. (Jiri Slaby) - Misc other changes, cleanups and smaller fixes" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) x86/entry/64: Remove pointless jump in paranoid_exit x86/entry/32: Remove unused resume_userspace label x86/build/vdso: Remove meaningless CFLAGS_REMOVE_*.o m68k: Convert missed RODATA to RO_DATA x86/vmlinux: Use INT3 instead of NOP for linker fill bytes x86/mm: Report actual image regions in /proc/iomem x86/mm: Report which part of kernel image is freed x86/mm: Remove redundant address-of operators on addresses xtensa: Move EXCEPTION_TABLE to RO_DATA segment powerpc: Move EXCEPTION_TABLE to RO_DATA segment parisc: Move EXCEPTION_TABLE to RO_DATA segment microblaze: Move EXCEPTION_TABLE to RO_DATA segment ia64: Move EXCEPTION_TABLE to RO_DATA segment h8300: Move EXCEPTION_TABLE to RO_DATA segment c6x: Move EXCEPTION_TABLE to RO_DATA segment arm64: Move EXCEPTION_TABLE to RO_DATA segment alpha: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Move EXCEPTION_TABLE to RO_DATA segment x86/vmlinux: Actually use _etext for the end of the text segment vmlinux.lds.h: Allow EXCEPTION_TABLE to live in RO_DATA ...	2019-11-26 10:42:40 -08:00
Linus Torvalds	ea1f56fa16	s390 updates for the 5.5 merge window - Adjust PMU device drivers registration to avoid WARN_ON and few other perf improvements. - Enhance tracing in vfio-ccw. - Few stack unwinder fixes and improvements, convert get_wchan custom stack unwinding to generic api usage. - Fixes for mm helpers issues uncovered with tests validating architecture page table helpers. - Fix noexec bit handling when hardware doesn't support it. - Fix memleak and unsigned value compared with zero bugs in crypto code. Minor code simplification. - Fix crash during kdump with kasan enabled kernel. - Switch bug and alternatives from asm to asm_inline to improve inlining decisions. - Use 'depends on cc-option' for MARCH and TUNE options in Kconfig, add z13s and z14 ZR1 to TUNE descriptions. - Minor head64.S simplification. - Fix physical to logical CPU map for SMT. - Several cleanups in qdio code. - Other minor cleanups and fixes all over the code. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3ahGYACgkQjYWKoQLX FBguwAgAig+FNos8zkd7Sr2wg4DPL2IlYERVP40fOLXfGuVUOnMLg8OTO6yDWDpH 5+cKAQS1wWgyvlfjWRUJ6anXLBAsgKRD1nyFIZTpn/wArGk/duCbnl/VFriDgrST 8KTQDJpZ9w9nXtQ7lA2QWaw5U2WG8I2T2JuQJCdLXze7RXi0bDVe8e6131NMaJ42 LLxqOqm8d8XDnd8oDVP04LT5IfhuI2cILoGBP/GyI2fqQk9Ems6M2gxuISq1COmy WORDLfwWyCLeF7gWKKjxf8Vo1HYcyoFvdXnxWiHb0TDZesQZJr/LLELTP03fbCW9 U4jbXncnnPA7kT4tlC95jT5M69yK5w== =+FxG -----END PGP SIGNATURE----- Merge tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Adjust PMU device drivers registration to avoid WARN_ON and few other perf improvements. - Enhance tracing in vfio-ccw. - Few stack unwinder fixes and improvements, convert get_wchan custom stack unwinding to generic api usage. - Fixes for mm helpers issues uncovered with tests validating architecture page table helpers. - Fix noexec bit handling when hardware doesn't support it. - Fix memleak and unsigned value compared with zero bugs in crypto code. Minor code simplification. - Fix crash during kdump with kasan enabled kernel. - Switch bug and alternatives from asm to asm_inline to improve inlining decisions. - Use 'depends on cc-option' for MARCH and TUNE options in Kconfig, add z13s and z14 ZR1 to TUNE descriptions. - Minor head64.S simplification. - Fix physical to logical CPU map for SMT. - Several cleanups in qdio code. - Other minor cleanups and fixes all over the code. * tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits) s390/cpumf: Adjust registration of s390 PMU device drivers s390/smp: fix physical to logical CPU map for SMT s390/early: move access registers setup in C code s390/head64: remove unnecessary vdso_per_cpu_data setup s390/early: move control registers setup in C code s390/kasan: support memcpy_real with TRACE_IRQFLAGS s390/crypto: Fix unsigned variable compared with zero s390/pkey: use memdup_user() to simplify code s390/pkey: fix memory leak within _copy_apqns_from_user() s390/disassembler: don't hide instruction addresses s390/cpum_sf: Assign error value to err variable s390/cpum_sf: Replace function name in debug statements s390/cpum_sf: Use consistant debug print format for sampling s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr() s390: add error handling to perf_callchain_kernel s390: always inline current_stack_pointer() s390/mm: add mm_pxd_folded() checks to pxd_free() s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported s390/mm: simplify page table helpers for large entries s390/mm: make pmd/pud_bad() report large entries as bad ...	2019-11-25 17:23:53 -08:00
Thomas Richter	6a82e23f45	s390/cpumf: Adjust registration of s390 PMU device drivers Linux-next commit titled "perf/core: Optimize perf_init_event()" changed the semantics of PMU device driver registration. It was done to speed up the lookup/handling of PMU device driver specific events. It also enforces that only one PMU device driver will be registered of type PERF_EVENT_RAW. This change added these line in function perf_pmu_register(): ... + ret = idr_alloc(&pmu_idr, pmu, max, 0, GFP_KERNEL); + if (ret < 0) goto free_pdc; + + WARN_ON(type >= 0 && ret != type); The warn_on generates a message. We have 3 PMU device drivers, each registered as type PERF_TYPE_RAW. The cf_diag device driver (arch/s390/kernel/perf_cpumf_cf_diag.c) always hits the WARN_ON because it is the second PMU device driver (after sampling device driver arch/s390/kernel/perf_cpumf_sf.c) which is registered as type 4 (PERF_TYPE_RAW). So when the sampling device driver is registered, ret has value 4. When cf_diag device driver is registered with type 4, ret has value of 5 and WARN_ON fires. Adjust the PMU device drivers for s390 to support the new semantics required by perf_pmu_register(). Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-20 17:16:01 +01:00
Heiko Carstens	72a81ad9d6	s390/smp: fix physical to logical CPU map for SMT If an SMT capable system is not IPL'ed from the first CPU the setup of the physical to logical CPU mapping is broken: the IPL core gets CPU number 0, but then the next core gets CPU number 1. Correct would be that all SMT threads of CPU 0 get the subsequent logical CPU numbers. This is important since a lot of code (like e.g. the CPU topology code) assumes that CPU maps are setup like this. If the mapping is broken the system will not IPL due to broken topology masks: [ 1.716341] BUG: arch topology broken [ 1.716342] the SMT domain not a subset of the MC domain [ 1.716343] BUG: arch topology broken [ 1.716344] the MC domain not a subset of the BOOK domain This scenario can usually not happen since LPARs are always IPL'ed from CPU 0 and also re-IPL is intiated from CPU 0. However older kernels did initiate re-IPL on an arbitrary CPU. If therefore a re-IPL from an old kernel into a new kernel is initiated this may lead to crash. Fix this by setting up the physical to logical CPU mapping correctly. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-20 12:58:13 +01:00
Vasily Gorbik	c231359421	s390/early: move access registers setup in C code Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-20 12:58:13 +01:00
Vasily Gorbik	b8ce1fa489	s390/head64: remove unnecessary vdso_per_cpu_data setup vdso_per_cpu_data lowcore value is only needed for fully functional exception handlers, which are activated in setup_lowcore_dat_off. The same function does init vdso_per_cpu_data via vdso_alloc_boot_cpu. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-20 12:58:12 +01:00
Vasily Gorbik	c02ee6a16a	s390/early: move control registers setup in C code Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-20 12:58:12 +01:00
Ilya Leoshkevich	544f1d62e3	s390/disassembler: don't hide instruction addresses Due to kptr_restrict, JITted BPF code is now displayed like this: 000000000b6ed1b2: ebdff0800024 stmg %r13,%r15,128(%r15) 000000004cde2ba0: 41d0f040 la %r13,64(%r15) 00000000fbad41b0: a7fbffa0 aghi %r15,-96 Leaking kernel addresses to dmesg is not a concern in this case, because this happens only when JIT debugging is explicitly activated, which only root can do. Use %px in this particular instance, and also to print an instruction address in show_code and PCREL (e.g. brasl) arguments in print_insn. While at present functionally equivalent to %016lx, %px is recommended by Documentation/core-api/printk-formats.rst for such cases. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-12 11:24:10 +01:00
Thomas Richter	72fbcd057f	s390/cpum_sf: Assign error value to err variable When starting the CPU Measurement sampling facility using qsi() function, this function may return an error value. This error value is referenced in the else part of the if statement to dump its value in a debug statement. Right now this value is always zero because it has not been assigned a value. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-12 11:24:10 +01:00
Thomas Richter	c18388340c	s390/cpum_sf: Replace function name in debug statements Replace hard coded function names in debug statements by the "%s ...", __func__ construct suggested by checkpatch.pl script. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-12 11:24:10 +01:00
Thomas Richter	d98b5d0728	s390/cpum_sf: Use consistant debug print format for sampling Use consistant debug print format of the form variable blank value. Also add leading 0x for all hex values. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-12 11:24:10 +01:00
Ingo Molnar	6d5a763c30	Linux 5.4-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm 6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY 6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR dUvuYZc= =1N6F -----END PGP SIGNATURE----- Merge tag 'v5.4-rc7' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-11-11 08:34:59 +01:00
Miroslav Benes	c2f2093e14	s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr() The current code around calling ftrace_graph_ret_addr() is ifdeffed and also tests if ftrace redirection is present on stack. ftrace_graph_ret_addr() however performs the test internally and there is a version for !CONFIG_FUNCTION_GRAPH_TRACER as well. The unnecessary code can thus be dropped. Link: http://lkml.kernel.org/r/20191029143904.24051-2-mbenes@suse.cz Signed-off-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-11-05 12:30:15 +01:00
Kees Cook	c9174047b4	vmlinux.lds.h: Replace RW_DATA_SECTION with RW_DATA Rename RW_DATA_SECTION to RW_DATA. (Calling this a "section" is a lie, since it's multiple sections and section flags cannot be applied to the macro.) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-14-keescook@chromium.org	2019-11-04 15:57:41 +01:00
Kees Cook	93240b3279	vmlinux.lds.h: Replace RO_DATA_SECTION with RO_DATA Finish renaming RO_DATA_SECTION to RO_DATA. (Calling this a "section" is a lie, since it's multiple sections and section flags cannot be applied to the macro.) Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-13-keescook@chromium.org	2019-11-04 15:56:16 +01:00
Kees Cook	eaf937075c	vmlinux.lds.h: Move NOTES into RO_DATA The .notes section should be non-executable read-only data. As such, move it to the RO_DATA macro instead of being per-architecture defined. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-11-keescook@chromium.org	2019-11-04 15:34:41 +01:00
Kees Cook	fbe6a8e618	vmlinux.lds.h: Move Program Header restoration into NOTES macro In preparation for moving NOTES into RO_DATA, make the Program Header assignment restoration be part of the NOTES macro itself. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-10-keescook@chromium.org	2019-11-04 15:34:39 +01:00
Kees Cook	441110a547	vmlinux.lds.h: Provide EMIT_PT_NOTE to indicate export of .notes In preparation for moving NOTES into RO_DATA, provide a mechanism for architectures that want to emit a PT_NOTE Program Header to do so. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # s390 Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-9-keescook@chromium.org	2019-11-04 15:34:38 +01:00
Kees Cook	6434efbd9a	s390: Move RO_DATA into "text" PT_LOAD Program Header In preparation for moving NOTES into RO_DATA, move RO_DATA back into the "text" PT_LOAD Program Header, as done with other architectures. The "data" PT_LOAD now starts with the writable data section. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-alpha@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-ia64@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Cc: Will Deacon <will@kernel.org> Cc: x86-ml <x86@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20191029211351.13243-7-keescook@chromium.org	2019-11-04 15:34:32 +01:00
Heiko Carstens	3d7efa4edd	s390/idle: fix cpu idle time calculation The idle time reported in /proc/stat sometimes incorrectly contains huge values on s390. This is caused by a bug in arch_cpu_idle_time(). The kernel tries to figure out when a different cpu entered idle by accessing its per-cpu data structure. There is an ordering problem: if the remote cpu has an idle_enter value which is not zero, and an idle_exit value which is zero, it is assumed it is idle since "now". The "now" timestamp however is taken before the idle_enter value is read. Which in turn means that "now" can be smaller than idle_enter of the remote cpu. Unconditionally subtracting idle_enter from "now" can thus lead to a negative value (aka large unsigned value). Fix this by moving the get_tod_clock() invocation out of the loop. While at it also make the code a bit more readable. A similar bug also exists for show_idle_time(). Fix this is as well. Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-31 17:26:48 +01:00
Ilya Leoshkevich	a1d863ac3e	s390/unwind: fix mixing regs and sp unwind_for_each_frame stops after the first frame if regs->gprs[15] <= sp. The reason is that in case regs are specified, the first frame should be regs->psw.addr and the second frame should be sp->gprs[8]. However, currently the second frame is regs->gprs[15], which confuses outside_of_stack(). Fix by introducing a flag to distinguish this special case from unwinding the interrupt handler, for which the current behavior is appropriate. Fixes: `78c98f9074` ("s390/unwind: introduce stack unwind API") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-31 17:26:48 +01:00
Ilya Leoshkevich	effb83ccc8	s390: add error handling to perf_callchain_kernel perf_callchain_kernel stops neither when it encounters a garbage address, nor when it runs out of space. Fix both issues using x86 version as an inspiration. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-31 17:20:54 +01:00
Vasily Gorbik	6756dd9b89	s390/process: avoid custom stack unwinding in get_wchan Currently get_wchan uses custom stack unwinding implementation which relies on back_chain presence. Replace it with more abstract stack unwinding api usage. Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-31 17:20:53 +01:00
Vasily Gorbik	d3baaeb5ae	s390: avoid double handling of "noexec" option "noexec" option is already parsed during startup and its value is exposed via noexec_disabled variable. Simply reuse that value during machine facilities detection. Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-31 17:20:52 +01:00
Heiko Carstens	f653e29bc2	s390/time: remove monotonic_clock() Remove unused monotonic_clock() function. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-31 17:20:52 +01:00
Gerald Schaefer	ac49303d9e	s390/kaslr: add support for R_390_GLOB_DAT relocation type Commit "bpf: Process in-kernel BTF" in linux-next introduced an undefined __weak symbol, which results in an R_390_GLOB_DAT relocation type. That is not yet handled by the KASLR relocation code, and the kernel stops with the message "Unknown relocation type". Add code to detect and handle R_390_GLOB_DAT relocation types and undefined symbols. Fixes: `805bc0bc23` ("s390/kernel: build a relocatable kernel") Cc: <stable@vger.kernel.org> # v5.2+ Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-22 17:55:51 +02:00
Christian Brauner	fefad9ef58	seccomp: simplify secure_computing() Afaict, the struct seccomp_data argument to secure_computing() is unused by all current callers. So let's remove it. The argument was added in [1]. It was added because having the arch supply the syscall arguments used to be faster than having it done by secure_computing() (cf. Andy's comment in [2]). This is not true anymore though. /* References */ [1]: `2f275de5d1` ("seccomp: Add a seccomp_data parameter secure_computing()") [2]: https://lore.kernel.org/r/CALCETrU_fs_At-hTpr231kpaAd0z7xJN4ku-DvzhRU6cvcJA_w@mail.gmail.com Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-parisc@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: x86@kernel.org Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/20190924064420.6353-1-christian.brauner@ubuntu.com Signed-off-by: Kees Cook <keescook@chromium.org>	2019-10-10 14:55:24 -07:00
Frederic Weisbecker	f83eeb1a01	sched/cputime: Rename vtime_account_system() to vtime_account_kernel() vtime_account_system() decides if we need to account the time to the system (__vtime_account_system()) or to the guest (vtime_account_guest()). So this function is a misnomer as we are on a higher level than "system". All we know when we call that function is that we are accounting kernel cputime. Whether it belongs to guest or system time is a lower level detail. Rename this function to vtime_account_kernel(). This will clarify things and avoid too many underscored vtime_account_system() versions. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Link: https://lkml.kernel.org/r/20191003161745.28464-2-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2019-10-09 12:39:25 +02:00
Thomas Richter	e14e59c125	s390/cpumf: Fix indentation in sampling device driver Fix indentation in the s390 CPU Measuement Facility sampling device dirver. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-01 09:41:36 +02:00
Thomas Richter	932bfc5aae	s390/cpumsf: Check for CPU Measurement sampling s390 IBM z15 introduces a check if the CPU Mesurement Facility sampling is temporarily unavailable. If this is the case return -EBUSY and abort the setup of CPU Measuement facility sampling. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-01 09:41:36 +02:00
Thomas Richter	8513458280	s390/cpumf: Use consistant debug print format Use consistant debug print format of the form variable blank value. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-10-01 09:41:36 +02:00
Linus Torvalds	aefcf2f4b5	Merge branch 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull kernel lockdown mode from James Morris: "This is the latest iteration of the kernel lockdown patchset, from Matthew Garrett, David Howells and others. From the original description: This patchset introduces an optional kernel lockdown feature, intended to strengthen the boundary between UID 0 and the kernel. When enabled, various pieces of kernel functionality are restricted. Applications that rely on low-level access to either hardware or the kernel may cease working as a result - therefore this should not be enabled without appropriate evaluation beforehand. The majority of mainstream distributions have been carrying variants of this patchset for many years now, so there's value in providing a doesn't meet every distribution requirement, but gets us much closer to not requiring external patches. There are two major changes since this was last proposed for mainline: - Separating lockdown from EFI secure boot. Background discussion is covered here: https://lwn.net/Articles/751061/ - Implementation as an LSM, with a default stackable lockdown LSM module. This allows the lockdown feature to be policy-driven, rather than encoding an implicit policy within the mechanism. The new locked_down LSM hook is provided to allow LSMs to make a policy decision around whether kernel functionality that would allow tampering with or examining the runtime state of the kernel should be permitted. The included lockdown LSM provides an implementation with a simple policy intended for general purpose use. This policy provides a coarse level of granularity, controllable via the kernel command line: lockdown={integrity\|confidentiality} Enable the kernel lockdown feature. If set to integrity, kernel features that allow userland to modify the running kernel are disabled. If set to confidentiality, kernel features that allow userland to extract confidential information from the kernel are also disabled. This may also be controlled via /sys/kernel/security/lockdown and overriden by kernel configuration. New or existing LSMs may implement finer-grained controls of the lockdown features. Refer to the lockdown_reason documentation in include/linux/security.h for details. The lockdown feature has had signficant design feedback and review across many subsystems. This code has been in linux-next for some weeks, with a few fixes applied along the way. Stephen Rothwell noted that commit `9d1f8be5cf` ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode") is missing a Signed-off-by from its author. Matthew responded that he is providing this under category (c) of the DCO" * 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits) kexec: Fix file verification on S390 security: constify some arrays in lockdown LSM lockdown: Print current->comm in restriction messages efi: Restrict efivar_ssdt_load when the kernel is locked down tracefs: Restrict tracefs when the kernel is locked down debugfs: Restrict debugfs when the kernel is locked down kexec: Allow kexec_file() with appropriate IMA policy when locked down lockdown: Lock down perf when in confidentiality mode bpf: Restrict bpf when kernel lockdown is in confidentiality mode lockdown: Lock down tracing and perf kprobes when in confidentiality mode lockdown: Lock down /proc/kcore x86/mmiotrace: Lock down the testmmiotrace module lockdown: Lock down module params that specify hardware parameters (eg. ioport) lockdown: Lock down TIOCSSERIAL lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down acpi: Disable ACPI table override if the kernel is locked down acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down ACPI: Limit access to custom_method when the kernel is locked down x86/msr: Restrict MSR access when the kernel is locked down x86: Lock down IO port access when the kernel is locked down ...	2019-09-28 08:14:15 -07:00
Linus Torvalds	f1f2f614d5	Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "The major feature in this time is IMA support for measuring and appraising appended file signatures. In addition are a couple of bug fixes and code cleanup to use struct_size(). In addition to the PE/COFF and IMA xattr signatures, the kexec kernel image may be signed with an appended signature, using the same scripts/sign-file tool that is used to sign kernel modules. Similarly, the initramfs may contain an appended signature. This contained a lot of refactoring of the existing appended signature verification code, so that IMA could retain the existing framework of calculating the file hash once, storing it in the IMA measurement list and extending the TPM, verifying the file's integrity based on a file hash or signature (eg. xattrs), and adding an audit record containing the file hash, all based on policy. (The IMA support for appended signatures patch set was posted and reviewed 11 times.) The support for appended signature paves the way for adding other signature verification methods, such as fs-verity, based on a single system-wide policy. The file hash used for verifying the signature and the signature, itself, can be included in the IMA measurement list" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: ima_api: Use struct_size() in kzalloc() ima: use struct_size() in kzalloc() sefltest/ima: support appended signatures (modsig) ima: Fix use after free in ima_read_modsig() MODSIGN: make new include file self contained ima: fix freeing ongoing ahash_request ima: always return negative code for error ima: Store the measurement again when appraising a modsig ima: Define ima-modsig template ima: Collect modsig ima: Implement support for module-style appended signatures ima: Factor xattr_verify() out of ima_appraise_measurement() ima: Add modsig appraise_type option for module-style appended signatures integrity: Select CONFIG_KEYS instead of depending on it PKCS#7: Introduce pkcs7_get_digest() PKCS#7: Refactor verify_pkcs7_signature() MODSIGN: Export module signature definitions ima: initialize the "template" field with the default template	2019-09-27 19:37:27 -07:00
Vasily Gorbik	f3122a79a1	s390/topology: avoid firing events before kobjs are created arch_update_cpu_topology is first called from: kernel_init_freeable->sched_init_smp->sched_init_domains even before cpus has been registered in: kernel_init_freeable->do_one_initcall->s390_smp_init Do not trigger kobject_uevent change events until cpu devices are actually created. Fixes the following kasan findings: BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb40/0xee0 Read of size 8 at addr 0000000000000020 by task swapper/0/1 BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb36/0xee0 Read of size 8 at addr 0000000000000018 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: ([<0000000143c6db7e>] show_stack+0x14e/0x1a8) [<0000000145956498>] dump_stack+0x1d0/0x218 [<000000014429fb4c>] print_address_description+0x64/0x380 [<000000014429f630>] __kasan_report+0x138/0x168 [<0000000145960b96>] kobject_uevent_env+0xb36/0xee0 [<0000000143c7c47c>] arch_update_cpu_topology+0x104/0x108 [<0000000143df9e22>] sched_init_domains+0x62/0xe8 [<000000014644c94a>] sched_init_smp+0x3a/0xc0 [<0000000146433a20>] kernel_init_freeable+0x558/0x958 [<000000014599002a>] kernel_init+0x22/0x160 [<00000001459a71d4>] ret_from_fork+0x28/0x30 [<00000001459a71dc>] kernel_thread_starter+0x0/0x10 Cc: stable@vger.kernel.org Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-09-23 23:27:52 +02:00
Thomas Richter	2cb549a821	s390/cpum_sf: Support ioctl PERF_EVENT_IOC_PERIOD A perf_event can be set up to deliver overflow notifications via SIGIO signal. The setup of the event is: 1. create event with perf_event_open() 2. assign it a signal for I/O notification with fcntl() 3. Install signal handler and consume samples The initial setup of perf_event_open() determines the period/frequency time span needed to elapse before each signal is delivered to the user process. While the event is active, system call ioctl(.., PERF_EVENT_IOC_PERIOD, value) can be used the change the frequency/period time span of the active event. The remaining signal handler invocations honour the new value. This does not work on s390. In fact the time span does not change regardless of ioctl's third argument 'value'. The call succeeds but the time span does not change. Support this behavior and make it common with other platforms. This is achieved by changing the interval value of the sampling control block accordingly and feed this new value every time the event is enabled using pmu_event_enable(). Before this change the interval value was set only once at pmu_event_add() and never changed. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-09-19 12:56:07 +02:00
Linus Torvalds	d590284419	s390 updates for the 5.4 merge window - Add support for IBM z15 machines. - Add SHA3 and CCA AES cipher key support in zcrypt and pkey refactoring. - Move to arch_stack_walk infrastructure for the stack unwinder. - Various kasan fixes and improvements. - Various command line parsing fixes. - Improve decompressor phase debuggability. - Lift no bss usage restriction for the early code. - Use refcount_t for reference counters for couple of places in mm code. - Logging improvements and return code fix in vfio-ccw code. - Couple of zpci fixes and minor refactoring. - Remove some outdated documentation. - Fix secure boot detection. - Other various minor code clean ups. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl1/pRoACgkQjYWKoQLX FBjxLQf/Y1nlmoc8URLqaqfNTczIvUzdfXuahI7L75RoIIiqHtcHBrVwauSr7Lma XVRzK/+6q0UPISrOIZEEtQKsMMM7rGuUv/+XTyrOB/Tsc31kN2EIRXltfXI/lkb8 BZdgch4Xs2rOD7y6TvqpYJsXYXsnLMWwCk8V+48V/pok4sEgMDgh0bTQRHPHYmZ6 1cv8ZQ0AeuVxC6ChM30LhajGRPkYd8RQ82K7fU7jxT0Tjzu66SyrW3pTwA5empBD RI2yBZJ8EXwJyTCpvN8NKiBgihDs9oUZl61Dyq3j64Mb1OuNUhxXA/8jmtnGn0ok O9vtImCWzExhjSMkvotuhHEC05nEEQ== =LCgE -----END PGP SIGNATURE----- Merge tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for IBM z15 machines. - Add SHA3 and CCA AES cipher key support in zcrypt and pkey refactoring. - Move to arch_stack_walk infrastructure for the stack unwinder. - Various kasan fixes and improvements. - Various command line parsing fixes. - Improve decompressor phase debuggability. - Lift no bss usage restriction for the early code. - Use refcount_t for reference counters for couple of places in mm code. - Logging improvements and return code fix in vfio-ccw code. - Couple of zpci fixes and minor refactoring. - Remove some outdated documentation. - Fix secure boot detection. - Other various minor code clean ups. * tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits) s390: remove pointless drivers-y in drivers/s390/Makefile s390/cpum_sf: Fix line length and format string s390/pci: fix MSI message data s390: add support for IBM z15 machines s390/crypto: Support for SHA3 via CPACF (MSA6) s390/startup: add pgm check info printing s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding vfio-ccw: fix error return code in vfio_ccw_sch_init() s390: vfio-ap: fix warning reset not completed s390/base: remove unused s390_base_mcck_handler s390/sclp: Fix bit checked for has_sipl s390/zcrypt: fix wrong handling of cca cipher keygenflags s390/kasan: add kdump support s390/setup: avoid using strncmp with hardcoded length s390/sclp: avoid using strncmp with hardcoded length s390/module: avoid using strncmp with hardcoded length s390/pci: avoid using strncmp with hardcoded length s390/kaslr: reserve memory for kasan usage s390/mem_detect: provide single get_mem_detect_end s390/cmma: reuse kstrtobool for option value parsing ...	2019-09-17 14:04:43 -07:00
Thomas Richter	03e9e42f79	s390/cpum_sf: Fix line length and format string Rewrite some lines to match line length and replace format string 0x%x to %#x. Add and remove blank line. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-09-16 13:21:51 +02:00
Martin Schwidefsky	a0e2251132	s390: add support for IBM z15 machines Add detection for machine types 0x8562 and 8x8561 and set the ELF platform name to z15. Add the miscellaneous-instruction-extension 3 facility to the list of facilities for z15. And allow to generate code that only runs on a z15 machine. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-09-13 12:19:14 +02:00
Matthew Garrett	45893a0abe	kexec: Fix file verification on S390 I accidentally typoed this #ifdef, so verification would always be disabled. Signed-off-by: Matthew Garrett <mjg59@google.com> Reported-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2019-09-10 13:27:51 +01:00
Vasily Gorbik	512222789c	s390/base: remove unused s390_base_mcck_handler s390_base_mcck_handler was used during system reset if diag308 set was not available. But after commit `d485235b00` ("s390: assume diag308 set always works") is a dead code and could be removed. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-09-03 13:53:56 +02:00
Vasily Gorbik	d0b319843b	s390/setup: avoid using strncmp with hardcoded length Replace strncmp usage in console mode setup code with simple strcmp. Replace strncmp which is used for prefix comparison with str_has_prefix. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-29 15:34:58 +02:00
Vasily Gorbik	54fb07d030	s390/sclp: avoid using strncmp with hardcoded length "earlyprintk" option documentation does not clearly state which platform supports which additional values (e.g. ",keep"). Preserve old option behaviour and reuse str_has_prefix instead of strncmp for prefix testing. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-29 15:34:58 +02:00
Vasily Gorbik	b29cd7c4c4	s390/module: avoid using strncmp with hardcoded length Reuse str_has_prefix instead of strncmp with hardcoded length to make the intent of a comparison more obvious. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-29 15:34:57 +02:00
Vasily Gorbik	3d64436453	s390/vdso: reuse kstrtobool for option value parsing "vdso" option setup already recognises integer and textual values. Yet kstrtobool is a more common way to parse boolean values, reuse it to unify option value parsing behavior and simplify code a bit. While at it, __setup value parsing callbacks are expected to return 1 when an option is recognized, and returning any other value won't trigger any error message currently, so simply return 1. Also don't change default vdso_enabled value of 1 when "vdso" option value is invalid. Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-26 12:51:17 +02:00
Vasily Gorbik	e991e5bb11	s390/stacktrace: use common arch_stack_walk infrastructure Use common arch_stack_walk infrastructure to avoid duplicated code and avoid taking care of the stack storage and filtering. Common code also uses try_get_task_stack/put_task_stack when needed which have been missing in our code, which also solves potential problem for us. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-21 12:58:53 +02:00
Vasily Gorbik	2c7fa8a11c	s390/kasan: avoid report in get_wchan Reading other running task's stack can be a dangerous endeavor. Kasan stack memory access instrumentation includes special prologue and epilogue to mark/remove red zones in shadow memory between stack variables. For that reason there is always a race between a task reading value in other task's stack and that other task returning from a function and entering another one generating different red zones pattern. To avoid kasan reports simply perform uninstrumented memory reads. Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-21 12:58:53 +02:00
Vasily Gorbik	8769f610fe	s390/process: avoid potential reading of freed stack With THREAD_INFO_IN_TASK (which is selected on s390) task's stack usage is refcounted and should always be protected by get/put when touching other task's stack to avoid race conditions with task's destruction code. Fixes: `d5c352cdd0` ("s390: move thread_info into task_struct") Cc: stable@vger.kernel.org # v4.10+ Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-21 12:58:52 +02:00
Vasily Gorbik	2e83e0eb85	s390: clean .bss before running uncompressed kernel Clean uncompressed kernel .bss section in the startup code before the uncompressed kernel is executed. At this point of time initrd and certificates have been already rescued. Uncompressed kernel .bss size is known from vmlinux_info. It is also taken into consideration during uncompressed kernel positioning by kaslr (so it is safe to clean it). With that uncompressed kernel is starting with .bss section zeroed and no .bss section usage restrictions apply. Which makes chkbss checks for uncompressed kernel objects obsolete and they can be removed. early_nobss.c is also not needed anymore. Parts of it which are still relevant are moved to early.c. Kasan initialization code is now called directly from head64 (early.c is instrumented and should not be executed before kasan shadow memory is set up). Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-21 12:58:52 +02:00
Vasily Gorbik	59793c5ab9	s390: move vmalloc option parsing to startup code Few other crucial memory setup options are already handled in the startup code. Those values are needed by kaslr and kasan implementations. "vmalloc" is the last piece required for future improvements such as early decision on kernel page levels depth required for actual memory setup, as well as vmalloc memory area access monitoring in kasan. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-21 12:41:43 +02:00
Jiri Bohac	99d5cadfde	kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE This is a preparatory patch for kexec_file_load() lockdown. A locked down kernel needs to prevent unsigned kernel images from being loaded with kexec_file_load(). Currently, the only way to force the signature verification is compiling with KEXEC_VERIFY_SIG. This prevents loading usigned images even when the kernel is not locked down at runtime. This patch splits KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE. Analogous to the MODULE_SIG and MODULE_SIG_FORCE for modules, KEXEC_SIG turns on the signature verification but allows unsigned images to be loaded. KEXEC_SIG_FORCE disallows images without a valid signature. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Garrett <mjg59@google.com> cc: kexec@lists.infradead.org Signed-off-by: James Morris <jmorris@namei.org>	2019-08-19 21:54:15 -07:00
Heiko Carstens	404861e15b	s390/vdso: map vdso also for statically linked binaries s390 does not map the vdso for statically linked binaries, assuming that this doesn't make sense. See commit `fc5243d98a` ("[S390] arch_setup_additional_pages arguments"). However with glibc commit d665367f596d ("linux: Enable vDSO for static linking as default (BZ#19767)") and commit 5e855c895401 ("s390: Enable VDSO for static linking") the vdso is also used for statically linked binaries - if the kernel would make it available. Therefore map the vdso always, just like all other architectures. Reported-by: Stefan Liebler <stli@linux.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-09 11:03:28 +02:00
Vasily Gorbik	24350fdadb	s390: put _stext and _etext into .text section Perf relies on _etext and _stext symbols being one of 't', 'T', 'v' or 'V'. Put them into .text section to guarantee that. Also moves padding to page boundary inside .text which has an effect that .text section is now padded with nops rather than 0's, which apparently has been the initial intention for specifying 0x0700 fill expression. Reported-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Suggested-by: Andreas Krebbel <krebbel@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-06 13:58:35 +02:00
Vasily Gorbik	b9f23b7376	s390/head64: cleanup unused labels Cleanup labels in head64 some of which are not being used since git recorded history. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-06 13:58:35 +02:00
Vasily Gorbik	fd0c7435d7	s390/unwind: remove stack recursion warning Remove pointless stack recursion on stack type ... warning, which only confuses people. There is no way to make backchain unwinder 100% reliable. When a task is interrupted in-between stack frame allocation and backchain write instructions new stack frame backchain pointer is left uninitialized (there are also sometimes additional instruction in-between stack frame allocation and backchain write instructions due to gcc shrink-wrapping). In attempt to unwind such stack the unwinder would still try to use that invalid backchain value and perform all kind of sanity checks on it to make sure we are not pointed out of stack. In some cases that invalid backchain value would be 0 and we would falsely treat next stackframe as pt_regs and again gprs[15] in those pt_regs might happen to point at some address within the task's stack. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-06 13:58:35 +02:00
Vasily Gorbik	218ddd5acf	s390/setup: adjust start_code of init_mm to _text After some investigation it doesn't look like init_mm fields start_code/end_code are used anywhere besides potentially in dump_mm for debugging purposes. Originally the value of 0 for start_code reflected the presence of lowcore and early boot code. But with kaslr in place start_code/end_code range should not span over unoccupied by the code segment memory. So, adjust init_mm start_code to point at the beginning of the code segment like other architectures do it. Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-06 13:58:34 +02:00
Vasily Gorbik	a287a49e67	s390/protvirt: avoid memory sharing for diag 308 set/store This reverts commit `db9492cef4` ("s390/protvirt: add memory sharing for diag 308 set/store") which due to ultravisor implementation change is not needed after all. Fixes: `db9492cef4` ("s390/protvirt: add memory sharing for diag 308 set/store") Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-08-06 13:58:34 +02:00
Thiago Jung Bauermann	c8424e776b	MODSIGN: Export module signature definitions IMA will use the module_signature format for append signatures, so export the relevant definitions and factor out the code which verifies that the appended signature trailer is valid. Also, create a CONFIG_MODULE_SIG_FORMAT option so that IMA can select it and be able to use mod_check_sig() without having to depend on either CONFIG_MODULE_SIG or CONFIG_MODULES. s390 duplicated the definition of struct module_signature so now they can use the new <linux/module_signature.h> header instead. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Acked-by: Jessica Yu <jeyu@kernel.org> Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>	2019-08-05 18:39:56 -04:00
Vasily Gorbik	1877011a35	s390/kexec: add missing include to machine_kexec_reloc.c Include <asm/kexec.h> into machine_kexec_reloc.c to expose arch_kexec_do_relocs declaration and avoid the following sparse warnings: arch/s390/kernel/machine_kexec_reloc.c:4:5: warning: symbol 'arch_kexec_do_relocs' was not declared. Should it be static? arch/s390/boot/../kernel/machine_kexec_reloc.c:4:5: warning: symbol 'arch_kexec_do_relocs' was not declared. Should it be static? Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-29 18:05:03 +02:00
Vasily Gorbik	06f9895fda	s390/perf: make cf_diag_csd static Since there is really no reason for cf_diag_csd per cpu variable to be globally visible make it static to avoid the following sparse warning: arch/s390/kernel/perf_cpum_cf_diag.c:37:1: warning: symbol 'cf_diag_csd' was not declared. Should it be static? Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2019-07-29 18:05:02 +02:00
Vasily Gorbik	5518aed82d	s390: wire up clone3 system call Tested (64-bit and compat mode) using program from http://lkml.kernel.org/r/20190604212930.jaaztvkent32b7d3@brauner.io with the following: return syscall(__NR_clone, flags, 0, pidfd, 0, 0); changed to: return syscall(__NR_clone, 0, flags, pidfd, 0, 0); due to CLONE_BACKWARDS2. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2019-07-23 10:45:53 +02:00

1 2 3 4 5 ...

2682 Commits