Commit Graph

1975 Commits

Author SHA1 Message Date
Christian Borntraeger
c19805f870 s390/cpumf: Use configuration level indication for sampling data
Newer hardware provides the level of virtualization that a particular
sample belongs to. Use that information and fall back to the old
heuristics if the sample does not contain that information.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:12:18 +01:00
Heiko Carstens
82897ede92 s390: cleanup arch/s390/kernel Makefile
Group all compiler flag modification lines together and sort them
alphabetically. This should hopefully prevent future bugs due to
missing flag modifications.
Also fix indentation at some places.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:26 +01:00
Heiko Carstens
d543a106f9 s390: fix initrd corruptions with gcov/kcov instrumented kernels
The early C code within arch/s390/kernel/early.c saves ipl parameters
before the bss section is cleared. When doing that it jumps to code
that is potentially gcov/kcov instrumented. That code in turn will
corrupt an initrd that potentially may reside in the not yet ready to
be used bss section.

Instead of excluding more and more code from gcov/kcov instrumentation
provide an early memmove function which will be used to save ipl
parameters. The verification if these parameters are actually valid
will be done later.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:20 +01:00
Heiko Carstens
b5cb9bf8dd s390: exclude early C code from gcov profiling
Early C code must be excluded from gcov profiling since it may write
to the bss section before
- a potential initrd that resides there is rescued
- the bss section is initialized (zeroed)

This patch only addresses the problem that early code is instrumented
for profiling, but not the problem that it jumps into other code that
is still instrumented. That problem will be fixed with a follow-on
patch.

Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:10:15 +01:00
Heiko Carstens
7df1160459 s390: remove unused labels from entry.S
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:04:26 +01:00
Viktor Mihajlovski
e32eae10e5 s390/sysinfo: show partition extended name and UUID if available
Extract extended name and UUID from SYSIB 2.2.2 data.
As the code to convert the raw extended name into printable format
can be reused by stsi_2_2_2 we're moving the conversion code into a
separate function convert_ext_name.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 12:29:47 +01:00
Heiko Carstens
8c91058022 s390/numa: establish cpu to node mapping early
Initialize the cpu topology and therefore also the cpu to node mapping
much earlier. Fixes this warning and subsequent crashes when using the
fake numa emulation mode on s390:

WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a67fd0-dirty #28
task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000
Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8)
           R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
Call Trace:
([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
([<000000000015d46c>] create_worker+0x174/0x1c0)
([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0)
([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0)
([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78)
([<000000000016441c>] __alloc_workqueue_key+0x304/0x520)
([<0000000000ee3706>] default_bdi_init+0x3e/0x70)
([<0000000000100270>] do_one_initcall+0x140/0x1d8)
([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8)
([<0000000000984a7a>] kernel_init+0x2a/0x150)
([<00000000009913fa>] kernel_thread_starter+0x6/0xc)
([<00000000009913f4>] kernel_thread_starter+0x0/0xc)

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:25 +01:00
Heiko Carstens
30fc4ca2a8 s390/topology: use cpu_topology array instead of per cpu variable
CPU topology information like cpu to node mapping must be setup in
setup_arch already. Topology information is currently made available
with a per cpu variable; this however will not work when the
initialization will be moved to setup_arch, since the generic percpu
setup will be done much later.

Therefore convert back to a cpu_topology array.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:16 +01:00
Heiko Carstens
af51160ebd s390/smp: initialize cpu_present_mask in setup_arch
In order to be able to setup the cpu to node mappings early it is a
prerequisite to know which cpus are present. Therefore cpus must be
detected much earlier than before.

For sclp based cpu detection this requires yet another early sclp
call, since the system is not ready to use the regular interrupt and
memory allocations.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:23:07 +01:00
Heiko Carstens
ebb299a510 s390/topology: always use s390 specific sched_domain_topology_level
The s390 specific sched_domain_topology_level should always be used,
not only if the machine provides topology information. Luckily this
odd behaviour, that was by accident introduced with git commit
d05d15da18 ("s390/topology: delay initialization of topology cpu
masks") has currently no side effect.

Fixes: d05d15da18 ("s390/topology: delay initialization of topology cpumasks")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:59 +01:00
Heiko Carstens
5423145f8c s390/smp: use smp_get_base_cpu() helper function
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:50 +01:00
Martin Schwidefsky
ce4dda3f02 s390: fix machine check panic stack switch
For system damage machine checks or machine checks due to invalid PSW
fields the system will be stopped. In order to get an oops message out
before killing the system the machine check handler branches to
.Lmcck_panic, switches to the panic stack and then does the usual
machine check handling.

The switch to the panic stack is incomplete, the stack pointer in %r15
is replaced, but the pt_regs pointer in %r11 is not. The result is
a program check which will kill the system in a slightly different way.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-07 07:22:13 +01:00
Heiko Carstens
db7ad63624 s390/setup: fix memblock usage
When converting from bootmem to memblock I missed a subtle difference:
the memblock_alloc() functions return uninitialized memory, while the
memblock_virt_alloc() functions return zeroed memory.

This led to quite random early boot crashes.

Therefore use the correct version everywhere now.
Hopefully.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-02 07:36:24 +01:00
Heiko Carstens
9f88eb4df7 s390/kexec: use node 0 when re-adding crash kernel memory
When re-adding crash kernel memory within setup_resources() the
function memblock_add() is used. That function will add memory by
default to node "MAX_NUMNODES" instead of node 0, like the memory
detection code does. In case of !NUMA this will trigger this warning
when the kernel generates the vmemmap:

Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
Call Trace:
 [<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
 [<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
 [<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
 [<0000000000840136>] sparse_mem_map_populate+0x46/0x68
 [<0000000000d0c59c>] sparse_init+0x184/0x238
 [<0000000000cf45f6>] paging_init+0xbe/0xf8
 [<0000000000cf1d4a>] setup_arch+0xa02/0xae0
 [<0000000000ced75a>] start_kernel+0x72/0x450
 [<0000000000100020>] _stext+0x20/0x80

If NUMA is selected numa_setup_memory() will fix the node assignments
before the vmemmap will be populated; so this warning will only appear
if NUMA is not selected.

To fix this simply use memblock_add_node() and re-add crash kernel
memory explicitly to node 0.

Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 4e042af463 ("s390/kexec: fix crash on resize of reserved memory")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-02 07:36:17 +01:00
Heiko Carstens
9e427365af s390: convert remaining bootmem allocations to memblock
Get rid of all remaining alloc_bootmem calls and use memblock_alloc
instead everywhere.  This way we get rid of the inconsistent mixture
of alloc_bootmem and memblock_alloc usages.

Two of the alloc_bootmem_low calls within arch/s390/kernel/setup.c are
replaced with memblock_alloc calls that don't enforce that the
allocated memory is below 2GB. This restriction was never necessary.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-29 07:52:55 +01:00
Martin Schwidefsky
61aaef51cc s390: fix kernel oops for CONFIG_MARCH_Z900=y builds
The LAST_BREAK macro in entry.S uses a different instruction sequence
for CONFIG_MARCH_Z900 builds. The branch target offset to skip the
store of the last breaking event address needs to take the different
length of the code block into account.

Fixes: f8fc82b471 ("s390: move sys_call_table and last_break from thread_info to thread_struct")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-25 10:07:55 +01:00
Heiko Carstens
e1231b0e48 s390: add cma support
In order to make the cma infrastructure usable we need to add a small
architecture backend which calls dma_contiguous_reserve.
Otherwise we would end up with the cma allocator enabled, but no pool
where memory can be allocated from.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:23 +01:00
Heiko Carstens
1e16b09666 s390/cpumf: simplify psw generation
Use the psw_bits macro and simplify the code. The generated code is
also better since it doesn't contain any conditional branches anymore.

Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:22 +01:00
Heiko Carstens
3a890380e4 s390/thread_info: get rid of THREAD_ORDER define
We have the s390 specific THREAD_ORDER define and the THREAD_SIZE_ORDER
define which is also used in common code. Both have exactly the same
semantics. Therefore get rid of THREAD_ORDER and always use
THREAD_SIZE_ORDER instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-23 16:02:21 +01:00
Martin Schwidefsky
191ce9d1fd s390/time: fix clocksource steering for negative clock offsets
The TOD clock offset injected by an STP sync check can be negative.
If the resulting total tod_steering_delta gets negative the kernel
will panic.

Change the type of tod_steering_delta to a signed type.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 75c7b6f3f6 ("s390/time: steer clocksource on STP sync events")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17 06:56:38 +01:00
Martin Schwidefsky
ef280c859f s390: move sys_call_table and last_break from thread_info to thread_struct
Move the last two architecture specific fields from the thread_info
structure to the thread_struct. All that is left in thread_info is
the flags field.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-15 16:48:20 +01:00
Martin Schwidefsky
90c53e6580 s390: move cputime accounting fields from thread_info to thread_struct
The user_timer and system_timer fields are used for the per-thread
cputime accounting code. The access to these values is simpler if
they are moved to the thread_struct as the task_thread_info(tsk)
indirection is not needed anymore.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:43 +01:00
Martin Schwidefsky
f8fc82b471 s390: move system_call field from thread_info to thread_struct
The system_call field in thread_info structure is used by the signal
code to store the number of the current system call while the debugger
interacts with its inferior. A better location for the system_call
field is with the other debugger related information in the
thread_struct.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:43 +01:00
Heiko Carstens
d5c352cdd0 s390: move thread_info into task_struct
This is the s390 variant of commit 15f4eae70d ("x86: Move
thread_info into task_struct").

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:41 +01:00
Martin Schwidefsky
c360192bf4 s390/preempt: move preempt_count to the lowcore
Convert s390 to use a field in the struct lowcore for the CPU
preemption count. It is a bit cheaper to access a lowcore field
compared to a thread_info variable and it removes the depencency
on a task related structure.

bloat-o-meter on the vmlinux image for the default configuration
(CONFIG_PREEMPT_NONE=y) reports a small reduction in text size:

add/remove: 0/0 grow/shrink: 18/578 up/down: 228/-5448 (-5220)

A larger improvement is achieved with the default configuration
but with CONFIG_PREEMPT=y and CONFIG_DEBUG_PREEMPT=n:

add/remove: 2/6 grow/shrink: 59/4477 up/down: 1618/-228762 (-227144)

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11 16:37:40 +01:00
Paul Gortmaker
8ba8b05f57 s390: kernel: make lgr explicitly non-modular
The Makefile currently controlling compilation of this code is obj-y
meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modular infrastructure use, so that
when reading the driver there is no doubt it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We replace module.h with init.h and export.h since the file does
export some symbols.

Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-31 17:55:42 +01:00
Martin Schwidefsky
75c7b6f3f6 s390/time: steer clocksource on STP sync events
On STP sync events the TOD clock will jump in time, either forward or
backward. The TOD clocksource claims to be continuous but in case of
an STP sync with a negative offset it is not.

Subtract the offset injected by the STP sync check from the result of
the TOD clocksource to make it continuous again. Add code to drift the
offset towards zero with a fixed rate, steering 1 second in ~9 hours.

Suggested-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Martin Schwidefsky
2ace06ec0d s390/time: adjust last_update_clock at clock synchronization
The last_update_clock time stamp in the lowcore should be adjusted by
the TOD clock delta that is created by the clock synchronization.
Otherwise the calculation of the steal time will be incorrect.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Martin Schwidefsky
b1c0854d16 s390/time: refactor clock sync
Merge clock_sync_cpu into stp_sync_clock and split out the update
of the global and per-CPU clock fields into clock_sync_global
and clock_sync_local.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-28 10:09:02 +02:00
Heiko Carstens
47ece7fef4 s390/dumpstack: use pr_cont within show_stack and die
Use pr_cont instead of printk calls also within show_stack and
die in order to avoid extra line breaks.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-24 10:26:14 +02:00
Heiko Carstens
dcddba96cd s390/dumpstack: get rid of return_address again
With commit ef6000b4c6 ("Disable the __builtin_return_address()
warning globally after all)" the kernel does not warn at all again if
__builtin_return_address(n) is called with n > 0.

Besides the fact that this was a false warning on s390 anyway, due to
the always present backchain, we can now revert commit 5606330627
("s390/dumpstack: implement and use return_address()") again, to
simplify the code again.

After all I shouldn't have had return_address() implememted at all to
workaround this issue. So get rid of this again.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:33 +02:00
Heiko Carstens
4d062487f3 s390/disassambler: use pr_cont where appropriate
Just like for dumpstack use pr_cont instead of simple printk calls to
fix the output when disassembling a piece of code.

Before:
[    0.840627] Krnl Code: 000000000017d1c6: a77400f7            brc     7,17d3b4
[    0.840630]
                          000000000017d1ca: 92015000            mvi     0(%r5),1
[    0.840634]
                         #000000000017d1ce: a7f40001            brc     15,17d1d0

After:
[    0.831792] Krnl Code: 000000000017d13e: a77400f7            brc     7,17d32c
                          000000000017d142: 92015000            mvi     0(%r5),1
                         #000000000017d146: a7f40001            brc     15,17d148

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:32 +02:00
Heiko Carstens
a790634544 s390/dumpstack: use pr_cont where appropriate
Use pr_cont instead of simple printk calls when lines will be
continued. This fixes the kernel output of various lines printed on
e.g. a warning:

Before:
[    0.840604] Krnl PSW : 0404c00180000000 000000000017d1d2
[    0.840606]  (try_to_wake_up+0x382/0x5e0)

[    0.840610]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0
[    0.840611]  RI:0 EA:3

After:
[    0.831772] Krnl PSW : 0404c00180000000 000000000017d14a (try_to_wake_up+0x382/0x5e0)
[    0.831776]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:31 +02:00
Heiko Carstens
d0208639db s390/dumpstack: restore reliable indicator for call traces
Before merging all different stack tracers the call traces printed had
an indicator if an entry can be considered reliable or not.
Unreliable entries were put in braces, reliable not. Currently all
lines contain these extra braces.

This patch restores the old behaviour by adding an extra "reliable"
parameter to the callback functions. Only show_trace makes currently
use of it.

Before:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0)

After:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756]  [<0000000000161d64>] create_worker+0x174/0x1c0

Fixes: 758d39ebd3 ("s390/dumpstack: merge all four stack tracers")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-10-17 14:44:30 +02:00
Linus Torvalds
84d69848c9 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - EXPORT_SYMBOL for asm source by Al Viro.

   This does bring a regression, because genksyms no longer generates
   checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
   working on a patch to fix this.

   Plus, we are talking about functions like strcpy(), which rarely
   change prototypes.

 - Fixes for PPC fallout of the above by Stephen Rothwell and Nick
   Piggin

 - fixdep speedup by Alexey Dobriyan.

 - preparatory work by Nick Piggin to allow architectures to build with
   -ffunction-sections, -fdata-sections and --gc-sections

 - CONFIG_THIN_ARCHIVES support by Stephen Rothwell

 - fix for filenames with colons in the initramfs source by me.

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
  initramfs: Escape colons in depfile
  ppc: there is no clear_pages to export
  powerpc/64: whitelist unresolved modversions CRCs
  kbuild: -ffunction-sections fix for archs with conflicting sections
  kbuild: add arch specific post-link Makefile
  kbuild: allow archs to select link dead code/data elimination
  kbuild: allow architectures to use thin archives instead of ld -r
  kbuild: Regenerate genksyms lexer
  kbuild: genksyms fix for typeof handling
  fixdep: faster CONFIG_ search
  ia64: move exports to definitions
  sparc32: debride memcpy.S a bit
  [sparc] unify 32bit and 64bit string.h
  sparc: move exports to definitions
  ppc: move exports to definitions
  arm: move exports to definitions
  s390: move exports to definitions
  m68k: move exports to definitions
  alpha: move exports to actual definitions
  x86: move exports to actual definitions
  ...
2016-10-14 14:26:58 -07:00
Alexey Dobriyan
81243eacfa cred: simpler, 1D supplementary groups
Current supplementary groups code can massively overallocate memory and
is implemented in a way so that access to individual gid is done via 2D
array.

If number of gids is <= 32, memory allocation is more or less tolerable
(140/148 bytes).  But if it is not, code allocates full page (!)
regardless and, what's even more fun, doesn't reuse small 32-entry
array.

2D array means dependent shifts, loads and LEAs without possibility to
optimize them (gid is never known at compile time).

All of the above is unnecessary.  Switch to the usual
trailing-zero-len-array scheme.  Memory is allocated with
kmalloc/vmalloc() and only as much as needed.  Accesses become simpler
(LEA 8(gi,idx,4) or even without displacement).

Maximum number of gids is 65536 which translates to 256KB+8 bytes.  I
think kernel can handle such allocation.

On my usual desktop system with whole 9 (nine) aux groups, struct
group_info shrinks from 148 bytes to 44 bytes, yay!

Nice side effects:

 - "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,

 - fix little mess in net/ipv4/ping.c
   should have been using GROUP_AT macro but this point becomes moot,

 - aux group allocation is persistent and should be accounted as such.

Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.by
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Chris Metcalf
6727ad9e20 nmi_backtrace: generate one-line reports for idle cpus
When doing an nmi backtrace of many cores, most of which are idle, the
output is a little overwhelming and very uninformative.  Suppress
messages for cpus that are idling when they are interrupted and just
emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

We do this by grouping all the cpuidle code together into a new
.cpuidle.text section, and then checking the address of the interrupted
PC to see if it lies within that section.

This commit suitably tags x86 and tile idle routines, and only adds in
the minimal framework for other architectures.

Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Tested-by: Petr Mladek <pmladek@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Linus Torvalds
6218590bcb KVM updates for v4.9-rc1
All architectures:
   Move `make kvmconfig` stubs from x86;  use 64 bits for debugfs stats.
 
 ARM:
   Important fixes for not using an in-kernel irqchip; handle SError
   exceptions and present them to guests if appropriate; proxying of GICV
   access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8;
   preparations for GICv3 save/restore, including ABI docs; cleanups and
   a bit of optimizations.
 
 MIPS:
   A couple of fixes in preparation for supporting MIPS EVA host kernels;
   MIPS SMP host & TLB invalidation fixes.
 
 PPC:
   Fix the bug which caused guests to falsely report lockups; other minor
   fixes; a small optimization.
 
 s390:
   Lazy enablement of runtime instrumentation; up to 255 CPUs for nested
   guests; rework of machine check deliver; cleanups and fixes.
 
 x86:
   IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V
   TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in
   nVMX; cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL
 sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC
 eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW
 hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4
 K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI
 B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18=
 =fZRB
 -----END PGP SIGNATURE-----

Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Radim Krčmář:
 "All architectures:
   - move `make kvmconfig` stubs from x86
   - use 64 bits for debugfs stats

  ARM:
   - Important fixes for not using an in-kernel irqchip
   - handle SError exceptions and present them to guests if appropriate
   - proxying of GICV access at EL2 if guest mappings are unsafe
   - GICv3 on AArch32 on ARMv8
   - preparations for GICv3 save/restore, including ABI docs
   - cleanups and a bit of optimizations

  MIPS:
   - A couple of fixes in preparation for supporting MIPS EVA host
     kernels
   - MIPS SMP host & TLB invalidation fixes

  PPC:
   - Fix the bug which caused guests to falsely report lockups
   - other minor fixes
   - a small optimization

  s390:
   - Lazy enablement of runtime instrumentation
   - up to 255 CPUs for nested guests
   - rework of machine check deliver
   - cleanups and fixes

  x86:
   - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery
   - Hyper-V TSC page
   - per-vcpu tsc_offset in debugfs
   - accelerated INS/OUTS in nVMX
   - cleanups and fixes"

* tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits)
  KVM: MIPS: Drop dubious EntryHi optimisation
  KVM: MIPS: Invalidate TLB by regenerating ASIDs
  KVM: MIPS: Split kernel/user ASID regeneration
  KVM: MIPS: Drop other CPU ASIDs on guest MMU changes
  KVM: arm/arm64: vgic: Don't flush/sync without a working vgic
  KVM: arm64: Require in-kernel irqchip for PMU support
  KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register
  KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL
  KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie
  KVM: PPC: BookE: Fix a sanity check
  KVM: PPC: Book3S HV: Take out virtual core piggybacking code
  KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread
  ARM: gic-v3: Work around definition of gic_write_bpr1
  KVM: nVMX: Fix the NMI IDT-vectoring handling
  KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive
  KVM: nVMX: Fix reload apic access page warning
  kvmconfig: add virtio-gpu to config fragment
  config: move x86 kvm_guest.config to a common location
  arm64: KVM: Remove duplicating init code for setting VMID
  ARM: KVM: Support vgic-v3
  ...
2016-10-06 10:49:01 -07:00
Linus Torvalds
e46cae4418 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The new features and main improvements in this merge for v4.9

   - Support for the UBSAN sanitizer

   - Set HAVE_EFFICIENT_UNALIGNED_ACCESS, it improves the code in some
     places

   - Improvements for the in-kernel fpu code, in particular the overhead
     for multiple consecutive in kernel fpu users is recuded

   - Add a SIMD implementation for the RAID6 gen and xor operations

   - Add RAID6 recovery based on the XC instruction

   - The PCI DMA flush logic has been improved to increase the speed of
     the map / unmap operations

   - The time synchronization code has seen some updates

  And bug fixes all over the place"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
  s390/con3270: fix insufficient space padding
  s390/con3270: fix use of uninitialised data
  MAINTAINERS: update DASD maintainer
  s390/cio: fix accidental interrupt enabling during resume
  s390/dasd: add missing \n to end of dev_err messages
  s390/config: Enable config options for Docker
  s390/dasd: make query host access interruptible
  s390/dasd: fix panic during offline processing
  s390/dasd: fix hanging offline processing
  s390/pci_dma: improve lazy flush for unmap
  s390/pci_dma: split dma_update_trans
  s390/pci_dma: improve map_sg
  s390/pci_dma: simplify dma address calculation
  s390/pci_dma: remove dma address range check
  iommu/s390: simplify registration of I/O address translation parameters
  s390: migrate exception table users off module.h and onto extable.h
  s390: export header for CLP ioctl
  s390/vmur: fix irq pointer dereference in int handler
  s390/dasd: add missing KOBJ_CHANGE event for unformatted devices
  s390: enable UBSAN
  ...
2016-10-04 14:05:52 -07:00
Paul Gortmaker
dcc096c540 s390: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

The additions of uaccess.h are to deal with implict includes like:

arch/s390/kernel/traps.c: In function 'do_report_trap':
arch/s390/kernel/traps.c:56:4: error: implicit declaration of function 'extable_fixup' [-Werror=implicit-function-declaration]
arch/s390/kernel/traps.c: In function 'illegal_op':
arch/s390/kernel/traps.c:173:3: error: implicit declaration of function 'get_user' [-Werror=implicit-function-declaration]

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:38 +02:00
Christian Borntraeger
c42d8c7dbe s390: enable UBSAN
This enables UBSAN for s390. We have to disable the null sanitizer
as s390 code does access memory via a null pointer (the prefix page).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:23 +02:00
Masahiro Yamada
f296190e41 s390/crashdump: use list_first_entry_or_null
The combo of list_empty() check and return list_first_entry()
can be replaced with list_first_entry_or_null().

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-20 14:26:04 +02:00
Ingo Molnar
d4b80afbba Merge branch 'linus' into x86/asm, to pick up recent fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-15 08:24:53 +02:00
David Hildenbrand
8953fb08ab KVM: s390: write external damage code on machine checks
Let's also write the external damage code already provided by
struct kvm_s390_mchk_info.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
Martin Schwidefsky
8f149ea6e9 s390/nmi: improve revalidation of fpu / vector registers
The machine check handler will do one of two things if the floating-point
control, a floating point register or a vector register can not be
revalidated:
1) if the PSW indicates user mode the process is terminated
2) if the PSW indicates kernel mode the system is stopped

To unconditionally stop the system for 2) is incorrect.

There are three possible outcomes if the floating-point control, a
floating point register or a vector registers can not be revalidated:
1) The kernel is inside a kernel_fpu_begin/kernel_fpu_end block and
   needs the register. The system is stopped.
2) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_CPU bit
   is not set. The user space process needs the register and is killed.
3) No active kernel_fpu_begin/kernel_fpu_end block and the CIF_FPU bit
   is set. Neither the kernel nor the user space process needs the
   lost register. Just revalidate it and continue.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:03 +02:00
Martin Schwidefsky
7f79695cc1 s390/fpu: improve kernel_fpu_[begin|end]
In case of nested user of the FPU or vector registers in the kernel
the current code uses the mask of the FPU/vector registers of the
previous contexts to decide which registers to save and restore.
E.g. if the previous context used KERNEL_VXR_V0V7 and the next
context wants to use KERNEL_VXR_V24V31 the first 8 vector registers
are stored to the FPU state structure. But this is not necessary
as the next context does not use these registers.

Rework the FPU/vector register save and restore code. The new code
does a few things differently:
1) A lowcore field is used instead of a per-cpu variable.
2) The kernel_fpu_end function now has two parameters just like
   kernel_fpu_begin. The register flags are required by both
   functions to save / restore the minimal register set.
3) The inline functions kernel_fpu_begin/kernel_fpu_end now do the
   update of the register masks. If the user space FPU registers
   have already been stored neither save_fpu_regs nor the
   __kernel_fpu_begin/__kernel_fpu_end functions have to be called
   for the first context. In this case kernel_fpu_begin adds 7
   instructions and kernel_fpu_end adds 4 instructions.
3) The inline assemblies in __kernel_fpu_begin / __kernel_fpu_end
   to save / restore the vector registers are simplified a bit.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:05:01 +02:00
David Hildenbrand
67f03de5f0 s390/time: avoid races when updating tb_update_count
The increment might not be atomic and we're not holding the
timekeeper_lock. Therefore we might lose an update to count, resulting in
VDSO being trapped in a loop. As other archs also simply update the
values and count doesn't seem to have an impact on reloading of these
values in VDSO code, let's just remove the update of tb_update_count.

Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:58 +02:00
David Hildenbrand
0c00b1e00b s390/time: fixup the clock comparator on all cpus
By leaving fixup_cc unset, only the clock comparator of the cpu actually
doing the sync is fixed up until now.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:56 +02:00
David Hildenbrand
ca64f63901 s390/time: cleanup etr leftovers
There are still some etr leftovers and wrong comments, let's clean that up.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:54 +02:00
David Hildenbrand
41ad022039 s390/time: simplify stp time syncs
The way we call do_adjtimex() today is broken. It has 0 effect, as
ADJ_OFFSET_SINGLESHOT (0x0001) in the kernel maps to !ADJ_ADJTIME
(in contrast to user space where it maps to  ADJ_OFFSET_SINGLESHOT |
ADJ_ADJTIME - 0x8001). !ADJ_ADJTIME will silently ignore all adjustments
without STA_PLL being active. We could switch to ADJ_ADJTIME or turn
STA_PLL on, but still we would run into some problems:

- Even when switching to nanoseconds, we lose accuracy.
- Successive calls to do_adjtimex() will simply overwrite any leftovers
  from the previous call (if not fully handled)
- Anything that NTP does using the sysctl heavily interferes with our
  use.
- !ADJ_ADJTIME will silently round stuff > or < than 0.5 seconds

Reusing do_adjtimex() here just feels wrong. The whole STP synchronization
works right now *somehow* only, as do_adjtimex() does nothing and our
TOD clock jumps in time, although it shouldn't. This is especially bad
as the clock could jump backwards in time. We will have to find another
way to fix this up.

As leap seconds are also not properly handled yet, let's just get rid of
all this complex logic altogether and use the correct clock_delta for
fixing up the clock comparator and keeping the sched_clock monotonic.

This change should have 0 effect on the current STP mechanism. Once we
know how to best handle sync events and leap second updates, we'll start
with a fresh implementation.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-29 11:04:52 +02:00