git commit cd2934a3 moved the flush_tlb_range() within
__unmap_hugepage_range() inside the mm->page_table_lock, which
triggered a deadlock in s390 tlb flushing code. __tlb_flush_mm_cond()
also tries to acquire the mm->page_table_lock, but that is not needed
because all callers already have mm->mmap_sem or mm->page_table_lock,
so it can be safely removed to fix the deadlock.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace __s390x__ with CONFIG_64BIT in all places that are not exported
to userspace or guarded with #ifdef __KERNEL__.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this compiler error for !CONFIG_SMP:
CC arch/s390/mm/pgtable.o
arch/s390/mm/pgtable.c: In function ‘gmap_flush_tlb’:
arch/s390/mm/pgtable.c:202:3: error: implicit declaration of function ‘__tlb_flush_global’ [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add code that allows KVM to control the virtual memory layout that
is seen by a guest. The guest address space uses a second page table
that shares the last level pte-tables with the process page table.
If a page is unmapped from the process page table it is automatically
unmapped from the guest page table as well.
The guest address space mapping starts out empty, KVM can map any
individual 1MB segments from the process virtual memory to any 1MB
aligned location in the guest virtual memory. If a target segment in
the process virtual memory does not exist or is unmapped while a
guest mapping exists the desired target address is stored as an
invalid segment table entry in the guest page table.
The population of the guest page table is fault driven.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The noexec support on s390 does not rely on a bit in the page table
entry but utilizes the secondary space mode to distinguish between
memory accesses for instructions vs. data. The noexec code relies
on the assumption that the cpu will always use the secondary space
page table for data accesses while it is running in the secondary
space mode. Up to the z9-109 class machines this has been the case.
Unfortunately this is not true anymore with z10 and later machines.
The load-relative-long instructions lrl, lgrl and lgfrl access the
memory operand using the same addressing-space mode that has been
used to fetch the instruction.
This breaks the noexec mode for all user space binaries compiled
with march=z10 or later. The only option is to remove the current
noexec support.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The tlb flushing code uses the mm_users field of the mm_struct to
decide if each page table entry needs to be flushed individually with
IPTE or if a global flush for the mm_struct is sufficient after all page
table updates have been done. The comment for mm_users says "How many
users with user space?" but the /proc code increases mm_users after it
found the process structure by pid without creating a new user process.
Which makes mm_users useless for the decision between the two tlb
flusing methods. The current code can be confused to not flush tlb
entries by a concurrent access to /proc files if e.g. a fork is in
progres. The solution for this problem is to make the tlb flushing
logic independent from the mm_users field.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Makes code futureproof against the impending change to mm->cpu_vm_mask.
It's also a chance to use the new cpumask_ ops which take a pointer
(the older ones are deprecated, but there's no hurry for arch code).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>