Pull x86 asm updates from Ingo Molnar:
"This is another big update. Main changes are:
- lots of x86 system call (and other traps/exceptions) entry code
enhancements. In particular the complex parts of the 64-bit entry
code have been migrated to C code as well, and a number of dusty
corners have been refreshed. (Andy Lutomirski)
- vDSO special mapping robustification and general cleanups (Andy
Lutomirski)
- cpufeature refactoring, cleanups and speedups (Borislav Petkov)
- lots of other changes ..."
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
x86/cpufeature: Enable new AVX-512 features
x86/entry/traps: Show unhandled signal for i386 in do_trap()
x86/entry: Call enter_from_user_mode() with IRQs off
x86/entry/32: Change INT80 to be an interrupt gate
x86/entry: Improve system call entry comments
x86/entry: Remove TIF_SINGLESTEP entry work
x86/entry/32: Add and check a stack canary for the SYSENTER stack
x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup
x86/entry: Only allocate space for tss_struct::SYSENTER_stack if needed
x86/entry: Vastly simplify SYSENTER TF (single-step) handling
x86/entry/traps: Clear DR6 early in do_debug() and improve the comment
x86/entry/traps: Clear TIF_BLOCKSTEP on all debug exceptions
x86/entry/32: Restore FLAGS on SYSEXIT
x86/entry/32: Filter NT and speed up AC filtering in SYSENTER
x86/entry/compat: In SYSENTER, sink AC clearing below the existing FLAGS test
selftests/x86: In syscall_nt, test NT|TF as well
x86/asm-offsets: Remove PARAVIRT_enabled
x86/entry/32: Introduce and use X86_BUG_ESPFIX instead of paravirt_enabled
uprobes: __create_xol_area() must nullify xol_mapping.fault
x86/cpufeature: Create a new synthetic cpu capability for machine check recovery
...
Code which runs outside the kernel's normal mode of operation often does
unusual things which can cause a static analysis tool like objtool to
emit false positive warnings:
- boot image
- vdso image
- relocation
- realmode
- efi
- head
- purgatory
- modpost
Set OBJECT_FILES_NON_STANDARD for their related files and directories,
which will tell objtool to skip checking them. It's ok to skip them
because they don't affect runtime stack traces.
Also skip the following code which does the right thing with respect to
frame pointers, but is too "special" to be validated by a tool:
- entry
- mcount
Also skip the test_nx module because it modifies its exception handling
table at runtime, which objtool can't understand. Fortunately it's
just a test module so it doesn't matter much.
Currently objtool is the only user of OBJECT_FILES_NON_STANDARD, but it
might eventually be useful for other tools.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris J Arges <chris.j.arges@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/366c080e3844e8a5b6a0327dc7e8c2b90ca3baeb.1456719558.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It makes me uncomfortable that even modern systems grant every
process direct read access to the HPET.
While fixing this for real without regressing anything is a mess
(unmapping the HPET is tricky because we don't adequately track
all the mappings), we can do almost as well by tracking which
vclocks have ever been used and only allowing pages associated
with used vclocks to be faulted in.
This will cause rogue programs that try to peek at the HPET to
get SIGBUS instead on most systems.
We can't restrict faults to vclock pages that are associated
with the currently selected vclock due to a race: a process
could start to access the HPET for the first time and race
against a switch away from the HPET as the current clocksource.
We can't segfault the process trying to peek at the HPET in this
case, even though the process isn't going to do anything useful
with the data.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e79d06295625c02512277737ab55085a498ac5d8.1451446564.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were:
- vDSO and asm entry improvements (Andy Lutomirski)
- Xen paravirt entry enhancements (Boris Ostrovsky)
- asm entry labels enhancement (Borislav Petkov)
- and other misc changes (Thomas Gleixner, me)"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vsdo: Fix build on PARAVIRT_CLOCK=y, KVM_GUEST=n
Revert "x86/kvm: On KVM re-enable (e.g. after suspend), update clocks"
x86/entry/64_compat: Make labels local
x86/platform/uv: Include clocksource.h for clocksource_touch_watchdog()
x86/vdso: Enable vdso pvclock access on all vdso variants
x86/vdso: Remove pvclock fixmap machinery
x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap
x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
x86/kvm: On KVM re-enable (e.g. after suspend), update clocks
x86/entry/64: Bypass enter_from_user_mode on non-context-tracking boots
x86/asm: Add asm macros for static keys/jump labels
x86/asm: Error out if asm/jump_label.h is included inappropriately
context_tracking: Switch to new static_branch API
x86/entry, x86/paravirt: Remove the unused usergs_sysret32 PV op
x86/paravirt: Remove the unused irq_enable_sysexit pv op
x86/xen: Avoid fast syscall path for Xen PV guests
Pull x86 mm changes from Ingo Molnar:
"The main changes are: continued PAT work by Toshi Kani, plus a new
boot time warning about insecure RWX kernel mappings, by Stephen
Smalley.
The new CONFIG_DEBUG_WX=y warning is marked default-y if
CONFIG_DEBUG_RODATA=y is already eanbled, as a special exception, as
these bugs are hard to notice and this check already found several
live bugs"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Warn on W^X mappings
x86/mm: Fix no-change case in try_preserve_large_page()
x86/mm: Fix __split_large_page() to handle large PAT bit
x86/mm: Fix try_preserve_large_page() to handle large PAT bit
x86/mm: Fix gup_huge_p?d() to handle large PAT bit
x86/mm: Fix slow_virt_to_phys() to handle large PAT bit
x86/mm: Fix page table dump to show PAT bit
x86/asm: Add pud_pgprot() and pmd_pgprot()
x86/asm: Fix pud/pmd interfaces to handle large PAT bit
x86/asm: Add pud/pmd mask interfaces to handle large PAT bit
x86/asm: Move PUD_PAGE macros to page_types.h
x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSO
The goal is to integrate the SYSENTER and SYSCALL32 entry paths
with the INT80 path. SYSENTER clobbers ESP and EIP. SYSCALL32
clobbers ECX (and, invisibly, R11). SYSRETL (long mode to
compat mode) clobbers ECX and, invisibly, R11. SYSEXIT (which
we only need for native 32-bit) clobbers ECX and EDX.
This means that we'll need to provide ESP to the kernel in a
register (I chose ECX, since it's only needed for SYSENTER) and
we need to provide the args that normally live in ECX and EDX in
memory.
The epilogue needs to restore ECX and EDX, since user code
relies on regs being preserved.
We don't need to do anything special about EIP, since the kernel
already knows where we are. The kernel will eventually need to
know where int $0x80 lands, so add a vdso_image entry for it.
The only user-visible effect of this code is that ptrace-induced
changes to ECX and EDX during fast syscalls will be lost. This
is already the case for the SYSENTER path.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/b860925adbee2d2627a0671fbfe23a7fd04127f8.1444091584.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32-bit
non-PAE kernel configuration by re-defining it to CONFIG_X86_32.
However, it does not re-define CONFIG_PGTABLE_LEVELS leaving it
as 4 levels.
This mismatch leads <asm/pgtable_type.h> to NOT include <asm-generic/
pgtable-nopud.h> and <asm-generic/pgtable-nopmd.h>, which will cause
compile errors when a later patch enhances <asm/pgtable_type.h> to
use PUD_SHIFT and PMD_SHIFT. These -nopud & -nopmd headers define
these SHIFTs for the 32-bit non-PAE kernel.
Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 levels.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Robert Elliot <elliott@hpe.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1442514264-12475-2-git-send-email-toshi.kani@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>