Pull syscall updates from Ingo Molnar:
"Improve the security of set_fs(): we now check the address limit on a
number of key platforms (x86, arm, arm64) before returning to
user-space - without adding overhead to the typical system call fast
path"
* 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/syscalls: Check address limit on user-mode return
arm/syscalls: Check address limit on user-mode return
x86/syscalls: Check address limit on user-mode return
Pull uacess-unaligned removal from Al Viro:
"That stuff had just one user, and an exotic one, at that - binfmt_flat
on arm and m68k"
* 'work.uaccess-unaligned' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kill {__,}{get,put}_user_unaligned()
binfmt_flat: flat_{get,put}_addr_from_rp() should be able to fail
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and
elevate privileges [1].
The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.
The TIF_SETFS flag is added to _TIF_WORK_MASK shifting _TIF_SYSCALL_WORK
for arm instruction immediate support. The global work mask is too big
to used on a single instruction so adapt ret_fast_syscall.
[1] https://bugs.chromium.org/p/project-zero/issues/detail?id=990
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Drewry <wad@chromium.org>
Cc: linux-api@vger.kernel.org
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/20170615011203.144108-2-thgarnie@google.com
In commit 76624175dc ("arm64: uaccess: consistently check object sizes"),
the object size checks are moved outside the access_ok() so that bad
destinations are detected before hitting the "memset(dest, 0, size)" in the
copy_from_user() failure path.
This makes the same change for arm, with attention given to possibly
extracting the uaccess routines into a common header file for all
architectures in the future.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Remove the code duplication between put_user() and __put_user(). The
code which selected the implementation based upon the pointer size, and
declared the local variable to hold the value to be put are common to
both implementations.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since commit 8c56cc8be5 ("ARM: 7449/1: use generic strnlen_user and
strncpy_from_user functions"), the definition of __addr_ok() has been
languishing unused; eradicate the sucker.
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The uaccess_with_memcpy() code is currently incompatible with the SW
PAN code: it takes locks within the region that we've changed the DACR,
potentially sleeping as a result. As we do not save and restore the
DACR across co-operative sleep events, can lead to an incorrect DACR
value later in this code path.
Reported-by: Peter Rosin <peda@axentia.se>
Tested-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a software-based implementation of the priviledged no access
support found in ARMv8.1.
Userspace pages are mapped using a different domain number from the
kernel and IO mappings. If we switch the user domain to "no access"
when we enter the kernel, we can prevent the kernel from touching
userspace.
However, the kernel needs to be able to access userspace via the
various user accessor functions. With the wrapping in the previous
patch, we can temporarily enable access when the kernel needs user
access, and re-disable it afterwards.
This allows us to trap non-intended accesses to userspace, eg, caused
by an inadvertent dereference of the LIST_POISON* values, which, with
appropriate user mappings setup, can be made to succeed. This in turn
can allow use-after-free bugs to be further exploited than would
otherwise be possible.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide uaccess_save_and_enable() and uaccess_restore() to permit
control of userspace visibility to the kernel, and hook these into
the appropriate places in the kernel where we need to access
userspace.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The user assembly for byte and word accesses was virtually identical.
Rather than duplicating this, use a macro instead.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This moves all fixup snippets to the .text.fixup section, which is
a special section that gets emitted along with the .text section
for each input object file, i.e., the snippets are kept much closer
to the code they refer to, which helps prevent linker failure on
large kernels.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
While working on arch/arm/include/asm/uaccess.h, I noticed
that some macros within this header are made harder to read because they
violate a coding style rule: space is missing after comma.
Fix it up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio wants to write bitwise types to userspace using put_user.
At the moment this triggers sparse errors, since the value is passed
through an integer.
For example:
__le32 __user *p;
__le32 x;
put_user(x, p);
is safe, but currently triggers a sparse warning.
Fix that up using __force.
Note: this does not suppress any useful sparse checks since caller
assigns x to typeof(*p), which in turn forces all the necessary type
checks.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
e38361d 'ARM: 8091/2: add get_user() support for 8 byte types' commit
broke V7 BE get_user call when target var size is 64 bit, but '*ptr' size
is 32 bit or smaller. e38361d changed type of __r2 from 'register
unsigned long' to 'register typeof(x) __r2 asm("r2")' i.e before the change
even when target variable size was 64 bit, __r2 was still 32 bit.
But after e38361d commit, for target var of 64 bit size, __r2 became 64
bit and now it should occupy 2 registers r2, and r3. The issue in BE case
that r3 register is least significant word of __r2 and r2 register is most
significant word of __r2. But __get_user_4 still copies result into r2 (most
significant word of __r2). Subsequent code copies from __r2 into x, but
for situation described it will pick up only garbage from r3 register.
Special __get_user_64t_(124) functions are introduced. They are similar to
corresponding __get_user_(124) function but result stored in r3 register
(lsw in case of 64 bit __r2 in BE image). Those function are used by
get_user macro in case of BE and target var size is 64bit.
Also changed __get_user_lo8 name into __get_user_32t_8 to get consistent
naming accross all cases.
Signed-off-by: Victor Kamensky <victor.kamensky@linaro.org>
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Recent contributions, including to DRM and binder, introduce 64-bit
values in their interfaces. A common motivation for this is to allow
the same ABI for 32- and 64-bit userspaces (and therefore also a shared
ABI for 32/64 hybrid userspaces). Anyhow, the developers would like to
avoid gotchas like having to use copy_from_user().
This feature is already implemented on x86-32 and the majority of other
32-bit architectures. The current list of get_user_8 hold out
architectures are: arm, avr32, blackfin, m32r, metag, microblaze,
mn10300, sh.
Credit:
My name sits rather uneasily at the top of this patch. The v1 and
v2 versions of the patch were written by Rob Clark and to produce v4
I mostly copied code from Russell King and H. Peter Anvin. However I
have mangled the patch sufficiently that *blame* is rightfully mine
even if credit should more widely shared.
Changelog:
v5: updated to use the ret macro (requested by Russell King)
v4: remove an inlined add on big endian systems (spotted by Russell King),
used __ARMEB__ rather than BIG_ENDIAN (to match rest of file),
cleared r3 on EFAULT during __get_user_8.
v3: fix a couple of checkpatch issues
v2: pass correct size to check_uaccess, and better handling of narrowing
double word read with __get_user_xb() (Russell King's suggestion)
v1: original
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With CONFIG_MMU=y get_fs() returns current_thread_info()->addr_limit
which is initialized as USER_DS (which in turn is defined to TASK_SIZE)
for userspace processes. At least theoretically
current_thread_info()->addr_limit is changable by set_fs() to a
different limit, so checking for KERNEL_DS is more robust.
With !CONFIG_MMU get_fs returns KERNEL_DS. To see what the old variant
did you'd have to find out that USER_DS == KERNEL_DS which isn't needed
any more with the variant this patch introduces. So it's a bit easier to
understand, too.
Also if the limit was changed this limit should be returned, not
TASK_SIZE.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
According to arm procedure call standart r2 register is call-cloberred.
So after the result of x expression was put into r2 any following
function call in p may overwrite r2. To fix this, the result of p
expression must be saved to the temporary variable before the
assigment x expression to __r2.
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Now that we select HAVE_EFFICIENT_UNALIGNED_ACCESS for ARMv6+ CPUs,
replace the __LINUX_ARM_ARCH__ check in uaccess.h with the new symbol.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
BTRFS is now relying on those since v3.12-rc1.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On NOMMU ARM, the __addr_ok() and __range_ok() macros do not evaluate
their arguments, which may lead to harmless build warnings in some
code where the variables are not used otherwise. Adding a cast to void
gets rid of the warning and does not make any semantic changes.
Without this patch, building at91x40_defconfig results in:
fs/read_write.c: In function 'rw_copy_check_uvector':
fs/read_write.c:684:9: warning: unused variable 'buf' [-Wunused-variable]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
The user access functions may generate a fault, resulting in invocation
of a handler that may sleep.
This patch annotates the accessors with might_fault() so that we print a
warning if they are invoked from atomic context and help lockdep keep
track of mmap_sem.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The {get,put}_user macros don't perform range checking on the provided
__user address when !CPU_HAS_DOMAINS.
This patch reworks the out-of-line assembly accessors to check the user
address against a specified limit, returning -EFAULT if is is out of
range.
[will: changed get_user register allocation to match put_user]
[rmk: fixed building on older ARM architectures]
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch implements the word-at-a-time interface for ARM using the
same algorithm as x86. We use the fls macro from ARMv5 onwards, where
we have a clz instruction available which saves us a mov instruction
when targetting Thumb-2. For older CPUs, we use the magic 0x0ff0001
constant. Big-endian configurations make use of the implementation from
asm-generic.
With this implemented, we can replace our byte-at-a-time strnlen_user
and strncpy_from_user functions with the optimised generic versions.
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Disintegrate asm/system.h for ARM.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Russell King <linux@arm.linux.org.uk>
cc: linux-arm-kernel@lists.infradead.org
This macro is used to generate unprivileged accesses (LDRT/STRT) to user
space.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch removes the domain switching functionality via the set_fs and
__switch_to functions on cores that have a TLS register.
Currently, the ioremap and vmalloc areas share the same level 1 page
tables and therefore have the same domain (DOMAIN_KERNEL). When the
kernel domain is modified from Client to Manager (via the __set_fs or in
the __switch_to function), the XN (eXecute Never) bit is overridden and
newer CPUs can speculatively prefetch the ioremap'ed memory.
Linux performs the kernel domain switching to allow user-specific
functions (copy_to/from_user, get/put_user etc.) to access kernel
memory. In order for these functions to work with the kernel domain set
to Client, the patch modifies the LDRT/STRT and related instructions to
the LDR/STR ones.
The user pages access rights are also modified for kernel read-only
access rather than read/write so that the copy-on-write mechanism still
works. CPU_USE_DOMAINS gets disabled only if the hardware has a TLS register
(CPU_32v6K is defined) since writing the TLS value to the high vectors page
isn't possible.
The user addresses passed to the kernel are checked by the access_ok()
function so that they do not point to the kernel space.
Tested-by: Anton Vorontsov <cbouatmailru@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
/tmp/ccJ3ssZW.s: Assembler messages:
/tmp/ccJ3ssZW.s:1952: Error: can't resolve `.text' {.text section} - `.LFB1077'
This is caused because:
.section .data
.section .text
.section .text
.previous
does not return us to the .text section, but the .data section; this
makes use of .previous dangerous if the ordering of previous sections
is not known.
Fix up the other users of .previous; .pushsection and .popsection are
a safer pairing to use than .section and .previous.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This allows for optional alternative implementations of __copy_to_user
and __clear_user, with a possible runtime fallback to the standard
version when the alternative provides no gain over that standard
version. This is done by making the standard __copy_to_user into a weak
alias for the symbol __copy_to_user_std. Same thing for __clear_user.
Those two functions are particularly good candidates to have alternative
implementations for, since they rely on the STRT instruction which has
lower performances than STM instructions on some CPU cores such as
the ARM1176 and Marvell Feroceon.
Signed-off-by: Nicolas Pitre <nico@marvell.com>
As suggested by Andrew Morton, remove memzero() - it's not supported
on other architectures so use of it is a potential build breaking bug.
Since the compiler optimizes memset(x,0,n) to __memzero() perfectly
well, we don't miss out on the underlying benefits of memzero().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The post-index immediate value is optional if it is 0 and this patch
removes it. The reason is to allow such instructions to compile to
Thumb-2 where only pre-indexed LDRT/STRT instructions are allowed.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>