Use new return type vm_fault_t for fault handler
in struct vm_operations_struct.
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
- add base infrastructure for socket mediation. ABI bump and
additional checks to ensure only v8 compliant policy uses
socket af mediation.
- improve and cleanup dfa verification
- improve profile attachment logic
- improve overlapping expression handling
- add the xattr matching to the attachment logic
- improve signal mediation handling with stacked labels
- improve handling of no_new_privs in a label stack
+ Cleanups and changes
- use dfa to parse string split
- bounded version of label_parse
- proper line wrap nulldfa.in
- split context out into task and cred naming to better match usage
- simplify code in aafs
+ Bug fixes
- fix display of .ns_name for containers
- fix resource audit messages when auditing peer
- fix logging of the existence test for signals
- fix resource audit messages when auditing peer
- fix display of .ns_name for containers
- fix an error code in verify_table_headers()
- fix memory leak on buffer on error exit path
- fix error returns checks by making size a ssize_t
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJazWpMAAoJEAUvNnAY1cPY2wwP/2ZmzyITY7xW3Cpz8ynKOTyZ
hD2ahIjLWxcQwMZUoHXIa4TTK5EThlhKcTa0+sdMJGsIsRyXLoyBcd/VST0F9ZrA
OWn1uL2ASeNroNw+88P6qU03+cT2eEohM3vvlNy2ud98EBiTyxB6L4VLpy3xDKAd
zblojxqegRO7WRfEFCR2kHmnrL0Z3oxPBahnuVitfrwO76WFUSM9EYm67Xtf4yjJ
qQ7ocGdhxiULNdceoIke11e8iNwiQyY4O+E24qVAJw66arxIByMKo+cLjeTxMbZR
z4/pVd664wiK7mW0In7bJWOfXLJHxHALpuCc82wFgiLPdfSpJzT1nx+Xjaw8DhdZ
FBoHLpHjJT3dTpYoQTjqtNdvHgXryL/OOllm+I8DPMu/nfcp8qsOru5bEXg+j/90
CRo1OqrWZhUkKHnQs12QIJS+Gt7qByQB6tDMDbjkIC71vKUWA4wnp7zLZHYd9T0L
6kZ2aWKiOXM6VRZ5V5HVLhrTajiubyBg3y3Eur4HwuGzquBmxAp1RhS8oiOpgzgW
jVI92/P2XjhnU9E2J5m+mzjh11i+D51homtz1y4vB53Ye/WLy1S0o4StDAiLfgw3
q/581V342vl6X46GlgcS5G7QeIkzFiCUe5H3t2/unCRnI+PxabwRmbaTqWq47xzQ
umwlYfok3ALSzdgnv2sT
=XhxG
-----END PGP SIGNATURE-----
Merge tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- add base infrastructure for socket mediation. ABI bump and
additional checks to ensure only v8 compliant policy uses socket af
mediation.
- improve and cleanup dfa verification
- improve profile attachment logic
- improve overlapping expression handling
- add the xattr matching to the attachment logic
- improve signal mediation handling with stacked labels
- improve handling of no_new_privs in a label stack
Cleanups and changes:
- use dfa to parse string split
- bounded version of label_parse
- proper line wrap nulldfa.in
- split context out into task and cred naming to better match usage
- simplify code in aafs
Bug fixes:
- fix display of .ns_name for containers
- fix resource audit messages when auditing peer
- fix logging of the existence test for signals
- fix resource audit messages when auditing peer
- fix display of .ns_name for containers
- fix an error code in verify_table_headers()
- fix memory leak on buffer on error exit path
- fix error returns checks by making size a ssize_t"
* tag 'apparmor-pr-2018-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (36 commits)
apparmor: fix memory leak on buffer on error exit path
apparmor: fix dangling symlinks to policy rawdata after replacement
apparmor: Fix an error code in verify_table_headers()
apparmor: fix error returns checks by making size a ssize_t
apparmor: update MAINTAINERS file git and wiki locations
apparmor: remove POLICY_MEDIATES_SAFE
apparmor: add base infastructure for socket mediation
apparmor: improve overlapping domain attachment resolution
apparmor: convert attaching profiles via xattrs to use dfa matching
apparmor: Add support for attaching profiles via xattr, presence and value
apparmor: cleanup: simplify code to get ns symlink name
apparmor: cleanup create_aafs() error path
apparmor: dfa split verification of table headers
apparmor: dfa add support for state differential encoding
apparmor: dfa move character match into a macro
apparmor: update domain transitions that are subsets of confinement at nnp
apparmor: move context.h to cred.h
apparmor: move task related defines and fns to task.X files
apparmor: cleanup, drop unused fn __aa_task_is_confined()
apparmor: cleanup fixup description of aa_replace_profiles
...
There is a permission discrepancy when consulting msq ipc object
metadata between /proc/sysvipc/msg (0444) and the MSG_STAT shmctl
command. The later does permission checks for the object vs S_IRUGO.
As such there can be cases where EACCESS is returned via syscall but the
info is displayed anyways in the procfs files.
While this might have security implications via info leaking (albeit no
writing to the msq metadata), this behavior goes way back and showing
all the objects regardless of the permissions was most likely an
overlook - so we are stuck with it. Furthermore, modifying either the
syscall or the procfs file can cause userspace programs to break (ie
ipcs). Some applications require getting the procfs info (without root
privileges) and can be rather slow in comparison with a syscall -- up to
500x in some reported cases for shm.
This patch introduces a new MSG_STAT_ANY command such that the msq ipc
object permissions are ignored, and only audited instead. In addition,
I've left the lsm security hook checks in place, as if some policy can
block the call, then the user has no other choice than just parsing the
procfs file.
Link: http://lkml.kernel.org/r/20180215162458.10059-4-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Robert Kettler <robert.kettler@outlook.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There is a permission discrepancy when consulting shm ipc object
metadata between /proc/sysvipc/sem (0444) and the SEM_STAT semctl
command. The later does permission checks for the object vs S_IRUGO.
As such there can be cases where EACCESS is returned via syscall but the
info is displayed anyways in the procfs files.
While this might have security implications via info leaking (albeit no
writing to the sma metadata), this behavior goes way back and showing
all the objects regardless of the permissions was most likely an
overlook - so we are stuck with it. Furthermore, modifying either the
syscall or the procfs file can cause userspace programs to break (ie
ipcs). Some applications require getting the procfs info (without root
privileges) and can be rather slow in comparison with a syscall -- up to
500x in some reported cases for shm.
This patch introduces a new SEM_STAT_ANY command such that the sem ipc
object permissions are ignored, and only audited instead. In addition,
I've left the lsm security hook checks in place, as if some policy can
block the call, then the user has no other choice than just parsing the
procfs file.
Link: http://lkml.kernel.org/r/20180215162458.10059-3-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Robert Kettler <robert.kettler@outlook.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "sysvipc: introduce STAT_ANY commands", v2.
The following patches adds the discussed (see [1]) new command for shm
as well as for sems and msq as they are subject to the same
discrepancies for ipc object permission checks between the syscall and
via procfs. These new commands are justified in that (1) we are stuck
with this semantics as changing syscall and procfs can break userland;
and (2) some users can benefit from performance (for large amounts of
shm segments, for example) from not having to parse the procfs
interface.
Once merged, I will submit the necesary manpage updates. But I'm thinking
something like:
: diff --git a/man2/shmctl.2 b/man2/shmctl.2
: index 7bb503999941..bb00bbe21a57 100644
: --- a/man2/shmctl.2
: +++ b/man2/shmctl.2
: @@ -41,6 +41,7 @@
: .\" 2005-04-25, mtk -- noted aberrant Linux behavior w.r.t. new
: .\" attaches to a segment that has already been marked for deletion.
: .\" 2005-08-02, mtk: Added IPC_INFO, SHM_INFO, SHM_STAT descriptions.
: +.\" 2018-02-13, dbueso: Added SHM_STAT_ANY description.
: .\"
: .TH SHMCTL 2 2017-09-15 "Linux" "Linux Programmer's Manual"
: .SH NAME
: @@ -242,6 +243,18 @@ However, the
: argument is not a segment identifier, but instead an index into
: the kernel's internal array that maintains information about
: all shared memory segments on the system.
: +.TP
: +.BR SHM_STAT_ANY " (Linux-specific)"
: +Return a
: +.I shmid_ds
: +structure as for
: +.BR SHM_STAT .
: +However, the
: +.I shm_perm.mode
: +is not checked for read access for
: +.IR shmid ,
: +resembing the behaviour of
: +/proc/sysvipc/shm.
: .PP
: The caller can prevent or allow swapping of a shared
: memory segment with the following \fIcmd\fP values:
: @@ -287,7 +300,7 @@ operation returns the index of the highest used entry in the
: kernel's internal array recording information about all
: shared memory segments.
: (This information can be used with repeated
: -.B SHM_STAT
: +.B SHM_STAT/SHM_STAT_ANY
: operations to obtain information about all shared memory segments
: on the system.)
: A successful
: @@ -328,7 +341,7 @@ isn't accessible.
: \fIshmid\fP is not a valid identifier, or \fIcmd\fP
: is not a valid command.
: Or: for a
: -.B SHM_STAT
: +.B SHM_STAT/SHM_STAT_ANY
: operation, the index value specified in
: .I shmid
: referred to an array slot that is currently unused.
This patch (of 3):
There is a permission discrepancy when consulting shm ipc object metadata
between /proc/sysvipc/shm (0444) and the SHM_STAT shmctl command. The
later does permission checks for the object vs S_IRUGO. As such there can
be cases where EACCESS is returned via syscall but the info is displayed
anyways in the procfs files.
While this might have security implications via info leaking (albeit no
writing to the shm metadata), this behavior goes way back and showing all
the objects regardless of the permissions was most likely an overlook - so
we are stuck with it. Furthermore, modifying either the syscall or the
procfs file can cause userspace programs to break (ie ipcs). Some
applications require getting the procfs info (without root privileges) and
can be rather slow in comparison with a syscall -- up to 500x in some
reported cases.
This patch introduces a new SHM_STAT_ANY command such that the shm ipc
object permissions are ignored, and only audited instead. In addition,
I've left the lsm security hook checks in place, as if some policy can
block the call, then the user has no other choice than just parsing the
procfs file.
[1] https://lkml.org/lkml/2017/12/19/220
Link: http://lkml.kernel.org/r/20180215162458.10059-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Robert Kettler <robert.kettler@outlook.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Tom Zanussi's extended histogram work
This adds the synthetic events to have histograms from multiple event data
Adds triggers "onmatch" and "onmax" to call the synthetic events
Several updates to the histogram code from this
- Allow way to nest ring buffer calls in the same context
- Allow absolute time stamps in ring buffer
- Rewrite of filter code parsing based on Al Viro's suggestions
- Setting of trace_clock to global if TSC is unstable (on boot)
- Better OOM handling when allocating large ring buffers
- Added initcall tracepoints (consolidated initcall_debug code with them)
And other various fixes and clean ups
-----BEGIN PGP SIGNATURE-----
iQHIBAABCgAyFiEEPm6V/WuN2kyArTUe1a05Y9njSUkFAlrLoCAUHHJvc3RlZHRA
Z29vZG1pcy5vcmcACgkQ1a05Y9njSUks/QwAn/ky8WgfjcRdjKmBYuEwDedvm9iI
V9G5kpv5JMw5dLz4l1pS3tA3M9Lyuc5z3Shw92FTy36vdU1wxEjQgHa7viB1xk9x
KsiTyNjTsgrRd7GVHMy/8Be2RRiTRLaXKAsLCoj/c7QWzagV1P8XWlWK5mojYkh/
DrSXyg9Avkp30+sU1bvcLWnmmZUFqMxs+bWipD9uFc98USMMyeP25nrnhrj0gDTg
Q93cjXUuyVRC4lJ2YTW0GCSKhMKEw5f/ltEOT1hwScqYkCJj1EubKqS53R/9h21z
IPUrYcqLnMRu0j2ejR+UAy5Vsy3gJUrPMQb0F6hlu1DwbMd0d/9SGh1c+Sm+zorh
yftWTdCZsYrXkaOuB6V5M30X+KBwbWO0Xc9VCvgJ/IU5vMlgLSt5itTWbT/Fmfhb
ll5/RXP7zhSXRv5sdl/BP3/4dd6F8jpyKyaR2Rk2+XjBOGIq5mvqNGr4Vj9AzxW8
E0nvq7l7e0dbxZNM42gEm3cht1VUg7Zz0Y0+
=91oN
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"New features:
- Tom Zanussi's extended histogram work.
This adds the synthetic events to have histograms from multiple
event data Adds triggers "onmatch" and "onmax" to call the
synthetic events Several updates to the histogram code from this
- Allow way to nest ring buffer calls in the same context
- Allow absolute time stamps in ring buffer
- Rewrite of filter code parsing based on Al Viro's suggestions
- Setting of trace_clock to global if TSC is unstable (on boot)
- Better OOM handling when allocating large ring buffers
- Added initcall tracepoints (consolidated initcall_debug code with
them)
And other various fixes and clean ups"
* tag 'trace-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
init: Have initcall_debug still work without CONFIG_TRACEPOINTS
init, tracing: Have printk come through the trace events for initcall_debug
init, tracing: instrument security and console initcall trace events
init, tracing: Add initcall trace events
tracing: Add rcu dereference annotation for test func that touches filter->prog
tracing: Add rcu dereference annotation for filter->prog
tracing: Fixup logic inversion on setting trace_global_clock defaults
tracing: Hide global trace clock from lockdep
ring-buffer: Add set/clear_current_oom_origin() during allocations
ring-buffer: Check if memory is available before allocation
lockdep: Add print_irqtrace_events() to __warn
vsprintf: Do not preprocess non-dereferenced pointers for bprintf (%px and %pK)
tracing: Uninitialized variable in create_tracing_map_fields()
tracing: Make sure variable string fields are NULL-terminated
tracing: Add action comparisons when testing matching hist triggers
tracing: Don't add flag strings when displaying variable references
tracing: Fix display of hist trigger expressions containing timestamps
ftrace: Drop a VLA in module_exists()
tracing: Mention trace_clock=global when warning about unstable clocks
tracing: Default to using trace_global_clock if sched_clock is unstable
...
Commit 0619f0f5e3 ("selinux: wrap selinuxfs state") triggers a BUG
when SELinux is runtime-disabled (i.e. systemd or equivalent disables
SELinux before initial policy load via /sys/fs/selinux/disable based on
/etc/selinux/config SELINUX=disabled).
This does not manifest if SELinux is disabled via kernel command line
argument or if SELinux is enabled (permissive or enforcing).
Before:
SELinux: Disabled at runtime.
BUG: Dentry 000000006d77e5c7{i=17,n=null} still in use (1) [unmount of selinuxfs selinuxfs]
After:
SELinux: Disabled at runtime.
Fixes: 0619f0f5e3 ("selinux: wrap selinuxfs state")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull integrity updates from James Morris:
"A mixture of bug fixes, code cleanup, and continues to close
IMA-measurement, IMA-appraisal, and IMA-audit gaps.
Also note the addition of a new cred_getsecid LSM hook by Matthew
Garrett:
For IMA purposes, we want to be able to obtain the prepared secid
in the bprm structure before the credentials are committed. Add a
cred_getsecid hook that makes this possible.
which is used by a new CREDS_CHECK target in IMA:
In ima_bprm_check(), check with both the existing process
credentials and the credentials that will be committed when the new
process is started. This will not change behaviour unless the
system policy is extended to include CREDS_CHECK targets -
BPRM_CHECK will continue to check the same credentials that it did
previously"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ima: Fallback to the builtin hash algorithm
ima: Add smackfs to the default appraise/measure list
evm: check for remount ro in progress before writing
ima: Improvements in ima_appraise_measurement()
ima: Simplify ima_eventsig_init()
integrity: Remove unused macro IMA_ACTION_RULE_FLAGS
ima: drop vla in ima_audit_measurement()
ima: Fix Kconfig to select TPM 2.0 CRB interface
evm: Constify *integrity_status_msg[]
evm: Move evm_hmac and evm_hash from evm_main.c to evm_crypto.c
fuse: define the filesystem as untrusted
ima: fail signature verification based on policy
ima: clear IMA_HASH
ima: re-evaluate files on privileged mounted filesystems
ima: fail file signature verification on non-init mounted filesystems
IMA: Support using new creds in appraisal policy
security: Add a cred_getsecid hook
Pull smack update from James Morris:
"One small change for Automotive Grade Linux"
* 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: Handle CGROUP2 in the same way that CGROUP
Pull general security layer updates from James Morris:
- Convert security hooks from list to hlist, a nice cleanup, saving
about 50% of space, from Sargun Dhillon.
- Only pass the cred, not the secid, to kill_pid_info_as_cred and
security_task_kill (as the secid can be determined from the cred),
from Stephen Smalley.
- Close a potential race in kernel_read_file(), by making the file
unwritable before calling the LSM check (vs after), from Kees Cook.
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: convert security hooks to use hlist
exec: Set file unwritable before LSM check
usb, signal, security: only pass the cred, not the secid, to kill_pid_info_as_cred and security_task_kill
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEcQCq365ubpQNLgrWVeRaWujKfIoFAlrD6XoUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQVeRaWujKfIpy9RAAjwhkNBNJhw1UlGggVvst8lzJBdMp
XxL7cg+1TcZkB12yrghILg+gY4j5PzY4GJo1gvllWIHsT8Ud6cQTI/AzeYR2OfZ3
mHv3gtyzmHsPGBdqhmgC7R10tpyXFXwDc3VLMtuuDiUl/seFEaJWOMYP7zj+tRil
XoOCyoV9bb1wb7vNAzQikK8yhz3fu72Y5QOODLfaYeYojMKs8Q8pMZgi68oVQUXk
SmS2mj0k2P3UqeOSk+8phJQhilm32m0tE0YnLvzAhblJLqeS2DUNnWORP1j4oQ/Q
aOOu4ZQ9PA1N7VAIGceuf2HZHhnrFzWdvggp2bxegcRSIfUZ84FuZbrj60RUz2ja
V6GmKYACnyd28TAWdnzjKEd4dc36LSPxnaj8hcrvyO2V34ozVEsvIEIJREoXRUJS
heJ9HT+VIvmguzRCIPPeC1ZYopIt8M1kTRrszigU80TuZjIP0VJHLGQn/rgRQzuO
cV5gmJ6TSGn1l54H13koBzgUCo0cAub8Nl+288qek+jLWoHnKwzLB+1HCWuyeCHt
2q6wdFfenYH0lXdIzCeC7NNHRKCrPNwkZ/32d4ZQf4cu5tAn8bOk8dSHchoAfZG8
p7N6jPPoxmi2F/GRKrTiUNZvQpyvgX3hjtJS6ljOTSYgRhjeNYeCP8U+BlOpLVQy
U4KzB9wOAngTEpo=
=p2Sh
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"A bigger than usual pull request for SELinux, 13 patches (lucky!)
along with a scary looking diffstat.
Although if you look a bit closer, excluding the usual minor
tweaks/fixes, there are really only two significant changes in this
pull request: the addition of proper SELinux access controls for SCTP
and the encapsulation of a lot of internal SELinux state.
The SCTP changes are the result of a multi-month effort (maybe even a
year or longer?) between the SELinux folks and the SCTP folks to add
proper SELinux controls. A special thanks go to Richard for seeing
this through and keeping the effort moving forward.
The state encapsulation work is a bit of janitorial work that came out
of some early work on SELinux namespacing. The question of namespacing
is still an open one, but I believe there is some real value in the
encapsulation work so we've split that out and are now sending that up
to you"
* tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: wrap AVC state
selinux: wrap selinuxfs state
selinux: fix handling of uninitialized selinux state in get_bools/classes
selinux: Update SELinux SCTP documentation
selinux: Fix ltp test connect-syscall failure
selinux: rename the {is,set}_enforcing() functions
selinux: wrap global selinux state
selinux: fix typo in selinux_netlbl_sctp_sk_clone declaration
selinux: Add SCTP support
sctp: Add LSM hooks
sctp: Add ip option support
security: Add support for SCTP security hooks
netlabel: If PF_INET6, check sk_buff ip header version
Merge updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- the v9fs maintainers have been missing for a long time. I've taken
over v9fs patch slinging.
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (116 commits)
mm,oom_reaper: check for MMF_OOM_SKIP before complaining
mm/ksm: fix interaction with THP
mm/memblock.c: cast constant ULLONG_MAX to phys_addr_t
headers: untangle kmemleak.h from mm.h
include/linux/mmdebug.h: make VM_WARN* non-rvals
mm/page_isolation.c: make start_isolate_page_range() fail if already isolated
mm: change return type to vm_fault_t
mm, oom: remove 3% bonus for CAP_SYS_ADMIN processes
mm, page_alloc: wakeup kcompactd even if kswapd cannot free more memory
kernel/fork.c: detect early free of a live mm
mm: make counting of list_lru_one::nr_items lockless
mm/swap_state.c: make bool enable_vma_readahead and swap_vma_readahead() static
block_invalidatepage(): only release page if the full page was invalidated
mm: kernel-doc: add missing parameter descriptions
mm/swap.c: remove @cold parameter description for release_pages()
mm/nommu: remove description of alloc_vm_area
zram: drop max_zpage_size and use zs_huge_class_size()
zsmalloc: introduce zs_huge_class_size()
mm: fix races between swapoff and flush dcache
fs/direct-io.c: minor cleanups in do_blockdev_direct_IO
...
Trace events have been added around the initcall functions defined in
init/main.c. But console and security have their own initcalls. This adds
the trace events associated for those initcall functions.
Link: http://lkml.kernel.org/r/1521765208.19745.2.camel@polymtl.ca
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Abderrahmane Benbachir <abderrahmane.benbachir@polymtl.ca>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Currently <linux/slab.h> #includes <linux/kmemleak.h> for no obvious
reason. It looks like it's only a convenience, so remove kmemleak.h
from slab.h and add <linux/kmemleak.h> to any users of kmemleak_* that
don't already #include it. Also remove <linux/kmemleak.h> from source
files that do not use it.
This is tested on i386 allmodconfig and x86_64 allmodconfig. It would
be good to run it through the 0day bot for other $ARCHes. I have
neither the horsepower nor the storage space for the other $ARCHes.
Update: This patch has been extensively build-tested by both the 0day
bot & kisskb/ozlabs build farms. Both of them reported 2 build failures
for which patches are included here (in v2).
[ slab.h is the second most used header file after module.h; kernel.h is
right there with slab.h. There could be some minor error in the
counting due to some #includes having comments after them and I didn't
combine all of those. ]
[akpm@linux-foundation.org: security/keys/big_key.c needs vmalloc.h, per sfr]
Link: http://lkml.kernel.org/r/e4309f98-3749-93e1-4bb7-d9501a39d015@infradead.org
Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au> [2 build failures]
Reported-by: Fengguang Wu <fengguang.wu@intel.com> [2 build failures]
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull namespace updates from Eric Biederman:
"There was a lot of work this cycle fixing bugs that were discovered
after the merge window and getting everything ready where we can
reasonably support fully unprivileged fuse. The bug fixes you already
have and much of the unprivileged fuse work is coming in via other
trees.
Still left for fully unprivileged fuse is figuring out how to cleanly
handle .set_acl and .get_acl in the legacy case, and properly handling
of evm xattrs on unprivileged mounts.
Included in the tree is a cleanup from Alexely that replaced a linked
list with a statically allocated fix sized array for the pid caches,
which simplifies and speeds things up.
Then there is are some cleanups and fixes for the ipc namespace. The
motivation was that in reviewing other code it was discovered that
access ipc objects from different pid namespaces recorded pids in such
a way that when asked the wrong pids were returned. In the worst case
there has been a measured 30% performance impact for sysvipc
semaphores. Other test cases showed no measurable performance impact.
Manfred Spraul and Davidlohr Bueso who tend to work on sysvipc
performance both gave the nod that this is good enough.
Casey Schaufler and James Morris have given their approval to the LSM
side of the changes.
I simplified the types and the code dealing with sysvipc to pass just
kern_ipc_perm for all three types of ipc. Which reduced the header
dependencies throughout the kernel and simplified the lsm code.
Which let me work on the pid fixes without having to worry about
trivial changes causing complete kernel recompiles"
* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ipc/shm: Fix pid freeing.
ipc/shm: fix up for struct file no longer being available in shm.h
ipc/smack: Tidy up from the change in type of the ipc security hooks
ipc: Directly call the security hook in ipc_ops.associate
ipc/sem: Fix semctl(..., GETPID, ...) between pid namespaces
ipc/msg: Fix msgctl(..., IPC_STAT, ...) between pid namespaces
ipc/shm: Fix shmctl(..., IPC_STAT, ...) between pid namespaces.
ipc/util: Helpers for making the sysvipc operations pid namespace aware
ipc: Move IPCMNI from include/ipc.h into ipc/util.h
msg: Move struct msg_queue into ipc/msg.c
shm: Move struct shmid_kernel into ipc/shm.c
sem: Move struct sem and struct sem_array into ipc/sem.c
msg/security: Pass kern_ipc_perm not msg_queue into the msg_queue security hooks
shm/security: Pass kern_ipc_perm not shmid_kernel into the shm security hooks
sem/security: Pass kern_ipc_perm not sem_array into the sem security hooks
pidns: simpler allocation of pid_* caches
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-03-31
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add raw BPF tracepoint API in order to have a BPF program type that
can access kernel internal arguments of the tracepoints in their
raw form similar to kprobes based BPF programs. This infrastructure
also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which
returns an anon-inode backed fd for the tracepoint object that allows
for automatic detach of the BPF program resp. unregistering of the
tracepoint probe on fd release, from Alexei.
2) Add new BPF cgroup hooks at bind() and connect() entry in order to
allow BPF programs to reject, inspect or modify user space passed
struct sockaddr, and as well a hook at post bind time once the port
has been allocated. They are used in FB's container management engine
for implementing policy, replacing fragile LD_PRELOAD wrapper
intercepting bind() and connect() calls that only works in limited
scenarios like glibc based apps but not for other runtimes in
containerized applications, from Andrey.
3) BPF_F_INGRESS flag support has been added to sockmap programs for
their redirect helper call bringing it in line with cls_bpf based
programs. Support is added for both variants of sockmap programs,
meaning for tx ULP hooks as well as recv skb hooks, from John.
4) Various improvements on BPF side for the nfp driver, besides others
this work adds BPF map update and delete helper call support from
the datapath, JITing of 32 and 64 bit XADD instructions as well as
offload support of bpf_get_prandom_u32() call. Initial implementation
of nfp packet cache has been tackled that optimizes memory access
(see merge commit for further details), from Jakub and Jiong.
5) Removal of struct bpf_verifier_env argument from the print_bpf_insn()
API has been done in order to prepare to use print_bpf_insn() soon
out of perf tool directly. This makes the print_bpf_insn() API more
generic and pushes the env into private data. bpftool is adjusted
as well with the print_bpf_insn() argument removal, from Jiri.
6) Couple of cleanups and prep work for the upcoming BTF (BPF Type
Format). The latter will reuse the current BPF verifier log as
well, thus bpf_verifier_log() is further generalized, from Martin.
7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read
and write support has been added in similar fashion to existing
IPv6 IPV6_TCLASS socket option we already have, from Nikita.
8) Fixes in recent sockmap scatterlist API usage, which did not use
sg_init_table() for initialization thus triggering a BUG_ON() in
scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and
uses a small helper sg_init_marker() to properly handle the affected
cases, from Prashant.
9) Let the BPF core follow IDR code convention and therefore use the
idr_preload() and idr_preload_end() helpers, which would also help
idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory
pressure, from Shaohua.
10) Last but not least, a spelling fix in an error message for the
BPF cookie UID helper under BPF sample code, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently on the error exit path the allocated buffer is not free'd
causing a memory leak. Fix this by kfree'ing it.
Detected by CoverityScan, CID#1466876 ("Resource leaks")
Fixes: 1180b4c757 ("apparmor: fix dangling symlinks to policy rawdata after replacement")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
This changes security_hook_heads to use hlist_heads instead of
the circular doubly-linked list heads. This should cut down
the size of the struct by about half.
In addition, it allows mutation of the hooks at the tail of the
callback list without having to modify the head. The longer-term
purpose of this is to enable making the heads read only.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Reviewed-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
rt_genid_bump_all() consists of ipv4 and ipv6 part.
ipv4 part is incrementing of net::ipv4::rt_genid,
and I see many places, where it's read without rtnl_lock().
ipv6 part calls __fib6_clean_all(), and it's also
called without rtnl_lock() in other places.
So, rtnl_lock() here was used to iterate net_namespace_list only,
and we can remove it.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtnl_lock() is used everywhere, and contention is very high.
When someone wants to iterate over alive net namespaces,
he/she has no a possibility to do that without exclusive lock.
But the exclusive rtnl_lock() in such places is overkill,
and it just increases the contention. Yes, there is already
for_each_net_rcu() in kernel, but it requires rcu_read_lock(),
and this can't be sleepable. Also, sometimes it may be need
really prevent net_namespace_list growth, so for_each_net_rcu()
is not fit there.
This patch introduces new rw_semaphore, which will be used
instead of rtnl_mutex to protect net_namespace_list. It is
sleepable and allows not-exclusive iterations over net
namespaces list. It allows to stop using rtnl_lock()
in several places (what is made in next patches) and makes
less the time, we keep rtnl_mutex. Here we just add new lock,
while the explanation of we can remove rtnl_lock() there are
in next patches.
Fine grained locks generally are better, then one big lock,
so let's do that with net_namespace_list, while the situation
allows that.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
move COUNT_ARGS() macro from apparmor to generic header and extend it
to count till twelve.
COUNT() was an alternative name for this logic, but it's used for
different purpose in many other places.
Similarly for CONCATENATE() macro.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Rename the variables shp, sma, msq to isp. As that is how the code already
refers to those variables.
Collapse smack_of_shm, smack_of_sem, and smack_of_msq into smack_of_ipc,
as the three functions had become completely identical.
Collapse smack_shm_alloc_security, smack_sem_alloc_security and
smack_msg_queue_alloc_security into smack_ipc_alloc_security as the three
functions had become identical.
Collapse smack_shm_free_security, smack_sem_free_security and
smack_msg_queue_free_security into smack_ipc_free_security as the
three functions had become identical.
Requested-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is required to use SMACK and IMA/EVM together. Add it to the
default nomeasure/noappraise list like other pseudo filesystems.
Signed-off-by: Martin Townsend <mtownsend1973@gmail.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
EVM might update the evm xattr while the VFS performs a remount to
readonly mode. This is not properly checked for, additionally check
the s_readonly_remount superblock flag before writing.
The bug can for example be observed with UBIFS. UBIFS checks the free
space on the device before and after a remount. With EVM enabled the
free space sometimes differs between both checks.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Replace nested ifs in the EVM xattr verification logic with a switch
statement, making the code easier to understand.
Also, add comments to the if statements in the out section and constify the
cause variable.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
The "goto out" statement doesn't have any purpose since there's no cleanup
to be done when returning early, so remove it. This also makes the rc
variable unnecessary so remove it as well.
Also, the xattr_len and fmt variables are redundant so remove them as well.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This macro isn't used anymore since commit 0d73a55208 ("ima: re-introduce
own integrity cache lock"), so remove it.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
In keeping with the directive to get rid of VLAs [1], let's drop the VLA
from ima_audit_measurement(). We need to adjust the return type of
ima_audit_measurement, because now this function can fail if an allocation
fails.
[1]: https://lkml.org/lkml/2018/3/7/621
v2: just use audit_log_format instead of doing a second allocation
v3: ignore failures in ima_audit_measurement()
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
TPM_CRB driver provides TPM CRB 2.0 support. If it is built as a
module, the TPM chip is registered after IMA init. tpm_pcr_read() in
IMA fails and displays the following message even though eventually
there is a TPM chip on the system.
ima: No TPM chip found, activating TPM-bypass! (rc=-19)
Fix IMA Kconfig to select TPM_CRB so TPM_CRB driver is built in the kernel
and initializes before IMA.
Signed-off-by: Jiandi An <anjiandi@codeaurora.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
When policy replacement occurs the symlinks in the profile directory
need to be updated to point to the new rawdata, otherwise once the
old rawdata is removed the symlink becomes broken.
Fix this by dynamically generating the symlink everytime it is read.
These links are used enough that their value needs to be cached and
this way we can avoid needing locking to read and update the link
value.
Fixes: a481f4d917 ("apparmor: add custom apparmorfs that will be used by policy namespace files")
BugLink: http://bugs.launchpad.net/bugs/1755563
Signed-off-by: John Johansen <john.johansen@canonical.com>
We accidentally return a positive EPROTO instead of a negative -EPROTO.
Since 71 is not an error pointer, that means it eventually results in an
Oops in the caller.
Fixes: d901d6a298 ("apparmor: dfa split verification of table headers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Currently variable size is a unsigned size_t, hence comparisons to
see if it is less than zero (for error checking) will always be
false. Fix this by making size a ssize_t
Detected by CoverityScan, CID#1466080 ("Unsigned compared against 0")
Fixes: 8e51f9087f ("apparmor: Add support for attaching profiles via xattr, presence and value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
There is no gain from doing this except for some self-documenting.
Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
These variables are not used where they are was defined. There is no
point in declaring them there as extern. Move and constify them, saving
2 bytes.
Function old new delta
init_desc 273 271 -2
Total: Before=2112094, After=2112092, chg -0.00%
Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This patch addresses the fuse privileged mounted filesystems in
environments which are unwilling to accept the risk of trusting the
signature verification and want to always fail safe, but are for example
using a pre-built kernel.
This patch defines a new builtin policy named "fail_securely", which can
be specified on the boot command line as an argument to "ima_policy=".
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Cc: Alban Crequy <alban@kinvolk.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
The IMA_APPRAISE and IMA_HASH policies overlap. Clear IMA_HASH properly.
Fixes: da1b0029f5 ("ima: support new "hash" and "dont_hash" policy actions")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This patch addresses the fuse privileged mounted filesystems in a "secure"
environment, with a correctly enforced security policy, which is willing
to assume the inherent risk of specific fuse filesystems that are well
defined and properly implemented.
As there is no way for the kernel to detect file changes, the kernel
ignores the cached file integrity results and re-measures, re-appraises,
and re-audits the file.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Cc: Alban Crequy <alban@kinvolk.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
FUSE can be mounted by unprivileged users either today with fusermount
installed with setuid, or soon with the upcoming patches to allow FUSE
mounts in a non-init user namespace.
This patch addresses the new unprivileged non-init mounted filesystems,
which are untrusted, by failing the signature verification.
This patch defines two new flags SB_I_IMA_UNVERIFIABLE_SIGNATURE and
SB_I_UNTRUSTED_MOUNTER.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Dongsu Park <dongsu@kinvolk.io>
Cc: Alban Crequy <alban@kinvolk.io>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
The existing BPRM_CHECK functionality in IMA validates against the
credentials of the existing process, not any new credentials that the
child process may transition to. Add an additional CREDS_CHECK target
and refactor IMA to pass the appropriate creds structure. In
ima_bprm_check(), check with both the existing process credentials and
the credentials that will be committed when the new process is started.
This will not change behaviour unless the system policy is extended to
include CREDS_CHECK targets - BPRM_CHECK will continue to check the same
credentials that it did previously.
After this patch, an IMA policy rule along the lines of:
measure func=CREDS_CHECK subj_type=unconfined_t
will trigger if a process is executed and runs as unconfined_t, ignoring
the context of the parent process. This is in contrast to:
measure func=BPRM_CHECK subj_type=unconfined_t
which will trigger if the process that calls exec() is already executing
in unconfined_t, ignoring the context that the child process executes
into.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Changelog:
- initialize ima_creds_status
For IMA purposes, we want to be able to obtain the prepared secid in the
bprm structure before the credentials are committed. Add a cred_getsecid
hook that makes this possible.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
All of the implementations of security hooks that take msg_queue only
access q_perm the struct kern_ipc_perm member. This means the
dependencies of the msg_queue security hooks can be simplified by
passing the kern_ipc_perm member of msg_queue.
Making this change will allow struct msg_queue to become private to
ipc/msg.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
All of the implementations of security hooks that take shmid_kernel only
access shm_perm the struct kern_ipc_perm member. This means the
dependencies of the shm security hooks can be simplified by passing
the kern_ipc_perm member of shmid_kernel..
Making this change will allow struct shmid_kernel to become private to ipc/shm.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
All of the implementations of security hooks that take sem_array only
access sem_perm the struct kern_ipc_perm member. This means the
dependencies of the sem security hooks can be simplified by passing
the kern_ipc_perm member of sem_array.
Making this change will allow struct sem and struct sem_array
to become private to ipc/sem.c.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlqvCPYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOaAH/171cgZGFEXSONxK
3O1AAv61wN5K/ISMt6mnelWR6fZg195FarOx0Rnq7Ot8OWuVa8CGcyT4vX4Z7nb9
SVMQKNMPCVQE4WCDOv6S0njChmRC0BxBoVJtTN9fhywdYgX1KcaTS/drMRHACF5n
rB9eouMQScfMzKGAW08gp5NvEGJ6W1SLX7La3/u0751dYisdJSP7+vFZNxUrGXEA
yIPOQjFu0Tfo8GXz/BwC678RZVzVLN0sE6+/vM7zNnoDlsRVkdDIVMo3UiVqm/NK
B37/TlZz8CYoapoKnRRB5giXnSPDSXtsikbGy3mcy0u5imGe+ZgdjrdYSaLk31cR
NVZY08k=
=pu3X
-----END PGP SIGNATURE-----
Merge tag 'v4.16-rc6' into next-general
Merge to Linux 4.16-rc6 at the request of Jarkko, for his TPM updates.
Wrap the AVC state within the selinux_state structure and
pass it explicitly to all AVC functions. The AVC private state
is encapsulated in a selinux_avc structure that is referenced
from the selinux_state.
This change should have no effect on SELinux behavior or
APIs (userspace or LSM).
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.morris@microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Move global selinuxfs state to a per-instance structure (selinux_fs_info),
and include a pointer to the selinux_state in this structure.
Pass this selinux_state to all security server operations, thereby
ensuring that each selinuxfs instance presents a view of and acts
as an interface to a particular selinux_state instance.
This change should have no effect on SELinux behavior or APIs
(userspace or LSM). It merely wraps the selinuxfs global state,
links it to a particular selinux_state (currently always the single
global selinux_state) and uses that state for all operations.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
If security_get_bools/classes are called before the selinux state is
initialized (i.e. before first policy load), then they should just
return immediately with no booleans/classes.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The unpack code now makes sure every profile has a dfa so the safe
version of POLICY_MEDIATES is no longer needed.
Signed-off-by: John Johansen <john.johansen@canonical.com>
version 2 - Force an abi break. Network mediation will only be
available in v8 abi complaint policy.
Provide a basic mediation of sockets. This is not a full net mediation
but just whether a spcific family of socket can be used by an
application, along with setting up some basic infrastructure for
network mediation to follow.
the user space rule hav the basic form of
NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ]
[ TYPE | PROTOCOL ]
DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' |
'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' |
'netbeui' | 'security' | 'key' | 'packet' | 'ash' |
'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' |
'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' |
'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' |
'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' |
'vsock' | 'mpls' | 'ib' | 'kcm' ) ','
TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' |
'packet' )
PROTOCOL = ( 'tcp' | 'udp' | 'icmp' )
eg.
network,
network inet,
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
commit d178bc3a70 ("user namespace: usb:
make usb urbs user namespace aware (v2)") changed kill_pid_info_as_uid
to kill_pid_info_as_cred, saving and passing a cred structure instead of
uids. Since the secid can be obtained from the cred, drop the secid fields
from the usb_dev_state and async structures, and drop the secid argument to
kill_pid_info_as_cred. Replace the secid argument to security_task_kill
with the cred. Update SELinux, Smack, and AppArmor to use the cred, which
avoids the need for Smack and AppArmor to use a secid at all in this hook.
Further changes to Smack might still be required to take full advantage of
this change, since it should now be possible to perform capability
checking based on the supplied cred. The changes to Smack and AppArmor
have only been compile-tested.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Define a selinux state structure (struct selinux_state) for
global SELinux state and pass it explicitly to all security server
functions. The public portion of the structure contains state
that is used throughout the SELinux code, such as the enforcing mode.
The structure also contains a pointer to a selinux_ss structure whose
definition is private to the security server and contains security
server specific state such as the policy database and SID table.
This change should have no effect on SELinux behavior or APIs
(userspace or LSM). It merely wraps SELinux state and passes it
explicitly as needed.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: minor fixups needed due to collisions with the SCTP patches]
Signed-off-by: Paul Moore <paul@paul-moore.com>
The new file system CGROUP2 isn't actually handled
by smack. This changes makes Smack treat equally
CGROUP and CGROUP2 items.
Signed-off-by: José Bollo <jose.bollo@iot.bzh>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
These pernet_operations only register and unregister nf hooks.
So, they are able to be marked as async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These pernet_operations only register and unregister nf hooks.
So, they are able to be marked as async.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A missing 'struct' keyword caused a build error when CONFIG_NETLABEL
is disabled:
In file included from security/selinux/hooks.c:99:
security/selinux/include/netlabel.h:135:66: error: unknown type name 'sock'
static inline void selinux_netlbl_sctp_sk_clone(struct sock *sk, sock *newsk)
^~~~
security/selinux/hooks.c: In function 'selinux_sctp_sk_clone':
security/selinux/hooks.c:5188:2: error: implicit declaration of function 'selinux_netlbl_sctp_sk_clone'; did you mean 'selinux_netlbl_inet_csk_clone'? [-Werror=implicit-function-declaration]
Fixes: db97c9f9d312 ("selinux: Add SCTP support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The SELinux SCTP implementation is explained in:
Documentation/security/SELinux-sctp.rst
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
security/integrity/digsig.c has build errors on some $ARCH due to a
missing header file, so add it.
security/integrity/digsig.c:146:2: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: linux-integrity@vger.kernel.org
Link: http://kisskb.ellerman.id.au/kisskb/head/13396/
Signed-off-by: James Morris <james.morris@microsoft.com>
The SCTP security hooks are explained in:
Documentation/security/LSM-sctp.rst
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
kmalloc() can't always allocate large enough buffers for big_key to use for
crypto (1MB + some metadata) so we cannot use that to allocate the buffer.
Further, vmalloc'd pages can't be passed to sg_init_one() and the aead
crypto accessors cannot be called progressively and must be passed all the
data in one go (which means we can't pass the data in one block at a time).
Fix this by allocating the buffer pages individually and passing them
through a multientry scatterlist to the crypto layer. This has the bonus
advantage that we don't have to allocate a contiguous series of pages.
We then vmap() the page list and pass that through to the VFS read/write
routines.
This can trigger a warning:
WARNING: CPU: 0 PID: 60912 at mm/page_alloc.c:3883 __alloc_pages_nodemask+0xb7c/0x15f8
([<00000000002acbb6>] __alloc_pages_nodemask+0x1ee/0x15f8)
[<00000000002dd356>] kmalloc_order+0x46/0x90
[<00000000002dd3e0>] kmalloc_order_trace+0x40/0x1f8
[<0000000000326a10>] __kmalloc+0x430/0x4c0
[<00000000004343e4>] big_key_preparse+0x7c/0x210
[<000000000042c040>] key_create_or_update+0x128/0x420
[<000000000042e52c>] SyS_add_key+0x124/0x220
[<00000000007bba2c>] system_call+0xc4/0x2b0
from the keyctl/padd/useradd test of the keyutils testsuite on s390x.
Note that it might be better to shovel data through in page-sized lumps
instead as there's no particular need to use a monolithic buffer unless the
kernel itself wants to access the data.
Fixes: 13100a72f4 ("Security: Keys: Big keys stored encrypted")
Reported-by: Paul Bunyan <pbunyan@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Kirill Marinushkin <k.marinushkin@gmail.com>
Changes since v1:
Added changes in these files:
drivers/infiniband/hw/usnic/usnic_transport.c
drivers/staging/lustre/lnet/lnet/lib-socket.c
drivers/target/iscsi/iscsi_target_login.c
drivers/vhost/net.c
fs/dlm/lowcomms.c
fs/ocfs2/cluster/tcp.c
security/tomoyo/network.c
Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.
"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.
None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.
This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.
Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.
rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.
Userspace API is not changed.
text data bss dec hex filename
30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
30108109 2633612 873672 33615393 200ee21 vmlinux.o
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Overlapping domain attachments using the current longest left exact
match fail in some simple cases, and with the fix to ensure consistent
behavior by failing unresolvable attachments it becomes important to
do a better job.
eg. under the current match the following are unresolvable where
the alternation is clearly a better match under the most specific
left match rule.
/**
/{bin/,}usr/
Use a counting match that detects when a loop in the state machine is
enter, and return the match count to provide a better specific left
match resolution.
Signed-off-by: John Johansen <john.johansen@canonical.com>
This converts profile attachment based on xattrs to a fixed extended
conditional using dfa matching.
This has a couple of advantages
- pattern matching can be used for the xattr match
- xattrs can be optional for an attachment or marked as required
- the xattr attachment conditional will be able to be combined with
other extended conditionals when the flexible extended conditional
work lands.
The xattr fixed extended conditional is appended to the xmatch
conditional. If an xattr attachment is specified the profile xmatch
will be generated regardless of whether there is a pattern match on
the executable name.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Make it possible to tie Apparmor profiles to the presence of one or more
extended attributes, and optionally their values. An example usecase for
this is to automatically transition to a more privileged Apparmor profile
if an executable has a valid IMA signature, which can then be appraised
by the IMA subsystem.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
separate the different types of verification so they are logically
separate and can be reused separate of each other.
Signed-off-by: John Johansen <john.johansen@canonical.com>
State differential encoding can provide better compression for
apparmor policy, without having significant impact on match time.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Domain transition so far have been largely blocked by no new privs,
unless the transition has been provably a subset of the previous
confinement. There was a couple problems with the previous
implementations,
- transitions that weren't explicitly a stack but resulted in a subset
of confinement were disallowed
- confinement subsets were only calculated from the previous
confinement instead of the confinement being enforced at the time of
no new privs, so transitions would have to get progressively
tighter.
Fix this by detecting and storing a reference to the task's
confinement at the "time" no new privs is set. This reference is then
used to determine whether a transition is a subsystem of the
confinement at the time no new privs was set.
Unfortunately the implementation is less than ideal in that we have to
detect no new privs after the fact when a task attempts a domain
transition. This is adequate for the currently but will not work in a
stacking situation where no new privs could be conceivably be set in
both the "host" and in the container.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Now that file contexts have been moved into file, and task context
fns() and data have been split from the context, only the cred context
remains in context.h so rename to cred.h to better reflect what it
deals with.
Signed-off-by: John Johansen <john.johansen@canonical.com>
now that cred_ctx has been removed we can rename task_ctxs from tctx
without causing confusion.
Signed-off-by: John Johansen <john.johansen@canonical.com>
With the task domain change information now stored in the task->security
context, the cred->security context only stores the label. We can get
rid of the cred_ctx and directly reference the label, removing a layer
of indirection, and unneeded extra allocations.
Signed-off-by: John Johansen <john.johansen@canonical.com>
The task domain change info is task specific and its and abuse of
the cred to store the information in there. Now that a task->security
field exists store it in the proper place.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Allow apparmor to audit the number of a signal that it does not
provide a mapping for and is currently being reported only as
unknown.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Given a label with a profile stack of
A//&B or A//&C ...
A ptrace rule should be able to specify a generic trace pattern with
a rule like
signal send A//&**,
however this is failing because while the correct label match routine
is called, it is being done post label decomposition so it is always
being done against a profile instead of the stacked label.
To fix this refactor the cross check to pass the full peer label in to
the label_match.
Signed-off-by: John Johansen <john.johansen@canonical.com>
These duplicate includes have been found with scripts/checkincludes.pl but
they have been removed manually to avoid removing false positives.
Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The root view of the label parse should not be exposed to user
control.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
nulldfa.in makes for a very long unwrapped line, which certain tools
do not like. So add line breaks.
Signed-off-by: John Johansen <john.johansen@canonical.com>
some label/context sources might not be guaranteed to be null terminiated
provide a size bounded version of label parse to deal with these.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
The current split scheme is actually wrong in that it splits
///&
where that is invalid and should fail. Use the dfa to do a proper
bounded split without having to worry about getting the string
processing right in code.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
Splitting the management struct from the actual data blob will allow
us in the future to do some sharing and other data reduction
techniques like replacing the the raw data with compressed data.
Prepare for this by separating the management struct from the data
blob.
Signed-off-by: John Johansen <john.johansen@canonical.com>
The existence test is not being properly logged as the signal mapping
maps it to the last entry in the named signal table. This is done
to help catch bugs by making the 0 mapped signal value invalid so
that we can catch the signal value not being filled in.
When fixing the off-by-one comparision logic the reporting of the
existence test was broken, because the logic behind the mapped named
table was hidden. Fix this by adding a define for the name lookup
and using it.
Cc: Stable <stable@vger.kernel.org>
Fixes: f7dc4c9a85 ("apparmor: fix off-by-one comparison on MAXMAPPED_SIG")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Resource auditing is using the peer field which is not available
when the rlim data struct is used, because it is a different element
of the same union. Accessing peer during resource auditing could
cause garbage log entries or even oops the kernel.
Move the rlim data block into the same struct as the peer field
so they can be used together.
CC: <stable@vger.kernel.org>
Fixes: 86b92cb782 ("apparmor: move resource checks to using labels")
Signed-off-by: John Johansen <john.johansen@canonical.com>
The .ns_name should not be virtualized by the current ns view. It
needs to report the ns base name as that is being used during startup
as part of determining apparmor policy namespace support.
BugLink: http://bugs.launchpad.net/bugs/1746463
Fixes: d9f02d9c23 ("apparmor: fix display of ns name")
Cc: Stable <stable@vger.kernel.org>
Reported-by: Serge Hallyn <serge@hallyn.com>
Tested-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJae0mSAAoJEAAOaEEZVoIVs98P+wSbwfgLeyTufmrRYrD9kxfh
EQXfuvnJqPzRHLJIUXfwzTN3IV9RZ1434ci31lZvQE3PKrgb90QuBLiR6OIKULef
UqpYRmjsg7BfFBdAnyUR8xSmmeN94PjXQk7tG+YQn096HJVZ6cG5qCA8RjJ9dFoq
2haDcOfDU+3e8mbtrrF4doP6jGrVwV+okqRsshFBclQv62Kk3m7L5AjQINyZpTM5
ZKX5JIMOAmlJcHsz/2J1qLAIRQKsvEUbRLV43bzp3E03PuVFPhig3dVtpGPUe+Yi
OW0JX49hIoTCrQ4KZk6uweLG7ZpaSoppXggEi2ERNCUkCf3nhejLlScfye+yLx7f
sItgPkOYU0VVF70Y72XH1DbOekZr/XCLZdEEUNCS/P68hnyK0gBNC9zPGetlxMMi
wjjQ9Qe45vD2JFlrvhHrdUdCnxnE05zC9ckBrmM94uRwIfDR0WVgo6pfebfRkAJd
Wp4/PfbaySY7vk4oyaXlNxcDIH2NvWwYkioI/K9rRGbB2KjTdXonQojBy+rT0LeS
f3mufyZYyCxdwu3Wf8WO36H23L+4fseMthKIIPA0aL4wasB9LgD8gDnkyKx28DT4
S32tdK4UALC8SAVsPr+vSaMVzKOZmuNHac+XB2i+5lHl8G/n4M2a+JFTeR4CnKJ/
9LsBEBL5Oj7ZXL7lfFIO
=iEKM
-----END PGP SIGNATURE-----
Merge tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull inode->i_version cleanup from Jeff Layton:
"Goffredo went ahead and sent a patch to rename this function, and
reverse its sense, as we discussed last week.
The patch is very straightforward and I figure it's probably best to
go ahead and merge this to get the API as settled as possible"
* tag 'iversion-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
iversion: Rename make inode_cmp_iversion{+raw} to inode_eq_iversion{+raw}
There are several functions that do find_task_by_vpid() followed by
get_task_struct(). We can use a helper function instead.
Link: http://lkml.kernel.org/r/1509602027-11337-1-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cache objects. This is good, but still leaves a lot of kernel memory
available to be copied to/from userspace in the face of bugs. To further
restrict what memory is available for copying, this creates a way to
whitelist specific areas of a given slab cache object for copying to/from
userspace, allowing much finer granularity of access control. Slab caches
that are never exposed to userspace can declare no whitelist for their
objects, thereby keeping them unavailable to userspace via dynamic copy
operations. (Note, an implicit form of whitelisting is the use of constant
sizes in usercopy operations and get_user()/put_user(); these bypass all
hardened usercopy checks since these sizes cannot change at runtime.)
This new check is WARN-by-default, so any mistakes can be found over the
next several releases without breaking anyone's system.
The series has roughly the following sections:
- remove %p and improve reporting with offset
- prepare infrastructure and whitelist kmalloc
- update VFS subsystem with whitelists
- update SCSI subsystem with whitelists
- update network subsystem with whitelists
- update process memory with whitelists
- update per-architecture thread_struct with whitelists
- update KVM with whitelists and fix ioctl bug
- mark all other allocations as not whitelisted
- update lkdtm for more sensible test overage
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIcBAABCgAGBQJabvleAAoJEIly9N/cbcAmO1kQAJnjVPutnLSbnUteZxtsv7W4
43Cggvokfxr6l08Yh3hUowNxZVKjhF9uwMVgRRg9Nl5WdYCN+vCQbHz+ZdzGJXKq
cGqdKWgexMKX+aBdNDrK7BphUeD46sH7JWR+a/lDV/BgPxBCm9i5ZZCgXbPP89AZ
NpLBji7gz49wMsnm/x135xtNlZ3dG0oKETzi7MiR+NtKtUGvoIszSKy5JdPZ4m8q
9fnXmHqmwM6uQFuzDJPt1o+D1fusTuYnjI7EgyrJRRhQ+BB3qEFZApXnKNDRS9Dm
uB7jtcwefJCjlZVCf2+PWTOEifH2WFZXLPFlC8f44jK6iRW2Nc+wVRisJ3vSNBG1
gaRUe/FSge68eyfQj5OFiwM/2099MNkKdZ0fSOjEBeubQpiFChjgWgcOXa5Bhlrr
C4CIhFV2qg/tOuHDAF+Q5S96oZkaTy5qcEEwhBSW15ySDUaRWFSrtboNt6ZVOhug
d8JJvDCQWoNu1IQozcbv6xW/Rk7miy8c0INZ4q33YUvIZpH862+vgDWfTJ73Zy9H
jR/8eG6t3kFHKS1vWdKZzOX1bEcnd02CGElFnFYUEewKoV7ZeeLsYX7zodyUAKyi
Yp5CImsDbWWTsptBg6h9nt2TseXTxYCt2bbmpJcqzsqSCUwOQNQ4/YpuzLeG0ihc
JgOmUnQNJWCTwUUw5AS1
=tzmJ
-----END PGP SIGNATURE-----
Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardened usercopy whitelisting from Kees Cook:
"Currently, hardened usercopy performs dynamic bounds checking on slab
cache objects. This is good, but still leaves a lot of kernel memory
available to be copied to/from userspace in the face of bugs.
To further restrict what memory is available for copying, this creates
a way to whitelist specific areas of a given slab cache object for
copying to/from userspace, allowing much finer granularity of access
control.
Slab caches that are never exposed to userspace can declare no
whitelist for their objects, thereby keeping them unavailable to
userspace via dynamic copy operations. (Note, an implicit form of
whitelisting is the use of constant sizes in usercopy operations and
get_user()/put_user(); these bypass all hardened usercopy checks since
these sizes cannot change at runtime.)
This new check is WARN-by-default, so any mistakes can be found over
the next several releases without breaking anyone's system.
The series has roughly the following sections:
- remove %p and improve reporting with offset
- prepare infrastructure and whitelist kmalloc
- update VFS subsystem with whitelists
- update SCSI subsystem with whitelists
- update network subsystem with whitelists
- update process memory with whitelists
- update per-architecture thread_struct with whitelists
- update KVM with whitelists and fix ioctl bug
- mark all other allocations as not whitelisted
- update lkdtm for more sensible test overage"
* tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
lkdtm: Update usercopy tests for whitelisting
usercopy: Restrict non-usercopy caches to size 0
kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
kvm: whitelist struct kvm_vcpu_arch
arm: Implement thread_struct whitelist for hardened usercopy
arm64: Implement thread_struct whitelist for hardened usercopy
x86: Implement thread_struct whitelist for hardened usercopy
fork: Provide usercopy whitelisting for task_struct
fork: Define usercopy region in thread_stack slab caches
fork: Define usercopy region in mm_struct slab caches
net: Restrict unwhitelisted proto caches to size 0
sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
sctp: Define usercopy region in SCTP proto slab cache
caif: Define usercopy region in caif proto slab cache
ip: Define usercopy region in IP proto slab cache
net: Define usercopy region in struct proto slab cache
scsi: Define usercopy region in scsi_sense_cache slab cache
cifs: Define usercopy region in cifs_request slab cache
vxfs: Define usercopy region in vxfs_inode slab cache
ufs: Define usercopy region in ufs_inode_cache slab cache
...
Intermittently security.ima is not being written for new files. This
patch re-initializes the new slab iint->atomic_flags field before
freeing it.
Fixes: commit 0d73a55208 ("ima: re-introduce own integrity cache lock")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android binder
fixes, the usual handful of hyper-v updates, and lots of other smaller
driver updates.
All of these have been in linux-next for a long time, with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnLuZw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynS4QCcCrPmwfD5PJwaF+q2dPfyKaflkQMAn0x6Wd+u
Gw3Z2scgjETUpwJ9ilnL
=xcQ0
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big pull request for char/misc drivers for 4.16-rc1.
There's a lot of stuff in here. Three new driver subsystems were added
for various types of hardware busses:
- siox
- slimbus
- soundwire
as well as a new vboxguest subsystem for the VirtualBox hypervisor
drivers.
There's also big updates from the FPGA subsystem, lots of Android
binder fixes, the usual handful of hyper-v updates, and lots of other
smaller driver updates.
All of these have been in linux-next for a long time, with no reported
issues"
* tag 'char-misc-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (155 commits)
char: lp: use true or false for boolean values
android: binder: use VM_ALLOC to get vm area
android: binder: Use true and false for boolean values
lkdtm: fix handle_irq_event symbol for INT_HW_IRQ_EN
EISA: Delete error message for a failed memory allocation in eisa_probe()
EISA: Whitespace cleanup
misc: remove AVR32 dependencies
virt: vbox: Add error mapping for VERR_INVALID_NAME and VERR_NO_MORE_FILES
soundwire: Fix a signedness bug
uio_hv_generic: fix new type mismatch warnings
uio_hv_generic: fix type mismatch warnings
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
uio_hv_generic: add rescind support
uio_hv_generic: check that host supports monitor page
uio_hv_generic: create send and receive buffers
uio: document uio_hv_generic regions
doc: fix documentation about uio_hv_generic
vmbus: add monitor_id and subchannel_id to sysfs per channel
vmbus: fix ABI documentation
uio_hv_generic: use ISR callback method
...
The function inode_cmp_iversion{+raw} is counter-intuitive, because it
returns true when the counters are different and false when these are equal.
Rename it to inode_eq_iversion{+raw}, which will returns true when
the counters are equal and false otherwise.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Pull networking updates from David Miller:
1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf
2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.
3) Support cross-chip FDB operations in DSA, from Vivien Didelot.
4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.
5) Add eBPF based queue selection to tun, from Jason Wang.
6) Lockless qdisc support, from John Fastabend.
7) SCTP stream interleave support, from Xin Long.
8) Smoother TCP receive autotuning, from Eric Dumazet.
9) Lots of erspan tunneling enhancements, from William Tu.
10) Add true function call support to BPF, from Alexei Starovoitov.
11) Add explicit support for GRO HW offloading, from Michael Chan.
12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.
13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.
14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.
15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.
16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.
17) Add resource abstration to devlink, from Arkadi Sharshevsky.
18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.
19) Avoid locking in act_csum, from Davide Caratti.
20) devinet_ioctl() simplifications from Al viro.
21) More TCP bpf improvements from Lawrence Brakmo.
22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEcQCq365ubpQNLgrWVeRaWujKfIoFAlpwp5QUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQVeRaWujKfIoWAxAAj4Ne8MkQj7AvwlXN/wc4F1jlLGLq
VfIN1CBPqzjOf8foHY05lYOyqO/npT8cxFtjaI/zBt44Mw9DnCpUG1/XiMZqNK31
Zg+cOKHNyshsP9g8muH4r3NylzXLA0k/K+GvYXqibvSMlkQhygIjMgktWquM/Nkk
rKOwB9T20XhGOuusfNuNMWFef15Jda6BjPlntoGvx/EebSizvy8f7M/AC+BNfcgO
1S26WEEinxc7EaihRqU6epCYZFK10M/WrDq5DGPq5Gw2JLQW4ZzLgjpRr/Yh9gJ5
sBc6Kok2qPeIh206OQeq/KqCAwAKn8+9PDiAoemYIGJ5dD63YjY8RyTlGoMchedp
geEuzz8b84qcrylziUd4TG0TkJ4Rdj14FLRLrNv50iw/+Hl3NzQiJBY+9SKULJAb
d05BHCriomJV/uT9N7OusAE9GTd8jBaz6fL8h0dOSbPrXkzjEXH6G8qsziw26M7s
jdtRoGmsnQ/h5RiaYEoMBC8C8jWn1MozEfW2K+P8Nzgp/JTUodFdzz+ZRSPQZNZ6
4qd8vYaxl7x3UeMBbcPTGeDvFGBOGr98RdWQfjyT6KEF4KLRtIQ36PeQEwXc2Vq5
W9x5STQ7RycyKp69cnz3qoEOdhn7XMzYHPz6jld6b3lcHP+VysDmWA7/Din50RrY
hgUzstNb5Kr3np4=
=ch+z
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20180130' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"A small pull request this time, just three patches, and one of these
is just a comment update (swap the FSF physical address for a URL).
The other two patches are small bug fixes found by szybot/syzkaller;
they individual patch descriptions should tell you all you ever wanted
to know"
* tag 'selinux-pr-20180130' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: skip bounded transition processing if the policy isn't loaded
selinux: ensure the context is NUL terminated in security_context_to_sid_core()
security: replace FSF address with web source in license notices
Pull tpm updates from James Morris:
- reduce polling delays in tpm_tis
- support retrieving TPM 2.0 Event Log through EFI before
ExitBootServices
- replace tpm-rng.c with a hwrng device managed by the driver for each
TPM device
- TPM resource manager synthesizes TPM_RC_COMMAND_CODE response instead
of returning -EINVAL for unknown TPM commands. This makes user space
more sound.
- CLKRUN fixes:
* Keep #CLKRUN disable through the entier TPM command/response flow
* Check whether #CLKRUN is enabled before disabling and enabling it
again because enabling it breaks PS/2 devices on a system where it
is disabled
* 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
tpm: remove unused variables
tpm: remove unused data fields from I2C and OF device ID tables
tpm: only attempt to disable the LPC CLKRUN if is already enabled
tpm: follow coding style for variable declaration in tpm_tis_core_init()
tpm: delete the TPM_TIS_CLK_ENABLE flag
tpm: Update MAINTAINERS for Jason Gunthorpe
tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd()
tpm_tis: Move ilb_base_addr to tpm_tis_data
tpm2-cmd: allow more attempts for selftest execution
tpm: return a TPM_RC_COMMAND_CODE response if command is not implemented
tpm: Move Linux RNG connection to hwrng
tpm: use struct tpm_chip for tpm_chip_find_get()
tpm: parse TPM event logs based on EFI table
efi: call get_event_log before ExitBootServices
tpm: add event log format version
tpm: rename event log provider files
tpm: move tpm_eventlog.h outside of drivers folder
tpm: use tpm_msleep() value as max delay
tpm: reduce tpm polling delay in tpm_tis_core
tpm: move wait_for_tpm_stat() to respective driver files
Pull smack updates from James Morris:
"Two minor fixes"
* 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: Privilege check on key operations
Smack: fix dereferenced before check
Pull integrity updates from James Morris:
"This contains a mixture of bug fixes, code cleanup, and new
functionality. Of note is the integrity cache locking fix, file change
detection, and support for a new EVM portable and immutable signature
type.
The re-introduction of the integrity cache lock (iint) fixes the
problem of attempting to take the i_rwsem shared a second time, when
it was previously taken exclusively. Defining atomic flags resolves
the original iint/i_rwsem circular locking - accessing the file data
vs. modifying the file metadata. Although it fixes the O_DIRECT
problem as well, a subsequent patch is needed to remove the explicit
O_DIRECT prevention.
For performance reasons, detecting when a file has changed and needs
to be re-measured, re-appraised, and/or re-audited, was limited to
after the last writer has closed, and only if the file data has
changed. Detecting file change is based on i_version. For filesystems
that do not support i_version, remote filesystems, or userspace
filesystems, the file was measured, appraised and/or audited once and
never re-evaluated. Now local filesystems, which do not support
i_version or are not mounted with the i_version option, assume the
file has changed and are required to re-evaluate the file. This change
does not address detecting file change on remote or userspace
filesystems.
Unlike file data signatures, which can be included and distributed in
software packages (eg. rpm, deb), the existing EVM signature, which
protects the file metadata, could not be included in software
packages, as it includes file system specific information (eg. i_ino,
possibly the UUID). This pull request defines a new EVM portable and
immutable file metadata signature format, which can be included in
software packages"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ima/policy: fix parsing of fsuuid
ima: Use i_version only when filesystem supports it
integrity: remove unneeded initializations in integrity_iint_cache entries
ima: log message to module appraisal error
ima: pass filename to ima_rdwr_violation_check()
ima: Fix line continuation format
ima: support new "hash" and "dont_hash" policy actions
ima: re-introduce own integrity cache lock
EVM: Add support for portable signature format
EVM: Allow userland to permit modification of EVM-protected metadata
ima: relax requiring a file signature for new files with zero length
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...
Pull RCU updates from Ingo Molnar:
"The main RCU changes in this cycle were:
- Updates to use cond_resched() instead of cond_resched_rcu_qs()
where feasible (currently everywhere except in kernel/rcu and in
kernel/torture.c). Also a couple of fixes to avoid sending IPIs to
offline CPUs.
- Updates to simplify RCU's dyntick-idle handling.
- Updates to remove almost all uses of smp_read_barrier_depends() and
read_barrier_depends().
- Torture-test updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits)
torture: Save a line in stutter_wait(): while -> for
torture: Eliminate torture_runnable and perf_runnable
torture: Make stutter less vulnerable to compilers and races
locking/locktorture: Fix num reader/writer corner cases
locking/locktorture: Fix rwsem reader_delay
torture: Place all torture-test modules in one MAINTAINERS group
rcutorture/kvm-build.sh: Skip build directory check
rcutorture: Simplify functions.sh include path
rcutorture: Simplify logging
rcutorture/kvm-recheck-*: Improve result directory readability check
rcutorture/kvm.sh: Support execution from any directory
rcutorture/kvm.sh: Use consistent help text for --qemu-args
rcutorture/kvm.sh: Remove unused variable, `alldone`
rcutorture: Remove unused script, config2frag.sh
rcutorture/configinit: Fix build directory error message
rcutorture: Preempt RCU-preempt readers more vigorously
torture: Reduce #ifdefs for preempt_schedule()
rcu: Remove have_rcu_nocb_mask from tree_plugin.h
rcu: Add comment giving debug strategy for double call_rcu()
tracing, rcu: Hide trace event rcu_nocb_wake when not used
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJabwjlAAoJEAAOaEEZVoIVeEEP/R84kZJjlZV/vNmFFvY46jM+
0hpMHXRNym+nW1Du1CKNkesEUAY8ACAQIyzJh63Q72341QTDdz3+asHwPYRNOqdC
PgryidPieojkNKQg+h7dmoKYlYh1xiCicvn66Q5PFb9B0lH36twekOK4X1qqJj8Z
breRmRoFLka9looMSuYgwbErts023fmASalvGum6T0ZM/7F9hUj4O3OsQtKTLUNM
VQ+gLJTQrUqrgzvWUwq3WTMa9YAaKP4oad8nsglNSpiVLG7WtURr5HokW9hAziqL
k99Y+K2ni1wZJlNGJAyV7PyEG2ieI5Xn+LzM2RM+SndD1QHF2QXACmSTDYfL51k5
G2RsKeTZvQPtX4qx9+vnCp/4oV6JduvCaq2Mt8SQb9nYZxKjs85TNLrARJv+85eQ
zP0OTxlH1Gfu3j36n3cny4XemyMYYF4hCFYfRPqTGst37fgLBtfIfUSQ6jedoCK2
Xcyb6ukGXMh6If/A7DSy91hvSSPrWSH7TPPsbfLy6o+wUOtpAGR4eXVlEuAiXrzc
gnoAz85oIMUQae66LrdrPk1NyE59qOb24g/yU5gyRBSpi2+/aoboNCKaD73tgs/C
XIMwGXLYmqkcud7IBQF0tHHiM+jsEkbSM4LUqRXSnqMdwNnS18Z4Q+JKqpdP0cii
eRdenDvUfu8Gu1Y9vWBv
=iihN
-----END PGP SIGNATURE-----
Merge tag 'iversion-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull inode->i_version rework from Jeff Layton:
"This pile of patches is a rework of the inode->i_version field. We
have traditionally incremented that field on every inode data or
metadata change. Typically this increment needs to be logged on disk
even when nothing else has changed, which is rather expensive.
It turns out though that none of the consumers of that field actually
require this behavior. The only real requirement for all of them is
that it be different iff the inode has changed since the last time the
field was checked.
Given that, we can optimize away most of the i_version increments and
avoid dirtying inode metadata when the only change is to the i_version
and no one is querying it. Queries of the i_version field are rather
rare, so we can help write performance under many common workloads.
This patch series converts existing accesses of the i_version field to
a new API, and then converts all of the in-kernel filesystems to use
it. The last patch in the series then converts the backend
implementation to a scheme that optimizes away a large portion of the
metadata updates when no one is looking at it.
In my own testing this series significantly helps performance with
small I/O sizes. I also got this email for Christmas this year from
the kernel test robot (a 244% r/w bandwidth improvement with XFS over
DAX, with 4k writes):
https://lkml.org/lkml/2017/12/25/8
A few of the earlier patches in this pile are also flowing to you via
other trees (mm, integrity, and nfsd trees in particular)".
* tag 'iversion-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: (22 commits)
fs: handle inode->i_version more efficiently
btrfs: only dirty the inode in btrfs_update_time if something was changed
xfs: avoid setting XFS_ILOG_CORE if i_version doesn't need incrementing
fs: only set S_VERSION when updating times if necessary
IMA: switch IMA over to new i_version API
xfs: convert to new i_version API
ufs: use new i_version API
ocfs2: convert to new i_version API
nfsd: convert to new i_version API
nfs: convert to new i_version API
ext4: convert to new i_version API
ext2: convert to new i_version API
exofs: switch to new i_version API
btrfs: convert to new i_version API
afs: convert to new i_version API
affs: convert to new i_version API
fat: convert to new i_version API
fs: don't take the i_lock in inode_inc_iversion
fs: new API for handling inode->i_version
ntfs: remove i_version handling
...
The switch to uuid_t invereted the logic of verfication that &entry->fsuuid
is zero during parsing of "fsuuid=" rule. Instead of making sure the
&entry->fsuuid field is not attempted to be overwritten, we bail out for
perfectly correct rule.
Fixes: 787d8c530a ("ima/policy: switch to use uuid_t")
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This introduces CONFIG_HARDENED_USERCOPY_FALLBACK to control the
behavior of hardened usercopy whitelist violations. By default, whitelist
violations will continue to WARN() so that any bad or missing usercopy
whitelists can be discovered without being too disruptive.
If this config is disabled at build time or a system is booted with
"slab_common.usercopy_fallback=0", usercopy whitelists will BUG() instead
of WARN(). This is useful for admins that want to use usercopy whitelists
immediately.
Suggested-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Pull x86 pti updates from Thomas Gleixner:
"This contains:
- a PTI bugfix to avoid setting reserved CR3 bits when PCID is
disabled. This seems to cause issues on a virtual machine at least
and is incorrect according to the AMD manual.
- a PTI bugfix which disables the perf BTS facility if PTI is
enabled. The BTS AUX buffer is not globally visible and causes the
CPU to fault when the mapping disappears on switching CR3 to user
space. A full fix which restores BTS on PTI is non trivial and will
be worked on.
- PTI bugfixes for EFI and trusted boot which make sure that the user
space visible page table entries have the NX bit cleared
- removal of dead code in the PTI pagetable setup functions
- add PTI documentation
- add a selftest for vsyscall to verify that the kernel actually
implements what it advertises.
- a sysfs interface to expose vulnerability and mitigation
information so there is a coherent way for users to retrieve the
status.
- the initial spectre_v2 mitigations, aka retpoline:
+ The necessary ASM thunk and compiler support
+ The ASM variants of retpoline and the conversion of affected ASM
code
+ Make LFENCE serializing on AMD so it can be used as speculation
trap
+ The RSB fill after vmexit
- initial objtool support for retpoline
As I said in the status mail this is the most of the set of patches
which should go into 4.15 except two straight forward patches still on
hold:
- the retpoline add on of LFENCE which waits for ACKs
- the RSB fill after context switch
Both should be ready to go early next week and with that we'll have
covered the major holes of spectre_v2 and go back to normality"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
x86,perf: Disable intel_bts when PTI
security/Kconfig: Correct the Documentation reference for PTI
x86/pti: Fix !PCID and sanitize defines
selftests/x86: Add test_vsyscall
x86/retpoline: Fill return stack buffer on vmexit
x86/retpoline/irq32: Convert assembler indirect jumps
x86/retpoline/checksum32: Convert assembler indirect jumps
x86/retpoline/xen: Convert Xen hypercall indirect jumps
x86/retpoline/hyperv: Convert assembler indirect jumps
x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
x86/retpoline/entry: Convert entry assembler indirect jumps
x86/retpoline/crypto: Convert crypto assembler indirect jumps
x86/spectre: Add boot time option to select Spectre v2 mitigation
x86/retpoline: Add initial retpoline support
objtool: Allow alternatives to be ignored
objtool: Detect jumps to retpoline thunks
x86/pti: Make unpoison of pgd for trusted boot work for real
x86/alternatives: Fix optimize_nops() checking
sysfs/cpu: Fix typos in vulnerability documentation
x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
...
The intended behaviour in apparmor profile matching is to flag a
conflict if two profiles match equally well. However, right now a
conflict is generated if another profile has the same match length even
if that profile doesn't actually match. Fix the logic so we only
generate a conflict if the profiles match.
Fixes: 844b8292b6 ("apparmor: ensure that undecidable profile attachments fail")
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Given a label with a profile stack of
A//&B or A//&C ...
A ptrace rule should be able to specify a generic trace pattern with
a rule like
ptrace trace A//&**,
however this is failing because while the correct label match routine
is called, it is being done post label decomposition so it is always
being done against a profile instead of the stacked label.
To fix this refactor the cross check to pass the full peer label in to
the label_match.
Fixes: 290f458a4f ("apparmor: allow ptrace checks to be finer grained than just capability")
Cc: Stable <stable@vger.kernel.org>
Reported-by: Matthew Garrett <mjg59@google.com>
Tested-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Smack: Privilege check on key operations
Operations on key objects are subjected to Smack policy
even if the process is privileged. This is inconsistent
with the general behavior of Smack and may cause issues
with authentication by privileged daemons. This patch
allows processes with CAP_MAC_OVERRIDE to access keys
even if the Smack rules indicate otherwise.
Reported-by: Jose Bollo <jobol@nonadev.net>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Device number (the character device index) is not a stable identifier
for a TPM chip. That is the reason why every call site passes
TPM_ANY_NUM to tpm_chip_find_get().
This commit changes the API in a way that instead a struct tpm_chip
instance is given and NULL means the default chip. In addition, this
commit refines the documentation to be up to date with the
implementation.
Suggested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> (@chip_num -> @chip part)
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@ziepe.ca>
Tested-by: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJaUiTsAAoJEAUvNnAY1cPYA1MP/37LQRImjRGq0wGp9RE1BrCI
VobxrcnzmfMSTlGYGHJOXt81dy6Xr+qcri9Qfr+x4fb79R49NOS+2usQZ3gHu05+
WZOgTOut87uGDzNSWawn9HrsCs89obi+G8jVBt1gCcG4oazp00EHiFQEy6Lkt9D0
5JB4Sx53aTPcqkS/uDrEIDtClnBn/c5EYInKMh4aTaDOAaOAva67n6QELRWyUHoz
y1uf7YQWOoUkuQIgGFhsH/mcvSu71W4ZbIO1CuvsTBcUdXi0q7Hdp23EWdLuYgPe
dPTs1ibhE7gEg0xJ5f2lqF8uL8XYAsZGSWhgPDtcbWq+kpRIGa8TuHhuca84tRMU
6hf3xZwqYFItG9fzLxD6ZJdEmYdHFxlY6KHQ+azTxUdkomJJB72wsRdhB679+Xsa
GMGJ4W4xxUHX+2u5I7p/5FifzmHf2g6YK+v4kkz1U8d9Vgh68b20V+ioC7RgR6k6
qDuVxs0K3g5ikP8WQKzHwKfEp1Z62uHV8HnsmRmuYzoPzbPy3szmgM8tGk+/02Qw
JHEf/umPauG1QjHLMU8HkiB6OP8wFs0Y/mma+Iqy2WrFFPo0oa3A4AW3HKRC5imS
lUryPugYAoioAcu4raYZYKw/fv16YKP0wwbcLKKH2jA6TlUqseJaCW6K1asgMqbh
72UyDLCIjrAmlhpSdLhg
=b5fz
-----END PGP SIGNATURE-----
Merge tag 'apparmor-pr-2018-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor fix from John Johansen:
"This fixes a regression when the kernel feature set is reported as
supporting mount and policy is pinned to a feature set that does not
support mount mediation"
* tag 'apparmor-pr-2018-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix regression in mount mediation when feature set is pinned
When the mount code was refactored for Labels it was not correctly
updated to check whether policy supported mediation of the mount
class. This causes a regression when the kernel feature set is
reported as supporting mount and policy is pinned to a feature set
that does not support mount mediation.
BugLink: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882697#41
Fixes: 2ea3ffb778 ("apparmor: add mount mediation")
Reported-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Pull x86 page table isolation fixes from Thomas Gleixner:
"A couple of urgent fixes for PTI:
- Fix a PTE mismatch between user and kernel visible mapping of the
cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch
MCE on older AMD K8 machines
- Fix the misplaced CR3 switch in the SYSCALL compat entry code which
causes access to unmapped kernel memory resulting in double faults.
- Fix the section mismatch of the cpu_tss_rw percpu storage caused by
using a different mechanism for declaration and definition.
- Two fixes for dumpstack which help to decode entry stack issues
better
- Enable PTI by default in Kconfig. We should have done that earlier,
but it slipped through the cracks.
- Exclude AMD from the PTI enforcement. Not necessarily a fix, but if
AMD is so confident that they are not affected, then we should not
burden users with the overhead"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/process: Define cpu_tss_rw in same section as declaration
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
x86/dumpstack: Print registers for first stack frame
x86/dumpstack: Fix partial register dumps
x86/pti: Make sure the user/kernel PTEs match
x86/cpu, x86/pti: Do not enable PTI on AMD processors
x86/pti: Enable PTI by default
This really want's to be enabled by default. Users who know what they are
doing can disable it either in the config or on the kernel command line.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Pull RCU updates from Paul E. McKenney:
- Updates to use cond_resched() instead of cond_resched_rcu_qs()
where feasible (currently everywhere except in kernel/rcu and
in kernel/torture.c). Also a couple of fixes to avoid sending
IPIs to offline CPUs.
- Updates to simplify RCU's dyntick-idle handling.
- Updates to remove almost all uses of smp_read_barrier_depends()
and read_barrier_depends().
- Miscellaneous fixes.
- Torture-test updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If userspace attempted to set a "security.capability" xattr shorter than
4 bytes (e.g. 'setfattr -n security.capability -v x file'), then
cap_convert_nscap() read past the end of the buffer containing the xattr
value because it accessed the ->magic_etc field without verifying that
the xattr value is long enough to contain that field.
Fix it by validating the xattr value size first.
This bug was found using syzkaller with KASAN. The KASAN report was as
follows (cleaned up slightly):
BUG: KASAN: slab-out-of-bounds in cap_convert_nscap+0x514/0x630 security/commoncap.c:498
Read of size 4 at addr ffff88002d8741c0 by task syz-executor1/2852
CPU: 0 PID: 2852 Comm: syz-executor1 Not tainted 4.15.0-rc6-00200-gcc0aac99d977 #253
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0xe3/0x195 lib/dump_stack.c:53
print_address_description+0x73/0x260 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x235/0x350 mm/kasan/report.c:409
cap_convert_nscap+0x514/0x630 security/commoncap.c:498
setxattr+0x2bd/0x350 fs/xattr.c:446
path_setxattr+0x168/0x1b0 fs/xattr.c:472
SYSC_setxattr fs/xattr.c:487 [inline]
SyS_setxattr+0x36/0x50 fs/xattr.c:483
entry_SYSCALL_64_fastpath+0x18/0x85
Fixes: 8db6c34f1d ("Introduce v3 namespaced file capabilities")
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Pull x86 page table isolation updates from Thomas Gleixner:
"This is the final set of enabling page table isolation on x86:
- Infrastructure patches for handling the extra page tables.
- Patches which map the various bits and pieces which are required to
get in and out of user space into the user space visible page
tables.
- The required changes to have CR3 switching in the entry/exit code.
- Optimizations for the CR3 switching along with documentation how
the ASID/PCID mechanism works.
- Updates to dump pagetables to cover the user space page tables for
W+X scans and extra debugfs files to analyze both the kernel and
the user space visible page tables
The whole functionality is compile time controlled via a config switch
and can be turned on/off on the command line as well"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
x86/ldt: Make the LDT mapping RO
x86/mm/dump_pagetables: Allow dumping current pagetables
x86/mm/dump_pagetables: Check user space page table for WX pages
x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
x86/mm/pti: Add Kconfig
x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
x86/mm: Use INVPCID for __native_flush_tlb_single()
x86/mm: Optimize RESTORE_CR3
x86/mm: Use/Fix PCID to optimize user/kernel switches
x86/mm: Abstract switching CR3
x86/mm: Allow flushing for future ASID switches
x86/pti: Map the vsyscall page if needed
x86/pti: Put the LDT in its own PGD if PTI is on
x86/mm/64: Make a full PGD-entry size hole in the memory map
x86/events/intel/ds: Map debug buffers in cpu_entry_area
x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
x86/mm/pti: Map ESPFIX into user space
x86/mm/pti: Share entry text PMD
x86/entry: Align entry text section to PMD boundary
...
This patch fixes the warning reported by smatch:
security/smack/smack_lsm.c:2872 smack_socket_connect() warn:
variable dereferenced before check 'sock->sk' (see line 2869)
Signed-off-by: Vasyl Gomonovych <gomonovych@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Finally allow CONFIG_PAGE_TABLE_ISOLATION to be enabled.
PARAVIRT generally requires that the kernel not manage its own page tables.
It also means that the hypervisor and kernel must agree wholeheartedly
about what format the page tables are in and what they contain.
PAGE_TABLE_ISOLATION, unfortunately, changes the rules and they
can not be used together.
I've seen conflicting feedback from maintainers lately about whether they
want the Kconfig magic to go first or last in a patch series. It's going
last here because the partially-applied series leads to kernels that can
not boot in a bunch of cases. I did a run through the entire series with
CONFIG_PAGE_TABLE_ISOLATION=y to look for build errors, though.
[ tglx: Removed SMP and !PARAVIRT dependencies as they not longer exist ]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
As done for /proc/kcore in
commit df04abfd18 ("fs/proc/kcore.c: Add bounce buffer for ktext data")
this adds a bounce buffer when reading memory via /dev/mem. This
is needed to allow kernel text memory to be read out when built with
CONFIG_HARDENED_USERCOPY (which refuses to read out kernel text) and
without CONFIG_STRICT_DEVMEM (which would have refused to read any RAM
contents at all).
Since this build configuration isn't common (most systems with
CONFIG_HARDENED_USERCOPY also have CONFIG_STRICT_DEVMEM), this also tries
to inform Kconfig about the recommended settings.
This patch is modified from Brad Spengler/PaX Team's changes to /dev/mem
code in the last public patch of grsecurity/PaX based on my understanding
of the code. Changes or omissions from the original code are mine and
don't reflect the original grsecurity/PaX code.
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Fixes: f5509cc18d ("mm: Hardened usercopy")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
i_version is only supported by a filesystem when the SB_I_VERSION
flag is set. This patch tests for the SB_I_VERSION flag before using
i_version. If we can't use i_version to detect a file change then we
must assume the file has changed in the last_writer path and remeasure
it.
On filesystems without i_version support IMA used to measure a file
only once and didn't detect any changes to a file. With this patch
IMA now works properly on these filesystems.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The init_once routine memsets the whole object to 0, and then
explicitly sets some of the fields to 0 again. Just remove the explicit
initializations.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Simple but useful message log to the user in case of module appraise is
forced and fails due to the lack of file descriptor, that might be
caused by kmod calls to compressed modules.
Signed-off-by: Bruno E. O. Meneguele <brdeoliv@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
ima_rdwr_violation_check() retrieves the full path of a measured file by
calling ima_d_path(). If process_measurement() calls this function, it
reuses the pointer and passes it to the functions to measure/appraise/audit
an accessed file.
After commit bc15ed663e ("ima: fix ima_d_path() possible race with
rename"), ima_d_path() first tries to retrieve the full path by calling
d_absolute_path() and, if there is an error, copies the dentry name to the
buffer passed as argument.
However, ima_rdwr_violation_check() passes to ima_d_path() the pointer of a
local variable. process_measurement() might be reusing the pointer to an
area in the stack which may have been already overwritten after
ima_rdwr_violation_check() returned.
Correct this issue by passing to ima_rdwr_violation_check() the pointer of
a buffer declared in process_measurement().
Fixes: bc15ed663e ("ima: fix ima_d_path() possible race with rename")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Line continuations with excess spacing causes unexpected output.
Based on commit 6f76b6fcaa ("CodingStyle: Document the exception of
not splitting user-visible strings, for grepping") recommendation.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The builtin ima_appraise_tcb policy, which is specified on the boot
command line, can be replaced with a custom policy, normally early in
the boot process. Custom policies can be more restrictive in some ways,
like requiring file signatures, but can be less restrictive in other
ways, like not appraising mutable files. With a less restrictive policy
in place, files in the builtin policy might not be hashed and labeled
with a security.ima hash. On reboot, files which should be labeled in
the ima_appraise_tcb are not labeled, possibly preventing the system
from booting properly.
To resolve this problem, this patch extends the existing IMA policy
actions "measure", "dont_measure", "appraise", "dont_appraise", and
"audit" with "hash" and "dont_hash". The new "hash" action will write
the file hash as security.ima, but without requiring the file to be
appraised as well.
For example, the builtin ima_appraise_tcb policy includes the rule,
"appraise fowner=0". Adding the "hash fowner=0" rule to a custom
policy, will cause the needed file hashes to be calculated and written
as security.ima xattrs.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Three sets of overlapping changes, two in the packet scheduler
and one in the meson-gxl PHY driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
i_version is only supported by a filesystem when the SB_I_VERSION
flag is set. This patch tests for the SB_I_VERSION flag before using
i_version. If we can't use i_version to detect a file change then we
must assume the file has changed in the last_writer path and remeasure
it.
On filesystems without i_version support IMA used to measure a file
only once and didn't detect any changes to a file. With this patch
IMA now works properly on these filesystems.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Before IMA appraisal was introduced, IMA was using own integrity cache
lock along with i_mutex. process_measurement and ima_file_free took
the iint->mutex first and then the i_mutex, while setxattr, chmod and
chown took the locks in reverse order. To resolve the potential deadlock,
i_mutex was moved to protect entire IMA functionality and the redundant
iint->mutex was eliminated.
Solution was based on the assumption that filesystem code does not take
i_mutex further. But when file is opened with O_DIRECT flag, direct-io
implementation takes i_mutex and produces deadlock. Furthermore, certain
other filesystem operations, such as llseek, also take i_mutex.
More recently some filesystems have replaced their filesystem specific
lock with the global i_rwsem to read a file. As a result, when IMA
attempts to calculate the file hash, reading the file attempts to take
the i_rwsem again.
To resolve O_DIRECT related deadlock problem, this patch re-introduces
iint->mutex. But to eliminate the original chmod() related deadlock
problem, this patch eliminates the requirement for chmod hooks to take
the iint->mutex by introducing additional atomic iint->attr_flags to
indicate calling of the hooks. The allowed locking order is to take
the iint->mutex first and then the i_rwsem.
Original flags were cleared in chmod(), setxattr() or removwxattr()
hooks and tested when file was closed or opened again. New atomic flags
are set or cleared in those hooks and tested to clear iint->flags on
close or on open.
Atomic flags are following:
* IMA_CHANGE_ATTR - indicates that chATTR() was called (chmod, chown,
chgrp) and file attributes have changed. On file open, it causes IMA
to clear iint->flags to re-evaluate policy and perform IMA functions
again.
* IMA_CHANGE_XATTR - indicates that setxattr or removexattr was called
and extended attributes have changed. On file open, it causes IMA to
clear iint->flags IMA_DONE_MASK to re-appraise.
* IMA_UPDATE_XATTR - indicates that security.ima needs to be updated.
It is cleared if file policy changes and no update is needed.
* IMA_DIGSIG - indicates that file security.ima has signature and file
security.ima must not update to file has on file close.
* IMA_MUST_MEASURE - indicates the file is in the measurement policy.
Fixes: Commit 6552321831 ("xfs: remove i_iolock and use i_rwsem in
the VFS inode instead")
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The EVM signature includes the inode number and (optionally) the
filesystem UUID, making it impractical to ship EVM signatures in
packages. This patch adds a new portable format intended to allow
distributions to include EVM signatures. It is identical to the existing
format but hardcodes the inode and generation numbers to 0 and does not
include the filesystem UUID even if the kernel is configured to do so.
Removing the inode means that the metadata and signature from one file
could be copied to another file without invalidating it. This is avoided
by ensuring that an IMA xattr is present during EVM validation.
Portable signatures are intended to be immutable - ie, they will never
be transformed into HMACs.
Based on earlier work by Dmitry Kasatkin and Mikhail Kurinnoi.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
Cc: Mikhail Kurinnoi <viewizard@viewizard.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
When EVM is enabled it forbids modification of metadata protected by
EVM unless there is already a valid EVM signature. If any modification
is made, the kernel will then generate a new EVM HMAC. However, this
does not map well on use cases which use only asymmetric EVM signatures,
as in this scenario the kernel is unable to generate new signatures.
This patch extends the /sys/kernel/security/evm interface to allow
userland to request that modification of these xattrs be permitted. This
is only permitted if no keys have already been loaded. In this
configuration, modifying the metadata will invalidate the EVM appraisal
on the file in question. This allows packaging systems to write out new
files, set the relevant extended attributes and then move them into
place.
There's also some refactoring of the use of evm_initialized in order to
avoid heading down codepaths that assume there's a key available.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Custom policies can require file signatures based on LSM labels. These
files are normally created and only afterwards labeled, requiring them
to be signed.
Instead of requiring file signatures based on LSM labels, entire
filesystems could require file signatures. In this case, we need the
ability of writing new files without requiring file signatures.
The definition of a "new" file was originally defined as any file with
a length of zero. Subsequent patches redefined a "new" file to be based
on the FILE_CREATE open flag. By combining the open flag with a file
size of zero, this patch relaxes the file signature requirement.
Fixes: 1ac202e978 ima: accept previously set IMA_NEW_FILE
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
keyctl_restrict_keyring() allows through a NULL restriction when the
"type" is non-NULL, which causes a NULL pointer dereference in
asymmetric_lookup_restriction() when it calls strcmp() on the
restriction string.
But no key types actually use a "NULL restriction" to mean anything, so
update keyctl_restrict_keyring() to reject it with EINVAL.
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 97d3aa0f31 ("KEYS: Add a lookup_restriction function for the asymmetric key type")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Variable key_ref is being assigned a value that is never read;
key_ref is being re-assigned a few statements later. Hence this
assignment is redundant and can be removed.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
When the request_key() syscall is not passed a destination keyring, it
links the requested key (if constructed) into the "default" request-key
keyring. This should require Write permission to the keyring. However,
there is actually no permission check.
This can be abused to add keys to any keyring to which only Search
permission is granted. This is because Search permission allows joining
the keyring. keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_SESSION_KEYRING)
then will set the default request-key keyring to the session keyring.
Then, request_key() can be used to add keys to the keyring.
Both negatively and positively instantiated keys can be added using this
method. Adding negative keys is trivial. Adding a positive key is a
bit trickier. It requires that either /sbin/request-key positively
instantiates the key, or that another thread adds the key to the process
keyring at just the right time, such that request_key() misses it
initially but then finds it in construct_alloc_key().
Fix this bug by checking for Write permission to the keyring in
construct_get_dest_keyring() when the default keyring is being used.
We don't do the permission check for non-default keyrings because that
was already done by the earlier call to lookup_user_key(). Also,
request_key_and_link() is currently passed a 'struct key *' rather than
a key_ref_t, so the "possessed" bit is unavailable.
We also don't do the permission check for the "requestor keyring", to
continue to support the use case described by commit 8bbf4976b5
("KEYS: Alter use of key instantiation link-to-keyring argument") where
/sbin/request-key recursively calls request_key() to add keys to the
original requestor's destination keyring. (I don't know of any users
who actually do that, though...)
Fixes: 3e30148c3d ("[PATCH] Keys: Make request-key create an authorisation key")
Cc: <stable@vger.kernel.org> # v2.6.13+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
In request_key_and_link(), in the case where the dest_keyring was
explicitly specified, there is no need to get another reference to
dest_keyring before calling key_link(), then drop it afterwards. This
is because by definition, we already have a reference to dest_keyring.
This change is useful because we'll be making
construct_get_dest_keyring() able to return an error code, and we don't
want to have to handle that error here for no reason.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
We can't do anything reasonable in security_bounded_transition() if we
don't have a policy loaded, and in fact we could run into problems
with some of the code inside expecting a policy. Fix these problems
like we do many others in security/selinux/ss/services.c by checking
to see if the policy is loaded (ss_initialized) and returning quickly
if it isn't.
Reported-by: syzbot <syzkaller-bugs@googlegroups.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Small overlapping change conflict ('net' changed a line,
'net-next' added a line right afterwards) in flexcan.c
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that the associative-array library properly heads dependency chains,
the various smp_read_barrier_depends() calls in security/keys/keyring.c
are no longer needed. This commit therefore removes them.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: <keyrings@vger.kernel.org>
Cc: <linux-security-module@vger.kernel.org>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Only IPSEC routes have a non-NULL dst->child pointer. And IPSEC
routes are identified by a non-NULL dst->xfrm pointer.
Signed-off-by: David S. Miller <davem@davemloft.net>
The syzbot/syzkaller automated tests found a problem in
security_context_to_sid_core() during early boot (before we load the
SELinux policy) where we could potentially feed context strings without
NUL terminators into the strcmp() function.
We already guard against this during normal operation (after the SELinux
policy has been loaded) by making a copy of the context strings and
explicitly adding a NUL terminator to the end. The patch extends this
protection to the early boot case (no loaded policy) by moving the context
copy earlier in security_context_to_sid_core().
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Reviewed-By: William Roberts <william.c.roberts@intel.com>
This is a pure automated search-and-replace of the internal kernel
superblock flags.
The s_flags are now called SB_*, with the names and the values for the
moment mirroring the MS_* flags that they're equivalent to.
Note how the MS_xyz flags are the ones passed to the mount system call,
while the SB_xyz flags are what we then use in sb->s_flags.
The script to do this was:
# places to look in; re security/*: it generally should *not* be
# touched (that stuff parses mount(2) arguments directly), but
# there are two places where we really deal with superblock flags.
FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
include/linux/fs.h include/uapi/linux/bfs_fs.h \
security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
# the list of MS_... constants
SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
ACTIVE NOUSER"
SED_PROG=
for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
# we want files that contain at least one of MS_...,
# with fs/namespace.c and fs/pnode.c excluded.
L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
for f in $L; do sed -i $f $SED_PROG; done
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull timer updates from Thomas Gleixner:
- The final conversion of timer wheel timers to timer_setup().
A few manual conversions and a large coccinelle assisted sweep and
the removal of the old initialization mechanisms and the related
code.
- Remove the now unused VSYSCALL update code
- Fix permissions of /proc/timer_list. I still need to get rid of that
file completely
- Rename a misnomed clocksource function and remove a stale declaration
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
m68k/macboing: Fix missed timer callback assignment
treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts
timer: Remove redundant __setup_timer*() macros
timer: Pass function down to initialization routines
timer: Remove unused data arguments from macros
timer: Switch callback prototype to take struct timer_list * argument
timer: Pass timer_list pointer to callbacks unconditionally
Coccinelle: Remove setup_timer.cocci
timer: Remove setup_*timer() interface
timer: Remove init_timer() interface
treewide: setup_timer() -> timer_setup() (2 field)
treewide: setup_timer() -> timer_setup()
treewide: init_timer() -> setup_timer()
treewide: Switch DEFINE_TIMER callbacks to struct timer_list *
s390: cmm: Convert timers to use timer_setup()
lightnvm: Convert timers to use timer_setup()
drivers/net: cris: Convert timers to use timer_setup()
drm/vc4: Convert timers to use timer_setup()
block/laptop_mode: Convert timers to use timer_setup()
net/atm/mpc: Avoid open-coded assignment of timer callback function
...
Pull keys update from James Morris:
"There's nothing too controversial here:
- Doc fix for keyctl_read().
- time_t -> time64_t replacement.
- Set the module licence on things to prevent tainting"
* 'next-keys' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
pkcs7: Set the module licence to prevent tainting
security: keys: Replace time_t with time64_t for struct key_preparsed_payload
security: keys: Replace time_t/timespec with time64_t
KEYS: fix in-kernel documentation for keyctl_read()
This changes all DEFINE_TIMER() callbacks to use a struct timer_list
pointer instead of unsigned long. Since the data argument has already been
removed, none of these callbacks are using their argument currently, so
this renames the argument to "unused".
Done using the following semantic patch:
@match_define_timer@
declarer name DEFINE_TIMER;
identifier _timer, _callback;
@@
DEFINE_TIMER(_timer, _callback);
@change_callback depends on match_define_timer@
identifier match_define_timer._callback;
type _origtype;
identifier _origarg;
@@
void
-_callback(_origtype _origarg)
+_callback(struct timer_list *unused)
{ ... }
Signed-off-by: Kees Cook <keescook@chromium.org>
It used to be that unconfined would never attach. However that is not
the case anymore as some special profiles can be marked as unconfined,
that are not the namespaces unconfined profile, and may have an
attachment.
Fixes: f1bd904175 ("apparmor: add the base fns() for domain labels")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Profiles that have an undecidable overlap in their attachments are
being incorrectly handled. Instead of failing to attach the first one
encountered is being used.
eg.
profile A /** { .. }
profile B /*foo { .. }
have an unresolvable longest left attachment, they both have an exact
match on / and then have an overlapping expression that has no clear
winner.
Currently the winner will be the profile that is loaded first which
can result in non-deterministic behavior. Instead in this situation
the exec should fail.
Fixes: 898127c34e ("AppArmor: functions for domain transitions")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Fixes: d07881d2ed ("apparmor: move new_null_profile to after profile lookup fns()")
Reported-by: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The boolean variable 'stop' is being set but never read. This
is a redundant variable and can be removed.
Cleans up clang warning: Value stored to 'stop' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Bool initializations should use true and false. Bool tests don't need
comparisons.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: John Johansen <john.johansen@canonical.com>
gcc-4.4 points out suspicious code in compute_mnt_perms, where
the aa_perms structure is only partially initialized before getting
returned:
security/apparmor/mount.c: In function 'compute_mnt_perms':
security/apparmor/mount.c:227: error: 'perms.prompt' is used uninitialized in this function
security/apparmor/mount.c:227: error: 'perms.hide' is used uninitialized in this function
security/apparmor/mount.c:227: error: 'perms.cond' is used uninitialized in this function
security/apparmor/mount.c:227: error: 'perms.complain' is used uninitialized in this function
security/apparmor/mount.c:227: error: 'perms.stop' is used uninitialized in this function
security/apparmor/mount.c:227: error: 'perms.deny' is used uninitialized in this function
Returning or assigning partially initialized structures is a bit tricky,
in particular it is explicitly allowed in c99 to assign a partially
initialized structure to another, as long as only members are read that
have been initialized earlier. Looking at what various compilers do here,
the version that produced the warning copied uninitialized stack data,
while newer versions (and also clang) either set the other members to
zero or don't update the parts of the return buffer that are not modified
in the temporary structure, but they never warn about this.
In case of apparmor, it seems better to be a little safer and always
initialize the aa_perms structure. Most users already do that, this
changes the remaining ones, including the one instance that I got the
warning for.
Fixes: fa488437d0f9 ("apparmor: add mount mediation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Seth Arnold <seth.arnold@canonical.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Trivial fix to spelling mistake in comment and also with text in
audit_resource call.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
A few years ago the FSF moved and "59 Temple Place" is wrong. Having this
still in our source files feels old and unmaintained.
Let's take the license statement serious and not confuse users.
As https://www.gnu.org/licenses/gpl-howto.html suggests, we replace the
postal address with "<http://www.gnu.org/licenses/>" in the security
directory.
Signed-off-by: Martin Kepplinger <martink@posteo.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Commit b65a9cfc2c ("Untangling ima mess, part 2: deal with counters")
moved the call of ima_file_check() from may_open() to do_filp_open() at a
point where the file descriptor is already opened.
This breaks the assumption made by IMA that file descriptors being closed
belong to files whose access was granted by ima_file_check(). The
consequence is that security.ima and security.evm are updated with good
values, regardless of the current appraisal status.
For example, if a file does not have security.ima, IMA will create it after
opening the file for writing, even if access is denied. Access to the file
will be allowed afterwards.
Avoid this issue by checking the appraisal status before updating
security.ima.
Cc: stable@vger.kernel.org
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Summary of modules changes for the 4.15 merge window:
- Treewide module_param_call() cleanup, fix up set/get function
prototype mismatches, from Kees Cook
- Minor code cleanups
Signed-off-by: Jessica Yu <jeyu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJaDCyzAAoJEMBFfjjOO8FyaYQP/AwHBy6XmwwVlWDP4BqIF6hL
Vhy3ccVLYEORvePv68tWSRPUz5n6+1Ebqanmwtkw6i8l+KwxY2SfkZql09cARc33
2iBE4bHF98iWQmnJbF6me80fedY9n5bZJNMQKEF9VozJWwTMOTQFTCfmyJRDBmk9
iidQj6M3idbSUOYIJjvc40VGx5NyQWSr+FFfqsz1rU5iLGRGEvA3I2/CDT0oTuV6
D4MmFxzE2Tv/vIMa2GzKJ1LGScuUfSjf93Lq9Kk0cG36qWao8l930CaXyVdE9WJv
bkUzpf3QYv/rDX6QbAGA0cada13zd+dfBr8YhchclEAfJ+GDLjMEDu04NEmI6KUT
5lP0Xw0xYNZQI7bkdxDMhsj5jaz/HJpXCjPCtZBnSEKiL4OPXVMe+pBHoCJ2/yFN
6M716XpWYgUviUOdiE+chczB5p3z4FA6u2ykaM4Tlk0btZuHGxjcSWwvcIdlPmjm
kY4AfDV6K0bfEBVguWPJicvrkx44atqT5nWbbPhDwTSavtsuRJLb3GCsHedx7K8h
ZO47lCQFAWCtrycK1HYw+oupNC3hYWQ0SR42XRdGhL1bq26C+1sei1QhfqSgA9PQ
7CwWH4UTOL9fhtrzSqZngYOh9sjQNFNefqQHcecNzcEjK2vjrgQZvRNWZKHSwaFs
fbGX8juZWP4ypbK+irTB
=c8vb
-----END PGP SIGNATURE-----
Merge tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:
"Summary of modules changes for the 4.15 merge window:
- treewide module_param_call() cleanup, fix up set/get function
prototype mismatches, from Kees Cook
- minor code cleanups"
* tag 'modules-for-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: Do not paper over type mismatches in module_param_call()
treewide: Fix function prototypes for module_param_call()
module: Prepare to convert all module_param_call() prototypes
kernel/module: Delete an error message for a failed memory allocation in add_module_usage()
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEcQCq365ubpQNLgrWVeRaWujKfIoFAloJ+XwUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQVeRaWujKfIqv3w//aNbHxEvf59yf9TjdrmJE6ivFlTAL
RmYCMsFn7uEisolTX1LPnz3cVNqN2/GQ5cnfcnrMiw7d2E/k85jq6Ket6NysX0Wi
LCj6V/JTDeibB41GPDbiC9pSbIER5kUaqXI+X2aR4PgGumxxjdIqmammd1Sf1Hy9
470OJjDnhH68KFLG4bxVPY8Y4j4i/xgIKHU9z9EUA5LErhDAajWADSbvLcTZcp4b
eja/DeSb8zbvHloB/maoqI9mnSKnwuGd91nz6cJBb92Lhy2A6xXNMCIVY/9tsb9n
ZOw8NvFbfpOZ6WUZgwQCjdyn4nV0TuJQPpXNcjVoD9djczTnBq5EXz0FpHcM7d7n
d44DeHOMJmJ2vGIbUz9MpelAxqckhY9wh/XTi1Kszr6qR8kSfzSnDsk1/bOWHWdk
2dKz6MAr4GbN6vgWRTtuuZ7db8TqFa1KdLVicKC0okaYlc0dH5PWC3KahKHbjgGi
5COdBhFexkeL82kJtMrFbusMNetBrYoLO4qOoSThuEOoCEGh0Fgx5zXlLFlEm3uv
hLtEdxT+FLO3jFKCVejLoIHwl/YJ+Pd8C2rAkaXV8AEvs2Cn1gni4lb150nXtq5+
BhkrkjvthYTXuQZH+yMsoTVFVa0QVm2U+QgLmf19MT1EUqc46xIczYUAInV2VxlY
2WQ4d6mcD+PMzoQ=
=VtxX
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"Seven SELinux patches for v4.15, although five of the seven are small
build fixes and cleanups.
Of the remaining two patches, the only one worth really calling out is
Eric's fix for the SELinux filesystem xattr set/remove code; the other
patch simply converts the SELinux hash table implementation to use
kmem_cache.
Eric's setxattr/removexattr tweak converts SELinux back to calling the
commoncap implementations when the xattr is not SELinux related. The
immediate win is to fixup filesystem capabilities in user namespaces,
but it makes things a bit saner overall; more information in the
commit description"
* tag 'selinux-pr-20171113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: remove extraneous initialization of slots_used and max_chain_len
selinux: remove redundant assignment to len
selinux: remove redundant assignment to str
selinux: fix build warning
selinux: fix build warning by removing the unused sid variable
selinux: Perform both commoncap and selinux xattr checks
selinux: Use kmem_cache for hashtab_node
Pull networking updates from David Miller:
"Highlights:
1) Maintain the TCP retransmit queue using an rbtree, with 1GB
windows at 100Gb this really has become necessary. From Eric
Dumazet.
2) Multi-program support for cgroup+bpf, from Alexei Starovoitov.
3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew
Lunn.
4) Add meter action support to openvswitch, from Andy Zhou.
5) Add a data meta pointer for BPF accessible packets, from Daniel
Borkmann.
6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet.
7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli.
8) More work to move the RTNL mutex down, from Florian Westphal.
9) Add 'bpftool' utility, to help with bpf program introspection.
From Jakub Kicinski.
10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper
Dangaard Brouer.
11) Support 'blocks' of transformations in the packet scheduler which
can span multiple network devices, from Jiri Pirko.
12) TC flower offload support in cxgb4, from Kumar Sanghvi.
13) Priority based stream scheduler for SCTP, from Marcelo Ricardo
Leitner.
14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg.
15) Add RED qdisc offloadability, and use it in mlxsw driver. From
Nogah Frankel.
16) eBPF based device controller for cgroup v2, from Roman Gushchin.
17) Add some fundamental tracepoints for TCP, from Song Liu.
18) Remove garbage collection from ipv6 route layer, this is a
significant accomplishment. From Wei Wang.
19) Add multicast route offload support to mlxsw, from Yotam Gigi"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits)
tcp: highest_sack fix
geneve: fix fill_info when link down
bpf: fix lockdep splat
net: cdc_ncm: GetNtbFormat endian fix
openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start
netem: remove unnecessary 64 bit modulus
netem: use 64 bit divide by rate
tcp: Namespace-ify sysctl_tcp_default_congestion_control
net: Protect iterations over net::fib_notifier_ops in fib_seq_sum()
ipv6: set all.accept_dad to 0 by default
uapi: fix linux/tls.h userspace compilation error
usbnet: ipheth: prevent TX queue timeouts when device not ready
vhost_net: conditionally enable tx polling
uapi: fix linux/rxrpc.h userspace compilation errors
net: stmmac: fix LPI transitioning for dwmac4
atm: horizon: Fix irq release error
net-sysfs: trigger netlink notification on ifalias change via sysfs
openvswitch: Using kfree_rcu() to simplify the code
openvswitch: Make local function ovs_nsh_key_attr_size() static
openvswitch: Fix return value check in ovs_meter_cmd_features()
...
The 'struct key_preparsed_payload' will use 'time_t' which we will
try to remove in the kernel, since 'time_t' is not year 2038 safe on
32bits systems.
Thus this patch replaces 'time_t' with 'time64_t' which is year 2038
safe on 32 bits system for 'struct key_preparsed_payload', moreover
we should use the 'TIME64_MAX' macro to initialize the 'time64_t'
type variable.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
The 'struct key' will use 'time_t' which we try to remove in the
kernel, since 'time_t' is not year 2038 safe on 32bit systems.
Also the 'struct keyring_search_context' will use 'timespec' type
to record current time, which is also not year 2038 safe on 32bit
systems.
Thus this patch replaces 'time_t' with 'time64_t' which is year 2038
safe for 'struct key', and replace 'timespec' with 'time64_t' for the
'struct keyring_search_context', since we only look at the the seconds
part of 'timespec' variable. Moreover we also change the codes where
using the 'time_t' and 'timespec', and we can get current time by
ktime_get_real_seconds() instead of current_kernel_time(), and use
'TIME64_MAX' macro to initialize the 'time64_t' type variable.
Especially in proc.c file, we have replaced 'unsigned long' and 'timespec'
type with 'u64' and 'time64_t' type to save the timeout value, which means
user will get one 'u64' type timeout value by issuing proc_keys_show()
function.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.15:
API:
- Disambiguate EBUSY when queueing crypto request by adding ENOSPC.
This change touches code outside the crypto API.
- Reset settings when empty string is written to rng_current.
Algorithms:
- Add OSCCA SM3 secure hash.
Drivers:
- Remove old mv_cesa driver (replaced by marvell/cesa).
- Enable rfc3686/ecb/cfb/ofb AES in crypto4xx.
- Add ccm/gcm AES in crypto4xx.
- Add support for BCM7278 in iproc-rng200.
- Add hash support on Exynos in s5p-sss.
- Fix fallback-induced error in vmx.
- Fix output IV in atmel-aes.
- Fix empty GCM hash in mediatek.
Others:
- Fix DoS potential in lib/mpi.
- Fix potential out-of-order issues with padata"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (162 commits)
lib/mpi: call cond_resched() from mpi_powm() loop
crypto: stm32/hash - Fix return issue on update
crypto: dh - Remove pointless checks for NULL 'p' and 'g'
crypto: qat - Clean up error handling in qat_dh_set_secret()
crypto: dh - Don't permit 'key' or 'g' size longer than 'p'
crypto: dh - Don't permit 'p' to be 0
crypto: dh - Fix double free of ctx->p
hwrng: iproc-rng200 - Add support for BCM7278
dt-bindings: rng: Document BCM7278 RNG200 compatible
crypto: chcr - Replace _manual_ swap with swap macro
crypto: marvell - Add a NULL entry at the end of mv_cesa_plat_id_table[]
hwrng: virtio - Virtio RNG devices need to be re-registered after suspend/resume
crypto: atmel - remove empty functions
crypto: ecdh - remove empty exit()
MAINTAINERS: update maintainer for qat
crypto: caam - remove unused param of ctx_map_to_sec4_sg()
crypto: caam - remove unneeded edesc zeroization
crypto: atmel-aes - Reset the controller before each use
crypto: atmel-aes - properly set IV after {en,de}crypt
hwrng: core - Reset user selected rng by writing "" to rng_current
...
Pull timer updates from Thomas Gleixner:
"Yet another big pile of changes:
- More year 2038 work from Arnd slowly reaching the point where we
need to think about the syscalls themself.
- A new timer function which allows to conditionally (re)arm a timer
only when it's either not running or the new expiry time is sooner
than the armed expiry time. This allows to use a single timer for
multiple timeout requirements w/o caring about the first expiry
time at the call site.
- A new NMI safe accessor to clock real time for the printk timestamp
work. Can be used by tracing, perf as well if required.
- A large number of timer setup conversions from Kees which got
collected here because either maintainers requested so or they
simply got ignored. As Kees pointed out already there are a few
trivial merge conflicts and some redundant commits which was
unavoidable due to the size of this conversion effort.
- Avoid a redundant iteration in the timer wheel softirq processing.
- Provide a mechanism to treat RTC implementations depending on their
hardware properties, i.e. don't inflict the write at the 0.5
seconds boundary which originates from the PC CMOS RTC to all RTCs.
No functional change as drivers need to be updated separately.
- The usual small updates to core code clocksource drivers. Nothing
really exciting"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (111 commits)
timers: Add a function to start/reduce a timer
pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday()
timer: Prepare to change all DEFINE_TIMER() callbacks
netfilter: ipvs: Convert timers to use timer_setup()
scsi: qla2xxx: Convert timers to use timer_setup()
block/aoe: discover_timer: Convert timers to use timer_setup()
ide: Convert timers to use timer_setup()
drbd: Convert timers to use timer_setup()
mailbox: Convert timers to use timer_setup()
crypto: Convert timers to use timer_setup()
drivers/pcmcia: omap1: Fix error in automated timer conversion
ARM: footbridge: Fix typo in timer conversion
drivers/sgi-xp: Convert timers to use timer_setup()
drivers/pcmcia: Convert timers to use timer_setup()
drivers/memstick: Convert timers to use timer_setup()
drivers/macintosh: Convert timers to use timer_setup()
hwrng/xgene-rng: Convert timers to use timer_setup()
auxdisplay: Convert timers to use timer_setup()
sparc/led: Convert timers to use timer_setup()
mips: ip22/32: Convert timers to use timer_setup()
...
Pull core locking updates from Ingo Molnar:
"The main changes in this cycle are:
- Another attempt at enabling cross-release lockdep dependency
tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time
with better performance and fewer false positives. (Byungchul Park)
- Introduce lockdep_assert_irqs_enabled()/disabled() and convert
open-coded equivalents to lockdep variants. (Frederic Weisbecker)
- Add down_read_killable() and use it in the VFS's iterate_dir()
method. (Kirill Tkhai)
- Convert remaining uses of ACCESS_ONCE() to
READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle
driven. (Mark Rutland, Paul E. McKenney)
- Get rid of lockless_dereference(), by strengthening Alpha atomics,
strengthening READ_ONCE() with smp_read_barrier_depends() and thus
being able to convert users of lockless_dereference() to
READ_ONCE(). (Will Deacon)
- Various micro-optimizations:
- better PV qspinlocks (Waiman Long),
- better x86 barriers (Michael S. Tsirkin)
- better x86 refcounts (Kees Cook)
- ... plus other fixes and enhancements. (Borislav Petkov, Juergen
Gross, Miguel Bernal Marin)"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
rcu: Use lockdep to assert IRQs are disabled/enabled
netpoll: Use lockdep to assert IRQs are disabled/enabled
timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled
sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
irq_work: Use lockdep to assert IRQs are disabled/enabled
irq/timings: Use lockdep to assert IRQs are disabled/enabled
perf/core: Use lockdep to assert IRQs are disabled/enabled
x86: Use lockdep to assert IRQs are disabled/enabled
smp/core: Use lockdep to assert IRQs are disabled/enabled
timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled
timers/nohz: Use lockdep to assert IRQs are disabled/enabled
workqueue: Use lockdep to assert IRQs are disabled/enabled
irq/softirqs: Use lockdep to assert IRQs are disabled/enabled
locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled()
locking/pvqspinlock: Implement hybrid PV queued/unfair locks
locking/rwlocks: Fix comments
x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized
block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()
workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes
...
Pull security subsystem integrity updates from James Morris:
"There is a mixture of bug fixes, code cleanup, preparatory code for
new functionality and new functionality.
Commit 26ddabfe96 ("evm: enable EVM when X509 certificate is
loaded") enabled EVM without loading a symmetric key, but was limited
to defining the x509 certificate pathname at build. Included in this
set of patches is the ability of enabling EVM, without loading the EVM
symmetric key, from userspace. New is the ability to prevent the
loading of an EVM symmetric key."
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
ima: Remove redundant conditional operator
ima: Fix bool initialization/comparison
ima: check signature enforcement against cmdline param instead of CONFIG
module: export module signature enforcement status
ima: fix hash algorithm initialization
EVM: Only complain about a missing HMAC key once
EVM: Allow userspace to signal an RSA key has been loaded
EVM: Include security.apparmor in EVM measurements
ima: call ima_file_free() prior to calling fasync
integrity: use kernel_read_file_from_path() to read x509 certs
ima: always measure and audit files in policy
ima: don't remove the securityfs policy file
vfs: fix mounting a filesystem with i_version
Pull general security subsystem updates from James Morris:
"TPM (from Jarkko):
- essential clean up for tpm_crb so that ARM64 and x86 versions do
not distract each other as much as before
- /dev/tpm0 rejects now too short writes (shorter buffer than
specified in the command header
- use DMA-safe buffer in tpm_tis_spi
- otherwise mostly minor fixes.
Smack:
- base support for overlafs
Capabilities:
- BPRM_FCAPS fixes, from Richard Guy Briggs:
The audit subsystem is adding a BPRM_FCAPS record when auditing
setuid application execution (SYSCALL execve). This is not expected
as it was supposed to be limited to when the file system actually
had capabilities in an extended attribute. It lists all
capabilities making the event really ugly to parse what is
happening. The PATH record correctly records the setuid bit and
owner. Suppress the BPRM_FCAPS record on set*id.
TOMOYO:
- Y2038 timestamping fixes"
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (28 commits)
MAINTAINERS: update the IMA, EVM, trusted-keys, encrypted-keys entries
Smack: Base support for overlayfs
MAINTAINERS: remove David Safford as maintainer for encrypted+trusted keys
tomoyo: fix timestamping for y2038
capabilities: audit log other surprising conditions
capabilities: fix logic for effective root or real root
capabilities: invert logic for clarity
capabilities: remove a layer of conditional logic
capabilities: move audit log decision to function
capabilities: use intuitive names for id changes
capabilities: use root_priveleged inline to clarify logic
capabilities: rename has_cap to has_fcap
capabilities: intuitive names for cap gain status
capabilities: factor out cap_bprm_set_creds privileged root
tpm, tpm_tis: use ARRAY_SIZE() to define TPM_HID_USR_IDX
tpm: fix duplicate inline declaration specifier
tpm: fix type of a local variables in tpm_tis_spi.c
tpm: fix type of a local variable in tpm2_map_command()
tpm: fix type of a local variable in tpm2_get_cc_attrs_tbl()
tpm-dev-common: Reject too short writes
...
Simple cases of overlapping changes in the packet scheduler.
Must easier to resolve this time.
Which probably means that I screwed it up somehow.
Signed-off-by: David S. Miller <davem@davemloft.net>
A non-zero value is converted to 1 when assigned to a bool variable, so the
conditional operator in is_ima_appraise_enabled is redundant.
The value of a comparison operator is either 1 or 0 so the conditional
operator in ima_inode_setxattr is redundant as well.
Confirmed that the patch is correct by comparing the object file from
before and after the patch. They are identical.
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Bool initializations should use true and false. Bool tests don't need
comparisons.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
When the user requests MODULE_CHECK policy and its kernel is compiled
with CONFIG_MODULE_SIG_FORCE not set, all modules would not load, just
those loaded in initram time. One option the user would have would be
set a kernel cmdline param (module.sig_enforce) to true, but the IMA
module check code doesn't rely on this value, it checks just
CONFIG_MODULE_SIG_FORCE.
This patch solves this problem checking for the exported value of
module.sig_enforce cmdline param intead of CONFIG_MODULE_SIG_FORCE,
which holds the effective value (CONFIG || param).
Signed-off-by: Bruno E. O. Meneguele <brdeoliv@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The hash_setup function always sets the hash_setup_done flag, even
when the hash algorithm is invalid. This prevents the default hash
algorithm defined as CONFIG_IMA_DEFAULT_HASH from being used.
This patch sets hash_setup_done flag only for valid hash algorithms.
Fixes: e7a2ad7eb6 "ima: enable support for larger default filedata hash
algorithms"
Signed-off-by: Boshi Wang <wangboshi@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
A system can validate EVM digital signatures without requiring an HMAC
key, but every EVM validation will generate a kernel error. Change this
so we only generate an error once.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
EVM will only perform validation once a key has been loaded. This key
may either be a symmetric trusted key (for HMAC validation and creation)
or the public half of an asymmetric key (for digital signature
validation). The /sys/kernel/security/evm interface allows userland to
signal that a symmetric key has been loaded, but does not allow userland
to signal that an asymmetric public key has been loaded.
This patch extends the interface to permit userspace to pass a bitmask
of loaded key types. It also allows userspace to block loading of a
symmetric key in order to avoid a compromised system from being able to
load an additional key type later.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Apparmor will be gaining support for security.apparmor labels, and it
would be helpful to include these in EVM validation now so appropriate
signatures can be generated even before full support is merged.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: John Johansen <John.johansen@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
The CONFIG_IMA_LOAD_X509 and CONFIG_EVM_LOAD_X509 options permit
loading x509 signed certificates onto the trusted keyrings without
verifying the x509 certificate file's signature.
This patch replaces the call to the integrity_read_file() specific
function with the common kernel_read_file_from_path() function.
To avoid verifying the file signature, this patch defines
READING_X509_CERTFICATE.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
All files matching a "measure" rule must be included in the IMA
measurement list, even when the file hash cannot be calculated.
Similarly, all files matching an "audit" rule must be audited, even when
the file hash can not be calculated.
The file data hash field contained in the IMA measurement list template
data will contain 0's instead of the actual file hash digest.
Note:
In general, adding, deleting or in anyway changing which files are
included in the IMA measurement list is not a good idea, as it might
result in not being able to unseal trusted keys sealed to a specific
TPM PCR value. This patch not only adds file measurements that were
not previously measured, but specifies that the file hash value for
these files will be 0's.
As the IMA measurement list ordering is not consistent from one boot
to the next, it is unlikely that anyone is sealing keys based on the
IMA measurement list. Remote attestation servers should be able to
process these new measurement records, but might complain about
these unknown records.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: Dmitry Kasatkin <dmitry.kasatkin@huawei.com>
The securityfs policy file is removed unless additional rules can be
appended to the IMA policy (CONFIG_IMA_WRITE_POLICY), regardless as
to whether the policy is configured so that it can be displayed.
This patch changes this behavior, removing the securityfs policy file,
only if CONFIG_IMA_READ_POLICY is also not enabled.
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
This came in yesterday, and I have verified our regression tests
were missing this and it can cause an oops. Please apply.
There is a an off-by-one comparision on sig against MAXMAPPED_SIG
that can lead to a read outside the sig_map array if sig
is MAXMAPPED_SIG. Fix this.
Verified that the check is an out of bounds case that can cause an oops.
Revised: add comparison fix to second case
Fixes: cd1dbf76b2 ("apparmor: add the ability to mediate signals")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>