Commit Graph

2534 Commits

Author SHA1 Message Date
Roberto Sassu
f7a859ff73 ima: fix race condition on ima_rdwr_violation_check and process_measurement
This patch fixes a race condition between two functions that try to access
the same inode. Since the i_mutex lock is held and released separately
in the two functions, there may be the possibility that a violation is
not correctly detected.

Suppose there are two processes, A (reader) and B (writer), if the
following sequence happens:

A: ima_rdwr_violation_check()
B: ima_rdwr_violation_check()
B: process_measurement()
B: starts writing the inode
A: process_measurement()

the ToMToU violation (a reader may be accessing a content different from
that measured, due to a concurrent modification by a writer) will not be
detected. To avoid this issue, the violation check and the measurement
must be done atomically.

This patch fixes the problem by moving the violation check inside
process_measurement() when the i_mutex lock is held. Differently from
the old code, the violation check is executed also for the MMAP_CHECK
hook (other than for FILE_CHECK). This allows to detect ToMToU violations
that are possible because shared libraries can be opened for writing
while they are in use (according to the output of 'man mmap', the mmap()
flag MAP_DENYWRITE is ignored).

Changes in v5 (Roberto Sassu):
* get iint if action is not zero
* exit process_measurement() after the violation check if action is zero
* reverse order process_measurement() exit cleanup (Mimi)

Changes in v4 (Dmitry Kasatkin):
* iint allocation is done before calling ima_rdrw_violation_check()
  (Suggested-by Mimi)
* do not check for violations if the policy does not contain 'measure'
  rules (done by Roberto Sassu)

Changes in v3 (Dmitry Kasatkin):
* no violation checking for MMAP_CHECK function in this patch
* remove use of filename from violation
* removes checking if ima is enabled from ima_rdrw_violation_check
* slight style change

Suggested-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-18 10:03:55 -04:00
James Morris
6f98e89288 Merge branch 'smack-for-3.18' of git://git.gitorious.org/smack-next/kernel into next 2014-09-18 23:52:46 +10:00
Roberto Sassu
a756024efe ima: added ima_policy_flag variable
This patch introduces the new variable 'ima_policy_flag', whose bits
are set depending on the action of the current policy rules. Only the
flags IMA_MEASURE, IMA_APPRAISE and IMA_AUDIT are set.

The new variable will be used to improve performance by skipping the
unnecessary execution of IMA code if the policy does not contain rules
with the above actions.

Changes in v6 (Roberto Sassu)
* do not check 'ima_initialized' before calling ima_update_policy_flag()
  in ima_update_policy() (suggested by Dmitry)
* calling ima_update_policy_flag() moved to init_ima to co-locate with
  ima_initialized (Dmitry)
* add/revise comments (Mimi)

Changes in v5 (Roberto Sassu)
* reset IMA_APPRAISE flag in 'ima_policy_flag' if 'ima_appraise' is set
  to zero (reported by Dmitry)
* update 'ima_policy_flag' only if IMA initialization is successful
  (suggested by Mimi and Dmitry)
* check 'ima_policy_flag' instead of 'ima_initialized'
  (suggested by Mimi and Dmitry)

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:39:36 -04:00
Roberto Sassu
be39ffc2fe ima: return an error code from ima_add_boot_aggregate()
This patch modifies ima_add_boot_aggregate() to return an error code.
This way we can determine if all the initialization procedures have
been executed successfully.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:15:42 -04:00
Dmitry Kasatkin
2faa6ef3b2 ima: provide 'ima_appraise=log' kernel option
The kernel boot parameter "ima_appraise" currently defines 'off',
'enforce' and 'fix' modes.  When designing a policy and labeling
the system, access to files are either blocked in the default
'enforce' mode or automatically fixed in the 'fix' mode.  It is
beneficial to be able to run the system in a logging only mode,
without fixing it, in order to properly analyze the system. This
patch adds a 'log' mode to run the system in a permissive mode and
log the appraisal results.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:14:23 -04:00
Dmitry Kasatkin
31b70f6632 ima: move keyring initialization to ima_init()
ima_init() is used as a single place for all initializations.
Experimental keyring patches used the 'late_initcall' which was
co-located with the late_initcall(init_ima). When the late_initcall
for the keyring initialization was abandoned, initialization moved
to init_ima, though it would be more logical to move it to ima_init,
where the rest of the initialization is done. This patch moves the
keyring initialization to ima_init() as a preparatory step for
loading the keys which will be added to ima_init() in following
patches.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-17 16:10:59 -04:00
David Howells
0c903ab64f KEYS: Make the key matching functions return bool
Make the key matching functions pointed to by key_match_data::cmp return bool
rather than int.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:08 +01:00
David Howells
c06cfb08b8 KEYS: Remove key_type::match in favour of overriding default by match_preparse
A previous patch added a ->match_preparse() method to the key type.  This is
allowed to override the function called by the iteration algorithm.
Therefore, we can just set a default that simply checks for an exact match of
the key description with the original criterion data and allow match_preparse
to override it as needed.

The key_type::match op is then redundant and can be removed, as can the
user_match() function.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:06 +01:00
David Howells
614d8c3901 KEYS: Remove key_type::def_lookup_type
Remove key_type::def_lookup_type as it's no longer used.  The information now
defaults to KEYRING_SEARCH_LOOKUP_DIRECT but may be overridden by
type->match_preparse().

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:04 +01:00
David Howells
462919591a KEYS: Preparse match data
Preparse the match data.  This provides several advantages:

 (1) The preparser can reject invalid criteria up front.

 (2) The preparser can convert the criteria to binary data if necessary (the
     asymmetric key type really wants to do binary comparison of the key IDs).

 (3) The preparser can set the type of search to be performed.  This means
     that it's not then a one-off setting in the key type.

 (4) The preparser can set an appropriate comparator function.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:36:02 +01:00
David Howells
1c9c115ccc Keyrings fixes for next
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVBhgQROxKuMESys7AQJv8w/+O61++U9OT5rL/X2pam3h4+tQmM7+2rU6
 rml9ZADkqKJVU1h3tT/WaFcmwR6J56buIWDZ7M7fPPRyvQPhfeBEHcakey+4iCd3
 SyFmP9ElkRvdOyoXRvImNIawsmnyme4CWSIQihSteyzyuV76odOI5HiJxMj6x8RV
 T4bjTsOMYugleWmuJzYPSniAs0my7Gxii7GK4WnLd+Jhxs8xJxiB3DNT4Kecxjuc
 MTldJIx32+Kj2JKp4ygbHM/WOI7jbdpseCedxbL4DNepgdXvVvV42pwPSCc2ydO/
 2b9kF+YnhsGbOv3Qd6GPPE5x0cMLKcmGf3DvJQcfCFSLjqpginGr0V/ylkVSf79S
 mQJLc2R7Z0ST7eFCyvClAeM9ZI6y+APWqo2wvVRNMww3NQhHyj43P2HPXiQEAryg
 UO6ZQehGHgxcx8uN/nB1gi4+S7+sZS6SzwvrTu04h5lYfCRrKSZGWDGpG/n7qtwo
 EKcWiAjupYbJtlDryL5Nz1xxkD6rpLKWbVfkO5FbJ8yAamz5XDl7FiQyJTu32LGj
 jbVma9sEIkJ5IB4mCAdF2tHi4o7QvQQbwWAPLu0B4hxp4JRslV5XhVVAH3CrvIFK
 l8A5nit0NwHms7d8+VLAyieknUForFHlmRheYXCSMvJDMfFToZrCNmTNHxZtyuuj
 voCrpkQYxGU=
 =9AZW
 -----END PGP SIGNATURE-----

Merge tag 'keys-next-fixes-20140916' into keys-next

Merge in keyrings fixes for next:

 (1) Insert some missing 'static' annotations.

Signed-off-by: David Howells <dhowells@redhat.com>
2014-09-16 17:32:55 +01:00
David Howells
68c45c7fea Keyrings fixes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVBhlXROxKuMESys7AQIQghAAmxHRD9kD7AJrlcL187gTXl5dRyxQNhX0
 myT9A1/01jHJh6mPPKS5jt5ooenwfcbvBkdUARLNLq6j7urIzFk+UtSmfcN1OMXw
 3y15nuJKbSCZRzPMR94Z0Ik23YED9dmc4ubxW7E+psoWWIZUvFt4GaGw7ei77O39
 8Pbt/n7nIx2+s8aiXte2Om/zqzyMEmcLd3Mzlqxe4GX1jo/ThNQZ348EL+ECxJEl
 3OKe7i7oRfa48+SybmhW5Bx/2f/aQMcIQX04+akOFMIC505rUg6CTzphcrTyV5/s
 6FdQexRquf8/Ei/6DMAYPhnumRfWJ5x9txiNSEY4i11AIjo6Bt65vaLPuNniYRNI
 b6Wn8SSE8Ucrq5RrmNlSmoJCs7r1NE+JdOaEPO0MDVkOouaja8daISmveV2AfZfF
 bITOQgEw3QRpyL2FYdwa39/NXCONBILfL5HvNyXEfPEHBhI8igTgEyXYRwNHV9jT
 dsVFTc9ZIrjksLt3CDh4Z8xdyZbyojYdRCfH/wna9aAkZpwwGfSYCcR3dE6SK26y
 IjkqEVJoCFHEnvLUQkAQrc/2qWX2D1qHrcjVLwzwbM5G66YIPeLcJlZK2FbiY0Ay
 Yc/kUJY0hU5W+TfFb1hhjO5G2DTTw8Ou6MGcxSTE3HwzqICjDhE7BwFZdVikHRP0
 xMtjfJnMwuM=
 =v+dv
 -----END PGP SIGNATURE-----

Merge tag 'keys-fixes-20140916' into keys-next

Merge in keyrings fixes, at least some of which later patches depend on:

 (1) Reinstate the production of EPERM for key types beginning with '.' in
     requests from userspace.

 (2) Tidy up the cleanup of PKCS#7 message signed information blocks and fix a
     bug this made more obvious.

Signed-off-by: David Howells <dhowells@redhat.coM>
2014-09-16 17:32:16 +01:00
David Howells
54e2c2c1a9 KEYS: Reinstate EPERM for a key type name beginning with a '.'
Reinstate the generation of EPERM for a key type name beginning with a '.' in
a userspace call.  Types whose name begins with a '.' are internal only.

The test was removed by:

	commit a4e3b8d79a
	Author: Mimi Zohar <zohar@linux.vnet.ibm.com>
	Date:   Thu May 22 14:02:23 2014 -0400
	Subject: KEYS: special dot prefixed keyring name bug fix

I think we want to keep the restriction on type name so that userspace can't
add keys of a special internal type.

Note that removal of the test causes several of the tests in the keyutils
testsuite to fail.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-16 17:29:03 +01:00
David Howells
8da79b6439 KEYS: Fix missing statics
Fix missing statics (found by checker).

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2014-09-16 17:07:07 +01:00
Paul Moore
cbe0d6e879 selinux: make the netif cache namespace aware
While SELinux largely ignores namespaces, for good reason, there are
some places where it needs to at least be aware of namespaces in order
to function correctly.  Network namespaces are one example.  Basic
awareness of network namespaces are necessary in order to match a
network interface's index number to an actual network device.

This patch corrects a problem with network interfaces added to a
non-init namespace, and can be reproduced with the following commands:

 [NOTE: the NetLabel configuration is here only to active the dynamic
        networking controls ]

 # netlabelctl unlbl add default address:0.0.0.0/0 \
   label:system_u:object_r:unlabeled_t:s0
 # netlabelctl unlbl add default address:::/0 \
   label:system_u:object_r:unlabeled_t:s0
 # netlabelctl cipsov4 add pass doi:100 tags:1
 # netlabelctl map add domain:lspp_test_netlabel_t \
   protocol:cipsov4,100

 # ip link add type veth
 # ip netns add myns
 # ip link set veth1 netns myns
 # ip a add dev veth0 10.250.13.100/24
 # ip netns exec myns ip a add dev veth1 10.250.13.101/24
 # ip l set veth0 up
 # ip netns exec myns ip l set veth1 up

 # ping -c 1 10.250.13.101
 # ip netns exec myns ping -c 1 10.250.13.100

Reported-by: Jiri Jaburek <jjaburek@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-10 17:09:57 -04:00
Jeff Layton
e0b93eddfe security: make security_file_set_fowner, f_setown and __f_setown void return
security_file_set_fowner always returns 0, so make it f_setown and
__f_setown void return functions and fix up the error handling in the
callers.

Cc: linux-security-module@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2014-09-09 16:01:36 -04:00
Dmitry Kasatkin
a2d61ed525 integrity: make integrity files as 'integrity' module
The kernel print macros use the KBUILD_MODNAME, which is initialized
to the module name. The current integrity/Makefile makes every file
as its own module, so pr_xxx messages are prefixed with the file name
instead of the module.  Similar to the evm/Makefile and ima/Makefile,
this patch fixes the integrity/Makefile to use the single name
'integrity'.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:58 -04:00
Dmitry Kasatkin
7ef84e65ec integrity: base integrity subsystem kconfig options on integrity
The integrity subsystem has lots of options and takes more than
half of the security menu.  This patch consolidates the options
under "integrity", which are hidden if not enabled.  This change
does not affect existing configurations.  Re-configuration is not
needed.

Changes v4:
- no need to change "integrity subsystem" to menuconfig as
options are hidden, when not enabled. (Mimi)
- add INTEGRITY Kconfig help description

Changes v3:
- dependency to INTEGRITY removed when behind 'if INTEGRITY'

Changes v2:
- previous patch moved integrity out of the 'security' menu.
  This version keeps integrity as a security option (Mimi).

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:56 -04:00
Dmitry Kasatkin
1ae8f41c23 integrity: move asymmetric keys config option
For better visual appearance it is better to co-locate
asymmetric key options together with signature support.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:55 -04:00
Dmitry Kasatkin
b4148db517 ima: initialize only required template
IMA uses only one template. This patch initializes only required
template to avoid unnecessary memory allocations.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Reviewed-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:54 -04:00
Dmitry Kasatkin
17f4bad3ab ima: remove usage of filename parameter
In all cases except ima_bprm_check() the filename was not defined
and ima_d_path() was used to find the full path.  Unfortunately,
the bprm filename is a relative pathname (eg. ./<dir>/filename).

ima_bprm_check() selects between bprm->interp and bprm->filename.
The following dump demonstrates the differences between using
filename and interp.

bprm->filename
 filename: ./foo.sh, pathname: /root/bin/foo.sh
 filename: ./foo.sh, pathname: /bin/dash

bprm->interp
 filename: ./foo.sh, pathname: /root/bin/foo.sh
 filename: /bin/sh, pathname: /bin/dash

In both cases the pathnames are currently the same.  This patch
removes usage of filename and interp in favor of d_absolute_path.

Changes v3:
- 11 extra bytes for "deleted" not needed (Mimi)
- purpose "replace relative bprm filename with full pathname" (Mimi)

Changes v2:
- use d_absolute_path() instead of d_path to work in chroot environments.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:52 -04:00
Dmitry Kasatkin
86f2bc0249 ima: remove unnecessary appraisal test
ima_get_action() sets the "action" flags based on policy.
Before collecting, measuring, appraising, or auditing the
file, the "action" flag is updated based on the cached
iint->flags.

This patch removes the subsequent unnecessary appraisal
test in ima_appraise_measurement().

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:51 -04:00
Dmitry Kasatkin
e4a9c51965 ima: add missing '__init' keywords
Add missing keywords to the function definition to cleanup
to discard initialization code.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Reviewed-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:50 -04:00
Dmitry Kasatkin
3a8a2eadc4 ima: remove unnecessary extra variable
'function' variable value can be changed instead of
allocating extra '_func' variable.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:48 -04:00
Dmitry Kasatkin
f68c05f4d2 ima: simplify conditional statement to improve performance
Precede bit testing before string comparison makes code
faster. Also refactor statement as a single line pointer
assignment. Logic is following: we set 'xattr_ptr' to read
xattr value when we will do appraisal or in any case when
measurement template is other than 'ima'.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:47 -04:00
Dmitry Kasatkin
65d98f3be2 integrity: remove declaration of non-existing functions
Commit f381c27 "integrity: move ima inode integrity data management"
(re)moved few functions but left their declarations in header files.
This patch removes them and also removes duplicated declaration of
integrity_iint_find().

Commit c7de7ad "ima: remove unused cleanup functions".  This patch
removes these definitions as well.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:46 -04:00
Dmitry Kasatkin
d9a2e5d788 integrity: prevent flooding with 'Request for unknown key'
If file has IMA signature, IMA in enforce mode, but key is missing
then file access is blocked and single error message is printed.

If IMA appraisal is enabled in fix mode, then system runs as usual
but might produce tons of 'Request for unknown key' messages.

This patch switches 'pr_warn' to 'pr_err_ratelimited'.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-09 10:28:44 -04:00
Dmitry Kasatkin
3034a14682 ima: pass 'opened' flag to identify newly created files
Empty files and missing xattrs do not guarantee that a file was
just created.  This patch passes FILE_CREATED flag to IMA to
reliably identify new files.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.14+
2014-09-09 10:28:43 -04:00
Dmitry Kasatkin
3dcbad52cf evm: properly handle INTEGRITY_NOXATTRS EVM status
Unless an LSM labels a file during d_instantiate(), newly created
files are not labeled with an initial security.evm xattr, until
the file closes.  EVM, before allowing a protected, security xattr
to be written, verifies the existing 'security.evm' value is good.
For newly created files without a security.evm label, this
verification prevents writing any protected, security xattrs,
until the file closes.

Following is the example when this happens:
fd = open("foo", O_CREAT | O_WRONLY, 0644);
setxattr("foo", "security.SMACK64", value, sizeof(value), 0);
close(fd);

While INTEGRITY_NOXATTRS status is handled in other places, such
as evm_inode_setattr(), it does not handle it in all cases in
evm_protect_xattr().  By limiting the use of INTEGRITY_NOXATTRS to
newly created files, we can now allow setting "protected" xattrs.

Changelog:
- limit the use of INTEGRITY_NOXATTRS to IMA identified new files

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.14+
2014-09-09 10:26:10 -04:00
Masanari Iida
da3dae54e4 Documentation: Docbook: Fix generated DocBook/kernel-api.xml
This patch fix spelling typo found in DocBook/kernel-api.xml.
It is because the file is generated from the source comments,
I have to fix the comments in source codes.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-09-09 10:34:56 +02:00
Jiri Pirko
25db6bea1f selinux: register nf hooks with single nf_register_hooks call
Push ipv4 and ipv6 nf hooks into single array and register/unregister
them via single call.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-08 20:42:47 -04:00
Dmitry Kasatkin
b151d6b00b ima: provide flag to identify new empty files
On ima_file_free(), newly created empty files are not labeled with
an initial security.ima value, because the iversion did not change.
Commit dff6efc "fs: fix iversion handling" introduced a change in
iversion behavior.  To verify this change use the shell command:

  $ (exec >foo)
  $ getfattr -h -e hex -d -m security foo

This patch defines the IMA_NEW_FILE flag.  The flag is initially
set, when IMA detects that a new file is created, and subsequently
checked on the ima_file_free() hook to set the initial security.ima
value.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.14+
2014-09-08 17:38:57 -04:00
Dmitry Kasatkin
1f1009791b evm: prevent passing integrity check if xattr read fails
This patch fixes a bug, where evm_verify_hmac() returns INTEGRITY_PASS
if inode->i_op->getxattr() returns an error in evm_find_protected_xattrs.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
2014-09-08 17:36:10 -04:00
Paul Moore
a7a91a1928 selinux: fix a problem with IPv6 traffic denials in selinux_ip_postroute()
A previous commit c0828e5048 ("selinux:
process labeled IPsec TCP SYN-ACK packets properly in
selinux_ip_postroute()") mistakenly left out a 'break' from a switch
statement which caused problems with IPv6 traffic.

Thanks to Florian Westphal for reporting and debugging the issue.

Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-09-03 10:51:59 -04:00
Steve Dickson
738c5d190f KEYS: Increase root_maxkeys and root_maxbytes sizes
Now that NFS client uses the kernel key ring facility to store the NFSv4
id/gid mappings, the defaults for root_maxkeys and root_maxbytes need to be
substantially increased.

These values have been soak tested:

	https://bugzilla.redhat.com/show_bug.cgi?id=1033708#c73

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-09-03 10:27:12 +10:00
Dmitry Kasatkin
e7d021e283 evm: fix checkpatch warnings
This patch fixes checkpatch 'return' warnings introduced with commit
9819cf2 "checkpatch: warn on unnecessary void function return statements".

Use scripts/checkpatch.pl --file security/integrity/evm/evm_main.c
to produce the warnings.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-02 17:03:37 -04:00
Dmitry Kasatkin
27cd1fc3ae ima: fix fallback to use new_sync_read()
3.16 commit aad4f8bb42
'switch simple generic_file_aio_read() users to ->read_iter()'
replaced ->aio_read with ->read_iter in most of the file systems
and introduced new_sync_read() as a replacement for do_sync_read().

Most of file systems set '->read' and ima_kernel_read is not affected.
When ->read is not set, this patch adopts fallback call changes from the
vfs_read.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>  3.16+
2014-09-02 17:03:36 -04:00
Dmitry Kasatkin
23c19e2ca7 ima: prevent buffer overflow in ima_alloc_tfm()
This patch fixes the case where the file's signature/hash xattr contains
an invalid hash algorithm.  Although we can not verify the xattr, we still
need to measure the file.  Use the default IMA hash algorithm.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-09-02 17:03:36 -04:00
Mimi Zohar
9a8d289fbc ima: fix ima_alloc_atfm()
The patch 3bcced39ea: "ima: use ahash API for file hash
calculation" from Feb 26, 2014, leads to the following static checker
warning:

security/integrity/ima/ima_crypto.c:204 ima_alloc_atfm()
         error: buffer overflow 'hash_algo_name' 17 <= 17

Unlike shash tfm memory, which is allocated on initialization, the
ahash tfm memory allocation is deferred until needed.

This patch fixes the case where ima_ahash_tfm has not yet been
allocated and the file's signature/hash xattr contains an invalid hash
algorithm.  Although we can not verify the xattr, we still need to
measure the file.  Use the default IMA hash algorithm.

Changelog:
- set valid algo before testing tfm - based on Dmitry's comment

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
2014-09-02 17:03:35 -04:00
Lukasz Pawelczyk
21c7eae21a Make Smack operate on smack_known struct where it still used char*
Smack used to use a mix of smack_known struct and char* throughout its
APIs and implementation. This patch unifies the behaviour and makes it
store and operate exclusively on smack_known struct pointers when managing
labels.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>

Conflicts:
	security/smack/smack_access.c
	security/smack/smack_lsm.c
2014-08-29 10:10:55 -07:00
Lukasz Pawelczyk
d01757904d Fix a bidirectional UDS connect check typo
The 54e70ec5eb commit introduced a
bidirectional check that should have checked for mutual WRITE access
between two labels. Due to a typo the second check was incorrect.

Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2014-08-29 10:10:47 -07:00
Lukasz Pawelczyk
e95ef49b7f Small fixes in comments describing function parameters
Signed-off-by: Lukasz Pawelczyk <l.pawelczyk@samsung.com>
2014-08-29 10:10:36 -07:00
Casey Schaufler
d166c8024d Smack: Bring-up access mode
People keep asking me for permissive mode, and I keep saying "no".

Permissive mode is wrong for more reasons than I can enumerate,
but the compelling one is that it's once on, never off.

Nonetheless, there is an argument to be made for running a
process with lots of permissions, logging which are required,
and then locking the process down. There wasn't a way to do
that with Smack, but this provides it.

The notion is that you start out by giving the process an
appropriate Smack label, such as "ATBirds". You create rules
with a wide range of access and the "b" mode. On Tizen it
might be:

	ATBirds	System	rwxalb
	ATBirds	User	rwxalb
	ATBirds	_	rwxalb
	User	ATBirds	wb
	System	ATBirds	wb

Accesses that fail will generate audit records. Accesses
that succeed because of rules marked with a "b" generate
log messages identifying the rule, the program and as much
object information as is convenient.

When the system is properly configured and the programs
brought in line with the labeling scheme the "b" mode can
be removed from the rules. When the system is ready for
production the facility can be configured out.

This provides the developer the convenience of permissive
mode without creating a system that looks like it is
enforcing a policy while it is not.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-28 13:11:56 -07:00
Stephen Smalley
7b0d0b40cd selinux: Permit bounded transitions under NO_NEW_PRIVS or NOSUID.
If the callee SID is bounded by the caller SID, then allowing
the transition to occur poses no risk of privilege escalation and we can
therefore safely allow the transition to occur.  Add this exemption
for both the case where a transition was explicitly requested by the
application and the case where an automatic transition is defined in
policy.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-08-28 11:37:12 -04:00
Jani Nikula
6a4c264313 module: rename KERNEL_PARAM_FL_NOARG to avoid confusion
Make it clear this is about kernel_param_ops, not kernel_param (which
will soon have a flags field of its own). No functional changes.

Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Jon Mason <jon.mason@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2014-08-27 21:54:07 +09:30
Tetsuo Handa
8fe7a268b1 tomoyo: Fix pathname calculation breakage.
Commit 7177a9c4b5 ("fs: call rename2 if exists") changed
"struct inode_operations"->rename == NULL if
"struct inode_operations"->rename2 != NULL .

TOMOYO needs to check for both ->rename and ->rename2 , or
a system on (e.g.) ext4 filesystem won't boot.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
2014-08-26 21:52:09 -05:00
Marcin Niesluchowski
d83d2c2646 Smack: Fix setting label on successful file open
While opening with CAP_MAC_OVERRIDE file label is not set.
Other calls may access it after CAP_MAC_OVERRIDE is dropped from process.

Signed-off-by: Marcin Niesluchowski <m.niesluchow@samsung.com>
2014-08-25 14:27:28 -07:00
Linus Torvalds
96784de59f Merge branch 'stable-3.17' of git://git.infradead.org/users/pcmoore/selinux
Pull SElinux fixes from Paul Moore:
 "Two small patches to fix a couple of build warnings in SELinux and
  NetLabel.  The patches are obvious enough that I don't think any
  additional explanation is necessary, but it basically boils down to
  the usual: I was stupid, and these patches fix some of the stupid.

  Both patches were posted earlier this week to the SELinux list, and
  that is where they sat as I didn't think there were noteworthy enough
  to go upstream at this point in time, but DaveM would rather see them
  upstream now so who am I to argue.  As the patches are both very
  small"

* 'stable-3.17' of git://git.infradead.org/users/pcmoore/selinux:
  selinux: remove unused variabled in the netport, netnode, and netif caches
  netlabel: fix the netlbl_catmap_setlong() dummy function
2014-08-09 15:09:52 -07:00
Konstantin Khlebnikov
da1b63566c Smack: remove unneeded NULL-termination from securtity label
Values of extended attributes are stored as binary blobs. NULL-termination
of them isn't required. It just wastes disk space and confuses command-line
tools like getfattr because they have to print that zero byte at the end.

This patch removes terminating zero byte from initial security label in
smack_inode_init_security and cuts it out in function smack_inode_getsecurity
which is used by syscall getxattr. This change seems completely safe, because
function smk_parse_smack ignores everything after first zero byte.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-08-08 14:51:19 -07:00
Konstantin Khlebnikov
b862e561ba Smack: handle zero-length security labels without panic
Zero-length security labels are invalid but kernel should handle them.

This patch fixes kernel panic after setting zero-length security labels:
# attr -S -s "SMACK64" -V "" file

And after writing zero-length string into smackfs files syslog and onlycp:
# python -c 'import os; os.write(1, "")' > /smack/syslog

The problem is caused by brain-damaged logic in function smk_parse_smack()
which takes pointer to buffer and its length but if length below or equal zero
it thinks that the buffer is zero-terminated. Unfortunately callers of this
function are widely used and proper fix requires serious refactoring.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-08-08 14:51:07 -07:00
Konstantin Khlebnikov
fd5c9d230d Smack: fix behavior of smack_inode_listsecurity
Security operation ->inode_listsecurity is used for generating list of
available extended attributes for syscall listxattr. Currently it's used
only in nfs4 or if filesystem doesn't provide i_op->listxattr.

The list is the set of NULL-terminated names, one after the other.
This method must include zero byte at the and into result.

Also this function must return length even if string does not fit into
output buffer or it is NULL, see similar method in selinux and man listxattr.

Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com>
2014-08-08 14:50:19 -07:00
Paul Moore
942ba36465 selinux: remove unused variabled in the netport, netnode, and netif caches
This patch removes the unused return code variable in the netport,
netnode, and netif initialization functions.

Reported-by: fengguang.wu@intel.com
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-08-07 20:55:30 -04:00
Linus Torvalds
bb2cbf5e93 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "In this release:

   - PKCS#7 parser for the key management subsystem from David Howells
   - appoint Kees Cook as seccomp maintainer
   - bugfixes and general maintenance across the subsystem"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits)
  X.509: Need to export x509_request_asymmetric_key()
  netlabel: shorter names for the NetLabel catmap funcs/structs
  netlabel: fix the catmap walking functions
  netlabel: fix the horribly broken catmap functions
  netlabel: fix a problem when setting bits below the previously lowest bit
  PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1
  tpm: simplify code by using %*phN specifier
  tpm: Provide a generic means to override the chip returned timeouts
  tpm: missing tpm_chip_put in tpm_get_random()
  tpm: Properly clean sysfs entries in error path
  tpm: Add missing tpm_do_selftest to ST33 I2C driver
  PKCS#7: Use x509_request_asymmetric_key()
  Revert "selinux: fix the default socket labeling in sock_graft()"
  X.509: x509_request_asymmetric_keys() doesn't need string length arguments
  PKCS#7: fix sparse non static symbol warning
  KEYS: revert encrypted key change
  ima: add support for measuring and appraising firmware
  firmware_class: perform new LSM checks
  security: introduce kernel_fw_from_file hook
  PKCS#7: Missing inclusion of linux/err.h
  ...
2014-08-06 08:06:39 -07:00
Linus Torvalds
e7fda6c4c3 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and time updates from Thomas Gleixner:
 "A rather large update of timers, timekeeping & co

   - Core timekeeping code is year-2038 safe now for 32bit machines.
     Now we just need to fix all in kernel users and the gazillion of
     user space interfaces which rely on timespec/timeval :)

   - Better cache layout for the timekeeping internal data structures.

   - Proper nanosecond based interfaces for in kernel users.

   - Tree wide cleanup of code which wants nanoseconds but does hoops
     and loops to convert back and forth from timespecs.  Some of it
     definitely belongs into the ugly code museum.

   - Consolidation of the timekeeping interface zoo.

   - A fast NMI safe accessor to clock monotonic for tracing.  This is a
     long standing request to support correlated user/kernel space
     traces.  With proper NTP frequency correction it's also suitable
     for correlation of traces accross separate machines.

   - Checkpoint/restart support for timerfd.

   - A few NOHZ[_FULL] improvements in the [hr]timer code.

   - Code move from kernel to kernel/time of all time* related code.

   - New clocksource/event drivers from the ARM universe.  I'm really
     impressed that despite an architected timer in the newer chips SoC
     manufacturers insist on inventing new and differently broken SoC
     specific timers.

[ Ed. "Impressed"? I don't think that word means what you think it means ]

   - Another round of code move from arch to drivers.  Looks like most
     of the legacy mess in ARM regarding timers is sorted out except for
     a few obnoxious strongholds.

   - The usual updates and fixlets all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
  timekeeping: Fixup typo in update_vsyscall_old definition
  clocksource: document some basic timekeeping concepts
  timekeeping: Use cached ntp_tick_length when accumulating error
  timekeeping: Rework frequency adjustments to work better w/ nohz
  timekeeping: Minor fixup for timespec64->timespec assignment
  ftrace: Provide trace clocks monotonic
  timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
  seqcount: Add raw_write_seqcount_latch()
  seqcount: Provide raw_read_seqcount()
  timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
  timekeeping: Create struct tk_read_base and use it in struct timekeeper
  timekeeping: Restructure the timekeeper some more
  clocksource: Get rid of cycle_last
  clocksource: Move cycle_last validation to core code
  clocksource: Make delta calculation a function
  wireless: ath9k: Get rid of timespec conversions
  drm: vmwgfx: Use nsec based interfaces
  drm: i915: Use nsec based interfaces
  timekeeping: Provide ktime_get_raw()
  hangcheck-timer: Use ktime_get_ns()
  ...
2014-08-05 17:46:42 -07:00
Paul Moore
aa9e0de81b Linux 3.16
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJT3rbVAAoJEHm+PkMAQRiGBc0H/0PcAqZ66KqBrjCaC7tlR9ZJ
 Oyv4usrPpVmJaCaYiNwc4KnkJXDfc/foEtZq32vYSb4d8xaOLta3DrT8YJTS7B7T
 Afdg8FbVdSjBD0S8It35XidmZlOaVrgGJGpDIRBRrqDwPPgbWpTeUR73bfkwoA/R
 ziW+78s0mquo9hN9Bdu3apr7XxVmzeIUx6lJxKPCoXNEGTsSC7ibCzZRzZDMpag/
 D1JrQbE0XevgEu5fWrJkcqKceUzi3I1wuKZvBIJm2aX5XDsKpYNfQL6ViJDW56dK
 LhrB8vex8gkQYSCVPyUKx4BjkdPourSICSKq+h0SwhOCpHVHPmG8XM3J4/U4a7U=
 =yoNZ
 -----END PGP SIGNATURE-----

Merge tag 'v3.16' into next

Linux 3.16
2014-08-05 15:44:22 -04:00
Linus Torvalds
98959948a7 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - Move the nohz kick code out of the scheduler tick to a dedicated IPI,
   from Frederic Weisbecker.

  This necessiated quite some background infrastructure rework,
  including:

   * Clean up some irq-work internals
   * Implement remote irq-work
   * Implement nohz kick on top of remote irq-work
   * Move full dynticks timer enqueue notification to new kick
   * Move multi-task notification to new kick
   * Remove unecessary barriers on multi-task notification

 - Remove proliferation of wait_on_bit() action functions and allow
   wait_on_bit_action() functions to support a timeout.  (Neil Brown)

 - Another round of sched/numa improvements, cleanups and fixes.  (Rik
   van Riel)

 - Implement fast idling of CPUs when the system is partially loaded,
   for better scalability.  (Tim Chen)

 - Restructure and fix the CPU hotplug handling code that may leave
   cfs_rq and rt_rq's throttled when tasks are migrated away from a dead
   cpu.  (Kirill Tkhai)

 - Robustify the sched topology setup code.  (Peterz Zijlstra)

 - Improve sched_feat() handling wrt.  static_keys (Jason Baron)

 - Misc fixes.

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
  sched/fair: Fix 'make xmldocs' warning caused by missing description
  sched: Use macro for magic number of -1 for setparam
  sched: Robustify topology setup
  sched: Fix sched_setparam() policy == -1 logic
  sched: Allow wait_on_bit_action() functions to support a timeout
  sched: Remove proliferation of wait_on_bit() action functions
  sched/numa: Revert "Use effective_load() to balance NUMA loads"
  sched: Fix static_key race with sched_feat()
  sched: Remove extra static_key*() function indirection
  sched/rt: Fix replenish_dl_entity() comments to match the current upstream code
  sched: Transform resched_task() into resched_curr()
  sched/deadline: Kill task_struct->pi_top_task
  sched: Rework check_for_tasks()
  sched/rt: Enqueue just unthrottled rt_rq back on the stack in __disable_runtime()
  sched/fair: Disable runtime_enabled on dying rq
  sched/numa: Change scan period code to match intent
  sched/numa: Rework best node setting in task_numa_migrate()
  sched/numa: Examine a task move when examining a task swap
  sched/numa: Simplify task_numa_compare()
  sched/numa: Use effective_load() to balance NUMA loads
  ...
2014-08-04 16:23:30 -07:00
James Morris
103ae675b1 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-08-02 22:58:02 +10:00
Paul Moore
4fbe63d1c7 netlabel: shorter names for the NetLabel catmap funcs/structs
Historically the NetLabel LSM secattr catmap functions and data
structures have had very long names which makes a mess of the NetLabel
code and anyone who uses NetLabel.  This patch renames the catmap
functions and structures from "*_secattr_catmap_*" to just "*_catmap_*"
which improves things greatly.

There are no substantial code or logic changes in this patch.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:37 -04:00
Paul Moore
4b8feff251 netlabel: fix the horribly broken catmap functions
The NetLabel secattr catmap functions, and the SELinux import/export
glue routines, were broken in many horrible ways and the SELinux glue
code fiddled with the NetLabel catmap structures in ways that we
probably shouldn't allow.  At some point this "worked", but that was
likely due to a bit of dumb luck and sub-par testing (both inflicted
by yours truly).  This patch corrects these problems by basically
gutting the code in favor of something less obtuse and restoring the
NetLabel abstractions in the SELinux catmap glue code.

Everything is working now, and if it decides to break itself in the
future this code will be much easier to debug than the code it
replaces.

One noteworthy side effect of the changes is that it is no longer
necessary to allocate a NetLabel catmap before calling one of the
NetLabel APIs to set a bit in the catmap.  NetLabel will automatically
allocate the catmap nodes when needed, resulting in less allocations
when the lowest bit is greater than 255 and less code in the LSMs.

Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:17 -04:00
Paul Moore
41c3bd2039 netlabel: fix a problem when setting bits below the previously lowest bit
The NetLabel category (catmap) functions have a problem in that they
assume categories will be set in an increasing manner, e.g. the next
category set will always be larger than the last.  Unfortunately, this
is not a valid assumption and could result in problems when attempting
to set categories less than the startbit in the lowest catmap node.
In some cases kernel panics and other nasties can result.

This patch corrects the problem by checking for this and allocating a
new catmap node instance and placing it at the front of the list.

Cc: stable@vger.kernel.org
Reported-by: Christian Evans <frodox@zoho.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Casey Schaufler <casey@schaufler-ca.com>
2014-08-01 11:17:03 -04:00
James Morris
167225b775 Merge branch 'stable-3.16' of git://git.infradead.org/users/pcmoore/selinux into next 2014-07-30 01:31:46 +10:00
Paul Moore
2873ead7e4 Revert "selinux: fix the default socket labeling in sock_graft()"
This reverts commit 4da6daf4d3.

Unfortunately, the commit in question caused problems with Bluetooth
devices, specifically it caused them to get caught in the newly
created BUG_ON() check.  The AF_ALG problem still exists, but will be
addressed in a future patch.

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-07-28 10:46:07 -04:00
Mimi Zohar
b64cc5fb85 KEYS: revert encrypted key change
Commit fc7c70e "KEYS: struct key_preparsed_payload should have two
payload pointers" erroneously modified encrypted-keys.  This patch
reverts the change to that file.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-28 12:36:17 +01:00
Mimi Zohar
5a9196d715 ima: add support for measuring and appraising firmware
The "security: introduce kernel_fw_from_file hook" patch defined a
new security hook to evaluate any loaded firmware that wasn't built
into the kernel.

This patch defines ima_fw_from_file(), which is called from the new
security hook, to measure and/or appraise the loaded firmware's
integrity.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2014-07-25 11:47:46 -07:00
Kees Cook
13752fe2d7 security: introduce kernel_fw_from_file hook
In order to validate the contents of firmware being loaded, there must be
a hook to evaluate any loaded firmware that wasn't built into the kernel
itself. Without this, there is a risk that a root user could load malicious
firmware designed to mount an attack against kernel memory (e.g. via DMA).

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
2014-07-25 11:47:45 -07:00
Eric Paris
7d8b6c6375 CAPABILITIES: remove undefined caps from all processes
This is effectively a revert of 7b9a7ec565
plus fixing it a different way...

We found, when trying to run an application from an application which
had dropped privs that the kernel does security checks on undefined
capability bits.  This was ESPECIALLY difficult to debug as those
undefined bits are hidden from /proc/$PID/status.

Consider a root application which drops all capabilities from ALL 4
capability sets.  We assume, since the application is going to set
eff/perm/inh from an array that it will clear not only the defined caps
less than CAP_LAST_CAP, but also the higher 28ish bits which are
undefined future capabilities.

The BSET gets cleared differently.  Instead it is cleared one bit at a
time.  The problem here is that in security/commoncap.c::cap_task_prctl()
we actually check the validity of a capability being read.  So any task
which attempts to 'read all things set in bset' followed by 'unset all
things set in bset' will not even attempt to unset the undefined bits
higher than CAP_LAST_CAP.

So the 'parent' will look something like:
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	ffffffc000000000

All of this 'should' be fine.  Given that these are undefined bits that
aren't supposed to have anything to do with permissions.  But they do...

So lets now consider a task which cleared the eff/perm/inh completely
and cleared all of the valid caps in the bset (but not the invalid caps
it couldn't read out of the kernel).  We know that this is exactly what
the libcap-ng library does and what the go capabilities library does.
They both leave you in that above situation if you try to clear all of
you capapabilities from all 4 sets.  If that root task calls execve()
the child task will pick up all caps not blocked by the bset.  The bset
however does not block bits higher than CAP_LAST_CAP.  So now the child
task has bits in eff which are not in the parent.  These are
'meaningless' undefined bits, but still bits which the parent doesn't
have.

The problem is now in cred_cap_issubset() (or any operation which does a
subset test) as the child, while a subset for valid cap bits, is not a
subset for invalid cap bits!  So now we set durring commit creds that
the child is not dumpable.  Given it is 'more priv' than its parent.  It
also means the parent cannot ptrace the child and other stupidity.

The solution here:
1) stop hiding capability bits in status
	This makes debugging easier!

2) stop giving any task undefined capability bits.  it's simple, it you
don't put those invalid bits in CAP_FULL_SET you won't get them in init
and you won't get them in any other task either.
	This fixes the cap_issubset() tests and resulting fallout (which
	made the init task in a docker container untraceable among other
	things)

3) mask out undefined bits when sys_capset() is called as it might use
~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
	This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.

4) mask out undefined bit when we read a file capability off of disk as
again likely all bits are set in the xattr for forward/backward
compatibility.
	This lets 'setcap all+pe /bin/bash; /bin/bash' run

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Steve Grubb <sgrubb@redhat.com>
Cc: Dan Walsh <dwalsh@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-07-24 21:53:47 +10:00
James Morris
4ca332e11d Merge tag 'keys-next-20140722' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2014-07-24 21:36:19 +10:00
Tetsuo Handa
6d6f332842 commoncap: don't alloc the credential unless needed in cap_task_prctl
In function cap_task_prctl(), we would allocate a credential
unconditionally and then check if we support the requested function.
If not we would release this credential with abort_creds() by using
RCU method. But on some archs such as powerpc, the sys_prctl is heavily
used to get/set the floating point exception mode. So the unnecessary
allocating/releasing of credential not only introduce runtime overhead
but also do cause OOM due to the RCU implementation.

This patch removes abort_creds() from cap_task_prctl() by calling
prepare_creds() only when we need to modify it.

Reported-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-07-24 21:12:30 +10:00
David Howells
633706a2ee Merge branch 'keys-fixes' into keys-next
Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-22 21:55:45 +01:00
David Howells
64724cfc6e Merge remote-tracking branch 'integrity/next-with-keys' into keys-next
Signed-off-by: David Howells <dhowells@redhat.com>
2014-07-22 21:54:43 +01:00
David Howells
f1dcde91a3 KEYS: request_key_auth: Provide key preparsing
Provide key preparsing for the request_key_auth key type so that we can make
preparsing mandatory.  This does nothing as this type can only be set up
internally to the kernel.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22 21:46:55 +01:00
David Howells
5d19e20b53 KEYS: keyring: Provide key preparsing
Provide key preparsing in the keyring so that we can make preparsing
mandatory.  For keyrings, however, only an empty payload is permitted.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22 21:46:51 +01:00
David Howells
002edaf76f KEYS: big_key: Use key preparsing
Make use of key preparsing in the big key type so that quota size determination
can take place prior to keyring locking when a key is being added.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
2014-07-22 21:46:47 +01:00
David Howells
f9167789df KEYS: user: Use key preparsing
Make use of key preparsing in user-defined and logon keys so that quota size
determination can take place prior to keyring locking when a key is being
added.

Also the idmapper key types need to change to match as they use the
user-defined key type routines.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
2014-07-22 21:46:17 +01:00
David Howells
4d8c0250b8 KEYS: Call ->free_preparse() even after ->preparse() returns an error
Call the ->free_preparse() key type op even after ->preparse() returns an
error as it does cleaning up type stuff.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 21:46:12 +01:00
David Howells
7dfa0ca6a9 KEYS: Allow expiry time to be set when preparsing a key
Allow a key type's preparsing routine to set the expiry time for a key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 21:46:08 +01:00
David Howells
fc7c70e0b6 KEYS: struct key_preparsed_payload should have two payload pointers
struct key_preparsed_payload should have two payload pointers to correspond
with those in struct key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-22 21:46:02 +01:00
James Morris
fd33c43677 Merge tag 'seccomp-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux into next 2014-07-19 17:40:49 +10:00
James Morris
2ccf4661f3 Merge branch 'next' of git://git.infradead.org/users/pcmoore/selinux into next 2014-07-19 17:39:19 +10:00
Kees Cook
1d4457f999 sched: move no_new_privs into new atomic flags
Since seccomp transitions between threads requires updates to the
no_new_privs flag to be atomic, the flag must be part of an atomic flag
set. This moves the nnp flag into a separate task field, and introduces
accessors.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18 12:13:38 -07:00
David Howells
6a09d17bb6 KEYS: Provide a generic instantiation function
Provide a generic instantiation function for key types that use the preparse
hook.  This makes it easier to prereserve key quota before keyrings get locked
to retain the new key.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2014-07-18 18:56:34 +01:00
David Howells
0c7774abb4 KEYS: Allow special keys (eg. DNS results) to be invalidated by CAP_SYS_ADMIN
Special kernel keys, such as those used to hold DNS results for AFS, CIFS and
NFS and those used to hold idmapper results for NFS, used to be
'invalidateable' with key_revoke().  However, since the default permissions for
keys were reduced:

	Commit: 96b5c8fea6
	KEYS: Reduce initial permissions on keys

it has become impossible to do this.

Add a key flag (KEY_FLAG_ROOT_CAN_INVAL) that will permit a key to be
invalidated by root.  This should not be used for system keyrings as the
garbage collector will try and remove any invalidate key.  For system keyrings,
KEY_FLAG_ROOT_CAN_CLEAR can be used instead.

After this, from userspace, keyctl_invalidate() and "keyctl invalidate" can be
used by any possessor of CAP_SYS_ADMIN (typically root) to invalidate DNS and
idmapper keys.  Invalidated keys are immediately garbage collected and will be
immediately rerequested if needed again.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Steve Dickson <steved@redhat.com>
2014-07-17 20:45:08 +01:00
Mimi Zohar
7d2ce2320e ima: define '.ima' as a builtin 'trusted' keyring
Require all keys added to the IMA keyring be signed by an
existing trusted key on the system trusted keyring.

Changelog v6:
- remove ifdef CONFIG_IMA_TRUSTED_KEYRING in C code - Dmitry
- update Kconfig dependency and help
- select KEYS_DEBUG_PROC_KEYS - Dmitry

Changelog v5:
- Move integrity_init_keyring() to init_ima() - Dmitry
- reset keyring[id] on failure - Dmitry

Changelog v1:
- don't link IMA trusted keyring to user keyring

Changelog:
- define stub integrity_init_keyring() function (reported-by Fengguang Wu)
- differentiate between regular and trusted keyring names.
- replace printk with pr_info (D. Kasatkin)
- only make the IMA keyring a trusted keyring (reported-by D. Kastatkin)
- define stub integrity_init_keyring() definition based on
  CONFIG_INTEGRITY_SIGNATURE, not CONFIG_INTEGRITY_ASYMMETRIC_KEYS.
  (reported-by Jim Davis)

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Acked-by: David Howells <dhowells@redhat.com>
2014-07-17 09:35:17 -04:00
Mimi Zohar
a4e3b8d79a KEYS: special dot prefixed keyring name bug fix
Dot prefixed keyring names are supposed to be reserved for the
kernel, but add_key() calls key_get_type_from_user(), which
incorrectly verifies the 'type' field, not the 'description' field.
This patch verifies the 'description' field isn't dot prefixed,
when creating a new keyring, and removes the dot prefix test in
key_get_type_from_user().

Changelog v6:
- whitespace and other cleanup

Changelog v5:
- Only prevent userspace from creating a dot prefixed keyring, not
  regular keys  - Dmitry

Reported-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
2014-07-17 09:35:14 -04:00
Dmitry Kasatkin
32c2e6752f ima: provide double buffering for hash calculation
The asynchronous hash API allows initiating a hash calculation and
then performing other tasks, while waiting for the hash calculation
to complete.

This patch introduces usage of double buffering for simultaneous
hashing and reading of the next chunk of data from storage.

Changes in v3:
- better comments

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:11 -04:00
Dmitry Kasatkin
6edf7a8926 ima: introduce multi-page collect buffers
Use of multiple-page collect buffers reduces:
1) the number of block IO requests
2) the number of asynchronous hash update requests

Second is important for HW accelerated hashing, because significant
amount of time is spent for preparation of hash update operation,
which includes configuring acceleration HW, DMA engine, etc...
Thus, HW accelerators are more efficient when working on large
chunks of data.

This patch introduces usage of multi-page collect buffers. Buffer size
can be specified using 'ahash_bufsize' module parameter. Default buffer
size is 4096 bytes.

Changes in v3:
- kernel parameter replaced with module parameter

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:11 -04:00
Dmitry Kasatkin
3bcced39ea ima: use ahash API for file hash calculation
Async hash API allows the use of HW acceleration for hash calculation.
It may give significant performance gain and/or reduce power consumption,
which might be very beneficial for battery powered devices.

This patch introduces hash calculation using ahash API. ahash performance
depends on the data size and the particular HW. Depending on the specific
system, shash performance may be better.

This patch defines 'ahash_minsize' module parameter, which is used to
define the minimal file size to use with ahash.  If this minimum file size
is not set or the file is smaller than defined by the parameter, shash will
be used.

Changes in v3:
- kernel parameter replaced with module parameter
- pr_crit replaced with pr_crit_ratelimited
- more comment changes - Mimi

Changes in v2:
- ima_ahash_size became as ima_ahash
- ahash pre-allocation moved out from __init code to be able to use
  ahash crypto modules. Ahash allocated once on the first use.
- hash calculation falls back to shash if ahash allocation/calculation fails
- complex initialization separated from variable declaration
- improved comments

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:10 -04:00
Richard Guy Briggs
7e9001f663 audit: fix dangling keywords in integrity ima message output
Replace spaces in op keyword labels in log output since userspace audit tools
can't parse orphaned keywords.

Reported-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:10 -04:00
Dmitry Kasatkin
209b43ca64 ima: delay template descriptor lookup until use
process_measurement() always calls ima_template_desc_current(),
including when an IMA policy has not been defined.

This patch delays template descriptor lookup until action is
determined.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:09 -04:00
Dmitry Kasatkin
2c50b96482 ima: remove unnecessary i_mutex locking from ima_rdwr_violation_check()
Before 2.6.39 inode->i_readcount was maintained by IMA. It was not atomic
and protected using spinlock. For 2.6.39, i_readcount was converted to
atomic and maintaining was moved VFS layer. Spinlock for some unclear
reason was replaced by i_mutex.

After analyzing the code, we came to conclusion that i_mutex locking is
unnecessary, especially when an IMA policy has not been defined.

This patch removes i_mutex locking from ima_rdwr_violation_check().

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-07-17 09:35:09 -04:00
Thomas Gleixner
afdb094380 Linux 3.16-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTwvRvAAoJEHm+PkMAQRiG8CoIAJucWkj+MJFFoDXjR9hfI8U7
 /WeQLJP0GpWGMXd2KznX9epCuw5rsuaPAxCy1HFDNOa7OtNYacWrsIhByxOIDLwL
 YjDB9+fpMMPFWsr+LPJa8Ombh/TveCS77w6Pt5VMZFwvIKujiNK/C3MdxjReH5Gr
 iTGm8x7nEs2D6L2+5sQVlhXot/97phxIlBSP6wPXEiaztNZ9/JZi905Xpgq+WU16
 ZOA8MlJj1TQD4xcWyUcsQ5REwIOdQ6xxPF00wv/12RFela+Puy4JLAilnV6Mc12U
 fwYOZKbUNBS8rjfDDdyX3sljV1L5iFFqKkW3WFdnv/z8ZaZSo5NupWuavDnifKw=
 =6Q8o
 -----END PGP SIGNATURE-----

Merge tag 'v3.16-rc5' into timers/core

Reason: Bring in upstream modifications, so the pending changes which
depend on them can be queued.
2014-07-16 21:57:38 +02:00
James Morris
b6b8a371f5 Merge branch 'stable-3.16' of git://git.infradead.org/users/pcmoore/selinux into next 2014-07-17 03:05:51 +10:00
NeilBrown
743162013d sched: Remove proliferation of wait_on_bit() action functions
The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
 Rename wait_on_bit and        wait_on_bit_lock to
        wait_on_bit_action and wait_on_bit_lock_action
 to make it explicit that they need an action function.

 Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
 which are *not* given an action function but implicitly use
 a standard one.
 The decision to error-out if a signal is pending is now made
 based on the 'mode' argument rather than being encoded in the action
 function.

 All instances of the old wait_on_bit and wait_on_bit_lock which
 can use the new version have been changed accordingly and their
 action functions have been discarded.
 wait_on_bit{_lock} does not return any specific error code in the
 event of a signal so the caller must check for non-zero and
 interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack.  So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS.  CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: David Howells <dhowells@redhat.com> (fscache, keys)
Acked-by: Steven Whitehouse <swhiteho@redhat.com> (gfs2)
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-07-16 15:10:39 +02:00
Tejun Heo
5577964e64 cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes
Currently, cgroup_subsys->base_cftypes is used for both the unified
default hierarchy and legacy ones and subsystems can mark each file
with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear
only on one of them.  This is quite hairy and error-prone.  Also, we
may end up exposing interface files to the default hierarchy without
thinking it through.

cgroup_subsys will grow two separate cftype arrays and apply each only
on the hierarchies of the matching type.  This will allow organizing
cftypes in a lot clearer way and encourage subsystems to scrutinize
the interface which is being exposed in the new default hierarchy.

In preparation, this patch renames cgroup_subsys->base_cftypes to
cgroup_subsys->legacy_cftypes.  This patch is pure rename.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2014-07-15 11:05:09 -04:00
Paul Moore
4da6daf4d3 selinux: fix the default socket labeling in sock_graft()
The sock_graft() hook has special handling for AF_INET, AF_INET, and
AF_UNIX sockets as those address families have special hooks which
label the sock before it is attached its associated socket.
Unfortunately, the sock_graft() hook was missing a default approach
to labeling sockets which meant that any other address family which
made use of connections or the accept() syscall would find the
returned socket to be in an "unlabeled" state.  This was recently
demonstrated by the kcrypto/AF_ALG subsystem and the newly released
cryptsetup package (cryptsetup v1.6.5 and later).

This patch preserves the special handling in selinux_sock_graft(),
but adds a default behavior - setting the sock's label equal to the
associated socket - which resolves the problem with AF_ALG and
presumably any other address family which makes use of accept().

Cc: stable@vger.kernel.org
Signed-off-by: Paul Moore <pmoore@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
2014-07-10 10:17:48 -04:00
Paul Moore
615e51fdda selinux: reduce the number of calls to synchronize_net() when flushing caches
When flushing the AVC, such as during a policy load, the various
network caches are also flushed, with each making a call to
synchronize_net() which has shown to be expensive in some cases.
This patch consolidates the network cache flushes into a single AVC
callback which only calls synchronize_net() once for each AVC cache
flush.

Reported-by: Jaejyn Shin <flagon22bass@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-26 14:33:56 -04:00
Waiman Long
f31e799459 selinux: no recursive read_lock of policy_rwlock in security_genfs_sid()
With the introduction of fair queued rwlock, recursive read_lock()
may hang the offending process if there is a write_lock() somewhere
in between.

With recursive read_lock checking enabled, the following error was
reported:

=============================================
[ INFO: possible recursive locking detected ]
3.16.0-rc1 #2 Tainted: G            E
---------------------------------------------
load_policy/708 is trying to acquire lock:
 (policy_rwlock){.+.+..}, at: [<ffffffff8125b32a>]
security_genfs_sid+0x3a/0x170

but task is already holding lock:
 (policy_rwlock){.+.+..}, at: [<ffffffff8125b48c>]
security_fs_use+0x2c/0x110

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(policy_rwlock);
  lock(policy_rwlock);

This patch fixes the occurrence of recursive read_lock() of
policy_rwlock by adding a helper function __security_genfs_sid()
which requires caller to take the lock before calling it. The
security_fs_use() was then modified to call the new helper function.

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-23 16:52:55 -04:00
Namhyung Kim
6e51f9cbfa selinux: fix a possible memory leak in cond_read_node()
The cond_read_node() should free the given node on error path as it's
not linked to p->cond_list yet.  This is done via cond_node_destroy()
but it's not called when next_entry() fails before the expr loop.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-19 14:56:59 -04:00
Namhyung Kim
f004afe60d selinux: simple cleanup for cond_read_node()
The node->cur_state and len can be read in a single call of next_entry().
And setting len before reading is a dead write so can be eliminated.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
(Minor tweak to the length parameter in the call to next_entry())
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-19 14:53:15 -04:00
Gideon Israel Dsouza
4bb9398300 security: Used macros from compiler.h instead of __attribute__((...))
To increase compiler portability there is <linux/compiler.h> which
provides convenience macros for various gcc constructs.  Eg: __packed
for __attribute__((packed)).

This patch is part of a large task I've taken to clean the gcc
specific attributes and use the the macros instead.

Signed-off-by: Gideon Israel Dsouza <gidisrael@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-18 16:59:34 -04:00
Namhyung Kim
4b6f405f72 selinux: introduce str_read() helper
There're some code duplication for reading a string value during
policydb_read().  Add str_read() helper to fix it.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-18 15:55:58 -04:00
Himangi Saraogi
5c7001b84b SELinux: use ARRAY_SIZE
ARRAY_SIZE is more concise to use when the size of an array is divided
by the size of its type or the size of its first element.

The Coccinelle semantic patch that makes this change is as follows:

// <smpl>
@@
type T;
T[] E;
@@

- (sizeof(E)/sizeof(E[...]))
+ ARRAY_SIZE(E)
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-17 17:36:02 -04:00
Paul Moore
170b5910d9 Linux 3.15
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTlKlSAAoJEHm+PkMAQRiGF5YH/Ak8QPJLI0mRvSDI9PuaVVfv
 XPnXrOIigcj/X0CL4WbzqWW2zQFAK6YtDDoNlzuRaRveFVx0TNp4YAI/YHCD4IVZ
 FzRXBEWpeGvTh5Ev0DYDyHMXyMqz5fJeIM3rCmOCUOlHJQxneIHGKIXwsMKpAPkX
 BxVdfEqt/Yq/CB5RFbvAyfFlXlnl8A5QDELEkaCk+1s3iXXST5qUKl8XwdgCvfKF
 Qq3yau12+Lmw89Klh74U9Hj5s/LXz5L8Gg66XM30hTsaK5/0AQ7oGx3kJvO3Uurs
 CO3vrt7EXJjeq2YLken4Wa7UpPoHOK33KY2E/ByhkJn4xTFwJmo68miLH0Edy9U=
 =4IWq
 -----END PGP SIGNATURE-----

Merge tag 'v3.15' into next

Linux 3.15
2014-06-17 17:30:23 -04:00
Linus Torvalds
aa569fa0ea Merge branch 'serge-next-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security
Pull more security layer updates from Serge Hallyn:
 "A few more commits had previously failed to make it through
  security-next into linux-next but this week made it into linux-next.
  At least commit "ima: introduce ima_kernel_read()" was deemed critical
  by Mimi to make this merge window.

  This is a temporary tree just for this request.  Mimi has pointed me
  to some previous threads about keeping maintainer trees at the
  previous release, which I'll certainly do for anything long-term,
  after talking with James"

* 'serge-next-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security:
  ima: introduce ima_kernel_read()
  evm: prohibit userspace writing 'security.evm' HMAC value
  ima: check inode integrity cache in violation check
  ima: prevent unnecessary policy checking
  evm: provide option to protect additional SMACK xattrs
  evm: replace HMAC version with attribute mask
  ima: prevent new digsig xattr from being replaced
2014-06-13 07:39:39 -07:00
Dmitry Kasatkin
0430e49b6e ima: introduce ima_kernel_read()
Commit 8aac62706 "move exit_task_namespaces() outside of exit_notify"
introduced the kernel opps since the kernel v3.10, which happens when
Apparmor and IMA-appraisal are enabled at the same time.

----------------------------------------------------------------------
[  106.750167] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[  106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[  106.750241] PGD 0
[  106.750254] Oops: 0000 [#1] SMP
[  106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm
bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp
kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul
ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel
snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw
snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp
parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci
pps_core
[  106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15
[  106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08
09/19/2012
[  106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti:
ffff880400fca000
[  106.750704] RIP: 0010:[<ffffffff811ec7da>]  [<ffffffff811ec7da>]
our_mnt+0x1a/0x30
[  106.750725] RSP: 0018:ffff880400fcba60  EFLAGS: 00010286
[  106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX:
ffff8800d51523e7
[  106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI:
ffff880402d20020
[  106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09:
0000000000000001
[  106.750817] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff8800d5152300
[  106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15:
ffff8800d51523e7
[  106.750871] FS:  0000000000000000(0000) GS:ffff88040d200000(0000)
knlGS:0000000000000000
[  106.750910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4:
00000000001407e0
[  106.750962] Stack:
[  106.750981]  ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18
0000000000000000
[  106.751037]  ffff8800de804920 ffffffff8101b9b9 0001800000000000
0000000000000100
[  106.751093]  0000010000000000 0000000000000002 000000000000000e
ffff8803eb8df500
[  106.751149] Call Trace:
[  106.751172]  [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430
[  106.751199]  [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10
[  106.751225]  [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170
[  106.751250]  [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80
[  106.751276]  [<ffffffff8134aa73>] aa_file_perm+0x33/0x40
[  106.751301]  [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0
[  106.751327]  [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20
[  106.751355]  [<ffffffff8130c853>] security_file_permission+0x23/0xa0
[  106.751382]  [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0
[  106.751407]  [<ffffffff811c789d>] vfs_read+0x6d/0x170
[  106.751432]  [<ffffffff811cda31>] kernel_read+0x41/0x60
[  106.751457]  [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280
[  106.751483]  [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280
[  106.751509]  [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160
[  106.751536]  [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10
[  106.751562]  [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0
[  106.751587]  [<ffffffff81352824>] ima_update_xattr+0x34/0x60
[  106.751612]  [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0
[  106.751637]  [<ffffffff811c9635>] __fput+0xd5/0x300
[  106.751662]  [<ffffffff811c98ae>] ____fput+0xe/0x10
[  106.751687]  [<ffffffff81086774>] task_work_run+0xc4/0xe0
[  106.751712]  [<ffffffff81066fad>] do_exit+0x2bd/0xa90
[  106.751738]  [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b
[  106.751763]  [<ffffffff8106780c>] do_group_exit+0x4c/0xc0
[  106.751788]  [<ffffffff81067894>] SyS_exit_group+0x14/0x20
[  106.751814]  [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f
[  106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3
0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89
e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00
[  106.752185] RIP  [<ffffffff811ec7da>] our_mnt+0x1a/0x30
[  106.752214]  RSP <ffff880400fcba60>
[  106.752236] CR2: 0000000000000018
[  106.752258] ---[ end trace 3c520748b4732721 ]---
----------------------------------------------------------------------

The reason for the oops is that IMA-appraisal uses "kernel_read()" when
file is closed. kernel_read() honors LSM security hook which calls
Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty'
commit changed the order of cleanup code so that nsproxy->mnt_ns was
not already available for Apparmor.

Discussion about the issue with Al Viro and Eric W. Biederman suggested
that kernel_read() is too high-level for IMA. Another issue, except
security checking, that was identified is mandatory locking. kernel_read
honors it as well and it might prevent IMA from calculating necessary hash.
It was suggested to use simplified version of the function without security
and locking checks.

This patch introduces special version ima_kernel_read(), which skips security
and mandatory locking checking. It prevents the kernel oops to happen.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
2014-06-12 17:58:08 -04:00
Mimi Zohar
2fb1c9a4f2 evm: prohibit userspace writing 'security.evm' HMAC value
Calculating the 'security.evm' HMAC value requires access to the
EVM encrypted key.  Only the kernel should have access to it.  This
patch prevents userspace tools(eg. setfattr, cp --preserve=xattr)
from setting/modifying the 'security.evm' HMAC value directly.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
2014-06-12 17:58:07 -04:00
Dmitry Kasatkin
14503eb994 ima: check inode integrity cache in violation check
When IMA did not support ima-appraisal, existance of the S_IMA flag
clearly indicated that the file was measured. With IMA appraisal S_IMA
flag indicates that file was measured and/or appraised. Because of
this, when measurement is not enabled by the policy, violations are
still reported.

To differentiate between measurement and appraisal policies this
patch checks the inode integrity cache flags.  The IMA_MEASURED
flag indicates whether the file was actually measured, while the
IMA_MEASURE flag indicates whether the file should be measured.
Unfortunately, the IMA_MEASURED flag is reset to indicate the file
needs to be re-measured.  Thus, this patch checks the IMA_MEASURE
flag.

This patch limits the false positive violation reports, but does
not fix it entirely.  The IMA_MEASURE/IMA_MEASURED flags are
indications that, at some point in time, the file opened for read
was in policy, but might not be in policy now (eg. different uid).
Other changes would be needed to further limit false positive
violation reports.

Changelog:
- expanded patch description based on conversation with Roberto (Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:07 -04:00
Dmitry Kasatkin
b882fae2d3 ima: prevent unnecessary policy checking
ima_rdwr_violation_check is called for every file openning.
The function checks the policy even when violation condition
is not met. It causes unnecessary policy checking.

This patch does policy checking only if violation condition is met.

Changelog:
- check writecount is greater than zero (Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:06 -04:00
Dmitry Kasatkin
3e38df56e6 evm: provide option to protect additional SMACK xattrs
Newer versions of SMACK introduced following security xattrs:
SMACK64EXEC, SMACK64TRANSMUTE and SMACK64MMAP.

To protect these xattrs, this patch includes them in the HMAC
calculation.  However, for backwards compatibility with existing
labeled filesystems, including these xattrs needs to be
configurable.

Changelog:
- Add SMACK dependency on new option (Mimi)

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:06 -04:00
Dmitry Kasatkin
d3b3367948 evm: replace HMAC version with attribute mask
Using HMAC version limits the posibility to arbitrarily add new
attributes such as SMACK64EXEC to the hmac calculation.

This patch replaces hmac version with attribute mask.
Desired attributes can be enabled with configuration parameter.
It allows to build kernels which works with previously labeled
filesystems.

Currently supported attribute is 'fsuuid' which is equivalent of
the former version 2.

Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:06 -04:00
Mimi Zohar
060bdebfb0 ima: prevent new digsig xattr from being replaced
Even though a new xattr will only be appraised on the next access,
set the DIGSIG flag to prevent a signature from being replaced with
a hash on file close.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12 17:58:05 -04:00
Linus Torvalds
f9da455b93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.

 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
    Benniston.

 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
    Mork.

 4) BPF now has a "random" opcode, from Chema Gonzalez.

 5) Add more BPF documentation and improve test framework, from Daniel
    Borkmann.

 6) Support TCP fastopen over ipv6, from Daniel Lee.

 7) Add software TSO helper functions and use them to support software
    TSO in mvneta and mv643xx_eth drivers.  From Ezequiel Garcia.

 8) Support software TSO in fec driver too, from Nimrod Andy.

 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.

10) Handle broadcasts more gracefully over macvlan when there are large
    numbers of interfaces configured, from Herbert Xu.

11) Allow more control over fwmark used for non-socket based responses,
    from Lorenzo Colitti.

12) Do TCP congestion window limiting based upon measurements, from Neal
    Cardwell.

13) Support busy polling in SCTP, from Neal Horman.

14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.

15) Bridge promisc mode handling improvements from Vlad Yasevich.

16) Don't use inetpeer entries to implement ID generation any more, it
    performs poorly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
  rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
  tcp: fixing TLP's FIN recovery
  net: fec: Add software TSO support
  net: fec: Add Scatter/gather support
  net: fec: Increase buffer descriptor entry number
  net: fec: Factorize feature setting
  net: fec: Enable IP header hardware checksum
  net: fec: Factorize the .xmit transmit function
  bridge: fix compile error when compiling without IPv6 support
  bridge: fix smatch warning / potential null pointer dereference
  via-rhine: fix full-duplex with autoneg disable
  bnx2x: Enlarge the dorq threshold for VFs
  bnx2x: Check for UNDI in uncommon branch
  bnx2x: Fix 1G-baseT link
  bnx2x: Fix link for KR with swapped polarity lane
  sctp: Fix sk_ack_backlog wrap-around problem
  net/core: Add VF link state control policy
  net/fsl: xgmac_mdio is dependent on OF_MDIO
  net/fsl: Make xgmac_mdio read error message useful
  net_sched: drr: warn when qdisc is not work conserving
  ...
2014-06-12 14:27:40 -07:00
Thomas Gleixner
77f4fa089c tomoyo: Use sensible time interface
There is no point in calling gettimeofday if only the seconds part of
the timespec is used. Use get_seconds() instead. It's not only the
proper interface it's also faster.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kentaro Takeda <takedakn@nttdata.co.jp>
Cc: linux-security-module@vger.kernel.org
Link: http://lkml.kernel.org/r/20140611234607.775273584@linutronix.de
2014-06-12 16:18:45 +02:00
Linus Torvalds
fad0701eaa Merge branch 'serge-next-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security
Pull security layer updates from Serge Hallyn:
 "This is a merge of James Morris' security-next tree from 3.14 to
  yesterday's master, plus four patches from Paul Moore which are in
  linux-next, plus one patch from Mimi"

* 'serge-next-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux-security:
  ima: audit log files opened with O_DIRECT flag
  selinux: conditionally reschedule in hashtab_insert while loading selinux policy
  selinux: conditionally reschedule in mls_convert_context while loading selinux policy
  selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
  selinux:  Report permissive mode in avc: denied messages.
  Warning in scanf string typing
  Smack: Label cgroup files for systemd
  Smack: Verify read access on file open - v3
  security: Convert use of typedef ctl_table to struct ctl_table
  Smack: bidirectional UDS connect check
  Smack: Correctly remove SMACK64TRANSMUTE attribute
  SMACK: Fix handling value==NULL in post setxattr
  bugfix patch for SMACK
  Smack: adds smackfs/ptrace interface
  Smack: unify all ptrace accesses in the smack
  Smack: fix the subject/object order in smack_ptrace_traceme()
  Minor improvement of 'smack_sb_kern_mount'
  smack: fix key permission verification
  KEYS: Move the flags representing required permission to linux/key.h
2014-06-10 10:05:36 -07:00
Linus Torvalds
14208b0ec5 Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "A lot of activities on cgroup side.  Heavy restructuring including
  locking simplification took place to improve the code base and enable
  implementation of the unified hierarchy, which currently exists behind
  a __DEVEL__ mount option.  The core support is mostly complete but
  individual controllers need further work.  To explain the design and
  rationales of the the unified hierarchy

        Documentation/cgroups/unified-hierarchy.txt

  is added.

  Another notable change is css (cgroup_subsys_state - what each
  controller uses to identify and interact with a cgroup) iteration
  update.  This is part of continuing updates on css object lifetime and
  visibility.  cgroup started with reference count draining on removal
  way back and is now reaching a point where csses behave and are
  iterated like normal refcnted objects albeit with some complexities to
  allow distinguishing the state where they're being deleted.  The css
  iteration update isn't taken advantage of yet but is planned to be
  used to simplify memcg significantly"

* 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (77 commits)
  cgroup: disallow disabled controllers on the default hierarchy
  cgroup: don't destroy the default root
  cgroup: disallow debug controller on the default hierarchy
  cgroup: clean up MAINTAINERS entries
  cgroup: implement css_tryget()
  device_cgroup: use css_has_online_children() instead of has_children()
  cgroup: convert cgroup_has_live_children() into css_has_online_children()
  cgroup: use CSS_ONLINE instead of CGRP_DEAD
  cgroup: iterate cgroup_subsys_states directly
  cgroup: introduce CSS_RELEASED and reduce css iteration fallback window
  cgroup: move cgroup->serial_nr into cgroup_subsys_state
  cgroup: link all cgroup_subsys_states in their sibling lists
  cgroup: move cgroup->sibling and ->children into cgroup_subsys_state
  cgroup: remove cgroup->parent
  device_cgroup: remove direct access to cgroup->children
  memcg: update memcg_has_children() to use css_next_child()
  memcg: remove tasks/children test from mem_cgroup_force_empty()
  cgroup: remove css_parent()
  cgroup: skip refcnting on normal root csses and cgrp_dfl_root self css
  cgroup: use cgroup->self.refcnt for cgroup refcnting
  ...
2014-06-09 15:03:33 -07:00
Mimi Zohar
f9b2a735bd ima: audit log files opened with O_DIRECT flag
Files are measured or appraised based on the IMA policy.  When a
file, in policy, is opened with the O_DIRECT flag, a deadlock
occurs.

The first attempt at resolving this lockdep temporarily removed the
O_DIRECT flag and restored it, after calculating the hash.  The
second attempt introduced the O_DIRECT_HAVELOCK flag. Based on this
flag, do_blockdev_direct_IO() would skip taking the i_mutex a second
time.  The third attempt, by Dmitry Kasatkin, resolves the i_mutex
locking issue, by re-introducing the IMA mutex, but uncovered
another problem.  Reading a file with O_DIRECT flag set, writes
directly to userspace pages.  A second patch allocates a user-space
like memory.  This works for all IMA hooks, except ima_file_free(),
which is called on __fput() to recalculate the file hash.

Until this last issue is addressed, do not 'collect' the
measurement for measuring, appraising, or auditing files opened
with the O_DIRECT flag set.  Based on policy, permit or deny file
access.  This patch defines a new IMA policy rule option named
'permit_directio'.  Policy rules could be defined, based on LSM
or other criteria, to permit specific applications to open files
with the O_DIRECT flag set.

Changelog v1:
- permit or deny file access based IMA policy rules

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Dmitry Kasatkin <d.kasatkin@samsung.com>
Cc: <stable@vger.kernel.org>
2014-06-03 14:21:50 -05:00
Dave Jones
ed1c96429a selinux: conditionally reschedule in hashtab_insert while loading selinux policy
After silencing the sleeping warning in mls_convert_context() I started
seeing similar traces from hashtab_insert. Do a cond_resched there too.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03 14:21:50 -05:00
Dave Jones
9a591f39a9 selinux: conditionally reschedule in mls_convert_context while loading selinux policy
On a slow machine (with debugging enabled), upgrading selinux policy may take
a considerable amount of time. Long enough that the softlockup detector
gets triggered.

The backtrace looks like this..

 > BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045]
 > Call Trace:
 >  [<ffffffff81221ddf>] symcmp+0xf/0x20
 >  [<ffffffff81221c27>] hashtab_search+0x47/0x80
 >  [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0
 >  [<ffffffff812294e8>] convert_context+0x378/0x460
 >  [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240
 >  [<ffffffff812221b5>] sidtab_map+0x45/0x80
 >  [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
 >  [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 >  [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8121e947>] sel_write_load+0xa7/0x770
 >  [<ffffffff81139633>] ? vfs_write+0x1c3/0x200
 >  [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0
 >  [<ffffffff8113952b>] vfs_write+0xbb/0x200
 >  [<ffffffff811581c7>] ? fget_light+0x397/0x4b0
 >  [<ffffffff81139c27>] SyS_write+0x47/0xa0
 >  [<ffffffff8153bde4>] tracesys+0xdd/0xe2

Stephen Smalley suggested:

 > Maybe put a cond_resched() within the ebitmap_for_each_positive_bit()
 > loop in mls_convert_context()?

That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03 14:21:49 -05:00
Paul Moore
5b589d44fa selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem.  This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2014-06-03 14:21:48 -05:00
Stephen Smalley
ca7786a2f9 selinux: Report permissive mode in avc: denied messages.
We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-06-03 14:21:48 -05:00
David S. Miller
54e5c4def0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_alb.c
	drivers/net/ethernet/altera/altera_msgdma.c
	drivers/net/ethernet/altera/altera_sgdma.c
	net/ipv6/xfrm6_output.c

Several cases of overlapping changes.

The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.

In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.

Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24 00:32:30 -04:00
James Morris
2fd4e6698f Merge branch 'smack-for-3.16' of git://git.gitorious.org/smack-next/kernel into next 2014-05-20 14:50:09 +10:00
Tejun Heo
7a3bb24f7c device_cgroup: use css_has_online_children() instead of has_children()
devcgroup_update_access() wants to know whether there are child
cgroups which are online and visible to userland and has_children()
may return false positive.  Replace it with css_has_online_children().

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-05-16 13:22:52 -04:00
Tejun Heo
5877019d97 device_cgroup: remove direct access to cgroup->children
Currently, devcg::has_children() directly tests cgroup->children for
list emptiness.  The field is not a published field and scheduled to
go away.  In addition, the test isn't strictly correct as devcg should
only care about children which are visible to userland.

This patch converts has_children() to use css_next_child() instead.
The subtle incorrectness is noted and will be dealt with later.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
2014-05-16 13:22:48 -04:00
Tejun Heo
5c9d535b89 cgroup: remove css_parent()
cgroup in general is moving towards using cgroup_subsys_state as the
fundamental structural component and css_parent() was introduced to
convert from using cgroup->parent to css->parent.  It was quite some
time ago and we're moving forward with making css more prominent.

This patch drops the trivial wrapper css_parent() and let the users
dereference css->parent.  While at it, explicitly mark fields of css
which are public and immutable.

v2: New usage from device_cgroup.c converted.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Johannes Weiner <hannes@cmpxchg.org>
2014-05-16 13:22:48 -04:00
Dave Jones
47dd0b76ac selinux: conditionally reschedule in hashtab_insert while loading selinux policy
After silencing the sleeping warning in mls_convert_context() I started
seeing similar traces from hashtab_insert. Do a cond_resched there too.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-05-15 17:07:55 -04:00
Dave Jones
612c353178 selinux: conditionally reschedule in mls_convert_context while loading selinux policy
On a slow machine (with debugging enabled), upgrading selinux policy may take
a considerable amount of time. Long enough that the softlockup detector
gets triggered.

The backtrace looks like this..

 > BUG: soft lockup - CPU#2 stuck for 23s! [load_policy:19045]
 > Call Trace:
 >  [<ffffffff81221ddf>] symcmp+0xf/0x20
 >  [<ffffffff81221c27>] hashtab_search+0x47/0x80
 >  [<ffffffff8122e96c>] mls_convert_context+0xdc/0x1c0
 >  [<ffffffff812294e8>] convert_context+0x378/0x460
 >  [<ffffffff81229170>] ? security_context_to_sid_core+0x240/0x240
 >  [<ffffffff812221b5>] sidtab_map+0x45/0x80
 >  [<ffffffff8122bb9f>] security_load_policy+0x3ff/0x580
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810786dd>] ? sched_clock_local+0x1d/0x80
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff8103096a>] ? __change_page_attr_set_clr+0x82a/0xa50
 >  [<ffffffff810788a8>] ? sched_clock_cpu+0xa8/0x100
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8109c82d>] ? trace_hardirqs_on_caller+0xfd/0x1c0
 >  [<ffffffff81279a2e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 >  [<ffffffff810d28a8>] ? rcu_irq_exit+0x68/0xb0
 >  [<ffffffff81534ddc>] ? retint_restore_args+0xe/0xe
 >  [<ffffffff8121e947>] sel_write_load+0xa7/0x770
 >  [<ffffffff81139633>] ? vfs_write+0x1c3/0x200
 >  [<ffffffff81210e8e>] ? security_file_permission+0x1e/0xa0
 >  [<ffffffff8113952b>] vfs_write+0xbb/0x200
 >  [<ffffffff811581c7>] ? fget_light+0x397/0x4b0
 >  [<ffffffff81139c27>] SyS_write+0x47/0xa0
 >  [<ffffffff8153bde4>] tracesys+0xdd/0xe2

Stephen Smalley suggested:

 > Maybe put a cond_resched() within the ebitmap_for_each_positive_bit()
 > loop in mls_convert_context()?

That seems to do the trick. Tested by downgrading and re-upgrading selinux-policy-targeted.

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-05-15 17:06:14 -04:00
Paul Moore
4f189988a0 selinux: reject setexeccon() on MNT_NOSUID applications with -EACCES
We presently prevent processes from using setexecon() to set the
security label of exec()'d processes when NO_NEW_PRIVS is enabled by
returning an error; however, we silently ignore setexeccon() when
exec()'ing from a nosuid mounted filesystem.  This patch makes things
a bit more consistent by returning an error in the setexeccon()/nosuid
case.

Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2014-05-15 11:16:06 -04:00
Tejun Heo
451af504df cgroup: replace cftype->write_string() with cftype->write()
Convert all cftype->write_string() users to the new cftype->write()
which maps directly to kernfs write operation and has full access to
kernfs and cgroup contexts.  The conversions are mostly mechanical.

* @css and @cft are accessed using of_css() and of_cft() accessors
  respectively instead of being specified as arguments.

* Should return @nbytes on success instead of 0.

* @buf is not trimmed automatically.  Trim if necessary.  Note that
  blkcg and netprio don't need this as the parsers already handle
  whitespaces.

cftype->write_string() has no user left after the conversions and
removed.

While at it, remove unnecessary local variable @p in
cgroup_subtree_control_write() and stale comment about
CGROUP_LOCAL_BUFFER_SIZE in cgroup_freezer.c.

This patch doesn't introduce any visible behavior changes.

v2: netprio was missing from conversion.  Converted.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
2014-05-13 12:16:21 -04:00
Linus Torvalds
26a41cd1ee Merge branch 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "During recent restructuring, device_cgroup unified config input check
  and enforcement logic; unfortunately, it turned out to share too much.
  Aristeu's patches fix the breakage and marked for -stable backport.

  The other two patches are fallouts from kernfs conversion.  The blkcg
  change is temporary and will go away once kernfs internal locking gets
  simplified (patches pending)"

* 'for-3.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  blkcg: use trylock on blkcg_pol_mutex in blkcg_reset_stats()
  device_cgroup: check if exception removal is allowed
  device_cgroup: fix the comment format for recently added functions
  device_cgroup: rework device access check and exception checking
  cgroup: fix the retry path of cgroup_mount()
2014-05-13 11:22:57 +09:00
David S. Miller
5f013c9bc7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/altera/altera_sgdma.c
	net/netlink/af_netlink.c
	net/sched/cls_api.c
	net/sched/sch_api.c

The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces.  These were simple transformations from
netlink_capable to netlink_ns_capable.

The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-12 13:19:14 -04:00
Linus Torvalds
8169d3005e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "dcache fixes + kvfree() (uninlined, exported by mm/util.c) + posix_acl
  bugfix from hch"

The dcache fixes are for a subtle LRU list corruption bug reported by
Miklos Szeredi, where people inside IBM saw list corruptions with the
LTP/host01 test.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nick kvfree() from apparmor
  posix_acl: handle NULL ACL in posix_acl_equiv_mode
  dcache: don't need rcu in shrink_dentry_list()
  more graceful recovery in umount_collect()
  don't remove from shrink list in select_collect()
  dentry_kill(): don't try to remove from shrink list
  expand the call of dentry_lru_del() in dentry_kill()
  new helper: dentry_free()
  fold try_prune_one_dentry()
  fold d_kill() and d_free()
  fix races between __d_instantiate() and checks of dentry flags
2014-05-06 12:22:20 -07:00
Toralf Förster
ec554fa75e Warning in scanf string typing
This fixes a warning about the mismatch of types between
the declared unsigned and integer.

Signed-off-by: Toralf Förster <toralf.foerster@gmx.de>
2014-05-06 11:32:53 -07:00
Al Viro
39f1f78d53 nick kvfree() from apparmor
too many places open-code it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-06 14:02:53 -04:00
Aristeu Rozanski
d2c2b11cfa device_cgroup: check if exception removal is allowed
[PATCH v3 1/2] device_cgroup: check if exception removal is allowed

When the device cgroup hierarchy was introduced in
	bd2953ebbb - devcg: propagate local changes down the hierarchy

a specific case was overlooked. Consider the hierarchy bellow:

	A	default policy: ALLOW, exceptions will deny access
	 \
	  B	default policy: ALLOW, exceptions will deny access

There's no need to verify when an new exception is added to B because
in this case exceptions will deny access to further devices, which is
always fine. Hierarchy in device cgroup only makes sure B won't have
more access than A.

But when an exception is removed (by writing devices.allow), it isn't
checked if the user is in fact removing an inherited exception from A,
thus giving more access to B.

Example:

	# echo 'a' >A/devices.allow
	# echo 'c 1:3 rw' >A/devices.deny
	# echo $$ >A/B/tasks
	# echo >/dev/null
	-bash: /dev/null: Operation not permitted
	# echo 'c 1:3 w' >A/B/devices.allow
	# echo >/dev/null
	#

This shouldn't be allowed and this patch fixes it by making sure to never allow
exceptions in this case to be removed if the exception is partially or fully
present on the parent.

v3: missing '*' in function description
v2: improved log message and formatting fixes

Cc: cgroups@vger.kernel.org
Cc: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-05-05 11:20:12 -04:00
Aristeu Rozanski
f5f3cf6f7e device_cgroup: fix the comment format for recently added functions
Moving more extensive explanations to the end of the comment.

Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-05-04 15:21:09 -04:00
Stephen Smalley
626b9740fa selinux: Report permissive mode in avc: denied messages.
We cannot presently tell from an avc: denied message whether access was in
fact denied or was allowed due to global or per-domain permissive mode.
Add a permissive= field to the avc message to reflect this information.

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
2014-05-01 14:56:14 -04:00
Casey Schaufler
36ea735b52 Smack: Label cgroup files for systemd
The cgroup filesystem isn't ready for an LSM to
properly use extented attributes. This patch makes
files created in the cgroup filesystem usable by
a system running Smack and systemd.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-04-30 10:49:33 -07:00
Casey Schaufler
a6834c0b91 Smack: Verify read access on file open - v3
Smack believes that many of the operatons that can
be performed on an open file descriptor are read operations.
The fstat and lseek system calls are examples.
An implication of this is that files shouldn't be open
if the task doesn't have read access even if it has
write access and the file is being opened write only.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-04-23 08:52:39 -07:00
Richard Guy Briggs
3a101b8de0 audit: add netlink audit protocol bind to check capabilities on multicast join
Register a netlink per-protocol bind fuction for audit to check userspace
process capabilities before allowing a multicast group connection.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-22 21:42:27 -04:00
Jeff Layton
0d3f7a2dd2 locks: rename file-private locks to "open file description locks"
File-private locks have been merged into Linux for v3.15, and *now*
people are commenting that the name and macro definitions for the new
file-private locks suck.

...and I can't even disagree. The names and command macros do suck.

We're going to have to live with these for a long time, so it's
important that we be happy with the names before we're stuck with them.
The consensus on the lists so far is that they should be rechristened as
"open file description locks".

The name isn't a big deal for the kernel, but the command macros are not
visually distinct enough from the traditional POSIX lock macros. The
glibc and documentation folks are recommending that we change them to
look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
programmer will typo one of the commands wrong, and also makes it easier
to spot this difference when reading code.

This patch makes the following changes that I think are necessary before
v3.15 ships:

1) rename the command macros to their new names. These end up in the uapi
   headers and so are part of the external-facing API. It turns out that
   glibc doesn't actually use the fcntl.h uapi header, but it's hard to
   be sure that something else won't. Changing it now is safest.

2) make the the /proc/locks output display these as type "OFDLCK"

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Carlos O'Donell <carlos@redhat.com>
Cc: Stefan Metzmacher <metze@samba.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Frank Filz <ffilzlnx@mindspring.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
2014-04-22 08:23:58 -04:00
Aristeu Rozanski
79d719749d device_cgroup: rework device access check and exception checking
Whenever a device file is opened and checked against current device
cgroup rules, it uses the same function (may_access()) as when a new
exception rule is added by writing devices.{allow,deny}. And in both
cases, the algorithm is the same, doesn't matter the behavior.

First problem is having device access to be considered the same as rule
checking. Consider the following structure:

	A	(default behavior: allow, exceptions disallow access)
	 \
	  B	(default behavior: allow, exceptions disallow access)

A new exception is added to B by writing devices.deny:

	c 12:34 rw

When checking if that exception is allowed in may_access():

	if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) {
		if (behavior == DEVCG_DEFAULT_ALLOW) {
			/* the exception will deny access to certain devices */
			return true;

Which is ok, since B is not getting more privileges than A, it doesn't
matter and the rule is accepted

Now, consider it's a device file open check and the process belongs to
cgroup B. The access will be generated as:

	behavior: allow
	exception: c 12:34 rw

The very same chunk of code will allow it, even if there's an explicit
exception telling to do otherwise.

A simple test case:

	# mkdir new_group
	# cd new_group
	# echo $$ >tasks
	# echo "c 1:3 w" >devices.deny
	# echo >/dev/null
	# echo $?
	0

This is a serious bug and was introduced on

	c39a2a3018 devcg: prepare may_access() for hierarchy support

To solve this problem, the device file open function was split from the
new exception check.

Second problem is how exceptions are processed by may_access(). The
first part of the said function tries to match fully with an existing
exception:

	list_for_each_entry_rcu(ex, &dev_cgroup->exceptions, list) {
		if ((refex->type & DEV_BLOCK) && !(ex->type & DEV_BLOCK))
			continue;
		if ((refex->type & DEV_CHAR) && !(ex->type & DEV_CHAR))
			continue;
		if (ex->major != ~0 && ex->major != refex->major)
			continue;
		if (ex->minor != ~0 && ex->minor != refex->minor)
			continue;
		if (refex->access & (~ex->access))
			continue;
		match = true;
		break;
	}

That means the new exception should be contained into an existing one to
be considered a match:

	New exception		Existing	match?	notes
	b 12:34 rwm		b 12:34 rwm	yes
	b 12:34 r		b *:34 rw	yes
	b 12:34 rw		b 12:34 w	no	extra "r"
	b *:34 rw		b 12:34 rw	no	too broad "*"
	b *:34 rw		b *:34 rwm	yes

Which is fine in some cases. Consider:

	A	(default behavior: deny, exceptions allow access)
	 \
	  B	(default behavior: deny, exceptions allow access)

In this case the full match makes sense, the new exception cannot add
more access than the parent allows

But this doesn't always work, consider:

	A	(default behavior: allow, exceptions disallow access)
	 \
	  B	(default behavior: deny, exceptions allow access)

In this case, a new exception in B shouldn't match any of the exceptions
in A, after all you can't allow something that was forbidden by A. But
consider this scenario:

	New exception	Existing in A	match?	outcome
	b 12:34 rw	b 12:34 r	no	exception is accepted

Because the new exception has "w" as extra, it doesn't match, so it'll
be added to B's exception list.

The same problem can happen during a file access check. Consider a
cgroup with allow as default behavior:

	Access		Exception	match?
	b 12:34 rw	b 12:34 r	no

In this case, the access didn't match any of the exceptions in the
cgroup, which is required since exceptions will disallow access.

To solve this problem, two new functions were created to match an
exception either fully or partially. In the example above, a partial
check will be performed and it'll produce a match since at least
"b 12:34 r" from "b 12:34 rw" access matches.

Cc: cgroups@vger.kernel.org
Cc: Tejun Heo <tj@kernel.org>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Aristeu Rozanski <arozansk@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-04-21 18:21:42 -04:00
Joe Perches
fab71a90ed security: Convert use of typedef ctl_table to struct ctl_table
This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2014-04-15 13:39:58 +10:00
James Morris
b13cebe707 Merge tag 'keys-20140314' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into next 2014-04-14 11:42:49 +10:00
James Morris
ecd740c6f2 Merge commit 'v3.14' into next 2014-04-14 11:23:14 +10:00
Linus Torvalds
5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
Linus Torvalds
0b747172dc Merge git://git.infradead.org/users/eparis/audit
Pull audit updates from Eric Paris.

* git://git.infradead.org/users/eparis/audit: (28 commits)
  AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
  audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
  audit: do not cast audit_rule_data pointers pointlesly
  AUDIT: Allow login in non-init namespaces
  audit: define audit_is_compat in kernel internal header
  kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
  sched: declare pid_alive as inline
  audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
  syscall_get_arch: remove useless function arguments
  audit: remove stray newline from audit_log_execve_info() audit_panic() call
  audit: remove stray newlines from audit_log_lost messages
  audit: include subject in login records
  audit: remove superfluous new- prefix in AUDIT_LOGIN messages
  audit: allow user processes to log from another PID namespace
  audit: anchor all pid references in the initial pid namespace
  audit: convert PPIDs to the inital PID namespace.
  pid: get pid_t ppid of task in init_pid_ns
  audit: rename the misleading audit_get_context() to audit_take_context()
  audit: Add generic compat syscall support
  audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
  ...
2014-04-12 12:38:53 -07:00
Casey Schaufler
54e70ec5eb Smack: bidirectional UDS connect check
Smack IPC policy requires that the sender have write access
to the receiver. UDS streams don't do per-packet checks. The
only check is done at connect time. The existing code checks
if the connecting process can write to the other, but not the
other way around. This change adds a check that the other end
can write to the connecting process.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schuafler <casey@schaufler-ca.com>
2014-04-11 14:35:28 -07:00
Casey Schaufler
f59bdfba3e Smack: Correctly remove SMACK64TRANSMUTE attribute
Sam Henderson points out that removing the SMACK64TRANSMUTE
attribute from a directory does not result in the directory
transmuting. This is because the inode flag indicating that
the directory is transmuting isn't cleared. The fix is a tad
less than trivial because smk_task and smk_mmap should have
been broken out, too.

Targeted for git://git.gitorious.org/smack-next/kernel.git

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2014-04-11 14:35:19 -07:00
José Bollo
9598f4c9e7 SMACK: Fix handling value==NULL in post setxattr
The function `smack_inode_post_setxattr` is called each
time that a setxattr is done, for any value of name.
The kernel allow to put value==NULL when size==0
to set an empty attribute value. The systematic
call to smk_import_entry was causing the dereference
of a NULL pointer hence a KERNEL PANIC!

The problem can be produced easily by issuing the
command `setfattr -n user.data file` under bash prompt
when SMACK is active.

Moving the call to smk_import_entry as proposed by this
patch is correcting the behaviour because the function
smack_inode_post_setxattr is called for the SMACK's
attributes only if the function smack_inode_setxattr validated
the value and its size (what will not be the case when size==0).

It also has a benefical effect to not fill the smack hash
with garbage values coming from any extended attribute
write.

Change-Id: Iaf0039c2be9bccb6cee11c24a3b44d209101fe47
Signed-off-by: José Bollo <jose.bollo@open.eurogiciel.org>
2014-04-11 14:35:05 -07:00