Commit Graph

561 Commits

Author SHA1 Message Date
Dan Carpenter
3ad49d37cf smackfs: Prevent underflow in smk_set_cipso()
There is a upper bound to "catlen" but no lower bound to prevent
negatives.  I don't see that this necessarily causes a problem but we
may as well be safe.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-08-07 14:09:23 -07:00
Tóth János
c47b658400 security: smack: smackfs: fix typo (lables->labels)
Fix a spelling error in smakcfs.

Signed-off-by: Tóth János <gomba007@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-08-07 14:09:23 -07:00
Roberto Sassu
baed456a6a smack: Set the SMACK64TRANSMUTE xattr in smack_inode_init_security()
With the newly added ability of LSMs to supply multiple xattrs, set
SMACK64TRASMUTE in smack_inode_init_security(), instead of d_instantiate().
Do it by incrementing SMACK_INODE_INIT_XATTRS to 2 and by calling
lsm_get_xattr_slot() a second time, if the transmuting conditions are met.

The LSM infrastructure passes all xattrs provided by LSMs to the
filesystems through the initxattrs() callback, so that filesystems can
store xattrs in the disk.

After the change, the SMK_INODE_TRANSMUTE inode flag is always set by
d_instantiate() after fetching SMACK64TRANSMUTE from the disk. Before it
was done by smack_inode_post_setxattr() as result of the __vfs_setxattr()
call.

Removing __vfs_setxattr() also prevents invalidating the EVM HMAC, by
adding a new xattr without checking and updating the existing HMAC.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:38 -04:00
Roberto Sassu
6bcdfd2cac security: Allow all LSMs to provide xattrs for inode_init_security hook
Currently, the LSM infrastructure supports only one LSM providing an xattr
and EVM calculating the HMAC on that xattr, plus other inode metadata.

Allow all LSMs to provide one or multiple xattrs, by extending the security
blob reservation mechanism. Introduce the new lbs_xattr_count field of the
lsm_blob_sizes structure, so that each LSM can specify how many xattrs it
needs, and the LSM infrastructure knows how many xattr slots it should
allocate.

Modify the inode_init_security hook definition, by passing the full
xattr array allocated in security_inode_init_security(), and the current
number of xattr slots in that array filled by LSMs. The first parameter
would allow EVM to access and calculate the HMAC on xattrs supplied by
other LSMs, the second to not leave gaps in the xattr array, when an LSM
requested but did not provide xattrs (e.g. if it is not initialized).

Introduce lsm_get_xattr_slot(), which LSMs can call as many times as the
number specified in the lbs_xattr_count field of the lsm_blob_sizes
structure. During each call, lsm_get_xattr_slot() increments the number of
filled xattrs, so that at the next invocation it returns the next xattr
slot to fill.

Cleanup security_inode_init_security(). Unify the !initxattrs and
initxattrs case by simply not allocating the new_xattrs array in the
former. Update the documentation to reflect the changes, and fix the
description of the xattr name, as it is not allocated anymore.

Adapt both SELinux and Smack to use the new definition of the
inode_init_security hook, and to call lsm_get_xattr_slot() to obtain and
fill the reserved slots in the xattr array.

Move the xattr->name assignment after the xattr->value one, so that it is
done only in case of successful memory allocation.

Finally, change the default return value of the inode_init_security hook
from zero to -EOPNOTSUPP, so that BPF LSM correctly follows the hook
conventions.

Reported-by: Nicolas Bouchinet <nicolas.bouchinet@clip-os.org>
Link: https://lore.kernel.org/linux-integrity/Y1FTSIo+1x+4X0LS@archlinux/
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: minor comment and variable tweaks, approved by RS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-07-10 13:59:37 -04:00
Roberto Sassu
2c085f3a8f smack: Record transmuting in smk_transmuted
smack_dentry_create_files_as() determines whether transmuting should occur
based on the label of the parent directory the new inode will be added to,
and not the label of the directory where it is created.

This helps for example to do transmuting on overlayfs, since the latter
first creates the inode in the working directory, and then moves it to the
correct destination.

However, despite smack_dentry_create_files_as() provides the correct label,
smack_inode_init_security() does not know from passed information whether
or not transmuting occurred. Without this information,
smack_inode_init_security() cannot set SMK_INODE_CHANGED in smk_flags,
which will result in the SMACK64TRANSMUTE xattr not being set in
smack_d_instantiate().

Thus, add the smk_transmuted field to the task_smack structure, and set it
in smack_dentry_create_files_as() to smk_task if transmuting occurred. If
smk_task is equal to smk_transmuted in smack_inode_init_security(), act as
if transmuting was successful but without taking the label from the parent
directory (the inode label was already set correctly from the current
credentials in smack_inode_alloc_security()).

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-05-11 10:05:39 -07:00
Roberto Sassu
3a3d8fce31 smack: Retrieve transmuting information in smack_inode_getsecurity()
Enhance smack_inode_getsecurity() to retrieve the value for
SMACK64TRANSMUTE from the inode security blob, similarly to SMACK64.

This helps to display accurate values in the situation where the security
labels come from mount options and not from xattrs.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-05-11 10:05:38 -07:00
Linus Torvalds
dc7e22a368 Smack updates for v6.4
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmRGv+4XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBE4QxAAkHiCueaplFsGvYhtx6aeajNC
 0ScA84efBMhQJ/jP4FsTh893bGUkbDv+dyasAVOoAdfFPfgpecEOELzhOaaXv5l2
 8pZ1CtTPXU9h5Csg7D6idII/EyzBUkKDCLbrZexT6A6ZEl0xTqY6Pz6/3uee/W4J
 Z/84U1lX/GgI/SzV6JFcO0XYDj2yp7cfdwIzPUHRky5HgPgLm3roB+eZQwONHfYl
 qYX5xAYCxMx6Uqx3kFb+wgXEJ71lFQGBd7zAZsinGqlrH0vIA63BqpxcHPhYTJNl
 9Y/t6Mb9ds2C1CCGhQTPn/m4hcqYcA5oLuhGWNhOeXMX450XBQ4v7nRw45Dkb1Sa
 IPwJTPfuH2I5r5dOW8cGVCrDp5OT+XQJ5OrsIBtdrPxPGX8x6XyaC4DLG3mympC6
 UfBxdP60Jtm/PRuLCX3tX92zzXhFuqt63Gw87b6htlgEPpirJlhZaEiCYKGlshS1
 b6+kMn1snCxqbBvE/jI3FKHp/C8F/lKNnuVRid9J6HkoyABubWMZ3UIAY+SkVw6b
 9BuF8dn+S/HOqPiijDDnwjnnhHFJQg3F8XRCmNP9MsDqfajcwWHs9ik0NLSMfD50
 CXpp3WIZDVGllmNSeYgkkZKuYV+yNbydLU+DaMfWEkOS7euRoaDozShVJdBTRfnV
 7PYZ3V4KhWkNCWXWfbw=
 =Ynnl
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.4' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "There are two changes, one small and one more substantial:

   - Remove of an unnecessary cast

   - The mount option processing introduced with the mount rework makes
     copies of mount option values. There is no good reason to make
     copies of Smack labels, as they are maintained on a list and never
     removed.

     The code now uses pointers to entries on the list, reducing
     processing time and memory use"

* tag 'Smack-for-6.4' of https://github.com/cschaufler/smack-next:
  Smack: Improve mount process memory use
  smack_lsm: remove unnecessary type casting
2023-04-24 11:37:24 -07:00
Casey Schaufler
de93e515db Smack: Improve mount process memory use
The existing mount processing code in Smack makes many unnecessary
copies of Smack labels. Because Smack labels never go away once
imported it is safe to use pointers to them rather than copies.
Replace the use of copies of label names to pointers to the global
label list entries.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-04-05 08:46:14 -07:00
Paul Moore
f22f9aaf6c selinux: remove the runtime disable functionality
After working with the larger SELinux-based distros for several
years, we're finally at a place where we can disable the SELinux
runtime disable functionality.  The existing kernel deprecation
notice explains the functionality and why we want to remove it:

  The selinuxfs "disable" node allows SELinux to be disabled at
  runtime prior to a policy being loaded into the kernel.  If
  disabled via this mechanism, SELinux will remain disabled until
  the system is rebooted.

  The preferred method of disabling SELinux is via the "selinux=0"
  boot parameter, but the selinuxfs "disable" node was created to
  make it easier for systems with primitive bootloaders that did not
  allow for easy modification of the kernel command line.
  Unfortunately, allowing for SELinux to be disabled at runtime makes
  it difficult to secure the kernel's LSM hooks using the
  "__ro_after_init" feature.

It is that last sentence, mentioning the '__ro_after_init' hardening,
which is the real motivation for this change, and if you look at the
diffstat you'll see that the impact of this patch reaches across all
the different LSMs, helping prevent tampering at the LSM hook level.

From a SELinux perspective, it is important to note that if you
continue to disable SELinux via "/etc/selinux/config" it may appear
that SELinux is disabled, but it is simply in an uninitialized state.
If you load a policy with `load_policy -i`, you will see SELinux
come alive just as if you had loaded the policy during early-boot.

It is also worth noting that the "/sys/fs/selinux/disable" file is
always writable now, regardless of the Kconfig settings, but writing
to the file has no effect on the system, other than to display an
error on the console if a non-zero/true value is written.

Finally, in the several years where we have been working on
deprecating this functionality, there has only been one instance of
someone mentioning any user visible breakage.  In this particular
case it was an individual's kernel test system, and the workaround
documented in the deprecation notice ("selinux=0" on the kernel
command line) resolved the issue without problem.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-03-20 12:34:23 -04:00
XU pengfei
502a29b04d smack_lsm: remove unnecessary type casting
Remove unnecessary type casting.
The type of inode variable is struct inode *, so no type casting required.

Signed-off-by: XU pengfei <xupengfei@nfschina.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-03-08 09:35:20 -08:00
Linus Torvalds
77bc1bb184 One fix for resetting CIPSO labeling.
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmP1G00XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFKjxAApRk/kVuNJArDxf+0gruF/5R+
 Ey+8olXAnGELoS2YZeg3pfGyB+BkQjqscAcJTj9zQ8IF7k8CCTP1V9TpcOdV3etU
 8LCyvUMf+dquLcEbEwgWp76iXmKM0dsrizzv0jAz+J9dvmWa7moVZ8mnDIlDMrju
 UfxH6+dHAzqk6HtXqbkMNnQM3+bABaBXP62ba9je8qwdLEoAXSS+MpDMnkPTnVgd
 EtL+W5RiR/AczG4DMpqFJN8HwF0jVloEGp7DLUyAICyU0DLVk9ZObkKO8Tg8FT4e
 krMT6csUXdfrrkIbXEAL7CogQE5chpez2d8IVn3QfipPePxspAPTmfqVX8CclL3o
 Nwudb+h45ZwZ4aM3FE8Na6ZcyNF9bzytmk3SkzcHjgixYDAjRZymVZvMzL2dIFWt
 PlJhMeTWlG9BRVc011K4GEFsNAqwuOhjS/7hIDKXsvXskms+OpNudVlBpyp0wf71
 5lH1595hbZKIndqGcxKtXCMxjbiIn1ZfsbyqMT8onPLRMLlllweGwISyJIK+nMRo
 2nVU9rEWgy8XAiw0sTenbpNvuO4EcDu9pVkj/jA6llgeb4DqLeGWjIWRUY/W30Pp
 +029mpvYM9ZzFFM/U+hI4C+upVq5FqUlcOH63BbbtgCSPigjrvuObbfTa1rxVLMd
 6qQRFQCvJJIgG/QnqD0=
 =QkGV
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.3' of https://github.com/cschaufler/smack-next

Pull smack update from Casey Schaufler:
 "One fix for resetting CIPSO labeling"

* tag 'Smack-for-6.3' of https://github.com/cschaufler/smack-next:
  smackfs: Added check catlen
2023-02-22 12:52:59 -08:00
Denis Arefev
ccfd889acb smackfs: Added check catlen
If the catlen is 0, the memory for the netlbl_lsm_catmap
  structure must be allocated anyway, otherwise the check of
  such rules is not completed correctly.

Signed-off-by: Denis Arefev <arefev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2023-02-21 11:22:02 -08:00
Christian Brauner
700b794052
fs: port acl to mnt_idmap
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-19 09:24:28 +01:00
Christian Brauner
39f60c1cce
fs: port xattr to mnt_idmap
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-19 09:24:28 +01:00
Christian Brauner
4609e1f18e
fs: port ->permission() to pass mnt_idmap
Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-19 09:24:28 +01:00
Linus Torvalds
c76ff350bd lsm/stable-6.2 PR 20221212
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmOXmxkUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMPXg//cxfYC8lRtVpuGNCZWDietSiHzpzu
 +qFntaTplvybJMQX0HfgNee5cTBZM+W5mp1BHRcZInvV5LRhyrVtgsxDBifutE4x
 LyUJAw5SkiPdRC+XLDIRLKiZCobFBLVs2zO+qibIqsyR60pFjU6WXBLbJfidXBFR
 yWudDbLU0YhQJCHdNHNqnHCgqrEculxn6q3QPvm/DX0xzBwkFHSSYBkGNvHW2ZTA
 lKNreEOwEk5DTLIKjP4bJ72ixp0xbshw5CXuxtwB/12/4h8QbWbJVQLlIeZrTLmp
 zQXQLJ3pCqKJ2OUCgMDK+wmkvLezd80BV3Due7KX0pT0YRDygoh5QEpZ5/8k8eG7
 prxToh2gJWk2htfJF6kgMpAh9Jqewcke4BysbYVM/427OPZYwQqLDZDGOzbtT6pl
 FYF+adN9wwkAErnHnPlzYipUEpBWurbjtsV8KFWNERoZ4YmzfSPEisRqGIHDGRws
 bTyq/7qs5FXkb1zULELj8V+S2ULsmxPqsxJ63p9di54Uo9lHK0I+0IUtajGDdfze
 psAasa9DD/oH2PAbSmpQ5Xo9XyfHRXsVuz1twEmEA14ML0m4wHbNWVHaK0aaXVdG
 kJKSDSjMsiV+GiwNo7ISJ4pVdUpnMI/iZSghFfV28cJslNhJDeaREHaE/Wtn1/xF
 /bCVmEfS16UoJsQ=
 =klFk
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Improve the error handling in the device cgroup such that memory
   allocation failures when updating the access policy do not
   potentially alter the policy.

 - Some minor fixes to reiserfs to ensure that it properly releases
   LSM-related xattr values.

 - Update the security_socket_getpeersec_stream() LSM hook to take
   sockptr_t values.

   Previously the net/BPF folks updated the getsockopt code in the
   network stack to leverage the sockptr_t type to make it easier to
   pass both kernel and __user pointers, but unfortunately when they did
   so they didn't convert the LSM hook.

   While there was/is no immediate risk by not converting the LSM hook,
   it seems like this is a mistake waiting to happen so this patch
   proactively does the LSM hook conversion.

 - Convert vfs_getxattr_alloc() to return an int instead of a ssize_t
   and cleanup the callers. Internally the function was never going to
   return anything larger than an int and the callers were doing some
   very odd things casting the return value; this patch fixes all that
   and helps bring a bit of sanity to vfs_getxattr_alloc() and its
   callers.

 - More verbose, and helpful, LSM debug output when the system is booted
   with "lsm.debug" on the command line. There are examples in the
   commit description, but the quick summary is that this patch provides
   better information about which LSMs are enabled and the ordering in
   which they are processed.

 - General comment and kernel-doc fixes and cleanups.

* tag 'lsm-pr-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lsm: Fix description of fs_context_parse_param
  lsm: Add/fix return values in lsm_hooks.h and fix formatting
  lsm: Clarify documentation of vm_enough_memory hook
  reiserfs: Add missing calls to reiserfs_security_free()
  lsm,fs: fix vfs_getxattr_alloc() return type and caller error paths
  device_cgroup: Roll back to original exceptions after copy failure
  LSM: Better reporting of actual LSMs at boot
  lsm: make security_socket_getpeersec_stream() sockptr_t safe
  audit: Fix some kernel-doc warnings
  lsm: remove obsoleted comments for security hooks
  fs: edit a comment made in bad taste
2022-12-13 09:47:48 -08:00
Paul Moore
b10b9c342f lsm: make security_socket_getpeersec_stream() sockptr_t safe
Commit 4ff09db1b7 ("bpf: net: Change sk_getsockopt() to take the
sockptr_t argument") made it possible to call sk_getsockopt()
with both user and kernel address space buffers through the use of
the sockptr_t type.  Unfortunately at the time of conversion the
security_socket_getpeersec_stream() LSM hook was written to only
accept userspace buffers, and in a desire to avoid having to change
the LSM hook the commit author simply passed the sockptr_t's
userspace buffer pointer.  Since the only sk_getsockopt() callers
at the time of conversion which used kernel sockptr_t buffers did
not allow SO_PEERSEC, and hence the
security_socket_getpeersec_stream() hook, this was acceptable but
also very fragile as future changes presented the possibility of
silently passing kernel space pointers to the LSM hook.

There are several ways to protect against this, including careful
code review of future commits, but since relying on code review to
catch bugs is a recipe for disaster and the upstream eBPF maintainer
is "strongly against defensive programming", this patch updates the
LSM hook, and all of the implementations to support sockptr_t and
safely handle both user and kernel space buffers.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-04 23:25:30 -04:00
Christian Brauner
44faac01cd
smack: implement get, set and remove acl hook
The current way of setting and getting posix acls through the generic
xattr interface is error prone and type unsafe. The vfs needs to
interpret and fixup posix acls before storing or reporting it to
userspace. Various hacks exist to make this work. The code is hard to
understand and difficult to maintain in it's current form. Instead of
making this work by hacking posix acls through xattr handlers we are
building a dedicated posix acl api around the get and set inode
operations. This removes a lot of hackiness and makes the codepaths
easier to maintain. A lot of background can be found in [1].

So far posix acls were passed as a void blob to the security and
integrity modules. Some of them like evm then proceed to interpret the
void pointer and convert it into the kernel internal struct posix acl
representation to perform their integrity checking magic. This is
obviously pretty problematic as that requires knowledge that only the
vfs is guaranteed to have and has lead to various bugs. Add a proper
security hook for setting posix acls and pass down the posix acls in
their appropriate vfs format instead of hacking it through a void
pointer stored in the uapi format.

I spent considerate time in the security module infrastructure and
audited all codepaths. Smack has no restrictions based on the posix
acl values passed through it. The capability hook doesn't need to be
called either because it only has restrictions on security.* xattrs. So
these all becomes very simple hooks for smack.

Link: https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org [1]
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-10-20 10:13:29 +02:00
Linus Torvalds
4c0ed7d8d6 whack-a-mole: constifying struct path *
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYzxmRQAKCRBZ7Krx/gZQ
 6+/kAQD2xyf+i4zOYVBr1NB3qBbhVS1zrni1NbC/kT3dJPgTvwEA7z7eqwnrN4zg
 scKFP8a3yPoaQBfs4do5PolhuSr2ngA=
 =NBI+
 -----END PGP SIGNATURE-----

Merge tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs constification updates from Al Viro:
 "whack-a-mole: constifying struct path *"

* tag 'pull-path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  ecryptfs: constify path
  spufs: constify path
  nd_jump_link(): constify path
  audit_init_parent(): constify path
  __io_setxattr(): constify path
  do_proc_readlink(): constify path
  overlayfs: constify path
  fs/notify: constify path
  may_linkat(): constify path
  do_sys_name_to_handle(): constify path
  ->getprocattr(): attribute name is const char *, TYVM...
2022-10-06 17:31:02 -07:00
Linus Torvalds
74a0f84590 Smack updates for v6.1
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmMzNOsXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBHSyg//XprfrAxU5Mk13fEKv1+L2TQ5
 07510lqIevJObY9WwhzPwYW/3KZwlXDc8pcYnJZt5o6zV9YXipB4kRtdDVdew5k7
 l+WJzwx+6uQjoHk6GrY7d50PhNFOpe+QPP68zs2iBJMairqpHEhEPbX81b2fhD2v
 7VnWWGhKMS+iYR9SEGldA8NNnPpzz4+1xs7OlT6CEM3pnZFANlR1RCSsr1DvYFvZ
 mJEXVWZNGQsLrwKLLesYGBzRRJeZtU47VMROyOqiXgSh+D2p9Z4ajVzdROSVNENY
 8e2CRp2al9Ij0arUBaq1JaAIrvoO2P0YiOSa5wPU2yghj3McvAkIphQ8+c1PxkzM
 r8Qk3hyZfjDMbh3jBFEugXt+UaQCCqELWnCrxoWZVflUdi5YXT1/7STifsQ1DhOw
 okppOmAXsQ7rsr3+GW0249i7ySzvXCI/xtXfpvnT4aw0rjBML0uN7GeoEzPr84Pw
 2vPM0lhULLifvfoaUwkySYVt0VHS2LVk1xaNFVikM80rkFagAjqU4ouzZw0JCa2U
 VA45/h5/kWt+57uj8hdmaPZtfkw7saSl51kozwISltJS7ga6X6lCm1VwWZC6bjJF
 QGUXWZlMC1hgwYK4DmMvjr9wWIwkxmEcVWSBMmsHiacr1Rl5N0Lnq0Rp8xD15u/R
 TIdvYo9hHV6biX9+pkU=
 =rKZK
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.1' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Two minor code clean-ups: one removes constants left over from the old
  mount API, while the other gets rid of an unneeded variable.

  The other change fixes a flaw in handling IPv6 labeling"

* tag 'Smack-for-6.1' of https://github.com/cschaufler/smack-next:
  smack: cleanup obsolete mount option flags
  smack: lsm: remove the unneeded result variable
  SMACK: Add sk_clone_security LSM hook
2022-10-03 17:38:09 -07:00
Xiu Jianfeng
cc71271f5b smack: cleanup obsolete mount option flags
These mount option flags are obsolete since commit 12085b14a4 ("smack:
switch to private smack_mnt_opts"), remove them.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-09-27 10:33:03 -07:00
Xu Panda
d3f84f5c96 smack: lsm: remove the unneeded result variable
Return the value smk_ptrace_rule_check() directly instead of storing it
in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-09-27 10:33:03 -07:00
Lontke Michael
4ca165fc6c SMACK: Add sk_clone_security LSM hook
Using smk_of_current() during sk_alloc_security hook leads in
rare cases to a faulty initialization of the security context
of the created socket.

By adding the LSM hook sk_clone_security to SMACK this initialization
fault is corrected by copying the security context of the old socket
pointer to the newly cloned one.

Co-authored-by: Martin Ostertag: <martin.ostertag@elektrobit.com>
Signed-off-by: Lontke Michael <michael.lontke@elektrobit.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-09-27 10:33:03 -07:00
Al Viro
c8e477c649 ->getprocattr(): attribute name is const char *, TYVM...
cast of ->d_name.name to char * is completely wrong - nothing is
allowed to modify its contents.

Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-09-01 17:34:39 -04:00
Casey Schaufler
dd93734022 Smack: Provide read control for io_uring_cmd
Limit io_uring "cmd" options to files for which the caller has
Smack read access. There may be cases where the cmd option may
be closer to a write access than a read, but there is no way
to make that determination.

Cc: stable@vger.kernel.org
Fixes: ee692a21e9 ("fs,io_uring: add infrastructure for uring-cmd")
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-26 14:56:35 -04:00
Xiu Jianfeng
aa16fb4b9e smack: Remove the redundant lsm_inode_alloc
It's not possible for inode->i_security to be NULL here because every
inode will call inode_init_always and then lsm_inode_alloc to alloc
memory for inode->security, this is what LSM infrastructure management
do, so remove this redundant code.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-08-01 11:26:09 -07:00
GONG, Ruiqi
63c3b5d2ca smack: Replace kzalloc + strncpy with kstrndup
Simplify the code by using kstrndup instead of kzalloc and strncpy in
smk_parse_smack(), which meanwhile remove strncpy as [1] suggests.

[1]: https://github.com/KSPP/linux/issues/90

Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-08-01 11:26:09 -07:00
Linus Torvalds
cbd76edeab Cleanups (and one fix) around struct mount handling.
The fix is usermode_driver.c one - once you've done kern_mount(), you
 must kern_unmount(); simple mntput() will end up with a leak.  Several
 failure exits in there messed up that way...  In practice you won't
 hit those particular failure exits without fault injection, though.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYpvrWQAKCRBZ7Krx/gZQ
 6z29AP9EZVSyIvnwXleehpa2mEZhsp+KAKgV/ENaKHMn7jiH0wD/bfgnhxIDNuc5
 108E2R5RWEYTynW5k7nnP5PsTsMq5Qc=
 =b3Wc
 -----END PGP SIGNATURE-----

Merge tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull mount handling updates from Al Viro:
 "Cleanups (and one fix) around struct mount handling.

  The fix is usermode_driver.c one - once you've done kern_mount(), you
  must kern_unmount(); simple mntput() will end up with a leak. Several
  failure exits in there messed up that way... In practice you won't hit
  those particular failure exits without fault injection, though"

* tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  move mount-related externs from fs.h to mount.h
  blob_to_mnt(): kern_unmount() is needed to undo kern_mount()
  m->mnt_root->d_inode->i_sb is a weird way to spell m->mnt_sb...
  linux/mount.h: trim includes
  uninline may_mount() and don't opencode it in fspick(2)/fsopen(2)
2022-06-04 19:00:05 -07:00
Michal Orzel
eaff451d4b smack: Remove redundant assignments
Get rid of redundant assignments which end up in values not being
read either because they are overwritten or the function ends.

Reported by clang-tidy [deadcode.DeadStores]

Signed-off-by: Michal Orzel <michalorzel.eng@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-05-23 11:12:08 -07:00
Al Viro
70f8d9c575 move mount-related externs from fs.h to mount.h
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-05-19 23:25:48 -04:00
Casey Schaufler
a5cd1ab7ab Fix incorrect type in assignment of ipv6 port for audit
Remove inappropriate use of ntohs() and assign the
port value directly.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2022-02-28 15:45:32 -08:00
Paul Moore
6326948f94 lsm: security_task_getsecid_subj() -> security_current_getsecid_subj()
The security_task_getsecid_subj() LSM hook invites misuse by allowing
callers to specify a task even though the hook is only safe when the
current task is referenced.  Fix this by removing the task_struct
argument to the hook, requiring LSM implementations to use the
current task.  While we are changing the hook declaration we also
rename the function to security_current_getsecid_subj() in an effort
to reinforce that the hook captures the subjective credentials of the
current task and not an arbitrary task on the system.

Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-11-22 17:52:47 -05:00
Linus Torvalds
cdab10bf32 selinux/stable-5.16 PR 20211101
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmGANbAUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNaMBAAg+9gZr0F7xiafu8JFZqZfx/AQdJ2
 G2cn3le+/tXGZmF8m/+82lOaR6LeQLatgSDJNSkXWkKr0nRwseQJDbtRfvYJdn0t
 Ax05/Fmz6OGxQ2wgRYgaFiSrKpE5p3NhDtiLFVdkCJaQNe/8DZOc7NhBl6EjZf3x
 ubhl2hUiJ4AmiXGwcYhr4uKgP4nhW8OM1/OkskVi+bBMmLA8KTY9kslmIDP5E3BW
 29W4qhqeLNQupY5dGMEMVcyxY9ZUWpO39q4uOaQVZrUGE7xABkj/jhnxT5gFTSlI
 pu8VhsYXm9KuRVveIsv0L5SZfadwoM9YAl7ki1wD3W5rHqOAte3rBTm6VmNlQwfU
 MqxP65Jiyxudxet5Be3/dCRH/+MDQuwBxivgmZXbeVxor2SeznVb0GDaEUC5FSHu
 CJIgWtQzsPJMxgAEGXN4F3QGP0htTTJni56GUPOsrf4TIBW02TT+oLTLFRIokQQL
 INNOfwVSRXElnCsvxsHR4oB+JZ9pJyBaAmeupcQ6jmcKiWlbLj4s+W0U0pM5h91v
 hmMpz7KMxrX6gVL4gB2Jj4aN3r5YRbq26NBu6D+wdwwBTeTTocaHSpAqkv4buClf
 uNk3cG8Hkp8TTg9cM8jYgpxMyzKH/AI/Uw3VhEa1xCiq2Ck3DgfnZvnvcRRaZevU
 FPgmwgqePJXGi60=
 =sb8J
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:

 - Add LSM/SELinux/Smack controls and auditing for io-uring.

   As usual, the individual commit descriptions have more detail, but we
   were basically missing two things which we're adding here:

      + establishment of a proper audit context so that auditing of
        io-uring ops works similarly to how it does for syscalls (with
        some io-uring additions because io-uring ops are *not* syscalls)

      + additional LSM hooks to enable access control points for some of
        the more unusual io-uring features, e.g. credential overrides.

   The additional audit callouts and LSM hooks were done in conjunction
   with the io-uring folks, based on conversations and RFC patches
   earlier in the year.

 - Fixup the binder credential handling so that the proper credentials
   are used in the LSM hooks; the commit description and the code
   comment which is removed in these patches are helpful to understand
   the background and why this is the proper fix.

 - Enable SELinux genfscon policy support for securityfs, allowing
   improved SELinux filesystem labeling for other subsystems which make
   use of securityfs, e.g. IMA.

* tag 'selinux-pr-20211101' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  security: Return xattr name from security_dentry_init_security()
  selinux: fix a sock regression in selinux_ip_postroute_compat()
  binder: use cred instead of task for getsecid
  binder: use cred instead of task for selinux checks
  binder: use euid from cred instead of using task
  LSM: Avoid warnings about potentially unused hook variables
  selinux: fix all of the W=1 build warnings
  selinux: make better use of the nf_hook_state passed to the NF hooks
  selinux: fix race condition when computing ocontext SIDs
  selinux: remove unneeded ipv6 hook wrappers
  selinux: remove the SELinux lockdown implementation
  selinux: enable genfscon labeling for securityfs
  Smack: Brutalist io_uring support
  selinux: add support for the io_uring access controls
  lsm,io_uring: add LSM hooks to io_uring
  io_uring: convert io_uring to the secure anon inode interface
  fs: add anon_inode_getfile_secure() similar to anon_inode_getfd_secure()
  audit: add filtering for io_uring records
  audit,io_uring,io-wq: add some basic audit support to io_uring
  audit: prepare audit_context for use in calling contexts beyond syscalls
2021-11-01 21:06:18 -07:00
Linus Torvalds
6f2b76a4a3 Smack changes for 5.16
Multiple corrections to smackfs.
 W=1 fixes
 Fix for overlayfs.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmGALSwXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBECcRAAuFN3uiptwLHC0P/Yy5VtANF4
 kvRV7egXeIp8tYl6zVb+VIa3AgQMTB0BfdpSVMq8YyTtp5olczlM2b00zN4cVZq0
 VO7XlTUfoljdsi4IysaA5rZZtjT5entoNl6eiCrMVwpvC4ZrlTqebOJteXDAnEQP
 zI+smzc5mUFn1FUkGaQ+ciLHv2wyT39Wmk78BDKI/UAl09xV6Kxfci15Q9UPpZ94
 CnYMfrjsinUCzQ+gbj9FIe5vXvcGwVoO7jJUNda5kuCSM3N4TTxD/fDkVxWEwOD2
 eNjemub5RkxVWSlzRtVQgEFsssRsd6VYEKop34jyojDOGDj+JVQ03ntnFUfWScUz
 8BkwyYsE8I+f878f0wAaz+8xrefjnwnRUHFzkF5hd6wLFCvQlCKULL/naRtig7pJ
 x1V6Q+AC/Qyu0rNrSH5UCDgsvDQ3YzKocWhnvgCqJa8bd/QlfMKu8sxIwetNlctz
 +TG+GwBLKaVmdiwWoI/CF3PkM4xYo4DtJtDnvlzAiEjGEYosEXilgDBq6IAD8vLa
 cuSXtWCIpBk5VKkglvAvsIbXxnWa0W45j7PXyf8b7YXRWF511I8zBHjpDx6XP/Ko
 FywGEaRDeNO3KZJxw9e39FUdyl1MT+s+gN3sERomUTig9RaPhp87pC/kWMWxVj+Y
 fU/iIgrRTqa2spQgYNg=
 =lPZg
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.16' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Multiple corrections to smackfs:

   - a change for overlayfs support that corrects the initial attributes
     on created files

   - code clean-up for netlabel processing

   - several fixes in smackfs for a variety of reasons

   - Errors reported by W=1 have been addressed

  All told, nothing challenging"

* tag 'Smack-for-5.16' of https://github.com/cschaufler/smack-next:
  smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
  smackfs: use __GFP_NOFAIL for smk_cipso_doi()
  Smack: fix W=1 build warnings
  smack: remove duplicated hook function
  Smack:- Use overlay inode label in smack_inode_copy_up()
  smack: Guard smack_ipv6_lock definition within a SMACK_IPV6_PORT_LABELING block
  smackfs: Fix use-after-free in netlbl_catmap_walk()
2021-11-01 17:34:02 -07:00
Tetsuo Handa
0934ad42bb smackfs: use netlbl_cfg_cipsov4_del() for deleting cipso_v4_doi
syzbot is reporting UAF at cipso_v4_doi_search() [1], for smk_cipso_doi()
is calling kfree() without removing from the cipso_v4_doi_list list after
netlbl_cfg_cipsov4_map_add() returned an error. We need to use
netlbl_cfg_cipsov4_del() in order to remove from the list and wait for
RCU grace period before kfree().

Link: https://syzkaller.appspot.com/bug?extid=93dba5b91f0fed312cbd [1]
Reported-by: syzbot <syzbot+93dba5b91f0fed312cbd@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 6c2e8ac095 ("netlabel: Update kernel configuration API")
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-10-22 08:46:53 -07:00
Tetsuo Handa
f91488ee15 smackfs: use __GFP_NOFAIL for smk_cipso_doi()
syzbot is reporting kernel panic at smk_cipso_doi() due to memory
allocation fault injection [1]. The reason for need to use panic() was
not explained. But since no fix was proposed for 18 months, for now
let's use __GFP_NOFAIL for utilizing syzbot resource on other bugs.

Link: https://syzkaller.appspot.com/bug?extid=89731ccb6fec15ce1c22 [1]
Reported-by: syzbot <syzbot+89731ccb6fec15ce1c22@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-10-22 08:46:44 -07:00
Casey Schaufler
b57d02091b Smack: fix W=1 build warnings
A couple of functions had malformed comment blocks.
Namespace parameters were added without updating the
comment blocks. These are all repaired in the Smack code,
so "% make W=1 security/smack" is warning free.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-10-13 14:56:43 -07:00
Florian Westphal
f8de49ef92 smack: remove duplicated hook function
ipv4 and ipv6 hook functions are identical, remove one.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-10-12 08:23:52 -07:00
Vishal Goel
387ef96446 Smack:- Use overlay inode label in smack_inode_copy_up()
Currently in "smack_inode_copy_up()" function, process label is
changed with the label on parent inode. Due to which,
process is assigned directory label and whatever file or directory
created by the process are also getting directory label
which is wrong label.

Changes has been done to use label of overlay inode instead
of parent inode.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-09-28 13:29:40 -07:00
Sebastian Andrzej Siewior
222a96b31c smack: Guard smack_ipv6_lock definition within a SMACK_IPV6_PORT_LABELING block
The mutex smack_ipv6_lock is only used with the SMACK_IPV6_PORT_LABELING
block but its definition is outside of the block. This leads to a
defined-but-not-used warning on PREEMPT_RT.

Moving smack_ipv6_lock down to the block where it is used where it used
raises the question why is smk_ipv6_port_list read if nothing is added
to it.
Turns out, only smk_ipv6_port_check() is using it outside of an ifdef
SMACK_IPV6_PORT_LABELING block. However two of three caller invoke
smk_ipv6_port_check() from a ifdef block and only one is using
__is_defined() macro which requires the function and smk_ipv6_port_list
to be around.

Put the lock and list inside an ifdef SMACK_IPV6_PORT_LABELING block to
avoid the warning regarding unused mutex. Extend the ifdef-block to also
cover smk_ipv6_port_check(). Make smack_socket_connect() use ifdef
instead of __is_defined() to avoid complains about missing function.

Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-09-24 13:14:49 -07:00
Paul Moore
a3727a8bac selinux,smack: fix subjective/objective credential use mixups
Jann Horn reported a problem with commit eb1231f73c ("selinux:
clarify task subjective and objective credentials") where some LSM
hooks were attempting to access the subjective credentials of a task
other than the current task.  Generally speaking, it is not safe to
access another task's subjective credentials and doing so can cause
a number of problems.

Further, while looking into the problem, I realized that Smack was
suffering from a similar problem brought about by a similar commit
1fb057dcde ("smack: differentiate between subjective and objective
task credentials").

This patch addresses this problem by restoring the use of the task's
objective credentials in those cases where the task is other than the
current executing task.  Not only does this resolve the problem
reported by Jann, it is arguably the correct thing to do in these
cases.

Cc: stable@vger.kernel.org
Fixes: eb1231f73c ("selinux: clarify task subjective and objective credentials")
Fixes: 1fb057dcde ("smack: differentiate between subjective and objective task credentials")
Reported-by: Jann Horn <jannh@google.com>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-23 12:30:59 -04:00
Casey Schaufler
d9d8c93938 Smack: Brutalist io_uring support
Add Smack privilege checks for io_uring. Use CAP_MAC_OVERRIDE
for the override_creds case and CAP_MAC_ADMIN for creating a
polling thread. These choices are based on conjecture regarding
the intent of the surrounding code.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: make the smack_uring_* funcs static, remove debug code]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-09-19 22:40:51 -04:00
Pawan Gupta
0817534ff9 smackfs: Fix use-after-free in netlbl_catmap_walk()
Syzkaller reported use-after-free bug as described in [1]. The bug is
triggered when smk_set_cipso() tries to free stale category bitmaps
while there are concurrent reader(s) using the same bitmaps.

Wait for RCU grace period to finish before freeing the category bitmaps
in smk_set_cipso(). This makes sure that there are no more readers using
the stale bitmaps and freeing them should be safe.

[1] https://lore.kernel.org/netdev/000000000000a814c505ca657a4e@google.com/

Reported-by: syzbot+3f91de0b813cc3d19a80@syzkaller.appspotmail.com
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-09-15 16:42:25 -07:00
Austin Kim
bfc3cac0c7 smack: mark 'smack_enabled' global variable as __initdata
Mark 'smack_enabled' as __initdata
since it is only used during initialization code.

Signed-off-by: Austin Kim <austin.kim@lge.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-07-20 10:34:59 -07:00
Tianjia Zhang
6d14f5c702 Smack: Fix wrong semantics in smk_access_entry()
In the smk_access_entry() function, if no matching rule is found
in the rust_list, a negative error code will be used to perform bit
operations with the MAY_ enumeration value. This is semantically
wrong. This patch fixes this issue.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-07-20 09:17:36 -07:00
ChenXiaoSong
fe6bde732b Smack: fix doc warning
Fix gcc W=1 warning:

security/smack/smack_access.c:342: warning: Function parameter or member 'ad' not described in 'smack_log'
security/smack/smack_access.c:403: warning: Function parameter or member 'skp' not described in 'smk_insert_entry'
security/smack/smack_access.c:487: warning: Function parameter or member 'level' not described in 'smk_netlbl_mls'
security/smack/smack_access.c:487: warning: Function parameter or member 'len' not described in 'smk_netlbl_mls'

Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-06-08 10:23:08 -07:00
Jens Axboe
0169d8f33a Revert "Smack: Handle io_uring kernel thread privileges"
This reverts commit 942cb357ae.

The io_uring PF_IO_WORKER threads no longer have PF_KTHREAD set, so no
need to special case them for credential checks.

Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-05-18 10:36:48 -07:00
Tetsuo Handa
49ec114a6e smackfs: restrict bytes count in smk_set_cipso()
Oops, I failed to update subject line.

From 07571157c91b98ce1a4aa70967531e64b78e8346 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Mon, 12 Apr 2021 22:25:06 +0900
Subject: [PATCH] smackfs: restrict bytes count in smk_set_cipso()

Commit 7ef4c19d24 ("smackfs: restrict bytes count in smackfs write
functions") missed that count > SMK_CIPSOMAX check applies to only
format == SMK_FIXED24_FMT case.

Reported-by: syzbot <syzbot+77c53db50c9fff774e8e@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-05-10 13:55:08 -07:00
Xiong Zhenwu
2e08fb550a security/smack/: fix misspellings using codespell tool
A typo is found out by codespell tool in 383th line of smackfs.c:

$ codespell ./security/smack/
./smackfs.c:383: numer  ==> number

Fix a typo found by codespell.

Signed-off-by: Xiong Zhenwu <xiong.zhenwu@zte.com.cn>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-05-10 13:54:58 -07:00
Linus Torvalds
17ae69aba8 Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com>
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAmCInP4ACgkQrZhLv9lQ
 BTza0g//dTeb9woC9H7qlEhK4l9yk62lTss60Q8X7m7ZSNfdL4tiEbi64SgK+iOW
 OOegbrOEb8Kzh4KJJYmVlVZ5YUWyH4szgmee1wnylBdsWiWaPLPF3Cflz77apy6T
 TiiBsJd7rRE29FKheaMt34B41BMh8QHESN+DzjzJWsFoi/uNxjgSs2W16XuSupKu
 bpRmB1pYNXMlrkzz7taL05jndZYE5arVriqlxgAsuLOFOp/ER7zecrjImdCM/4kL
 W6ej0R1fz2Geh6CsLBJVE+bKWSQ82q5a4xZEkSYuQHXgZV5eywE5UKu8ssQcRgQA
 VmGUY5k73rfY9Ofupf2gCaf/JSJNXKO/8Xjg0zAdklKtmgFjtna5Tyg9I90j7zn+
 5swSpKuRpilN8MQH+6GWAnfqQlNoviTOpFeq3LwBtNVVOh08cOg6lko/bmebBC+R
 TeQPACKS0Q0gCDPm9RYoU1pMUuYgfOwVfVRZK1prgi2Co7ZBUMOvYbNoKYoPIydr
 ENBYljlU1OYwbzgR2nE+24fvhU8xdNOVG1xXYPAEHShu+p7dLIWRLhl8UCtRQpSR
 1ofeVaJjgjrp29O+1OIQjB2kwCaRdfv/Gq1mztE/VlMU/r++E62OEzcH0aS+mnrg
 yzfyUdI8IFv1q6FGT9yNSifWUWxQPmOKuC8kXsKYfqfJsFwKmHM=
 =uCN4
 -----END PGP SIGNATURE-----

Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull Landlock LSM from James Morris:
 "Add Landlock, a new LSM from Mickaël Salaün.

  Briefly, Landlock provides for unprivileged application sandboxing.

  From Mickaël's cover letter:
    "The goal of Landlock is to enable to restrict ambient rights (e.g.
     global filesystem access) for a set of processes. Because Landlock
     is a stackable LSM [1], it makes possible to create safe security
     sandboxes as new security layers in addition to the existing
     system-wide access-controls. This kind of sandbox is expected to
     help mitigate the security impact of bugs or unexpected/malicious
     behaviors in user-space applications. Landlock empowers any
     process, including unprivileged ones, to securely restrict
     themselves.

     Landlock is inspired by seccomp-bpf but instead of filtering
     syscalls and their raw arguments, a Landlock rule can restrict the
     use of kernel objects like file hierarchies, according to the
     kernel semantic. Landlock also takes inspiration from other OS
     sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD
     Pledge/Unveil.

     In this current form, Landlock misses some access-control features.
     This enables to minimize this patch series and ease review. This
     series still addresses multiple use cases, especially with the
     combined use of seccomp-bpf: applications with built-in sandboxing,
     init systems, security sandbox tools and security-oriented APIs [2]"

  The cover letter and v34 posting is here:

      https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/

  See also:

      https://landlock.io/

  This code has had extensive design discussion and review over several
  years"

Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1]
Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2]

* tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  landlock: Enable user space to infer supported features
  landlock: Add user and kernel documentation
  samples/landlock: Add a sandbox manager example
  selftests/landlock: Add user space tests
  landlock: Add syscall implementations
  arch: Wire up Landlock syscalls
  fs,security: Add sb_delete hook
  landlock: Support filesystem access-control
  LSM: Infrastructure management of the superblock
  landlock: Add ptrace restrictions
  landlock: Set up the security framework and manage credentials
  landlock: Add ruleset and domain management
  landlock: Add object management
2021-05-01 18:50:44 -07:00
Casey Schaufler
1aea780837 LSM: Infrastructure management of the superblock
Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: John Johansen <john.johansen@canonical.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:10 -07:00
Paul Moore
1fb057dcde smack: differentiate between subjective and objective task credentials
With the split of the security_task_getsecid() into subjective and
objective variants it's time to update Smack to ensure it is using
the correct task creds.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Reviewed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 15:24:14 -04:00
Paul Moore
4ebd7651bf lsm: separate security_task_getsecid() into subjective and objective variants
Of the three LSMs that implement the security_task_getsecid() LSM
hook, all three LSMs provide the task's objective security
credentials.  This turns out to be unfortunate as most of the hook's
callers seem to expect the task's subjective credentials, although
a small handful of callers do correctly expect the objective
credentials.

This patch is the first step towards fixing the problem: it splits
the existing security_task_getsecid() hook into two variants, one
for the subjective creds, one for the objective creds.

  void security_task_getsecid_subj(struct task_struct *p,
				   u32 *secid);
  void security_task_getsecid_obj(struct task_struct *p,
				  u32 *secid);

While this patch does fix all of the callers to use the correct
variant, in order to keep this patch focused on the callers and to
ease review, the LSMs continue to use the same implementation for
both hooks.  The net effect is that this patch should not change
the behavior of the kernel in any way, it will be up to the latter
LSM specific patches in this series to change the hook
implementations and return the correct credentials.

Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 15:23:32 -04:00
Linus Torvalds
7d6beb71da idmapped-mounts-v5.12
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
 ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
 4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
 =yPaw
 -----END PGP SIGNATURE-----

Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull idmapped mounts from Christian Brauner:
 "This introduces idmapped mounts which has been in the making for some
  time. Simply put, different mounts can expose the same file or
  directory with different ownership. This initial implementation comes
  with ports for fat, ext4 and with Christoph's port for xfs with more
  filesystems being actively worked on by independent people and
  maintainers.

  Idmapping mounts handle a wide range of long standing use-cases. Here
  are just a few:

   - Idmapped mounts make it possible to easily share files between
     multiple users or multiple machines especially in complex
     scenarios. For example, idmapped mounts will be used in the
     implementation of portable home directories in
     systemd-homed.service(8) where they allow users to move their home
     directory to an external storage device and use it on multiple
     computers where they are assigned different uids and gids. This
     effectively makes it possible to assign random uids and gids at
     login time.

   - It is possible to share files from the host with unprivileged
     containers without having to change ownership permanently through
     chown(2).

   - It is possible to idmap a container's rootfs and without having to
     mangle every file. For example, Chromebooks use it to share the
     user's Download folder with their unprivileged containers in their
     Linux subsystem.

   - It is possible to share files between containers with
     non-overlapping idmappings.

   - Filesystem that lack a proper concept of ownership such as fat can
     use idmapped mounts to implement discretionary access (DAC)
     permission checking.

   - They allow users to efficiently changing ownership on a per-mount
     basis without having to (recursively) chown(2) all files. In
     contrast to chown (2) changing ownership of large sets of files is
     instantenous with idmapped mounts. This is especially useful when
     ownership of a whole root filesystem of a virtual machine or
     container is changed. With idmapped mounts a single syscall
     mount_setattr syscall will be sufficient to change the ownership of
     all files.

   - Idmapped mounts always take the current ownership into account as
     idmappings specify what a given uid or gid is supposed to be mapped
     to. This contrasts with the chown(2) syscall which cannot by itself
     take the current ownership of the files it changes into account. It
     simply changes the ownership to the specified uid and gid. This is
     especially problematic when recursively chown(2)ing a large set of
     files which is commong with the aforementioned portable home
     directory and container and vm scenario.

   - Idmapped mounts allow to change ownership locally, restricting it
     to specific mounts, and temporarily as the ownership changes only
     apply as long as the mount exists.

  Several userspace projects have either already put up patches and
  pull-requests for this feature or will do so should you decide to pull
  this:

   - systemd: In a wide variety of scenarios but especially right away
     in their implementation of portable home directories.

         https://systemd.io/HOME_DIRECTORY/

   - container runtimes: containerd, runC, LXD:To share data between
     host and unprivileged containers, unprivileged and privileged
     containers, etc. The pull request for idmapped mounts support in
     containerd, the default Kubernetes runtime is already up for quite
     a while now: https://github.com/containerd/containerd/pull/4734

   - The virtio-fs developers and several users have expressed interest
     in using this feature with virtual machines once virtio-fs is
     ported.

   - ChromeOS: Sharing host-directories with unprivileged containers.

  I've tightly synced with all those projects and all of those listed
  here have also expressed their need/desire for this feature on the
  mailing list. For more info on how people use this there's a bunch of
  talks about this too. Here's just two recent ones:

      https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
      https://fosdem.org/2021/schedule/event/containers_idmap/

  This comes with an extensive xfstests suite covering both ext4 and
  xfs:

      https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts

  It covers truncation, creation, opening, xattrs, vfscaps, setid
  execution, setgid inheritance and more both with idmapped and
  non-idmapped mounts. It already helped to discover an unrelated xfs
  setgid inheritance bug which has since been fixed in mainline. It will
  be sent for inclusion with the xfstests project should you decide to
  merge this.

  In order to support per-mount idmappings vfsmounts are marked with
  user namespaces. The idmapping of the user namespace will be used to
  map the ids of vfs objects when they are accessed through that mount.
  By default all vfsmounts are marked with the initial user namespace.
  The initial user namespace is used to indicate that a mount is not
  idmapped. All operations behave as before and this is verified in the
  testsuite.

  Based on prior discussions we want to attach the whole user namespace
  and not just a dedicated idmapping struct. This allows us to reuse all
  the helpers that already exist for dealing with idmappings instead of
  introducing a whole new range of helpers. In addition, if we decide in
  the future that we are confident enough to enable unprivileged users
  to setup idmapped mounts the permission checking can take into account
  whether the caller is privileged in the user namespace the mount is
  currently marked with.

  The user namespace the mount will be marked with can be specified by
  passing a file descriptor refering to the user namespace as an
  argument to the new mount_setattr() syscall together with the new
  MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
  of extensibility.

  The following conditions must be met in order to create an idmapped
  mount:

   - The caller must currently have the CAP_SYS_ADMIN capability in the
     user namespace the underlying filesystem has been mounted in.

   - The underlying filesystem must support idmapped mounts.

   - The mount must not already be idmapped. This also implies that the
     idmapping of a mount cannot be altered once it has been idmapped.

   - The mount must be a detached/anonymous mount, i.e. it must have
     been created by calling open_tree() with the OPEN_TREE_CLONE flag
     and it must not already have been visible in the filesystem.

  The last two points guarantee easier semantics for userspace and the
  kernel and make the implementation significantly simpler.

  By default vfsmounts are marked with the initial user namespace and no
  behavioral or performance changes are observed.

  The manpage with a detailed description can be found here:

      1d7b902e28

  In order to support idmapped mounts, filesystems need to be changed
  and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The
  patches to convert individual filesystem are not very large or
  complicated overall as can be seen from the included fat, ext4, and
  xfs ports. Patches for other filesystems are actively worked on and
  will be sent out separately. The xfstestsuite can be used to verify
  that port has been done correctly.

  The mount_setattr() syscall is motivated independent of the idmapped
  mounts patches and it's been around since July 2019. One of the most
  valuable features of the new mount api is the ability to perform
  mounts based on file descriptors only.

  Together with the lookup restrictions available in the openat2()
  RESOLVE_* flag namespace which we added in v5.6 this is the first time
  we are close to hardened and race-free (e.g. symlinks) mounting and
  path resolution.

  While userspace has started porting to the new mount api to mount
  proper filesystems and create new bind-mounts it is currently not
  possible to change mount options of an already existing bind mount in
  the new mount api since the mount_setattr() syscall is missing.

  With the addition of the mount_setattr() syscall we remove this last
  restriction and userspace can now fully port to the new mount api,
  covering every use-case the old mount api could. We also add the
  crucial ability to recursively change mount options for a whole mount
  tree, both removing and adding mount options at the same time. This
  syscall has been requested multiple times by various people and
  projects.

  There is a simple tool available at

      https://github.com/brauner/mount-idmapped

  that allows to create idmapped mounts so people can play with this
  patch series. I'll add support for the regular mount binary should you
  decide to pull this in the following weeks:

  Here's an example to a simple idmapped mount of another user's home
  directory:

	u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt

	u1001@f2-vm:/$ ls -al /home/ubuntu/
	total 28
	drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
	drwxr-xr-x 4 root   root   4096 Oct 28 04:00 ..
	-rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
	-rw-r--r-- 1 ubuntu ubuntu  220 Feb 25  2020 .bash_logout
	-rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25  2020 .bashrc
	-rw-r--r-- 1 ubuntu ubuntu  807 Feb 25  2020 .profile
	-rw-r--r-- 1 ubuntu ubuntu    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ ls -al /mnt/
	total 28
	drwxr-xr-x  2 u1001 u1001 4096 Oct 28 22:07 .
	drwxr-xr-x 29 root  root  4096 Oct 28 22:01 ..
	-rw-------  1 u1001 u1001 3154 Oct 28 22:12 .bash_history
	-rw-r--r--  1 u1001 u1001  220 Feb 25  2020 .bash_logout
	-rw-r--r--  1 u1001 u1001 3771 Feb 25  2020 .bashrc
	-rw-r--r--  1 u1001 u1001  807 Feb 25  2020 .profile
	-rw-r--r--  1 u1001 u1001    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw-------  1 u1001 u1001 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ touch /mnt/my-file

	u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file

	u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file

	u1001@f2-vm:/$ ls -al /mnt/my-file
	-rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file

	u1001@f2-vm:/$ ls -al /home/ubuntu/my-file
	-rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file

	u1001@f2-vm:/$ getfacl /mnt/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: mnt/my-file
	# owner: u1001
	# group: u1001
	user::rw-
	user:u1001:rwx
	group::rw-
	mask::rwx
	other::r--

	u1001@f2-vm:/$ getfacl /home/ubuntu/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: home/ubuntu/my-file
	# owner: ubuntu
	# group: ubuntu
	user::rw-
	user:ubuntu:rwx
	group::rw-
	mask::rwx
	other::r--"

* tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits)
  xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl
  xfs: support idmapped mounts
  ext4: support idmapped mounts
  fat: handle idmapped mounts
  tests: add mount_setattr() selftests
  fs: introduce MOUNT_ATTR_IDMAP
  fs: add mount_setattr()
  fs: add attr_flags_to_mnt_flags helper
  fs: split out functions to hold writers
  namespace: only take read lock in do_reconfigure_mnt()
  mount: make {lock,unlock}_mount_hash() static
  namespace: take lock_mount_hash() directly when changing flags
  nfs: do not export idmapped mounts
  overlayfs: do not mount on top of idmapped mounts
  ecryptfs: do not mount on top of idmapped mounts
  ima: handle idmapped mounts
  apparmor: handle idmapped mounts
  fs: make helpers idmap mount aware
  exec: handle idmapped mounts
  would_dump: handle idmapped mounts
  ...
2021-02-23 13:39:45 -08:00
Sabyrzhan Tasbolatov
7ef4c19d24 smackfs: restrict bytes count in smackfs write functions
syzbot found WARNINGs in several smackfs write operations where
bytes count is passed to memdup_user_nul which exceeds
GFP MAX_ORDER. Check count size if bigger than PAGE_SIZE.

Per smackfs doc, smk_write_net4addr accepts any label or -CIPSO,
smk_write_net6addr accepts any label or -DELETE. I couldn't find
any general rule for other label lengths except SMK_LABELLEN,
SMK_LONGLABEL, SMK_CIPSOMAX which are documented.

Let's constrain, in general, smackfs label lengths for PAGE_SIZE.
Although fuzzer crashes write to smackfs/netlabel on 0x400000 length.

Here is a quick way to reproduce the WARNING:
python -c "print('A' * 0x400000)" > /sys/fs/smackfs/netlabel

Reported-by: syzbot+a71a442385a0b2815497@syzkaller.appspotmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-02-02 17:14:02 -08:00
Christian Brauner
71bc356f93
commoncap: handle idmapped mounts
When interacting with user namespace and non-user namespace aware
filesystem capabilities the vfs will perform various security checks to
determine whether or not the filesystem capabilities can be used by the
caller, whether they need to be removed and so on. The main
infrastructure for this resides in the capability codepaths but they are
called through the LSM security infrastructure even though they are not
technically an LSM or optional. This extends the existing security hooks
security_inode_removexattr(), security_inode_killpriv(),
security_inode_getsecurity() to pass down the mount's user namespace and
makes them aware of idmapped mounts.

In order to actually get filesystem capabilities from disk the
capability infrastructure exposes the get_vfs_caps_from_disk() helper.
For user namespace aware filesystem capabilities a root uid is stored
alongside the capabilities.

In order to determine whether the caller can make use of the filesystem
capability or whether it needs to be ignored it is translated according
to the superblock's user namespace. If it can be translated to uid 0
according to that id mapping the caller can use the filesystem
capabilities stored on disk. If we are accessing the inode that holds
the filesystem capabilities through an idmapped mount we map the root
uid according to the mount's user namespace. Afterwards the checks are
identical to non-idmapped mounts: reading filesystem caps from disk
enforces that the root uid associated with the filesystem capability
must have a mapping in the superblock's user namespace and that the
caller is either in the same user namespace or is a descendant of the
superblock's user namespace. For filesystems that are mountable inside
user namespace the caller can just mount the filesystem and won't
usually need to idmap it. If they do want to idmap it they can create an
idmapped mount and mark it with a user namespace they created and which
is thus a descendant of s_user_ns. For filesystems that are not
mountable inside user namespaces the descendant rule is trivially true
because the s_user_ns will be the initial user namespace.

If the initial user namespace is passed nothing changes so non-idmapped
mounts will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-11-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:17 +01:00
Tycho Andersen
c7c7a1a18a
xattr: handle idmapped mounts
When interacting with extended attributes the vfs verifies that the
caller is privileged over the inode with which the extended attribute is
associated. For posix access and posix default extended attributes a uid
or gid can be stored on-disk. Let the functions handle posix extended
attributes on idmapped mounts. If the inode is accessed through an
idmapped mount we need to map it according to the mount's user
namespace. Afterwards the checks are identical to non-idmapped mounts.
This has no effect for e.g. security xattrs since they don't store uids
or gids and don't perform permission checks on them like posix acls do.

Link: https://lore.kernel.org/r/20210121131959.646623-10-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:17 +01:00
Linus Torvalds
2f2fce3d53 Provide a fix for the incorrect handling of privilege
in the face of io_uring's use of kernel threads.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl/ig/gXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBGqvRAArz1+Ki7ipJVOusXAMtBLimUc
 4rQnk6vCLZllojY5K0swfN7Y8zDMwrOiGLO9zXFeWFT3lc9vf0HVDDAerIZ+go8n
 g6gII+OWPM2E0ywBgzXgzt5eSOkJ7npcn5zYAHz5sZGlYjWfVu7IAtR6GAcN0aeC
 C75NTJSTbx2uyxUL/5rRJBAiNKHRbmzsNX103SKqAEuhEmHwb1yXE2i8xrxKe/un
 vQv7CJQYZRf6tFEpC8zYPDZb1/EXifxQkzxGhmJCJcl+ZTWW8RplNk8ij/BIyb/V
 Bupj+cF/waq6lXKGQB/lxrUiOYrUzk09RBaawH5ZTVnQ+kBgXPJjc3gjrOdjuVji
 DVl4oImgOGrTpq36PdldWIlB1tY1hrGgP4qjaRn0/6zJocLmuu3fVlVbHb/xMgMT
 Nq5/Ny1ymfH3w/hsg9Usze8nAL8Sj3pt6j0hdmgqS31Fth7aR16eLxZY5rK0jDWu
 d+eEobZx/ww0jJB8s+cVMGxhFNlq1Qmu0+nCPGHKtfnBwAWdMfu2M25lg3s2CBb+
 8+1edxG9Vfcq7laNUUrV9z8RAJpog+22DjEZpq2pCKIHgGqEYsvrxFcdLX/ymlTi
 RDtIi0+SF2Vt8B6M+iZ8gbTAGzcGZi1uHXZCha95A1nwLq6X7ZwzIiIzL0CicvZJ
 QPG3O4zvys3LthxZFoA=
 =zXSS
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.11-io_uring-fix' of git://github.com/cschaufler/smack-next

Pull smack fix from Casey Schaufler:
 "Provide a fix for the incorrect handling of privilege in the face of
  io_uring's use of kernel threads. That invalidated an long standing
  assumption regarding the privilege of kernel threads.

  The fix is simple and safe. It was provided by Jens Axboe and has been
  tested"

* tag 'Smack-for-5.11-io_uring-fix' of git://github.com/cschaufler/smack-next:
  Smack: Handle io_uring kernel thread privileges
2020-12-24 14:08:43 -08:00
Casey Schaufler
942cb357ae Smack: Handle io_uring kernel thread privileges
Smack assumes that kernel threads are privileged for smackfs
operations. This was necessary because the credential of the
kernel thread was not related to a user operation. With io_uring
the credential does reflect a user's rights and can be used.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-12-22 15:34:24 -08:00
Linus Torvalds
8bda68d68b Smack updates for Linux 5.11. One minor code clean-up and a
set of corrections in function header comments.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl/ZAUgXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFgFhAAgxDoN1EV1LMbt5YHHWVCRkuN
 a9anGIoKQz++B+XMXJ4p5aouJvm5Bg7XATgHagUIK8oT3i4EW/8t1mD2tSw1kvrn
 WARe0T+DrH+4NL1S53hcsfNVYdfUL4oSaQP5S/D7OezAILs8VDz108a97AWmrTnA
 0Ed3q+C8YzqQfPP3KjcNhXYOBpx2EwSwmWNeyAL9QU4lGYUbeO9gN7TVXQnRK4OH
 /cjhPRhe9D7Abr2QtHuVjzjFvpLx6hlkNOCI9o1kI9FM7JlC2m2nAEUerm1/DRxC
 z7vB0CP1nc8Q6MlbeI+fM3CQvWaDXV6/fKcHfUQ5sqIL2utyZzKZwg1xxfDvFYZ8
 WmF3v0XKUwW30rBa7Zbq+IBPSZ7ITWnIrIig2HwhdGxEBa6GHcy6MLMrOVKmQPsx
 2cNiFd8+FizP82tSzHpfb8hS2834Qz1RS/z7rpXFhkq71EqgLTl9Taghtv6gOYiX
 WGRiz4zOCqtOIzdQWfz+npLwfmr7ILfw1B5ftjMJv5U6Fq0JPowGswl999OYMvUr
 Ct/iyAcT64M5TTac8T7qFjzmFpjjGF8Ev3NPPnqYLTBFYA13/3Wrjdj8VDM92RXW
 HaqVv3bMaYwOzREI1OjiDyfJq+IwfUQNLaABSW+3lwuw3BTAabttkheMLnYAAY7O
 KXNhqO97Vp8Q83JBbtk=
 =LBD6
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.11' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "There are no functional changes. Just one minor code clean-up and a
  set of corrections in function header comments"

* tag 'Smack-for-5.11' of git://github.com/cschaufler/smack-next:
  security/smack: remove unused varible 'rc'
  Smack: fix kernel-doc interface on functions
2020-12-16 11:11:58 -08:00
Florian Westphal
41dd9596d6 security: add const qualifier to struct sock in various places
A followup change to tcp_request_sock_op would have to drop the 'const'
qualifier from the 'route_req' function as the
'security_inet_conn_request' call is moved there - and that function
expects a 'struct sock *'.

However, it turns out its also possible to add a const qualifier to
security_inet_conn_request instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 12:56:03 -08:00
Alex Shi
9b0072e2b2 security/smack: remove unused varible 'rc'
This varible isn't used and can be removed to avoid a gcc warning:
security/smack/smack_lsm.c:3873:6: warning: variable ‘rc’ set but not
used [-Wunused-but-set-variable]

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-11-16 17:26:31 -08:00
Alex Shi
7da31b858e Smack: fix kernel-doc interface on functions
The are some kernel-doc interface issues:
security/smack/smackfs.c:1950: warning: Function parameter or member
'list' not described in 'smk_parse_label_list'
security/smack/smackfs.c:1950: warning: Excess function parameter
'private' description in 'smk_parse_label_list'
security/smack/smackfs.c:1979: warning: Function parameter or member
'list' not described in 'smk_destroy_label_list'
security/smack/smackfs.c:1979: warning: Excess function parameter 'head'
description in 'smk_destroy_label_list'
security/smack/smackfs.c:2141: warning: Function parameter or member
'count' not described in 'smk_read_logging'
security/smack/smackfs.c:2141: warning: Excess function parameter 'cn'
description in 'smk_read_logging'
security/smack/smackfs.c:2278: warning: Function parameter or member
'format' not described in 'smk_user_access'

Correct them in this patch.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-11-13 11:50:44 -08:00
Linus Torvalds
99a6740f88 Smack LSM changes for Linux 5.10
Two kernel test robot suggested clean-ups.
 Teach Smack to use the IPv4 netlabel cache.
 This results in a 12-14% improvement on TCP benchmarks.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl+EkO0XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFmWBAAqJzkYd1Pd3g0vewZXGe5mRxs
 Qqt6i3HMJ7yAPIJN3445DyHUB6Sad2jmGXGullROfUtJYFkTbxqIj0XT+RUrJFap
 qf54Sgzt/ucNTvZ45s5mV6iYdL2ol7NSB9kQOMVMV0BZf2GXknYv0/rP23wI/upo
 Ylk41UYZNE5RDwoH90sy9aUUJ6VHhMYJOj33aUXz6rwaO1Ck6xvu1tCDudJpERkU
 kUmgpYzkNVGmg+GQIOT/xr7C7LRFxfnbSX0UL5TYPEKC23tYfkSBoClgCpNSSYSh
 3+7+YNRjMjxuKYJEPTUliDeDMQLxQB/3tLMD2c92nADtTJyRkTZelSgFhDFiOW0L
 ln5otQ0aGuDvkmLmBLk/jcq3eKctztqJIQdxQG9K3kJdohz1t/onJJt3cwuWZIo5
 T8dKz/mhga11u0ii6Zk+ecDdpagcxrQ8FDKcjI1tgXtTKAEvjfhVwAbj/9X55BGY
 T3M52b6RuE/ZRPD/BKXMaAUTNI0jwYbJyZYm+5GY/fk6lg0CRNaPBTVr0hj1Ia4P
 7akSD1gYUSTblFvjGa+hBsnmUDHo7Htl+kXJ3v6U8bSPFgFsO7hLbZJR6nRbxDvV
 k/KejR7RsORkbTFczWGngTKGI2UUNgMJf3Km9WU5bLqU3PpM32Hlal3CLMm3dQTB
 /3Gfl6aCy7Ct72tNilw=
 =EgBn
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Two minor fixes and one performance enhancement to Smack. The
  performance improvement is significant and the new code is more like
  its counterpart in SELinux.

   - Two kernel test robot suggested clean-ups.

   - Teach Smack to use the IPv4 netlabel cache. This results in a
     12-14% improvement on TCP benchmarks"

* tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next:
  Smack: Remove unnecessary variable initialization
  Smack: Fix build when NETWORK_SECMARK is not set
  Smack: Use the netlabel cache
  Smack: Set socket labels only once
  Smack: Consolidate uses of secmark into a function
2020-10-13 16:18:51 -07:00
Casey Schaufler
edd615371b Smack: Remove unnecessary variable initialization
The initialization of rc in smack_from_netlbl() is pointless.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-10-05 14:20:51 -07:00
Casey Schaufler
bf0afe673b Smack: Fix build when NETWORK_SECMARK is not set
Use proper conditional compilation for the secmark field in
the network skb.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-22 14:59:31 -07:00
Casey Schaufler
322dd63c7f Smack: Use the netlabel cache
Utilize the Netlabel cache mechanism for incoming packet matching.
Refactor the initialization of secattr structures, as it was being
done in two places.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:31 -07:00
Casey Schaufler
a2af031885 Smack: Set socket labels only once
Refactor the IP send checks so that the netlabel value
is set only when necessary, not on every send. Some functions
get renamed as the changes made the old name misleading.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:30 -07:00
Casey Schaufler
36be81293d Smack: Consolidate uses of secmark into a function
Add a function smack_from_skb() that returns the Smack label
identified by a network secmark. Replace the explicit uses of
the secmark with this function.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:30 -07:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Dan Carpenter
42a2df3e82 Smack: prevent underflow in smk_set_cipso()
We have an upper bound on "maplevel" but forgot to check for negative
values.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-27 13:35:12 -07:00
Dan Carpenter
a6bd4f6d9b Smack: fix another vsscanf out of bounds
This is similar to commit 84e99e58e8 ("Smack: slab-out-of-bounds in
vsscanf") where we added a bounds check on "rule".

Reported-by: syzbot+a22c6092d003d6fe1122@syzkaller.appspotmail.com
Fixes: f7112e6c9a ("Smack: allow for significantly longer Smack labels v4")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-27 13:35:03 -07:00
Eric Biggers
beb4ee6770 Smack: fix use-after-free in smk_write_relabel_self()
smk_write_relabel_self() frees memory from the task's credentials with
no locking, which can easily cause a use-after-free because multiple
tasks can share the same credentials structure.

Fix this by using prepare_creds() and commit_creds() to correctly modify
the task's credentials.

Reproducer for "BUG: KASAN: use-after-free in smk_write_relabel_self":

	#include <fcntl.h>
	#include <pthread.h>
	#include <unistd.h>

	static void *thrproc(void *arg)
	{
		int fd = open("/sys/fs/smackfs/relabel-self", O_WRONLY);
		for (;;) write(fd, "foo", 3);
	}

	int main()
	{
		pthread_t t;
		pthread_create(&t, NULL, thrproc, NULL);
		thrproc(NULL);
	}

Reported-by: syzbot+e6416dabb497a650da40@syzkaller.appspotmail.com
Fixes: 38416e5393 ("Smack: limited capability for changing process label")
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-14 11:19:58 -07:00
Linus Torvalds
6c32978414 Notifications over pipes + Keyring notifications
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl7U/i8ACgkQ+7dXa6fL
 C2u2eg/+Oy6ybq0hPovYVkFI9WIG7ZCz7w9Q6BEnfYMqqn3dnfJxKQ3l4pnQEOWw
 f4QfvpvevsYfMtOJkYcG6s66rQgbFdqc5TEyBBy0QNp3acRolN7IXkcopvv9xOpQ
 JxedpbFG1PTFLWjvBpyjlrUPouwLzq2FXAf1Ox0ZIMw6165mYOMWoli1VL8dh0A0
 Ai7JUB0WrvTNbrwhV413obIzXT/rPCdcrgbQcgrrLPex8lQ47ZAE9bq6k4q5HiwK
 KRzEqkQgnzId6cCNTFBfkTWsx89zZunz7jkfM5yx30MvdAtPSxvvpfIPdZRZkXsP
 E2K9Fk1/6OQZTC0Op3Pi/bt+hVG/mD1p0sQUDgo2MO3qlSS+5mMkR8h3mJEgwK12
 72P4YfOJkuAy2z3v4lL0GYdUDAZY6i6G8TMxERKu/a9O3VjTWICDOyBUS6F8YEAK
 C7HlbZxAEOKTVK0BTDTeEUBwSeDrBbvH6MnRlZCG5g1Fos2aWP0udhjiX8IfZLO7
 GN6nWBvK1fYzfsUczdhgnoCzQs3suoDo04HnsTPGJ8De52T4x2RsjV+gPx0nrNAq
 eWChl1JvMWsY2B3GLnl9XQz4NNN+EreKEkk+PULDGllrArrPsp5Vnhb9FJO1PVCU
 hMDJHohPiXnKbc8f4Bd78OhIvnuoGfJPdM5MtNe2flUKy2a2ops=
 =YTGf
 -----END PGP SIGNATURE-----

Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull notification queue from David Howells:
 "This adds a general notification queue concept and adds an event
  source for keys/keyrings, such as linking and unlinking keys and
  changing their attributes.

  Thanks to Debarshi Ray, we do have a pull request to use this to fix a
  problem with gnome-online-accounts - as mentioned last time:

     https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

  Without this, g-o-a has to constantly poll a keyring-based kerberos
  cache to find out if kinit has changed anything.

  [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

  LSM hooks are included:

   - A set of hooks are provided that allow an LSM to rule on whether or
     not a watch may be set. Each of these hooks takes a different
     "watched object" parameter, so they're not really shareable. The
     LSM should use current's credentials. [Wanted by SELinux & Smack]

   - A hook is provided to allow an LSM to rule on whether or not a
     particular message may be posted to a particular queue. This is
     given the credentials from the event generator (which may be the
     system) and the watch setter. [Wanted by Smack]

  I've provided SELinux and Smack with implementations of some of these
  hooks.

  WHY
  ===

  Key/keyring notifications are desirable because if you have your
  kerberos tickets in a file/directory, your Gnome desktop will monitor
  that using something like fanotify and tell you if your credentials
  cache changes.

  However, we also have the ability to cache your kerberos tickets in
  the session, user or persistent keyring so that it isn't left around
  on disk across a reboot or logout. Keyrings, however, cannot currently
  be monitored asynchronously, so the desktop has to poll for it - not
  so good on a laptop. This facility will allow the desktop to avoid the
  need to poll.

  DESIGN DECISIONS
  ================

   - The notification queue is built on top of a standard pipe. Messages
     are effectively spliced in. The pipe is opened with a special flag:

        pipe2(fds, O_NOTIFICATION_PIPE);

     The special flag has the same value as O_EXCL (which doesn't seem
     like it will ever be applicable in this context)[?]. It is given up
     front to make it a lot easier to prohibit splice&co from accessing
     the pipe.

     [?] Should this be done some other way?  I'd rather not use up a new
         O_* flag if I can avoid it - should I add a pipe3() system call
         instead?

     The pipe is then configured::

        ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
        ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

     Messages are then read out of the pipe using read().

   - It should be possible to allow write() to insert data into the
     notification pipes too, but this is currently disabled as the
     kernel has to be able to insert messages into the pipe *without*
     holding pipe->mutex and the code to make this work needs careful
     auditing.

   - sendfile(), splice() and vmsplice() are disabled on notification
     pipes because of the pipe->mutex issue and also because they
     sometimes want to revert what they just did - but one or more
     notification messages might've been interleaved in the ring.

   - The kernel inserts messages with the wait queue spinlock held. This
     means that pipe_read() and pipe_write() have to take the spinlock
     to update the queue pointers.

   - Records in the buffer are binary, typed and have a length so that
     they can be of varying size.

     This allows multiple heterogeneous sources to share a common
     buffer; there are 16 million types available, of which I've used
     just a few, so there is scope for others to be used. Tags may be
     specified when a watchpoint is created to help distinguish the
     sources.

   - Records are filterable as types have up to 256 subtypes that can be
     individually filtered. Other filtration is also available.

   - Notification pipes don't interfere with each other; each may be
     bound to a different set of watches. Any particular notification
     will be copied to all the queues that are currently watching for it
     - and only those that are watching for it.

   - When recording a notification, the kernel will not sleep, but will
     rather mark a queue as having lost a message if there's
     insufficient space. read() will fabricate a loss notification
     message at an appropriate point later.

   - The notification pipe is created and then watchpoints are attached
     to it, using one of:

        keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
        watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
        watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

     where in both cases, fd indicates the queue and the number after is
     a tag between 0 and 255.

   - Watches are removed if either the notification pipe is destroyed or
     the watched object is destroyed. In the latter case, a message will
     be generated indicating the enforced watch removal.

  Things I want to avoid:

   - Introducing features that make the core VFS dependent on the
     network stack or networking namespaces (ie. usage of netlink).

   - Dumping all this stuff into dmesg and having a daemon that sits
     there parsing the output and distributing it as this then puts the
     responsibility for security into userspace and makes handling
     namespaces tricky. Further, dmesg might not exist or might be
     inaccessible inside a container.

   - Letting users see events they shouldn't be able to see.

  TESTING AND MANPAGES
  ====================

   - The keyutils tree has a pipe-watch branch that has keyctl commands
     for making use of notifications. Proposed manual pages can also be
     found on this branch, though a couple of them really need to go to
     the main manpages repository instead.

     If the kernel supports the watching of keys, then running "make
     test" on that branch will cause the testing infrastructure to spawn
     a monitoring process on the side that monitors a notifications pipe
     for all the key/keyring changes induced by the tests and they'll
     all be checked off to make sure they happened.

        https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

   - A test program is provided (samples/watch_queue/watch_test) that
     can be used to monitor for keyrings, mount and superblock events.
     Information on the notifications is simply logged to stdout"

* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  smack: Implement the watch_key and post_notification hooks
  selinux: Implement the watch_key security hook
  keys: Make the KEY_NEED_* perms an enum rather than a mask
  pipe: Add notification lossage handling
  pipe: Allow buffers to be marked read-whole-or-error for notifications
  Add sample notification program
  watch_queue: Add a key/keyring notification facility
  security: Add hooks to rule on setting a watch
  pipe: Add general notification queue support
  pipe: Add O_NOTIFICATION_PIPE
  security: Add a hook for the point of notification insertion
  uapi: General notification queue definitions
2020-06-13 09:56:21 -07:00
Linus Torvalds
15a2bc4dbb Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "Last cycle for the Nth time I ran into bugs and quality of
  implementation issues related to exec that could not be easily be
  fixed because of the way exec is implemented. So I have been digging
  into exec and cleanup up what I can.

  I don't think I have exec sorted out enough to fix the issues I
  started with but I have made some headway this cycle with 4 sets of
  changes.

   - promised cleanups after introducing exec_update_mutex

   - trivial cleanups for exec

   - control flow simplifications

   - remove the recomputation of bprm->cred

  The net result is code that is a bit easier to understand and work
  with and a decrease in the number of lines of code (if you don't count
  the added tests)"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (24 commits)
  exec: Compute file based creds only once
  exec: Add a per bprm->file version of per_clear
  binfmt_elf_fdpic: fix execfd build regression
  selftests/exec: Add binfmt_script regression test
  exec: Remove recursion from search_binary_handler
  exec: Generic execfd support
  exec/binfmt_script: Don't modify bprm->buf and then return -ENOEXEC
  exec: Move the call of prepare_binprm into search_binary_handler
  exec: Allow load_misc_binary to call prepare_binprm unconditionally
  exec: Convert security_bprm_set_creds into security_bprm_repopulate_creds
  exec: Factor security_bprm_creds_for_exec out of security_bprm_set_creds
  exec: Teach prepare_exec_creds how exec treats uids & gids
  exec: Set the point of no return sooner
  exec: Move handling of the point of no return to the top level
  exec: Run sync_mm_rss before taking exec_update_mutex
  exec: Fix spelling of search_binary_handler in a comment
  exec: Move the comment from above de_thread to above unshare_sighand
  exec: Rename flush_old_exec begin_new_exec
  exec: Move most of setup_new_exec into flush_old_exec
  exec: In setup_new_exec cache current in the local variable me
  ...
2020-06-04 14:07:08 -07:00
Eric W. Biederman
b8bff59926 exec: Factor security_bprm_creds_for_exec out of security_bprm_set_creds
Today security_bprm_set_creds has several implementations:
apparmor_bprm_set_creds, cap_bprm_set_creds, selinux_bprm_set_creds,
smack_bprm_set_creds, and tomoyo_bprm_set_creds.

Except for cap_bprm_set_creds they all test bprm->called_set_creds and
return immediately if it is true.  The function cap_bprm_set_creds
ignores bprm->calld_sed_creds entirely.

Create a new LSM hook security_bprm_creds_for_exec that is called just
before prepare_binprm in __do_execve_file, resulting in a LSM hook
that is called exactly once for the entire of exec.  Modify the bits
of security_bprm_set_creds that only want to be called once per exec
into security_bprm_creds_for_exec, leaving only cap_bprm_set_creds
behind.

Remove bprm->called_set_creds all of it's former users have been moved
to security_bprm_creds_for_exec.

Add or upate comments a appropriate to bring them up to date and
to reflect this change.

Link: https://lkml.kernel.org/r/87v9kszrzh.fsf_-_@x220.int.ebiederm.org
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com> # For the LSM and Smack bits
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-05-20 14:45:31 -05:00
David Howells
a8478a6029 smack: Implement the watch_key and post_notification hooks
Implement the watch_key security hook in Smack to make sure that a key
grants the caller Read permission in order to set a watch on a key.

Also implement the post_notification security hook to make sure that the
notification source is granted Write permission by the watch queue.

For the moment, the watch_devices security hook is left unimplemented as
it's not obvious what the object should be since the queue is global and
didn't previously exist.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-19 15:47:38 +01:00
David Howells
8c0637e950 keys: Make the KEY_NEED_* perms an enum rather than a mask
Since the meaning of combining the KEY_NEED_* constants is undefined, make
it so that you can't do that by turning them into an enum.

The enum is also given some extra values to represent special
circumstances, such as:

 (1) The '0' value is reserved and causes a warning to trap the parameter
     being unset.

 (2) The key is to be unlinked and we require no permissions on it, only
     the keyring, (this replaces the KEY_LOOKUP_FOR_UNLINK flag).

 (3) An override due to CAP_SYS_ADMIN.

 (4) An override due to an instantiation token being present.

 (5) The permissions check is being deferred to later key_permission()
     calls.

The extra values give the opportunity for LSMs to audit these situations.

[Note: This really needs overhauling so that lookup_user_key() tells
 key_task_permission() and the LSM what operation is being done and leaves
 it to those functions to decide how to map that onto the available
 permits.  However, I don't really want to make these change in the middle
 of the notifications patchset.]

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: Paul Moore <paul@paul-moore.com>
cc: Stephen Smalley <stephen.smalley.work@gmail.com>
cc: Casey Schaufler <casey@schaufler-ca.com>
cc: keyrings@vger.kernel.org
cc: selinux@vger.kernel.org
2020-05-19 15:42:22 +01:00
YueHaibing
ef26650a20 Smack: Remove unused inline function smk_ad_setfield_u_fs_path_mnt
commit a269434d2f ("LSM: separate LSM_AUDIT_DATA_DENTRY from LSM_AUDIT_DATA_PATH")
left behind this, remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-11 10:25:37 -07:00
Casey Schaufler
4ca7528706 Smack:- Remove redundant inode_smack cache
The inode_smack cache is no longer used.
Remove it.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-06 14:46:26 -07:00
Casey Schaufler
921bb1cbb3 Smack:- Remove mutex lock "smk_lock" from inode_smack
"smk_lock" mutex is used during inode instantiation in
smack_d_instantiate()function. It has been used to avoid
simultaneous access on same inode security structure.
Since smack related initialization is done only once i.e during
inode creation. If the inode has already been instantiated then
smack_d_instantiate() function just returns without doing
anything.

So it means mutex lock is required only during inode creation.
But since 2 processes can't create same inodes or files
simultaneously. Also linking or some other file operation can't
be done simultaneously when the file is getting created since
file lookup will fail before dentry inode linkup which is done
after smack initialization.
So no mutex lock is required in inode_smack structure.

It will save memory as well as improve some performance.
If 40000 inodes are created in system, it will save 1.5 MB on
32-bit systems & 2.8 MB on 64-bit systems.

Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-06 14:46:26 -07:00
Casey Schaufler
84e99e58e8 Smack: slab-out-of-bounds in vsscanf
Add barrier to soob. Return -EOVERFLOW if the buffer
is exceeded.

Suggested-by: Hillf Danton <hdanton@sina.com>
Reported-by: syzbot+bfdd4a2f07be52351350@syzkaller.appspotmail.com
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-06 14:46:26 -07:00
Maninder Singh
092c94aed3 smack: remove redundant structure variable from header.
commit afb1cbe374 ("LSM: Infrastructure management
of the inode security") removed usage of smk_rcu,
thus removing it from structure.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-06 14:46:26 -07:00
Arnd Bergmann
00720f0e7f smack: avoid unused 'sip' variable warning
The mix of IS_ENABLED() and #ifdef checks has left a combination
that causes a warning about an unused variable:

security/smack/smack_lsm.c: In function 'smack_socket_connect':
security/smack/smack_lsm.c:2838:24: error: unused variable 'sip' [-Werror=unused-variable]
 2838 |   struct sockaddr_in6 *sip = (struct sockaddr_in6 *)sap;

Change the code to use C-style checks consistently so the compiler
can handle it correctly.

Fixes: 87fbfffcc8 ("broken ping to ipv6 linklocal addresses on debian buster")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-05-06 14:46:26 -07:00
Linus Torvalds
c9d35ee049 Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs file system parameter updates from Al Viro:
 "Saner fs_parser.c guts and data structures. The system-wide registry
  of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
  the horror switch() in fs_parse() that would have to grow another case
  every time something got added to that system-wide registry.

  New syntax types can be added by filesystems easily now, and their
  namespace is that of functions - not of system-wide enum members. IOW,
  they can be shared or kept private and if some turn out to be widely
  useful, we can make them common library helpers, etc., without having
  to do anything whatsoever to fs_parse() itself.

  And we already get that kind of requests - the thing that finally
  pushed me into doing that was "oh, and let's add one for timeouts -
  things like 15s or 2h". If some filesystem really wants that, let them
  do it. Without somebody having to play gatekeeper for the variants
  blessed by direct support in fs_parse(), TYVM.

  Quite a bit of boilerplate is gone. And IMO the data structures make a
  lot more sense now. -200LoC, while we are at it"

* 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
  tmpfs: switch to use of invalfc()
  cgroup1: switch to use of errorfc() et.al.
  procfs: switch to use of invalfc()
  hugetlbfs: switch to use of invalfc()
  cramfs: switch to use of errofc() et.al.
  gfs2: switch to use of errorfc() et.al.
  fuse: switch to use errorfc() et.al.
  ceph: use errorfc() and friends instead of spelling the prefix out
  prefix-handling analogues of errorf() and friends
  turn fs_param_is_... into functions
  fs_parse: handle optional arguments sanely
  fs_parse: fold fs_parameter_desc/fs_parameter_spec
  fs_parser: remove fs_parameter_description name field
  add prefix to fs_context->log
  ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
  new primitive: __fs_parse()
  switch rbd and libceph to p_log-based primitives
  struct p_log, variants of warnf() et.al. taking that one instead
  teach logfc() to handle prefices, give it saner calling conventions
  get rid of cg_invalf()
  ...
2020-02-08 13:26:41 -08:00
Al Viro
d7167b1499 fs_parse: fold fs_parameter_desc/fs_parameter_spec
The former contains nothing but a pointer to an array of the latter...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07 14:48:37 -05:00
Eric Sandeen
96cafb9ccb fs_parser: remove fs_parameter_description name field
Unused now.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-02-07 14:48:36 -05:00
Casey Schaufler
87fbfffcc8
broken ping to ipv6 linklocal addresses on debian buster
I am seeing ping failures to IPv6 linklocal addresses with Debian
buster. Easiest example to reproduce is:

$ ping -c1 -w1 ff02::1%eth1
connect: Invalid argument

$ ping -c1 -w1 ff02::1%eth1
PING ff02::01%eth1(ff02::1%eth1) 56 data bytes
64 bytes from fe80::e0:f9ff:fe0c:37%eth1: icmp_seq=1 ttl=64 time=0.059 ms

git bisect traced the failure to
commit b9ef5513c9 ("smack: Check address length before reading address family")

Arguably ping is being stupid since the buster version is not setting
the address family properly (ping on stretch for example does):

$ strace -e connect ping6 -c1 -w1 ff02::1%eth1
connect(5, {sa_family=AF_UNSPEC,
sa_data="\4\1\0\0\0\0\377\2\0\0\0\0\0\0\0\0\0\0\0\0\0\1\3\0\0\0"}, 28)
= -1 EINVAL (Invalid argument)

but the command works fine on kernels prior to this commit, so this is
breakage which goes against the Linux paradigm of "don't break userspace"

Cc: stable@vger.kernel.org
Reported-by: David Ahern <dsahern@gmail.com>
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>

 security/smack/smack_lsm.c | 41 +++++++++++++++++++----------------------
 1 file changed, 19 insertions(+), 22 deletions(-)
2020-02-05 14:16:27 -08:00
David Howells
d055b4fb4d pipe: Reduce #inclusion of pipe_fs_i.h
Remove some #inclusions of linux/pipe_fs_i.h that don't seem to be
necessary any more.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-10-23 17:02:34 +01:00
Linus Torvalds
e94f8ccde4 I have four patches for v5.4. Nothing is major. All but one are in
response to mechanically detected potential issues. The remaining
 patch cleans up kernel-doc notations.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl2JI5wXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBEOJQ/5AXdQTd09LMp9jB54u9Usdm71
 +kyJ/KudEja8/pCDDNboiXSfoagRqJ8AbuBAbGLtWLXc3smUcL1mncdfJDJAk88J
 mbIB+qWMls5fC25udD+B2bF2py+eyVJ7dsnvHZg1mS5KUxYBMWVEqgX9zW0EFgNH
 xd2/nB314GhULrfqagxxCd/HpbZ3GV1sM+BkfRPx2zm3gJ8xAuXm1xMMgchP9WqH
 MFJDqk8r1wXCog8OkjQjAYR8zGRJTrP9W6UY9p1L6rp9rtfyPObBuAMLKv3WlXx8
 Jz7idqSDNa49V7W3UrWcjXCunbjyPR7HszuuxhTC+EmB1MRU4IdX9I6ZdAaTuxEM
 jFNwSSjIWRgXkJfLxrDX1ukFPU0JCd8ms7Lzw5YHq2TWt/V/7h4jyUCN8o9BN80r
 7WzqdzT4v+Exc6TpqlpkHiQjJFL4ByEzNt3xNVZ3UFIyxnogVi45kL/78PsqDk/j
 XWqM9bED8dBjM/K3EGqzj0mPCtILLnTm9ZyDvFF75jabf4rk0E354yGcuamoF+eM
 UTT+3NTPQB/kI5i9av8ibGezInVVRQeHuI1/qIaD/Hsr8K7VJbqlB1k/rUxUZaSy
 6g9e0mU2GLgM+eW0EKW0GWpV6/STqzskxu2TW46tobpOykwH9dNKJHhJzx7nEWJi
 +5kMcGIvFCha6922/sM=
 =QV1S
 -----END PGP SIGNATURE-----

Merge tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Four patches for v5.4. Nothing is major.

  All but one are in response to mechanically detected potential issues.
  The remaining patch cleans up kernel-doc notations"

* tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next:
  smack: use GFP_NOFS while holding inode_smack::smk_lock
  security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb()
  smack: fix some kernel-doc notations
  Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
2019-09-23 14:25:45 -07:00
Eric Biggers
e5bfad3d7a
smack: use GFP_NOFS while holding inode_smack::smk_lock
inode_smack::smk_lock is taken during smack_d_instantiate(), which is
called during a filesystem transaction when creating a file on ext4.
Therefore to avoid a deadlock, all code that takes this lock must use
GFP_NOFS, to prevent memory reclaim from waiting for the filesystem
transaction to complete.

Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04 09:37:07 -07:00
Jia-Ju Bai
3f4287e7d9
security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb()
In smack_socket_sock_rcv_skb(), there is an if statement
on line 3920 to check whether skb is NULL:
    if (skb && skb->secmark != 0)

This check indicates skb can be NULL in some cases.

But on lines 3931 and 3932, skb is used:
    ad.a.u.net->netif = skb->skb_iif;
    ipv6_skb_to_auditdata(skb, &ad.a, NULL);

Thus, possible null-pointer dereferences may occur when skb is NULL.

To fix these possible bugs, an if statement is added to check skb.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04 09:37:07 -07:00
luanshi
a1a07f2234
smack: fix some kernel-doc notations
Fix/add kernel-doc notation and fix typos in security/smack/.

Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04 09:37:07 -07:00
Jann Horn
3675f052b4
Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
There is a logic bug in the current smack_bprm_set_creds():
If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be
acceptable (e.g. because the ptracer detached in the meantime), the other
->unsafe flags aren't checked. As far as I can tell, this means that
something like the following could work (but I haven't tested it):

 - task A: create task B with fork()
 - task B: set NO_NEW_PRIVS
 - task B: install a seccomp filter that makes open() return 0 under some
   conditions
 - task B: replace fd 0 with a malicious library
 - task A: attach to task B with PTRACE_ATTACH
 - task B: execve() a file with an SMACK64EXEC extended attribute
 - task A: while task B is still in the middle of execve(), exit (which
   destroys the ptrace relationship)

Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in
bprm->unsafe, we reject the execve().

Cc: stable@vger.kernel.org
Fixes: 5663884caa ("Smack: unify all ptrace accesses in the smack")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2019-09-04 09:36:57 -07:00
Linus Torvalds
933a90bf4f Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs mount updates from Al Viro:
 "The first part of mount updates.

  Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  mnt_init(): call shmem_init() unconditionally
  constify ksys_mount() string arguments
  don't bother with registering rootfs
  init_rootfs(): don't bother with init_ramfs_fs()
  vfs: Convert smackfs to use the new mount API
  vfs: Convert selinuxfs to use the new mount API
  vfs: Convert securityfs to use the new mount API
  vfs: Convert apparmorfs to use the new mount API
  vfs: Convert openpromfs to use the new mount API
  vfs: Convert xenfs to use the new mount API
  vfs: Convert gadgetfs to use the new mount API
  vfs: Convert oprofilefs to use the new mount API
  vfs: Convert ibmasmfs to use the new mount API
  vfs: Convert qib_fs/ipathfs to use the new mount API
  vfs: Convert efivarfs to use the new mount API
  vfs: Convert configfs to use the new mount API
  vfs: Convert binfmt_misc to use the new mount API
  convenience helper: get_tree_single()
  convenience helper get_tree_nodev()
  vfs: Kill sget_userns()
  ...
2019-07-19 10:42:02 -07:00
Linus Torvalds
028db3e290 Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs"
This reverts merge 0f75ef6a9c (and thus
effectively commits

   7a1ade8475 ("keys: Provide KEYCTL_GRANT_PERMISSION")
   2e12256b9a ("keys: Replace uid/gid/perm permissions checking with an ACL")

that the merge brought in).

It turns out that it breaks booting with an encrypted volume, and Eric
biggers reports that it also breaks the fscrypt tests [1] and loading of
in-kernel X.509 certificates [2].

The root cause of all the breakage is likely the same, but David Howells
is off email so rather than try to work it out it's getting reverted in
order to not impact the rest of the merge window.

 [1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/
 [2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/

Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/
Reported-by: Eric Biggers <ebiggers@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-10 18:43:43 -07:00
Linus Torvalds
0f75ef6a9c Keyrings ACL
-----BEGIN PGP SIGNATURE-----
 
 iQIVAwUAXRyyVvu3V2unywtrAQL3xQ//eifjlELkRAPm2EReWwwahdM+9QL/0bAy
 e8eAzP9EaphQGUhpIzM9Y7Cx+a8XW2xACljY8hEFGyxXhDMoLa35oSoJOeay6vQt
 QcgWnDYsET8Z7HOsFCP3ZQqlbbqfsB6CbIKtZoEkZ8ib7eXpYcy1qTydu7wqrl4A
 AaJalAhlUKKUx9hkGGJTh2xvgmxgSJkxx3cNEWJQ2uGgY/ustBpqqT4iwFDsgA/q
 fcYTQFfNQBsC8/SmvQgxJSc+reUdQdp0z1vd8qjpSdFFcTq1qOtK0qDdz1Bbyl24
 hAxvNM1KKav83C8aF7oHhEwLrkD+XiYKixdEiCJJp+A2i+vy2v8JnfgtFTpTgLNK
 5xu2VmaiWmee9SLCiDIBKE4Ghtkr8DQ/5cKFCwthT8GXgQUtdsdwAaT3bWdCNfRm
 DqgU/AyyXhoHXrUM25tPeF3hZuDn2yy6b1TbKA9GCpu5TtznZIHju40Px/XMIpQH
 8d6s/pg+u/SnkhjYWaTvTcvsQ2FB/vZY/UzAVyosnoMBkVfL4UtAHGbb8FBVj1nf
 Dv5VjSjl4vFjgOr3jygEAeD2cJ7L6jyKbtC/jo4dnOmPrSRShIjvfSU04L3z7FZS
 XFjMmGb2Jj8a7vAGFmsJdwmIXZ1uoTwX56DbpNL88eCgZWFPGKU7TisdIWAmJj8U
 N9wholjHJgw=
 =E3bF
 -----END PGP SIGNATURE-----

Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull keyring ACL support from David Howells:
 "This changes the permissions model used by keys and keyrings to be
  based on an internal ACL by the following means:

   - Replace the permissions mask internally with an ACL that contains a
     list of ACEs, each with a specific subject with a permissions mask.
     Potted default ACLs are available for new keys and keyrings.

     ACE subjects can be macroised to indicate the UID and GID specified
     on the key (which remain). Future commits will be able to add
     additional subject types, such as specific UIDs or domain
     tags/namespaces.

     Also split a number of permissions to give finer control. Examples
     include splitting the revocation permit from the change-attributes
     permit, thereby allowing someone to be granted permission to revoke
     a key without allowing them to change the owner; also the ability
     to join a keyring is split from the ability to link to it, thereby
     stopping a process accessing a keyring by joining it and thus
     acquiring use of possessor permits.

   - Provide a keyctl to allow the granting or denial of one or more
     permits to a specific subject. Direct access to the ACL is not
     granted, and the ACL cannot be viewed"

* tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  keys: Provide KEYCTL_GRANT_PERMISSION
  keys: Replace uid/gid/perm permissions checking with an ACL
2019-07-08 19:56:57 -07:00
David Howells
5afdd0f1e6 vfs: Convert smackfs to use the new mount API
Convert the smackfs filesystem to the new internal mount API as the old
one will be obsoleted and removed.  This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Casey Schaufler <casey@schaufler-ca.com>
cc: linux-security-module@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-04 22:01:59 -04:00
David Howells
2e12256b9a keys: Replace uid/gid/perm permissions checking with an ACL
Replace the uid/gid/perm permissions checking on a key with an ACL to allow
the SETATTR and SEARCH permissions to be split.  This will also allow a
greater range of subjects to represented.

============
WHY DO THIS?
============

The problem is that SETATTR and SEARCH cover a slew of actions, not all of
which should be grouped together.

For SETATTR, this includes actions that are about controlling access to a
key:

 (1) Changing a key's ownership.

 (2) Changing a key's security information.

 (3) Setting a keyring's restriction.

And actions that are about managing a key's lifetime:

 (4) Setting an expiry time.

 (5) Revoking a key.

and (proposed) managing a key as part of a cache:

 (6) Invalidating a key.

Managing a key's lifetime doesn't really have anything to do with
controlling access to that key.

Expiry time is awkward since it's more about the lifetime of the content
and so, in some ways goes better with WRITE permission.  It can, however,
be set unconditionally by a process with an appropriate authorisation token
for instantiating a key, and can also be set by the key type driver when a
key is instantiated, so lumping it with the access-controlling actions is
probably okay.

As for SEARCH permission, that currently covers:

 (1) Finding keys in a keyring tree during a search.

 (2) Permitting keyrings to be joined.

 (3) Invalidation.

But these don't really belong together either, since these actions really
need to be controlled separately.

Finally, there are number of special cases to do with granting the
administrator special rights to invalidate or clear keys that I would like
to handle with the ACL rather than key flags and special checks.


===============
WHAT IS CHANGED
===============

The SETATTR permission is split to create two new permissions:

 (1) SET_SECURITY - which allows the key's owner, group and ACL to be
     changed and a restriction to be placed on a keyring.

 (2) REVOKE - which allows a key to be revoked.

The SEARCH permission is split to create:

 (1) SEARCH - which allows a keyring to be search and a key to be found.

 (2) JOIN - which allows a keyring to be joined as a session keyring.

 (3) INVAL - which allows a key to be invalidated.

The WRITE permission is also split to create:

 (1) WRITE - which allows a key's content to be altered and links to be
     added, removed and replaced in a keyring.

 (2) CLEAR - which allows a keyring to be cleared completely.  This is
     split out to make it possible to give just this to an administrator.

 (3) REVOKE - see above.


Keys acquire ACLs which consist of a series of ACEs, and all that apply are
unioned together.  An ACE specifies a subject, such as:

 (*) Possessor - permitted to anyone who 'possesses' a key
 (*) Owner - permitted to the key owner
 (*) Group - permitted to the key group
 (*) Everyone - permitted to everyone

Note that 'Other' has been replaced with 'Everyone' on the assumption that
you wouldn't grant a permit to 'Other' that you wouldn't also grant to
everyone else.

Further subjects may be made available by later patches.

The ACE also specifies a permissions mask.  The set of permissions is now:

	VIEW		Can view the key metadata
	READ		Can read the key content
	WRITE		Can update/modify the key content
	SEARCH		Can find the key by searching/requesting
	LINK		Can make a link to the key
	SET_SECURITY	Can change owner, ACL, expiry
	INVAL		Can invalidate
	REVOKE		Can revoke
	JOIN		Can join this keyring
	CLEAR		Can clear this keyring


The KEYCTL_SETPERM function is then deprecated.

The KEYCTL_SET_TIMEOUT function then is permitted if SET_SECURITY is set,
or if the caller has a valid instantiation auth token.

The KEYCTL_INVALIDATE function then requires INVAL.

The KEYCTL_REVOKE function then requires REVOKE.

The KEYCTL_JOIN_SESSION_KEYRING function then requires JOIN to join an
existing keyring.

The JOIN permission is enabled by default for session keyrings and manually
created keyrings only.


======================
BACKWARD COMPATIBILITY
======================

To maintain backward compatibility, KEYCTL_SETPERM will translate the
permissions mask it is given into a new ACL for a key - unless
KEYCTL_SET_ACL has been called on that key, in which case an error will be
returned.

It will convert possessor, owner, group and other permissions into separate
ACEs, if each portion of the mask is non-zero.

SETATTR permission turns on all of INVAL, REVOKE and SET_SECURITY.  WRITE
permission turns on WRITE, REVOKE and, if a keyring, CLEAR.  JOIN is turned
on if a keyring is being altered.

The KEYCTL_DESCRIBE function translates the ACL back into a permissions
mask to return depending on possessor, owner, group and everyone ACEs.

It will make the following mappings:

 (1) INVAL, JOIN -> SEARCH

 (2) SET_SECURITY -> SETATTR

 (3) REVOKE -> WRITE if SETATTR isn't already set

 (4) CLEAR -> WRITE

Note that the value subsequently returned by KEYCTL_DESCRIBE may not match
the value set with KEYCTL_SETATTR.


=======
TESTING
=======

This passes the keyutils testsuite for all but a couple of tests:

 (1) tests/keyctl/dh_compute/badargs: The first wrong-key-type test now
     returns EOPNOTSUPP rather than ENOKEY as READ permission isn't removed
     if the type doesn't have ->read().  You still can't actually read the
     key.

 (2) tests/keyctl/permitting/valid: The view-other-permissions test doesn't
     work as Other has been replaced with Everyone in the ACL.

Signed-off-by: David Howells <dhowells@redhat.com>
2019-06-27 23:03:07 +01:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00