Commit Graph

36 Commits

Author SHA1 Message Date
Linus Torvalds
f1ef09fde1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
 "There is a lot here. A lot of these changes result in subtle user
  visible differences in kernel behavior. I don't expect anything will
  care but I will revert/fix things immediately if any regressions show
  up.

  From Seth Forshee there is a continuation of the work to make the vfs
  ready for unpriviled mounts. We had thought the previous changes
  prevented the creation of files outside of s_user_ns of a filesystem,
  but it turns we missed the O_CREAT path. Ooops.

  Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
  standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
  children that are forked after the prctl are considered and not
  children forked before the prctl. The only known user of this prctl
  systemd forks all children after the prctl. So no userspace
  regressions will occur. Holding earlier forked children to the same
  rules as later forked children creates a semantic that is sane enough
  to allow checkpoing of processes that use this feature.

  There is a long delayed change by Nikolay Borisov to limit inotify
  instances inside a user namespace.

  Michael Kerrisk extends the API for files used to maniuplate
  namespaces with two new trivial ioctls to allow discovery of the
  hierachy and properties of namespaces.

  Konstantin Khlebnikov with the help of Al Viro adds code that when a
  network namespace exits purges it's sysctl entries from the dcache. As
  in some circumstances this could use a lot of memory.

  Vivek Goyal fixed a bug with stacked filesystems where the permissions
  on the wrong inode were being checked.

  I continue previous work on ptracing across exec. Allowing a file to
  be setuid across exec while being ptraced if the tracer has enough
  credentials in the user namespace, and if the process has CAP_SETUID
  in it's own namespace. Proc files for setuid or otherwise undumpable
  executables are now owned by the root in the user namespace of their
  mm. Allowing debugging of setuid applications in containers to work
  better.

  A bug I introduced with permission checking and automount is now
  fixed. The big change is to mark the mounts that the kernel initiates
  as a result of an automount. This allows the permission checks in sget
  to be safely suppressed for this kind of mount. As the permission
  check happened when the original filesystem was mounted.

  Finally a special case in the mount namespace is removed preventing
  unbounded chains in the mount hash table, and making the semantics
  simpler which benefits CRIU.

  The vfs fix along with related work in ima and evm I believe makes us
  ready to finish developing and merge fully unprivileged mounts of the
  fuse filesystem. The cleanups of the mount namespace makes discussing
  how to fix the worst case complexity of umount. The stacked filesystem
  fixes pave the way for adding multiple mappings for the filesystem
  uids so that efficient and safer containers can be implemented"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc/sysctl: Don't grab i_lock under sysctl_lock.
  vfs: Use upper filesystem inode in bprm_fill_uid()
  proc/sysctl: prune stale dentries during unregistering
  mnt: Tuck mounts under others instead of creating shadow/side mounts.
  prctl: propagate has_child_subreaper flag to every descendant
  introduce the walk_process_tree() helper
  nsfs: Add an ioctl() to return owner UID of a userns
  fs: Better permission checking for submounts
  exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
  vfs: open() with O_CREAT should not create inodes with unknown ids
  nsfs: Add an ioctl() to return the namespace type
  proc: Better ownership of files for non-dumpable tasks in user namespaces
  exec: Remove LSM_UNSAFE_PTRACE_CAP
  exec: Test the ptracer's saved cred to see if the tracee can gain caps
  exec: Don't reset euid and egid when the tracee has CAP_SETUID
  inotify: Convert to using per-namespace limits
2017-02-23 20:33:51 -08:00
Eric W. Biederman
9227dd2a84 exec: Remove LSM_UNSAFE_PTRACE_CAP
With previous changes every location that tests for
LSM_UNSAFE_PTRACE_CAP also tests for LSM_UNSAFE_PTRACE making the
LSM_UNSAFE_PTRACE_CAP redundant, so remove it.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2017-01-24 12:03:08 +13:00
John Johansen
aa9a39ad8f apparmor: convert change_profile to use fqname later to give better control
Moving the use of fqname to later allows learning profiles to be based
on the fqname request instead of just the hname. It also allows cleaning
up some of the name parsing and lookup by allowing the use of
the fqlookupn_profile() lib fn.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:49 -08:00
John Johansen
ef88a7ac55 apparmor: change aad apparmor_audit_data macro to a fn macro
The aad macro can replace aad strings when it is not intended to. Switch
to a fn macro so it is only applied when intended.

Also at the same time cleanup audit_data initialization by putting
common boiler plate behind a macro, and dropping the gfp_t parameter
which will become useless.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:47 -08:00
John Johansen
47f6e5cc73 apparmor: change op from int to const char *
Having ops be an integer that is an index into an op name table is
awkward and brittle. Every op change requires an edit for both the
op constant and a string in the table. Instead switch to using const
strings directly, eliminating the need for the table that needs to
be kept in sync.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:46 -08:00
John Johansen
55a26ebf63 apparmor: rename context abreviation cxt to the more standard ctx
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:45 -08:00
John Johansen
181f7c9776 apparmor: name null-XXX profiles after the executable
When possible its better to name a learning profile after the missing
profile in question. This allows for both more informative names and
for profile reuse.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 01:18:30 -08:00
John Johansen
98849dff90 apparmor: rename namespace to ns to improve code line lengths
Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:16 -08:00
John Johansen
cff281f686 apparmor: split apparmor policy namespaces code into its own file
Policy namespaces will be diverging from profile management and
expanding so put it in its own file.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-01-16 00:42:15 -08:00
John Johansen
3d40658c97 apparmor: fix change_hat not finding hat after policy replacement
After a policy replacement, the task cred may be out of date and need
to be updated. However change_hat is using the stale profiles from
the out of date cred resulting in either: a stale profile being applied
or, incorrect failure when searching for a hat profile as it has been
migrated to the new parent profile.

Fixes: 01e2b670aa (failure to find hat)
Fixes: 898127c34e (stale policy being applied)
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000287
Cc: stable@vger.kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2016-11-21 18:01:28 +11:00
John Johansen
f7da2de011 apparmor: ensure the target profile name is always audited
The target profile name was not being correctly audited in a few
cases because the target variable was not being set and gotos
passed the code to set it at apply:

Since it is always based on new_profile just drop the target var
and conditionally report based on new_profile.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
John Johansen
9049a79221 apparmor: exec should not be returning ENOENT when it denies
The current behavior is confusing as it causes exec failures to report
the executable is missing instead of identifying that apparmor
caused the failure.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2016-07-12 08:43:10 -07:00
Casey Schaufler
b1d9e6b064 LSM: Switch to lists of hooks
Instead of using a vector of security operations
with explicit, special case stacking of the capability
and yama hooks use lists of hooks with capability and
yama hooks included as appropriate.

The security_operations structure is no longer required.
Instead, there is a union of the function pointers that
allows all the hooks lists to use a common mechanism for
list management while retaining typing. Each module
supplies an array describing the hooks it provides instead
of a sparsely populated security_operations structure.
The description includes the element that gets put on
the hook list, avoiding the issues surrounding individual
element allocation.

The method for registering security modules is changed to
reflect the information available. The method for removing
a module, currently only used by SELinux, has also changed.
It should be generic now, however if there are potential
race conditions based on ordering of hook removal that needs
to be addressed by the calling module.

The security hooks are called from the lists and the first
failure is returned.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Acked-by:  Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2015-05-12 15:00:41 +10:00
Kees Cook
1d4457f999 sched: move no_new_privs into new atomic flags
Since seccomp transitions between threads requires updates to the
no_new_privs flag to be atomic, the flag must be part of an atomic flag
set. This moves the nnp flag into a separate task field, and introduces
accessors.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
2014-07-18 12:13:38 -07:00
Oleg Nesterov
51775fe736 apparmor: remove the "task" arg from may_change_ptraced_domain()
Unless task == current ptrace_parent(task) is not safe even under
rcu_read_lock() and most of the current users are not right.

So may_change_ptraced_domain(task) looks wrong as well. However it
is always called with task == current so the code is actually fine.
Remove this argument to make this fact clear.

Note: perhaps we should simply kill ptrace_parent(), it buys almost
nothing. And it is obviously racy, perhaps this should be fixed.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-10-29 21:34:18 -07:00
John Johansen
dd0c6e86f6 apparmor: fix capability to not use the current task, during reporting
Mediation is based off of the cred but auditing includes the current
task which may not be related to the actual request.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-10-29 21:33:37 -07:00
John Johansen
038165070a apparmor: allow setting any profile into the unconfined state
Allow emulating the default profile behavior from boot, by allowing
loading of a profile in the unconfined state into a new NS.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:07 -07:00
John Johansen
fa2ac468db apparmor: update how unconfined is handled
ns->unconfined is being used read side without locking, nor rcu but is
being updated when a namespace is removed. This works for the root ns
which is never removed but has a race window and can cause failures when
children namespaces are removed.

Also ns and ns->unconfined have a circular refcounting dependency that
is problematic and must be broken. Currently this is done incorrectly
when the namespace is destroyed.

Fix this by forward referencing unconfined via the replacedby infrastructure
instead of directly updating the ns->unconfined pointer.

Remove the circular refcount dependency by making the ns and its unconfined
profile share the same refcount.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
77b071b340 apparmor: change how profile replacement update is done
remove the use of replaced by chaining and move to profile invalidation
and lookup to handle task replacement.

Replacement chaining can result in large chains of profiles being pinned
in memory when one profile in the chain is use. With implicit labeling
this will be even more of a problem, so move to a direct lookup method.

Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
01e2b670aa apparmor: convert profile lists to RCU based locking
Signed-off-by: John Johansen <john.johansen@canonical.com>
2013-08-14 11:42:06 -07:00
John Johansen
214beacaa7 apparmor: localize getting the security context to a few macros
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
2013-04-28 00:39:35 -07:00
John Johansen
7a2871b566 apparmor: use common fn to clear task_context for domain transitions
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:36:20 -07:00
John Johansen
3cfcc19e0b apparmor: add utility function to get an arbitrary tasks profile.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:53 -07:00
John Johansen
04266236b1 apparmor: Remove -W1 warnings
Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:18 -07:00
John Johansen
17322cc3f9 apparmor: fix auditing of domain transition failures due to incomplete policy
When policy specifies a transition to a profile that is not currently
loaded, it result in exec being denied.  However the failure is not being
audited correctly because the audit code is treating this as an allowed
permission and thus not reporting it.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-By: Steve Beattie <sbeattie@ubuntu.com>
2013-04-28 00:35:04 -07:00
Al Viro
496ad9aa8e new helper: file_inode(file)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-02-22 23:31:31 -05:00
Eric W. Biederman
2db8145293 userns: Convert apparmor to use kuid and kgid where appropriate
Cc: John Johansen <john.johansen@canonical.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-09-21 03:13:21 -07:00
John Johansen
c29bceb396 Fix execve behavior apparmor for PR_{GET,SET}_NO_NEW_PRIVS
Add support for AppArmor to explicitly fail requested domain transitions
if NO_NEW_PRIVS is set and the task is not unconfined.

Transitions from unconfined are still allowed because this always results
in a reduction of privileges.

Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>

v18: new acked-by, new description
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-04-14 11:13:18 +10:00
Andy Lutomirski
259e5e6c75 Add PR_{GET,SET}_NO_NEW_PRIVS to prevent execve from granting privs
With this change, calling
  prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)
disables privilege granting operations at execve-time.  For example, a
process will not be able to execute a setuid binary to change their uid
or gid if this bit is set.  The same is true for file capabilities.

Additionally, LSM_UNSAFE_NO_NEW_PRIVS is defined to ensure that
LSMs respect the requested behavior.

To determine if the NO_NEW_PRIVS bit is set, a task may call
  prctl(PR_GET_NO_NEW_PRIVS, 0, 0, 0, 0);
It returns 1 if set and 0 if it is not set. If any of the arguments are
non-zero, it will return -1 and set errno to -EINVAL.
(PR_SET_NO_NEW_PRIVS behaves similarly.)

This functionality is desired for the proposed seccomp filter patch
series.  By using PR_SET_NO_NEW_PRIVS, it allows a task to modify the
system call behavior for itself and its child tasks without being
able to impact the behavior of a more privileged task.

Another potential use is making certain privileged operations
unprivileged.  For example, chroot may be considered "safe" if it cannot
affect privileged tasks.

Note, this patch causes execve to fail when PR_SET_NO_NEW_PRIVS is
set and AppArmor is in use.  It is fixed in a subsequent patch.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Will Drewry <wad@chromium.org>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>

v18: updated change desc
v17: using new define values as per 3.4
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-04-14 11:13:18 +10:00
John Johansen
0421ea91dd apparmor: Fix change_onexec when called from a confined task
Fix failure in aa_change_onexec api when the request is made from a confined
task.  This failure was caused by two problems

 The AA_MAY_ONEXEC perm was not being mapped correctly for this case.

 The executable name was being checked as second time instead of using the
 requested onexec profile name, which may not be the same as the exec
 profile name. This mistake can not be exploited to grant extra permission
 because of the above flaw where the ONEXEC permission was not being mapped
 so it will not be granted.

BugLink: http://bugs.launchpad.net/bugs/963756

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
2012-03-28 01:00:05 +11:00
John Johansen
57fa1e1809 AppArmor: Move path failure information into aa_get_name and rename
Move the path name lookup failure messages into the main path name lookup
routine, as the information is useful in more than just aa_path_perm.

Also rename aa_get_name to aa_path_name as it is not getting a reference
counted object with a corresponding put fn.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Kees Cook <kees@ubuntu.com>
2012-03-14 06:15:25 -07:00
Linus Torvalds
95b6886526 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (54 commits)
  tpm_nsc: Fix bug when loading multiple TPM drivers
  tpm: Move tpm_tis_reenable_interrupts out of CONFIG_PNP block
  tpm: Fix compilation warning when CONFIG_PNP is not defined
  TOMOYO: Update kernel-doc.
  tpm: Fix a typo
  tpm_tis: Probing function for Intel iTPM bug
  tpm_tis: Fix the probing for interrupts
  tpm_tis: Delay ACPI S3 suspend while the TPM is busy
  tpm_tis: Re-enable interrupts upon (S3) resume
  tpm: Fix display of data in pubek sysfs entry
  tpm_tis: Add timeouts sysfs entry
  tpm: Adjust interface timeouts if they are too small
  tpm: Use interface timeouts returned from the TPM
  tpm_tis: Introduce durations sysfs entry
  tpm: Adjust the durations if they are too small
  tpm: Use durations returned from TPM
  TOMOYO: Enable conditional ACL.
  TOMOYO: Allow using argv[]/envp[] of execve() as conditions.
  TOMOYO: Allow using executable's realpath and symlink's target as conditions.
  TOMOYO: Allow using owner/group etc. of file objects as conditions.
  ...

Fix up trivial conflict in security/tomoyo/realpath.c
2011-07-27 19:26:38 -07:00
John Johansen
04fdc099f9 AppArmor: Fix reference to rcu protected pointer outside of rcu_read_lock
The pointer returned from tracehook_tracer_task() is only valid inside
the rcu_read_lock.  However the tracer pointer obtained is being passed
to aa_may_ptrace outside of the rcu_read_lock critical section.

Mover the aa_may_ptrace test into the rcu_read_lock critical section, to
fix this.

Kernels affected: 2.6.36 - 3.0

Reported-by: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org
Signed-off-by: John Johansen <john.johansen@canonical.com>
2011-06-29 02:02:03 +01:00
Tejun Heo
06d984737b ptrace: s/tracehook_tracer_task()/ptrace_parent()/
tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:29 +02:00
James Morris
77c80e6b2f AppArmor: fix build warnings for non-const use of get_task_cred
Fix build warnings for non-const use of get_task_cred.

Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:49:00 +10:00
John Johansen
898127c34e AppArmor: functions for domain transitions
AppArmor routines for controling domain transitions, which can occur at
exec or through self directed change_profile/change_hat calls.

Unconfined tasks are checked at exec against the profiles in the confining
profile namespace to determine if a profile should be attached to the task.

Confined tasks execs are controlled by the profile which provides rules
determining which execs are allowed and if so which profiles should be
transitioned to.

Self directed domain transitions allow a task to request transition
to a given profile.  If the transition is allowed then the profile will
be applied, either immeditately or at exec time depending on the request.
Immeditate self directed transitions have several security limitations
but have uses in setting up stub transition profiles and other limited
cases.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-08-02 15:35:14 +10:00