linux/fs/proc
Eric W. Biederman 22d917d80e userns: Rework the user_namespace adding uid/gid mapping support
- Convert the old uid mapping functions into compatibility wrappers
- Add a uid/gid mapping layer from user space uid and gids to kernel
  internal uids and gids that is extent based for simplicty and speed.
  * Working with number space after mapping uids/gids into their kernel
    internal version adds only mapping complexity over what we have today,
    leaving the kernel code easy to understand and test.
- Add proc files /proc/self/uid_map /proc/self/gid_map
  These files display the mapping and allow a mapping to be added
  if a mapping does not exist.
- Allow entering the user namespace without a uid or gid mapping.
  Since we are starting with an existing user our uids and gids
  still have global mappings so are still valid and useful they just don't
  have local mappings.  The requirement for things to work are global uid
  and gid so it is odd but perfectly fine not to have a local uid
  and gid mapping.
  Not requiring global uid and gid mappings greatly simplifies
  the logic of setting up the uid and gid mappings by allowing
  the mappings to be set after the namespace is created which makes the
  slight weirdness worth it.
- Make the mappings in the initial user namespace to the global
  uid/gid space explicit.  Today it is an identity mapping
  but in the future we may want to twist this for debugging, similar
  to what we do with jiffies.
- Document the memory ordering requirements of setting the uid and
  gid mappings.  We only allow the mappings to be set once
  and there are no pointers involved so the requirments are
  trivial but a little atypical.

Performance:

In this scheme for the permission checks the performance is expected to
stay the same as the actuall machine instructions should remain the same.

The worst case I could think of is ls -l on a large directory where
all of the stat results need to be translated with from kuids and
kgids to uids and gids.  So I benchmarked that case on my laptop
with a dual core hyperthread Intel i5-2520M cpu with 3M of cpu cache.

My benchmark consisted of going to single user mode where nothing else
was running. On an ext4 filesystem opening 1,000,000 files and looping
through all of the files 1000 times and calling fstat on the
individuals files.  This was to ensure I was benchmarking stat times
where the inodes were in the kernels cache, but the inode values were
not in the processors cache.  My results:

v3.4-rc1:         ~= 156ns (unmodified v3.4-rc1 with user namespace support disabled)
v3.4-rc1-userns-: ~= 155ns (v3.4-rc1 with my user namespace patches and user namespace support disabled)
v3.4-rc1-userns+: ~= 164ns (v3.4-rc1 with my user namespace patches and user namespace support enabled)

All of the configurations ran in roughly 120ns when I performed tests
that ran in the cpu cache.

So in summary the performance impact is:
1ns improvement in the worst case with user namespace support compiled out.
8ns aka 5% slowdown in the worst case with user namespace support compiled in.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2012-04-26 02:01:39 -07:00
..
array.c procfs: fix /proc/statm 2012-03-28 17:14:35 -07:00
base.c userns: Rework the user_namespace adding uid/gid mapping support 2012-04-26 02:01:39 -07:00
cmdline.c
consoles.c console: rename acquire/release_console_sem() to console_lock/unlock() 2011-01-26 10:50:06 +10:00
cpuinfo.c
devices.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
generic.c switch procfs to umode_t use 2012-01-03 22:54:56 -05:00
inode.c Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
internal.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl 2012-03-23 18:08:58 -07:00
interrupts.c
Kconfig kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT 2011-01-20 17:02:05 -08:00
kcore.c fs/proc/kcore.c: make get_sparsemem_vmemmap_info() static 2012-03-23 16:58:42 -07:00
kmsg.c procfs: Use generic_file_llseek in /proc/kmsg 2010-04-09 16:35:41 +02:00
loadavg.c sched, timers: cleanup avenrun users 2009-05-15 15:32:45 +02:00
Makefile ns: proc files for namespace naming policy. 2011-05-10 14:31:44 -07:00
meminfo.c fs/proc/meminfo.c: fix compilation error 2011-12-09 07:50:28 -08:00
mmu.c
namespaces.c fs/proc/namespaces.c: prevent crash when ns_entries[] is empty 2012-03-28 17:14:37 -07:00
nommu.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
page.c pagemap: export KPF_THP 2012-03-21 17:54:57 -07:00
proc_devtree.c of/flattree: Drop an uninteresting message to pr_debug level 2011-03-02 13:45:18 -07:00
proc_net.c switch procfs to umode_t use 2012-01-03 22:54:56 -05:00
proc_sysctl.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl 2012-03-23 18:08:58 -07:00
proc_tty.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
root.c procfs: add hidepid= and gid= mount options 2012-01-10 16:30:54 -08:00
softirqs.c proc: use seq_puts()/seq_putc() where possible 2011-01-13 08:03:16 -08:00
stat.c procfs: add num_to_str() to speed up /proc/stat 2012-03-23 16:58:42 -07:00
task_mmu.c pagemap: remove remaining unneeded spin_lock() 2012-03-29 14:06:43 -07:00
task_nommu.c procfs: mark thread stack correctly in proc/<pid>/maps 2012-03-21 17:54:58 -07:00
uptime.c Merge branch 'sched/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip 2011-12-19 19:23:15 +01:00
version.c
vmcore.c fadump: Introduce cleanup routine to invalidate /proc/vmcore. 2012-02-23 10:50:02 +11:00