linux/fs
Eric W. Biederman da362b09e4 umount: Do not allow unmounting rootfs.
Andrew Vagin <avagin@parallels.com> writes:

> #define _GNU_SOURCE
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sched.h>
> #include <unistd.h>
> #include <sys/mount.h>
>
> int main(int argc, char **argv)
> {
> 	int fd;
>
> 	fd = open("/proc/self/ns/mnt", O_RDONLY);
> 	if (fd < 0)
> 	   return 1;
> 	   while (1) {
> 	   	 if (umount2("/", MNT_DETACH) ||
> 		        setns(fd, CLONE_NEWNS))
> 					break;
> 					}
>
> 					return 0;
> }
>
> root@ubuntu:/home/avagin# gcc -Wall nsenter.c -o nsenter
> root@ubuntu:/home/avagin# strace ./nsenter
> execve("./nsenter", ["./nsenter"], [/* 22 vars */]) = 0
> ...
> open("/proc/self/ns/mnt", O_RDONLY)     = 3
> umount("/", MNT_DETACH)                 = 0
> setns(3, 131072)                        = 0
> umount("/", MNT_DETACH
>
causes:

> [  260.548301] ------------[ cut here ]------------
> [  260.550941] kernel BUG at /build/buildd/linux-3.13.0/fs/pnode.c:372!
> [  260.552068] invalid opcode: 0000 [#1] SMP
> [  260.552068] Modules linked in: xt_CHECKSUM iptable_mangle xt_tcpudp xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison iptable_filter ip_tables x_tables crct10dif_pclmul crc32_pclmul ghash_clmulni_intel binfmt_misc nfsd auth_rpcgss nfs_acl aesni_intel nfs lockd aes_x86_64 sunrpc fscache lrw gf128mul glue_helper ablk_helper cryptd serio_raw ppdev parport_pc lp parport btrfs xor raid6_pq libcrc32c psmouse floppy
> [  260.552068] CPU: 0 PID: 1723 Comm: nsenter Not tainted 3.13.0-30-generic #55-Ubuntu
> [  260.552068] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [  260.552068] task: ffff8800376097f0 ti: ffff880074824000 task.ti: ffff880074824000
> [  260.552068] RIP: 0010:[<ffffffff811e9483>]  [<ffffffff811e9483>] propagate_umount+0x123/0x130
> [  260.552068] RSP: 0018:ffff880074825e98  EFLAGS: 00010246
> [  260.552068] RAX: ffff88007c741140 RBX: 0000000000000002 RCX: ffff88007c741190
> [  260.552068] RDX: ffff88007c741190 RSI: ffff880074825ec0 RDI: ffff880074825ec0
> [  260.552068] RBP: ffff880074825eb0 R08: 00000000000172e0 R09: ffff88007fc172e0
> [  260.552068] R10: ffffffff811cc642 R11: ffffea0001d59000 R12: ffff88007c741140
> [  260.552068] R13: ffff88007c741140 R14: ffff88007c741140 R15: 0000000000000000
> [  260.552068] FS:  00007fd5c7e41740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
> [  260.552068] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  260.552068] CR2: 00007fd5c7968050 CR3: 0000000070124000 CR4: 00000000000406f0
> [  260.552068] Stack:
> [  260.552068]  0000000000000002 0000000000000002 ffff88007c631000 ffff880074825ed8
> [  260.552068]  ffffffff811dcfac ffff88007c741140 0000000000000002 ffff88007c741160
> [  260.552068]  ffff880074825f38 ffffffff811dd12b ffffffff811cc642 0000000075640000
> [  260.552068] Call Trace:
> [  260.552068]  [<ffffffff811dcfac>] umount_tree+0x20c/0x260
> [  260.552068]  [<ffffffff811dd12b>] do_umount+0x12b/0x300
> [  260.552068]  [<ffffffff811cc642>] ? final_putname+0x22/0x50
> [  260.552068]  [<ffffffff811cc849>] ? putname+0x29/0x40
> [  260.552068]  [<ffffffff811dd88c>] SyS_umount+0xdc/0x100
> [  260.552068]  [<ffffffff8172aeff>] tracesys+0xe1/0xe6
> [  260.552068] Code: 89 50 08 48 8b 50 08 48 89 02 49 89 45 08 e9 72 ff ff ff 0f 1f 44 00 00 4c 89 e6 4c 89 e7 e8 f5 f6 ff ff 48 89 c3 e9 39 ff ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 55 b8 01
> [  260.552068] RIP  [<ffffffff811e9483>] propagate_umount+0x123/0x130
> [  260.552068]  RSP <ffff880074825e98>
> [  260.611451] ---[ end trace 11c33d85f1d4c652 ]--

Which in practice is totally uninteresting.  Only the global root user can
do it, and it is just a stupid thing to do.

However that is no excuse to allow a silly way to oops the kernel.

We can avoid this silly problem by setting MNT_LOCKED on the rootfs
mount point and thus avoid needing any special cases in the unmount
code.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2014-12-02 10:46:49 -06:00
..
9p 9p: switch to %p[dD] 2014-10-09 02:39:04 -04:00
adfs adfs: add __printf verification, fix format/argument mismatches 2014-08-08 15:57:24 -07:00
affs fs/affs: remove redundant sys_tz declarations 2014-10-14 02:18:22 +02:00
afs Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-10-13 16:23:15 +02:00
autofs4 autofs4: d_manage() should return -EISDIR when appropriate in rcu-walk mode. 2014-10-14 02:18:16 +02:00
befs fs/befs/btree.c: remove typedef befs_btree_node 2014-10-14 02:18:20 +02:00
bfs fs/bfs: use bfs prefix for dump_imap 2014-08-08 15:57:24 -07:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-11-23 11:16:36 -08:00
cachefiles FS-Cache fixes 2014-10-14 08:40:15 +02:00
ceph ceph: fix flush tid comparision 2014-11-13 22:19:05 +03:00
cifs [CIFS] Remove obsolete comment 2014-10-17 17:17:12 -05:00
coda fs/coda: use linux/uaccess.h 2014-08-08 15:57:20 -07:00
configfs
cramfs fs/cramfs/inode.c: use linux/uaccess.h 2014-08-08 15:57:25 -07:00
debugfs fs: debugfs: remove trailing whitespace 2014-07-09 16:58:21 -07:00
devpts
dlm dlm: fix missing endian conversion of rcom_status flags 2014-10-14 15:11:48 -05:00
ecryptfs fs: limit filesystem stacking depth 2014-10-24 00:14:39 +02:00
efivarfs
efs fs/efs/namei.c: return is not a function 2014-08-08 15:57:18 -07:00
exofs Boaz Harrosh - Fix broken email address 2014-10-19 20:22:32 +03:00
exportfs
ext2 percpu_counter: add @gfp to percpu_counter_init() 2014-09-08 09:51:29 +09:00
ext3 ext3: Don't check quota format when there are no quota files 2014-10-22 09:02:48 +02:00
ext4 ext4: make ext4_ext_convert_to_initialized() return proper number of blocks 2014-10-30 10:53:17 -04:00
f2fs f2fs: support volatile operations for transient data 2014-10-07 11:54:41 -07:00
fat fat: remove redundant sys_tz declaration 2014-10-14 02:18:20 +02:00
freevxfs
fscache fs/fscache/object-list.c: use __seq_open_private() 2014-10-13 17:52:21 +01:00
fuse vfs: Make d_invalidate return void 2014-10-09 02:38:57 -04:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-10-13 11:28:42 +02:00
hfs fs/hfs/hfs_fs.h: remove redundant sys_tz declaration 2014-10-14 02:18:20 +02:00
hfsplus Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-06-12 10:30:18 -07:00
hostfs hostfs: support rename flags 2014-08-07 14:40:09 -04:00
hpfs fs/hpfs/dnode.c: fix suspect code indent 2014-08-08 15:57:22 -07:00
hppfs
hugetlbfs
isofs isofs: avoid unused function warning 2014-11-19 13:09:37 -05:00
jbd fs, jbd: use a more generic hash function 2014-10-22 10:02:04 +02:00
jbd2 jbd2: use a better hash function for the revoke table 2014-10-30 10:53:17 -04:00
jffs2 [jffs2] kill wbuf_queued/wbuf_dwork_lock 2014-10-09 02:39:01 -04:00
jfs Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-10-13 16:23:15 +02:00
kernfs vfs: Remove unnecessary calls of check_submounts_and_drop 2014-10-09 02:38:56 -04:00
lockd File locking related changes for v3.18 (pile #1) 2014-10-11 13:21:34 -04:00
logfs fs/logfs/readwrite.c: kernel-doc warning fixes 2014-08-06 18:01:12 -07:00
minix minix zmap block counts calculation fix 2014-08-08 15:57:20 -07:00
ncpfs fs/ncpfs/dir.c: remove redundant sys_tz declaration 2014-10-14 02:18:16 +02:00
nfs NFS: Don't try to reclaim delegation open state if recovery failed 2014-11-12 17:19:04 -05:00
nfs_common lockd: move lockd's grace period handling into its own module 2014-09-17 16:33:11 -04:00
nfsd nfsd4: fix crash on unknown operation number 2014-10-23 13:39:51 -04:00
nilfs2 nilfs2: improve the performance of fdatasync() 2014-10-14 02:18:20 +02:00
nls
notify fanotify: fix notification of groups with inode & mount marks 2014-11-13 16:17:06 -08:00
ntfs NTFS: Bump version to 2.1.31. 2014-10-16 12:53:35 +01:00
ocfs2 fix breakage in o2net_send_tcp_msg() 2014-11-05 15:21:18 -05:00
omfs FS/OMFS: block number sanity check during fill_super operation 2014-10-14 02:18:22 +02:00
openpromfs
overlayfs ovl: ovl_dir_fsync() cleanup 2014-11-20 16:40:02 +01:00
proc mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared 2014-10-14 02:18:28 +02:00
pstore pstore: Fix duplicate {console,ftrace}-efi entries 2014-10-15 13:51:33 -07:00
qnx4
qnx6 fs/qnx6: update debugging to current functions 2014-08-08 15:57:26 -07:00
quota quota: Properly return errors from dquot_writeback_dquots() 2014-10-22 09:08:03 +02:00
ramfs fs/ramfs/file-nommu.c: replace count*size kzalloc by kcalloc 2014-08-08 15:57:18 -07:00
reiserfs fs/reiserfs/journal.c: fix sparse context imbalance warning 2014-10-14 02:18:20 +02:00
romfs fs/romfs/super.c: add blank line after declarations 2014-08-08 15:57:25 -07:00
squashfs fs/squashfs/super.c: logging cleanup 2014-08-06 18:01:13 -07:00
sysfs
sysv
ubifs UBIFS: Fix trivial typo in power_cut_emulated() 2014-09-30 09:29:44 +03:00
udf udf: Fix loading of special inodes 2014-10-09 13:06:14 +02:00
ufs fs/ufs/balloc.c: remove unused variable 2014-10-14 02:18:20 +02:00
xfs xfs: track bulkstat progress by agino 2014-11-07 08:33:52 +11:00
aio.c percpu_ref: add PERCPU_REF_INIT_* flags 2014-09-24 13:31:50 -04:00
anon_inodes.c
attr.c
bad_inode.c bad_inode: add ->rename2() 2014-08-07 14:40:09 -04:00
binfmt_aout.c handle suicide on late failure exits in execve() in search_binary_handler() 2014-10-09 02:39:00 -04:00
binfmt_elf_fdpic.c handle suicide on late failure exits in execve() in search_binary_handler() 2014-10-09 02:39:00 -04:00
binfmt_elf.c handle suicide on late failure exits in execve() in search_binary_handler() 2014-10-09 02:39:00 -04:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: work around gcc-4.9 warning 2014-10-14 02:18:16 +02:00
binfmt_script.c
binfmt_som.c
block_dev.c Return short read or 0 at end of a raw device, not EIO 2014-10-31 06:33:26 -04:00
buffer.c fs: clarify rate limit suppressed buffer I/O errors 2014-10-21 13:55:11 -06:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c Bluetooth: Move HCI socket definitions into its own header file 2014-07-11 13:53:04 +03:00
compat.c vfs: move getname() from callers to do_mount() 2014-10-09 02:39:16 -04:00
coredump.c coredump: add %i/%I in core_pattern to report the tid of the crashed thread 2014-10-14 02:18:21 +02:00
dcache.c vfs: fix reference leak in d_prune_aliases() 2014-11-19 13:07:20 -05:00
dcookies.c
direct-io.c fuse: honour max_read and max_write in direct_io mode 2014-09-26 21:16:51 -04:00
drop_caches.c
eventfd.c
eventpoll.c eventpoll: fix uninitialized variable in epoll_ctl 2014-09-10 15:42:12 -07:00
exec.c handle suicide on late failure exits in execve() in search_binary_handler() 2014-10-09 02:39:00 -04:00
fcntl.c security: make security_file_set_fowner, f_setown and __f_setown void return 2014-09-09 16:01:36 -04:00
fhandle.c
file_table.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-10-13 11:28:42 +02:00
file.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-10-13 15:44:12 +02:00
filesystems.c
fs_pin.c make fs/{namespace,super}.c forget about acct.h 2014-08-07 14:40:09 -04:00
fs_struct.c
fs-writeback.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
inode.c mm: allow drivers to prevent new writable mappings 2014-08-08 15:57:31 -07:00
internal.h vfs: export __inode_permission() to modules 2014-10-24 00:14:35 +02:00
ioctl.c
Kconfig overlay filesystem 2014-10-24 00:14:38 +02:00
Kconfig.binfmt
libfs.c locks: plumb a "priv" pointer into the setlease routines 2014-10-07 14:06:12 -04:00
locks.c locks: flock_make_lock should return a struct file_lock (or PTR_ERR) 2014-10-07 14:06:13 -04:00
Makefile ovl: rename filesystem type to "overlay" 2014-11-20 16:39:59 +01:00
mbcache.c fs/mbcache: replace __builtin_log2() with ilog2() 2014-06-25 22:08:29 -04:00
mount.h vfs: Add a function to lazily unmount all mounts from any dentry. 2014-10-09 02:38:55 -04:00
mpage.c vfs: guard end of device for mpage interface 2014-10-09 22:25:53 -04:00
namei.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-11-02 10:28:43 -08:00
namespace.c umount: Do not allow unmounting rootfs. 2014-12-02 10:46:49 -06:00
no-block.c
open.c vfs: add i_op->dentry_open() 2014-10-24 00:14:35 +02:00
pipe.c
pnode.c get rid of propagate_umount() mistakenly treating slaves as busy. 2014-08-30 18:31:41 -04:00
pnode.h
posix_acl.c
proc_namespace.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
read_write.c cachefiles_write_page(): switch to __kernel_write() 2014-10-09 02:39:05 -04:00
readdir.c
select.c
seq_file.c fs/seq_file: fallback to vmalloc allocation 2014-07-03 09:21:54 -07:00
signalfd.c
splice.c vfs: export do_splice_direct() to modules 2014-10-24 00:14:35 +02:00
stack.c fs: fix comment for 'CONFIG_LBADF' 2014-08-26 09:35:56 +02:00
stat.c
statfs.c
super.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-10-13 11:28:42 +02:00
sync.c Export sync_filesystem() for modular ->remount_fs() use 2014-09-05 08:16:21 -07:00
timerfd.c timerfd: Remove an always true check 2014-08-27 11:17:48 +02:00
utimes.c
xattr.c vfs: Deduplicate code shared by xattr system calls operating on paths 2014-10-12 17:09:10 -04:00