linux/fs
Vladimir Sementsov-Ogievskiy 88e4607034 coredump: require O_WRONLY instead of O_RDWR
The motivation for this patch has been to enable using a stricter
apparmor profile to prevent programs from reading any coredump in the
system.

However, this became something else. The following details are based on
Christian's and Linus' archeology into the history of the number "2" in
the coredump handling code.

To make sure we're not accidently introducing some subtle behavioral
change into the coredump code we set out on a voyage into the depths of
history.git to figure out why this was O_RDWR in the first place.

Coredump handling was introduced over 30 years ago in commit
ddc733f452e0 ("[PATCH] Linux-0.97 (August 1, 1992)").
The original code used O_WRONLY:

    open_namei("core",O_CREAT | O_WRONLY | O_TRUNC,0600,&inode,NULL)

However, this changed in 1993 and starting with commit
9cb9f18b5d26 ("[PATCH] Linux-0.99.10 (June 7, 1993)") the coredump code
suddenly used the constant "2":

    open_namei("core",O_CREAT | 2 | O_TRUNC,0600,&inode,NULL)

This was curious as in the same commit the kernel switched from
constants to proper defines in other places such as KERNEL_DS and
USER_DS and O_RDWR did already exist.

So why was "2" used? It turns out that open_namei() - an early version
of what later turned into filp_open() - didn't accept O_RDWR.

A semantic quirk of the open() uapi is the definition of the O_RDONLY
flag. It would seem natural to define:

    #define O_RDWR (O_RDONLY | O_WRONLY)

but that isn't possible because:

    #define O_RDONLY 0

This makes O_RDONLY effectively meaningless when passed to the kernel.
In other words, there has never been a way - until O_PATH at least - to
open a file without any permission; O_RDONLY was always implied on the
uapi side while the kernel does in fact allow opening files without
permissions.

The trouble comes when trying to map the uapi flags onto the
corresponding file mode flags FMODE_{READ,WRITE}. This mapping still
happens today and is causing issues to this day (We ran into this
during additions for openat2() for example.).

So the special value "3" was used to indicate that the file was opened
for special access:

    f->f_flags = flag = flags;
    f->f_mode = (flag+1) & O_ACCMODE;
    if (f->f_mode)
            flag++;

This allowed the file mode to be set to FMODE_READ | FMODE_WRITE mapping
the O_{RDONLY,WRONLY,RDWR} flags into the FMODE_{READ,WRITE} flags. The
special access then required read-write permissions and 0 was used to
access symlinks.

But back when ddc733f452e0 ("[PATCH] Linux-0.97 (August 1, 1992)") added
coredump handling open_namei() took the FMODE_{READ,WRITE} flags as an
argument. So the coredump handling introduced in
ddc733f452e0 ("[PATCH] Linux-0.97 (August 1, 1992)") was buggy because
O_WRONLY shouldn't have been passed. Since O_WRONLY is 1 but
open_namei() took FMODE_{READ,WRITE} it was passed FMODE_READ on
accident.

So 9cb9f18b5d26 ("[PATCH] Linux-0.99.10 (June 7, 1993)") was a bugfix
for this and the 2 didn't really mean O_RDWR, it meant FMODE_WRITE which
was correct.

The clue is that FMODE_{READ,WRITE} didn't exist yet and thus a raw "2"
value was passed.

Fast forward 5 years when around 2.2.4pre4 (February 16, 1999) this code
was changed to:

    -       dentry = open_namei(corefile,O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);
    ...
    +       file = filp_open(corefile,O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);

At this point the raw "2" should have become O_WRONLY again as
filp_open() didn't take FMODE_{READ,WRITE} but O_{RDONLY,WRONLY,RDWR}.

Another 17 years later, the code was changed again cementing the mistake
and making it almost impossible to detect when commit
378c6520e7 ("fs/coredump: prevent fsuid=0 dumps into user-controlled directories")
replaced the raw "2" with O_RDWR.

And now, here we are with this patch that sent us on a quest to answer
the big questions in life such as "Why are coredump files opened with
O_RDWR?" and "Is it safe to just use O_WRONLY?".

So with this commit we're reintroducing O_WRONLY again and bringing this
code back to its original state when it was first introduced in commit
ddc733f452e0 ("[PATCH] Linux-0.97 (August 1, 1992)") over 30 years ago.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230420120409.602576-1-vsementsov@yandex-team.ru>
[brauner@kernel.org: completely rewritten commit message]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-05-17 09:13:44 +02:00
..
9p Including fixes from netfilter. 2023-05-05 19:12:01 -07:00
adfs fs: port ->setattr() to pass mnt_idmap 2023-01-19 09:24:02 +01:00
affs for-6.3/dio-2023-02-16 2023-02-20 14:10:36 -08:00
afs Five hotfixes. Three are cc:stable, two pertain to post-6.3 merge window 2023-05-06 11:25:03 -07:00
autofs fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
befs
bfs fs: port inode_init_owner() to mnt_idmap 2023-01-19 09:24:28 +01:00
btrfs for-6.4-rc1-tag 2023-05-12 17:10:32 -05:00
cachefiles fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctls 2023-04-13 11:49:35 -07:00
ceph A few filesystem improvements, with a rather nasty use-after-free fix 2023-05-04 14:48:02 -07:00
cifs cifs: release leases for deferred close handles when freezing 2023-05-10 17:48:30 -05:00
coda sysctl-6.4-rc1 2023-04-27 16:52:33 -07:00
configfs fs: consolidate duplicate dt_type helpers 2023-04-03 09:23:54 +02:00
cramfs fs/cramfs/inode.c: initialize file_ra_state 2023-03-02 21:54:23 -08:00
crypto fscrypt: optimize fscrypt_initialize() 2023-04-06 11:16:39 -07:00
debugfs ARM: SoC drivers for 6.3 2023-02-27 10:04:49 -08:00
devpts devpts: simplify two-level sysctl registration for pty_kern_table 2023-03-13 12:36:34 +01:00
dlm Networking changes for 6.4. 2023-04-26 16:07:23 -07:00
ecryptfs fs: drop unused posix acl handlers 2023-03-06 09:57:12 +01:00
efivarfs A healthy mix of EFI contributions this time: 2023-02-23 14:41:48 -08:00
efs
erofs Changes since last update: 2023-04-24 14:25:39 -07:00
exfat Description for this pull request: 2023-03-01 08:42:27 -08:00
exportfs fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
ext2 \n 2023-04-26 09:07:46 -07:00
ext4 ext4: bail out of ext4_xattr_ibody_get() fails for any reason 2023-05-13 18:05:05 -04:00
f2fs f2fs update for 6.4-rc1 2023-04-26 09:42:10 -07:00
fat There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
freevxfs There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
fscache fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work() 2023-01-30 12:51:54 +00:00
fuse Driver core changes for 6.4-rc1 2023-04-27 11:53:57 -07:00
gfs2 gfs2: Don't deref jdesc in evict 2023-05-10 17:15:18 +02:00
hfs There is no particular theme here - mainly quick hits all over the tree. 2023-02-23 17:55:40 -08:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-04-12 11:29:32 +02:00
hostfs um: hostfs: define our own API boundary 2023-04-20 23:04:40 +02:00
hpfs fs: port ->rename() to pass mnt_idmap 2023-01-19 09:24:26 +01:00
hugetlbfs mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() 2023-04-21 14:52:05 -07:00
iomap New code for 6.4: 2023-04-29 10:35:48 -07:00
isofs - hfs and hfsplus kmap API modernization from Fabio Francesco 2022-10-12 11:00:22 -07:00
jbd2 jdb2: Don't refuse invalidation of already invalidated buffers 2023-04-14 19:38:50 -04:00
jffs2 jffs2: reduce stack usage in jffs2_build_xattr_subsystem() 2023-05-15 12:43:15 +02:00
jfs write_one_page series 2023-04-24 19:20:27 -07:00
kernfs Driver core changes for 6.4-rc1 2023-04-27 11:53:57 -07:00
ksmbd 9 smb3 client fixes, mostly DFS or reconnect related 2023-05-07 10:46:21 -07:00
lockd NFSD 6.4 Release Notes 2023-04-29 11:04:14 -07:00
minix Merge branch 'work.minix' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2023-02-24 19:01:15 -08:00
netfs - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
nfs nfs: fix another case of NULL/IS_ERR confusion wrt folio pointers 2023-05-09 10:22:13 -07:00
nfs_common NFSv4.2: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:52 -07:00
nfsd NFSD 6.4 Release Notes 2023-04-29 11:04:14 -07:00
nilfs2 nilfs2: do not write dirty data after degenerating to read-only 2023-05-06 10:10:07 -07:00
nls
notify inotify: Avoid reporting event with invalid wd 2023-04-25 12:36:55 +02:00
ntfs ntfs: simplfy one-level sysctl registration for ntfs_sysctls 2023-04-13 11:49:35 -07:00
ntfs3 driver ntfs3 for linux 6.4 2023-04-29 10:52:37 -07:00
ocfs2 Mainly singleton patches all over the place. Series of note are: 2023-04-27 19:57:00 -07:00
omfs fs: port inode_init_owner() to mnt_idmap 2023-01-19 09:24:28 +01:00
openpromfs
orangefs - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
overlayfs fs: fix incorrect fmode_t casts 2023-05-15 13:14:01 +02:00
proc sysctl: remove register_sysctl_paths() 2023-05-02 19:24:16 -07:00
pstore pstore update for v6.4-rc1 2023-04-27 17:03:40 -07:00
qnx4 qnx4: credit contributors in CREDITS 2023-03-14 12:56:30 -06:00
qnx6 qnx6: credit contributor and mark filesystem orphan 2023-03-14 12:56:30 -06:00
quota quota: mark PRINT_QUOTA_WARNING as BROKEN 2023-04-14 13:06:50 +02:00
ramfs mm, treewide: redefine MAX_ORDER sanely 2023-04-05 19:42:46 -07:00
reiserfs \n 2023-04-26 09:07:46 -07:00
romfs mm/nommu: factor out check for NOMMU shared mappings into is_nommu_shared_mapping() 2023-01-18 17:12:56 -08:00
smbfs_common SMB3.1.1: correct definition for app_instance_id create contexts 2023-05-02 09:23:23 -05:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-03 17:52:25 -08:00
sysfs
sysv sysv: switch to put_and_unmap_page() 2023-03-12 20:03:41 -04:00
tracefs fs: port ->mkdir() to pass mnt_idmap 2023-01-19 09:24:26 +01:00
ubifs ubifs: Fix memleak when insert_old_idx() failed 2023-04-23 23:36:38 +02:00
udf udf: use wrapper i_blocksize() in udf_discard_prealloc() 2023-03-13 11:16:16 +01:00
ufs ufs: don't flush page immediately for DIRSYNC directories 2023-03-28 16:20:14 -07:00
unicode unicode: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:54 -07:00
vboxsf fs: port ->rename() to pass mnt_idmap 2023-01-19 09:24:26 +01:00
verity fsverity: reject FS_IOC_ENABLE_VERITY on mode 3 fds 2023-04-11 19:23:23 -07:00
xfs xfs: bug fixes for 6.4-rc2 2023-05-11 16:51:11 -05:00
zonefs zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space 2023-03-30 20:56:02 +09:00
aio.c Merge branch 'mm-hotfixes-stable' into mm-stable 2023-02-10 15:34:48 -08:00
anon_inodes.c
attr.c nfs: use vfs setgid helper 2023-03-30 08:51:48 +02:00
bad_inode.c fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
binfmt_elf_fdpic.c ELF: fix all "Elf" typos 2023-04-08 13:45:37 -07:00
binfmt_elf_test.c
binfmt_elf.c Mainly singleton patches all over the place. Series of note are: 2023-04-27 19:57:00 -07:00
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-02 13:57:04 -08:00
binfmt_script.c
buffer.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
char_dev.c vfs: Replace all non-returning strlcpy with strscpy 2023-05-15 09:42:01 +02:00
compat_binfmt_elf.c
coredump.c coredump: require O_WRONLY instead of O_RDWR 2023-05-17 09:13:44 +02:00
d_path.c
dax.c fsdax: force clear dirty mark if CoW 2023-04-05 18:06:23 -07:00
dcache.c tmpfile API change 2022-10-10 19:45:17 -07:00
direct-io.c __blockdev_direct_IO(): get rid of submit_io callback 2023-03-05 20:27:41 -05:00
drop_caches.c
eventfd.c fs: use correct __poll_t type 2023-05-15 09:33:09 +02:00
eventpoll.c fs: use correct __poll_t type 2023-05-15 09:33:09 +02:00
exec.c tracing updates for 6.4: 2023-04-28 15:57:53 -07:00
fcntl.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
fhandle.c
file_table.c filelock: move file locking definitions to separate header file 2023-01-11 06:52:32 -05:00
file.c fs: prevent out-of-bounds array speculation when closing a file descriptor 2023-03-09 22:46:21 -05:00
filesystems.c
fs_context.c
fs_parser.c ext4: journal_path mount options should follow links 2022-12-01 10:46:54 -05:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c for-6.4/block-2023-05-06 2023-05-06 08:28:58 -07:00
fsopen.c
init.c fs: port ->permission() to pass mnt_idmap 2023-01-19 09:24:28 +01:00
inode.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
internal.h five ksmbd server fixes, and new lock_rename_child VFS helper routine 2023-04-29 11:10:39 -07:00
ioctl.c fs: port inode_owner_or_capable() to mnt_idmap 2023-01-19 09:24:29 +01:00
Kconfig mm/hugetlb_vmemmap: rename ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP 2023-04-18 16:30:09 -07:00
Kconfig.binfmt Xtensa updates for v6.1 2022-10-10 14:21:11 -07:00
kernel_read_file.c
libfs.c fs: consolidate duplicate dt_type helpers 2023-04-03 09:23:54 +02:00
locks.c filelocks: use mount idmapping for setlease permission check 2023-03-09 22:36:12 +01:00
Makefile This pull request contains the following changes for UML: 2023-05-03 19:02:03 -07:00
mbcache.c ext4: fix deadlock due to mbcache entry corruption 2022-12-08 21:49:25 -05:00
mnt_idmapping.c fs: move mnt_idmap 2023-01-19 09:24:30 +01:00
mount.h
mpage.c mpage: use folios in bio end_io handler 2023-04-18 16:30:02 -07:00
namei.c five ksmbd server fixes, and new lock_rename_child VFS helper routine 2023-04-29 11:10:39 -07:00
namespace.c fget() to fdget() conversions 2023-04-24 19:14:20 -07:00
no-block.c
nsfs.c kill the last remaining user of proc_ns_fget() 2023-04-20 22:55:35 -04:00
open.c fs: fix incorrect fmode_t casts 2023-05-15 13:14:01 +02:00
pipe.c pipe: check for IOCB_NOWAIT alongside O_NONBLOCK 2023-05-12 17:17:27 +02:00
pnode.c pnode: pass mountpoint directly 2023-04-06 14:53:38 +02:00
pnode.h
posix_acl.c acl: don't depend on IOP_XATTR 2023-03-06 09:59:20 +01:00
proc_namespace.c
read_write.c iov_iter: add iter_iov_addr() and iter_iov_len() helpers 2023-03-30 08:12:29 -06:00
readdir.c
remap_range.c fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap 2023-01-19 09:24:29 +01:00
select.c
seq_file.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
signalfd.c
splice.c pipe-nonblock-2023-05-06 2023-05-06 08:15:20 -07:00
stack.c
stat.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
statfs.c
super.c vfs: Replace all non-returning strlcpy with strscpy 2023-05-15 09:42:01 +02:00
sync.c
sysctls.c
timerfd.c
userfaultfd.c - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of 2023-04-27 19:42:02 -07:00
utimes.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
xattr.c acl: don't depend on IOP_XATTR 2023-03-06 09:59:20 +01:00