linux/fs
Linus Torvalds 1c24a18639 fs: fd tables have to be multiples of BITS_PER_LONG
This has always been the rule: fdtables have several bitmaps in them,
and as a result they have to be sized properly for bitmaps.  We walk
those bitmaps in chunks of 'unsigned long' in serveral cases, but even
when we don't, we use the regular kernel bitops that are defined to work
on arrays of 'unsigned long', not on some byte array.

Now, the distinction between arrays of bytes and 'unsigned long'
normally only really ends up being noticeable on big-endian systems, but
Fedor Pchelkin and Alexey Khoroshilov reported that copy_fd_bitmaps()
could be called with an argument that wasn't even a multiple of
BITS_PER_BYTE.  And then it fails to do the proper copy even on
little-endian machines.

The bug wasn't in copy_fd_bitmap(), but in sane_fdtable_size(), which
didn't actually sanitize the fdtable size sufficiently, and never made
sure it had the proper BITS_PER_LONG alignment.

That's partly because the alignment historically came not from having to
explicitly align things, but simply from previous fdtable sizes, and
from count_open_files(), which counts the file descriptors by walking
them one 'unsigned long' word at a time and thus naturally ends up doing
sizing in the proper 'chunks of unsigned long'.

But with the introduction of close_range(), we now have an external
source of "this is how many files we want to have", and so
sane_fdtable_size() needs to do a better job.

This also adds that explicit alignment to alloc_fdtable(), although
there it is mainly just for documentation at a source code level.  The
arithmetic we do there to pick a reasonable fdtable size already aligns
the result sufficiently.

In fact,clang notices that the added ALIGN() in that function doesn't
actually do anything, and does not generate any extra code for it.

It turns out that gcc ends up confusing itself by combining a previous
constant-sized shift operation with the variable-sized shift operations
in roundup_pow_of_two().  And probably due to that doesn't notice that
the ALIGN() is a no-op.  But that's a (tiny) gcc misfeature that doesn't
matter.  Having the explicit alignment makes sense, and would actually
matter on a 128-bit architecture if we ever go there.

This also adds big comments above both functions about how fdtable sizes
have to have that BITS_PER_LONG alignment.

Fixes: 60997c3d45 ("close_range: add CLOSE_RANGE_UNSHARE")
Reported-by: Fedor Pchelkin <aissur0002@gmail.com>
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Link: https://lore.kernel.org/all/20220326114009.1690-1-aissur0002@gmail.com/
Tested-and-acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-29 15:06:39 -07:00
..
9p Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
adfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
affs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
afs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
autofs
befs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
bfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
btrfs for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
cachefiles for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
ceph The highlights are: 2022-03-24 18:32:48 -07:00
cifs flexible-array transformations for 5.18-rc1 2022-03-24 11:39:32 -07:00
coda Folio changes for 5.18 2022-03-22 17:03:12 -07:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-02-22 18:30:28 +01:00
cramfs
crypto fscrypt updates for 5.18 2022-03-22 09:50:16 -07:00
debugfs debugfs: Document that debugfs_create functions need not be error checked 2022-02-25 11:56:13 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-01-24 14:17:02 +01:00
dlm driver core changes for 5.17-rc1 2022-01-12 11:11:34 -08:00
ecryptfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
efivarfs
efs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
erofs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
exfat Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
exportfs
ext2 \n 2022-03-25 17:38:15 -07:00
ext4 for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
f2fs for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
fat Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
freevxfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
fscache fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio() 2022-03-15 08:34:36 -04:00
fuse Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a 2022-03-27 10:17:23 -07:00
gfs2 for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
hfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hfsplus Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hostfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hpfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hugetlbfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
iomap Fix buffered write page prefaulting 2022-03-26 12:41:52 -07:00
isofs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
jbd2 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
jffs2 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
jfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
kernfs kernfs: fix typos in comments 2022-03-18 13:38:03 +01:00
ksmbd flexible-array transformations for 5.18-rc1 2022-03-24 11:39:32 -07:00
lockd NFSD: Move svc_serv_ops::svo_function into struct svc_serv 2022-02-28 10:26:40 -05:00
minix Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
netfs netfs: Make ops->init_rreq() optional 2022-01-21 21:36:28 +00:00
nfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
nfs_common
nfsd Folio changes for 5.18 2022-03-22 17:03:12 -07:00
nilfs2 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
nls
notify fsnotify: remove redundant parameter judgment 2022-03-14 09:05:25 +01:00
ntfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
ntfs3 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
ocfs2 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
omfs fs: Convert __set_page_dirty_buffers to block_dirty_folio 2022-03-16 13:37:04 -04:00
openpromfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
orangefs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
overlayfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
proc ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
pstore pstore: Don't use semaphores in always-atomic-context code 2022-03-15 11:08:23 -07:00
qnx4 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
qnx6 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
quota quota: make dquot_quota_sync return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
ramfs Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
reiserfs \n 2022-03-25 17:38:15 -07:00
romfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
smbfs_common smb3: add new defines from protocol specification 2022-01-18 16:50:47 -06:00
squashfs Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
sysfs kernfs: move struct kernfs_root out of the public view. 2022-02-23 15:46:34 +01:00
sysv Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-02-25 21:05:04 -05:00
ubifs Driver core changes for 5.18-rc1 2022-03-28 12:41:28 -07:00
udf \n 2022-03-25 17:38:15 -07:00
ufs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
unicode Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
vboxsf Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
verity
xfs Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a 2022-03-27 10:17:23 -07:00
zonefs for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
aio.c for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
anon_inodes.c
attr.c fs: handle circular mappings correctly 2021-11-17 09:26:09 +01:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c coredump: Snapshot the vmas in do_coredump 2022-03-08 12:55:29 -06:00
binfmt_elf_test.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
binfmt_elf.c execve updates for v5.18-rc1 2022-03-21 19:16:02 -07:00
binfmt_flat.c coredump: Don't compile flat_core_dump when coredumps are disabled 2022-03-09 10:37:07 -06:00
binfmt_misc.c Fix regression due to "fs: move binfmt_misc sysctl to its own file" 2022-02-09 09:50:02 -08:00
binfmt_script.c
buffer.c for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
char_dev.c
compat_binfmt_elf.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
coredump.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
d_path.c d_path: fix Kernel doc validator complaining 2021-11-06 13:30:32 -07:00
dax.c dax for 5.18 2022-03-24 18:12:09 -07:00
dcache.c mm: dcache: use kmem_cache_alloc_lru() to allocate dentry 2022-03-22 15:57:03 -07:00
direct-io.c block: remove the per-bio/request write hint 2022-03-07 12:45:57 -07:00
drop_caches.c
eventfd.c
eventpoll.c eventpoll: simplify sysctl declaration with register_sysctl() 2022-01-22 08:33:35 +02:00
exec.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
fcntl.c fs: remove fs.f_write_hint 2022-03-08 17:55:03 -07:00
fhandle.c
file_table.c fs/file_table: fix adding missing kmemleak_not_leak() 2022-02-17 10:23:19 -08:00
file.c fs: fd tables have to be multiples of BITS_PER_LONG 2022-03-29 15:06:39 -07:00
filesystems.c
fs_context.c vfs: fs_context: fix up param length parsing in legacy_parse_param 2022-01-18 09:23:19 +02:00
fs_parser.c fs_parse: allow parameter value to be empty 2021-12-09 14:09:36 -05:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
fsopen.c
init.c
inode.c fs: introduce alloc_inode_sb() to allocate filesystems specific inode 2022-03-22 15:57:03 -07:00
internal.h for-5.18-tag 2022-03-22 10:51:40 -07:00
io_uring.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
io-wq.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
io-wq.h Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
ioctl.c fs: allow cross-vfsmount reflink/dedupe 2022-03-14 13:13:53 +01:00
Kconfig Folio changes for 5.18 2022-03-22 17:03:12 -07:00
Kconfig.binfmt execve updates for v5.18-rc1 2022-03-21 19:16:02 -07:00
kernel_read_file.c
libfs.c fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio 2022-03-16 13:37:05 -04:00
locks.c fs: move locking sysctls where they are used 2022-01-22 08:33:36 +02:00
Makefile Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
mbcache.c
mount.h
mpage.c for-5.18/alloc-cleanups-2022-03-25 2022-03-26 11:59:30 -07:00
namei.c \n 2022-01-28 17:51:31 +02:00
namespace.c fs.rt.v5.18 2022-03-24 10:06:43 -07:00
no-block.c
nsfs.c
open.c fs: remove fs.f_write_hint 2022-03-08 17:55:03 -07:00
pipe.c fs/pipe.c: local vars have to match types of proper pipe_inode_info fields 2022-03-23 19:00:34 -07:00
pnode.c
pnode.h
posix_acl.c fs: support mapped mounts of mapped filesystems 2021-12-05 10:28:57 +01:00
proc_namespace.c fs: add is_idmapped_mnt() helper 2021-12-03 18:44:06 +01:00
read_write.c fs: export variant of generic_write_checks without iov_iter 2022-03-14 13:13:50 +01:00
readdir.c
remap_range.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-11 09:03:05 -08:00
seq_file.c seq_file: move seq_escape() to a header 2021-11-09 10:02:52 -08:00
signalfd.c Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
splice.c mm: Convert remove_mapping() to take a folio 2022-03-21 12:59:01 -04:00
stack.c
stat.c io-uring: Make statx API stable 2022-03-10 09:33:55 -07:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-01-30 08:59:47 -08:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
sysctls.c fs: move namespace sysctls and declare fs base directory 2022-01-22 08:33:36 +02:00
timerfd.c
userfaultfd.c userfaultfd: provide unmasked address on page-fault 2022-03-22 15:57:08 -07:00
utimes.c
xattr.c