linux/fs
Davidlohr Bueso 002b343669 fs/epoll: loosen irq safety in ep_scan_ready_list()
Patch series "fs/epoll: loosen irq safety when possible".

Both patches replace saving+restoring interrupts when taking the ep->lock
(now the waitqueue lock), with just disabling local irqs.  This shows
immediate performance benefits in patch 1 for an epoll workload running on
Xen.  The main concern we need to have with this sort of changes in epoll
is the ep_poll_callback() which is passed to the wait queue wakeup and is
done very often under irq context, this patch does not touch this call.

Patches have been tested pretty heavily with the customer workload,
microbenchmarks, ltp testcases and two high level workloads that use epoll
under the hood: nginx and libevent benchmarks.

This patch (of 2):

Saving and restoring interrupts in ep_scan_ready_list() is an
overkill as it is never called with interrupts disabled. Loosen
this to simply disabling local irqs such that archs where managing
irqs is expensive or virtual environments. This patch yields
some throughput improvements on a workload that is epoll intensive
running on a single Xen DomU.

1 Job	 7500	-->    8800 enq/s  (+17%)
2 Jobs	14000   -->   15200 enq/s  (+8%)
3 Jobs	20500	-->   22300 enq/s  (+8%)
4 Jobs	25000   -->   28000 enq/s  (+8-12)%

On bare metal:

For a 2-socket 40-core (ht) IvyBridge on a few workloads, unfortunately I
don't have a xen environment and the results for Xen I do have (which
numbers are in patch 1) I don't have the actual workload, so cannot
compare them directly.

1) Different configurations were used for a epoll_wait (pipes io)
   microbench (http://linux-scalability.org/epoll/epoll-test.c) and shows
   around a 7-10% improvement in overall total number of times the
   epoll_wait() loops when using both regular and nested epolls, so very
   raw numbers, but measurable nonetheless.

# threads	vanilla		dirty
     1		1677717		1805587
     2		1660510		1854064
     4		1610184		1805484
     8		1577696		1751222
     16		1568837		1725299
     32		1291532		1378463
     64		 752584		 787368

   Note that stddev is pretty small.

2) Another pipe test, which shows no real measurable improvement.
   (http://www.xmailserver.org/linux-patches/pipetest.c)

Link: http://lkml.kernel.org/r/20180720172956.2883-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:47 -07:00
..
9p Pull request for inclusion in 4.19, take two 2018-08-17 17:27:58 -07:00
adfs adfs: don't put inodes into icache 2018-08-03 16:03:33 -04:00
affs affs: fix potential memory leak when parsing option 'prefix' 2018-05-28 12:36:41 +02:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-08-15 15:04:25 -07:00
autofs autofs: fix autofs_sbi() does not check super block type 2018-08-22 10:52:43 -07:00
befs fix a series of Documentation/ broken file name references 2018-06-15 18:10:01 -03:00
bfs
btrfs btrfs: readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
cachefiles cachefiles: Wait rather than BUG'ing on "Unexpected object collision" 2018-07-25 14:49:00 +01:00
ceph The main things are support for cephx v2 authentication protocol and 2018-08-20 18:26:55 -07:00
cifs Merge branch 'linus/master' into rdma.git for-next 2018-08-16 14:21:29 -06:00
coda vfs: change inode times to use struct timespec64 2018-06-05 16:57:31 -07:00
configfs configfs: fix registered group removal 2018-07-17 06:14:07 -07:00
cramfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
crypto f2fs-for-4.18-rc1 2018-06-11 10:16:13 -07:00
debugfs Revert "debugfs: inode: debugfs_create_dir uses mode permission from parent" 2018-06-12 20:52:16 -07:00
devpts
dlm treewide: Use array_size() in vmalloc() 2018-06-12 16:19:22 -07:00
ecryptfs Merge branch 'fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into aio-base 2018-05-26 09:16:25 +02:00
efivarfs efivars: Call guid_parse() against guid_t type of variable 2018-07-22 14:13:44 +02:00
efs
exofs exofs: use bio_clone_fast in _write_mirror 2018-07-24 14:43:20 -06:00
exportfs
ext2 Merge branch 'akpm' (patches from Andrew) 2018-08-17 16:49:31 -07:00
ext4 ext4: readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
f2fs mpage: mpage_readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
fat fat: fix memory allocation failure handling of match_strdup() 2018-07-21 12:50:46 -07:00
freevxfs
fscache fscache: Fix reference overput in fscache_attach_object() error handling 2018-07-25 14:49:00 +01:00
fuse Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
gfs2 gfs2 4.19 merge 2018-08-15 22:40:03 -07:00
hfs new helper: inode_fake_hash() 2018-08-03 16:03:32 -04:00
hfsplus vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
hostfs vfs: discard ATTR_ATTR_FLAG 2018-08-17 16:20:28 -07:00
hpfs fs/hpfs: extend gmt_to_local() conversion to 64-bit times 2018-08-17 16:20:27 -07:00
hugetlbfs mm: zero out the vma in vma_init() 2018-08-22 10:52:44 -07:00
isofs
jbd2 jbd2: replace current_kernel_time64 with ktime equivalent 2018-07-29 15:51:47 -04:00
jffs2 jffs2: use unsigned 32-bit timstamps consistently 2018-07-18 16:44:01 +02:00
jfs Just one jfs patch for 4.19 2018-08-15 22:47:23 -07:00
kernfs Driver core patches for 4.19-rc1 2018-08-18 11:44:53 -07:00
lockd
minix
nfs Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
nfs_common
nfsd IMA: don't propagate opened through the entire thing 2018-07-12 10:04:19 -04:00
nilfs2
nls
notify Merge branch 'akpm' (patches from Andrew) 2018-08-17 16:49:31 -07:00
ntfs ntfs: mft: remove VLA usage 2018-08-17 16:20:27 -07:00
ocfs2 ocfs2: make several functions and variables static (and some const) 2018-08-17 16:20:28 -07:00
omfs
openpromfs
orangefs orangefs: remove redundant pointer orangefs_inode 2018-08-14 12:07:14 -04:00
overlayfs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
proc proc/kcore: add vmcoreinfo note to /proc/kcore 2018-08-22 10:52:46 -07:00
pstore pstore: add zstd compression support 2018-08-03 18:12:18 -07:00
qnx4
qnx6
quota quota: Cleanup list iteration in dqcache_shrink_scan() 2018-06-20 11:04:26 +02:00
ramfs
reiserfs reiserfs: fix buffer overflow with long warning messages 2018-07-14 11:11:10 -07:00
romfs
squashfs Squashfs: Compute expected length from inode size rather than block length 2018-08-02 09:34:02 -07:00
sysfs Driver core patches for 4.19-rc1 2018-08-18 11:44:53 -07:00
sysv
tracefs tracefs: Annotate tracefs_ops with __ro_after_init 2018-07-31 11:32:44 -04:00
ubifs vfs/y2038: inode timestamps conversion to timespec64 2018-06-15 07:31:07 +09:00
udf \n 2018-08-17 09:38:39 -07:00
ufs fs/ufs: use ktime_get_real_seconds for sb and cg timestamps 2018-08-17 16:20:27 -07:00
xfs dax: remove VM_MIXEDMAP for fsdax and device dax 2018-08-17 16:20:27 -07:00
aio.c Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:56:23 -07:00
anon_inodes.c anon_inode_getfile(): switch to alloc_file_pseudo() 2018-07-12 10:04:27 -04:00
attr.c fs: Fix attr.c kernel-doc 2018-07-03 16:44:45 -04:00
bad_inode.c get rid of 'opened' argument of ->atomic_open() - part 3 2018-07-12 10:04:20 -04:00
binfmt_aout.c
binfmt_elf_fdpic.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
binfmt_elf.c Here are the main MIPS changes for 4.19. 2018-08-13 19:24:32 -07:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c turn filp_clone_open() into inline wrapper for dentry_open() 2018-07-10 23:29:03 -04:00
binfmt_script.c
block_dev.c for-4.19/block-20180812 2018-08-14 10:23:25 -07:00
buffer.c fs, mm: account buffer_head to kmemcg 2018-08-17 16:20:30 -07:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c media: dvb/audio.h: get rid of unused APIs 2018-07-30 16:21:49 -04:00
compat.c ncpfs: remove compat functionality 2018-06-05 19:23:26 +02:00
coredump.c
d_path.c
dax.c dax: dax_layout_busy_page() warn on !exceptional 2018-07-29 16:59:16 -04:00
dcache.c fs/dcache.c: fix kmemcheck splat at take_dentry_name_snapshot() 2018-08-17 16:20:28 -07:00
dcookies.c
direct-io.c
drop_caches.c
eventfd.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
eventpoll.c fs/epoll: loosen irq safety in ep_scan_ready_list() 2018-08-22 10:52:47 -07:00
exec.c mm: fix vma_is_anonymous() false-positives 2018-07-26 19:38:03 -07:00
fcntl.c mm: restructure memfd code 2018-06-07 17:34:35 -07:00
fhandle.c
file_table.c make alloc_file() static 2018-07-12 10:04:29 -04:00
file.c
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c
inode.c Merge branch 'work.mkdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:25:58 -07:00
internal.h Merge branch 'iomap-4.19-merge' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux 2018-08-13 22:29:03 -07:00
ioctl.c
iomap.c Changes for 4.19: 2018-08-14 08:56:02 -07:00
Kconfig autofs: remove left-over autofs4 stubs 2018-06-11 08:22:34 -07:00
Kconfig.binfmt kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
libfs.c
locks.c File locking fixes and enhancements for v4.19 2018-08-13 21:56:50 -07:00
Makefile autofs: remove left-over autofs4 stubs 2018-06-11 08:22:34 -07:00
mbcache.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
mount.h
mpage.c mpage: mpage_readpages() should submit IO as read-ahead 2018-08-17 16:20:29 -07:00
namei.c Merge branches 'work.misc' and 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 21:28:25 -07:00
namespace.c fix __legitimize_mnt()/mntput() race 2018-08-09 17:51:32 -04:00
no-block.c
nsfs.c
open.c ->atomic_open(): return 0 in all success cases 2018-07-12 10:04:21 -04:00
pipe.c Merge branch 'work.open3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 19:58:36 -07:00
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c treewide: kmalloc() -> kmalloc_array() 2018-06-12 16:19:22 -07:00
readdir.c
select.c Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL 2018-06-28 10:40:47 -07:00
seq_file.c fs/seq_file.c: simplify seq_file iteration code and interface 2018-08-17 16:20:28 -07:00
signalfd.c Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-16 16:21:50 +09:00
splice.c Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-06-16 16:21:50 +09:00
stack.c
stat.c
statfs.c kernel: add kcompat_sys_{f,}statfs64() 2018-07-12 14:49:48 +01:00
super.c mm: add SHRINK_EMPTY shrinker methods return value 2018-08-17 16:20:31 -07:00
sync.c
timerfd.c Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-08-13 20:56:23 -07:00
userfaultfd.c userfaultfd: use fault_wqh lock 2018-08-22 10:52:47 -07:00
utimes.c
xattr.c vfs: delete unnecessary assignment in vfs_listxattr 2018-05-29 13:22:41 -04:00