linux/fs
Eric Paris 90586523eb fsnotify: unified filesystem notification backend
fsnotify is a backend for filesystem notification.  fsnotify does
not provide any userspace interface but does provide the basis
needed for other notification schemes such as dnotify.  fsnotify
can be extended to be the backend for inotify or the upcoming
fanotify.  fsnotify provides a mechanism for "groups" to register for
some set of filesystem events and to then deliver those events to
those groups for processing.

fsnotify has a number of benefits, the first being actually shrinking the size
of an inode.  Before fsnotify to support both dnotify and inotify an inode had

        unsigned long           i_dnotify_mask; /* Directory notify events */
        struct dnotify_struct   *i_dnotify; /* for directory notifications */
        struct list_head        inotify_watches; /* watches on this inode */
        struct mutex            inotify_mutex;  /* protects the watches list

But with fsnotify this same functionallity (and more) is done with just

        __u32                   i_fsnotify_mask; /* all events for this inode */
        struct hlist_head       i_fsnotify_mark_entries; /* marks on this inode */

That's right, inotify, dnotify, and fanotify all in 64 bits.  We used that
much space just in inotify_watches alone, before this patch set.

fsnotify object lifetime and locking is MUCH better than what we have today.
inotify locking is incredibly complex.  See 8f7b0ba1c8 as an example of
what's been busted since inception.  inotify needs to know internal semantics
of superblock destruction and unmounting to function.  The inode pinning and
vfs contortions are horrible.

no fsnotify implementers do allocation under locks.  This means things like
f04b30de3 which (due to an overabundance of caution) changes GFP_KERNEL to
GFP_NOFS can be reverted.  There are no longer any allocation rules when using
or implementing your own fsnotify listener.

fsnotify paves the way for fanotify.  In brief fanotify is a notification
mechanism that delivers the lisener both an 'event' and an open file descriptor
to the object in question.  This means that fanotify is pathname agnostic.
Some on lkml may not care for the original companies or users that pushed for
TALPA, but fanotify was designed with flexibility and input for other users in
mind.  The readahead group expressed interest in fanotify as it could be used
to profile disk access on boot without breaking the audit system.  The desktop
search groups have also expressed interest in fanotify as it solves a number
of the race conditions and problems present with managing inotify when more
than a limited number of specific files are of interest.  fanotify can provide
for a userspace access control system which makes it a clean interface for AV
vendors to hook without trying to do binary patching on the syscall table,
LSM, and everywhere else they do their things today.  With this patch series
fanotify can be implemented in less than 1200 lines of easy to review code.
Almost all of which is the socket based user interface.

This patch series builds fsnotify to the point that it can implement
dnotify and inotify_user.  Patches exist and will be sent soon after
acceptance to finish the in kernel inotify conversion (audit) and implement
fanotify.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
2009-06-11 14:57:52 -04:00
..
9p Fix a leak in failure exit in 9p ->get_sb() 2009-05-09 10:49:40 -04:00
adfs fs/adfs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
affs Fix races around the access to ->s_options 2009-05-09 10:51:34 -04:00
afs Fix races around the access to ->s_options 2009-05-09 10:51:34 -04:00
autofs Fix autofs_expire() 2009-04-20 23:01:15 -04:00
autofs4 autofs4: remove hashed check in validate_wait() 2009-06-09 16:59:03 -07:00
befs befs: fix build on parisc 2009-04-08 10:21:43 -07:00
bfs fs/Kconfig: move bfs out 2009-01-22 13:15:57 +03:00
btrfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable 2009-06-05 11:54:28 -07:00
cachefiles CacheFiles: Fixup renamed filenames in comments in internal.h 2009-05-27 10:20:13 -07:00
cifs cifs: remove never-used in6_addr option 2009-06-10 18:34:35 +00:00
coda splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
configfs configfs: Fix Trivial Warning in fs/configfs/symlink.c 2009-04-21 12:59:21 -07:00
cramfs fs/cramfs: return f_fsid for statfs(2) 2009-04-02 19:05:08 -07:00
debugfs debugfs: function to know if debugfs is initialized 2009-03-23 16:25:46 +01:00
devpts devpts: unregister the file system on error 2009-06-11 08:51:06 -07:00
dlm dlm: fix length calculation in compat code 2009-03-11 12:23:59 -05:00
ecryptfs Convert obvious places to deactivate_locked_super() 2009-05-09 10:49:40 -04:00
efs fs/efs: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
exofs block: add rq->resid_len 2009-05-11 09:50:53 +02:00
exportfs
ext2 ext2: Fix memory leak in ext2_fill_super() in case of a failed mount 2009-05-17 23:52:51 -04:00
ext3 Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
ext4 Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
fat vfat: Note the NLS requirement 2009-04-17 09:32:11 -07:00
freevxfs fs/Kconfig: move vxfs out 2009-01-22 13:15:58 +03:00
fscache FS-Cache: Fixup renamed filenames in comments in internal.h 2009-05-27 10:20:13 -07:00
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse 2009-05-13 16:32:57 -07:00
gfs2 Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
hfs hfs: fix memory leak when unmounting 2009-04-13 15:04:29 -07:00
hfsplus Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
hostfs constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
hpfs Fix races around the access to ->s_options 2009-05-09 10:51:34 -04:00
hppfs hppfs: hppfs_read_file() may return -ERROR 2009-04-02 19:04:53 -07:00
hugetlbfs Merge branch 'master' into next 2009-05-22 18:40:59 +10:00
isofs fs/isofs: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
jbd jbd: fix race in buffer processing in commit code 2009-06-09 16:59:03 -07:00
jbd2 jbd2: Fix minor typos in comments in fs/jbd2/journal.c 2009-06-09 00:06:20 -04:00
jffs2 jffs2: Fix corruption when flash erase/write failure 2009-05-29 10:44:46 +01:00
jfs New helper - current_umask() 2009-03-31 23:00:26 -04:00
lockd lockd: fix list corruption on lockd restart 2009-05-06 17:19:36 -04:00
minix fs/minix: return f_fsid for statfs(2) 2009-04-02 19:05:09 -07:00
ncpfs ncpfs: use memdup_user() 2009-04-20 23:02:51 -04:00
nfs NFSv4: Fix the case where NFSv4 renewal fails 2009-05-26 14:51:00 -04:00
nfs_common
nfsd Merge branch 'master' into next 2009-06-09 09:27:53 +10:00
nilfs2 Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
nls
notify fsnotify: unified filesystem notification backend 2009-06-11 14:57:52 -04:00
ntfs block: Do away with the notion of hardsect_size 2009-05-22 23:22:54 +02:00
ocfs2 block: Do away with the notion of hardsect_size 2009-05-22 23:22:54 +02:00
omfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-04-02 21:09:10 -07:00
openpromfs zero i_uid/i_gid on inode allocation 2009-01-05 11:54:28 -05:00
partitions block: Export I/O topology for block devices and partitions 2009-05-22 23:22:55 +02:00
proc Merge branch 'next' into for-linus 2009-06-11 11:03:14 +10:00
qnx4 fs/qnx4: return f_fsid for statfs(2) 2009-04-02 19:05:10 -07:00
quota quota: remove obsolete comments in fs/quota/Makefile 2009-04-27 16:49:52 +02:00
ramfs ramfs: fix double freeing s_fs_info on failed mount 2009-04-07 07:39:59 -07:00
reiserfs reiserfs: fixup perms when xattrs are disabled 2009-05-17 11:45:45 -07:00
romfs ROMFS: romfs_dev_read() error ignored 2009-05-09 10:49:41 -04:00
smbfs constify dentry_operations: misc filesystems 2009-03-27 14:44:00 -04:00
squashfs Squashfs: cody tidying, remove commented out line in Makefile 2009-05-13 03:25:20 +01:00
sysfs sysfs: file.c: use create_singlethread_workqueue() 2009-05-28 14:24:07 -07:00
sysv fs/sysv: return f_fsid for statfs(2) 2009-04-02 19:05:10 -07:00
ubifs Convert obvious places to deactivate_locked_super() 2009-05-09 10:49:40 -04:00
udf block: Do away with the notion of hardsect_size 2009-05-22 23:22:54 +02:00
ufs switch ufs directories to ufs_sync_file() 2009-05-09 10:49:42 -04:00
xfs Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
aio.c aio: lookup_ioctx can return the wrong value when looking up a bogus context 2009-03-19 15:57:18 -07:00
anon_inodes.c constify dentry_operations: rest 2009-03-27 14:44:03 -04:00
attr.c vfs: Use lowercase names of quota functions 2009-03-26 02:18:35 +01:00
bad_inode.c
binfmt_aout.c sanitize ifdefs in binfmt_aout 2009-01-03 11:45:54 -08:00
binfmt_elf_fdpic.c ptrace: s/parent/real_parent/ in binfmt_elf_fdpic.c 2009-05-02 15:36:10 -07:00
binfmt_elf.c Trim includes in binfmt_elf 2009-03-31 23:00:27 -04:00
binfmt_em86.c
binfmt_flat.c flat: fix data sections alignment 2009-05-29 08:40:02 -07:00
binfmt_misc.c fs/binfmt_misc.c: add terminating newline to /proc/sys/fs/binfmt_misc/status 2009-01-06 15:59:19 -08:00
binfmt_script.c
binfmt_som.c Don't crap into descriptor table in binfmt_som 2009-03-31 23:00:28 -04:00
bio-integrity.c block: add private bio_set for bio integrity allocations 2009-03-24 12:35:17 +01:00
bio.c Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
block_dev.c Revert "block: implement blkdev_readpages" 2009-06-04 22:34:44 +02:00
buffer.c Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
char_dev.c fs: fix name overwrite in __register_chrdev_region() 2009-01-06 15:59:13 -08:00
compat_binfmt_elf.c
compat_ioctl.c fs/compat_ioctl: fix build when !BLOCK 2009-04-20 23:01:16 -04:00
compat.c CRED: Rename cred_exec_mutex to reflect that it's a guard against ptrace 2009-05-11 08:15:36 +10:00
dcache.c fs: dcache fix LRU ordering 2009-05-09 10:49:40 -04:00
dcookies.c [CVE-2009-0029] System call wrapper special cases 2009-01-14 14:15:18 +01:00
direct-io.c block: Do away with the notion of hardsect_size 2009-05-22 23:22:54 +02:00
drop_caches.c vfs: skip I_CLEAR state inodes 2009-04-02 19:04:48 -07:00
eventfd.c epoll keyed wakeups: make eventfd use keyed wakeups 2009-04-01 08:59:20 -07:00
eventpoll.c epoll: fix size check in epoll_create() 2009-05-12 14:11:35 -07:00
exec.c Merge branch 'master' into next 2009-05-22 18:40:59 +10:00
fcntl.c dup2: Fix return value with oldfd == newfd and invalid fd 2009-05-11 12:18:06 -07:00
fifo.c
file_table.c trivial: remove unused variable 'path' in alloc_file() 2009-03-30 15:22:03 +02:00
file.c
filesystems.c fs: Mark get_filesystem_list() as __init function. 2009-04-20 23:02:52 -04:00
fs_struct.c Get rid of indirect include of fs_struct.h 2009-03-31 23:00:27 -04:00
fs-writeback.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-04-03 15:24:35 -07:00
generic_acl.c New helper - current_umask() 2009-03-31 23:00:26 -04:00
inode.c integrity: fix IMA inode leak 2009-06-06 14:33:41 -07:00
internal.h New locking/refcounting for fs_struct 2009-03-31 23:00:26 -04:00
ioctl.c fiemap: fix problem with setting FIEMAP_EXTENT_LAST 2009-05-06 16:36:09 -07:00
ioprio.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
Kconfig nilfs2: update makefile and Kconfig 2009-04-07 08:31:16 -07:00
Kconfig.binfmt CORE_DUMP_DEFAULT_ELF_HEADERS depends on ELF_CORE 2009-01-09 16:54:41 -08:00
libfs.c Convert obvious places to deactivate_locked_super() 2009-05-09 10:49:40 -04:00
locks.c [CVE-2009-0029] System call wrappers part 16 2009-01-14 14:15:25 +01:00
Makefile nilfs2: update makefile and Kconfig 2009-04-07 08:31:16 -07:00
mbcache.c
mpage.c ext4: Properly initialize the buffer_head state 2009-05-13 15:13:42 -04:00
namei.c Merge branch 'master' into next 2009-05-22 18:40:59 +10:00
namespace.c Fix races around the access to ->s_options 2009-05-09 10:51:34 -04:00
nfsctl.c [CVE-2009-0029] System call wrappers part 27 2009-01-14 14:15:29 +01:00
no-block.c
open.c Switch open_exec() and sys_uselib() to do_open_filp() 2009-05-09 10:49:42 -04:00
pipe.c splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
pnode.c
pnode.h
posix_acl.c
read_write.c splice: implement default splice_read method 2009-05-11 14:13:10 +02:00
read_write.h
readdir.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
select.c [CVE-2009-0029] System call wrappers part 32 2009-01-14 14:15:31 +01:00
seq_file.c cpumask: fix seq_bitmap_*() functions. 2009-03-30 22:05:11 +10:30
signalfd.c [CVE-2009-0029] System call wrappers part 31 2009-01-14 14:15:31 +01:00
splice.c splice: fix kmaps in default_file_splice_write() 2009-05-19 11:37:46 +02:00
stack.c
stat.c kill vfs_stat_fd / vfs_lstat_fd 2009-04-20 23:02:52 -04:00
super.c NULL noise in fs/super.c:kill_bdev_super() 2009-05-09 10:49:41 -04:00
sync.c Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-quota-2.6 2009-03-27 14:48:34 -07:00
timerfd.c timerfd: add flags check 2009-02-18 15:37:53 -08:00
utimes.c [CVE-2009-0029] System call wrappers part 30 2009-01-14 14:15:30 +01:00
xattr_acl.c
xattr.c xattr: use memdup_user() 2009-04-20 23:02:50 -04:00