linux/fs
Aneesh Kumar K.V a05b0855fd hugetlbfs: avoid taking i_mutex from hugetlbfs_read()
Taking i_mutex in hugetlbfs_read() can result in deadlock with mmap as
explained below

 Thread A:
  read() on hugetlbfs
   hugetlbfs_read() called
    i_mutex grabbed
     hugetlbfs_read_actor() called
      __copy_to_user() called
       page fault is triggered
 Thread B, sharing address space with A:
  mmap() the same file
   ->mmap_sem is grabbed on task_B->mm->mmap_sem
    hugetlbfs_file_mmap() is called
     attempt to grab ->i_mutex and block waiting for A to give it up
 Thread A:
  pagefault handled blocked on attempt to grab task_A->mm->mmap_sem,
 which happens to be the same thing as task_B->mm->mmap_sem.  Block waiting
 for B to give it up.

AFAIU the i_mutex locking was added to hugetlbfs_read() as per
http://lkml.indiana.edu/hypermail/linux/kernel/0707.2/3066.html to take
care of the race between truncate and read.  This patch fixes this by
looking at page->mapping under lock_page() (find_lock_page()) to ensure
that the inode didn't get truncated in the range during a parallel read.

Ideally we can extend the patch to make sure we don't increase i_size in
mmap.  But that will break userspace, because applications will now have
to use truncate(2) to increase i_size in hugetlbfs.

Based on the original patch from Hillf Danton.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@kernel.org>		[everything after 2007 :)]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-21 17:54:58 -07:00
..
9p Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs 2012-01-10 15:09:01 -08:00
adfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
affs affs: propagate umode_t 2012-01-03 22:55:04 -05:00
afs Merge branch 'kmap_atomic' of git://github.com/congwang/linux 2012-03-21 09:40:26 -07:00
autofs4 autofs: work around unhappy compat problem on x86-64 2012-02-25 12:10:27 -08:00
befs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
bfs switch ->create() to umode_t 2012-01-03 22:54:53 -05:00
btrfs Merge branch 'kmap_atomic' of git://github.com/congwang/linux 2012-03-21 09:40:26 -07:00
cachefiles fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
ceph Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2012-02-02 15:47:33 -08:00
cifs CIFS: Do not kmalloc under the flocks spinlock 2012-03-06 21:50:15 -06:00
coda coda: switch coda_cnode_make() to sane API as well, clean coda_lookup() 2012-01-10 11:13:16 -05:00
configfs configfs: convert to umode_t 2012-01-03 22:54:57 -05:00
cramfs cramfs: Fix typo in inode.c 2012-02-21 11:40:35 +01:00
debugfs Merge 3.3-rc2 into the driver-core-next branch. 2012-02-02 11:24:44 -08:00
devpts tty: rework pty count limiting 2012-01-24 14:01:01 -08:00
dlm dlm: Do not allocate a fd for peeloff 2012-03-08 13:52:09 -08:00
ecryptfs ecryptfs: fix printk format warning for size_t 2012-02-28 16:55:30 -08:00
efs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
exofs exofs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:22 +08:00
exportfs
ext2 ext2: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:22 +08:00
ext3 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2012-01-09 12:51:21 -08:00
ext4 Merge branch 'for_linus' into for_linus_merged 2012-01-10 11:54:07 -05:00
fat Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb 2012-01-09 12:09:47 -08:00
freevxfs fs: propagate umode_t, misc bits 2012-01-03 22:55:10 -05:00
fscache
fuse fuse: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:22 +08:00
gfs2 gfs2: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:23 +08:00
hfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hfsplus hfsplus: creation of hidden dir on mount can fail 2012-01-10 17:48:52 -05:00
hostfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
hpfs switch ->mknod() to umode_t 2012-01-03 22:54:54 -05:00
hppfs vfs: for usbfs, etc. internal vfsmounts ->mnt_sb->s_root == ->mnt_root 2012-01-03 22:52:41 -05:00
hugetlbfs hugetlbfs: avoid taking i_mutex from hugetlbfs_read() 2012-03-21 17:54:58 -07:00
isofs isofs: inode leak on mount failure 2012-01-09 10:48:11 -05:00
jbd Power management updates for 3.4 2012-03-21 10:15:51 -07:00
jbd2 Power management updates for 3.4 2012-03-21 10:15:51 -07:00
jffs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-03-20 21:12:50 -07:00
jfs Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
lockd module_param: make bool parameters really bool (drivers & misc) 2012-01-13 09:32:20 +10:30
logfs logfs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:24 +08:00
minix minix: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:24 +08:00
ncpfs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
nfs nfs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:24 +08:00
nfs_common
nfsd Merge branch 'for-3.3' of git://linux-nfs.org/~bfields/linux 2012-01-14 12:26:41 -08:00
nilfs2 nilfs2: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:24 +08:00
nls NLS: raname "maxlen" to "maxout" in UTF conversion routines 2011-11-26 19:58:47 -08:00
notify fsnotify: don't BUG in fsnotify_destroy_mark() 2012-01-14 18:01:42 -08:00
ntfs Merge branch 'kmap_atomic' of git://github.com/congwang/linux 2012-03-21 09:40:26 -07:00
ocfs2 ocfs2: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:25 +08:00
omfs omfs: propagate umode_t 2012-01-03 22:55:01 -05:00
openpromfs vfs: fix the stupidity with i_dentry in inode destructors 2012-01-03 22:52:40 -05:00
proc procfs: mark thread stack correctly in proc/<pid>/maps 2012-03-21 17:54:58 -07:00
pstore pstore: gracefully handle NULL pstore_info functions 2011-11-18 13:49:00 -08:00
qnx4 qnx4: don't leak ->BitMap on late failure exits 2012-01-19 13:54:36 -05:00
quota quota: Fix deadlock with suspend and quotas 2012-02-13 20:45:39 -05:00
ramfs pohmelfs: propagate umode_t 2012-01-03 22:55:07 -05:00
reiserfs Merge branch 'kmap_atomic' of git://github.com/congwang/linux 2012-03-21 09:40:26 -07:00
romfs MTD pull for 3.3 2012-01-10 13:45:22 -08:00
squashfs squashfs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:25 +08:00
sysfs Revert "sysfs: Kill nlink counting." 2012-03-08 13:03:10 -08:00
sysv vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
ubifs ubifs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:26 +08:00
udf udf: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:26 +08:00
ufs vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
xfs xfs: make inode quota check more general 2012-02-21 10:12:43 -06:00
aio.c fs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:21 +08:00
anon_inodes.c
attr.c switch is_sxid() to umode_t 2012-01-03 22:55:11 -05:00
bad_inode.c switch ->mknod() to umode_t 2012-01-03 22:54:54 -05:00
binfmt_aout.c aout: move setup_arg_pages() prior to reading/mapping the binary 2012-03-05 13:51:32 -08:00
binfmt_elf_fdpic.c
binfmt_elf.c regset: Prevent null pointer reference on readonly regsets 2012-03-02 11:38:15 -08:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
binfmt_script.c
binfmt_som.c
bio-integrity.c fs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:21 +08:00
bio.c bio: don't overflow in bio_get_nr_vecs() 2012-02-08 22:07:18 +01:00
block_dev.c block: Fix NULL pointer dereference in sd_revalidate_disk 2012-03-02 10:38:33 +01:00
buffer.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
char_dev.c char_dev.c: fix up some whitespace errors 2011-12-13 11:18:17 -08:00
compat_binfmt_elf.c
compat_ioctl.c ppp: Replace uses of <linux/if_ppp.h> with <linux/ppp-ioctl.h> 2012-03-04 20:41:38 -05:00
compat.c vfs: fix compat_sys_stat() handling of overflows in st_nlink 2012-02-13 20:45:39 -05:00
dcache.c Merge branch 'dcache-word-accesses' 2012-03-19 16:37:28 -07:00
dcookies.c
direct-io.c Restore direct_io / truncate locking API 2012-02-23 15:56:21 -08:00
drop_caches.c
eventfd.c
eventpoll.c Don't limit non-nested epoll paths 2012-03-18 12:25:04 -07:00
exec.c Merge branch 'kmap_atomic' of git://github.com/congwang/linux 2012-03-21 09:40:26 -07:00
fcntl.c
fhandle.c vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb 2012-01-06 23:16:53 -05:00
fifo.c
file_table.c vfs: prevent remount read-only if pending removes 2012-01-06 23:20:13 -05:00
file.c
filesystems.c vfs: convert fs_supers to hlist 2012-01-03 22:52:39 -05:00
fs_struct.c
fs-writeback.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-03-20 21:12:50 -07:00
generic_acl.c
inode.c restore smp_mb() in unlock_new_inode() 2012-03-10 17:07:28 -05:00
internal.h vfs: protect remounting superblock read-only 2012-01-06 23:20:12 -05:00
ioctl.c vfs: fix up ENOIOCTLCMD error handling 2012-01-05 15:40:12 -08:00
ioprio.c block: strip out locking optimization in put_io_context() 2012-02-07 07:51:30 +01:00
Kconfig vfs: use 'unsigned long' accesses for dcache name comparison and hashing 2012-03-08 18:08:44 -08:00
Kconfig.binfmt fs: binfmt_elf: create Kconfig variable for PIE randomization 2012-01-10 16:30:51 -08:00
libfs.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
locks.c vfs: fix handling of lock allocation failure in lease-break case 2011-12-26 10:25:26 -08:00
Makefile Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z 2012-01-06 23:15:54 -05:00
mbcache.c
mount.h vfs: keep list of mounts for each superblock 2012-01-06 23:20:12 -05:00
mpage.c fs: remove unneeded plug in mpage_readpages() 2012-01-12 09:19:54 +01:00
namei.c fs/namei.c: fix warnings on 32-bit 2012-03-21 17:54:54 -07:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2012-01-08 13:21:22 -08:00
no-block.c
open.c switch security_path_chmod() to struct path * 2012-01-06 23:16:53 -05:00
pipe.c fs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:21 +08:00
pnode.c vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
pnode.h vfs: switch pnode.h macros to struct mount * 2012-01-03 22:57:11 -05:00
posix_acl.c vfs: pass all mask flags check_acl and posix_acl_permission 2011-10-28 14:58:54 +02:00
proc_namespace.c vfs: switch ->show_options() to struct dentry * 2012-01-06 23:19:54 -05:00
read_write.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
read_write.h
readdir.c
select.c sys_poll: fix incorrect type for 'timeout' parameter 2012-02-21 17:24:20 -08:00
seq_file.c seq_file: fix mishandling of consecutive pread() invocations. 2012-03-21 17:54:54 -07:00
signalfd.c epoll: ep_unregister_pollwait() can use the freed pwq->whead 2012-02-24 11:42:50 -08:00
splice.c fs: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:21 +08:00
stack.c filesystems: add set_nlink() 2011-11-02 12:53:43 +01:00
stat.c readlinkat: ensure we return ENOENT for the empty pathname for normal lookups 2011-11-02 12:53:42 +01:00
statfs.c vfs: new helper - vfs_ustat() 2012-01-03 22:53:07 -05:00
super.c vfs: Provide function to get superblock and wait for it to thaw 2012-02-13 20:45:38 -05:00
sync.c fs: move code out of buffer.c 2012-01-03 22:54:07 -05:00
timerfd.c
utimes.c
xattr_acl.c
xattr.c vfs: mnt_drop_write_file() 2012-01-03 22:52:40 -05:00