linux

Author	SHA1	Message	Date
Sunil Mushran	9af0b38ff3	ocfs2/net: Use wait_event() in o2net_send_message_vec() Replace wait_event_interruptible() with wait_event() in o2net_send_message_vec(). This is because this function is called by the dlm that expects signals to be blocked. Fixes oss bugzilla#1126 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1126 Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-06-15 14:50:14 -07:00
Tao Ma	6b791bcc8b	ocfs2: Adjust rightmost path in ocfs2_add_branch. In ocfs2_add_branch, we use the rightmost rec of the leaf extent block to generate the e_cpos for the newly added branch. In the most case, it is OK but if the parent extent block's rightmost rec covers more clusters than the leaf does, it will cause kernel panic if we insert some clusters in it. The message is something like: (7445,1):ocfs2_insert_at_leaf:3775 ERROR: bug expression: le16_to_cpu(el->l_next_free_rec) >= le16_to_cpu(el->l_count) (7445,1):ocfs2_insert_at_leaf:3775 ERROR: inode 66053, depth 0, count 28, next free 28, rec.cpos 270, rec.clusters 1, insert.cpos 275, insert.clusters 1 [<fa7ad565>] ? ocfs2_do_insert_extent+0xb58/0xda0 [ocfs2] [<fa7b08f2>] ? ocfs2_insert_extent+0x5bd/0x6ba [ocfs2] [<fa7b1b8b>] ? ocfs2_add_clusters_in_btree+0x37f/0x564 [ocfs2] ... The panic can be easily reproduced by the following small test case (with bs=512, cs=4K, and I remove all the error handling so that it looks clear enough for reading). int main(int argc, char *argv) { int fd, i; char buf[5] = "test"; fd = open(argv[1], O_RDWR\|O_CREAT); for (i = 0; i < 30; i++) { lseek(fd, 40960 i, SEEK_SET); write(fd, buf, 5); } ftruncate(fd, 1146880); lseek(fd, 1126400, SEEK_SET); write(fd, buf, 5); close(fd); return 0; } The reason of the panic is that: the 30 writes and the ftruncate makes the file's extent list looks like: Tree Depth: 1 Count: 19 Next Free Rec: 1 ## Offset Clusters Block# 0 0 280 86183 SubAlloc Bit: 7 SubAlloc Slot: 0 Blknum: 86183 Next Leaf: 0 CRC32: 00000000 ECC: 0000 Tree Depth: 0 Count: 28 Next Free Rec: 28 ## Offset Clusters Block# Flags 0 0 1 143368 0x0 1 10 1 143376 0x0 ... 26 260 1 143576 0x0 27 270 1 143584 0x0 Now another write at 1126400(275 cluster) whiich will write at the gap between 271 and 280 will trigger ocfs2_add_branch, but the result after the function looks like: Tree Depth: 1 Count: 19 Next Free Rec: 2 ## Offset Clusters Block# 0 0 280 86183 1 271 0 143592 So the extent record is intersected and make the following operation bug out. This patch just try to remove the gap before we add the new branch, so that the root(branch) rightmost rec will cover the same right position. So in the above case, before adding branch the tree will be changed to Tree Depth: 1 Count: 19 Next Free Rec: 1 ## Offset Clusters Block# 0 0 271 86183 SubAlloc Bit: 7 SubAlloc Slot: 0 Blknum: 86183 Next Leaf: 0 CRC32: 00000000 ECC: 0000 Tree Depth: 0 Count: 28 Next Free Rec: 28 ## Offset Clusters Block# Flags 0 0 1 143368 0x0 1 10 1 143376 0x0 ... 26 260 1 143576 0x0 27 270 1 143584 0x0 And after branch add, the tree looks like Tree Depth: 1 Count: 19 Next Free Rec: 2 ## Offset Clusters Block# 0 0 271 86183 1 271 0 143592 Signed-off-by: Tao Ma <tao.ma@oracle.com> Acked-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-06-15 14:49:43 -07:00
Linus Torvalds	9c7cb99a82	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: (22 commits) nilfs2: support contiguous lookup of blocks nilfs2: add sync_page method to page caches of meta data nilfs2: use device's backing_dev_info for btree node caches nilfs2: return EBUSY against delete request on snapshot nilfs2: modify list of unsupported features in caveats nilfs2: enable sync_page method nilfs2: set bio unplug flag for the last bio in segment nilfs2: allow future expansion of metadata read out via get info ioctl NILFS2: Pagecache usage optimization on NILFS2 nilfs2: remove nilfs_btree_operations from btree mapping nilfs2: remove nilfs_direct_operations from direct mapping nilfs2: remove bmap pointer operations nilfs2: remove useless b_low and b_high fields from nilfs_bmap struct nilfs2: remove pointless NULL check of bpop_commit_alloc_ptr function nilfs2: move get block functions in bmap.c into btree codes nilfs2: remove nilfs_bmap_delete_block nilfs2: remove nilfs_bmap_put_block nilfs2: remove header file for segment list operations nilfs2: eliminate removal list of segments nilfs2: add sufile function that can modify multiple segment usages ...	2009-06-15 09:13:49 -07:00
Alexey Zaytsev	df59c0ad05	[SCSI] compat: don't perform unneeded copy in sg_io code The members from 'status' in struct sg_io_hdr to the last are used to transfer information from kernel to user space. The values that user space sets are just ignored. Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-06-15 10:09:30 -05:00
Mike Frysinger	0a8eba9b7f	ramfs: ignore unknown mount options On systems where CONFIG_SHMEM is disabled, mounting tmpfs filesystems can fail when tmpfs options are used. This is because tmpfs creates a small wrapper around ramfs which rejects unknown options, and ramfs itself only supports a tiny subset of what tmpfs supports. This makes it pretty hard to use the same userspace systems across different configuration systems. As such, ramfs should ignore the tmpfs options when tmpfs is merely a wrapper around ramfs. This used to work before commit `c3b1b1cbf0` as previously, ramfs would ignore all options. But now, we get: ramfs: bad mount option: size=10M mount: mounting mdev on /dev failed: Invalid argument Another option might be to restore the previous behavior, where ramfs simply ignored all unknown mount options ... which is what Hugh prefers. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Acked-by: Matt Mackall <mpm@selenic.com> Acked-by: Wu Fengguang <fengguang.wu@intel.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-14 17:58:25 -07:00
Linus Torvalds	489f7ab6c1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (31 commits) trivial: remove the trivial patch monkey's name from SubmittingPatches trivial: Fix a typo in comment of addrconf_dad_start() trivial: usb: fix missing space typo in doc trivial: pci hotplug: adding __init/__exit macros to sgi_hotplug trivial: Remove the hyphen from git commands trivial: fix ETIMEOUT -> ETIMEDOUT typos trivial: Kconfig: .ko is normally not included in module names trivial: SubmittingPatches: fix typo trivial: Documentation/dell_rbu.txt: fix typos trivial: Fix Pavel's address in MAINTAINERS trivial: ftrace:fix description of trace directory trivial: unnecessary (void*) cast removal in sound/oss/msnd.c trivial: input/misc: Fix typo in Kconfig trivial: fix grammo in bus_for_each_dev() kerneldoc trivial: rbtree.txt: fix rb_entry() parameters in sample code trivial: spelling fix in ppc code comments trivial: fix typo in bio_alloc kernel doc trivial: Documentation/rbtree.txt: cleanup kerneldoc of rbtree.txt trivial: Miscellaneous documentation typo fixes trivial: fix typo milisecond/millisecond for documentation and source comments. ...	2009-06-14 13:46:25 -07:00
Felix Blyakher	fd40261354	Merge branch 'master' of git://oss.sgi.com/xfs/xfs into for-linus	2009-06-12 21:28:59 -05:00
Christoph Hellwig	e83f1eb6bf	xfs: fix small mismerge in xfs_vn_mknod Identation got messed up when merging the current_umask changes with the generic ACL support. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-06-12 21:15:31 -05:00
Christoph Hellwig	493b87e5ed	xfs: fix warnings with CONFIG_XFS_QUOTA disabled Fix warnings about unitialized dquot variables by making sure xfs_qm_vop_dqalloc touches it even when quotas are disabled. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-06-12 21:15:12 -05:00
Linus Torvalds	f3ad116588	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs: configfs: Rework configfs_depend_item() locking and make lockdep happy configfs: Silence lockdep on mkdir() and rmdir()	2009-06-12 18:21:19 -07:00
Linus Torvalds	c53567ad45	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: use more NOFS allocation dlm: connect to nodes earlier dlm: fix use count with multiple joins dlm: Make name input parameter of {,dlm_}new_lockspace() const	2009-06-12 13:17:12 -07:00
Linus Torvalds	c9b8af00ff	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (154 commits) [SCSI] osd: Remove out-of-tree left overs [SCSI] libosd: Use REQ_QUIET requests. [SCSI] osduld: use filp_open() when looking up an osd-device [SCSI] libosd: Define an osd_dev wrapper to retrieve the request_queue [SCSI] libosd: osd_req_{read,write} takes a length parameter [SCSI] libosd: Let _osd_req_finalize_data_integrity receive number of out_bytes [SCSI] libosd: osd_req_{read,write}_kern new API [SCSI] libosd: Better printout of OSD target system information [SCSI] libosd: OSD2r05: Attribute definitions [SCSI] libosd: OSD2r05: Additional command enums [SCSI] mpt fusion: fix up doc book comments [SCSI] mpt fusion: Added support for Broadcast primitives Event handling [SCSI] mpt fusion: Queue full event handling [SCSI] mpt fusion: RAID device handling and Dual port Raid support is added [SCSI] mpt fusion: Put IOC into ready state if it not already in ready state [SCSI] mpt fusion: Code Cleanup patch [SCSI] mpt fusion: Rescan SAS topology added [SCSI] mpt fusion: SAS topology scan changes, expander events [SCSI] mpt fusion: Firmware event implementation using seperate WorkQueue [SCSI] mpt fusion: rewrite of ioctl_cmds internal generated function ...	2009-06-12 09:50:42 -07:00
Linus Torvalds	6cb8a91174	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Remove lock_kernel from gfs2_put_super() GFS2: Add tracepoints	2009-06-12 09:43:44 -07:00
Linus Torvalds	7f3591cfac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest: (31 commits) lguest: add support for indirect ring entries lguest: suppress notifications in example Launcher lguest: try to batch interrupts on network receive lguest: avoid sending interrupts to Guest when no activity occurs. lguest: implement deferred interrupts in example Launcher lguest: remove obsolete LHREQ_BREAK call lguest: have example Launcher service all devices in separate threads lguest: use eventfds for device notification eventfd: export eventfd_signal and eventfd_fget for lguest lguest: allow any process to send interrupts lguest: PAE fixes lguest: PAE support lguest: Add support for kvm_hypercall4() lguest: replace hypercall name LHCALL_SET_PMD with LHCALL_SET_PGD lguest: use native_set_* macros, which properly handle 64-bit entries when PAE is activated lguest: map switcher with executable page table entries lguest: fix writev returning short on console output lguest: clean up length-used value in example launcher lguest: Segment selectors are 16-bit long. Fix lg_cpu.ss1 definition. lguest: beyond ARRAY_SIZE of cpu->arch.gdt ...	2009-06-12 09:32:26 -07:00
Linus Torvalds	c34752bc8b	Merge branch 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'cuse' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: CUSE: implement CUSE - Character device in Userspace fuse: export symbols to be used by CUSE fuse: update fuse_conn_init() and separate out fuse_conn_kill() fuse: don't use inode in fuse_file_poll fuse: don't use inode in fuse_do_ioctl() helper fuse: don't use inode in fuse_sync_release() fuse: create fuse_do_open() helper for CUSE fuse: clean up args in fuse_finish_open() and fuse_release_fill() fuse: don't use inode in helpers called by fuse_direct_io() fuse: add members to struct fuse_file fuse: prepare fuse_direct_io() for CUSE fuse: clean up fuse_write_fill() fuse: use struct path in release structure fuse: misc cleanups	2009-06-12 09:31:20 -07:00
Linus Torvalds	d614aec475	Merge branch 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 * 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (29 commits) ide: re-implement ide_pci_init_one() on top of ide_pci_init_two() ide: unexport ide_find_dma_mode() ide: fix PowerMac bootup oops ide: skip probe if there are no devices on the port (v2) sl82c105: add printk() logging facility ide-tape: fix proc warning ide: add IDE_DFLAG_NIEN_QUIRK device flag ide: respect quirk_drives[] list on all controllers hpt366: enable all quirks for devices on quirk_drives[] list hpt366: sync quirk_drives[] list with pdc202xx_{new,old}.c ide: remove superfluous SELECT_MASK() call from do_rw_taskfile() ide: remove superfluous SELECT_MASK() call from ide_driveid_update() icside: remove superfluous ->maskproc method ide-tape: fix IDE_AFLAG_* atomic accesses ide-tape: change IDE_AFLAG_IGNORE_DSC non-atomically pdc202xx_old: kill resetproc() method pdc202xx_old: don't call pdc202xx_reset() on IRQ timeout pdc202xx_old: use ide_dma_test_irq() ide: preserve Host Protected Area by default (v2) ide-gd: implement block device ->set_capacity method (v2) ...	2009-06-12 09:29:42 -07:00
Nikanth Karthikesan	76d93ff344	trivial: fix typo in bio_alloc kernel doc Fix typo in bio_alloc kernel doc. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:47 +02:00
Thadeu Lima de Souza Cascardo	ab2274af05	trivial: fix typo compatiable/compatiability has extra 'a'. Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:46 +02:00
Wolfram Sang	2eadfc0ed6	trivial: fs/inode: Fix typo in file_update_time nanodoc The advertised flag for not updating the time was wrong. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:45 +02:00
Nikanth Karthikesan	ff677f8d10	trivial: fix comment typo in fs/compat.c Fix a typo in fs/compat.c Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:44 +02:00
Ali Gholami Rudi	88164ff4fc	trivial: ext2: fix a typo in comment in ext2.h Signed-off-by: Ali Gholami Rudi <ali@rudi.ir> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:44 +02:00
Felix Blyakher	7747a0b0af	xfs: fix freeing memory in xfs_getbmap() Regression from commit `28e211700a`. Need to free temporary buffer allocated in xfs_getbmap(). Signed-off-by: Felix Blyakher <felixb@sgi.com> Signed-off-by: Hedi Berriche <hedi@sgi.com> Reported-by: Justin Piszcz <jpiszcz@lucidpixels.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de>	2009-06-12 10:26:52 -05:00
James Bottomley	82681a318f	[SCSI] Merge branch 'linus' Conflicts: drivers/message/fusion/mptsas.c fixed up conflict between req->data_len accessors and mptsas driver updates. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2009-06-12 10:02:03 -05:00
Rusty Russell	5718607bb6	eventfd: export eventfd_signal and eventfd_fget for lguest lguest wants to attach eventfds to guest notifications, and lguest is usually a module. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> To: Davide Libenzi <davidel@xmailserver.org>	2009-06-12 22:27:09 +09:30
Steven Whitehouse	3ea400581f	GFS2: Remove lock_kernel from gfs2_put_super() It is not required here. Signed-off-by: Steven Whitehouse <swhiteho@redhat,com> Cc: Christoph Hellwig <hch@infradead.org>	2009-06-12 13:40:47 +01:00
Steven Whitehouse	63997775b7	GFS2: Add tracepoints This patch adds the ability to trace various aspects of the GFS2 filesystem. The trace points are divided into three groups, glocks, logging and bmap. These points have been chosen because they allow inspection of the major internal functions of GFS2 and they are also generic enough that they are unlikely to need any major changes as the filesystem evolves. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-06-12 08:49:20 +01:00
Ryusuke Konishi	aa7dfb8954	nilfs2: get rid of bd_mount_sem use from nilfs This will remove every bd_mount_sem use in nilfs. The intended exclusion control was replaced by the previous patch ("nilfs2: correct exclusion control in nilfs_remount function") for nilfs_remount(), and this patch will replace remains with a new mutex that this inserts in nilfs object. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:18 -04:00
Ryusuke Konishi	e59399d010	nilfs2: correct exclusion control in nilfs_remount function nilfs_remount() changes mount state of a superblock instance. Even though nilfs accesses other superblock instances during mount or remount, the mount state was not properly protected in nilfs_remount(). Moreover, nilfs_remount() has a lock order reversal problem; nilfs_get_sb() holds: 1. bdev->bd_mount_sem 2. sb->s_umount (sget acquires) and nilfs_remount() holds: 1. sb->s_umount (locked by the caller in vfs) 2. bdev->bd_mount_sem To avoid these problems, this patch divides a semaphore protecting super block instances from nilfs->ns_sem, and applies it to the mount state protection in nilfs_remount(). With this change, bd_mount_sem use is removed from nilfs_remount() and the lock order reversal will be resolved. And the new rw-semaphore, nilfs->ns_super_sem will properly protect the mount state except the modification from nilfs_error function. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:18 -04:00
Ryusuke Konishi	6dd4740662	nilfs2: simplify remaining sget() use This simplifies the test function passed on the remaining sget() callsite in nilfs. Instead of checking mount type (i.e. ro-mount/rw-mount/snapshot mount) in the test function passed to sget(), this patch first looks up the nilfs_sb_info struct which the given mount type matches, and then acquires the super block instance holding the nilfs_sb_info. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:18 -04:00
Ryusuke Konishi	3f82ff5516	nilfs2: get rid of sget use for checking if current mount is present This stops using sget() for checking if an r/w-mount or an r/o-mount exists on the device. This elimination uses a back pointer to the current mount added to nilfs object. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:17 -04:00
Ryusuke Konishi	33c8e57c86	nilfs2: get rid of sget use for acquiring nilfs object This will change the way to obtain nilfs object in nilfs_get_sb() function. Previously, a preliminary sget() call was performed, and the nilfs object was acquired from a super block instance found by the sget() call. This patch, instead, instroduces a new dedicated function find_or_create_nilfs(); as the name implies, the function finds an existent nilfs object from a global list or creates a new one if no object is found on the device. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:17 -04:00
Ryusuke Konishi	81fc20bd0e	nilfs2: remove meaningless EBUSY case from nilfs_get_sb function The following EBUSY case in nilfs_get_sb() is meaningless. Indeed, this error code is never returned to the caller. if (!s->s_root) { ... } else if (!(s->s_flags & MS_RDONLY)) { err = -EBUSY; } This simply removes the else case. Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:17 -04:00
Christoph Hellwig	0c95ee190e	remove the call to ->write_super in __sync_filesystem Now that all filesystems provide ->sync_fs methods we can change __sync_filesystem to only call ->sync_fs. This gives us a clear separation between periodic writeouts which are driven by ->write_super and data integrity syncs that go through ->sync_fs. (modulo file_fsync which is also going away) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:17 -04:00
Christoph Hellwig	d731e06323	nilfs2: call nilfs2_write_super from nilfs2_sync_fs The call to ->write_super from __sync_filesystem will go away, so make sure nilfs2 performs the same actions from inside ->sync_fs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:17 -04:00
Christoph Hellwig	d579ed00aa	jffs2: call jffs2_write_super from jffs2_sync_fs The call to ->write_super from __sync_filesystem will go away, so make sure jffs2 performs the same actions from inside ->sync_fs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:16 -04:00
Christoph Hellwig	8c8006564a	ufs: add ->sync_fs Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:16 -04:00
Christoph Hellwig	ad43ffdeea	sysv: add ->sync_fs Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:16 -04:00
Christoph Hellwig	7fbc6df0e7	hfsplus: add ->sync_fs Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:16 -04:00
Christoph Hellwig	58bc5bbb87	hfs: add ->sync_fs Add a ->sync_fs method for data integrity syncs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:15 -04:00
Christoph Hellwig	f83d6d46e7	fat: add ->sync_fs Add a ->sync_fs method for data integrity syncs. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:15 -04:00
Christoph Hellwig	40f31dd47e	ext2: add ->sync_fs Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:15 -04:00
Christoph Hellwig	80e09fb942	exofs: add ->sync_fs Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:15 -04:00
Christoph Hellwig	561e47ce72	bfs: add ->sync_fs Add a ->sync_fs method for data integrity syncs, and reimplement ->write_super ontop of it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:14 -04:00
Christoph Hellwig	e28964365f	affs: add ->sync_fs Add a ->sync_fs method for data integrity syncs. Factor out common code between affs_put_super, affs_write_super and the new affs_sync_fs into a helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:14 -04:00
Al Viro	c475879556	sanitize ->fsync() for affs unfortunately, for affs (especially for affs directories) we have no real way to keep track of metadata ownership. So we have to do more or less what file_fsync() does, but we do not need to call write_super() there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:14 -04:00
Al Viro	4427f0c36e	repair bfs_write_inode(), switch bfs to simple_fsync() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:14 -04:00
Al Viro	224c886643	Fix adfs GET_FRAG_ID() on big-endian Missing conversion to host-endian before doing shifts Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:14 -04:00
Al Viro	ffdc9064f8	repair adfs ->write_inode(), switch to simple_fsync() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:13 -04:00
Al Viro	bea6b64c27	switch omfs to simple_fsync() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:13 -04:00
Al Viro	90de066443	switch udf to simple_fsync() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:13 -04:00
Al Viro	a932801543	switch ufs to simple_fsync() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:13 -04:00
Al Viro	05459ca81a	repair sysv_write_inode(), switch sysv to simple_fsync() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:12 -04:00
Al Viro	0d7916d7e9	switch minix to simple_fsync() * get minix_write_inode() to honour the second argument * now we can use simple_fsync() for minixfs Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:12 -04:00
Al Viro	e1740a462e	switch ext2 to simple_fsync() kill ext2_sync_file() (along with ext2/fsync.c), get rid of ext2_update_inode() - it's an alias of ext2_write_inode(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:12 -04:00
Al Viro	b522412aea	Sanitize ->fsync() for FAT * mark directory data blocks as assoc. metadata * add new inode to deal with FAT, mark FAT blocks as assoc. metadata of that * now ->fsync() is trivial both for files and directories Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:12 -04:00
Al Viro	964f536966	fs/qnx4: sanitize includes fs-internal parts of qnx4_fs.h taken to fs/qnx4/qnx4.h, includes adjusted, qnx4_fs.h doesn't need unifdef anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:12 -04:00
Al Viro	79d2576758	Sanitize qnx4 fsync handling * have directory operations use mark_buffer_dirty_inode(), so that sync_mapping_buffers() would get those. * make qnx4_write_inode() honour its last argument. * get rid of insane copies of very ancient "walk the indirect blocks" in qnx4/fsync - they never matched the actual fs layout and, fortunately, never'd been called. Again, all this junk is not needed; ->fsync() should just do sync_mapping_buffers + sync_inode (and if we implement block allocation for qnx4, we'll need to use mark_buffer_dirty_inode() for extent blocks) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:11 -04:00
Al Viro	d5aacad548	New helper - simple_fsync() writes associated buffers, then does sync_inode() to write the inode itself (and to make it clean). Depends on ->write_inode() honouring the second argument. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:11 -04:00
Alessio Igor Bogani	337eb00a2c	Push BKL down into ->remount_fs() [xfs, btrfs, capifs, shmem don't need BKL, exempt] Signed-off-by: Alessio Igor Bogani <abogani@texware.it> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:11 -04:00
Nick Piggin	4195f73d13	fs: block_dump missing dentry locking I think the block_dump output in __mark_inode_dirty is missing dentry locking. Surely the i_dentry list can change any time, so we may not even get a dentry there. If we do get one by chance, then it would appear to be able to go away or get renamed at any time... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:10 -04:00
Nick Piggin	545b9fd3d7	fs: remove incorrect I_NEW warnings Some filesystems can call in to sync an inode that is still in the I_NEW state (eg. ext family, when mounted with -osync). This is OK because the filesystem has sole access to the new inode, so it can modify i_state without races (because no other thread should be modifying it, by definition of I_NEW). Ie. a false positive, so remove the warnings. The races are described here `7ef0d7377c`, which is also where the warnings were introduced. Reported-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:10 -04:00
Christoph Hellwig	f95022161d	xfs: remove ->write_super and stop maintaining ->s_dirt the write_super method is used for (1) writing back the superblock periodically from pdflush (2) called just before ->sync_fs for data integerity syncs We don't need (1) because we have our own peridoc writeout through xfssyncd, and we don't need (2) because xfs_fs_sync_fs performs a proper synchronous superblock writeout after all other data and metadata has been written out. Also remove ->s_dirt tracking as it's only used to decide when too call ->write_super. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:10 -04:00
Jens Axboe	13205fb926	ntfs: remove old debug check for dirty data in ntfs_put_super() This should not trigger anymore, so kill it. Acked-by: Anton Altaparmakov <aia21@cam.ac.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:10 -04:00
Theodore Ts'o	9fd5746fd3	fs: Remove i_cindex from struct inode The only user of the i_cindex element in the inode structure is used is by the firewire drivers. As part of an attempt to slim down the inode structure to save memory --- since a typical Linux system will have hundreds of thousands if not millions of inodes cached, a reduction in the size inode has high leverage. The firewire driver does not need i_cindex in any fast path, so it's simple enough to calculate when it is needed, instead of wasting space in the inode structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: krh@redhat.com Cc: stefanr@s5r6.in-berlin.de Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:09 -04:00
Christoph Hellwig	ebc1ac1645	->write_super lock_super pushdown Push down lock_super into ->write_super instances and remove it from the caller. Following filesystem don't need ->s_lock in ->write_super and are skipped: * bfs, nilfs2 - no other uses of s_lock and have internal locks in ->write_super * ext2 - uses BKL in ext2_write_super and has internal calls without s_lock * reiserfs - no other uses of s_lock as has reiserfs_write_lock (BKL) in ->write_super * xfs - no other uses of s_lock and uses internal lock (buffer lock on superblock buffer) to serialize ->write_super. Also xfs_fs_write_super is superflous and will go away in the next merge window Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:09 -04:00
Christoph Hellwig	01ba687577	jffs2: move jffs2_write_super to super.c jffs2_write_super is only called from super.c and doesn't use any functionality from fs.c. So move it over to super.c and make it static there. [should go in through the vfs tree as it is a requirement for the next patch] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:09 -04:00
Al Viro	4aa98cf768	Push BKL down into do_remount_sb() [folded fix from Jiri Slaby] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:08 -04:00
Al Viro	7f78d4cd4c	Push BKL down beyond VFS-only parts of do_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:08 -04:00
Al Viro	6fac98dd21	Push BKL into do_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:08 -04:00
Al Viro	bbd6851a32	Push lock_super() into the ->remount_fs() of filesystems that care about it Note that since we can't run into contention between remount_fs and write_super (due to exclusion on s_umount), we have to care only about filesystems that touch lock_super() on their own. Out of those ext3, ext4, hpfs, sysv and ufs do need it; fat doesn't since its ->remount_fs() only accesses assign-once data (basically, it's "we have no atime on directories and only have atime on files for vfat; force nodiratime and possibly noatime into *flags"). [folded a build fix from hch] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:08 -04:00
Christoph Hellwig	6cfd014842	push BKL down into ->put_super Move BKL into ->put_super from the only caller. A couple of filesystems had trivial enough ->put_super (only kfree and NULLing of s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs, hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most of them probably don't need it, but I'd rather sort that out individually. Preferably after all the other BKL pushdowns in that area. [AV: original used to move lock_super() down as well; these changes are removed since we don't do lock_super() at all in generic_shutdown_super() now] [AV: fuse, btrfs and xfs are known to need no damn BKL, exempt] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:07 -04:00
Al Viro	a9e220f832	No need to do lock_super() for exclusion in generic_shutdown_super() We can't run into contention on it. All other callers of lock_super() either hold s_umount (and we have it exclusive) or hold an active reference to superblock in question, which prevents the call of generic_shutdown_super() while the reference is held. So we can replace lock_super(s) with get_fs_excl() in generic_shutdown_super() (and corresponding change for unlock_super(), of course). Since ext4 expects s_lock held for its put_super, take lock_super() into it. The rest of filesystems do not care at all. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:07 -04:00
Al Viro	62c6943b4b	Trim a bit of crap from fs.h do_remount_sb() is fs/internal.h fodder, fsync_no_super() is long gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:07 -04:00
Al Viro	443b94baaa	Make sure that all callers of remount hold s_umount exclusive Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:07 -04:00
Christoph Hellwig	5af7926ff3	enforce ->sync_fs is only called for rw superblock Make sure a superblock really is writeable by checking MS_RDONLY under s_umount. sync_filesystems needed some re-arragement for that, but all but one sync_filesystem caller had the correct locking already so that we could add that check there. cachefiles grew s_umount locking. I've also added a WARN_ON to sync_filesystem to assert this for future callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:06 -04:00
Christoph Hellwig	e500475338	cleanup sync_supers Merge the write_super helper into sync_super and move the check for ->write_super earlier so that we can avoid grabbing a reference to a superblock that doesn't have it. While we're at it also add a little comment documenting sync_supers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:06 -04:00
Alexey Dobriyan	f3da392e9f	dcache: extrace and use d_unlinked() d_unlinked() will be used in middle-term to ban checkpointing when opened but unlinked file is detected, and in long term, to detect such situation and special case on it. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:06 -04:00
Christoph Hellwig	8c85e12512	remove ->write_super call in generic_shutdown_super We just did a full fs writeout using sync_filesystem before, and if that's not enough for the filesystem it can perform it's own writeout in ->put_super, which many filesystems already do. Move a call to foofs_write_super into every foofs_put_super for now to guarantee identical behaviour until it's cleaned up by the individual filesystem maintainers. Exceptions: - affs already has identical copy & pasted code at the beginning of affs_put_super so no need to do it twice. - xfs does the right thing without it and I have changes pending for the xfs tree touching this are so I don't really need conflicts here.. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:06 -04:00
Christoph Hellwig	517bfae283	qnx4: remove ->write_super Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:05 -04:00
Christoph Hellwig	94cb993f2e	ocfs2: remove ->write_super and stop maintaining ->s_dirt Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:05 -04:00
Christoph Hellwig	b7d245de25	gfs2: remove ->write_super and stop maintaining ->s_dirt Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:05 -04:00
Christoph Hellwig	ca41f7b918	ext3: remove ->write_super and stop maintaining ->s_dirt Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:05 -04:00
Christoph Hellwig	59d697b702	btrfs: remove ->write_super and stop maintaining ->s_dirt Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:05 -04:00
Jan Kara	c3f8a40c1c	quota: Introduce writeout_quota_sb() (version 4) Introduce this function which just writes all the quota structures but avoids all the syncing and cache pruning work to expose quota structures to userspace. Use this function from __sync_filesystem when wait == 0. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:04 -04:00
Christoph Hellwig	850b201b08	quota: cleanup dquota sync functions (version 4) Currently the VFS calls vfs_dq_sync to sync out disk quotas for a given superblock. This is a small wrapper around sync_dquots which for the case of a non-NULL superblock is a small wrapper around quota_sync_sb. Just make quota_sync_sb global (rename it to sync_quota_sb) and call it directly. Also call it directly for those cases in quota.c that have a superblock and leave sync_dquots purely an iterator over sync_quota_sb and remove it's superblock argument. To make this nicer move the check for the lack of a quota_sync method from the callers into sync_quota_sb. [folded build fix from Alexander Beregalov <a.beregalov@gmail.com>] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:04 -04:00
Jan Kara	60b0680fa2	vfs: Rename fsync_super() to sync_filesystem() (version 4) Rename the function so that it better describe what it really does. Also remove the unnecessary include of buffer_head.h. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:04 -04:00
Jan Kara	c15c54f5f0	vfs: Move syncing code from super.c to sync.c (version 4) Move sync_filesystems(), __fsync_super(), fsync_super() from super.c to sync.c where it fits better. [build fixes folded] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:04 -04:00
Jan Kara	5cee5815d1	vfs: Make sys_sync() use fsync_super() (version 4) It is unnecessarily fragile to have two places (fsync_super() and do_sync()) doing data integrity sync of the filesystem. Alter __fsync_super() to accommodate needs of both callers and use it. So after this patch __fsync_super() is the only place where we gather all the calls needed to properly send all data on a filesystem to disk. Nice bonus is that we get a complete livelock avoidance and write_supers() is now only used for periodic writeback of superblocks. sync_blockdevs() introduced a couple of patches ago is gone now. [build fixes folded] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:03 -04:00
Jan Kara	429479f031	vfs: Make __fsync_super() a static function (version 4) __fsync_super() does the same thing as fsync_super(). So change the only caller to use fsync_super() and make __fsync_super() static. This removes unnecessarily duplicated call to sync_blockdev() and prepares ground for the changes to __fsync_super() in the following patches. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:03 -04:00
Jan Kara	bfe881255c	vfs: Call ->sync_fs() even if s_dirt is 0 (version 4) sync_filesystems() has a condition that if wait == 0 and s_dirt == 0, then ->sync_fs() isn't called. This does not really make much sence since s_dirt is generally used by a filesystem to mean that ->write_super() needs to be called. But ->sync_fs() does different things. I even suspect that some filesystems (btrfs?) sets s_dirt just to fool this logic. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:03 -04:00
Jan Kara	5a3e5cb8e0	vfs: Fix sys_sync() and fsync_super() reliability (version 4) So far, do_sync() called: sync_inodes(0); sync_supers(); sync_filesystems(0); sync_filesystems(1); sync_inodes(1); This ordering makes it kind of hard for filesystems as sync_inodes(0) need not submit all the IO (for example it skips inodes with I_SYNC set) so e.g. forcing transaction to disk in ->sync_fs() is not really enough. Therefore sys_sync has not been completely reliable on some filesystems (ext3, ext4, reiserfs, ocfs2 and others are hit by this) when racing e.g. with background writeback. A similar problem hits also other filesystems (e.g. ext2) because of write_supers() being called before the sync_inodes(1). Change the ordering of calls in do_sync() - this requires a new function sync_blockdevs() to preserve the property that block devices are always synced after write_super() / sync_fs() call. The same issue is fixed in __fsync_super() function used on umount / remount read-only. [AV: build fixes] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:03 -04:00
Christoph Hellwig	876a9f76ab	remove s_async_list Remove the unused s_async_list in the superblock, a leftover of the broken async inode deletion code that leaked into mainline. Having this in the middle of the sync/unmount path is not helpful for the following cleanups. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:02 -04:00
npiggin@suse.de	864d7c4c06	fs: move mark_files_ro into file_table.c This function walks the s_files lock, and operates primarily on the files in a superblock, so it better belongs here (eg. see also fs_may_remount_ro). [AV: ... and it shouldn't be static after that move] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:02 -04:00
npiggin@suse.de	96029c4e09	fs: introduce mnt_clone_write This patch speeds up lmbench lat_mmap test by about another 2% after the first patch. Before: avg = 462.286 std = 5.46106 After: avg = 453.12 std = 9.58257 (50 runs of each, stddev gives a reasonable confidence) It does this by introducing mnt_clone_write, which avoids some heavyweight operations of mnt_want_write if called on a vfsmount which we know already has a write count; and mnt_want_write_file, which can call mnt_clone_write if the file is open for write. After these two patches, mnt_want_write and mnt_drop_write go from 7% on the profile down to 1.3% (including mnt_clone_write). [AV: mnt_want_write_file() should take file alone and derive mnt from it; not only all callers have that form, but that's the only mnt about which we know that it's already held for write if file is opened for write] Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:02 -04:00
npiggin@suse.de	d3ef3d7351	fs: mnt_want_write speedup This patch speeds up lmbench lat_mmap test by about 8%. lat_mmap is set up basically to mmap a 64MB file on tmpfs, fault in its pages, then unmap it. A microbenchmark yes, but it exercises some important paths in the mm. Before: avg = 501.9 std = 14.7773 After: avg = 462.286 std = 5.46106 (50 runs of each, stddev gives a reasonable confidence, but there is quite a bit of variation there still) It does this by removing the complex per-cpu locking and counter-cache and replaces it with a percpu counter in struct vfsmount. This makes the code much simpler, and avoids spinlocks (although the msync is still pretty costly, unfortunately). It results in about 900 bytes smaller code too. It does increase the size of a vfsmount, however. It should also give a speedup on large systems if CPUs are frequently operating on different mounts (because the existing scheme has to operate on an atomic in the struct vfsmount when switching between mounts). But I'm most interested in the single threaded path performance for the moment. [AV: minor cleanup] Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:02 -04:00
Al Viro	3174c21b74	Move junk from proc_fs.h to fs/proc/internal.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Al Viro	1c755af4df	switch lookup_mnt() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Al Viro	79ed022619	switch follow_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Al Viro	9393bd07cf	switch follow_down() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00
Al Viro	589ff870ed	Switch collect_mounts() to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-06-11 21:36:01 -04:00

1 2 3 4 5 ...

14335 Commits