linux

Author	SHA1	Message	Date
Linus Torvalds	c9059598ea	Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits) block: add request clone interface (v2) floppy: fix hibernation ramdisk: remove long-deprecated "ramdisk=" boot-time parameter fs/bio.c: add missing __user annotation block: prevent possible io_context->refcount overflow Add serial number support for virtio_blk, V4a block: Add missing bounce_pfn stacking and fix comments Revert "block: Fix bounce limit setting in DM" cciss: decode unit attention in SCSI error handling code cciss: Remove no longer needed sendcmd reject processing code cciss: change SCSI error handling routines to work with interrupts enabled. cciss: separate error processing and command retrying code in sendcmd_withirq_core() cciss: factor out fix target status processing code from sendcmd functions cciss: simplify interface of sendcmd() and sendcmd_withirq() cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code cciss: Use schedule_timeout_uninterruptible in SCSI error handling code block: needs to set the residual length of a bidi request Revert "block: implement blkdev_readpages" block: Fix bounce limit setting in DM Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt ... Manually fix conflicts with tracing updates in: block/blk-sysfs.c drivers/ide/ide-atapi.c drivers/ide/ide-cd.c drivers/ide/ide-floppy.c drivers/ide/ide-tape.c include/trace/events/block.h kernel/trace/blktrace.c	2009-06-11 11:10:35 -07:00
Steven Whitehouse	003dec8913	GFS2: Merge gfs2_get_sb into gfs2_get_sb_meta These don't need to be separate functions. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-06-10 10:31:45 +01:00
Steven Whitehouse	40bc9a27e0	GFS2: Fix cache coherency between truncate and O_DIRECT read If a page was partially zeroed as the result of a truncate, then it was not being correctly marked dirty. This resulted in the deleted data reappearing if the file was read back via direct I/O. Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-06-10 09:09:40 +01:00
Steven Whitehouse	f6d03139d7	GFS2: Fix locking issue mounting gfs2meta fs This patch uses sget() to get a reference to the existing gfs2 sb when mouting the gfs2meta filesystem (in fact thats just another mount of the gfs2 filesystem with a different root and this interface is for backward compatibility). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reported-by: Benjamin Marzinski <bmarzins@redhat.com> Tested-by: Benjamin Marzinski <bmarzins@redhat.com> Cc: Christoph Hellwig <hch@infradead.org>	2009-06-05 08:35:15 +01:00
Steven Whitehouse	e09f9446b9	GFS2: Remove unused variable Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-06-03 10:07:44 +01:00
Abhijith Das	a12af1ebe6	GFS2: smbd proccess hangs with flock() call. GFS2 currently does not support mandatory flocks. An flock() call with LOCK_MAND triggers unexpected behavior because gfs2 is not checking for this lock type. This patch corrects that. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-06-02 08:01:12 +01:00
Steven Whitehouse	f6eb53498e	GFS2: Remove args subdir from gfs2 sysfs files Since we can cat /proc/mounts there is no need to have this subdirectory in the gfs2 sysfs files. In fact this does not reflect the full range of possible mount argumenmts, where as /proc/mounts does. There was only one userland user of this set of sysfs files and it will function perfectly well without these files being present (in fact that subcommand of gfs2_tool is obsolete anyway). The tune/* subdirectory is also considered mostly obsolete, but there are a few uses of this until mount arguments can be added for the last few functions for which there are no equivalents currently. However the tune/* directory is still in my sights and new code should avoid using it. Only the gfs2_quota and gfs2_tool programs are know to use tune/* at the moment. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-26 15:50:25 +01:00
Steven Whitehouse	e1b28aab58	GFS2: Remove lockstruct subdir from gfs2 sysfs files The lockstruct sub directory contained two entries, both of which are duplicated elsewhere in the gfs2 sysfs files as well as being available via /proc/mounts. There is no userland program using either of them, so this patch removes them. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-26 15:41:27 +01:00
Martin K. Petersen	e1defc4ff0	block: Do away with the notion of hardsect_size Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-22 23:22:54 +02:00
Steven Whitehouse	87ec217411	GFS2: Move gfs2_unlink_ok into ops_inode.c Another function which is only called from one ops_inode.c so we move it and make it static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-22 10:54:50 +01:00
Steven Whitehouse	536baf02f6	GFS2: Move gfs2_readlinki into ops_inode.c Move gfs2_readlinki into ops_inode.c and make it static Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-22 10:48:59 +01:00
Steven Whitehouse	2286dbfad1	GFS2: Move gfs2_rmdiri into ops_inode.c Move gfs2_rmdiri() into ops_inode.c and make it static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-22 10:45:09 +01:00
Steven Whitehouse	9e6e0a128b	GFS2: Merge mount.c and ops_super.c into super.c mount.c only contained a single function, so is not really worth retaining on its own. All of the super related code is now either in super.c or ops_fstype.c Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-22 10:36:01 +01:00
Steven Whitehouse	b1e71b0622	GFS2: Clean up some file names This patch renames the ops_*.c files which have no counterpart without the ops_ prefix in order to shorten the name and make it more readable. In addition, ops_address.h (which was very small) is moved into inode.h and inode.h is cleaned up by adding extern where required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-22 10:01:55 +01:00
Steven Whitehouse	1ce97e564b	GFS2: Be more aggressive in reclaiming unlinked inodes This patch increases the frequency with which gfs2 looks for unlinked, but still allocated inodes. Its the equivalent operation to ext3's orphan list, but done with bitmaps in the resource groups. This also fixes a bug where a field in the rgrp was too small. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-21 15:18:19 +01:00
Steven Whitehouse	60a0b8f936	GFS2: Add a rgrp bitmap full flag During block allocation, it is useful to know if sections of disk are full on a finer grained basis than a single resource group. This can make a performance difference when resource groups have larger numbers of bitmap blocks, since we no longer have to search them all block by block in each individual bitmap. The full flag is set on a per-bitmap basis when it has been searched and found to have no free space. It is then skipped in subsequent searches until the flag is reset. The resetting occurs if we have to drop the glock on the resource group for any reason, or if we deallocate some blocks within that resource group and thus free up some space. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-21 12:23:12 +01:00
Steven Whitehouse	0901097834	GFS2: Improve resource group error handling This patch improves the error handling in the case where we discover that the summary information in the resource group doesn't match the bitmap information while in the process of allocating blocks. Originally this resulted in a kernel bug, but this patch changes that so that we return -EIO and print some messages explaining what went wrong, and how to fix it. We also remember locally not to try and allocate from the same rgrp again, so that a subsequent allocation in a different rgrp should succeed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-20 10:48:47 +01:00
Steven Whitehouse	ef9e8b14a5	GFS2: Don't warn when delete inode fails on ro filesystem If the filesystem is read-only, then we expect that delete inode will fail, so there is no need to warn about it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-19 14:25:16 +01:00
Steven Whitehouse	fe64d517df	GFS2: Umount recovery race fix This patch fixes a race condition where we can receive recovery requests part way through processing a umount. This was causing problems since the recovery thread had already gone away. Looking in more detail at the recovery code, it was really trying to implement a slight variation on a work queue, and that happens to align nicely with the recently introduced slow-work subsystem. As a result I've updated the code to use slow-work, rather than its own home grown variety of work queue. When using the wait_on_bit() function, I noticed that the wait function that was supplied as an argument was appearing in the WCHAN field, so I've updated the function names in order to produce more meaningful output. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-19 10:01:18 +01:00
Steven Whitehouse	9582d41135	GFS2: Remove a couple of unused sysfs entries These two tunables are pointless and would never need to be changed anyway. There is also a race between them and umount as the deamons which they refer to might have gone away. The easiest way to fix the race is to remove the interface. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-13 15:06:25 +01:00
Steven Whitehouse	48c2b61361	GFS2: Add commit= mount option It has always been possible to adjust the gfs2 log commit interval, but only from the sysfs interface. This adds a mount option, commit=<nn>, which will be familar to ext3 users. The sysfs interface continues to be available as well, although this might be removed in the future. Also this patch cleans up some duplicated structures in the GFS2 sysfs code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-13 14:49:48 +01:00
Steven Whitehouse	a1c0643ff9	GFS2: Move journal live test at transaction start There seems little point grabbing the transaction glock only to have to release it again if the journal isn't live. This moves the test earlier to avoid grabbing the lock when we don't need it in the first place. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-13 10:56:52 +01:00
Abhijith Das	7537d81aa7	GFS2: Fix timestamps on write This patch copies the timestamps from the vfs inode into gfs2 and syncs it to the disk inode during writes. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-12 16:14:05 +01:00
Steven Whitehouse	48bf2b1711	GFS2: Something nonlinear this way comes! For some reason GFS2 has been missing support for non-linear mappings. This patch fixes that, and also avoids taking any locks for mmap in the O_NOATIME case. In fact we don't actually need to take the lock here at all - just doing file_accessed() would be enough, but we have to take the lock eventually and this helps it hit disk (and thus be seen by other nodes) faster. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-11 12:36:44 +01:00
Steven Whitehouse	4a0f9a321a	GFS2: Optimise writepage for metadata This adds a GFS2 specific writepage for metadata, rather than continuing to use the VFS function. As a result we now tag all our metadata I/O with the correct flag so that blktraces will now be less confusing. Also, the generic function was checking for a number of corner cases which cannot happen on the metadata address spaces so that this should be faster too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-11 12:36:43 +01:00
Steven Whitehouse	c969f58ca4	GFS2: Update the rw flags After Jens recent updates: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=a1f242524c3c1f5d40f1c9c343427e34d1aadd6e et al. this is a patch to bring gfs2 uptodate with the core code. Also I've managed to squash another call to ll_rw_block() along the way. There is still one part of the GFS2 I/O paths which are not correctly annotated and that is due to the sharing of the writeback code between the data and metadata address spaces. I would like to change that too, but this patch is still worth doing on its own, I think. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-11 12:36:41 +01:00
Linus Torvalds	93b49d45eb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (22 commits) Fix the race between capifs remount and node creation Fix races around the access to ->s_options switch ufs directories to ufs_sync_file() Switch open_exec() and sys_uselib() to do_open_filp() Make open_exec() and sys_uselib() use may_open(), instead of duplicating its parts Reduce path_lookup() abuses Make checkpatch.pl shut up on fs/inode.c NULL noise in fs/super.c:kill_bdev_super() romfs: cleanup romfs_fs.h ROMFS: romfs_dev_read() error ignored fs: dcache fix LRU ordering ocfs2: Use nd_set_link(). Fix deadlock in ipathfs ->get_sb() Fix a leak in failure exit in 9p ->get_sb() Convert obvious places to deactivate_locked_super() New helper: deactivate_locked_super() reiserfs: remove privroot hiding in lookup reiserfs: dont associate security.* with xattr files reiserfs: fixup xattr_root caching Always lookup priv_root on reiserfs mount and keep it ...	2009-05-10 10:49:08 -07:00
Al Viro	e24977d45f	Reduce path_lookup() abuses ... use kern_path() where possible [folded a fix from rdd] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-05-09 10:49:42 -04:00
Steven Whitehouse	0c7a531a20	GFS2: Fix glock ref counting bug Depending on the ordering of events as we go around the glock shrinker loop, it is possible to drop the ref count of a glock incorrectly. It doesn't happen very often. This patch corrects the got_ref variable, fixing the problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-05-09 15:15:17 +01:00
Steven Whitehouse	d9ba7615bf	GFS2: Ensure that the inode goal block settings are updated GFS2 has a goal block associated with each inode indicating the search start position for future block allocations (in fact there are two, but thats for backward compatibility with GFS1 as they are set to identical locations in GFS2). In some circumstances, depending on the ordering of updates to the inode it was possible for the goal block settings to not be updated on disk. This patch ensures that the goal block will always get updated, thus reducing the potential for searching the same (already allocated) blocks again when looking for free space during block allocation. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-23 10:07:37 +01:00
Steven Whitehouse	d8bd504ab8	GFS2: Fix bug in block allocation The new bitfit algorithm was counting from the wrong end of 64 bit words in the bitfield. This fixes it by using __ffs64 instead of fls64 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-23 10:07:16 +01:00
Steven Whitehouse	e56985da45	GFS2: Fix page_mkwrite() return code This allows for the possibility of returning VM_FAULT_OOM as well as VM_FAULT_SIGBUS. This ensures that the correct action is taken. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-20 16:02:02 +01:00
Steven Whitehouse	52fcd11c09	GFS2: Clear dirty bit at end of inode glock sync The dirty bit can get set during the inode glock sync. Its too complicated to change that at the moment, so this is the quick fix - to clear the bit again at the end of the function. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-20 09:05:21 +01:00
Linus Torvalds	a2c252ebde	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Use DEFINE_SPINLOCK GFS2: cleanup file_operations mess GFS2: Move umount flush rwsem GFS2: Fix symlink creation race GFS2: Make quotad's waiting interruptible	2009-04-15 09:04:12 -07:00
Nikanth Karthikesan	b1fffc9ca6	gfs2: Remove code handling bio_alloc failure with __GFP_WAIT Remove code handling bio_alloc failure with __GFP_WAIT. GFP_NOFS implies __GFP_WAIT. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-15 12:10:13 +02:00
Xu Gang	1328df7252	GFS2: Use DEFINE_SPINLOCK SPIN_LOCK_UNLOCKED is deprecated, use DEFINE_SPINLOCK instead. (as suggested in Documentation/spinlocks.txt) Signed-off-by: Xu Gang <xug@cn.fujitsu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-15 10:18:07 +01:00
Christoph Hellwig	10d2198805	GFS2: cleanup file_operations mess Remove the weird pointer to file_operations mess and replace it with straight-forward defining of the lockinginstance names to the _nolock variants. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-15 10:17:18 +01:00
Steven Whitehouse	a228df6339	GFS2: Move umount flush rwsem The rwsem, used only on umount, is in the wrong place in glock.c. This patch moves it up a bit so that it does not get called under a spinlock. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-15 10:16:13 +01:00
Steven Whitehouse	5cf32524de	GFS2: Fix symlink creation race In certain cases symlinks can appear to have zero size if a lookup on the inode occurs within a certain (very short) time after the symlink has been created. The symlink is correctly created on disk but appears to have zero size when stat()ed. This patch closes the race and prevents incorrect sizes appearing. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-15 10:15:38 +01:00
Steven Whitehouse	7fa5d20d1a	GFS2: Make quotad's waiting interruptible So we don't count its D state in the loadavg. Reported-by: Nathan Straz <nstraz@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-04-15 10:15:08 +01:00
Linus Torvalds	8fe74cf053	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: Remove two unneeded exports and make two symbols static in fs/mpage.c Cleanup after commit `585d3bc06f` Trim includes of fdtable.h Don't crap into descriptor table in binfmt_som Trim includes in binfmt_elf Don't mess with descriptor table in load_elf_binary() Get rid of indirect include of fs_struct.h New helper - current_umask() check_unsafe_exec() doesn't care about signal handlers sharing New locking/refcounting for fs_struct Take fs_struct handling to new file (fs/fs_struct.c) Get rid of bumping fs_struct refcount in pivot_root(2) Kill unsharing fs_struct in __set_personality()	2009-04-02 21:09:10 -07:00
Nick Piggin	c2ec175c39	mm: page_mkwrite change prototype to match fault Change the page_mkwrite prototype to take a struct vm_fault, and return VM_FAULT_xxx flags. There should be no functional change. This makes it possible to return much more detailed error information to the VM (and also can provide more information eg. virtual_address to the driver, which might be important in some special cases). This is required for a subsequent fix. And will also make it easier to merge page_mkwrite() with fault() in future. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Felix Blyakher <felixb@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-01 08:59:14 -07:00
Al Viro	ce3b0f8d5c	New helper - current_umask() current->fs->umask is what most of fs_struct users are doing. Put that into a helper function. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-31 23:00:26 -04:00
Linus Torvalds	3ae5080f4c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (37 commits) fs: avoid I_NEW inodes Merge code for single and multiple-instance mounts Remove get_init_pts_sb() Move common mknod_ptmx() calls into caller Parse mount options just once and copy them to super block Unroll essentials of do_remount_sb() into devpts vfs: simple_set_mnt() should return void fs: move bdev code out of buffer.c constify dentry_operations: rest constify dentry_operations: configfs constify dentry_operations: sysfs constify dentry_operations: JFS constify dentry_operations: OCFS2 constify dentry_operations: GFS2 constify dentry_operations: FAT constify dentry_operations: FUSE constify dentry_operations: procfs constify dentry_operations: ecryptfs constify dentry_operations: CIFS constify dentry_operations: AFS ...	2009-03-27 16:23:12 -07:00
Al Viro	92cecbbfa3	constify dentry_operations: GFS2 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-03-27 14:44:02 -04:00
Steven Whitehouse	df3647b245	GFS2: Fix freeze issue This removes some old code that was causing issues during filesystem freeze. Reported-by: Andrew Price <andy@andrewprice.me.uk> Tested-by: Andrew Price <andy@andrewprice.me.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-03-24 11:31:30 +00:00
Steven Whitehouse	9c538837d8	Fix a minor bug in the previous patch The logic requires that we mark the glock dirty in page_mkwrite otherwise we might not flush correctly in the case that no allocation was required in the process of dirying the page. Also we need to set the shared write flag early for the same reason. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-03-24 11:21:27 +00:00
Steven Whitehouse	6bac243f07	GFS2: Clean up of glops.c This cleans up a number of bits of code mostly based in glops.c. A couple of simple functions have been merged into the callers to make it more obvious what is going on, the mysterious raising of i_writecount around the truncate_inode_pages() call has been removed. The meta_go_* operations have been renamed rgrp_go_* since that is the only lock type that they are used with. The unused argument of gfs2_read_sb has been removed. Also a bug has been fixed where a check for the rindex inode was in the wrong callback. More comments are added, and the debugging code is improved too. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-03-24 11:21:27 +00:00
Benjamin Marzinski	02ffad08e8	GFS2: Fix locking bug in failed shared to exclusive conversion After calling out to the dlm, GFS2 sets the new state of a glock to gl_target in gdlm_ast(). However, gl_target is not always the lock state that was requested. If a conversion from shared to exclusive fails, finish_xmote() will call do_xmote() with LM_ST_UNLOCKED, instead of gl->gl_target, so that it can reacquire the lock in exlusive the next time around. In this case, setting the lock to gl_target in gdlm_ast() will make GFS2 think that it has the glock in exclusive mode, when really, it doesn't have the glock locked at all. This patch adds a new field to the gfs2_glock structure, gl_req, to track the mode that was requested. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-03-24 11:21:26 +00:00
Hisashi Hifumi	229615def3	GFS2: Pagecache usage optimization on GFS2 I introduced "is_partially_uptodate" aops for GFS2. A page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate on pagesize != blocksize environment. This aops checks that all buffers which correspond to a part of a file that we want to read are uptodate. If so, we do not have to issue actual read IO to HDD even if a page is not uptodate because the portion we want to read are uptodate. "block_is_partially_uptodate" function is already used by ext2/3/4. With the following patch random read/write mixed workloads or random read after random write workloads can be optimized and we can get performance improvement. I did a performance test using the sysbench. #sysbench --num-threads=16 --max-requests=200000 --test=fileio --file-num=1 --file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0 --file-rw-ratio=1 run -2.6.29-rc6 Test execution summary: total time: 202.6389s total number of events: 200000 total time taken by event execution: 2580.0480 per-request statistics: min: 0.0000s avg: 0.0129s max: 49.5852s approx. 95 percentile: 0.0462s -2.6.29-rc6-patched Test execution summary: total time: 177.8639s total number of events: 200000 total time taken by event execution: 2419.0199 per-request statistics: min: 0.0000s avg: 0.0121s max: 52.4306s approx. 95 percentile: 0.0444s arch: ia64 pagesize: 16k blocksize: 4k Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2009-03-24 11:21:25 +00:00

1 2 3 4 5 ...

774 Commits