Allow a lock type to specifiy whether it makes use of the LVB. The only type
which does this right now is the meta data lock. This should save us some
space on network messages since they won't have to needlessly transmit value
blocks.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
There is extremely little difference between the two now. We can remove the
callback from ocfs2_lock_res_ops as well.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This was always defined to the same function in all locks, so clean things
up by removing and passing ocfs2_unlock_ast() directly to the DLM.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
There is extremely little difference between the two now. We can remove the
callback from ocfs2_lock_res_ops as well.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Use of the refresh mechanism is lock-type wide, so move knowledge of that to
the ocfs2_lock_res_ops structure.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
OCFS2 puts inode meta data in the "lock value block" provided by the DLM.
Typically, i_generation is encoded in the lock name so that a deleted inode
on and a new one in the same block don't share the same lvb.
Unfortunately, that scheme means that the read in ocfs2_read_locked_inode()
is potentially thrown away as soon as the meta data lock is taken - we
cannot encode the lock name without first knowing i_generation, which
requires a disk read.
This patch encodes i_generation in the inode meta data lvb, and removes the
value from the inode meta data lock name. This way, the read can be covered
by a lock, and at the same time we can distinguish between an up to date and
a stale LVB.
This will help cold-cache stat(2) performance in particular.
Since this patch changes the protocol version, we take the opportunity to do
a minor re-organization of two of the LVB fields.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
When i_generation is removed from the lockname, this will help us determine
whether a meta data lvb has information that is in sync with the local
struct inode.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
lvb_version doesn't need to be a whole 32 bits. Make it an 8 bit field to
free up some space. This should be backwards compatible until we use one of
the fields, in which case we'd bump the lvb version anyway.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We can't use LKM_LOCAL for new dentry locks because an unlink and subsequent
re-create of a name/inode pair may result in the lock still being mastered
somewhere in the cluster.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur
during ->rename() if we d_move() outside of the parent directory cluster
locks, and another node discovers the new name (created during the rename)
and unlinks it. d_move() will unconditionally rehash a dentry - which will
leave stale data in the system.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Actually replace the vote calls with the new dentry operations. Make any
necessary adjustments to get the scheme to work.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Replace the dentry vote mechanism with a cluster lock which covers a set
of dentries. This allows us to force d_delete() only on nodes which actually
care about an unlink.
Every node that does a ->lookup() gets a read only lock on the dentry, until
an unlink during which the unlinking node, will request an exclusive lock,
forcing the other nodes who care about that dentry to d_delete() it. The
effect is that we retain a very lightweight ->d_revalidate(), and at the
same time get to make large improvements to the average case performance of
the ocfs2 unlink and rename operations.
This patch adds the higher level API and the dentry manipulation code.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Replace the dentry vote mechanism with a cluster lock which covers a set
of dentries. This allows us to force d_delete() only on nodes which actually
care about an unlink.
Every node that does a ->lookup() gets a read only lock on the dentry, until
an unlink during which the unlinking node, will request an exclusive lock,
forcing the other nodes who care about that dentry to d_delete() it. The
effect is that we retain a very lightweight ->d_revalidate(), and at the
same time get to make large improvements to the average case performance of
the ocfs2 unlink and rename operations.
This patch adds the cluster lock type which OCFS2 can attach to
dentries. A small number of fs/ocfs2/dcache.c functions are stubbed
out so that this change can compile.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
File system lock names are very regular right now, so we really only need to
pass an extra parameter to dlmlock().
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We just need to add a namelen field to the user_lock_res structure, and
update a few debug prints. Instead of updating all debug prints, I took the
opportunity to remove a few that are likely unnecessary these days.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The OCFS2 DLM uses strlen() to determine lock name length, which excludes
the possibility of putting binary values in the name string. Fix this by
requiring that string length be passed in as a parameter.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
An AST can be delivered via the network after a lock has been removed, so no
need to print an error when we see that.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The truncate code was never supposed to BUG() on an allocator it doesn't
know about, but rather to ignore it. Right now, this does nothing, but when
we change our allocation paths to use all suballocator files, this will
allow current versions of the fs module to work fine.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Uptodate.c now knows about read-ahead buffers. Use some more aggressive
logic in ocfs2_readdir().
The two functions which currently use directory read-ahead are
ocfs2_find_entry() and ocfs2_readdir().
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We weren't always updating i_mtime on writes, so fix ocfs2_commit_write() to
handle this.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: Zach Brown <zach.brown@oracle.com>
Remove the redundant "i_nlink >= OCFS2_LINK_MAX" check and adds an unlinked
directory check in ocfs2_link().
Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
The dir nlink check in ocfs2_mknod() was being done outside of the cluster
lock, which means we could have been checking against a stale version of the
inode. Fix this by doing the check after the cluster lock instead.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Every file should #include the headers containing the prototypes for its
global functions.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Support immutable, and other attributes.
Some renaming and other minor fixes done by myself.
Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Record the most recently used allocation group on the allocation context, so
that subsequent allocations can attempt to optimize for contiguousness.
Local alloc especially should benefit from this as the current chain search
tends to let it spew across the disk.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Try to catch corrupted group descriptors with some stronger checks placed in
a couple of strategic locations. Detect a failed resizefs and refuse to
allocate past what bitmap i_clusters allows.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
We were storing cluster count on the ocfs2_super structure, but never
actually using it so remove that. Also, we don't want to populate the
uptodate cache with the unlocked block read - it is technically safe as is,
but we should change it for correctness.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This patch removes the unused EXPORT_SYMBOL_GPL(dlm_migrate_lockres).
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
If a process requests a lock cancel but the lock has been remotely granted
already then there is no need to send the cancel message.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
This can race with other ast notification, which can cause bad status values
to propagate into the unlock ast.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Properly ignore LVB flags during a PR downconvert. This avoids an illegal
lvb update.
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Get rid of osb->uuid, osb->proc_sub_dir, and osb->osb_id. Those fields were
unused, or could easily be removed. As a result, we also no longer need
MAX_OSB_ID or ocfs2_globals_lock.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
dlm_lockres_master_requery() became global without any external usage.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Give gcc the chance to compile out the debug logging code in ocfs2.
This saves some size at the expense of being able to debug the code.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
locking init cleanups:
- convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
- convert rwlocks in a similar manner
this patch was generated automatically.
Motivation:
- cleanliness
- lockdep needs control of lock initialization, which the open-coded
variants do not give
- it's also useful for -rt and for lock debugging in general
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>