linux

Author	SHA1	Message	Date
Sage Weil	153a10939e	ceph: fix crush device 'out' threshold to 1.0, not 0.1 Fix a typo that made any OSD weighted between 0.1 and 1.0 effectively weighted as 1.0 (fully in). Signed-off-by: Sage Weil <sage@newdream.net>	2010-07-05 09:44:17 -07:00
Sage Weil	443b3760a0	ceph: fix caps usage accounting for import (non-reserved) case We need to increase the total and used counters when allocating a new cap in the non-reserved (cap import) case. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-29 09:31:56 -07:00
Sage Weil	ec97f88ba6	ceph: only release clean, unused caps with mds requests We can drop caps with an mds request. Ensure we only drop unused AND clean caps, since the MDS doesn't support cap writeback in that context, nor do we track it. If caps are dirty, and the MDS needs them back, we it will revoke and we will flush in the normal fashion. This fixes a possibly loss of metadata. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-29 09:31:55 -07:00
Sage Weil	a1a31e7342	ceph: fix crush CHOOSE_LEAF when type is already a leaf We may not recurse for CHOOSE_LEAF if we start with a leaf node. When that happens, the out2 vector needs to be filled in with the result. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-24 12:58:14 -07:00
Sage Weil	55bda7aacd	ceph: fix crush recursion There was a longstanding problem with recursion through intervening bucket types on complex hierarchies. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-24 12:55:48 -07:00
Yehuda Sadeh	bfaf148eb2	ceph: fix caps debugfs entry The ceph client structure was not set correctly. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-24 09:47:36 -07:00
Sage Weil	17c688c3df	ceph: delay umount until all mds requests drop inode+dentry refs This fixes a race between handle_reply finishing an mds request, signalling completion, and then dropping the request structing and its dentry+inode refs, and pre_umount function waiting for requests to finish before letting the vfs tear down the dcache. If umount was delayed waiting for mds requests, we could race and BUG in shrink_dcache_for_umount_subtree because of a slow dput. This delays umount until the msgr queue flushes, which means handle_reply will exit and will have dropped the ceph_mds_request struct. I'm assuming the VFS has already ensured that its calls have all completed and those request refs have thus been dropped as well (I haven't seen that race, at least). Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-21 16:11:50 -07:00
Sage Weil	d69ed05a80	ceph: handle splice_dentry/d_materialize_unique error in readdir_prepopulate Handle a splice_dentry failure (due to a d_materialize_unique error) without crashing. (Also, report the error code.) Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-21 16:04:10 -07:00
Sage Weil	cebc5be6b6	ceph: fix crush map update decoding If the incremental osdmap has a new crush map, advance the position after decoding so that we can parse the rest of the osdmap properly. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-17 10:22:48 -07:00
Sage Weil	ae32be3134	ceph: fix message memory leak, uninitialized variable We need to properly initialize skip, as not all alloc_msg op instances set it. Also, BUG if someone says skip but also allocates a message. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-13 10:34:36 -07:00
Sage Weil	4a32f93d29	ceph: fix map handler error path Don't leak message if we receive an unexpected message type. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-13 10:34:36 -07:00
Yehuda Sadeh	0cf5537b15	ceph: some endianity fixes Fix some problems that came up with sparse. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-13 10:34:36 -07:00
Sage Weil	2b2300d62e	ceph: try to send partial cap release on cap message on missing inode If we have enough memory to allocate a new cap release message, do so, so that we can send a partial release message immediately. This keeps us from making the MDS wait when the cap release it needs is in a partially full release message. If we fail because of ENOMEM, oh well, they'll just have to wait a bit longer. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-10 13:30:25 -07:00
Sage Weil	3d7ded4d81	ceph: release cap on import if we don't have the inode If we get an IMPORT that give us a cap, but we don't have the inode, queue a release (and try to send it immediately) so that the MDS doesn't get stuck waiting for us. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-10 13:30:07 -07:00
Sage Weil	9dbd412f56	ceph: fix misleading/incorrect debug message Nothing is released here: the caps message is simply ignored in this case. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-10 13:29:59 -07:00
Jeff Mahoney	00d5643e7c	ceph: fix atomic64_t initialization on ia64 bdi_seq is an atomic_long_t but we're using ATOMIC_INIT, which causes build failures on ia64. This patch fixes it to use ATOMIC_LONG_INIT. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-10 13:29:50 -07:00
Sage Weil	1e5ea23df1	ceph: fix lease revocation when seq doesn't match If the client revokes a lease with a higher seq than what we have, keep the mds's seq, so that it honors our release. Otherwise, we can hang indefinitely. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-04 10:05:40 -07:00
Sage Weil	558d3499bd	ceph: fix f_namelen reported by statfs We were setting f_namelen in kstatfs to PATH_MAX instead of NAME_MAX. That disagrees with ceph_lookup behavior (which checks against NAME_MAX), and also makes the pjd posix test suite spit out ugly errors because with can't clean up its temporary files. Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-01 16:56:03 -07:00
Yehuda Sadeh	205475679a	ceph: fix memory leak in statfs Freeing the statfs request structure when required. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-01 16:56:02 -07:00
Henry C Chang	13a4214cd9	ceph: fix d_subdirs ordering problem We misused list_move_tail() to order the dentry in d_subdirs. This will screw up the d_subdirs order. This bug can be reliably reproduced by: 1. mount ceph fs. 2. on ceph fs, git clone git://ceph.newdream.net/git/ceph.git 3. Run autogen.sh in ceph directory. (Note: Errors only occur at the first time you run autogen.sh.) Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-06-01 16:55:55 -07:00
Linus Torvalds	b612a05537	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: clean up on forwarded aborted mds request ceph: fix leak of osd authorizer ceph: close out mds, osd connections before stopping auth ceph: make lease code DN specific fs/ceph: Use ERR_CAST ceph: renew auth tickets before they expire ceph: do not resend mon requests on auth ticket renewal ceph: removed duplicated #includes ceph: avoid possible null dereference ceph: make mds requests killable, not interruptible sched: add wait_for_completion_killable_timeout	2010-05-30 08:56:39 -07:00
Sage Weil	2a8e5e3637	ceph: clean up on forwarded aborted mds request If an mds request is aborted (timeout, SIGKILL), it is left registered to keep our state in sync with the mds. If we get a forward notification, though, we know the request didn't succeed and we can unregister it safely. We were trying to resend it, but then bailing out (and not unregistering) in __do_request. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:42:05 -07:00
Sage Weil	79494d1b9b	ceph: fix leak of osd authorizer Release the ceph_authorizer when releasing osd state. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:42:04 -07:00
Sage Weil	a922d38fd1	ceph: close out mds, osd connections before stopping auth The auth module (part of the mon_client) is needed to free any ceph_authorizer(s) used by the mds and osd connections. Flush the msgr workqueue before stopping monc to ensure that the destroy_authorizer auth op is available when those connections are closed out. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:42:03 -07:00
Sage Weil	dd1c905736	ceph: make lease code DN specific The lease code includes a mask in the CEPH_LOCK_* namespace, but that namespace is changing, and only one mask (formerly _DN == 1) is used, so hard code for that value for now. If we ever extend this code to handle leases over different data types we can extend it accordingly. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:42 -07:00
Julia Lawall	7e34bc524e	fs/ceph: Use ERR_CAST Use ERR_CAST(x) rather than ERR_PTR(PTR_ERR(x)). The former makes more clear what is the purpose of the operation, which otherwise looks like a no-op. In the case of fs/ceph/inode.c, ERR_CAST is not needed, because the type of the returned value is the same as the type of the enclosing function. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ type T; T x; identifier f; @@ T f (...) { <+... - ERR_PTR(PTR_ERR(x)) + x ...+> } @@ expression x; @@ - ERR_PTR(PTR_ERR(x)) + ERR_CAST(x) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:41 -07:00
Sage Weil	a41359fa35	ceph: renew auth tickets before they expire We were only requesting renewal after our tickets expire; do so before that. Most of the low-level logic for this was already there; just use it. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:39 -07:00
Sage Weil	09c4d6a7d4	ceph: do not resend mon requests on auth ticket renewal We only want to send pending mon requests when we successfully authenticate. If we are already authenticated, like when we renew our ticket, there is no need to resend pending requests. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:38 -07:00
Andrea Gelmini	984c76908e	ceph: removed duplicated #includes fs/ceph/auth.c: linux/slab.h is included more than once. fs/ceph/super.h: linux/slab.h is included more than once. Acked-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:37 -07:00
Sage Weil	e95e9a7ae4	ceph: avoid possible null dereference ac->ops may be null; use protocol id in error message instead. Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:36 -07:00
Sage Weil	aa91647c89	ceph: make mds requests killable, not interruptible The underlying problem is that many mds requests can't be restarted. For example, a restarted create() would return -EEXIST if the original request succeeds. However, we do not want a hung MDS to hang the client too. So, use the _killable wait_for_completion variants to abort on SIGKILL but nothing else. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-29 09:12:35 -07:00
Christoph Hellwig	7ea8085910	drop unused dentry argument to ->fsync Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:05:02 -04:00
Linus Torvalds	6e188240eb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (59 commits) ceph: reuse mon subscribe message instead of allocated anew ceph: avoid resending queued message to monitor ceph: Storage class should be before const qualifier ceph: all allocation functions should get gfp_mask ceph: specify max_bytes on readdir replies ceph: cleanup pool op strings ceph: Use kzalloc ceph: use common helper for aborted dir request invalidation ceph: cope with out of order (unsafe after safe) mds reply ceph: save peer feature bits in connection structure ceph: resync headers with userland ceph: use ceph. prefix for virtual xattrs ceph: throw out dirty caps metadata, data on session teardown ceph: attempt mds reconnect if mds closes our session ceph: clean up send_mds_reconnect interface ceph: wait for mds OPEN reply to indicate reconnect success ceph: only send cap releases when mds is OPEN\|HUNG ceph: dicard cap releases on mds restart ceph: make mon client statfs handling more generic ceph: drop src address(es) from message header [new protocol feature] ...	2010-05-24 07:37:52 -07:00
Sage Weil	240ed68eb5	ceph: reuse mon subscribe message instead of allocated anew Use the same message, allocated during startup. No need to reallocate a new one each time around (and potentially ENOMEM). Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-21 16:26:11 -07:00
Christoph Hellwig	8018ab0574	sanitize vfs_fsync calling conventions Now that the last user passing a NULL file pointer is gone we can remove the redundant dentry argument and associated hacks inside vfs_fsynmc_range. The next step will be removig the dentry argument from ->fsync, but given the luck with the last round of method prototype changes I'd rather defer this until after the main merge window. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:21 -04:00
Al Viro	3981f2e2a0	ceph: should use deactivate_locked_super() on failure exits Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-21 18:31:13 -04:00
Sage Weil	970690012c	ceph: avoid resending queued message to monitor The auth_reply handler will (re)send any pending requests. For the initial mon authenticate phase, that's correct, but when a auth ticket renewal races with an in-flight request, we may resend a request message that is already in flight. Avoid this by revoking the message before sending it. We should also avoid resending requests at all during ticket renewal; that will come soon. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-21 15:01:22 -07:00
Tobias Klauser	9e32789f63	ceph: Storage class should be before const qualifier The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-21 15:01:21 -07:00
Yehuda Sadeh	34d23762d9	ceph: all allocation functions should get gfp_mask This is essential, as for the rados block device we'll need to run in different contexts that would need flags that are other than GFP_NOFS. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:42 -07:00
Sage Weil	23804d91f1	ceph: specify max_bytes on readdir replies Specify max bytes in request to bound size of reply. Add associated mount option with default value of 512 KB. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:41 -07:00
Sage Weil	366837706b	ceph: cleanup pool op strings Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:41 -07:00
Julia Lawall	cffe7b6d8c	ceph: Use kzalloc Use kzalloc rather than the combination of kmalloc and memset. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x,size,flags; statement S; @@ -x = kmalloc(size,flags); +x = kzalloc(size,flags); if (x == NULL) S -memset(x, 0, size); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:40 -07:00
Sage Weil	167c9e352d	ceph: use common helper for aborted dir request invalidation We invalidate I_COMPLETE and dentry leases in two places: on aborted mds request and on request replay. Use common helper to avoid duplicate code. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:40 -07:00
Sage Weil	85792d0dd6	ceph: cope with out of order (unsafe after safe) mds reply Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:39 -07:00
Sage Weil	aba558e28a	ceph: save peer feature bits in connection structure These are used for adjusting behavior, such as conditionally encoding a newer message format. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:38 -07:00
Sage Weil	ca9d93a292	ceph: resync headers with userland Notable changes include pool op defines and types, FLOCK feature bit, and new CMPXATTR osd ops. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:38 -07:00
Sage Weil	1a75627896	ceph: use ceph. prefix for virtual xattrs Drop the 'user.' prefix and use just 'ceph.' for fs virtual xattrs. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:37 -07:00
Sage Weil	6c99f2545d	ceph: throw out dirty caps metadata, data on session teardown The remove_session_caps() helper is called when an MDS closes out our session (either normally, or as a result of a failed reconnect), and when we tear down state for umount. If we remove the last cap, and there are no cap migrations in progress, then there is little hope of us flushing out that data to the mds (without heroic efforts to reconnect and flush). So, to avoid leaving inodes pinned (due to dirty state) and crashing after umount, throw out dirty caps state and unpin the inodes. Print a warning to the console so we know something was lost. NOTE: Although we drop wrbuffer refs, we don't actually mark pages clean; maybe a truncate should be queued? Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:37 -07:00
Sage Weil	7e70f0ed9f	ceph: attempt mds reconnect if mds closes our session Currently, if our session is closed (due to a timeout, or explicit close, or whatever), we just sit there doing nothing unless/until the MDS restarts, at which point we try to reconnect. Change client to attempt an immediate reconnect if our session is closed. Note that currently the MDS doesn't support this, and our attempt will fail. We'll get a session CLOSE, our caps and dirty cap state will be dropped, and the client will be free to attempt to reconnect. That's clearly not as nice as a successful reconnect, but it at least allows us to try to carry on, and in the future the MDS will support a reconnect and we will fare better. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:36 -07:00
Sage Weil	34b6c855fa	ceph: clean up send_mds_reconnect interface Pass a ceph_mds_session, since the caller has it. Remove the dead code for sending empty reconnects. It used to be used when the MDS contacted _us_ to solicit a reconnect, and we could reply saying "go away, I have no session." Now we only send reconnects based on the mds map, and only when we do in fact have an open session. Signed-off-by: Sage Weil <sage@newdream.net>	2010-05-17 15:25:35 -07:00

1 2 3 4 5 ...

354 Commits