Commit Graph

63 Commits

Author SHA1 Message Date
Boaz Harrosh
06886a5a3d exofs: Move all operations to an io_engine
In anticipation for multi-device operations, we separate osd operations
into an abstract I/O API. Currently only one device is used but later
when adding more devices, we will drive all devices in parallel according
to a "data_map" that describes how data is arranged on multiple devices.
The file system level operates, like before, as if there is one object
(inode-number) and an i_size. The io engine will split this to the same
object-number but on multiple device.

At first we introduce Mirror (raid 1) layout. But at the final outcome
we intend to fully implement the pNFS-Objects data-map, including
raid 0,4,5,6 over mirrored devices, over multiple device-groups. And
more. See: http://tools.ietf.org/html/draft-ietf-nfsv4-pnfs-obj-12

* Define an io_state based API for accessing osd storage devices
  in an abstract way.
  Usage:
	First a caller allocates an io state with:
		exofs_get_io_state(struct exofs_sb_info *sbi,
				   struct exofs_io_state** ios);

	Then calles one of:
		exofs_sbi_create(struct exofs_io_state *ios);
		exofs_sbi_remove(struct exofs_io_state *ios);
		exofs_sbi_write(struct exofs_io_state *ios);
		exofs_sbi_read(struct exofs_io_state *ios);
		exofs_oi_truncate(struct exofs_i_info *oi, u64 new_len);

	And when done
		exofs_put_io_state(struct exofs_io_state *ios);

* Convert all source files to use this new API
* Convert from bio_alloc to bio_kmalloc
* In io engine we make use of the now fixed osd_req_decode_sense

There are no functional changes or on disk additions after this patch.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-12-10 09:59:22 +02:00
Boaz Harrosh
cae012d853 exofs: statfs blocks is sectors not FS blocks
Even though exofs has a 4k block size, statfs blocks
is in sectors (512 bytes).

Also if target returns 0 for capacity then make it
ULLONG_MAX. df does not like zero-size filesystems

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-12-10 09:59:21 +02:00
Boaz Harrosh
19fe294f2e exofs: Prints on mount and unmout
It is important to print in the logs when a filesystem was
mounted and eventually unmounted.

Print the osd-device's osd_name and pid the FS was
mounted/unmounted on.

TODO: How to also print the namespace path the filesystem was
      mounted on?

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-12-10 09:59:20 +02:00
Boaz Harrosh
1ba50bbe93 exofs: remove BKL from super operations
the two places inside exofs that where taking the BKL were:
exofs_put_super() - .put_super
and
exofs_sync_fs() - which is .sync_fs and is also called from
                  .write_super.

Now exofs_sync_fs() is protected from itself by also taking
the sb_lock.

exofs_put_super() directly calls exofs_sync_fs() so there is no
danger between these two either.

In anyway there is absolutely nothing dangerous been done
inside exofs_sync_fs().

Unless there is some subtle race with the actual lifetime of
the super_block in regard to .put_super and some other parts
of the VFS. Which is highly unlikely.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-09-24 07:47:38 -04:00
Alexey Dobriyan
405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Boaz Harrosh
baaf94cdc7 exofs: Avoid using file_fsync()
The use of file_fsync() in exofs_file_sync() is not necessary since it
does some extra stuff not used by exofs. Open code just the parts that
are currently needed.

TODO: Farther optimization can be done to sync the sb only on inode
update of new files, Usually the sb update is not needed in exofs.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-06-21 17:53:47 +03:00
Boaz Harrosh
27d2e14919 exofs: Remove IBM copyrights
Boaz,
Congrats on getting all the OSD stuff into 2.6.30!
I just pulled the git, and saw that the IBM copyrights are still there.
Please remove them from all files:
 * Copyright (C) 2005, 2006
 * International Business Machines

IBM has revoked all rights on the code - they gave it to me.

Thanks!
Avishay

Signed-off-by: Avishay Traeger <avishay@gmail.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-06-21 17:53:47 +03:00
Christoph Hellwig
80e09fb942 exofs: add ->sync_fs
Add a ->sync_fs method for data integrity syncs, and reimplement
->write_super ontop of it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:15 -04:00
Christoph Hellwig
ebc1ac1645 ->write_super lock_super pushdown
Push down lock_super into ->write_super instances and remove it from the
caller.

Following filesystem don't need ->s_lock in ->write_super and are skipped:

 * bfs, nilfs2 - no other uses of s_lock and have internal locks in
	->write_super
 * ext2 - uses BKL in ext2_write_super and has internal calls without s_lock
 * reiserfs - no other uses of s_lock as has reiserfs_write_lock (BKL) in
 	->write_super
 * xfs - no other uses of s_lock and uses internal lock (buffer lock on
	superblock buffer) to serialize ->write_super.  Also xfs_fs_write_super
	is superflous and will go away in the next merge window

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:09 -04:00
Christoph Hellwig
6cfd014842 push BKL down into ->put_super
Move BKL into ->put_super from the only caller.  A couple of
filesystems had trivial enough ->put_super (only kfree and NULLing of
s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs,
hugetlbfs, omfs, qnx4, shmem, all others got the full treatment.  Most
of them probably don't need it, but I'd rather sort that out individually.
Preferably after all the other BKL pushdowns in that area.

[AV: original used to move lock_super() down as well; these changes are
removed since we don't do lock_super() at all in generic_shutdown_super()
now]
[AV: fuse, btrfs and xfs are known to need no damn BKL, exempt]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:07 -04:00
Christoph Hellwig
8c85e12512 remove ->write_super call in generic_shutdown_super
We just did a full fs writeout using sync_filesystem before, and if
that's not enough for the filesystem it can perform it's own writeout
in ->put_super, which many filesystems already do.

Move a call to foofs_write_super into every foofs_put_super for now to
guarantee identical behaviour until it's cleaned up by the individual
filesystem maintainers.

Exceptions:

 - affs already has identical copy & pasted code at the beginning of
   affs_put_super so no need to do it twice.
 - xfs does the right thing without it and I have changes pending for
   the xfs tree touching this are so I don't really need conflicts
   here..

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-06-11 21:36:06 -04:00
Boaz Harrosh
8cf74b3936 exofs: export_operations
implement export_operations and set in superblock.
It is now posible to export exofs via nfs

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-03-31 19:44:36 +03:00
Boaz Harrosh
ba9e5e98ca exofs: super_operations and file_system_type
This patch ties all operation vectors into a file system superblock
and registers the exofs file_system_type at module's load time.

* The file system control block (AKA on-disk superblock) resides in
  an object with a special ID (defined in common.h).
  Information included in the file system control block is used to
  fill the in-memory superblock structure at mount time. This object
  is created before the file system is used by mkexofs.c It contains
  information such as:
	- The file system's magic number
	- The next inode number to be allocated

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
2009-03-31 19:44:34 +03:00