Commit Graph

91 Commits

Author SHA1 Message Date
Josef Bacik
02c24a8218 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers.  Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2.  For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 20:47:59 -04:00
Artem Bityutskiy
d808efb407 UBIFS: amend debugging inode size check function prototype
Add 'const struct ubifs_info *c' parameter to 'dbg_check_synced_i_size()'
function because we'll need it in the next patch when we switch to debugfs.
So this patch is just a preparation.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-07-04 10:54:27 +03:00
Artem Bityutskiy
2cd0a60cf4 UBIFS: remove strange commentary
Remove the following commentary from 'ubifs_file_mmap()':

/* 'generic_file_mmap()' takes care of NOMMU case */

I do not understand what it means, and I could not find anything relater to
NOMMU in 'generic_file_mmap()'.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-05-13 19:23:55 +03:00
Artem Bityutskiy
3b2f9a019e UBIFS: use ro_mount instead of MS_RDONLY
We have our own flags indicating R/O mode, and c->ro_mode is equivalent
to MS_RDONLY. Let's be consistent and use UBIFS flags everywhere.
This patch is just a minor cleanup.

Additionally, add a comment that we are surprised with VFS behavior -
as a reminder to look at this some day.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-05-13 19:23:55 +03:00
Artem Bityutskiy
b137545c44 UBIFS: introduce a separate structure for budgeting info
This patch separates out all the budgeting-related information
from 'struct ubifs_info' to 'struct ubifs_budg_info'. This way the
code looks a bit cleaner. However, the main driver for this is
that we want to save budgeting information and print it later,
so a separate data structure for this is helpful.

This patch is a preparation for the further debugging output
improvements.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-05-13 19:23:53 +03:00
Artem Bityutskiy
c43615702f UBIFS: fix minor stylistic issues
Fix several minor stylistic issues:
* lines longer than 80 characters
* space before closing parenthesis ')'
* spaces in the indentations

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-05-13 19:23:53 +03:00
Artem Bityutskiy
78530bf7f2 UBIFS: fix oops when R/O file-system is fsync'ed
This patch fixes severe UBIFS bug: UBIFS oopses when we 'fsync()' an
file on R/O-mounter file-system. We (the UBIFS authors) incorrectly
thought that VFS would not propagate 'fsync()' down to the file-system
if it is read-only, but this is not the case.

It is easy to exploit this bug using the following simple perl script:

use strict;
use File::Sync qw(fsync sync);

die "File path is not specified" if not defined $ARGV[0];
my $path = $ARGV[0];

open FILE, "<", "$path" or die "Cannot open $path: $!";
fsync(\*FILE) or die "cannot fsync $path: $!";
close FILE or die "Cannot close $path: $!";

Thanks to Reuben Dowle <Reuben.Dowle@navico.com> for reporting about this
issue.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Reported-by: Reuben Dowle <Reuben.Dowle@navico.com>
Cc: stable@kernel.org
2011-04-13 10:43:32 +03:00
Artem Bityutskiy
6ed09c34b7 UBIFS: fix assertion warning and refine comments
This patch fixes the following UBIFS assertion warning:

UBIFS assert failed in do_readpage at 115 (pid 199)
[<b00321b8>] (unwind_backtrace+0x0/0xdc) from [<af025118>]
(do_readpage+0x108/0x594 [ubifs])
[<af025118>] (do_readpage+0x108/0x594 [ubifs]) from [<af025764>]
(ubifs_write_end+0x1c0/0x2e8 [ubifs])
[<af025764>] (ubifs_write_end+0x1c0/0x2e8 [ubifs]) from
[<b00a0164>] (generic_file_buffered_write+0x18c/0x270)
[<b00a0164>] (generic_file_buffered_write+0x18c/0x270) from
[<b00a08d4>] (__generic_file_aio_write+0x478/0x4c0)
[<b00a08d4>] (__generic_file_aio_write+0x478/0x4c0) from
[<b00a0984>] (generic_file_aio_write+0x68/0xc8)
[<b00a0984>] (generic_file_aio_write+0x68/0xc8) from
[<af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs])
[<af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs]) from
[<b00d104c>] (do_sync_write+0xb0/0x100)
[<b00d104c>] (do_sync_write+0xb0/0x100) from [<b00d1abc>]
(vfs_write+0xac/0x154)
[<b00d1abc>] (vfs_write+0xac/0x154) from [<b00d1c10>]
(sys_write+0x3c/0x68)
[<b00d1c10>] (sys_write+0x3c/0x68) from [<b002d9a0>]
(ret_fast_syscall+0x0/0x2c)

The 'PG_checked' flag is used to indicate that the page does not
supposedly exist on the media (e.g., a hole or a page beyond the
inode size), so it requires slightly bigger budget, because we have
to account the indexing size increase. And this flag basically
tells that the budget for this page has to be "new page budget".
The "new page budget" is slightly bigger than the "existing page
budget".

The 'do_readpage()' function has the following assertion which
sometimes is hit: 'ubifs_assert(!PageChecked(page))'. Obviously,
the meaning of this assertion is: "I should not be asked to read
a page which does not exist on the media".

However, in 'ubifs_write_begin()' we have a small "trick". Notice,
that VFS may write pages which were not read yet, so the page data
were not loaded from the media to the page cache yet. If VFS tells
that it is going to change only some part of the page, we obviously
have to load it from the media. However, if VFS tells that it is
going to change whole page, we do not read it from the media for
optimization purposes.

However, since we do not read it, we do not know if it exists on
the media or not (a hole, etc). So we set the 'PG_checked' flag
to this page to force bigger budget, just in case.

So 'ubifs_write_begin()' sets 'PG_checked'. Then we are in
'ubifs_write_end()'. And VFS tells us: "hey, for some reasons I
changed my mind and did not change whole page". Frankly, I do not
know why this happens, but I hit this somehow on an ARM platform.
And this is extremely rare.

So in this case UBIFS does the following:

1. Cancels allocated budget.
2. Loads the page from the media by calling 'do_readpage()'.
3. Asks VFS to repeat the whole write operation from the very
   beginning (call '->write_begin() again, etc).

And the assertion warning is hit at the step 2 - remember we have
the 'PG_checked' set for this page, and 'do_readpage()' does not
like this. So this patch fixes the problem by adding step 1.5 and
cleaning the 'PG_checked' before calling 'do_readpage()'.

All in all, this patch does not fix any functionality issue, but it
silences UBIFS false positive warning which may happen in very very
rare cases.

And while on it, this patch also improves a commentary which explains
the reasons of setting the 'PG_checked' flag for the page. The old
commentary was a bit difficult to understand.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2011-03-24 16:16:18 +02:00
Artem Bityutskiy
2ef13294d2 UBIFS: introduce new flags for RO mounts
Commit 2fde99cb55 "UBIFS: mark VFS SB RO too"
introduced regression. This commit made UBIFS set the 'MS_RDONLY' flag in the
VFS superblock when it switches to R/O mode due to an error. This was done
to make VFS show the R/O UBIFS flag in /proc/mounts.

However, several places in UBIFS relied on the 'MS_RDONLY' flag and assume this
flag can only change when we re-mount. For example, 'ubifs_put_super()'.

This patch introduces new UBIFS flag - 'c->ro_mount' which changes only when
we re-mount, and preserves the way UBIFS was originally mounted (R/W or R/O).
This allows us to de-initialize UBIFS cleanly in 'ubifs_put_super()'.

This patch also changes all 'ubifs_assert(!c->ro_media)' assertions to
'ubifs_assert(!c->ro_media && !c->ro_mount)', because we never should write
anything if the FS was mounter R/O.

All the places where we test for 'MS_RDONLY' flag in the VFS SB were changed
and now we test the 'c->ro_mount' flag instead, because it preserves the
original UBIFS mount type, unlike the 'MS_RDONLY' flag.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-09-19 21:07:58 +03:00
Artem Bityutskiy
2680d722bf UBIFS: introduce new flag for RO due to errors
The R/O state may have various reasons:

1. The UBI volume is R/O
2. The FS is mounted R/O
3. The FS switched to R/O mode because of an error

However, in UBIFS we have only one variable which represents cases
1 and 3 - 'c->ro_media'. Indeed, we set this to 1 if we switch to
R/O mode due to an error, and then we test it in many places to
make sure that we stop writing as soon as the error happens.

But this is very unclean. One consequence of this, for example, is
that in 'ubifs_remount_fs()' we use 'c->ro_media' to check whether
we are in R/O mode because on an error, and we print a message
in this case. However, if we are in R/O mode because the media
is R/O, our message is bogus.

This patch introduces new flag - 'c->ro_error' which is set when
we switch to R/O mode because of an error. It also changes all
"if (c->ro_media)" checks to "if (c->ro_error)" checks, because
this is what the checks actually mean. We do not need to check
for 'c->ro_media' because if the UBI volume is in R/O mode, we
do not allow R/W mounting, and now writes can happen. This is
guaranteed by VFS. But it is good to double-check this, so this
patch also adds many "ubifs_assert(!c->ro_media)" checks.

In the 'ubifs_remount_fs()' function this patch makes a bit more
changes - it fixes the error messages as well.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2010-09-17 17:08:09 +03:00
Christoph Hellwig
2c27c65ed0 check ATTR_SIZE contraints in inode_change_ok
Make sure we check the truncate constraints early on in ->setattr by adding
those checks to inode_change_ok.  Also clean up and document inode_change_ok
to make this obvious.

As a fallout we don't have to call inode_newsize_ok from simple_setsize and
simplify it down to a truncate_setsize which doesn't return an error.  This
simplifies a lot of setattr implementations and means we use truncate_setsize
almost everywhere.  Get rid of fat_setsize now that it's trivial and mark
ext2_setsize static to make the calling convention obvious.

Keep the inode_newsize_ok in vmtruncate for now as all callers need an
audit for its removal anyway.

Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
needs a deeper audit, but that is left for later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-08-09 16:47:39 -04:00
npiggin@suse.de
15c6fd9786 kill spurious reference to vmtruncate
Lots of filesystems calls vmtruncate despite not implementing the old
->truncate method.  Switch them to use simple_setsize and add some
comments about the truncate code where it seems fitting.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-27 22:15:42 -04:00
Christoph Hellwig
7ea8085910 drop unused dentry argument to ->fsync
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-05-27 22:05:02 -04:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Christoph Hellwig
a9185b41a4 pass writeback_control to ->write_inode
This gives the filesystem more information about the writeback that
is happening.  Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-05 13:25:52 -05:00
Christoph Hellwig
eaff8079d4 kill I_LOCK
After I_SYNC was split from I_LOCK the leftover is always used together with
I_NEW and thus superflous.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-12-17 11:03:25 -05:00
Christoph Hellwig
774888bcd6 UBIFS: remove manual O_SYNC handling
generic_file_aio_write already calls into ->fsync to handle O_SYNC/O_DSYNC.
Remove the duplicate call to ubifs_sync_wbufs_by_inode which is already
covered by ubifs_fsync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-11-24 08:18:55 +02:00
Alexey Dobriyan
f0f37e2f77 const: mark struct vm_struct_operations
* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code

But leave TTM code alone, something is fishy there with global vm_ops
being used.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-27 11:39:25 -07:00
Artem Bityutskiy
873a64c762 UBIFS: amend commentaries
This patch amends and nicifies commentaries in file.c, as well as
fixes some spelling problems.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-09-10 12:06:47 +03:00
Linus Torvalds
e0724bf6e4 Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
* 'linux-next' of git://git.infradead.org/ubifs-2.6:
  UBIFS: fix recovery bug
  UBIFS: add R/O compatibility
  UBIFS: fix compiler warnings
  UBIFS: fully sort GCed nodes
  UBIFS: fix commentaries
  UBIFS: introduce a helpful variable
  UBIFS: use KERN_CONT
  UBIFS: fix lprops committing bug
  UBIFS: fix bogus assertion
  UBIFS: fix bug where page is marked uptodate when out of space
  UBIFS: amend key_hash return value
  UBIFS: improve find function interface
  UBIFS: list usage cleanup
  UBIFS: fix dbg_chk_lpt_sz()
2009-04-06 15:00:19 -07:00
Nick Piggin
c2ec175c39 mm: page_mkwrite change prototype to match fault
Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags.  There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg.  virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix.  And will also make it easier to
merge page_mkwrite() with fault() in future.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01 08:59:14 -07:00
Artem Bityutskiy
7d4e9ccb43 UBIFS: fix commentaries
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-20 19:11:12 +02:00
Adrian Hunter
f55aa59106 UBIFS: fix bug where page is marked uptodate when out of space
UBIFS fast path in write_begin may mark a page up to date
and then discover that there may not be enough space to do
the write, and so fall back to a slow path.  The slow path
tries harder, but may still find no space - leaving the page
marked up to date, when it is not.  This patch ensures that
the page is marked not up to date in that case.

The bug that this patch fixes becomes evident when the write
is into a hole (sparse file) or is at the end of the file
and a subsequent read is off the end of the file.  In both
cases, the file system should return zeros but was instead
returning the page that had not been written because the
file system was out of space.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-14 16:46:33 +02:00
Artem Bityutskiy
84abf972cc UBIFS: add re-mount debugging checks
We observe space corrupted accounting when re-mounting. So add some
debbugging checks to catch problems like this.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-01-26 12:54:11 +02:00
Artem Bityutskiy
e8b815663b UBIFS: constify operations
Mark super, file, and inode operation structcutes with 'const'.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-01-18 14:05:08 +02:00
Nick Piggin
54566b2c15 fs: symlink write_begin allocation context fix
With the write_begin/write_end aops, page_symlink was broken because it
could no longer pass a GFP_NOFS type mask into the point where the
allocations happened.  They are done in write_begin, which would always
assume that the filesystem can be entered from reclaim.  This bug could
cause filesystem deadlocks.

The funny thing with having a gfp_t mask there is that it doesn't really
allow the caller to arbitrarily tinker with the context in which it can be
called.  It couldn't ever be GFP_ATOMIC, for example, because it needs to
take the page lock.  The only thing any callers care about is __GFP_FS
anyway, so turn that into a single flag.

Add a new flag for write_begin, AOP_FLAG_NOFS.  Filesystems can now act on
this flag in their write_begin function.  Change __grab_cache_page to
accept a nofs argument as well, to honour that flag (while we're there,
change the name to grab_cache_page_write_begin which is more instructive
and does away with random leading underscores).

This is really a more flexible way to go in the end anyway -- if a
filesystem happens to want any extra allocations aside from the pagecache
ones in ints write_begin function, it may now use GFP_KERNEL (rather than
GFP_NOFS) for common case allocations (eg.  ocfs2_alloc_write_ctxt, for a
random example).

[kosaki.motohiro@jp.fujitsu.com: fix ubifs]
[kosaki.motohiro@jp.fujitsu.com: fix fuse]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: <stable@kernel.org>		[2.6.28.x]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Cleaned up the calling convention: just pass in the AOP flags
  untouched to the grab_cache_page_write_begin() function.  That
  just simplifies everybody, and may even allow future expansion of the
  logic.   - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-04 13:33:20 -08:00
Artem Bityutskiy
f92b982680 UBIFS: fix checkpatch.pl warnings
These are mostly long lines and wrong indentation warning
fixes. But also there are two volatile variables and
checkpatch.pl complains about them:

WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
+       volatile int gc_seq;

WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt
+       volatile int gced_lnum;

Well, we anyway use smp_wmb() for c->gc_seq and c->gced_lnum, so
these 'volatile' modifiers can be just dropped.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-31 14:13:25 +02:00
Artem Bityutskiy
7bbe5b5aa6 UBIFS: use PAGE_CACHE_MASK correctly
It has high bits set, not low bits set as the UBIFS code
assumed.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:19:14 +02:00
Artem Bityutskiy
3477d20465 UBIFS: pre-allocate bulk-read buffer
To avoid memory allocation failure during bulk-read, pre-allocate
a bulk-read buffer, so that if there is only one bulk-reader at
a time, it would just use the pre-allocated buffer and would not
do any memory allocation. However, if there are more than 1 bulk-
reader, then only one reader would use the pre-allocated buffer,
while the other reader would allocate the buffer for itself.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-21 18:59:33 +02:00
Artem Bityutskiy
6c0c42cdfd UBIFS: do not allocate too much
Bulk-read allocates 128KiB or more using kmalloc. The allocation
starts failing often when the memory gets fragmented. UBIFS still
works fine in this case, because it falls-back to standard
(non-optimized) read method, though. This patch teaches bulk-read
to allocate exactly the amount of memory it needs, instead of
allocating 128KiB every time.

This patch is also a preparation to the further fix where we'll
have a pre-allocated bulk-read buffer as well. For example, now
the @bu object is prepared in 'ubifs_bulk_read()', so we could
path either pre-allocated or allocated information to
'ubifs_do_bulk_read()' later. Or teaching 'ubifs_do_bulk_read()'
not to allocate 'bu->buf' if it is already there.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-21 18:59:25 +02:00
Artem Bityutskiy
39ce81ce71 UBIFS: do not print scary memory allocation warnings
Bulk-read allocates a lot of memory with 'kmalloc()', and when it
is/gets fragmented 'kmalloc()' fails with a scarry warning. But
because bulk-read is just an optimization, UBIFS keeps working fine.
Supress the warning by passing __GFP_NOWARN option to 'kmalloc()'.

This patch also introduces a macro for the magic 128KiB constant.
This is just neater.

Note, this is not really fixes the problem we had, but just hides
the warnings. The further patches fix the problem.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-21 18:59:16 +02:00
Harvey Harrison
0ecb9529a4 UBIFS: endian handling fixes and annotations
Noticed by sparse:
fs/ubifs/file.c:75:2: warning: restricted __le64 degrades to integer
fs/ubifs/file.c:629:4: warning: restricted __le64 degrades to integer
fs/ubifs/dir.c:431:3: warning: restricted __le64 degrades to integer

This should be checked to ensure the ubifs_assert is working as
intended, I've done the suggested annotation in this patch.

fs/ubifs/sb.c:298:6: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:298:6:    expected int [signed] [assigned] tmp
fs/ubifs/sb.c:298:6:    got restricted __le64 [usertype] <noident>
fs/ubifs/sb.c:299:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:299:19:    expected restricted __le64 [usertype] atime_sec
fs/ubifs/sb.c:299:19:    got int [signed] [assigned] tmp
fs/ubifs/sb.c:300:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:300:19:    expected restricted __le64 [usertype] ctime_sec
fs/ubifs/sb.c:300:19:    got int [signed] [assigned] tmp
fs/ubifs/sb.c:301:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:301:19:    expected restricted __le64 [usertype] mtime_sec
fs/ubifs/sb.c:301:19:    got int [signed] [assigned] tmp

This looks like a bugfix as your tmp was a u32 so there was truncation in
the atime, mtime, ctime value, probably not intentional, add a tmp_le64
and use it here.

fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:419:9: warning: cast to restricted __le32

Read from the annotated union member instead.

fs/ubifs/recovery.c:175:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:175:13:    expected unsigned int [unsigned] [usertype] save_flags
fs/ubifs/recovery.c:175:13:    got restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:186:13:    expected restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13:    got unsigned int [unsigned] [usertype] save_flags

Do byteshifting at compile time of the flag value.  Annotate the saved_flags
as le32.

fs/ubifs/debug.c:368:10: warning: cast to restricted __le32
fs/ubifs/debug.c:368:10: warning: cast from restricted __le64

Should be checked if the truncation was intentional, I've changed the
printk to print the full width.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-11-06 11:06:19 +02:00
Adrian Hunter
5c0013c16b UBIFS: fix bulk-read handling uptodate pages
Bulk-read skips uptodate pages but this was putting its
array index out and causing it to treat subsequent pages
as holes.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
2008-09-30 11:12:59 +03:00
Adrian Hunter
ed382d5898 UBIFS: ensure data read beyond i_size is zeroed out correctly
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
2008-09-30 11:12:57 +03:00
Adrian Hunter
4793e7c5e1 UBIFS: add bulk-read facility
Some flash media are capable of reading sequentially at faster rates.
UBIFS bulk-read facility is designed to take advantage of that, by
reading in one go consecutive data nodes that are also located
consecutively in the same LEB.

Read speed on Arm platform with OneNAND goes from 17 MiB/s to
19 MiB/s.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
2008-09-30 11:12:56 +03:00
Artem Bityutskiy
04da11bfcf UBIFS: fix zero-length truncations
Always allow truncations to zero, even if budgeting thinks there
is no space. UBIFS reserves some space for deletions anyway.

Otherwise, the following happans:
1. create a file, and write as much as possible there, until ENOSPC
2. truncate the file, which fails with ENOSPC, which is not good.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-08-21 16:48:52 +03:00
Zoltan Sogor
22bc7fa8c5 UBIFS: support splice_write
Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-08-13 11:38:43 +03:00
Artem Bityutskiy
dab4b4d2f9 UBIFS: align inode data to eight
UBIFS aligns node lengths to 8, so budgeting has to do the
same. Well, direntry, inode, and page budgets are already
aligned, but not inode data budget (e.g., data in special
devices or symlinks). Do this for inode data as well.
Also, add corresponding debugging checks.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-08-13 11:35:16 +03:00
Artem Bityutskiy
7d32c2bb14 UBIFS: improve debugging
1. Print inode mode in some of debugging messages
2. Add few more useful assertions

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-08-13 11:20:07 +03:00
Al Viro
3f8206d496 [PATCH] get rid of indirect users of namei.h
fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all.
Several places in the tree needed direct include.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:42 -04:00
Artem Bityutskiy
1e51764a3c UBIFS: add new flash file system
This is a new flash file system. See
http://www.linux-mtd.infradead.org/doc/ubifs.html

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
2008-07-15 17:35:15 +03:00