Commit Graph

1430 Commits

Author SHA1 Message Date
Jaegeuk Kim
e589c2c477 f2fs: control not to exceed # of cached nat entries
This is to avoid cache entry management overhead including radix tree.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 10:18:08 -07:00
Jaegeuk Kim
29710bcf94 f2fs: fix wrong percentage
This should be 1%, 10MB / 1GB.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:41 -07:00
Jaegeuk Kim
1e7c48fa9a f2fs: avoid data race between FI_DIRTY_INODE flag and update_inode
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset
at any time like below.

Thread #1                        Thread #2
- lock_page(ipage)
- update i_fields
                                 - update i_size/i_blocks/and so on
				 - set FI_DIRTY_INODE
- reset FI_DIRTY_INODE
- set_page_dirty(ipage)

In this case, we can lose the latest i_field information.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:40 -07:00
Jaegeuk Kim
9a449e9c3b f2fs: remove obsolete parameter in f2fs_truncate
We don't need lock parameter, which is always true.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:39 -07:00
Jaegeuk Kim
338bbfa086 f2fs: avoid wrong count on dirty inodes
The number should be covered by spin_lock. Otherwise we can see wrong count
in f2fs_stat.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:38 -07:00
Jaegeuk Kim
9f7c45ccd6 f2fs: remove deprecated parameter
Remove deprecated paramter.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-07 09:45:37 -07:00
Jaegeuk Kim
b230e6cabf f2fs: handle writepage correctly
Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls
f2fs_write_data_page().
If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage()
calls mapping_set_error(). But, this should not happen at every time, since
sometimes f2fs_write_data_page() tries to skip writing pages without error.
For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed
out.

Reported-by: Shuoran Liu <liushuoran@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:24 -07:00
Jaegeuk Kim
eb4246dc12 f2fs: return error of f2fs_lookup
Now we can report an error to f2fs_lookup given by f2fs_find_entry.

Suggested-by: He YunLei <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:23 -07:00
Yunlong Song
0c9df7fb80 f2fs: return the errno to the caller to avoid using a wrong page
Commit aaf9607516 ("f2fs: check node page
contents all the time") pointed out that "sometimes it was reported that
its contents was missing", so it checks the page's mapping and contents.
When "nid != nid_of_node(page)", ERR_PTR(-EIO) will be returned to the
caller. However, commit e1c51b9f1d ("f2fs:
clean up node page updating flow") moves "nid != nid_of_node(page)" test
to "f2fs_bug_on(sbi, nid != nid_of_node(page))", this will return a
wrong page to the caller when F2FS_CHECK_FS is off when "sometimes it
was reported that its contents was missing" happens.

This patch restores to check node page contents all the time, and
returns the errno to make the caller known something is wrong and avoid
to use the page. This patch also moves f2fs_bug_on to its proper location.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:22 -07:00
Jaegeuk Kim
46ae957f9b f2fs: remove two steps to flush dirty data pages
If there is no cold page, we don't need to do a loop to flush dirty
data pages.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 1.1 GB/s
 After  : 1.2 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 2.2 GB/s
 After  : 2.3 GB/s

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:21 -07:00
Jaegeuk Kim
28ea6162e2 f2fs: do not skip writing data pages
For data pages, let's try to flush as much as possible in background.

On /dev/pmem0,

1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync
 Before : 800 MB/s
 After  : 1.1 GB/s

2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048
 Before : 1.3 GB/s
 After  : 2.2 GB/s

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:20 -07:00
Jaegeuk Kim
53aa6bbfda f2fs: inject to produce some orphan inodes
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:19 -07:00
Jaegeuk Kim
42d964016e f2fs: propagate error given by f2fs_find_entry
If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away.
Otherwise, for example, we can get duplicate directory entry by ->chash and
->clevel.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:18 -07:00
Jaegeuk Kim
b93f771286 f2fs: remove writepages lock
This patch removes writepages lock.
We can improve multi-threading performance.

tiobench, 32 threads, 4KB write per fsync on SSD
Before: 25.88 MB/s
After: 28.03 MB/s

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:17 -07:00
Jaegeuk Kim
69e9e42744 f2fs: set flush_merge by default
This patch sets flush_merge by default.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:16 -07:00
Jaegeuk Kim
0a87f664d1 f2fs: detect congestion of flush command issues
If flush commands do not incur any congestion, we don't need to throw that to
dispatching queue which causes unnecessary latency.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:15 -07:00
Jaegeuk Kim
6d94c74ab8 f2fs: add lazytime mount option
This patch adds lazytime support.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:14 -07:00
Jaegeuk Kim
26de9b1171 f2fs: avoid unnecessary updating inode during fsync
If roll-forward recovery can recover i_size, we don't need to update inode's
metadata during fsync.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:13 -07:00
Jaegeuk Kim
ee6d182f2a f2fs: remove syncing inode page in all the cases
This patch reduces to call them across the whole tree.
- sync_inode_page()
- update_inode_page()
- update_inode()
- f2fs_write_inode()

Instead, checkpoint will flush all the dirty inode metadata before syncing
node pages.
Note that, this is doable, since we call mark_inode_dirty_sync() for all
inode's field change which needs to update on-disk inode as well.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:12 -07:00
Jaegeuk Kim
0f18b462b2 f2fs: flush inode metadata when checkpoint is doing
This patch registers all the inodes which have dirty metadata to sync when
checkpoint is doing.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:11 -07:00
Jaegeuk Kim
205b98221c f2fs: call mark_inode_dirty_sync for i_field changes
This patch calls mark_inode_dirty_sync() for the following on-disk inode
changes.

 -> largest
 -> ctime/mtime/atime
 -> i_current_depth
 -> i_xattr_nid
 -> i_pino
 -> i_advise
 -> i_flags
 -> i_mode

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:11 -07:00
Jaegeuk Kim
a1961246c3 f2fs: introduce f2fs_i_links_write with mark_inode_dirty_sync
This patch introduces f2fs_i_links_write() to call mark_inode_dirty_sync() when
changing inode->i_links.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:10 -07:00
Jaegeuk Kim
8edd03c870 f2fs: introduce f2fs_i_blocks_write with mark_inode_dirty_sync
This patch introduces f2fs_i_blocks_write() to call mark_inode_dirty_sync() when
changing inode->i_blocks.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:09 -07:00
Jaegeuk Kim
fc9581c809 f2fs: introduce f2fs_i_size_write with mark_inode_dirty_sync
This patch introduces f2fs_i_size_write() to call mark_inode_dirty_sync() with
i_size_write().

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:08 -07:00
Jaegeuk Kim
91942321e4 f2fs: use inode pointer for {set, clear}_inode_flag
This patch refactors to use inode pointer for set_inode_flag and
clear_inode_flag.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-06-02 18:05:07 -07:00
Jaegeuk Kim
1c4bf76303 Revert "f2fs: no need inc dirty pages under inode lock"
This reverts commit b951a4ec16.

 Conflicts:
	fs/f2fs/checkpoint.c
2016-06-02 18:05:06 -07:00
Al Viro
5930122683 switch xattr_handler->set() to passing dentry and inode separately
preparation for similar switch in ->setxattr() (see the next commit for
rationale).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-05-27 15:39:43 -04:00
Linus Torvalds
f6c658df63 Enhancement
- fs-specific prefix for fscrypto
 - fault injection facility
 - expose validity bitmaps for user to be aware of fragmentation
 - fallocate/rm/preallocation speed up
 - use percpu counters
 
 Bug fixes
 - some inline_dentry/inline_data bugs
 - error handling for atomic/volatile/orphan inodes
 - recover broken superblock
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXQPu4AAoJEEAUqH6CSFDSILgP/1dj6fmtytr8c+55EBqXUGpo
 M7rS93JTxlmU5BduIo9psJsEquTQoVEmxB/Gjd+ZnI5R6Rp1c/REaP0ba374rEhZ
 ecMQh5QqzM1gRNFXrQhWFEL/KtfRqt3T80zebQP7pxFUm/m9NGMLWT43RzQ8AAhr
 Y3P0NLdvxA4HAnipKptkPJcGZQlWnL9W/MR+LgsXLXqLDwJHkVu61GcF0y2ibcJM
 lEtIRmyH5tg7hP5c5LTw9pKQFHkIZt5cHFLjrJ1x8FSm2TXOcJPbjOrThvcb+NKK
 e0O+6R0meH2eMpak+BTkZp2YbPPyXOb1N00j//lmbPjCoJPd4ZuiJ+oRoHUlTxtU
 FhO67t0brlDbMFQVRFrtv8VA8M6by+DTAAP3Ffx62I/TJkphKANCSoyQRhlWtxxO
 kRU69N7ipnRNxO4WCv40FjaQjSIElCKysP1POazRmAOQm7UFTGT9Nj37+eqUcEPJ
 HZ7O61DEHNemb0SMlJ8WSClstt0yUU+2cjRfTPAr2Wd3V8gYbRs0QUg5M2GLgywR
 EmiJfpkXse3f/nR8W6g1hganSOXA0AZX+EUibed6VkV3oYemdFbm8OymeEmLmWpM
 y2F3D7dPLW7MCoTXJqtwFWdoDwI+zkH4rJaPGTq5TVBRWVU/njX8OvoB47pOvKV1
 kccL7zv2PekE1hSDO5WF
 =6MSp
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, as Ted pointed out, fscrypto allows one more key prefix
  given by filesystem to resolve backward compatibility issues.  Other
  than that, we've fixed several error handling cases by introducing
  a fault injection facility.  We've also achieved performance
  improvement in some workloads as well as a bunch of bug fixes.

  Summary:

  Enhancements:
   - fs-specific prefix for fscrypto
   - fault injection facility
   - expose validity bitmaps for user to be aware of fragmentation
   - fallocate/rm/preallocation speed up
   - use percpu counters

  Bug fixes:
   - some inline_dentry/inline_data bugs
   - error handling for atomic/volatile/orphan inodes
   - recover broken superblock"

* tag 'for-f2fs-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (73 commits)
  f2fs: fix to update dirty page count correctly
  f2fs: flush pending bios right away when error occurs
  f2fs: avoid ENOSPC fault in the recovery process
  f2fs: make exit_f2fs_fs more clear
  f2fs: use percpu_counter for total_valid_inode_count
  f2fs: use percpu_counter for alloc_valid_block_count
  f2fs: use percpu_counter for # of dirty pages in inode
  f2fs: use percpu_counter for page counters
  f2fs: use bio count instead of F2FS_WRITEBACK page count
  f2fs: manipulate dirty file inodes when DATA_FLUSH is set
  f2fs: add fault injection to sysfs
  f2fs: no need inc dirty pages under inode lock
  f2fs: fix incorrect error path handling in f2fs_move_rehashed_dirents
  f2fs: fix i_current_depth during inline dentry conversion
  f2fs: correct return value type of f2fs_fill_super
  f2fs: fix deadlock when flush inline data
  f2fs: avoid f2fs_bug_on during recovery
  f2fs: show # of orphan inodes
  f2fs: support in batch fzero in dnode page
  f2fs: support in batch multi blocks preallocation
  ...
2016-05-21 18:25:28 -07:00
Andy Shevchenko
8da4b8c48e lib/uuid.c: move generate_random_uuid() to uuid.c
Let's gather the UUID related functions under one hood.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-20 17:58:30 -07:00
Chao Yu
0f3311a8c2 f2fs: fix to update dirty page count correctly
Once we failed to merge inline data into inode page during flushing inline
inode, we will skip invoking inode_dec_dirty_pages, which makes dirty page
count incorrect, result in panic in ->evict_inode, Fix it.

------------[ cut here ]------------
kernel BUG at /home/yuchao/git/devf2fs/inode.c:336!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 10004 Comm: umount Tainted: G           O    4.6.0-rc5+ #17
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: f0c33000 ti: c5212000 task.ti: c5212000
EIP: 0060:[<f89aacb5>] EFLAGS: 00010202 CPU: 3
EIP is at f2fs_evict_inode+0x85/0x490 [f2fs]
EAX: 00000001 EBX: c4529ea0 ECX: 00000001 EDX: 00000000
ESI: c0131000 EDI: f89dd0a0 EBP: c5213e9c ESP: c5213e78
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 80050033 CR2: b75878c0 CR3: 1a36a700 CR4: 000406f0
Stack:
 c4529ea0 c4529ef4 c5213e8c c176d45c c4529ef4 00000000 c4529ea0 c4529fac
 f89dd0a0 c5213eb0 c1204a68 c5213ed8 c452a2b4 c6680930 c5213ec0 c1204b64
 c6680d44 c6680620 c5213eec c120588d ee84b000 ee84b5c0 c5214000 ee84b5e0
Call Trace:
 [<c176d45c>] ? _raw_spin_unlock+0x2c/0x50
 [<c1204a68>] evict+0xa8/0x170
 [<c1204b64>] dispose_list+0x34/0x50
 [<c120588d>] evict_inodes+0x10d/0x130
 [<c11ea941>] generic_shutdown_super+0x41/0xe0
 [<c1185190>] ? unregister_shrinker+0x40/0x50
 [<c1185190>] ? unregister_shrinker+0x40/0x50
 [<c11eac52>] kill_block_super+0x22/0x70
 [<f89af23e>] kill_f2fs_super+0x1e/0x20 [f2fs]
 [<c11eae1d>] deactivate_locked_super+0x3d/0x70
 [<c11eb383>] deactivate_super+0x43/0x60
 [<c1208ec9>] cleanup_mnt+0x39/0x80
 [<c1208f50>] __cleanup_mnt+0x10/0x20
 [<c107d091>] task_work_run+0x71/0x90
 [<c105725a>] exit_to_usermode_loop+0x72/0x9e
 [<c1001c7c>] do_fast_syscall_32+0x19c/0x1c0
 [<c176dd48>] sysenter_past_esp+0x45/0x74
EIP: [<f89aacb5>] f2fs_evict_inode+0x85/0x490 [f2fs] SS:ESP 0068:c5213e78
---[ end trace d30536330b7fdc58 ]---

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-20 14:55:41 -07:00
Jaegeuk Kim
38f91ca8c0 f2fs: flush pending bios right away when error occurs
Given errors, this patch flushes pending bios as soon as possible.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-20 11:46:15 -07:00
Jaegeuk Kim
975756c413 f2fs: avoid ENOSPC fault in the recovery process
This patch avoids impossible error injection, ENOSPC, during recovery process.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-20 11:43:04 -07:00
Tiezhu Yang
b8bef79df7 f2fs: make exit_f2fs_fs more clear
init_f2fs_fs does:
    1) f2fs_build_trace_ios
    2) init_inodecache
    3) create_node_manager_caches
    4) create_segment_manager_caches
    5) create_checkpoint_caches
    6) create_extent_cache
    7) kset_create_and_add
    8) kobject_init_and_add
    9) register_shrinker
    10) register_filesystem
    11) f2fs_create_root_stats
    12) proc_mkdir

exit_f2fs_fs should do cleanup in the reverse order
to make the code more clear.

Signed-off-by: Tiezhu Yang <kernelpatch@126.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:31 -07:00
Jaegeuk Kim
513c5f3735 f2fs: use percpu_counter for total_valid_inode_count
This patch uses percpu_counter to avoid stat_lock.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:30 -07:00
Jaegeuk Kim
41382ec432 f2fs: use percpu_counter for alloc_valid_block_count
This patch uses percpu_count for sbi->alloc_valid_block_count.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:29 -07:00
Jaegeuk Kim
1beba1b3a9 f2fs: use percpu_counter for # of dirty pages in inode
This patch adds percpu_counter for # of dirty pages in inode.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:28 -07:00
Jaegeuk Kim
523be8a6b3 f2fs: use percpu_counter for page counters
This patch substitutes percpu_counter for atomic_counter when counting
various types of pages.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:27 -07:00
Jaegeuk Kim
f573018491 f2fs: use bio count instead of F2FS_WRITEBACK page count
This can reduce page counting overhead.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-18 13:57:25 -07:00
Linus Torvalds
c2e7b20705 Merge branch 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs cleanups from Al Viro:
 "More cleanups from Christoph"

* 'work.preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  nfsd: use RWF_SYNC
  fs: add RWF_DSYNC aand RWF_SYNC
  ceph: use generic_write_sync
  fs: simplify the generic_write_sync prototype
  fs: add IOCB_SYNC and IOCB_DSYNC
  direct-io: remove the offset argument to dio_complete
  direct-io: eliminate the offset argument to ->direct_IO
  xfs: eliminate the pos variable in xfs_file_dio_aio_write
  filemap: remove the pos argument to generic_file_direct_write
  filemap: remove pos variables in generic_file_read_iter
2016-05-17 15:05:23 -07:00
Al Viro
0e0162bb8c Merge branch 'ovl-fixes' into for-linus
Backmerge to resolve a conflict in ovl_lookup_real();
"ovl_lookup_real(): use lookup_one_len_unlocked()" instead,
but it was too late in the cycle to rebase.
2016-05-17 02:17:59 -04:00
Jaegeuk Kim
10aa97c379 f2fs: manipulate dirty file inodes when DATA_FLUSH is set
It needs to maintain dirty file inodes only if DATA_FLUSH is set.
Otherwise, let's avoid its overhead.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-16 15:32:03 -07:00
Sheng Yong
087968974f f2fs: add fault injection to sysfs
This patch introduces a new struct f2fs_fault_info and a global f2fs_fault
to save fault injection status. Fault injection entries are created in
/sys/fs/f2fs/fault_injection/ during initializing f2fs module.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-16 15:32:02 -07:00
Yunlei He
b951a4ec16 f2fs: no need inc dirty pages under inode lock
No need inc dirty pages under inode lock

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-16 15:32:01 -07:00
Chao Yu
8975bdf482 f2fs: fix incorrect error path handling in f2fs_move_rehashed_dirents
Fix two bugs in error path of f2fs_move_rehashed_dirents:
 - release dir's inode page if fail to call kmalloc
 - recover i_current_depth if fail to converting

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-16 15:32:01 -07:00
Chao Yu
e4103849ba f2fs: fix i_current_depth during inline dentry conversion
With below steps, we will see that dentry page becoming unaccessable later.
This is because we forget updating i_current_depth in inode during inline
dentry conversion, after that, once we failed at somewhere, it will leave
i_current_depth as 0 in non-inline directory. Then, during ->lookup, the
current_depth value makes all dentry pages in first level invisible. Fix
it.

1) mount f2fs with inline_dentry option
2) mkdir dir
3) touch 180 files named [0-179] in dir
4) touch 180 in dir (fail after inline dir conversion)
5) ll dir

ls: cannot access /mnt/f2fs/dir/0: No such file or directory
ls: cannot access /mnt/f2fs/dir/1: No such file or directory
ls: cannot access /mnt/f2fs/dir/2: No such file or directory
ls: cannot access /mnt/f2fs/dir/3: No such file or directory
ls: cannot access /mnt/f2fs/dir/4: No such file or directory

drwxr-xr-x 2 root root 4096  may 13 21:47 ./
drwxr-xr-x 3 root root 4096  may 13 21:46 ../
-????????? ? ?    ?       ?             ? 0
-????????? ? ?    ?       ?             ? 1
-????????? ? ?    ?       ?             ? 10
-????????? ? ?    ?       ?             ? 100
-????????? ? ?    ?       ?             ? 101
-????????? ? ?    ?       ?             ? 102

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-16 15:32:00 -07:00
Sheng Yong
99e3e858a4 f2fs: correct return value type of f2fs_fill_super
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-16 15:31:59 -07:00
Chao Yu
ab47036d8f f2fs: fix deadlock when flush inline data
Below backtrace info was reported by Yunlei He:

Call Trace:
 [<ffffffff817a9395>] schedule+0x35/0x80
 [<ffffffff817abb7d>] rwsem_down_read_failed+0xed/0x130
 [<ffffffff813c12a8>] call_rwsem_down_read_failed+0x18/0x
 [<ffffffff817ab1d0>] down_read+0x20/0x30
 [<ffffffffa02a1a12>] f2fs_evict_inode+0x242/0x3a0 [f2fs]
 [<ffffffff81217057>] evict+0xc7/0x1a0
 [<ffffffff81217cd6>] iput+0x196/0x200
 [<ffffffff812134f9>] __dentry_kill+0x179/0x1e0
 [<ffffffff812136f9>] dput+0x199/0x1f0
 [<ffffffff811fe77b>] __fput+0x18b/0x220
 [<ffffffff811fe84e>] ____fput+0xe/0x10
 [<ffffffff81097427>] task_work_run+0x77/0x90
 [<ffffffff81074d62>] exit_to_usermode_loop+0x73/0xa2
 [<ffffffff81003b7a>] do_syscall_64+0xfa/0x110
 [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25

Call Trace:
 [<ffffffff817a9395>] schedule+0x35/0x80
 [<ffffffff81216dc3>] __wait_on_freeing_inode+0xa3/0xd0
 [<ffffffff810bc300>] ? autoremove_wake_function+0x40/0x4
 [<ffffffff8121771d>] find_inode_fast+0x7d/0xb0
 [<ffffffff8121794a>] ilookup+0x6a/0xd0
 [<ffffffffa02bc740>] sync_node_pages+0x210/0x650 [f2fs]
 [<ffffffff8122e690>] ? do_fsync+0x70/0x70
 [<ffffffffa02b085e>] block_operations+0x9e/0xf0 [f2fs]
 [<ffffffff8137b795>] ? bio_endio+0x55/0x60
 [<ffffffffa02b0942>] write_checkpoint+0x92/0xba0 [f2fs]
 [<ffffffff8117da57>] ? mempool_free_slab+0x17/0x20
 [<ffffffff8117de8b>] ? mempool_free+0x2b/0x80
 [<ffffffff8122e690>] ? do_fsync+0x70/0x70
 [<ffffffffa02a53e3>] f2fs_sync_fs+0x63/0xd0 [f2fs]
 [<ffffffff8129630f>] ? ext4_sync_fs+0xbf/0x190
 [<ffffffff8122e6b0>] sync_fs_one_sb+0x20/0x30
 [<ffffffff812002e9>] iterate_supers+0xb9/0x110
 [<ffffffff8122e7b5>] sys_sync+0x55/0x90
 [<ffffffff81003ae9>] do_syscall_64+0x69/0x110
 [<ffffffff817acf65>] entry_SYSCALL64_slow_path+0x25/0x25

With following excuting serials, we will set inline_node in inode page
after inode was unlinked, result in a deadloop described as below:
1. open file
2. write file
3. unlink file
4. write file
5. close file

Thread A				Thread B
 - dput
  - iput_final
   - inode->i_state |= I_FREEING
   - evict
    - f2fs_evict_inode
					 - f2fs_sync_fs
					  - write_checkpoint
					   - block_operations
					    - f2fs_lock_all (down_write(cp_rwsem))
     - f2fs_lock_op (down_read(cp_rwsem))
					    - sync_node_pages
					     - ilookup
					      - find_inode_fast
					       - __wait_on_freeing_inode
					         (wait on I_FREEING clear)

Here, we change to set inline_node flag only for linked inode for fixing.

Reported-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Tested-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: stable@vger.kernel.org # v4.6
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-11 09:56:38 -07:00
Jaegeuk Kim
3b9b10f9ce f2fs: avoid f2fs_bug_on during recovery
We don't need to use f2fs_bug_on() to treat with any error case when allocating
a block during recovery.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-11 09:56:37 -07:00
Jaegeuk Kim
652be55162 f2fs: show # of orphan inodes
This adds debug information for # of orphan inodes.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-11 09:56:36 -07:00
Chao Yu
6e9619499f f2fs: support in batch fzero in dnode page
This patch tries to speedup fzero_range by making space preallocation and
address removal of blocks in one dnode page as in batch operation.

In virtual machine, with zram driver:

dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=4096
time xfs_io -f /mnt/f2fs/file -c "fzero 0 4096M"

Before:
real	0m3.276s
user	0m0.008s
sys	0m3.260s

After:
real	0m1.568s
user	0m0.000s
sys	0m1.564s

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: consider ENOSPC case]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-05-11 09:56:36 -07:00