linux

Author	SHA1	Message	Date
Linus Torvalds	80201fe175	for-5.1/block-20190302 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlx63XIQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpp2vEACfrrQsap7R+Av28mmXpmXi2FPa3g5Tev1t yYjK2qHvhlMZjPTYw3hCmbYdDDczlF7PEgSE2x2DjdcsYapb8Fy1lZ2X16c7ztBR HD/t9b5AVSQsczZzKgv3RqsNtTnjzS5V0A8XH8FAP2QRgiwDMwSN6G0FP0JBLbE/ ZgxQrH1Iy1F33Wz4hI3Z7dEghKPZrH1IlegkZCEu47q9SlWS76qUetSy2GEtchOl 3Lgu54mQZyVdI5/QZf9DyMDLF6dIz3tYU2qhuo01AHjGRCC72v86p8sIiXcUr94Q 8pbegJhJ/g8KBol9Qhv3+pWG/QUAZwi/ZwasTkK+MJ4klRXfOrznxPubW1z6t9Vn QRo39Po5SqqP0QWAscDxCFjESIQlWlKa+LZurJL7DJDCUGrSgzTpnVwFqKwc5zTP HJa5MT2tEeL2TfUYRYCfh0ZV0elINdHA1y1klDBh38drh4EWr2gW8xdseGYXqRjh fLgEpoF7VQ8kTvxKN+E4jZXkcZmoLmefp0ZyAbblS6IawpPVC7kXM9Fdn2OU8f2c fjVjvSiqxfeN6dnpfeLDRbbN9894HwgP/LPropJOQ7KmjCorQq5zMDkAvoh3tElq qwluRqdBJpWT/F05KweY+XVW8OawIycmUWqt6JrVNoIDAK31auHQv47kR0VA4OvE DRVVhYpocw== =VBaU -----END PGP SIGNATURE----- Merge tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block Pull block layer updates from Jens Axboe: "Not a huge amount of changes in this round, the biggest one is that we finally have Mings multi-page bvec support merged. Apart from that, this pull request contains: - Small series that avoids quiescing the queue for sysfs changes that match what we currently have (Aleksei) - Series of bcache fixes (via Coly) - Series of lightnvm fixes (via Mathias) - NVMe pull request from Christoph. Nothing major, just SPDX/license cleanups, RR mp policy (Hannes), and little fixes (Bart, Chaitanya). - BFQ series (Paolo) - Save blk-mq cpu -> hw queue mapping, removing a pointer indirection for the fast path (Jianchao) - fops->iopoll() added for async IO polling, this is a feature that the upcoming io_uring interface will use (Christoph, me) - Partition scan loop fixes (Dongli) - mtip32xx conversion from managed resource API (Christoph) - cdrom registration race fix (Guenter) - MD pull from Song, two minor fixes. - Various documentation fixes (Marcos) - Multi-page bvec feature. This brings a lot of nice improvements with it, like more efficient splitting, larger IOs can be supported without growing the bvec table size, and so on. (Ming) - Various little fixes to core and drivers" * tag 'for-5.1/block-20190302' of git://git.kernel.dk/linux-block: (117 commits) block: fix updating bio's front segment size block: Replace function name in string with __func__ nbd: propagate genlmsg_reply return code floppy: remove set but not used variable 'q' null_blk: fix checking for REQ_FUA block: fix NULL pointer dereference in register_disk fs: fix guard_bio_eod to check for real EOD errors blk-mq: use HCTX_TYPE_DEFAULT but not 0 to index blk_mq_tag_set->map block: optimize bvec iteration in bvec_iter_advance block: introduce mp_bvec_for_each_page() for iterating over page block: optimize blk_bio_segment_split for single-page bvec block: optimize __blk_segment_map_sg() for single-page bvec block: introduce bvec_nth_page() iomap: wire up the iopoll method block: add bio_set_polled() helper block: wire up block device iopoll method fs: add an iopoll method to struct file_operations loop: set GENHD_FL_NO_PART_SCAN after blkdev_reread_part() loop: do not print warn message if partition scan is successful block: bounce: make sure that bvec table is updated ...	2019-03-08 14:12:17 -08:00
Gao Xiang	a112152f6f	staging: erofs: fix mis-acted TAIL merging behavior EROFS has an optimized path called TAIL merging, which is designed to merge multiple reads and the corresponding decompressions into one if these requests read continuous pages almost at the same time. In general, it behaves as follows: ________________________________________________________________ ... \| TAIL . HEAD \| PAGE \| PAGE \| TAIL . HEAD \| ... _____\|_combined page A_\|________\|________\|_combined page B_\|____ 1 ] -> [ 2 ] -> [ 3 If the above three reads are requested in the order 1-2-3, it will generate a large work chain rather than 3 individual work chains to reduce scheduling overhead and boost up sequential read. However, if Read 2 is processed slightly earlier than Read 1, currently it still generates 2 individual work chains (chain 1, 2) but it does in-place decompression for combined page A, moreover, if chain 2 decompresses ahead of chain 1, it will be a race and lead to corrupted decompressed page. This patch fixes it. Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-27 15:41:57 +01:00
Gao Xiang	1e5ceeab69	staging: erofs: fix illegal address access under memory pressure Considering a read request with two decompressed file pages, If a decompression work cannot be started on the previous page due to memory pressure but in-memory LTP map lookup is done, builder->work should be still NULL. Moreover, if the current page also belongs to the same map, it won't try to start the decompression work again and then run into trouble. This patch aims to solve the above issue only with little changes as much as possible in order to make the fix backport easier. kernel message is: <4>[1051408.015930s]SLUB: Unable to allocate memory on node -1, gfp=0x2408040(GFP_NOFS\|__GFP_ZERO) <4>[1051408.015930s] cache: erofs_compress, object size: 144, buffer size: 144, default order: 0, min order: 0 <4>[1051408.015930s] node 0: slabs: 98, objs: 2744, free: 0 * Cannot allocate the decompression work <3>[1051408.015960s]erofs: z_erofs_vle_normalaccess_readpages, readahead error at page 1008 of nid `5391488` * Note that the previous page was failed to read <0>[1051408.015960s]Internal error: Accessing user space memory outside uaccess.h routines: 96000005 [#1] PREEMPT SMP ... <4>[1051408.015991s]Hardware name: kirin710 (DT) ... <4>[1051408.016021s]PC is at z_erofs_vle_work_add_page+0xa0/0x17c <4>[1051408.016021s]LR is at z_erofs_do_read_page+0x12c/0xcf0 ... <4>[1051408.018096s][<ffffff80c6fb0fd4>] z_erofs_vle_work_add_page+0xa0/0x17c <4>[1051408.018096s][<ffffff80c6fb3814>] z_erofs_vle_normalaccess_readpages+0x1a0/0x37c <4>[1051408.018096s][<ffffff80c6d670b8>] read_pages+0x70/0x190 <4>[1051408.018127s][<ffffff80c6d6736c>] __do_page_cache_readahead+0x194/0x1a8 <4>[1051408.018127s][<ffffff80c6d59318>] filemap_fault+0x398/0x684 <4>[1051408.018127s][<ffffff80c6d8a9e0>] __do_fault+0x8c/0x138 <4>[1051408.018127s][<ffffff80c6d8f90c>] handle_pte_fault+0x730/0xb7c <4>[1051408.018127s][<ffffff80c6d8fe04>] __handle_mm_fault+0xac/0xf4 <4>[1051408.018157s][<ffffff80c6d8fec8>] handle_mm_fault+0x7c/0x118 <4>[1051408.018157s][<ffffff80c8c52998>] do_page_fault+0x354/0x474 <4>[1051408.018157s][<ffffff80c8c52af8>] do_translation_fault+0x40/0x48 <4>[1051408.018157s][<ffffff80c6c002f4>] do_mem_abort+0x80/0x100 <4>[1051408.018310s]---[ end trace 9f4009a3283bd78b ]--- Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-27 15:41:57 +01:00
Gao Xiang	af692e117c	staging: erofs: compressed_pages should not be accessed again after freed This patch resolves the following page use-after-free issue, z_erofs_vle_unzip: ... for (i = 0; i < nr_pages; ++i) { ... z_erofs_onlinepage_endio(page); (1) } for (i = 0; i < clusterpages; ++i) { page = compressed_pages[i]; if (page->mapping == mngda) (2) continue; /* recycle all individual staging pages */ (void)z_erofs_gather_if_stagingpage(page_pool, page); (3) WRITE_ONCE(compressed_pages[i], NULL); } ... After (1) is executed, page is freed and could be then reused, if compressed_pages is scanned after that, it could fall info (2) or (3) by mistake and that could finally be in a mess. This patch aims to solve the above issue only with little changes as much as possible in order to make the fix backport easier. Fixes: `3883a79abd` ("staging: erofs: introduce VLE decompression support") Cc: <stable@vger.kernel.org> # 4.19+ Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-27 15:41:57 +01:00
Gao Xiang	bee1568293	staging: erofs: switch to ->iterate_shared() After commit `6192269444` ("introduce a parallel variant of ->iterate()"), readdir can be done without taking exclusive inode lock of course. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-26 11:52:46 +01:00
Gao Xiang	b2bb112db1	staging: erofs: no need to take page lock in readdir VFS will take inode_lock for readdir, therefore no need to take page lock in readdir at all just as the majority of other generic filesystems. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-26 11:52:45 +01:00
Gao Xiang	047d4abc4d	staging: erofs: remove rcu_read_lock() in erofs_try_to_free_cached_page page_private(page) cannot be changed if page lock is taken. Besides, the corresponding workgroup won't be freed if the page is already protected by page lock, therefore no need to take rcu read lock. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-20 11:20:55 +01:00
Gao Xiang	62dc45979f	staging: erofs: fix race of initializing xattrs of a inode at the same time In real scenario, there could be several threads accessing xattrs of the same xattr-uninitialized inode, and init_inode_xattrs() almost at the same time. That's actually an unexpected behavior, this patch closes the race. Fixes: `b17500a0fd` ("staging: erofs: introduce xattr & acl support") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-20 11:20:55 +01:00
Bhanusree Pola	cbebe5d05d	staging: erofs: match alignment with open parentheses Align code with open parantheses to improve the readability. Issue found using checkpatch.pl Signed-off-by: Bhanusree Pola <bhanusreemahesh@gmail.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-19 15:18:49 +01:00
Ming Lei	6dc4f100c1	block: allow bio_for_each_segment_all() to iterate over multi-page bvec This patch introduces one extra iterator variable to bio_for_each_segment_all(), then we can allow bio_for_each_segment_all() to iterate over multi-page bvec. Given it is just one mechannical & simple change on all bio_for_each_segment_all() users, this patch does tree-wide change in one single patch, so that we can avoid to use a temporary helper for this conversion. Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-15 08:40:11 -07:00
Gao Xiang	419d6efc50	staging: erofs: keep corrupted fs from crashing kernel in erofs_namei() As Al pointed out, " ... and while we are at it, what happens to unsigned int nameoff = le16_to_cpu(de[mid].nameoff); unsigned int matched = min(startprfx, endprfx); struct qstr dname = QSTR_INIT(data + nameoff, unlikely(mid >= ndirents - 1) ? maxsize - nameoff : le16_to_cpu(de[mid + 1].nameoff) - nameoff); /* string comparison without already matched prefix */ int ret = dirnamecmp(name, &dname, &matched); if le16_to_cpu(de[...].nameoff) is not monotonically increasing? I.e. what's to prevent e.g. (unsigned)-1 ending up in dname.len? Corrupted fs image shouldn't oops the kernel.. " Revisit the related lookup flow to address the issue. Fixes: `d72d1ce601` ("staging: erofs: add namei functions") Cc: <stable@vger.kernel.org> # 4.19+ Suggested-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-14 10:47:21 +01:00
Sheng Yong	3b1b5291f7	staging: erofs: fix memleak of inode's shared xattr array If it fails to read a shared xattr page, the inode's shared xattr array is not freed. The next time the inode's xattr is accessed, the previously allocated array is leaked. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Fixes: `b17500a0fd` ("staging: erofs: introduce xattr & acl support") Cc: <stable@vger.kernel.org> # 4.19+ Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-14 10:47:20 +01:00
Chengguang Xu	209312369e	staging: erofs: remove redundant unlikely annotation in unzip_vle.c unlikely has already included in IS_ERR(), so just remove it. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-12 10:45:45 +01:00
Chengguang Xu	7fadcdce5d	staging: erofs: remove redundant likely/unlikely annotation in namei.c unlikely has already included in IS_ERR(), so just remove redundant likely/unlikely annotation. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-12 10:45:45 +01:00
Masahiro Yamada	2fa495892b	staging: prefix header search paths with $(srctree)/ Currently, the Kbuild core manipulates header search paths in a crazy way [1]. To fix this mess, I want all Makefiles to add explicit $(srctree)/ to the search paths in the srctree. Some Makefiles are already written in that way, but not all. The goal of this work is to make the notation consistent, and finally get rid of the gross hacks. Having whitespaces after -I does not matter since commit `48f6e3cf5b` ("kbuild: do not drop -I without parameter"). [1]: https://patchwork.kernel.org/patch/9632347/ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-02-04 12:30:27 +01:00
Gao Xiang	516c115c91	staging: erofs: complete POSIX ACL support Let's add .get_acl() to read the file's acl from its xattrs to make POSIX ACL usable. Here is the on-disk detail, fullname: system.posix_acl_access struct erofs_xattr_entry: .e_name_len = 0 .e_name_index = EROFS_XATTR_INDEX_POSIX_ACL_ACCESS (2) fullname: system.posix_acl_default struct erofs_xattr_entry: .e_name_len = 0 .e_name_index = EROFS_XATTR_INDEX_POSIX_ACL_DEFAULT (3) Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-30 15:38:50 +01:00
Gao Xiang	a24df1f62f	staging: erofs: use xattr_prefix to wrap up Let's use xattr_prefix instead of open code. No logic changes. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-30 15:38:50 +01:00
Chengguang Xu	94832d9399	staging: erofs: fix potential double iput in erofs_read_super() Some error cases like failing from d_make_root() will cause double iput because d_make_root() also does iput in its error path. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-25 09:52:02 +01:00
Gao Xiang	2e1d66379e	staging: erofs: drop the extern prefix for function definitions Fix all `CHECK: extern prototypes should be avoided in .h files' reported by checkpatch.pl. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-18 10:37:16 +01:00
Gao Xiang	d55bc7ba6b	staging: erofs: staticize erofs_shrink_count, erofs_shrink_scan Move erofs_shrinker_info to utils.c and therefore no need to globalize erofs_shrink_count and erofs_shrink_scan. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-18 10:37:16 +01:00
Gao Xiang	4501ca36bc	staging: erofs: move shrink accounting inside the function This patch moves the &erofs_global_shrink_cnt accounting from the caller to erofs_workgroup_get(). It's cleaner and it matches erofs_workgroup_put() better. No behavior change. Reviewed-by: Chao Yu <yuchao0@huawei.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-18 10:37:16 +01:00
Gao Xiang	d60eff4396	staging: erofs: localize erofs_workgroup_get Staticize erofs_workgroup_get since no external user out of utils.c directly calls erofs_workgroup_get. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-18 10:34:01 +01:00
Gao Xiang	61c9314fdd	staging: erofs: sunset erofs_workstation_cleanup_all There is only one user calling erofs_workstation_cleanup_all, and it is no likely that more users will use in that way in the future. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-18 10:34:01 +01:00
Chao Yu	3b423417d0	staging: erofs: clean up erofs_map_blocks_iter This patch cleans up erofs_map_blocks* function and structure family, just simply the code, no logic change. Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-15 15:52:26 +01:00
Gao Xiang	6af7b48305	staging: erofs: move erofs_xattr_handlers to xattr.h Let's move independent xattr-related stuffs to xattr.h. No logic changes. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-15 15:51:57 +01:00
Gao Xiang	609398266c	staging: erofs: remove unneeded inode_operations Currently, EROFS uses generic iops when xattr is off, it seems unnecessary and a lot of extra code is there. Let's follow what other filesystems do instead. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-15 15:51:57 +01:00
Gao Xiang	7077fffcb0	staging: erofs: fix fast symlink w/o xattr when fs xattr is on Currently, this will hit a BUG_ON for these symlinks as follows: - kernel message ------------[ cut here ]------------ kernel BUG at drivers/staging/erofs/xattr.c:59! SMP PTI CPU: 1 PID: 1170 Comm: getllxattr Not tainted 4.20.0-rc6+ #92 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 RIP: 0010:init_inode_xattrs+0x22b/0x270 Code: 48 0f 45 ea f0 ff 4d 34 74 0d 41 83 4c 24 e0 01 31 c0 e9 00 fe ff ff 48 89 ef e8 e0 31 9e ff eb e9 89 e8 e9 ef fd ff ff 0f 0$ <0f> 0b 48 89 ef e8 fb f6 9c ff 48 8b 45 08 a8 01 75 24 f0 ff 4d 34 RSP: 0018:ffffa03ac026bdf8 EFLAGS: 00010246 ------------[ cut here ]------------ ... Call Trace: erofs_listxattr+0x30/0x2c0 ? selinux_inode_listxattr+0x5a/0x80 ? kmem_cache_alloc+0x33/0x170 ? security_inode_listxattr+0x27/0x40 listxattr+0xaf/0xc0 path_listxattr+0x5a/0xa0 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ... ---[ end trace 3c24b49408dc0c72 ]--- Fix it by checking ->xattr_isize in init_inode_xattrs(), and it also fixes improper return value -ENOTSUPP (it should be -ENODATA if xattr is enabled) for those inodes. Fixes: `b17500a0fd` ("staging: erofs: introduce xattr & acl support") Cc: <stable@vger.kernel.org> # 4.19+ Reported-by: Li Guifu <bluce.liguifu@huawei.com> Tested-by: Li Guifu <bluce.liguifu@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-15 15:51:57 +01:00
Gao Xiang	fdb0536469	staging: erofs: add document This documents key feature, usage, and on-disk design of erofs. Reviewed-by: Chao Yu <yuchao0@huawei.com> Cc: <linux-fsdevel@vger.kernel.org> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-15 15:51:22 +01:00
Sidong Yang	4b03f3f4cc	staging: erofs: Add identifier for function definition arguments Add identifier for function definition arguments in xattr_iter_handlers, this change clears the checkpatch.pl issue and make code more explicit. Signed-off-by: Sidong Yang <realwakka@gmail.com> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-11 10:44:20 +01:00
Jeremy Sowden	e7dfb1cff6	staging: erofs: fixed -Wmissing-prototype warnings by moving prototypes to header file. Moved prototypes for two functions to a header file in order to fix the following warnings: drivers/staging/erofs/unzip_vle.c:577:6: warning: no previous prototype for ‘erofs_workgroup_free_rcu’ [-Wmissing-prototypes] void erofs_workgroup_free_rcu(struct erofs_workgroup grp) ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/staging/erofs/unzip_vle.c:1702:5: warning: no previous prototype for ‘z_erofs_map_blocks_iter’ [-Wmissing-prototypes] int z_erofs_map_blocks_iter(struct inode inode, ^~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-11 10:44:20 +01:00
Jeremy Sowden	0a64d62d53	staging: erofs: fixed -Wmissing-prototype warnings by making functions static. Define three functions which are not used outside their own translation-units as static in order to fix the following warnings: drivers/staging/erofs/utils.c:134:6: warning: no previous prototype for ‘erofs_try_to_release_workgroup’ [-Wmissing-prototypes] bool erofs_try_to_release_workgroup(struct erofs_sb_info sbi, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/staging/erofs/unzip_vle.c:592:6: warning: no previous prototype for ‘z_erofs_vle_work_release’ [-Wmissing-prototypes] void z_erofs_vle_work_release(struct z_erofs_vle_work work) ^~~~~~~~~~~~~~~~~~~~~~~~ drivers/staging/erofs/unzip_vle_lz4.c:16:5: warning: no previous prototype for ‘z_erofs_unzip_lz4’ [-Wmissing-prototypes] int z_erofs_unzip_lz4(void in, void out, size_t inlen, size_t outlen) ^~~~~~~~~~~~~~~~~ Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-11 10:44:20 +01:00
Gao Xiang	c706d4b744	staging: erofs: fix return type of erofs_workgroup_get There exists a return type misuse (`int'->`bool') since all users assume it fails if only return value != 0, let's fix the return type to `int' instead of confusing `bool'. No logic changes. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-01-07 08:56:07 +01:00
Aaron Strahlberger	019ec6c14f	staging: erofs: Fix spelling issue Changed "stoped" to "stopped". Signed-off-by: Aaron Strahlberger <aaron.strahlberger@posteo.de> Signed-off-by: Julius Wiedmann <julius.wiedmann@fau.de> Signed-off-by: Dominik Huber <domi250@gmx.de> Reviewed-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-12 11:30:05 +01:00
Gao Xiang	ccd9c19c7a	staging: erofs: remove __EROFS_BIT It's better to use pre-calculated values to make on-disk definition more straight-forward and human-readable. Since there is the only one user, let's remove __EROFS_BIT entirely. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Gao Xiang <hsiangkao@aol.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-12 11:29:46 +01:00
Gao Xiang	b8e076a6ef	staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs remove all redundant BUG_ONs, and turn the rest useful usages to DBG_BUGONs. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-12 10:56:34 +01:00
Gao Xiang	70b17991d8	staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs remove all redundant BUG_ONs, and turn the rest useful usages to DBG_BUGONs. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-12 10:56:34 +01:00
Gao Xiang	7146a4f026	staging: erofs: simplify `z_erofs_vle_submit_all' Previously, there are too many hacked stuffs such as `__FSIO_1', `lstgrp_noio', `lstgrp_io' out there in `z_erofs_vle_submit_all'. Revisit the whole process by properly introducing jobqueue to represent each type of queued workgroups, furthermore hide all of crazyness behind independent separated functions. After this patch, 2 independent jobqueues exist if managed cache is enabled, or 1 jobqueue if disabled. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	6afd227ca1	staging: erofs: redefine where `owned_workgrp_t' points By design, workgroups are queued in the form of linked lists. Previously, it points to the next `z_erofs_vle_workgroup', which isn't flexible enough to simplify `z_erofs_vle_submit_all'. Let's fix it by pointing to the next `owned_workgrp_t' and use container_of to get its coresponding `z_erofs_vle_workgroup'. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	92e6efd566	staging: erofs: refine compressed pages preload flow Currently, there are two kinds of compressed pages in erofs: 1) file pages for the in-place decompression and 2) managed pages for cached decompression. Both are all stored in grp->compressed_pages[]. For managed pages, they could already exist or could be preloaded in this round, including the following cases in detail: 1) Already valid (loaded in some previous round); 2) PAGE_UNALLOCATED, should be allocated at the time of submission; 3) Just found in the managed cache, and with an extra page ref. Currently, 1) and 3) can be distinguishable by lock_page and checking its PG_private, which is guaranteed by the reclaim path, but it's better to do a double check by using an extra tag. This patch reworks the preload flow by introducing such the tag by using tagged pointer, too many #ifdefs are removed as well. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	9248fce714	staging: erofs: revisit the page submission flow Previously, the submission flow works with cached compressed pages reclaim path in a tricky way, and it could be buggy if the reclaim path changes later without such tricky restrictions. For example, currently one PagePrivate(page) is evaluated without taking page lock (it only follows a wait_for_page_locked which closes such race) and no handling solves the potential page truncation case. In addition, it's also full of #ifdefs in the function, which is hard to understand and maintain. this patch fixes them all. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	672e547610	staging: erofs: localize UNALLOCATED_CACHED_PAGE placeholder In practice, in order to do cached decompression rather than reuse them for in-place decompression and make full use of pages in page_pool instead of allocating as much as possible, an unallocated placeholder was introduce to mark all in compressed_pages[] and they will be replaced at the time of submission. Previously EROFS_UNALLOCATED_CACHED_PAGE was included in internal.h, which is unnecessary since it's only internally used in decompression subsystem, move it to unzip_vle.c and rename it to PAGE_UNALLOCATED. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	c1448fa880	staging: erofs: introduce MNGD_MAPPING helper This patch introduces MNGD_MAPPING to wrap up sbi->managed_cache->i_mapping, which will be used to solve too many #ifdefs in a single function. No logic changes. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	848bd9acdc	staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io' The root cause is the race as follows: Thread #0 Thread #1 z_erofs_vle_unzip_kickoff z_erofs_submit_and_unzip struct z_erofs_vle_unzip_io io[] atomic_add_return() wait_event() [end of function] wake_up() Fix it by taking the waitqueue lock between atomic_add_return and wake_up to close such the race. kernel message: Unable to handle kernel paging request at virtual address 97f7052caa1303dc ... Workqueue: kverityd verity_work task: ffffffe32bcb8000 task.stack: ffffffe3298a0000 PC is at __wake_up_common+0x48/0xa8 LR is at __wake_up+0x3c/0x58 ... Call trace: ... [<ffffff94a08ff648>] __wake_up_common+0x48/0xa8 [<ffffff94a08ff8b8>] __wake_up+0x3c/0x58 [<ffffff94a0c11b60>] z_erofs_vle_unzip_kickoff+0x40/0x64 [<ffffff94a0c118e4>] z_erofs_vle_read_endio+0x94/0x134 [<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8 [<ffffff94a1076540>] dec_pending+0x134/0x32c [<ffffff94a1076f28>] clone_endio+0x90/0xf4 [<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8 [<ffffff94a1095024>] verity_work+0x210/0x368 [<ffffff94a08c4150>] process_one_work+0x188/0x4b4 [<ffffff94a08c45bc>] worker_thread+0x140/0x458 [<ffffff94a08cad48>] kthread+0xec/0x108 [<ffffff94a0883ab4>] ret_from_fork+0x10/0x1c Code: d1006273 54000260 f9400804 b9400019 (b85fc081) ---[ end trace be9dde154f677cd1 ]--- Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-07 17:10:48 +01:00
Gao Xiang	3c49898715	staging: erofs: update erofs-utils information in TODO erofs-utils was released by the original author Li Guifu weeks ago in the linux-erofs mailing list [1], update information in TODO and let's wait the original author finish all open source works. [1] https://lists.ozlabs.org/pipermail/linux-erofs/2018-November/001004.html Cc: Li Guifu <bluce.liguifu@huawei.com> Cc: Fang Wei <fangwei1@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-06 16:08:56 +01:00
Gao Xiang	8b987bca2d	staging: erofs: {dir,inode,super}.c: rectify BUG_ONs remove all redundant BUG_ONs, and turn the rest useful usages to DBG_BUGONs. Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-12-06 16:08:56 +01:00
Gao Xiang	f0c519fc26	staging: erofs: rename strange variable names in z_erofs_vle_frontend Previously, 2 members called `initial' and `cachedzone_la' are used for applying caching policy (whether the workgroup is at either end), which are hard to understand, rename them to `backmost' and `headoffset'. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-11-23 10:53:08 +01:00
Gao Xiang	2d9b5dcd99	staging: erofs: decompress asynchronously if PG_readahead page at first For the case of nr_to_read == lookahead_size, it is better to decompress asynchronously as well since no page will be needed immediately. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-11-23 10:53:08 +01:00
Gao Xiang	23edf3abe7	staging: erofs: locked before registering for all new workgroups Let's make sure that the one registering a workgroup will also take the primary work lock at first for two reasons: 1) There's no need to introduce such a race window (and consequently overhead) between registering and locking, other tasks could break in by chance, and the race seems unnecessary (no benefit at all); 2) It's better to take the primary work when a workgroup is registered to apply the cache managed policy, for example, if some other tasks break in, it could turn into the in-place decompression rather than use as the cached decompression. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-11-23 10:53:08 +01:00
Gao Xiang	48d4bf3b05	staging: erofs: separate into init_once / always `z_erofs_vle_workgroup' is heavily generated in the decompression, for example, it resets 32 bytes redundantly for 64-bit platforms even through Z_EROFS_VLE_INLINE_PAGEVECS + Z_EROFS_CLUSTER_MAX_PAGES, default 4, pages are stored in `z_erofs_vle_workgroup'. As an another example, `struct mutex' takes 72 bytes for our kirin 64-bit platforms, it's unnecessary to be reseted at first and be initialized each time. Let's avoid filling all `z_erofs_vle_workgroup' with 0 at first since most fields are reinitialized to meaningful values later, and pagevec is no need to initialized at all. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-11-23 10:53:08 +01:00
Gao Xiang	948bbdb181	staging: erofs: add a full barrier in erofs_workgroup_unfreeze Just like other generic locks, insert a full barrier in case of memory reorder. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-11-23 10:53:08 +01:00

1 2 3

145 Commits