linux

Author	SHA1	Message	Date
Masahiro Yamada	9390dff66a	kbuild: invoke syncconfig if include/config/auto.conf.cmd is missing If include/config/auto.conf.cmd is lost for some reasons, it is not self-healing, so the top Makefile misses to run syncconfig. Move include/config/auto.conf.cmd to the target side. I used a pattern rule instead of a normal rule here although it is a bit gross. If the rule were written with a normal rule like this, include/config/auto.conf \ include/config/auto.conf.cmd \ include/config/tristate.conf: $(KCONFIG_CONFIG) $(Q)$(MAKE) -f $(srctree)/Makefile syncconfig ... syncconfig would be executed per target. Using a pattern rule makes sure that syncconfig is executed just once because Make assumes the recipe will create all of the targets. Here is a quote from the GNU Make manual [1]: "Pattern rules may have more than one target. Unlike normal rules, this does not act as many different rules with the same prerequisites and recipe. If a pattern rule has multiple targets, make knows that the rule's recipe is responsible for making all of the targets. The recipe is executed only once to make all the targets. When searching for a pattern rule to match a target, the target patterns of a rule other than the one that matches the target in need of a rule are incidental: make worries only about giving a recipe and prerequisites to the file presently in question. However, when this file's recipe is run, the other targets are marked as having been updated themselves." [1]: https://www.gnu.org/software/make/manual/html_node/Pattern-Intro.html Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 22:22:12 +09:00
Ming Lei	bbcbbd567c	block: optimize blk_bio_segment_split for single-page bvec Introduce a fast path for single-page bvec IO, then we can avoid to call bvec_split_segs() unnecessarily. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-27 06:18:55 -07:00
Ming Lei	48d7727cae	block: optimize __blk_segment_map_sg() for single-page bvec Introduce a fast path for single-page bvec IO, then blk_bvec_map_sg() can be avoided. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-27 06:18:54 -07:00
Ming Lei	4d633062c1	block: introduce bvec_nth_page() Single-page bvec can often be seen in small BS workloads, so introduce bvec_nth_page() for avoiding to call nth_page() unnecessarily, which looks not cheap. Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2019-02-27 06:18:52 -07:00
David Sterba	7503b83d80	btrfs: move ulist allocation out of transaction in quota enable The allocation happens with GFP_KERNEL after a transaction has been started, this can potentially cause deadlock if reclaim tries to get the memory by flushing filesystem data. The fs_info::qgroup_ulist is not used during transaction start when quotas are not enabled. The status bit BTRFS_FS_QUOTA_ENABLED is set later in btrfs_quota_enable so it's safe to move it before the transaction start. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-02-27 14:10:25 +01:00
Josef Bacik	aea6f028d0	btrfs: save drop_progress if we drop refs at all Previously we only updated the drop_progress key if we were in the DROP_REFERENCE stage of snapshot deletion. This is because the UPDATE_BACKREF stage checks the flags of the blocks it's converting to FULL_BACKREF, so if we go over a block we processed before it doesn't matter, we just don't do anything. The problem is in do_walk_down() we will go ahead and drop the roots reference to any blocks that we know we won't need to walk into. Given subvolume A and snapshot B. The root of B points to all of the nodes that belong to A, so all of those nodes have a refcnt > 1. If B did not modify those blocks it'll hit this condition in do_walk_down if (!wc->update_ref \|\| generation <= root->root_key.offset) goto skip; and in "goto skip" we simply do a btrfs_free_extent() for that bytenr that we point at. Now assume we modified some data in B, and then took a snapshot of B and call it C. C points to all the nodes in B, making every node the root of B points to have a refcnt > 1. This assumes the root level is 2 or higher. We delete snapshot B, which does the above work in do_walk_down, free'ing our ref for nodes we share with A that we didn't modify. Now we hit a node we _did_ modify, thus we own. We need to walk down into this node and we set wc->stage == UPDATE_BACKREF. We walk down to level 0 which we also own because we modified data. We can't walk any further down and thus now need to walk up and start the next part of the deletion. Now walk_up_proc is supposed to put us back into DROP_REFERENCE, but there's an exception to this if (level < wc->shared_level) goto out; we are at level == 0, and our shared_level == 1. We skip out of this one and go up to level 1. Since path->slots[1] < nritems we path->slots[1]++ and break out of walk_up_tree to stop our transaction and loop back around. Now in btrfs_drop_snapshot we have this snippet if (wc->stage == DROP_REFERENCE) { level = wc->level; btrfs_node_key(path->nodes[level], &root_item->drop_progress, path->slots[level]); root_item->drop_level = level; } our stage == UPDATE_BACKREF still, so we don't update the drop_progress key. This is a problem because we would have done btrfs_free_extent() for the nodes leading up to our current position. If we crash or unmount here and go to remount we'll start over where we were before and try to free our ref for blocks we've already freed, and thus abort() out. Fix this by keeping track of the last place we dropped a reference for our block in do_walk_down. Then if wc->stage == UPDATE_BACKREF we know we'll start over from a place we meant to, and otherwise things continue to work as they did before. I have a complicated reproducer for this problem, without this patch we'll fail to fsck the fs when replaying the log writes log. With this patch we can replay the whole log without any fsck or mount failures. The steps to reproduce this easily are sort of tricky, I had to add a couple of debug patches to the kernel in order to make it easy, basically I just needed to make sure we did actually commit the transaction every time we finished a walk_down_tree/walk_up_tree combo. The reproducer: 1) Creates a base subvolume. 2) Creates 100k files in the subvolume. 3) Snapshots the base subvolume (snap1). 4) Touches files 5000-6000 in snap1. 5) Snapshots snap1 (snap2). 6) Deletes snap1. I do this with dm-log-writes, and then replay to every FUA in the log and fsck the fs. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> [ copy reproducer steps ] Signed-off-by: David Sterba <dsterba@suse.com>	2019-02-27 14:08:47 +01:00
Josef Bacik	78c52d9eb6	btrfs: check for refs on snapshot delete resume There's a bug in snapshot deletion where we won't update the drop_progress key if we're in the UPDATE_BACKREF stage. This is a problem because we could drop refs for blocks we know don't belong to ours. If we crash or umount at the right time we could experience messages such as the following when snapshot deletion resumes BTRFS error (device dm-3): unable to find ref byte nr 66797568 parent 0 root 258 owner 1 offset 0 ------------[ cut here ]------------ WARNING: CPU: 3 PID: 16052 at fs/btrfs/extent-tree.c:7108 __btrfs_free_extent.isra.78+0x62c/0xb30 [btrfs] CPU: 3 PID: 16052 Comm: umount Tainted: G W OE 5.0.0-rc4+ #147 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011 RIP: 0010:__btrfs_free_extent.isra.78+0x62c/0xb30 [btrfs] RSP: 0018:ffffc90005cd7b18 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff88842fade680 RSI: ffff88842fad6b18 RDI: ffff88842fad6b18 RBP: ffffc90005cd7bc8 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000001 R11: ffffffff822696b8 R12: 0000000003fb4000 R13: 0000000000000001 R14: 0000000000000102 R15: ffff88819c9d67e0 FS: 00007f08bb138fc0(0000) GS:ffff88842fac0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f5d861ea0 CR3: 00000003e99fe000 CR4: 00000000000006e0 Call Trace: ? _raw_spin_unlock+0x27/0x40 ? btrfs_merge_delayed_refs+0x356/0x3e0 [btrfs] __btrfs_run_delayed_refs+0x75a/0x13c0 [btrfs] ? join_transaction+0x2b/0x460 [btrfs] btrfs_run_delayed_refs+0xf3/0x1c0 [btrfs] btrfs_commit_transaction+0x52/0xa50 [btrfs] ? start_transaction+0xa6/0x510 [btrfs] btrfs_sync_fs+0x79/0x1c0 [btrfs] sync_filesystem+0x70/0x90 generic_shutdown_super+0x27/0x120 kill_anon_super+0x12/0x30 btrfs_kill_super+0x16/0xa0 [btrfs] deactivate_locked_super+0x43/0x70 deactivate_super+0x40/0x60 cleanup_mnt+0x3f/0x80 __cleanup_mnt+0x12/0x20 task_work_run+0x8b/0xc0 exit_to_usermode_loop+0xce/0xd0 do_syscall_64+0x20b/0x210 entry_SYSCALL_64_after_hwframe+0x49/0xbe To fix this simply mark dead roots we read from disk as DEAD and then set the walk_control->restarted flag so we know we have a restarted deletion. From here whenever we try to drop refs for blocks we check to verify our ref is set on them, and if it is not we skip it. Once we find a ref that is set we unset walk_control->restarted since the tree should be in a normal state from then on, and any problems we run into from there are different issues. I tested this with an existing broken fs and my reproducer that creates a broken fs and it fixed both file systems. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-02-27 14:08:47 +01:00
Masahiro Yamada	6b12de69ad	kbuild: simplify single target rules The dependency will be checked anyway after Kbuild descends into a sub-directory. Skip object/source dependency checks in top Makefile. VPATH can be simpler since the top Makefile no longer checks the presence of the source file, which is located in in the external module directory. One good thing is, it can compile an object from a generated source file. $ make crypto/rsapubkey.asn1.o ... ASN.1 crypto/rsapubkey.asn1.c CC crypto/rsapubkey.asn1.o Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 21:43:31 +09:00
Masahiro Yamada	b999923c29	kbuild: remove empty rules for makefiles The previous commit made 'MAKEFLAGS += -rR' effective in the top Makefile regardless of O= option, GNU Make versions. The top Makefile does not need to cancel implicit rules for makefiles. There is still one place where an empty rule is useful. Since -rR is effective only after sub-make, GNU Make would try implicit rules to update the top Makefile. Although it is not a big overhead, cancel it just in case. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 21:43:31 +09:00
Masahiro Yamada	3812b8c5c5	kbuild: make -r/-R effective in top Makefile for old Make versions Adding -rR to MAKEFLAGS is important because we do not want to be bothered by built-in implicit rules or variables. One problem that used to exist in older GNU Make versions is MAKEFLAGS += -rR ... does not become effective in the current Makefile. When you are building with O= option, it becomes effective in the top Makefile since it recurses via 'sub-make' target. Otherwise, the top Makefile tries implicit rules. That is why we explicitly add empty rules for Makefiles, but we often miss to do that. In fact, adding -d option to older GNU Make versions shows it is trying a bunch of implicit pattern rules. Considering target file `scripts/Makefile.kcov'. Looking for an implicit rule for `scripts/Makefile.kcov'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.o'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.c'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.cc'. Trying pattern rule with stem `Makefile.kcov'. Trying implicit prerequisite `scripts/Makefile.kcov.C'. ... This issue was fixed by GNU Make commit 58dae243526b ("[Savannah #20501] Handle adding -r/-R to MAKEFLAGS in the makefile"). So, it is no longer a problem if you use GNU Make 4.0 or later. However, older versions are still widely used. So, I decided to patch the kernel Makefile to invoke sub-make regardless of O= option. This will allow further cleanups. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 21:43:31 +09:00
Masahiro Yamada	f47a23ce2b	kbuild: move tools_silent to a more relevant place This would disturb the change the sub-make part. Move it near the tools/ target. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 21:43:30 +09:00
Masahiro Yamada	b303c6df80	kbuild: compute false-positive -Wmaybe-uninitialized cases in Kconfig Since -Wmaybe-uninitialized was introduced by GCC 4.7, we have patched various false positives: - commit `e74fc973b6` ("Turn off -Wmaybe-uninitialized when building with -Os") turned off this option for -Os. - commit `815eb71e71` ("Kbuild: disable 'maybe-uninitialized' warning for CONFIG_PROFILE_ALL_BRANCHES") turned off this option for CONFIG_PROFILE_ALL_BRANCHES - commit `a76bcf557e` ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"") turned off this option for GCC < 4.9 Arnd provided more explanation in https://lkml.org/lkml/2017/3/14/903 I think this looks better by shifting the logic from Makefile to Kconfig. Link: https://github.com/ClangBuiltLinux/linux/issues/350 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com>	2019-02-27 21:43:20 +09:00
Masahiro Yamada	bd55f96fa9	kbuild: refactor cc-cross-prefix implementation - $(word 1, <text>) is equivalent to $(firstword <text>) - hardcode "gcc" instead of $(CC) - minimize the shell script part A little more notes in case $(filter-out -%, ...) is not clear. arch/mips/Makefile passes prefixes depending on the configuration. CROSS_COMPILE := $(call cc-cross-prefix, $(tool-archpref)-linux- \ $(tool-archpref)-linux-gnu- $(tool-archpref)-unknown-linux-gnu-) In the Kconfig stage (e.g. when you run 'make defconfig'), neither CONFIG_32BIT nor CONFIG_64BIT is defined. So, $(tool-archpref) is empty. As a result, "-linux -linux-gnu- -unknown-linux-gnu" is passed into cc-cross-prefix. The command 'which' assumes arguments starting with a hyphen as command options, then emits the following messages: Illegal option -l Illegal option -l Illegal option -u I think it is strange to define CROSS_COMPILE depending on the CONFIG options since you need to feed $(CC) to Kconfig, but it is how MIPS Makefile currently works. Anyway, it would not hurt to filter-out invalid strings beforehand. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 21:41:27 +09:00
Masahiro Yamada	88110713ca	kbuild: hardcode genksyms path and remove GENKSYMS variable The genksyms source was integrated into the kernel tree in 2003. I do not expect anybody still using the external /sbin/genksyms. Kbuild does not need to provide the ability to override GENKSYMS. Let's remove the GENKSYMS variable, and use the hardcoded path. Since it occurred in the pre-git era, I attached the commit message in case somebody is interested in the historical background. \| Author: Kai Germaschewski <kai@tp1.ruhr-uni-bochum.de> \| Date: Wed Feb 19 04:17:28 2003 -0600 \| \| kbuild: [PATCH] put genksyms in scripts dir \| \| This puts genksyms into scripts/genksyms/. \| \| genksyms used to be maintained externally, though the only possible user \| was the kernel build. Moving it into the kernel sources makes it easier to \| keep it uptodate, like for example updating it to generate linker scripts \| directly instead of postprocessing the generated header file fragments \| with sed, as we do currently. \| \| Also, genksyms does not handle __typeof__, which needs to be fixed since \| some of the exported symbol in the kernel are defined using __typeof__. \| \| (Rusty Russell/me) Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2019-02-27 21:41:26 +09:00
Masahiro Yamada	b513adf45c	scripts/gdb: refactor rules for symlink creation gdb-scripts is not a real object, but (ab)used like a phony target. Rewrite the code in a more Kbuild-ish way. Add symlinks to extra-y and use if_changed. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>	2019-02-27 21:41:04 +09:00
Masahiro Yamada	8d2e52003a	kbuild: create symlink to vmlinux-gdb.py in scripts_gdb target It is weird to create gdb stuff as a side-effect of vmlinux. Move it to a more relevant place. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>	2019-02-27 21:40:54 +09:00
Masahiro Yamada	1e5ff84ffe	scripts/gdb: do not descend into scripts/gdb from scripts Currently, Kbuild descends from scripts/Makefile to scripts/gdb/Makefile just for creating symbolic links, but it does not need to do it so early. Merge the two descending paths to simplify the code. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>	2019-02-27 21:40:09 +09:00
Masahiro Yamada	01d509a48b	kbuild: remove unimportant comments from ./Kbuild Every time we add/remove a target, we need to touch the header part, including renumbering. This is not so important information. Numbering targets is rather misleading because they are not necessarily generated in this order. For example, 1) and 2) can be executed simultaneously when the -j option is given. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>	2019-02-27 21:40:01 +09:00
Masahiro Yamada	67274c0834	scripts/gdb: delay generation of gdb constants.py scripts/gdb/linux/constants.py is never used in the kernel build process. There is no good reason to create it so early. Get it out of the 'prepare' stage. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>	2019-02-27 21:39:48 +09:00
Martin Schwidefsky	7b660c225f	Further fixes in TIC handling. -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAlx2a7ESHGNvaHVja0By ZWRoYXQuY29tAAoJEN7Pa5PG8C+vwCYP/0ZTYTJdWhX3P72Bu2lBZyfrb6fxZYB5 TNFiPFYLDSsCaFhrRVTkz0LEV54DqGuENsmhGPFX0HqzyL9V2RV/5ZAqvCGMIXv1 AxYzTuAL8iihEAzgz7MhO2k2bBVmSWuyFIeBnhhKZYnxUV+ag06wizVghrMKjrmf p8jgZnXYmk0YeYfahzVJqqZ4J8ZCWugOxYMPCz8zcdW7NScyBS6yLm0eFi8zQJsi APwc8XdM7nMBeM3uf17J3XXdJbjOyZDCuKMKw9M0fCZ2sn3Y8qbujFcbJmBHjH3a UfB3wEh1RhvuJm5CjBSwXfhvAhZOra0Vlz+vBdnpcdjuCiU3vCoOIW6+NV6l76N0 CsaOfSo/EvmhXgzFBEfzfkBuapnzcri2PjcFtEKvuY0DsT/K4bzLXcWooBXLvd6n iy9UNqvEaVR19Djh0ds1y4jsTAWdn8SJr9iu++jhovO5U7X/4BcYtIKqS0TgKvJB 5xoOXHjySlM1K4t78pGaxnwEnq6U9FB46znzUuf6+sRf6mENiNFkK5us5LiqwukA rTug/VZnxyfkuil+D2bfuGvbI7eIWGMU/NCPynpnx6fTIfyEGVslt6Wv6Ugvygo3 bWLgC9oMtdei9H3RzAknqV0I5XOyZUur+WFctuDxjUtc10uEOTHfC3KS6EzvwLEr Mf96sfls8aRy =6hqn -----END PGP SIGNATURE----- Merge tag 'vfio-ccw-20190227' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features Pull vfio-ccw from Cornelia Huck with the following changes: - Further fixes in TIC handling.	2019-02-27 13:19:33 +01:00
Christophe Leroy	27da80719e	powerpc/fsl: Fix the flush of branch predictor. The commit identified below adds MC_BTB_FLUSH macro only when CONFIG_PPC_FSL_BOOK3E is defined. This results in the following error on some configs (seen several times with kisskb randconfig_defconfig) arch/powerpc/kernel/exceptions-64e.S:576: Error: Unrecognized opcode: `mc_btb_flush' make[3]: * [scripts/Makefile.build:367: arch/powerpc/kernel/exceptions-64e.o] Error 1 make[2]: * [scripts/Makefile.build:492: arch/powerpc/kernel] Error 2 make[1]: * [Makefile:1043: arch/powerpc] Error 2 make: * [Makefile:152: sub-make] Error 2 This patch adds a blank definition of MC_BTB_FLUSH for other cases. Fixes: `10c5e83afd` ("powerpc/fsl: Flush the branch predictor at each kernel entry (64bit)") Cc: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Daniel Axtens <dja@axtens.net> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-27 22:52:38 +11:00
Filipe Manana	4ea748e1d2	Btrfs: fix deadlock between clone/dedupe and rename Reflinking (clone/dedupe) and rename are operations that operate on two inodes and therefore need to lock them in the same order to avoid ABBA deadlocks. It happens that Btrfs' reflink implementation always locked them in a different order from VFS's lock_two_nondirectories() helper, which is used by the rename code in VFS, resulting in ABBA type deadlocks. Btrfs' locking order: static void btrfs_double_inode_lock(struct inode inode1, struct inode inode2) { if (inode1 < inode2) swap(inode1, inode2); inode_lock_nested(inode1, I_MUTEX_PARENT); inode_lock_nested(inode2, I_MUTEX_CHILD); } VFS's locking order: void lock_two_nondirectories(struct inode inode1, struct inode inode2) { if (inode1 > inode2) swap(inode1, inode2); if (inode1 && !S_ISDIR(inode1->i_mode)) inode_lock(inode1); if (inode2 && !S_ISDIR(inode2->i_mode) && inode2 != inode1) inode_lock_nested(inode2, I_MUTEX_NONDIR2); } Fix this by killing the btrfs helper function that does the double inode locking and replace it with VFS's helper lock_two_nondirectories(). Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: `416161db9b` ("btrfs: offline dedupe") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-02-27 12:24:16 +01:00
Filipe Manana	8e92821878	Btrfs: fix corruption reading shared and compressed extents after hole punching In the past we had data corruption when reading compressed extents that are shared within the same file and they are consecutive, this got fixed by commit `005efedf2c` ("Btrfs: fix read corruption of compressed and shared extents") and by commit `808f80b467` ("Btrfs: update fix for read corruption of compressed and shared extents"). However there was a case that was missing in those fixes, which is when the shared and compressed extents are referenced with a non-zero offset. The following shell script creates a reproducer for this issue: #!/bin/bash mkfs.btrfs -f /dev/sdc &> /dev/null mount -o compress /dev/sdc /mnt/sdc # Create a file with 3 consecutive compressed extents, each has an # uncompressed size of 128Kb and a compressed size of 4Kb. for ((i = 1; i <= 3; i++)); do head -c 4096 /dev/zero for ((j = 1; j <= 31; j++)); do head -c 4096 /dev/zero \| tr '\0' "\377" done done > /mnt/sdc/foobar sync echo "Digest after file creation: $(md5sum /mnt/sdc/foobar)" # Clone the first extent into offsets 128K and 256K. xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar sync echo "Digest after cloning: $(md5sum /mnt/sdc/foobar)" # Punch holes into the regions that are already full of zeroes. xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar sync echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)" echo "Dropping page cache..." sysctl -q vm.drop_caches=1 echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)" umount /dev/sdc When running the script we get the following output: Digest after file creation: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar linked 131072/131072 bytes at offset 131072 128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec) linked 131072/131072 bytes at offset 262144 128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec) Digest after cloning: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Digest after hole punching: 5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar Dropping page cache... Digest after hole punching: fba694ae8664ed0c2e9ff8937e7f1484 /mnt/sdc/foobar This happens because after reading all the pages of the extent in the range from 128K to 256K for example, we read the hole at offset 256K and then when reading the page at offset 260K we don't submit the existing bio, which is responsible for filling all the page in the range 128K to 256K only, therefore adding the pages from range 260K to 384K to the existing bio and submitting it after iterating over the entire range. Once the bio completes, the uncompressed data fills only the pages in the range 128K to 256K because there's no more data read from disk, leaving the pages in the range 260K to 384K unfilled. It is just a slightly different variant of what was solved by commit `005efedf2c` ("Btrfs: fix read corruption of compressed and shared extents"). Fix this by forcing a bio submit, during readpages(), whenever we find a compressed extent map for a page that is different from the extent map for the previous page or has a different starting offset (in case it's the same compressed extent), instead of the extent map's original start offset. A test case for fstests follows soon. Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Fixes: `808f80b467` ("Btrfs: update fix for read corruption of compressed and shared extents") Fixes: `005efedf2c` ("Btrfs: fix read corruption of compressed and shared extents") Cc: stable@vger.kernel.org # 4.3+ Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2019-02-27 12:24:07 +01:00
Jordan Niethe	7b62f9bd22	powerpc/powernv: Make opal log only readable by root Currently the opal log is globally readable. It is kernel policy to limit the visibility of physical addresses / kernel pointers to root. Given this and the fact the opal log may contain this information it would be better to limit the readability to root. Fixes: `bfc36894a4` ("powerpc/powernv: Add OPAL message log interface") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Jordan Niethe <jniethe5@gmail.com> Reviewed-by: Stewart Smith <stewart@linux.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2019-02-27 22:11:31 +11:00
Pablo Neira Ayuso	123f89c8aa	netfilter: nft_set_hash: remove nft_hash_key() hashtable is never used for 2-byte keys, remove nft_hash_key(). Fixes: `e240cd0df4` ("netfilter: nf_tables: place all set backends in one single module") Reported-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 11:08:32 +01:00
Pablo Neira Ayuso	a01cbae57e	netfilter: nft_set_hash: bogus element self comparison from deactivation path Use the element from the loop iteration, not the same element we want to deactivate otherwise this branch always evaluates true. Fixes: `6c03ae210c` ("netfilter: nft_set_hash: add non-resizable hashtable implementation") Reported-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 11:08:31 +01:00
Pablo Neira Ayuso	3b02b0adc2	netfilter: nft_set_hash: fix lookups with fixed size hash on big endian Call jhash_1word() for the 4-bytes key case from the insertion and deactivation path, otherwise big endian arch set lookups fail. Fixes: `446a8268b7` ("netfilter: nft_set_hash: add lookup variant for fixed size hashtable") Reported-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 11:08:31 +01:00
Li RongQing	35acfbab6e	netfilter: remove unneeded switch fall-through Empty case is fine and does not switch fall-through Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 11:03:59 +01:00
Florian Westphal	cc16921351	netfilter: conntrack: avoid same-timeout update No need to dirty a cache line if timeout is unchanged. Also, WARN() is useless here: we crash on 'skb->len' access if skb is NULL. Last, ct->timeout is u32, not 'unsigned long' so adapt the function prototype accordingly. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:58:21 +01:00
Florian Westphal	d2c5c103b1	netfilter: nat: remove nf_nat_l3proto.h and nf_nat_core.h The l3proto name is gone, its header file is the last trace. While at it, also remove nf_nat_core.h, its very small and all users include nf_nat.h too. before: text data bss dec hex filename 22948 1612 4136 28696 7018 nf_nat.ko after removal of l3proto register/unregister functions: text data bss dec hex filename 22196 1516 4136 27848 6cc8 nf_nat.ko checkpatch complains about overly long lines, but line breaks do not make things more readable and the line length gets smaller here, not larger. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:54:08 +01:00
Florian Westphal	d6c4c8ffb5	netfilter: nat: remove l3proto struct All l3proto function pointers have been removed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:53:57 +01:00
Florian Westphal	dac3fe7259	netfilter: nat: remove csum_recalc hook We can now use direct calls. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:53:47 +01:00
Florian Westphal	03fe5efc4c	netfilter: nat: remove csum_update hook We can now use direct calls. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:53:35 +01:00
Florian Westphal	2e666b229d	netfilter: nat: remove l3 manip_pkt hook We can now use direct calls. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:53:05 +01:00
Florian Westphal	14cb1a6e29	netfilter: nat: remove nf_nat_l4proto.h after ipv4/6 nat tracker merge, there are no external callers, so make last function static and remove the header. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:52:47 +01:00
Florian Westphal	3bf195ae60	netfilter: nat: merge nf_nat_ipv4,6 into nat core before: text data bss dec hex filename 16566 1576 4136 22278 5706 nf_nat.ko 3598 844 0 4442 115a nf_nat_ipv6.ko 3187 844 0 4031 fbf nf_nat_ipv4.ko after: text data bss dec hex filename 22948 1612 4136 28696 7018 nf_nat.ko ... with ipv4/v6 nat now provided directly via nf_nat.ko. Also changes: ret = nf_nat_ipv4_fn(priv, skb, state); if (ret != NF_DROP && ret != NF_STOLEN && into if (ret != NF_ACCEPT) return ret; everywhere. The nat hooks never should return anything other than ACCEPT or DROP (and the latter only in rare error cases). The original code uses multi-line ANDing including assignment-in-if: if (ret != NF_DROP && ret != NF_STOLEN && !(IPCB(skb)->flags & IPSKB_XFRM_TRANSFORMED) && (ct = nf_ct_get(skb, &ctinfo)) != NULL) { I removed this while moving, breaking those in separate conditionals and moving the assignments into extra lines. checkpatch still generates some warnings: 1. Overly long lines (of moved code). Breaking them is even more ugly. so I kept this as-is. 2. use of extern function declarations in a .c file. This is necessary evil, we must call nf_nat_l3proto_register() from the nat core now. All l3proto related functions are removed later in this series, those prototypes are then removed as well. v2: keep empty nf_nat_ipv6_csum_update stub for CONFIG_IPV6=n case. v3: remove IS_ENABLED(NF_NAT_IPV4/6) tests, NF_NAT_IPVx toggles are removed here. v4: also get rid of the assignments in conditionals. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:49:55 +01:00
Florian Westphal	096d09067a	netfilter: nat: move nlattr parse and xfrm session decode to core None of these functions calls any external functions, moving them allows to avoid both the indirection and a need to export these symbols. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:49:42 +01:00
Florian Westphal	d1aca8ab31	netfilter: nat: merge ipv4 and ipv6 masquerade functionality Before: text data bss dec hex filename 13916 1412 4128 19456 4c00 nf_nat.ko 4510 968 4 5482 156a nf_nat_ipv4.ko 5146 944 8 6098 17d2 nf_nat_ipv6.ko After: text data bss dec hex filename 16566 1576 4136 22278 5706 nf_nat.ko 3187 844 0 4031 fbf nf_nat_ipv4.ko 3598 844 0 4442 115a nf_nat_ipv6.ko ... so no drastic changes in combined size. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:49:24 +01:00
Andy Shevchenko	886ca88be6	ACPI / bus: Respect PRP0001 when retrieving device match data In the PRP0001 case, the compatible string may have additional data affiliated with the device. When we call device_get_match_data() on such device, we will get nothing since currently acpi_device_get_match_data() doesn't respect PRP0001. To fix the above, try acpi_of_match_device() if there is no ACPI table in the driver. Anyway, note that the device is expected to get its own proper ACPI ID. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2019-02-27 10:47:59 +01:00
Florian Westphal	d824548dae	netfilter: ebtables: remove BUGPRINT messages They are however frequently triggered by syzkaller, so remove them. ebtables userspace should never trigger any of these, so there is little value in making them pr_debug (or ratelimited). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:47:57 +01:00
Florian Tham	4283428e49	netfilter: nf_conntrack_amanda: add support for STATE streams The Amanda CONNECT command has been updated to establish an optional fourth connection [0]. Previously, a CONNECT command would look like: CONNECT DATA port0 MESG port1 INDEX port2 nf_conntrack_amanda analyses the CONNECT command string in order to learn the port numbers of the related DATA, MESG and INDEX streams. As of amanda v3.4, the CONNECT command can advertise an additional port: CONNECT DATA port0 MESG port1 INDEX port2 STATE port3 The new STATE stream is not handled, thus the connection on the STATE port cannot be established. The patch adds support for STATE streams to the amanda conntrack helper. I tested with max_expected = 3, leaving the other patch hunks unmodified. Amanda reports "connection refused" and aborts. After I set max_expected to 4, the backup completes successfully. [0] `3b8384fc9f (diff-711e502fc81a65182c0954765b42919eR456)` Signed-off-by: Florian Tham <tham@fidion.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:46:39 +01:00
Pablo Neira Ayuso	b8e2040063	netfilter: nft_compat: use .release_ops and remove list of extension Add .release_ops, that is called in case of error at a later stage in the expression initialization path, ie. .select_ops() has been already set up operations and that needs to be undone. This allows us to unwind .select_ops from the error path, ie. release the dynamic operations for this extension. Moreover, allocate one single operation instead of recycling them, this comes at the cost of consuming a bit more memory per rule, but it simplifies the infrastructure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2019-02-27 10:41:24 +01:00
Ritesh Harjani	e5723f95d6	mmc: core: Fix NULL ptr crash from mmc_should_fail_request In case of CQHCI, mrq->cmd may be NULL for data requests (non DCMD). In such case mmc_should_fail_request is directly dereferencing mrq->cmd while cmd is NULL. Fix this by checking for mrq->cmd pointer. Fixes: `72a5af554d` ("mmc: core: Add support for handling CQE requests") Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2019-02-27 10:00:17 +01:00
Brian Norris	5364a0b4f4	arm64: dts: rockchip: move QCA6174A wakeup pin into its USB node Currently, we don't coordinate BT USB activity with our handling of the BT out-of-band wake pin, and instead just use gpio-keys. That causes problems because we have no way of distinguishing wake activity due to a BT device (e.g., mouse) vs. the BT controller (e.g., re-configuring wake mask before suspend). This can cause spurious wake events just because we, for instance, try to reconfigure the host controller's event mask before suspending. We can avoid these synchronization problems by handling the BT wake pin directly in the btusb driver -- for all activity up until BT controller suspend(), we simply listen to normal USB activity (e.g., to know the difference between device and host activity); once we're really ready to suspend the host controller, there should be no more host activity, and only then do we unmask the GPIO interrupt. This is already supported by btusb; we just need to describe the wake pin in the right node. We list 2 compatible properties, since both PID/VID pairs show up on Scarlet devices, and they're both essentially identical QCA6174A-based modules. Also note that the polarity was wrong before: Qualcomm implemented WAKE as active high, not active low. We only got away with this because gpio-keys always reconfigured us as bi-directional edge-triggered. Finally, we have an external pull-up and a level-shifter on this line (we didn't notice Qualcomm's polarity in the initial design), so we can't do pull-down. Switch to pull-none. Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:50:15 +01:00
Brian Norris	7d19261bc0	dt-bindings: net: btusb: add QCA6174A IDs There are two USB PID/VID variations I've seen for this chip, and I want to utilize the 'interrupts' property defined here already. Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:50:15 +01:00
Brian Norris	4c409af04d	Bluetooth: btusb: add QCA6174A compatible properties We may need to specify a GPIO wake pin for this device, so add a compatible property for it. There are at least to USB PID/VID variations of this chip: one with a Lite-On ID and one with an Atheros ID. Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:50:14 +01:00
Matthias Kaehlcke	6d10cd5cbd	Bluetooth: hci_qca: Use msleep() instead of open coding it Call msleep() in qca_set_baudrate() instead of reimplementing it. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:47:39 +01:00
Matthias Kaehlcke	0ebcddd8e0	Bluetooth: hci_qca: Add delay after power-off pulse During initialization the power-on pulse is currently sent inmediately after the prior power-off pulse. With this initialization often fails at boot time: [ 15.205224] Bluetooth: hci0: setting up wcn3990 [ 17.341062] Bluetooth: hci0: command 0xfc00 tx timeout [ 22.101453] ERROR: Bluetooth initialization failed [ 25.337740] Bluetooth: hci0: Reading QCA version information failed (-110) After a power-off pulse wait 10ms to give the controller time to power off. Remove the previous short settling delay, it isn't needed anymore. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:44:33 +01:00
Matthias Kaehlcke	ad571d725c	Bluetooth: hci_qca: Move boot delay to qca_send_power_pulse() After sending a power on pulse the driver has a delay of 100ms to allow the host controller to boot. Move the delay into qca_send_power_pulse(), since it is directly related with the power-on pulse. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:44:32 +01:00
Matthias Kaehlcke	9836b80208	Bluetooth: hci_qca: Pass boolean 'on/off' to qca_send_power_pulse() There are only two types of power pulses 'on' or 'off', pass a boolean instead of the power pulse 'command'. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Reviewed-by: Balakrishna Godavarthi <bgodavar@codeaurora.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2019-02-27 08:44:32 +01:00

... 67 68 69 70 71 ...

826113 Commits