linux

Author	SHA1	Message	Date
Su Yue	d21deec5e7	btrfs: remove unused parameter fs_devices from btrfs_init_workqueues Since commit `ba8a9d0795` ("Btrfs: delete the entire async bio submission framework") removed submit workqueues, the parameter fs_devices is not used anymore. Remove it, no functional changes. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Su Yue <l@damenly.su> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:44 +01:00
Filipe Manana	dfba78dc1c	btrfs: reduce the scope of the tree log mutex during transaction commit In the transaction commit path we are acquiring the tree log mutex too early and we have a stale comment because: 1) It mentions a function named btrfs_commit_tree_roots(), which does not exists anymore, it was the old name of commit_cowonly_roots(), renamed a very long time ago by commit `5d4f98a28c` ("Btrfs: Mixed back reference (FORWARD ROLLING FORMAT CHANGE)")); 2) It mentions that we need to acquire the tree log mutex at that point to ensure we have no running log writers. That is not correct anymore, for many years at least, since we are guaranteed that we do not have any log writers at that point simply because we have set the state of the transaction to TRANS_STATE_COMMIT_DOING and have waited for all writers to complete - meaning no one can log until we change the state of the transaction to TRANS_STATE_UNBLOCKED. Any attempts to join the transaction or start a new one will block until we do that state transition; 3) The comment mentions a "trans mutex" which doesn't exists since 2011, commit `a4abeea41a` ("Btrfs: kill trans_mutex") removed it; 4) The current use of the tree log mutex is to ensure proper serialization of super block writes - if someone started a new transaction and uses it for logging, it will wait for the previous transaction to write its super block before writing the super block when attempting to sync the log. So acquire the tree log mutex only when it's absolutely needed, before setting the transaction state to TRANS_STATE_UNBLOCKED, fix and move the stale comment, add some assertions and new comments where appropriate. Also, this has no effect on concurrency or performance, since the new start of the critical section is still when the transaction is in the state TRANS_STATE_COMMIT_DOING. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:44 +01:00
Anand Jain	849eae5e57	btrfs: consolidate device_list_mutex in prepare_sprout to its parent btrfs_prepare_sprout() splices seed devices into its own struct fs_devices, so that its parent function btrfs_init_new_device() can add the new sprout device to fs_info->fs_devices. Both btrfs_prepare_sprout() and btrfs_init_new_device() need device_list_mutex. But they are holding it separately, thus create a small race window. Close it and hold device_list_mutex across both functions btrfs_init_new_device() and btrfs_prepare_sprout(). Split btrfs_prepare_sprout() into btrfs_init_sprout() and btrfs_setup_sprout(). This split is essential because device_list_mutex must not be held for allocations in btrfs_init_sprout() but must be held for btrfs_setup_sprout(). So now a common device_list_mutex can be used between btrfs_init_new_device() and btrfs_setup_sprout(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:44 +01:00
Anand Jain	fd8808097a	btrfs: switch seeding_dev in init_new_device to bool Declare int seeding_dev as a bool. Also, move its declaration a line below to adjust packing. Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:44 +01:00
Omar Sandoval	b1dea4e732	btrfs: send: remove unused type parameter to iterate_inode_ref_t Again, I don't think this was ever used since iterate_dir_item() is only used for xattrs. No functional change. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:44 +01:00
Omar Sandoval	eab67c0645	btrfs: send: remove unused found_type parameter to lookup_dir_item_inode() As far as I can tell, this was never used. No functional change. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:43 +01:00
Josef Bacik	dc2e724e0f	btrfs: rename btrfs_item_end_nr to btrfs_item_data_end The name btrfs_item_end_nr() is a bit of a misnomer, as it's actually the offset of the end of the data the item points to. In fact all of the helpers that we use btrfs_item_end_nr() use data in their name, like BTRFS_LEAF_DATA_SIZE() and leaf_data(). Rename to btrfs_item_data_end() to make it clear what this helper is giving us. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:43 +01:00
Josef Bacik	5a08663d01	btrfs: remove the btrfs_item_end() helper We're only using btrfs_item_end() from btrfs_item_end_nr(), so this can be collapsed. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:43 +01:00
Josef Bacik	3212fa14e7	btrfs: drop the _nr from the item helpers Now that all call sites are using the slot number to modify item values, rename the SETGET helpers to raw_item_(), and then rework the _nr() helpers to be the btrfs_item_() btrfs_set_item_*() helpers, and then rename all of the callers to the new helpers. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:43 +01:00
Josef Bacik	7479420736	btrfs: introduce item_nr token variant helpers The last remaining place where we have the pattern of item = btrfs_item_nr(slot) <do something with the item> are the token helpers. Handle this by introducing token helpers that will do the btrfs_item_nr() work inside of the helper itself, and then convert all users of the btrfs_item token helpers to the new _nr() variants. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:43 +01:00
Josef Bacik	437bd07e6c	btrfs: make btrfs_file_extent_inline_item_len take a slot Instead of getting the btrfs_item for this, simply pass in the slot of the item and then use the btrfs_item_size_nr() helper inside of btrfs_file_extent_inline_item_len(). Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:43 +01:00
Josef Bacik	c91666b1f6	btrfs: add btrfs_set_item__nr() helpers We have the pattern of item = btrfs_item_nr(slot); btrfs_set_item_(leaf, item); in a bunch of places in our code. Fix this by adding btrfs_set_item__nr() helpers which will do the appropriate work, and replace those calls with btrfs_set_item__nr(leaf, slot); Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:42 +01:00
Josef Bacik	227f3cd0d5	btrfs: use btrfs_item_size_nr/btrfs_item_offset_nr everywhere We have this pattern in a lot of places item = btrfs_item_nr(slot); btrfs_item_size(leaf, item); when we could simply use btrfs_item_size(leaf, slot); Fix all callers of btrfs_item_size() and btrfs_item_offset() to use the _nr variation of the helpers. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:42 +01:00
Filipe Manana	ccae4a19c9	btrfs: remove no longer needed logic for replaying directory deletes Now that we log only dir index keys when logging a directory, we no longer need to deal with dir item keys in the log replay code for replaying directory deletes. This is also true for the case when we replay a log tree created by a kernel that still logs dir items. So remove the remaining code of the replay of directory deletes algorithm that deals with dir item keys. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:42 +01:00
Filipe Manana	339d035424	btrfs: only copy dir index keys when logging a directory Currently, when logging a directory, we copy both dir items and dir index items from the fs/subvolume tree to the log tree. Both items have exactly the same data (same struct btrfs_dir_item), the difference lies in the key values, where a dir index key contains the index number of a directory entry while the dir item key does not, as it's used for doing fast lookups of an entry by name, while the former is used for sorting entries when listing a directory. We can exploit that and log only the dir index items, since they contain all the information needed to correctly add, replace and delete directory entries when replaying a log tree. Logging only the dir index items is also backward and forward compatible: an unpatched kernel (without this change) can correctly replay a log tree generated by a patched kernel (with this patch), and a patched kernel can correctly replay a log tree generated by an unpatched kernel. The backward compatibility is ensured because: 1) For inserting a new dentry: a dentry is only inserted when we find a new dir index key - we can only insert if we know the dir index offset, which is encoded in the dir index key's offset; 2) For deleting dentries: during log replay, before adding or replacing dentries, we first replay dentry deletions. Whenever we find a dir item key or a dir index key in the subvolume/fs tree that is not logged in a range for which the log tree is authoritative, we do the unlink of the dentry, which removes both the existing dir item key and the dir index key. Therefore logging just dir index keys is enough to ensure dentry deletions are correctly replayed; 3) For dentry replacements: they work when we log only dir index keys and this is mostly due to a combination of 1) and 2). If we replace a dentry with name "foobar" to point from inode A to inode B, then we know the dir index key for the new dentry is different from the old one, as it has an index number (key offset) larger than the old one. This results in replaying a deletion, through replay_dir_deletes(), that causes the old dentry to be removed, both the dir item key and the dir index key, as mentioned at 2). Then when processing the new dir index key, we add the new dentry, adding both a new dir item key and a new index key pointing to inode B, as stated in 1). The forward compatibility, the ability for a patched kernel to replay a log created by an older, unpatched kernel, comes from the changes required for making sure we are able to replay a log that only contains dir index keys - we simply ignore every dir item key we find. So modify directory logging to log only dir index items, and modify the log replay process to ignore dir item keys, from log trees created by an unpatched kernel, and process only with dir index keys. This reduces the amount of logged metadata by about half, and therefore the time spent logging or fsyncing large directories (less CPU time and less IO). The following test script was used to measure this change: #!/bin/bash DEV=/dev/nvme0n1 MNT=/mnt/nvme0n1 NUM_NEW_FILES=1000000 NUM_FILE_DELETES=10000 mkfs.btrfs -f $DEV mount -o ssd $DEV $MNT mkdir $MNT/testdir for ((i = 1; i <= $NUM_NEW_FILES; i++)); do echo -n > $MNT/testdir/file_$i done start=$(date +%s%N) xfs_io -c "fsync" $MNT/testdir end=$(date +%s%N) dur=$(( (end - start) / 1000000 )) echo "dir fsync took $dur ms after adding $NUM_NEW_FILES files" # sync to force transaction commit and wipeout the log. sync del_inc=$(( $NUM_NEW_FILES / $NUM_FILE_DELETES )) for ((i = 1; i <= $NUM_NEW_FILES; i += $del_inc)); do rm -f $MNT/testdir/file_$i done start=$(date +%s%N) xfs_io -c "fsync" $MNT/testdir end=$(date +%s%N) dur=$(( (end - start) / 1000000 )) echo "dir fsync took $dur ms after deleting $NUM_FILE_DELETES files" echo umount $MNT The tests were run on a physical machine, with a non-debug kernel (Debian's default kernel config), for different values of $NUM_NEW_FILES and $NUM_FILE_DELETES, and the results were the following: Before patch, NUM_NEW_FILES = 1 000 000, NUM_DELETE_FILES = 10 000 dir fsync took 8412 ms after adding 1000000 files dir fsync took 500 ms after deleting 10000 files After patch, NUM_NEW_FILES = 1 000 000, NUM_DELETE_FILES = 10 000 dir fsync took 4252 ms after adding 1000000 files (-49.5%) dir fsync took 269 ms after deleting 10000 files (-46.2%) Before patch, NUM_NEW_FILES = 100 000, NUM_DELETE_FILES = 1 000 dir fsync took 745 ms after adding 100000 files dir fsync took 59 ms after deleting 1000 files After patch, NUM_NEW_FILES = 100 000, NUM_DELETE_FILES = 1 000 dir fsync took 404 ms after adding 100000 files (-45.8%) dir fsync took 31 ms after deleting 1000 files (-47.5%) Before patch, NUM_NEW_FILES = 10 000, NUM_DELETE_FILES = 1 000 dir fsync took 67 ms after adding 10000 files dir fsync took 9 ms after deleting 1000 files After patch, NUM_NEW_FILES = 10 000, NUM_DELETE_FILES = 1 000 dir fsync took 36 ms after adding 10000 files (-46.3%) dir fsync took 5 ms after deleting 1000 files (-44.4%) Before patch, NUM_NEW_FILES = 1 000, NUM_DELETE_FILES = 100 dir fsync took 9 ms after adding 1000 files dir fsync took 4 ms after deleting 100 files After patch, NUM_NEW_FILES = 1 000, NUM_DELETE_FILES = 100 dir fsync took 7 ms after adding 1000 files (-22.2%) dir fsync took 3 ms after deleting 100 files (-25.0%) Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:42 +01:00
Nikolay Borisov	17130a65f0	btrfs: remove spurious unlock/lock of unused_bgs_lock Since both unused block groups and reclaim bgs lists are protected by unused_bgs_lock then free them in the same critical section without doing an extra unlock/lock pair. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:42 +01:00
Filipe Manana	232796df8c	btrfs: fix deadlock between quota enable and other quota operations When enabling quotas, we attempt to commit a transaction while holding the mutex fs_info->qgroup_ioctl_lock. This can result on a deadlock with other quota operations such as: - qgroup creation and deletion, ioctl BTRFS_IOC_QGROUP_CREATE; - adding and removing qgroup relations, ioctl BTRFS_IOC_QGROUP_ASSIGN. This is because these operations join a transaction and after that they attempt to lock the mutex fs_info->qgroup_ioctl_lock. Acquiring that mutex after joining or starting a transaction is a pattern followed everywhere in qgroups, so the quota enablement operation is the one at fault here, and should not commit a transaction while holding that mutex. Fix this by making the transaction commit while not holding the mutex. We are safe from two concurrent tasks trying to enable quotas because we are serialized by the rw semaphore fs_info->subvol_sem at btrfs_ioctl_quota_ctl(), which is the only call site for enabling quotas. When this deadlock happens, it produces a trace like the following: INFO: task syz-executor:25604 blocked for more than 143 seconds. Not tainted 5.15.0-rc6 #4 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:24800 pid:25604 ppid: 24873 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 btrfs_commit_transaction+0x994/0x2e90 fs/btrfs/transaction.c:2201 btrfs_quota_enable+0x95c/0x1790 fs/btrfs/qgroup.c:1120 btrfs_ioctl_quota_ctl fs/btrfs/ioctl.c:4229 [inline] btrfs_ioctl+0x637e/0x7b70 fs/btrfs/ioctl.c:5010 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f86920b2c4d RSP: 002b:00007f868f61ac58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f86921d90a0 RCX: 00007f86920b2c4d RDX: 0000000020005e40 RSI: 00000000c0109428 RDI: 0000000000000008 RBP: 00007f869212bd80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f86921d90a0 R13: 00007fff6d233e4f R14: 00007fff6d233ff0 R15: 00007f868f61adc0 INFO: task syz-executor:25628 blocked for more than 143 seconds. Not tainted 5.15.0-rc6 #4 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:29080 pid:25628 ppid: 24873 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425 __mutex_lock_common kernel/locking/mutex.c:669 [inline] __mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729 btrfs_remove_qgroup+0xb7/0x7d0 fs/btrfs/qgroup.c:1548 btrfs_ioctl_qgroup_create fs/btrfs/ioctl.c:4333 [inline] btrfs_ioctl+0x683c/0x7b70 fs/btrfs/ioctl.c:5014 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: Hao Sun <sunhao.th@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZQF19bQ1C6=yetF3BvL10OSORpFUcWXTP6HErshDB4dQ@mail.gmail.com/ Fixes: `340f1aa27f` ("btrfs: qgroups: Move transaction management inside btrfs_quota_enable/disable") CC: stable@vger.kernel.org # 4.19 Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:41 +01:00
Filipe Manana	f0bfa76a11	btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range When doing a direct IO write against a file range that either has preallocated extents in that range or has regular extents and the file has the NOCOW attribute set, the write fails with -ENOSPC when all of the following conditions are met: 1) There are no data blocks groups with enough free space matching the size of the write; 2) There's not enough unallocated space for allocating a new data block group; 3) The extents in the target file range are not shared, neither through snapshots nor through reflinks. This is wrong because a NOCOW write can be done in such case, and in fact it's possible to do it using a buffered IO write, since when failing to allocate data space, the buffered IO path checks if a NOCOW write is possible. The failure in direct IO write path comes from the fact that early on, at btrfs_dio_iomap_begin(), we try to allocate data space for the write and if it that fails we return the error and stop - we never check if we can do NOCOW. But later, at btrfs_get_blocks_direct_write(), we check if we can do a NOCOW write into the range, or a subset of the range, and then release the previously reserved data space. Fix this by doing the data reservation only if needed, when we must COW, at btrfs_get_blocks_direct_write() instead of doing it at btrfs_dio_iomap_begin(). This also simplifies a bit the logic and removes the inneficiency of doing unnecessary data reservations. The following example test script reproduces the problem: $ cat dio-nocow-enospc.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj # Use a small fixed size (1G) filesystem so that it's quick to fill # it up. # Make sure the mixed block groups feature is not enabled because we # later want to not have more space available for allocating data # extents but still have enough metadata space free for the file writes. mkfs.btrfs -f -b $((1024 * 1024 * 1024)) -O ^mixed-bg $DEV mount $DEV $MNT # Create our test file with the NOCOW attribute set. touch $MNT/foobar chattr +C $MNT/foobar # Now fill in all unallocated space with data for our test file. # This will allocate a data block group that will be full and leave # no (or a very small amount of) unallocated space in the device, so # that it will not be possible to allocate a new block group later. echo echo "Creating test file with initial data..." xfs_io -c "pwrite -S 0xab -b 1M 0 900M" $MNT/foobar # Now try a direct IO write against file range [0, 10M[. # This should succeed since this is a NOCOW file and an extent for the # range was previously allocated. echo echo "Trying direct IO write over allocated space..." xfs_io -d -c "pwrite -S 0xcd -b 10M 0 10M" $MNT/foobar umount $MNT When running the test: $ ./dio-nocow-enospc.sh (...) Creating test file with initial data... wrote 943718400/943718400 bytes at offset 0 900 MiB, 900 ops; 0:00:01.43 (625.526 MiB/sec and 625.5265 ops/sec) Trying direct IO write over allocated space... pwrite: No space left on device A test case for fstests will follow, testing both this direct IO write scenario as well as the buffered IO write scenario to make it less likely to get future regressions on the buffered IO case. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2022-01-03 15:09:41 +01:00
Linus Torvalds	c9e6606c7f	Linux 5.16-rc8	2022-01-02 14:23:25 -08:00
Linus Torvalds	24a0b22061	perf tools fixes for v5.16: 5th batch - Fix TUI exit screen refresh race condition in 'perf top'. - Fix parsing of Intel PT VM time correlation arguments. - Honour CPU filtering command line request of a script's switch events in 'perf script'. - Fix printing of switch events in Intel PT python script. - Fix duplicate alias events list printing in 'perf list', noticed on heterogeneous arm64 systems. - Fix return value of ids__new(), users expect NULL for failure, not ERR_PTR(-ENOMEM). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYdHBrgAKCRCyPKLppCJ+ J8C0AQDT9JKQlNPMETD5F2leq0YB5O3wGKwMvgff0hyblArU7QD/Yq7d1XgMkshF lb2dEJfGIyClnrgtTo9nikraESM5JwE= =RTLi -----END PGP SIGNATURE----- Merge tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix TUI exit screen refresh race condition in 'perf top'. - Fix parsing of Intel PT VM time correlation arguments. - Honour CPU filtering command line request of a script's switch events in 'perf script'. - Fix printing of switch events in Intel PT python script. - Fix duplicate alias events list printing in 'perf list', noticed on heterogeneous arm64 systems. - Fix return value of ids__new(), users expect NULL for failure, not ERR_PTR(-ENOMEM). * tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf top: Fix TUI exit screen refresh race condition perf pmu: Fix alias events list perf scripts python: intel-pt-events.py: Fix printing of switch events perf script: Fix CPU filtering of a script's switch events perf intel-pt: Fix parsing of VM time correlation arguments perf expr: Fix return value of ids__new()	2022-01-02 14:09:03 -08:00
Linus Torvalds	859431ac11	Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Better input validation for compat ioctls and a documentation bugfix for 5.16" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Docs: Fixes link to I2C specification i2c: validate user data in compat ioctl	2022-01-02 10:36:09 -08:00
Linus Torvalds	1286cc4893	- Use the proper CONFIG symbol in a preprocessor check. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHRcK4ACgkQEsHwGGHe VUqPjw//bcEdZomdrSR9fJTWC33sKzwBHySxeBV/pIF3/TEu1r9kW0HWy1hqFnMV PmRNRLKzBY/nLmh9EKyYindj2V/+jNMIRSTlkSQxWRNuqTI/LFnOK4ufknqzF17M 3PNGo7mgVo//bnPW+kedTwtKPBTq4TxHUbwnexOd51qbQM6X1fByR0KeR2sOkyeG DW7cTeH3pDzA5YLb6VkzaGhZzQ59Fmh7sHFOlDq4Lfij2F6Le2Vo2/Q0hGmsUDMT BAzUJN3kgf6FSXxg0LZ2vezYI7t00lEMUMwqD0Jrr0laBBFZg15qlpO5B6OUJWJu WjfCwcmBrfyjafFeinQqnqIMd5KHrjuIqBpW3MHj8sx5/3uGjVJw/fk7hn64m6Vr AG8qjQNs0kdad7RAaeIRpxRyDmOmiCShFNHaruz40ztNcV3IZ3/1YANJBLU/jydy xnBCZoGespPyNqu9MPYKaJo5BftQkx5Kwp2UcnJu6C4/pDlEGRDmIhV0hW8Owu26 Vm/g5v/oFK35Q4bZDec0h5k4VIfN9WO2+Fu7vH/BF0v9hRz4TxGTlI2Ne4dOBbdX VaAyXHTNmJOUvnLMejFJWt52sk4ze2CmQ17GN6Gu4F5cm1wYYrj7vpr9SDGQmRcu +zzNOSNJXVA0UemiEbWLoaOBDxRaOQDf7Fi+rDlo77oPElLdulc= =mJVP -----END PGP SIGNATURE----- Merge tag 'x86_urgent_for_v5.16_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: - Use the proper CONFIG symbol in a preprocessor check. * tag 'x86_urgent_for_v5.16_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Use the proper name CONFIG_FW_LOADER	2022-01-02 09:02:54 -08:00
yaowenbin	64f18d2d04	perf top: Fix TUI exit screen refresh race condition When the following command is executed several times, a coredump file is generated. $ timeout -k 9 5 perf top -e task-clock ***** *** ***** 0.01% [kernel] [k] __do_softirq 0.01% libpthread-2.28.so [.] __pthread_mutex_lock 0.01% [kernel] [k] __ll_sc_atomic64_sub_return double free or corruption (!prev) perf top --sort comm,dso timeout: the monitored command dumped core When we terminate "perf top" using sending signal method, SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen management routines by freeing all memory allocated while it was active. However SLsmg_reinit_smg() maybe be called by another thread. SLsmg_reinit_smg() will free the same memory accessed by SLsmg_reset_smg(), thus it results in a double free. SLsmg_reinit_smg() is called already protected by ui__lock, so we fix the problem by adding pthread_mutex_trylock of ui__lock when calling SLsmg_reset_smg(). Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wuxu.wu@huawei.com Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.com Signed-off-by: Hewenliang <hewenliang4@huawei.com> Signed-off-by: yaowenbin <yaowenbin1@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-01-02 11:46:44 -03:00
John Garry	e0257a01d6	perf pmu: Fix alias events list Commit `0e0ae87422` ("perf list: Display hybrid PMU events with cpu type") changes the event list for uncore PMUs or arm64 heterogeneous CPU systems, such that duplicate aliases are incorrectly listed per PMU (which they should not be), like: # perf list ... unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] ... Notice how the events are listed twice. The named commit changed how we remove duplicate events, in that events for different PMUs are not treated as duplicates. I suppose this is to handle how "Each hybrid pmu event has been assigned with a pmu name". Fix PMU alias listing by restoring behaviour to remove duplicates for non-hybrid PMUs. Fixes: `0e0ae87422` ("perf list: Display hybrid PMU events with cpu type") Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-01-02 11:29:05 -03:00
Linus Torvalds	278218f677	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "Two small fixups for spaceball joystick driver and appletouch touchpad driver" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: spaceball - fix parsing of movement data packets Input: appletouch - initialize work before device registration	2022-01-01 10:21:49 -08:00
Mel Gorman	8008293888	mm: vmscan: reduce throttling due to a failure to make progress -fix Hugh Dickins reported the following My tmpfs swapping load (tweaked to use huge pages more heavily than in real life) is far from being a realistic load: but it was notably slowed down by your throttling mods in 5.16-rc, and this patch makes it well again - thanks. But: it very quickly hit NULL pointer until I changed that last line to if (first_pgdat) consider_reclaim_throttle(first_pgdat, sc); The likely issue is that huge pages are a major component of the test workload. When this is the case, first_pgdat may never get set if compaction is ready to continue due to this check if (IS_ENABLED(CONFIG_COMPACTION) && sc->order > PAGE_ALLOC_COSTLY_ORDER && compaction_ready(zone, sc)) { sc->compaction_ready = true; continue; } If this was true for every zone in the zonelist, first_pgdat would never get set resulting in a NULL pointer exception. Link: https://lkml.kernel.org/r/20211209095453.GM3366@techsingularity.net Fixes: `1b4e3f26f9` ("mm: vmscan: Reduce throttling due to a failure to make progress") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-12-31 13:12:55 -08:00
Mel Gorman	1b4e3f26f9	mm: vmscan: Reduce throttling due to a failure to make progress Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar problems due to reclaim throttling for excessive lengths of time. In Alexey's case, a memory hog that should go OOM quickly stalls for several minutes before stalling. In Mike and Darrick's cases, a small memcg environment stalled excessively even though the system had enough memory overall. Commit `69392a403f` ("mm/vmscan: throttle reclaim when no progress is being made") introduced the problem although commit `a19594ca4a` ("mm/vmscan: increase the timeout if page reclaim is not making progress") made it worse. Systems at or near an OOM state that cannot be recovered must reach OOM quickly and memcg should kill tasks if a memcg is near OOM. To address this, only stall for the first zone in the zonelist, reduce the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if the scan control nr_reclaimed is 0, kswapd is still active and there were excessive pages pending for writeback. If kswapd has stopped reclaiming due to excessive failures, do not stall at all so that OOM triggers relatively quickly. Similarly, if an LRU is simply congested, only lightly throttle similar to NOPROGRESS. Alexey's original case was the most straight forward for i in {1..3}; do tail /dev/zero; done On vanilla 5.16-rc1, this test stalled heavily, after the patch the test completes in a few seconds similar to 5.15. Alexey's second test case added watching a youtube video while tail runs 10 times. On 5.15, playback only jitters slightly, 5.16-rc1 stalls a lot with lots of frames missing and numerous audio glitches. With this patch applies, the video plays similarly to 5.15. [lkp@intel.com: Fix W=1 build warning] Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net Link: https://lore.kernel.org/r/20211202150614.22440-1-mgorman@techsingularity.net Link: https://linux-regtracking.leemhuis.info/regzbot/regression/20211124011954.7cab9bb4@mail.inbox.lv/ Reported-and-tested-by: Alexey Avramov <hakavlad@inbox.lv> Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Reported-and-tested-by: Darrick J. Wong <djwong@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Hugh Dickins <hughd@google.com> Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info> Fixes: `69392a403f` ("mm/vmscan: throttle reclaim when no progress is being made") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-12-31 11:17:07 -08:00
Linus Torvalds	f87bcc88f3	Merge branch 'akpm' (patches from Andrew) Merge misc mm fixes from Andrew Morton: "2 patches. Subsystems affected by this patch series: mm (userfaultfd and damon)" * akpm: mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()' userfaultfd/selftests: fix hugetlb area allocations	2021-12-31 09:28:48 -08:00
Linus Torvalds	e46227bf38	SCSI fixes on 20211231 Three fixes, all in drivers. The lpfc one doesn't look exploitable, but nasty things could happen in string operations if mybuf ends up with an on stack unterminated string. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYc8YLiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdUhAQCVmqLx GhEK15Y8etJwMoj03I6hO5gChhQz6kk7pxXAVwD/e5LHrVVeq/WxjUnyrC1gx6sm iYHYbZ0UHotwbRpwU9k= =WAIf -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three fixes, all in drivers. The lpfc one doesn't look exploitable, but nasty things could happen in string operations if mybuf ends up with an on stack unterminated string" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: vmw_pvscsi: Set residual data length conditionally scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown() scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()	2021-12-31 09:22:25 -08:00
SeongJae Park	ebb3f994dd	mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()' DAMON debugfs interface increases the reference counts of 'struct pid's for targets from the 'target_ids' file write callback ('dbgfs_target_ids_write()'), but decreases the counts only in DAMON monitoring termination callback ('dbgfs_before_terminate()'). Therefore, when 'target_ids' file is repeatedly written without DAMON monitoring start/termination, the reference count is not decreased and therefore memory for the 'struct pid' cannot be freed. This commit fixes this issue by decreasing the reference counts when 'target_ids' is written. Link: https://lkml.kernel.org/r/20211229124029.23348-1-sj@kernel.org Fixes: `4bc05954d0` ("mm/damon: implement a debugfs-based user space interface") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-12-31 09:20:12 -08:00
Mike Kravetz	f5c7329718	userfaultfd/selftests: fix hugetlb area allocations Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh or any environment where there are 'just enough' hugetlb pages will always fail with: testing events (fork, remap, remove): ERROR: UFFDIO_COPY error: -12 (errno=12, line=616) The ENOMEM error code implies there are not enough hugetlb pages. However, there are free hugetlb pages but they are all reserved. There is a basic problem with the way the test allocates hugetlb pages which has existed since the test was originally written. Due to the way 'cleanup' was done between different phases of the test, this issue was masked until recently. The issue was uncovered by commit `8ba6e86408` ("userfaultfd/selftests: reinitialize test context in each test"). For the hugetlb test, src and dst areas are allocated as PRIVATE mappings of a hugetlb file. This means that at mmap time, pages are reserved for the src and dst areas. At the start of event testing (and other tests) the src area is populated which results in allocation of huge pages to fill the area and consumption of reserves associated with the area. Then, a child is forked to fault in the dst area. Note that the dst area was allocated in the parent and hence the parent owns the reserves associated with the mapping. The child has normal access to the dst area, but can not use the reserves created/owned by the parent. Thus, if there are no other huge pages available allocation of a page for the dst by the child will fail. Fix by not creating reserves for the dst area. In this way the child can use free (non-reserved) pages. Also, MAP_PRIVATE of a file only makes sense if you are interested in the contents of the file before making a COW copy. The test does not do this. So, just use MAP_ANONYMOUS \| MAP_HUGETLB to create an anonymous hugetlb mapping. There is no need to create a hugetlb file in the non-shared case. Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-12-31 09:20:12 -08:00
Deep Majumder	c116fe1e18	Docs: Fixes link to I2C specification The link to the I2C specification is broken. Although "https://www.nxp.com" hosts Rev 7 (2021) of this specification, it is behind a login-wall. Thus, an additional link has been added (which doesn't require a login) and the NXP official docs link has been updated. Signed-off-by: Deep Majumder <deep@fastmail.in> [wsa: minor updates to text and commit message] Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-12-31 14:39:28 +01:00
Pavel Skripkin	bb436283e2	i2c: validate user data in compat ioctl Wrong user data may cause warning in i2c_transfer(), ex: zero msgs. Userspace should not be able to trigger warnings, so this patch adds validation checks for user data in compact ioctl to prevent reported warnings Reported-and-tested-by: syzbot+e417648b303855b91d8a@syzkaller.appspotmail.com Fixes: `7d5cb45655` ("i2c compat ioctls: move to ->compat_ioctl()") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>	2021-12-31 14:28:22 +01:00
Leo L. Schwab	bc7ec91718	Input: spaceball - fix parsing of movement data packets The spaceball.c module was not properly parsing the movement reports coming from the device. The code read axis data as signed 16-bit little-endian values starting at offset 2. In fact, axis data in Spaceball movement reports are signed 16-bit big-endian values starting at offset 3. This was determined first by visually inspecting the data packets, and later verified by consulting: http://spacemice.org/pdf/SpaceBall_2003-3003_Protocol.pdf If this ever worked properly, it was in the time before Git... Signed-off-by: Leo L. Schwab <ewhac@ewhac.org> Link: https://lore.kernel.org/r/20211221101630.1146385-1-ewhac@ewhac.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2021-12-30 21:09:29 -08:00
Pavel Skripkin	9f3ccdc3f6	Input: appletouch - initialize work before device registration Syzbot has reported warning in __flush_work(). This warning is caused by work->func == NULL, which means missing work initialization. This may happen, since input_dev->close() calls cancel_work_sync(&dev->work), but dev->work initalization happens _after_ input_register_device() call. So this patch moves dev->work initialization before registering input device Fixes: `5a6eb676d3` ("Input: appletouch - improve powersaving for Geyser3 devices") Reported-and-tested-by: syzbot+b88c5eae27386b252bbd@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211230141151.17300-1-paskripkin@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>	2021-12-30 21:04:04 -08:00
Linus Torvalds	4f3d93c6ea	drm fixes for 5.16-rc8 nouveau: - fencing regression fix i915: - Fix possible uninitialized variable - Fix composite fence seqno icrement on each fence creation amdgpu: - Fencing fix - XGMI fix - VCN regression fix - IP discovery regression fixes - Fix runpm documentation - Suspend/resume fixes - Yellow Carp display fixes - MCLK power management fix - dma-buf fix -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmHOYw0ACgkQDHTzWXnE hr5uSw/+LUOSTfobUSZxRLwhpD9wIk1i29J6OTxKO8DJHLGW1TlZzOI0QXFp1Ikf oEImQkEr4YVzjmcbgtPSl2v2oI8odrIbvJnps733FereIkfCdiT4Odf+Is6/Gs5m zjLg9EGIJt6TFrgCDuL9yFXWnVELpxmvsKJ+eyUa1NfbT61xSy7TcwRkv5+5gkoJ ZMkuvVo2rgEAKiVA9vlSDjG0r8/ksFhK7hy9w0E5V44xJEmemEPRw9FjOd8Efujc gbSCw5vIBXRPD7kDTwKUw6Y7MKChZ7DFyIF7t0ioez32cCK8MVrmjdD+cHcx77fv EgvqlAbhZAFIo/nb/FGxVHYzlUbxqsZhYsYzX00WROEqgmiLiEirBXM1+6ChqS1C Jicfe+Ko5MXle5MVd9UlgCIdd/St5Bfr77Nejq6U3R697Oskt/1g2nV1adCSTvyv c3Tf9P3C9edzdzT6jnwLCkXCUtyki6w5RBgM4x9R1fP/BFvIOdahhcKilcqli2jx s5HxMIZUYEcR5NNAcpMZFZNnDSGvI5pQWTqD7Gu1lsmyqWyy7GkBDbIjnToDPORn 3Bno2c1OhYanaxDr2pgGKgI1I9mRb0L+jPRRSNgBwgxMrmwixpMJlmCpGbI/AZtD kZK9F8wAHUm/hrWMC7xrGFMHiEEdD4xV3jMz/mAgpGFE8WSZUgA= =WRy6 -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2021-12-31' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "This is a bit bigger than I'd like, however it has two weeks of amdgpu fixes in it, since they missed last week, which was very small. The nouveau regression is probably the biggest fix in here, and it needs to go into 5.15 as well, two i915 fixes, and then a scattering of amdgpu fixes. The biggest fix in there is for a fencing NULL pointer dereference, the rest are pretty minor. For the misc team, I've pulled the two misc fixes manually since I'm not sure what is happening at this time of year! The amdgpu maintainers have the outstanding runpm regression to fix still, they are just working through the last bits of it now. Summary: nouveau: - fencing regression fix i915: - Fix possible uninitialized variable - Fix composite fence seqno icrement on each fence creation amdgpu: - Fencing fix - XGMI fix - VCN regression fix - IP discovery regression fixes - Fix runpm documentation - Suspend/resume fixes - Yellow Carp display fixes - MCLK power management fix - dma-buf fix" * tag 'drm-fixes-2021-12-31' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Changed pipe split policy to allow for multi-display pipe split drm/amd/display: Fix USB4 null pointer dereference in update_psp_stream_config drm/amd/display: Set optimize_pwr_state for DCN31 drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization drm/amd/display: Added power down for DCN10 drm/amd/display: fix B0 TMDS deepcolor no dislay issue drm/amdgpu: no DC support for headless chips drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform drm/amdgpu: always reset the asic in suspend (v2) drm/amd/pm: skip setting gfx cgpg in the s0ix suspend-resume drm/i915: Increment composite fence seqno drm/i915: Fix possible uninitialized variable in parallel extension drm/amdgpu: fix runpm documentation drm/nouveau: wait for the exclusive fence after the shared ones v2 drm/amdgpu: add support for IP discovery gc_info table v2 drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled drm/amd/pm: Fix xgmi link control on aldebaran drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence drm/amdgpu: fix dropped backing store handling in amdgpu_dma_buf_move_notify	2021-12-30 18:25:43 -08:00
Dave Airlie	ce9b333c73	Merge branch 'drm-misc-fixes' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes This merges two fixes that haven't been sent to me yet, but I wanted to get in. One amdgpu fix, but one nouveau regression fixer. Signed-off-by: Dave Airlie <airlied@redhat.com>	2021-12-31 11:40:29 +10:00
Christian Brauner	012e332286	fs/mount_setattr: always cleanup mount_kattr Make sure that finish_mount_kattr() is called after mount_kattr was succesfully built in both the success and failure case to prevent leaking any references we took when we built it. We returned early if path lookup failed thereby risking to leak an additional reference we took when building mount_kattr when an idmapped mount was requested. Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: `9caccd4154` ("fs: introduce MOUNT_ATTR_IDMAP") Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-12-30 15:12:13 -08:00
Linus Torvalds	74c78b4291	Networking fixes for 5.16-rc8, including fixes from.. Santa? Current release - regressions: - xsk: initialise xskb free_list_node, fixup for a -rc7 fix Current release - new code bugs: - mlx5: handful of minor fixes: - use first online CPU instead of hard coded CPU - fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' - fix skb memory leak when TC classifier action offloads are disabled - fix memory leak with rules with internal OvS port Previous releases - regressions: - igc: do not enable crosstimestamping for i225-V models Previous releases - always broken: - udp: use datalen to cap ipv6 udp max gso segments - fix use-after-free in tw_timer_handler due to early free of stats - smc: fix kernel panic caused by race of smc_sock - smc: don't send CDC/LLC message if link not ready, avoid timeouts - sctp: use call_rcu to free endpoint, avoid UAF in sock diag - bridge: mcast: add and enforce query interval minimum - usb: pegasus: do not drop long Ethernet frames - mlx5e: fix ICOSQ recovery flow for XSK - nfc: uapi: use kernel size_t to fix user-space builds Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHN9xUACgkQMUZtbf5S Irt86w/9HC6nHXaEmcBoLhBp7k39Kbs5s/og68+ALgtQt/XRlQsiC5HuYqLQREQ0 kqGEyp0JJyLuAM23CcWM7s8JhEAcmyHiGFhdCtrTwNltFLE0Fvd7XYPtG8VXHtVE bEbMu3cmafKtyn5EueFp+Hfl1yA0u5LrX6lDZfLgEgYDjLVSUJCXg2B+uiTIdhON UuKdXIHrBWX0aZpCHeMZ0/Ksdw9oOq7dqcaKi62yQAWkXpQMAUlFJ9OiQXksdlqY leBao3gA8F9J8KK39GfDNyn1Gt8kbN6d/pwi3+IVM2KTHk1wlyLfelDauTG7iUOl FDLuzrKZtMsyAXa5zxeHvQlV2f7CeXsOmpLhGnO0/FSCIc9WvkBFnuq49ESur0Lq 3tu5vrxoIW0In1DWy2HvWCflV3eYatq9eGzAhymkAiBcKrBhJyEE1IH4hYPzRD4x 3ab8Ma0zKzbRum37izNfW2X9hpJTSmlXdVsSP1L6O6hq1iSZhQnQ0dWP8KXw222u CpaqfepkxQMGj+mQss+nIltw8OQnj84dJOajuH/oo4Le4lUciyPizwAo45Muv7D7 2MDd/GFs3yHT8gglxSEjwNg8HKooI93Zc11uEt0KJDTXMlmnCLasTwkKBh+CD970 +PyKuaNDE1k6rav01bcteOEXFOhnDjvU3Kur1bnzo5OXKZ5cbng= =ucH7 -----END PGP SIGNATURE----- Merge tag 'net-5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from.. Santa? No regressions on our radar at this point. The igc problem fixed here was the last one I was tracking but it was broken in previous releases, anyway. Mostly driver fixes and a couple of largish SMC fixes. Current release - regressions: - xsk: initialise xskb free_list_node, fixup for a -rc7 fix Current release - new code bugs: - mlx5: handful of minor fixes: - use first online CPU instead of hard coded CPU - fix some error handling paths in 'mlx5e_tc_add_fdb_flow()' - fix skb memory leak when TC classifier action offloads are disabled - fix memory leak with rules with internal OvS port Previous releases - regressions: - igc: do not enable crosstimestamping for i225-V models Previous releases - always broken: - udp: use datalen to cap ipv6 udp max gso segments - fix use-after-free in tw_timer_handler due to early free of stats - smc: fix kernel panic caused by race of smc_sock - smc: don't send CDC/LLC message if link not ready, avoid timeouts - sctp: use call_rcu to free endpoint, avoid UAF in sock diag - bridge: mcast: add and enforce query interval minimum - usb: pegasus: do not drop long Ethernet frames - mlx5e: fix ICOSQ recovery flow for XSK - nfc: uapi: use kernel size_t to fix user-space builds" * tag 'net-5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) fsl/fman: Fix missing put_device() call in fman_port_probe selftests: net: using ping6 for IPv6 in udpgro_fwd.sh Documentation: fix outdated interpretation of ip_no_pmtu_disc net/ncsi: check for error return from call to nla_put_u32 net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper net: fix use-after-free in tw_timer_handler selftests: net: Fix a typo in udpgro_fwd.sh selftests/net: udpgso_bench_tx: fix dst ip argument net: bridge: mcast: add and enforce startup query interval minimum net: bridge: mcast: add and enforce query interval minimum ipv6: raw: check passed optlen before reading xsk: Initialise xskb free_list_node net/mlx5e: Fix wrong features assignment in case of error net/mlx5e: TC, Fix memory leak with rules with internal port ionic: Initialize the 'lif->dbid_inuse' bitmap igc: Fix TX timestamp support for non-MSI-X platforms igc: Do not enable crosstimestamping for i225-V models net/smc: fix kernel panic caused by race of smc_sock net/smc: don't send CDC/LLC message if link not ready NFC: st21nfca: Fix memory leak in device probe and remove ...	2021-12-30 11:12:12 -08:00
Linus Torvalds	9bad743e8d	Char/Misc fixes for 5.16-final Here are two misc driver fixes for 5.16-final: - binder accounting fix to resolve reported problem - nitro_enclaves fix for mmap assert warning output Both of these have been for over a week with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYc3i1w8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynYBgCgvqxG0Ykl3G/RG55U2fSZlWJuLfsAoKZBCt+6 BTSCwLhNQvJ5fI6BHFkK =3nNc -----END PGP SIGNATURE----- Merge tag 'char-misc-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are two misc driver fixes for 5.16-final: - binder accounting fix to resolve reported problem - nitro_enclaves fix for mmap assert warning output Both of these have been for over a week with no reported issues" * tag 'char-misc-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: nitro_enclaves: Use get_user_pages_unlocked() call to handle mmap assert binder: fix async_free_space accounting for empty parcels	2021-12-30 09:52:32 -08:00
Linus Torvalds	2d40060bb5	USB fixes for 5.16-final Here are some small USB driver fixes for 5.16 to resolve some reported problems: - mtu3 driver fixes - typec ucsi driver fix - xhci driver quirk added - usb gadget f_fs fix for reported crash All of these have been in linux-next for a while with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYc3jlA8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymbwgCfbEHPGRtOsbEFFiJugbKhVHCi0w8An0CHzzTB 3nEwm+l4BUkUcvqTxc7A =95Py -----END PGP SIGNATURE----- Merge tag 'usb-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB driver fixes for 5.16 to resolve some reported problems: - mtu3 driver fixes - typec ucsi driver fix - xhci driver quirk added - usb gadget f_fs fix for reported crash All of these have been in linux-next for a while with no reported problems" * tag 'usb-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: typec: ucsi: Only check the contract if there is a connection xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set. usb: mtu3: set interval of FS intr and isoc endpoint usb: mtu3: fix list_head check warning usb: mtu3: add memory barrier before set GPD's HWO usb: mtu3: fix interval value for intr and isoc usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear.	2021-12-30 09:49:54 -08:00
Miaoqian Lin	bf2b09fedc	fsl/fman: Fix missing put_device() call in fman_port_probe The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the and error handling paths. Fixes: `18a6c85fcc` ("fsl/fman: Add FMan Port Support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-30 13:34:06 +00:00
Jianguo Wu	8b3170e075	selftests: net: using ping6 for IPv6 in udpgro_fwd.sh udpgro_fwd.sh output following message: ping: 2001:db8:1:💯 Address family for hostname not supported Using ping6 when pinging IPv6 addresses. Fixes: `a062260a9d` ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-30 13:31:48 +00:00
xu xin	be1c5b5322	Documentation: fix outdated interpretation of ip_no_pmtu_disc The updating way of pmtu has changed, but documentation is still in the old way. So this patch updates the interpretation of ip_no_pmtu_disc and min_pmtu. See commit `28d35bcdd3` ("net: ipv4: don't let PMTU updates increase route MTU") Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-30 13:28:04 +00:00
Dave Airlie	aeeb82fd61	Merge tag 'amd-drm-fixes-5.16-2021-12-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.16-2021-12-29: amdgpu: - Fencing fix - XGMI fix - VCN regression fix - IP discovery regression fixes - Fix runpm documentation - Suspend/resume fixes - Yellow Carp display fixes - MCLK power management fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211229155129.5789-1-alexander.deucher@amd.com	2021-12-30 13:55:48 +10:00
Jakub Kicinski	ccc0c9be75	mlx5-fixes-2021-12-28 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmHMA2sACgkQSD+KveBX +j4EmAgArIcwFde37gGKqwW/alEWohligk7KN5QiJDFZ9HrzpTEQp9vCZ/JV5TeC 1ySaW/34gUrhPPM2brgAY+ZdYeIu1tApmmKTHAbzCFn44viShqxjH8nJUYKZtqeu sAATmR059Ap1Zsb6y74u6jy5qUD2/dkkjDlaNBYoYmkTeKKg+Jkt56tE0lVLAhn2 PMsd8VO459KUor+0HJoXHEzurHRvitLlK5d7QsYPaiKEdCJ/ZE6NNABXVuMZf5KU gHQcmjH1jy2X722bs92u1ykHOPmDCFDB9ltnR2mLfRhtPglbGhAVL1A8paEvRwy+ tVssgIIt6PG74SOV5DuJFOH970CjpA== =00nz -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2021-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-12-28 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2021-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Fix wrong features assignment in case of error net/mlx5e: TC, Fix memory leak with rules with internal port ==================== Link: https://lore.kernel.org/r/20211229065352.30178-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-29 18:19:01 -08:00
Dave Airlie	05097b19a9	Merge tag 'drm-intel-fixes-2021-12-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.16: - Fix possible uninitialized variable - Fix composite fence seqno icrement on each fence creation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87h7ark5r5.fsf@intel.com	2021-12-30 12:12:40 +10:00
Jiasheng Jiang	92a34ab169	net/ncsi: check for error return from call to nla_put_u32 As we can see from the comment of the nla_put() that it could return -EMSGSIZE if the tailroom of the skb is insufficient. Therefore, it should be better to check the return value of the nla_put_u32 and return the error code if error accurs. Also, there are many other functions have the same problem, and if this patch is correct, I will commit a new version to fix all. Fixes: `955dc68cb9` ("net/ncsi: Add generic netlink family") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Link: https://lore.kernel.org/r/20211229032118.1706294-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-29 17:53:24 -08:00
Nikolay Aleksandrov	168fed986b	net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper We need to first check if the context is a vlan one, then we need to check the global bridge multicast vlan snooping flag, and finally the vlan's multicast flag, otherwise we will unnecessarily enable vlan mcast processing (e.g. querier timers). Fixes: `7b54aaaf53` ("net: bridge: multicast: add vlan state initialization and control") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20211228153142.536969-1-nikolay@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-29 17:49:45 -08:00
Muchun Song	e22e45fc9e	net: fix use-after-free in tw_timer_handler A real world panic issue was found as follow in Linux 5.4. BUG: unable to handle page fault for address: ffffde49a863de28 PGD 7e6fe62067 P4D 7e6fe62067 PUD 7e6fe63067 PMD f51e064067 PTE 0 RIP: 0010:tw_timer_handler+0x20/0x40 Call Trace: <IRQ> call_timer_fn+0x2b/0x120 run_timer_softirq+0x1ef/0x450 __do_softirq+0x10d/0x2b8 irq_exit+0xc7/0xd0 smp_apic_timer_interrupt+0x68/0x120 apic_timer_interrupt+0xf/0x20 This issue was also reported since 2017 in the thread [1], unfortunately, the issue was still can be reproduced after fixing DCCP. The ipv4_mib_exit_net is called before tcp_sk_exit_batch when a net namespace is destroyed since tcp_sk_ops is registered befrore ipv4_mib_ops, which means tcp_sk_ops is in the front of ipv4_mib_ops in the list of pernet_list. There will be a use-after-free on net->mib.net_statistics in tw_timer_handler after ipv4_mib_exit_net if there are some inflight time-wait timers. This bug is not introduced by commit `f2bf415cfe` ("mib: add net to NET_ADD_STATS_BH") since the net_statistics is a global variable instead of dynamic allocation and freeing. Actually, commit `61a7e26028` ("mib: put net statistics on struct net") introduces the bug since it put net statistics on struct net and free it when net namespace is destroyed. Moving init_ipv4_mibs() to the front of tcp_init() to fix this bug and replace pr_crit() with panic() since continuing is meaningless when init_ipv4_mibs() fails. [1] https://groups.google.com/g/syzkaller/c/p1tn-_Kc6l4/m/smuL_FMAAgAJ?pli=1 Fixes: `61a7e26028` ("mib: put net statistics on struct net") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Cong Wang <cong.wang@bytedance.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211228104145.9426-1-songmuchun@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-29 17:46:43 -08:00

1 2 3 4 5 ...

1060265 Commits