Due to restriction that cannot handle multiple
buffer descriptor structures, decrease the maximum
read/write size for Windows clients.
And set the maximum fragmented receive size
in consideration of the receive queue size.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Create a memory region pool because rdma_rw_ctx_init()
uses memory registration if memory registration yields
better performance than using multiple SGE entries.
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Whenever new parameter is added to smb configuration, It is possible
to break the execution of the IPC daemon by mismatch size of
request/response. This patch tries to reserve space in ipc request/response
in advance to prevent that.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
If the client ignores the CreditResponse received from the server and
continues to send the request, ksmbd limits the requests if it exceeds
smb2 max credits.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Moves the credit charge deduction from total_credits under the processing
a request. When repeating smb2 lock request and other command request,
there will be a problem that ->total_credits does not decrease.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Add smb2 max credits parameter to adjust maximum credits value to limit
number of outstanding requests.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
When SMB Direct is used with iWARP, Windows use 5445 port for smb direct
port, 445 port for SMB. This patch check ib_device using ib_client to
know if NICs type is iWARP or Infiniband.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Register ksmbd ib client with ib_register_client() to find the rdma capable
network adapter. If ops.get_netdev(Chelsio NICs) is NULL, ksmbd will find
it using ib_device_get_by_netdev in old way.
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Update the arch_support_sort_key() function in powerpc to enable
presenting local and global variants of sort key 'p_stage_cyc'.
Update the "se_header" strings for these in arch_perf_header_entry()
along with instruction latency.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20211203022038.48240-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sort key 'p_stage_cyc' is used to present the latency cycles spent in
pipeline stages.
perf has local 'p_stage_cyc' sort key to display this info. There is no
global variant available for this sort key. The local variant shows
latency in a single sample, whereas the global value will be useful to
present the total latency (sum of latencies) in the hist entry. It
represents the latency number multiplied by the number of samples.
Add global ('p_stage_cyc') and local variant ('local_p_stage_cyc') for
this sort key. Use 'local_p_stage_cyc' as default option for "mem" sort
mode.
Also add this to the list of dynamic sort keys and made the
"dynamic_headers" and "arch_specific_sort_keys" as static.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20211203022038.48240-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We probably want to remove the indirect block to extents migration
feature after a deprecation window, but until then, let's fix a
potential data loss problem caused by the fact that we put the
tmp_inode on the orphan list. In the unlikely case where we crash and
do a journal recovery, the data blocks belonging to the inode being
migrated are also represented in the tmp_inode on the orphan list ---
and so its data blocks will get marked unallocated, and available for
reuse.
Instead, stop putting the tmp_inode on the oprhan list. So in the
case where we crash while migrating the inode, we'll leak an inode,
which is not a disaster. It will be easily fixed the next time we run
fsck, and it's better than potentially having blocks getting claimed
by two different files, and losing data as a result.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Cc: stable@kernel.org
BUG_ON would be better.
This issue was detected with the help of Coccinelle.
Reported-by: Zeal robot <zealci@zte.com.cn>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: xu xin <xu.xin16@zte.com.cn>
Link: https://lore.kernel.org/r/20211228073252.580296-1-xu.xin16@zte.com.cn
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This was obviously supposed to be an ext4 struct, not xfs. GCC
doesn't care either way so it doesn't affect the build or runtime.
Fixes: cebe85d570 ("ext4: switch to the new mount api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20211215114309.GB14552@kili
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When migrating to extents, the temporary inode will have it's own checksum
seed. This means that, when swapping the inodes data, the inode checksums
will be incorrect.
This can be fixed by recalculating the extents checksums again. Or simply
by copying the seed into the temporary inode.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357
Reported-by: Jeroen van Wolffelaar <jeroen@wolffelaar.nl>
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Link: https://lore.kernel.org/r/20211214175058.19511-1-lhenriques@suse.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Although it is in the loop, offset is reassigned at the beginning of the
while loop. And after the loop, the value will not be used
The clang_analyzer complains as follows:
fs/ext4/dir.c:306:3 warning:
Value stored to 'offset' is never read
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Link: https://lore.kernel.org/r/20211208075307.404703-1-luo.penghao@zte.com.cn
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The if will goto out of the loop, and until the end of the
function execution, o_start will not be used again.
The clang_analyzer complains as follows:
fs/ext4/move_extent.c:635:5 warning:
Value stored to 'o_start' is never read
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Link: https://lore.kernel.org/r/20211208075157.404535-1-luo.penghao@zte.com.cn
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
EXT_FIRST_INDEX(ptr) is ptr+12, which can't possibly be null; gcc-12
warns about this.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20211115172020.57853-1-kilobyte@angband.pl
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The eh assignment in these two places is meaningless, because the
function will goto to merge, which will not use eh.
The clang_analyzer complains as follows:
fs/ext4/extents.c:1988:4 warning:
fs/ext4/extents.c:2016:4 warning:
Value stored to 'eh' is never read
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Link: https://lore.kernel.org/r/20211104064007.2919-1-luo.penghao@zte.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The local variable assignment at the end of the function is meaningless.
The clang_analyzer complains as follows:
fs/ext4/fast_commit.c:779:2 warning:
Value stored to 'dst' is never read
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: luo penghao <luo.penghao@zte.com.cn>
Link: https://lore.kernel.org/r/20211104063406.2747-1-luo.penghao@zte.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The command "make clang-analyzer" detects dead stores in
mpage_process_page() function.
Do not reset io_end_size to 0 in the current paths, as the function
exits on those paths without further using io_end_size.
Signed-off-by: Nghia Le <nghialm78@gmail.com>
Link: https://lore.kernel.org/r/20211025221803.3326-1-nghialm78@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ext4 has an optimization mechanism for batched disacrd (FITRIM) that
should help speed up subsequent calls of FITRIM ioctl by skipping the
groups that were previously trimmed. However because the FITRIM allows
to set the minimum size of an extent to trim, ext4 stores the last
minimum extent size and only avoids trimming the group if it was
previously trimmed with minimum extent size equal to, or smaller than
the current call.
There is currently no way to bypass the optimization without
umount/mount cycle. This becomes a problem when the file system is
live migrated to a different storage, because the optimization will
prevent possibly useful discard calls to the storage.
Fix it by exporting the s_last_trim_minblks via sysfs interface which
will allow us to set the minimum size to the number of blocks larger
than subsequent FITRIM call, effectively bypassing the optimization.
By setting the s_last_trim_minblks to ULONG_MAX the optimization will be
effectively cleared regardless of the previous state, or file system
configuration.
For example:
getconf ULONG_MAX > /sys/fs/ext4/dm-1/last_trim_minblks
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reported-by: Laurent GUERBY <laurent@guerby.net>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20211103145122.17338-2-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There is no good reason for the s_last_trim_minblks to be atomic. There is
no data integrity needed and there is no real danger in setting and
reading it in a racy manner. Change it to be unsigned long, the same type
as s_clusters_per_group which is the maximum that's allowed.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Suggested-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20211103145122.17338-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Implement support for FS_IOC_GETFSLABEL and FS_IOC_SETFSLABEL ioctls for
online reading and setting of file system label.
ext4_ioctl_getlabel() is simple, just get the label from the primary
superblock. This might not be the first sb on the file system if
'sb=' mount option is used.
In ext4_ioctl_setlabel() we update what ext4 currently views as a
primary superblock and then proceed to update backup superblocks. There
are two caveats:
- the primary superblock might not be the first superblock and so it
might not be the one used by userspace tools if read directly
off the disk.
- because the primary superblock might not be the first superblock we
potentialy have to update it as part of backup superblock update.
However the first sb location is a bit more complicated than the rest
so we have to account for that.
The superblock modification is created generic enough so the
infrastructure can be used for other potential superblock modification
operations, such as chaning UUID.
Tested with generic/492 with various configurations. I also checked the
behavior with 'sb=' mount options, including very large file systems
with and without sparse_super/sparse_super2.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20211213135618.43303-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Only set EXT4_MOUNT_QUOTA when journalled quota file is specified,
otherwise simply disabling specific quota type (usrjquota=) will also
set the EXT4_MOUNT_QUOTA super block option.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Fixes: e6e268cb68 ("ext4: move quota configuration out of handle_mount_opt()")
Link: https://lore.kernel.org/r/20220104143518.134465-2-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
During ext4 mount api rework the commit e6e268cb68 ("ext4: move quota
configuration out of handle_mount_opt()") introduced a bug where we
would kfree(sbi->s_qf_names[i]) before assigning the new quota name in
ext4_apply_quota_options().
This is wrong because we're using kfree() on rcu prointer that could be
simultaneously accessed from ext4_show_quota_options() during remount.
Fix it by using rcu_replace_pointer() to replace the old qname with the
new one and then kfree_rcu() the old quota name.
Also use get_qf_name() instead of sbi->s_qf_names in strcmp() to silence
the sparse warning.
Fixes: e6e268cb68 ("ext4: move quota configuration out of handle_mount_opt()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220104143518.134465-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
A user reported FITRIM ioctl failing for him on ext4 on some devices
without apparent reason. After some debugging we've found out that
these devices (being LVM volumes) report rather large discard
granularity of 42MB and the filesystem had 1k blocksize and thus group
size of 8MB. Because ext4 FITRIM implementation puts discard
granularity into minlen, ext4_trim_fs() declared the trim request as
invalid. However just silently doing nothing seems to be a more
appropriate reaction to such combination of parameters since user did
not specify anything wrong.
CC: Lukas Czerner <lczerner@redhat.com>
Fixes: 5c2ed62fd4 ("ext4: Adjust minlen with discard_granularity in the FITRIM ioctl")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20211112152202.26614-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Our syzkaller report an use-after-free issue that accessing the freed
buffer_head on the writeback page in __ext4_journalled_writepage(). The
problem is that if there was a truncate racing with the data=journalled
writeback procedure, the writeback length could become zero and
bget_one() refuse to get buffer_head's refcount, then the truncate
procedure release buffer once we drop page lock, finally, the last
ext4_walk_page_buffers() trigger the use-after-free problem.
sync truncate
ext4_sync_file()
file_write_and_wait_range()
ext4_setattr(0)
inode->i_size = 0
ext4_writepage()
len = 0
__ext4_journalled_writepage()
page_bufs = page_buffers(page)
ext4_walk_page_buffers(bget_one) <- does not get refcount
do_invalidatepage()
free_buffer_head()
ext4_walk_page_buffers(page_bufs) <- trigger use-after-free
After commit bdf96838ae ("ext4: fix race between truncate and
__ext4_journalled_writepage()"), we have already handled the racing
case, so the bget_one() and bput_one() are not needed. So this patch
simply remove these hunk, and recheck the i_size to make it safe.
Fixes: bdf96838ae ("ext4: fix race between truncate and __ext4_journalled_writepage()")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211225090937.712867-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It is not guaranteed that __ext4_get_inode_loc will definitely set
err_blk pointer when it returns EIO. To avoid using uninitialized
variables, let's first set err_blk to 0.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211201163421.2631661-1-harshads@google.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
We found on older kernel (3.10) that in the scenario of insufficient
disk space, system may trigger an ABBA deadlock problem, it seems that
this problem still exists in latest kernel, try to fix it here. The
main process triggered by this problem is that task A occupies the PA
and waits for the jbd2 transaction finish, the jbd2 transaction waits
for the completion of task B's IO (plug_list), but task B waits for
the release of PA by task A to finish discard, which indirectly forms
an ABBA deadlock. The related calltrace is as follows:
Task A
vfs_write
ext4_mb_new_blocks()
ext4_mb_mark_diskspace_used() JBD2
jbd2_journal_get_write_access() -> jbd2_journal_commit_transaction()
->schedule() filemap_fdatawait()
| |
| Task B |
| do_unlinkat() |
| ext4_evict_inode() |
| jbd2_journal_begin_ordered_truncate() |
| filemap_fdatawrite_range() |
| ext4_mb_new_blocks() |
-ext4_mb_discard_group_preallocations() <-----
Here, try to cancel ext4_mb_discard_group_preallocations() internal
retry due to PA busy, and do a limited number of retries inside
ext4_mb_discard_preallocations(), which can circumvent the above
problems, but also has some advantages:
1. Since the PA is in a busy state, if other groups have free PAs,
keeping the current PA may help to reduce fragmentation.
2. Continue to traverse forward instead of waiting for the current
group PA to be released. In most scenarios, the PA discard time
can be reduced.
However, in the case of smaller free space, if only a few groups have
space, then due to multiple traversals of the group, it may increase
CPU overhead. But in contrast, I feel that the overall benefit is
better than the cost.
Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/1637630277-23496-1-git-send-email-brookxu.cn@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
coccicheck complains about the use of snprintf() in sysfs show functions.
Fix the coccicheck warning:
WARNING: use scnprintf or sprintf.
Use sysfs_emit instead of scnprintf or sprintf makes more sense.
Signed-off-by: Qing Wang <wangqing@vivo.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/1634095731-4528-1-git-send-email-wangqing@vivo.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we succeed in enabling some quota type but fail to enable another
one with quota feature, we correctly disable all enabled quota types.
However we forget to reset i_data_sem lockdep class. When the inode gets
freed and reused, it will inherit this lockdep class (i_data_sem is
initialized only when a slab is created) and thus eventually lockdep
barfs about possible deadlocks.
Reported-and-tested-by: syzbot+3b6f9218b1301ddda3e2@syzkaller.appspotmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20211007155336.12493-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When we hit an error when enabling quotas and setting inode flags, we do
not properly shutdown quota subsystem despite returning error from
Q_QUOTAON quotactl. This can lead to some odd situations like kernel
using quota file while it is still writeable for userspace. Make sure we
properly cleanup the quota subsystem in case of error.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20211007155336.12493-2-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If use FALLOC_FL_KEEP_SIZE to alloc unwritten range at bottom, the
inode->i_size will not include the unwritten range. When call
ftruncate with fast commit enabled, it will miss to track the
unwritten range.
Change to trace the full range during ftruncate.
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211223032337.5198-3-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
For now ,we use ext4_punch_hole() during fast commit replay delete range
procedure. But it will be affected by inode->i_size, which may not
correct during fast commit replay procedure. The following test will
failed.
-create & write foo (len 1000K)
-falloc FALLOC_FL_ZERO_RANGE foo (range 400K - 600K)
-create & fsync bar
-falloc FALLOC_FL_PUNCH_HOLE foo (range 300K-500K)
-fsync foo
-crash before a full commit
After the fast_commit reply procedure, the range 400K-500K will not be
removed. Because in this case, when calling ext4_punch_hole() the
inode->i_size is 0, and it just retruns with doing nothing.
Change to use ext4_ext_remove_space() instead of ext4_punch_hole()
to remove blocks of inode directly.
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211223032337.5198-2-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
when call falloc with FALLOC_FL_ZERO_RANGE, to set an range to unwritten,
which has been already initialized. If the range is align to blocksize,
fast commit will not track range for this change.
Also track range for unwritten range in ext4_map_blocks().
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20211221022839.374606-1-yinxin.x@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
The MSI entries for multi-MSI are populated en bloc for the MSI descriptor,
but the current code invokes the population inside the per interrupt loop
which triggers a warning in the sysfs code and causes the interrupt
allocation to fail.
Move it outside of the loop so it works correctly for single and multi-MSI.
Fixes: bf5e758f02 ("genirq/msi: Simplify sysfs handling")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/87leznqx2a.ffs@tglx
instead of the compiler driver to link files
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcGJkACgkQEsHwGGHe
VUqZ8w/+JqbI2c3ShKyh2Zz8P9yjZre1yPqocg/DBGJKmWOfRkjSjwzHJoMN45E6
CgjqqcBFv8i1YkUkYKkn+2LHeVZ4o9FJWKXeTvQ7+H4oewiUMDeOS487cxGjKBlf
aR0ER/1MDZLMqLfiER4s/Ek6cPAZ9uwz9zBn6WZAi95fjXkpAnMpN8U5IKI9/icH
ICCeU4EYqhZGVElhpEOb36/zuy6iZZ6laDl79ecdPL5MfMJnzdRaC+rCgcHTGM3p
AmyhMyL1cJxxH1u/GYW7yI3DNHKKHluFO8xxeGagk9yKCbNcA8/XjkW7oy/5T4FS
WWmDuUqae3kcToNxSm+z2u2InY9mzk6jkl7icXu9TbZZbniaX+/FtM7mWxLtQmsD
e0IHH1DtUK/NovKiLzCI89Lrzj4Z5ZAC2q9m3z/MrTUZo8YoRxIYS94Xe4LNY2Z8
itjAmUPO+H3DAOSuHWv/AD/qehM7k05I9sw3yxQusDJH3S4sWQOBcpqKEqpzDliK
9rvg68K73aHM0lrdq2QRu+A5GjsKZ8FbFNG/uZDDU1tghwa6pcQ6BARMR+db7vD9
bncEpuQ0cM3Dxll0GarSL3nNtodZzvUCJ4fhrd/l31rq80Fs0IdB96jrUgLmwVnX
FTns3BQS27bJpDWAwL6O2K46FKeKFgjeuvKD2mfTK2Cyv3xmEB0=
=oxKT
-----END PGP SIGNATURE-----
Merge tag 'x86_vdso_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Borislav Petkov:
"Remove -nostdlib compiler flag now that the vDSO uses the linker
instead of the compiler driver to link files"
* tag 'x86_vdso_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/purgatory: Remove -nostdlib compiler flag
x86/vdso: Remove -nostdlib compiler flag
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcFgkACgkQEsHwGGHe
VUp4VxAAuPLLNi+NKFzO+MZsJfnfQ+3w5zaB/fCEbiBFeDXU1AgMYQe5ClgjNwrG
rtAsugbF+W1MyU2CILqsrMqvuB/H1Ra4o4IeQJzLuMnZJBRpRdG7sH1RarUYdG3W
C4WkJTjmsPW3QFJUPs1EoH18zGIxHg4qrIGv9t6kAjWJi+ZnJrkHLjSrnvYSZMAM
sk0jhe1d0NupLh2gmQiTKAd7ULaWeL0RQLRW7ZhTJpaWAMIkL0z2B4azzQb11dgW
m6zRsPtOa+MRBI01OBrEEwoCexAkK8tuFrqV1QdylKhDUsDBTQCyM8zY3i/Fu40U
OVERtDCR+kcjLJ9cWeJdysNGAT91R2QLPrxv6wFod6jnRdn8yRrOz92cwnpN1/Fg
YZD1b0q9l7ar3Qfy0WgA+SMLU1fln/NMVDsZ8fFm7UFhnBNChjHuQ2DAtOBuobhI
Rdu8gI6hHJztGDoRXKaXSDpJgdVP5e9Uvqckh8abxJjy6mTIZy7GUQJeIs2Xk8MQ
U6m0WRsT3clsMC/eIgqrLYlA1QYh99RzcXmmJcVuPbcMDdMtBDm5gfWrWyI+L94c
vUrgVondrwTP7Rz7E3d3nBsV/Alr3gUMZrgjkLIZLmFVrPSfaZMzc6grtbtPxZf8
bvZWi15V05b8jF3pAwUMrG2pbNQMCdTxkIK6Fk7baefQBvh1dOw=
=nqWC
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build fix from Borislav Petkov:
"A fix for cross-compiling the compressed stub on arm64 with clang"
* tag 'x86_build_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/compressed: Move CLANG_FLAGS to beginning of KBUILD_CFLAGS
copy_user_enhanced_fast_string()
- Avoid writing MSR_CSTAR on Intel due to TDX guests raising a #VE trap
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcFRcACgkQEsHwGGHe
VUrVYRAAg8hJS/aIMnqr+CDX+iOlx2hxJ2TA2bA45NwWc1A4VTt9kwRB0+NIKkjj
F3uJbidZjxSch9Oza6O5KyjJK8QtOfqxyYcx8TLjSleqJRoJWxl1Ub1/yAfKIX/0
QsqXVc/OuMzgwVGYLUwGSWifJOWMYKy03vSczmXK74zp9vZ56fdot8rOhDm3Xb/R
QSfT5nKlgCvxbvAqgFfbXKoEu/EqT43sTXq4o1C6yDX/G6JOGe6nXZIAvIVm3iKZ
utOqO+tBOmektF/yg3EHZL/7paFgtfETcI1YpmPYqKhG3KvvZgm7yyU6SqrcctSx
vMSPTcgcuZl2I5OF+eesUGfGGhHSfSPBAhkxpCTOb6lHf73PYRC3BnQtlQkQt6g/
UOtm3fQwrVJcKlMu7nem46iDCgbSyvASFa5ZyuOGcrAiFLhJzQNRDlXLpxp/q615
yOYTRgj4YS6vomzc6bL3zNCcF5aJUwAPNVghe3l2zwKXetoOPvtWX8sKlYjiN3GW
DTtEi117IAiWkosDIYY+aFNxLeOqxpNMcOkwd5eHHdpR3rkeFkjOtBctll/eHzPi
NYx++cV5yYW0z4S2uRr6o4k4hdgAQU/p7xhdO28Z+yzWpmXQ//79HhiOf2nNd1iI
dpQAx9roo8vbR3JYLxGYFuJrZsHna+/f6Gqf5teUy7SjVL5M95U=
=zbYM
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid updates from Borislav Petkov:
- Enable the short string copies for CPUs which support them, in
copy_user_enhanced_fast_string()
- Avoid writing MSR_CSTAR on Intel due to TDX guests raising a #VE trap
* tag 'x86_cpu_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/lib: Add fast-short-rep-movs check to copy_user_enhanced_fast_string()
x86/cpu: Don't write CSTAR MSR on Intel CPUs
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcEx8ACgkQEsHwGGHe
VUr7sw/9HcBuobgSgF8RJTsuaITMYZEIEGCQX9V8i8QLWAmWfw4gWsun3x/QrF0+
/CmuGpXVckTQgYo8Y/knhPfvVRDWLIsF9p18gV7sPxpQvDZ0ZgPQVSsfXx3cl84y
4D11zdYHDPDnFRY6AX4GCh5Ozz4K2WPwv41iHgi+vGoFjfSMeq1+ilAEXCBY8pdZ
S/Txd7oAm5tCVyH0VUXmNbQKTNjDbFXxr/FnWpMtVgXFiDlzbOK91H998ZJP13WL
1owCxLKW8BgvoRXtUBt8jJqHDqzjwBuxrGBCYwiFpVl0PBkbwLLYxElkef6rtpWf
gHAMtUeqgTOprNpvtK3ynaf9g8cgXd5QyBL5nszaD1WCzpVstg+yZExgKFWtTI6F
U0UJkjmGmVGsD9XkV7nbkcO/IGEgY9cOZ3tHqmyEaHSVahgoZRS8SDQKJA5Ivziu
mUN60dXEdX2xtCwGR3thacblTWHZrSr4k7Cb93thCb+Het67/DtOSkVp9U4Eutqu
O2La5Lw1pwsFaTy7WvBUvcuI+gc/zFrbs92r1gT5cGvSJ8+0Pf1Pr4nuhPL9dkKa
rFQ1ubRHpQQ5u4hVDiEVm+PD7I9hSSPEUWiqZamUEkYGF7XYec60FOTgFERhgfez
HExAhjHlH6lo4lSxwuWMXRcJMI7uehEm0fUduX8USvYUQn66zGM=
=u68B
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
"The mandatory set of random minor cleanups all over tip"
* tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/events/amd/iommu: Remove redundant assignment to variable shift
x86/boot/string: Add missing function prototypes
x86/fpu: Remove duplicate copy_fpstate_to_sigframe() prototype
x86/uaccess: Move variable into switch case statement
to use it in SEV and TDX guests
- An include fix and reorg to allow for removing set_fs in UML later
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcEl4ACgkQEsHwGGHe
VUqFTQ/+K9Kb6X0+r7wBSRTeAIWaYewmgOdf+7rpFVyFqQtNecKbuSAWGgFnEHc8
8HUB/krNa+odtx7mAy73wNALUaPmR0KUg6O+YKrvT6LHt8DLlGl5u0g/hihzFdAB
PW7auuxqt9TvK1i8PkYAI+W7t93o4mw4LzgDCVvoLPQUutRZEV1gHRht8Tn8SjaN
3EmEiazpFDrXNGWl/3rnS0qIyvtiZu7KNtibE6ljbUgse9cgxOt733mykH6eO9RJ
hXOfewKML72UxmgWig01pElgLaXeYI5rpSoG7usm4FwwYh+tmBIA8S/EoeE24gn0
e82lxwRCcHjqUDRp2//gz16sYhs//K6bcViT/4FtnL33e2CjK2/J4MwHPn9zgimO
VvxSdAes7UFiA/gDIomFt3gJij+hfy4TGKg5d3326Nm9rsQLpxg49WkozYJZ8m/f
75VVlC4BAj9SnYLQYhSm9buF7pIXmfwN3yWkYJsebl18C6/4FXLLomiqOgWpo3mG
D0e+CXhLZsEaU5NTiVuaPySzjtpRUzmfWf3S9GifJZex0rX+et7+mqIuC92aHbtD
Dc+nNFX/D77Fq8Uoe8bIEt8QsnjdACov1TI/S8h2rSjt5R/Lyg73qh0CpN0jtQ+S
9dUooJWwE4RXnuVMpFq/Xea/BYj1lQ72kMeyFiCNc0/hnzYhZNM=
=NBcE
-----END PGP SIGNATURE-----
Merge tag 'x86_misc_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
"The pile which we cannot find the proper topic for so we stick it in
x86/misc:
- Add support for decoding instructions which do MMIO accesses in
order to use it in SEV and TDX guests
- An include fix and reorg to allow for removing set_fs in UML later"
* tag 'x86_misc_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mtrr: Remove the mtrr_bp_init() stub
x86/sev-es: Use insn_decode_mmio() for MMIO implementation
x86/insn-eval: Introduce insn_decode_mmio()
x86/insn-eval: Introduce insn_get_modrm_reg_ptr()
x86/insn-eval: Handle insn_get_opcode() failure
for-5.16-fixes contains two subtle race conditions which were introduced by
scheduler side code cleanups. The branch didn't get pushed out, so merge
into for-5.17.
pagetable to prevent any stale entries' presence
- Flush global mappings from the TLB, in addition to the CR3-write,
after switching off of the trampoline_pgd during boot to clear the
identity mappings
- Prevent instrumentation issues resulting from the above changes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcEGIACgkQEsHwGGHe
VUp14RAAgo6BbW9J82Pyl55egIhcQDdGsa16Gdm9S/AFIIW/NhwYo9ydrgtzr/70
3XKpJYX7nH7PUKYRmoca/m3NnzUU+wnjSGS1XMyB3bJvn2/8S1qeuwBty2VP2dYM
iS2eGRLjVjbMWwQUSK7tPJa5wi11zUqLIyCe3t0YiWso6TK7xKaVJTQ3/19Xc+/a
zVQ5VpmzglUTxA6xGCvTDn5IUViUb8QmIuw7Ty6QtQEoI6T3qQvPkdJNXOxDcHNy
9gDGf4O+5YlPCxYsNEkWDDa02zSZ2aWFSq76b98VyMiOK0xts+ktnAwq6oes+as9
ZLIipOu5aIkj8te7he0FelyvPhZAVzrFvvmMf1U+EV3PqbyVkabhk5SBeP5v8CZy
bM4eYNuJ2FLvFpUCC9zQ/MNVQ6ZtxN15rrrsTqk46KLPBHmHp/Aj9W/DP4zpCcNg
Wwh4xbnGNIN8jZBiBJG6R6q7oM/lZt/loEicxm2QFZHtAIYMsiUmE99HnIREjUHd
+0mwo2rHniie9zh6GoybX8OcbZCLYGdfe3iPvlO9fQpyDTn8IUIlnruDlUiTBMDM
fX4J2dynh7xXRH1WW+MwxDv4n400+C08SG9zTD0qPCbGhYwNscMlZhA2JN6mlPep
spuRPOzzwUUxqjXkDloeDDJNUQ8r032OB2LMhWSbLApJrJM9/QA=
=cM+z
-----END PGP SIGNATURE-----
Merge tag 'x86_mm_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Borislav Petkov:
- Flush *all* mappings from the TLB after switching to the trampoline
pagetable to prevent any stale entries' presence
- Flush global mappings from the TLB, in addition to the CR3-write,
after switching off of the trampoline_pgd during boot to clear the
identity mappings
- Prevent instrumentation issues resulting from the above changes
* tag 'x86_mm_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Prevent early boot triple-faults with instrumentation
x86/mm: Include spinlock_t definition in pgtable.
x86/mm: Flush global TLB when switching to trampoline page-table
x86/mm/64: Flush global TLB on boot and AP bringup
x86/realmode: Add comment for Global bit usage in trampoline_pgd
x86/mm: Add missing <asm/cpufeatures.h> dependency to <asm/page_64.h>
from poison memory and error injection into SGX pages
- A bunch of changes to the SGX selftests to simplify and allow of SGX
features testing without the need of a whole SGX software stack
- Add a sysfs attribute which is supposed to show the amount of SGX
memory in a NUMA node, similar to what /proc/meminfo is to normal
memory
- The usual bunch of fixes and cleanups too
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcDQMACgkQEsHwGGHe
VUq42xAAjWM0AFpIxgUBpbE0swV3ZMulnndl3/vA5XN+9Yn7Q52+AFyPRE0s7Zam
Ap+cInh2Il7d/sv54rZ4x/j7+TH4i7s8fWPVU/XiPALQuOuw0/B1wJJ+jmMiPFiU
3jr7DkUPyWjWTHduMY/tk+xMOpkx1XsxJheYnKvsKVW+fjJ0vPuftAZtfu2z2VOh
3JLcp5cAXPxW0UK9gdoF5bCBQhBu0NRguTbhHhbByAixQO2GyVSKLSRovUdj0a+y
QRrQ6hgcvpTOsVHJoWJ7yIX4SBzQTe9Bg6dT9DghOxE4Sc2GH89hu7wRztGawBJO
nLyzWgiW9ttjQutDpBvZANNVcFAPAdtDWczrzZpREbrGKkzT+kOBnIIL1LWITWOy
2YWTO3ytW0KNIK85GzMjSVOKRMgaHJeBaGuYZ7Z0kb3GuUPJ9zRlaRxNapKQFuzA
0PGoA4IDT+2Afy7VYBBNUA2d/WverFQuXKusSxK6b5zJ173o5/DXL2q0d3gn/j8Z
hhxJUJyVOsfRXSG4NKrj4se4FiA0n/RL4oyUZR9iJ8kWzzZTd0eZTAn468bpGIp5
yiOlPOLgsmu0xzVmAtG1+4d2+S2x+Ec5YE0sP1V/JLNciYk3Ebp7UyfnS3tn33Xc
cpdWjELvD1LJVpMEURnbjRrwU6OiiAekYJCP/9lmK9zfOGpwRHc=
=vFTM
-----END PGP SIGNATURE-----
Merge tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Borislav Petkov:
- Add support for handling hw errors in SGX pages: poisoning,
recovering from poison memory and error injection into SGX pages
- A bunch of changes to the SGX selftests to simplify and allow of SGX
features testing without the need of a whole SGX software stack
- Add a sysfs attribute which is supposed to show the amount of SGX
memory in a NUMA node, similar to what /proc/meminfo is to normal
memory
- The usual bunch of fixes and cleanups too
* tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/sgx: Fix NULL pointer dereference on non-SGX systems
selftests/sgx: Fix corrupted cpuid macro invocation
x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node
x86/sgx: Fix minor documentation issues
selftests/sgx: Add test for multiple TCS entry
selftests/sgx: Enable multiple thread support
selftests/sgx: Add page permission and exception test
selftests/sgx: Rename test properties in preparation for more enclave tests
selftests/sgx: Provide per-op parameter structs for the test enclave
selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed
selftests/sgx: Move setup_test_encl() to each TEST_F()
selftests/sgx: Encpsulate the test enclave creation
selftests/sgx: Dump segments and /proc/self/maps only on failure
selftests/sgx: Create a heap for the test enclave
selftests/sgx: Make data measurement for an enclave segment optional
selftests/sgx: Assign source for each segment
selftests/sgx: Fix a benign linker warning
x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
x86/sgx: Add hook to error injection address validation
x86/sgx: Hook arch_memory_failure() into mainline code
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcC4cACgkQEsHwGGHe
VUoKAw/5AYM6etR5PHHOkNHycaT7fzZYVhvHbW9FWdRh7rlLlQ2A/khe//hkidOY
auXcc1wy0UfJN/MvoKlZyXpbTzj3HdnuNTaNf24akhKAM9HpC5oYzAxGaZtF9OvD
NpkYC1sFWxF1StvRK5iS2Ns80eL0SU1o7VOeKR6BaabBBiuE9Nq9AAuVddNIqWkW
k5wwjkttBea4wSMUt3kZbac5mncwV5o2vF23db6j6vvsG6lVo+xT/zcKzQk0z/bu
NZsdAqKZwzubV+3zksxiQjsPpy4kS7OGVlPCiNGX1zCRZtOX6wpNUzOcj6S4TDju
FIlnD2zqK9mylBomBDq0n/GWJavMNttojO6u9iwJm3v6bteXb5dT3Cu8OCp6J7B+
/q4t79hwleELVQ5z60Wz8R9oPi2jR/YKjSVW1w5X1yxxJPUIDU905ZF7r50BxeA+
QQK0kn+EkJvOahfwK9RwhlaexSilYB9TBum5E70yzBefp3mbMGwwu/60UkjydrEP
SoD/qlY5vatLjuYMLx/Pe93aE0nK0HOfbZKpgvnxzdm9pxQs+GogXGW5CmutfpHo
9Fv8wH0bAlFnltmylXg+zdJ0hggRlQS97bM/FKJXq4qhj3NgEQ1uPhfGSe/CmnTe
t6TB2dXPPN9+2hnbnZDNKRygVPUnRklnnVweSP7ij2Csksiqj7A=
=eG29
-----END PGP SIGNATURE-----
Merge tag 'x86_cache_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control fixlet from Borislav Petkov:
"A minor code cleanup removing a redundant assignment"
* tag 'x86_cache_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Remove redundant assignment to variable chunks