mirror of
https://github.com/torvalds/linux.git
synced 2024-11-22 20:22:09 +00:00
549c729771
983015 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Christian Brauner
|
549c729771
|
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
1ab29965b3
|
exec: handle idmapped mounts
When executing a setuid binary the kernel will verify in bprm_fill_uid() that the inode has a mapping in the caller's user namespace before setting the callers uid and gid. Let bprm_fill_uid() handle idmapped mounts. If the inode is accessed through an idmapped mount it is mapped according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-24-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
435ac6214e
|
would_dump: handle idmapped mounts
When determining whether or not to create a coredump the vfs will verify that the caller is privileged over the inode. Make the would_dump() helper handle idmapped mounts by passing down the mount's user namespace of the exec file. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-23-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
0f5d220b42
|
ioctl: handle idmapped mounts
Enable generic ioctls to handle idmapped mounts by passing down the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-22-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
b816dd5dde
|
init: handle idmapped mounts
Enable the init helpers to handle idmapped mounts by passing down the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-21-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
9eccd12ce7
|
fcntl: handle idmapped mounts
Enable the setfl() helper to handle idmapped mounts by passing down the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-20-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
d06c26f196
|
utimes: handle idmapped mounts
Enable the vfs_utimes() helper to handle idmapped mounts by passing down the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-19-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
7c02cf73d0
|
af_unix: handle idmapped mounts
When binding a non-abstract AF_UNIX socket it will gain a representation in the filesystem. Enable the socket infrastructure to handle idmapped mounts by passing down the user namespace of the mount the socket will be created from. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-18-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
b8b546a061
|
open: handle idmapped mounts
For core file operations such as changing directories or chrooting, determining file access, changing mode or ownership the vfs will verify that the caller is privileged over the inode. Extend the various helpers to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the permissions checks are identical to non-idmapped mounts. When changing file ownership we need to map the uid and gid from the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-17-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: James Morris <jamorris@linux.microsoft.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
643fe55a06
|
open: handle idmapped mounts in do_truncate()
When truncating files the vfs will verify that the caller is privileged over the inode. Extend it to handle idmapped mounts. If the inode is accessed through an idmapped mount it is mapped according to the mount's user namespace. Afterwards the permissions checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-16-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
6521f89170
|
namei: prepare for idmapped mounts
The various vfs_*() helpers are called by filesystems or by the vfs itself to perform core operations such as create, link, mkdir, mknod, rename, rmdir, tmpfile and unlink. Enable them to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace and pass it down. Afterwards the checks and operations are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-15-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
9fe6145097
|
namei: introduce struct renamedata
In order to handle idmapped mounts we will extend the vfs rename helper to take two new arguments in follow up patches. Since this operations already takes a bunch of arguments add a simple struct renamedata and make the current helper use it before we extend it. Link: https://lore.kernel.org/r/20210121131959.646623-14-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
ba73d98745
|
namei: handle idmapped mounts in may_*() helpers
The may_follow_link(), may_linkat(), may_lookup(), may_open(), may_o_create(), may_create_in_sticky(), may_delete(), and may_create() helpers determine whether the caller is privileged enough to perform the associated operations. Let them handle idmapped mounts by mapping the inode or fsids according to the mount's user namespace. Afterwards the checks are identical to non-idmapped inodes. The patch takes care to retrieve the mount's user namespace right before performing permission checks and passing it down into the fileystem so the user namespace can't change in between by someone idmapping a mount that is currently not idmapped. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-13-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
0d56a4518d
|
stat: handle idmapped mounts
The generic_fillattr() helper fills in the basic attributes associated with an inode. Enable it to handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace before we store the uid and gid. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-12-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
71bc356f93
|
commoncap: handle idmapped mounts
When interacting with user namespace and non-user namespace aware filesystem capabilities the vfs will perform various security checks to determine whether or not the filesystem capabilities can be used by the caller, whether they need to be removed and so on. The main infrastructure for this resides in the capability codepaths but they are called through the LSM security infrastructure even though they are not technically an LSM or optional. This extends the existing security hooks security_inode_removexattr(), security_inode_killpriv(), security_inode_getsecurity() to pass down the mount's user namespace and makes them aware of idmapped mounts. In order to actually get filesystem capabilities from disk the capability infrastructure exposes the get_vfs_caps_from_disk() helper. For user namespace aware filesystem capabilities a root uid is stored alongside the capabilities. In order to determine whether the caller can make use of the filesystem capability or whether it needs to be ignored it is translated according to the superblock's user namespace. If it can be translated to uid 0 according to that id mapping the caller can use the filesystem capabilities stored on disk. If we are accessing the inode that holds the filesystem capabilities through an idmapped mount we map the root uid according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts: reading filesystem caps from disk enforces that the root uid associated with the filesystem capability must have a mapping in the superblock's user namespace and that the caller is either in the same user namespace or is a descendant of the superblock's user namespace. For filesystems that are mountable inside user namespace the caller can just mount the filesystem and won't usually need to idmap it. If they do want to idmap it they can create an idmapped mount and mark it with a user namespace they created and which is thus a descendant of s_user_ns. For filesystems that are not mountable inside user namespaces the descendant rule is trivially true because the s_user_ns will be the initial user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-11-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Tycho Andersen
|
c7c7a1a18a
|
xattr: handle idmapped mounts
When interacting with extended attributes the vfs verifies that the caller is privileged over the inode with which the extended attribute is associated. For posix access and posix default extended attributes a uid or gid can be stored on-disk. Let the functions handle posix extended attributes on idmapped mounts. If the inode is accessed through an idmapped mount we need to map it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. This has no effect for e.g. security xattrs since they don't store uids or gids and don't perform permission checks on them like posix acls do. Link: https://lore.kernel.org/r/20210121131959.646623-10-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
e65ce2a50c
|
acl: handle idmapped mounts
The posix acl permission checking helpers determine whether a caller is privileged over an inode according to the acls associated with the inode. Add helpers that make it possible to handle acls on idmapped mounts. The vfs and the filesystems targeted by this first iteration make use of posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to translate basic posix access and default permissions such as the ACL_USER and ACL_GROUP type according to the initial user namespace (or the superblock's user namespace) to and from the caller's current user namespace. Adapt these two helpers to handle idmapped mounts whereby we either map from or into the mount's user namespace depending on in which direction we're translating. Similarly, cap_convert_nscap() is used by the vfs to translate user namespace and non-user namespace aware filesystem capabilities from the superblock's user namespace to the caller's user namespace. Enable it to handle idmapped mounts by accounting for the mount's user namespace. In addition the fileystems targeted in the first iteration of this patch series make use of the posix_acl_chmod() and, posix_acl_update_mode() helpers. Both helpers perform permission checks on the target inode. Let them handle idmapped mounts. These two helpers are called when posix acls are set by the respective filesystems to handle this case we extend the ->set() method to take an additional user namespace argument to pass the mount's user namespace down. Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
2f221d6f7b
|
attr: handle idmapped mounts
When file attributes are changed most filesystems rely on the setattr_prepare(), setattr_copy(), and notify_change() helpers for initialization and permission checking. Let them handle idmapped mounts. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Helpers that perform checks on the ia_uid and ia_gid fields in struct iattr assume that ia_uid and ia_gid are intended values and have already been mapped correctly at the userspace-kernelspace boundary as we already do today. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
21cb47be6f
|
inode: make init and permission helpers idmapped mount aware
The inode_owner_or_capable() helper determines whether the caller is the owner of the inode or is capable with respect to that inode. Allow it to handle idmapped mounts. If the inode is accessed through an idmapped mount it according to the mount's user namespace. Afterwards the checks are identical to non-idmapped mounts. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Similarly, allow the inode_init_owner() helper to handle idmapped mounts. It initializes a new inode on idmapped mounts by mapping the fsuid and fsgid of the caller from the mount's user namespace. If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
47291baa8d
|
namei: make permission helpers idmapped mount aware
The two helpers inode_permission() and generic_permission() are used by the vfs to perform basic permission checking by verifying that the caller is privileged over an inode. In order to handle idmapped mounts we extend the two helpers with an additional user namespace argument. On idmapped mounts the two helpers will make sure to map the inode according to the mount's user namespace and then peform identical permission checks to inode_permission() and generic_permission(). If the initial user namespace is passed nothing changes so non-idmapped mounts will see identical behavior as before. Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
0558c1bf5a
|
capability: handle idmapped mounts
In order to determine whether a caller holds privilege over a given inode the capability framework exposes the two helpers privileged_wrt_inode_uidgid() and capable_wrt_inode_uidgid(). The former verifies that the inode has a mapping in the caller's user namespace and the latter additionally verifies that the caller has the requested capability in their current user namespace. If the inode is accessed through an idmapped mount map it into the mount's user namespace. Afterwards the checks are identical to non-idmapped inodes. If the initial user namespace is passed all operations are a nop so non-idmapped mounts will not see a change in behavior. Link: https://lore.kernel.org/r/20210121131959.646623-5-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
02f92b3868
|
fs: add file and path permissions helpers
Add two simple helpers to check permissions on a file and path respectively and convert over some callers. It simplifies quite a few codepaths and also reduces the churn in later patches quite a bit. Christoph also correctly points out that this makes codepaths (e.g. ioctls) way easier to follow that would otherwise have to do more complex argument passing than necessary. Link: https://lore.kernel.org/r/20210121131959.646623-4-christian.brauner@ubuntu.com Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
e6c9a71451
|
fs: add id translation helpers
Add simple helpers to make it easy to map kuids into and from idmapped mounts. We provide simple wrappers that filesystems can use to e.g. initialize inodes similar to i_{uid,gid}_read() and i_{uid,gid}_write(). Accessing an inode through an idmapped mount maps the i_uid and i_gid of the inode to the mount's user namespace. If the fsids are used to initialize inodes they are unmapped according to the mount's user namespace. Passing the initial user namespace to these helpers makes them a nop and so any non-idmapped paths will not be impacted. Link: https://lore.kernel.org/r/20210121131959.646623-3-christian.brauner@ubuntu.com Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
a6435940b6
|
mount: attach mappings to mounts
In order to support per-mount idmappings vfsmounts are marked with user namespaces. The idmapping of the user namespace will be used to map the ids of vfs objects when they are accessed through that mount. By default all vfsmounts are marked with the initial user namespace. The initial user namespace is used to indicate that a mount is not idmapped. All operations behave as before. Based on prior discussions we want to attach the whole user namespace and not just a dedicated idmapping struct. This allows us to reuse all the helpers that already exist for dealing with idmappings instead of introducing a whole new range of helpers. In addition, if we decide in the future that we are confident enough to enable unprivileged users to setup idmapped mounts the permission checking can take into account whether the caller is privileged in the user namespace the mount is currently marked with. Later patches enforce that once a mount has been idmapped it can't be remapped. This keeps permission checking and life-cycle management simple. Users wanting to change the idmapped can always create a new detached mount with a different idmapping. Add a new mnt_userns member to vfsmount and two simple helpers to retrieve the mnt_userns from vfsmounts and files. The idea to attach user namespaces to vfsmounts has been floated around in various forms at Linux Plumbers in ~2018 with the original idea tracing back to a discussion in 2017 at a conference in St. Petersburg between Christoph, Tycho, and myself. Link: https://lore.kernel.org/r/20210121131959.646623-2-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Linus Torvalds
|
19c329f680 | Linux 5.11-rc4 | ||
Linus Torvalds
|
e2da783614 |
perf tools fixes for 5.11:
- Fix 'CPU too large' error in Intel PT.
- Correct event attribute sizes in 'perf inject'.
- Sync build_bug.h and kvm.h kernel copies.
- Fix bpf.h header include directive in 5sec.c 'perf trace' bpf example.
- libbpf tests fixes.
- Fix shadow stat 'perf test' for non-bash shells.
- Take cgroups into account for shadow stats in 'perf stat'.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test results:
The first ones are container based builds of tools/perf with and without libelf
support. Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
$ grep "model name" -m1 /proc/cpuinfo
model name: AMD Ryzen 9 3900X 12-Core Processor
# export PERF_TARBALL=http://192.168.86.5/perf/perf-5.11.0-rc3.tar.xz
# dm
1 66.93 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
2 68.65 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
3 73.00 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
4 79.04 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
5 79.71 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
6 82.51 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
7 103.45 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
8 113.86 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
9 109.31 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
10 113.90 alpine:edge : Ok gcc (Alpine 10.2.0) 10.2.0, Alpine clang version 10.0.1
11 66.76 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1), clang version 3.8.0 (tags/RELEASE_380/final)
12 83.71 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1), clang version 10.0.0
13 80.70 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 9.3.1 20200518 (ALT Sisyphus 9.3.1-alt1), clang version 10.0.1
14 62.75 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
15 97.65 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
16 21.18 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
17 21.07 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
18 25.83 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
19 30.65 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
20 93.44 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), clang version 10.0.1 (Red Hat 10.0.1-1.module_el8.3.0+467+cb298d5b)
21 60.64 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20201217 releases/gcc-10.2.0-643-g7cbb07d2fc, clang version 10.0.1
22 74.57 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
23 75.40 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
24 72.75 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
25 72.36 debian:experimental : Ok gcc (Debian 10.2.1-6) 10.2.1 20210110, Debian clang version 11.0.1-2
26 32.35 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
27 28.65 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 10.2.1-3) 10.2.1 20201224
28 13.79 debian:experimental-x-mipsel : FAIL mipsel-linux-gnu-gcc (Debian 10.2.1-3) 10.2.1 20201224
CC /tmp/build/perf/util/map.o
util/map.c: In function 'map__new':
util/map.c:109:5: error: '%s' directive output may be truncated writing between 1 and 2147483645 bytes into a region of size 4096 [-Werror=format-truncation=]
109 | "%s/platforms/%s/arch-%s/usr/lib/%s",
| ^~
In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
from util/symbol.h:11,
from util/map.c:2:
/usr/mipsel-linux-gnu/include/bits/stdio2.h:67:10: note: '__builtin___snprintf_chk' output 32 or more bytes (assuming 4294967321) into a destination of size 4096
67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68 | __bos (__s), __fmt, __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29 29.14 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
30 30.66 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final)
31 66.33 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
32 77.51 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
33 25.23 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
34 79.68 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
35 93.09 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
36 94.12 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
37 101.97 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
38 107.51 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
39 111.24 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc30)
40 25.85 fedora:30-x-ARC-uClibc : Ok arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
41 110.61 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 9.0.1 (Fedora 9.0.1-4.fc31)
42 93.78 fedora:32 : Ok gcc (GCC) 10.2.1 20201016 (Red Hat 10.2.1-6), clang version 10.0.1 (Fedora 10.0.1-3.fc32)
43 91.51 fedora:33 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9), clang version 11.0.0 (Fedora 11.0.0-2.fc33)
44 92.75 fedora:34 : Ok gcc (GCC) 11.0.0 20210113 (Red Hat 11.0.0-0), clang version 11.0.1 (Fedora 11.0.1-4.fc34)
45 92.33 fedora:rawhide : Ok gcc (GCC) 11.0.0 20210109 (Red Hat 11.0.0-0), clang version 11.0.1 (Fedora 11.0.1-4.fc34)
46 33.58 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
47 66.03 mageia:5 : Ok gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
48 84.73 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
49 98.35 manjaro:latest : Ok gcc (GCC) 10.2.0, clang version 10.0.1
50 223.15 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva), OpenMandriva 11.0.0-1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-llvmorg-11.0.0/clang 63e22714ac938c6b537bd958f70680d3331a2030)
51 117.30 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407], clang version 5.0.1 (tags/RELEASE_501/final 312548)
52 124.82 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
53 113.33 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0, clang version 9.0.1
54 106.17 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553)
55 108.15 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], clang version 10.0.1
56 25.57 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
57 30.86 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)
58 91.75 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.1), clang version 10.0.1 (Red Hat 10.0.1-1.0.1.module+el8.3.0+7827+89335dbf)
59 27.64 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
60 29.65 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
61 75.65 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
62 25.57 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
63 25.52 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
64 25.01 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
65 25.51 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
66 25.70 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
67 24.95 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
68 87.96 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
69 27.40 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
70 27.14 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
71 22.68 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
72 26.52 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
73 28.97 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
74 28.54 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
75 163.57 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
76 24.07 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
77 26.77 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
78 24.00 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
79 69.36 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 8.0.1-3build1 (tags/RELEASE_801/final)
80 27.07 ubuntu:19.10-x-alpha : Ok alpha-linux-gnu-gcc (Ubuntu 9.2.1-9ubuntu1) 9.2.1 20191008
81 24.29 ubuntu:19.10-x-hppa : Ok hppa-linux-gnu-gcc (Ubuntu 9.2.1-9ubuntu1) 9.2.1 20191008
82 74.99 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, clang version 10.0.0-4ubuntu1
83 30.49 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0
84 73.54 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0, Ubuntu clang version 11.0.0-2
$
# uname -a
Linux quaco 5.10.7-100.fc32.x86_64 #1 SMP Tue Jan 12 20:25:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
|
||
Linus Torvalds
|
a1339d6355 |
powerpc fixes for 5.11 #4
One fix for a lack of alignment in our linker script, that can lead to crashes depending on configuration etc. One fix for the 32-bit VDSO after the C VDSO conversion. Thanks to: Andreas Schwab, Ariel Marcovitch, Christophe Leroy. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmAEDicTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgAMOEAC2jmfH0UwkbDWsKtolps7Gy4hzr1HH PzauZgWHA+eq+j6I7oDsACsVOsVnUGCBsEkOfFYTIlBroVgnTdXlRU3WSsisnTfW sjaQguv3nP01P82CicIVCJJJJFpJENuXcs4Dr02OYP9VMFytWiAr6RvxxCOqozVo dcCg7/04za+v5mR3KRdw2Jf5mlox5kN7wFCFMLlSzadAdUneP+Qt583shEx0KejH IXQOXTp191Q0luFh/2TLz+gzai/A2v16Bk/Q7h3VQ/EQ3V0jpEil6bQXX2UI6on8 dRngTQ4j7gZ5b7QcpqvO2t2otWthGO0YQ/rfI3p1XdpWZNQKFA2I3cXblSqFEhp1 /qI2K5zUiLbRSW4NrgxZ6zIt0PYuxYnrIt7Wwj7nV+79RP+9o9t1VcvUMAaSL5C+ DfQq8GJdsUUUifFzNzq9EeuL2T0RHFooK0xNd00hc43NJjmnhni3TY20UI4r7b8k PmeKJg94Pc4a6PmtGUsOgG53CGENVDTDPCSY7e9XSIAMT0jV0Cbo4+0uwk4s/J/b 1oEROtc8TTq6I47ARc6GZgQ9Wui4C/34uxIuhF7uTTGrWYlMgFcMOkRGUt8CuBrD DLhjA37uqgf+bK2g2heCOQXIjh9JCGc3V7BEB0d545xxv0vjpIPmk9mXwGyxth0N /lCUHrl64VtI/g== =jpfR -----END PGP SIGNATURE----- Merge tag 'powerpc-5.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix for a lack of alignment in our linker script, that can lead to crashes depending on configuration etc. One fix for the 32-bit VDSO after the C VDSO conversion. Thanks to Andreas Schwab, Ariel Marcovitch, and Christophe Leroy" * tag 'powerpc-5.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/vdso: Fix clock_gettime_fallback for vdso32 powerpc: Fix alignment bug within the init sections |
||
Linus Torvalds
|
a527a2b32d |
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs fixes from Al Viro: "Several assorted fixes. I still think that audit ->d_name race is better fixed this way for the benefit of backports, with any possibly fancier variants done on top of it" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: dump_common_audit_data(): fix racy accesses to ->d_name iov_iter: fix the uaccess area in copy_compat_iovec_from_user umount(2): move the flag validity checks first |
||
Linus Torvalds
|
feb889fb40 |
mm: don't put pinned pages into the swap cache
So technically there is nothing wrong with adding a pinned page to the
swap cache, but the pinning obviously means that the page can't actually
be free'd right now anyway, so it's a bit pointless.
However, the real problem is not with it being a bit pointless: the real
issue is that after we've added it to the swap cache, we'll try to unmap
the page. That will succeed, because the code in mm/rmap.c doesn't know
or care about pinned pages.
Even the unmapping isn't fatal per se, since the page will stay around
in memory due to the pinning, and we do hold the connection to it using
the swap cache. But when we then touch it next and take a page fault,
the logic in do_swap_page() will map it back into the process as a
possibly read-only page, and we'll then break the page association on
the next COW fault.
Honestly, this issue could have been fixed in any of those other places:
(a) we could refuse to unmap a pinned page (which makes conceptual
sense), or (b) we could make sure to re-map a pinned page writably in
do_swap_page(), or (c) we could just make do_wp_page() not COW the
pinned page (which was what we historically did before that "mm:
do_wp_page() simplification" commit).
But while all of them are equally valid models for breaking this chain,
not putting pinned pages into the swap cache in the first place is the
simplest one by far.
It's also the safest one: the reason why do_wp_page() was changed in the
first place was that getting the "can I re-use this page" wrong is so
fraught with errors. If you do it wrong, you end up with an incorrectly
shared page.
As a result, using "page_maybe_dma_pinned()" in either do_wp_page() or
do_swap_page() would be a serious bug since it is only a (very good)
heuristic. Re-using the page requires a hard black-and-white rule with
no room for ambiguity.
In contrast, saying "this page is very likely dma pinned, so let's not
add it to the swap cache and try to unmap it" is an obviously safe thing
to do, and if the heuristic might very rarely be a false positive, no
harm is done.
Fixes:
|
||
Linus Torvalds
|
0da0a8a0a0 |
SCSI fixes on 20210116
Nine minor fixes, 7 in drivers and 2 in the core SCSI disk driver (sd) which should be harmless involving removing an unused variable and quietening a spurious warning. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYANKJiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVSyAP9a4xdK 9A4seh2/LW3GwPsoQUJQINe4yok/lGVXSQk3XQD/QkkYbzeqVGN4BHK1LQX2R9Sw wy+E4ENjdOY+9p+KMxo= =m0jh -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Nine minor fixes, seven in drivers and two in the core SCSI disk driver (sd) which should be harmless involving removing an unused variable and quietening a spurious warning" Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Remove obsolete variable in sd_remove() scsi: sd: Suppress spurious errors when WRITE SAME is being disabled scsi: scsi_debug: Fix memleak in scsi_debug_init() scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility" scsi: qedi: Correct max length of CHAP secret scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback scsi: ufs: Relocate flush of exceptional event scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL scsi: ufs: Fix possible power drain during system suspend |
||
Al Viro
|
d36a1dd9f7 |
dump_common_audit_data(): fix racy accesses to ->d_name
We are not guaranteed the locking environment that would prevent dentry getting renamed right under us. And it's possible for old long name to be freed after rename, leading to UAF here. Cc: stable@kernel.org # v2.6.2+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
Linus Torvalds
|
54c6247d06 |
block-5.11-2021-01-16
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmADKbkQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpgrWD/9gmAaxW3cF5K9UPcfT9VRz50PRfa5cGNwu eQ3ETIkqQtmizibj9BXBYwPWn6FK6gSeI8LjP0SXOUxkG1O23Hwwb6ieQ2tRg+CX Wmb+S7EUwpkHBWWmulfsX9PXQsyG2D4oz8Obxwp0jvI3XzuEeMQ5gYTpO+0YTxi1 VgdeOZfV5Xben0kvpwlUfowdOD6CK8LSPJ4KjuynkvgtfKuipsCmJqMqRZqFMu3f F4p+ngVYuqEcKddh+h5pFifUy2bo076eYe4kE6vGlmdySjaVyHiLJcBmkMDxb/cp daWTaBGCsvZfP0uwLgATD0P0UqJn44Vx15FlN50uSlrfaEiglO6aBuWwbQpuPpq/ FF/hhUscs9cmZhYGO8TNnOZHfzNmYmPx9dH/D+Fo6mqCaGvsOyGGUeKuG86hfLlM AEVGFiVdlQNdLq7f2HHFKfWL1XvHbvGrgQ5d4dnJrD9a/1TgXTiluleC1+vvLfoP cRTATNtfMTcc88QF+L1BxtHXkUsf6dyqPt1AalmKqwHDv9d1C/B0aZt6a/JFTXp4 utsxDi+WOb4HVT4/ojl35HML6CWATqVgHH1rSG2Q8UPWvLX+O8QalRIBhtNaRt9k hr3joKlmeyCjJdrgE+07xlwfiqTmmhmTkGjJw39KyQTiOyyEPey+HWwJEhO1QaFR hgoQvLA4mw== =CYyj -----END PGP SIGNATURE----- Merge tag 'block-5.11-2021-01-16' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Just an nvme pull request via Christoph: - don't initialize hwmon for discover controllers (Sagi Grimberg) - fix iov_iter handling in nvme-tcp (Sagi Grimberg) - fix a preempt warning in nvme-tcp (Sagi Grimberg) - fix a possible NULL pointer dereference in nvme (Israel Rukshin)" * tag 'block-5.11-2021-01-16' of git://git.kernel.dk/linux-block: nvme: don't intialize hwmon for discovery controllers nvme-tcp: fix possible data corruption with bio merges nvme-tcp: Fix warning with CONFIG_DEBUG_PREEMPT nvmet-rdma: Fix NULL deref when setting pi_enable and traddr INADDR_ANY |
||
Linus Torvalds
|
11c0239ae2 |
io_uring-5.11-2021-01-16
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmADKaIQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpnjVD/402IBhG40EoLMdcztTbOF7S5JZLbECbWam HPN99D0YyjRQNwsd7CPzeO+7W4zO4P0T/NcnXLF1TmPnkRynDNsT4oygcLrJoH8n xZoAXi6ykM/4eE+4NSaNfWF/f3DtGepsIHMpSNNyzE+qxz4I36/EPTu7M13ga1A4 H6pAbvU1DrDZ+z62rhWcb+4ouUV7QDS02UTNo9cW4ueeMNLyW/Zxp9lBzNuAkbzF gRlsxkhXLgcqtA36EdRWKs9fOwcdTyiMGzS5DRRmmE/O/a3//yl0ENFTPmWd6+jf daZcAYW6BdB+0va1sYGbECl9zOkMlhcm/hNFnl3ZmFIW5a4QiRy2WLoWuiNwVSYM nGCz0l5P2pP32tt0fQlSQh8KdI+py+c7MhJ1IaRhJi8X4t1MHoOR3vlilw0UvrdQ zpm8H+hMmL3g9/6pZLwxBvd31N1Y/A8ghh6qTdQ2EwdPyjImR3Xv/wYUqdbfvBo5 VcB6ccoFwuP5KzQcyI2j2WRcx7AtHZuiVGvoqkA8QL9Onq3GWoY8tnkf/sP8RxlQ wL91wVkT46UMkCLeY7QCGdx4nnmSahCEa8SQEYxibnh++9jh+c6Y5FtZc3YNjnED GRFT9vJmqNmrvKCLo78iTf/F8vJllGXQRKDeMuU9DVj3ylhZOBSEhpmMIZwdXNQa CqMCqia4Xg== =XL14 -----END PGP SIGNATURE----- Merge tag 'io_uring-5.11-2021-01-16' of git://git.kernel.dk/linux-block Pull io_uring fixes from Jens Axboe: "We still have a pending fix for a cancelation issue, but it's still being investigated. In the meantime: - Dead mm handling fix (Pavel) - SQPOLL setup error handling (Pavel) - Flush timeout sequence fix (Marcelo) - Missing finish_wait() for one exit case" * tag 'io_uring-5.11-2021-01-16' of git://git.kernel.dk/linux-block: io_uring: ensure finish_wait() is always called in __io_uring_task_cancel() io_uring: flush timeouts that should already have expired io_uring: do sqo disable on install_fd error io_uring: fix null-deref in io_disable_sqo_submit io_uring: don't take files/mm for a dead task io_uring: drop mm and files after task_work_run |
||
Linus Torvalds
|
acda701bf1 |
RISC-V Fixes for 5.11-rc4
There are a few more fixes than a normal rc4, largely due to the bubble introduced by the holiday break: * A fix to return -ENOSYS for syscall number -1, which previously returned an uninitialized value. * A fix to time_init() to ensure of_clk_init() has been called, without which clock drivers may not be initialized. * A fix to the sifive,uart0 driver to properly display the baud rate. A fix to initialize MPIE that allows interrupts to be processed during system calls. * A fix to avoid erronously begin tracing IRQs when interrupts are disabled, which at least triggers suprious lockdep failures. * A workaround for a warning related to calling smp_processor_id() while preemptible. The warning itself is suprious on currently availiable systems. * A fix to properly include the generic time VDSO calls. A fix to our kasan address mapping. A fix to the HiFive Unleashed device tree, which allows the Ethernet PHY to be properly initialized by Linux (as opposed to relying on the bootloader). * A defconfig update to include SiFive's GPIO driver, which is present on the HiFive Unleashed and necessary to initialize the PHY. * A fix to avoid allocating memory while initializing reserved memory. * A fix to avoid allocating the last 4K of memory, as pointers there alias with syscall errors. There are also two cleanups that should have no functional effect but do fix build warnings: * A cleanup to drop a duplicated definition of PAGE_KERNEL_EXEC. * A cleanup to properly declare the asm register SP shim. * A cleanup to the rv32 memory size Kconfig entry, to reflect the actual size of memory availiable. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmACfN8THHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYiWkQD/40g0dX0dX7fkGHk/zbwlkJffA/VCF7 +dtqfJSjpRkfPCBEOCJxfCYdHM0adrH2FACwN2JhQmOzOpbccO6Xhr/9jgqms43x WB5J0WNKrIT2ZGpB6Hp5htD2lo2OuCLNtdnMK0I3NJMBhJHUKb4HGMwFXKBg+RK1 LtOQMKEJSgkwT6Mhe5K8+tz5VLu6XQUJMCAzxoeFoTDi513HDDcAoOcvfKt0VujD 1g2HDtIhYlIYUiGliBcFh88slycWh8Im41Hi2Dh5Nfe9PaiexskpL1+g+CCnZUpa yIi72AOirjNxxvfPfmAcTXRx4lHB/YQnB9EUnI6W16D7XniXedC7StR+zhzRjkuy lePMKJE6x8mTMF5zv2GLP1FalBeJ2SFwLtboT7zfRlhI+IucMqCaSsE06RHg5/52 YL9ScVi0xS423f3ctGNUrlSKQKdV1s5D7pm6Qf2WZlqUvwKTspOvkzDn8hNpINCe 0D3EvPDWVoRSWx4u0Ao/Ig2+xwumtun1IPmF7WHqW+aSxVTs0g0XiEiA8ubaU9bD JpqfCL9ftzD5S85igHp8+a3zuFmK+sgR/7JCdp7TqBoHmLoQjg8c5L56m1cZy8Mn Wu6E1X0n0IyDDjDqWbGTkfkKC6CvbUy6wRW3LdEnhufon3ldiPMTB66gLs4qfcqF Yq5WkrJyGK98Ww== =guLu -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "There are a few more fixes than a normal rc4, largely due to the bubble introduced by the holiday break: - return -ENOSYS for syscall number -1, which previously returned an uninitialized value. - ensure of_clk_init() has been called in time_init(), without which clock drivers may not be initialized. - fix sifive,uart0 driver to properly display the baud rate. A fix to initialize MPIE that allows interrupts to be processed during system calls. - avoid erronously begin tracing IRQs when interrupts are disabled, which at least triggers suprious lockdep failures. - workaround for a warning related to calling smp_processor_id() while preemptible. The warning itself is suprious on currently availiable systems. - properly include the generic time VDSO calls. A fix to our kasan address mapping. A fix to the HiFive Unleashed device tree, which allows the Ethernet PHY to be properly initialized by Linux (as opposed to relying on the bootloader). - defconfig update to include SiFive's GPIO driver, which is present on the HiFive Unleashed and necessary to initialize the PHY. - avoid allocating memory while initializing reserved memory. - avoid allocating the last 4K of memory, as pointers there alias with syscall errors. There are also two cleanups that should have no functional effect but do fix build warnings: - drop a duplicated definition of PAGE_KERNEL_EXEC. - properly declare the asm register SP shim. - cleanup the rv32 memory size Kconfig entry, to reflect the actual size of memory availiable" * tag 'riscv-for-linus-5.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Fix maximum allowed phsyical memory for RV32 RISC-V: Set current memblock limit RISC-V: Do not allocate memblock while iterating reserved memblocks riscv: stacktrace: Move register keyword to beginning of declaration riscv: defconfig: enable gpio support for HiFive Unleashed dts: phy: add GPIO number and active state used for phy reset dts: phy: fix missing mdio device and probe failure of vsc8541-01 device riscv: Fix KASAN memory mapping. riscv: Fixup CONFIG_GENERIC_TIME_VSYSCALL riscv: cacheinfo: Fix using smp_processor_id() in preemptible riscv: Trace irq on only interrupt is enabled riscv: Drop a duplicated PAGE_KERNEL_EXEC riscv: Enable interrupts during syscalls with M-Mode riscv: Fix sifive serial driver riscv: Fix kernel time_init() riscv: return -ENOSYS for syscall -1 |
||
Linus Torvalds
|
9348b73c2e |
mm: don't play games with pinned pages in clear_page_refs
Turning a pinned page read-only breaks the pinning after COW. Don't do it. The whole "track page soft dirty" state doesn't work with pinned pages anyway, since the page might be dirtied by the pinning entity without ever being noticed in the page tables. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
29a951dfb3 |
mm: fix clear_refs_write locking
Turning page table entries read-only requires the mmap_sem held for writing. So stop doing the odd games with turning things from read locks to write locks and back. Just get the write lock. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Atish Patra
|
e557793799
|
RISC-V: Fix maximum allowed phsyical memory for RV32
Linux kernel can only map 1GB of address space for RV32 as the page offset is set to 0xC0000000. The current description in the Kconfig is confusing as it indicates that RV32 can support 2GB of physical memory. That is simply not true for current kernel. In future, a 2GB split support can be added to allow 2GB physical address space. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> |
||
Atish Patra
|
abb8e86b26
|
RISC-V: Set current memblock limit
Currently, linux kernel can not use last 4k bytes of addressable space because IS_ERR_VALUE macro treats those as an error. This will be an issue for RV32 as any memblock allocator potentially allocate chunk of memory from the end of DRAM (2GB) leading bad address error even though the address was technically valid. Fix this issue by limiting the memblock if available memory spans the entire address space. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> |
||
Atish Patra
|
797f0375dd
|
RISC-V: Do not allocate memblock while iterating reserved memblocks
Currently, resource tree allocates memory blocks while iterating on the
list. It leads to following kernel warning because memblock allocation
also invokes memory block reservation API.
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/resource.c:795
__insert_resource+0x8e/0xd0
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
5.10.0-00022-ge20097fb37e2-dirty #549
[ 0.000000] epc: c00125c2 ra : c001262c sp : c1c01f50
[ 0.000000] gp : c1d456e0 tp : c1c0a980 t0 : ffffcf20
[ 0.000000] t1 : 00000000 t2 : 00000000 s0 : c1c01f60
[ 0.000000] s1 : ffffcf00 a0 : ffffff00 a1 : c1c0c0c4
[ 0.000000] a2 : 80c12b15 a3 : 80402000 a4 : 80402000
[ 0.000000] a5 : c1c0c0c4 a6 : 80c12b15 a7 : f5faf600
[ 0.000000] s2 : c1c0c0c4 s3 : c1c0e000 s4 : c1009a80
[ 0.000000] s5 : c1c0c000 s6 : c1d48000 s7 : c1613b4c
[ 0.000000] s8 : 00000fff s9 : 80000200 s10: c1613b40
[ 0.000000] s11: 00000000 t3 : c1d4a000 t4 : ffffffff
This is also unnecessary as we can pre-compute the total memblocks required
for each memory region and allocate it before the loop. It save precious
boot time not going through memblock allocation code every time.
Fixes:
|
||
Christoph Hellwig
|
a959a9782f |
iov_iter: fix the uaccess area in copy_compat_iovec_from_user
sizeof needs to be called on the compat pointer, not the native one.
Fixes:
|
||
Linus Torvalds
|
1d94330a43 |
- Fix DM-raid's raid1 discard limits so discards work.
- Select missing Kconfig dependencies for DM integrity and zoned targets. - 4 fixes for DM crypt target's support to optionally bypass kcryptd workqueues. - Fix DM snapshot merge supports missing data flushes before committing metadata. - Fix DM integrity data device flushing when external metadata is used. - Fix DM integrity's maximum number of supported constructor arguments that user can request when creating an integrity device. - Eliminate DM core ioctl logging noise when an ioctl is issued without required CAP_SYS_RAWIO permission. -----BEGIN PGP SIGNATURE----- iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmACJwsTHHNuaXR6ZXJA cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWr93B/sFpKbCjRxZhGdOz8u2UIV9jGoSzAaZ 3qdSTruCp1HMTsz0J/+bFBTMQKKap+8u8kzhCOLAqtZgEz13P0D3NyDUGiCxj07O vPZJQ3COrClknq5byKws/aOKcnbr25odk6Zq+gFmk47pdNFBMkzNMf0IbdVJsj19 d0tnTRkIiycC8hnXOr/tHufnmERrw9nZpq3cLEHT6vaXc4FxcYXbo4UEMIwKjgcB AnsRrqpqKyayG+G/SEs01JM9eYznWRPqf3L03wTWfieAsFjDoIp/ek8iXWjEwLtZ BnhxFwodWvycFb9ITaxlFgfyiBSgU1f3tsIYjU4MHCSFI6u5b83y57Dl =FnV7 -----END PGP SIGNATURE----- Merge tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM-raid's raid1 discard limits so discards work. - Select missing Kconfig dependencies for DM integrity and zoned targets. - Four fixes for DM crypt target's support to optionally bypass kcryptd workqueues. - Fix DM snapshot merge supports missing data flushes before committing metadata. - Fix DM integrity data device flushing when external metadata is used. - Fix DM integrity's maximum number of supported constructor arguments that user can request when creating an integrity device. - Eliminate DM core ioctl logging noise when an ioctl is issued without required CAP_SYS_RAWIO permission. * tag 'for-5.11/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm crypt: defer decryption to a tasklet if interrupts disabled dm integrity: fix the maximum number of arguments dm crypt: do not call bio_endio() from the dm-crypt tasklet dm integrity: fix flush with external metadata device dm: eliminate potential source of excessive kernel log noise dm snapshot: flush merged data before committing metadata dm crypt: use GFP_ATOMIC when allocating crypto requests from softirq dm crypt: do not wait for backlogged crypto request completion in softirq dm zoned: select CONFIG_CRC32 dm integrity: select CRYPTO_SKCIPHER dm raid: fix discard limits for raid1 |
||
Linus Torvalds
|
b45e2da6e4 |
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton: "10 patches. Subsystems affected by this patch series: MAINTAINERS and mm (slub, pagealloc, memcg, kasan, vmalloc, migration, hugetlb, memory-failure, and process_vm_access)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/process_vm_access.c: include compat.h mm,hwpoison: fix printing of page flags MAINTAINERS: add Vlastimil as slab allocators maintainer mm/hugetlb: fix potential missing huge page size info mm: migrate: initialize err in do_migrate_pages mm/vmalloc.c: fix potential memory leak arm/kasan: fix the array size of kasan_early_shadow_pte[] mm/memcontrol: fix warning in mem_cgroup_page_lruvec() mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint mm, slub: consider rest of partial list if acquire_slab() fails |
||
Linus Torvalds
|
8cbe71e7e0 |
RDMA 5.11 first RC pull request
Several bug fixes. - Fix a ucma memory leak introduced in v5.9 while fixing the Syzkaller bugs - Don't fail when the xarray wraps for user verbs objects - User triggerable oops regression from the umem page size rework - Error unwind bugs in usnic, ocrdma, mlx5 and cma -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmAAkKEACgkQOG33FX4g mxrsdw//VojAS7oIPqpVlmWY3p8EAs/SsZi4ToDQKi+e92m5WvIT0Q72tcbKZF4K MFX23Eopi26hjXYEvF0+GGh4EIy4A80DJdftgEaB9ecq+wh7XSd4Ug3ZGy9O8H1F RIj9HVK/F5VKY6WhXHI89uBgLY/ACt14oxXAXxQZzvy4lADuGdTR7KAjkgY/L/Kx NmTQUMcfiTibnB8FADOCst31VLz4u+Uyk+WJa2m+KVQYTxrNgNKA9Ba7SC41W/5J +x8/l020vfJHpzeN52diTTvA66D6IcPRazGb22RzDy6uzkSbFFoHLJxTq//2Ez4A rSeTuUPjLPxZ0LqEyi2sc6RkMEz4milwgXbLBHq+ni3M08bs77bV5hwA2MkJ7+jP Gv8QdN9v47odk+tNAiP2SjC/w7TpEvM+37NIz0G3vm7uVFnwWBfLGIN+Fcrun0uS Ap9+V5L/lQSUprsWnroojZ23O7NyIv4a+Ig2OUzT+B+9UkYtAjmjnPRq+7Ikcv1k ia7vtqQaZwQ6z9hnqSYUxGYInMTUxRQzzW7DSDIpSlhKYSUYfmwUuttgpbmTdNTJ Kt7okIMusTPHZDPYestCX+iVpHIiKO0lyTdFA0z5UY9shGJlWuYr9tUt9bFHmH9/ 4/v14Vn8CCNol4JABRPi01j27VGKkhs19eLg2MzD5f/m8ZWSOqM= =8eDD -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "A fairly modest set of bug fixes, nothing abnormal from the merge window The ucma patch is a bit on the larger side, but given the regression was recently added I've opted to forward it to the rc stream. - Fix a ucma memory leak introduced in v5.9 while fixing the Syzkaller bugs - Don't fail when the xarray wraps for user verbs objects - User triggerable oops regression from the umem page size rework - Error unwind bugs in usnic, ocrdma, mlx5 and cma" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/cma: Fix error flow in default_roce_mode_store RDMA/mlx5: Fix wrong free of blue flame register on error IB/mlx5: Fix error unwinding when set_has_smi_cap fails RDMA/umem: Avoid undefined behavior of rounddown_pow_of_two() RDMA/ocrdma: Fix use after free in ocrdma_dealloc_ucontext_pd() RDMA/usnic: Fix memleak in find_free_vf_and_create_qp_grp RDMA/restrack: Don't treat as an error allocation ID wrapping RDMA/ucma: Do not miss ctx destruction steps in some cases |
||
Jens Axboe
|
a8d13dbccb |
io_uring: ensure finish_wait() is always called in __io_uring_task_cancel()
If we enter with requests pending and performm cancelations, we'll have a different inflight count before and after calling prepare_to_wait(). This causes the loop to restart. If we actually ended up canceling everything, or everything completed in-between, then we'll break out of the loop without calling finish_wait() on the waitqueue. This can trigger a warning on exit_signals(), as we leave the task state in TASK_UNINTERRUPTIBLE. Put a finish_wait() after the loop to catch that case. Cc: stable@vger.kernel.org # 5.9+ Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Linus Torvalds
|
0bc9bc1d8b |
A number of bug fixes for ext4:
* For the new fast_commit feature * Fix some error handling codepaths in whiteout handling and mountpoint sampling * Fix how we write ext4_error information so it goes through the journal when journalling is active, to avoid races that can lead to lost error information, superblock checksum failures, or DIF/DIX features. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmAB8eMACgkQ8vlZVpUN gaMUxAf+MW22dceTto2RO0ox9OEBNoZDFiVnlEuUaIOxkqOlovIWaqX7wwuF/121 +FaNeDVzqNSS/QjQSB5lHF5OfHCD2u1Ef/bGzCm9cQyeN2/n0sCsStfPCcyLHy/0 4R8PsjF0xhhbCETLcAc0U/YBFEoqSn1i7DG5nnpx63Wt1S/SSMmTAXzafWbzisEZ XNsz3CEPCDDSmSzOt3qMMHxkSoOZhYcLe7fCoKkhZ2pvTyrQsHrne6NNLtxc+sDL AcKkaI0EWFiFRhebowQO/5ouq6nnGKLCsukuZN9//Br8ht5gNcFpuKNVFl+LOiM6 ud4H3qcRokcdPPAn3uwI0AJKFXqLvg== =Dgdj -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A number of bug fixes for ext4: - Fix for the new fast_commit feature - Fix some error handling codepaths in whiteout handling and mountpoint sampling - Fix how we write ext4_error information so it goes through the journal when journalling is active, to avoid races that can lead to lost error information, superblock checksum failures, or DIF/DIX features" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: remove expensive flush on fast commit ext4: fix bug for rename with RENAME_WHITEOUT ext4: fix wrong list_splice in ext4_fc_cleanup ext4: use IS_ERR instead of IS_ERR_OR_NULL and set inode null when IS_ERR ext4: don't leak old mountpoint samples ext4: drop ext4_handle_dirty_super() ext4: fix superblock checksum failure when setting password salt ext4: use sbi instead of EXT4_SB(sb) in ext4_update_super() ext4: save error info to sb through journal if available ext4: protect superblock modifications with a buffer lock ext4: drop sync argument of ext4_commit_super() ext4: combine ext4_handle_error() and save_error_info() |
||
Linus Torvalds
|
7cd3c41261 |
2 small cifs fixes for stable (including an important handle leak fix) and 3 small cleanup patches
-----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmABLeEACgkQiiy9cAdy T1G5VAv/ZqpBJp6A/SOFGKlSBVTUqf++AWo/DsTZQLx938j6URChazImFb3eJ3FU WvqeUW9kza1OICVg0Wf3II2RCMDwJBqxfSCsj5qK53g9By4X1tsiatQZd9rWfBSc AW8nw5hzjrZ2e3//4gieSXZOHfT6n80hteAlIBNWvjN7w6R1DvZ7cIMkS6+M4EEn RBfcQlUTAihxFC8/RUqXMK8+jIXvt3hiRdI4gYV5GuAD8C7CvLKZJc0n6WZ0hPR5 IPlrY5j2J8yB+R+FkJ1Qjpzz7KrNCpkOyB0DPoHHlqRAT46oM8yfGGf2wlVr6v3A dG0DPQ5JAqB3vK9/3frJencXYRDCgLc4ZVwdFGrvv/wNZd29KBKiigfWIrxjnM13 tcN9Noy5mX791YvdYKpvijTSJqWslB04gLy2HuEkOAyOKo2sqiEtemB4bRPeJr1E cmheItvkOhNY13KZKyrTgnUy61H9eLpjhS+CJZl3dArhaPNtI+4OGuAv+BIQQNNf FbNvJWmQ =7ueC -----END PGP SIGNATURE----- Merge tag '5.11-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Two small cifs fixes for stable (including an important handle leak fix) and three small cleanup patches" * tag '5.11-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6: cifs: style: replace one-element array with flexible-array cifs: connect: style: Simplify bool comparison fs: cifs: remove unneeded variable in smb3_fs_context_dup cifs: fix interrupted close commands cifs: check pointer before freeing |
||
Linus Torvalds
|
82821be8a2 |
arm64 fixes:
- Set the minimum GCC version to 5.1 for arm64 due to earlier compiler bugs. - Make atomic helpers __always_inline to avoid a section mismatch when compiling with clang. - Fix the CMA and crashkernel reservations to use ZONE_DMA (remove the arm64_dma32_phys_limit variable, no longer needed with a dynamic ZONE_DMA sizing in 5.11). - Remove redundant IRQ flag tracing that was leaving lockdep inconsistent with the hardware state. - Revert perf events based hard lockup detector that was causing smp_processor_id() to be called in preemptible context. - Some trivial cleanups - spelling fix, renaming S_FRAME_SIZE to PT_REGS_SIZE, function prototypes added. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmAB2zoACgkQa9axLQDI XvE2vA//Vjh7bKMlNtocP7oA0/FVA7i9tKgNxDYmxjYvl6qDg26V7aDMelIi9H6l 14k+Wbf2Eqkav3+aAGEwdXuaoYPGrZIfVkPf+BbuviluoGsjwaGak0pc29ofDjE4 zaznZNXwO+joEqrtEZeTQO8fSagAupbgqf3ls/pRjBGVL6XEajhPqA+ccgQB71DI O+L0Z1tzQDurABp4mwHGFRbOTMdN59OhxfyHijO3yu+RGLQcO3C29AwuMkzJuYuA Fjng+VNS4mrKT8bsP0fpJa/oNekJ4jc/2OQaLxN+re8J+o6/EG6QGKdUL4VlEllk eK0chdC/ZD1e6R9MSV0ZL1diYedi0vn+F9mGxwNiSKtWzw0KqPEy7liP0KWQGVyF NALShoGkKMglYj2fmOrZhs7E4vAQGPlk7hROssDTM1RSu/7JpBwEJRb9QsOM4p4T HgcCoF4smnTCmbyVkcMYgZxMrJ5YOjchTu8uvUwHy6D//ZMQmDE2m3u9Svziu+y3 Nk8VpIp0HNDZyA7ZTeOrmo2jSEOoK3tDVKiqorPSmZd5mp35BMCr1q/Rcu6uaywr 4ym/CfmvQIqObzQOYbBze6QZs4DLqERP1p0WgEnWBE8W2rP4UcGaNdegbEsMkrfq CiCGcNyfpiJNZSc8VF1/1OFY/yABcWem1pVM7F254zA5G2wZX1U= =W9th -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Set the minimum GCC version to 5.1 for arm64 due to earlier compiler bugs. - Make atomic helpers __always_inline to avoid a section mismatch when compiling with clang. - Fix the CMA and crashkernel reservations to use ZONE_DMA (remove the arm64_dma32_phys_limit variable, no longer needed with a dynamic ZONE_DMA sizing in 5.11). - Remove redundant IRQ flag tracing that was leaving lockdep inconsistent with the hardware state. - Revert perf events based hard lockup detector that was causing smp_processor_id() to be called in preemptible context. - Some trivial cleanups - spelling fix, renaming S_FRAME_SIZE to PT_REGS_SIZE, function prototypes added. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: selftests: Fix spelling of 'Mismatch' arm64: syscall: include prototype for EL0 SVC functions compiler.h: Raise minimum version of GCC to 5.1 for arm64 arm64: make atomic helpers __always_inline arm64: rename S_FRAME_SIZE to PT_REGS_SIZE Revert "arm64: Enable perf events based hard lockup detector" arm64: entry: remove redundant IRQ flag tracing arm64: Remove arm64_dma32_phys_limit and its uses |
||
Linus Torvalds
|
f288c89562 |
- fix coredumps on 64bit kernels
- fix for alignment bugs preventing booting - fix checking for failed irq_alloc_desc calls -----BEGIN PGP SIGNATURE----- iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmABo3MaHHRzYm9nZW5k QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHDNrRAAgDNLFY5SYFsVUTJm/Lad LfKWUmeMTeFGpmE6a4gdNt1HTyO/aPcpQOmlm5sle2R3UmoWfs0pCsN67m7EQITQ z3ZKx4bUS/VsMUqJQW9/eBgsAvQFhZH1gPtATigdkpF9l/IDKaQ4oQCauU80bLbg fpoX8wMS7UVJskHMUXw/Y67JWjeivhOet6Dg1/3zM55RF25d+O9HM0mNRTfGKDdI GBnPTbLj8f74E/mq9khdfVI9Au7SUc22UtDxrHdJzegBa9aULen69I+43LBWK7QK fqbxxD3YmTpEzgTs7fbIKd76j4qQ5b6gLPeOmDa62GJu1TCNi5rWXQ2bN3RDd8zP tGQOWMPgqq3UnT//YhNdw7M2+a3j1naXRWMFo0dDENkxWFg285bj8hjmcBqPhSVP dz9ycRdS38bp4jD0vnnkHaMSVryS9Q2TJ29kKI5CT85DcWyrShh+gjtH9DFsDLNa o04ED+f4RqMMA49+eBHSp0extZ/958OpvWG6lq5k7wgDPGVFLg1fvwFhLD17G3SO DTfj2vlL64zucDQDFle08LyNeTM5xGy0EL+vvEWCqxY5Q48XtfIxjg6vQKTxFVba BqwENIf0fgMBtG5YENbpkwUd9ZRpv+au0Br6+nlwYdbodK4Da/RpfgrAY/7ofMuX mS6chzvsewxKBjAxKpWWSBc= =Vkor -----END PGP SIGNATURE----- Merge tag 'mips_fixes_5.11.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fix coredumps on 64bit kernels - fix for alignment bugs preventing booting - fix checking for failed irq_alloc_desc calls * tag 'mips_fixes_5.11.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: OCTEON: fix unreachable code in octeon_irq_init_ciu MIPS: relocatable: fix possible boot hangup with KASLR enabled MIPS: Fix malformed NT_FILE and NT_SIGINFO in 32bit coredumps MIPS: boot: Fix unaligned access with CONFIG_MIPS_RAW_APPENDED_DTB |
||
Al Grant
|
648b054a46 |
perf inject: Correct event attribute sizes
When 'perf inject' reads a perf.data file from an older version of perf, it writes event attributes into the output with the original size field, but lays them out as if they had the size currently used. Readers see a corrupt file. Update the size field to match the layout. Signed-off-by: Al Grant <al.grant@foss.arm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20201124195818.30603-1-al.grant@arm.com Signed-off-by: Denis Nikitin <denik@chromium.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
Adrian Hunter
|
5501e9229a |
perf intel-pt: Fix 'CPU too large' error
In some cases, the number of cpus (nr_cpus_online) is confused with the maximum cpu number (nr_cpus_avail), which results in the error in the example below: Example on system with 8 cpus: Before: # echo 0 > /sys/devices/system/cpu/cpu2/online # ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.147 MB perf.data ] # ./perf script --itrace=e Requested CPU 7 too large. Consider raising MAX_NR_CPUS 0x25908 [0x8]: failed to process type: 68 [Invalid argument] After: # ./perf script --itrace=e # Fixes: |