linux

Author	SHA1	Message	Date
Ronnie Sahlberg	e77fe73c7e	cifs: we can not use small padding iovs together with encryption We can not append small padding buffers as separate iovs when encryption is used. For this case we must flatten the request into a single buffer containing both the data from all the iovs as well as the padding bytes. This is at least needed for 4.20 as well due to compounding changes. CC: Stable <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-31 00:58:52 -06:00
Steve French	14e92c5dc7	cifs: Minor Kconfig clarification Clarify the use of the CONFIG_DFS_UPCALL for DNS name resolution when server ip addresses change (e.g. on long running mounts) Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	28eb24ff75	cifs: Always resolve hostname before reconnecting In case a hostname resolves to a different IP address (e.g. long running mounts), make sure to resolve it every time prior to calling generic_ip_connect() in reconnect. Suggested-by: Steve French <stfrench@microsoft.com> Signed-off-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	0874401549	cifs: Add support for failover in cifs_reconnect_tcon() After a successful failover, the cifs_reconnect_tcon() function will make sure to reconnect every tcon to new target server. Same as previous commit but for SMB1 codepath. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	a3a53b7603	cifs: Add support for failover in smb2_reconnect() After a successful failover in cifs_reconnect(), the smb2_reconnect() function will make sure to reconnect every tcon to new target server. For SMB2+. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	2332440714	cifs: Only free DFS target list if we actually got one Fix potential NULL ptr deref when DFS target list is empty. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	e511d31753	cifs: start DFS cache refresher in cifs_mount() Start the DFS cache refresh worker per volume during cifs mount. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
YueHaibing	2f0a617448	cifs: Use GFP_ATOMIC when a lock is held in cifs_mount() A spin lock is held before kstrndup, it may sleep with holding the spinlock, so we should use GFP_ATOMIC instead. Fixes: e58c31d5e387 ("cifs: Add support for failover in cifs_reconnect()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	93d5cb517d	cifs: Add support for failover in cifs_reconnect() After failing to reconnect to original target, it will retry any target available from DFS cache. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:13:11 -06:00
Paulo Alcantara	4a367dc044	cifs: Add support for failover in cifs_mount() This patch adds support for failover when failing to connect in cifs_mount(). Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:10:29 -06:00
YueHaibing	5a650501eb	cifs: remove set but not used variable 'sep' Fixes gcc '-Wunused-but-set-variable' warning: fs/cifs/cifs_dfs_ref.c: In function 'cifs_dfs_do_automount': fs/cifs/cifs_dfs_ref.c:309:7: warning: variable 'sep' set but not used [-Wunused-but-set-variable] It never used since introdution in commit 0f56b277073c ("cifs: Make use of DFS cache to get new DFS referrals") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Paulo Alcantara <palcantara@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:09:46 -06:00
Paulo Alcantara	1c780228e9	cifs: Make use of DFS cache to get new DFS referrals This patch will make use of DFS cache routines where appropriate and do not always request a new referral from server. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:09:46 -06:00
Steve French	e8bcdfdbf9	cifs: minor updates to documentation Update cifs "TODO" file. Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:09:46 -06:00
Joe Perches	0544b324e6	cifs: check kzalloc return kzalloc can return NULL so an additional check is needed. While there is a check for ret_buf there is no check for the allocation of ret_buf->crfid.fid - this check is thus added. Both call-sites of tconInfoAlloc() check for NULL return of tconInfoAlloc() so returning NULL on failure of kzalloc() here seems appropriate. As the kzalloc() is the only thing here that can fail it is moved to the beginning so as not to initialize other resources on failure of kzalloc. Fixes: `3d4ef9a153` ("smb3: fix redundant opens on root") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:09:46 -06:00
YueHaibing	29cbfa1b2b	cifs: remove set but not used variable 'server' Fixes gcc '-Wunused-but-set-variable' warning: fs/cifs/smb2pdu.c: In function 'smb311_posix_mkdir': fs/cifs/smb2pdu.c:2040:26: warning: variable 'server' set but not used [-Wunused-but-set-variable] fs/cifs/smb2pdu.c: In function 'build_qfs_info_req': fs/cifs/smb2pdu.c:4067:26: warning: variable 'server' set but not used [-Wunused-but-set-variable] The first 'server' never used since commit `bea851b8ba` ("smb3: Fix mode on mkdir on smb311 mounts") And the second not used since commit `1fc6ad2f10` ("cifs: remove header_preamble_size where it is always 0") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:09:46 -06:00
Dan Carpenter	34bca9bbe7	cifs: Use kzfree() to free password We should zero out the password before we free it. Fixes: 3d6cacbb5310 ("cifs: Add DFS cache routines") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de>	2018-12-28 10:09:46 -06:00
Wei Yongjun	3e80be0158	cifs: Fix to use kmem_cache_free() instead of kfree() memory allocated by kmem_cache_alloc() in alloc_cache_entry() should be freed using kmem_cache_free(), not kfree(). Fixes: 34a44fb160f9 ("cifs: Add DFS cache routines") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com>	2018-12-28 10:09:46 -06:00
Stephen Rothwell	54e4f73cbe	cifs: update for current_kernel_time64() removal Fixes cifs build failure after merge of the y2038 tree After merging the y2038 tree, today's linux-next build (x86_64 allmodconfig) failed like this: fs/cifs/dfs_cache.c: In function 'cache_entry_expired': fs/cifs/dfs_cache.c:106:7: error: implicit declaration of function 'current_kernel_time64'; did you mean 'core_kernel_text'? [-Werror=implicit-function-declaration] ts = current_kernel_time64(); ^~~~~~~~~~~~~~~~~~~~~ core_kernel_text fs/cifs/dfs_cache.c:106:5: error: incompatible types when assigning to type 'struct timespec64' from type 'int' ts = current_kernel_time64(); ^ fs/cifs/dfs_cache.c: In function 'get_expire_time': fs/cifs/dfs_cache.c:342:24: error: incompatible type for argument 1 of 'timespec64_add' return timespec64_add(current_kernel_time64(), ts); ^~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/restart_block.h:10, from include/linux/thread_info.h:13, from arch/x86/include/asm/preempt.h:7, from include/linux/preempt.h:78, from include/linux/rcupdate.h:40, from fs/cifs/dfs_cache.c:8: include/linux/time64.h:66:66: note: expected 'struct timespec64' but argument is of type 'int' static inline struct timespec64 timespec64_add(struct timespec64 lhs, ~~~~~~~~~~~~~~~~~~^~~ fs/cifs/dfs_cache.c:343:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ Caused by: commit ccea641b6742 ("timekeeping: remove obsolete time accessors") interacting with: commit 34a44fb160f9 ("cifs: Add DFS cache routines") from the cifs tree. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Paulo Alcantara <palcantara@suse.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:09:46 -06:00
Paulo Alcantara	54be1f6c1c	cifs: Add DFS cache routines * Add new dfs_cache.[ch] files * Add new /proc/fs/cifs/dfscache file - dump current cache when read - clear current cache when writing "0" to it * Add delayed_work to periodically refresh cache entries The new interface will be used for caching DFS referrals, as well as supporting client target failover. The DFS cache is a hashtable that maps UNC paths to cache entries. A cache entry contains: - the UNC path it is mapped on - how much the the UNC path the entry consumes - flags - a Time-To-Live after which the entry expires - a list of possible targets (linked lists of UNC paths) - a "hint target" pointing the last known working target or the first target if none were tried. This hint lets cifs.ko remember and try working targets first. * Looking for an entry in the cache is done with dfs_cache_find() - if no valid entries are found, a DFS query is made, stored in the cache and returned - the full target list can be copied and returned to avoid race conditions and looped on with the help with the dfs_cache_tgt_iterator * Updating the target hint to the next target is done with dfs_cache_update_tgthint() These functions have a dfs_cache_noreq_XXX() version that doesn't fetches referrals if no entries are found. These versions don't require the tcp/ses/tcon/cifs_sb parameters as a result. Expired entries cannot be used and since they have a pretty short TTL [1] in order for them to be useful for failover the DFS cache adds a delayed work called periodically to keep them fresh. Since we might not have available connections to issue the referral request when refreshing we need to store volume_info structs with credentials and other needed info to be able to connect to the right server. 1: Windows defaults: 5mn for domain-based referrals, 30mn for regular links Signed-off-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-28 10:05:58 -06:00
Paulo Alcantara	e7b602f437	cifs: Save TTL value when parsing DFS referrals This will be needed by DFS cache. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 23:49:00 -06:00
Aurelien Aptel	5fc7fcd054	cifs: auto disable 'serverino' in dfs mounts Different servers have different set of file ids. After failover, unique IDs will be different so we can't validate them. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Paulo Alcantara <palcantara@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 23:05:11 -06:00
Paulo Alcantara	d9345e0ae7	cifs: Make devname param optional in cifs_compose_mount_options() If we only want to get the mount options strings, do not return the devname. For DFS failover, we'll be passing the DFS full path down to cifs_mount() rather than the devname. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 23:05:08 -06:00
Paulo Alcantara	c34fea5a63	cifs: Skip any trailing backslashes from UNC When extracting hostname from UNC, check for leading backslashes before trying to remove them. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 23:05:05 -06:00
Paulo Alcantara	56c762eb9b	cifs: Refactor out cifs_mount() * Split and refactor the very large function cifs_mount() in multiple functions: - tcp, ses and tcon setup to mount_get_conns() - tcp, ses and tcon cleanup in mount_put_conns() - tcon tlink setup to mount_setup_tlink() - remote path checking to is_path_remote() * Implement 2 version of cifs_mount() for DFS-enabled builds and non-DFS-enabled builds (CONFIG_CIFS_DFS_UPCALL). In preparation for DFS failover support. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 23:00:38 -06:00
Georgy A Bystrenin	9a596f5b39	CIFS: Fix error mapping for SMB2_LOCK command which caused OFD lock problem While resolving a bug with locks on samba shares found a strange behavior. When a file locked by one node and we trying to lock it from another node it fail with errno 5 (EIO) but in that case errno must be set to (EACCES \| EAGAIN). This isn't happening when we try to lock file second time on same node. In this case it returns EACCES as expected. Also this issue not reproduces when we use SMB1 protocol (vers=1.0 in mount options). Further investigation showed that the mapping from status_to_posix_error is different for SMB1 and SMB2+ implementations. For SMB1 mapping is [NT_STATUS_LOCK_NOT_GRANTED to ERRlock] (See fs/cifs/netmisc.c line 66) but for SMB2+ mapping is [STATUS_LOCK_NOT_GRANTED to -EIO] (see fs/cifs/smb2maperror.c line 383) Quick changes in SMB2+ mapping from EIO to EACCES has fixed issue. BUG: https://bugzilla.kernel.org/show_bug.cgi?id=201971 Signed-off-by: Georgy A Bystrenin <gkot@altlinux.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:42:56 -06:00
Long Li	54e94ff94e	CIFS: return correct errors when pinning memory failed for direct I/O When pinning memory failed, we should return the correct error code and rewind the SMB credits. Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Cc: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:42:48 -06:00
Long Li	b6bc8a7b99	CIFS: use the correct length when pinning memory for direct I/O for write The current code attempts to pin memory using the largest possible wsize based on the currect SMB credits. This doesn't cause kernel oops but this is not optimal as we may pin more pages then actually needed. Fix this by only pinning what are needed for doing this write I/O. Signed-off-by: Long Li <longli@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>	2018-12-23 22:42:19 -06:00
Ronnie Sahlberg	59a63e479c	cifs: check ntwrk_buf_start for NULL before dereferencing it RHBZ: 1021460 There is an issue where when multiple threads open/close the same directory ntwrk_buf_start might end up being NULL, causing the call to smbCalcSize later to oops with a NULL deref. The real bug is why this happens and why this can become NULL for an open cfile, which should not be allowed. This patch tries to avoid a oops until the time when we fix the underlying issue. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:41:31 -06:00
Ronnie Sahlberg	52baa51d30	cifs: remove coverity warning in calc_lanman_hash password_with_pad is a fixed size buffer of 16 bytes, it contains a password string, to be padded with \0 if shorter than 16 bytes but is just truncated if longer. It is not, and we do not depend on it to be, nul terminated. As such, do not use strncpy() to populate this buffer since the str* prefix suggests that this is a string, which it is not, and it also confuses coverity causing a false warning. Detected by CoverityScan CID#113743 ("Buffer not null terminated") Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:41:26 -06:00
YueHaibing	0f57451eeb	cifs: remove set but not used variable 'smb_buf' Fixes gcc '-Wunused-but-set-variable' warning: fs/cifs/sess.c: In function '_sess_auth_rawntlmssp_assemble_req': fs/cifs/sess.c:1157:18: warning: variable 'smb_buf' set but not used [-Wunused-but-set-variable] It never used since commit `cc87c47d9d` ("cifs: Separate rawntlmssp auth from CIFS_SessSetup()") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:41:20 -06:00
Gustavo A. R. Silva	07fa6010ff	cifs: suppress some implicit-fallthrough warnings To avoid the warning: warning: this statement may fall through [-Wimplicit-fallthrough=] Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:41:11 -06:00
Ronnie Sahlberg	f9793b6fcc	cifs: change smb2_query_eas to use the compound query-info helper Reducing the number of network roundtrips improves the performance of query xattrs Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:40:17 -06:00
Kenneth D'souza	4a3b38aec5	Add vers=3.0.2 as a valid option for SMBv3.0.2 Technically 3.02 is not the dialect name although that is more familiar to many, so we should also accept the official dialect name (3.0.2 vs. 3.02) in vers= Signed-off-by: Kenneth D'souza <kdsouza@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:39:29 -06:00
Ronnie Sahlberg	07d3b2e426	cifs: create a helper function for compound query_info and convert statfs to use it. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:38:17 -06:00
Steve French	97aa495a89	cifs: address trivial coverity warning This is not actually a bug but as Coverity points out we shouldn't be doing an "\|=" on a value which hasn't been set (although technically it was memset to zero so isn't a bug) and so might as well change "\|=" to "=" in this line Detected by CoverityScan, CID#728535 ("Unitialized scalar variable") Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-12-23 22:38:14 -06:00
Steve French	f5942db5ef	cifs: smb2 commands can not be negative, remove confusing check As Coverity points out le16_to_cpu(midEntry->Command) can not be less than zero. Detected by CoverityScan, CID#1438650 ("Macro compares unsigned to 0") Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>	2018-12-23 22:37:23 -06:00
Ronnie Sahlberg	0967e54579	cifs: use a compound for setting an xattr Improve performance by reducing number of network round trips for set xattr. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:36:24 -06:00
Colin Ian King	5890255b83	cifs: clean up indentation, replace spaces with tab Trivial fix to clean up indentation, replace spaces with tab Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2018-12-23 22:36:09 -06:00
Linus Torvalds	8fe28cb58b	Linux 4.20	2018-12-23 15:55:59 -08:00
Linus Torvalds	3c730b1041	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A couple of fixes - no common topic ;-)" [ The aio spectre patch also came in from Jens, so now we have that doubly fixed .. ] * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: proc/sysctl: don't return ENOMEM on lookup when a table is unregistering aio: fix spectre gadget in lookup_ioctx	2018-12-23 10:40:41 -08:00
Linus Torvalds	9105b8aa50	SCSI fixes on 20181221 This is two simple target fixes and one discard related I/O starvation problem in sd. The discard problem occurs because the discard page doesn't have a mempool backing so if the allocation fails due to memory pressure, we then lose the forward progress we require if the writeout is on the same device. The fix is to back it with a mempool. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXB2mCiYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSJmAP9E8ItG tSgUyIfRRcn/ZxYdfOg1EWxGgDq17Fq2TgQU3gEAolSLwol7eKl1hQnDpOKPVMmC //j4JyKpCl3EEvNs6DQ= =3Hmt -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is two simple target fixes and one discard related I/O starvation problem in sd. The discard problem occurs because the discard page doesn't have a mempool backing so if the allocation fails due to memory pressure, we then lose the forward progress we require if the writeout is on the same device. The fix is to back it with a mempool" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: use mempool for discard special page scsi: target: iscsi: cxgbit: add missing spin_lock_init() scsi: target: iscsi: cxgbit: fix csk leak	2018-12-22 15:03:00 -08:00
Linus Torvalds	1104bd96eb	A cleanup for userspace in compiler_types.h - don't pollute userspace with macro definitions From Xiaozhou Liu -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAlwdVEQACgkQGXyLc2ht IW0RSxAAo3yZBrkMxN6FIrBeEfENFs8TL3iDq5GoCPShJNWHRpRbEBhi06/k6dnA ePFmgXL/FEio+f47aUj/pEh2NQv5QcwkLRpizREmGtHjVBngJNARFyHxveZyqE52 ArySpu5/WPswQdu73cQLAwqtk505Gi8jNLRKVqr4CiBJZB/WO7rsINWDOhUulpwG 9b8Kmct4al/3mhDOhnn1ppgAIauzj2xoyXxYMLZx95h7oycfssUvbNfJtALnxCJs eIWAxGebr3ni85q9J69gMfIOwiSn6HtaLAuv8Q7AOKuCBd1+/ymX79gCwH68dQVl tdDhIE7vEAWZHbVHy7fdnNIUbPfAMk/QonLStbdd2nYVeblD/luSe91NShCo7Jg5 ZVJHdA+eD9IjypGz4mMzjOlvhCWZBGtOdnby4tD6YxV+S9fDQPvE+9Ws1JaAIzpH kpnj1tmi5YwqN6T5pLWQwVs/HCuoCXI89pv0tSQwip2/txxorAIhhJvzo94lLdv/ nOABNb6/eszVj7IGOxKLXW/djFluwt/0SzlaD3A8pIjQWNXolfpmBu/9xMcJVZuj 070Vfn60bjH9q2qitBvlYLCX4eXaGBcfybRi7oe5WnOlKVCl1idQSPcn+2dhiaOO JpInO/XLQUrieHT4f/AW6prWJ4AiQJd1lpKw76acOjcm/iXMJGM= =e9gb -----END PGP SIGNATURE----- Merge tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux Pull compiler_types.h fix from Miguel Ojeda: "A cleanup for userspace in compiler_types.h: don't pollute userspace with macro definitions (Xiaozhou Liu) This is harmless for the kernel, but v4.19 was released with a few macros exposed to userspace as the patch explains; which this removes, so it could happen that we break something for someone (although leaving inline redefined is probably worse)" * tag 'compiler-attributes-for-linus-v4.20' of https://github.com/ojeda/linux: include/linux/compiler_types.h: don't pollute userspace with macro definitions	2018-12-22 14:29:21 -08:00
Linus Torvalds	38c0ecf608	Fix bug in auxdisplay. - charlcd: fix x/y command parsing From Mans Rullgard -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPjU5OPd5QIZ9jqqOGXyLc2htIW0FAlwdT8QACgkQGXyLc2ht IW3xxQ/9EEw7lrrACBaTzj9DaEkln1Dz1ObEkSZdZJ+YxkPaagKAZZvaDmigYrkL SRSAxNr11mRVLOLjPK4/D1JpK07Xb7Ahu0lgcfInqxEv+lgkqVNdxCxomCqNowRM FbCd1xKH2VMWo1z6Bi/tvnJtD/tYnJzzBPPPNMRtd5Cmtt6OF5LVjE68pGcgHELq HD6uygy3WmGIzHxhV3O5PSM7FhD0mp9HgOzheBXqJY0ku5y/fj5eyp7QaucqSd2E oTr8X43GdGhAbntMbQ7Go0c9u4r6TJKkmqw9ylvXICDyNRzP3fLOipqRDh9U9slv ApHkZeM7v/2vzYdf1VJNep5Cb3EoeAE93CGnmVN+Jux006/6BQSTZ6k4z6XfXTpA wguhLgnRJ0EUx6REhC1ciPdmB9sJnkg6AUO8O4KQFU8X5MDT2pV+055ZopKkGTBB RKYvn3ymmY0FcOXPcn0lu48aCCkkyGZ7ANxTfZDpBt5UnMEWWHiYaElyet2OssG9 RhkxS6ogzhlEesEGF5c9ZkRujHorBnvpwPACocK+kwoEMBaRQ6SDWQ+gZS6YGceS 21MqY3WqMXZxT1PzJ1r9Z+FVB6gPPLGCiDEJg4QOrBQ3r6cq5JQK5LVNnD3b9Uqz /wWJj0KT1oOyMkdxzxxiGtcutDTzRluBOmRyQ0A6z0xTabgD0r8= =BHr1 -----END PGP SIGNATURE----- Merge tag 'auxdisplay-for-linus-v4.20' of https://github.com/ojeda/linux Pull auxdisplay fix from Miguel Ojeda: "charlcd: fix x/y command parsing (Mans Rullgard)" * tag 'auxdisplay-for-linus-v4.20' of https://github.com/ojeda/linux: auxdisplay: charlcd: fix x/y command parsing	2018-12-22 14:25:23 -08:00
Christian Brauner	94f82008ce	Revert "vfs: Allow userns root to call mknod on owned filesystems." This reverts commit `55956b59df`. commit `55956b59df` ("vfs: Allow userns root to call mknod on owned filesystems.") enabled mknod() in user namespaces for userns root if CAP_MKNOD is available. However, these device nodes are useless since any filesystem mounted from a non-initial user namespace will set the SB_I_NODEV flag on the filesystem. Now, when a device node s created in a non-initial user namespace a call to open() on said device node will fail due to: bool may_open_dev(const struct path *path) { return !(path->mnt->mnt_flags & MNT_NODEV) && !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV); } The problem with this is that as of the aforementioned commit mknod() creates partially functional device nodes in non-initial user namespaces. In particular, it has the consequence that as of the aforementioned commit open() will be more privileged with respect to device nodes than mknod(). Before it was the other way around. Specifically, if mknod() succeeded then it was transparent for any userspace application that a fatal error must have occured when open() failed. All of this breaks multiple userspace workloads and a widespread assumption about how to handle mknod(). Basically, all container runtimes and systemd live by the slogan "ask for forgiveness not permission" when running user namespace workloads. For mknod() the assumption is that if the syscall succeeds the device nodes are useable irrespective of whether it succeeds in a non-initial user namespace or not. This logic was chosen explicitly to allow for the glorious day when mknod() will actually be able to create fully functional device nodes in user namespaces. A specific problem people are already running into when running 4.18 rc kernels are failing systemd services. For any distro that is run in a container systemd services started with the PrivateDevices= property set will fail to start since the device nodes in question cannot be opened (cf. the arguments in [1]). Full disclosure, Seth made the very sound argument that it is already possible to end up with partially functional device nodes. Any filesystem mounted with MS_NODEV set will allow mknod() to succeed but will not allow open() to succeed. The difference to the case here is that the MS_NODEV case is transparent to userspace since it is an explicitly set mount option while the SB_I_NODEV case is an implicit property enforced by the kernel and hence opaque to userspace. [1]: https://github.com/systemd/systemd/pull/9483 Signed-off-by: Christian Brauner <christian@brauner.io> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Seth Forshee <seth.forshee@canonical.com> Cc: Serge Hallyn <serge@hallyn.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-22 14:18:34 -08:00
Christoph Hellwig	0cd60eb1a7	dma-mapping: fix flags in dma_alloc_wc We really need the writecombine flag in dma_alloc_wc, fix a stupid oversight. Fixes: `7ed1d91a9e` ("dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-22 08:46:27 -08:00
Linus Torvalds	23203e3f34	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "4 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, page_alloc: fix has_unmovable_pages for HugePages fork,memcg: fix crash in free_thread_stack on memcg charge fail mm: thp: fix flags for pmd migration when split mm, memory_hotplug: initialize struct pages for the full memory section	2018-12-21 14:59:00 -08:00
Oscar Salvador	17e2e7d7e1	mm, page_alloc: fix has_unmovable_pages for HugePages While playing with gigantic hugepages and memory_hotplug, I triggered the following #PF when "cat memoryX/removable": BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 #PF error: [normal kernel read fault] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 1481 Comm: cat Tainted: G E 4.20.0-rc6-mm1-1-default+ #18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:has_unmovable_pages+0x154/0x210 Call Trace: is_mem_section_removable+0x7d/0x100 removable_show+0x90/0xb0 dev_attr_show+0x1c/0x50 sysfs_kf_seq_show+0xca/0x1b0 seq_read+0x133/0x380 __vfs_read+0x26/0x180 vfs_read+0x89/0x140 ksys_read+0x42/0x90 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The reason is we do not pass the Head to page_hstate(), and so, the call to compound_order() in page_hstate() returns 0, so we end up checking all hstates's size to match PAGE_SIZE. Obviously, we do not find any hstate matching that size, and we return NULL. Then, we dereference that NULL pointer in hugepage_migration_supported() and we got the #PF from above. Fix that by getting the head page before calling page_hstate(). Also, since gigantic pages span several pageblocks, re-adjust the logic for skipping pages. While are it, we can also get rid of the round_up(). [osalvador@suse.de: remove round_up(), adjust skip pages logic per Michal] Link: http://lkml.kernel.org/r/20181221062809.31771-1-osalvador@suse.de Link: http://lkml.kernel.org/r/20181217225113.17864-1-osalvador@suse.de Signed-off-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-21 14:51:18 -08:00
Rik van Riel	5eed6f1dff	fork,memcg: fix crash in free_thread_stack on memcg charge fail Commit `9b6f7e163c` ("mm: rework memcg kernel stack accounting") will result in fork failing if allocating a kernel stack for a task in dup_task_struct exceeds the kernel memory allowance for that cgroup. Unfortunately, it also results in a crash. This is due to the code jumping to free_stack and calling free_thread_stack when the memcg kernel stack charge fails, but without tsk->stack pointing at the freshly allocated stack. This in turn results in the vfree_atomic in free_thread_stack oopsing with a backtrace like this: #5 [ffffc900244efc88] die at ffffffff8101f0ab #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86 #7 [ffffc900244efce0] general_protection at ffffffff818ff082 [exception RIP: llist_add_batch+7] RIP: ffffffff8150d487 RSP: ffffc900244efd98 RFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88085ef55980 RCX: 0000000000000000 RDX: ffff88085ef55980 RSI: 343834343531203a RDI: 343834343531203a RBP: ffffc900244efd98 R8: 0000000000000001 R9: ffff8808578c3600 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88029f6c21c0 R13: 0000000000000286 R14: ffff880147759b00 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37 #10 [ffffc900244efe98] _do_fork at ffffffff810884e0 #11 [ffffc900244eff10] sys_vfork at ffffffff810887ff #12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43 RIP: 000000000049b948 RSP: 00007ffcdb307830 RFLAGS: 00000246 RAX: ffffffffffffffda RBX: 0000000000896030 RCX: 000000000049b948 RDX: 0000000000000000 RSI: 00007ffcdb307790 RDI: 00000000005d7421 RBP: 000000000067370f R8: 00007ffcdb3077b0 R9: 000000000001ed00 R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000040 R13: 000000000000000f R14: 0000000000000000 R15: 000000000088d018 ORIG_RAX: 000000000000003a CS: 0033 SS: 002b The simplest fix is to assign tsk->stack right where it is allocated. Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com Fixes: `9b6f7e163c` ("mm: rework memcg kernel stack accounting") Signed-off-by: Rik van Riel <riel@surriel.com> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-21 14:51:18 -08:00
Peter Xu	2e83ee1d86	mm: thp: fix flags for pmd migration when split When splitting a huge migrating PMD, we'll transfer all the existing PMD bits and apply them again onto the small PTEs. However we are fetching the bits unconditionally via pmd_soft_dirty(), pmd_write() or pmd_yound() while actually they don't make sense at all when it's a migration entry. Fix them up. Since at it, drop the ifdef together as not needed. Note that if my understanding is correct about the problem then if without the patch there is chance to lose some of the dirty bits in the migrating pmd pages (on x86_64 we're fetching bit 11 which is part of swap offset instead of bit 2) and it could potentially corrupt the memory of an userspace program which depends on the dirty bit. Link: http://lkml.kernel.org/r/20181213051510.20306-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-21 14:51:18 -08:00
Mikhail Zaslonko	2830bf6f05	mm, memory_hotplug: initialize struct pages for the full memory section If memory end is not aligned with the sparse memory section boundary, the mapping of such a section is only partly initialized. This may lead to VM_BUG_ON due to uninitialized struct page access from is_mem_section_removable() or test_pages_in_a_zone() function triggered by memory_hotplug sysfs handlers: Here are the the panic examples: CONFIG_DEBUG_VM=y CONFIG_DEBUG_VM_PGFLAGS=y kernel parameter mem=2050M -------------------------- page:000003d082008000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ( test_pages_in_a_zone+0xde/0x160) show_valid_zones+0x5c/0x190 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: test_pages_in_a_zone+0xde/0x160 Kernel panic - not syncing: Fatal exception: panic_on_oops kernel parameter mem=3075M -------------------------- page:000003d08300c000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ( is_mem_section_removable+0xb4/0x190) show_mem_removable+0x9a/0xd8 dev_attr_show+0x34/0x70 sysfs_kf_seq_show+0xc8/0x148 seq_read+0x204/0x480 __vfs_read+0x32/0x178 vfs_read+0x82/0x138 ksys_read+0x5a/0xb0 system_call+0xdc/0x2d8 Last Breaking-Event-Address: is_mem_section_removable+0xb4/0x190 Kernel panic - not syncing: Fatal exception: panic_on_oops Fix the problem by initializing the last memory section of each zone in memmap_init_zone() till the very end, even if it goes beyond the zone end. Michal said: : This has alwways been problem AFAIU. It just went unnoticed because we : have zeroed memmaps during allocation before `f7f99100d8` ("mm: stop : zeroing memory during allocation in vmemmap") and so the above test : would simply skip these ranges as belonging to zone 0 or provided a : garbage. : : So I guess we do care for post `f7f99100d8` kernels mostly and : therefore Fixes: `f7f99100d8` ("mm: stop zeroing memory during : allocation in vmemmap") Link: http://lkml.kernel.org/r/20181212172712.34019-2-zaslonko@linux.ibm.com Fixes: `f7f99100d8` ("mm: stop zeroing memory during allocation in vmemmap") Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-12-21 14:51:18 -08:00

1 2 3 4 5 ...

798761 Commits