linux

Author	SHA1	Message	Date
Ben Dooks	787dbea11a	profile: setup_profiling_timer() is moslty not implemented The setup_profiling_timer() is mostly un-implemented by many architectures. In many places it isn't guarded by CONFIG_PROFILE which is needed for it to be used. Make it a weak symbol in kernel/profile.c and remove the 'return -EINVAL' implementations from the kenrel. There are a couple of architectures which do return 0 from the setup_profiling_timer() function but they don't seem to do anything else with it. To keep the /proc compatibility for now, leave these for a future update or removal. On ARM, this fixes the following sparse warning: arch/arm/kernel/smp.c:793:5: warning: symbol 'setup_profiling_timer' was not declared. Should it be static? Link: https://lkml.kernel.org/r/20220721195509.418205-1-ben-linux@fluff.org Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Christophe JAILLET	45ee6d1e93	ocfs2: fix a typo in a comment s/heartbaet/heartbeat Link: https://lkml.kernel.org/r/4d4a6786e8ad522bfad6d2401b7f6634f8af0e5d.1658436259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Christophe JAILLET	702f3cf374	ocfs2: use the bitmap API to simplify code Use bitmap_zero() instead of hand-writing it. It is less verbose. While at it, add an explicit #include <linux/bitmap.h>. Link: https://lkml.kernel.org/r/86d2a027c319db12055c98f00c65f7d01e703722.1658436259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:36 -07:00
Christophe JAILLET	97d3b2676f	ocfs2: remove some useless functions Patch series "ocfs2: A few clean_ups", v2. __ocfs2_node_map_set_bit() and __ocfs2_node_map_clear_bit() are just wrapper around set_bit() and clear_bit(). The leading __ also makes think that these functions are non-atomic just like __set_bit() and __clear_bit(). So, just remove these wrappers and call set_bit() and clear_bit() directly. Link: https://lkml.kernel.org/r/cover.1658436259.git.christophe.jaillet@wanadoo.fr Link: https://lkml.kernel.org/r/bd1429c84ec7d174c96dbb67a2b42b1b456d9394.1658436259.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Slark Xiao	cf069c3b47	lib/mpi: fix typo 'the the' in comment Replace 'the the' with 'the' in the comment. Link: https://lkml.kernel.org/r/20220722101922.81126-1-slark_xiao@163.com Signed-off-by: Slark Xiao <slark_xiao@163.com> Cc: Hongbo Li <herberthbli@tencent.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Alexey Dobriyan	ed8fb78d7e	proc: add some (hopefully) insightful comments * /proc/${pid}/net status * removing PDE vs last close stuff (again!) * random small stuff Link: https://lkml.kernel.org/r/YtwrM6sDC0OQ53YB@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Xiu Jianfeng	fa7d574ba4	bdi: remove enum wb_congested_state enum wb_congested_state and the member 'congested' in bdi_writeback are useless since commit `a88f2096d5` ("remove congestion tracking framework"), so remove it. Link: https://lkml.kernel.org/r/20220719083349.87547-1-xiujianfeng@huawei.com Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Ben Dooks	591c32bddb	kernel/hung_task: fix address space of proc_dohung_task_timeout_secs The proc_dohung_task_timeout_secs() function is incorrectly marked as having a __user buffer as argument 3. However this is not the case and it is casing multiple sparse warnings. Fix the following warnings by removing __user from the argument: kernel/hung_task.c:237:52: warning: incorrect type in argument 3 (different address spaces) kernel/hung_task.c:237:52: expected void * kernel/hung_task.c:237:52: got void [noderef] __user buffer kernel/hung_task.c:287:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces)) kernel/hung_task.c:287:35: expected int ( [usertype] proc_handler )( ... ) kernel/hung_task.c:287:35: got int ( * )( ... ) kernel/hung_task.c:295:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces)) kernel/hung_task.c:295:35: expected int ( [usertype] proc_handler )( ... ) kernel/hung_task.c:295:35: got int ( )( ... ) Link: https://lkml.kernel.org/r/20220714074744.189017-1-ben.dooks@sifive.com Signed-off-by: Ben Dooks <ben.dooks@sifive.com> Cc: <Conor.Dooley@microchip.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:35 -07:00
Jiangshan Yi	a10c9ede99	lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t() Fix the following coccicheck warning: lib/lzo/lzo1x_compress.c:54: WARNING opportunity for min(). lib/lzo/lzo1x_compress.c:329: WARNING opportunity for min(). min() and min_t() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Link: https://lkml.kernel.org/r/20220714015441.1313036-1-13667453960@163.com Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Tested-by: Dave Rodgman <dave.rodgman@arm.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Phillip Lougher	b09a7a036d	squashfs: support reading fragments in readahead call Add a function which can be used to read fragments in the readahead call. This function is necessary because filesystems built with the -tailends (or -always-use-fragments) option may have fragments present which cannot be currently handled. Link: https://lkml.kernel.org/r/20220617083810.337573-5-hsinyi@chromium.org Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Cc: Hou Tao <houtao1@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miao Xie <miaoxie@huawei.com> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Hsin-Yi Wang	8fc78b6fe2	squashfs: implement readahead Implement readahead callback for squashfs. It will read datablocks which cover pages in readahead request. For a few cases it will not mark page as uptodate, including: - file end is 0. - zero filled blocks. - current batch of pages isn't in the same datablock. - decompressor error. Otherwise pages will be marked as uptodate. The unhandled pages will be updated by readpage later. Link: https://lkml.kernel.org/r/20220617083810.337573-4-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Matthew Wilcox <willy@infradead.org> Reported-by: Matthew Wilcox <willy@infradead.org> Reported-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: Xiongwei Song <Xiongwei.Song@windriver.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Cc: Hou Tao <houtao1@huawei.com> Cc: kernel test robot <lkp@intel.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Phillip Lougher	db98b43086	squashfs: always build "file direct" version of page actor Squashfs_readahead uses the "file direct" version of the page actor, and so build it unconditionally. Link: https://lkml.kernel.org/r/20220617083810.337573-3-hsinyi@chromium.org Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miao Xie <miaoxie@huawei.com> Cc: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Zheng Liang <zhengliang6@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Hsin-Yi Wang	0c12185728	Revert "squashfs: provide backing_dev_info in order to disable read-ahead" Patch series "Implement readahead for squashfs", v7. Commit 9eec1d897139("squashfs: provide backing_dev_info in order to disable read-ahead") mitigates the performance drop issue for squashfs by closing readahead for it. This series implements readahead callback for squashfs. This patch (of 4): This reverts `9eec1d8971` ("squashfs: provide backing_dev_info in order to disable read-ahead"). Revert closing the readahead to squashfs since the readahead callback for squashfs is implemented. Link: https://lkml.kernel.org/r/20220617083810.337573-1-hsinyi@chromium.org Link: https://lkml.kernel.org/r/20220617083810.337573-2-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Zheng Liang <zhengliang6@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Jiangshan Yi	233eb8d689	fs/ocfs2: Fix spelling typo in comment Fix spelling typo in comment. Link: https://lkml.kernel.org/r/20220715060035.632903-1-13667453960@163.com Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn> Reported-by: k2ci <kernel-bot@kylinos.cn> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:43 -07:00
Souptick Joarder (HPE)	1298f83b54	ia64: old_rr4 added under CONFIG_HUGETLB_PAGE kernel test robot throws below warning -> arch/ia64/include/asm/mmu_context.h: In function 'reload_context': arch/ia64/include/asm/mmu_context.h:127:48: warning: variable 'old_rr4' set but not used [-Wunused-but-set-variable] 127 \| unsigned long rr0, rr1, rr2, rr3, rr4, old_rr4; Add it under CONFIG_HUGETLB_PAGE Link: https://lkml.kernel.org/r/20220626022114.4020-1-jrdr.linux@gmail.com Signed-off-by: Souptick Joarder (HPE) <jrdr.linux@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:43 -07:00
Alexey Dobriyan	3adb2d8723	proc: fix test for "vsyscall=xonly" boot option Booting with vsyscall=xonly results in the following vsyscall VMA: ffffffffff600000-ffffffffff601000 --xp ... [vsyscall] Test does read from fixed vsyscall address to determine if kernel supports vsyscall page but it doesn't work because, well, vsyscall page is execute only. Fix test by trying to execute from the first byte of the page which contains gettimeofday() stub. This should work because vsyscall entry points have stable addresses by design. Alexey, avoiding parsing .config, /proc/config.gz and /proc/cmdline at all costs. Link: https://lkml.kernel.org/r/Ys2KgeiEMboU8Ytu@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: <dylanbhatch@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:43 -07:00
Zhihao Cheng	d919a1e79b	proc: fix a dentry lock race between release_task and lookup Commit `7bc3e6e55a` ("proc: Use a list of inodes to flush from proc") moved proc_flush_task() behind __exit_signal(). Then, process systemd can take long period high cpu usage during releasing task in following concurrent processes: systemd ps kernel_waitid stat(/proc/tgid) do_wait filename_lookup wait_consider_task lookup_fast release_task __exit_signal __unhash_process detach_pid __change_pid // remove task->pid_links d_revalidate -> pid_revalidate // 0 d_invalidate(/proc/tgid) shrink_dcache_parent(/proc/tgid) d_walk(/proc/tgid) spin_lock_nested(/proc/tgid/fd) // iterating opened fd proc_flush_pid \| d_invalidate (/proc/tgid/fd) \| shrink_dcache_parent(/proc/tgid/fd) \| shrink_dentry_list(subdirs) ↓ shrink_lock_dentry(/proc/tgid/fd) --> race on dentry lock Function d_invalidate() will remove dentry from hash firstly, but why does proc_flush_pid() process dentry '/proc/tgid/fd' before dentry '/proc/tgid'? That's because proc_pid_make_inode() adds proc inode in reverse order by invoking hlist_add_head_rcu(). But proc should not add any inodes under '/proc/tgid' except '/proc/tgid/task/pid', fix it by adding inode into 'pid->inodes' only if the inode is /proc/tgid or /proc/tgid/task/pid. Performance regression: Create 200 tasks, each task open one file for 50,000 times. Kill all tasks when opened files exceed 10,000,000 (cat /proc/sys/fs/file-nr). Before fix: $ time killall -wq aa real 4m40.946s # During this period, we can see 'ps' and 'systemd' taking high cpu usage. After fix: $ time killall -wq aa real 1m20.732s # During this period, we can see 'systemd' taking high cpu usage. Link: https://lkml.kernel.org/r/20220713130029.4133533-1-chengzhihao1@huawei.com Fixes: `7bc3e6e55a` ("proc: Use a list of inodes to flush from proc") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216054 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Suggested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Baoquan He <bhe@redhat.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:43 -07:00
Ian Kent	7ffe4e90a0	autofs: remove unused ino field inode Remove the unused inode field of the autofs dentry info structure. Link: https://lkml.kernel.org/r/165724460393.30914.6511330213821246793.stgit@donald.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:42 -07:00
Ian Kent	ba97a0a3a3	autofs: add comment about autofs_mountpoint_changed() The function autofs_mountpoint_changed() is unusual, add a comment about two cases for which it is needed. Link: https://lkml.kernel.org/r/165724459804.30914.10974834416046555127.stgit@donald.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:42 -07:00
Ian Kent	a4a8730387	autofs: use dentry info count instead of simple_empty() The dentry info. field count is used to check if a dentry is in use during expire. But, to be used for this the count field must account for the presence of child dentries in a directory dentry. Therefore it can also be used to check for an empty directory dentry which can be done without having to to take an additional lock or account for the presence of a readdir cursor dentry as is done by simple_empty(). Link: https://lkml.kernel.org/r/165724459238.30914.1504611159945950108.stgit@donald.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:42 -07:00
Ian Kent	9ccbac76e7	autofs: make dentry info count consistent If an autofs dentry is a mount root directory there's no ->mkdir() call to set its count to one. To make the dentry info count consistent for all autofs dentries set count to one when the dentry info struct is allocated. Link: https://lkml.kernel.org/r/165724458671.30914.2902424437132835325.stgit@donald.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:42 -07:00
Ian Kent	f71381fcdc	autofs: use inode permission method for write access Patch series "autofs: misc patches". This series contains several patches that resulted mostly from comments made by Al Viro (quite a long time ago now). This patch (of 5): Eliminate some code duplication from mkdir/rmdir/symlink/unlink methods by using the inode operation .permission(). Link: https://lkml.kernel.org/r/165724445154.30914.10970894936827635879.stgit@donald.themaw.net Link: https://lkml.kernel.org/r/165724458096.30914.13499431569758625806.stgit@donald.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:42 -07:00
Mark-PK Tsai	55656016da	lib: devres: use numa aware allocation Allocate device resource from local node memory when the numa locality of the device is specified. Link: https://lkml.kernel.org/r/20220708131952.14500-1-mark-pk.tsai@mediatek.com Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: YJ Chiang <yj.chiang@mediatek.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:41 -07:00
Tetsuo Handa	bd27acaac2	lib/smp_processor_id: fix imbalanced instrumentation_end() call Currently instrumentation_end() won't be called if printk_ratelimit() returned false. Link: https://lkml.kernel.org/r/a636d8e0-ad32-5888-acac-671f7f553bb3@I-love.SAKURA.ne.jp Fixes: `126f21f0e8` ("lib/smp_processor_id: Move it into noinstr section") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:41 -07:00
Sander Vanheule	953257a925	cpumask: update cpumask_next_wrap() signature The extern specifier is not needed for this declaration, so drop it. The function also depends only on the input parameters, and has no side effects, so it can be marked __pure like other functions in cpumask.h. Link: https://lkml.kernel.org/r/72ab755695b74bb5fbaa756ae4c0edd708d172f1.1656777646.git.sander@svanheule.net Signed-off-by: Sander Vanheule <sander@svanheule.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:41 -07:00
Sander Vanheule	c41e8866c2	lib/test: introduce cpumask KUnit test suite Add a basic suite of tests for cpumask, providing some tests for empty and completely filled cpumasks. Link: https://lkml.kernel.org/r/c96980ec35c3bd23f17c3374bf42c22971545e85.1656777646.git.sander@svanheule.net Signed-off-by: Sander Vanheule <sander@svanheule.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Suggested-by: Yury Norov <yury.norov@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:41 -07:00
Sander Vanheule	b81dce77ce	cpumask: Fix invalid uniprocessor mask assumption On uniprocessor builds, any CPU mask is assumed to contain exactly one CPU (cpu0). This assumption ignores the existence of empty masks, resulting in incorrect behaviour. cpumask_first_zero(), cpumask_next_zero(), and for_each_cpu_not() don't provide behaviour matching the assumption that a UP mask is always "1", and instead provide behaviour matching the empty mask. Drop the incorrectly optimised code and use the generic implementations in all cases. Link: https://lkml.kernel.org/r/86bf3f005abba2d92120ddd0809235cab4f759a6.1656777646.git.sander@svanheule.net Signed-off-by: Sander Vanheule <sander@svanheule.net> Suggested-by: Yury Norov <yury.norov@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:41 -07:00
Sander Vanheule	4f09903078	cpumask: add UP optimised for_each_*_cpu versions On uniprocessor builds, the following loops will always run over a mask that contains one enabled CPU (cpu0): - for_each_possible_cpu - for_each_online_cpu - for_each_present_cpu Provide uniprocessor-specific macros for these loops, that always run exactly once. Link: https://lkml.kernel.org/r/3a92869b902a075b97be5d1452c9c6badbbff0df.1656777646.git.sander@svanheule.net Signed-off-by: Sander Vanheule <sander@svanheule.net> Acked-by: Yury Norov <yury.norov@gmail.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:41 -07:00
Sander Vanheule	adbcaef840	x86/cacheinfo: move shared cache map definitions Patch series "cpumask: Fix invalid uniprocessor assumptions", v4. On uniprocessor builds, it is currently assumed that any cpumask will contain the single CPU: cpu0. This assumption is used to provide optimised implementations. The current assumption also appears to be wrong, by ignoring the fact that users can provide empty cpumasks. This can result in bugs as explained in [1] - for_each_cpu() will run one iteration of the loop even when passed an empty cpumask. This series introduces some basic tests, and updates the optimisations for uniprocessor builds. The x86 patch was written after the kernel test robot [2] ran into a failed build. I have tried to list the files potentially affected by the changes to cpumask.h, in an attempt to find any other cases that fail on !SMP. I've gone through some of the files manually, and ran a few cross builds, but nothing else popped up. I (build) checked about half of the potientally affected files, but I do not have the resources to do them all. I hope we can fix other issues if/when they pop up later. [1] https://lore.kernel.org/all/20220530082552.46113-1-sander@svanheule.net/ [2] https://lore.kernel.org/all/202206060858.wA0FOzRy-lkp@intel.com/ This patch (of 5): The maps to keep track of shared caches between CPUs on SMP systems are declared in asm/smp.h, among them specifically cpu_llc_shared_map. These maps are externally defined in cpu/smpboot.c. The latter is only compiled on CONFIG_SMP=y, which means the declared extern symbols from asm/smp.h do not have a corresponding definition on uniprocessor builds. The inline cpu_llc_shared_mask() function from asm/smp.h refers to the map declaration mentioned above. This function is referenced in cacheinfo.c inside for_each_cpu() loop macros, to provide cpumask for the loop. On uniprocessor builds, the symbol for the cpu_llc_shared_map does not exist. However, the current implementation of for_each_cpu() also (wrongly) ignores the provided mask. By sheer luck, the compiler thus optimises out this unused reference to cpu_llc_shared_map, and the linker therefore does not require the cpu_llc_shared_mask to actually exist on uniprocessor builds. Only on SMP bulids does smpboot.o exist to provide the required symbols. To no longer rely on compiler optimisations for successful uniprocessor builds, move the definitions of cpu_llc_shared_map and cpu_l2c_shared_map from smpboot.c to cacheinfo.c. Link: https://lkml.kernel.org/r/cover.1656777646.git.sander@svanheule.net Link: https://lkml.kernel.org/r/e8167ddb570f56744a3dc12c2149a660a324d969.1656777646.git.sander@svanheule.net Signed-off-by: Sander Vanheule <sander@svanheule.net> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Marco Elver <elver@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:40 -07:00
Nikolay Borisov	8b5db66798	scripts/bloat-o-meter: add -p argument When doing cross platform development on a machine sometimes it might be useful to invoke bloat-o-meter for files which haven't been build with the native toolchain. In cases when the host nm doesn't support the target one then a toolchain-specific nm could be used. Add this ability by adding the -p allowing invocations as: ./scripts/bloat-o-meter -p riscv64-unknown-linux-gnu- file1.o file2.o Link: https://lkml.kernel.org/r/20220701113513.1938008-2-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:40 -07:00
Nikolay Borisov	b62eb2731e	scripts/bloat-o-meter: switch argument parsing to using argparse This will facilitate further extension to the arguments the script takes. As an added benefit it also produces saner usage output, where mutual exclusivity of the c\|d\|t parameters is clearly visible: ./scripts/bloat-o-meter -h usage: bloat-o-meter [-h] [-c \| -d \| -t] file1 file2 Simple script used to compare the symbol sizes of 2 object files positional arguments: file1 First file to compare file2 Second file to compare optional arguments: -h, --help show this help message and exit -c categorize output based on symbol type -d Show delta of Data Section -t Show delta of text Section Link: https://lkml.kernel.org/r/20220701113513.1938008-1-nborisov@suse.com Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:40 -07:00
Benjamin Segall	a16ceb1396	epoll: autoremove wakers even more aggressively If a process is killed or otherwise exits while having active network connections and many threads waiting on epoll_wait, the threads will all be woken immediately, but not removed from ep->wq. Then when network traffic scans ep->wq in wake_up, every wakeup attempt will fail, and will not remove the entries from the list. This means that the cost of the wakeup attempt is far higher than usual, does not decrease, and this also competes with the dying threads trying to actually make progress and remove themselves from the wq. Handle this by removing visited epoll wq entries unconditionally, rather than only when the wakeup succeeds - the structure of ep_poll means that the only potential loss is the timed_out->eavail heuristic, which now can race and result in a redundant ep_send_events attempt. (But only when incoming data and a timeout actually race, not on every timeout) Shakeel added: : We are seeing this issue in production with real workloads and it has : caused hard lockups. Particularly network heavy workloads with a lot : of threads in epoll_wait() can easily trigger this issue if they get : killed (oom-killed in our case). Link: https://lkml.kernel.org/r/xm26fsjotqda.fsf@google.com Signed-off-by: Ben Segall <bsegall@google.com> Tested-by: Shakeel Butt <shakeelb@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Roman Penyaev <rpenyaev@suse.de> Cc: Jason Baron <jbaron@akamai.com> Cc: Khazhismel Kumykov <khazhy@google.com> Cc: Heiher <r@hev.cc> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:40 -07:00
Yu Zhe	2c795fb03f	ipc/mqueue: remove unnecessary (void) conversion Remove unnecessary void type casting. Link: https://lkml.kernel.org/r/20220628021251.17197-1-yuzhe@nfschina.com Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:40 -07:00
Tao Liu	46d36b1be1	kdump: round up the total memory size to 128M for crashkernel reservation The total memory size we get in kernel is usually slightly less than the actual memory size because BIOS/firmware will reserve some memory region. So it won't export all memory as usable. E.g, on my x86_64 kvm guest with 1G memory, the total_mem value shows: UEFI boot with ovmf: 0x3faef000 Legacy boot kvm guest: 0x3ff7ec00 When specifying crashkernel=1G-2G:128M, if we have a 1G memory machine, we get total size 1023M from firmware. Then it will not fall into 1G-2G, thus no memory reserved. User will never know this, it is hard to let user know the exact total value in kernel. One way is to use dmi/smbios to get physical memory size, but it's not reliable as well. According to Prarit hardware vendors sometimes screw this up. Thus round up total size to 128M to work around this problem. This patch is a resend of [1] and rebased onto v5.19-rc2, and the original credit goes to Dave Young. [1]: http://lists.infradead.org/pipermail/kexec/2018-April/020568.html Link: https://lkml.kernel.org/r/20220627074440.187222-1-ltao@redhat.com Signed-off-by: Tao Liu <ltao@redhat.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:40 -07:00
Alexey Dobriyan	376b0c2661	proc: delete unused <linux/uaccess.h> includes Those aren't necessary after seq files won. Link: https://lkml.kernel.org/r/YqnA3mS7KBt8Z4If@localhost.localdomain Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:39 -07:00
Stephen Brennan	5fd8fea935	vmcoreinfo: include kallsyms symbols The internal kallsyms tables contain information which could be quite useful to a debugging tool in the absence of other debuginfo. If kallsyms is enabled, then a debugging tool could parse it and use it as a fallback symbol table. Combined with BTF data, live & post-mortem debuggers can support basic operations without needing a large DWARF debuginfo file available. As many as five symbols are necessary to properly parse kallsyms names and addresses. Add these to the vmcoreinfo note. CONFIG_KALLSYMS_ABSOLUTE_PERCPU does impact the computation of symbol addresses. However, a debugger can infer this configuration value by comparing the address of _stext in the vmcoreinfo with the address computed via kallsyms. So there's no need to include information about this config value in the vmcoreinfo note. To verify that we're still well below the maximum of 4096 bytes, I created a script[1] to compute a rough upper bound on the possible size of vmcoreinfo. On v5.18-rc7, the script reports 3106 bytes, and with this patch, the maximum become 3370 bytes. [1]: https://github.com/brenns10/kernel_stuff/blob/master/vmcoreinfosize/ Link: https://lkml.kernel.org/r/20220517000508.777145-3-stephen.s.brennan@oracle.com Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Bixuan Cui <cuibixuan@huawei.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Vernet <void@manifault.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:39 -07:00
Stephen Brennan	71f8c15565	kallsyms: move declarations to internal header Patch series "Expose kallsyms data in vmcoreinfo note". The kernel can be configured to contain a lot of introspection or debugging information built-in, such as ORC for unwinding stack traces, BTF for type information, and of course kallsyms. Debuggers could use this information to navigate a core dump or live system, but they need to be able to find it. This patch series adds the necessary symbols into vmcoreinfo, which would allow a debugger to find and interpret the kallsyms table. Using the kallsyms data, the debugger can then lookup any symbol, allowing it to find ORC, BTF, or any other useful data. This would allow a live kernel, or core dump, to be debugged without any DWARF debuginfo. This is useful for many cases: the debuginfo may not have been generated, or you may not want to deploy the large files everywhere you need them. I've demonstrated a proof of concept for this at LSF/MM+BPF during a lighting talk. Using a work-in-progress branch of the drgn debugger, and an extended set of BTF generated by a patched version of dwarves, I've been able to open a core dump without any DWARF info and do basic tasks such as enumerating slab caches, block devices, tasks, and doing backtraces. I hope this series can be a first step toward a new possibility of "DWARFless debugging". Related discussion around the BTF side of this: https://lore.kernel.org/bpf/586a6288-704a-f7a7-b256-e18a675927df@oracle.com/T/#u Some work-in-progress branches using this feature: https://github.com/brenns10/dwarves/tree/remove_percpu_restriction_1 https://github.com/brenns10/drgn/tree/kallsyms_plus_btf This patch (of 2): To include kallsyms data in the vmcoreinfo note, we must make the symbol declarations visible outside of kallsyms.c. Move these to a new internal header file. Link: https://lkml.kernel.org/r/20220517000508.777145-1-stephen.s.brennan@oracle.com Link: https://lkml.kernel.org/r/20220517000508.777145-2-stephen.s.brennan@oracle.com Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Dave Young <dyoung@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Bixuan Cui <cuibixuan@huawei.com> Cc: David Vernet <void@manifault.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:39 -07:00
Colin Ian King	4a70ce5f93	lib/ts_bm.c: remove redundant store to variable consumed after addition There is no need to store the result of the addition back to variable consumed after the addition. The store is redundant, replace += with just + Cleans up clang scan build warning: lib/ts_bm.c:83:11: warning: Although the value stored to 'consumed' is used in the enclosing expression, the value is never actually read from 'consumed' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220704215325.600993-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:39 -07:00
wuchi	6d529ea80b	lib/scatterlist: use matched parameter type when calling __sg_free_table() commit `4635873c56` ("scsi: lib/sg_pool.c: improve APIs for allocating sg pool") changeed @(bool)skip_first_chunk of __sg_free_table() to @(unsigned int)nents_first_chunk, so use unsigend int type instead of bool type (false -> 0) when calling the function in sg_free_append_table() and sg_free_table(). Link: https://lkml.kernel.org/r/20220629030241.84559-1-wuchi.zero@gmail.com Signed-off-by: wuchi <wuchi.zero@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Cc: Maor Gottlieb <maorg@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:39 -07:00
Tiezhu Yang	2d8867f3e0	lib: make LZ4_decompress_safe_forceExtDict() static LZ4_decompress_safe_forceExtDict() is only used in lib/lz4/lz4_decompress.c, make it static to fix the build warning about "no previous prototype" [1]. [1] https://lore.kernel.org/lkml/202206260948.akgsho1q-lkp@intel.com/ Link: https://lkml.kernel.org/r/1656298965-8698-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:39 -07:00
wuchi	cda83bb8a6	lib/radix-tree: remove unused argument of insert_entries insert_entries() doesn't use the 'bool replace' argument, and the function is only used locally, remove the argument. The historical context of the unused argument is as follow: 2: commit <3a08cd52c37c79> (radix tree: Remove multiorder support) Remove the code related to macro CONFIG_RADIX_TREE_MULTIORDER to convert to the xArray. Without the macro, there is no need to retain the argument. 1: commit <175542f575723e> (radix-tree: add radix_tree_join) Add insert_entries(..., bool replace) function, depending on the macro CONFIG_RADIX_TREE_MULTIORDER definition, the implementation is different. Notice that the implementation without the macro doesn't use the argument. [Matthew Wilcox: add historical context for argument] Link: https://lkml.kernel.org/r/20220625135324.72574-1-wuchi.zero@gmail.com Signed-off-by: wuchi <wuchi.zero@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:38 -07:00
Dan Carpenter	045ed31e23	kfifo: fix kfifo_to_user() return type The kfifo_to_user() macro is supposed to return zero for success or negative error codes. Unfortunately, there is a signedness bug so it returns unsigned int. This only affects callers which try to save the result in ssize_t and as far as I can see the only place which does that is line6_hwdep_read(). TL;DR: s/_uint/_int/. Link: https://lkml.kernel.org/r/YrVL3OJVLlNhIMFs@kili Fixes: `144ecf310e` ("kfifo: fix kfifo_alloc() to return a signed int value") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Stefani Seibold <stefani@seibold.net> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:38 -07:00
Uros Bizjak	43c249ea0b	compiler-gcc.h: remove ancient workaround for gcc PR 58670 The workaround for 'asm goto' miscompilation introduces a compiler barrier quirk that inhibits many useful compiler optimizations. For example, __try_cmpxchg_user compiles to: 11375: 41 8b 4d 00 mov 0x0(%r13),%ecx 11379: 41 8b 02 mov (%r10),%eax 1137c: f0 0f b1 0a lock cmpxchg %ecx,(%rdx) 11380: 0f 94 c2 sete %dl 11383: 84 d2 test %dl,%dl 11385: 75 c4 jne 1134b <...> 11387: 41 89 02 mov %eax,(%r10) where the barrier inhibits flags propagation from asm when compiled with gcc-12. When the mentioned quirk is removed, the following code is generated: 11553: 41 8b 4d 00 mov 0x0(%r13),%ecx 11557: 41 8b 02 mov (%r10),%eax 1155a: f0 0f b1 0a lock cmpxchg %ecx,(%rdx) 1155e: 74 c9 je 11529 <...> 11560: 41 89 02 mov %eax,(%r10) The refered compiler bug: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 was fixed for gcc-4.8.2. Current minimum required version of GCC is version 5.1 which has the above 'asm goto' miscompilation fixed, so remove the workaround. Link: https://lkml.kernel.org/r/20220624141412.72274-1-ubizjak@gmail.com Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:38 -07:00
wuchi	86e5908ec2	lib/error-inject: traverse list with mutex Traversing list without mutex in get_injectable_error_type will race with the following code: list_del_init(&ent->list) kfree(ent) in module_unload_ei_list. So fix that. Link: https://lkml.kernel.org/r/20220620100244.82896-1-wuchi.zero@gmail.com Signed-off-by: wuchi <wuchi.zero@gmail.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:38 -07:00
Vlastimil Babka	f9987921cb	lib/stackdepot: replace CONFIG_STACK_HASH_ORDER with automatic sizing As Linus explained [1], setting the stackdepot hash table size as a config option is suboptimal, especially as stackdepot becomes a dependency of less "expert" subsystems than initially (e.g. DRM, networking, SLUB_DEBUG): : (a) it introduces a new compile-time question that isn't sane to ask : a regular user, but is now exposed to regular users. : (b) this by default uses 1MB of memory for a feature that didn't in : the past, so now if you have small machines you need to make sure you : make a special kernel config for them. Ideally we would employ rhashtable for fully automatic resizing, which should be feasible for many of the new users, but problematic for the original users with restricted context that call __stack_depot_save() with can_alloc == false, i.e. KASAN. However we can easily remove the config option and scale the hash table automatically with system memory. The STACK_HASH_MASK constant becomes stack_hash_mask variable and is used only in one mask operation, so the overhead should be negligible to none. For early allocation we can employ the existing alloc_large_system_hash() function and perform similar scaling for the late allocation. The existing limits of the config option (between 4k and 1M buckets) are preserved, and scaling factor is set to one bucket per 16kB memory so on 64bit the max 1M buckets (8MB memory) is achieved with 16GB system, while a 1GB system will use 512kB. Because KASAN is reported to need the maximum number of buckets even with smaller amounts of memory [2], set it as such when kasan_enabled(). If needed, the automatic scaling could be complemented with a boot-time kernel parameter, but it feels pointless to add it without a specific use case. [1] https://lore.kernel.org/all/CAHk-=wjC5nS+fnf6EzRD9yQRJApAhxx7gRB87ZV+pAWo9oVrTg@mail.gmail.com/ [2] https://lore.kernel.org/all/CACT4Y+Y4GZfXOru2z5tFPzFdaSUd+GFc6KVL=bsa0+1m197cQQ@mail.gmail.com/ Link: https://lkml.kernel.org/r/20220620150249.16814-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:38 -07:00
wuchi	62df90b53e	net, lib/once: remove {net_}get_random_once_wait macro DO_ONCE(func, ...) will call func with spinlock which acquired by spin_lock_irqsave in __do_once_start. But the get_random_once_wait will sleep in get_random_bytes_wait -> wait_for_random_bytes. Fortunately, there is no place to use {net_}get_random_once_wait, so we could remove them simply. Link: https://lkml.kernel.org/r/20220619074641.40916-1-wuchi.zero@gmail.com Signed-off-by: wuchi <wuchi.zero@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:37 -07:00
wuchi	5a66fce95b	lib/lru_cache: fix error free handing in lc_create When kmem_cache_alloc in function lc_create returns null, we will free the memory already allocated. The loop of kmem_cache_free is wrong, especially: i = 0 ==> do wrong loop i > 0 ==> do not free element[0] Link: https://lkml.kernel.org/r/20220618082521.7082-1-wuchi.zero@gmail.com Signed-off-by: wuchi <wuchi.zero@gmail.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Cc: Christoph Bhmwalder <christoph.boehmwalder@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:37 -07:00
Dan Moulding	5a704629f2	init: add "hostname" kernel parameter The gethostname system call returns the hostname for the current machine. However, the kernel has no mechanism to initially set the current machine's name in such a way as to guarantee that the first userspace process to call gethostname will receive a meaningful result. It relies on some unspecified userspace process to first call sethostname before gethostname can produce a meaningful name. Traditionally the machine's hostname is set from userspace by the init system. The init system, in turn, often relies on a configuration file (say, /etc/hostname) to provide the value that it will supply in the call to sethostname. Consequently, the file system containing /etc/hostname usually must be available before the hostname will be set. There may, however, be earlier userspace processes that could call gethostname before the file system containing /etc/hostname is mounted. Such a process will get some other, likely meaningless, name from gethostname (such as "(none)", "localhost", or "darkstar"). A real-world example where this can happen, and lead to undesirable results, is with mdadm. When assembling arrays, mdadm distinguishes between "local" arrays and "foreign" arrays. A local array is one that properly belongs to the current machine, and a foreign array is one that is (possibly temporarily) attached to the current machine, but properly belongs to some other machine. To determine if an array is local or foreign, mdadm may compare the "homehost" recorded on the array with the current hostname. If mdadm is run before the root file system is mounted, perhaps because the root file system itself resides on an md-raid array, then /etc/hostname isn't yet available and the init system will not yet have called sethostname, causing mdadm to incorrectly conclude that all of the local arrays are foreign. Solving this problem could be delegated to the init system. It could be left up to the init system (including any init system that starts within an initramfs, if one is in use) to ensure that sethostname is called before any other userspace process could possibly call gethostname. However, it may not always be obvious which processes could call gethostname (for example, udev itself might not call gethostname, but it could via udev rules invoke processes that do). Additionally, the init system has to ensure that the hostname configuration value is stored in some place where it will be readily accessible during early boot. Unfortunately, every init system will attempt to (or has already attempted to) solve this problem in a different, possibly incorrect, way. This makes getting consistently working configurations harder for users. I believe it is better for the kernel to provide the means by which the hostname may be set early, rather than making this a problem for the init system to solve. The option to set the hostname during early startup, via a kernel parameter, provides a simple, reliable way to solve this problem. It also could make system configuration easier for some embedded systems. [dmoulding@me.com: v2] Link: https://lkml.kernel.org/r/20220506060310.7495-2-dmoulding@me.com Link: https://lkml.kernel.org/r/20220505180651.22849-2-dmoulding@me.com Signed-off-by: Dan Moulding <dmoulding@me.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-17 17:31:37 -07:00
akpm	ee56c3e8ee	Merge branch 'master' into mm-nonmm-stable	2022-06-27 10:31:44 -07:00
Linus Torvalds	03c765b0e3	Linux 5.19-rc4	2022-06-26 14:22:10 -07:00

1 2 3 4 5 ...

1105841 Commits