Commit Graph

37201 Commits

Author SHA1 Message Date
Linus Torvalds
43175623dd More tracing updates for 5.15:
- Add migrate-disable counter to tracing header
 
  - Fix error handling in event probes
 
  - Fix missed unlock in osnoise in error path
 
  - Fix merge issue with tools/bootconfig
 
  - Clean up bootconfig data when init memory is removed
 
  - Fix bootconfig to loop only on subkeys
 
  - Have kernel command lines override bootconfig options
 
  - Increase field counts for synthetic events
 
  - Have histograms dynamic allocate event elements to save space
 
  - Fixes in testing and documentation
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYToFZBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtg5AP44U3Dn1m1lQo3y1DJ9kUP3HsAsDofS
 Cv7ZM9tLV2p4MQEA9KJc3/B/5BZEK1kso3uLeLT+WxJOC4YStXY19WwmjAI=
 =Wuo+
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:

 - Add migrate-disable counter to tracing header

 - Fix error handling in event probes

 - Fix missed unlock in osnoise in error path

 - Fix merge issue with tools/bootconfig

 - Clean up bootconfig data when init memory is removed

 - Fix bootconfig to loop only on subkeys

 - Have kernel command lines override bootconfig options

 - Increase field counts for synthetic events

 - Have histograms dynamic allocate event elements to save space

 - Fixes in testing and documentation

* tag 'trace-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/boot: Fix to loop on only subkeys
  selftests/ftrace: Exclude "(fault)" in testing add/remove eprobe events
  tracing: Dynamically allocate the per-elt hist_elt_data array
  tracing: synth events: increase max fields count
  tools/bootconfig: Show whole test command for each test case
  bootconfig: Fix missing return check of xbc_node_compose_key function
  tools/bootconfig: Fix tracing_on option checking in ftrace2bconf.sh
  docs: bootconfig: Add how to use bootconfig for kernel parameters
  init/bootconfig: Reorder init parameter from bootconfig and cmdline
  init: bootconfig: Remove all bootconfig data when the init memory is removed
  tracing/osnoise: Fix missed cpus_read_unlock() in start_per_cpu_kthreads()
  tracing: Fix some alloc_event_probe() error handling bugs
  tracing: Add migrate-disabled counter to tracing output.
2021-09-09 13:11:15 -07:00
Linus Torvalds
a3fa7a101d Merge branches 'akpm' and 'akpm-hotfixes' (patches from Andrew)
Merge yet more updates and hotfixes from Andrew Morton:
 "Post-linux-next material, based upon latest upstream to catch the
  now-merged dependencies:

   - 10 patches.

     Subsystems affected by this patch series: mm (vmstat and migration)
     and compat.

  And bunch of hotfixes, mostly cc:stable:

   - 8 patches.

     Subsystems affected by this patch series: mm (hmm, hugetlb, vmscan,
     pagealloc, pagemap, kmemleak, mempolicy, and memblock)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  arch: remove compat_alloc_user_space
  compat: remove some compat entry points
  mm: simplify compat numa syscalls
  mm: simplify compat_sys_move_pages
  kexec: avoid compat_alloc_user_space
  kexec: move locking into do_kexec_load
  mm: migrate: change to use bool type for 'page_was_mapped'
  mm: migrate: fix the incorrect function name in comments
  mm: migrate: introduce a local variable to get the number of pages
  mm/vmstat: protect per cpu variables with preempt disable on RT

* emailed hotfixes from Andrew Morton <akpm@linux-foundation.org>:
  nds32/setup: remove unused memblock_region variable in setup_memory()
  mm/mempolicy: fix a race between offset_il_node and mpol_rebind_task
  mm/kmemleak: allow __GFP_NOLOCKDEP passed to kmemleak's gfp
  mmap_lock: change trace and locking order
  mm/page_alloc.c: avoid accessing uninitialized pcp page migratetype
  mm,vmscan: fix divide by zero in get_scan_count
  mm/hugetlb: initialize hugetlb_usage in mm_init
  mm/hmm: bypass devmap pte when all pfn requested flags are fulfilled
2021-09-08 18:52:05 -07:00
Liu Zixian
13db8c5047 mm/hugetlb: initialize hugetlb_usage in mm_init
After fork, the child process will get incorrect (2x) hugetlb_usage.  If
a process uses 5 2MB hugetlb pages in an anonymous mapping,

	HugetlbPages:	   10240 kB

and then forks, the child will show,

	HugetlbPages:	   20480 kB

The reason for double the amount is because hugetlb_usage will be copied
from the parent and then increased when we copy page tables from parent
to child.  Child will have 2x actual usage.

Fix this by adding hugetlb_count_init in mm_init.

Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com
Fixes: 5d317b2b65 ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status")
Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 18:45:53 -07:00
Arnd Bergmann
a7a08b275a arch: remove compat_alloc_user_space
All users of compat_alloc_user_space() and copy_in_user() have been
removed from the kernel, only a few functions in sparc remain that can be
changed to calling arch_copy_in_user() instead.

Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 15:32:35 -07:00
Arnd Bergmann
59ab844eed compat: remove some compat entry points
These are all handled correctly when calling the native system call entry
point, so remove the special cases.

Link: https://lkml.kernel.org/r/20210727144859.4150043-6-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 15:32:35 -07:00
Arnd Bergmann
5d700a0fd7 kexec: avoid compat_alloc_user_space
kimage_alloc_init() expects a __user pointer, so compat_sys_kexec_load()
uses compat_alloc_user_space() to convert the layout and put it back onto
the user space caller stack.

Moving the user space access into the syscall handler directly actually
makes the code simpler, as the conversion for compat mode can now be done
on kernel memory.

Link: https://lkml.kernel.org/r/20210727144859.4150043-3-arnd@kernel.org
Link: https://lore.kernel.org/lkml/YPbtsU4GX6PL7%2F42@infradead.org/
Link: https://lore.kernel.org/lkml/m1y2cbzmnw.fsf@fess.ebiederm.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Co-developed-by: Eric Biederman <ebiederm@xmission.com>
Co-developed-by: Christoph Hellwig <hch@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 15:32:34 -07:00
Arnd Bergmann
4b692e8616 kexec: move locking into do_kexec_load
Patch series "compat: remove compat_alloc_user_space", v5.

Going through compat_alloc_user_space() to convert indirect system call
arguments tends to add complexity compared to handling the native and
compat logic in the same code.

This patch (of 6):

The locking is the same between the native and compat version of
sys_kexec_load(), so it can be done in the common implementation to reduce
duplication.

Link: https://lkml.kernel.org/r/20210727144859.4150043-1-arnd@kernel.org
Link: https://lkml.kernel.org/r/20210727144859.4150043-2-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Co-developed-by: Eric Biederman <ebiederm@xmission.com>
Co-developed-by: Christoph Hellwig <hch@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 15:32:34 -07:00
Linus Torvalds
2d338201d5 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "147 patches, based on 7d2a07b769.

  Subsystems affected by this patch series: mm (memory-hotplug, rmap,
  ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
  alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
  checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
  selftests, ipc, and scripts"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
  scripts: check_extable: fix typo in user error message
  mm/workingset: correct kernel-doc notations
  ipc: replace costly bailout check in sysvipc_find_ipc()
  selftests/memfd: remove unused variable
  Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
  configs: remove the obsolete CONFIG_INPUT_POLLDEV
  prctl: allow to setup brk for et_dyn executables
  pid: cleanup the stale comment mentioning pidmap_init().
  kernel/fork.c: unexport get_{mm,task}_exe_file
  coredump: fix memleak in dump_vma_snapshot()
  fs/coredump.c: log if a core dump is aborted due to changed file permissions
  nilfs2: use refcount_dec_and_lock() to fix potential UAF
  nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
  nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
  nilfs2: fix NULL pointer in nilfs_##name##_attr_release
  nilfs2: fix memory leak in nilfs_sysfs_create_device_group
  trap: cleanup trap_init()
  init: move usermodehelper_enable() to populate_rootfs()
  ...
2021-09-08 12:55:35 -07:00
Masami Hiramatsu
cfd799837d tracing/boot: Fix to loop on only subkeys
Since the commit e5efaeb8a8 ("bootconfig: Support mixing
a value and subkeys under a key") allows to co-exist a value
node and key nodes under a node, xbc_node_for_each_child()
is not only returning key node but also a value node.
In the boot-time tracing using xbc_node_for_each_child() to
iterate the events, groups and instances, but those must be
key nodes. Thus it must use xbc_node_for_each_subkey().

Link: https://lkml.kernel.org/r/163112988361.74896.2267026262061819145.stgit@devnote2

Fixes: e5efaeb8a8 ("bootconfig: Support mixing a value and subkeys under a key")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-08 15:44:32 -04:00
Tom Zanussi
c910db943d tracing: Dynamically allocate the per-elt hist_elt_data array
Setting the hist_elt_data.field_var_str[] array unconditionally to a
size of SYNTH_FIELD_MAX elements wastes space unnecessarily.  The
actual number of elements needed can be calculated at run-time
instead.

In most cases, this will save a lot of space since it's a per-elt
array which isn't normally close to being full.  It also allows us to
increase SYNTH_FIELD_MAX without worrying about even more wastage when
we do that.

Link: https://lkml.kernel.org/r/d52ae0ad5e1b59af7c4f54faf3fc098461fd82b3.camel@kernel.org

Signed-off-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-08 15:29:16 -04:00
Artem Bityutskiy
0be083cee4 tracing: synth events: increase max fields count
Sometimes it is useful to construct larger synthetic trace events. Increase
'SYNTH_FIELDS_MAX' (maximum number of fields in a synthetic event) from 32 to
64.

Link: https://lkml.kernel.org/r/20210901135513.3087062-1-dedekind1@gmail.com

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-08 15:29:16 -04:00
Qiang.Zhang
4b6b08f2e4 tracing/osnoise: Fix missed cpus_read_unlock() in start_per_cpu_kthreads()
When start_kthread() return error, the cpus_read_unlock() need
to be called.

Link: https://lkml.kernel.org/r/20210831022919.27630-1-qiang.zhang@windriver.com

Cc: <stable@vger.kernel.org>
Fixes: c8895e271f ("trace/osnoise: Support hotplug operations")
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Qiang.Zhang <qiang.zhang@windriver.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-08 15:10:24 -04:00
Cyrill Gorcunov
e1fbbd0731 prctl: allow to setup brk for et_dyn executables
Keno Fischer reported that when a binray loaded via ld-linux-x the
prctl(PR_SET_MM_MAP) doesn't allow to setup brk value because it lays
before mm:end_data.

For example a test program shows

 | # ~/t
 |
 | start_code      401000
 | end_code        401a15
 | start_stack     7ffce4577dd0
 | start_data	   403e10
 | end_data        40408c
 | start_brk	   b5b000
 | sbrk(0)         b5b000

and when executed via ld-linux

 | # /lib64/ld-linux-x86-64.so.2 ~/t
 |
 | start_code      7fc25b0a4000
 | end_code        7fc25b0c4524
 | start_stack     7fffcc6b2400
 | start_data	   7fc25b0ce4c0
 | end_data        7fc25b0cff98
 | start_brk	   55555710c000
 | sbrk(0)         55555710c000

This of course prevent criu from restoring such programs.  Looking into
how kernel operates with brk/start_brk inside brk() syscall I don't see
any problem if we allow to setup brk/start_brk without checking for
end_data.  Even if someone pass some weird address here on a purpose then
the worst possible result will be an unexpected unmapping of existing vma
(own vma, since prctl works with the callers memory) but test for
RLIMIT_DATA is still valid and a user won't be able to gain more memory in
case of expanding VMAs via new values shipped with prctl call.

Link: https://lkml.kernel.org/r/20210121221207.GB2174@grain
Fixes: bbdc6076d2 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reported-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Andrey Vagin <avagin@gmail.com>
Tested-by: Andrey Vagin <avagin@gmail.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:28 -07:00
Christoph Hellwig
05da8113c9 kernel/fork.c: unexport get_{mm,task}_exe_file
Only used by core code and the tomoyo which can't be a module either.

Link: https://lkml.kernel.org/r/20210820095430.445242-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:28 -07:00
Nicholas Piggin
1e1c15839d fs/epoll: use a per-cpu counter for user's watches count
This counter tracks the number of watches a user has, to compare against
the 'max_user_watches' limit. This causes a scalability bottleneck on
SPECjbb2015 on large systems as there is only one user. Changing to a
per-cpu counter increases throughput of the benchmark by about 30% on a
16-socket, > 1000 thread system.

[rdunlap@infradead.org: fix build errors in kernel/user.c when CONFIG_EPOLL=n]
[npiggin@gmail.com: move ifdefs into wrapper functions, slightly improve panic message]
  Link: https://lkml.kernel.org/r/1628051945.fens3r99ox.astroid@bobo.none
[akpm@linux-foundation.org: tweak user_epoll_alloc(), per Guenter]
  Link: https://lkml.kernel.org/r/20210804191421.GA1900577@roeck-us.net

Link: https://lkml.kernel.org/r/20210802032013.2751916-1-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reported-by: Anton Blanchard <anton@ozlabs.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:27 -07:00
Pavel Skripkin
2d186afd04 profiling: fix shift-out-of-bounds bugs
Syzbot reported shift-out-of-bounds bug in profile_init().
The problem was in incorrect prof_shift. Since prof_shift value comes from
userspace we need to clamp this value into [0, BITS_PER_LONG -1]
boundaries.

Second possible shiht-out-of-bounds was found by Tetsuo:
sample_step local variable in read_profile() had "unsigned int" type,
but prof_shift allows to make a BITS_PER_LONG shift. So, to prevent
possible shiht-out-of-bounds sample_step type was changed to
"unsigned long".

Also, "unsigned short int" will be sufficient for storing
[0, BITS_PER_LONG] value, that's why there is no need for
"unsigned long" prof_shift.

Link: https://lkml.kernel.org/r/20210813140022.5011-1-paskripkin@gmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-and-tested-by: syzbot+e68c89a9510c159d9684@syzkaller.appspotmail.com
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:26 -07:00
Yang Yang
3c91dda97e kernel/acct.c: use dedicated helper to access rlimit values
Use rlimit() helper instead of manually writing whole chain from
task to rlimit value. See patch "posix-cpu-timers: Use dedicated
helper to access rlimit values".

Link: https://lkml.kernel.org/r/20210728030822.524789-1-yang.yang29@zte.com.cn
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: sh_def@163.com <sh_def@163.com>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 11:50:26 -07:00
Dan Carpenter
5615e088b4 tracing: Fix some alloc_event_probe() error handling bugs
There are two bugs in this code.  First, if the kzalloc() fails it leads
to a NULL dereference of "ep" on the next line.  Second, if the
alloc_event_probe() function returns an error then it leads to an
error pointer dereference in the caller.

Link: https://lkml.kernel.org/r/20210824115150.GI31143@kili

Fixes: 7491e2c442 ("tracing: Add a probe that attaches to trace events")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-07 20:58:23 -04:00
Linus Torvalds
996fe06160 kgdb patches for 5.15
Changes for kgdb/kdb this cycle are dominated by a change from
 Sumit that removes as small (256K) private heap from kdb. This is
 change I've hoped for ever since I discovered how few users of this
 heap remained in the kernel, so many thanks to Sumit for hunting
 these down. Other change is an incremental step towards SPDX headers.
 
 Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmE2OuUACgkQfOMlXTn3
 iKEcTQ/7BiMPnSzPeh7srNIG1a7/ig8CUtu+qMsrU50BUvCBzhas3+fGbzlZHV4p
 GwCGTuRDWhVJjntpCkLYroSSF0h/AhN68pxqUx6EJUWEYMGuuMBuVtAdXYMSYbse
 MW+Jw6Tec2fWPyw6lyWd0vwOKLzhf/pC8axbmGESzgdmZrsWeA4XEeF5XRHsHyJf
 pL4GvBXN/mzH2TSQNQmSgda5VLnK3m6B66tlT17mywG8QTKwID8I4LXs9doNwqeW
 6fkwJswAsCa+nO1GjMsRcA8HoUd7mtqLCYbjZzFVQdFJ6kF9eBKSPvgbaqr6oNvv
 t6VBfm908+fM4/4yEQSuwVPT32IPe7I/Bk+aUqnSpPiXU8v75u0lIfB929O/DS07
 X+NjqpXN7v6mEpLAFgWhqPyzTODw40OhpOi1uf9jzPB/OSNm9y1zs21FJQSx6KGa
 AahMHrRoeZwlxNdD/2RBf9pQCSBWF5R1DyIye2iQX+e/jrvTXHMCK33tpcbcE9Cl
 bMVqnnJ4Pnmul6hAAf9WlJduIrACV1Oq1iQrtXWE3wdFZdQuvqKvwHUK/Zko+O0f
 cdW6neyW7Slo+8q2qXuEE0Bp+Ae61oZTrGIdfj29ZdXwp0+sBzcA5CFO7MKPJllV
 yEq41Bxc3cgJPcsFMZWSaHcEAI5fq1iBmquVdCA0/vYbP9dsO1U=
 =iMIF
 -----END PGP SIGNATURE-----

Merge tag 'kgdb-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb updates from Daniel Thompson:
 "Changes for kgdb/kdb this cycle are dominated by a change from Sumit
  that removes as small (256K) private heap from kdb. This is change
  I've hoped for ever since I discovered how few users of this heap
  remained in the kernel, so many thanks to Sumit for hunting these
  down.

  The other change is an incremental step towards SPDX headers"

* tag 'kgdb-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kernel: debug: Convert to SPDX identifier
  kdb: Rename members of struct kdbtab_t
  kdb: Simplify kdb_defcmd macro logic
  kdb: Get rid of redundant kdb_register_flags()
  kdb: Rename struct defcmd_set to struct kdb_macro
  kdb: Get rid of custom debug heap allocator
2021-09-07 12:08:04 -07:00
Cai Huoqing
f8416aa291 kernel: debug: Convert to SPDX identifier
use SPDX-License-Identifier instead of a verbose license text

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210906112302.937-1-caihuoqing@baidu.com
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2021-09-06 14:31:11 +01:00
Linus Torvalds
58ca241587 Tracing updates for 5.15:
- Simplifying the Kconfig use of FTRACE and TRACE_IRQFLAGS_SUPPORT
 
  - bootconfig now can start histograms
 
  - bootconfig supports group/all enabling
 
  - histograms now can put values in linear size buckets
 
  - execnames can be passed to synthetic events
 
  - Introduction of "event probes" that attach to other events and
    can retrieve data from pointers of fields, or record fields
    as different types (a pointer to a string as a string instead
    of just a hex number)
 
  - Various fixes and clean ups
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYTJDixQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnPLAP9XviWrZD27uFj6LU/Vp2umbq8la1aC
 oW8o9itUGpLoHQD+OtsMpQXsWrxoNw/JD1OWCH4J0YN+TnZAUUG2E9e0twA=
 =OZXG
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - simplify the Kconfig use of FTRACE and TRACE_IRQFLAGS_SUPPORT

 - bootconfig can now start histograms

 - bootconfig supports group/all enabling

 - histograms now can put values in linear size buckets

 - execnames can be passed to synthetic events

 - introduce "event probes" that attach to other events and can retrieve
   data from pointers of fields, or record fields as different types (a
   pointer to a string as a string instead of just a hex number)

 - various fixes and clean ups

* tag 'trace-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (35 commits)
  tracing/doc: Fix table format in histogram code
  selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes
  selftests/ftrace: Add selftest for testing eprobe events on synthetic events
  selftests/ftrace: Add test case to test adding and removing of event probe
  selftests/ftrace: Fix requirement check of README file
  selftests/ftrace: Add clear_dynamic_events() to test cases
  tracing: Add a probe that attaches to trace events
  tracing/probes: Reject events which have the same name of existing one
  tracing/probes: Have process_fetch_insn() take a void * instead of pt_regs
  tracing/probe: Change traceprobe_set_print_fmt() to take a type
  tracing/probes: Use struct_size() instead of defining custom macros
  tracing/probes: Allow for dot delimiter as well as slash for system names
  tracing/probe: Have traceprobe_parse_probe_arg() take a const arg
  tracing: Have dynamic events have a ref counter
  tracing: Add DYNAMIC flag for dynamic events
  tracing: Replace deprecated CPU-hotplug functions.
  MAINTAINERS: Add an entry for os noise/latency
  tracepoint: Fix kerneldoc comments
  bootconfig/tracing/ktest: Update ktest example for boot-time tracing
  tools/bootconfig: Use per-group/all enable option in ftrace2bconf script
  ...
2021-09-05 11:50:41 -07:00
Linus Torvalds
49624efa65 Merge tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux
Pull MAP_DENYWRITE removal from David Hildenbrand:
 "Remove all in-tree usage of MAP_DENYWRITE from the kernel and remove
  VM_DENYWRITE.

  There are some (minor) user-visible changes:

   - We no longer deny write access to shared libaries loaded via legacy
     uselib(); this behavior matches modern user space e.g. dlopen().

   - We no longer deny write access to the elf interpreter after exec
     completed, treating it just like shared libraries (which it often
     is).

   - We always deny write access to the file linked via /proc/pid/exe:
     sys_prctl(PR_SET_MM_MAP/EXE_FILE) will fail if write access to the
     file cannot be denied, and write access to the file will remain
     denied until the link is effectivel gone (exec, termination,
     sys_prctl(PR_SET_MM_MAP/EXE_FILE)) -- just as if exec'ing the file.

  Cross-compiled for a bunch of architectures (alpha, microblaze, i386,
  s390x, ...) and verified via ltp that especially the relevant tests
  (i.e., creat07 and execve04) continue working as expected"

* tag 'denywrite-for-5.15' of git://github.com/davidhildenbrand/linux:
  fs: update documentation of get_write_access() and friends
  mm: ignore MAP_DENYWRITE in ksys_mmap_pgoff()
  mm: remove VM_DENYWRITE
  binfmt: remove in-tree usage of MAP_DENYWRITE
  kernel/fork: always deny write access to current MM exe_file
  kernel/fork: factor out replacing the current MM exe_file
  binfmt: don't use MAP_DENYWRITE when loading shared libraries via uselib()
2021-09-04 11:35:47 -07:00
Thomas Gleixner
54357f0c91 tracing: Add migrate-disabled counter to tracing output.
migrate_disable() forbids task migration to another CPU. It is available
since v5.11 and has already users such as highmem or BPF. It is useful
to observe this task state in tracing which already has other states
like the preemption counter.

Instead of adding the migrate disable counter as a new entry to struct
trace_entry, which would extend the whole struct by four bytes, it is
squashed into the preempt-disable counter. The lower four bits represent
the preemption counter, the upper four bits represent the migrate
disable counter. Both counter shouldn't exceed 15 but if they do, there
is a safety net which caps the value at 15.

Add the migrate-disable counter to the trace entry so it shows up in the
trace. Due to the users mentioned above, it is already possible to
observe it:

|  bash-1108    [000] ...21    73.950578: rss_stat: mm_id=2213312838 curr=0 type=MM_ANONPAGES size=8192B
|  bash-1108    [000] d..31    73.951222: irq_disable: caller=flush_tlb_mm_range+0x115/0x130 parent=ptep_clear_flush+0x42/0x50
|  bash-1108    [000] d..31    73.951222: tlb_flush: pages:1 reason:local mm shootdown (3)

The last value is the migrate-disable counter.

Things that popped up:
- trace_print_lat_context() does not print the migrate counter. Not sure
  if it should. It is used in "verbose" mode and uses 8 digits and I'm
  not sure ther is something processing the value.

- trace_define_common_fields() now defines a different variable. This
  probably breaks things. No ide what to do in order to preserve the old
  behaviour. Since this is used as a filter it should be split somehow
  to be able to match both nibbles here.

Link: https://lkml.kernel.org/r/20210810132625.ylssabmsrkygokuv@linutronix.de

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[bigeasy: patch description.]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
[ SDR: Removed change to common_preempt_count field name ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-03 19:42:35 -04:00
Linus Torvalds
b250e6d141 Kbuild updates for v5.15
- Add -s option (strict mode) to merge_config.sh to make it fail when
    any symbol is redefined.
 
  - Show a warning if a different compiler is used for building external
    modules.
 
  - Infer --target from ARCH for CC=clang to let you cross-compile the
    kernel without CROSS_COMPILE.
 
  - Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
 
  - Add <linux/stdarg.h> to the kernel source instead of borrowing
    <stdarg.h> from the compiler.
 
  - Add Nick Desaulniers as a Kbuild reviewer.
 
  - Drop stale cc-option tests.
 
  - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
    to handle symbols in inline assembly.
 
  - Show a warning if 'FORCE' is missing for if_changed rules.
 
  - Various cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmExXHoVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGAZwP/iHdEZzuQ4cz2uXUaV0fevj9jjPU
 zJ8wrrNabAiT6f5x861DsARQSR4OSt3zN0tyBNgZwUdotbe7ED5GegrgIUBMWlML
 QskhTEIZj7TexAX/20vx671gtzI3JzFg4c9BuriXCFRBvychSevdJPr65gMDOesL
 vOJnXe+SGXG2+fPWi/PxrcOItNRcveqo2GiWHT3g0Cv/DJUulu81gEkz3hrufnMR
 cjMeSkV0nJJcvI755OQBOUnEuigW64k4m2WxHPG24tU8cQOCqV6lqwOfNQBAn4+F
 OoaCMyPQT9gvGYwGExQMCXGg0wbUt1qnxzOVoA2qFCwbo+MFhqjBvPXab6VJm7CE
 mY3RrTtvxSqBdHI6EGcYeLjhycK9b+LLoJ1qc3S9FK8It6NoFFp4XV0R6ItPBls7
 mWi9VSpyI6k0AwLq+bGXEHvaX/bnnf/vfqn8H+w6mRZdXjFV8EB2DiOSRX/OqjVG
 RnvTtXzWWThLyXvWR3Jox4+7X6728oL7akLemoeZI6oTbJDm7dQgwpz5HbSyHXLh
 d+gUF3Y/6lqxT5N9GSVDxpD1bEMh2I7nGQ4M7WGbGas/3yUemF8wbBqGQo4a+YeD
 d9vGAUxDp2PQTtL2sjFo5Gd4PZEM9g7vwWzRvHe0o5NxKEXcBg25b8cD1hxrN9Y4
 Y1AAnc0kLO+My3PC
 =lw3M
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Add -s option (strict mode) to merge_config.sh to make it fail when
   any symbol is redefined.

 - Show a warning if a different compiler is used for building external
   modules.

 - Infer --target from ARCH for CC=clang to let you cross-compile the
   kernel without CROSS_COMPILE.

 - Make the integrated assembler default (LLVM_IAS=1) for CC=clang.

 - Add <linux/stdarg.h> to the kernel source instead of borrowing
   <stdarg.h> from the compiler.

 - Add Nick Desaulniers as a Kbuild reviewer.

 - Drop stale cc-option tests.

 - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
   to handle symbols in inline assembly.

 - Show a warning if 'FORCE' is missing for if_changed rules.

 - Various cleanups

* tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
  kbuild: redo fake deps at include/ksym/*.h
  kbuild: clean up objtool_args slightly
  modpost: get the *.mod file path more simply
  checkkconfigsymbols.py: Fix the '--ignore' option
  kbuild: merge vmlinux_link() between ARCH=um and other architectures
  kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh
  kbuild: merge vmlinux_link() between the ordinary link and Clang LTO
  kbuild: remove stale *.symversions
  kbuild: remove unused quiet_cmd_update_lto_symversions
  gen_compile_commands: extract compiler command from a series of commands
  x86: remove cc-option-yn test for -mtune=
  arc: replace cc-option-yn uses with cc-option
  s390: replace cc-option-yn uses with cc-option
  ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild
  sparc: move the install rule to arch/sparc/Makefile
  security: remove unneeded subdir-$(CONFIG_...)
  kbuild: sh: remove unused install script
  kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y
  kbuild: Switch to 'f' variants of integrated assembler flag
  kbuild: Shuffle blank line to improve comment meaning
  ...
2021-09-03 15:33:47 -07:00
Linus Torvalds
7cca308cfd powerpc updates for 5.15
- Convert pseries & powernv to use MSI IRQ domains.
 
  - Rework the pseries CPU numbering so that CPUs that are removed, and later re-added, are
    given a CPU number on the same node as previously, when possible.
 
  - Add support for a new more flexible device-tree format for specifying NUMA distances.
 
  - Convert powerpc to GENERIC_PTDUMP.
 
  - Retire sbc8548 and sbc8641d board support.
 
  - Various other small features and fixes.
 
 Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard, Cédric Le Goater,
 Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas, Fangrui Song, Finn Thain, Gautham R.
 Shenoy, Hari Bathini, Joel Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo
 Bras, Lukas Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan Chancellor,
 Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R. Sampat, Randy Dunlap, Sebastian
 Andrzej Siewior, Srikar Dronamraju, Wan Jiabing, Xiongwei Song, Zheng Yongjun.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmEyHTYTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDo3D/9aXMVP2wsEMNB0XhTiJ1UUdi311Uq9
 PvkAaGZH14ZqZLVigeiD3gt6YzTH0cEuGj6qgwsJrPDjF8FESnMbBsprMLr5/qE1
 itWRGMAMCFaeTcB9ogYVJkzwg6RN2ZgIqoq4NVswNSXoAQGWb+1bvXq3RnXXNuGR
 TQmLL02poNC6nX0YbRaQoT1Xx4nfUTiKHhU+Aok9uOCMJIyYZVATR6Qafb7/j7tO
 UvjwOHztbu84lcJOGmSnw4LcmwNORLuP9IwR0r+O1M3ijEZqDo9TPkvtSz8HZwjU
 mxdJwhrUmN0euMcghuiFxW+1XG2eM49ugsdJugiezG2RaIijbIp0nAIvdeaKAgT1
 OSSwvWCQ0fkTPyLXE+O6tVqMhlUMdqQlRcyNwmN9svIip9VnwGNq3vA4ePlJm6Fi
 i0i/tLqVNlJwFokZ7blW5g8SRgGRuFfXd5XUYLFvy5Teez+/7b1mW95gPQZSJ8kV
 Tbx2e0nHAPX4hCAxJ1AB3/zTlnjY+4+WJ9bD5XdgXkeVE8PPh1BEkulhMi1R1OMj
 57D1W6OgsBu/Pze78wjAvwO8+NAb1T/2mv2Bd/LY6Q+7hNDqOOhuajyBTxbH41FG
 sqx5bKjKOwgTybfV9A0Eo0e4FQBX07yXltBFHaPlyA4sOsIhM59+PxNrEwN1eZrQ
 LVVsdBXg8pHxrw==
 =EbN0
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Convert pseries & powernv to use MSI IRQ domains.

 - Rework the pseries CPU numbering so that CPUs that are removed, and
   later re-added, are given a CPU number on the same node as
   previously, when possible.

 - Add support for a new more flexible device-tree format for specifying
   NUMA distances.

 - Convert powerpc to GENERIC_PTDUMP.

 - Retire sbc8548 and sbc8641d board support.

 - Various other small features and fixes.

Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Anton Blanchard,
Cédric Le Goater, Christophe Leroy, Emmanuel Gil Peyrot, Fabiano Rosas,
Fangrui Song, Finn Thain, Gautham R.  Shenoy, Hari Bathini, Joel
Stanley, Jordan Niethe, Kajol Jain, Laurent Dufour, Leonardo Bras, Lukas
Bulwahn, Marc Zyngier, Masahiro Yamada, Michal Suchanek, Nathan
Chancellor, Nicholas Piggin, Parth Shah, Paul Gortmaker, Pratik R.
Sampat, Randy Dunlap, Sebastian Andrzej Siewior, Srikar Dronamraju, Wan
Jiabing, Xiongwei Song, and Zheng Yongjun.

* tag 'powerpc-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (154 commits)
  powerpc/bug: Cast to unsigned long before passing to inline asm
  powerpc/ptdump: Fix generic ptdump for 64-bit
  KVM: PPC: Fix clearing never mapped TCEs in realmode
  powerpc/pseries/iommu: Rename "direct window" to "dma window"
  powerpc/pseries/iommu: Make use of DDW for indirect mapping
  powerpc/pseries/iommu: Find existing DDW with given property name
  powerpc/pseries/iommu: Update remove_dma_window() to accept property name
  powerpc/pseries/iommu: Reorganize iommu_table_setparms*() with new helper
  powerpc/pseries/iommu: Add ddw_property_create() and refactor enable_ddw()
  powerpc/pseries/iommu: Allow DDW windows starting at 0x00
  powerpc/pseries/iommu: Add ddw_list_new_entry() helper
  powerpc/pseries/iommu: Add iommu_pseries_alloc_table() helper
  powerpc/kernel/iommu: Add new iommu_table_in_use() helper
  powerpc/pseries/iommu: Replace hard-coded page shift
  powerpc/numa: Update cpu_cpu_map on CPU online/offline
  powerpc/numa: Print debug statements only when required
  powerpc/numa: convert printk to pr_xxx
  powerpc/numa: Drop dbg in favour of pr_debug
  powerpc/smp: Enable CACHE domain for shared processor
  powerpc/smp: Update cpu_core_map on all PowerPc systems
  ...
2021-09-03 11:22:50 -07:00
Linus Torvalds
50ddcdb263 Livepatching changes for 5.15
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmEwmF0ACgkQUqAMR0iA
 lPI/0xAAkQX+LQMIkQ7++Gff4qqdDofMa7+jHlpYoPxF9siGqcr2Y85hfV9BC1d/
 wVGHE0sK8cq+xGfIqTARm4AKrktIwgAWuPVWeHtiVFP/H+9NO6CMq5lrA02hAiJg
 efnUtggRhUbTbURWqrvJK4MEoNwmKI+PuwwHYrU8cfvbi13W0P5Z2s02nztLljvc
 BdoeEubA4siJxFNvJ35dekh/wmz/5aMUC2lkUJgBrhO1u9ve6TCuMXGz3GG3j3RO
 bYepk/qCV7J2PD1U56wh93IDq0MNt26jDln63/rDi3r0JW8L/dbiS8oCllw5Se5o
 ZgjXWHaZk88konFVlY2LlYhmQTPA6XPKOH5NGNkeprnqYlT3zf2njHkzP/qmxg9/
 uJ1D59hXnCRcKkIF5OSM4OHPWtHtocZ34N46mFvuJlk1X3xuaj9Fu28pcf+tYt0+
 0YcMbrthSJ2jzhkvTNX5MUIh5d2NSryV4rU1Lp+s/sBmL8C8J5t04jO77+mIMNsV
 KIkWUuto3bAAPDOb4UfD0FwYMfof4toBXis9qmB9SvwYAooH9rwIlXIZ8pZq632v
 eDGCMk7sdeNLX0CQ6Uz2hcO80S8N0pPl2mw+1sXfNNL9HiquVj0ypeKbVtahnei8
 TtUfNN0vz1hyIGrDvNIsVl2bN/twWgsl+U0nlh+c4vtqt/1JA/0=
 =/yd8
 -----END PGP SIGNATURE-----

Merge tag 'livepatching-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching update from Petr Mladek.

* tag 'livepatching-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  livepatch: Replace deprecated CPU-hotplug functions.
2021-09-03 10:57:25 -07:00
Linus Torvalds
3de18c865f Merge branch 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb updates from Konrad Rzeszutek Wilk:
 "A new feature called restricted DMA pools. It allows SWIOTLB to
  utilize per-device (or per-platform) allocated memory pools instead of
  using the global one.

  The first big user of this is ARM Confidential Computing where the
  memory for DMA operations can be set per platform"

* 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits)
  swiotlb: use depends on for DMA_RESTRICTED_POOL
  of: restricted dma: Don't fail device probe on rmem init failure
  of: Move of_dma_set_restricted_buffer() into device.c
  powerpc/svm: Don't issue ultracalls if !mem_encrypt_active()
  s390/pv: fix the forcing of the swiotlb
  swiotlb: Free tbl memory in swiotlb_exit()
  swiotlb: Emit diagnostic in swiotlb_exit()
  swiotlb: Convert io_default_tlb_mem to static allocation
  of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS
  swiotlb: add overflow checks to swiotlb_bounce
  swiotlb: fix implicit debugfs declarations
  of: Add plumbing for restricted DMA pool
  dt-bindings: of: Add restricted DMA pool
  swiotlb: Add restricted DMA pool initialization
  swiotlb: Add restricted DMA alloc/free support
  swiotlb: Refactor swiotlb_tbl_unmap_single
  swiotlb: Move alloc_size to swiotlb_find_slots
  swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
  swiotlb: Update is_swiotlb_active to add a struct device argument
  swiotlb: Update is_swiotlb_buffer to add a struct device argument
  ...
2021-09-03 10:34:44 -07:00
Linus Torvalds
14726903c8 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "173 patches.

  Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
  pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
  bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
  hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
  oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
  mm/madvise: add MADV_WILLNEED to process_madvise()
  mm/vmstat: remove unneeded return value
  mm/vmstat: simplify the array size calculation
  mm/vmstat: correct some wrong comments
  mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
  selftests: vm: add COW time test for KSM pages
  selftests: vm: add KSM merging time test
  mm: KSM: fix data type
  selftests: vm: add KSM merging across nodes test
  selftests: vm: add KSM zero page merging test
  selftests: vm: add KSM unmerge test
  selftests: vm: add KSM merge test
  mm/migrate: correct kernel-doc notation
  mm: wire up syscall process_mrelease
  mm: introduce process_mrelease system call
  memblock: make memblock_find_in_range method private
  mm/mempolicy.c: use in_task() in mempolicy_slab_node()
  mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
  mm/mempolicy: advertise new MPOL_PREFERRED_MANY
  mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
  ...
2021-09-03 10:08:28 -07:00
Suren Baghdasaryan
dce4910396 mm: wire up syscall process_mrelease
Split off from prev patch in the series that implements the syscall.

Link: https://lkml.kernel.org/r/20210809185259.405936-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tim Murray <timmurray@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Charan Teja Reddy
65d759c8f9 mm: compaction: support triggering of proactive compaction by user
The proactive compaction[1] gets triggered for every 500msec and run
compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9) pages
based on the value set to sysctl.compaction_proactiveness.  Triggering the
compaction for every 500msec in search of COMPACTION_HPAGE_ORDER pages is
not needed for all applications, especially on the embedded system
usecases which may have few MB's of RAM.  Enabling the proactive
compaction in its state will endup in running almost always on such
systems.

Other side, proactive compaction can still be very much useful for getting
a set of higher order pages in some controllable manner(controlled by
using the sysctl.compaction_proactiveness).  So, on systems where enabling
the proactive compaction always may proove not required, can trigger the
same from user space on write to its sysctl interface.  As an example, say
app launcher decide to launch the memory heavy application which can be
launched fast if it gets more higher order pages thus launcher can prepare
the system in advance by triggering the proactive compaction from
userspace.

This triggering of proactive compaction is done on a write to
sysctl.compaction_proactiveness by user.

[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a

[akpm@linux-foundation.org: tweak vm.rst, per Mike]

Link: https://lkml.kernel.org/r/1627653207-12317-1-git-send-email-charante@codeaurora.org
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Nitin Gupta <nigupta@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Vasily Averin
c509723ec2 memcg: enable accounting for posix_timers_cache slab
A program may create multiple interval timers using timer_create().  For
each timer the kernel preallocates a "queued real-time signal",
Consequently, the number of timers is limited by the RLIMIT_SIGPENDING
resource limit.  The allocated object is quite small, ~250 bytes, but even
the default signal limits allow to consume up to 100 megabytes per user.

It makes sense to account for them to limit the host's memory consumption
from inside the memcg-limited container.

Link: https://lkml.kernel.org/r/57795560-025c-267c-6b1a-dea852d95530@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Yutian Yang <nglaive@gmail.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:13 -07:00
Vasily Averin
5f58c39819 memcg: enable accounting for signals
When a user send a signal to any another processes it forces the kernel to
allocate memory for 'struct sigqueue' objects.  The number of signals is
limited by RLIMIT_SIGPENDING resource limit, but even the default settings
allow each user to consume up to several megabytes of memory.

It makes sense to account for these allocations to restrict the host's
memory consumption from inside the memcg-limited container.

Link: https://lkml.kernel.org/r/e34e958c-e785-712e-a62a-2c7b66c646c7@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Yutian Yang <nglaive@gmail.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:12 -07:00
Vasily Averin
30acd0bdfb memcg: enable accounting for new namesapces and struct nsproxy
Container admin can create new namespaces and force kernel to allocate up
to several pages of memory for the namespaces and its associated
structures.

Net and uts namespaces have enabled accounting for such allocations.  It
makes sense to account for rest ones to restrict the host's memory
consumption from inside the memcg-limited container.

Link: https://lkml.kernel.org/r/5525bcbf-533e-da27-79b7-158686c64e13@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Yutian Yang <nglaive@gmail.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:12 -07:00
Vasily Averin
fab827dbee memcg: enable accounting for pids in nested pid namespaces
Commit 5d097056c9 ("kmemcg: account certain kmem allocations to memcg")
enabled memcg accounting for pids allocated from init_pid_ns.pid_cachep,
but forgot to adjust the setting for nested pid namespaces.  As a result,
pid memory is not accounted exactly where it is really needed, inside
memcg-limited containers with their own pid namespaces.

Pid was one the first kernel objects enabled for memcg accounting.
init_pid_ns.pid_cachep marked by SLAB_ACCOUNT and we can expect that any
new pids in the system are memcg-accounted.

Though recently I've noticed that it is wrong.  nested pid namespaces
creates own slab caches for pid objects, nested pids have increased size
because contain id both for all parent and for own pid namespaces.  The
problem is that these slab caches are _NOT_ marked by SLAB_ACCOUNT, as a
result any pids allocated in nested pid namespaces are not
memcg-accounted.

Pid struct in nested pid namespace consumes up to 500 bytes memory, 100000
such objects gives us up to ~50Mb unaccounted memory, this allow container
to exceed assigned memcg limits.

Link: https://lkml.kernel.org/r/8b6de616-fd1a-02c6-cbdb-976ecdcfa604@virtuozzo.com
Fixes: 5d097056c9 ("kmemcg: account certain kmem allocations to memcg")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:12 -07:00
David Hildenbrand
8d0920bde5 mm: remove VM_DENYWRITE
All in-tree users of MAP_DENYWRITE are gone. MAP_DENYWRITE cannot be
set from user space, so all users are gone; let's remove it.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2021-09-03 18:42:01 +02:00
David Hildenbrand
fe69d560b5 kernel/fork: always deny write access to current MM exe_file
We want to remove VM_DENYWRITE only currently only used when mapping the
executable during exec. During exec, we already deny_write_access() the
executable, however, after exec completes the VMAs mapped
with VM_DENYWRITE effectively keeps write access denied via
deny_write_access().

Let's deny write access when setting or replacing the MM exe_file. With
this change, we can remove VM_DENYWRITE for mapping executables.

Make set_mm_exe_file() return an error in case deny_write_access()
fails; note that this should never happen, because exec code does a
deny_write_access() early and keeps write access denied when calling
set_mm_exe_file. However, it makes the code easier to read and makes
set_mm_exe_file() and replace_mm_exe_file() look more similar.

This represents a minor user space visible change:
sys_prctl(PR_SET_MM_MAP/EXE_FILE) can now fail if the file is already
opened writable. Also, after sys_prctl(PR_SET_MM_MAP/EXE_FILE) the file
cannot be opened writable. Note that we can already fail with -EACCES if
the file doesn't have execute permissions.

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2021-09-03 18:42:01 +02:00
David Hildenbrand
35d7bdc860 kernel/fork: factor out replacing the current MM exe_file
Let's factor the main logic out into replace_mm_exe_file(), such that
all mm->exe_file logic is contained in kernel/fork.c.

While at it, perform some simple cleanups that are possible now that
we're simplifying the individual functions.

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2021-09-03 18:42:01 +02:00
Linus Torvalds
aa829778b1 LKMM updates:
- Update documentation and code example
 
 KCSAN updates:
 
  - Introduce CONFIG_KCSAN_STRICT (which RCU uses)
  - Optimize use of get_ctx() by kcsan_found_watchpoint()
  - Rework atomic.h into permissive.h
  - Add the ability to ignore writes that change only one bit of a given data-racy variable.
  - Improve comments
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmEvHeQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1g2pw/+NBZ4TGHc7pcxVbKQUdiOZ+BJEQG0OfUZ
 bDQbHDsEjjXXav9udGABJb9n/zbh+08Z+ME9H4RAIG7pLUMcITOVSpCdF7x8USFI
 T5Z5IkN0gKUXMyNGzxA28Q0nTMl7F2ndlwygrKUeVXEMjnX6QtkILaXWiFesJSkR
 bBNDUAYfk5+9kMbyx2RQ9VlNpHh0J6sYIwbHsbPOakw2dJaHLhusEUVEWIIGG7uy
 aCyCDY6kT5GWXIFDZIUSHs2wYH3calNJ77nD0JWpD/6XWCbiS87F0zKtW9y3kfA4
 0Lk6mJJDNSqUjbF0Y2sBYkuaisJI5lLxK1mfYET3AN7E4ZfDY9BJFXfMEPtMaKbD
 pW9p3CAiiYsJzg9976tHq4m7iO75nV2I29ZAv913oNSdOkxwH3ahE+6TFkAgNXnT
 57j0QFUS8MxpxxqBFcY45ukmDjzeQ0SKIVpWQnfCQ8mlfvTJhBrllXIdhEenVtZK
 b83k3Mer/wYmfpLpQVsUYck/Yk0QV1doywOv/BVbwhxbMeEI27BvoGpGZTVVPEZF
 VhhdC4sbySG1Eva0wiJ2abyC9u5INlxOxRuSjSGqNJhTXwDWTw4gfWmoowm1sCVz
 uoJoR2bzZzDGda6C5g8OvXATeQR/DtjjLPI7V9MEhLfVFukHKQDaNYXkAxZb7cyn
 IepaMtiJ0RA=
 =99fO
 -----END PGP SIGNATURE-----

Merge tag 'locking-debug-2021-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull memory model updates from Ingo Molnar:
 "LKMM updates:

   - Update documentation and code example

  KCSAN updates:

   - Introduce CONFIG_KCSAN_STRICT (which RCU uses)

   - Optimize use of get_ctx() by kcsan_found_watchpoint()

   - Rework atomic.h into permissive.h

   - Add the ability to ignore writes that change only one bit of a
     given data-racy variable.

   - Improve comments"

* tag 'locking-debug-2021-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/memory-model: Document data_race(READ_ONCE())
  tools/memory-model: Heuristics using data_race() must handle all values
  tools/memory-model: Add example for heuristic lockless reads
  tools/memory-model: Make read_foo_diagnostic() more clearly diagnostic
  kcsan: Make strict mode imply interruptible watchers
  kcsan: permissive: Ignore data-racy 1-bit value changes
  kcsan: Print if strict or non-strict during init
  kcsan: Rework atomic.h into permissive.h
  kcsan: Reduce get_ctx() uses in kcsan_found_watchpoint()
  kcsan: Introduce CONFIG_KCSAN_STRICT
  kcsan: Remove CONFIG_KCSAN_DEBUG
  kcsan: Improve some Kconfig comments
2021-09-02 13:00:15 -07:00
Linus Torvalds
4a3bb4200a dma-mapping updates for Linux 5.15
- fix debugfs initialization order (Anthony Iliopoulos)
  - use memory_intersects() directly (Kefeng Wang)
  - allow to return specific errors from ->map_sg
    (Logan Gunthorpe, Martin Oliveira)
  - turn the dma_map_sg return value into an unsigned int (me)
  - provide a common global coherent pool іmplementation (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmEvY+8LHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPaehAAsgnBzzzoLHO83pgs0KL92c+0DiMNHYmaMCJOvZXk
 x2Irv+O74WikRJc4S7uQ26p2spjmUxjmiOjld+8+NN0liD4QO9BQ/SZpIp8emuKS
 /yPG6Xh86xSl/OrPL1y7kGeHkRi5sm3mRhcTdILFQFPLcSReupe++GRfnvrpbOPk
 tj3pBGXluD6iJH12BBt00ushUVzZ0F2xaF6xUDAs94RSZ3tlqsfx6c928Y1KxSZh
 f89q/KuaokyogFG7Ujj/nYgIUETaIs2W6UmxBfRzdEMJFSffwomUMbw+M+qGJ7/d
 2UjamFYRX16FReE8WNsndbX1E6k5JBW12E1qwV3dUwatlNLWEaRq3PNiWkF7zcFH
 LDkpDYN6s5bIDPTfDp21XfPygoH8KQhnD9lVf0aB7n04uu8VJrGB9+10PpkCJVXD
 0b2dcuSwCO7hAfTfNGVV8f3EI/1XPflr1hJvMgcVtY53CR96ldp+4QaElzWLXumN
 MyptirmrVITNVyVwGzhGAblXBLWdarXD0EXudyiaF4Xbrj3AkIOSUCghEwKLpjQf
 UwMFFwSE8yGxKTRK4HfU5gMzy6G751fU7TUe5lmxZLovDflQoSXMWgHE8e7r0Qel
 o5v6lmUzoWz2fAISf3xjauo2ncgmfWMwYM6C7OJy5nG73QXLQId9J+ReXbmrgrrN
 DgI=
 =spje
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - fix debugfs initialization order (Anthony Iliopoulos)

 - use memory_intersects() directly (Kefeng Wang)

 - allow to return specific errors from ->map_sg (Logan Gunthorpe,
   Martin Oliveira)

 - turn the dma_map_sg return value into an unsigned int (me)

 - provide a common global coherent pool іmplementation (me)

* tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping: (31 commits)
  hexagon: use the generic global coherent pool
  dma-mapping: make the global coherent pool conditional
  dma-mapping: add a dma_init_global_coherent helper
  dma-mapping: simplify dma_init_coherent_memory
  dma-mapping: allow using the global coherent pool for !ARM
  ARM/nommu: use the generic dma-direct code for non-coherent devices
  dma-direct: add support for dma_coherent_default_memory
  dma-mapping: return an unsigned int from dma_map_sg{,_attrs}
  dma-mapping: disallow .map_sg operations from returning zero on error
  dma-mapping: return error code from dma_dummy_map_sg()
  x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR
  x86/amd_gart: return error code from gart_map_sg()
  xen: swiotlb: return error code from xen_swiotlb_map_sg()
  parisc: return error code from .map_sg() ops
  sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
  sparc/iommu: return error codes from .map_sg() ops
  s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR
  s390/pci: return error code from s390_dma_map_sg()
  powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR
  powerpc/iommu: return error code from .map_sg() ops
  ...
2021-09-02 10:32:06 -07:00
Linus Torvalds
df43d90382 printk changes for 5.15
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmEt+hwACgkQUqAMR0iA
 lPLppBAAiyrUNVmqqtdww+IJajEs1uD/4FqPsysHRwroHBFymJeQG1XCwUpDZ7jj
 6gXT0chxyjQE18gT/W9nf+PSmA9XvIVA1WSR+WCECTNW3YoZXqtgwiHfgnitXYku
 HlmoZLthYeuoXWw2wn+hVLfTRh6VcPHYEaC21jXrs6B1pOXHbvjJ5eTLHlX9oCfL
 UKSK+jFTHAJcn/GskRzviBe0Hpe8fqnkRol2XX13ltxqtQ73MjaGNu7imEH6/Pa7
 /MHXWtuWJtOvuYz17aztQP4Qwh1xy+kakMy3aHucdlxRBTP4PTzzTuQI3L/RYi6l
 +ttD7OHdRwqFAauBLY3bq3uJjYb5v/64ofd8DNnT2CJvtznY8wrPbTdFoSdPcL2Q
 69/opRWHcUwbU/Gt4WLtyQf3Mk0vepgMbbVg1B5SSy55atRZaXMrA2QJ/JeawZTB
 KK6D/mE7ccze/YFzsySunCUVKCm0veoNxEAcakCCZKXSbsvd1MYcIRC0e+2cv6e5
 2NEH7gL4dD+5tqu5nzvIuKDn3NrDQpbi28iUBoFbkxRgcVyvHJ9AGSa62wtb5h3D
 OgkqQMdVKBbjYNeUodPlQPzmXZDasytavyd0/BC/KENOcBvU/8gW++2UZTfsh/1A
 dLjgwFBdyJncQcCS9Abn20/EKntbIMEX8NLa97XWkA3fuzMKtak=
 =yEVq
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Optionally, provide an index of possible printk messages via
   <debugfs>/printk/index/. It can be used when monitoring important
   kernel messages on a farm of various hosts. The monitor has to be
   updated when some messages has changed or are not longer available by
   a newly deployed kernel.

 - Add printk.console_no_auto_verbose boot parameter. It allows to
   generate crash dump even with slow consoles in a reasonable time
   frame.

 - Remove printk_safe buffers. The messages are always stored directly
   to the main logbuffer, even in NMI or recursive context. Also it
   allows to serialize syslog operations by a mutex instead of a spin
   lock.

 - Misc clean up and build fixes.

* tag 'printk-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk/index: Fix -Wunused-function warning
  lib/nmi_backtrace: Serialize even messages about idle CPUs
  printk: Add printk.console_no_auto_verbose boot parameter
  printk: Remove console_silent()
  lib/test_scanf: Handle n_bits == 0 in random tests
  printk: syslog: close window between wait and read
  printk: convert @syslog_lock to mutex
  printk: remove NMI tracking
  printk: remove safe buffers
  printk: track/limit recursion
  lib/nmi_backtrace: explicitly serialize banner and regs
  printk: Move the printk() kerneldoc comment to its new home
  printk/index: Fix warning about missing prototypes
  MIPS/asm/printk: Fix build failure caused by printk
  printk: index: Add indexing support to dev_printk
  printk: Userspace format indexing support
  printk: Rework parse_prefix into printk_parse_prefix
  printk: Straighten out log_flags into printk_info_flags
  string_helpers: Escape double quotes in escape_special
  printk/console: Check consistent sequence number when handling race in console_unlock()
2021-09-01 18:41:13 -07:00
Linus Torvalds
bcfeebbff3 Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull exit cleanups from Eric Biederman:
 "In preparation of doing something about PTRACE_EVENT_EXIT I have
  started cleaning up various pieces of code related to do_exit. Most of
  that code I did not manage to get tested and reviewed before the merge
  window opened but a handful of very useful cleanups are ready to be
  merged.

  The first change is simply the removal of the bdflush system call. The
  code has now been disabled long enough that even the oldest userspace
  working userspace setups anyone can find to test are fine with the
  bdflush system call being removed.

  Changing m68k fsp040_die to use force_sigsegv(SIGSEGV) instead of
  calling do_exit directly is interesting only in that it is nearly the
  most difficult of the incorrect uses of do_exit to remove.

  The change to the seccomp code to simply send a signal instead of
  calling do_coredump directly is a very nice little cleanup made
  possible by realizing the existing signal sending helpers were missing
  a little bit of functionality that is easy to provide"

* 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signal/seccomp: Dump core when there is only one live thread
  signal/seccomp: Refactor seccomp signal and coredump generation
  signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  exit/bdflush: Remove the deprecated bdflush system call
2021-09-01 14:52:05 -07:00
Linus Torvalds
48983701a1 Merge branch 'siginfo-si_trapno-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo si_trapno updates from Eric Biederman:
 "The full set of si_trapno changes was not appropriate as a fix for the
  newly added SIGTRAP TRAP_PERF, and so I postponed the rest of the
  related cleanups.

  This is the rest of the cleanups for si_trapno that reduces it from
  being a really weird arch special case that is expect to be always
  present (but isn't) on the architectures that support it to being yet
  another field in the _sigfault union of struct siginfo.

  The changes have been reviewed and marinated in linux-next. With the
  removal of this awkward special case new code (like SIGTRAP TRAP_PERF)
  that works across architectures should be easier to write and
  maintain"

* 'siginfo-si_trapno-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signal: Rename SIL_PERF_EVENT SIL_FAULT_PERF_EVENT for consistency
  signal: Verify the alignment and size of siginfo_t
  signal: Remove the generic __ARCH_SI_TRAPNO support
  signal/alpha: si_trapno is only used with SIGFPE and SIGTRAP TRAP_UNK
  signal/sparc: si_trapno is only used with SIGILL ILL_ILLTRP
  arm64: Add compile-time asserts for siginfo_t offsets
  arm: Add compile-time asserts for siginfo_t offsets
  sparc64: Add compile-time asserts for siginfo_t offsets
2021-09-01 14:42:36 -07:00
Linus Torvalds
9e9fb7655e Core:
- Enable memcg accounting for various networking objects.
 
 BPF:
 
  - Introduce bpf timers.
 
  - Add perf link and opaque bpf_cookie which the program can read
    out again, to be used in libbpf-based USDT library.
 
  - Add bpf_task_pt_regs() helper to access user space pt_regs
    in kprobes, to help user space stack unwinding.
 
  - Add support for UNIX sockets for BPF sockmap.
 
  - Extend BPF iterator support for UNIX domain sockets.
 
  - Allow BPF TCP congestion control progs and bpf iterators to call
    bpf_setsockopt(), e.g. to switch to another congestion control
    algorithm.
 
 Protocols:
 
  - Support IOAM Pre-allocated Trace with IPv6.
 
  - Support Management Component Transport Protocol.
 
  - bridge: multicast: add vlan support.
 
  - netfilter: add hooks for the SRv6 lightweight tunnel driver.
 
  - tcp:
     - enable mid-stream window clamping (by user space or BPF)
     - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
     - more accurate DSACK processing for RACK-TLP
 
  - mptcp:
     - add full mesh path manager option
     - add partial support for MP_FAIL
     - improve use of backup subflows
     - optimize option processing
 
  - af_unix: add OOB notification support.
 
  - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by
          the router.
 
  - mac80211: Target Wake Time support in AP mode.
 
  - can: j1939: extend UAPI to notify about RX status.
 
 Driver APIs:
 
  - Add page frag support in page pool API.
 
  - Many improvements to the DSA (distributed switch) APIs.
 
  - ethtool: extend IRQ coalesce uAPI with timer reset modes.
 
  - devlink: control which auxiliary devices are created.
 
  - Support CAN PHYs via the generic PHY subsystem.
 
  - Proper cross-chip support for tag_8021q.
 
  - Allow TX forwarding for the software bridge data path to be
    offloaded to capable devices.
 
 Drivers:
 
  - veth: more flexible channels number configuration.
 
  - openvswitch: introduce per-cpu upcall dispatch.
 
  - Add internet mix (IMIX) mode to pktgen.
 
  - Transparently handle XDP operations in the bonding driver.
 
  - Add LiteETH network driver.
 
  - Renesas (ravb):
    - support Gigabit Ethernet IP
 
  - NXP Ethernet switch (sja1105)
    - fast aging support
    - support for "H" switch topologies
    - traffic termination for ports under VLAN-aware bridge
 
  - Intel 1G Ethernet
     - support getcrosststamp() with PCIe PTM (Precision Time
       Measurement) for better time sync
     - support Credit-Based Shaper (CBS) offload, enabling HW traffic
       prioritization and bandwidth reservation
 
  - Broadcom Ethernet (bnxt)
     - support pulse-per-second output
     - support larger Rx rings
 
  - Mellanox Ethernet (mlx5)
     - support ethtool RSS contexts and MQPRIO channel mode
     - support LAG offload with bridging
     - support devlink rate limit API
     - support packet sampling on tunnels
 
  - Huawei Ethernet (hns3):
     - basic devlink support
     - add extended IRQ coalescing support
     - report extended link state
 
  - Netronome Ethernet (nfp):
     - add conntrack offload support
 
  - Broadcom WiFi (brcmfmac):
     - add WPA3 Personal with FT to supported cipher suites
     - support 43752 SDIO device
 
  - Intel WiFi (iwlwifi):
     - support scanning hidden 6GHz networks
     - support for a new hardware family (Bz)
 
  - Xen pv driver:
     - harden netfront against malicious backends
 
  - Qualcomm mobile
     - ipa: refactor power management and enable automatic suspend
     - mhi: move MBIM to WWAN subsystem interfaces
 
 Refactor:
 
  - Ambient BPF run context and cgroup storage cleanup.
 
  - Compat rework for ndo_ioctl.
 
 Old code removal:
 
  - prism54 remove the obsoleted driver, deprecated by the p54 driver.
 
  - wan: remove sbni/granch driver.
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEukBYACgkQMUZtbf5S
 IrsyHA//TO8dw18NYts4n9LmlJT2naJ7yBUUSSXK/M+DtW0MQ9nnHhqzPm5uJdRl
 IgQTNJrW3dYzRwgqaWZqEwO1t5/FI+f87ND1Nsekg7x9tF66a6ov5WxU26TwwSba
 U+si/inQ/4chuQ+LxMQobqCDxaLE46I2dIoRl+YfndJ24DRzYSwAEYIPPbSdfyU+
 +/l+3s4GaxO4k/hLciPAiOniyxLoUNiGUTNh+2yqRBXelSRJRKVnl+V22ANFrxRW
 nTEiplfVKhlPU1e4iLuRtaxDDiePHhw9I3j/lMHhfeFU2P/gKJIvz4QpGV0CAZg2
 1VvDU32WEx1GQLXJbKm0KwoNRUq1QSjOyyFti+BO7ugGaYAR4gKhShOqlSYLzUtB
 tbtzQhSNLWOGqgmSJOztZb5kFDm2EdRSll5/lP2uyFlPkIsIp0QbscJVzNTnS74b
 Xz15ZOw41Z4TfWPEMWgfrx6Zkm7pPWkly+7WfUkPcHa1gftNz6tzXXxSXcXIBPdi
 yQ5JCzzxrM5573YHuk5YedwZpn6PiAt4A/muFGk9C6aXP60TQAOS/ppaUzZdnk4D
 NfOk9mj06WEULjYjPcKEuT3GGWE6kmjb8Pu0QZWKOchv7vr6oZly1EkVZqYlXELP
 AfhcrFeuufie8mqm0jdb4LnYaAnqyLzlb1J4Zxh9F+/IX7G3yoc=
 =JDGD
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - Enable memcg accounting for various networking objects.

  BPF:

   - Introduce bpf timers.

   - Add perf link and opaque bpf_cookie which the program can read out
     again, to be used in libbpf-based USDT library.

   - Add bpf_task_pt_regs() helper to access user space pt_regs in
     kprobes, to help user space stack unwinding.

   - Add support for UNIX sockets for BPF sockmap.

   - Extend BPF iterator support for UNIX domain sockets.

   - Allow BPF TCP congestion control progs and bpf iterators to call
     bpf_setsockopt(), e.g. to switch to another congestion control
     algorithm.

  Protocols:

   - Support IOAM Pre-allocated Trace with IPv6.

   - Support Management Component Transport Protocol.

   - bridge: multicast: add vlan support.

   - netfilter: add hooks for the SRv6 lightweight tunnel driver.

   - tcp:
       - enable mid-stream window clamping (by user space or BPF)
       - allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
       - more accurate DSACK processing for RACK-TLP

   - mptcp:
       - add full mesh path manager option
       - add partial support for MP_FAIL
       - improve use of backup subflows
       - optimize option processing

   - af_unix: add OOB notification support.

   - ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the
     router.

   - mac80211: Target Wake Time support in AP mode.

   - can: j1939: extend UAPI to notify about RX status.

  Driver APIs:

   - Add page frag support in page pool API.

   - Many improvements to the DSA (distributed switch) APIs.

   - ethtool: extend IRQ coalesce uAPI with timer reset modes.

   - devlink: control which auxiliary devices are created.

   - Support CAN PHYs via the generic PHY subsystem.

   - Proper cross-chip support for tag_8021q.

   - Allow TX forwarding for the software bridge data path to be
     offloaded to capable devices.

  Drivers:

   - veth: more flexible channels number configuration.

   - openvswitch: introduce per-cpu upcall dispatch.

   - Add internet mix (IMIX) mode to pktgen.

   - Transparently handle XDP operations in the bonding driver.

   - Add LiteETH network driver.

   - Renesas (ravb):
       - support Gigabit Ethernet IP

   - NXP Ethernet switch (sja1105):
       - fast aging support
       - support for "H" switch topologies
       - traffic termination for ports under VLAN-aware bridge

   - Intel 1G Ethernet
       - support getcrosststamp() with PCIe PTM (Precision Time
         Measurement) for better time sync
       - support Credit-Based Shaper (CBS) offload, enabling HW traffic
         prioritization and bandwidth reservation

   - Broadcom Ethernet (bnxt)
       - support pulse-per-second output
       - support larger Rx rings

   - Mellanox Ethernet (mlx5)
       - support ethtool RSS contexts and MQPRIO channel mode
       - support LAG offload with bridging
       - support devlink rate limit API
       - support packet sampling on tunnels

   - Huawei Ethernet (hns3):
       - basic devlink support
       - add extended IRQ coalescing support
       - report extended link state

   - Netronome Ethernet (nfp):
       - add conntrack offload support

   - Broadcom WiFi (brcmfmac):
       - add WPA3 Personal with FT to supported cipher suites
       - support 43752 SDIO device

   - Intel WiFi (iwlwifi):
       - support scanning hidden 6GHz networks
       - support for a new hardware family (Bz)

   - Xen pv driver:
       - harden netfront against malicious backends

   - Qualcomm mobile
       - ipa: refactor power management and enable automatic suspend
       - mhi: move MBIM to WWAN subsystem interfaces

  Refactor:

   - Ambient BPF run context and cgroup storage cleanup.

   - Compat rework for ndo_ioctl.

  Old code removal:

   - prism54 remove the obsoleted driver, deprecated by the p54 driver.

   - wan: remove sbni/granch driver"

* tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits)
  net: Add depends on OF_NET for LiteX's LiteETH
  ipv6: seg6: remove duplicated include
  net: hns3: remove unnecessary spaces
  net: hns3: add some required spaces
  net: hns3: clean up a type mismatch warning
  net: hns3: refine function hns3_set_default_feature()
  ipv6: remove duplicated 'net/lwtunnel.h' include
  net: w5100: check return value after calling platform_get_resource()
  net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx()
  net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource()
  net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()
  fou: remove sparse errors
  ipv4: fix endianness issue in inet_rtm_getroute_build_skb()
  octeontx2-af: Set proper errorcode for IPv4 checksum errors
  octeontx2-af: Fix static code analyzer reported issues
  octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg
  octeontx2-af: Fix loop in free and unmap counter
  af_unix: fix potential NULL deref in unix_dgram_connect()
  dpaa2-eth: Replace strlcpy with strscpy
  octeontx2-af: Use NDC TX for transmit packet data
  ...
2021-08-31 16:43:06 -07:00
Linus Torvalds
86ac54e79f Merge branch 'for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
 "There is a long-standing subtle destroy_workqueue() bug where a
  workqueue can be destroyed while internal work items used for flushing
  are still in flight. Lai fixed it by assigning a flush color to the
  internal work items so that they are correctly waited for during
  destruction.

  Other than that, all are minor cleanups"

* 'for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Remove unused WORK_NO_COLOR
  workqueue: Assign a color to barrier work items
  workqueue: Mark barrier work with WORK_STRUCT_INACTIVE
  workqueue: Change the code of calculating work_flags in insert_wq_barrier()
  workqueue: Change arguement of pwq_dec_nr_in_flight()
  workqueue: Rename "delayed" (delayed by active management) to "inactive"
  workqueue: Replace deprecated CPU-hotplug functions.
  workqueue: Replace deprecated ida_simple_*() with ida_alloc()/ida_free()
  workqueue: Fix typo in comments
  workqueue: Fix possible memory leaks in wq_numa_init()
2021-08-31 15:53:20 -07:00
Linus Torvalds
69dc8010b8 Merge branch 'for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
 "Two cpuset behavior changes:

   - cpuset on cgroup2 is changed to enable memory migration based on
     nodemask by default.

   - A notification is generated when cpuset partition state changes.

  All other patches are minor fixes and cleanups"

* 'for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Avoid compiler warnings with no subsystems
  cgroup/cpuset: Avoid memory migration when nodemasks match
  cgroup/cpuset: Enable memory migration for cpuset v2
  cgroup/cpuset: Enable event notification when partition state changes
  cgroup: cgroup-v1: clean up kernel-doc notation
  cgroup: Replace deprecated CPU-hotplug functions.
  cgroup/cpuset: Fix violation of cpuset locking rule
  cgroup/cpuset: Fix a partition bug with hotplug
  cgroup/cpuset: Miscellaneous code cleanup
  cgroup: remove cgroup_mount from comments
2021-08-31 15:49:04 -07:00
Linus Torvalds
5cbba60596 Power management updates for 5.15-rc1
- Address 3 PCI device power management issues (Rafael Wysocki).
 
  - Add Power Limit4 support for Alder Lake to the Intel RAPL power
    capping driver (Sumeet Pawnikar).
 
  - Add HWP guaranteed performance change notification support to
    the intel_pstate driver (Srinivas Pandruvada).
 
  - Replace deprecated CPU-hotplug functions in code related to power
    management (Sebastian Andrzej Siewior).
 
  - Update CPU PM notifiers to use raw spinlocks (Valentin Schneider).
 
  - Add support for 'required-opps' DT property to the generic power
    domains (genpd) framework and use this property for I2C on ARM64
    sc7180 (Rajendra Nayak).
 
  - Fix Kconfig issue related to genpd (Geert Uytterhoeven).
 
  - Increase energy calculation precision in the Energy Model (Lukasz
    Luba).
 
  - Fix kobject deletion in the exit code of the schedutil cpufreq
    governor (Kevin Hao).
 
  - Unmark some functions as kernel-doc in the PM core to avoid
    false-positive documentation build warnings (Randy Dunlap).
 
  - Check RTC features instead of ops in suspend_test Alexandre
    Belloni).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmEtIvYSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxnQ8QAK2QCyfFAdrPE3zn+XdTcVC8rpYhL0d2
 3YpbS2obZ4lOHsrNy7SrrU5NdgvCFPHl14Py5rxu4QZW8EF5km6SqJYj9MhrDsEm
 wJ/ZM3Zbaj98MAls7pulHxabMo8hvGfMSmcNOFfgh7yqCm5eu8gNnt/LbLaB9IvI
 v46ZGNvnpbtkWgk4Eaq+v3Y5/+4CoUfuAFn7K21SHNcgzDbhE3Ii6cam/rwPeCdn
 a+LcDLzeDQpnnYKukx7Zyk7Z+FblXWHWZ/ClLcR8JgYYSrbrXxSLtRdM5EtLje2J
 ZnqFKqmlMcq2OhTjBSoHSJjiOkxyz5eWWrEf7d1/oH19WSW+rv5VfZUXAkYtfGKo
 OJChuIomrLNqJhAwTnxG3fnVh2NMhLwDVEjkFgvej4shCZXhoVgWu39mWnbyTxV3
 57adiuVUv6AboH6Lyvi/yzV7sWmZbL/U/YVvt+Z9SEh0syy6YBb87bW3Mr1qnLeG
 QF4AmF553qzHwd9uxUoFfiUoBxH1GvNmOWtx6lYKhy0lm3AwvW1XHKqsDG4KbVJg
 zzok7J2yv1/1rd8dJ+8tCwyyC3WJjNjtq/ULOltugwQpqVK4PCwWAUmjJaTP3Gpp
 ZH/bGuLSQWxxiinwKCkFSxwbewJu+AvtoXYKBfk/Gf5O6GyUyryBbBMO9dsqP27A
 Zg3P4+ejS7RN
 =sx6G
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These address some PCI device power management issues, add new
  hardware support to the RAPL power capping driver, add HWP guaranteed
  performance change notification support to the intel_pstate driver,
  replace deprecated CPU-hotplug functions in a few places, update CPU
  PM notifiers to use raw spinlocks, update the PM domains framework
  (new DT property support, Kconfig fix), do a couple of cleanups in
  code related to system sleep, and improve the energy model and the
  schedutil cpufreq governor.

  Specifics:

   - Address 3 PCI device power management issues (Rafael Wysocki).

   - Add Power Limit4 support for Alder Lake to the Intel RAPL power
     capping driver (Sumeet Pawnikar).

   - Add HWP guaranteed performance change notification support to the
     intel_pstate driver (Srinivas Pandruvada).

   - Replace deprecated CPU-hotplug functions in code related to power
     management (Sebastian Andrzej Siewior).

   - Update CPU PM notifiers to use raw spinlocks (Valentin Schneider).

   - Add support for 'required-opps' DT property to the generic power
     domains (genpd) framework and use this property for I2C on ARM64
     sc7180 (Rajendra Nayak).

   - Fix Kconfig issue related to genpd (Geert Uytterhoeven).

   - Increase energy calculation precision in the Energy Model (Lukasz
     Luba).

   - Fix kobject deletion in the exit code of the schedutil cpufreq
     governor (Kevin Hao).

   - Unmark some functions as kernel-doc in the PM core to avoid
     false-positive documentation build warnings (Randy Dunlap).

   - Check RTC features instead of ops in suspend_test Alexandre
     Belloni)"

* tag 'pm-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: domains: Fix domain attach for CONFIG_PM_OPP=n
  powercap: Add Power Limit4 support for Alder Lake SoC
  cpufreq: intel_pstate: Process HWP Guaranteed change notification
  thermal: intel: Allow processing of HWP interrupt
  notifier: Remove atomic_notifier_call_chain_robust()
  PM: cpu: Make notifier chain use a raw_spinlock_t
  PM: sleep: unmark 'state' functions as kernel-doc
  arm64: dts: sc7180: Add required-opps for i2c
  PM: domains: Add support for 'required-opps' to set default perf state
  opp: Don't print an error if required-opps is missing
  cpufreq: schedutil: Use kobject release() method to free sugov_tunables
  PM: EM: Increase energy calculation precision
  PM: sleep: check RTC features instead of ops in suspend_test
  PM: sleep: s2idle: Replace deprecated CPU-hotplug functions
  cpufreq: Replace deprecated CPU-hotplug functions
  powercap: intel_rapl: Replace deprecated CPU-hotplug functions
  PCI: PM: Enable PME if it can be signaled from D3cold
  PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently
  PCI: Use pci_update_current_state() in pci_enable_device_flags()
2021-08-31 13:21:58 -07:00
Linus Torvalds
8e0cd9525c Merge tag 'audit-pr-20210830' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
 "Two patches in the audit pull request for v5.15; one is trivial
  ("header protection") but the second is a real patch that fixes a
  refcounting problem.

  The refcount fix normally would have been sent up during the -rcX
  cycle, but since we merged it less than a week before v5.14 proper I
  felt it was better to wait for the merge window to open; the patch is
  marked with the usual -stable markings"

* tag 'audit-pr-20210830' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: move put_tree() to avoid trim_trees refcount underflow and UAF
  audit: add header protection to kernel/audit.h
2021-08-31 12:55:50 -07:00
Linus Torvalds
e55f0c439a kernel.sys.v5.15
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYS3lRQAKCRCRxhvAZXjc
 oqH9AP999iWN7nOOr4QpnQZVMEbwYlZksdjJso0i2Nd87rNMWQEAgKYsnm00dlLm
 uV/X21a9W6RJYgOGP4+BY4DAyVezpgk=
 =vSMk
 -----END PGP SIGNATURE-----

Merge tag 'kernel.sys.v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull set_user()  update from Christian Brauner:
 "This contains a single fix to set_user() which aligns permission
  checks with the corresponding fork() codepath. No one involved in this
  could come up with a reason for the difference.

  A capable caller can already circumvent the check when they fork where
  the permission checks are already for the relevant capabilities in
  addition to also allowing to exceed nproc when it is the init user.

  So apply the same logic to set_user()"

* tag 'kernel.sys.v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds
2021-08-31 12:16:05 -07:00
Claire Chang
f3c4b1341e swiotlb: use depends on for DMA_RESTRICTED_POOL
Use depends on instead of select for DMA_RESTRICTED_POOL; otherwise it
will make SWIOTLB user configurable and cause compile errors for some
arch (e.g. mips).

Fixes: 0b84e4f8b7 ("swiotlb: Add restricted DMA pool initialization")
Acked-by: Will Deacon <will@kernel.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Claire Chang <tientzu@chromium.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2021-08-31 14:48:14 -04:00
Linus Torvalds
8bda955776 New features:
- Support for server-side disconnect injection via debugfs
 - Protocol definitions for new RPC_AUTH_TLS authentication flavor
 
 Performance improvements:
 - Reduce page allocator traffic in the NFSD splice read actor
 - Reduce CPU utilization in svcrdma's Send completion handler
 
 Notable bug fixes:
 - Stabilize lockd operation when re-exporting NFS mounts
 - Fix the use of %.*s in NFSD tracepoints
 - Fix /proc/sys/fs/nfs/nsm_use_hostnames
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmEqq0AACgkQM2qzM29m
 f5dYig/5AaPN2BWYf4D1VkrAS3+zGS+3IN23WVgpbA54jgfjPEH+Aa00YhEQQa0j
 Y5u/jE5g/tWvenDefq5BmvdRfZMWCVc2JkngctOSflhaREUWK+HgCkH+5DQs6zUM
 rbX7qy0v6wJnEMSlwCKJ2AuZbYw7Bsg2nvOgEbb718/ent3umeoXEK09x3HTWLEp
 eVcMU5uicB5wRRPpROYG792oWzUScQ8kyiRCKJfQDoR7bINhBeVHObAIFMBo1UaH
 x9CMX4RlPYGmoMYUc+AqcOM7hizucHpXqM1r3oVjQ7FyI+pmDLuLL/3OTjtRUX7+
 nYLqNW/PijH9PjFe4BPjGHAUQfKiTIXANAe8VdjQj70D40jYkP+jQ9SPdV+pEgi4
 U4azfK3S+85/bRYYq/1alcLiP1+6dgcL++rVvnKESTH9NRgNoEw2WZHeKxXiYaxU
 p7oOC4XdnYDwcz/3QVWa0sK2kA5IJHzOsCQR7OilD09NAJ+AbJTAp0H3xFXTllzb
 AV2CAEBVZlP+pZYOehuVnKpZPa7YAWx92wRK2anbRUMZN3lF1wWBEOTd6KweIpTx
 l2GJSf3GWBqL1x9PjSet/cBusxYjTA+S1hE7KMrsNPhzbvpIgAZEtSqOfn9apDCV
 uAFIN2DSiHm3Tv0aFSJWo+CMyKkyktuiS8JFKaFdzCp9NtsBM2M=
 =TGkK
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "New features:

   - Support for server-side disconnect injection via debugfs

   - Protocol definitions for new RPC_AUTH_TLS authentication flavor

  Performance improvements:

   - Reduce page allocator traffic in the NFSD splice read actor

   - Reduce CPU utilization in svcrdma's Send completion handler

  Notable bug fixes:

   - Stabilize lockd operation when re-exporting NFS mounts

   - Fix the use of %.*s in NFSD tracepoints

   - Fix /proc/sys/fs/nfs/nsm_use_hostnames"

* tag 'nfsd-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (31 commits)
  nfsd: fix crash on LOCKT on reexported NFSv3
  nfs: don't allow reexport reclaims
  lockd: don't attempt blocking locks on nfs reexports
  nfs: don't atempt blocking locks on nfs reexports
  Keep read and write fds with each nlm_file
  lockd: update nlm_lookup_file reexport comment
  nlm: minor refactoring
  nlm: minor nlm_lookup_file argument change
  lockd: lockd server-side shouldn't set fl_ops
  SUNRPC: Add documentation for the fail_sunrpc/ directory
  SUNRPC: Server-side disconnect injection
  SUNRPC: Move client-side disconnect injection
  SUNRPC: Add a /sys/kernel/debug/fail_sunrpc/ directory
  svcrdma: xpt_bc_xprt is already clear in __svc_rdma_free()
  nfsd4: Fix forced-expiry locking
  rpc: fix gss_svc_init cleanup on failure
  SUNRPC: Add RPC_AUTH_TLS protocol numbers
  lockd: change the proc_handler for nsm_use_hostnames
  sysctl: introduce new proc handler proc_dobool
  SUNRPC: Fix a NULL pointer deref in trace_svc_stats_latency()
  ...
2021-08-31 10:57:06 -07:00