mirror of
https://github.com/torvalds/linux.git
synced 2024-11-23 04:31:50 +00:00
3fe617ccaf
2274 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Linus Torvalds
|
57fa2369ab |
CFI on arm64 series for v5.13-rc1
- Clean up list_sort prototypes (Sami Tolvanen) - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCHCR8ACgkQiXL039xt wCZyFQ//fnUZaXR2K354zDyW6CJljMf+d94RF6rH+J6eMTH2/HXa5v0iJokwABLf ussP6qF4k5wtmI22Gm9A5Zc3e4iiry5pC0jOdk0mk4gzWwFN9MdgNxJZIGA3xqhS bsBK4AGrVKjtZl48G1/ZxJuNDeJhVp6GNK2n6/Gl4rZF6R7D/Upz0XelyJRdDpcM HIGma7jZl6xfGU0mdWCzpOGK1zdMca1WVs7A4YuurSbLn5PZJrcNVWLouDqt/Si2 AduSri1gyPClicgvqWjMOzhUpuw/nJtBLRl1x1EsWk/KSZ1/uNVjlewfzdN4fZrr zbtFr2gLubYLK6JOX7/LqoHlOTgE3tYLL+WIVN75DsOGZBKgHhmebTmWLyqzV0SL oqcyM5d3ucC6msdtAK5Fv4MSp8rpjqlK1Ha4SGRT6kC2wut7AhZ3KD7eyRIz8mV9 Sa9mhignGFJnTEUp+LSbYdrAudgSKxB40WyXPmswAXX4VJFRD4ONrrcAON/SzkUT Hw/JdFRCKkJjgwNQjIQoZcUNMTbFz2PlNIEnjJWm38YImQKQlCb2mXaZKCwBkf45 aheCZk17eKoxTCXFMd+KxlyNEtS2yBfq/PpZgvw7GW/pfFbWUg1+2O41LnihIe5v zu0hN1wNCQqgfxiMZqX1OTb9C/2vybzGsXILt+9nppjZ8EBU7iU= =wU6U -----END PGP SIGNATURE----- Merge tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull CFI on arm64 support from Kees Cook: "This builds on last cycle's LTO work, and allows the arm64 kernels to be built with Clang's Control Flow Integrity feature. This feature has happily lived in Android kernels for almost 3 years[1], so I'm excited to have it ready for upstream. The wide diffstat is mainly due to the treewide fixing of mismatched list_sort prototypes. Other things in core kernel are to address various CFI corner cases. The largest code portion is the CFI runtime implementation itself (which will be shared by all architectures implementing support for CFI). The arm64 pieces are Acked by arm64 maintainers rather than coming through the arm64 tree since carrying this tree over there was going to be awkward. CFI support for x86 is still under development, but is pretty close. There are a handful of corner cases on x86 that need some improvements to Clang and objtool, but otherwise works well. Summary: - Clean up list_sort prototypes (Sami Tolvanen) - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)" * tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: arm64: allow CONFIG_CFI_CLANG to be selected KVM: arm64: Disable CFI for nVHE arm64: ftrace: use function_nocfi for ftrace_call arm64: add __nocfi to __apply_alternatives arm64: add __nocfi to functions that jump to a physical address arm64: use function_nocfi with __pa_symbol arm64: implement function_nocfi psci: use function_nocfi for cpu_resume lkdtm: use function_nocfi treewide: Change list_sort to use const pointers bpf: disable CFI in dispatcher functions kallsyms: strip ThinLTO hashes from static functions kthread: use WARN_ON_FUNCTION_MISMATCH workqueue: use WARN_ON_FUNCTION_MISMATCH module: ensure __cfi_check alignment mm: add generic function_nocfi macro cfi: add __cficanonical add support for Clang CFI |
||
Linus Torvalds
|
7e4910b9ac |
seccomp updates for v5.13-rc1
- Fix "cacheable" typo in comments (Cui GaoSheng) - Fix CONFIG for /proc/$pid/status Seccomp_filters (Kenta.Tada@sony.com) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCHBe0ACgkQiXL039xt wCadDA//cy6LXlzJ78tRy1Zj4/iRlvfLGQ6rNuhoWkm9nuLOTJlzmb9lPxLFo1lo N4FDuXE0daPvmgy/XVu9wBKDZsgTlegzikGARfQmeHJ7Wj1H8ibz1OJPd1o60p4Y pfeImxefoNKxx7IxnNFDMLHgVi+CtnOZklwlj+bobIWjzclNB2EacumnyJlPuboW 4ZHBSkG1roLkBB4Q10fI7OHV8lSuQp/IyrAypLybydJ0xiZgvGD3NPOA4N8KH8nR A0kbA953Rld/PFzw5inRqepyPZKtT07LJyfl1ff60OtKOHkVBXPv6pYrdgWs0A9y XZxdHjVV/MHLvcK9dBoZZGi0/907fvcEgtMacaRekevD5sqiqtNOH5B5rQsMwtXs s/Kvg1KgmVJBQwFcMRuAfXqnnPy2672XvDU5/uptVbhpOIcIVeHtGvygPkiobuuO V1sE+huGCw+xnfRIOOmytRTpkHMlIS9ev1ApfXtuUtbXbM0W1G6H7adc0KE4bApm D/fpv97myH42r/UghOL5EHVaLcnw8embVr/ij4WpMiC1TrhWy0XU27oJisG6xRw6 A2Q4ybO3VM85LgteeQg10BZFmnuwfHMRJPBL4TOhNSs5GBx5EmkEFByozvMst5xR W/GIDn7g7jy1H0wuQOQ7NCgU5+RDDslCOjCIJdSipwpsTc65QCQ= =m4xO -----END PGP SIGNATURE----- Merge tag 'seccomp-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: - Fix "cacheable" typo in comments (Cui GaoSheng) - Fix CONFIG for /proc/$pid/status Seccomp_filters (Kenta.Tada@sony.com) * tag 'seccomp-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Fix "cacheable" typo in comments seccomp: Fix CONFIG tests for Seccomp_filters |
||
Alexey Dobriyan
|
0e0345b77a |
kbuild: redo fake deps at include/config/*.h
Make include/config/foo/bar.h fake deps files generation simpler. * delete .h suffix those aren't header files, shorten filenames, * delete tolower() Linux filesystems can deal with both upper and lowercase filenames very well, * put everything in 1 directory Presumably 'mkdir -p' split is from dark times when filesystems handled huge directories badly, disks were round adding to seek times. x86_64 allmodconfig lists 12364 files in include/config. ../obj/include/config/ ├── 104_QUAD_8 ├── 60XX_WDT ├── 64BIT ... ├── ZSWAP_DEFAULT_ON ├── ZSWAP_ZPOOL_DEFAULT └── ZSWAP_ZPOOL_DEFAULT_ZBUD 0 directories, 12364 files Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Yonghong Song
|
1fdd7433a9 |
kbuild: add an elfnote for whether vmlinux is built with lto
Currently, clang LTO built vmlinux won't work with pahole. LTO introduced cross-cu dwarf tag references and broke current pahole model which handles one cu as a time. The solution is to merge all cu's as one pahole cu as in [1]. We would like to do this merging only if cross-cu dwarf references happens. The LTO build mode is a pretty good indication for that. In earlier version of this patch ([2]), clang flag -grecord-gcc-switches is proposed to add to compilation flags so pahole could detect "-flto" and then merging cu's. This will increate the binary size of 1% without LTO though. Arnaldo suggested to use a note to indicate the vmlinux is built with LTO. Such a cheap way to get whether the vmlinux is built with LTO or not helps pahole but is also useful for tracing as LTO may inline/delete/demote global functions, promote static functions, etc. So this patch added an elfnote with a new type LINUX_ELFNOTE_LTO_INFO. The owner of the note is "Linux". With gcc 8.4.1 and clang trunk, without LTO, I got $ readelf -n vmlinux Displaying notes found in: .notes Owner Data size Description ... Linux 0x00000004 func description data: 00 00 00 00 ... With "readelf -x ".notes" vmlinux", I can verify the above "func" with type code 0x101. With clang thin-LTO, I got the same as above except the following: description data: 01 00 00 00 which indicates the vmlinux is built with LTO. [1] https://lore.kernel.org/bpf/20210325065316.3121287-1-yhs@fb.com/ [2] https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@fb.com/ Suggested-by: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> Signed-off-by: Yonghong Song <yhs@fb.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v12.0.0-rc4 (x86-64) Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Piotr Gorski
|
c3d7ef377e |
kbuild: add support for zstd compressed modules
kmod 28 supports modules compressed in zstd format so let's add this possibility to kernel. Signed-off-by: Piotr Gorski <lucjan.lucjanov@gmail.com> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Masahiro Yamada
|
d4bbe94209 |
kbuild: remove CONFIG_MODULE_COMPRESS
CONFIG_MODULE_COMPRESS is only used to activate the choice for module compression algorithm. It will be simpler to make the choice always visible, and add CONFIG_MODULE_COMPRESS_NONE in the choice. This is more consistent with the "Kernel compression mode" and "Built-in initramfs compression mode" choices. CONFIG_KERNEL_UNCOMPRESSED and CONFIG_INITRAMFS_COMPRESSION_NONE are available to choose no compression. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> |
||
Masahiro Yamada
|
ba64beb174 |
kbuild: check the minimum assembler version in Kconfig
Documentation/process/changes.rst defines the minimum assembler version
(binutils version), but we have never checked it in the build time.
Kbuild never invokes 'as' directly because all assembly files in the
kernel tree are *.S, hence must be preprocessed. I do not expect
raw assembly source files (*.s) would be added to the kernel tree.
Therefore, we always use $(CC) as the assembler driver, and commit
|
||
Masahiro Yamada
|
6dd85ff178 |
kconfig: change "modules" from sub-option to first-level attribute
Now "modules" is the only member of the "option" property. Remove "option", and move "modules" to the top level property. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Masahiro Yamada
|
f8f0d06438 |
kconfig: do not use allnoconfig_y option
allnoconfig_y is an ugly hack that sets a symbol to 'y' by allnoconfig. allnoconfig does not mean a minimal set of CONFIG options because a bunch of prompts are hidden by 'if EMBEDDED' or 'if EXPERT', but I do not like to hack Kconfig this way. Use the pre-existing feature, KCONFIG_ALLCONFIG, to provide a one liner config fragment. CONFIG_EMBEDDED=y is still forced when allnoconfig is invoked as a part of tinyconfig. No change in the .config file produced by 'make tinyconfig'. The output of 'make allnoconfig' will be changed; we will get CONFIG_EMBEDDED=n because allnoconfig literally sets all symbols to n. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Masahiro Yamada
|
b75b0a819a |
kconfig: change defconfig_list option to environment variable
"defconfig_list" is a weird option that defines a static symbol that declares the list of base config files in case the .config does not exist yet. This is quite different from other normal symbols; we just abused the "string" type and the "default" properties to list out the input files. They must be fixed values since these are searched for and loaded in the parse stage. It is an ugly hack, and should not exist in the first place. Providing this feature as an environment variable is a saner approach. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Nayna Jain
|
0165f4ca22 |
ima: enable signing of modules with build time generated key
The kernel build process currently only signs kernel modules when MODULE_SIG is enabled. Also, sign the kernel modules at build time when IMA_APPRAISE_MODSIG is enabled. Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Acked-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> |
||
Sami Tolvanen
|
cf68fffb66 |
add support for Clang CFI
This change adds support for Clang’s forward-edge Control Flow Integrity (CFI) checking. With CONFIG_CFI_CLANG, the compiler injects a runtime check before each indirect function call to ensure the target is a valid function with the correct static type. This restricts possible call targets and makes it more difficult for an attacker to exploit bugs that allow the modification of stored function pointers. For more details, see: https://clang.llvm.org/docs/ControlFlowIntegrity.html Clang requires CONFIG_LTO_CLANG to be enabled with CFI to gain visibility to possible call targets. Kernel modules are supported with Clang’s cross-DSO CFI mode, which allows checking between independently compiled components. With CFI enabled, the compiler injects a __cfi_check() function into the kernel and each module for validating local call targets. For cross-module calls that cannot be validated locally, the compiler calls the global __cfi_slowpath_diag() function, which determines the target module and calls the correct __cfi_check() function. This patch includes a slowpath implementation that uses __module_address() to resolve call targets, and with CONFIG_CFI_CLANG_SHADOW enabled, a shadow map that speeds up module look-ups by ~3x. Clang implements indirect call checking using jump tables and offers two methods of generating them. With canonical jump tables, the compiler renames each address-taken function to <function>.cfi and points the original symbol to a jump table entry, which passes __cfi_check() validation. This isn’t compatible with stand-alone assembly code, which the compiler doesn’t instrument, and would result in indirect calls to assembly code to fail. Therefore, we default to using non-canonical jump tables instead, where the compiler generates a local jump table entry <function>.cfi_jt for each address-taken function, and replaces all references to the function with the address of the jump table entry. Note that because non-canonical jump table addresses are local to each component, they break cross-module function address equality. Specifically, the address of a global function will be different in each module, as it's replaced with the address of a local jump table entry. If this address is passed to a different module, it won’t match the address of the same function taken there. This may break code that relies on comparing addresses passed from other components. CFI checking can be disabled in a function with the __nocfi attribute. Additionally, CFI can be disabled for an entire compilation unit by filtering out CC_FLAGS_CFI. By default, CFI failures result in a kernel panic to stop a potential exploit. CONFIG_CFI_PERMISSIVE enables a permissive mode, where the kernel prints out a rate-limited warning instead, and allows execution to continue. This option is helpful for locating type mismatches, but should only be enabled during development. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-2-samitolvanen@google.com |
||
Kees Cook
|
39218ff4c6 |
stack: Optionally randomize kernel stack offset each syscall
This provides the ability for architectures to enable kernel stack base address offset randomization. This feature is controlled by the boot param "randomize_kstack_offset=on/off", with its default value set by CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT. This feature is based on the original idea from the last public release of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt All the credit for the original idea goes to the PaX team. Note that the design and implementation of this upstream randomize_kstack_offset feature differs greatly from the RANDKSTACK feature (see below). Reasoning for the feature: This feature aims to make harder the various stack-based attacks that rely on deterministic stack structure. We have had many such attacks in past (just to name few): https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html As Linux kernel stack protections have been constantly improving (vmap-based stack allocation with guard pages, removal of thread_info, STACKLEAK), attackers have had to find new ways for their exploits to work. They have done so, continuing to rely on the kernel's stack determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT were not relevant. For example, the following recent attacks would have been hampered if the stack offset was non-deterministic between syscalls: https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf (page 70: targeting the pt_regs copy with linear stack overflow) https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html (leaked stack address from one syscall as a target during next syscall) The main idea is that since the stack offset is randomized on each system call, it is harder for an attack to reliably land in any particular place on the thread stack, even with address exposures, as the stack base will change on the next syscall. Also, since randomization is performed after placing pt_regs, the ptrace-based approach[1] to discover the randomized offset during a long-running syscall should not be possible. Design description: During most of the kernel's execution, it runs on the "thread stack", which is pretty deterministic in its structure: it is fixed in size, and on every entry from userspace to kernel on a syscall the thread stack starts construction from an address fetched from the per-cpu cpu_current_top_of_stack variable. The first element to be pushed to the thread stack is the pt_regs struct that stores all required CPU registers and syscall parameters. Finally the specific syscall function is called, with the stack being used as the kernel executes the resulting request. The goal of randomize_kstack_offset feature is to add a random offset after the pt_regs has been pushed to the stack and before the rest of the thread stack is used during the syscall processing, and to change it every time a process issues a syscall. The source of randomness is currently architecture-defined (but x86 is using the low byte of rdtsc()). Future improvements for different entropy sources is possible, but out of scope for this patch. Further more, to add more unpredictability, new offsets are chosen at the end of syscalls (the timing of which should be less easy to measure from userspace than at syscall entry time), and stored in a per-CPU variable, so that the life of the value does not stay explicitly tied to a single task. As suggested by Andy Lutomirski, the offset is added using alloca() and an empty asm() statement with an output constraint, since it avoids changes to assembly syscall entry code, to the unwinder, and provides correct stack alignment as defined by the compiler. In order to make this available by default with zero performance impact for those that don't want it, it is boot-time selectable with static branches. This way, if the overhead is not wanted, it can just be left turned off with no performance impact. The generated assembly for x86_64 with GCC looks like this: ... ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax # 12380 <kstack_offset> ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax ffffffff81003983: 48 83 c0 0f add $0xf,%rax ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax ffffffff8100398c: 48 29 c4 sub %rax,%rsp ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax ... As a result of the above stack alignment, this patch introduces about 5 bits of randomness after pt_regs is spilled to the thread stack on x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for stack alignment). The amount of entropy could be adjusted based on how much of the stack space we wish to trade for security. My measure of syscall performance overhead (on x86_64): lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null randomize_kstack_offset=y Simple syscall: 0.7082 microseconds randomize_kstack_offset=n Simple syscall: 0.7016 microseconds So, roughly 0.9% overhead growth for a no-op syscall, which is very manageable. And for people that don't want this, it's off by default. There are two gotchas with using the alloca() trick. First, compilers that have Stack Clash protection (-fstack-clash-protection) enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to any dynamic stack allocations. While the randomization offset is always less than a page, the resulting assembly would still contain (unreachable!) probing routines, bloating the resulting assembly. To avoid this, -fno-stack-clash-protection is unconditionally added to the kernel Makefile since this is the only dynamic stack allocation in the kernel (now that VLAs have been removed) and it is provably safe from Stack Clash style attacks. The second gotcha with alloca() is a negative interaction with -fstack-protector*, in that it sees the alloca() as an array allocation, which triggers the unconditional addition of the stack canary function pre/post-amble which slows down syscalls regardless of the static branch. In order to avoid adding this unneeded check and its associated performance impact, architectures need to carefully remove uses of -fstack-protector-strong (or -fstack-protector) in the compilation units that use the add_random_kstack() macro and to audit the resulting stack mitigation coverage (to make sure no desired coverage disappears). No change is visible for this on x86 because the stack protector is already unconditionally disabled for the compilation unit, but the change is required on arm64. There is, unfortunately, no attribute that can be used to disable stack protector for specific functions. Comparison to PaX RANDKSTACK feature: The RANDKSTACK feature randomizes the location of the stack start (cpu_current_top_of_stack), i.e. including the location of pt_regs structure itself on the stack. Initially this patch followed the same approach, but during the recent discussions[2], it has been determined to be of a little value since, if ptrace functionality is available for an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write different offsets in the pt_regs struct, observe the cache behavior of the pt_regs accesses, and figure out the random stack offset. Another difference is that the random offset is stored in a per-cpu variable, rather than having it be per-thread. As a result, these implementations differ a fair bit in their implementation details and results, though obviously the intent is similar. [1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/ [2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/ [3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html Co-developed-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org |
||
Vipin Sharma
|
a72232eabd |
cgroup: Add misc cgroup controller
The Miscellaneous cgroup provides the resource limiting and tracking mechanism for the scalar resources which cannot be abstracted like the other cgroup resources. Controller is enabled by the CONFIG_CGROUP_MISC config option. A resource can be added to the controller via enum misc_res_type{} in the include/linux/misc_cgroup.h file and the corresponding name via misc_res_name[] in the kernel/cgroup/misc.c file. Provider of the resource must set its capacity prior to using the resource by calling misc_cg_set_capacity(). Once a capacity is set then the resource usage can be updated using charge and uncharge APIs. All of the APIs to interact with misc controller are in include/linux/misc_cgroup.h. Miscellaneous controller provides 3 interface files. If two misc resources (res_a and res_b) are registered then: misc.capacity A read-only flat-keyed file shown only in the root cgroup. It shows miscellaneous scalar resources available on the platform along with their quantities:: $ cat misc.capacity res_a 50 res_b 10 misc.current A read-only flat-keyed file shown in the non-root cgroups. It shows the current usage of the resources in the cgroup and its children:: $ cat misc.current res_a 3 res_b 0 misc.max A read-write flat-keyed file shown in the non root cgroups. Allowed maximum usage of the resources in the cgroup and its children.:: $ cat misc.max res_a max res_b 4 Limit can be set by:: # echo res_a 1 > misc.max Limit can be set to max by:: # echo res_a max > misc.max Limits can be set more than the capacity value in the misc.capacity file. Signed-off-by: Vipin Sharma <vipinsh@google.com> Reviewed-by: David Rientjes <rientjes@google.com> Signed-off-by: Tejun Heo <tj@kernel.org> |
||
Kenta.Tada@sony.com
|
64bdc02440 |
seccomp: Fix CONFIG tests for Seccomp_filters
Strictly speaking, seccomp filters are only used
when CONFIG_SECCOMP_FILTER.
This patch fixes the condition to enable "Seccomp_filters"
in /proc/$pid/status.
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
Fixes:
|
||
David S. Miller
|
efd13b71a3 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Cao jin
|
2b7d2fe76f |
bootconfig: Update prototype of setup_boot_config()
Parameter "cmdline" has no use, drop it. Link: https://lkml.kernel.org/r/20210311085213.27680-1-jojing64@gmail.com Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Cao jin <jojing64@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
Linus Torvalds
|
50eb842fe5 |
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton: "28 patches. Subsystems affected by this series: mm (memblock, pagealloc, hugetlb, highmem, kfence, oom-kill, madvise, kasan, userfaultfd, memcg, and zram), core-kernel, kconfig, fork, binfmt, MAINTAINERS, kbuild, and ia64" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (28 commits) zram: fix broken page writeback zram: fix return value on writeback_store mm/memcg: set memcg when splitting page mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls mm/userfaultfd: fix memory corruption due to writeprotect kasan: fix KASAN_STACK dependency for HW_TAGS kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC mm/madvise: replace ptrace attach requirement for process_madvise include/linux/sched/mm.h: use rcu_dereference in in_vfork() kfence: fix reports if constant function prefixes exist kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations kfence: fix printk format for ptrdiff_t linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP* MAINTAINERS: exclude uapi directories in API/ABI section binfmt_misc: fix possible deadlock in bm_register_write mm/highmem.c: fix zero_user_segments() with start > end hugetlb: do early cow when page pinned on src mm mm: use is_cow_mapping() across tree where proper ... |
||
Masahiro Yamada
|
ea29b20a82 |
init/Kconfig: make COMPILE_TEST depend on HAS_IOMEM
I read the commit log of the following two: - |
||
Masahiro Yamada
|
ce6ed1c4c9 |
kbuild: rebuild GCC plugins when the compiler is upgraded
Linus reported a build error due to the GCC plugin incompatibility
when the compiler is upgraded. [1]
GCC plugins are tied to a particular GCC version. So, they must be
rebuilt when the compiler is upgraded.
This seems to be a long-standing flaw since the initial support of
GCC plugins.
Extend commit
|
||
David S. Miller
|
c1acda9807 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-09 The following pull-request contains BPF updates for your *net-next* tree. We've added 90 non-merge commits during the last 17 day(s) which contain a total of 114 files changed, 5158 insertions(+), 1288 deletions(-). The main changes are: 1) Faster bpf_redirect_map(), from Björn. 2) skmsg cleanup, from Cong. 3) Support for floating point types in BTF, from Ilya. 4) Documentation for sys_bpf commands, from Joe. 5) Support for sk_lookup in bpf_prog_test_run, form Lorenz. 6) Enable task local storage for tracing programs, from Song. 7) bpf_for_each_map_elem() helper, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Masahiro Yamada
|
a6aaeb8411 |
kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO
Commit |
||
Cong Wang
|
887596095e |
bpf: Clean up sockmap related Kconfigs
As suggested by John, clean up sockmap related Kconfigs: Reduce the scope of CONFIG_BPF_STREAM_PARSER down to TCP stream parser, to reflect its name. Make the rest sockmap code simply depend on CONFIG_BPF_SYSCALL and CONFIG_INET, the latter is still needed at this point because of TCP/UDP proto update. And leave CONFIG_NET_SOCK_MSG untouched, as it is used by non-sockmap cases. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210223184934.6054-2-xiyou.wangcong@gmail.com |
||
Linus Torvalds
|
8b83369ddc |
RISC-V Patches for the 5.12 Merge Window
I have a handful of new RISC-V related patches for this merge window: * A check to ensure drivers are properly using uaccess. This isn't manifesting with any of the drivers I'm currently using, but may catch errors in new drivers. * Some preliminary support for the FU740, along with the HiFive Unleashed it will appear on. * NUMA support for RISC-V, which involves making the arm64 code generic. * Support for kasan on the vmalloc region. * A handful of new drivers for the Kendryte K210, along with the DT plumbing required to boot on a handful of K210-based boards. * Support for allocating ASIDs. * Preliminary support for kernels larger than 128MiB. * Various other improvements to our KASAN support, including the utilization of huge pages when allocating the KASAN regions. We may have already found a bug with the KASAN_VMALLOC code, but it's passing my tests. There's a fix in the works, but that will probably miss the merge window. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmA4hXATHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYifryD/0SfXGOfj93Cxq7I7AYhhzCN7lJ5jvv iEQScTlPqU9nfvYodo4EDq0fp+5LIPpTL/XBHtqVjzv0FqRNa28Ea0K7kO8HuXc4 BaUd0m/DqyB4Gfgm4qjc5bDneQ1ZYxVXprYERWNQ5Fj+tdWhaQGOW64N/TVodjjj NgJtTqbIAcjJqjUtttM8TZN5U1TgwLo+KCqw3iYW12lV1YKBBuvrwvSdD6jnFdIQ AzG/wRGZhxLoFxgBB/NEsZxDoSd6ztiwxLhS9lX4okZVsryyIdOE70Q/MflfiTlU xE+AdxQXTMUiiqYSmHeDD6PDb57GT/K3hnjI1yP+lIZpbInsi29JKow1qjyYjfHl 9cSSKYCIXHL7jKU6pgt34G1O5N5+fgqHQhNbfKvlrQ2UPlfs/tWdKHpFIP/z9Jlr 0vCAou7NSEB9zZGqzO63uBLXoN8yfL8FT3uRnnRvoRpfpex5dQX2QqPLQ7327D7N GUG31nd1PHTJPdxJ1cI4SO24PqPpWDWY9uaea+0jv7ivGClVadZPco/S3ZKloguT lazYUvyA4oRrSAyln785Rd8vg4CinqTxMtIyZbRMbNkgzVQARi9a8rjvu4n9qms2 2wlXDFi8nR8B4ih5n79dSiiLM9ay9GJDxMcf9VxIxSAYZV2fJALnpK6gV2fzRBUe +k/uv8BIsFmlwQ== =CutX -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "A handful of new RISC-V related patches for this merge window: - A check to ensure drivers are properly using uaccess. This isn't manifesting with any of the drivers I'm currently using, but may catch errors in new drivers. - Some preliminary support for the FU740, along with the HiFive Unleashed it will appear on. - NUMA support for RISC-V, which involves making the arm64 code generic. - Support for kasan on the vmalloc region. - A handful of new drivers for the Kendryte K210, along with the DT plumbing required to boot on a handful of K210-based boards. - Support for allocating ASIDs. - Preliminary support for kernels larger than 128MiB. - Various other improvements to our KASAN support, including the utilization of huge pages when allocating the KASAN regions. We may have already found a bug with the KASAN_VMALLOC code, but it's passing my tests. There's a fix in the works, but that will probably miss the merge window. * tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits) riscv: Improve kasan population by using hugepages when possible riscv: Improve kasan population function riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization riscv: Improve kasan definitions riscv: Get rid of MAX_EARLY_MAPPING_SIZE soc: canaan: Sort the Makefile alphabetically riscv: Disable KSAN_SANITIZE for vDSO riscv: Remove unnecessary declaration riscv: Add Canaan Kendryte K210 SD card defconfig riscv: Update Canaan Kendryte K210 defconfig riscv: Add Kendryte KD233 board device tree riscv: Add SiPeed MAIXDUINO board device tree riscv: Add SiPeed MAIX GO board device tree riscv: Add SiPeed MAIX DOCK board device tree riscv: Add SiPeed MAIX BiT board device tree riscv: Update Canaan Kendryte K210 device tree dt-bindings: add resets property to dw-apb-timer dt-bindings: fix sifive gpio properties dt-bindings: update sifive uart compatible string dt-bindings: update sifive clint compatible string ... |
||
Florian Fainelli
|
dd23e8098f |
initramfs: panic with memory information
On systems with large amounts of reserved memory we may fail to successfully complete unpack_to_rootfs() and be left with: Kernel panic - not syncing: write error this is not too helpful to understand what happened, so let's wrap the panic() calls with a surrounding show_mem() such that we have a chance of understanding the memory conditions leading to these allocation failures. [akpm@linux-foundation.org: replace macro with C function] Link: https://lkml.kernel.org/r/20210114231517.1854379-1-f.fainelli@gmail.com Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Barret Rhoden <brho@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Sumit Garg
|
d54ce6158e |
kgdb: fix to kill breakpoints on initmem after boot
Currently breakpoints in kernel .init.text section are not handled correctly while allowing to remove them even after corresponding pages have been freed. Fix it via killing .init.text section breakpoints just prior to initmem pages being freed. Doug: "HW breakpoints aren't handled by this patch but it's probably not such a big deal". Link: https://lkml.kernel.org/r/20210224081652.587785-1-sumit.garg@linaro.org Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Suggested-by: Doug Anderson <dianders@chromium.org> Acked-by: Doug Anderson <dianders@chromium.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Bhaskar Chowdhury
|
f9c8bc4604 |
init/Kconfig: fix a typo in CC_VERSION_TEXT help text
s/compier/compiler/ Link: https://lkml.kernel.org/r/20210224223325.29099-1-unixbhaskar@gmail.com Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Masahiro Yamada
|
073a9ecb3a |
init/version.c: remove Version_<LINUX_VERSION_CODE> symbol
This code hunk creates a Version_<LINUX_VERSION_CODE> symbol if CONFIG_KALLSYMS is disabled. For example, building the kernel v5.10 for allnoconfig creates the following symbol: $ nm vmlinux | grep Version_ c116b028 B Version_330240 There is no in-tree user of this symbol. Commit |
||
Vijayanand Jitta
|
e1fdc40334 |
lib: stackdepot: add support to disable stack depot
Add a kernel parameter stack_depot_disable to disable stack depot. So that stack hash table doesn't consume any memory when stack depot is disabled. The use case is CONFIG_PAGE_OWNER without page_owner=on. Without this patch, stackdepot will consume the memory for the hashtable. By default, it's 8M which is never trivial. With this option, in CONFIG_PAGE_OWNER configured system, page_owner=off, stack_depot_disable in kernel command line, we could save the wasted memory for the hashtable. [akpm@linux-foundation.org: fix CONFIG_STACKDEPOT=n build] Link: https://lkml.kernel.org/r/1611749198-24316-2-git-send-email-vjitta@codeaurora.org Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org> Signed-off-by: Vijayanand Jitta <vjitta@codeaurora.org> Cc: Alexander Potapenko <glider@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Yogesh Lal <ylal@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Alexander Potapenko
|
0ce20dd840 |
mm: add Kernel Electric-Fence infrastructure
Patch series "KFENCE: A low-overhead sampling-based memory safety error detector", v7. This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a low-overhead sampling-based memory safety error detector of heap use-after-free, invalid-free, and out-of-bounds access errors. This series enables KFENCE for the x86 and arm64 architectures, and adds KFENCE hooks to the SLAB and SLUB allocators. KFENCE is designed to be enabled in production kernels, and has near zero performance overhead. Compared to KASAN, KFENCE trades performance for precision. The main motivation behind KFENCE's design, is that with enough total uptime KFENCE will detect bugs in code paths not typically exercised by non-production test workloads. One way to quickly achieve a large enough total uptime is when the tool is deployed across a large fleet of machines. KFENCE objects each reside on a dedicated page, at either the left or right page boundaries. The pages to the left and right of the object page are "guard pages", whose attributes are changed to a protected state, and cause page faults on any attempted access to them. Such page faults are then intercepted by KFENCE, which handles the fault gracefully by reporting a memory access error. Guarded allocations are set up based on a sample interval (can be set via kfence.sample_interval). After expiration of the sample interval, the next allocation through the main allocator (SLAB or SLUB) returns a guarded allocation from the KFENCE object pool. At this point, the timer is reset, and the next allocation is set up after the expiration of the interval. To enable/disable a KFENCE allocation through the main allocator's fast-path without overhead, KFENCE relies on static branches via the static keys infrastructure. The static branch is toggled to redirect the allocation to KFENCE. The KFENCE memory pool is of fixed size, and if the pool is exhausted no further KFENCE allocations occur. The default config is conservative with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB pages). We have verified by running synthetic benchmarks (sysbench I/O, hackbench) and production server-workload benchmarks that a kernel with KFENCE (using sample intervals 100-500ms) is performance-neutral compared to a non-KFENCE baseline kernel. KFENCE is inspired by GWP-ASan [1], a userspace tool with similar properties. The name "KFENCE" is a homage to the Electric Fence Malloc Debugger [2]. For more details, see Documentation/dev-tools/kfence.rst added in the series -- also viewable here: https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst [1] http://llvm.org/docs/GwpAsan.html [2] https://linux.die.net/man/3/efence This patch (of 9): This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a low-overhead sampling-based memory safety error detector of heap use-after-free, invalid-free, and out-of-bounds access errors. KFENCE is designed to be enabled in production kernels, and has near zero performance overhead. Compared to KASAN, KFENCE trades performance for precision. The main motivation behind KFENCE's design, is that with enough total uptime KFENCE will detect bugs in code paths not typically exercised by non-production test workloads. One way to quickly achieve a large enough total uptime is when the tool is deployed across a large fleet of machines. KFENCE objects each reside on a dedicated page, at either the left or right page boundaries. The pages to the left and right of the object page are "guard pages", whose attributes are changed to a protected state, and cause page faults on any attempted access to them. Such page faults are then intercepted by KFENCE, which handles the fault gracefully by reporting a memory access error. To detect out-of-bounds writes to memory within the object's page itself, KFENCE also uses pattern-based redzones. The following figure illustrates the page layout: ---+-----------+-----------+-----------+-----------+-----------+--- | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | ---+-----------+-----------+-----------+-----------+-----------+--- Guarded allocations are set up based on a sample interval (can be set via kfence.sample_interval). After expiration of the sample interval, a guarded allocation from the KFENCE object pool is returned to the main allocator (SLAB or SLUB). At this point, the timer is reset, and the next allocation is set up after the expiration of the interval. To enable/disable a KFENCE allocation through the main allocator's fast-path without overhead, KFENCE relies on static branches via the static keys infrastructure. The static branch is toggled to redirect the allocation to KFENCE. To date, we have verified by running synthetic benchmarks (sysbench I/O, hackbench) that a kernel compiled with KFENCE is performance-neutral compared to the non-KFENCE baseline. For more details, see Documentation/dev-tools/kfence.rst (added later in the series). [elver@google.com: fix parameter description for kfence_object_start()] Link: https://lkml.kernel.org/r/20201106092149.GA2851373@elver.google.com [elver@google.com: avoid stalling work queue task without allocations] Link: https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com Link: https://lkml.kernel.org/r/20201110135320.3309507-1-elver@google.com [elver@google.com: fix potential deadlock due to wake_up()] Link: https://lkml.kernel.org/r/000000000000c0645805b7f982e4@google.com Link: https://lkml.kernel.org/r/20210104130749.1768991-1-elver@google.com [elver@google.com: add option to use KFENCE without static keys] Link: https://lkml.kernel.org/r/20210111091544.3287013-1-elver@google.com [elver@google.com: add missing copyright and description headers] Link: https://lkml.kernel.org/r/20210118092159.145934-1-elver@google.com Link: https://lkml.kernel.org/r/20201103175841.3495947-2-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: SeongJae Park <sjpark@amazon.de> Co-developed-by: Marco Elver <elver@google.com> Reviewed-by: Jann Horn <jannh@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christopher Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hillf Danton <hdanton@sina.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Joern Engel <joern@purestorage.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
6fbd6cf85a |
Kbuild updates for v5.12
- Fix false-positive build warnings for ARCH=ia64 builds - Optimize dictionary size for module compression with xz - Check the compiler and linker versions in Kconfig - Fix misuse of extra-y - Support DWARF v5 debug info - Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x exceeded the limit - Add generic syscall{tbl,hdr}.sh for cleanups across arches - Minor cleanups of genksyms - Minor cleanups of Kconfig -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmA3zhgVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsG0C4P/A5hUNFdkYI+EffAWZiHn69t0S8j M1GQkZildKu/yOfm6hp3mNwgHmYgw0aAuch1htkJuv+5rXRtoK77yw0xKbUqNHyO VqkJWQPVUXJbWIDiu332NaETHbFTWCnPZKGmzcbVOBHbYsXUJPp17gROQ9ke0fQN Ae6OV5WINhoS8UnjESWb3qOO87MdQTZ+9mP+NMnVh4kV1SUeMAXLFwFll66KZTkj GXB330N3p9L0wQVljhXpQ/YPOd76wJNPhJWJ9+hKLFbWsedovzlHb+duprh1z1xe 7LLaq9dEbXxe1Uz0qmK76lupXxilYMyUupTW9HIYtIsY8br8DIoBOG0bn46LVnuL /m+UQNfUFCYYePT7iZQNNc1DISQJrxme3bjq0PJzZTDukNnHJVahnj9x4RoNaF8j Dc+JME0r2i8Ccp28vgmaRgzvSsb8Xtw5icwRdwzIpyt1ubs/+tkd/GSaGzQo30Q8 m8y1WOjovHNX7OGnOaOWBGoQAX/2k/VHeAediMsPqWUoOxwsLHYxG/4KtgwbJ5vc gu/Fyk1GRDklZPpLdYFVvz8TGnqSDogJgF+7WolJ6YvPGAUIDAfd5Ky2sWayddlm wchc3sKDVyh3lov23h0WQVTvLO9xl+NZ6THxoAGdYeQ0DUu5OxwH8qje/UpWuo1a DchhNN+g5pa6n56Z =sLxb -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Fix false-positive build warnings for ARCH=ia64 builds - Optimize dictionary size for module compression with xz - Check the compiler and linker versions in Kconfig - Fix misuse of extra-y - Support DWARF v5 debug info - Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x exceeded the limit - Add generic syscall{tbl,hdr}.sh for cleanups across arches - Minor cleanups of genksyms - Minor cleanups of Kconfig * tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (38 commits) initramfs: Remove redundant dependency of RD_ZSTD on BLK_DEV_INITRD kbuild: remove deprecated 'always' and 'hostprogs-y/m' kbuild: parse C= and M= before changing the working directory kbuild: reuse this-makefile to define abs_srctree kconfig: unify rule of config, menuconfig, nconfig, gconfig, xconfig kconfig: omit --oldaskconfig option for 'make config' kconfig: fix 'invalid option' for help option kconfig: remove dead code in conf_askvalue() kconfig: clean up nested if-conditionals in check_conf() kconfig: Remove duplicate call to sym_get_string_value() Makefile: Remove # characters from compiler string Makefile: reuse CC_VERSION_TEXT kbuild: check the minimum linker version in Kconfig kbuild: remove ld-version macro scripts: add generic syscallhdr.sh scripts: add generic syscalltbl.sh arch: syscalls: remove $(srctree)/ prefix from syscall tables arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work gen_compile_commands: prune some directories kbuild: simplify access to the kernel's version ... |
||
Linus Torvalds
|
4c48faba5b |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: "A few small subsystems and some of MM. 172 patches. Subsystems affected by this patch series: hexagon, scripts, ntfs, ocfs2, vfs, and mm (slab-generic, slab, slub, debug, pagecache, swap, memcg, pagemap, mprotect, mremap, page-reporting, vmalloc, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, mempolicy, oom-kill, hugetlbfs, and migration)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (172 commits) mm/migrate: remove unneeded semicolons hugetlbfs: remove unneeded return value of hugetlb_vmtruncate() hugetlbfs: fix some comment typos hugetlbfs: correct some obsolete comments about inode i_mutex hugetlbfs: make hugepage size conversion more readable hugetlbfs: remove meaningless variable avoid_reserve hugetlbfs: correct obsolete function name in hugetlbfs_read_iter() hugetlbfs: use helper macro default_hstate in init_hugetlbfs_fs hugetlbfs: remove useless BUG_ON(!inode) in hugetlbfs_setattr() hugetlbfs: remove special hugetlbfs_set_page_dirty() mm/hugetlb: change hugetlb_reserve_pages() to type bool mm, oom: fix a comment in dump_task() mm/mempolicy: use helper range_in_vma() in queue_pages_test_walk() numa balancing: migrate on fault among multiple bound nodes mm, compaction: make fast_isolate_freepages() stay within zone mm/compaction: fix misbehaviors of fast_find_migrateblock() mm/compaction: correct deferral logic for proactive compaction mm/compaction: remove duplicated VM_BUG_ON_PAGE !PageLocked mm/compaction: remove rcu_read_lock during page compaction z3fold: simplify the zhdr initialization code in init_z3fold_page() ... |
||
Vlastimil Babka
|
fe2cce15d6 |
mm, slub: remove slub_memcg_sysfs boot param and CONFIG_SLUB_MEMCG_SYSFS_ON
The boot param and config determine the value of memcg_sysfs_enabled,
which is unused since commit
|
||
Linus Torvalds
|
c4fbde84fe |
Simple Firmware Interface (SFI) support removal for v5.12-rc1
Drop support for depercated platforms using SFI, drop the entire support for SFI that has been long deprecated too and make some janitorial changes on top of that (Andy Shevchenko). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmA2ZukSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxKcAP/RAkbRVFndhQIZYTCu74O64v86FjTBcS 3vvcKevVkBJiPJL1l10Yo3UMEYAbJIRZY00jkUjX7pq4eurELu6LwdMtJlHwh0p5 ZP5QeSdq1xN+9UGwBGXlnka2ypmD8fjbQyxHKErYgvmOl4ltFm40PyUC9GCVFLnW /1o83t/dcmTtaOGPYWTW3HuCsbYqANG/x8PYAFeAk5dBxoSaNV69gAEuCYr1JC5N Nie4x2m2I5v9egJFhy6rmRrpHPBvocCho+FipJFagSKWHPCI2rBSKESVOj23zWt2 eIWhK5T/ZR3OqQb9tZN6uAPJmBAerc3l7ZHZ1oFBP68MjUJJJhduQ+hNxljOyLLw CVx0UhuancIWZdyJon5f7E9S9STZLIZ/3usx3K+7AZK+PSmH8d/UEIeXfkC0FcAr eO3gwalB9KuhhXbVvihW79RkfkV5pTaMvVS7l1BffN4WE1dB9PKtJ8/MKFbGaTUF 4Rev6BdAEDqJrw6OIARvNcI6TAEhbKe5yIghzhQWn+fZ7oEm6f6fvFObBzD0KvQP 4RwYJhXU0gtK5yo/Ib1sUqjVQn8Jgqb7Xq46WZsP07Yc6O2Ws/86qCpX1GSCv5FU 1CZEJLGLGTbjDYOyMaUDfO/tI5kXG11e0Ss7Q+snWH4Iyhg0aNEYChKjOAFIxIxg JJYOH8O5p2IP =jlPz -----END PGP SIGNATURE----- Merge tag 'sfi-removal-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull Simple Firmware Interface (SFI) support removal from Rafael Wysocki: "Drop support for depercated platforms using SFI, drop the entire support for SFI that has been long deprecated too and make some janitorial changes on top of that (Andy Shevchenko)" * tag 'sfi-removal-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/platform/intel-mid: Update Copyright year and drop file names x86/platform/intel-mid: Remove unused header inclusion in intel-mid.h x86/platform/intel-mid: Drop unused __intel_mid_cpu_chip and Co. x86/platform/intel-mid: Get rid of intel_scu_ipc_legacy.h x86/PCI: Describe @reg for type1_access_ok() x86/PCI: Get rid of custom x86 model comparison sfi: Remove framework for deprecated firmware cpufreq: sfi-cpufreq: Remove driver for deprecated firmware media: atomisp: Remove unused header mfd: intel_msic: Remove driver for deprecated platform x86/apb_timer: Remove driver for deprecated platform x86/platform/intel-mid: Remove unused leftovers (vRTC) x86/platform/intel-mid: Remove unused leftovers (msic) x86/platform/intel-mid: Remove unused leftovers (msic_thermal) x86/platform/intel-mid: Remove unused leftovers (msic_power_btn) x86/platform/intel-mid: Remove unused leftovers (msic_gpio) x86/platform/intel-mid: Remove unused leftovers (msic_battery) x86/platform/intel-mid: Remove unused leftovers (msic_ocd) x86/platform/intel-mid: Remove unused leftovers (msic_audio) platform/x86: intel_scu_wdt: Drop mistakenly added const |
||
Linus Torvalds
|
a555bdd0c5 |
Kbuild: enable TRIM_UNUSED_KSYMS again, with some guarding
In commit
|
||
Linus Torvalds
|
5cf0fd591f |
Kbuild: disable TRIM_UNUSED_KSYMS option
The removal of EXPORT_UNUSED_SYMBOL() in commit |
||
Linus Torvalds
|
21a6ab2131 |
Modules updates for v5.12
Summary of modules changes for the 5.12 merge window: - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export types were introduced between 2006 - 2008. All the of the unused symbols have been long removed and gpl future symbols were converted to gpl quite a long time ago, and I don't believe these export types have been used ever since. So, I think it should be safe to retire those export types now. (Christoph Hellwig) - Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig) - Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as it is the only caller (Christoph Hellwig) - Unexport find_module() and module_mutex and fix the last module callers to not rely on these anymore. Make module_mutex internal to the module loader. (Christoph Hellwig) - Harden ELF checks on module load and validate ELF structures before checking the module signature (Frank van der Linden) - Fix undefined symbol warning for clang (Fangrui Song) - Fix smatch warning (Dan Carpenter) Signed-off-by: Jessica Yu <jeyu@kernel.org> -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEVrp26glSWYuDNrCUwEV+OM47wXIFAmA0/KMQHGpleXVAa2Vy bmVsLm9yZwAKCRDARX44zjvBcu0uD/4nmRp18EKAtdUZivsZHat0aEWGlkmrVueY 5huYw6iwM8b/wIAl3xwLki1Iv0/l0a83WXZhLG4ekl0/Nj8kgllA+jtBrZWpoLMH CZusN5dS9YwwyD2vu3ak83ARcehcDEPeA9thvc3uRFGis6Hi4bt1rkzGdrzsgqR4 tybfN4qaQx4ZAKFxA8bnS58l7QTFwUzTxJfM6WWzl1Q+mLZDr/WP+loJ/f1/oFFg ufN31KrqqFpdQY5UKq5P4H8FVq/eXE1Mwl8vo3HsnAj598fznyPUmA3D/j+N4GuR sTGBVZ9CSehUj7uZRs+Qgg6Bd+y3o44N29BrdZWA6K3ieTeQQpA+VgPUNrDBjGhP J/9Y4ms4PnuNEWWRaa73m9qsVqAsjh9+T2xp9PYn9uWLCM8BvQFtWcY7tw4/nB0/ INmyiP/tIRpwWkkBl47u1TPR09FzBBGDZjBiSn3lm3VX+zCYtHoma5jWyejG11cf ybDrTsci9ANyHNP2zFQsUOQJkph78PIal0i3k4ODqGJvaC0iEIH3Xjv+0dmE14rq kGRrG/HN6HhMZPjashudVUktyTZ63+PJpfFlQbcUzdvjQQIkzW0vrCHMWx9vD1xl Na7vZLl4Nb03WSJp6saY6j2YSRKL0poGETzGqrsUAHEhpEOPHduaiCVlAr/EmeMk p6SrWv8+UQ== =T29Q -----END PGP SIGNATURE----- Merge tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull module updates from Jessica Yu: - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These export types were introduced between 2006 - 2008. All the of the unused symbols have been long removed and gpl future symbols were converted to gpl quite a long time ago, and I don't believe these export types have been used ever since. So, I think it should be safe to retire those export types now (Christoph Hellwig) - Refactor and clean up some aged code cruft in the module loader (Christoph Hellwig) - Build {,module_}kallsyms_on_each_symbol only when livepatching is enabled, as it is the only caller (Christoph Hellwig) - Unexport find_module() and module_mutex and fix the last module callers to not rely on these anymore. Make module_mutex internal to the module loader (Christoph Hellwig) - Harden ELF checks on module load and validate ELF structures before checking the module signature (Frank van der Linden) - Fix undefined symbol warning for clang (Fangrui Song) - Fix smatch warning (Dan Carpenter) * tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: module: potential uninitialized return in module_kallsyms_on_each_symbol() module: remove EXPORT_UNUSED_SYMBOL* module: remove EXPORT_SYMBOL_GPL_FUTURE module: move struct symsearch to module.c module: pass struct find_symbol_args to find_symbol module: merge each_symbol_section into find_symbol module: remove each_symbol_in_section module: mark module_mutex static kallsyms: only build {,module_}kallsyms_on_each_symbol when required kallsyms: refactor {,module_}kallsyms_on_each_symbol module: use RCU to synchronize find_module module: unexport find_module and module_mutex drm: remove drm_fb_helper_modinit powerpc/powernv: remove get_cxl_module module: harden ELF info handling module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols |
||
Linus Torvalds
|
79db4d2293 |
clang-lto series for v5.12-rc1
- Clang LTO build infrastructure and arm64-specific enablement (Sami Tolvanen)
- Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmA0OEYACgkQiXL039xt
wCYGJw/8CcyvQUGmXYEZVDLMahKz93RYijiGuSTVnhl0pNAyfOojaZ8Z//eD1VNA
s82azW1XybbA6RnPGD7YQzYz27cSF2qUFDmplwVfE4mwBnPXzRxtVBDLSxksP1HS
77sCOu91QhbovPCWET4dSHLJB3DVc78FiW4lVlRgrglyAz+dut1iXYar5e7VNoS0
S4MwnqwteHC6YXP619rubhpdDoj7njuw1uxRIaodt9S/zRSpl5MCUgHmzQusgezs
yWDdPHPWHnF7xxKgwSvE7AKZPdOnIxKxRi6Yd6vUIyrYB3qLZkFe75nUsgMroAhs
/Bgrn69U2McMiJsOdh0ERzP2VNYfvMacBQ308nb45j83Bgv5l6uj8QOZU4ZogmXV
PsDzsfUe9GsxgYexfozGX61rpd6JinzQKVyoDW3oTT54fbBxO3uDqT8kOBw72dPT
9nkOxTzyb+UO0dpb/MhXLGkGcv8+lTA5ffVIKUx5UxKngRbukc3dxwVJgO4HmucK
bwVQGD83D+/if5/JL9WtQRjDwFEn+IFmdv+3cAXkRo4IIS18LPZB1MJncTeWr8Z9
HlkuDXlJOncUWCABGd1IKu1j0S2HpXV4qhqQXJ6PdfOvUPEaD9qgqEAjD5FxxyXF
wpaV2MWya5i1FGwD5UKhi8hVnAFJyF0/w+enjiPwlmIbjdyEVXE=
=6peY
-----END PGP SIGNATURE-----
Merge tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang LTO updates from Kees Cook:
"Clang Link Time Optimization.
This is built on the work done preparing for LTO by arm64 folks,
tracing folks, etc. This includes the core changes as well as the
remaining pieces for arm64 (LTO has been the default build method on
Android for about 3 years now, as it is the prerequisite for the
Control Flow Integrity protections).
While x86 LTO enablement is done, it depends on some pending objtool
clean-ups. It's possible that I'll send a "part 2" pull request for
LTO that includes x86 support.
For merge log posterity, and as detailed in commit
|
||
Linus Torvalds
|
4b5f9254e4 |
kconfig for kcmp syscall
drm userspaces uses this, systemd uses this, makes sense to pull it out from the checkpoint-restore bundle. Kees reviewed this from security pov and is happy with the final version. LWN coverage: https://lwn.net/Articles/845448/ -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAmAzaXIACgkQTA9ye/CY qnH5FQ//eL/7a/PDICuCRIN2p2aQwHoe9d12q+01RolAgce6F9mR9SFiKGSCR+t7 daw4G/BaGxSYzvz1IqWbXDMhN87jAXV/IGs9k4OkSIcbnDmMY78EKMZB1c1t7AZo zmeAuQvmTAcBogTwC6IE9N1JwhH3fmudq4p8zZ4zLojJNSPjrwCvF/xQI/Yaw52V CTfni8mrjYJ+pZ1qn9XP3IceAFEEI27ubZj2DJU+5xpRJAdIAobo0XbVOf8XQ0uc /BRLyXFS66EDsY1wWHT6y6UXDNZgbLic0olC1aielaBJh+Wq6bQHgephxpasU5y7 cZX7XTX2N1q8j8NmgzWLYRgERqtXv0CPHKdimTs8SaUcPDGhxcnwPR6hmdQEC+i6 IjntWMERjfuyD+s6qVuc7s8WS7+Ry9OxgdVskHASqGpBvsSliXN1o02Am6WUuGsB HZxTjCe967FyL4LGU0YjobMTUUSWfYQkOBKABlvYUySNZ0ZHnSygHIWiWjC6b89A KmXiHJoocNfDlKwX6bf3OWQ+dGGFu2wo5wYzldIiqYJVidp50xdOosdRE1R6WwuG IOLCdNKdqDgtig+90/fFZ06liXZvqUdDafWgUs/g6lLquFrcq5aAIiSdR6PcPKB0 MwfWcCglLtYrxgDHvNaqnW18yRQq2TGbe+A65aXzLPp45pKP8Hk= =uiSj -----END PGP SIGNATURE----- Merge tag 'topic/kcmp-kconfig-2021-02-22' of git://anongit.freedesktop.org/drm/drm Pull kcmp kconfig update from Daniel Vetter: "Make the kcmp syscall available independently of checkpoint/restore. drm userspaces uses this, systemd uses this, so makes sense to pull it out from the checkpoint-restore bundle. Kees reviewed this from security pov and is happy with the final version" Link: https://lwn.net/Articles/845448/ * tag 'topic/kcmp-kconfig-2021-02-22' of git://anongit.freedesktop.org/drm/drm: kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE |
||
Masahiro Yamada
|
02aff85922 |
kbuild: check the minimum linker version in Kconfig
Unify the two scripts/ld-version.sh and scripts/lld-version.sh, and check the minimum linker version like scripts/cc-version.sh did. I tested this script for some corner cases reported in the past: - GNU ld version 2.25-15.fc23 as reported by commit |
||
Linus Torvalds
|
657bd90c93 |
Scheduler updates for v5.12:
[ NOTE: unfortunately this tree had to be freshly rebased today, it's a same-content tree of 82891be90f3c (-next published) merged with v5.11. The main reason for the rebase was an authorship misattribution problem with a new commit, which we noticed in the last minute, and which we didn't want to be merged upstream. The offending commit was deep in the tree, and dependent commits had to be rebased as well. ] - Core scheduler updates: - Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the preempt=none/voluntary/full boot options (default: full), to allow distros to build a PREEMPT kernel but fall back to close to PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling behavior via a boot time selection. There's also the /debug/sched_debug switch to do this runtime. This feature is implemented via runtime patching (a new variant of static calls). The scope of the runtime patching can be best reviewed by looking at the sched_dynamic_update() function in kernel/sched/core.c. ( Note that the dynamic none/voluntary mode isn't 100% identical, for example preempt-RCU is available in all cases, plus the preempt count is maintained in all models, which has runtime overhead even with the code patching. ) The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast majority of distributions, are supposed to be unaffected. - Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that was found via rcutorture triggering a hang. The bug is that rcu_idle_enter() may wake up a NOCB kthread, but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't. In true 2020 fashion the original bug fix has grown into a 5-patch scheduler/RCU fix series plus another 16 RCU patches to address the underlying issue of missed preemption events. These are the initial fixes that should fix current incarnations of the bug. - Clean up rbtree usage in the scheduler, by providing & using the following consistent set of rbtree APIs: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two - Improve the SMP/NUMA load-balancer: scan for an idle sibling in a single pass. This is a 4-commit series where each commit improves one aspect of the idle sibling scan logic. - Improve the cpufreq cooling driver by getting the effective CPU utilization metrics from the scheduler - Improve the fair scheduler's active load-balancing logic by reducing the number of active LB attempts & lengthen the load-balancing interval. This improves stress-ng mmapfork performance. - Fix CFS's estimated utilization (util_est) calculation bug that can result in too high utilization values - Misc updates & fixes: - Fix the HRTICK reprogramming & optimization feature - Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code - Reduce dl_add_task_root_domain() overhead - Fix uprobes refcount bug - Process pending softirqs in flush_smp_call_function_from_idle() - Clean up task priority related defines, remove *USER_*PRIO and USER_PRIO() - Simplify the sched_init_numa() deduplication sort - Documentation updates - Fix EAS bug in update_misfit_status(), which degraded the quality of energy-balancing - Smaller cleanups Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmAtHBsRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1itgg/+NGed12pgPjYBzesdou60Lvx7LZLGjfOt M1F1EnmQGn/hEH2fCY6ZoqIZQTVltm7GIcBNabzYTzlaHZsdtyuDUJBZyj19vTlk zekcj7WVt+qvfjChaNwEJhQ9nnOM/eohMgEOHMAAJd9zlnQvve7NOLQ56UDM+kn/ 9taFJ5ZPvb4avP6C5p3KivvKex6Bjof/Tl0m3utpNyPpI/qK3FyGxwdgCxU0yepT ABWQX5ZQCufFvo1bgnBPfqyzab4MqhoM3bNKBsLQfuAlssG1xRv4KQOev4dRwrt9 pXJikV5C9yez5d2lGe5p0ltH5IZS/l9x2yI/ZQj3OUDTFyV1ic6WfFAqJgDzVF8E i/vvA4NPQiI241Bkps+ErcCw4aVOgiY6TWli74cHjLUIX0+As6aHrFWXGSxUmiHB WR+B8KmdfzRTTlhOxMA+cvlpZcKCfxWkJJmXzr/lDZzIuKPqM3QCE2wD9sixkfVo JNICT0IvZghWOdbMEfZba8Psh/e2LVI9RzdpEiuYJz1ZrVlt1hO0M6jBxY0hMz9n k54z81xODw0a8P2FHMtpmB1vhAeqCmvwA6DO8z0Oxs0DFi+KM2bLf2efHsCKafI+ Bm5v9YFaOk/55R76hJVh+aYLlyFgFkKd+P/niJTPDnxOk3SqJuXvTrql1HeGHkNr kYgQa23dsZk= =pyaG -----END PGP SIGNATURE----- Merge tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core scheduler updates: - Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the preempt=none/voluntary/full boot options (default: full), to allow distros to build a PREEMPT kernel but fall back to close to PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling behavior via a boot time selection. There's also the /debug/sched_debug switch to do this runtime. This feature is implemented via runtime patching (a new variant of static calls). The scope of the runtime patching can be best reviewed by looking at the sched_dynamic_update() function in kernel/sched/core.c. ( Note that the dynamic none/voluntary mode isn't 100% identical, for example preempt-RCU is available in all cases, plus the preempt count is maintained in all models, which has runtime overhead even with the code patching. ) The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast majority of distributions, are supposed to be unaffected. - Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that was found via rcutorture triggering a hang. The bug is that rcu_idle_enter() may wake up a NOCB kthread, but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't. In true 2020 fashion the original bug fix has grown into a 5-patch scheduler/RCU fix series plus another 16 RCU patches to address the underlying issue of missed preemption events. These are the initial fixes that should fix current incarnations of the bug. - Clean up rbtree usage in the scheduler, by providing & using the following consistent set of rbtree APIs: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two - Improve the SMP/NUMA load-balancer: scan for an idle sibling in a single pass. This is a 4-commit series where each commit improves one aspect of the idle sibling scan logic. - Improve the cpufreq cooling driver by getting the effective CPU utilization metrics from the scheduler - Improve the fair scheduler's active load-balancing logic by reducing the number of active LB attempts & lengthen the load-balancing interval. This improves stress-ng mmapfork performance. - Fix CFS's estimated utilization (util_est) calculation bug that can result in too high utilization values Misc updates & fixes: - Fix the HRTICK reprogramming & optimization feature - Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code - Reduce dl_add_task_root_domain() overhead - Fix uprobes refcount bug - Process pending softirqs in flush_smp_call_function_from_idle() - Clean up task priority related defines, remove *USER_*PRIO and USER_PRIO() - Simplify the sched_init_numa() deduplication sort - Documentation updates - Fix EAS bug in update_misfit_status(), which degraded the quality of energy-balancing - Smaller cleanups" * tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) sched,x86: Allow !PREEMPT_DYNAMIC entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point entry: Explicitly flush pending rcuog wakeup before last rescheduling point rcu/nocb: Trigger self-IPI on late deferred wake up before user resume rcu/nocb: Perform deferred wake up before last idle's need_resched() check rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers sched/features: Distinguish between NORMAL and DEADLINE hrtick sched/features: Fix hrtick reprogramming sched/deadline: Reduce rq lock contention in dl_add_task_root_domain() uprobes: (Re)add missing get_uprobe() in __find_uprobe() smp: Process pending softirqs in flush_smp_call_function_from_idle() sched: Harden PREEMPT_DYNAMIC static_call: Allow module use without exposing static_call_key sched: Add /debug/sched_preempt preempt/dynamic: Support dynamic preempt with preempt= boot option preempt/dynamic: Provide irqentry_exit_cond_resched() static call preempt/dynamic: Provide preempt_schedule[_notrace]() static calls preempt/dynamic: Provide cond_resched() and might_resched() static calls preempt: Introduce CONFIG_PREEMPT_DYNAMIC static_call: Provide DEFINE_STATIC_CALL_RET0() ... |
||
Linus Torvalds
|
24880bef41 |
Remove oprofile and dcookies support
The "oprofile" user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. The dcookies stuff is only used by the oprofile code. Now that oprofile's support is getting removed from the kernel, there is no need for dcookies as well. Remove kernel's old oprofile and dcookies support. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJgJMEVAAoJENK5HDyugRIcL8YP/jkmXH5CZT80ntcqrJGWKcG7 lWbach7uNeQteht7B1ZPKvojxizTkmfrN2sClX0B2hbGkc5TiWUQ2ZSnvnfWDZ8+ z2qQcEB11G/ReL2vvRk1fJlWdAOyUfrPee/44AkemnLRv+Niw/8PqnGd87yDQGsK qy5E1XXfbjUq6Y/uMiLOX3+21I6w6o2Q6I3NNXC93s0wS3awqnft8n0XBC7iAPBj eowRJxpdRU2Vcuj8UOzzOI7gQlwdjwYImyLPbRy/V8NawC8a+FHrPrf5/GCYlVzl 7TGFBsDQSmzvrBChUfoGz1Rq/VZ1a357p5rhRqemfUrdkjW+vyzelnD8I1W/hb2o SmBXoPoyl3+UkFHNyJI0mI7obaV+2PzyXMV0JIQUj+IiX/mfeFv0nF4XfZD2IkRt 6xhaYj775Zrx32iBdGZIvvLg5Gh9ZkZmR5vJ7Fi/EIZFe6Z+bZnPKUROnAgS/o0z +UkSygOhgo/1XbqrzZVk1iweWeu+EUMbY4YQv2qVnFhpvsq4ieThcUGQpWcxGjjH WP8O0n1yq1slsnpUtxhiTsm46ENajx9zZp6Iv6Ws+NM0RUqjND8BdF1co9WGD3LS cnZMFBs4Bg/V1HICL/D4s6L7t1ofrEXIgJH1y3iF0HeECq03mU4CgA/qly9Aebqg UxPF3oNlVOPlds9FzsU2 =I2Ac -----END PGP SIGNATURE----- Merge tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux Pull oprofile and dcookies removal from Viresh Kumar: "Remove oprofile and dcookies support The 'oprofile' user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. The dcookies stuff is only used by the oprofile code. Now that oprofile's support is getting removed from the kernel, there is no need for dcookies as well. Remove kernel's old oprofile and dcookies support" * tag 'oprofile-removal-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/linux: fs: Remove dcookies support drivers: Remove CONFIG_OPROFILE support arch: xtensa: Remove CONFIG_OPROFILE support arch: x86: Remove CONFIG_OPROFILE support arch: sparc: Remove CONFIG_OPROFILE support arch: sh: Remove CONFIG_OPROFILE support arch: s390: Remove CONFIG_OPROFILE support arch: powerpc: Remove oprofile arch: powerpc: Stop building and using oprofile arch: parisc: Remove CONFIG_OPROFILE support arch: mips: Remove CONFIG_OPROFILE support arch: microblaze: Remove CONFIG_OPROFILE support arch: ia64: Remove rest of perfmon support arch: ia64: Remove CONFIG_OPROFILE support arch: hexagon: Don't select HAVE_OPROFILE arch: arc: Remove CONFIG_OPROFILE support arch: arm: Remove CONFIG_OPROFILE support arch: alpha: Remove CONFIG_OPROFILE support |
||
Kefeng Wang
|
c72160fe05
|
initramfs: Provide a common initrd reserve function
Some architectures(eg, ARM and riscv) have similar logic to check and reserve the memory of initrd, let's provide a common function reserve_initrd_mem() to reduce duplicated code. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> |
||
Ingo Molnar
|
ed3cd45f8c |
Linux 5.11
-----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmAppPgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGeXYH/imZPBd4A1jIMehN 5HV2A53Z+MXmmaMuGj9X1KV6vsf55/xB+IhOoFdtRAIsO8c2yYSCO8i4+4R0XfYA +/YFJeq672rojQnmh6XbpR8dugaAV7CUHy6n7KDsyvtT6EOCpwFSwkOb4X3tBRX6 TlYgm2d/xgV/wRHSgLVugK0MdFCLMAnyb7mkPfar9QrMgG1BiDKLq07xmwnS23On TkqpJ9yZ/rJpUrrUqQYPShSO/FmA+fSfWs0CDv7EIrJ40LUScD6PZxSHWTIHtjLk E4jFda6wuqLRVWsBwaBzUIdD0zk7X5quHRzEpbC5ga16SK6yrWvE5YJJXCguIEuZ f3FMRYs= =CAjn -----END PGP SIGNATURE----- Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Chris Wilson
|
bfe3911a91 |
kcmp: Support selection of SYS_kcmp without CHECKPOINT_RESTORE
Userspace has discovered the functionality offered by SYS_kcmp and has started to depend upon it. In particular, Mesa uses SYS_kcmp for os_same_file_description() in order to identify when two fd (e.g. device or dmabuf) point to the same struct file. Since they depend on it for core functionality, lift SYS_kcmp out of the non-default CONFIG_CHECKPOINT_RESTORE into the selectable syscall category. Rasmus Villemoes also pointed out that systemd uses SYS_kcmp to deduplicate the per-service file descriptor store. Note that some distributions such as Ubuntu are already enabling CHECKPOINT_RESTORE in their configs and so, by extension, SYS_kcmp. References: https://gitlab.freedesktop.org/drm/intel/-/issues/3046 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: stable@vger.kernel.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> # DRM depends on kcmp Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> # systemd uses kcmp Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210205220012.1983-1-chris@chris-wilson.co.uk |
||
Masahiro Yamada
|
aec6c60a01 |
kbuild: check the minimum compiler version in Kconfig
Paul Gortmaker reported a regression in the GCC version check. [1]
If you use GCC 4.8, the build breaks before showing the error message
"error Sorry, your version of GCC is too old - please use 4.9 or newer."
I do not want to apply his fix-up since it implies we would not be able
to remove any cc-option test. Anyway, I admit checking the GCC version
in <linux/compiler-gcc.h> is too late.
Almost at the same time, Linus also suggested to move the compiler
version error to Kconfig time. [2]
I unified the two similar scripts, gcc-version.sh and clang-version.sh
into cc-version.sh. The old scripts invoked the compiler multiple times
(3 times for gcc-version.sh, 4 times for clang-version.sh). I refactored
the code so the new one invokes the compiler just once, and also tried
my best to use shell-builtin commands where possible.
The new script runs faster.
$ time ./scripts/clang-version.sh clang
120000
real 0m0.029s
user 0m0.012s
sys 0m0.021s
$ time ./scripts/cc-version.sh clang
Clang 120000
real 0m0.009s
user 0m0.006s
sys 0m0.004s
cc-version.sh also shows an error message if the compiler is too old:
$ make defconfig CC=clang-9
*** Default configuration is based on 'x86_64_defconfig'
***
*** Compiler is too old.
*** Your Clang version: 9.0.1
*** Minimum Clang version: 10.0.1
***
scripts/Kconfig.include:46: Sorry, this compiler is not supported.
make[1]: *** [scripts/kconfig/Makefile:81: defconfig] Error 1
make: *** [Makefile:602: defconfig] Error 2
The new script takes care of ICC because we have <linux/compiler-intel.h>
although I am not sure if building the kernel with ICC is well-supported.
[1]: https://lore.kernel.org/r/20210110190807.134996-1-paul.gortmaker@windriver.com
[2]: https://lore.kernel.org/r/CAHk-=wh-+TMHPTFo1qs-MYyK7tZh-OQovA=pP3=e06aCVp6_kA@mail.gmail.com
Fixes:
|
||
Andy Shevchenko
|
4590d98f5a |
sfi: Remove framework for deprecated firmware
SFI-based platforms are gone. So does this framework. This removes mention of SFI through the drivers and other code as well. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
||
Christoph Hellwig
|
367948220f |
module: remove EXPORT_UNUSED_SYMBOL*
EXPORT_UNUSED_SYMBOL* is not actually used anywhere. Remove the unused functionality as we generally just remove unused code anyway. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org> |
||
Johannes Berg
|
55b6f763d8 |
init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov
On ARCH=um, loading a module doesn't result in its constructors getting called, which breaks module gcov since the debugfs files are never registered. On the other hand, in-kernel constructors have already been called by the dynamic linker, so we can't call them again. Get out of this conundrum by allowing CONFIG_CONSTRUCTORS to be selected, but avoiding the in-kernel constructor calls. Also remove the "if !UML" from GCOV selecting CONSTRUCTORS now, since we really do want CONSTRUCTORS, just not kernel binary ones. Link: https://lkml.kernel.org/r/20210120172041.c246a2cac2fb.I1358f584b76f1898373adfed77f4462c8705b736@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jessica Yu <jeyu@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Steven Rostedt (VMware)
|
7e0a922046 |
fgraph: Initialize tracing_graph_pause at task creation
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable@vger.kernel.org
Fixes:
|
||
Viresh Kumar
|
f8408264c7 |
drivers: Remove CONFIG_OPROFILE support
The "oprofile" user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. Remove kernel's old oprofile support. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Robert Richter <rric@kernel.org> Acked-by: Paul E. McKenney <paulmck@kernel.org> #RCU Acked-by: William Cohen <wcohen@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> |
||
Yue Hu
|
432900f816 |
init/Kconfig: Correct thermal pressure help text
We're using arch_scale_thermal_pressure() to retrieve per CPU thermal pressure. Signed-off-by: Yue Hu <huyue2@yulong.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20210127054451.1240-1-zbestahu@gmail.com |
||
Sami Tolvanen
|
fbe078d397 |
kbuild: lto: add a default list of used symbols
With CONFIG_LTO_CLANG, LLVM bitcode has not yet been compiled into a binary when the .mod files are generated, which means they don't yet contain references to certain symbols that will be present in the final binaries. This includes intrinsic functions, such as memcpy, memmove, and memset [1], and stack protector symbols [2]. This change adds a default symbol list to use with CONFIG_TRIM_UNUSED_KSYMS when Clang's LTO is used. [1] https://llvm.org/docs/LangRef.html#standard-c-c-library-intrinsics [2] https://llvm.org/docs/LangRef.html#llvm-stackprotector-intrinsic Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201211184633.3213045-7-samitolvanen@google.com |
||
Petr Mladek
|
a91bd6223e |
Revert "init/console: Use ttynull as a fallback when there is no console"
This reverts commit |
||
Linus Torvalds
|
36bbbd0e23 |
Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU fix from Paul McKenney:
"This is a fix for a regression in the v5.10 merge window, but it was
reported quite late in the v5.10 process, plus generating and testing
the fix took some time.
The regression is due to commit
|
||
Andrey Konovalov
|
d73b49365e |
kasan, arm64: only use kasan_depth for software modes
This is a preparatory commit for the upcoming addition of a new hardware tag-based (MTE-based) KASAN mode. Hardware tag-based KASAN won't use kasan_depth. Only define and use it when one of the software KASAN modes are enabled. No functional changes for software modes. Link: https://lkml.kernel.org/r/e16f15aeda90bc7fb4dfc2e243a14b74cc5c8219.1606161801.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Marco Elver <elver@google.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
ac7ac4618c |
for-5.11/block-2020-12-14
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/Xec8QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpoLbEACzXypgZWwMdfgRckA/Vt333rXHtbhUV+hK 2XP+P81iRvr9Esi31UPbRp82vrgcDO0cpI1QmQojS5U5TIQP88BfXptfRZZu48eb wT5RDDNQ34HItqAh/yEuYsv9yUKcxeIrB99tBVvM+4UmQg9zTdIW3mg6PvCBdbhV N38jI0tCF/PJatjfRuphT/nXonQLPWBlVDmZk06KZQFOwQe9ep1vUi1+nbiRPuo3 geFBpTh1Kp6Vl1B3n4RpECs6Y7I0RRuJdaH2sDizICla1/BW91F9fQwHimNnUxUq e1Q1kMuh6ftcQGkYlHSYcPhuv6CvorldTZCO5arPxWpcwvxriTSMRPWAgUr5pEiF fhiGhqeDu9e6vl9vS31wUD1B30hy+jFz9wyjRrDwJ3cPHH1JVBjTzvdX+cIh/1ku IbIwUMteUtvUrzqAv/DzbGhedp7xWtOFaVo8j0QFYh9zkjd6b8yDOF/yztwX2gjY Xt1cd+KpDSiN449ZRaoMI0sCJAxqzhMa6nsWlb0L7KuNyWKAbvKQBm9Rb47FLV9A Vx70KC+zkFoyw23capvIahmQazerriUJ5PGe0lVm6ROgmIFdCpXTPDjnrvq/6RZ/ GEpD7gTW9atGJ7EuEE8686sAfKD5kneChWLX5EHXf0d0AG5Mr2lKsluiGp5LpPJg Q1Xqs6xwww== =zo4w -----END PGP SIGNATURE----- Merge tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block Pull block updates from Jens Axboe: "Another series of killing more code than what is being added, again thanks to Christoph's relentless cleanups and tech debt tackling. This contains: - blk-iocost improvements (Baolin Wang) - part0 iostat fix (Jeffle Xu) - Disable iopoll for split bios (Jeffle Xu) - block tracepoint cleanups (Christoph Hellwig) - Merging of struct block_device and hd_struct (Christoph Hellwig) - Rework/cleanup of how block device sizes are updated (Christoph Hellwig) - Simplification of gendisk lookup and removal of block device aliasing (Christoph Hellwig) - Block device ioctl cleanups (Christoph Hellwig) - Removal of bdget()/blkdev_get() as exported API (Christoph Hellwig) - Disk change rework, avoid ->revalidate_disk() (Christoph Hellwig) - sbitmap improvements (Pavel Begunkov) - Hybrid polling fix (Pavel Begunkov) - bvec iteration improvements (Pavel Begunkov) - Zone revalidation fixes (Damien Le Moal) - blk-throttle limit fix (Yu Kuai) - Various little fixes" * tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block: (126 commits) blk-mq: fix msec comment from micro to milli seconds blk-mq: update arg in comment of blk_mq_map_queue blk-mq: add helper allocating tagset->tags Revert "block: Fix a lockdep complaint triggered by request queue flushing" nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class blk-mq: add new API of blk_mq_hctx_set_fq_lock_class block: disable iopoll for split bio block: Improve blk_revalidate_disk_zones() checks sbitmap: simplify wrap check sbitmap: replace CAS with atomic and sbitmap: remove swap_lock sbitmap: optimise sbitmap_deferred_clear() blk-mq: skip hybrid polling if iopoll doesn't spin blk-iocost: Factor out the base vrate change into a separate function blk-iocost: Factor out the active iocgs' state check into a separate function blk-iocost: Move the usage ratio calculation to the correct place blk-iocost: Remove unnecessary advance declaration blk-iocost: Fix some typos in comments blktrace: fix up a kerneldoc comment block: remove the request_queue to argument request based tracepoints ... |
||
Linus Torvalds
|
d3eb52113d |
printk changes for 5.11
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl/XdCkACgkQUqAMR0iA lPJ/tA/8DcvxH5VHi7DGdgsBbp1vyryUhVyT7jlTSSnOlsUzfgkmB9fOh+lYtfFJ HPvb1j5PCxGjkLsQ2U7jRFfzKhXKdPvww4NlD2aDN4QDZmEvhHVlsWvpk6DqcOzZ 6Gdn0kRFdt83kJdR60Hxhk8fATKF691qNWJ2gQzamef40XvlMdMLSgL9e77vSx62 I7OUoWKDfjVoCNZg3UWglEvqQRVGPgmXtBQ2S7FRHgkHcmeyRaw5ge0CAV63Xghc tVPXMWzizSHLo+rvktYZDyR3UPskRG8YP8O3kcVNuGZ1OIZNenlJ5ldzn9ATcolH njG7TlIygQRKLzMsD2WNJlVY/AaVGooQsNt5xn68g4F10hd5qvHHqcF1hik69PwX QmQBkg/aEaIdr19uCYLJLSh5dv7hfDpMMwW+IFbW1TuiDZJITGrLa8akIygsFf0D H2QZ1Cuw0xz62Yhx6NSdu5MCLPsdV/sbYg7wAjAXYEDUk4MxPrk0YxXWCFWqpmnw nCjGUCOcr9On4JQ8oHYgoC/g7FPzPKiX8Y84X5YLAjHNrvJdwdS2TypXlLYCJbm0 5eooOnxthysBatWWdIdx5Yu9zIDWkb+tn2zhxm5nLAOZFAjRlFaXSuvbtQFzh1lh FZqGPJY2JJnSXov5uc1Hbf1L/eG89liqkvWi1iRD9oXNgXGZHTM= =1gUa -----END PGP SIGNATURE----- Merge tag 'printk-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Finally allow parallel writes and reads into/from the lockless ringbuffer. But it is not a complete solution. Readers are still serialized against each other. And nested writes are still prevented by printk_safe per-CPU buffers. - Use ttynull as the ultimate fallback for /dev/console. - Officially allow disabling console output by using console="" or console=null - A few code cleanups * tag 'printk-for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: remove logbuf_lock writer-protection of ringbuffer printk: inline log_output(),log_store() in vprintk_store() printk: remove obsolete dead assignment printk/console: Allow to disable console output by using console="" or console=null init/console: Use ttynull as a fallback when there is no console printk: ringbuffer: Reference text_data_ring directly in callees. |
||
Linus Torvalds
|
d01e7f10da |
Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull exec-update-lock update from Eric Biederman: "The key point of this is to transform exec_update_mutex into a rw_semaphore so readers can be separated from writers. This makes it easier to understand what the holders of the lock are doing, and makes it harder to contend or deadlock on the lock. The real deadlock fix wound up in perf_event_open" * 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: exec: Transform exec_update_mutex into a rw_semaphore |
||
Linus Torvalds
|
ac73e3dc8a |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits) mm: cleanup kstrto*() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject * uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ... |
||
Vlastimil Babka
|
04013513cc |
mm, page_alloc: do not rely on the order of page_poison and init_on_alloc/free parameters
Patch series "cleanup page poisoning", v3. I have identified a number of issues and opportunities for cleanup with CONFIG_PAGE_POISON and friends: - interaction with init_on_alloc and init_on_free parameters depends on the order of parameters (Patch 1) - the boot time enabling uses static key, but inefficienty (Patch 2) - sanity checking is incompatible with hibernation (Patch 3) - CONFIG_PAGE_POISONING_NO_SANITY can be removed now that we have init_on_free (Patch 4) - CONFIG_PAGE_POISONING_ZERO can be most likely removed now that we have init_on_free (Patch 5) This patch (of 5): Enabling page_poison=1 together with init_on_alloc=1 or init_on_free=1 produces a warning in dmesg that page_poison takes precedence. However, as these warnings are printed in early_param handlers for init_on_alloc/free, they are not printed if page_poison is enabled later on the command line (handlers are called in the order of their parameters), or when init_on_alloc/free is always enabled by the respective config option - before the page_poison early param handler is called, it is not considered to be enabled. This is inconsistent. We can remove the dependency on order by making the init_on_* parameters only set a boolean variable, and postponing the evaluation after all early params have been processed. Introduce a new init_mem_debugging_and_hardening() function for that, and move the related debug_pagealloc processing there as well. As a result init_mem_debugging_and_hardening() knows always accurately if init_on_* and/or page_poison options were enabled. Thus we can also optimize want_init_on_alloc() and want_init_on_free(). We don't need to check page_poisoning_enabled() there, we can instead not enable the init_on_* static keys at all, if page poisoning is enabled. This results in a simpler and more effective code. Link: https://lkml.kernel.org/r/20201113104033.22907-1-vbabka@suse.cz Link: https://lkml.kernel.org/r/20201113104033.22907-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mateusz Nosek <mateusznosek0@gmail.com> Cc: Laura Abbott <labbott@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Lin Feng
|
ba8f3587f5 |
init/main: fix broken buffer_init when DEFERRED_STRUCT_PAGE_INIT set
In the booting phase if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set, we have following callchain: start_kernel ... mm_init mem_init memblock_free_all reset_all_zones_managed_pages free_low_memory_core_early ... buffer_init nr_free_buffer_pages zone->managed_pages ... rest_init kernel_init kernel_init_freeable page_alloc_init_late kthread_run(deferred_init_memmap, NODE_DATA(nid), "pgdatinit%d", nid); wait_for_completion(&pgdat_init_all_done_comp); ... files_maxfiles_init It's clear that buffer_init depends on zone->managed_pages, but it's reset in reset_all_zones_managed_pages after that pages are readded into zone->managed_pages, but when buffer_init runs this process is half done and most of them will finally be added till deferred_init_memmap done. In large memory couting of nr_free_buffer_pages drifts too much, also drifting from kernels to kernels on same hardware. Fix is simple, it delays buffer_init run till deferred_init_memmap all done. But as corrected by this patch, max_buffer_heads becomes very large, the value is roughly as many as 4 times of totalram_pages, formula: max_buffer_heads = nrpages * (10%) * (PAGE_SIZE / sizeof(struct buffer_head)); Say in a 64GB memory box we have 16777216 pages, then max_buffer_heads turns out to be roughly 67,108,864. In common cases, should a buffer_head be mapped to one page/block(4KB)? So max_buffer_heads never exceeds totalram_pages. IMO it's likely to make buffer_heads_over_limit bool value alwasy false, then make codes 'if (buffer_heads_over_limit)' test in vmscan unnecessary. So this patch will change the original behavior related to buffer_heads_over_limit in vmscan since we used a half done value of zone->managed_pages before, or should we use a smaller factor(<10%) in previous formula. akpm: I think this is OK - the max_buffer_heads code is only needed on highmem machines, to prevent ZONE_NORMAL from being consumed by large amounts of buffer_heads attached to highmem pagecache. This problem will not occur on 64-bit machines, so this feature's non-functionality on such machines is a feature, not a bug. Link: https://lkml.kernel.org/r/20201123110500.103523-1-linf@wangsu.com Signed-off-by: Lin Feng <linf@wangsu.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Zhenhua Huang
|
7fb7ab6d61 |
mm: fix page_owner initializing issue for arm32
Page owner of pages used by page owner itself used is missing on arm32 targets. The reason is dummy_handle and failure_handle is not initialized correctly. Buddy allocator is used to initialize these two handles. However, buddy allocator is not ready when page owner calls it. This change fixed that by initializing page owner after buddy initialization. The working flow before and after this change are: original logic: 1. allocated memory for page_ext(using memblock). 2. invoke the init callback of page_ext_ops like page_owner(using buddy allocator). 3. initialize buddy. after this change: 1. allocated memory for page_ext(using memblock). 2. initialize buddy. 3. invoke the init callback of page_ext_ops like page_owner(using buddy allocator). with the change, failure/dummy_handle can get its correct value and page owner output for example has the one for page owner itself: Page allocated via order 2, mask 0x6202c0(GFP_USER|__GFP_NOWARN), pid 1006, ts 67278156558 ns PFN 543776 type Unmovable Block 531 type Unmovable Flags 0x0() init_page_owner+0x28/0x2f8 invoke_init_callbacks_flatmem+0x24/0x34 start_kernel+0x33c/0x5d8 Link: https://lkml.kernel.org/r/1603104925-5888-1-git-send-email-zhenhuah@codeaurora.org Signed-off-by: Zhenhua Huang <zhenhuah@codeaurora.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
f9b4240b07 |
fixes-v5.11
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCX9daOgAKCRCRxhvAZXjc
ohPkAQChXUB2BAjtIzXlCkZoDBbzHHblm5DZ37oy/4xYFmAcEwEA5sw6dQqyGHnF
GEP9def51HvXLpBV2BzNUGggo1SoGgQ=
=w/cO
-----END PGP SIGNATURE-----
Merge tag 'fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull misc fixes from Christian Brauner:
"This contains several fixes which felt worth being combined into a
single branch:
- Use put_nsproxy() instead of open-coding it switch_task_namespaces()
- Kirill's work to unify lifecycle management for all namespaces. The
lifetime counters are used identically for all namespaces types.
Namespaces may of course have additional unrelated counters and
these are not altered. This work allows us to unify the type of the
counters and reduces maintenance cost by moving the counter in one
place and indicating that basic lifetime management is identical
for all namespaces.
- Peilin's fix adding three byte padding to Dmitry's
PTRACE_GET_SYSCALL_INFO uapi struct to prevent an info leak.
- Two smal patches to convert from the /* fall through */ comment
annotation to the fallthrough keyword annotation which I had taken
into my branch and into -next before
|
||
Linus Torvalds
|
586592478b |
- Add support for the hugetlb_cma command line option to allocate gigantic
hugepages using CMA: - Add arch_get_random_long() support. - Add ap bus userspace notifications. - Increase default size of vmalloc area to 512GB and otherwise let it increase dynamically by the size of physical memory. This should fix all occurrences where the vmalloc area was not large enough. - Completely get rid of set_fs() (aka select SET_FS) and rework address space handling while doing that; making address space handling much more simple. - Reimplement getcpu vdso syscall in C. - Add support for extended SCLP responses (> 4k). This allows e.g. to handle also potential large system configurations. - Simplify KASAN by removing 3-level page table support and only supporting 4-levels from now on. - Improve debug-ability of the kernel decompressor code, which now prints also stack traces and symbols in case of problems to the console. - Remove more power management leftovers. - Other various fixes and improvements all over the place. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAl/XQAIACgkQIg7DeRsp bsIdYA//TCtSTrka/yW03b4b0FuLtKNpKB5zQgaqtEurbgbZhXdZ7/L3N+KavPQH njmKAARxebRIJB0DoZ9w9XpSb+mI3Q5y8GMi5xvUzjtJj/c6ahi3cEXIpuDR0PBv bf4UYSUpvndOwVFVOEZLeaJwKciCYvdoOwjBCmoKz9orthNVdVh5vztVRE2dMkNl y9C/Pb3w4ZMYxrbETuYnxqzueCxUhVOJmwodkGdP6bxBeemOwKn2TLVZQCbGGe7y BZpG+xsTaLZV1dZUZuDSOzVi1CTzJBGaJuYy5ewddWfxi7+mxqwEg/4s6nGKAciX Fa3T6aqLpUmDDN842Ql9TZHrwR+GYrlAp3XaQETOusUuEQLvP1dKRj/RXiDXN3MZ L+Mfa56dbs9GkVaNN/N+L7Y4z/6tZ2caX4X2S22Cp/QzvRTrG4jXVTn0r4WIcY/2 vn7fEy71LJ97CLQTDryyfJx7YNMdyIlUZY5ICAk1bt8nz1lB/IoZy0YoCBvPxIzb cEKcFTOdOtZR4WY3F8+kU0Nv1HQ8yPBzMaAqSNERvNQhMvoCChxntmyYxuVgH5iB SACADqEJKQ3hb4nMnxkeTrmmrhH4e0kdF9lAEytX+VYbjAq/6MY+qYo+QHDYkFWh BndxI54d6IiktDcKuBcpKJM7S/7N2t+EsLTS6Dhux7dbDZ2+Upw= =UR7j -----END PGP SIGNATURE----- Merge tag 's390-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Add support for the hugetlb_cma command line option to allocate gigantic hugepages using CMA - Add arch_get_random_long() support. - Add ap bus userspace notifications. - Increase default size of vmalloc area to 512GB and otherwise let it increase dynamically by the size of physical memory. This should fix all occurrences where the vmalloc area was not large enough. - Completely get rid of set_fs() (aka select SET_FS) and rework address space handling while doing that; making address space handling much more simple. - Reimplement getcpu vdso syscall in C. - Add support for extended SCLP responses (> 4k). This allows e.g. to handle also potential large system configurations. - Simplify KASAN by removing 3-level page table support and only supporting 4-levels from now on. - Improve debug-ability of the kernel decompressor code, which now prints also stack traces and symbols in case of problems to the console. - Remove more power management leftovers. - Other various fixes and improvements all over the place. * tag 's390-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (62 commits) s390/mm: add support to allocate gigantic hugepages using CMA s390/crypto: add arch_get_random_long() support s390/smp: perform initial CPU reset also for SMT siblings s390/mm: use invalid asce for user space when switching to init_mm s390/idle: fix accounting with machine checks s390/idle: add missing mt_cycles calculation s390/boot: add build-id to decompressor s390/kexec_file: fix diag308 subcode when loading crash kernel s390/cio: fix use-after-free in ccw_device_destroy_console s390/cio: remove pm support from ccw bus driver s390/cio: remove pm support from css-bus driver s390/cio: remove pm support from IO subchannel drivers s390/cio: remove pm support from chsc subchannel driver s390/vmur: remove unused pm related functions s390/tape: remove unsupported PM functions s390/cio: remove pm support from eadm-sch drivers s390: remove pm support from console drivers s390/dasd: remove unused pm related functions s390/zfcp: remove pm support from zfcp driver s390/ap: let bus_register() add the AP bus sysfs attributes ... |
||
Uladzislau Rezki (Sony)
|
1b04fa9900 |
rcu-tasks: Move RCU-tasks initialization to before early_initcall()
PowerPC testing encountered boot failures due to RCU Tasks not being
fully initialized until core_initcall() time. This commit therefore
initializes RCU Tasks (along with Rude RCU and RCU Tasks Trace) just
before early_initcall() time, thus allowing waiting on RCU Tasks grace
periods from early_initcall() handlers.
Link: https://lore.kernel.org/rcu/87eekfh80a.fsf@dja-thinkpad.axtens.net/
Fixes:
|
||
Petr Mladek
|
5f3b8d3986 | Merge branch 'for-5.11-null-console' into for-linus | ||
Arnd Bergmann
|
55d5b7dd64 |
initramfs: fix clang build failure
There is only one function in init/initramfs.c that is in the .text section, and it is marked __weak. When building with clang-12 and the integrated assembler, this leads to a bug with recordmcount: ./scripts/recordmcount "init/initramfs.o" Cannot find symbol for section 2: .text. init/initramfs.o: failed I'm not quite sure what exactly goes wrong, but I notice that this function is only ever called from an __init function, and normally inlined. Marking it __init as well is clearly correct and it leads to recordmcount no longer complaining. Link: https://lkml.kernel.org/r/20201204165742.3815221-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Barret Rhoden <brho@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Eric W. Biederman
|
f7cfd871ae |
exec: Transform exec_update_mutex into a rw_semaphore
Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex. The problematic lock ordering found by lockdep
was:
perf_event_open (exec_update_mutex -> ovl_i_mutex)
chown (ovl_i_mutex -> sb_writes)
sendfile (sb_writes -> p->lock)
by reading from a proc file and writing to overlayfs
proc_pid_syscall (p->lock -> exec_update_mutex)
While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same. They are all readers. The only writer is
exec.
There is no reason for readers to block on each other. So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.
Cc: Jann Horn <jannh@google.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christopher Yeoh <cyeoh@au1.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Fixes:
|
||
Linus Torvalds
|
e6585a4939 |
Kbuild fixes for v5.10 (2nd)
- Move -Wcast-align to W=3, which tends to be false-positive and there is no tree-wide solution. - Pass -fmacro-prefix-map to KBUILD_CPPFLAGS because it is a preprocessor option and makes sense for .S files as well. - Disable -gdwarf-2 for Clang's integrated assembler to avoid warnings. - Disable --orphan-handling=warn for LLD 10.0.1 to avoid warnings. - Fix undesirable line breaks in *.mod files. -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl/MzyMVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGKJ8P/2kLq296XAPjqC90/LWMja8dsXO/ Wgaq8zC819x0JFuGdBKlwlFe3AvFYRtts9V5+mzjxvsOjH/6+xzyrXjRPCwZYqlj XKC3ZwuS2SGDPFCriI1edwTUp5tyDnG/VBjqbf3ybQnz0LAShidXBD9IlM/XX9Rz BlWqd7Uib50Pq8AfM2JVokrSmkkvhqxocIsmjTa0wvRjRAw7+aVkGNCWXqnTho7y YuHmTWbmUQIROF3Bzs1fkGp+qaQofPRfA1tTwaTVvgmt8rEqyzXi11y6kj56INfg /pq4O1KrplKtJFdrcjj4/eptqHG3I+Jq56qCHVescF6+bH6cc6BUL8qDdAzFZQai e/pWCzREqFDKchEmT2d0Uzik8Zfxi5Cw68Otpzb4LqTUUxXSoRx1R9Of/Ei5QZum 6b6s9Q41UwH983UQCOOSGjXGZYP6fZG1a0XejbduYo7TL4KEECAO/FlLBWGttYH3 0i3aKz3aDKb/fo7hDbbqg+o6F0mShEraqxMmWgIvgGt+k76j0O0wS2KryqpTd7Vv xg72suGM7f9QBA50lZ0r32fm86XnlqwQAm9ZMaSXR1Ii7j4F9UNRmR/FUYq7dPwa COkuHr+9LqzV/tkluWi2rjLIGPaCuEVeSCcQ/wIDdp2iOyb54CbozwK0Yi2dxxus jVFKwSaMUDHrkSj6 =/ysh -----END PGP SIGNATURE----- Merge tag 'kbuild-fixes-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Move -Wcast-align to W=3, which tends to be false-positive and there is no tree-wide solution. - Pass -fmacro-prefix-map to KBUILD_CPPFLAGS because it is a preprocessor option and makes sense for .S files as well. - Disable -gdwarf-2 for Clang's integrated assembler to avoid warnings. - Disable --orphan-handling=warn for LLD 10.0.1 to avoid warnings. - Fix undesirable line breaks in *.mod files. * tag 'kbuild-fixes-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: avoid split lines in .mod files kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1 kbuild: Hoist '--orphan-handling' into Kconfig Kbuild: do not emit debug info for assembly with LLVM_IAS=1 kbuild: use -fmacro-prefix-map for .S sources Makefile.extrawarn: move -Wcast-align to W=3 |
||
Linus Torvalds
|
8a02ec8f35 |
Have bootconfig size and checksum be little endian
In case the bootconfig is created on one kind of endian machine, and then read on the other kind of endian kernel, the size and checksum will be incorrect. Instead, have both the size and checksum always be little endian and have the tool and the kernel convert it from little endian to or from the host endian. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCX8brThQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qiBMAQDe1vsp/SyHO9H5pnsepdmk4fERn0bC Q0qtCoYp1xUKOQEAjnOJKdCE1O6n24u+b+3jw3BHswQLyUKOFaPcIM7jSgM= =Z6kA -----END PGP SIGNATURE----- Merge tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull bootconfig fixes from Steven Rostedt: "Have bootconfig size and checksum be little endian In case the bootconfig is created on one kind of endian machine, and then read on the other kind of endian kernel, the size and checksum will be incorrect. Instead, have both the size and checksum always be little endian and have the tool and the kernel convert it from little endian to or from the host endian" * tag 'trace-v5.10-rc6-bootconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: docs: bootconfig: Add the endianness of fields tools/bootconfig: Store size and checksum in footer as le32 bootconfig: Load size and checksum in the footer as le32 |
||
Christoph Hellwig
|
0d02129e76 |
block: merge struct block_device and struct hd_struct
Instead of having two structures that represent each block device with different life time rules, merge them into a single one. This also greatly simplifies the reference counting rules, as we can use the inode reference count as the main reference count for the new struct block_device, with the device model reference front ending it for device model interaction. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
41e5c81984 |
block: remove the partno field from struct hd_struct
Just use the bd_partno field in struct block_device everywhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
231926dbf0 |
block: move the partition_meta_info to struct block_device
Move the partition_meta_info to struct block_device in preparation for killing struct hd_struct. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
013b0e96ae |
init: cleanup match_dev_by_uuid and match_dev_by_label
Avoid a totally pointless goto label, and use the same style of comparism for both helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
e036bb8e0c |
init: refactor devt_from_partuuid
The code in devt_from_partuuid is very convoluted. Refactor a bit by sanitizing the goto and variable name usage. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Christoph Hellwig
|
c2637e80a0 |
init: refactor name_to_dev_t
Split each case into a self-contained helper, and move the block dependent code entirely under the pre-existing #ifdef CONFIG_BLOCK. This allows to remove the blk_lookup_devt stub in genhd.h. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Nathan Chancellor
|
d5750cd3c5 |
kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1
ld.lld 10.0.1 spews a bunch of various warnings about .rela sections, along with a few others. Newer versions of ld.lld do not have these warnings. As a result, do not add '--orphan-handling=warn' to LDFLAGS_vmlinux if ld.lld's version is not new enough. Link: https://github.com/ClangBuiltLinux/linux/issues/1187 Link: https://github.com/ClangBuiltLinux/linux/issues/1193 Reported-by: Arvind Sankar <nivedita@alum.mit.edu> Reported-by: kernelci.org bot <bot@kernelci.org> Reported-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Nathan Chancellor
|
59612b24f7 |
kbuild: Hoist '--orphan-handling' into Kconfig
Currently, '--orphan-handling=warn' is spread out across four different architectures in their respective Makefiles, which makes it a little unruly to deal with in case it needs to be disabled for a specific linker version (in this case, ld.lld 10.0.1). To make it easier to control this, hoist this warning into Kconfig and the main Makefile so that disabling it is simpler, as the warning will only be enabled in a couple places (main Makefile and a couple of compressed boot folders that blow away LDFLAGS_vmlinx) and making it conditional is easier due to Kconfig syntax. One small additional benefit of this is saving a call to ld-option on incremental builds because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN. To keep the list of supported architectures the same, introduce CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to gain this automatically after all of the sections are specified and size asserted. A special thanks to Kees Cook for the help text on this config. Link: https://github.com/ClangBuiltLinux/linux/issues/1187 Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Masami Hiramatsu
|
24aed09451 |
bootconfig: Load size and checksum in the footer as le32
Load the size and the checksum fields in the footer as le32 instead of u32. This will allow us to apply bootconfig to the cross build initrd without caring the endianness. Link: https://lkml.kernel.org/r/160583934457.547349.10504070298990791074.stgit@devnote2 Reported-by: Steven Rostedt <rostedt@goodmis.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
Linus Torvalds
|
43d6ecd97c |
Urgent printk fix for 5.10
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl/A+VEACgkQUqAMR0iA lPIwGw/9F/E2ZdX+Vgi3ZiR/5GdfVZeIW+QwhKXBQc8Jr9+p2JJ+UOPeeazKQA5l bFt6GR67yjqtFS5gO76EPCQ6/Uu3cPA+A3HQRQZuE6p0zM+mrMXc/upLMy5DKi4Z f4zkW8dYWSBpAWPvM9bb0gIKO9wVV6Aj1IyyZLfEghX/KrJPx0zutioO4ScYxhA9 YVITmnUQ6YzHEVE8CwWGV4lArC50ILGdIqNlZrkjuG3CuGTdyB2OY60P8XCy8bzn W3WgRGI/bvfHwCPh8oYKm/5nM9JAVdhbEpoFQj8cMPKoH5DeSGNWfYXkali2gqhL 1Y2SntTcR7zclMcN0/gIn9ViVsma/eayAyawSYgQjmAdl6H/vv9B7x9ZswmK/b38 JzOzHwP+H3lXVg2yN4EbH3uDMTMjqflYuC7QiZ/HNa43KURXhoritw2hBRczhazp mdyRQf4iv8NoYSthggD6LolCs+ay5NZpCeB3YXgnlpxiYFGCE+ykSz41AGdTyYTl jOWVtK1VawFD0/FgpgF8XK7/gOXWeYb+4WeBYgGKgCJdneiB5eJt8eWT7zmpAPpG FECexdAd4TAjD+EEbidiFWpMjJcY2TnOJp76O3/Wlo1QLbEgRHOklM/Rrq0zGg5b vm3w0kobGZfpIJuzSOAHyErX0jGEVTq6yUi381jSQpf4bTttIpc= =7IkU -----END PGP SIGNATURE----- Merge tag 'printk-for-5.10-rc6-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fixes from Petr Mladek: - do not lose trailing newline in pr_cont() calls - two trivial fixes for a dead store and a config description * tag 'printk-for-5.10-rc6-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: finalize records with trailing newlines printk: remove unneeded dead-store assignment init/Kconfig: Fix CPU number in LOG_CPU_MAX_BUF_SHIFT description |
||
Heiko Carstens
|
334ef6ed06 |
init/Kconfig: make COMPILE_TEST depend on !S390
While allmodconfig and allyesconfig build for s390 there are also
various bots running compile tests with randconfig, where PCI is
disabled. This reveals that a lot of drivers should actually depend on
HAS_IOMEM.
Adding this to each device driver would be a never ending story,
therefore just disable COMPILE_TEST for s390.
The reasoning is more or less the same as described in
commit
|
||
Petr Mladek
|
757055ae8d |
init/console: Use ttynull as a fallback when there is no console
stdin, stdout, and stderr standard I/O stream are created for the init
process. They are not available when there is no console registered
for /dev/console. It might lead to a crash when the init process
tries to use them, see the commit
|
||
Masami Hiramatsu
|
50b8a74285 |
bootconfig: Extend the magic check range to the preceding 3 bytes
Since Grub may align the size of initrd to 4 if user pass
initrd from cpio, we have to check the preceding 3 bytes as well.
Link: https://lkml.kernel.org/r/160520205132.303174.4876760192433315429.stgit@devnote2
Cc: stable@vger.kernel.org
Fixes:
|
||
Paul Menzel
|
0f7636e165 |
init/Kconfig: Fix CPU number in LOG_CPU_MAX_BUF_SHIFT description
Currently, LOG_BUF_SHIFT defaults to 17, which is 2 ^ 17 bytes = 128 KB,
and LOG_CPU_MAX_BUF_SHIFT defaults to 12, which is 2 ^ 12 bytes = 4 KB.
Half of 128 KB is 64 KB, so more than 16 CPUs are required for the value
to be used, as then the sum of contributions is greater than 64 KB for
the first time. My guess is, that the description was written with the
configuration values used in the SUSE in mind.
Fixes:
|
||
Linus Torvalds
|
7cf726a594 |
linux-kselftest-kunit-5.10-rc1
This Kunit update for Linux 5.10-rc1 consists of: - add Kunit to kernel_init() and remove KUnit from init calls entirely. This addresses the concern Kunit would not work correctly during late init phase. - add a linker section where KUnit can put references to its test suites. This patch is the first step in transitioning to dispatching all KUnit tests from a centralized executor rather than having each as its own separate late_initcall. - add a centralized executor to dispatch tests rather than relying on late_initcall to schedule each test suite separately. Centralized execution is for built-in tests only; modules will execute tests when loaded. - convert bitfield test to use KUnit framework - Documentation updates for naming guidelines and how kunit_test_suite() works. - add test plan to KUnit TAP format -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl+Mr68ACgkQCwJExA0N Qxy7HxAAuToPP6uUHwTC3KzVVE4hjP9a3t4hiD7kP/gI0umN+2nrccm6Vx6E+r9t Jkjiv9Yxj3riOkE5jJ8KriAx228mwz3N1yBEDfpp+8iCWOK3iOuFKKTTWOoZY4hf Enlf7n4Yp2TOEmIH0xwh/H67zl0+3FwT3fGWC6DDPXHuw+X+mGphCl9XPB70rZcT q/s0dwx1CmWBm30MgFXN+SZ7CgLP13lRAvkVO4t56/O1SkTbpCe7U1zqT2p5UoOY x7qvzs3pdCaWbpCsAqFWr46iECDHuVQjIgLuddOF/OgWVcCZlv7T7ESd7IDPHUPx DD3zYG0ODV0jKZHmpwkSojSbu3z6v5FnfhLpAcaHoEMBeRu5UIar7EjPHwqrqiU7 JqE7dBECmcD308sr9u0w44DK15nmsD3+njrBQ/AJmsWdg0wtnMvA01nAHKObbk0n 33aIu4Iny1dH35/rt9dV2DKT09f5r0ANCjoJMX8gu/li66FHGfULOaqr6KLLqi5X VPgHCKzyT9nD+Bc2LYzRWmhhAj+5Iwyglgpe9ZiOlPQ5i+hLvfPPAZxVYSbVA1Sk aVZi+ibKUqHSBfXcaLf/OKX7Csf4zni3F+WfFT5ZIC4Y6iEF+0tvS2HW2/pcUAN/ OSPYYmyqhwYIl8tvbQENgBsyU/K1rECxJpqWAznJLRCebkY5a/s= =0Sco -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-kunit-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull more Kunit updates from Shuah Khan: - add Kunit to kernel_init() and remove KUnit from init calls entirely. This addresses the concern that Kunit would not work correctly during late init phase. - add a linker section where KUnit can put references to its test suites. This is the first step in transitioning to dispatching all KUnit tests from a centralized executor rather than having each as its own separate late_initcall. - add a centralized executor to dispatch tests rather than relying on late_initcall to schedule each test suite separately. Centralized execution is for built-in tests only; modules will execute tests when loaded. - convert bitfield test to use KUnit framework - Documentation updates for naming guidelines and how kunit_test_suite() works. - add test plan to KUnit TAP format * tag 'linux-kselftest-kunit-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: lib: kunit: Fix compilation test when using TEST_BIT_FIELD_COMPILE lib: kunit: add bitfield test conversion to KUnit Documentation: kunit: add a brief blurb about kunit_test_suite kunit: test: add test plan to KUnit TAP format init: main: add KUnit to kernel init kunit: test: create a single centralized executor for all tests vmlinux.lds.h: add linker section for KUnit test suites Documentation: kunit: Add naming guidelines |
||
Linus Torvalds
|
9ff9b0d392 |
networking changes for the 5.10 merge window
Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. Allow more than 255 IPv4 multicast interfaces. Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. Allow more calls to same peer in RxRPC. Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. Add TC actions for implementing MPLS L2 VPNs. Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. Support sleepable BPF programs, initially targeting LSM and tracing. Add bpf_d_path() helper for returning full path for given 'struct path'. Make bpf_tail_call compatible with bpf-to-bpf calls. Allow BPF programs to call map_update_elem on sockmaps. Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). Support listing and getting information about bpf_links via the bpf syscall. Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. Add XDP support for Intel's igb driver. Support offloading TC flower classification and filtering rules to mscc_ocelot switches. Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+ItRwACgkQMUZtbf5S IrtTMg//UxpdR/MirT1DatBU0K/UGAZY82hV7F/UC8tPgjfHZeHvWlDFxfi3YP81 PtPKbhRZ7DhwBXefUp6nY3UdvjftrJK2lJm8prJUPSsZRye8Wlcb7y65q7/P2y2U Efucyopg6RUrmrM0DUsIGYGJgylQLHnMYUl/keCsD4t5Bp4ksyi9R2t5eitGoWzh r3QGdbSa0AuWx4iu0i+tqp6Tj0ekMBMXLVb35dtU1t0joj2KTNEnSgABN3prOa8E iWYf2erOau68Ogp3yU3miCy0ZU4p/7qGHTtzbcp677692P/ekak6+zmfHLT9/Pjy 2Stq2z6GoKuVxdktr91D9pA3jxG4LxSJmr0TImcGnXbvkMP3Ez3g9RrpV5fn8j6F mZCH8TKZAoD5aJrAJAMkhZmLYE1pvDa7KolSk8WogXrbCnTEb5Nv8FHTS1Qnk3yl wSKXuvutFVNLMEHCnWQLtODbTST9DI/aOi6EctPpuOA/ZyL1v3pl+gfp37S+LUTe owMnT/7TdvKaTD0+gIyU53M6rAWTtr5YyRQorX9awIu/4Ha0F0gYD7BJZQUGtegp HzKt59NiSrFdbSH7UdyemdBF4LuCgIhS7rgfeoUXMXmuPHq7eHXyHZt5dzPPa/xP 81P0MAvdpFVwg8ij2yp2sHS7sISIRKq17fd1tIewUabxQbjXqPc= =bc1U -----END PGP SIGNATURE----- Merge tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: - Add redirect_neigh() BPF packet redirect helper, allowing to limit stack traversal in common container configs and improving TCP back-pressure. Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain. - Expand netlink policy support and improve policy export to user space. (Ge)netlink core performs request validation according to declared policies. Expand the expressiveness of those policies (min/max length and bitmasks). Allow dumping policies for particular commands. This is used for feature discovery by user space (instead of kernel version parsing or trial and error). - Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge. - Allow more than 255 IPv4 multicast interfaces. - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK packets of TCPv6. - In Multi-patch TCP (MPTCP) support concurrent transmission of data on multiple subflows in a load balancing scenario. Enhance advertising addresses via the RM_ADDR/ADD_ADDR options. - Support SMC-Dv2 version of SMC, which enables multi-subnet deployments. - Allow more calls to same peer in RxRPC. - Support two new Controller Area Network (CAN) protocols - CAN-FD and ISO 15765-2:2016. - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit kernel problem. - Add TC actions for implementing MPLS L2 VPNs. - Improve nexthop code - e.g. handle various corner cases when nexthop objects are removed from groups better, skip unnecessary notifications and make it easier to offload nexthops into HW by converting to a blocking notifier. - Support adding and consuming TCP header options by BPF programs, opening the doors for easy experimental and deployment-specific TCP option use. - Reorganize TCP congestion control (CC) initialization to simplify life of TCP CC implemented in BPF. - Add support for shipping BPF programs with the kernel and loading them early on boot via the User Mode Driver mechanism, hence reusing all the user space infra we have. - Support sleepable BPF programs, initially targeting LSM and tracing. - Add bpf_d_path() helper for returning full path for given 'struct path'. - Make bpf_tail_call compatible with bpf-to-bpf calls. - Allow BPF programs to call map_update_elem on sockmaps. - Add BPF Type Format (BTF) support for type and enum discovery, as well as support for using BTF within the kernel itself (current use is for pretty printing structures). - Support listing and getting information about bpf_links via the bpf syscall. - Enhance kernel interfaces around NIC firmware update. Allow specifying overwrite mask to control if settings etc. are reset during update; report expected max time operation may take to users; support firmware activation without machine reboot incl. limits of how much impact reset may have (e.g. dropping link or not). - Extend ethtool configuration interface to report IEEE-standard counters, to limit the need for per-vendor logic in user space. - Adopt or extend devlink use for debug, monitoring, fw update in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx, dpaa2-eth). - In mlxsw expose critical and emergency SFP module temperature alarms. Refactor port buffer handling to make the defaults more suitable and support setting these values explicitly via the DCBNL interface. - Add XDP support for Intel's igb driver. - Support offloading TC flower classification and filtering rules to mscc_ocelot switches. - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as fixed interval period pulse generator and one-step timestamping in dpaa-eth. - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3) offload. - Add Lynx PHY/PCS MDIO module, and convert various drivers which have this HW to use it. Convert mvpp2 to split PCS. - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as 7-port Mediatek MT7531 IP. - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver, and wcn3680 support in wcn36xx. - Improve performance for packets which don't require much offloads on recent Mellanox NICs by 20% by making multiple packets share a descriptor entry. - Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto subtree to drivers/net. Move MDIO drivers out of the phy directory. - Clean up a lot of W=1 warnings, reportedly the actively developed subsections of networking drivers should now build W=1 warning free. - Make sure drivers don't use in_interrupt() to dynamically adapt their code. Convert tasklets to use new tasklet_setup API (sadly this conversion is not yet complete). * tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits) Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" net, sockmap: Don't call bpf_prog_put() on NULL pointer bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo bpf, sockmap: Add locking annotations to iterator netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements net: fix pos incrementment in ipv6_route_seq_next net/smc: fix invalid return code in smcd_new_buf_create() net/smc: fix valid DMBE buffer sizes net/smc: fix use-after-free of delayed events bpfilter: Fix build error with CONFIG_BPFILTER_UMH cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info bpf: Fix register equivalence tracking. rxrpc: Fix loss of final ack on shutdown rxrpc: Fix bundle counting for exclusive connections netfilter: restore NF_INET_NUMHOOKS ibmveth: Identify ingress large send packets. ibmveth: Switch order of ibmveth_helper calls. cxgb4: handle 4-tuple PEDIT to NAT mode translation selftests: Add VRF route leaking tests ... |
||
Linus Torvalds
|
bbf6259903 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina: "The latest advances in computer science from the trivial queue" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: xtensa: fix Kconfig typo spelling.txt: Remove some duplicate entries mtd: rawnand: oxnas: cleanup/simplify code selftests: vm: add fragment CONFIG_GUP_BENCHMARK perf: Fix opt help text for --no-bpf-event HID: logitech-dj: Fix spelling in comment bootconfig: Fix kernel message mentioning CONFIG_BOOT_CONFIG MAINTAINERS: rectify MMP SUPPORT after moving cputype.h scif: Fix spelling of EACCES printk: fix global comment lib/bitmap.c: fix spello fs: Fix missing 'bit' in comment |
||
Linus Torvalds
|
d594d8f411 |
printk changes for 5.10
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl+EN+oACgkQUqAMR0iA lPK/gA//WXBjC4FSPNr0j7kPFKQhADS3cUcp+GfuI4rYkYcJHV0yJn1kvctg1rUC Je+Hc+Hy5Nk93lwejj5BvQoc31zOeoPDyMje5zi5te4H2NQkaoGXHOMvUnaLcNeo g+HJvx+NU9MDjuc5amtK8YD69jzErD+eqrHpQOg4UToMXXcBXLafTThIi9vT1fzP 9uwWBRlpdQyY7tYbbwFiDuu33PyoWlc6Ksp8qKdLBLz2AmGd1Rvaq+ePsq8b9tHJ pfv1agW0GTpzoN2pm5gFXOoYniHB/ooB1L0QLq7ylaociEyb8WbTtkn4v++EjxW8 aGsO1WdO0MQeIWDxXQR5DYD3s+Me2DMhFPDqUc2/s0q2SGWUPFcsmCsvMAOx/clA HDfTWkyzB4FarZOTv0gZ7jYNOVukFzUQ1IBTtWpJifC9fT0xrRkKmKE1UgmWv0ei Hx5VFQyQGsDh3sUcRLhW91p4sqJCs7l01zw1A/0rb7a+QTHAqZRtbz5hyTjlViiT 57XiyXynXW8N4Q5U6uAxCbkFFi+nP/XVQ5ggZ/QLn/4hfWWUcu0vt2bOGkRwryAT zYmDqViraEVWKIom74UzZ0nrIBtdhvtbFQIYuyiCQKpKMwytWXUQbUASZL2mfBZi h5eJx7etV6f5to5mNRsj8bbN5buX9UheEd0QFD9NJdS6aadqTac= =9vEl -----END PGP SIGNATURE----- Merge tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: "The big new thing is the fully lockless ringbuffer implementation, including the support for continuous lines. It will allow to store and read messages in any situation wihtout the risk of deadlocks and without the need of temporary per-CPU buffers. The access is still serialized by logbuf_lock. It synchronizes few more operations, for example, temporary buffer for formatting the message, syslog and kmsg_dump operations. The lock removal is being discussed and should be ready for the next release. The continuous lines are handled exactly the same way as before to avoid regressions in user space. It means that they are appended to the last message when the caller is the same. Only the last message can be extended. The data ring includes plain text of the messages. Except for an integer at the beginning of each message that points back to the descriptor ring with other metadata. The dictionary has to stay. journalctl uses it to filter the log. It allows to show messages related to a given device. The dictionary values are stored in the descriptor ring with the other metadata. This is the first part of the printk rework as discussed at Plumbers 2019, see https://lore.kernel.org/r/87k1acz5rx.fsf@linutronix.de. The next big step will be handling consoles by kthreads during the normal system operation. It will require special handling of situations when the kthreads could not get scheduled, for example, early boot, suspend, panic. Other changes: - Add John Ogness as a reviewer for printk subsystem. He is author of the rework and is familiar with the code and history. - Fix locking in serial8250_do_startup() to prevent lockdep report. - Few code cleanups" * tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (27 commits) printk: Use fallthrough pseudo-keyword printk: reduce setup_text_buf size to LOG_LINE_MAX printk: avoid and/or handle record truncation printk: remove dict ring printk: move dictionary keys to dev_printk_info printk: move printk_info into separate array printk: reimplement log_cont using record extension printk: ringbuffer: add finalization/extension support printk: ringbuffer: change representation of states printk: ringbuffer: clear initial reserved fields printk: ringbuffer: add BLK_DATALESS() macro printk: ringbuffer: relocate get_data() printk: ringbuffer: avoid memcpy() on state_var printk: ringbuffer: fix setting state in desc_read() kernel.h: Move oops_in_progress to printk.h scripts/gdb: update for lockless printk ringbuffer scripts/gdb: add utils.read_ulong() docs: vmcoreinfo: add lockless printk ringbuffer vmcoreinfo printk: reduce LOG_BUF_SHIFT range for H8300 printk: ringbuffer: support dataless records ... |
||
Linus Torvalds
|
6ad4bf6ea1 |
io_uring-5.10-2020-10-12
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+EXPEQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpiR4EAC3trm1ojXVF7y9/XRhcPpb4Pror+ZA1coO gyoy+zUuCEl9WCzzHWqXULMYMP0YzNJnJs0oLQPA1s0sx1H4uDMl/UXg0OXZisYG Y59Kca3c1DHFwj9KPQXfGmCEjc/rbDWK5TqRc2iZMz+6E5Mt71UFZHtenwgV1zD8 hTmZZkzLCu2ePfOvrPONgL5tDqPWGVyn61phoC7cSzMF66juXGYuvQGktzi/m6q+ jAxUnhKvKTlLB9wsq3s5X/20/QD56Yuba9U+YxeeNDBE8MDWQOsjz0mZCV1fn4p3 h/6762aRaWaXH7EwMtsHFUWy7arJZg/YoFYNYLv4Ksyy3y4sMABZCy3A+JyzrgQ0 hMu7vjsP+k22X1WH8nyejBfWNEmxu6dpgckKrgF0dhJcXk/acWA3XaDWZ80UwfQy isKRAP1rA0MJKHDMIwCzSQJDPvtUAkPptbNZJcUSU78o+pPoCaQ93V++LbdgGtKn iGJJqX05dVbcsDx5X7fluphjkUTC4yFr7ZgLgbOIedXajWRD8iOkO2xxCHk6SKFl iv9entvRcX9k3SHK9uffIUkRBUujMU0+HCIQFCO1qGmkCaS5nSrovZl4HoL7L/Dj +T8+v7kyJeklLXgJBaE7jk01O4HwZKjwPWMbCjvL9NKk8j7c1soYnRu5uNvi85Mu /9wn671s+w== =udgj -----END PGP SIGNATURE----- Merge tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block Pull io_uring updates from Jens Axboe: - Add blkcg accounting for io-wq offload (Dennis) - A use-after-free fix for io-wq (Hillf) - Cancelation fixes and improvements - Use proper files_struct references for offload - Cleanup of io_uring_get_socket() since that can now go into our own header - SQPOLL fixes and cleanups, and support for sharing the thread - Improvement to how page accounting is done for registered buffers and huge pages, accounting the real pinned state - Series cleaning up the xarray code (Willy) - Various cleanups, refactoring, and improvements (Pavel) - Use raw spinlock for io-wq (Sebastian) - Add support for ring restrictions (Stefano) * tag 'io_uring-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (62 commits) io_uring: keep a pointer ref_node in file_data io_uring: refactor *files_register()'s error paths io_uring: clean file_data access in files_register io_uring: don't delay io_init_req() error check io_uring: clean leftovers after splitting issue io_uring: remove timeout.list after hrtimer cancel io_uring: use a separate struct for timeout_remove io_uring: improve submit_state.ios_left accounting io_uring: simplify io_file_get() io_uring: kill extra check in fixed io_file_get() io_uring: clean up ->files grabbing io_uring: don't io_prep_async_work() linked reqs io_uring: Convert advanced XArray uses to the normal API io_uring: Fix XArray usage in io_uring_add_task_file io_uring: Fix use of XArray in __io_uring_files_cancel io_uring: fix break condition for __io_uring_register() waiting io_uring: no need to call xa_destroy() on empty xarray io_uring: batch account ->req_issue and task struct references io_uring: kill callback_head argument for io_req_task_work_add() io_uring: move req preps out of io_issue_sqe() ... |
||
Linus Torvalds
|
50d228345a |
As hoped, things calmed down for docs this cycle; fewer changes and almost
no conflicts at all. This pull includes: - A reworked and expanded user-mode Linux document - Some simplifications and improvements for submitting-patches.rst - An emergency fix for (some) problems with Sphinx 3.x - Some welcome automarkup improvements to automatically generate cross-references to struct definitions and other documents - The usual collection of translation updates, typo fixes, etc. -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl+ErNYPHGNvcmJldEBs d24ubmV0AAoJEBdDWhNsDH5Y284H/3bv9fahbg16AJcKYqJXFHGpDs3CsASPnJqQ 9HQoV5tg6Qd4kI3oFb+30l8SK73Wr2t685/DhOPDRR/vN3B5M1vOQvPRL/dEqiwi aUEhtMbnC/trSbteXsjGDWT+1EnI/+R3NFV++WiRp1XxE4DRXL3xySTeviR0IW+V rQxU7VCcVp0bklVH+gqjrsvqU5iZeckyZB6evc8X92ThhzjNprR5KVxxgl1wxcu/ dPYizHoKYVoLVNw50rwPGt2hmq9RpyDM6Xh9UhLHcA57ENyzr8NNTJBOT0tVMTWV smU01X/ECoy54kj1w8AKP+f7F0G7DUU+Jz68X0X/kYPq520dUs4= =Ovox -----END PGP SIGNATURE----- Merge tag 'docs-5.10' of git://git.lwn.net/linux Pull documentation updates from Jonathan Corbet: "As hoped, things calmed down for docs this cycle; fewer changes and almost no conflicts at all. This includes: - A reworked and expanded user-mode Linux document - Some simplifications and improvements for submitting-patches.rst - An emergency fix for (some) problems with Sphinx 3.x - Some welcome automarkup improvements to automatically generate cross-references to struct definitions and other documents - The usual collection of translation updates, typo fixes, etc" * tag 'docs-5.10' of git://git.lwn.net/linux: (81 commits) gpiolib: Update indentation in driver.rst for code excerpts Documentation/admin-guide: tainted-kernels: Fix typo occured Documentation: better locations for sysfs-pci, sysfs-tagging docs: programming-languages: refresh blurb on clang support Documentation: kvm: fix a typo Documentation: Chinese translation of Documentation/arm64/amu.rst doc: zh_CN: index files in arm64 subdirectory mailmap: add entry for <mstarovoitov@marvell.com> doc: seq_file: clarify role of *pos in ->next() docs: trace: ring-buffer-design.rst: use the new SPDX tag Documentation: kernel-parameters: clarify "module." parameters Fix references to nommu-mmap.rst docs: rewrite admin-guide/sysctl/abi.rst docs: fb: Remove vesafb scrollback boot option docs: fb: Remove sstfb scrollback boot option docs: fb: Remove matroxfb scrollback boot option docs: fb: Remove framebuffer scrollback boot option docs: replace the old User Mode Linux HowTo with a new one Documentation/admin-guide: blockdev/ramdisk: remove use of "rdev" Documentation/admin-guide: README & svga: remove use of "rdev" ... |
||
Petr Mladek
|
70333f4ff9 | Merge branch 'printk-rework' into for-linus | ||
Brendan Higgins
|
8c0d884986 |
init: main: add KUnit to kernel init
Although we have not seen any actual examples where KUnit doesn't work because it runs in the late init phase of the kernel, it has been a concern for some time that this could potentially be an issue in the future. So, remove KUnit from init calls entirely, instead call directly from kernel_init() so that KUnit runs after late init. Co-developed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Brendan Higgins <brendanhiggins@google.com> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> |
||
Jens Axboe
|
0f2122045b |
io_uring: don't rely on weak ->files references
Grab actual references to the files_struct. To avoid circular references issues due to this, we add a per-task note that keeps track of what io_uring contexts a task has used. When the tasks execs or exits its assigned files, we cancel requests based on this tracking. With that, we can grab proper references to the files table, and no longer need to rely on stashing away ring_fd and ring_file to check if the ring_fd may have been closed. Cc: stable@vger.kernel.org # v5.5+ Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Stephen Kitt
|
dd19d2938f |
Fix references to nommu-mmap.rst
nommu-mmap.rst was moved to Documentation/admin-guide/mm; this patch
updates the remaining stale references to Documentation/mm.
Fixes:
|
||
David S. Miller
|
3ab0a7a0c3 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net> |
||
Jason Yan
|
a27026e95b |
bootconfig: init: make xbc_namebuf static
This eliminates the following sparse warning: init/main.c:306:6: warning: symbol 'xbc_namebuf' was not declared. Should it be static? Link: https://lkml.kernel.org/r/20200915070324.2239473-1-yanaijie@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
Masami Hiramatsu
|
82d083ab60 |
kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
Since kprobe_event= cmdline option allows user to put kprobes on the
functions in initmem, kprobe has to make such probes gone after boot.
Currently the probes on the init functions in modules will be handled
by module callback, but the kernel init text isn't handled.
Without this, kprobes may access non-exist text area to disable or
remove it.
Link: https://lkml.kernel.org/r/159972810544.428528.1839307531600646955.stgit@devnote2
Fixes:
|
||
John Ogness
|
550c10d28d |
printk: reduce LOG_BUF_SHIFT range for H8300
The .bss section for the h8300 is relatively small. A value of CONFIG_LOG_BUF_SHIFT that is larger than 19 will create a static printk ringbuffer that is too large. Limit the range appropriately for the H8300. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200812073122.25412-1-john.ogness@linutronix.de |
||
Jakub Kicinski
|
44a8c4f33c |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.
Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit
|