mirror of
https://github.com/torvalds/linux.git
synced 2024-11-27 22:51:35 +00:00
05e4ed1ce5
1838 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Arnd Bergmann
|
e6226997ec |
asm-generic: reverse GENERIC_{STRNCPY_FROM,STRNLEN}_USER symbols
Most architectures do not need a custom implementation, and in most cases the generic implementation is preferred, so change the polariy on these Kconfig symbols to require architectures to select them when they provide their own version. The new name is CONFIG_ARCH_HAS_{STRNCPY_FROM,STRNLEN}_USER. The remaining architectures at the moment are: ia64, mips, parisc, um and xtensa. We should probably convert these as well, but I was not sure how far to take this series. Thomas Bogendoerfer had some concerns about converting mips but may still do some more detailed measurements to see which version is better. Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: linux-ia64@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-um@lists.infradead.org Cc: linux-xtensa@linux-xtensa.org Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Helge Deller <deller@gmx.de> # parisc Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
Arnd Bergmann
|
b26b181651 |
microblaze: use generic strncpy/strnlen from_user
Remove the microblaze implemenation of strncpy/strnlen and instead use the generic versions. The microblaze version is fairly slow because it always does byte accesses even for aligned data, and it lacks a checks for user_addr_max(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
Eric W. Biederman
|
b48c7236b1 |
exit/bdflush: Remove the deprecated bdflush system call
The bdflush system call has been deprecated for a very long time. Recently Michael Schmitz tested[1] and found that the last known caller of of the bdflush system call is unaffected by it's removal. Since the code is not needed delete it. [1] https://lkml.kernel.org/r/36123b5d-daa0-6c2b-f2d4-a942f069fd54@gmail.com Link: https://lkml.kernel.org/r/87sg10quue.fsf_-_@disp2133 Tested-by: Michael Schmitz <schmitzmic@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Cyril Hrubis <chrubis@suse.cz> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> |
||
Linus Torvalds
|
81361b837a |
Kbuild updates for v5.14
- Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmDon90VHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGWFUP/RGNwlGD/YV1xg0ZmM0/ynBzzOy2 3dcr3etJZpipQDeqnHy3jt0esgMVlbkTdrHvP+2hpNaeXFwjF1fDHjhur9m8ZkVD efOA6nugOnNwhy2G3BvtCJv+Vhb+KZ0nNLB27z3Bl0LGP6LJdMRNAxFBJMv4k3aR F3sABugwCpnT2/YtuprxRl2/3/CyLur5NjY24FD+ugON3JIWfl6ETbHeFmxr1JE4 mE+zaN5AwYuSuH9LpdRy85XVCcW/FFqP/DwOFllVvCCCNvvS0KWYSNHWfEsKdR75 hmAAaS/rpi2eaL0vp88sNhAtYnhMSf+uFu0fyfYeWZuJqMt4Xz5xZKAzDsifCdif aQ6UEPDjiKABh9gpX26BMd2CXzkGR+L4qZ7iBPfO586Iy7opajrFX9kIj5U7ZtCl wsPat/9+18xpVJOTe0sss3idId7Ft4cRoW5FQMEAW2EWJ9fXAG1yDxEREj1V5gFx sMXtpmCoQag968qjfARvP08s3MB1P4Ij6tXcioGqHuEWeJLxOMK/KWyafQUg611d 0kSWNO0OMo+odBj6j/vM+MIIaPhgwtZnPgw2q4uHGMcemzQxaEvGW+G/5a5qEpTv SKm8W24wXplNot4tuTGWq5/jANRJcMvVsyC48DYT81OZEOWrIc0kDV4v4qZToTxW 97jn1NKa2H6L0J1V =Za8V -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups * tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (27 commits) scripts: add generic syscallnr.sh scripts: check duplicated syscall number in syscall table sparc: syscalls: use pattern rules to generate syscall headers parisc: syscalls: use pattern rules to generate syscall headers nds32: add arch/nds32/boot/.gitignore kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set kbuild: modpost: Explicitly warn about unprototyped symbols kbuild: remove trailing slashes from $(KBUILD_EXTMOD) kconfig.h: explain IS_MODULE(), IS_ENABLED() kconfig: constify long_opts scripts/setlocalversion: simplify the short version part scripts/setlocalversion: factor out 12-chars hash construction scripts/setlocalversion: add more comments to -dirty flag detection scripts/setlocalversion: remove workaround for old make-kpkg scripts/setlocalversion: remove mercurial, svn and git-svn supports kbuild: clean up ${quiet} checks in shell scripts kbuild: sink stdout from cmd for silent build init: use $(call cmd,) for generating include/generated/compile.h kbuild: merge scripts/mkmakefile to top Makefile sh: move core-y in arch/sh/Makefile to arch/sh/Kbuild ... |
||
Linus Torvalds
|
4cad671979 |
asm-generic/unaligned: Unify asm/unaligned.h around struct helper
The get_unaligned()/put_unaligned() helpers are traditionally architecture specific, with the two main variants being the "access-ok.h" version that assumes unaligned pointer accesses always work on a particular architecture, and the "le-struct.h" version that casts the data to a byte aligned type before dereferencing, for architectures that cannot always do unaligned accesses in hardware. Based on the discussion linked below, it appears that the access-ok version is not realiable on any architecture, but the struct version probably has no downsides. This series changes the code to use the same implementation on all architectures, addressing the few exceptions separately. Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/ Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/ Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2 Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmDfFx4ACgkQmmx57+YA GNkqzRAAjdlIr8M+xI2CyT0/A9tswYfLMeWejmYopq3zlxI6RnvPiJJDIdY2I8US 1npIiDo55w061CnXL9rV65ocL3XmGu1mabOvgM6ATsec+8t4WaXBV9tysxTJ9ea0 ltLTa2P5DXWALvWiVMTME7hFaf1cW+8Uqt3LmXxDp2l5zasXajCHAH6YokON2PfM CsaRhwSxIu8Sbnu/IQGBI9JW5UXsBfKSyUwtM0OwP7jFOuIeZ4WBVA+j6UxONnFC wouKmAM/ThoOsaV9aP4EZLIfBx8d4/hfYQjZ958kYXurerruYkJeEqdIRbV0QqTy 2O6ZrJ6uqPlzfWz9h458me2dt98YEtALHV/3DCWUcBfHmUQtxElyJYEhG0YjVF3H 5RYtjw8Q2LS/QR5ask1Xn0JfT89rRnLi2migAtsA4Ce70JP4Us6wGobkj4SHlgDt P7+eVq2Mkhqw/kmV8N4p+ZS5lpkK0JniDN+ONDhkZqHL/zXG/HQzx9wLV69jlvo2 ASevKxITdi+bKHWs5ANungkBOnBUQZacq46mVyi4HPDwMAFyWvVYTbFumy9koagQ o9NEgX3RsZcxxi7bU1xuFPFMLMlUQT3Nb30+84B4fKe9FmvHC1hizTiCnp7q4bZr z6a6AMHke7YLqKZOqzTJGRR3lPoZZDCb775SAd70LQp6XPZXOHs= =IY5U -----END PGP SIGNATURE----- Merge tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm/unaligned.h unification from Arnd Bergmann: "Unify asm/unaligned.h around struct helper The get_unaligned()/put_unaligned() helpers are traditionally architecture specific, with the two main variants being the "access-ok.h" version that assumes unaligned pointer accesses always work on a particular architecture, and the "le-struct.h" version that casts the data to a byte aligned type before dereferencing, for architectures that cannot always do unaligned accesses in hardware. Based on the discussion linked below, it appears that the access-ok version is not realiable on any architecture, but the struct version probably has no downsides. This series changes the code to use the same implementation on all architectures, addressing the few exceptions separately" Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/ Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363 Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/ Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2 Link: https://lore.kernel.org/lkml/CAHk-=whGObOKruA_bU3aPGZfoDqZM1_9wBkwREp0H0FgR-90uQ@mail.gmail.com/ * tag 'asm-generic-unaligned-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: asm-generic: simplify asm/unaligned.h asm-generic: uaccess: 1-byte access is always aligned netpoll: avoid put_unaligned() on single character mwifiex: re-fix for unaligned accesses apparmor: use get_unaligned() only for multi-byte words partitions: msdos: fix one-byte get_unaligned() asm-generic: unaligned always use struct helpers asm-generic: unaligned: remove byteshift helpers powerpc: use linux/unaligned/le_struct.h on LE power7 m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS sh: remove unaligned access for sh4a openrisc: always use unaligned-struct header asm-generic: use asm-generic/unaligned.h for most architectures |
||
Linus Torvalds
|
71bd934101 |
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ... |
||
Linus Torvalds
|
911a2997a5 |
\n
-----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmDcl7AACgkQnJ2qBz9k QNnsBQf+LBAPsfykQ/f8EdHErO1lfbVTmwf2g/JzTkjrIVZTZ6Ic47aCIiFxgHU2 Js9ufaPxpsbbopzpn2PAoCUzxNsZDqgXtnC03MOUAqoSFbAvgLHz2sQwjqeYJUGQ P6n7VipEA/qBVpQI5zeCUhHYcahoNrRjSLzaFnE2Z8CrQYQ6Ry9gVEhduvu2OTru 62cWlAWlTJfx/FcR1Y0F/ZznnNSKMiAHcEe3F6Beztplg2ooq+z6FclJYrkmnxMq SXSOsqTCdi1/oFx36NpvLkykrIS9I7N/iqCnKwbm6X+nyZZKyAwYZhWVqkbozPPu +u1Ppq8o0IuWwEA6/UAmxgAO3m/Gkw== =tn0h -----END PGP SIGNATURE----- Merge tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull misc fs updates from Jan Kara: "The new quotactl_fd() syscall (remake of quotactl_path() syscall that got introduced & disabled in 5.13 cycle), and couple of udf, reiserfs, isofs, and writeback fixes and cleanups" * tag 'fs_for_v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: writeback: fix obtain a reference to a freeing memcg css quota: remove unnecessary oom message isofs: remove redundant continue statement quota: Wire up quotactl_fd syscall quota: Change quotactl_path() systcall to an fd-based one reiserfs: Remove unneed check in reiserfs_write_full_page() udf: Fix NULL pointer dereference in udf_symlink function reiserfs: add check for invalid 1st journal block |
||
Anshuman Khandual
|
1c2f7d14d8 |
mm/thp: define default pmd_pgtable()
Currently most platforms define pmd_pgtable() as pmd_page() duplicating the same code all over. Instead just define a default value i.e pmd_page() for pmd_pgtable() and let platforms override when required via <asm/pgtable.h>. All the existing platform that override pmd_pgtable() have been moved into their respective <asm/pgtable.h> header in order to precede before the new generic definition. This makes it much cleaner with reduced code. Link: https://lkml.kernel.org/r/1623646133-20306-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Anshuman Khandual
|
fac7757e1f |
mm: define default value for FIRST_USER_ADDRESS
Currently most platforms define FIRST_USER_ADDRESS as 0UL duplication the same code all over. Instead just define a generic default value (i.e 0UL) for FIRST_USER_ADDRESS and let the platforms override when required. This makes it much cleaner with reduced code. The default FIRST_USER_ADDRESS here would be skipped in <linux/pgtable.h> when the given platform overrides its value via <asm/pgtable.h>. Link: https://lkml.kernel.org/r/1620615725-24623-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> [RISC-V] Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Brian Cain <bcain@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Kefeng Wang
|
63703f37aa |
mm: generalize ZONE_[DMA|DMA32]
ZONE_[DMA|DMA32] configs have duplicate definitions on platforms that subscribe to them. Instead, just make them generic options which can be selected on applicable platforms. Also only x86/arm64 architectures could enable both ZONE_DMA and ZONE_DMA32 if EXPERT, add ARCH_HAS_ZONE_DMA_SET to make dma zone configurable and visible on the two architectures. Link: https://lkml.kernel.org/r/20210528074557.17768-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> [RISC-V] Acked-by: Michal Simek <michal.simek@xilinx.com> [microblaze] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
f4cc74c938 |
Microblaze patches for 5.14-rc1
- Remove unused PAGE_UP/DOWN macros - Fix trivial spelling mistake -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCYNstwwAKCRDKSWXLKUoM ISRRAJ9d+6RuVlID0X58bnKaFKNY/RcPMwCgkxwVfBDgOyTQT2LS3VOsxf5FpH0= =3KGM -----END PGP SIGNATURE----- Merge tag 'microblaze-v5.14' of git://git.monstr.eu/linux-2.6-microblaze Pull microblaze updates from Michal Simek: - Remove unused PAGE_UP/DOWN macros - Fix trivial spelling mistake * tag 'microblaze-v5.14' of git://git.monstr.eu/linux-2.6-microblaze: arch: microblaze: Fix spelling mistake "vesion" -> "version" microblaze: Cleanup unused functions |
||
Linus Torvalds
|
54a728dc5e |
Scheduler udpates for this cycle:
- Changes to core scheduling facilities: - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables coordinated scheduling across SMT siblings. This is a much requested feature for cloud computing platforms, to allow the flexible utilization of SMT siblings, without exposing untrusted domains to information leaks & side channels, plus to ensure more deterministic computing performance on SMT systems used by heterogenous workloads. There's new prctls to set core scheduling groups, which allows more flexible management of workloads that can share siblings. - Fix task->state access anti-patterns that may result in missed wakeups and rename it to ->__state in the process to catch new abuses. - Load-balancing changes: - Tweak newidle_balance for fair-sched, to improve 'memcache'-like workloads. - "Age" (decay) average idle time, to better track & improve workloads such as 'tbench'. - Fix & improve energy-aware (EAS) balancing logic & metrics. - Fix & improve the uclamp metrics. - Fix task migration (taskset) corner case on !CONFIG_CPUSET. - Fix RT and deadline utilization tracking across policy changes - Introduce a "burstable" CFS controller via cgroups, which allows bursty CPU-bound workloads to borrow a bit against their future quota to improve overall latencies & batching. Can be tweaked via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us. - Rework assymetric topology/capacity detection & handling. - Scheduler statistics & tooling: - Disable delayacct by default, but add a sysctl to enable it at runtime if tooling needs it. Use static keys and other optimizations to make it more palatable. - Use sched_clock() in delayacct, instead of ktime_get_ns(). - Misc cleanups and fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZcPoRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1g3yw//WfhIqy7Psa9d/MBMjQDRGbTuO4+w22Dj vmWFU44Q4KJxQHWeIgUlrK+dzvYWvNmflUs2CUUOiDVzxFTHMIyBtL4qCBUbx4Ns vKAcB9wsWZge2o3WzZqpProRhdoRaSKw8egUr2q7rACVBkckY7eGP/OjWxXU8BdA b7D0LPWwuIBFfN4pFYeCDLn32Dqr9s6Chyj+ZecabdG7EE6Gu+f1diVcxy7JE/mc 4WWL0D1RqdgpGrBEuMJIxPYekdrZiuy4jtEbztz5gbTBteN1cj3BLfqn0Pc/e6rO Vyuc5mXCAmzRVi18z6g6bsVl+IA/nrbErENB2OHOhOYtqiZxqGTd4GPWZszMyY17 5AsEO5+5pcaBsy4gyp09qURggBu9zhJnMVmOI3rIHZkmkhwzc6uUJlyhDCTiFWOz 3ZF3LjbZEyCKodMD8qMHbs3axIBpIfZqjzkvSKyFnvfXEGVytVse7NUuWtQ36u92 GnURxVeYY1TDVXvE1Y8owNKMxknKQ6YRlypP7Dtbeo/qG6hShp0xmS7qDLDi0ybZ ZlK+bDECiVoDf3nvJo+8v5M82IJ3CBt4UYldeRJsa1YCK/FsbK8tp91fkEfnXVue +U6LPX0AmMpXacR5HaZfb3uBIKRw/QMdP/7RFtBPhpV6jqCrEmuqHnpPQiEVtxwO UmG7bt94Trk= =3VDr -----END PGP SIGNATURE----- Merge tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler udpates from Ingo Molnar: - Changes to core scheduling facilities: - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables coordinated scheduling across SMT siblings. This is a much requested feature for cloud computing platforms, to allow the flexible utilization of SMT siblings, without exposing untrusted domains to information leaks & side channels, plus to ensure more deterministic computing performance on SMT systems used by heterogenous workloads. There are new prctls to set core scheduling groups, which allows more flexible management of workloads that can share siblings. - Fix task->state access anti-patterns that may result in missed wakeups and rename it to ->__state in the process to catch new abuses. - Load-balancing changes: - Tweak newidle_balance for fair-sched, to improve 'memcache'-like workloads. - "Age" (decay) average idle time, to better track & improve workloads such as 'tbench'. - Fix & improve energy-aware (EAS) balancing logic & metrics. - Fix & improve the uclamp metrics. - Fix task migration (taskset) corner case on !CONFIG_CPUSET. - Fix RT and deadline utilization tracking across policy changes - Introduce a "burstable" CFS controller via cgroups, which allows bursty CPU-bound workloads to borrow a bit against their future quota to improve overall latencies & batching. Can be tweaked via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us. - Rework assymetric topology/capacity detection & handling. - Scheduler statistics & tooling: - Disable delayacct by default, but add a sysctl to enable it at runtime if tooling needs it. Use static keys and other optimizations to make it more palatable. - Use sched_clock() in delayacct, instead of ktime_get_ns(). - Misc cleanups and fixes. * tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/doc: Update the CPU capacity asymmetry bits sched/topology: Rework CPU capacity asymmetry detection sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag psi: Fix race between psi_trigger_create/destroy sched/fair: Introduce the burstable CFS controller sched/uclamp: Fix uclamp_tg_restrict() sched/rt: Fix Deadline utilization tracking during policy change sched/rt: Fix RT utilization tracking during policy change sched: Change task_struct::state sched,arch: Remove unused TASK_STATE offsets sched,timer: Use __set_current_state() sched: Add get_current_state() sched,perf,kvm: Fix preemption condition sched: Introduce task_is_running() sched: Unbreak wakeups sched/fair: Age the average idle time sched/cpufreq: Consider reduced CPU capacity in energy calculation sched/fair: Take thermal pressure into account while estimating energy thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure sched/fair: Return early from update_tg_cfs_load() if delta == 0 ... |
||
Linus Torvalds
|
a15286c63d |
Locking changes for this cycle:
- Core locking & atomics: - Convert all architectures to ARCH_ATOMIC: move every architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the transitory facilities and #ifdefs. Much reduction in complexity from that series: 63 files changed, 756 insertions(+), 4094 deletions(-) - Self-test enhancements - Futexes: - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC). [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of FLAGS_CLOCKRT & invert the flag's meaning to avoid having to introduce a new variant was resisted successfully. ] - Enhance futex self-tests - Lockdep: - Fix dependency path printouts - Optimize trace saving - Broaden & fix wait-context checks - Misc cleanups and fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZaEYRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hPdxAAiNCsxL6X1cZ8zqbWsvLefT9Zqhzgs5u6 gdZele7PNibvbYdON26b5RUzuKfOW/hgyX6LKqr+AiNYTT9PGhcY+tycUr2PGk5R LMyhJWmmX5cUVPU92ky+z5hEHB2gr4XPJcvgpKKUL0XB1tBaSvy2DtgwPuhXOoT1 1sCQfy63t71snt2RfEnibVW6xovwaA2lsqL81lLHJN4iRFWvqO498/m4+PWkylsm ig/+VT1Oz7t4wqu3NhTqNNZv+4K4W2asniyo53Dg2BnRm/NjhJtgg4jRibrb0ssb 67Xdq6y8+xNBmEAKj+Re8VpMcu4aj346Ctk7d4gst2ah/Rc0TvqfH6mezH7oq7RL hmOrMBWtwQfKhEE/fDkng30nrVxc/98YXP0n2rCCa0ySsaF6b6T185mTcYDRDxFs BVNS58ub+zxrF9Zd4nhIHKaEHiL2ZdDimqAicXN0RpywjIzTQ/y11uU7I1WBsKkq WkPYs+FPHnX7aBv1MsuxHhb8sUXjG924K4JeqnjF45jC3sC1crX+N0jv4wHw+89V h4k20s2Tw6m5XGXlgGwMJh0PCcD6X22Vd9Uyw8zb+IJfvNTGR9Rp1Ec+1gMRSll+ xsn6G6Uy9bcNU0SqKlBSfelweGKn4ZxbEPn76Jc8KWLiepuZ6vv5PBoOuaujWht9 KAeOC5XdjMk= =tH// -----END PGP SIGNATURE----- Merge tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Core locking & atomics: - Convert all architectures to ARCH_ATOMIC: move every architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the transitory facilities and #ifdefs. Much reduction in complexity from that series: 63 files changed, 756 insertions(+), 4094 deletions(-) - Self-test enhancements - Futexes: - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC). [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of FLAGS_CLOCKRT & invert the flag's meaning to avoid having to introduce a new variant was resisted successfully. ] - Enhance futex self-tests - Lockdep: - Fix dependency path printouts - Optimize trace saving - Broaden & fix wait-context checks - Misc cleanups and fixes. * tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) locking/lockdep: Correct the description error for check_redundant() futex: Provide FUTEX_LOCK_PI2 to support clock selection futex: Prepare futex_lock_pi() for runtime clock selection lockdep/selftest: Remove wait-type RCU_CALLBACK tests lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING lockdep: Fix wait-type for empty stack locking/selftests: Add a selftest for check_irq_usage() lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage() locking/lockdep: Remove the unnecessary trace saving locking/lockdep: Fix the dep path printing for backwards BFS selftests: futex: Add futex compare requeue test selftests: futex: Add futex wait test seqlock: Remove trailing semicolon in macros locking/lockdep: Reduce LOCKDEP dependency list locking/lockdep,doc: Improve readability of the block matrix locking/atomics: atomic-instrumented: simplify ifdeffery locking/atomic: delete !ARCH_ATOMIC remnants locking/atomic: xtensa: move to ARCH_ATOMIC locking/atomic: sparc: move to ARCH_ATOMIC locking/atomic: sh: move to ARCH_ATOMIC ... |
||
Peter Zijlstra
|
7c3edd6d9c |
sched,arch: Remove unused TASK_STATE offsets
All 6 architectures define TASK_STATE in asm-offsets, but then never actually use it. Remove the definitions to make sure they never will. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210611082838.472811363@infradead.org |
||
Jan Kara
|
65ffb3d69e |
quota: Wire up quotactl_fd syscall
Wire up the quotactl_fd syscall. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> |
||
Colin Ian King
|
14a832498c |
arch: microblaze: Fix spelling mistake "vesion" -> "version"
There is a spelling mistake in the comment. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20210601103707.9701-1-colin.king@canonical.com Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Guo Ren
|
695efefb2e |
microblaze: Cleanup unused functions
These functions haven't been used, so just remove them. The patch just uses grep to verify. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Michal Simek <monstr@monstr.eu> Link: https://lore.kernel.org/r/1622350408-44875-3-git-send-email-guoren@kernel.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Masahiro Yamada
|
d92cc4d516 |
kbuild: require all architectures to have arch/$(SRCARCH)/Kbuild
arch/$(SRCARCH)/Kbuild is useful for Makefile cleanups because you can use the obj-y syntax. Add an empty file if it is missing in arch/$(SRCARCH)/. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Mark Rutland
|
3c1885187b |
locking/atomic: delete !ARCH_ATOMIC remnants
Now that all architectures implement ARCH_ATOMIC, we can make it mandatory, removing the Kconfig symbol and logic for !ARCH_ATOMIC. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210525140232.53872-33-mark.rutland@arm.com |
||
Mark Rutland
|
f5b1c0f951 |
locking/atomic: microblaze: move to ARCH_ATOMIC
We'd like all architectures to convert to ARCH_ATOMIC, as once all architectures are converted it will be possible to make significant cleanups to the atomics headers, and this will make it much easier to generically enable atomic functionality (e.g. debug logic in the instrumented wrappers). As a step towards that, this patch migrates microblaze to ARCH_ATOMIC, using the asm-generic implementations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210525140232.53872-22-mark.rutland@arm.com |
||
Mark Rutland
|
b68622a86c |
locking/atomic: microblaze: use asm-generic exclusively
Microblaze provides its own implementation of atomic_dec_if_positive(), but nothing else. For a while now, the conditional inc/dec ops have been optional, and the core code will provide generic implementations using the code templates in scripts/atomic/fallbacks/. For simplicity, and for consistency with the other conditional atomic ops, let's drop the microblaze implementation of atomic_dec_if_positive(), and use the generic implementation. With that, we can also drop the local asm/atomic.h and asm/cmpxchg.h headers, as asm-generic/atomic.h is mandatory-y, and we can pull in asm-generic/cmpxchg.h via generic-y. This matches what nios2 and nds32 do today. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210525140232.53872-5-mark.rutland@arm.com |
||
Jan Kara
|
5b9fedb31e |
quota: Disable quotactl_path syscall
In commit
|
||
Arnd Bergmann
|
637be9183e |
asm-generic: use asm-generic/unaligned.h for most architectures
There are several architectures that just duplicate the contents of asm-generic/unaligned.h, so change those over to use the file directly, to make future modifications easier. The exceptions are: - arm32 sets HAVE_EFFICIENT_UNALIGNED_ACCESS, but wants the unaligned-struct version - ppc64le disables HAVE_EFFICIENT_UNALIGNED_ACCESS but includes the access-ok version - most m68k also uses the access-ok version without setting HAVE_EFFICIENT_UNALIGNED_ACCESS. - sh4a has a custom inline asm version - openrisc is the only one using the memmove version that generally leads to worse code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> |
||
Linus Torvalds
|
9b1f61d5d7 |
tracing updates for 5.13
New feature: The "func-no-repeats" option in tracefs/options directory. When set the function tracer will detect if the current function being traced is the same as the previous one, and instead of recording it, it will keep track of the number of times that the function is repeated in a row. And when another function is recorded, it will write a new event that shows the function that repeated, the number of times it repeated and the time stamp of when the last repeated function occurred. Enhancements: In order to implement the above "func-no-repeats" option, the ring buffer timestamp can now give the accurate timestamp of the event as it is being recorded, instead of having to record an absolute timestamp for all events. This helps the histogram code which no longer needs to waste ring buffer space. New validation logic to make sure all trace events that access dereferenced pointers do so in a safe way, and will warn otherwise. Fixes: No longer limit the PIDs of tasks that are recorded for "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger range. This caused the mapping of PIDs to the task names to be dropped for all tasks with a PID greater than 32768. Change trace_clock_global() to never block. This caused a deadlock. Clean ups: Typos, prototype fixes, and removing of duplicate or unused code. Better management of ftrace_page allocations. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYI/1vBQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qiL0AP9EemIC5TDh2oihqLRNeUjdTu0ryEoM HRFqxozSF985twD/bfkt86KQC8rLHwxTbxQZ863bmdaC6cMGFhWiF+H/MAs= =psYt -----END PGP SIGNATURE----- Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "New feature: - A new "func-no-repeats" option in tracefs/options directory. When set the function tracer will detect if the current function being traced is the same as the previous one, and instead of recording it, it will keep track of the number of times that the function is repeated in a row. And when another function is recorded, it will write a new event that shows the function that repeated, the number of times it repeated and the time stamp of when the last repeated function occurred. Enhancements: - In order to implement the above "func-no-repeats" option, the ring buffer timestamp can now give the accurate timestamp of the event as it is being recorded, instead of having to record an absolute timestamp for all events. This helps the histogram code which no longer needs to waste ring buffer space. - New validation logic to make sure all trace events that access dereferenced pointers do so in a safe way, and will warn otherwise. Fixes: - No longer limit the PIDs of tasks that are recorded for "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger range. This caused the mapping of PIDs to the task names to be dropped for all tasks with a PID greater than 32768. - Change trace_clock_global() to never block. This caused a deadlock. Clean ups: - Typos, prototype fixes, and removing of duplicate or unused code. - Better management of ftrace_page allocations" * tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (32 commits) tracing: Restructure trace_clock_global() to never block tracing: Map all PIDs to command lines ftrace: Reuse the output of the function tracer for func_repeats tracing: Add "func_no_repeats" option for function tracing tracing: Unify the logic for function tracing options tracing: Add method for recording "func_repeats" events tracing: Add "last_func_repeats" to struct trace_array tracing: Define new ftrace event "func_repeats" tracing: Define static void trace_print_time() ftrace: Simplify the calculation of page number for ftrace_page->records some more ftrace: Store the order of pages allocated in ftrace_page tracing: Remove unused argument from "ring_buffer_time_stamp() tracing: Remove duplicate struct declaration in trace_events.h tracing: Update create_system_filter() kernel-doc comment tracing: A minor cleanup for create_system_filter() kernel: trace: Mundane typo fixes in the file trace_events_filter.c tracing: Fix various typos in comments scripts/recordmcount.pl: Make vim and emacs indent the same scripts/recordmcount.pl: Make indent spacing consistent tracing: Add a verifier to check string pointers for trace events ... |
||
Linus Torvalds
|
17ae69aba8 |
Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com>
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAmCInP4ACgkQrZhLv9lQ BTza0g//dTeb9woC9H7qlEhK4l9yk62lTss60Q8X7m7ZSNfdL4tiEbi64SgK+iOW OOegbrOEb8Kzh4KJJYmVlVZ5YUWyH4szgmee1wnylBdsWiWaPLPF3Cflz77apy6T TiiBsJd7rRE29FKheaMt34B41BMh8QHESN+DzjzJWsFoi/uNxjgSs2W16XuSupKu bpRmB1pYNXMlrkzz7taL05jndZYE5arVriqlxgAsuLOFOp/ER7zecrjImdCM/4kL W6ej0R1fz2Geh6CsLBJVE+bKWSQ82q5a4xZEkSYuQHXgZV5eywE5UKu8ssQcRgQA VmGUY5k73rfY9Ofupf2gCaf/JSJNXKO/8Xjg0zAdklKtmgFjtna5Tyg9I90j7zn+ 5swSpKuRpilN8MQH+6GWAnfqQlNoviTOpFeq3LwBtNVVOh08cOg6lko/bmebBC+R TeQPACKS0Q0gCDPm9RYoU1pMUuYgfOwVfVRZK1prgi2Co7ZBUMOvYbNoKYoPIydr ENBYljlU1OYwbzgR2nE+24fvhU8xdNOVG1xXYPAEHShu+p7dLIWRLhl8UCtRQpSR 1ofeVaJjgjrp29O+1OIQjB2kwCaRdfv/Gq1mztE/VlMU/r++E62OEzcH0aS+mnrg yzfyUdI8IFv1q6FGT9yNSifWUWxQPmOKuC8kXsKYfqfJsFwKmHM= =uCN4 -----END PGP SIGNATURE----- Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull Landlock LSM from James Morris: "Add Landlock, a new LSM from Mickaël Salaün. Briefly, Landlock provides for unprivileged application sandboxing. From Mickaël's cover letter: "The goal of Landlock is to enable to restrict ambient rights (e.g. global filesystem access) for a set of processes. Because Landlock is a stackable LSM [1], it makes possible to create safe security sandboxes as new security layers in addition to the existing system-wide access-controls. This kind of sandbox is expected to help mitigate the security impact of bugs or unexpected/malicious behaviors in user-space applications. Landlock empowers any process, including unprivileged ones, to securely restrict themselves. Landlock is inspired by seccomp-bpf but instead of filtering syscalls and their raw arguments, a Landlock rule can restrict the use of kernel objects like file hierarchies, according to the kernel semantic. Landlock also takes inspiration from other OS sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil. In this current form, Landlock misses some access-control features. This enables to minimize this patch series and ease review. This series still addresses multiple use cases, especially with the combined use of seccomp-bpf: applications with built-in sandboxing, init systems, security sandbox tools and security-oriented APIs [2]" The cover letter and v34 posting is here: https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/ See also: https://landlock.io/ This code has had extensive design discussion and review over several years" Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1] Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2] * tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: landlock: Enable user space to infer supported features landlock: Add user and kernel documentation samples/landlock: Add a sandbox manager example selftests/landlock: Add user space tests landlock: Add syscall implementations arch: Wire up Landlock syscalls fs,security: Add sb_delete hook landlock: Support filesystem access-control LSM: Infrastructure management of the superblock landlock: Add ptrace restrictions landlock: Set up the security framework and manage credentials landlock: Add ruleset and domain management landlock: Add object management |
||
Kefeng Wang
|
1f9d03c5e9 |
mm: move mem_init_print_info() into mm_init()
mem_init_print_info() is called in mem_init() on each architecture, and pass NULL argument, so using void argument and move it into mm_init(). Link: https://lkml.kernel.org/r/20210317015210.33641-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86] Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> [powerpc] Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> [sparc64] Acked-by: Russell King <rmk+kernel@armlinux.org.uk> [arm] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.osdn.me> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
d0cc7ecacb |
Microblaze patches for 5.13-rc1
- Switch to generic syscall scripts - Some small fixes -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCYIpg9AAKCRDKSWXLKUoM IXMaAKCMAvlTOl/sq+44yyqmZafzeka1kQCeOJ64km/oN86kZlhs72oXV3Nevmg= =VvZO -----END PGP SIGNATURE----- Merge tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze Pull Microblaze updates from Michal Simek: "No new features, just about cleaning up some code and moving to generic syscall solution used by other architectures: - Switch to generic syscall scripts - Some small fixes" * tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: add 'fallthrough' to memcpy/memset/memmove microblaze: Fix a typo microblaze: tag highmem_setup() with __meminit microblaze: syscalls: switch to generic syscallhdr.sh microblaze: syscalls: switch to generic syscalltbl.sh |
||
Linus Torvalds
|
767fcbc80f |
\n
-----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmCJU1UACgkQnJ2qBz9k QNk62AgAgp05OIXU/AgObb7DvSyI3ycwCV8PeWBpwD8yoDAh5x0tmT7vnJu974p6 yHdnF7rr69ZzvbNCHLJ5kRykRlUao9W7cO5fdOW1uTpL7Ic60QuJMks/NfgVTHp1 2zIQmBDerfn1/LTK8r2pPGcvtcjRcr7Ep4beN0Duw57lfVMJhjsNRPnBbXGBcp0r QzKk4/8V3DCZvOw+XNC3nto7avjvf+nU9sJmuh83546eqh0atjWivvO5aAlDOe6W rhBiLlmP0in5u2n1fYqzI1OQvtgtleyEZT2G0CrbAZn0xjmV/if9wl+3K6TOwDvR 778xDEX7sZCaO/xkB+WK3hrd15ftKg== =0kYE -----END PGP SIGNATURE----- Merge tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, ext2, reiserfs updates from Jan Kara: - support for path (instead of device) based quotactl syscall (quotactl_path(2)) - ext2 conversion to kmap_local() - other minor cleanups & fixes * tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/reiserfs/journal.c: delete useless variables fs/ext2: Replace kmap() with kmap_local_page() ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry() fs/ext2/: fix misspellings using codespell tool quota: report warning limits for realtime space quotas quota: wire up quotactl_path quota: Add mountpath based quota support |
||
Mickaël Salaün
|
a49f4f81cb |
arch: Wire up Landlock syscalls
Wire up the following system calls for all architectures: * landlock_create_ruleset(2) * landlock_add_rule(2) * landlock_restrict_self(2) Cc: Arnd Bergmann <arnd@arndb.de> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> Link: https://lore.kernel.org/r/20210422154123.13086-10-mic@digikod.net Signed-off-by: James Morris <jamorris@linux.microsoft.com> |
||
Randy Dunlap
|
47de4477a8 |
microblaze: add 'fallthrough' to memcpy/memset/memmove
Fix "fallthrough" warnings in microblaze memcpy/memset/memmove library functions. CC arch/microblaze/lib/memcpy.o ../arch/microblaze/lib/memcpy.c: In function 'memcpy': ../arch/microblaze/lib/memcpy.c:70:4: warning: this statement may fall through [-Wimplicit-fallthrough=] 70 | --c; ../arch/microblaze/lib/memcpy.c:71:3: note: here 71 | case 2: ../arch/microblaze/lib/memcpy.c:73:4: warning: this statement may fall through [-Wimplicit-fallthrough=] 73 | --c; ../arch/microblaze/lib/memcpy.c:74:3: note: here 74 | case 3: ../arch/microblaze/lib/memcpy.c:178:10: warning: this statement may fall through [-Wimplicit-fallthrough=] 178 | *dst++ = *src++; ../arch/microblaze/lib/memcpy.c:179:2: note: here 179 | case 2: ../arch/microblaze/lib/memcpy.c:180:10: warning: this statement may fall through [-Wimplicit-fallthrough=] 180 | *dst++ = *src++; ../arch/microblaze/lib/memcpy.c:181:2: note: here 181 | case 1: CC arch/microblaze/lib/memset.o ../arch/microblaze/lib/memset.c: In function 'memset': ../arch/microblaze/lib/memset.c:71:4: warning: this statement may fall through [-Wimplicit-fallthrough=] 71 | --n; ../arch/microblaze/lib/memset.c:72:3: note: here 72 | case 2: ../arch/microblaze/lib/memset.c:74:4: warning: this statement may fall through [-Wimplicit-fallthrough=] 74 | --n; ../arch/microblaze/lib/memset.c:75:3: note: here 75 | case 3: CC arch/microblaze/lib/memmove.o ../arch/microblaze/lib/memmove.c: In function 'memmove': ../arch/microblaze/lib/memmove.c:92:4: warning: this statement may fall through [-Wimplicit-fallthrough=] 92 | --c; ../arch/microblaze/lib/memmove.c:93:3: note: here 93 | case 2: ../arch/microblaze/lib/memmove.c:95:4: warning: this statement may fall through [-Wimplicit-fallthrough=] 95 | --c; ../arch/microblaze/lib/memmove.c:96:3: note: here 96 | case 1: ../arch/microblaze/lib/memmove.c:203:10: warning: this statement may fall through [-Wimplicit-fallthrough=] 203 | *--dst = *--src; ../arch/microblaze/lib/memmove.c:204:2: note: here 204 | case 3: ../arch/microblaze/lib/memmove.c:205:10: warning: this statement may fall through [-Wimplicit-fallthrough=] 205 | *--dst = *--src; ../arch/microblaze/lib/memmove.c:206:2: note: here 206 | case 2: ../arch/microblaze/lib/memmove.c:207:10: warning: this statement may fall through [-Wimplicit-fallthrough=] 207 | *--dst = *--src; ../arch/microblaze/lib/memmove.c:208:2: note: here 208 | case 1: Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Simek <monstr@monstr.eu> Link: https://lore.kernel.org/r/20210421022041.10689-1-rdunlap@infradead.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Ingo Molnar
|
f2cc020d78 |
tracing: Fix various typos in comments
Fix ~59 single-word typos in the tracing code comments, and fix the grammar in a handful of places. Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
||
Michal Simek
|
2907f851f6 |
xsysace: Remove SYSACE driver
Sysace IP is no longer used on Xilinx PowerPC 405/440 and Microblaze systems. The driver is not regularly tested and very likely not working for quite a long time that's why remove it. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Bhaskar Chowdhury
|
bbcee72c2f |
microblaze: Fix a typo
s/storign/storing/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Link: https://lore.kernel.org/r/20210319045323.21241-1-unixbhaskar@gmail.com Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Sascha Hauer
|
fa8b90070a |
quota: wire up quotactl_path
Wire up the quotactl_path syscall added in the previous patch. Link: https://lore.kernel.org/r/20210304123541.30749-3-s.hauer@pengutronix.de Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> |
||
David Hildenbrand
|
9dc12e4ccd |
microblaze: tag highmem_setup() with __meminit
With commit |
||
Masahiro Yamada
|
64f416c869 |
microblaze: syscalls: switch to generic syscallhdr.sh
Many architectures duplicate similar shell scripts. This commit converts microblaze to use scripts/syscallhdr.sh. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20210301142303.343727-2-masahiroy@kernel.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Masahiro Yamada
|
ce372128a7 |
microblaze: syscalls: switch to generic syscalltbl.sh
Many architectures duplicate similar shell scripts. This commit converts microblaze to use scripts/syscalltbl.sh. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20210301142303.343727-1-masahiroy@kernel.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Linus Torvalds
|
5695e51619 |
io_uring-worker.v3-2021-02-25
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmA4JRkQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpoWqD/9dbbqe8L701U6May1A/4hRsqL4THTA2flx vNCNRBl6XV3l/wBCtL6waKy6tyO4lyM8XdUdEvo3Kxl2kGPb8eVfpyYL/+77HqyH ctT4RMrs+84Mxn+5N6cM97hS1qVI2moTxxyvOEl/JTB7BYrutz9gvAoeY3/Dto47 J66oSaPeuqJ32TyihxfQHVxQopJcqFzDjyoYHGDu6ATio1PXfaIdTu8ywVYSECAh pWI4rwnqdurGuHMNpxyL1bA6CT/jC7s+sqU7bUYUCgtYI3eG0u3V0bp5gAQQIgl9 5sxxE3DidYGAkYZsosrelshBtzGddLdz4Qrt2ungMYv8RsGNpFQ095jDPKDwFaZj bSvSsfplCo7iFsJByb1TtpNEOW8eAwi81PmBDVQ9Oq5P5ygTYno9GBDc/20ql0Fk q6wcX28coE3IBw44ne0hIwvBOtXV4WJyluG/gqOxfbTH+kOy3pDsN8lWcY/P4X0U yzdU2MLHe8BNMyYlUiBF47Amzt4ltr85P4XD3WZ4bX71iwri6HvrdGWLuuKwX+Ie 66QiIDDQIYZQ6NMMJWS9DGW3y3DBizpSXGxONbOw1J2bQdNmtToR0D2UnK/9UnKp msnvkUNk8fkYGS4aptpJ6HxbmjMEG5YtbiGlPj6fz5/7MTvhRjPxt7A0LWrUIdqR f88+sHUMqg== =oc8u -----END PGP SIGNATURE----- Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block Pull io_uring thread rewrite from Jens Axboe: "This converts the io-wq workers to be forked off the tasks in question instead of being kernel threads that assume various bits of the original task identity. This kills > 400 lines of code from io_uring/io-wq, and it's the worst part of the code. We've had several bugs in this area, and the worry is always that we could be missing some pieces for file types doing unusual things (recent /dev/tty example comes to mind, userfaultfd reads installing file descriptors is another fun one... - both of which need special handling, and I bet it's not the last weird oddity we'll find). With these identical workers, we can have full confidence that we're never missing anything. That, in itself, is a huge win. Outside of that, it's also more efficient since we're not wasting space and code on tracking state, or switching between different states. I'm sure we're going to find little things to patch up after this series, but testing has been pretty thorough, from the usual regression suite to production. Any issue that may crop up should be manageable. There's also a nice series of further reductions we can do on top of this, but I wanted to get the meat of it out sooner rather than later. The general worry here isn't that it's fundamentally broken. Most of the little issues we've found over the last week have been related to just changes in how thread startup/exit is done, since that's the main difference between using kthreads and these kinds of threads. In fact, if all goes according to plan, I want to get this into the 5.10 and 5.11 stable branches as well. That said, the changes outside of io_uring/io-wq are: - arch setup, simple one-liner to each arch copy_thread() implementation. - Removal of net and proc restrictions for io_uring, they are no longer needed or useful" * tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block: (30 commits) io-wq: remove now unused IO_WQ_BIT_ERROR io_uring: fix SQPOLL thread handling over exec io-wq: improve manager/worker handling over exec io_uring: ensure SQPOLL startup is triggered before error shutdown io-wq: make buffered file write hashed work map per-ctx io-wq: fix race around io_worker grabbing io-wq: fix races around manager/worker creation and task exit io_uring: ensure io-wq context is always destroyed for tasks arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread() io_uring: cleanup ->user usage io-wq: remove nr_process accounting io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERS net: remove cmsg restriction from io_uring based send/recvmsg calls Revert "proc: don't allow async path resolution of /proc/self components" Revert "proc: don't allow async path resolution of /proc/thread-self components" io_uring: move SQPOLL thread io-wq forked worker io-wq: make io_wq_fork_thread() available to other users io-wq: only remove worker from free_list, if it was there io_uring: remove io_identity io_uring: remove any grabbing of context ... |
||
Linus Torvalds
|
6fbd6cf85a |
Kbuild updates for v5.12
- Fix false-positive build warnings for ARCH=ia64 builds - Optimize dictionary size for module compression with xz - Check the compiler and linker versions in Kconfig - Fix misuse of extra-y - Support DWARF v5 debug info - Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x exceeded the limit - Add generic syscall{tbl,hdr}.sh for cleanups across arches - Minor cleanups of genksyms - Minor cleanups of Kconfig -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmA3zhgVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsG0C4P/A5hUNFdkYI+EffAWZiHn69t0S8j M1GQkZildKu/yOfm6hp3mNwgHmYgw0aAuch1htkJuv+5rXRtoK77yw0xKbUqNHyO VqkJWQPVUXJbWIDiu332NaETHbFTWCnPZKGmzcbVOBHbYsXUJPp17gROQ9ke0fQN Ae6OV5WINhoS8UnjESWb3qOO87MdQTZ+9mP+NMnVh4kV1SUeMAXLFwFll66KZTkj GXB330N3p9L0wQVljhXpQ/YPOd76wJNPhJWJ9+hKLFbWsedovzlHb+duprh1z1xe 7LLaq9dEbXxe1Uz0qmK76lupXxilYMyUupTW9HIYtIsY8br8DIoBOG0bn46LVnuL /m+UQNfUFCYYePT7iZQNNc1DISQJrxme3bjq0PJzZTDukNnHJVahnj9x4RoNaF8j Dc+JME0r2i8Ccp28vgmaRgzvSsb8Xtw5icwRdwzIpyt1ubs/+tkd/GSaGzQo30Q8 m8y1WOjovHNX7OGnOaOWBGoQAX/2k/VHeAediMsPqWUoOxwsLHYxG/4KtgwbJ5vc gu/Fyk1GRDklZPpLdYFVvz8TGnqSDogJgF+7WolJ6YvPGAUIDAfd5Ky2sWayddlm wchc3sKDVyh3lov23h0WQVTvLO9xl+NZ6THxoAGdYeQ0DUu5OxwH8qje/UpWuo1a DchhNN+g5pa6n56Z =sLxb -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Fix false-positive build warnings for ARCH=ia64 builds - Optimize dictionary size for module compression with xz - Check the compiler and linker versions in Kconfig - Fix misuse of extra-y - Support DWARF v5 debug info - Clamp SUBLEVEL to 255 because stable releases 4.4.x and 4.9.x exceeded the limit - Add generic syscall{tbl,hdr}.sh for cleanups across arches - Minor cleanups of genksyms - Minor cleanups of Kconfig * tag 'kbuild-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (38 commits) initramfs: Remove redundant dependency of RD_ZSTD on BLK_DEV_INITRD kbuild: remove deprecated 'always' and 'hostprogs-y/m' kbuild: parse C= and M= before changing the working directory kbuild: reuse this-makefile to define abs_srctree kconfig: unify rule of config, menuconfig, nconfig, gconfig, xconfig kconfig: omit --oldaskconfig option for 'make config' kconfig: fix 'invalid option' for help option kconfig: remove dead code in conf_askvalue() kconfig: clean up nested if-conditionals in check_conf() kconfig: Remove duplicate call to sym_get_string_value() Makefile: Remove # characters from compiler string Makefile: reuse CC_VERSION_TEXT kbuild: check the minimum linker version in Kconfig kbuild: remove ld-version macro scripts: add generic syscallhdr.sh scripts: add generic syscalltbl.sh arch: syscalls: remove $(srctree)/ prefix from syscall tables arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work gen_compile_commands: prune some directories kbuild: simplify access to the kernel's version ... |
||
Linus Torvalds
|
7d6beb71da |
idmapped-mounts-v5.12
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
=yPaw
-----END PGP SIGNATURE-----
Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull idmapped mounts from Christian Brauner:
"This introduces idmapped mounts which has been in the making for some
time. Simply put, different mounts can expose the same file or
directory with different ownership. This initial implementation comes
with ports for fat, ext4 and with Christoph's port for xfs with more
filesystems being actively worked on by independent people and
maintainers.
Idmapping mounts handle a wide range of long standing use-cases. Here
are just a few:
- Idmapped mounts make it possible to easily share files between
multiple users or multiple machines especially in complex
scenarios. For example, idmapped mounts will be used in the
implementation of portable home directories in
systemd-homed.service(8) where they allow users to move their home
directory to an external storage device and use it on multiple
computers where they are assigned different uids and gids. This
effectively makes it possible to assign random uids and gids at
login time.
- It is possible to share files from the host with unprivileged
containers without having to change ownership permanently through
chown(2).
- It is possible to idmap a container's rootfs and without having to
mangle every file. For example, Chromebooks use it to share the
user's Download folder with their unprivileged containers in their
Linux subsystem.
- It is possible to share files between containers with
non-overlapping idmappings.
- Filesystem that lack a proper concept of ownership such as fat can
use idmapped mounts to implement discretionary access (DAC)
permission checking.
- They allow users to efficiently changing ownership on a per-mount
basis without having to (recursively) chown(2) all files. In
contrast to chown (2) changing ownership of large sets of files is
instantenous with idmapped mounts. This is especially useful when
ownership of a whole root filesystem of a virtual machine or
container is changed. With idmapped mounts a single syscall
mount_setattr syscall will be sufficient to change the ownership of
all files.
- Idmapped mounts always take the current ownership into account as
idmappings specify what a given uid or gid is supposed to be mapped
to. This contrasts with the chown(2) syscall which cannot by itself
take the current ownership of the files it changes into account. It
simply changes the ownership to the specified uid and gid. This is
especially problematic when recursively chown(2)ing a large set of
files which is commong with the aforementioned portable home
directory and container and vm scenario.
- Idmapped mounts allow to change ownership locally, restricting it
to specific mounts, and temporarily as the ownership changes only
apply as long as the mount exists.
Several userspace projects have either already put up patches and
pull-requests for this feature or will do so should you decide to pull
this:
- systemd: In a wide variety of scenarios but especially right away
in their implementation of portable home directories.
https://systemd.io/HOME_DIRECTORY/
- container runtimes: containerd, runC, LXD:To share data between
host and unprivileged containers, unprivileged and privileged
containers, etc. The pull request for idmapped mounts support in
containerd, the default Kubernetes runtime is already up for quite
a while now: https://github.com/containerd/containerd/pull/4734
- The virtio-fs developers and several users have expressed interest
in using this feature with virtual machines once virtio-fs is
ported.
- ChromeOS: Sharing host-directories with unprivileged containers.
I've tightly synced with all those projects and all of those listed
here have also expressed their need/desire for this feature on the
mailing list. For more info on how people use this there's a bunch of
talks about this too. Here's just two recent ones:
https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
https://fosdem.org/2021/schedule/event/containers_idmap/
This comes with an extensive xfstests suite covering both ext4 and
xfs:
https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts
It covers truncation, creation, opening, xattrs, vfscaps, setid
execution, setgid inheritance and more both with idmapped and
non-idmapped mounts. It already helped to discover an unrelated xfs
setgid inheritance bug which has since been fixed in mainline. It will
be sent for inclusion with the xfstests project should you decide to
merge this.
In order to support per-mount idmappings vfsmounts are marked with
user namespaces. The idmapping of the user namespace will be used to
map the ids of vfs objects when they are accessed through that mount.
By default all vfsmounts are marked with the initial user namespace.
The initial user namespace is used to indicate that a mount is not
idmapped. All operations behave as before and this is verified in the
testsuite.
Based on prior discussions we want to attach the whole user namespace
and not just a dedicated idmapping struct. This allows us to reuse all
the helpers that already exist for dealing with idmappings instead of
introducing a whole new range of helpers. In addition, if we decide in
the future that we are confident enough to enable unprivileged users
to setup idmapped mounts the permission checking can take into account
whether the caller is privileged in the user namespace the mount is
currently marked with.
The user namespace the mount will be marked with can be specified by
passing a file descriptor refering to the user namespace as an
argument to the new mount_setattr() syscall together with the new
MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
of extensibility.
The following conditions must be met in order to create an idmapped
mount:
- The caller must currently have the CAP_SYS_ADMIN capability in the
user namespace the underlying filesystem has been mounted in.
- The underlying filesystem must support idmapped mounts.
- The mount must not already be idmapped. This also implies that the
idmapping of a mount cannot be altered once it has been idmapped.
- The mount must be a detached/anonymous mount, i.e. it must have
been created by calling open_tree() with the OPEN_TREE_CLONE flag
and it must not already have been visible in the filesystem.
The last two points guarantee easier semantics for userspace and the
kernel and make the implementation significantly simpler.
By default vfsmounts are marked with the initial user namespace and no
behavioral or performance changes are observed.
The manpage with a detailed description can be found here:
|
||
Linus Torvalds
|
74268693e0 |
Microblaze patches for 5.12-rc1
- Fix DTB alignment - Remove < GCC 4 support - Remove TRACING_SUPPORT selection -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCYDUOpQAKCRDKSWXLKUoM IXV5AJwLYpmRfG1q4n4Vz4t8Zxonajws4wCfRwxSKSGIYUh5oagjpVohEfptq1M= =TtkZ -----END PGP SIGNATURE----- Merge tag 'microblaze-v5.12' of git://git.monstr.eu/linux-2.6-microblaze Pull microblaze updates from Michal Simek: - Fix DTB alignment - Remove code for very old GCC versions - Remove TRACING_SUPPORT selection * tag 'microblaze-v5.12' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Fix built-in DTB alignment to be 8-byte aligned microblaze: Remove support for gcc < 4 microblaze: do not select TRACING_SUPPORT directly |
||
Jens Axboe
|
4727dc20e0 |
arch: setup PF_IO_WORKER threads like PF_KTHREAD
PF_IO_WORKER are kernel threads too, but they aren't PF_KTHREAD in the sense that we don't assign ->set_child_tid with our own structure. Just ensure that every arch sets up the PF_IO_WORKER threads like kthreads in the arch implementation of copy_thread(). Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Masahiro Yamada
|
29c5c3ac63 |
arch: syscalls: remove $(srctree)/ prefix from syscall tables
The 'syscall' variables are not directly used in the commands. Remove the $(srctree)/ prefix because we can rely on VPATH. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Masahiro Yamada
|
865fa29f7d |
arch: syscalls: add missing FORCE and fix 'targets' to make if_changed work
The rules in these Makefiles cannot detect the command line change because the prerequisite 'FORCE' is missing. Adding 'FORCE' will result in the headers being rebuilt every time because the 'targets' additions are also wrong; the file paths in 'targets' must be relative to the current Makefile. Fix all of them so the if_changed rules work correctly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> |
||
Rob Herring
|
48783be427 |
microblaze: Fix built-in DTB alignment to be 8-byte aligned
Commit |
||
Geert Uytterhoeven
|
b68c8736a0 |
microblaze: Remove support for gcc < 4
Since commit
|
||
Christian Brauner
|
2a1867219c
|
fs: add mount_setattr()
This implements the missing mount_setattr() syscall. While the new mount
api allows to change the properties of a superblock there is currently
no way to change the properties of a mount or a mount tree using file
descriptors which the new mount api is based on. In addition the old
mount api has the restriction that mount options cannot be applied
recursively. This hasn't changed since changing mount options on a
per-mount basis was implemented in [1] and has been a frequent request
not just for convenience but also for security reasons. The legacy
mount syscall is unable to accommodate this behavior without introducing
a whole new set of flags because MS_REC | MS_REMOUNT | MS_BIND |
MS_RDONLY | MS_NOEXEC | [...] only apply the mount option to the topmost
mount. Changing MS_REC to apply to the whole mount tree would mean
introducing a significant uapi change and would likely cause significant
regressions.
The new mount_setattr() syscall allows to recursively clear and set
mount options in one shot. Multiple calls to change mount options
requesting the same changes are idempotent:
int mount_setattr(int dfd, const char *path, unsigned flags,
struct mount_attr *uattr, size_t usize);
Flags to modify path resolution behavior are specified in the @flags
argument. Currently, AT_EMPTY_PATH, AT_RECURSIVE, AT_SYMLINK_NOFOLLOW,
and AT_NO_AUTOMOUNT are supported. If useful, additional lookup flags to
restrict path resolution as introduced with openat2() might be supported
in the future.
The mount_setattr() syscall can be expected to grow over time and is
designed with extensibility in mind. It follows the extensible syscall
pattern we have used with other syscalls such as openat2(), clone3(),
sched_{set,get}attr(), and others.
The set of mount options is passed in the uapi struct mount_attr which
currently has the following layout:
struct mount_attr {
__u64 attr_set;
__u64 attr_clr;
__u64 propagation;
__u64 userns_fd;
};
The @attr_set and @attr_clr members are used to clear and set mount
options. This way a user can e.g. request that a set of flags is to be
raised such as turning mounts readonly by raising MOUNT_ATTR_RDONLY in
@attr_set while at the same time requesting that another set of flags is
to be lowered such as removing noexec from a mount tree by specifying
MOUNT_ATTR_NOEXEC in @attr_clr.
Note, since the MOUNT_ATTR_<atime> values are an enum starting from 0,
not a bitmap, users wanting to transition to a different atime setting
cannot simply specify the atime setting in @attr_set, but must also
specify MOUNT_ATTR__ATIME in the @attr_clr field. So we ensure that
MOUNT_ATTR__ATIME can't be partially set in @attr_clr and that @attr_set
can't have any atime bits set if MOUNT_ATTR__ATIME isn't set in
@attr_clr.
The @propagation field lets callers specify the propagation type of a
mount tree. Propagation is a single property that has four different
settings and as such is not really a flag argument but an enum.
Specifically, it would be unclear what setting and clearing propagation
settings in combination would amount to. The legacy mount() syscall thus
forbids the combination of multiple propagation settings too. The goal
is to keep the semantics of mount propagation somewhat simple as they
are overly complex as it is.
The @userns_fd field lets user specify a user namespace whose idmapping
becomes the idmapping of the mount. This is implemented and explained in
detail in the next patch.
[1]: commit
|
||
Viresh Kumar
|
d897a1670b |
arch: microblaze: Remove CONFIG_OPROFILE support
The "oprofile" user-space tools don't use the kernel OPROFILE support any more, and haven't in a long time. User-space has been converted to the perf interfaces. Remove the old oprofile's architecture specific support. Suggested-by: Christoph Hellwig <hch@infradead.org> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Robert Richter <rric@kernel.org> Acked-by: Michal Simek <michal.simek@xilinx.com> Acked-by: William Cohen <wcohen@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Thomas Gleixner <tglx@linutronix.de> |
||
Masahiro Yamada
|
57ddf08642 |
microblaze: do not select TRACING_SUPPORT directly
Microblaze is the only architecture that selects TRACING_SUPPORT. In my understanding, it is computed by kernel/trace/Kconfig: config TRACING_SUPPORT bool depends on TRACE_IRQFLAGS_SUPPORT depends on STACKTRACE_SUPPORT default y Microblaze enables both TRACE_IRQFLAGS_SUPPORT and STACKTRACE_SUPPORT, so there is no change in the resulted configuration. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20201223152947.698744-1-masahiroy@kernel.org Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Randy Dunlap
|
87dbc209ea |
local64.h: make <asm/local64.h> mandatory
Make <asm-generic/local64.h> mandatory in include/asm-generic/Kbuild and remove all arch/*/include/asm/local64.h arch-specific files since they only #include <asm-generic/local64.h>. This fixes build errors on arch/c6x/ and arch/nios2/ for block/blk-iocost.c. Build-tested on 21 of 25 arch-es. (tools problems on the others) Yes, we could even rename <asm-generic/local64.h> to <linux/local64.h> and change all #includes to use <linux/local64.h> instead. Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Willem de Bruijn
|
b0a0c2615f |
epoll: wire up syscall epoll_pwait2
Split off from prev patch in the series that implements the syscall. Link: https://lkml.kernel.org/r/20201121144401.3727659-4-willemdebruijn.kernel@gmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
66fc6a6254 |
Microblaze patches for 5.11-rc1
- Remove noMMU support - Add support for TIF_NOTIFY_SIGNAL - Small header fix -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCX9nbKgAKCRDKSWXLKUoM IUT4AKCJ/u4KiVjYp5ar4xPbzwYn/kMb9QCePT9+BUSKZ14NLzhpMr78rdm0Tf0= =+2wp -----END PGP SIGNATURE----- Merge tag 'microblaze-v5.11' of git://git.monstr.eu/linux-2.6-microblaze Pull microblaze updates from Michal Simek: "The biggest change is to remove support for noMMU configuration. FPGAs are bigger so people use Microblaze with MMU for a lot of years and there is likely no user of this code anymore. No one is updating libraries for this configuration either. - Remove noMMU support - Add support for TIF_NOTIFY_SIGNAL - Small header fix" * tag 'microblaze-v5.11' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Remove noMMU code microblaze: add support for TIF_NOTIFY_SIGNAL microblaze: Replace <linux/clk-provider.h> by <linux/of_clk.h> |
||
Linus Torvalds
|
7a932e5702 |
asm-generic: cross-architecture timer cleanup
This cleans up two ancient timer features that were never completed in the past, CONFIG_GENERIC_CLOCKEVENTS and CONFIG_ARCH_USES_GETTIMEOFFSET. There was only one user left for the ARCH_USES_GETTIMEOFFSET variant of clocksource implementations, the ARM EBSA110 platform. Rather than changing to use modern timekeeping, we remove the platform entirely as Russell no longer uses his machine and nobody else seems to have one any more. The conditional code for using arch_gettimeoffset() is removed as a result. For CONFIG_GENERIC_CLOCKEVENTS, there are still a couple of platforms not using clockevent drivers: parisc, ia64, most of m68k, and one Arm platform. These all do timer ticks slighly differently, and this gets cleaned up to the point they at least all call the same helper function. Instead of most platforms using 'select GENERIC_CLOCKEVENTS' in Kconfig, the polarity is now reversed, with the few remaining ones selecting LEGACY_TIMER_TICK instead. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/Y1v8ACgkQmmx57+YA GNmCvQ/9EDlgCt92r8SB+LGafDtgB8TUQZeIrs9S2mByzdxwnw0lxObIXFCnhQgh RpG3dR+ONRDnC5eI149B377JOEFMZWe2+BtYHUHkFARtUEWatslQcz7yAGvVRK/l TS/qReb6piKltlzuanF1bMZbjy2OhlaDRcm+OlC3y5mALR33M4emb+rJ6cSdfk3K v1iZhrxtfQT77ztesh/oPkPiyQ6kNcz7SfpyYOb6f5VLlml2BZ7YwBSVyGY7urHk RL3XqOUP4KKlMEAI8w0E2nvft6Fk+luziBhrMYWK0GvbmI1OESENuX/c6tgT2OQ1 DRaVHvcPG/EAY8adOKxxVyHhEJDSoz5GJV/EtjlOegsJk6RomczR1uuiT3Kvm7Ah PktMKv4xQht1E15KPSKbOvNIEP18w2s5z6gw+jVDv8pw42pVEQManm1D+BICqrhl fcpw6T1drf9UxAjwX4+zXtmNs+a+mqiFG8puU4VVgT4GpQ8umHvunXz2WUjZO0jc 3m8ErJHBvtJwW5TOHGyXnjl9SkwPzHOfF6IcXTYWEDU4/gQIK9TwUvCjLc0lE27t FMCV2ds7/K1CXwRgpa5IrefSkb8yOXSbRZ56NqqF7Ekxw4J5bYRSaY7jb+qD/e+3 5O1y+iPxFrpH+16hSahvzrtcdFNbLQvBBuRtEQOYuHLt2UJrNoU= =QpNs -----END PGP SIGNATURE----- Merge tag 'asm-generic-timers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic cross-architecture timer cleanup from Arnd Bergmann: "This cleans up two ancient timer features that were never completed in the past, CONFIG_GENERIC_CLOCKEVENTS and CONFIG_ARCH_USES_GETTIMEOFFSET. There was only one user left for the ARCH_USES_GETTIMEOFFSET variant of clocksource implementations, the ARM EBSA110 platform. Rather than changing to use modern timekeeping, we remove the platform entirely as Russell no longer uses his machine and nobody else seems to have one any more. The conditional code for using arch_gettimeoffset() is removed as a result. For CONFIG_GENERIC_CLOCKEVENTS, there are still a couple of platforms not using clockevent drivers: parisc, ia64, most of m68k, and one Arm platform. These all do timer ticks slighly differently, and this gets cleaned up to the point they at least all call the same helper function. Instead of most platforms using 'select GENERIC_CLOCKEVENTS' in Kconfig, the polarity is now reversed, with the few remaining ones selecting LEGACY_TIMER_TICK instead" * tag 'asm-generic-timers-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: timekeeping: default GENERIC_CLOCKEVENTS to enabled timekeeping: remove xtime_update m68k: remove timer_interrupt() function m68k: change remaining timers to legacy_timer_tick m68k: m68328: use legacy_timer_tick() m68k: sun3/sun3c: use legacy_timer_tick m68k: split heartbeat out of timer function m68k: coldfire: use legacy_timer_tick() parisc: use legacy_timer_tick ARM: rpc: use legacy_timer_tick ia64: convert to legacy_timer_tick timekeeping: add CONFIG_LEGACY_TIMER_TICK timekeeping: remove arch_gettimeoffset net: remove am79c961a driver ARM: remove ebsa110 platform |
||
Linus Torvalds
|
157807123c |
asm-generic: mmu-context cleanup
This is a cleanup series from Nicholas Piggin, preparing for later changes. The asm/mmu_context.h header are generalized and common code moved to asm-gneneric/mmu_context.h. This saves a bit of code and makes it easier to change in the future. Signed-off-by: Arnd Bergmann <arnd@arndb.de> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAl/Y1LsACgkQmmx57+YA GNm6kBAAq4/n6nuNnh6b9LhjXaZRG75gEyW7JvHl8KE5wmZHwDHqbwiQgU1b3lUs JJGbfKqi5ASKxNg6MpfYodmCOqeTUUYG0FUCb6lMhcxxMdfLTLYBvkNd6Y143M+T boi5b/iz+OUQdNPzlVeSsUEVsD59FIXmP/GhscWZN9VAyf/aLV2MDBIOhrDSJlPo ObexnP0Iw1E1NRQYDQ6L2dKTHa6XmHyUtw40ABPmd/6MSd1S+D+j3FGg+CYmvnzG k9g8FbNby8xtUfc0pZV4W/322WN8cDFF9bc04eTDZiAv1bk9lmfvWJ2bWjs3s2qt RO/suiZEOAta/WUX9vVLgYn2td00ef+AyjNUgffiUfvQfl++fiCDFTGl+MoCLjbh xQUPcRuRdED7bMKNrC0CcDOSwWEBWVXvkU/szBLDeE1sPjXzGQ80q1Y72k9y961I mqg7FrHqjZsxT9luXMAzClHNhXAtvehkJZBIdHlFok83EFoTQp48Da4jaDuOOhlq p/lkPJWOHegIQMWtGwRyGmG1qzil7b/QBNAPLgu9pF4TA+ySRBEB2BOr2jRSkj6N mNTHQbSYxBoktdt+VhtrSsxR+i8lwlegx+RNRFmKK3VH5da2nfiBaOY7zBQQHxCK yxQvXvsljSVpfkFKLc/S2nLQL1zTkRfFKV1Xmd3+3owR+EoqM60= =NpMX -----END PGP SIGNATURE----- Merge tag 'asm-generic-mmu-context-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic mmu-context cleanup from Arnd Bergmann: "This is a cleanup series from Nicholas Piggin, preparing for later changes. The asm/mmu_context.h header are generalized and common code moved to asm-gneneric/mmu_context.h. This saves a bit of code and makes it easier to change in the future" * tag 'asm-generic-mmu-context-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (25 commits) h8300: Fix generic mmu_context build m68k: mmu_context: Fix Sun-3 build xtensa: use asm-generic/mmu_context.h for no-op implementations x86: use asm-generic/mmu_context.h for no-op implementations um: use asm-generic/mmu_context.h for no-op implementations sparc: use asm-generic/mmu_context.h for no-op implementations sh: use asm-generic/mmu_context.h for no-op implementations s390: use asm-generic/mmu_context.h for no-op implementations riscv: use asm-generic/mmu_context.h for no-op implementations powerpc: use asm-generic/mmu_context.h for no-op implementations parisc: use asm-generic/mmu_context.h for no-op implementations openrisc: use asm-generic/mmu_context.h for no-op implementations nios2: use asm-generic/mmu_context.h for no-op implementations nds32: use asm-generic/mmu_context.h for no-op implementations mips: use asm-generic/mmu_context.h for no-op implementations microblaze: use asm-generic/mmu_context.h for no-op implementations m68k: use asm-generic/mmu_context.h for no-op implementations ia64: use asm-generic/mmu_context.h for no-op implementations hexagon: use asm-generic/mmu_context.h for no-op implementations csky: use asm-generic/mmu_context.h for no-op implementations ... |
||
Linus Torvalds
|
edd7ab7684 |
The new preemtible kmap_local() implementation:
- Consolidate all kmap_atomic() internals into a generic implementation which builds the base for the kmap_local() API and make the kmap_atomic() interface wrappers which handle the disabling/enabling of preemption and pagefaults. - Switch the storage from per-CPU to per task and provide scheduler support for clearing mapping when scheduling out and restoring them when scheduling back in. - Merge the migrate_disable/enable() code, which is also part of the scheduler pull request. This was required to make the kmap_local() interface available which does not disable preemption when a mapping is established. It has to disable migration instead to guarantee that the virtual address of the mapped slot is the same accross preemption. - Provide better debug facilities: guard pages and enforced utilization of the mapping mechanics on 64bit systems when the architecture allows it. - Provide the new kmap_local() API which can now be used to cleanup the kmap_atomic() usage sites all over the place. Most of the usage sites do not require the implicit disabling of preemption and pagefaults so the penalty on 64bit and 32bit non-highmem systems is removed and quite some of the code can be simplified. A wholesale conversion is not possible because some usage depends on the implicit side effects and some need to be cleaned up because they work around these side effects. The migrate disable side effect is only effective on highmem systems and when enforced debugging is enabled. On 64bit and 32bit non-highmem systems the overhead is completely avoided. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/XyQwTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoUolD/9+R+BX96fGir+I8rG9dc3cbLw5meSi 0I/Nq3PToZMs2Iqv50DsoaPYHHz/M6fcAO9LRIgsE9jRbnY93GnsBM0wU9Y8yQaT 4wUzOG5WHaLDfqIkx/CN9coUl458oEiwOEbn79A2FmPXFzr7IpkufnV3ybGDwzwP p73bjMJMPPFrsa9ig87YiYfV/5IAZHi82PN8Cq1v4yNzgXRP3Tg6QoAuCO84ZnWF RYlrfKjcJ2xPdn+RuYyXolPtxr1hJQ0bOUpe4xu/UfeZjxZ7i1wtwLN9kWZe8CKH +x4Lz8HZZ5QMTQ9sCHOLtKzu2MceMcpISzoQH4/aFQCNMgLn1zLbS790XkYiQCuR ne9Cua+IqgYfGMG8cq8+bkU9HCNKaXqIBgPEKE/iHYVmqzCOqhW5Cogu4KFekf6V Wi7pyyUdX2en8BAWpk5NHc8de9cGcc+HXMq2NIcgXjVWvPaqRP6DeITERTZLJOmz XPxq5oPLGl7wdm7z+ICIaNApy8zuxpzb6sPLNcn7l5OeorViORlUu08AN8587wAj FiVjp6ZYomg+gyMkiNkDqFOGDH5TMENpOFoB0hNNEyJwwS0xh6CgWuwZcv+N8aPO HuS/P+tNANbD8ggT4UparXYce7YCtgOf3IG4GA3JJYvYmJ6pU+AZOWRoDScWq4o+ +jlfoJhMbtx5Gg== =n71I -----END PGP SIGNATURE----- Merge tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kmap updates from Thomas Gleixner: "The new preemtible kmap_local() implementation: - Consolidate all kmap_atomic() internals into a generic implementation which builds the base for the kmap_local() API and make the kmap_atomic() interface wrappers which handle the disabling/enabling of preemption and pagefaults. - Switch the storage from per-CPU to per task and provide scheduler support for clearing mapping when scheduling out and restoring them when scheduling back in. - Merge the migrate_disable/enable() code, which is also part of the scheduler pull request. This was required to make the kmap_local() interface available which does not disable preemption when a mapping is established. It has to disable migration instead to guarantee that the virtual address of the mapped slot is the same across preemption. - Provide better debug facilities: guard pages and enforced utilization of the mapping mechanics on 64bit systems when the architecture allows it. - Provide the new kmap_local() API which can now be used to cleanup the kmap_atomic() usage sites all over the place. Most of the usage sites do not require the implicit disabling of preemption and pagefaults so the penalty on 64bit and 32bit non-highmem systems is removed and quite some of the code can be simplified. A wholesale conversion is not possible because some usage depends on the implicit side effects and some need to be cleaned up because they work around these side effects. The migrate disable side effect is only effective on highmem systems and when enforced debugging is enabled. On 64bit and 32bit non-highmem systems the overhead is completely avoided" * tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) ARM: highmem: Fix cache_is_vivt() reference x86/crashdump/32: Simplify copy_oldmem_page() io-mapping: Provide iomap_local variant mm/highmem: Provide kmap_local* sched: highmem: Store local kmaps in task struct x86: Support kmap_local() forced debugging mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL microblaze/mm/highmem: Add dropped #ifdef back xtensa/mm/highmem: Make generic kmap_atomic() work correctly mm/highmem: Take kmap_high_get() properly into account highmem: High implementation details and document API Documentation/io-mapping: Remove outdated blurb io-mapping: Cleanup atomic iomap mm/highmem: Remove the old kmap_atomic cruft highmem: Get rid of kmap_types.h xtensa/mm/highmem: Switch to generic kmap atomic sparc/mm/highmem: Switch to generic kmap atomic powerpc/mm/highmem: Switch to generic kmap atomic nds32/mm/highmem: Switch to generic kmap atomic ... |
||
Michal Simek
|
05cdf45747 |
microblaze: Remove noMMU code
This configuration is obsolete and likely none is really using it. That's why remove it to simplify code. Note about CONFIG_MMU in hw_exception_handler.S is left intentionally for better comment understanding. Cc: Mike Rapoport <rppt@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michal Simek <michal.simek@xilinx.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/43486cab370e0c0a79860120b71e0caac75a7e44.1606397528.git.michal.simek@xilinx.com |
||
Peter Zijlstra
|
58c644ba51 |
sched/idle: Fix arch_cpu_idle() vs tracing
We call arch_cpu_idle() with RCU disabled, but then use local_irq_{en,dis}able(), which invokes tracing, which relies on RCU. Switch all arch_cpu_idle() implementations to use raw_local_irq_{en,dis}able() and carefully manage the lockdep,rcu,tracing state like we do in entry. (XXX: we really should change arch_cpu_idle() to not return with interrupts enabled) Reported-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lkml.kernel.org/r/20201120114925.594122626@infradead.org |
||
Thomas Gleixner
|
a0e1699783 |
microblaze/mm/highmem: Add dropped #ifdef back
The conversion to generic kmap atomic broke microblaze by removing the
build fail.
Add it back.
Fixes:
|
||
Jens Axboe
|
ed2124c0b9 |
microblaze: add support for TIF_NOTIFY_SIGNAL
Wire up TIF_NOTIFY_SIGNAL handling for microblaze. Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/a2b78afc-5f60-8590-9df5-17302e356552@kernel.dk Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Geert Uytterhoeven
|
e167a59c65 |
microblaze: Replace <linux/clk-provider.h> by <linux/of_clk.h>
The MicroBlaze platform code is not a clock provider, and just needs to call of_clk_init(). Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20201110154851.3285695-1-geert+renesas@glider.be Signed-off-by: Michal Simek <michal.simek@xilinx.com> |
||
Thomas Gleixner
|
7ac1b26b0a |
microblaze/mm/highmem: Switch to generic kmap atomic
No reason having the same code in every architecture. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Michal Simek <monstr@monstr.eu> Cc: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20201103095857.777445435@linutronix.de |
||
Arnd Bergmann
|
0774a6ed29 |
timekeeping: default GENERIC_CLOCKEVENTS to enabled
Almost all machines use GENERIC_CLOCKEVENTS, so it feels wrong to
require each one to select that symbol manually.
Instead, enable it whenever CONFIG_LEGACY_TIMER_TICK is disabled as
a simplification. It should be possible to select both
GENERIC_CLOCKEVENTS and LEGACY_TIMER_TICK from an architecture now
and decide at runtime between the two.
For the clockevents arch-support.txt file, this means that additional
architectures are marked as TODO when they have at least one machine
that still uses LEGACY_TIMER_TICK, rather than being marked 'ok' when
at least one machine has been converted. This means that both m68k and
arm (for riscpc) revert to TODO.
At this point, we could just always enable CONFIG_GENERIC_CLOCKEVENTS
rather than leaving it off when not needed. I built an m68k
defconfig kernel (using gcc-10.1.0) and found that this would add
around 5.5KB in kernel image size:
text data bss dec hex filename
3861936 1092236 196656
|
||
Nicholas Piggin
|
97f130106f |
microblaze: use asm-generic/mmu_context.h for no-op implementations
Wire asm-generic/mmu_context.h to provide generic empty hooks for arch code simplification. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
Nicholas Piggin
|
94f89922e1 |
asm-generic: add generic MMU versions of mmu context functions
Many of these are no-ops on many architectures, so extend mmu_context.h to cover MMU and NOMMU, and split the NOMMU bits out to nommu_context.h Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> |
||
Joe Perches
|
33def8498f |
treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences. Remove the quote operator # from compiler_attributes.h __section macro. Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms. Conversion done using the script at: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
4a22709e21 |
arch-cleanup-2020-10-22
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+SOXIQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgptrcD/93VUDmRAn73ChKNd0TtXUicJlAlNLVjvfs VFTXWBDnlJnGkZT7ElkDD9b8dsz8l4xGf/QZ5dzhC/th2OsfObQkSTfe0lv5cCQO mX7CRSrDpjaHtW+WGPDa0oQsGgIfpqUz2IOg9NKbZZ1LJ2uzYfdOcf3oyRgwZJ9B I3sh1vP6OzjZVVCMmtMTM+sYZEsDoNwhZwpkpiwMmj8tYtOPgKCYKpqCiXrGU0x2 ML5FtDIwiwU+O3zYYdCBWqvCb2Db0iA9Aov2whEBz/V2jnmrN5RMA/90UOh1E2zG br4wM1Wt3hNrtj5qSxZGlF/HEMYJVB8Z2SgMjYu4vQz09qRVVqpGdT/dNvLAHQWg w4xNCj071kVZDQdfwnqeWSKYUau9Xskvi8xhTT+WX8a5CsbVrM9vGslnS5XNeZ6p h2D3Q+TAYTvT756icTl0qsYVP7PrPY7DdmQYu0q+Lc3jdGI+jyxO2h9OFBRLZ3p6 zFX2N8wkvvCCzP2DwVnnhIi/GovpSh7ksHnb039F36Y/IhZPqV1bGqdNQVdanv6I 8fcIDM6ltRQ7dO2Br5f1tKUZE9Pm6x60b/uRVjhfVh65uTEKyGRhcm5j9ztzvQfI cCBg4rbVRNKolxuDEkjsAFXVoiiEEsb7pLf4pMO+Dr62wxFG589tQNySySneUIVZ J9ILnGAAeQ== =aVWo -----END PGP SIGNATURE----- Merge tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block Pull arch task_work cleanups from Jens Axboe: "Two cleanups that don't fit other categories: - Finally get the task_work_add() cleanup done properly, so we don't have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates all callers, and also fixes up the documentation for task_work_add(). - While working on some TIF related changes for 5.11, this TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch duplication for how that is handled" * tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block: task_work: cleanup notification modes tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume() |
||
Linus Torvalds
|
f56e65dff6 |
Merge branch 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull initial set_fs() removal from Al Viro: "Christoph's set_fs base series + fixups" * 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Allow a NULL pos pointer to __kernel_read fs: Allow a NULL pos pointer to __kernel_write powerpc: remove address space overrides using set_fs() powerpc: use non-set_fs based maccess routines x86: remove address space overrides using set_fs() x86: make TASK_SIZE_MAX usable from assembly code x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h lkdtm: remove set_fs-based tests test_bitmap: remove user bitmap tests uaccess: add infrastructure for kernel builds with set_fs() fs: don't allow splice read/write without explicit ops fs: don't allow kernel reads and writes without iter ops sysctl: Convert to iter interfaces proc: add a read_iter method to proc proc_ops proc: cleanup the compat vs no compat file ops proc: remove a level of indentation in proc_get_inode |
||
Minchan Kim
|
ecb8ac8b1f |
mm/madvise: introduce process_madvise() syscall: an external memory hinting API
There is usecase that System Management Software(SMS) want to give a memory hint like MADV_[COLD|PAGEEOUT] to other processes and in the case of Android, it is the ActivityManagerService. The information required to make the reclaim decision is not known to the app. Instead, it is known to the centralized userspace daemon(ActivityManagerService), and that daemon must be able to initiate reclaim on its own without any app involvement. To solve the issue, this patch introduces a new syscall process_madvise(2). It uses pidfd of an external process to give the hint. It also supports vector address range because Android app has thousands of vmas due to zygote so it's totally waste of CPU and power if we should call the syscall one by one for each vma.(With testing 2000-vma syscall vs 1-vector syscall, it showed 15% performance improvement. I think it would be bigger in real practice because the testing ran very cache friendly environment). Another potential use case for the vector range is to amortize the cost ofTLB shootdowns for multiple ranges when using MADV_DONTNEED; this could benefit users like TCP receive zerocopy and malloc implementations. In future, we could find more usecases for other advises so let's make it happens as API since we introduce a new syscall at this moment. With that, existing madvise(2) user could replace it with process_madvise(2) with their own pid if they want to have batch address ranges support feature. ince it could affect other process's address range, only privileged process(PTRACE_MODE_ATTACH_FSCREDS) or something else(e.g., being the same UID) gives it the right to ptrace the process could use it successfully. The flag argument is reserved for future use if we need to extend the API. I think supporting all hints madvise has/will supported/support to process_madvise is rather risky. Because we are not sure all hints make sense from external process and implementation for the hint may rely on the caller being in the current context so it could be error-prone. Thus, I just limited hints as MADV_[COLD|PAGEOUT] in this patch. If someone want to add other hints, we could hear the usecase and review it for each hint. It's safer for maintenance rather than introducing a buggy syscall but hard to fix it later. So finally, the API is as follows, ssize_t process_madvise(int pidfd, const struct iovec *iovec, unsigned long vlen, int advice, unsigned int flags); DESCRIPTION The process_madvise() system call is used to give advice or directions to the kernel about the address ranges from external process as well as local process. It provides the advice to address ranges of process described by iovec and vlen. The goal of such advice is to improve system or application performance. The pidfd selects the process referred to by the PID file descriptor specified in pidfd. (See pidofd_open(2) for further information) The pointer iovec points to an array of iovec structures, defined in <sys/uio.h> as: struct iovec { void *iov_base; /* starting address */ size_t iov_len; /* number of bytes to be advised */ }; The iovec describes address ranges beginning at address(iov_base) and with size length of bytes(iov_len). The vlen represents the number of elements in iovec. The advice is indicated in the advice argument, which is one of the following at this moment if the target process specified by pidfd is external. MADV_COLD MADV_PAGEOUT Permission to provide a hint to external process is governed by a ptrace access mode PTRACE_MODE_ATTACH_FSCREDS check; see ptrace(2). The process_madvise supports every advice madvise(2) has if target process is in same thread group with calling process so user could use process_madvise(2) to extend existing madvise(2) to support vector address ranges. RETURN VALUE On success, process_madvise() returns the number of bytes advised. This return value may be less than the total number of requested bytes, if an error occurred. The caller should check return value to determine whether a partial advice occurred. FAQ: Q.1 - Why does any external entity have better knowledge? Quote from Sandeep "For Android, every application (including the special SystemServer) are forked from Zygote. The reason of course is to share as many libraries and classes between the two as possible to benefit from the preloading during boot. After applications start, (almost) all of the APIs end up calling into this SystemServer process over IPC (binder) and back to the application. In a fully running system, the SystemServer monitors every single process periodically to calculate their PSS / RSS and also decides which process is "important" to the user for interactivity. So, because of how these processes start _and_ the fact that the SystemServer is looping to monitor each process, it does tend to *know* which address range of the application is not used / useful. Besides, we can never rely on applications to clean things up themselves. We've had the "hey app1, the system is low on memory, please trim your memory usage down" notifications for a long time[1]. They rely on applications honoring the broadcasts and very few do. So, if we want to avoid the inevitable killing of the application and restarting it, some way to be able to tell the OS about unimportant memory in these applications will be useful. - ssp Q.2 - How to guarantee the race(i.e., object validation) between when giving a hint from an external process and get the hint from the target process? process_madvise operates on the target process's address space as it exists at the instant that process_madvise is called. If the space target process can run between the time the process_madvise process inspects the target process address space and the time that process_madvise is actually called, process_madvise may operate on memory regions that the calling process does not expect. It's the responsibility of the process calling process_madvise to close this race condition. For example, the calling process can suspend the target process with ptrace, SIGSTOP, or the freezer cgroup so that it doesn't have an opportunity to change its own address space before process_madvise is called. Another option is to operate on memory regions that the caller knows a priori will be unchanged in the target process. Yet another option is to accept the race for certain process_madvise calls after reasoning that mistargeting will do no harm. The suggested API itself does not provide synchronization. It also apply other APIs like move_pages, process_vm_write. The race isn't really a problem though. Why is it so wrong to require that callers do their own synchronization in some manner? Nobody objects to write(2) merely because it's possible for two processes to open the same file and clobber each other's writes --- instead, we tell people to use flock or something. Think about mmap. It never guarantees newly allocated address space is still valid when the user tries to access it because other threads could unmap the memory right before. That's where we need synchronization by using other API or design from userside. It shouldn't be part of API itself. If someone needs more fine-grained synchronization rather than process level, there were two ideas suggested - cookie[2] and anon-fd[3]. Both are applicable via using last reserved argument of the API but I don't think it's necessary right now since we have already ways to prevent the race so don't want to add additional complexity with more fine-grained optimization model. To make the API extend, it reserved an unsigned long as last argument so we could support it in future if someone really needs it. Q.3 - Why doesn't ptrace work? Injecting an madvise in the target process using ptrace would not work for us because such injected madvise would have to be executed by the target process, which means that process would have to be runnable and that creates the risk of the abovementioned race and hinting a wrong VMA. Furthermore, we want to act the hint in caller's context, not the callee's, because the callee is usually limited in cpuset/cgroups or even freezed state so they can't act by themselves quick enough, which causes more thrashing/kill. It doesn't work if the target process are ptraced(e.g., strace, debugger, minidump) because a process can have at most one ptracer. [1] https://developer.android.com/topic/performance/memory" [2] process_getinfo for getting the cookie which is updated whenever vma of process address layout are changed - Daniel Colascione - https://lore.kernel.org/lkml/20190520035254.57579-1-minchan@kernel.org/T/#m7694416fd179b2066a2c62b5b139b14e3894e224 [3] anonymous fd which is used for the object(i.e., address range) validation - Michal Hocko - https://lore.kernel.org/lkml/20200120112722.GY18451@dhcp22.suse.cz/ [minchan@kernel.org: fix process_madvise build break for arm64] Link: http://lkml.kernel.org/r/20200303145756.GA219683@google.com [minchan@kernel.org: fix build error for mips of process_madvise] Link: http://lkml.kernel.org/r/20200508052517.GA197378@google.com [akpm@linux-foundation.org: fix patch ordering issue] [akpm@linux-foundation.org: fix arm64 whoops] [minchan@kernel.org: make process_madvise() vlen arg have type size_t, per Florian] [akpm@linux-foundation.org: fix i386 build] [sfr@canb.auug.org.au: fix syscall numbering] Link: https://lkml.kernel.org/r/20200905142639.49fc3f1a@canb.auug.org.au [sfr@canb.auug.org.au: madvise.c needs compat.h] Link: https://lkml.kernel.org/r/20200908204547.285646b4@canb.auug.org.au [minchan@kernel.org: fix mips build] Link: https://lkml.kernel.org/r/20200909173655.GC2435453@google.com [yuehaibing@huawei.com: remove duplicate header which is included twice] Link: https://lkml.kernel.org/r/20200915121550.30584-1-yuehaibing@huawei.com [minchan@kernel.org: do not use helper functions for process_madvise] Link: https://lkml.kernel.org/r/20200921175539.GB387368@google.com [akpm@linux-foundation.org: pidfd_get_pid() gained an argument] [sfr@canb.auug.org.au: fix up for "iov_iter: transparently handle compat iovecs in import_iovec"] Link: https://lkml.kernel.org/r/20200928212542.468e1fef@canb.auug.org.au Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <christian@brauner.io> Cc: Daniel Colascione <dancol@google.com> Cc: Jann Horn <jannh@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Dias <joaodias@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksandr Natalenko <oleksandr@redhat.com> Cc: Sandeep Patil <sspatil@google.com> Cc: SeongJae Park <sj38.park@gmail.com> Cc: SeongJae Park <sjpark@amazon.de> Cc: Shakeel Butt <shakeelb@google.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Tim Murray <timmurray@google.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Florian Weimer <fw@deneb.enyo.de> Cc: <linux-man@vger.kernel.org> Link: http://lkml.kernel.org/r/20200302193630.68771-3-minchan@kernel.org Link: http://lkml.kernel.org/r/20200508183320.GA125527@google.com Link: http://lkml.kernel.org/r/20200622192900.22757-4-minchan@kernel.org Link: https://lkml.kernel.org/r/20200901000633.1920247-4-minchan@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Jens Axboe
|
3c532798ec |
tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
All the callers currently do this, clean it up and move the clearing into tracehook_notify_resume() instead. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
Linus Torvalds
|
5a32c3413d |
dma-mapping updates for 5.10
- rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl+IiPwLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYPKEQ//TM8vxjucnRl/pklpMin49dJorwiVvROLhQqLmdxw 286ZKpVzYYAPc7LnNqwIBugnFZiXuHu8xPKQkIiOa2OtNDTwhKNoBxOAmOJaV6DD 8JfEtZYeX5mKJ/Nqd2iSkIqOvCwZ9Wzii+aytJ2U88wezQr1fnyF4X49MegETEey FHWreSaRWZKa0MMRu9AQ0QxmoNTHAQUNaPc0PeqEtPULybfkGOGw4/ghSB7WcKrA gtKTuooNOSpVEHkTas2TMpcBp6lxtOjFqKzVN0ml+/nqq5NeTSDx91VOCX/6Cj76 mXIg+s7fbACTk/BmkkwAkd0QEw4fo4tyD6Bep/5QNhvEoAriTuSRbhvLdOwFz0EF vhkF0Rer6umdhSK7nPd7SBqn8kAnP4vBbdmB68+nc3lmkqysLyE4VkgkdH/IYYQI 6TJ0oilXWFmU6DT5Rm4FBqCvfcEfU2dUIHJr5wZHqrF2kLzoZ+mpg42fADoG4GuI D/oOsz7soeaRe3eYfWybC0omGR6YYPozZJ9lsfftcElmwSsFrmPsbO1DM5IBkj1B gItmEbOB9ZK3RhIK55T/3u1UWY3Uc/RVr+kchWvADGrWnRQnW0kxYIqDgiOytLFi JZNH8uHpJIwzoJAv6XXSPyEUBwXTG+zK37Ce769HGbUEaUrE71MxBbQAQsK8mDpg 7fM= =Bkf/ -----END PGP SIGNATURE----- Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits) ARM/ixp4xx: add a missing include of dma-map-ops.h dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling dma-direct: factor out a dma_direct_alloc_from_pool helper dma-direct check for highmem pages in dma_direct_alloc_pages dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma dma-mapping: move dma-debug.h to kernel/dma/ dma-mapping: remove <asm/dma-contiguous.h> dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> dma-contiguous: remove dma_contiguous_set_default dma-contiguous: remove dev_set_cma_area dma-contiguous: remove dma_declare_contiguous dma-mapping: split <linux/dma-mapping.h> cma: decrease CMA_ALIGNMENT lower limit to 2 firewire-ohci: use dma_alloc_pages dma-iommu: implement ->alloc_noncoherent dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods dma-mapping: add a new dma_alloc_pages API dma-mapping: remove dma_cache_sync 53c700: convert to dma_alloc_noncoherent ... |
||
Linus Torvalds
|
d5660df4a5 |
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: "181 patches. Subsystems affected by this patch series: kbuild, scripts, ntfs, ocfs2, vfs, mm (slab, slub, kmemleak, dax, debug, pagecache, fadvise, gup, swap, memremap, memcg, selftests, pagemap, mincore, hmm, dma, memory-failure, vmallo and migration)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (181 commits) mm/migrate: remove obsolete comment about device public mm/migrate: remove cpages-- in migrate_vma_finalize() mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary memblock: use separate iterators for memory and reserved regions memblock: implement for_each_reserved_mem_region() using __next_mem_region() memblock: remove unused memblock_mem_size() x86/setup: simplify reserve_crashkernel() x86/setup: simplify initrd relocation and reservation arch, drivers: replace for_each_membock() with for_each_mem_range() arch, mm: replace for_each_memblock() with for_each_mem_pfn_range() memblock: reduce number of parameters in for_each_mem_range() memblock: make memblock_debug and related functionality private memblock: make for_each_memblock_type() iterator private mircoblaze: drop unneeded NUMA and sparsemem initializations riscv: drop unneeded node initialization h8300, nds32, openrisc: simplify detection of memory extents arm64: numa: simplify dummy_numa_init() arm, xtensa: simplify initialization of high memory pages dma-contiguous: simplify cma_early_percent_memory() KVM: PPC: Book3S HV: simplify kvm_cma_reserve() ... |
||
Mike Rapoport
|
b10d6bca87 |
arch, drivers: replace for_each_membock() with for_each_mem_range()
There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start = __pfn_to_phys(memblock_region_memory_base_pfn(reg); end = __pfn_to_phys(memblock_region_memory_end_pfn(reg)); /* do something with start and end */ } Using for_each_mem_range() iterator is more appropriate in such cases and allows simpler and cleaner code. [akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build] [rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring] Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Mike Rapoport
|
49645793bc |
mircoblaze: drop unneeded NUMA and sparsemem initializations
microblaze does not support neither NUMA not SPARSMEM, so there is no point to call memblock_set_node() and sparse_memory_present_with_active_regions() functions during microblaze memory initialization. Remove these calls and the surrounding code. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Daniel Axtens <dja@axtens.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: https://lkml.kernel.org/r/20200818151634.14343-8-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
8b05418b25 |
seccomp updates for v5.10-rc1
- heavily refactor seccomp selftests (and clone3 selftests dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza Cascardo) - fix style issue in selftests (Zou Wei) - upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich Felker) - replace task_pt_regs(current) with current_pt_regs() (Denis Efremov) - fix corner-case race in USER_NOTIF (Jann Horn) - make CONFIG_SECCOMP no longer per-arch (YiFei Zhu) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl+E1LAWHGtlZXNjb29r QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJgRfD/0cq7W51+o34719vefC+oZaMjJJ Bd5HYshmr6NRpMqn0OhtT9kVi6OeV0sK0VJeNxSISDIaGNJ8xCI9YhnXwzY+7myK +IQu3i2Hv7dlWvTaXWFLL+mvfk6WopLntFGGJQ8KPMnP2gcfH2AZmOeAKGFGhBDe NwpAUZ9zriXg9JCQp6u0FzPJgk8KfgfHjUY6Hsa095gg0aPSJhc8bWEUNBQwjCe6 uIcxDP/zK2WWaEhO9BfHt6/VTcXw7QgTLS3yM+pwBCgR1JHs7HMhtgcwPT410qES LmYD8OiHmv5AZhDjcCcNipKEv3ZnxkLnpU/6hfaKM4zn/DoaR/zbfjO9U017rcNV 9gf7k5siAP7DH48IFlqf4Erzd3xyF0OJDnVfC7NiPtggPfO9aWOHJJZCuJRQOdrN qPMjkaQzFb02qb501PLEn55F24OLDjz1vFOqpkJm2/XamOBVV4uiRKmfpNEo/MOf QkhSvzvwEFErWwzPH95uFyVhs42stwnM3ppnwtya2+U5kxXdNvbAR8N5leH7siaU ab+YJIHW59+BxXTlKgXIcqBP/6RqJWJtuT9OqGs0K2A7FhQSexh5MOm+9vvGgIwZ Qjyijku8dB3aV94BNGnlJq6BV+4Hc6EGadh7h3b8GiRAUTYo0pk5G/iKL6Ii+R6p 0msJENqalKFtNCr70w== =a4u2 -----END PGP SIGNATURE----- Merge tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "The bulk of the changes are with the seccomp selftests to accommodate some powerpc-specific behavioral characteristics. Additional cleanups, fixes, and improvements are also included: - heavily refactor seccomp selftests (and clone3 selftests dependency) to fix powerpc (Kees Cook, Thadeu Lima de Souza Cascardo) - fix style issue in selftests (Zou Wei) - upgrade "unknown action" from KILL_THREAD to KILL_PROCESS (Rich Felker) - replace task_pt_regs(current) with current_pt_regs() (Denis Efremov) - fix corner-case race in USER_NOTIF (Jann Horn) - make CONFIG_SECCOMP no longer per-arch (YiFei Zhu)" * tag 'seccomp-v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits) seccomp: Make duplicate listener detection non-racy seccomp: Move config option SECCOMP to arch/Kconfig selftests/clone3: Avoid OS-defined clone_args selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit selftests/seccomp: Allow syscall nr and ret value to be set separately selftests/seccomp: Record syscall during ptrace entry selftests/seccomp: powerpc: Fix seccomp return value testing selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SET selftests/seccomp: Avoid redundant register flushes selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREG selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREG selftests/seccomp: Remove syscall setting #ifdefs selftests/seccomp: mips: Remove O32-specific macro selftests/seccomp: arm64: Define SYSCALL_NUM_SET macro selftests/seccomp: arm: Define SYSCALL_NUM_SET macro selftests/seccomp: mips: Define SYSCALL_NUM_SET macro selftests/seccomp: Provide generic syscall setting macro selftests/seccomp: Refactor arch register macros to avoid xtensa special case selftests/seccomp: Use __NR_mknodat instead of __NR_mknod selftests/seccomp: Use bitwise instead of arithmetic operator for flags ... |
||
Linus Torvalds
|
024fb66772 |
Microblaze patch for 5.10-rc1
- Fix Kbuild -----BEGIN PGP SIGNATURE----- iF0EABECAB0WIQQbPNTMvXmYlBPRwx7KSWXLKUoMIQUCX4QaqwAKCRDKSWXLKUoM IRtRAJwME21S1TrGKSQox5xBM4qYPyrjRACbBghBPxE2EUlCUFoE2hi1QupTWyk= =Z7i+ -----END PGP SIGNATURE----- Merge tag 'microblaze-v5.10' of git://git.monstr.eu/linux-2.6-microblaze Pull Microblaze build warning fix from Michal Simek. * tag 'microblaze-v5.10' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: fix kbuild redundant file warning |
||
YiFei Zhu
|
282a181b1a |
seccomp: Move config option SECCOMP to arch/Kconfig
In order to make adding configurable features into seccomp easier, it's better to have the options at one single location, considering especially that the bulk of seccomp code is arch-independent. An quick look also show that many SECCOMP descriptions are outdated; they talk about /proc rather than prctl. As a result of moving the config option and keeping it default on, architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP on by default prior to this and SECCOMP will be default in this change. Architectures microblaze, mips, powerpc, s390, sh, and sparc have an outdated depend on PROC_FS and this dependency is removed in this change. Suggested-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/ Signed-off-by: YiFei Zhu <yifeifz2@illinois.edu> [kees: added HAVE_ARCH_SECCOMP help text, tweaked wording] Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu |
||
Christoph Hellwig
|
9f4df96b87 |
dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
Move more nitty gritty DMA implementation details into the common internal header. Signed-off-by: Christoph Hellwig <hch@lst.de> |
||
Christoph Hellwig
|
a1fd09e8e6 |
dma-mapping: move dma-debug.h to kernel/dma/
Most of dma-debug.h is not required by anything outside of kernel/dma. Move the four declarations needed by dma-mappin.h or dma-ops providers into dma-mapping.h and dma-map-ops.h, and move the remainder of the file to kernel/dma/debug.h. Signed-off-by: Christoph Hellwig <hch@lst.de> |
||
Christoph Hellwig
|
0b1abd1fb7 |
dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment describing the contiguous allocator into kernel/dma/contigous.c. Signed-off-by: Christoph Hellwig <hch@lst.de> |
||
Christoph Hellwig
|
5e6e9852d6 |
uaccess: add infrastructure for kernel builds with set_fs()
Add a CONFIG_SET_FS option that is selected by architecturess that implement set_fs, which is all of them initially. If the option is not set stubs for routines related to overriding the address space are provided so that architectures can start to opt out of providing set_fs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
||
Randy Dunlap
|
4a17e85133 |
microblaze: fix kbuild redundant file warning
Fix build warning since this file is already listed in
include/asm-generic/Kbuild.
../scripts/Makefile.asm-generic:25: redundant generic-y found in arch/microblaze/include/asm/Kbuild: hw_irq.h
Fixes:
|
||
Randy Dunlap
|
08134e79e0 |
microblaze: fix min_low_pfn/max_low_pfn build errors
Fix min_low_pfn/max_low_pfn build errors for arch/microblaze/: (e.g.) ERROR: "min_low_pfn" [drivers/rpmsg/virtio_rpmsg_bus.ko] undefined! ERROR: "min_low_pfn" [drivers/hwtracing/intel_th/intel_th_msu_sink.ko] undefined! ERROR: "min_low_pfn" [drivers/hwtracing/intel_th/intel_th_msu.ko] undefined! ERROR: "min_low_pfn" [drivers/mmc/core/mmc_core.ko] undefined! ERROR: "min_low_pfn" [drivers/md/dm-crypt.ko] undefined! ERROR: "min_low_pfn" [drivers/net/wireless/ath/ath6kl/ath6kl_sdio.ko] undefined! ERROR: "min_low_pfn" [crypto/tcrypt.ko] undefined! ERROR: "min_low_pfn" [crypto/asymmetric_keys/asym_tpm.ko] undefined! Mike had/has an alternate patch for Microblaze: https://lore.kernel.org/lkml/20200630111519.GA1951986@linux.ibm.com/ David suggested just exporting min_low_pfn & max_low_pfn in mm/memblock.c: https://lore.kernel.org/lkml/alpine.DEB.2.22.394.2006291911220.1118534@chino.kir.corp.google.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Michal Simek <monstr@monstr.eu> Acked-by: David Rientjes <rientjes@google.com> Cc: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> |
||
Gustavo A. R. Silva
|
df561f6688 |
treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> |
||
Xiaoming Ni
|
88db0aa242 |
all arch: remove system call sys_sysctl
Since commit
|
||
Peter Xu
|
aeb6aefc31 |
mm/microblaze: use general page fault accounting
Use the general page fault accounting by passing regs into handle_mm_fault(). It naturally solve the issue of multiple page fault accounting when page fault retry happened. Add the missing PERF_COUNT_SW_PAGE_FAULTS perf events too. Note, the other two perf events (PERF_COUNT_SW_PAGE_FAULTS_[MAJ|MIN]) were done in handle_mm_fault(). Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Simek <monstr@monstr.eu> Link: http://lkml.kernel.org/r/20200707225021.200906-11-peterx@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Peter Xu
|
bce617edec |
mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.
This is v5 of the pf accounting cleanup series. It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit
|
||
Christoph Hellwig
|
428e2976a5 |
uaccess: remove segment_eq
segment_eq is only used to implement uaccess_kernel. Just open code uaccess_kernel in the arch uaccess headers and remove one layer of indirection. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Greentime Hu <green.hu@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Mike Rapoport
|
c89ab04feb |
mm/sparse: cleanup the code surrounding memory_present()
After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent functions that call memory_present() for each region in memblock.memory: sparse_memory_present_with_active_regions() and membocks_present(). Moreover, all architectures have a call to either of these functions preceding the call to sparse_init() and in the most cases they are called one after the other. Mark the regions from memblock.memory as present during sparce_init() by making sparse_init() call memblocks_present(), make memblocks_present() and memory_present() functions static and remove redundant sparse_memory_present_with_active_regions() function. Also remove no longer required HAVE_MEMORY_PRESENT configuration option. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Mike Rapoport
|
f9cb654cb5 |
asm-generic: pgalloc: provide generic pgd_free()
Most architectures define pgd_free() as a wrapper for free_page(). Provide a generic version in asm-generic/pgalloc.h and enable its use for most architectures. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-7-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Mike Rapoport
|
ca15ca406f |
mm: remove unneeded includes of <asm/pgalloc.h>
Patch series "mm: cleanup usage of <asm/pgalloc.h>" Most architectures have very similar versions of pXd_alloc_one() and pXd_free_one() for intermediate levels of page table. These patches add generic versions of these functions in <asm-generic/pgalloc.h> and enable use of the generic functions where appropriate. In addition, functions declared and defined in <asm/pgalloc.h> headers are used mostly by core mm and early mm initialization in arch and there is no actual reason to have the <asm/pgalloc.h> included all over the place. The first patch in this series removes unneeded includes of <asm/pgalloc.h> In the end it didn't work out as neatly as I hoped and moving pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require unnecessary changes to arches that have custom page table allocations, so I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local to mm/. This patch (of 8): In most cases <asm/pgalloc.h> header is required only for allocations of page table memory. Most of the .c files that include that header do not use symbols declared in <asm/pgalloc.h> and do not require that header. As for the other header files that used to include <asm/pgalloc.h>, it is possible to move that include into the .c file that actually uses symbols from <asm/pgalloc.h> and drop the include from the header file. The process was somewhat automated using sed -i -E '/[<"]asm\/pgalloc\.h/d' \ $(grep -L -w -f /tmp/xx \ $(git grep -E -l '[<"]asm/pgalloc\.h')) where /tmp/xx contains all the symbols defined in arch/*/include/asm/pgalloc.h. [rppt@linux.ibm.com: fix powerpc warning] Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Joerg Roedel <joro@8bytes.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Joerg Roedel <jroedel@suse.de> Cc: Matthew Wilcox <willy@infradead.org> Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Linus Torvalds
|
4f30a60aa7 |
close-range-v5.9
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygcpgAKCRCRxhvAZXjc ogPeAQDv1ncqtNroFAC4pJ4tQhH7JSjW0OltiMk/AocY/J2SdQD9GJ15luYJ0/om 697q/Z68sndRynhdoZlMuf3oYuBlHQw= =3ZhE -----END PGP SIGNATURE----- Merge tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range() |
||
Christian Brauner
|
714acdbd1c
|
arch: rename copy_thread_tls() back to copy_thread()
Now that HAVE_COPY_THREAD_TLS has been removed, rename copy_thread_tls() back simply copy_thread(). It's a simpler name, and doesn't imply that only tls is copied here. This finishes an outstanding chunk of internal process creation work since we've added clone3(). Cc: linux-arch@vger.kernel.org Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>A Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Greentime Hu <green.hu@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>A Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
140c8180eb
|
arch: remove HAVE_COPY_THREAD_TLS
All architectures support copy_thread_tls() now, so remove the legacy copy_thread() function and the HAVE_COPY_THREAD_TLS config option. Everyone uses the same process creation calling convention based on copy_thread_tls() and struct kernel_clone_args. This will make it easier to maintain the core process creation code under kernel/, simplifies the callpaths and makes the identical for all architectures. Cc: linux-arch@vger.kernel.org Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by: Greentime Hu <green.hu@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
ad1bb82c05
|
microblaze: switch to copy_thread_tls()
Use the copy_thread_tls() calling convention which passes tls through a register. This is required so we can remove the copy_thread{_tls}() split and remove the HAVE_COPY_THREAD_TLS macro. Cc: Michal Simek <monstr@monstr.eu> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> |
||
Christian Brauner
|
9b4feb630e
|
arch: wire-up close_range()
This wires up the close_range() syscall into all arches at once. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Jann Horn <jannh@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Dmitry V. Levin <ldv@altlinux.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Florian Weimer <fweimer@redhat.com> Cc: linux-api@vger.kernel.org Cc: linux-alpha@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-ia64@vger.kernel.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-mips@vger.kernel.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org |
||
Michel Lespinasse
|
c1e8d7c6a7 |
mmap locking API: convert mmap_sem comments
Convert comments that reference mmap_sem to reference mmap_lock instead. [akpm@linux-foundation.org: fix up linux-next leftovers] [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil] [akpm@linux-foundation.org: more linux-next fixups, per Michel] Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Michel Lespinasse
|
3e4e28c5a8 |
mmap locking API: convert mmap_sem API comments
Convert comments that reference old mmap_sem APIs to reference corresponding new mmap locking APIs instead. Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Michel Lespinasse
|
d8ed45c5dc |
mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Liam Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ying Han <yinghan@google.com> Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Mike Rapoport
|
974b9b2c68 |
mm: consolidate pte_index() and pte_offset_*() definitions
All architectures define pte_index() as (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1) and all architectures define pte_offset_kernel() as an entry in the array of PTEs indexed by the pte_index(). For the most architectures the pte_offset_kernel() implementation relies on the availability of pmd_page_vaddr() that converts a PMD entry value to the virtual address of the page containing PTEs array. Let's move x86 definitions of the PTE accessors to the generic place in <linux/pgtable.h> and then simply drop the respective definitions from the other architectures. The architectures that didn't provide pmd_page_vaddr() are updated to have that defined. The generic implementation of pte_offset_kernel() can be overridden by an architecture and alpha makes use of this because it has special ordering requirements for its version of pte_offset_kernel(). [rppt@linux.ibm.com: v2] Link: http://lkml.kernel.org/r/20200514170327.31389-11-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-12-rppt@kernel.org [rppt@linux.ibm.com: update] Link: http://lkml.kernel.org/r/20200514170327.31389-13-rppt@kernel.org [akpm@linux-foundation.org: fix x86 warning] [sfr@canb.auug.org.au: fix powerpc build] Link: http://lkml.kernel.org/r/20200607153443.GB738695@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-10-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Mike Rapoport
|
e05c7b1f2b |
mm: pgtable: add shortcuts for accessing kernel PMD and PTE
The powerpc 32-bit implementation of pgtable has nice shortcuts for accessing kernel PMD and PTE for a given virtual address. Make these helpers available for all architectures. [rppt@linux.ibm.com: microblaze: fix page table traversal in setup_rt_frame()] Link: http://lkml.kernel.org/r/20200518191511.GD1118872@kernel.org [akpm@linux-foundation.org: s/pmd_ptr_k/pmd_off_k/ in various powerpc places] Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-9-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |